β‘ Machine Coding: LRU Cache (Least Recently Used)
π Overview
Design and implement a highly optimized, generic In-Memory Cache with a fixed capacity. This challenge focuses on advanced data structure composition to achieve strict \(O(1)\) constant-time performance for both retrieval and eviction operations.
Why This Challenge?
- Data Structure Composition: Evaluates your ability to combine Hash Maps and Doubly Linked Lists to bypass the limitations of each individual structure.
- Pointer Manipulation: Tests your precision in managing
prevandnextmemory pointers without creating orphaned nodes or memory leaks. - Generics & Typing: Demonstrates how to build highly reusable, type-safe data structures using Python's
GenericandTypeVar.
π The Scenario & Requirements
π‘ The Problem (The Villain)
"The \(O(N)\) Eviction." A naive cache implementation uses a standard array or list to track usage history. As the cache grows to millions of items, finding the "Least Recently Used" item to evict requires scanning the entire list or array shifting, causing massive latency spikes during high-traffic bursts.
π¦Έ The System (The Hero)
"The Hybrid Store." A lightning-fast cache that uses a Hash Map for \(O(1)\) lookups and a Doubly Linked List to maintain access order in \(O(1)\). By decoupling the key-value storage from the chronological eviction logic, the system handles massive throughput with deterministic, sub-millisecond latency.
π Requirements & Constraints
- (Functional): Implement
get(key)to retrieve an item and mark it as recently used. - (Functional): Implement
set(key, value)to insert or update an item, marking it as recently used. - (Functional): When the cache reaches its maximum capacity, it must automatically evict the Least Recently Used item before inserting a new one.
- (Technical): Both
getandsetmust execute in strictly \(O(1)\) time complexity. - (Technical): The data structure must be generic and accept any hashable key type.
ποΈ Design & Architecture
π§ Thinking Process
To achieve \(O(1)\) for both lookup and eviction, we must combine two structures:
1. Hash Map (lookup): Stores key -> Node mapping for instant memory access to any item in the cache.
2. Doubly Linked List: Stores the actual nodes in order of recency. The Most Recently Used (MRU) nodes are moved to the head, and the Least Recently Used (LRU) node sits at the tail.
3. Sentinel Nodes: To prevent messy if node.prev is None edge cases during pointer manipulation, we initialize the linked list with a "dummy" head and tail node. All real data lives safely between them.
π§© Class Diagram
(The Object-Oriented Blueprint. Who owns what?)
classDiagram
direction TB
class LRUCache~K, V~ {
-int max_size
-Dict~K, Node~ lookup
-DoublyLinkedList linked_list
+get(key) V
+set(key, value)
}
class Node~K, V~ {
+K key
+V value
+Node prev
+Node next
}
class DoublyLinkedList~K, V~ {
-Node head
-Node tail
+move_to_front(node)
+append_to_front(node)
+remove_from_tail() Node
}
LRUCache *-- DoublyLinkedList : manages
LRUCache *-- Node : maps via Dict
DoublyLinkedList *-- Node : chains
βοΈ Design Patterns Applied
- Structural Composition: Fusing a Hash Map and a Linked List into a single cohesive facade.
- Sentinel Pattern: Using dummy head and tail nodes to simplify structural boundary conditions and eliminate null-pointer checks during node extraction.
π» Solution Implementation
The Code
# Solution implementation for High-Performance In-Memory Cache
from typing import Generic, TypeVar, List
K = TypeVar('K')
V = TypeVar('V')
class HashItem(Generic[K, V]):
"""Represents a key-value pair stored in the hash table.
Attributes:
key (K): The unique identifier for the item.
value (V): The data associated with the key.
"""
def __init__(self, key: K, value: V) -> None:
"""Initializes a new HashItem.
Args:
key (K): The key to identify the item.
value (V): The value to store.
"""
self.key: K = key
self.value: V = value
class HashTable(Generic[K, V]):
"""A Hash Table implementation using separate chaining for collision resolution.
Attributes:
size (int): The total number of buckets in the hash table.
table (List[List[HashItem[K, V]]]): The underlying array of buckets.
"""
def __init__(self, size: int) -> None:
"""Initializes the HashTable with a fixed number of buckets.
Args:
size (int): The number of buckets.
"""
self.size: int = size
self.table: List[List[HashItem[K, V]]] = [[] for _ in range(self.size)]
def _hash_function(self, key: K) -> int:
"""Calculates the bucket index for a given key.
Args:
key (K): The key to hash.
Returns:
int: The computed bucket index (0 to size - 1).
"""
return hash(key) % self.size
def set(self, key: K, value: V) -> None:
"""Inserts or updates a key-value pair in the hash table.
Args:
key (K): The key to insert or update.
value (V): The value to associate with the key.
"""
hash_index = self._hash_function(key)
bucket = self.table[hash_index]
# Update existing key
for item in bucket:
if item.key == key:
item.value = value
return
# Insert new key
bucket.append(HashItem(key, value))
def get(self, key: K) -> V:
"""Retrieves the value associated with a given key.
Args:
key (K): The key to look up.
Returns:
V: The associated value.
Raises:
KeyError: If the key does not exist in the hash table.
"""
hash_index = self._hash_function(key)
bucket = self.table[hash_index]
for item in bucket:
if item.key == key:
return item.value
raise KeyError(f"Key '{key}' not found.")
def remove(self, key: K) -> None:
"""Removes a key-value pair from the hash table.
Args:
key (K): The key to remove.
Raises:
KeyError: If the key does not exist in the hash table.
"""
hash_index = self._hash_function(key)
bucket = self.table[hash_index]
for index, item in enumerate(bucket):
if item.key == key:
del bucket[index]
return
raise KeyError(f"Key '{key}' not found.")
π¬ Why This Works (Evaluation)
The combination of the Hash Map and Doubly Linked List is the algorithmic key. When an item is accessed via get, the map provides the exact memory reference to the Node in \(O(1)\). We then "pluck" the node from its current position by bridging its prev and next pointers, and move it to the front of the list in \(O(1)\) (which is only possible because a doubly linked list allows us to look backward). Eviction is simply popping the node immediately preceding the dummy tail and deleting its key from the Hash Map.
βοΈ Trade-offs & Limitations
| Decision | Pros | Cons / Limitations |
|---|---|---|
| Doubly Linked List | \(O(1)\) extraction and reordering from the middle of the list. | High memory overhead (requires allocating 2 extra pointers per entry). |
| Algorithmic Purity (No Locks) | Maximum single-thread performance without context-switching overhead. | Not Thread-Safe. If multiple threads call set() concurrently, the linked list pointers will become corrupted. |
| In-Memory Only | Sub-millisecond latency. | Data is lost on server restart; cannot scale beyond the physical RAM of a single machine. |
π€ Interview Toolkit
- Concurrency Probe: "This code isn't thread-safe. How would you modify it to safely handle 10,000 concurrent requests per second?" -> (Add a
threading.Lock()around thegetandsetoperations. For extreme throughput, implement Striped Lockingβhashing keys into different segments and locking only the specific segment being modified). - Extensibility: "How would you add a Time-To-Live (TTL) feature so keys expire after 5 minutes?" -> (Add a
timestampto theNode. Implement "Lazy Eviction" by checking the timestamp during aget()call and returningNoneif expired. Optionally, run a background 'janitor' thread to clean up stale nodes periodically). - Algorithm Pivot: "What if we wanted an LFU (Least Frequently Used) cache instead?" -> (You would need to change the underlying structure, typically requiring two Hash Maps: one mapping
key -> value/frequency, and another mappingfrequency -> DoublyLinkedList of nodesto resolve ties).
π Related Challenges
- Hash Map (Dictionary) β The underlying foundation of the
lookupdictionary used in this challenge. - Distributed Rate Limiter β Uses a cache-like structural approach for tracking requests and evicting old timestamps via Sliding Windows.