LFU cache

Hero Image

LFU Cache: Understanding and Implementing Efficient Data Retrieval

Problem Overview

The LFU (Least Frequently Used) Cache is a type of data cache algorithm that is designed to maintain its performance efficiency by removing the least frequently accessed items. In this problem, you're tasked with designing and implementing a data structure to simulate an LFU Cache that supports the following two operations:

  1. get(key): Retrieve the value of the key if it exists in the cache; otherwise, return -1.
  2. put(key, value): Insert or update the value of the key. If the cache exceeds its capacity, the least frequently used key is removed.

Understanding the Problem

To implement an LFU Cache, we need to ensure that both access operations (get) and updates (put) occur in optimal time complexity, ideally O(1). The challenge in this problem is efficiently managing the frequency data and eviction strategy given that keys with the same frequency need further ordering by their most recent access.

Key DSA Concepts and Theory

Stacks and Queues: Both these data structures support different types of access. While they are not directly used in LFU Cache, understanding them is foundational for mastering data management concepts.

HashMap: An efficient way to store key-value pairs. In LFU Cache, a HashMap is used to quickly look up the values by their keys.

Doubly Linked List: This data structure helps in maintaining the order of elements and aids in quick insertion and removal of elements. It will be used to maintain the order of keys when they have the same frequency.

Frequencies Map: A map that keeps track of all the keys according to their frequencies, often implemented using a map of doubly linked lists for quick access and updates.

Solution Approach

The LFU Cache requires an efficient way to keep track of both frequency and order of keys. Here's a step-by-step approach:

  1. Data Structures:

    • Node class: Represents each key-value pair and the frequency of access.
    • HashMap (keyNodeMap): Maps each key to its corresponding node.
    • Frequency Map (freqListMap): Maps frequencies to a Doubly Linked List maintaining all nodes with that frequency.
    • Capacity: The maximum number of items that can be stored in the cache.
  2. get(key):

    • If the key exists in keyNodeMap, retrieve the node.
    • Increase the frequency of that node.
    • Move the node from its current frequency list to the next frequency list.
    • Return the value. If the key doesn’t exist, return -1.
  3. put(key, value):

    • If the cache capacity is zero, return immediately.
    • If the key is already present, update its value and increase its frequency.
    • If the key is new:
      • Add it to the cache.
      • If the current size exceeds the capacity, remove the least frequently used item and decrement the size.
    • Insert the new node into the frequency map with a frequency of 1.
  4. Removal: If two keys have the same frequency, remove the one that was least recently used. This is efficient with the Doubly Linked List.

Here is the code implementation in both C++ and Java:

C++ Implementation:

#include <unordered_map>
#include <list>

class LFUCache {
    struct Node {
        int key;
        int value;
        int freq;
    };
    
    std::unordered_map<int, std::list<Node>::iterator> keyNodeMap;
    std::unordered_map<int, std::list<Node>> freqListMap;
    int minFreq;
    int capacity;
    int size;
    
public:
    LFUCache(int capacity) : capacity(capacity), size(0), minFreq(0) {}
    
    int get(int key) {
        if (keyNodeMap.find(key) == keyNodeMap.end())
            return -1;
        
        auto nodeIt = keyNodeMap[key];
        Node node = *nodeIt;
        freqListMap[node.freq].erase(nodeIt);
        
        if (freqListMap[node.freq].empty()) {
            if (node.freq == minFreq)
                minFreq++;
            freqListMap.erase(node.freq);
        }
        
        node.freq++;
        freqListMap[node.freq].push_front(node);
        keyNodeMap[key] = freqListMap[node.freq].begin();
        
        return node.value;
    }
    
    void put(int key, int value) {
        if (capacity == 0)
            return;
        
        if (keyNodeMap.find(key) != keyNodeMap.end()) {
            auto nodeIt = keyNodeMap[key];
            Node node = *nodeIt;
            freqListMap[node.freq].erase(nodeIt);
            
            if (freqListMap[node.freq].empty()) {
                if (node.freq == minFreq)
                    minFreq++;
                freqListMap.erase(node.freq);
            }
            
            node.value = value;
            node.freq++;
            freqListMap[node.freq].push_front(node);
            keyNodeMap[key] = freqListMap[node.freq].begin();
        } else {
            if (size == capacity) {
                auto node = freqListMap[minFreq].back();
                keyNodeMap.erase(node.key);
                freqListMap[minFreq].pop_back();
                
                if (freqListMap[minFreq].empty())
                    freqListMap.erase(minFreq);
                
                size--;
            }
            
            Node newNode = {key, value, 1};
            minFreq = 1;
            freqListMap[1].push_front(newNode);
            keyNodeMap[key] = freqListMap[1].begin();
            size++;
        }
    }
};

Java Implementation:

import java.util.HashMap;
import java.util.LinkedList;

class LFUCache {
    class Node {
        int key, value, freq = 1;
        Node(int key, int value) {
            this.key = key;
            this.value = value;
        }
    }
    
    HashMap<Integer, Node> cache;
    HashMap<Integer, LinkedList<Node>> freqMap;
    int capacity, size, minFreq;
    
    public LFUCache(int capacity) {
        this.capacity = capacity;
        this.size = 0;
        this.minFreq = 0;
        this.cache = new HashMap<>();
        this.freqMap = new HashMap<>();
    }
    
    public int get(int key) {
        if (!cache.containsKey(key)) return -1;
        
        Node node = cache.get(key);
        update(node);
        return node.value;
    }
    
    public void put(int key, int value) {
        if (capacity == 0) return;
        
        if (cache.containsKey(key)) {
            Node node = cache.get(key);
            node.value = value;
            update(node);
        } else {
            if (size == capacity) {
                LinkedList<Node> minFreqList = freqMap.get(minFreq);
                Node removeNode = minFreqList.pollLast();
                cache.remove(removeNode.key);
                if(minFreqList.isEmpty()) freqMap.remove(minFreq);
                size--;
            }
            
            Node newNode = new Node(key, value);
            cache.put(key, newNode);
            freqMap.computeIfAbsent(1, k -> new LinkedList<>()).offerFirst(newNode);
            minFreq = 1;
            size++;
        }
    }
    
    private void update(Node node) {
        int freq = node.freq;
        freqMap.get(freq).remove(node);
        if (freqMap.get(freq).isEmpty()) {
            freqMap.remove(freq);
            if (minFreq == freq) minFreq++;
        }
        
        node.freq++;
        freqMap.computeIfAbsent(node.freq, k -> new LinkedList<>()).offerFirst(node);
    }
}

Time and Space Complexity Analysis

  • Time Complexity:

    • get() operation is O(1) because it involves accessing a hashmap and modifying a linked list.
    • put() operation is O(1) for updates and O(1) amortized for insertions (due to hashmap and linked list operations).
  • Space Complexity: O(capacity) due to the storage required for storing up to 'capacity' number of key-value pairs and their associated metadata.

Common Mistakes to Avoid

  • Failing to update both the frequency map and the key mapping after every get or put operation.
  • Incorrectly managing the minimum frequency, especially when removing elements.
  • Forgetting that upon exceeding capacity, eviction of the least frequently used and least recently used element is necessary.

Similar Problems on LeetCode

Additional Resources and References

This comprehensive guide should provide a firm understanding and practical approach to implementing an LFU Cache while considering all the critical data structure components and associated operations.