K most frequent elements

Hero Image

Problem Overview

The provided LeetCode problem, titled "Top K Frequent Elements," revolves around identifying the elements that appear most frequently within a given list. The objective is to find the k most frequent elements efficiently.

Problem Statement

Given an integer array nums and an integer k, return the k most frequent elements in the array. You may return the answer in any order.

Constraints

  • 1 <= nums.length <= 10^5
  • k is in the range [1, the number of unique elements in nums]
  • It is guaranteed that the answer is unique

The URL to view the problem statement is: Top K Frequent Elements - LeetCode

Understanding the Problem

The problem requires us to compute the frequency of each element in the list and then retrieve the k elements with the highest frequency. This can become computationally expensive for larger datasets, so an efficient approach is required.

Key DSA Concepts and Theory

Heaps

A heap is a specialized tree-based data structure that satisfies the heap property. The two main types are:

  • Min-Heap: The key at a parent node is less than or equal to the keys of its children, hence having the smallest element on top.
  • Max-Heap: The parent node has a key greater than or equal to the keys of its children.

For this problem, we can leverage a Min-Heap to efficiently track the top k elements with the highest frequency, since the heap allows us to keep operations like insertion and deletion in O(log k) time.

Priority Queues

A priority queue is an abstract data structure similar to a regular queue or stack data structure, but where each element has a "priority" associated with it. In a priority queue, an element with high priority is served before an element with low priority.

Solution Approach

To solve the problem, we can use a combination of hashmaps to calculate the frequency of each element and a Min-Heap (priority queue) to efficiently retrieve the k most frequent elements.

Step-by-Step Solution

  1. Frequency Count:

    • Use a HashMap to count the frequency of each element in the array.
  2. Min-Heap for Top K Elements:

    • Use a Min-Heap to keep track of the top k elements by frequency.
    • Iterate through the hashmap, and for each entry (element and its frequency), compare it with the smallest frequency in the heap.
    • If the heap size is less than k, simply add the element.
    • If it's equal to k, pop the smallest element in frequency and add the new element if it is more frequent.
  3. Retrieve Results:

    • The heap will contain the k most frequent elements. Extract these for the result.

C++ Code

#include <vector>
#include <unordered_map>
#include <queue>

using namespace std;

vector<int> topKFrequent(vector<int>& nums, int k) {
    // Step 1: Frequency Map
    unordered_map<int, int> frequencyMap;
    for (int num : nums) {
        frequencyMap[num]++;
    }
    
    // Step 2: Min-Heap for top k frequent elements
    auto comp = [&frequencyMap](int n1, int n2) {
        return frequencyMap[n1] > frequencyMap[n2];
    };
    priority_queue<int, vector<int>, decltype(comp)> minHeap(comp);
    
    for (auto& pair : frequencyMap) {
        minHeap.push(pair.first);
        if (minHeap.size() > k) {
            minHeap.pop();
        }
    }
    
    // Step 3: Extracting results
    vector<int> result;
    while (!minHeap.empty()) {
        result.push_back(minHeap.top());
        minHeap.pop();
    }
    
    return result;
}

Java Code

import java.util.*;

class Solution {
    public int[] topKFrequent(int[] nums, int k) {
        // Step 1: Frequency Map
        HashMap<Integer, Integer> frequencyMap = new HashMap<>();
        for (int num : nums) {
            frequencyMap.put(num, frequencyMap.getOrDefault(num, 0) + 1);
        }

        // Step 2: Min-Heap for top k frequent elements
        PriorityQueue<Integer> minHeap = new PriorityQueue<>(
            (n1, n2) -> frequencyMap.get(n1) - frequencyMap.get(n2)
        );
        
        for (int num : frequencyMap.keySet()) {
            minHeap.add(num);
            if (minHeap.size() > k) {
                minHeap.poll();
            }
        }

        // Step 3: Extracting results
        int[] result = new int[k];
        for (int i = 0; i < k; ++i) {
            result[i] = minHeap.poll();
        }
        
        return result;
    }
}

Time and Space Complexity Analysis

  • Time Complexity:

    • Building the frequency map will take O(N), where N is the number of elements in the array.
    • Building the heap takes O(N log k) due to the insertion of elements.
    • Overall complexity: O(N log k).
  • Space Complexity:

    • Space used by the frequency map is O(N).
    • Space used by the heap is O(k).
    • Overall complexity: O(N + k).

Common Mistakes to Avoid

  1. Ignoring duplicates: Ensure to account for the frequency of elements and not just their unique occurrences.
  2. Heap Size Management: Ensure the heap never exceeds k elements, as this could lead to incorrect results and higher complexity.

Similar Problems on LeetCode

  1. Find K Pairs with Smallest Sums (LeetCode 373)
  2. K Closest Points to Origin (LeetCode 973)
  3. Kth Largest Element in an Array (LeetCode 215)

Additional Resources and References

This article should equip you with a holistic understanding of solving the "Top K Frequent Elements" problem using heaps efficiently in both C++ and Java while avoiding common pitfalls.