@derekli66

Solving LeetCode #347: Top K Frequent Elements (Part 2)

Using C++ STL Priority Queue

In Part 1, we implemented a custom min heap to solve this problem. However, C++ provides a powerful container in its Standard Template Library (STL) that can simplify our solution: the priority_queue.

The priority_queue container is an implementation of a heap data structure that provides constant time complexity for retrieving the maximum element and logarithmic time complexity for insertion and extraction.

Max Heap vs. Min Heap Approach

By default, priority_queue in C++ is a max heap. For this problem, we have two options:

  1. Min Heap of size k (our chosen approach):

    • Keep a min heap containing the k elements with highest frequencies
    • When the heap size exceeds k, remove the element with the lowest frequency
    • This requires us to customize the priority_queue to be a min heap using std::greater
  2. Max Heap of all elements:

    • Insert all elements and their frequencies into a max heap
    • Extract the top k elements
    • This would use the default priority_queue behavior

We’ll implement the min heap approach because it’s more efficient when k is much smaller than the number of unique elements, which is often the case in practice.

Let’s solve the problem using priority_queue:

class Solution {
public:
    vector<int> topKFrequent(vector<int>& nums, int k) {
        // Count frequency of each number
        unordered_map<int, int> freqMap;
        for (int num : nums) {
            freqMap[num]++;
        }
        
        // Use a min heap to keep track of top k frequent elements
        // The heap is ordered by frequency (first value in the pair)
        priority_queue<pair<int, int>, vector<pair<int, int>>, 
                      greater<pair<int, int>>> minHeap;
        
        // Process each element and its frequency
        for (auto& pair : freqMap) {
            minHeap.push({pair.second, pair.first});
            
            // If heap size exceeds k, remove the smallest frequency element
            if (minHeap.size() > k) {
                minHeap.pop();
            }
        }
        
        // Extract the k most frequent elements from the heap
        vector<int> result;
        while (!minHeap.empty()) {
            result.push_back(minHeap.top().second);
            minHeap.pop();
        }
        
        return result;
    }
};

How the Solution Works

Let’s break down this solution step by step:

  1. Frequency Counting:

    • We use an unordered_map<int, int> to count the frequency of each number in the input array.
    • This takes O(n) time, where n is the size of the input array.
  2. Min Heap Construction:

    • We define a min heap using priority_queue with greater<pair<int, int>> as the comparator.
    • The pairs are structured as (frequency, number), so the heap is ordered by frequency.
    • We use a min heap so that we can easily remove the element with the lowest frequency when the heap size exceeds k.
  3. Processing Elements:

    • For each number and its frequency, we push a pair into the min heap.
    • If the heap size exceeds k, we remove the top element (the one with the lowest frequency).
    • This ensures that we keep only the k elements with the highest frequencies.
  4. Result Extraction:

    • Finally, we extract all elements from the heap and add them to our result array.
    • Note that the numbers in the result will be ordered by increasing frequency. If a specific order is required, additional sorting might be needed.

Example Walkthrough

Let’s trace through the solution with the example input [1,1,1,2,2,3] and k=2:

  1. Count frequencies: {1:3, 2:2, 3:1}

  2. Process frequencies:

    • Push (3,1) into min heap: [(3,1)]
    • Push (2,2) into min heap: [(2,2), (3,1)]
    • Push (1,3) into min heap: [(1,3), (2,2), (3,1)]
    • Since heap size is 3 and k is 2, pop the top (minimum frequency) element: [(2,2), (3,1)]
  3. Extract results:

    • Pop (2,2) from heap: result = [2]
    • Pop (3,1) from heap: result = [2,1]
  4. Final result: [2,1] (which is equivalent to [1,2] since the order doesn’t matter)

Time and Space Complexity

Time Complexity

  • Building the frequency map: O(n)
  • Heap operations: O(m log k), where m is the number of unique elements (at most n)
  • Overall: O(n + m log k) = O(n log k) in the worst case

Space Complexity

  • Frequency map: O(m), where m is the number of unique elements
  • Min heap: O(k)
  • Overall: O(m + k) = O(n) in the worst case

When to Use This Approach

The priority queue approach is particularly useful when:

  1. You need to find the top k elements from a collection.
  2. The input array is large and k is relatively small.
  3. You want a straightforward, efficient implementation without having to code a custom data structure.

Conclusion

The priority queue solution is like having a smart system that always keeps your “top k” list updated, while efficiently discarding anything that doesn’t make the cut. This targeted approach is what makes it more efficient than sorting everything when you only care about a small subset of the elements.