@derekli66

Solving LeetCode #347: Top K Frequent Elements (Part 1)

Custom Min Heap Approach

This is the first in a three-part series exploring different approaches to solving the “Top K Frequent Elements” problem from LeetCode (#347).

Understanding the Problem

Before diving into solutions, let’s understand what we need to do:

  1. Count the frequency of each number in the array
  2. Sort the numbers by their frequency
  3. Return the top k numbers with the highest frequencies

Solution Approach: Custom Min Heap Implementation

Our first approach uses a custom min heap implementation to keep track of the top k frequent elements:

  1. Count the frequencies of each number using a hash map (unordered_map)
  2. Use a min heap to keep track of the top k elements by frequency
  3. When the heap size exceeds k, remove the element with the lowest frequency
  4. Return the elements in the heap as the result

Here’s my custom implementation:

class Solution {
public:
    vector<int> topKFrequent(vector<int>& nums, int k) {
        // Count frequency of each number
        unordered_map<int, int> frequentMap;
        
        for (auto& num : nums) {
            if (frequentMap.find(num) != frequentMap.end()) {
                frequentMap[num]++;
            } else {
                frequentMap[num] = 1;
            }
        }
        
        // Create a min heap to track top K elements
        vector<pair<int, int>> results;
        
        for (const auto& pair : frequentMap) {
            insert(results, make_pair(pair.first, pair.second));
            if (results.size() > k) {
                removePeak(results);
            }
        }
        
        // Convert results to output format
        vector<int> output;
        for (const auto& pair : results) {
            output.push_back(pair.first);
        }
        
        return output;
    }
    
private:
    void insert(vector<pair<int, int>>& results, pair<int, int> pair) {
        // Add the element to the end
        results.push_back(pair);
        
        // Maintain min heap property by bubbling up the new element
        int index = results.size() - 1;
        while (index > 0) {
            int parentIndex = parent(index);
            if (results[parentIndex].second <= results[index].second) {
                break; // Heap property satisfied
            }
            // Swap with parent if parent has higher frequency (min heap)
            std::swap(results[index], results[parentIndex]);
            index = parentIndex;
        }
    }
    
    void removePeak(vector<pair<int, int>>& results) {
        int lastIndex = results.size() - 1;
        std::swap(results[0], results[lastIndex]); // Using std::swap
        results.pop_back();
        minHeapify(results, 0);
    }
    
    void minHeapify(vector<pair<int, int>>& heap, int i) {
        int smallest = i;
        int leftIdx = left(i);
        int rightIdx = right(i);
        
        if (leftIdx < heap.size() && heap[leftIdx].second < heap[smallest].second) {
            smallest = leftIdx;
        }
        
        if (rightIdx < heap.size() && heap[rightIdx].second < heap[smallest].second) {
            smallest = rightIdx;
        }
        
        if (smallest != i) {
            std::swap(heap[i], heap[smallest]);
            minHeapify(heap, smallest);
        }
    }
    
    int parent(int i) { return (i - 1) / 2; }
    int left(int i) { return 2 * i + 1; }
    int right(int i) { return 2 * i + 2; }
};

How the Min Heap Works

Let’s walk through the key components of our min heap implementation:

  1. Structure: We represent the heap as a vector of pairs, where each pair contains (number, frequency).

  2. Insertion (insert function):

    • Add the new element to the end of the vector
    • Bubble up the element until the min heap property is satisfied
    • The min heap property ensures that the element with the lowest frequency is at the root
  3. Removal (removePeak function):

    • Swap the root (minimum element) with the last element in the heap
    • Remove the last element (which is now the minimum)
    • Restore the heap property by calling minHeapify on the root
  4. Heapify (minHeapify function):

    • Compare the current node with its left and right children
    • Find the smallest of the three elements
    • If the smallest is not the current node, swap and recursively heapify the affected subtree
  5. Helper functions:

    • parent(i): Returns the index of the parent of node at index i
    • left(i): Returns the index of the left child of node at index i
    • right(i): Returns the index of the right child of node at index i

Example Walkthrough

Let’s trace through the solution with the example input [1,1,1,2,2,3] and k=2:

  1. Count frequencies: {1:3, 2:2, 3:1}

  2. Process frequencies:

    • Insert (1,3) into heap: [(1,3)]
    • Insert (2,2) into heap: [(2,2), (1,3)] (after heapify)
    • Insert (3,1) into heap: [(3,1), (2,2), (1,3)] (after heapify)
    • Since heap size is 3 and k is 2, remove the minimum: [(2,2), (1,3)]
  3. Result: [2,1] (or equivalently [1,2] since the order doesn’t matter)

Time and Space Complexity

The time complexity of this solution is:

  • O(n) for counting frequencies, where n is the number of elements in the input array
  • O(m log k) for heap operations, where m is the number of unique elements

So the overall time complexity is O(n + m log k). Since m is at most n, this gives O(n log k) in the worst case. The space complexity is O(n) for the frequency map and heap.

Conclusion

The custom min heap approach offers an efficient solution for finding the top k frequent elements. By maintaining a heap of size k, we avoid the need to sort all unique elements, which is especially beneficial when k is much smaller than the number of unique elements.

Understanding this pattern of frequency counting and heap-based selection is valuable for solving many algorithm problems efficiently. In the next article, we’ll explore how to simplify this solution using C++’s STL priority queue.