Solving LeetCode #347: Top K Frequent Elements (Part 1)
Custom Min Heap Approach
This is the first in a three-part series exploring different approaches to solving the “Top K Frequent Elements” problem from LeetCode (#347).
Understanding the Problem
Before diving into solutions, let’s understand what we need to do:
- Count the frequency of each number in the array
- Sort the numbers by their frequency
- Return the top
k
numbers with the highest frequencies
Solution Approach: Custom Min Heap Implementation
Our first approach uses a custom min heap implementation to keep track of the top k frequent elements:
- Count the frequencies of each number using a hash map (unordered_map)
- Use a min heap to keep track of the top k elements by frequency
- When the heap size exceeds k, remove the element with the lowest frequency
- Return the elements in the heap as the result
Here’s my custom implementation:
class Solution {
public:
vector<int> topKFrequent(vector<int>& nums, int k) {
// Count frequency of each number
unordered_map<int, int> frequentMap;
for (auto& num : nums) {
if (frequentMap.find(num) != frequentMap.end()) {
frequentMap[num]++;
} else {
frequentMap[num] = 1;
}
}
// Create a min heap to track top K elements
vector<pair<int, int>> results;
for (const auto& pair : frequentMap) {
insert(results, make_pair(pair.first, pair.second));
if (results.size() > k) {
removePeak(results);
}
}
// Convert results to output format
vector<int> output;
for (const auto& pair : results) {
output.push_back(pair.first);
}
return output;
}
private:
void insert(vector<pair<int, int>>& results, pair<int, int> pair) {
// Add the element to the end
results.push_back(pair);
// Maintain min heap property by bubbling up the new element
int index = results.size() - 1;
while (index > 0) {
int parentIndex = parent(index);
if (results[parentIndex].second <= results[index].second) {
break; // Heap property satisfied
}
// Swap with parent if parent has higher frequency (min heap)
std::swap(results[index], results[parentIndex]);
index = parentIndex;
}
}
void removePeak(vector<pair<int, int>>& results) {
int lastIndex = results.size() - 1;
std::swap(results[0], results[lastIndex]); // Using std::swap
results.pop_back();
minHeapify(results, 0);
}
void minHeapify(vector<pair<int, int>>& heap, int i) {
int smallest = i;
int leftIdx = left(i);
int rightIdx = right(i);
if (leftIdx < heap.size() && heap[leftIdx].second < heap[smallest].second) {
smallest = leftIdx;
}
if (rightIdx < heap.size() && heap[rightIdx].second < heap[smallest].second) {
smallest = rightIdx;
}
if (smallest != i) {
std::swap(heap[i], heap[smallest]);
minHeapify(heap, smallest);
}
}
int parent(int i) { return (i - 1) / 2; }
int left(int i) { return 2 * i + 1; }
int right(int i) { return 2 * i + 2; }
};
How the Min Heap Works
Let’s walk through the key components of our min heap implementation:
Structure: We represent the heap as a vector of pairs, where each pair contains (number, frequency).
Insertion (
insert
function):- Add the new element to the end of the vector
- Bubble up the element until the min heap property is satisfied
- The min heap property ensures that the element with the lowest frequency is at the root
Removal (
removePeak
function):- Swap the root (minimum element) with the last element in the heap
- Remove the last element (which is now the minimum)
- Restore the heap property by calling
minHeapify
on the root
Heapify (
minHeapify
function):- Compare the current node with its left and right children
- Find the smallest of the three elements
- If the smallest is not the current node, swap and recursively heapify the affected subtree
Helper functions:
parent(i)
: Returns the index of the parent of node at index ileft(i)
: Returns the index of the left child of node at index iright(i)
: Returns the index of the right child of node at index i
Example Walkthrough
Let’s trace through the solution with the example input [1,1,1,2,2,3]
and k=2
:
Count frequencies:
{1:3, 2:2, 3:1}
Process frequencies:
- Insert (1,3) into heap:
[(1,3)]
- Insert (2,2) into heap:
[(2,2), (1,3)]
(after heapify) - Insert (3,1) into heap:
[(3,1), (2,2), (1,3)]
(after heapify) - Since heap size is 3 and k is 2, remove the minimum:
[(2,2), (1,3)]
- Insert (1,3) into heap:
Result:
[2,1]
(or equivalently[1,2]
since the order doesn’t matter)
Time and Space Complexity
The time complexity of this solution is:
- O(n) for counting frequencies, where n is the number of elements in the input array
- O(m log k) for heap operations, where m is the number of unique elements
So the overall time complexity is O(n + m log k). Since m is at most n, this gives O(n log k) in the worst case. The space complexity is O(n) for the frequency map and heap.
Conclusion
The custom min heap approach offers an efficient solution for finding the top k frequent elements. By maintaining a heap of size k, we avoid the need to sort all unique elements, which is especially beneficial when k is much smaller than the number of unique elements.
Understanding this pattern of frequency counting and heap-based selection is valuable for solving many algorithm problems efficiently. In the next article, we’ll explore how to simplify this solution using C++’s STL priority queue.