K Closest Points to Origin

Problem Overview
In the problem "K Closest Points to Origin," you're tasked with finding the k
points that are closest to the origin (0, 0)
from a given list of points in a 2D plane. The distance between two points (x1, y1)
and (x2, y2)
is calculated using the Euclidean distance formula. The objective is to return the k
closest points to the origin, in any order.
Link to the problem: K Closest Points to Origin on LeetCode
Understanding the Problem
The task involves selecting the closest k
points from a list of 2D points based on their distance from the origin. Instead of computing the actual distance, which would involve a square root calculation, we can compare the squares of the distances as it suffices to determine the relative order.
Example
Input:
- Points =
[(1, 3), (-2, 2)]
k = 1
Output:
[(−2, 2)]
Explanation: The square of the distance of (1, 3) from the origin is 1^2 + 3^2 = 10
and for (-2, 2), it is (-2)^2 + 2^2 = 8
. Since (-2, 2)
is closer, it is returned when k = 1.
Key DSA Concepts and Theory
Heaps
A heap is a specialized tree-based data structure that satisfies the heap property. In a max-heap, for any given node i
, the value of i
is greater than or equal to the values of its children, and in a min-heap, the value of i
is less than or equal to the values of its children. Heaps are often implemented with arrays for efficiency.
Max-Heap for K Closest Points
For this problem, a max-heap can be used to efficiently keep track of the k
points closest to the origin. By maintaining a max-heap of size k
, you can ensure that the largest (furthest away) element is at the root, and can be replaced with a closer element if found.
Solution Approach
To solve this problem, the most efficient way is to use a max-heap to maintain the closest k
points:
Initialize a Max-Heap: Use a priority queue to simulate a max-heap. The heap will store the points as pairs where the key is the negative of the squared distance (to convert min-heap functionality into a max-heap).
Iterate Over Points: Iterate through each point, calculate its squared distance to the origin, and compare it with the largest (top) element in the heap.
Heap Operations:
- If the heap contains fewer than
k
elements, push the new point. - If the heap already contains
k
elements and the new point is closer, pop the largest element and push the new point.
- If the heap contains fewer than
Construct Result: The heap now contains the
k
closest points. Extract these points and return them.
C++ Code Implementation
#include <vector>
#include <queue>
#include <cmath>
using namespace std;
vector<vector<int>> kClosest(vector<vector<int>>& points, int K) {
auto distance = [](vector<int>& point) {
return point[0] * point[0] + point[1] * point[1];
};
priority_queue<pair<int, vector<int>>> maxHeap;
for (auto& point : points) {
int dist = distance(point);
maxHeap.push({-dist, point});
if (maxHeap.size() > K) {
maxHeap.pop();
}
}
vector<vector<int>> result;
while (!maxHeap.empty()) {
result.push_back(maxHeap.top().second);
maxHeap.pop();
}
return result;
}
Java Code Implementation
import java.util.*;
public class KClosestPoints {
public int[][] kClosest(int[][] points, int K) {
PriorityQueue<int[]> maxHeap = new PriorityQueue<>((a, b) ->
Integer.compare(b[0] * b[0] + b[1] * b[1], a[0] * a[0] + a[1] * a[1]));
for (int[] point : points) {
maxHeap.offer(point);
if (maxHeap.size() > K) {
maxHeap.poll();
}
}
int[][] result = new int[K][2];
while (K-- > 0) {
result[K] = maxHeap.poll();
}
return result;
}
}
Time and Space Complexity Analysis
Time Complexity:
O(N log K)
, whereN
is the number of points. This is because eachoffer
andpoll
operation on the heap takeslog K
time, and we do this forN
points.Space Complexity:
O(K)
for storing the closestk
points in the heap.
Common Mistakes to Avoid
- Misinterpretation of Distance Calculation: Ensure to use the squared distance (without the square root) to avoid unnecessary floating-point calculations.
- Max-Heap Logic: Using a max-heap (via negative distances) correctly, as Java’s
PriorityQueue
is inherently a min-heap. - Handling Edge Cases: Such as when
k
is larger than the number of points or when points list is empty.
Similar Problems on LeetCode
- #973 K Closest Points to Origin
- #215 Kth Largest Element in an Array
- #703 Kth Largest Element in a Stream
Additional Resources and References
- Heaps and Priority Queues on Wikipedia.
- Euclidean Distance description and formula.
- Priority Queues in C++ Documentation.
- PriorityQueue in Java API Reference.
Understanding heaps and how they can be applied to windowed computations is essential to mastering algorithms that require partial orderings, such as "K Closest Points to Origin". This problem efficiently teaches how to leverage heaps for maintaining a dynamic list of important elements based on predefined criteria.