K Closest Points to Origin

Hero Image

DT

Dhaval Trivedi

Co-founder, Airtribe

Problem Overview

In the problem "K Closest Points to Origin," you're tasked with finding the k points that are closest to the origin (0, 0) from a given list of points in a 2D plane. The distance between two points (x1, y1) and (x2, y2) is calculated using the Euclidean distance formula. The objective is to return the k closest points to the origin, in any order.

Link to the problem: K Closest Points to Origin on LeetCode

Understanding the Problem

The task involves selecting the closest k points from a list of 2D points based on their distance from the origin. Instead of computing the actual distance, which would involve a square root calculation, we can compare the squares of the distances as it suffices to determine the relative order.

Example

Input:

  • Points = [(1, 3), (-2, 2)]
  • k = 1

Output:

  • [(−2, 2)]

Explanation: The square of the distance of (1, 3) from the origin is 1^2 + 3^2 = 10 and for (-2, 2), it is (-2)^2 + 2^2 = 8. Since (-2, 2) is closer, it is returned when k = 1.

Key DSA Concepts and Theory

Heaps

A heap is a specialized tree-based data structure that satisfies the heap property. In a max-heap, for any given node i, the value of i is greater than or equal to the values of its children, and in a min-heap, the value of i is less than or equal to the values of its children. Heaps are often implemented with arrays for efficiency.

Max-Heap for K Closest Points

For this problem, a max-heap can be used to efficiently keep track of the k points closest to the origin. By maintaining a max-heap of size k, you can ensure that the largest (furthest away) element is at the root, and can be replaced with a closer element if found.

Solution Approach

To solve this problem, the most efficient way is to use a max-heap to maintain the closest k points:

  1. Initialize a Max-Heap: Use a priority queue to simulate a max-heap. The heap will store the points as pairs where the key is the negative of the squared distance (to convert min-heap functionality into a max-heap).

  2. Iterate Over Points: Iterate through each point, calculate its squared distance to the origin, and compare it with the largest (top) element in the heap.

  3. Heap Operations:

    • If the heap contains fewer than k elements, push the new point.
    • If the heap already contains k elements and the new point is closer, pop the largest element and push the new point.
  4. Construct Result: The heap now contains the k closest points. Extract these points and return them.

C++ Code Implementation

#include <vector>
#include <queue>
#include <cmath>
using namespace std;

vector<vector<int>> kClosest(vector<vector<int>>& points, int K) {
    auto distance = [](vector<int>& point) {
        return point[0] * point[0] + point[1] * point[1];
    };
    
    priority_queue<pair<int, vector<int>>> maxHeap;
    
    for (auto& point : points) {
        int dist = distance(point);
        maxHeap.push({-dist, point});
        
        if (maxHeap.size() > K) {
            maxHeap.pop();
        }
    }
    
    vector<vector<int>> result;
    while (!maxHeap.empty()) {
        result.push_back(maxHeap.top().second);
        maxHeap.pop();
    }
    
    return result;
}

Java Code Implementation

import java.util.*;

public class KClosestPoints {
    public int[][] kClosest(int[][] points, int K) {
        PriorityQueue<int[]> maxHeap = new PriorityQueue<>((a, b) -> 
            Integer.compare(b[0] * b[0] + b[1] * b[1], a[0] * a[0] + a[1] * a[1]));
        
        for (int[] point : points) {
            maxHeap.offer(point);
            if (maxHeap.size() > K) {
                maxHeap.poll();
            }
        }
        
        int[][] result = new int[K][2];
        while (K-- > 0) {
            result[K] = maxHeap.poll();
        }
        
        return result;
    }
}

Time and Space Complexity Analysis

  • Time Complexity: O(N log K), where N is the number of points. This is because each offer and poll operation on the heap takes log K time, and we do this for N points.

  • Space Complexity: O(K) for storing the closest k points in the heap.

Common Mistakes to Avoid

  1. Misinterpretation of Distance Calculation: Ensure to use the squared distance (without the square root) to avoid unnecessary floating-point calculations.
  2. Max-Heap Logic: Using a max-heap (via negative distances) correctly, as Java’s PriorityQueue is inherently a min-heap.
  3. Handling Edge Cases: Such as when k is larger than the number of points or when points list is empty.

Similar Problems on LeetCode

  1. #973 K Closest Points to Origin
  2. #215 Kth Largest Element in an Array
  3. #703 Kth Largest Element in a Stream

Additional Resources and References

Understanding heaps and how they can be applied to windowed computations is essential to mastering algorithms that require partial orderings, such as "K Closest Points to Origin". This problem efficiently teaches how to leverage heaps for maintaining a dynamic list of important elements based on predefined criteria.