Amazon Interview Question: Find k largest/smallest numbe... | Glassdoor

## Interview Question

Software Development Engineer II Interview Seattle, WA

# Find k largest/smallest number in a series of numbers. What

data-structures will you use? Code it on white board.
Tags:
c++, c

0

For K smallest number in series you can use a max heap of size k. When you find a new number and it is smaller than root of heap, swap it with root and heapify.

Ajit on Dec 4, 2011
0

@Ajit: What're the initial values of the max heap? What happens to the first value that comes in?

Anon on Dec 5, 2011
0

Use selection sort for 'max' ( or for 'min')
If K &gt; series.length/2 then reverse the criteria- like in stead of looking for 15th highest out of 20 element array - look for (20 -15 =) 5th lowest and so on....

Kuntal on Jan 12, 2012
0

I think there's a cool solution which involves doing a pivot operation (like quicksort) until you have the top/bottom k numbers sorted. It's possible that you can get the top k numbers in a single pass of the array if you happen to choose a good pivot, and in general takes very few passes to find the top k numbers.

I coded it up in C a while back: http://privatepaste.com/1f1df9d8f0

You just call select with your list, the min index, max index, and top k numbers, respectively. It can be easily changed to find the min (just reverse the swap condition)

Jordan on Jan 19, 2012
0

@Jordan, I can no longer get at your privatepaste link, so if you see this post I am interested in answers to the following:

- since partition sizes are inherently not controllable by your choice of pivot, how do you land on an exactly k-sized partition (or even a k-sized total of the leftmost/rightmost N partitions)?

- how do you choose pivots reliably to perform this (if it isn't reliable, then it isn't always a good answer just like pathological inputs for quicksort without randomizing the pivot selecting)

- do you still just sort the partition(s) once you reach a k-sized set? I assume they want them in-order.

I would just use a heapsort that short-circuits once K elements have been popped from the internal sifting loop, which does minimal processing and returns them in-order (C# example):

public void FindKLargest(T[] A, int k)
{
. Heapify(A);
. int end = A.Length - 1;
. int k_offset = Math.Max(Math.Min(k, end), 1);
. int stop = end - k_offset;
. while (end &gt; stop)
. {
. // Save the root heap item as done
. Swap(ref A[0], ref A[end]);
. // Separate heap from saved item
. end--;
. // Re-sift the new root item
. SiftDown(A, 0, end);
. }
. // the K largest entries are now in ascending order at the end of A
}

Ed on Apr 20, 2013
0

For those of you who think building a max/min heap is O(nlogn) since heapify is O(logn) and heap build is O(n), but you are incorrect. Building a binary heap is actually O(n) because heapify depends on the height of the node, hence becoming O(h). Find more here: https://www.geeksforgeeks.org/time-complexity-of-building-a-heap/

Sam on Jul 25, 2018