Microsoft

www.microsoft.com
Engaged Employer

## Interview Question

Software Development Engineer Interview Redmond, WA

# Find the 20 longest strings in a text file.

0

If you don't have to worry about space, you could use a hash with each word being the "key" and its length being the "value". Then sort bu hash by value, and choose the top 20.

Homer Simpson on Aug 27, 2012
2

Another way can be to just traverse the strings and keep a min b-tree to keep the minimum element on the top. When size of tree + 20, then only insert (replace) in tree, if the current string length > root of tree(min) . This will take O(log20) for each insertion and O(N) for traversal.

Eizenh3im on Aug 28, 2012
1

In the hash map approach described in the first answer, instead of having the word as the key, we can have the length as the key and value as the list of words with the length pointed by the key. You can use LinkedHashSet if you want sorting. Or you can maintain the largest 20 lengths in a separate data structure.

This obviously is not a really optimal solution in terms of space complexity. Instead of that we can maintain a min-heap. Add first 20 words as it is. And then, if the new word has a larger length than the word at the top of the heap (smallest word as it's a min-heap) you remove the word from the heap and insert the new word into the heap. The heap size is always going to be 20 so you don't take much space and don't take much time to insert the word in the heap as well.

Random on Oct 10, 2012
0

The min-heap solutions looks nice..However, it would take O(n.logn) right? On worst case, we do have to insert/delete in min-heap (log n) for every n words.

Any better solution exists??
Can we use dynamic programming to find the longest word and then remove the longest word and repeat the routine for 19 more times? That would be 20times O(N) , which is better than O(n.log n).