Microsoft Interview Question: Find the 20 longest strings i... | Glassdoor

Find your next job here

Interview Question

Software Development Engineer Interview Redmond, WA

Find the 20 longest strings in a text file.


Interview Answer

6 Answers


If you don't have to worry about space, you could use a hash with each word being the "key" and its length being the "value". Then sort bu hash by value, and choose the top 20.

Homer Simpson on Aug 27, 2012

Another way can be to just traverse the strings and keep a min b-tree to keep the minimum element on the top. When size of tree + 20, then only insert (replace) in tree, if the current string length > root of tree(min) . This will take O(log20) for each insertion and O(N) for traversal.

Eizenh3im on Aug 28, 2012

In the hash map approach described in the first answer, instead of having the word as the key, we can have the length as the key and value as the list of words with the length pointed by the key. You can use LinkedHashSet if you want sorting. Or you can maintain the largest 20 lengths in a separate data structure.

This obviously is not a really optimal solution in terms of space complexity. Instead of that we can maintain a min-heap. Add first 20 words as it is. And then, if the new word has a larger length than the word at the top of the heap (smallest word as it's a min-heap) you remove the word from the heap and insert the new word into the heap. The heap size is always going to be 20 so you don't take much space and don't take much time to insert the word in the heap as well.

Random on Oct 10, 2012

The min-heap solutions looks nice..However, it would take O(n.logn) right? On worst case, we do have to insert/delete in min-heap (log n) for every n words.

Any better solution exists??
Can we use dynamic programming to find the longest word and then remove the longest word and repeat the routine for 19 more times? That would be 20times O(N) , which is better than O(n.log n).

Any comments?

Cijo Thomas on Oct 28, 2012

Correction: The min-heap version takes n.log(k) only which is O(n) only. -This looks the best solution so far.

Cijo Thomas on Oct 28, 2012

Use selection rank algorithm. Expected time complexity o(n)

Anonymous on Jan 16, 2018

Add Answers or Comments

To comment on this, Sign In or Sign Up.