Epic Interview Question: I have a log that consists of... | Glassdoor

Interview Question

Software Developer Interview(Student Candidate) Madison, WI

I have a log that consists of more than 100 million lines

 . Each line is just a data about user login, login time, etc. I want to sort them based on user login, and then if there is a tie based on login time, etc. However, I have limited memory, so don't think of storing all of them in an array. The memory can only hold n data where n is much smaller than 100 millions. You can access the disk though although it is much slower. How will you do it so that it is as efficient as possible?
technical, data structures, problem solving, programming, algorithm, algorithms

Interview Answer

5 Answers


hint: the memory can only hold n data, use them wisely.

Interview Candidate on Aug 29, 2009

we can solve this problem using n^2 complexity time
we scan the whole enters to find the smallest value and we do that in a for loop

its complexity is bad, n square
anyone knows better answer?

Anonymous on Oct 15, 2009

how about using hash table?

Anonymous on Oct 27, 2009

Use an external mergesort http://en.wikipedia.org/wiki/External_sorting

Anonymous on Oct 27, 2009

hash table

Anonymous on Feb 23, 2010

Add Answers or Comments

To comment on this, Sign In or Sign Up.