Software Engineer Interview Cambridge, MA

Design a system that can efficiently scale and store the

  count of all unique words from a *very large* document-based corpus of text.
analytical, concurrency, achitecture, parallelism, big data

While this is a great broad topic to tease out high and low level considerations that I might need to go through in a given design, I received more ambiguity rather than clarifications when I sought information on how to constrain my "design".

For example, what kind of hardware would I have at my disposal? Would it run on limited in-house commodity hardware, or would I have access to unlimited cloud-based elastic resources? Does "efficiently" also mean cheaply? Would it be a one time job, or is it something that would grow and be accessed at the same time? How fast is "fast enough"? The answer: "You tell me."

I ended up asking more questions and exploring suppositions rather than giving concrete answers, which is what I would probably do on the job anyways, at least upfront. My "design" (or lack thereof of reaching one!) seemed to severely underwhelm the interviewer.

Interview Candidate on Aug 29, 2012

Those questions you asked are not relevant to what the interviewers want to know.... they want to know HOW you actually implement it.

Anonymous on Dec 11, 2013

i.e. make an inverted index

Anonymous on Dec 11, 2013

map reduce

Anonymous on Jun 13, 2015

Anonymous on Jul 11, 2019

