Facebook

  www.facebook.com
  www.facebook.com

Interview Question

Software Engineer Interview

Intersection of n sets without using a hash table.

Answer

Interview Answer

9 Answers

0

Use an algorithm similar to the combine step of merge-sort

Interview Candidate on Feb 25, 2012
0

Could you please give more details on how the input was given? Was it given as n 1-D arrays or a 2-D array?

D on Mar 2, 2012
0

OK. Assuming the two sets are simple arrays of numbers, you can choose to organize them in-place (with no additional memory cost) into a heap. Then your code calls both heap to get the next number to print out the intersecting numbers. Heap costs you O(n log n).

Ravi on Mar 13, 2012
0

But the problem states "intersection of N sets" is it right?

Anon on Mar 17, 2012
2

You can do this with an auxiliary array with a size of your smallest set in linear time.

since the number of items in the intersection set of n sets is smaller or equal to the number of items in the smallest set. You can go through all the sets and count the number of elements in each and then find the minimum by o(m) in which m is the number of items in all the sets

Now concatenate all the sets together just attach the arrays at the end of each other without doing any thing making sure that the smallest array is the first array. this can be done in o(n) in which n is the number of arrays now you have a long array of size m .

You are going to insert objects of (value,key)

Start from the beginning of the big array and go x steps forward inserting each element into the auxilary array with the key 1.
let i be = 1.
 now loop x more items through the rest of the big array and for each element search the smaller array using binary search in logx. if the key == i increase it otherwise let it be.
i++
continue this until you reach the end of big array. You have done this in o(m).
Now go through the auxillary array and print the items with key == i . This is done in o(x)

so the total is o(x + mlogx + m) . Considering the fact that the x is the size of the smallest array in is just a constant and does not change with increasing the input size we can safely remove it and you have o(m).

Ali on Apr 14, 2012
0

the approach by ali is correct
here is another alternative approach same order O(n+M) => M=sum of sizes of each array

vint intersect(vector<vint> sets)
{
  int n= sets.size();
  int i=0;

  vint pointers;
  vint values;
  pointers.insert(pointers.begin(),n,0);
  values.insert(values.begin(),n,0);

  int max;
  int maxindex;
  bool cond=true;
  vint res;
  res.clear();

  for(i=0;i<n;++i)
  {
    if(sets[i].size()==0) return res; //if there is a null set return empty
    sort(sets[i].begin(),sets[i].end()); //keep sorting otherwise
    print(sets[i]);
  }

  while(true)
  {
       for(i=0;i<n;++i)
           values[i]=sets[i][pointers[i]];

       max=values[0];
       maxindex=0;
       for(i=1;i<n;++i)
       {
         if(values[i]>max)
         {
            max=values[i];
            maxindex=i;
         }
       }

       cond=true;
       for(i=0;i<n;++i)
       {
          if(values[i]<max)
      {
          cond=false;
          pointers[i]+=1;
          if(pointers[i]>=sets[i].size()) //breaking condition
               return res;
      }
       }

       if(cond)
       {
         res.push_back(values[0]);
     for(i=0;i<n;++i)
     {
       pointers[i]+=1;
       if(pointers[i]>=sets[i].size())
          return res;
     }
       }

 }
 return res;
}

light on Apr 21, 2012
0

well this approach is the combine step in merge sort generalized for n arrays
initalize pointers to beginning of each array
then move all pointers that are < max of values pointed to by the pointers;
if we hit the end of any array then we terminate

light on Apr 21, 2012
0

This i progressively reductive. If the element doesn't exist in the first (smallest) 2 arrays, it does not exist in the solution. I propose creating a BST of the smallest array O(mlogm) where m is size of smallest array and then doing a search for every element in this tree O(Nlogm) (N being size of all elements). If you track counts in the tree-node the ones <N can be thrown out. This is O(m). Hence O(m) + O(mlogm) + O(Nlogm). You can consider trimming the tree regularly to remove elements with count < (curent array number).

anon on Apr 22, 2012
0

very easy in functional

intersectb([],_Li) ->[];
intersectb(_Li,[]) ->[];
intersectb([ H1 | T1 ],[ H2 | T2 ])
 ->
     if H1 == H2 -> [H1|intersectb(T1,T2)];
        H1 > H2 -> intersectb([H1|T1],T2);
       H1 < H2 -> intersectb(T1,[H2|T2])
end.

intersect([]) -> [];
intersect([ H | []]) -> H;
intersect([ H | T ]) -> intersectb(lists:sort(H),lists:sort(intersect(T))).

Vitaly on Aug 24, 2012

Add Answers or Comments

To comment on this, Sign In or Sign Up.