Machine learning intern Interview Questions
4K
Machine Learning Intern interview questions shared by candidates
Suppose you have a matrix of numbers. How can you easily compute the sum of any rectangle (i.e. a range [row_start, row_end, col_start, col_end]) of those numbers? How would you code this?
7 Answers↳
Compute the sum of the rectangles, for all i,j, bounded by (i,j), (i,m), (n,j), (n,m), where (n,m) is the size of the matrix M. Call that sum s(i,j). You can calculate s(i,j) by dynamic programming: s(i,j) = M(i,j) + s(i+1,j) + s(i,j+1) - s(i+1,j+1). And the sum of any rectangle can be computed from s(i,j). Less
↳
Awesome!!
↳
The answer is already popular in computer vision fields!! It is called integral imaging. See this page http://en.wikipedia.org/wiki/Haar-like_features Less

Why does one use MSE as a measure of quality. What is the scientific/mathematical reason for the same?
3 Answers↳
Mean-Square error is an error metric for measuring image or video quality it is popular video and image quality metric because the analysis and mathematics is easier with this L2-Norm metric. Most video and image quality experts will agree that MSE is not a very good measure of perceptual video and image quality. Less
↳
The mathematical reasoning behind the MSE is as follows: For any real applications, noise in the readings or the labels is inevitable. We generally assume this noise follows Gaussian distribution and this holds perfectly well for most of the real applications. Considering 'e' follows gaussian distribution in y=f(x) + e and calculating the MLE, we get MSE which is also L2 distance. Note: Assuming some other noise distribution may lead to other MLE estimate which will not be MSE. Less
↳
MSE is used for understanding the weight of the errors in any model. This helps us understand model accuracy in a way that is helpful when choosing different types of models. Check out more answers on InterviewQuery.com Less

How would you design a recommendation system (like amazon)?
2 Answers↳
Use collaborate filtering to compare personal preference with others. If A and B are similar, we can recommend preferred items in B to A. Less
↳
Why downvote on other answer? He/she is right. Collaborative filtering is the most common strategy for recommendation systems. You see user A buys these things and user B also bought those things but user B bought this other thing too so let's show that thing to User A. Less

The three data structure questions are: 1. the difference between linked list and array; 2. the difference between stack and queue; 3. describe hash table.
1 Answers↳
Wow... pathetically easy

Have you ever had your code formally verified?
6 Answers↳
What were the online coding questions like? Could you elaborate?
↳
Object detection. Is that what yours was?
↳
it is same as mine. Could you give me more details about the online coding? what algorithm did they test on object detection part? Less


What are some of the projects that you have done?
4 Answers↳
Do you mind to share what are the hard leetcode questions they asked during the interview? Less
↳
I dont think it's fair to share which question they asked. But the exact same question is on leetcode and the difficulty level is hard. Less
↳
What topic you are being ask from in leetcode? also did they ask you system design and CS fundamentals. Less

Give an image, when we take 2 sub images from it, calculate the ratio similar to AnB/AuB.
4 Answers↳
Coded in python but wasn't able to finish it
↳
Can you elaborate on the question
↳
Given a matrix and coordinates of 2 rectangles calculate the weighted IoU in linear/constant time. Less

Design round: Design an api rate limiter Coding round: simple manipulation of arrays and maps Craft round: Design an ML Labelling system
3 Answers↳
APi rate limiter was really simple, just look at uber/ratelimit on git and thats it. Rest was farily easy Less
↳
There will be many documents in a document database. The labelling system must use machine learning to label into different categories. Eg help desk, system document, technical. There will a small train dataset available but not entirely reliable. Less
↳
The correct answer would be to use a combination of weak learning methods and gradually incorporate feedback and make it stronger Less

how to sort in O(Logn) time
3 Answers↳
I don't think you can sort in O(logn) because you will need to go through the whole data at least once, making it O(n). Indeed, you can do it in O(logn) if the data is guarantee with some specific constrain or relationship. I think the best you can sort a completely random data is O(nlogn). Less
↳
I didn't come up with the answer. it is not difficult, just not prepared
↳
what is the question