Quantitative researcher Interview Questions
quantitative researcher interview questions shared by candidates
Top Interview Questions
Quantitative Analyst at Morgan Stanley was asked...
What's the best unbiased estimator for a series random variables? 5 AnswersI guess it is just a Gaussian distribution (Normal dist.). Since it has the smallest uncertainty (from quantum point of view) or variance. I guess it is just a Gaussian distribution (Normal dist.), since it has the smallest uncertainty (from quantum point of view) or variance. It is the OLS estimator (with Gauss-Markov approximations and normality), by Fisher's theorem on Maximum Likelihood Estimators. Show More Responses It didn't mention linear. if it's linear, then ols. if not, CEF, conditional expectation function. It didn't mention linear. if it's linear, then ols. if not, CEF, conditional expectation function. |
Quantitative Analyst at Jane Street was asked...
You have 2 decks of cards (each deck contains both red and black cards). One deck has twice the number of cards in the other deck with the same color ration (so one deck has 52 cards and the other has 104, both half red and half black). I offer you to play a game. First you get to chose which deck of cards you want to play with. Second, you draw 2 cards at random from your deck of choice. If both are red, then I will give you a ferarri. Which deck of cards would you chose? 13 AnswersThe unalert interviewee would answer "it doesn't matter, the probability is the same". While this is true for the first card, you have a higher probability of drawing a second red card with the big deck than the smaller one. So I chose the big deck (no homo) and I was right. yes, he would be right Mathematically: (26/52 * 25/51) vs (52/104 * 51/103) 51/206 > 25/102 Show More Responses Actually, the probabilities are the same for each deck. Consider than because you have a 50/50 chance of drawing your first card red, there's a 50% chance the numerator of the next fraction is reduced by one..so your probability is 26/52 * (25/51+26/51)/2 vs 52/104 * (51/103+52/103)/2 which are the same "G" did the right calculation. To calculate the probability of drawing two red cards in a row one needs to set up an equation where the first drawn was red, and the SECOND card was red as well. The question is asking what is the probability of drawing 2 red cards in a row, NOT what is the probability of drawing a red card then either a red or black card. Intuition tells us if you add in same number of red and black cards into the original problem with 52 cards, probability will go up. Just imagine when you add in 1 billion red cards and 1billion black cards You should definitely choose the larger deck if both are 50% red, 50% blue. Here's another explanation in addition to the other correct ones above. Each deck is naturally partitioned into maximal sub-stacks where each sub-stack consists of cards of a single color, either all red or all blue. If it is known ahead of time that half the cards are blue and half are red, then the expected size of the stack increases with the number of cards in the deck. Correction: expected size of the **sub-stacks** increases with the number of cards in the deck. Sorry about that. extreme case deck 1: 1 red 1 black deck 2: 2 red 2 black So more cards the better chance you get... 2 out of 52 is the equivalent of 4 out of 104. For these chances to be equal then the problem should be 2 out of 52 or 4 out of 104. If you still get to draw only two cards to try to get two reds then the chances should be better with the smaller deck. Either one. If you don't like how the table is set, re arrange it. I would offer the dealer half of the reward if he/she let me draw 5 more times in the deck with more cards. It seems that the odds are greater with a lager number of choices. 2:1 ratio. |
3) Poker. 26 red, 26 black. Take one every time, you can choose to guess whether it’s red. You have only one chance. If you are right, you get 1 dollar. What’s the strategy? And what’s the expected earn? 13 Answersexpected earn is 25 cents. 1/2*1/2*1, prob of choosing to guess is 1/2, prob of guessing right is 1/2, and the pay is $1 I would start picking cards without making a decision to reduce the sample size. This is risky because I could just as easily reduce my chances of selecting red by taking more red cards to start, as I could increase my chances of selecting red by picking more black cards first. But I like my chances with 52 cards, that at some point, I will at least get back to 50% if I start off by picking red. Ultimately, I can keep picking cards until there is only 1 red left. But I obviously wouldn't want to find myself in that situation so I would do my best to avoid it, by making a decision earlier rather than later. Best case scenario, I pick more blacks out of the deck right off the bat. My strategy would be to first pick 3 cards without making a decision. If I start off by selecting more than 1 red, and thus the probability of guessing red correctly is below 50%, then I will look to make a decision once I get back to the 50% mark. (The risk here is that I never get back to 50%) However, if I pick more than 1 black card, then I will continue to pick cards without making a choice until I reach 51% - ultimately hoping that I get down to a much smaller sample size, and variance is reduced, while odds are in my favor that I choose correctly. The expected return, in my opinion, all depends on "when" you decide to guess. If you decide to guess when there is a 50% chance of selecting correctly, then your expected return is 50 cents (50% correct wins you $1 ; 50% incorrect wins $0 --- 0.5 + 0 = .5) If you decide to guess when there is a 51% chance of selecting red correctly, then the expected return adjusts to (0.51* $1) + (0.49 * $0) = 51 cents. So, in other words, your expected return would be a direct function of the percentage probability of selecting correctly. i.e. 50% = 50 cents, 51% = 51 cents, 75% equals 75 cents. Thoughts? There is symmetry between red and black. Each time you pull a card it is equally likely to be red or black (assuming you haven't looked at the previous cards you pulled). Thus no matter when you guess you odds are 50% and the expected return should be 50 cents. Show More Responses scheme: guess when the first one is black, p(guess) x p(right) x 1=1/2 x 26/51=13/51 0.5, just turn the first card to see if it's red. I think it's more about trading psychology. If you don't know where the price is going, just get out of the market asap. Don't expect anything. The problem should be random draw card and dont put it back. Every draw you have one chance to guess. So the strategy is after first draw you random guess it's red. If correct you get one dollar, next draw you know there is less red than black. So you guess black on next draw. Else if first guess you are wrong, you guess red on next round. It's all about conditioning on the information you know from the previous drawings This should be similar to the brainteaser about "picking an optimal place in the queue; if you are the first person whose birthday is the same as anyone in front of you, you win a free ticket." So in this case we want to find n such that the probability P(first n cards are black)*P(n+1th card is red | first n cards are black) is maximized, and call the n+1th card? The problem statement is not very clear. What I understand is: you take one card at a time, you can choose to guess, or you can look at it. If you guess, then if it's red, you gain $1. And whatever the result, after the guess, game over. The answer is then $0.5, and under whatever strategy you use. Suppose there is x red y black, if you guess, your chance of winning is x/(x+y). If you don't, and look at the card, and flip the next one, your chance of winning is x/(x+y)*(x-1)/(x+y-1) + y/(x+y)*x/(x+y-1) = x/(x+y), which is the same. A rigorous proof should obviously done by induction and start from x,y=0,1. The answer above is not 100% correct, for second scenario, if you don't guess, and only look, the total probability of getting red is indeed the same. However, the fact that you look at the card means you know if the probability of getting red is x/(x+y)*(x-1)/(x+y-1) or y/(x+y)*x/(x+y-1). Therefore, this argument only holds if you don't get to look at the card, or have any knowledge of what card you passed Doesn't matter what strategy you use. The probability is 1/2. It's a consequence of the Optional Stopping Theorem. The percent of cards that are left in the deck at each time is a martingale. Choosing when to stop and guess red is a stopping time. The expected value of a martingale at a stopping time is equal to the initial value, which is 1/2. My strategy was to always pick that colour, which has been taken less time during the previous picks. Naturally, that colour has a higher probability, because there are more still in the deck. In the model, n = the number of cards which has already been chosen, k = the number of black cards out of n, and m = min(k, n-k) i.e. the number of black cards out of n, if less black cards have been taken and the number of red cards out n if red cards have been taken less times. After n takes, we can face n+1 different situations, i.e. k = 0,1,2, ..., n. To calculate the expected value of the whole game we are interested in the probability that we face the given situation which can be computed with combination and the probability of winning the next pick. Every situation has the probability (n over m)/2^n, since every outcome can happen in (n over m) ways, and the number of all of the possible outcomes is 2^n. Then in that given situation the probability of winning is (26-m)/(52-n), because there are 26-m cards of the chosen colour in the deck which has 52-n cards in it. So combining them [(n over m)/2^n]*[(26-m)/(52-n)]. Then we have to sum over k from 0 to n, and then sum again over n from 0 to 51. (After the 52. pick we don't have to choose therefore we only sum to 51) I hope it's not that messy without proper math signs. After all, this is a bit too much of computation, so I wrote it quickly in Python and got 37.2856419726 which is a significant improvement compared to a basic strategy when you always choose the same colour. dynamic programming, let E(R,B) means the expected gain for R red and B blue remain, and the strategy will be guess whichever is more in the rest. E(0,B)=B for all Bs, E(R,0)=R for all Rs. E(R,B)=[max(R,B)+R*E(R-1,B)+B*E(R,B-1)]/(R+B). I don't know how to estimate E(26,26) quickly. The question, to me, is not clear. Perhaps on purpose. If so, the best answers would involve asking for clarification. |
The first question he gave me was not hard. 1. You call to someone's house and asked if they have two children. The answer happens to be yes. Then you ask if one of their children is a boy. The answer happens to be yes again. What's the probability that the second child is a boy? 2. (Much harder) You call to someone's house and asked if they have two children. The answer happens to be yes. Then you ask if one of their children's name a William. The answer happens to be yes again.(We assume William is a boy's name, and that it's possible that both children are Williams) What's the probability that the second child is a boy? 12 Answersthe second ans should be 1/3 Actually the asnwer is 3/4. Lets go through the conditional probabilities. If the first child is named William and the second is NOT: the prob. of the second child being a boy is 1/2. If the second child is named William and the first is NOT: the prob. of the second child being a boy is 1. If both children are named William: the prob. of the second child being a boy is 1. Now, assuming equal probs. of the first, second, or both children being named William, the total probability of the second child being a boy is (1/3)(1/2)+(1/3)(1)+(1/3)(1)= 5/6 ^^^^^^^^^^^^^ Correction... I meant to write the answer is 5/6 Show More Responses You're both incorrect. The first answer is 2/3 and the second answer is 4/5. Let B stand for boy (not named William), Bw be a boy named William, and G be a girl. 1. The sample space is: (B,B) (B,G) (G,B) It's easy to see that two of these cases have a boy as the second child so the probability is 2/3. 2. The sample space is: (Bw, Bw) (Bw, G) (G, Bw) (Bw, B) (B, Bw) 4 of the 5 cases have a boy as the second child, and therefore the probability is 4/5. The first answer is 2/3, as mentioned. But the second question hasn't been answered correctly here. Given a person X, define e = P[X's name = William | X = boy]. Then, note that P[X's name = William] = P[X's name = William | X = boy] * P[X = boy] = e / 2 (the problem states that William is a boy's name). Letting C1 be child 1 and C2 be child 2, we are asked to find P[C2 = boy | Y], where for notational simplicity, Y denotes the event "C1's name = William or C2's name = William." By Bayes' rule, we have: P[C2 = boy | Y] = P[C2 = boy, Y] / P[Y] == Denominator: P[Y] = 1 - P[C1's name != William AND C2's name != William] = 1 - e/2 * e/2 = 1 - e^2/4. == == Numerator: P[C2 = boy, Y] = P[Y | C2 = boy] * P[C2 = boy] = 1/2 * P[Y | C2 = boy] = 1/2 * P[C1's name = William or C2's name = William | C2 = boy]. To compute P[C1's name = William or C2's name = William | C2 = boy], there are 3 cases. Case 1: C1's name is William, C2's name isn't William (given C2 is a boy). This is e/2 * (1- e). Case 2: C2's name is William, C1's name isn't William (given C2 is a boy). This is e * (1 - e/2). Case 3: both names are William (given C2 is a boy). This is e/2 * e. Summing these 3 cases gives e/2 - e^2/2 + e - e^2/2 + e^2 / 2 = 3e/2 - e^2/2, which is the numerator. == Dividing, we have (3e/2 - e^2/2) / (1 - e^2/4) = e * (3 - e) / (4 - e^2). As a sanity check, note that setting e = 1 implies that all boys are named William, and our probability is 2/3, as in the first question. Setting e = 0 implies that no boys are named William, in which case our probability is 0. unless William's prior distribution is provided. The only information we got is that William is a boy name. Thus the event is equal to at least one of them is a boy = 2/3. Becareful when you guys calculate probability with outcomes UNEQUAL LIKELY. 1. BB, BG, GB, GG 1/4 each, which later reduced to only BB, BG, GB with 1/3 probability each. So the probability of BB is 1/3 2. Let w is the probability of the name William. Probability to have at least one William in the family for BB is 2w-w^2, For BG - w, GB - w, GG - 0. So the probability of BB with at least one William is (2w-w^2)/(2w+2w-w^2) ~ 1/2 1. P[C2 = boy | C1 = boy ] = 0.5 since these are independent events 2. P[C2 = boy | C1 = boy and is called William ] = 0.5 since these are independent events The answer by Anonymous poster on Sep 28, 2014 gets closest to the answer. However, I think the calculation P[Y] = 1 - P[C1's name != William AND C2's name != William] should result in 1 - (1- e /2) ( 1- e / 2) = e - (e ^ 2 ) / 4, as opposed to poster's answer 1 - (e^2) / 4, which I think overstates the probability of Y. For e.g. let's assume that e (Probability [X is William | X is boy]) is 0.5, meaning half of all boys are named William. e - (e ^ 2) / 4 results in probability of P(Y) = 7/16; Y = C1 is William or C2 is William 1 - (e ^ 2) / 4 results in probability of P(Y) = 15/16, which is way too high; because there is more than one case possible in which we both C1 and C2 are not Williams, for e.g. if both are girls or both are boys but not named William etc) So in that case the final answer becomes: (3e/2 - (e^2)/2) * 0.5 / (e - (e ^ 2) / 4) = 3e - e^2 / 4e - e^2 = (3 - e) / (4 - e) One reason why I thought this might be incorrect was that setting e = 0, does not result in P(C2 = Boy | Y) as 0 like Anyoymous's poster does. However I think e = 0 is violates the question's assumptions. If e = 0, it means no boy is named William but question also says that William is a Boy's name. So that means there can be no person in the world named William, but then how did question come up with a person named William! I think second child refers the other child (the one not on the phone) In this case answer to first is 1/3 and second is (1-p)/(2-p) where p is total probability of the name William. For sanity check if all boys are named William the answers coincide. i think you guys are doing way too much and this is a trick question, these are completely independent events, i could name my child william, i could name it lollipop - the chances of it being a boy are still .5, regardless of its brothers name as well . (im going with .5 for both questions) keep in mind this is a quick phone interview question so they wont give anything thats too calculation heavy, involving e^2 exponents fractions etc, because you're supposed to be able to do it in your head Answer by indosaurabh is correct, i go to harvard |
Quantitative Researcher at Citadel was asked...
Given log X ~ N(0,1). Compute the expectation of X. 12 AnswersThis is a basic probability question. Exp[1/4] exp(mu + (sigma^2)/2) = exp(0+1/2) = exp(1/2) Show More Responses Let Y = log(X), then X = exp(Y) = r(Y), if we call the pdf of X f(X), then E[X] = integral(Xf(X)dX). By variable transformation, f(x) = g(r^-1(X))r^-1(X))', plug this into E[X] = integral(Xf(X)dX), we get integral( f(y)dy ), which equals to 1 Suppose the density function of Y is P(y) and the one for X is F(x), it obeys that P(y)*dy = F(x)*dx; then the expectation of X is E(x) = Integral( x*F(x)*dx ) = Integral( Exp(y) * P(y) * dy ); if you plug the gaussian function and standard deviation in, you will find E(x) = Integral( Exp(1/2) * P(y-1/2)*d(y-1/2) ) = Exp(1/2) So, mojo's ans is correct. I m not that sure, as I got E(x) = 4 I substituted log X = y e^y = X ;and e^2y = t and plz do not forget to change the integration limits Do they care if you explain the theory or not? I just looked at it, it's standard normal, therefore x=50% P(logX P(X Sorry misread the problem. ignore. X has a log-normal distribution, so yes the mean is exp(mu+sigma^2/2)=exp(1/2) Expanding on the correct answers above: E[X] = E[exp(logX)], and logX is normally distributed. So: E[X} is the moment-generating-function (mgf) of a standard normal distribution, evaluated at 1. The mgf of a normal distribution with mean mu, SD sigma is exp(mu*t + (1/2) * sigma^2 * t^2), now set mu = 0, sigma = 1, t = 1 to get exp(1/2). Complete the square in the integral |
Quantitative Trader at Jane Street was asked...
If we flip a coin 100 times, what is the probability of getting even number of heads? 13 Answers1/2. Let p_(e,n) be the probability that we have an even number of heads given n tosses of a coin. We have the recursion relation p_(e,n) = (prob to get heads on toss N AND we have odd number of heads after n-1 tosses) + (prob to get tails on toss N AND we have even number of heads after n-1 tosses) = 1/2 p(o, n-1) + 1/2 p(e,n-1). But p(o, n-1) = 1 - p(e, n-1), so p_(e,n) = 1/2 (1-p(e, n-1)) + 1/2 p_(e, n-1) = 1/2. 1/2. Whether even or odd heads is ultimately determined by the nth flip, for any n. Probability even to odd or vice versa is n. Show More Responses ( (2^50)+1) / (2^100) Even numbers include 0 heads to 100 heads. There exist 50% even complements and 50% odd complements from 1-100. So half our sample space. Now we do our zero case which is no heads and there is only 1 outcome. ( (2^99)+1) / (2^100) Even numbers include 0 heads to 100 heads. There exist 50% even complements and 50% odd complements from 1 heads to 100 heads.. So half our sample space. Now we do our zero case which is no heads and there is only 1 outcome. 1. 0-99 flips: 50% even; 50% odd 2. P(even after 100 flips) = P(odd after 99)*0.5 + P(even after 99)*0.5 = 0.5*0.5+0.5*0.5 = 0.5 all your answers are wrong. Just consider a game with 4 coins: Overall you have 2^4 = 16 differten combinations. The fisable combinations with 50/50 head tails are: 1100 1010 1001 0110 0101 0011 So just 6 out of 16 which gives us a prob of 3/8 The formula of Anonym tells us 1) (2^2 + 1)/(2^4) = 5/16 or 2) (2^3 + 1) /(2^4) = 9/16 and the solution of ky tells us P(odd after 3)*0.5 + P(even after 3)*0.5 but forgets about 111x. However, we can calculate the probability distribution of all tosses. Just model 0 = tail, 1 = head and look at the sum of all coin-tosses S. The law of large numbers tells us that the expression [S-E(Coin-toss)]/sqrt(100*0.25) approximately normal distributed. Thus, S ~ N(1/2,100*25) and consequently we see on average 1:1 head and tails. (Sum[i range from 0-50(inclusive)] 100 Choose 2i)*(1/2^100) Ans: 1/2 Lemma 1: For any n >= 1, sum_(i even, 0 = 1) coin tosses, P(getting even number of Heads) = 2^(n-1)/2^n = 1/2 Ans: 1/2 Lemma 1: For any n >= 1, sum_(i even, 0 = 1) coin tosses, P(getting even number of Heads) = 2^(n-1)/2^n = 1/2 Ans: 1/2 Lemma 1: For any n larger than 1, sum_(i even, i from 0 to n) n Choose i = sum_(j odd, j from 0 to n) n Choose j Proof of Lemma 1: For odd n, it is obviously true since "n Choose r = n Choose (n-r)". If n is odd and r is even, then (n-r) is odd, vice versa. Hence, with n odd, sum w.r.t. all even i is equal to sum w.r.t. all odd j. For even n, we may deduce that "sum_(i even, i from 0 to n) n Choose i = sum_(r from 0 to (n-1)) (n-1) Choose r = sum_(j odd, j from 0 to n) n Choose j" In other words, for n even, sum of (even number)-terms at the nth layer of the Pascal's Triangle is equal to sum of all terms at the (n-1)th layer of the Pascal's Triangle. For example, 1 3 3 1 3rd layer of Pascal's Triangle 1 4 6 4 1 4th layer of Pascal's Triangle 1+6+1 = 1+3+3+1 = 4+4 Since sum_(i even, i from 0 to n) n Choose i = sum_(j odd, j from 0 to n) n Choose j, and obviously sum_(i even, i from 0 to n) n Choose i + sum_(j odd, j from 0 to n) n Choose j = 2^n, we have sum_(i even, i from 0 to n) n Choose i = 2^(n-1) Therefore, in n (larger than or equal to 1) coin tosses, P(getting even number of Heads) = 2^(n-1)/2^n = 1/2 There is a bijection between the set of coin tosses with an even number of heads and those with an odd number of heads. (One can easily prove this by induction, being true for n=1, and the induction step is trivial because the even sets are carried over to the odd sets and vice versa.) Hence p=1/2. One or more comments have been removed. |
How many numbers between 1 and 1000 contain a 3? 13 Answers300 numbers contain a 3, but you counted numbers of the form x33, 3x3, and 33x *twice* so you must subtract them 300-30=270, but you subtracted 333 once too many, so add it back 300-30+1=271. Answer is 271. 271 does not seem right . I think it is 111. Aren't there just 11 numbers containing 3 between 0 an 100, like 3,13,23 30,33 etc. So between 0 and 1000 there are 111; after adding in for 300. Show More Responses Thanks Candidate, or more simply For 300 to 399 = 100 numbers For other x00 to x99 = 19 numbers each = 19 x 9 = 171 numbers Total of 271 numbers I would go with elimination of one character from a base 10 numbering system gives you a base 9 numbering system. 9^3 = 729 permutations of a base 9 numbering system (a system with no number 3) with 3 digits, since 10^3 = 1000; 10^3 - 9^3 = 271 thus 271 numbers have a 3 in them. 1000 does not contain a 3. So, count the number of 3 digit numbers without a three. There are 9 choices for the first entry, 9 for the second, and 9 for the third. So, there are 729 numbers without a 3, and 1000-729 = 271 with a 3. I have a much simpler and faster method: let A be the cardinality of numbers between 1 and 1000 that contain a 3 and let A' be the cardinality of numbers between 1 and 1000 that do not contain a 3. There are 3 digits that can take the form of (0,1,2,3,4,5,6,7,8,9), so 10 possibilities.. To obtain A' cardinality we have 9 possibilities because 3 is excluded so A' = 9^3 = 729. Hence, the amount of numbers that don't have a 3 from 1 to 1000 is 729 so to obtain the amount that does contain at least one 3 is : 1000 - 729 = 271 lol im in 7th grade and this question is easy to me. first you count the numbers which DON't have a three. there are 9 choices for the first digit, 9 for the second digit, and 9 for the third. You probably noticed that this counts 0 to 999 instead of 1 to 1000 but its okay because, we count the same amount of numbers. 9^3=729 -1000=|271| (i like using absolute value cause it makes me look cool, its just a way to show "difference" in math) 3xx 100 a3x a is not 3 here to reduce duplication, 10*9 ab3 a , b are neither not 3 , 9*9 =271 You can use this formula to work it out for any power of 10: Tn+1=9Tn+10^n. Tn being the number of threes in the numbers between 1 and the previous power of 10. Tn+1 is simply saying the number of threes in the next power (the one you are working out). The 10^n is the power of 10 that you add on. This is the previous power of ten. You must start off knowing that there is one 3 between 1 and 10 For 100: Tn+1=9*1+10^1=19 For 1000: Tn+1=9*19+10^2=271 For 10 000: Tn+1=9*271+10^3=3439 It is an infinite number 3.1, 3.11, 3.111, 3.11111 One or more comments have been removed. |
Quantitative Researcher at Jane Street was asked...
You have two decks of cards: a 52 card deck (26 black, 26 red) and a 26 card deck (13 black, 13 red). You randomly draw two cards and win if both are the same color. Which deck would you prefer? What if the 26 card deck was randomly drawn from the 52 card deck? Which deck would you prefer then? 11 AnswersI responded immediately to the first part. The second part took me a bit longer - I immediately said that my intuition thought the third deck and the first deck were equally good but couldn't give a good rigorous proof very quickly (took about 30 seconds or so). Actually I think the third deck is better than the first deck. That is because it says to "draw two cards of the same color" not "draw two black cards". Compare the following decks: a deck with 13 black and 13 red, a deck with 26 black, and a deck with 26 red. The chance of drawing two of the same color cards are 6/25, 1, 1 respectively. You can see with a little math that any distribution of 26 cards is better than or equally as good as a distribution of 13 red and 13 black cards. Show More Responses @curious_cat I think that only implies that the third deck is better than the second deck (the second has 13/13 while the first has 26/26). 1) P(win | 52-card deck) = 25/51. P(win | even 26-card deck) = 12/25. 52-card deck is better. 2) P(win | n-red cards in random 26-card deck) = (n/26 * ((n-1) / 25)) + ((26-n) / 26 * ((26-n-1) / 25)) = (n^2) / 325 - (2n / 25) + 1. Taking the derivative and solving for the root: P' = 2n / 325 - 2 / 25 = 0 -> n = 13, which is a minimum. Interpretation: having equal numbers of red and black cards in the deck MINIMIZES your chances of winning. Because the last deck is the same as the second deck (26 cards, split evenly red/black) except it may have an uneven number, this last (randomly selected) deck is better than the evely-split deck, but is it better than the 52-card deck? For this, we use the Hypergeometric Distribution (like the Binomial distribution, but for trials without replacement) to look at the odds of getting a 26-card deck with n red cards: P(selecting n red cards for random 26-card deck) = [ (52-26) C (26-n) ] * [ 26 C n ] / [52 C 26] = (2^43 * 3^17 * 5^12 * 7^4 * 11^4 * 13^4 * 17 * 19^2 * 23^2) / (29 * 31 * 37 * 41 * 43 * 47 * (n!)^2 * ((26-n)!)^2). From here, all that's left to do is combine these probabilities with the probability of winning [from above, P(win | n-red cards in deck) = (n^2) / 325 - (2n / 25) + 1] with each deck that contains 0 through 26 red cards (n => {0,26}). If this is larger than 25/51, then we can say definitively that we would prefer the randomly selected 26-card deck to the even 52-card deck. However, doing this out reveals that the probability of winning with the randomly selected deck = 25/51. Therefore, odds of winning are THE SAME with either the first (even 52-card deck) or the last (26-card deck, randomly selected from an even 52-card deck). Imagine, all that math to prove a simple equality! :) Q.E.D. the 3rd deck is the same as the 1st deck we do not need to calculate it by hand P(I randomly pick 2 cards in a 52 deck) = P(I always pick 2 cards on the top of the 52 cards’ deck) = P(You shuffle the deck, then I pick 2 cards on the top) = P(You shuffle the deck, you throw away the bottom half deck, then I pick 2 cards on the top) = P(Picking a 26 cards’ random deck, then I pick 2 cards on the top) = P(Picking a 26 cards’ random deck, then I randomly choose 2 cards in the 26 cards deck) in this logic - even if you only pick a 4 cards' deck randomly from the 52 cards deck for me to choose 2 cards - it's the same probability as if I choose 2 randomly from the 52 cards' deck directly . The 3rd deck is better. Suppose the 3rd deck has k red cards. The probability of getting 2 cards of the same colour is (C(k,2) + C(26-k,2))/C(26,2). It is easy to see that this is minimum for k = 13, which is the first deck. So essentially any random 26 cards is at-least as good as a 13-13 split. Split a blind draw into two draws doesnt change your distribution. These answers are all overkill, the answers are obvious by intuition which are good enough (perhaps even better) for an interview. 1. Obviously deck 1 is better , because taking away your first card has a smaller impact on the ratio of cards left of the same colour. 2. Obviously they're the same. Deck 1 is equivalent to shuffling a deck and taking the top 2 cards, Deck 3 is equivalent to shuffling a deck, taking the top 13 cards of that and then taking the top 2 cards of that. One or more comments have been removed. |
Drawing a pair of (x, y) from a joint Gaussian distribution with 0 covariance. Knowing the stndard deviations of x and y and knowing z = x + y, what is your best guess for x? 10 AnswersZ sigma_x^2/ (sigma_x^2 + sigma_y^2). This is because X and Z are jointly normal and their covariance is equal to the variance of x. Therefore, the correlation coefficient is equal to sigma_x/sigma_z, and as E(X|Z)= rho. (sigma_x/sigma_z). Z, replacing the fact that the variance of the sum is the sum of the variance for independent (normal) R.V.s will give us the answer! z/2. Think about the 2 dimensional graph of joint density of (x, y). The condition x+y = z (here z is fixed) is a vertical plane. The intersection will be proportional to the conditional density. For any curve of such intersection, the highest point has x coordinate z/2. The answer is 0.5 EX - 0.5* EY + 0.5z (EX EY is not equal to zero) Show More Responses Dc [ z * sigma_y^ (-2) ] / [ sigma_x^ (-2) + sigma_y^ (-2) ] From signal processing point of view, x is the signal, y is the noise, and z is the observation. We know X has a prior distribution X ~ N(0, sigma_x^ 2 ), noise Y has distribution Y ~ N(0, sigma_y^ 2 ) and the value Z = z, the questions is what is the MMSE estimate of X given Z, i.e., E(X|Z)? Using Bayesian theorem, or Gauss Markov Theorem, one can show that : E(X|Z) = [ z * sigma_y^ (-2) + 0 * sigma_x^ (-2) ] / [ sigma_x^ (-2) + sigma_y^ (-2) ] Comments: 1. This kind of problems are very common so please keep it in mind in Gaussian case the best estimate of X is a weighted linear combination of maximum likelihood estimate (z in this problem ) and the prior mean (0 in this problem). And the weights are the the inverse of variance. 2. In multi dimension cases where x, y, z are vectors, similar rules also apply. Check Gauss Markov Theorem 3. In tuition here is the larger variance of noise y, the less trust we will assign on ML estimate, which is sigma_y^ (-2) . Correspondingly, the more trust we put on the prior of X. [ z * sigma_y^ (-2) ] / [ sigma_x^ (-2) + sigma_y^ (-2) ] From signal processing point of view, x is the signal, y is the noise, and z is the observation. We know X has a prior distribution X ~ N(0, sigma_x^ 2 ), noise Y has distribution Y ~ N(0, sigma_y^ 2 ) and the value Z = z, the questions is what is the MMSE estimate of X given Z, i.e., E(X|Z)? Using Bayesian theorem, or Gauss Markov Theorem, one can show that : E(X|Z) = [ z * sigma_y^ (-2) + 0 * sigma_x^ (-2) ] / [ sigma_x^ (-2) + sigma_y^ (-2) ] Comments: 1. This kind of problems are very common so please keep it in mind in Gaussian case the best estimate of X is a weighted linear combination of maximum likelihood estimate (z in this problem ) and the prior mean (0 in this problem). And the weights are the the inverse of variance. 2. In multi dimension cases where x, y, z are vectors, similar rules also apply. Check Gauss Markov Theorem 3. In tuition here is the larger variance of noise y, the less trust we will assign on ML estimate, which is sigma_y^ (-2) . Correspondingly, the more trust we put on the prior of X. Z sigma_x^2/ (sigma_x^2 + sigma_y^2), similar to CAPM If you don't know above theorems you can use good old bayes, P(X|Z) = P(Z|X)P(X)/P(Z) and set derivative=0, since you have pdfs of X and Z. But it's really messy and I don't wanna do it. One or more comments have been removed. |
What is the probability of breaking a stick into 3 pieces and forming a triangle? 8 AnswersIts 1/4. Here is the key idea for my analysis of the problem: If we consider the original stick to be of unit length, then we can form a triangle whenever the longest stick is less than a half unit long. Suppose x is the length of the first piece and y is the length of the second piece (both must be nonnegative). Then y will be <= 1-x, and to be able to form a triangle, y must be <= .5 - x with x <= .5. The probability of being able to form a triangle is the area of the second set of (x,y) pairs divided by the area of the first set of (x,y) pairs, which is .125/.5 = .25. The probability is 0 given that it is a question about the probability of 2 breaking points falling on 1st thirdth and 2nd thirdth point. For any continuous variable, the probability of the variable equal to countable points (including indefinite countable numbers) equal to 0 Show More Responses For one time evet, the probability is 0 I think, the probability is 1/2: Breaking a stick into three pieces corresponds to selecting three real positive numbers with a+b+c=1, and, w.l.o.g., a>=b>=c. The triangle inequation, that any two sides are longer than the third one (i.e., a= 0.5, then b+c a, so we cannot form a triangle. - If a 0.5 > a. The other two inequations a+b>c and a+c>b also hold because a>=b>=c: b is positive and so from a>=c we have a+b>c, similarly, as c>0, a+c>b holds. Hence, we can form a triangle iff a<0.5. Ultimately, selecting a number 0 nothing in the question said it had to be equilateral triangle so the probability is 100% 1.a+b>c 2.b+c>a 3.a+c>b A Triangle is formed when all three are true. As there are three pieces, so a>0, b>0 and c> 0 Only possibilities are: 1.T,T,F i.e one is bigger than sum of 2 2. T,T,T I.e all thee equations are valid Favorable possibility is T,T,T So answer = 1/2 = 0.5 try this simulation in R: checkTri c & a+c>b & b+c>a, 1, 0) ) } mean(replicate(100, checkTri())) Theoretically, we have conditions that a+b>1/2, a<1/2, b<1/2. If you can draw this area in an coordinate axis, you can calc the probability. |
See Interview Questions for Similar Jobs
- Analyst
- Software Engineer
- Data Scientist
- Associate
- Quantitative Research Analyst
- Quantitative Researcher
- Business Analyst
- Data Analyst
- Trader
- Intern
- Financial Analyst
- Vice President
- Software Developer
- Senior Software Engineer
- Consultant
- Investment Banking Analyst
- Summer Analyst
- Quantitative Associate
- Strategist