- Q1: You and your friend are playing a game with a fair coin. The two of you will continue to toss the coin until the sequence HH or TH shows up. If HH shows up first, you win, and if TH shows up first your friend win. What is the probability of you winning the game?
- Q2: If you roll a dice three times, what is the probability to get two consecutive threes?
- Q3: Suppose you have ten fair dice. If you randomly throw them simultaneously, what is the probability that the sum of all of the top faces is divisible by six?
- Q4: If you have three draws from a uniformly distributed random variable between 0 and 2, what is the probability that the median of three numbers is greater than 1.5?
- Q5: Assume you have a deck of 100 cards with values ranging from 1 to 100 and you draw two cards randomly without replacement, what is the probability that the number of one of them is double the other?
- Q6: What is the difference between the Bernoulli and Binomial distribution?
- Q7: If there are 30 people in a room, what is the probability that everyone has different birthdays?
- Q8: Assume two coins, one fair and the other is unfair. You pick one at random, flip it five times, and observe that it comes up as tails all five times. What is the probability that you are fliping the unfair coin?
- Q9: Assume you take a stick of length 1 and you break it uniformly at random into three parts. What is the probability that the three pieces can be used to form a triangle?
- Q10: Say you draw a circle and choose two chords at random. What is the probability that those chords will intersect?
- Q11: If there’s a 15% probability that you might see at least one airplane in a five-minute interval, what is the probability that you might see at least one airplane in a period of half an hour?
- Q12: Say you are given an unfair coin, with an unknown bias towards heads or tails. How can you generate fair odds using this coin?
- Q13: According to hospital records, 75% of patients suffering from a disease die from that disease. Find out the probability that 4 out of the 6 randomly selected patients survive.
- Q14: Discuss some methods you will use to estimate the Parameters of a Probability Distribution
- Q15: You have 40 cards in four colors, 10 reds, 10 greens, 10 blues, and ten yellows. Each color has a number from 1 to 10. When you pick two cards without replacement, what is the probability that the two cards are not in the same color and not in the same number?
- Q16: Can you explain the difference between frequentist and Bayesian probability approaches?
- Q17: Explain the Difference Between Probability and Likelihood
Q1: You and your friend are playing a game with a fair coin. The two of you will continue to toss the coin until the sequence HH or TH shows up. If HH shows up first, you win, and if TH shows up first your friend win. What is the probability of you winning the game?
Answer:
First flip is either heads or tails. If the second flip is heads we have a winner no matter what. Hence we have a 1/2 chance of game ending on the second flip. If first flip is H, and the second flip is H, then player 1 wins. If first flip is H, and second flip is T, the game goes on. Generalizing, if the last flip was T, then HH will never occur and player 1 has no chance of wining. Either the game goes on OR player 2 wins OR the game goes on AND player 2 wins. Player 1 can only win if the first flip is H and the second flip is H. Consider the following four scenarios
- HH : Player 1 wins
- HT ... ? : HH will never occur before TH. Player 2 wins.
- TT ... ? : HH will never occur before TH. Player 2 wins.
- TH : Player 2 wins. Hence probability of player 1 wining is 1/4 and probability of player 2 wining is 3/4.
The right answer is 11/216
There are different ways to answer this question:
-
If we roll a dice three times we can get two consecutive 3’s in three ways:
-
The first two rolls are 3s and the third is any other number with a probability of 1/6 * 1/6 * 5/6.
-
The first one is not three while the other two rolls are 3s with a probability of 5/6 * 1/6 * 1/6
-
The last one is that the three rolls are 3s with probability 1/6 ^ 3
So the final result is 2 * (5/6 * (1/6)^2) + (1/6)*3 = 11/216
By Inclusion-Exclusion Principle:
Probability of at least two consecutive threes = Probability of two consecutive threes in first two rolls + Probability of two consecutive threes in last two rolls - Probability of three consecutive threes
= 2 * Probability of two consecutive threes in first two rolls - Probability of three consecutive threes = 2 * (1/6) * (1/6) - (1/6) * (1/6) * (1/6) = 11/216
It can be seen also like this:
The sample space is made of (x, y, z) tuples where each letter can take a value from 1 to 6, therefore the sample space has 6x6x6=216 values, and the number of outcomes that are considered two consecutive threes is (3,3, X) or (X, 3, 3), the number of possible outcomes is therefore 6 for the first scenario (3,3,1) till (3,3,6) and 6 for the other scenario (1,3,3) till (6,3,3) and subtract the duplicate (3,3,3) which appears in both, and this leaves us with a probability of 11/216.
Q3: Suppose you have ten fair dice. If you randomly throw them simultaneously, what is the probability that the sum of all of the top faces is divisible by six?
Answer: 1/6
Explanation: With 10 dices, the possible sums divisible by 6 are 12, 18, 24, 30, 36, 42, 48, 54, and 60. You don't actually need to calculate the probability of getting each of these numbers as the final sums from 10 dices because no matter what the sum of the first 9 numbers is, you can still choose a number between 1 to 6 on the last die and add to that previous sum to make the final sum divisible by 6. Therefore, we only care about the last die. And the probability to get that number on the last die is 1/6. So the answer is 1/6
Q4: If you have three draws from a uniformly distributed random variable between 0 and 2, what is the probability that the median of three numbers is greater than 1.5?
The right answer is 5/32 or 0.156. There are different methods to solve it:
- Method 1:
To get a median greater than 1.5 at least two of the three numbers must be greater than 1.5. The probability of one number being greater than 1.5 in this distribution is 0.25. Then, using the binomial distribution with three trials and a success probability of 0.25 we compute the probability of 2 or more successes to get the probability of the median is more than 1.5, which would be about 15.6%.
- Method2 :
A median greater than 1.5 will occur when o all three uniformly distributed random numbers are greater than 1.5 or 1 uniform distributed random number between 0 and 1.5 and the other two are greater than 1.5.
So, the probability of the above event is = {(2 - 1.5) / 2}^3 + (3 choose 1)(1.5/2)(0.5/2)^2 = 10/64 = 5/32
- Method3:
Using the Monte Carlo method as shown in the figure below:
Q5: Assume you have a deck of 100 cards with values ranging from 1 to 100 and you draw two cards randomly without replacement, what is the probability that the number of one of them is double the other?
There are a total of (100 C 2) = 4950 ways to choose two cards at random from the 100 cards and there are only 50 pairs of these 4950 ways that you will get one number and it's double. Therefore the probability that the number of one of them is double the other is 50/4950.
Answer:
Bernoulli and Binomial are both types of probability distributions.
The function of Bernoulli is given by
p(x) =p^x * q^(1-x) , x=[0,1]
Mean is p
Variance p*(1-p)
The function Binomial is given by:
p(x) = nCx p^x q^(n-x) x=[0,1,2...n]
Mean : np
Variance :npq
Where p and q are the probability of success and probability of failure respectively, n is the number of independent trials and x is the number of successes.
As we can see sample space( x ) for Bernoulli distribution is Binary (2 outcomes), and just a single trial.
Eg: A loan sanction for a person can be either a success or a failure, with no other possibility. (Hence single trial).
Whereas for Binomial the sample space(x) ranges from 0 -n.
Eg. Tossing a coin 6 times, what is the probability of getting 2 or a few heads?
Here sample space is x=[0,1,2] and more than 1 trial and n=6(finite)
In short, Bernoulli Distribution is a single trial version of Binomial Distribution.
Q7: If there are 30 people in a room, what is the probability that everyone has different birthdays?
The sample space is 365^30 and the number of events is 365p30 because we need to choose persons without replacement to get everyone to have a unique birthday therefore the Prob = 356p30 / 365^30 = 0.2936
A theoretical explanation is provided in the figure below thanks to Fazil Mohammed.
Note: Why do we use permutations and not combinations here?
When calculating 365C30, you are saying: “Out of 365 days, I'm choosing 30 distinct days, but I don't care in what order they are assigned to people.” This treats the selection of birthdays as unordered, which isn't the case in the birthday problem, because who gets which birthday is important.
For example, if you selected 30 distinct birthdays (as in a combination), this would only tell you which 30 birthdays are used, but it wouldn't account for the fact that different people being assigned different birthdays creates different outcomes. In contrast, with permutations, we are considering the specific assignment of each person to a particular birthday, where the order matters because we care about which person gets which birthday. i.e. if Person A is born on 01/01 and person B is born on 02/02, its different than if Person A is born on 02/02 and person B is born on 01/01.
Interesting facts provided by Rishi Dey Chowdhury:
-
With just 23 people there is over 50% chance of a birthday match and with 57 people the match probability exceeds 99%. One intuition to think of why with such a low number of people the probability of a match is so high. It's because for a match we require a pair of people and 23 choose 2 is 23*11 = 253 which is a relatively big number and ya 50% sounds like a decent probability of a match for this case.
-
Another interesting fact is if the assumption of equal probability of birthday of a person on any day out of 365 is violated and there is a non-equal probability of birthday of a person among days of the year then, it is even more likely to have a birthday match.
Q8: Assume two coins, one fair and the other is unfair. You pick one at random, flip it five times, and observe that it comes up as tails all five times. What is the probability that you are fliping the unfair coin? Assume that the unfair coin always results in tails.
Answer:
Let's use Baye’s theorem let U denote the case where you are flipping the unfair coin and F denote the case where you are flipping the fair coin. Since the coin is chosen randomly, we know that P(U)=P(F)=0.5. Let 5T denote the event of flipping 5 tails in a row.
Then, we are interested in solving for P(U|5T) (the probability that you are flipping the unfair coin given that you obtained 5 tails). Since the unfair coin always results in tails, therefore P(5T|U) = 1 and also P(5T|F) =1/2⁵ = 1/32 by the definition of a fair coin.
Lets apply Bayes theorem where P(U|5T) = P(5T|U) * P(U) / P(5T|U)* P(U) + P(5T|F)* P(F) = 0.5 / 0.5 +0.5* 1/32 = 0.97
Therefore the probability that you picked the unfair coin is 97%
Q9: Assume you take a stick of length 1 and you break it uniformly at random into three parts. What is the probability that the three pieces can be used to form a triangle?
Answer: The right answer is 0.25
Let's say, x and y are the lengths of the two parts, so the length of the third part will be 1-x-y
As per the triangle inequality theorem, the sum of two sides should always be greater than the third side. Therefore, no two lengths can be more than 1/2. x<1/2 y<1/2
To achieve this the first breaking point (X) should before the 0.5 mark on the stick and the second breaking point (Y) should be after the 0.5 mark on the stick.
P(X < 0.5) = (0.5-0) / (1-0) = 0.5
P(Y > 0.5) = (1 - 0.5) / (1-0) = 0.5
Hence, overal probability = P(X < 0.5) * P(Y > 0.5) = 1/5 = 0.25
Q10: Say you draw a circle and choose two chords at random. What is the probability that those chords will intersect?
Answer: For making 2 chords, 4 points are necessary and from 4 points there are 3 different combinations of pairs of chords can be made. From the 3 combinations, there is only one combination in which the two chords intersect hence answer is 1/3. Let's assume that P1, P2, P3, and P4 are four points then 3 different combinations are possible for pairs of chords: (P1 P2) (P3 P4) or (P1 P3) (P4 P2) or (P1 P4) (P2 P3) there the 3rd one will only intersect.
Q11: If there’s a 15% probability that you might see at least one airplane in a five-minute interval, what is the probability that you might see at least one airplane in a period of half an hour?
Answer:
Probability of at least one plane in 5 mins interval=0.15 Probability of no plane in 5 mins interval=0.85 Probability of seeing at least one plane in 30 mins=1 - Probability of not seeing any plane in 30 minutes =1-(0.85)^6 = 0.6228
This problem can also be solved using Poisson distribution. Refer this blog post.
Q12: Say you are given an unfair coin, with an unknown bias towards heads or tails. How can you generate fair odds using this coin?
Answer:
Q13: According to hospital records, 75% of patients suffering from a disease die from that disease. Find out the probability that 4 out of the 6 randomly selected patients survive.
Answer: This has to be a binomial since there are only 2 outcomes – death or life.
Here n =6, and x=4.
p=0.25 (probability if life) q = 0.75(probability of death)
Using probability mass function equation:
P(X) = nCx * p^x * q^(n-x)
Then:
P(4) = 6C4 * (0.25)^4 * (0.75)^2 = 0.032
Answer:
There are different ways you can go about this. Following are some methods, one may choose only one of these or a combination depending on the observed data.
- Method of moments
- Maximum Likelihood Estimatation
- Bayesian Estimation
- Least Squares Estimation
- Method of Least Absolute Deviation
- Chi-squared Test
Q15: You have 40 cards in four colors, 10 reds, 10 greens, 10 blues, and ten yellows. Each color has a number from 1 to 10. When you pick two cards without replacement, what is the probability that the two cards are not in the same color and not in the same number?
Answer:
Since it doesn't matter how you choose the first card, so, choose one card at random. Now, all we have to care about is the restriction on the second card. It can't be the same number (i.e. 3 cards from the other colors can't be chosen in favorable cases) and also can't be the same color (i.e. 9 cards from the same color can't be chosen keep in mind we have already picked one).
So, the number of favorable choices for the 2nd card is (39-12)/39 = 27/39 = 9/13
Answer:
The frequentist approach to probability defines probability as the long-run relative frequency of an event in an infinite number of trials. It views probabilities as fixed and objective, determined by the data at hand. In this approach, the parameters of a model are treated as fixed and unknown and estimated using methods like maximum likelihood estimation.
On the other hand, Bayesian probability defines probability as a degree of belief, or the degree of confidence, in an event. It views probabilities as subjective and personal, representing an individual's beliefs. In this approach, the parameters of a model are treated as random variables with prior beliefs, which are updated as new data becomes available to form a posterior belief.
In summary, the frequentist approach deals with fixed and objective probabilities and uses methods like estimation, while the Bayesian approach deals with subjective and personal probabilities and uses methods like updating prior beliefs with new data.
Probability and likelihood are two concepts that are often used in statistics and data analysis, but they have different meanings and uses.
Probability is the measure of the likelihood of an event occurring. It is a number between 0 and 1, with 0 indicating an impossible event and 1 indicating a certain event. For example, the probability of flipping a coin and getting heads is 0.5.
The likelihood, on the other hand, is the measure of how well a statistical model or hypothesis fits a set of observed data. It is not a probability, but rather a measure of how plausible the data is given the model or hypothesis. For example, if we have a hypothesis that the average height of people in a certain population is 6 feet, the likelihood of observing a random sample of people with an average height of 5 feet would be low.