2 minute read

What I did

  • Probability Problems

What I learned

What’s the probability of rolling 2 six-sided dice which aren’t identical and have a sum of 6?

import numpy as np

d1 = np.arange(1, 7, 1)
d2 = np.arange(1, 7, 1)

combs = [(x, y) for x in d1 for y in d2] # nexted list comprehension!
tot = sum([(x != y and x + y == 6) for x, y in combs]) # clever trick I came up with because if true 1, 0 else.
return tot/len(combs)

There are urns labeled X, Y, and Z.
Urn X contains 4 red balls and 3 black balls.
Urn Y contains 5 red balls and 4 black balls.
Urn Z contains 4 red balls and 4 black balls.
One ball is drawn from each of the urns. What is the probability that, of the balls drawn, 2 are red and 1 is black?

Mathematically,

\[(\frac{4}{7}\frac{5}{9}\frac{4}{8}) + (\frac{4}{7}\frac{4}{9}\frac{4}{8}) + (\frac{3}{7}\frac{5}{7}\frac{4}{8}) = \frac{204}{504} = \frac{17}{42}\]

Pythonically,

# create the lists
urnX = ['r' for _ in range(4)]
urnX.extend(['b' for _ in range(3)])
urnY = ['r' for _ in range(5)]
urnY.extend(['b' for _ in range(4)])
urnZ = ['r' for _ in range(4)]
urnZ.extend(['b' for _ in range(4)])

import itertools
combs = [(i) for i in itertools.product(urnX, urnY, urnZ)] # with itertools
combs = [(x, y, z) for x in urnX for y in urnY for z in urnZ] # without itertools

matches = sum([tup.count('r')==2 and tup.count('b')==1 for tup in combs])

A family has 2 children. The first born is a boy. What’s the probability that both children are boys?

\(1/2\). The options are BB and BG, so the probability is \(1/2\).

Given the knowledge the one is a boy, what is the probability that both are boys?

\(2/3\). The options now are BB, BG, GB, so the proability is \(2/3\).

A random variable, x, follows Poisson distribution with mean of 2.5. Find the probability with which the random variable is equal to 5.

The Poisson distribution looks like: \(P(k, \lambda) = \frac{\lambda^k e^{-\lambda}}{k!}\) where \(k\) is the actual number of successes in an interval and \(\lambda\) is the average number of successes that occur in an interval (a specified region).

So, \(P(k=5 \vert \lambda=2.5) = \frac{2.5^5e^{-2.5}}{5!}\)

Consider some Poisson random variable, x. Let \(\mathbb{E}[x]\) be the expectation of x. Find the value of \(\mathbb{E}[x^2]\).

Note that for a Poisson distribution, the expectation value and variance are both \(\lambda\).

Recall for a random variable, \(var(x) = \mathbb{E}[x^2] - \mathbb{E}[x]^2 \\ => \mathbb{E}[x^2] = \mathbb{E}[x]^2 + var(x)\) So for the Poisson distribution, \(\mathbb{E}[x^2] = \lambda^2 + \lambda\)

A class’s test scores have a mean of 70 and standard deviation of 10. What pecent of student scored >80? >=60? <80?

CDF of the normal distribution

import math
# The math.erf() method returns the error function of a number.
# This method accepts a value between - inf and + inf, and returns a value between - 1 to + 1.

# cumulative probability
#phi(x) = P(X <= x)

def phi(x, mu, sigma):
    return .5*(1 + math.erf( (x - mu)/(sigma*(2**.5)) ) )

# p score > 80 as percent
print(round(100*(1 - phi(80, 70, 10)), 2))
# p score >= 60 as percent
print(round(100*(1 - phi(60, 70, 10)), 2))
# p score < 60 as percent
print(round(100*(phi(60, 70, 10)), 2))

What I will do next

  • Spotify dataset exploration with PySpark
  • LaTeX thesis time