4 minute read

Stats riddles, Covariance Matrix, etc.

What I did

  • Continued Bayesian stats
  • I’ve become more interested in finance and looked into what projects or courses I could take to apply my ML knowledge to the finance domain. With this in mind, I did lots of brain-teasers and puzzles this week.

What I learned

  • The covariant matrix (alo known as a variant-covariant matrix) is important to understand in general, but especially with its link to eigen-decomposition and PCA. It’s best to just define it with an example. Considering the following data. Create a covariance matrix.
A B
1 3
2 1
-1 -2

First find the deviation score sums.

A B
(1 - 1/3)^2 (3 - 3/3)^2
(2 - 2/3)^2 (1 - 1/3)^2
(-1 + 1/3)^2 (2 + 2/3)^2

Now average these values to get the covariance matrix

A B
[(1 - 1/3)^2]/3 [(3 - 3/3)^2]/3
[(2 - 2/3)^2]/3 [(1 - 1/3)^2]/3
[(-1 + 1/3)^2]/3 [(2 + 2/3)^2]/3

In code:

import numpy as np
A = np.array([1,2,-1])
B = np.array([3,1,-2])

def make_covariance_matrix(A, B)
  matrix = np.stack((A, B))
  n = size(matrix, axis=1)
  temp = matrix - A/n
  matrix = matrix - temp
  matrix = matrix.T @ matrix
  matrix = matrix/n

* Note, the stack function gives a dimension like this 
matrix = array([[ 1,  2, -1],
                [ 3,  1, -2]])
  • Tricky stats problem:

    Basic case: What is the expected number of coin flips needed to get a heads.

    • \[E[X=heads] = \sum_i^\infty f(x)*p(X=x)\]
    • where x is some state. It could be 1,2,3, 4, etc. meaning that you flip H, or TH, or TTH, or TTTH, etc. and probability for each of those is 1/2, 1/4, 1/8, etc. This infinite sum looks like:
    • \[\sum\limits_{i=1}^\infty i p (1-p)^{i-1} = \frac{1}{p}\]
    • Here, \(f(x)=i\) and \((1-p) = p = \frac{1}{2}\). So the answer is \(\frac{1}{1/2} = 2\)
    • Another way to think about it is to deconstruct it a bit more explicitely. For example, if x is the number of flips before seeing heads, then the expected number of flips before seeing heads is 1 (your first flip) + probability of geting a tails instead * the expected number of flips. This is a recursive approach to expectation values.
    \[E[x] = 1 + p(x)E[x]\]
    • This is easily solvable to find that E[x] = 2 if p(x) = .5. This methodology is important to understand for the more complicated problem of 3 heads in a row.

    What is the expected number of flips needed to get 3 heads in a row?

    • If E[x] is the expected number of flips, then if you get a tails on your first try, the new expectation is 1 + E[x]. This has a probability of 1/2. If you get HT then your new expectation is E[x] + 2 since you’ve flipped twice and have to start over. This has a probability of 1/4. The probability that you get HHT is 1/8. If we flip HHH, the the expected number of flips is just 3, but this has a probability of 1/8.
    \[E[x] = \frac{1}{2} (1 + E[x]) + \frac{1}{4}(2 + E[x]) + \frac{1}{8}(3 + E[x]) + \frac{3}{8}\]

    How about for n in a row?

    \[\begin{aligned} E[x] &= \sum_{i=1}^n \left[ \frac{1}{2^n}*(n + E[x]) \right]\\ E[x] &= 2(2^n - 1) \end{aligned}\]

    How about the expected number of rolls to get 2 6’s in a row?

    \[\begin{aligned} E[x] &= \frac{5}{6}(1 + E[x]) &&+ ~~~~ \frac{1}{6}\frac{1}{6}(2 + ~~~~E[x]) &&+ ~~~~ \frac{1}{6}\frac{1}{6}(2) \\ &= P(\text{Roll 1} \neq 6) &&+ ~~~~ P(\text{Roll 1 = 6, Roll 2} \neq 6) &&+ ~~~~ P(\text{Roll} 1, 2 = 6) \end{aligned}\]

NLP

  • The Edit Distance is also known as the Levenstein distance and is a measure of the dissimilarity between two strings. It is the number of changes to one string to make it match the other string. This can be tricky because it might mean just adding/removing a letter, or replacing a letter, or any combination of those. Here’s how I would write an efficient algorithm to do it. Using dynamic programming you can compute the edit distance in \(O(st)\) for strings s and t. Dynamic programming is where you break down one problem into several smaller problems that are easier to solve.
#string s, string t, len(s), m = len(s), n = len(t)
def min_edit_distance(s, t, m, n):
# for dynamic programming, you need a table to hold the values for
# each subproblem.
  dp = [[0 for x in range(n + 1)] for x in range(m + 1)]

  for i in range(m+1):
    for j in range(n+1):
      # If first string is empty, only option is to 
      # insert all characters of second string
      if i == 0:
        dp[i][j] = j # j=min num of operations
      # If 2nd string is empty, only option is to
      # remove all the characters
      elif j == 0:
        dp[i][j] = i

      # If last characters are same, ignore last char
      # and recur for remaining string
      elif str1[i-1] == s[j-1]:
          dp[i][j] = dp[i-1][j-1]

      # If last character are different, consider all
      # possibilities and find minimum
      else:
          dp[i][j] = 1 + min(dp[i][j-1],        # Insert
                             dp[i-1][j],        # Remove
                             dp[i-1][j-1])      # Replace
  
    return dp[m][n]


  • You could also solve this recursively, but it’s exponential time, like \(O(3^t)\) so it’s a bad solution. Here’s a good source to actually read some more about dynamic programming and this specific problem, especially since it’s very useful for DNA sequence matching.

What I will do Next

  • More data science and quant trading interview prep.