# Simple Random Walk – Method 1

Suppose we consider a simple random walk. A particle starts at initial position $z_i$ and moves one unit to the left with probability $p$ and moves one unit to the right with probability $1-p$. What is the expected position $\mathbb{E}[X_n]$ of the particle after $n$ steps?

I will calculate the expected value using two different methods. The second method is simpler, but I’ll start using an iteration method.

Our PMF is:

$f_X(x) = \bigg\{ \begin{tabular}{cc} p & x = -1 \\ 1-p & x = 1 \end{tabular}$

Let’s set our initial position as:
$n=0: \quad \mathbb{E}[X_0] = z_i$

After one step our expected position is then:
$n=1: \quad \mathbb{E}[X_1] = (z_i - 1)p + (z_i + 1)(1 - p)$
$n=1: \quad \mathbb{E}[X_1] = z_{i}p - p + z_i + 1 - z_{i}p - p$
$n=1: \quad \mathbb{E}[X_1] = z_i + 1 - 2p$

Great, let’s try iterating one more to see what we get. Note that at $n=2$ our position is now the result from $n=1$, $z_i + 1 - 2p$.
$n=2: \quad \mathbb{E}[X_2] = (z_i + 1 - 2p - 1)p + (z_i + 1 - 2p + 1)(1 - p)$
$n=2: \quad \mathbb{E}[X_2] = z_{i}p - 2p^2 + Z_i - 2p + 2 - z_{i}p + 2p^2 - 2p$
$n=2: \quad \mathbb{E}[X_2] = z_i + 2(1 - 2p)$

If we keep iterating we will see that $\mathbb{E}[X_n] = z_i + n(1 - 2p)$. But we can prove this formally through induction. We’ve already done our base case, so let’s now do the induction step. We will assume that $\mathbb{E}[X_n] = z_i + n(1 - 2p)$ is true and show that it is also true for $n + 1$.

$\mathbb{E}[X_{n+1}] = (z_i + n(1 - 2p) - 1)p + (z_i + n(1 - 2p) + 1)(1 - p)$
$\mathbb{E}[X_{n+1}] = (z_i + n - 2pn - 1)p + (z_i + n - 2pn + 1)(1 - p)$
$\mathbb{E}[X_{n+1}] = z_{i}p + pn - 2p^{2}n - p + z_i + n - 2pn + 1 -z_{i}p - pn + 2p^{2}n - p$
$\mathbb{E}[X_{n+1}] = - p + z_i + n - 2pn + 1 - p$
$\mathbb{E}[X_{n+1}] = z_i + (n + 1)(1 - 2p)$

Thus our induction step holds and we have shown that $\mathbb{E}[X_n] = z_i + n(1 - 2p)$.

Because we chose our initial starting position $z_i$ to be arbitrary, we might as well set it to 0 to obtain a final result of $\mathbb{E}[X_n] = n(1 - 2p)$.

Let’s take a moment to think about this result and make sure it seems reasonable. Suppose $p = 0.5$. This would mean we have an equal chance of moving left or moving right. Over the long run we would expect our final position to be exactly where we started. Plugging in $p = 0.5$ to our equation yields $n(1 - 2 \cdot 0.5) = n(1 - 1) = 0$. Just as we expected! What if $p = 1$? This means we only move to the left. Plugging $p = 1$ into our equation yields $n(1 - 2 \cdot 1) = n(-1) = -n$. This makes sense! If we can only move to the left then after $n$ steps we would expect to be $n$ steps left of our staring position (the origin as we chose it), the negative direction in our problem setup. We could also choose $p$ to be 0, meaning we only move to the right and we would get $n$, again just what we would expect!

We can also run a simulation in R to verify our results:

################################################################
# R Simulation
################################################################
# Generate random walk
rand_walk = function (n, p, z) {
walk = sample(c(-1,1), size=n, replace=TRUE, prob=c(p,1-p))
for (i in 1:n) {
z = z + walk[i]
}
return(z)
}

n = 1000 # Walk n steps
p = .3 # Probability of moving left
z = 0 # Set initial position to 0
trials = 10000 # Num times to repeate sim
# Run simulation
X = replicate(trials, rand_walk(n,p,z))

# Calculate empirical and theoretical results
empirical = mean(X)
theoretical = n*(1-2*p)
percent_diff = abs((empirical-theoretical)/empirical)*100

# print to console
empirical
theoretical
percent_diff


Printing to the console we see that after 10,000 trials of 1,000 steps each our empirical and theoretical results differ by just 0.046%.

> empirical
[1] 400.1842
> theoretical
[1] 400
> percent_diff
[1] 0.0460288


# Expected Value of a Generic Positive Random Variable

Here is a math problem:

Suppose we are told that $P(X > 0) = 1$ and are given that $\lim_{n\to\infty} x\big[1-F(x)\big] = 0$. Show that $\mathbb{E}[X] = \int_0^{\infty} P(X > x)dx$.

Let’s start with the definition of expectation and use integration by parts.

$\mathbb{E}[X] = \int_{-\infty}^{\infty}xf_X(x)dx = \int_0^{\infty}xf_X(x)dx$ since we are given $x \in (0,\infty)$.

Now using integration by parts, making the natural selection we have:

$u=F_X(x) \quad \quad v=x$
$u'=f_X(x) \quad \quad v'=dx$

Recall that $\int u'v = uv-\int uv'$ and plugging in our selections we get:

$\mathbb{E}[X] = \int_0^{\infty}xf_X(x)dx = xF_X(x)\bigm|_0^{\infty} - \int_0^{\infty} F_X(x)dx$

Let’s rewrite $\int_0^{\infty} F_X(x)dx$ as $\int_0^{\infty} \big(1 - P(X > x)\big)dx$. This simplifies to $\int_0^{\infty} dx - \int_0^{\infty} P(X > x)dx$. Let’s plug this back into our equation above:

$\mathbb{E}[X] = \int_0^{\infty}xf_X(x)dx = xF_X(x)\bigm|_0^{\infty} - \bigg(\int_0^{\infty} dx - \int_0^{\infty} P(X > x)dx \bigg)$

$\mathbb{E}[X] = xF_X(x)\bigm|_0^{\infty} - x\bigm|_0^{\infty} + \int_0^{\infty} P(X > x)dx$

$\mathbb{E}[X] = \big[xF_X(x)- x\big]_0^{\infty} + \int_0^{\infty} P(X > x)dx$

$\mathbb{E}[X] = \big[x(F_X(x)- 1)\big]_0^{\infty} + \int_0^{\infty} P(X > x)dx$

$\mathbb{E}[X] = \big[-x(1-F_X(x))\big]^{x=\infty} + \int_0^{\infty} P(X > x)dx$ since plugging in $0$ to the first half of our expression just yields $0$. We can’t actually evaluate $x$ at infinity, instead we take the limit:

$\mathbb{E}[X] = -\lim_{x\to\infty} x\big[1-F(x)\big] + \int_0^{\infty} P(X > x)dx$, but recall that we are told $\lim_{x\to\infty} x\big[1-F(x)\big] = 0$ and so we are simply left with:

$\mathbb{E}[X] = \int_0^{\infty} P(X > x)dx$, our desired result.

# Finding the Expected Value of the Maximum of n Random Variables

My friend Ryan, who is also a math tutor at UW, and I are working our way through several math resources including Larry Wasserman’s famous All of Statistics. Here is a math problem:

Suppose we have $n$ random variables $X_1, ...X_n$ all distributed uniformly, $X_i \sim Uniform(0,1)$. We want to find the expected value of $\mathbb{E}[Y_n]$ where $Y_n = \max\{X_1,..., X_n\}$.

First, we need to find the Probability Density Function (PDF) $f_Y(y)$ and we do so in the usual way, by first finding the Cumulative Distribution Function (CDF) and taking the derivative:

$F_Y(y) = P(Y < y)$
$F_Y(y) = P(\max\{X_1, ..., X_n\} < y)$
$F_Y(y) = P(X_1,..., X_n < y)$

We want to be able to get this step:
$F_Y(y) = P(X_1 < y)P(X_2 < y) \cdots P(X_n < y)$

But must show independence and we are not give that our $X_i$‘s are in fact independent. Thanks to Ryan for helping me see that by definition:

$F_Y(y) = \underset{A}{\idotsint} f_{X_1, \dots, X_n}(y) \,dx_1 \dots dx_n$

However, note that in this case $f_{X_1, \dots, X_n}(y)$ is a unit $n-cube$ with area $A$ equal to $1$. In other words $f_{X_1, \dots, X_n}(y) = 1$. Our equation then simplifies:

$F_Y(y) = \idotsint 1 dx_1 \dots dx_n$
$F_Y(y) = \int dx_1 \dots \int dx_n = [F_X(y)]^n$ where $X$ here is a generic random variable, by symmetry (all $X_i$‘s are identically distributed). This is the same answer we would’ve gotten if we made the iid assumption earlier and obtained $F_Y(y) = P(X_1 < y)P(X_2 < y) \cdots P(X_n < y)$. Originally, I had made this assumption by way of wishful thinking — and a bit of intuition, it does seem that $n$ uniformly distributed random variables would be independent — but Ryan corrected my mistake.

Now that we have $F_Y(y)$ we can find $f_Y(y)$ the PDF.

$f_Y(y) = \frac{d}{dy}F_Y(y) = \frac{d}{dy}[F_X(y)]^n$
$f_Y(y) = n[F_X(y)]^{n-1}f_X(y)$ by the chain rule.

Recall that the PDF $f_X(x)$ of a $X \sim Uniform(0,1)$ is $\frac{1}{b-a} = \frac{1}{1-0} = 1$ for $x \in [0,1]$. And by extension the CDF $F_X(x)$ for a $X \sim Uniform(0,1)$ is:
$\int_a^x f(t)dt = \int_a^x \frac{1}{b-a}dt = t\frac{1}{b-a} \bigm|_a^x = \frac{x}{b-a} - \frac{a}{b-a} = \frac{x-a}{b-a} = \frac{x-0}{1-0} = x$.

Plugging these values into our equation above (and noting we have $F_X(y)$ not $F_X(x)$ meaning we simply replace the $x$ we just derived with $y$ as we would in any normal function) we have:

$f_Y(y) = ny^{n-1} \cdot 1$

Finally, we are ready to take our expectation:

$\mathbb{E}[Y] = \int_{y\in A}yf_Y(y)dy = \int_0^1 yny^{n-1}dy = n\int_0^1 y^{n}dy = n\bigg[\frac{1}{n+1}y^{n+1}\bigg]_0^1 = \frac{n}{n+1}$

Let’s take a moment and make sure this answer seems reasonable. First, note that if we have the trival case of $Y = \max\{X_1\}$ (which is simply $Y = X_1$; $n = 1$ in this case) we get $\frac{1}{1+1} = \frac{1}{2}$. This makes sense! If $Y = X_1$ then $Y$ is just a uniform random variable on the interval $0$ to $1$. And the expected value of that random variable is $\frac{1}{2}$ which is exactly what we got.

Also notice that $\lim_{n\to\infty} \frac{n}{n+1} = 1$. This also makes sense! If we take the maximum of 1 or 2 or 3 $X_i$‘s each randomly drawn from the interval 0 to 1, we would expect the largest of them to be a bit above $\frac{1}{2}$, the expected value for a single uniform random variable, but we wouldn’t expect to get values that are extremely close to 1 like .9. However, if we took the maximum of, say, 100 $X_i$‘s we would expect that at least one of them is going to be pretty close to 1 (and since we’re choosing the maximum that’s the one we would select). This doesn’t guarantee our math is correct (although it is), but it does give a gut check that what we derived is reasonable.

We can further verify our answer by simulation in R, for example by choosing $n = 5$ (thanks to the fantastic Markup.su syntax highlighter):

################################################################
# R Simulation
################################################################
X = 5
Y = replicate(100000, max(runif(X)))
empirical = mean(Y)
theoretical = (X/(X+1)) #5/6 = 8.33 in this case
percent_diff = abs((empirical-theoretical)/empirical)*100

# print to console
empirical
theoretical
percent_diff


We can see from our results that our theoretical and empirical results differ by just 0.05% after 100,000 runs of our simulation.

> empirical
[1] 0.8337853
> theoretical
[1] 0.8333333
> percent_diff
[1] 0.0542087


# Change of Integral Limits for Even Functions

For fun I’m enrolled in an online computational finance certificate at UW. In one of my homework problems I wanted to use the following fact about the integral of single variable, even functions:

$\int_{-\infty}^{-x} f(s)ds = \int_{x}^{\infty} f(s)ds$

If it’s been a few years since you’ve taken calculus that may not make much sense, but trust me when I tell you that it’s analytically obvious, especially when looking at functions graphically, as this terrible hand drawn image shows:

Intuitively, we know the two red areas are the same, so it seems we should be able to interchange the limits as I described above. Indeed, playing around in Mathematica suggests that this is true. However, I could not find a proof or theorem for this online so perhaps it is rarely used. I decided to prove it myself:

Original equation:
$\int_{s=-\infty}^{s=-x} f(s)ds$

Use u-substitution with $-u=s$ and $-du=ds$ :
$= \int_{u=\infty}^{u=x} -f(-u)du$

Bring minus sign outside integral:
$= -\int_{u=\infty}^{u=x} f(-u)du$

Use the fact that $-\int_{b}^{a} f(x)dx = \int_{a}^{b} f(x)dx$ :
$= \int_{u=x}^{u=\infty} f(-u)du$

By assumption $f$ is even so $f(-u)=f(u)$ :
$= \int_{u=x}^{u=\infty} f(u)du$

Rewrite improper integral:
$= \lim_{t \to \infty} \int_{x}^{t} f(u)du$

By Fundamental Theorem of Calculus:
$= \lim_{t \to \infty} [F(t)] - F(x)$

By Fundamental Theorem of Calculus:
$= \lim_{t \to \infty} \int_{x}^{t} f(s)ds$

$= \int_{x}^{\infty} f(s)ds$
Which is exactly the result we were trying to obtain.