Simple Random Walk – Method 2

Suppose we consider a simple random walk. A particle starts at initial position z_i and moves one unit to the left with probability p and moves one unit to the right with probability 1-p . What is the expected position \mathbb{E}[X_n] of the particle after n steps?

In a previous post I presented a method to solve this by iteration. I’d like to present a different method to solve the problem here.

Recall we have:

X_i = \bigg\{ \begin{tabular}{cc} p     & x = -1 \\ 1-p & x = 1 \end{tabular}

We give it some thought we can see that after n steps we’ve simply added up our random variables to get our final location. We could define a new random variable:

Y = \sum_{i=1}^{n} X_i

So now all we need to do is take the expected value of Y . We can use rules of expected value to bring the expectation inside the summation and take the expectation of each individual X_i as we normally would.

\mathbb{E}[Y] = \mathbb{E}\big[\sum_{i=1}^{n} X_i \big]
\mathbb{E}[Y] = \sum_{i=1}^{n} \mathbb{E}[X_i]
\mathbb{E}[Y] = \sum_{i=1}^{n} -1 \cdot p + 1 \cdot (1-p)
\mathbb{E}[Y] =  \sum_{i=1}^{n} 1 - 2p
\mathbb{E}[Y] = n(1 - 2p)

This is the same answer we got via the iteration method. This also makes our R simulation easier. Recall we has this setup:

################################################################
# R Simulation
################################################################
# Generate random walk
rand_walk = function (n, p, z) {
  walk = sample(c(-1,1), size=n, replace=TRUE, prob=c(p,1-p))
  for (i in 1:n) {
    z = z + walk[i]
  }
  return(z)
}

n = 1000 # Walk n steps
p = .3 # Probability of moving left
z = 0 # Set initial position to 0
trials = 10000 # Num times to repeate sim
# Run simulation
X = replicate(trials, rand_walk(n,p,z))

Our code now becomes:

################################################################
# R Simulation
################################################################
n = 1000 # Walk n steps
p = .3 # Probability of moving left
trials = 10000 # Num times to repeate sim
# Run simulation
X = replicate(trials, sum(sample(c(-1,1), size=n, replace=TRUE, prob=c(p,1-p))))

# Calculate empirical and theoretical results
empirical = mean(X)
theoretical = n*(1-2*p)
percent_diff = abs((empirical-theoretical)/empirical)*100

# print to console
empirical
theoretical
percent_diff

Notice that for our random walk we can replace this function:

rand_walk = function (n, p, z) {
  walk = sample(c(-1,1), size=n, replace=TRUE, prob=c(p,1-p))
  for (i in 1:n) {
    z = z + walk[i]
  }
  return(z)
}
X = replicate(trials, rand_walk(n,p,z))

With the simpler:

X = replicate(trials, sum(sample(c(-1,1), size=n, replace=TRUE, prob=c(p,1-p))))

Additionally, in this second method we don’t need to specify an initial position since we assume from the beginning it’s zero. Of course both methods complete the same task, but they use different conceptual models. The first uses an iteration model, while the latter completes the “iteration” in a single step.

Distribution Mean Convergence

Suppose we want to simulate \frac{1}{n}\sum_{i=1}^{n} X_i for X_1, X_2, \dots , X_n \sim N(0, 1) , n = 1, \dots , 10,000 . Suppose we want to do the same for the Cauchy distribution.

In other words, we want to draw several random variables from a normal distribution and then take the average. As n increases we should get closer to the mean of the distribution we’re drawing from, 0 in this case.

The R code below will do this. It produces this graph:
ch3-pr9

Notice that while the Normal distribution converges quickly the Cauchy never does. This is because the Cauchy distribution has fat tails and so extreme observations are common.

################################################################
# R Simulation
# James McCammon
# 2/20/2017
################################################################
# This script goes through the simulation of plotting both normal
# and Cauchy means for random vectors of size 1 to 10,000. It
# also demonstrates function creation and plotting.
# Highlight desired section and click "Run."

# Set working directory as needed
setwd("~/R Projects/All of Statistics")

###
# Calculate means and plot using base R
###

    # Set seed for reproducibility
    set.seed(271)
    
    #
    # Version 1: Simple
    #
    n = seq(from=1, to=10000, by=1)
    y=sapply(n, FUN=function(x) sum(rnorm(x))/x)
    plot(n, y, type="l")
    
    #
    # Version 2: Define a function
    #
    sim = function(x, FUN) {
      sapply(x, FUN=function(x) sum(FUN(x))/x)  
    }
    
    # Use function to plot normal means
    x = seq(from=1, to=10000, by=1)
    y1 = sim(x, rnorm)
    plot(x, y1, type="l")
    
    # Use function to plot Cauchy means
    y2 = sim(x, rcauchy)
    plot(x, y2, type="l")
    
    #
    # Version 3: More complex function
    #
    
    # This function has:
    # (1) error checking
    # (2) extra argument options
    # (3) the ability to input any distribution R supports
    sim = function(x, FUN, ...) {
      if(!is.character(FUN)) stop('Please enter distribution as string.')
      dists = c('rnorm',
                'rbeta',
                'rbinom',
                'rcauchy',
                'rchisq',
                'rexp',
                'rf',
                'rgamma',
                'rgeom',
                'rhyper',
                'rlnorm',
                'rmultinom',
                'rnbinom',
                'rpois',
                'rt',
                'runif',
                'rweibull')
      if(is.na(match(FUN, dists))) stop(paste('Please enter a valid distribution from one of:', paste(dists, collapse=', ')))
      FUN = get(FUN)
      sapply(x, FUN=function(x) sum(FUN(x, ...))/x)  
    }
    
    # We have to define our function in string form.
    # This will throw error 1.
    test1 = sim(x, rnorm)
    
    # We have to input a distribution R supports.
    # This will throw error 2.
    test2 = sim(x, 'my_cool_function')
    
    # We can input additional arguments like the
    # mean, standard deviations, or other shape parameters.
    test3 = sim(x, 'rnorm', mean=10, sd=2)

####
# Using ggplot2 to make pretty graph
###

    # Load libraries
    library(ggplot2)
    library(ggthemes)
    library(gridExtra)
    
    png(filename='Ch3-Pr9.png', width=1200, height=600)
    
      df1 = cbind.data.frame(x, y1)
      p1 = ggplot(df1, aes(x=x, y=y1)) + 
        geom_line(size = 1, color='#937EBF') + 
        theme_fivethirtyeight() +
        ggtitle("Normal Means")
      
      df2 = cbind.data.frame(x, y2)
      p2 = ggplot(df2, aes(x=x, y=y2)) + 
        geom_line(size = 1, color='#EF4664') + 
        theme_fivethirtyeight() +
        ggtitle("Cauchy Means")
      
      # Save charts
      grid.arrange(p1,p2,nrow=2,ncol=1)
      
    dev.off()
    # Print charts to screen
    grid.arrange(p1,p2,nrow=2,ncol=1)