Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

1. String evolution

A scribe repeatedly copies a string of bits. Whenever he recopies the string, he either copies it correctly (with probability 1/2), deletes the last bit (with probability 1/3; if the string has length 0, he copies it correctly in this case); or adds a new bit (with probability 1/12 each for adding a 0 or adding a 1).

The sequence of bit strings generated by the scribe can easily be seen to be a Markov chain, since each string depends only on its immediate predecessor.

Compute the stationary distribution of this Markov chain, and show whether it is or is not reversible.
Suppose we start with the empty string. Compute a bound on the rate of convergence to the stationary distribution, showing an upper bound on the expected time until the total variation distance drops below some small constant ε.

1.1. Solution

1. Organize the strings into a binary tree, where each string x is the parent of x0 and x1. The process then describes a random walk on this tree, with a probability of 1/3 of going up and 1/12 of going down to each child. Let's guess that the process is reversible; then we have π(x)(1/12) = π(xb)(1/3) for b=0,1, giving π(xb) = π(x)/3 and in general π(x) = π(<>) 3^-x where π(<>) is the stationary probability of the empty string. Summing over all strings gives ∑_n π(<>) 2ⁿ3^-n = π(<>) ∑_n (2/3)ⁿ = 3π(<>) = 1. This gives π(<>) = 1/3, and so in general we have π(x) = 3^-|x|-1. Because this distribution satisfies the detailed balance equations (we proved it using them), it is a stationary distribution and the process is reversible.

2. We'll set up a coupling in the simplest way imaginable: given X_t and Y_t, we either delete a bit from both (probability 1/3), add the same bit to both (probability 1/6), or do nothing. We will then argue that X_t and Y_t are linked as soon as |X_t| = |Y_t| = 0. To analyze when this occurs, observe that Z_t = max(|X_t|, |Y_t|) moves as a biased random walk with a reflecting barrier at 0, and probability 1/3 of dropping and 1/6 of rising, for an expected change of -1/6 per time unit. It follows that Z_t+min(t,τ)/6 is a martingale, where τ is the stopping time at which Z_τ = 0. Since we have bounded increments, the optional stopping theorem applies, and E[Z_τ] = E[Z₀], giving E[τ] = 6 E[Z₀].

We now need to compute E[Z₀]. We have |X₀| = 0, but for |Y₀| we have a probability distribution that assigns probability 2ⁿ3^-n-1 to |Y₀| = n. So we have E[Z₀] = ∑_n=0..∞ n 2ⁿ3^-n-1 = (2/9) ∑ n (2/3)^n-1 = (2/9) (1/(1-(2/3))² = 2. This gives E[τ] = 12, and using Markov's inequality we get d_TV(X₂₄, π) ≤ 1/2.

2. Perturbing a chain

Let P be the transition matrix of a reversible Markov chain with conductance Φ_P and diameter d, where the diameter is the maximum number of steps needed to reach any state from any other state.

Suppose we replace P with a new Markov chain Q where ε ≤ q_ij ≤ p_ij for some constant ε ≥ 0, and suppose that Q is also reversible.

Let x be some state of P and Q. Compute the best upper and lower bounds you can on π_Q(x) as a function of π_P(x), d, and ε. Hint: First compute bounds on π_Q(y)/π_Q(x) as a function of π_P(y)/π_P(x),,, d, and ε.
Compute the best upper and lower bounds you can on the conductance Φ_Q of Q.
Suppose that you are not told Φ_P, but are told τ₂(P). What can you say about τ₂(Q), using your previous bound on Φ_Q?

2.1. Solution

Consider some shortest path x = x0 x1 x2 ... xk = y from x to y; here k≤d. Because P is reversible, we have π_P(y) = π_P(x) ∏_i p_{x(i) x(i+1)}/p_{x(i+1) x(i)}. Similarly π_Q(y) = π_Q(x) ∏_i q_{x(i) x(i+1)}/q_{x(i+1) x(i)}. But then ε^d π_P(y) / π_P(x) ≤ π_Q(y)/π_Q(x) ≤ ε^-d π_P(y)/π_P(x). Since we have 1 = ∑ π_Q(y) = π_Q(x) ∑ π_Q(y)/π_Q(x), we can compute π_Q(x) = 1/(∑ π_Q(y)/π_Q(x)) = c / (∑ π_P(y)/π_P(x)) where ε^d ≤ c ≤ ε^-d. So this gives ε^d π_P(x) ≤ π_Q(x) ≤ ε^-d π_P(x).
Let S be such that Φ_Q = Φ_Q(S) = (1/π_Q(S)) ∑_x∈S,y∉S π_Q(x) q_xy. We have that ε p_xy ≤ q_xy ≤ p_xy, ε^d π_P(S) ≤ ε^-d π_P(S), and for each x, ε^d π_P(x) ≤ π_Q(x) ≤ ε^-d π_P(x). Multiplying all these ε's together gives Φ_Q = Φ_Q(S) ≥ (1/ε^-d π_P(S)) ∑_{x∈S y∉S} ε^d π_P(x) ε p_xy = ε^2d+1 Φ_P(S) ≥ ε^2d+1 Φ_P. Reversing the argument gives Φ_P = Φ_P(S') = (1/π_P(S')) ∑_x∈s,y∉s π_P(x) p_xy ≥ (1/ε^-d π_Q(S')) ∑_{x∈S' y∉S'} ε^d π_Q(x) q_xy = ε^2d Φ_Q(S') ≥ ε^2d Φ_Q. Combining the two bounds gives ε^2d+1 Φ_P ≤ Φ_Q ≤ ε^-2d Φ_P.
We have 1/2Φ_P ≤ τ₂(P) ≤ 2/Φ²(P), which gives 1/2τ₂(P) ≤ Φ_P ≤ √(2/τ₂(P)). Applying the previous bound gives ε^2d+1/2τ₂(P) ≤ Φ_Q ≤ ε^-2d √(2/τ₂(P)). Plugging this back in to the bound on τ₂(Q) gives (ε^2d/2√2) (τ₂(P))^1/2 ≤ τ₂(Q) ≤ 8ε^-4d-2(τ₂(P))².

3. A sticky chain

Consider a random walk on a cycle of n nodes, where at each step we remain on node i with probability f(i), where 1/2 ≤ f(i) ≤ 1-δ, for some constant δ > 0, and move to node i-1 or i+1 with probability (1-f(i))/2 each.

Compute the stationary distribution of this walk as a function of the f(i).
Give the best bound you can as a function of n and δ on the time to converge to within some fixed ε of the stationary distribution, starting from an arbitrary node.

3.1. Solution

Let's guess that the chain is reversible. Then we have π(i+1) = π(i) (1-f(i))/(1-f(i+1)) and in general π(i) = π(0) ∏_j<i (1-f(j))/(1-f(j+1)). We can check that this is consistent by computing π(0) = (1-f(n-1))/(1-f(0)) π(0) ∏_j<n-1 (1-f(j))/(1-f(j+1)) = π(0) ∏_j (1-f(j)) / ∏_j (1-f(j)) = π(0).
We use a coupling based on the standard coupling for a lazy uniform walk on the cycle. If X_t = Y_t, we move both together. Otherwise, given X_t = i, Y_t = j, we move X_t only with probability 1-f(i) and Y_t only with probability 1-f(j) (these probabilities sum to at most 1 because f(i) ≥ 1/2 for all i). So now Z_t = X_t - Y_t (mod n) satisfies E[Z_t+1] = E[Z_t], |Z_t+1-Z_t| ≤ 1, and Pr[Z_t+1≠Z_t|Z_t≠0 (mod n)] ≥ 2δ. It follows that Z_t∧τ² - 2δ(t∧τ) is a submartingale, where τ = min_t Z_t = 0 or n. Optional stopping applies because of bounded increments, so we have E[Z_τ² - 2δτ] ≥ E[Z₀²], giving E[τ] ≤ n²/2δ (the constant can probably be improved a bit). So taking τ₁ = n²/δ gives d_TV(X_τ₁,π) ≤ 1/2.