[FrontPage] [TitleIndex] [WordIndex

Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

1. String evolution

A scribe repeatedly copies a string of bits. Whenever he recopies the string, he either copies it correctly (with probability 1/2), deletes the last bit (with probability 1/3; if the string has length 0, he copies it correctly in this case); or adds a new bit (with probability 1/12 each for adding a 0 or adding a 1).

The sequence of bit strings generated by the scribe can easily be seen to be a Markov chain, since each string depends only on its immediate predecessor.

  1. Compute the stationary distribution of this Markov chain, and show whether it is or is not reversible.
  2. Suppose we start with the empty string. Compute a bound on the rate of convergence to the stationary distribution, showing an upper bound on the expected time until the total variation distance drops below some small constant ε.

1.1. Solution

1. Organize the strings into a binary tree, where each string x is the parent of x0 and x1. The process then describes a random walk on this tree, with a probability of 1/3 of going up and 1/12 of going down to each child. Let's guess that the process is reversible; then we have π(x)(1/12) = π(xb)(1/3) for b=0,1, giving π(xb) = π(x)/3 and in general π(x) = π(<>) 3-x where π(<>) is the stationary probability of the empty string. Summing over all strings gives ∑n π(<>) 2n3-n = π(<>) ∑n (2/3)n = 3π(<>) = 1. This gives π(<>) = 1/3, and so in general we have π(x) = 3-|x|-1. Because this distribution satisfies the detailed balance equations (we proved it using them), it is a stationary distribution and the process is reversible.

2. We'll set up a coupling in the simplest way imaginable: given Xt and Yt, we either delete a bit from both (probability 1/3), add the same bit to both (probability 1/6), or do nothing. We will then argue that Xt and Yt are linked as soon as |Xt| = |Yt| = 0. To analyze when this occurs, observe that Zt = max(|Xt|, |Yt|) moves as a biased random walk with a reflecting barrier at 0, and probability 1/3 of dropping and 1/6 of rising, for an expected change of -1/6 per time unit. It follows that Zt+min(t,τ)/6 is a martingale, where τ is the stopping time at which Zτ = 0. Since we have bounded increments, the optional stopping theorem applies, and E[Zτ] = E[Z0], giving E[τ] = 6 E[Z0].

We now need to compute E[Z0]. We have |X0| = 0, but for |Y0| we have a probability distribution that assigns probability 2n3-n-1 to |Y0| = n. So we have E[Z0] = ∑n=0..∞ n 2n3-n-1 = (2/9) ∑ n (2/3)n-1 = (2/9) (1/(1-(2/3))2 = 2. This gives E[τ] = 12, and using Markov's inequality we get dTV(X24, π) ≤ 1/2.

2. Perturbing a chain

Let P be the transition matrix of a reversible Markov chain with conductance ΦP and diameter d, where the diameter is the maximum number of steps needed to reach any state from any other state.

Suppose we replace P with a new Markov chain Q where ε ≤ qij ≤ pij for some constant ε ≥ 0, and suppose that Q is also reversible.

  1. Let x be some state of P and Q. Compute the best upper and lower bounds you can on πQ(x) as a function of πP(x), d, and ε. Hint: First compute bounds on πQ(y)/πQ(x) as a function of πP(y)/πP(x),,, d, and ε.

  2. Compute the best upper and lower bounds you can on the conductance ΦQ of Q.

  3. Suppose that you are not told ΦP, but are told τ2(P). What can you say about τ2(Q), using your previous bound on ΦQ?

2.1. Solution

  1. Consider some shortest path x = x0 x1 x2 ... xk = y from x to y; here k≤d. Because P is reversible, we have πP(y) = πP(x) ∏i px(i) x(i+1)/px(i+1) x(i). Similarly πQ(y) = πQ(x) ∏i qx(i) x(i+1)/qx(i+1) x(i). But then εd πP(y) / πP(x) ≤ πQ(y)/πQ(x) ≤ ε-d πP(y)/πP(x). Since we have 1 = ∑ πQ(y) = πQ(x) ∑ πQ(y)/πQ(x), we can compute πQ(x) = 1/(∑ πQ(y)/πQ(x)) = c / (∑ πP(y)/πP(x)) where εd ≤ c ≤ ε-d. So this gives εd πP(x) ≤ πQ(x) ≤ ε-d πP(x).

  2. Let S be such that ΦQ = ΦQ(S) = (1/πQ(S)) ∑x∈S,y∉S πQ(x) qxy. We have that ε pxy ≤ qxy ≤ pxy, εd πP(S) ≤ ε-d πP(S), and for each x, εd πP(x) ≤ πQ(x) ≤ ε-d πP(x). Multiplying all these ε's together gives ΦQ = ΦQ(S) ≥ (1/ε-d πP(S)) ∑x∈S y∉S εd πP(x) ε pxy = ε2d+1 ΦP(S) ≥ ε2d+1 ΦP. Reversing the argument gives ΦP = ΦP(S') = (1/πP(S')) ∑x∈s,y∉s πP(x) pxy ≥ (1/ε-d πQ(S')) ∑x∈S' y∉S' εd πQ(x) qxy = ε2d ΦQ(S') ≥ ε2d ΦQ. Combining the two bounds gives ε2d+1 ΦP ≤ ΦQ ≤ ε-2d ΦP.

  3. We have 1/2ΦP ≤ τ2(P) ≤ 2/Φ2(P), which gives 1/2τ2(P) ≤ ΦP ≤ √(2/τ2(P)). Applying the previous bound gives ε2d+1/2τ2(P) ≤ ΦQ ≤ ε-2d √(2/τ2(P)). Plugging this back in to the bound on τ2(Q) gives (ε2d/2√2) (τ2(P))1/2 ≤ τ2(Q) ≤ 8ε-4d-22(P))2.

3. A sticky chain

Consider a random walk on a cycle of n nodes, where at each step we remain on node i with probability f(i), where 1/2 ≤ f(i) ≤ 1-δ, for some constant δ > 0, and move to node i-1 or i+1 with probability (1-f(i))/2 each.

  1. Compute the stationary distribution of this walk as a function of the f(i).
  2. Give the best bound you can as a function of n and δ on the time to converge to within some fixed ε of the stationary distribution, starting from an arbitrary node.

3.1. Solution

  1. Let's guess that the chain is reversible. Then we have π(i+1) = π(i) (1-f(i))/(1-f(i+1)) and in general π(i) = π(0) ∏j<i (1-f(j))/(1-f(j+1)). We can check that this is consistent by computing π(0) = (1-f(n-1))/(1-f(0)) π(0) ∏j<n-1 (1-f(j))/(1-f(j+1)) = π(0) ∏j (1-f(j)) / ∏j (1-f(j)) = π(0).

  2. We use a coupling based on the standard coupling for a lazy uniform walk on the cycle. If Xt = Yt, we move both together. Otherwise, given Xt = i, Yt = j, we move Xt only with probability 1-f(i) and Yt only with probability 1-f(j) (these probabilities sum to at most 1 because f(i) ≥ 1/2 for all i). So now Zt = Xt - Yt (mod n) satisfies E[Zt+1] = E[Zt], |Zt+1-Zt| ≤ 1, and Pr[Zt+1≠Zt|Zt≠0 (mod n)] ≥ 2δ. It follows that Zt∧τ2 - 2δ(t∧τ) is a submartingale, where τ = mint Zt = 0 or n. Optional stopping applies because of bounded increments, so we have E[Zτ2 - 2δτ] ≥ E[Z02], giving E[τ] ≤ n2/2δ (the constant can probably be improved a bit). So taking τ1 = n2/δ gives dTV(Xτ₁,π) ≤ 1/2.


2014-06-17 11:58