Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

(For more details see MotwaniRaghavan Chapter 13 or BorodinElYaniv.)

1. On-line algorithms

Classic example: Sleator and Tarjan's list update problem, where we have a linked list and get a sequence σ of requests to find particular elements of the list (general version also includes insertions and deletions). It costs 1 unit of work per comparison, and we can only search from the front of the list, so finding the k-th item costs k units. But having found it we are allowed to move it to any position ≤k at no charge (free exchanges). We can also move arbitrary nodes in either direction at a cost of 1 unit per transposition (paid exchanges). Goal is to minimize total cost of an entire sequence σ.

Typical heuristic is move to front: always move the node we found to the front of the list (using free exchanges). It's easy to show this is not optimal for all σ, but maybe it's not much worse.

In the worst case, σ always asks for whatever element is at the end of the list, and we pay n units per search. In the best case, it always asks for the front of the list, and we pay 1 unit per search. We'd like a measure of how difficult a particular sequence σ is so that we can tell if our chosen algorithm is working.

Competitive analysis says we'll compare our on-line algorithm (move-to-front, say) with an optimal off-line algorithm that can see all of σ in advance (equivalently, that gets lucky; such lucky algorithms are delivered by our good friend the universal quantifier). An algorithm A is c-competitive or has competitive ratio c if there is a constant α such that for all σ we have A(σ) ≤ c⋅OPT(σ) + α, where A(σ) and OPT(σ) are the costs of A and the optimal algorithm OPT on σ.

The additive constant avoids problems with bad short sequences; to show that an algorithm has a bad competitive ratio, we effectively have to show that A(σ) ≥ c⋅OPT(σ) in the limit as OPT(σ) goes to infinity.

2. 2-competitiveness of move-to-front

(See BorodinElYaniv §2.1.)

We'll run MTF and OPT in parallel, and use a potential function Φ to pay for transitions where MTF pays much more than OPT.

State of each algorithm is a permutation π (MTF) or π^* (OPT) of the elements. The potential function Φ is the number of inversions between π and π^*, elements x and y where πx < πy but π^*x > π^*y. We will charge MTF an amortized cost that includes both its actual cost and the change in Φ (which may give negative costs if Φ drops), and we want to show that this amortized cost for each search is less than 2s-1, where s is the cost incurred by OPT.

Now let's see what happens if we do a search. Suppose we are looking for x, which is at position k^* in π^* and k in π. Consider the set of elements that appear before x in π but after x in π^* (thus producing a higher cost for MTF than for OPT); let ν be the number of such elements. Then we have k^* ≥ k-ν, and we also have that at most k-1-ν elements appear before x in both lists. By moving x to the front, MTF creates at most k-1-ν new inversions, while destroying ν old inversions, for a change in Φ of k-1-2ν and a total amortized cost of 2k-1-2ν ≤ 2(k-ν) - 1 ≤ 2k^*+1.

Now we let OPT make any transpositions it likes. Free transpositions move x closer to the front, so they can only decrease Φ (which reduces our amortized cost). Each paid transition can create at most one new inversion. So we have cost(MTF) + Φ' - Φ ≤ 2k^*-1+P ≤ 2(k^*+P)-1 = 2s-1.

Summing over a sequence of n operations gives MTF(σ) + Φ_n - Φ₀ ≤ 2⋅OPT(σ) - n. Since Φ is non-negative this gives the desired competitive bound MTF(σ) ≤ 2⋅OPT(σ).

3. Adding randomization

Now we look at expected costs.

3.1. Adversaries

Complication: Have to decide what adversary gets to do.

Oblivious: Can't see what A does at all.
Adaptive on-line: Can see what A does, OPT has to choose its actions before seeing A's next choice (but can connive with adversary).
Adaptive off-line: Can see what A does, OPT chooses its actions after seeing entire sequence.

We'll look at the relationship between these in more detail later.

3.2. Example: randomized move-to-front with oblivious adversary

(See BorodinElYaniv §2.3.)

Flip a coin for each search and move to front only if coin comes up heads. Provably twice as bad as deterministic MTF in the limit (and thus at best 2-competitive).

Bad sequence: (x_m)^k, (x_m-1)^k, ... (x₁)^k. Exercise: Compute the expected costs for this sequence.

3.3. Example: BIT with oblivious adversary

(See BorodinElYaniv §2.2.)

Like randomized move-to-front, except I only flip a coin once for each element and store it. If the element's bit is 0, I move it to the front and switch to 1; if 1, I do nothing and switch to 0.

We can show that E[BIT(σ)] ≤ (7/4) OPT(σ) using a modified version of the potential function for MTF. Define Φ = ∑_{inversions xy} (b(y)+1), which is equal to the sum over all inversions of the number of times we need to hit y before BIT will get rid of the inversion. The intuition is that b(y)+1 represents how many accesses we are going to have to pay for using this inversion before we actually move y to the front.

Again we treat a sequence of operations as consisting of the following events: (a) accesses to y, where the cost BIT_i to BIT and OPT_i to OPT are the positions of y in π and π^*, and either algorithm may move y using free exchanges; and (b) paid exchanges by OPT (BIT doesn't do paid exchanges). Defining the amortized cost a_i = Φ_i-Φ_i-1+BIT_i, we have BIT(σ) = Φ₀-Φ_last+∑a_i. So the bound will be obtained by showing that E[a_i] ≤ (7/4)OPT_i for all i.

The easy case is when i is a paid exchange by OPT. Each paid exchange creates at most one new inversion; the weight of this inversion is b(y)+1 for some y, which is 1 or 2 with equal probability. A complication is that whether the inversion is created or not may be correlated with b(y), but since we are only looking for an upper bound, we can just use the constant 1 as an upper bound on the number of inversions created and get a_i = (1/2)(1+2) = 3/2 < (7/4)OPT_i.

The painful and difficult case is when i is an access to some element y. Here we have to count creation and destruction of inversions as in the MTF case, and correlations abound. Let BIT_i=k be the position of y in π, and OPT_i=k^* be the position of y in π^*. Let I be the number of inversions <x,y> (note we are not counting weights here). By the same analysis as in MTF we have k ≤ k^*+I, giving BIT_i ≤ OPT^* + I. Now we just have to argue that the extra I is paid for by the competitive ratio and changes in Φ.

We'll divide the change in Φ into three pieces: A=added inversions, B=removed inversions, and C=inversions that change weight. We'll consider the effect B+C first and then add in A at the end.

Suppose b(y)=1 before i. Then y doesn't move in BIT but the weight of any inversion <x,y> drops. So we have C = -I and B≤0, giving B+C ≤ -I.

Alternatively, suppose b(y)=0 before i. Then y moves in BIT, eliminating I inversions; since no inversions <x,y> survive, the increase in b(y) has no effect. So now we have B=-I, C=0, and again B+C ≤ -I.

Now we tackle A. Consider the elements x₁...x_k*-1, that precede y in π^*. Since we are only moving y forwards (in BIT and/or OPT), any new inversion involves some x_j where (a) x_j precedes y in π, and (b) exactly one of BIT and OPT moves y before x_j. Let k' be the new position of y in π^*. Define X_j as the random variable representing the contribution of the new inversion involving x_j (if any).

Case 1: b(y)=0 before i. Then y moves in BIT and X_j = b(x_j)+1 precisely when y doesn't move past x_j in OPT, with X_j = 0 otherwise. The nonzero case occurs only if j < k', giving ∑ X_j ≤ ∑_j=1..k-1 (b(x_j)+1).

Case 2: b(y)=1 before i. Then y doesn't move in BIT. Now the nonzero case for X_j occurs when j ≥ k', giving ∑ X_j ≤ ∑_j=k+1..k-1 (b(y)+(1)) =

Both cases occur with probability 1/2. Summing the two cases, we get EA = E[∑X_j] ≤ (1/2) ∑_j=1..k'-1 (3/2) + (1/2) ∑_j=k'..k*-1 1 ≤ (3/4)(k^*-1).

Adding the contribution from B+C, we get E[A+B+C] ≤ (3/4)OPT_i - 3/4 - I. Now add BIT_i ≤ OPT_i + I to get E[a_i] ≤ (7/4) OPT_i - 3/4.

4. Paging

See BorodinElYaniv Chapter 4.

CategoryRandomizedAlgorithmsNotes