Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

1. Random sets

Suppose we are given a set of size S, and generate two subsets A and B by including each element of S in A with independent probability p and in B with independent probability q.

What is the probability that A⊆B?
What are the expected sizes of A, B, A∩B, and A∪B?

1.1. Solution

For A⊆B to hold, it must be the case x∈A ⇒ x∈B for all x∈S. Let's compute the probability that this condition fails for some fixed x: we need x∈A but x∉B. Since these events are independent, they occur with probability p(1-q). So the probability that the condition holds for x is (1-p(1-q)), and the probability that it holds for all x is (1-p(1-q))ⁿ.
Linearity of expectation is our friend here; given a target set T, we let X_x be the indicator variable for the event that x∈T. Then E[|T|] = ∑_x∈S E[X_x] = ∑_x∈S Pr[x∈T] = n Pr[x∈T] if the probability is the same for all x. Thus we have:
- For A, Pr[x∈A] = p ⇒ E[|A|] = pn.
- For B, Pr[x∈B] = q ⇒ E[|B|] = qn.
- For A∩B, Pr[x∈A∩B] = pq ⇒ E[|A∩B|] = pqn.
- For A∪B, Pr[x∈A∪B] = 1-(1-p)(1-q) = p+q-pq ⇒ E[|A∪B|] = n(p+q-pq).

2. Recurrences

Let T(n) = 1 + T(n-X), where X = 0 with probability p and ⌈n/2⌉ with probability q = 1-p. Give the best upper bound you can on E[T(n)].
Let T(n) = n² + T(n-X), where X is a uniformly distributed integer-valued random variable in the range 1..n. Give the best upper bound you can on E[T(n)].
Let T(n) = 1 + T(n-X), where the distribution of X depends on n and E[X] = μ(n), where μ satisfies the conditions of the Karp-Upfal-Wigderson bound. Give an example of a family of processes for which the K-U-W bound on E[T(n)] for some n is an arbitrarily large multiple of the actual value of E[T(n)].

2.1. Solutions

We can try doing this either with KUW or by attacking it directly.
KUW bound
EX = q⌈n/2⌉, a non-decreasing function in n, so E[T(n)] ≤ $\int_0^{n} \frac{1}{q\lceil t/2 \rceil} \,dt$ ≤ $\frac{1}{q} 2 \sum_{k=1}^{\lceil n/2 \rceil} \frac{1}{k}$ = (2/q) H(⌈n/2⌉) ≤ (2/q) ln n.
Direct bound
Here we observe that at each value we reach, we wait an expect (1/q) rounds before dropping to the next value. It takes ⌈lg n⌉+1 drops to get to 0, so the total expected time is exactly (1/q)(⌈lg n⌉ + 1). The constants here are slightly better than the KUW bound, but otherwise both are asymptotically Θ((1/q) log n).
Here we guess that an² works for some constant a, since this is what we'd expect if we were solving the recurrence with X = n/2. So let's check the guess against the actual recurrence. For n = 0, we get T(n) = 0 = an². For larger n, assume that E[T(m)] ≤ am² for m<n and check E[T(n)] = E[n²+T(n-X)] = $n^2+\frac{1}{n}\sum_{m=0}^{n-1} am^2$ ≤ $n^2 + \frac{1}{n} \int_0^n at^2 \,dt$ = n² + (a/n)(n³/3) = (1+a/3)n² ≤ an² provided (1+a/3)≤a. This holds when a≥3/2, so we get E[T(n)] ≤ (3/2)n².
Let's just look at a process with three states, corresponding to n=2, 1, 0. From n=2, suppose we drop to 0 with probability 1; from n=1, we drop to 0 with probability p and otherwise stay put. Now we have E[X₁] = p and E[X₂] = 2, so we can let μ(n) = p for 0≤n≤1 and 2 for 1<n≤2. The KUW bound on E[T(2)] is then 1/p+1/2. But the actual value of E[T(2)] is 1, so by letting p→0 we can make the KUW bound arbitrarily worse than the real value.

3. Random hitting sets

In the hitting set problem, one is given a collection of subsets A₁,A₂...A_k of a set S of n elements, and the goal is to find as small a set B as possible such that A_i∩B≠∅ for all i.

Suppose that |A_i| = m for all i, and that we choose B by including each element of S with independent probability c/m. As a function of n, k, and m, how large does c need to be so that you can show there is at least a constant probability that B∩A_i≠∅ for all i?

You may find it helpful to use the fact that 1+x ≤ e^x for all x.

3.1. Solution

We'll start by computing Pr[B∩A_i=∅] for any specific A_i. This is just the probability that none of the m elements of A_i are chosen, or (1-c/m)^m ≤ exp(-c/m)^m = exp(-c). Applying the union bound gives a bound on the probability than any A_i is missed of k(1-c/m)^m ≤ ke^-c. Letting ke^-c = ε and solving for c gives c = ln(k/ε).