The power of two choices is the observation that if we put n balls into n bins, the maximum load is much smaller if each ball can choose the lighter of two randomly-selected bins (without replacement) instead of just picking one bin at random. We'll do the upper bound proof from MitzenmacherUpfal §14.1 that with d≥2 choices the maximum load is ln ln n / ln d + O(1) with probability 1 - o(1/n); this is much better than the Θ(log n / log log n) we previously saw for the d=1 case.
For each ball j=1..n, define its height h(j) as the number of balls in its bin after it is placed. To get a ball of height i+1 or higher, we have to pick d bins with i balls or more. Suppose that we have a bound βi on the number of bins that ever have i or more balls. Then the probability that we pick d such bins is at most (βi/n)d. We will then use Chernoff bounds to show that the total number of height i+1 or higher balls is at most 2n(βi/n)d with high probability, or that βi+1/n ≤ 2(βi/n)d. This gives a recurrence for the concentration of high balls βi.
2. Probabilistic implications
One messy part of the argument is that in arguing that there are few height i+1 balls, we are depending on there being few height i balls. This sounds like a conditional probability, but if we condition on having few height i balls the process gets very messy (for example, it's not clear whether this increases or decreases the probability of having a lot of height i+1 balls; maybe the reason we have so few height i balls is that we have an usually large number of very full bins, leaving not so many balls to fill up other bins to height i). So instead we will look at events of the form [number of height i balls ≤ βi ⇒ number of height i+1 balls ≤ βi+1]. The negation of this event is [number of height i balls ≤ βi ∧ number of height i+1 balls > βi+1]; by showing that each of these bad events has low probability we can us the union bound to show that all the implications hold except with low probability, which will show that the entire chain of events holds.
3. Main induction step
Formally, let νi be the number of balls at height i or greater at the end of the process. Let βi be given by the recurrence β4 = n/4, βi+1/n = 2(βi/n)d. We want to show that νi ≤ βi for all sufficiently large i with high probability.
For i=4, we have Pr[νi ≤ βi] = 1; there aren't enough balls to get more than n/4 with height 4 or greater.
Now let i≥4 and look at νi+1. Let Xj be the indicator variable for the event [νi ≤ βi ∧ ball j picks d bins with at least i balls each]. Then E[Xj|X1..Xj-1] ≤ (βi/n)d, because either (a) at time j there are already more than βi balls at height i, and Xj = 0, or (b) at time j there are not more than βi balls at height i, the probability that we pick d bins that contain tall balls is bounded by (βi/n)d, and the probability that Xj=1 is no more than this (it may be less either because our estimate of the number of tall bins is too high or because νi later exceeds βi). This gives us a sequence of indicator variables where the expectation of each conditioned on the previous ones is bounded.
Unfortunately these variables are not independent, so we can't apply Chernoff bounds directly. We can probably apply Azuma-Hoeffding with a bit of work, but there is a sneaky trick that lets us apply Chernoff bounds anyway. The trick is that we can build a coupled collection of independent random variables Y1...Yn where Xj ≤ Yj always and each Yj=1 with probability exactly (βi/n)d. So then ∑ Xj ≤ ∑ Yj and we have Pr[νi ≤ βi ∧ νi+1 > βi+1] ≤ Pr[∑ Xj > βi+1] ≤ Pr[∑ Yj > βi+1] = Pr[∑ Yj > 2 E[∑ Yj]] ≤ (e2/22)np ≤ e-np/3 where p = (βi/n)d. This bound will be less than n-2 as long as np ≥ 6 ln n or p ≥ 6 ln n / n. When np gets smaller, we will have to switch to a different method.
4. Solving the recurrence
MitzenmacherUpfal give the solution to the recurrence βi/n = 2(βi/n)d as βi+4 = . The proof is by the usual plug-and-chug method, so we will omit it here. Since the sum in the exponent is annoying, we can simplify the bound to βi+4 by observing that the sum is (di-1)/(d-1) = Θ(di). So the break-even point where p = (βi/n)d drops below 6 ln n / n is at some i* = ln lg n / ln d + Θ(1) = ln ln n / ln d + Θ(1).
5. Final step
So here we are at i*, and we have νi* ≤ βi* with some high probability. We also have that p = (βi*/n)d ≤ 6 ln n / n (otherwise we could keep using the Chernoff bounds). By the coupling argument from above, we get Pr[νi* ≤ βi* ∧ νi*+1 > 18 ln n] ≤ Pr[B(n, 6 ln n / n) ≥ 18 ln n], where B(n,p) is the usual binomial random variable. From Chernoff bounds we have that the probability that this event occurs is at most (e3/33)6 ln n ≤ 1/n2.
Now let's look at i*+2. Here we argue Pr[νi*+1 ≤ 18 ln n ∧ νi*+2 ≥ 2] ≤ Pr[B(n, (18 ln n / n)d) ≥ 2] ≤ (n choose 2) (18 ln n / n)2d ≤ 1/n2 = O(ln4n / n2). The (n choose 2) is all different ways of choosing 2 possible height i*+2 balls, and the (18 ln n / n)2d bounds the probability that both balls are too tall.
Finally, at i*+3, we have Pr[νi*+2 ≤ 1 ∧ νi*+3 ≥ 1] = 0, since we can't pick d≥2 distinct non-empty bins if νi*+2 ≤ 1.
Summing up all the probabilities of failure gives (i*+2)/n2 + O(ln4 n / n2) = o(1/n).
6. Lower bound
The bound given here is tight up to an additive constant. With n balls in n bins we also get some bin with at least ln ln n / ln d - O(1) with probability at least 1 - o(1/n). The proof of the lower bound is approximately as horrible as the proof of the upper bound. See MitzenmacherUpfal §14.2 for details.