**The power of two choices** is the observation that if we put n balls into n bins, the maximum load is much smaller if each ball can choose the lighter of two randomly-selected bins (without replacement) instead of just picking one bin at random. We'll do the upper bound proof from MitzenmacherUpfal §14.1 that with d≥2 choices the maximum load is ln ln n / ln d + O(1) with probability 1 - o(1/n); this is much better than the Θ(log n / log log n) we previously saw for the d=1 case.

# 1. Intuition

For each ball j=1..n, define its **height** h(j) as the number of balls in its bin after it is placed. To get a ball of height i+1 or higher, we have to pick d bins with i balls or more. Suppose that we have a bound β_{i} on the number of bins that ever have i or more balls. Then the probability that we pick d such bins is at most (β_{i}/n)^{d}. We will then use Chernoff bounds to show that the total number of height i+1 or higher balls is at most 2n(β_{i}/n)^{d} with high probability, or that β_{i+1}/n ≤ 2(β_{i}/n)^{d}. This gives a recurrence for the concentration of high balls β_{i}.

# 2. Probabilistic implications

One messy part of the argument is that in arguing that there are few height i+1 balls, we are depending on there being few height i balls. This sounds like a conditional probability, but if we condition on having few height i balls the process gets very messy (for example, it's not clear whether this increases or decreases the probability of having a lot of height i+1 balls; maybe the reason we have so few height i balls is that we have an usually large number of very full bins, leaving not so many balls to fill up other bins to height i). So instead we will look at events of the form [number of height i balls ≤ β_{i} ⇒ number of height i+1 balls ≤ β_{i+1}]. The negation of this event is [number of height i balls ≤ β_{i} ∧ number of height i+1 balls > β_{i+1}]; by showing that each of these bad events has low probability we can us the union bound to show that all the implications hold except with low probability, which will show that the entire chain of events holds.

# 3. Main induction step

Formally, let ν_{i} be the number of balls at height i or greater at the end of the process. Let β_{i} be given by the recurrence β_{4} = n/4, β_{i+1}/n = 2(β_{i}/n)^{d}. We want to show that ν_{i} ≤ β_{i} for all sufficiently large i with high probability.

For i=4, we have Pr[ν_{i} ≤ β_{i}] = 1; there aren't enough balls to get more than n/4 with height 4 or greater.

Now let i≥4 and look at ν_{i+1}. Let X_{j} be the indicator variable for the event [ν_{i} ≤ β_{i} ∧ ball j picks d bins with at least i balls each]. Then E[X_{j}|X_{1}..X_{j-1}] ≤ (β_{i}/n)^{d}, because either (a) at time j there are already more than β_{i} balls at height i, and X_{j} = 0, or (b) at time j there are not more than β_{i} balls at height i, the probability that we pick d bins that contain tall balls is bounded by (β_{i}/n)^{d}, and the probability that X_{j}=1 is no more than this (it may be less either because our estimate of the number of tall bins is too high or because ν_{i} later exceeds β_{i}). This gives us a sequence of indicator variables where the expectation of each conditioned on the previous ones is bounded.

Unfortunately these variables are not independent, so we can't apply Chernoff bounds directly. We can probably apply Azuma-Hoeffding with a bit of work, but there is a sneaky trick that lets us apply Chernoff bounds anyway. The trick is that we can build a coupled collection of independent random variables Y_{1}...Y_{n} where X_{j} ≤ Y_{j} always and each Y_{j}=1 with probability exactly (β_{i}/n)^{d}. So then ∑ X_{j} ≤ ∑ Y_{j} and we have Pr[ν_{i} ≤ β_{i} ∧ ν_{i+1} > β_{i+1}] ≤ Pr[∑ X_{j} > β_{i+1}] ≤ Pr[∑ Y_{j} > β_{i+1}] = Pr[∑ Y_{j} > 2 E[∑ Y_{j}]] ≤ (e^{2}/2^{2})^{np} ≤ e^{-np/3} where p = (β_{i}/n)^{d}. This bound will be less than n^{-2} as long as np ≥ 6 ln n or p ≥ 6 ln n / n. When np gets smaller, we will have to switch to a different method.

# 4. Solving the recurrence

MitzenmacherUpfal give the solution to the recurrence β_{i}/n = 2(β_{i}/n)^{d} as β_{i+4} = . The proof is by the usual plug-and-chug method, so we will omit it here. Since the sum in the exponent is annoying, we can simplify the bound to β_{i+4} by observing that the sum is (d^{i}-1)/(d-1) = Θ(d^{i}). So the break-even point where p = (β_{i}/n)^{d} drops below 6 ln n / n is at some i^{*} = ln lg n / ln d + Θ(1) = ln ln n / ln d + Θ(1).

# 5. Final step

So here we are at i^{*}, and we have ν_{i*} ≤ β_{i*} with some high probability. We also have that p = (β_{i*}/n)^{d} ≤ 6 ln n / n (otherwise we could keep using the Chernoff bounds). By the coupling argument from above, we get Pr[ν_{i*} ≤ β_{i*} ∧ ν_{i*+1} > 18 ln n] ≤ Pr[B(n, 6 ln n / n) ≥ 18 ln n], where B(n,p) is the usual binomial random variable. From Chernoff bounds we have that the probability that this event occurs is at most (e^{3}/3^{3})^{6 ln n} ≤ 1/n^{2}.

Now let's look at i^{*}+2. Here we argue Pr[ν_{i*+1} ≤ 18 ln n ∧ ν_{i*+2} ≥ 2] ≤ Pr[B(n, (18 ln n / n)^{d}) ≥ 2] ≤ (n choose 2) (18 ln n / n)^{2d} ≤ 1/n^{2} = O(ln^{4}n / n^{2}). The (n choose 2) is all different ways of choosing 2 possible height i^{*}+2 balls, and the (18 ln n / n)^{2d} bounds the probability that both balls are too tall.

Finally, at i^{*}+3, we have Pr[ν_{i*+2} ≤ 1 ∧ ν_{i*+3} ≥ 1] = 0, since we can't pick d≥2 distinct non-empty bins if ν_{i*+2} ≤ 1.

Summing up all the probabilities of failure gives (i^{*}+2)/n^{2} + O(ln^{4} n / n^{2}) = o(1/n).

# 6. Lower bound

The bound given here is tight up to an additive constant. With n balls in n bins we also get some bin with at least ln ln n / ln d - O(1) with probability at least 1 - o(1/n). The proof of the lower bound is approximately as horrible as the proof of the upper bound. See MitzenmacherUpfal §14.2 for details.