Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

Recall that for any prime p≥m, the family of hash functions hab:M→N given by hab(x) = ((ax + b) mod p) mod n for a∈ℤp*, b∈ℤp is 2-universal: ∀x≠y Pr[hab(x) = hab(y)] ≤ 1/n. Show that it is not k-universal for any constant k>2, by showing that for any fixed k and sufficiently large n there exists a set Sn such that Pr[hab(Sn) is constant] = Ω(1/n) (with a constant that may depend on k).

1.1. Solution

Fix k, and let Sn=S be the set 0..k-1. Let A be the set { cn | 0 < cn < p/2k } and let B be the set { b | 0 ≤ b < p/2 }. Then for any a∈A and x∈S, we have ax < p/2 and ax mod n = 0. Adding in b∈B gives ax+b < p and ax mod n = b mod n for all x∈S. Because ax+b < p, we have (ax+b) mod p = ax+b and so hab(x) = ((ax+b) mod p) mod n = (ax+b) mod n = b mod n for all x∈S.

What is the probability that this event occurs? We have ⌊p/(2kn)-1⌋ elements in A, giving Pr[a∈A] = ⌊p/(2kn)-1⌋/(p-1) = Θ(1/kn). Similarly, there are ⌊p/2⌋ choices for b, giving Pr[b∈B] = ⌊p/2⌋/p = Θ(1). This gives a probability of Θ(1/kn) that hab(S) is constant.

2. Low expected distortion

Given a graph G, let V(G) be the set of vertices in G and let dG(u,v) be the length of the shortest path from u to v in G, where u,v∈V(G).

Given a graph G and a probability distribution P on pairs (f,H) where H is a graph and f:V(G)→V(H) such that dH(f(u),f(v)) ≥ dG(u,v) for all u,v∈V(G), let the expected distortion δ(P) = maxu,v∈V(G) E[dH(f(u),f(v))/dG(u,v)].

Let G = Cn be a cycle with n vertices. Show that there exists a probability distribution P on (f,H) such that H is always Pn-1, the path with n vertices, and δ(P) = O(1).

2.1. Solution

Obtain H by cutting G along a uniform random edge, and let f be the identity map. Then dH(u,v) ≥ dG(u,v) trivially (removing edges can only increase the length of the shortest path. To prove the distortion bound, suppose that dG(u,v) = k ≤ n/2. Then with probability k/n, we cut an edge on the shortest uv path, giving dH(u,v) = n-k; the rest of the time, dH(u,v) = dG(u,v) = k. This gives E[dH(u,v)] = (k/n)(n-k) + ((n-k)/n)(k) = 2k(n-k)/n < 2k and thus δ(P) ≤ max(2k/k) = 2.

3. Defective skip lists

Suppose we are given a skip list with a head node and n nodes with parameter p (the probability that a node at level k also appears at level k+1), and that we choose one of the non-head nodes uniformly at random and remove it without updating any pointers, so that any search that attempts to access the removed node fails.

What is the expected number of nodes that are still reachable by a standard skip list search starting from the head node?

3.1. Solution

It is possible to solve this problem by brute force, but it is easier to do it by making very heavy use of linearity of expectation. Suppose that the search path for j includes m non-head nodes (this includes j). Then the probability that j is not reachable, conditioned on this fact, is exactly m/n. The expected total number of non-reachable nodes is then E[∑ mj/n], where mj is the number of nodes on the search path to j, or (1/n) ∑j E[mj], the expected length of a search path to a uniformly-chosen node. Computing this quantity exactly is exceptionally painful, but we can easily get that it is O(log n / (p log (1/p))) using the usual skip list analysis.

2014-06-17 11:58