*For more up-to-date notes see http://www.cs.yale.edu/homes/aspnes/classes/465/notes.pdf.*

Want to do **adaptive** collect, where in a system with n processes of which only k participate, we read the values written by the participants in time that depends on k but not n. Such an algorithm is called **adaptive to total contention**, where the total contention k is the number of processes that ever execute the algorithm. Later we will see examples of algorithms that are even more refined, being adaptive to **interval contention** (maximum number of active processes that overlap any one operation) or even **point contention** (maximum number of active processes at any single time).

In principle, we can solve this problem using Renaming followed by standard-issue collect, but we'd like to do better than this, since the overhead on renaming can be pretty high. We will also want to minimize space, measured as a function of the maximum number of processes n (it doesn't entirely make sense to adapt to k here, unless we can arrange to dynamically go out and buy more space, although one could imagine algorithms that request space `malloc`-style from a common pool).

# Collecting with splitters

We'll start by describing some results from Attiya, Fouren, and Gafni, An adaptive collect algorithm with applications, Distributed Computing 15(2):87–96, 2002 (AttiyaFG2002.pdf).

Recall that a **splitter** (see Renaming) is a data structure with three outcomes stop, right, and down, with the guarantee that at most one process stops, at least one process does not go right, and at least one process does not go down. Moir and Anderson showed that if k processes enter a two-dimensional grid of splitters organized in rows and columns starting at (0,0), no process reaches a position (r,c) where (r,c) ≥ k.

Now let each process mark any splitter it touches and store its value in the splitter it won (by getting stop). A collect can be carried out by reading each diagonal with (r,c) = k for increasing k until we reach a diagonal that is empty. This gives a collect in O(k^{2}) time and O(n^{2}) space.

The Attiya-Fouren-Gafni collect trades off time for space by reducing the time complexity to O(k) at the cost of using space 2^{n-1}. What we do is build a depth-n binary tree of splitters; because at least one process entering each node doesn't reach one of its children, we have that every process stops somewhere in the tree. Since we also have the property that every node that gets more than one process sends at least one of them to a different place from the rest (stopping or to the other child), if we construct a tree consisting just of the splitters that are marked plus some dummy leaves for processes that stop, we get a tree in which every internal node has either 2 or 3 children; it follows that the number of internal nodes is bounded by the number of leaves, which is equal to the number of participating processes k.

If we do depth-first search on this, we get a collect in O(k) time. But the space cost is very high.

# Randomized adaptive collects get polynomial space

Hagit Attiya, Fabian Kuhn, Mirjam Wattenhofer, and Roger Wattenhofer, Efficient adaptive collect using randomization, DISC 2004 (won the best student paper award), later version Hagit Attiya , Fabian Kuhn , C. Greg Plaxton , Mirjam Wattenhofer , Roger Wattenhofer, Efficient adaptive collect using randomization, Distributed Computing, v.18 n.3, p.179-188, February 2006. http://www.dcg.ethz.ch/publications/disc04b.pdf

Two main contributions:

A tunable version of the AFG collect based on stacking up n / (γ - 1) lg n) trees of splitters of depth γ lg n. This gives space complexity O(n

^{γ+1}/ ((γ-1) log n)). For time complexity, it is shown that if k_{i}processes enter tree T_{i}, then at least min(k_{i}(γ-1) lg n) processes stop in T_{i}, implying every process stops somewhere in the first k / ((γ-1) lg n) trees; the messy detail here is that it takes up to lg k_{i}≤ lg n levels of the tree before the processes split up enough that we guarantee to lose at least one per level, which accounts for the -1. It costs O(k_{i}) ≤ O(k) to do a DFS search within T_{i}, giving a total cost of O(k^{2}/ ((γ-1) lg n)). At one extreme, γ = n/lg n + 1 gets exactly one tree, essentially reducing to AFG. At the other extreme, setting γ = 1 + 1/log n gives an algorithm that matches the O(k^{2}) time and O(n^{2}) memory bounds from Moir and Anderson.A randomized adaptive collect with O(k)

*expected*cost and polynomial space. We will give a simplified version of this that uses O(n^{3}) space; the paper does better. The idea is to do an AFG-style tree with γ lg n layers of "randomized splitters," which are essentially just stock Moir-Anderson splitters modified so that you flip a coin to decide whether to go right or down instead of using the output of the splitter. This is backstopped by some other collect algorithm that is only used with low probability; for quick bounds (and if we are not assuming N > n), a standard-issue (n-1)-read collect of n registers works. The space complexity is O(n + 2^{γ lg n}) = O(n^{γ}). The time complexity is O(k) + O(n)⋅Pr[escape], where Pr[escape] is the probability that any process fails to stop in the tree, forcing it into the backstop structure. AKWW give a rather messy argument based on balls-in-bins that this probability is low; this is needed for their low-space construction. An easier argument that works if we are willing to tolerate crummier space bounds is that I can only escape if the last splitter I go through is used by at least one other process, which means that there is at least one other process that gets the same value for all γ lg n coin-flips. This event occurs with probability 2^{-γ lg n}= n^{-γ}for each pair of processes, giving an expected number of bad pairs of less than n^{2-γ}= n^{-1}for γ = 3. So at the cost of O(n^{3}) space we get O(k) expected cost. The bound in AKWW is better: O(k) expected cost but O(n^{3/2+epsilon}space. (See the paper for details.) Even better: Mirjam Wattenhofer's dissertation shows how to get O(k + log n) cost for collect with high probability and O(n) space by cascading randomized splitters as in the tunable collect.)