For more uptodate notes see http://www.cs.yale.edu/homes/aspnes/classes/465/notes.pdf.
In a shared memory model, it may be possible to solve some problems using waitfree protocols, in which any process can finish the protocol in a bounded number of steps, no matter what the other processes are doing (see ObstructionFreedom for more on this and some variants).
The waitfree hierarchy h^{r}_{m} classifies AsynchronousSharedMemory object types T by consensus number, where a type T has consensus number n if with objects of type T and atomic registers (all initialized to appropriate values) it is possible to solve waitfree consensus (i.e., agreement, validity, waitfree termination) for n processes but not for n+1 processes. The consensus number of any type is at least 1, since 1process consensus requires no interaction, and may range up to ∞ for particularly powerful objects.
The waitfree hierarchy was suggested by work by Maurice Herlihy Waitfree synchronization, TOPLAS 13(1):124149 that classified many common (and some uncommon) sharedmemory objects by consensus number, and showed that an unbounded collection of objects with consensus number n together with atomic registers gives a waitfree implementation of any object in an nprocess system. Various subsequent authors noticed that this did not give a "robust" hierarchy in the sense that combining two types of objects with consensus number n could solve waitfree consensus for larger n, and the hierarchy h^{r}_{m} was proposed by Prasad Jayanti Jayanti, Robust waitfree hierarchies, JACM 44(4):592614, 1997 as a way of classifying objects that might be robust: an object is at level n of the h^{r}_{m} hierarchy if having unboundedly many objects plus unboundedly many registers solves nprocess waitfree consensus but not (n+1)process waitfree consensus. Whether or not the resulting hierarchy is in fact robust for arbitrary deterministic objects is still open, but Eric Ruppert Ruppert, Determining consensus numbers, SICOMP 30(4):11561168, 2000 subsequently showed that it is robust for RMW registers and objects with a read operation that returns the current state, and there is a paper by Borowsky, Gafni, and Afek (Consensus power makes (some) sense! (extended abstract), PODC 1994) that sketches a proof based on a topological characterization of computability that h^{r}_{m} is robust for deterministic objects that don't discriminate between processes (unlike, say, singlewriter registers). So for wellbehaved sharedmemory objects (i.e., deterministic, symmetrically accessible, with read operations, etc.), consensus number appears to give a real classification that allows us to say for example that any collection of readwrite registers (consensus number 1), fetchandincrements (2), testandset bits (2), and queues (2) is not enough to build a compareandswap (∞).
We won't do the full robustness proofs of Borowsky et al or Ruppert that let us get away with this. Instead, we'll concentrate on Herlihy's original results and show that specific objects have specific consensus numbers when used in isolation. The procedure in each case will be to show an upper bound on the consensus number using a variant of FischerLynchPaterson (made easier because we are waitfree and don't have to worry about fairness) and then show a matching lower bound (for nontrivial upper bounds) by exhibiting an nprocess consensus protocol for some n. Essentially everything below is taken from Herlihy's paper, so reading that may make more sense than reading these notes.
1. Classification by consensus number
Here's the quick table:
Consensus number 
Defining characteristic 
Examples 
1 
Read + interfering noreturn RMW 
Registers, counters, generalized counters 
2 
Interfering RMW; queuelike structures 
Fetchandwrite, fetchandadd, queues, processtomemory swap 
m 

mprocess consensus objects 
2m2 

atomic mregister write 
∞ 
First writelike operation wins 
Queue with peek, fetchandcons, sticky bits, compareandswap, memorytomemory swap, memorytomemory copy 
Details below.
1.1. Level 1: atomic registers, counters, other interfering RMW registers that don't return the old value
First observe that any type has consensus number at least 1, since 1process consensus is trivial.
We'll argue that a large class of particularly weak objects has consensus number exactly 1, by running FischerLynchPaterson with 2 processes. Recall that in FischerLynchPaterson we classify states as bivalent or univalent depending on whether both decision values are still possible, and that with at least one failure we can always start in a bivalent state (this doesn't depend on what objects we are using, since it depends only on having invisible inputs). Since the system is waitfree there is no constraint on adversary scheduling, and so if any bivalent state has a bivalent successor we can just do it. So to solve consensus we have to reach a bivalent configuration C that has only univalent successors, and in particular has a 0valent and a 1valent successor produced by applying operations x and y of processes p_{x} and p_{y}.
Assuming objects don't interact with each other behind the scenes, x and y must be operations of the same object. Otherwise Cxy = Cyx and we get a contradiction.
Now let's suppose we are looking at atomic registers, and consider cases:
 x and y are both reads, Then x and y commute: Cxy = Cyx ⇒ contradiction.
x is a read and y is a write. Then p_{y} can't tell the difference between Cyx and Cxy, so running p_{y} to completion gives the same decision value from both Cyx and Cxy, another contradiction.
x and y are both writes. Now p_{y} can't tell the difference between Cxy and Cy, so we get the same decision value for both, again constradicting that Cx is 0valent and Cy is 1valent.
There's a pattern to these cases that generalizes to other objects. Suppose that an object has a read operation that returns its state and one or more readmodifywrite operations that don't return anything (perhaps we could call them "modifywrite" operations). We'll say that the MW operations are interfering if for any two operations x and y either:
x and y commute: Cxy = Cyx.
One of x and y overwrites the other: Cxy = Cy or Cyx = Cx.
Then no pair of read or modifywrite operations can get us out of a bivalent state, because reads commute, the nonreader can't tell which of a read and a MW operation happened first, and for any two MW operations either they commute or the overwriter can't detect that the first operation happened. So any MW object with uninformative, interfering MW operations has consensus number 1. For example, consider a counter that supports operations read, increment, decrement, and write: a write overwrites any other operation, and increments and decrements commute with each other, so the counter has consensus number 1. The same applies to a generalized counter that supports an atomic x←x+a operation; as long as this operation doesn't return the old value, it still commutes with other atomic increments.
1.2. Level 2: interfering RMW objects that return the old value, queues (without peek)
Suppose now that we have a RMW object that returns the old value, and suppose that it is nontrivial in the sense that it has at least one RMW operation where the embedded function f that determines the new value is not the identity (otherwise RMW is just read). Then there is some value v such that f(v) ≠ v. To solve twoprocess consensus, have each process p_{i} first write its preferred value to a register r_{i}, then execute the nontrivial RMW operation on the RMW object initialized to v. The first process in sees v and decides its own value. The second process sees f(v) and decides the first process's value (which it reads from the register). It follows that nontrivial RMW object has consensus number at least 2.
In many cases, this is all we get. Suppose that the operations of some RMW type T are interfering in a way analogous to the previous definition, where now we say that x and y commute if they leave the object in the same state (regardless of what values are returned) and that y overwrites x if the object is always in the same state after both x and xy (again regardless of what is returned). The two processes that carry out x and y know what happenened, but a third process z doesn't. So if we run z to completion we get the same decision value after both Cx and Cy, which means that Cx and Cy can't be 0valent and 1valent. It follows that no collection of RMW registers with interfering operations can solve 3process consensus, and thus all such objects have consensus number 2.
There are some other objects with consensus number 2 that don't fit this pattern. Define a waitfree queue as an object with enqueue and dequeue operations (like normal queues), where dequeue returns empty if the queue is empty (instead of blocking). To solve 2process consensus with a waitfree queue, initialize the queue with a single value (it doesn't matter what the value is). We can then treat the queue as a nontrivial RMW register where a process wins if it successfully dequeues the initial value and loses if it gets empty.
However, enqueue operations are noninterfering: if p_{x} enqueues v_{x} and p_{y} enqueues v_{y}, then any third process can detect which happened first; similarly we can distinguish enq(x) deq() from deq() enq(x). So to show we can't do three process consensus we do something sneakier: given a bivalent state C with allegedly 0 and 1valent successors C enq(x) and C enq(y), consider both C enq(x) enq(y) and C enq(y) enq(x) and run x until it does a deq() (which it must, because otherwise it can't tell what to decide) and then stop it. Now run y until it also does a deq() and then stop it. We've now destroyed the evidence of the split and poor hapless z is stuck. In the case of C deq() enq(x) and C enq(x) deq() on a nonempty queue we can kill the initial dequeuer immediately and then kill whoever dequeues x or the value it replaced, and if the queue is empty only the dequeuer knows. In either case we reach indistinguishable states after killing only 2 witnesses, and the queue has number ≤ 2.
Similar arguments work on stacks, deques, and so forth—these all have consensus number exactly 2.
1.3. Level ∞: queue with peek, compareandswap, various memorytomemory operations, fetchandcons, sticky bits
Here are a bunch of level∞ objects:
 Queue with peek
 Has operations enq(x) and peek(), which returns the first value enqueued. (Maybe also deq(), but we don't use it). Protocol is to enq my input and then peek and return the first value into the queue.
 Fetchandcons
 Returns old cdr and adds new car on to the head of a list. Use preceding protocol where peek() = tail(car::cdr).
 Sticky bits
 Has write operation that fails unless register is in the initial ⊥ state. Protocol is to write my input and then return result of a read.
 Compareandswap
 has CAS(old, new) operation that writes new only if previous value = old. Use it to build a sticky bit.
 Memorytomemory swap
Has swap(r_{i}, r_{j}) operation that atomically swaps contents of r_{i} with r_{j}, as well as the usual read and write operations for all registers. Use to implement fetchandcons. Alternatively, use two registers input_{i} and victory_{i} for each process, where victory_{i} is initialized to 0, and a single central register token, initialized to 1. To execute consensus, write your input to input_{i}, then swap victory_{i} with token. The winning value is obtained by scanning all the victory registers for the one that contains a 1, then returning the corresponding input value.)
 Memorytomemory copy
Has copy(r_{i}, r_{j}) operation that copies r_{i} to r_{j} atomically. Idea is that each process has registers r_{p1} and r_{p2} and stakes its claim by copying r_{p1} to r_{p2}. It then writes 0 to all r_{p'1} for p' > p, and works backward looking for last nonzero r_{p'2}, which is the winner (because it staked its claim before anybody else could shut it down). No better claim will come up at this point because all the zeroes in r_{p'1} for p' > p mean that there can be no further changes in any r_{p'2} for p' > p, and we don't care about r_{p'2} for p' < p because nobody will get that far during their backward scan without hitting r_{p2} = 1 first.
1.4. Level 2m2: simultaneous mregister write
Here we have a (large) collection of atomic registers augmented by an mregister write operation that performs all the writes simultaneously. The intuition for why this is helpful is that if p_{1} writes r_{1} and r_{shared} while p_{2} writes r_{2} and r_{shared} then any process can look at the state of r_{1}, r_{2} and r_{shared} and tell which write happened first:
If the process reads r_{1} = r_{2} = ⊥, then we don't care which went first, because the reader (or somebody else) already won.
If the process reads r_{1} = 1 and then r_{2} = ⊥, then p_{1} went first.
If the process reads r_{2} = 2 and then r_{1} = ⊥, then p_{2} went first. (This requires at least one more read after checking the first case.)
Otherwise the process saw r_{1} = 1 and r_{2} = 2. Now read p_{shared}: if it's 1, p_{2} went first, and vice versa.
This requires 2register writes, and will give us a protocol for 2 processes (since the reader above has to participate somewhere to make the first case work). For m processes, we can do the same thing with mregister writes. We have a register r_{pq} = r_{qp} for each pair of distinct processes p and q, plus a register r_{pp} for each p; this gives a total of m(m+1)/2 = O(m^{2}) registers. All registers are initialized to ⊥. Process p then writes its initial preference to some singlewriter register pref_{p} and then simultaneously writes p to r_{pq} for all q (including r_{pp}). It then attempts to figure out the first writer by applying the above test for each q to r_{pq} (standing in for r_{shared}), r_{pp} (= r_{1}) and r_{qq} (= r_{2}). If it won against all the other processes, it decides its own value. If not, it repeats the test recursively for some p' that beat it until it finds a process that beat everybody, and returns its value. So mregister writes solve mprocess waitfree consensus.
A further tweak gets 2m2: run two copies of an m1 process protocol using separate arrays of registers to decide a winner for each group. Then add a second phase where each process has one register s_{p}, in which each process p from group 1 writes the winning id for its group simultaneously into s_{p} and s_{q} for each q in the other group. To figure out who won in the end, build a graph of all victories, where there is an edge from p to q iff p beat q in phase 1 or p's id was written before q's id in phase 2. The winner is the (unique) process with at least one outgoing edge and no incoming edges, which will be the process that won its own group (by writing first) and whose value was written first in phase 2.
1.4.1. Matching impossibility result
It would seem that the technique used to boost from mprocess consensus to (2m2)process consensus could be repeated to get up to at least Θ(m^{2}), but this turns out not to be the case. The essential idea is to show that in order to escape bivalence, we have to get to a configuration C where every process is about to do an mregister write leading to a univalent configuration (since reads don't help for the usual reasons, and normal writes can be simulated by mregister writes with an extra m1 dummy registers), and then argue that these writes can't overlap too much. So suppose we are in such a configuration, and suppose that Cx is 0valent and Cy is 1valent, and we also have many other operations z_{1}...z_{k} that lead to univalent states. Following Herlihy, we argue in two steps:
There is some register that is written to by x alone out of all the pending operations. Proof: Suppose not. Then the 0valent configuration Cxyz_{1}...z_{k} is indistinguishable from the 1valent configuration Cyz_{1}...z_{k} by any process except p_{x}, and we're in trouble.
There is some register that is written to by x and y but not by any of the z_{i}. Proof:: Suppose not; then Cxyz_{1}...z_{k} is indistinguishable from Cyxz_{1}...z_{k} for any process other than p_{x} and p_{y}, and we're still in trouble.
Now suppose we have 2m1 processes. The first part says that each of the pending operations (x, y, all of the z_{i}) writes to 1 singlewriter register and at least k twowriter registers where k is the number of processes leading to a different univalent value. This gives k+1 total registers simultaneously written by this operation. Now observe that with 2m1 process, there is some set of m processes whose operations all lead to a bvalent state; so for any process to get to a (¬b)valent state, it must write m+1 registers simultaneously. It follows that with only m simultaneous writes we can only do (2m2)consensus.
1.5. Level m: mprocess consensus objects
An mprocess consensus object has a single consensus operation that, the first m times it is called, returns the input value in the first operation, and thereafter returns only ⊥. Clearly this solves mprocess consensus. To show that it doesn't solve (m+1)process consensus even when augmented with registers, run a bivalent initial configuration to a configuration C where any further operation yields a univalent state. By an argument similar to the mregister write case we can show that the pending operations in C must all be consensus operations on the same consensus object (anything else commutes or overwrites). Now run Cxyz_{1}...z_{k} and Cyxz_{1}...z_{k}, where x and y lead to 0 and 1valent states, and observe that p_{k} can't distinguish the resulting configurations because all it got was ⊥. (Note: this works even if the consensus object isn't in its initial state, since we know that before x or y the configuration is still bivalent.)
So the mprocess consensus object has consensus number m. This shows that h^{r}_{m} is nonempty at each level.
A natural question at this point is whether the inability of mprocess consensus objects to solve (m+1)process consensus implies robustness of the hierarchy. One might consider the following argument: given any object at level m, we can simulate it with an mprocess consensus object, and since we can't combine mprocess consensus objects to boost the consensus number, we can't combine any objects they can simulate either. The problem here is that while mprocess consensus objects can simulate any object in a system with m processes (see below), it may be that some objects can do more in a system with m+1 objects while still not solving (m+1)process consensus. A simple way to see this would be to imagine a variant of the mprocess consensus object that doesn't fail completely after m operations; for example, it might return one of the first two inputs given to it instead of ⊥. This doesn't help with solving consensus, but it might (or might not) make it too powerful to implement using standard mprocess consensus objects.
2. Universality of consensus
This says that any type that can implement nprocess consensus can, together with atomic registers, give a waitfree implementation of any object in a system with n processes.
See Herlihy paper for the full result, which does a lot of extra work to handle garbage collection and reduce overhead. Simplified profligate version goes like this: the processes repeatedly use consensus to decide between candidate histories of the simulated object, and a process successfully completes an operation when its operation (tagged to distinguish it from other similar operations) appears in a winning history. In slightly more detail:
 Have a separate nprocess consensus protocol for each of a series of phases 0, 1, 2, ... . These protocols should allow processes as input values instead of just 0 or 1 (it's possible to build such objects out of binary consensus objects by doing tournaments).
 Processes post a list of (a) the operation they want to do and (b) the last phase they've participated in.
 To do an operation, process i:
 Posts the operation to its register.
 Reads all the lastphase values and takes their max.
 Runs the consensus protocol for the max phase to get the history decided on up to that phase.
 If the max phase history includes the process's pending operation, returns the result that operation would have had in the winning history.
 Otherwise, constructs a new history by appending all announced operations to the previous history, and tries to win with that history in phase max+1.
 Returns to step (b) if its operation doesn't make it into the winning history.
This terminates because even if process i doesn't get its value into the winning history, eventually some other process will pick up the announced value and include it.