Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

For more up-to-date notes see http://www.cs.yale.edu/homes/aspnes/classes/465/notes.pdf.

Most of the distributed protocols we have considered allow for large (sometimes very large) memories in the processes. In finite-state protocols, the amount of space in each process is bounded no matter how big the system gets. The motivation for studying such protocols is systems composed of very small or very stupid bits of machinery as found in some sensor networks or in molecular computation. The main question is what can such finite-state protocols do that is actually useful. The answer to this question, as with many questions in distributed computing, depends on the details of the model: whether the agents communicate synchronously or asynchronously, and what the underlying communication graph looks like.

As in less restricted protocol, finite-state protocols can generally do more with synchrony than with asynchrony. But unlike less restricted protocols, finite-state protocols tend to work best in highly restricted communication graphs.

1. Cellular automata: synchronous protocols on highly structured graphs

Here the processes (or cells) are typically arranged in a grid, and at each time unit each process computes a new state based on its own state and the state of its neighbors. Invented by John_von_Neumann, these were popularized by Conway's_Game_of_Life, a two-state cellular automaton on a two-dimensional grid that produces interesting patterns, and have been studied extensively by wealthy peer-review non-enthusiast Stephen_Wolfram. Despite the limitations of individual processes, cellular automata with enough states on a large enough grid can perform arbitrary Turing-machine computations: each cell represents one square of the tape, and the state of the head is marked on whatever cell the head is sitting on. This means that (ignoring efficiency issues) we know what cellular automata can compute.

2. Petri nets: asynchronous protocols with no graph structure

To a first approximation, consist of a set of places over which wander tokens according to transistions, where a typical transition looks something like "consume one token each from places A, B, C and two tokens from D and generate a new token on places E and F." They have been very extensively studied as a tool for understanding queuing systems and similar processes. If a Petri net is conservative, meaning that every transition consumes and generates the same number of tokens, then we can think of it as a finite-state distributed system where the placement of a token on place A corresponds to the existence of a process in state A and a transition corresponds to several processes meeting together and simultaneously updating their states based on their old states. Petri nets are in general nondeterministic, since more than one transition might be avaialable in each configuration, and there are many extensions to e.g. stochastic Petri nets that supply probability distributions on when particular transitions fire.

Petri nets were defined (by Petri) in 1962; the subsequent four decades have produced a huge literature on Petri nets and their variants that it would be folly to try to summarize. The site http://www.informatik.uni-hamburg.de/TGI/PetriNets/ gives many pointers and references to the Petri net literature.

3. Population protocols: asynchronous protocols with some graph structure

Population protocols were defined by Angluin et al (or see here for the full version) as a way of modeling certain classes of sensor systems, in which each process is an anonymous finite-state agent and interaction between processes is under the control of an adversary. They have been elaborated on in subsequent work by (mostly) the same authors.

3.1. Basic model

We have a population of n agents whose states are drawn from some state space Q; this gives as global configurations vectors in Qⁿ. A transition relation δ:Q×Q→Q×Q takes a pair of states and returns a pair of states: this corresponds to an interaction between two automata in which both update their state based on the previous states of the two automata. The transition relation is not necessarily symmetric: when two automata interact, we imagine that one of them is an initiator and one a responder, making symmetry-breaking a built-in feature of the model.

3.1.1. Inputs and outputs

An input to a population protocol is just an initial configuration, possibly after applying some mapping from an input alphabet to states. An output is again a configuration, usually mapped through some output function that strips out irrelevant components of the agents' states. For example, a typical problem might be that each agent starts with an input bit, and we want to reach a configuration where every agent has as one component of its state the majority value.

3.1.2. Who interacts with whom?

Which agents interact at each step is under the control of an adversary, subject to the limitation that there is an interaction graph and two agents (i.e., nodes in the graph) can only interact if there is an edge between them. In general, the interaction graph is assumed to be directed, and if there is only an edge between two agents in one direction then only the source can be the initiator; this assumption is mostly a nuisance in building protocols and so typical protocols are designed to work despite the orientations of the edges.

Obviously, a particularly malicious adversary could choose to only allow interactions between some small subset of the agents. So we need some sort of fairness condition to ensure progress. The fairness condition used in this work so far has been a very strong global fairness condition: essentially, any transition between global configurations that can occur infinitely often occurs infinitely often. This makes for simple termination analysis: if I can always reach some configuration, eventually I will reach it. It's also somewhat justifiable if we assume that interactions are controlled by some stochastic process that is independent of the agents' states. But an open question is whether a weaker fairness condition (e.g. (a) any two agents joined by an edge interact infinitely often or (b) any two agents interact infinitely often while in particular states if they are in that pair of states infinitely often) would allow interesting protocols.

3.1.3. How does the protocol terminate?

Even with the strong fairness condition, it is impossible to detect termination from within a protocol for most problems. The reason is that within any finite time, the adversary may choose to hold back some of the agents from interacting. Since this effectively hides those agents' part of the input, if the protocol terminates before they become visible, the output must not depend on their inputs. Similar problems arise if the population is partitioned: we may have several subsets of the nodes that each believe they have computed the correct solution, an opinion that they will have to revise if an interaction occurs between the subsets.

The solution to this is to replace termination with convergence: a population protocol is output-stable if in any fair execution the protocol eventually reaches a configuration that yields a particular output, and this output does not change thereafter.

3.2. Some simple population protocols

We'll start with protocols that don't depend on the underlying graph.

3.2.1. Leader election

This is a tool for building other protocols: we want to reach a configuration in which there is exactly one distinguished leader agent, and this condition remains true in all subsequent configurations. The solution is the usual one where all agents start out thinking they are leaders, and the responder in a leader-leader interaction drops out.

This protocol solves the problem in a complete interaction graph, but it won't work in a graph with diameter greater than 1; it is possible to get two leaders who aren't neighbors and who never kill each other. So in general we have to allow wandering leaders; an interaction between a leader and a follower moves the leader bit. Because of the global fairness condition, if there is any sequence of interactions that causes two surviving leaders to meet, eventually they will.

Here's the transition relation, where 1 represents a leader and 0 a follower:

{{{00→00 01→10 10→01 11→01}}}

3.2.2. Majority

Now we start with each agent in state - or +, and we want to find out which input is the majority input (or even if they are equal). We can perform this computation by piggybacking it onto leader election: each non-leader tries to unload its input onto some leader, after which it enters a neutral 0 state. Whatever value the leader ends up with will then be the majority value. To make this work, we have to maintain the invariant that the sum of all values is constant: this means that the leader may have to refuse to accept values that would cause overflow.

The protocol looks like this. The first component +, -, 0 of each state is the token value; the second * indicates leadership:

In	Out
x,y	x,y
0*,y	0,y*
+*,-	0,0*
-*,+	0,0*
x*,0	0,x*
x,y	x*,y

plus the symmetric transitions where initiator and responder have different states and are reversed.

One complication is that this protocol is not output-stable: it may reduce an initial population of a +'s and b -'s down to a-b +'s (including one designated leader) and many 0's, but the +'s will wander freely through the population. To make it output-stable we use a standard technique: each agent maintains an output field that is equal to the output value of the last leader it encountered. It follows that once the leader's output stabilizes, it will eventually convert all the non-leaders and the output of the protocol as a whole will be stable thereafter.

3.2.3. Small counters

We can generalize the +,-,0 split to allow a variety of small counters. E.g. if we count mod k (for a constant k) we can compute the sum of all inputs mod k at the leader, and in this case we can even make any interaction between two agents send one of their states to zero since there is no risk of overflow. Alternatively, using a bounded counter with values -k...k lets us count inputs up to small values (but forces us, as in the majority protocol, to just say "≥ k" or "≤ -k" when the counters would otherwise overflow).

3.2.4. Affine predicates with small coefficients

This can further be generalized to allow testing formulae like ∑a_ix_i ≥ k or ∑a_ix_i = c (mod k) where the x_i count the number of occurrences of each input symbol i and the coefficients are small. The essential idea is that we simple translate the input token i to a_i copies of a + token and then apply the previous results.

3.2.5. Boolean combinations of stably-computable predicates

If p and q are both stably-computable, then so are ¬p, p∧q, and p∨q; the proof is simply that we can run protocols for both p and q simultaneously and use the output map to translate the outputs of the individual protocols to the output of the combined protocol. This generalizes to arbitrary (fixed and finite) Boolean formulas.

Combined with the previous results, this tells use that for any predicate on the input counts definable in first-order Presburger_arithmetic, there is population protocol that runs in any weakly-connected graph that stably computes it. This is the main result of the Angluin et al paper from PODC 2004; they further conjectured that these were precisely the predicates that were stably computable in a complete graph.

3.3. Population protocols on restricted classes of graphs

On interaction graphs that are not complete, we may get more power. The question of how much more power we get is studied in Angluin et al, Stably computable properties of network graphs, which appeared in DCOSS 2005 (see http://cs-www.cs.yale.edu/homes/aspnes/graph-properties-abstract.html). At an extreme, if the interaction graph is a path, we can simulate a Turing machine using the same method as for cellular automata. A slightly less extreme case is one where we have a constant degree bound; this turns out to also allow a full Turing-machine simulation.

Note that in restricted interaction graphs, we can treat the structure of the graph itself as part of the input, and we may want to ask questions like "is the interaction graph a path/cycle/of bounded degree/etc.?" Such questions may be more interesting than simply asking about counts of input tokens (which is the only information we really have in a complete graph because of symmetry between nodes). For most of these problems that we know how to solve, the solution is essentially to map out the graph completely and build a Turing machine on top of it.

How do we do this? There are basically three steps:

Construct a distance-2 coloring in which every node is assigned a color distinct from all other nodes at distance 2 in the interaction graph. This obviously requires at least one distinct color for each neighbor of a node. But if we can do it, it allows a node to identify its neighbors, and in particular allows it to store pointers to particular neighbors. The existence of a distance-2 coloring using a bounded number of colors in a bounded-degree graph is not hard to prove: any node has at most d(d-1) distinct distance-2 neighbors, so with d(d-1)+1 colors we can always find an unused one to assign it.
Construct a spanning tree on top of the graph, using the node colors for pointers. We have to restart this process whenever the coloring changes, but once the coloring stabilizes we eventually win.
Simulate a Turing machine on top of the spanning tree (an old trick from the self-stabilization literature). This again must be restarted periodically to deal with color or spanning tree changes.

For details see the paper. The result is that on bounded-degree graphs we can do anything we can do in LINSPACE. For graphs that are not bounded degree we don't currently know what happens.

CategoryDistributedComputingNotes