Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

For more up-to-date notes see http://www.cs.yale.edu/homes/aspnes/classes/465/notes.pdf.

See AttiyaWelch Chapter 2 for details. We'll just give the basic overview here. See also AsynchronousMessagePassing and SynchronousMessagePassing for some older notes based on the I/O automaton model used in LynchBook.

1. Basic message-passing model

We have a collection of n processes p₁...p₂, each of which has a state consisting of a state from from state set Q_i, together with an inbuf and outbuf component representing messages available for delivery and messages posted to be sent, respectively. Messages are point-to-point, with a single sender and recipient: if you want broadcast, you have to pay for it. A configuration of the system consists of a vector of states, one for each process. The configuration of the system is updated by an event, which is either a delivery event (a message is moved from some process's outbuf to the appropriate process's inbuf) or a computation event (some process updates its state based on the current value of its inbuf and state components, possibly adding new messages to its outbuf). An execution segment is a sequence of alternating configurations and events C₀,φ₁,C₁,φ₂,..., in which each triple C_iφ_i+1C_i+1 is consistent with the transition rules for the event φ_i+1 (see AttiyaWelch chapter 2 or the discussion below for more details on this) and the last element of the sequence (if any) is a configuration. If the first configuration C₀ is an initial configuration of the system, we have an execution. A schedule is an execution with the configurations removed.

1.1. Formal details

Each process i has, in addition to its state state_i, a variable inbuf_i[j] for each process j it can receive messages from and outbuf_i[j] for each process j it can send messages to. We assume each process has a transition function that maps tuples consisting of the inbuf values and the current state to a new state plus zero or one messages to be added to each outbuf (note that this means that the process's behavior can't depend on which of its previous messages have been delivered or not). A computation event comp(i) applies the transition function for i, emptying out all of i's inbuf variables, updating its state, and adding any outgoing messages to i's outbuf variables. A delivery event del(i,j,m) moves message m from outbuf_i[j] to inbuf_j[i].

Some implicit features in this definition:

A process can't tell when its outgoing messages are delivered, because the outbuf_i variables aren't included in the accessible state used as input to the transition function.
Processes are deterministic: The next action of each process depends only on its current state, and not on extrinsic variables like the phase of the moon, coin-flips, etc. We may wish to relax this condition later by allowing coin-flips; to do so, we will need to extend the model to incorporate probabilities.
Processes must process all incoming messages at once. This is not as severe a restriction as one might think, because we can always have the first comp(i) event move all incoming messages to buffers in the state_i variable, and process messages sequentially during subsequent comp(i) events.
It is possible to determine the accessible state of a process by looking only at events that involve that process. Specifically, given a schedule S, define the restriction S|i to be the subsequence consisting of all comp(i) and del(j,i,m) events (ranging over all possible j and m). Since these are the only events that affect the accessible state of i, and only the accessible state of i is needed to apply the transition function, we can compute the accessible state of i looking only at S|i. In particular this means that i will have the same accessible state after any two schedules S and S' where S|i = S'|i, and thus will take the same actions in both schedules. This is the basis for IndistinguishabilityProofs, a central technique in obtaining lower bounds and impossibility results.

An odd feature of this particular model is that communication channels are not modeled separately from processes, but instead are split across processes (as the inbuf and outbuf variables). This leads to some oddities like having to distinguish the accessible state of a process (which excludes the outbufs) from the full state (which doesn't). A different approach (taken, for example, by LynchBook) would be to have separate automata representing processes and communication channels. But since the resulting model produces essentially the same executions, the exact details don't really matter.

1.2. Network structure

It may be the case that not all processes can communicate directly; if so, we impose a network structure in the form of a directed graph, where i can send a message to j only if there is an edge from i to j in the graph. Typically we assume that each process knows the identity of all its neighbors.

For some problems (e.g., in peer-to-peer systems or other overlay networks) it may be natural to assume that there is a fully-connected underlying network but that we have a dynamic network on top of it, where processes can only send to other processes that they have obtained the addresses of in some way.

2. Asynchronous systems

In an asynchronous model, only minimal restrictions are placed on when messages are delivered and when local computation occurs. A schedule is said to be admissible if (a) there are infinitely many computation steps for each process, and (b) every message is eventually delivered. The first condition (a) assumes that processes do not explicitly terminate, which is the assumption used in AttiyaWelch; an alternative, which we will use when convenient, is to assume that every process either has infinitely many computation steps or reaches an explicit halting state.

2.1. Time complexity

There is no explicit notion of time in the asynchronous model, but we can define a time measure by adopting the rule that every message is delivered and processed at most 1 time unit after it is sent. Formally, we assign time 0 to the first event, and assign the largest time we can to each subsequent event, subject to the rule that if a message m from i to j is created at time t, then the time for the delivery of m from i to j and the time for the following computation step of j are both no greater than j+1. This is consistent with an assumption that message propagation takes at most 1 time unit and that local computation takes 0 time units. Another way to look at this is that it is a definition of a time unit in terms of maximum message delay together with an assumption that message delays dominate the cost of the computation. This last assumption is pretty much always true for real-world networks with any non-trivial physical separation between components, thanks to speed of light limitations.

The time complexity of a protocol (that terminates) is the time of the last event before all processes finish.

Note that looking at "step complexity," the number of computation events involving either a particular process ("individual step complexity") or all processes ("total step complexity") is not useful in the asynchronous model, because a process may be scheduled to carry out arbitrarily many computation steps without any of its incoming or outgoing messages being delivered, which probably means that it won't be making any progress. These complexity measures will be more useful when we look at SharedMemory models.

2.2. Message complexity

For a protocol that terminates, the message complexity is the total number of messages sent. We can also look at message length in bits, total bits sent, etc., if these are useful for distinguishing our new improved protocol from last year's model.

3. Synchronous systems

A synchronous message-passing system is exactly like an asynchronous system, except we insist that the schedule consists of alternating phases in which (a) every process executes a computation step, and (b) all messages are delivered. The combination of a computation phase and a delivery phase is called a round. Synchronous systems are effectively those in which all processes execute in lock-step, and there is no timing uncertainty. This makes protocols much easier to design, but makes them less resistant to real-world timing oddities. Sometimes this can be dealt with by applying a synchronizer, which transforms synchronous protocols into asynchronous protocols at a small cost in complexity.

For synchronous systems, time complexity becomes just the number of rounds until a protocol finishes. Message complexity is still only loosely connected to time complexity; for example, there are synchronous LeaderElection algorithms that, by virtue of grossly abusing the synchrony assumption, have unbounded time complexity but very low message complexity.

CategoryDistributedComputingNotes