For more up-to-date notes see http://www.cs.yale.edu/homes/aspnes/classes/465/notes.pdf.
Synchronizers simulate an execution of a failure-free synchronous system in a failure-free asynchronous system. See AttiyaWelch Chapter 11 or LynchBook Chapter 16 for a detailed (and rigorous) presentation.
1. Formal definition of a synchronizer
Formally, a synchronizer sits between the underlying network and the processes and does one of two things:
A global synchronizer guarantees that no process receives a message for round r until all processes have sent their messages for round r.
A local synchronizer guarantees that no process receives a message for round r until all of that process's neighbors have sent their messages for round r.
In both cases the synchronizer packages all the incoming round r messages for a single process together and delivers them as a single action recv(p, msgs, round). Similarly, a process is required to hand over all of its outgoing round-r messages to the synchronizer as a single action send(p, msgs, round)---this prevents a process from changing its mind and sending an extra round-r message or two. It is easy to see that the global synchronizer produces executions that are effectively indistinguishable from synchronous executions, assuming that a synchronous execution is allowed to have some variability in exactly when within a given round each process does its thing. The local synchronizer only guarantees an execution that is locally indistinguishable from an execution of the global synchronizer: an individual process can't tell the difference, but comparing actions at different (especially widely separated) processes may reveal some process finishing round r+1 while others are still stuck in round r or earlier. Whether this is good enough depends on what you want: it's bad for coordinating simultaneous missile launches, but may be just fine for adapting a SynchronousMessagePassing algorithm (e.g. for DistributedBreadthFirstSearch) to an asynchronous system, if we only care about the final states of the processes and not when precisely those states are reached.
1.1. Local indistinguishability of local and global synchronizers
Here we prove that any locally synchronized system is indistinguishable from the inside from a globally synchronized system.
Essentially, we use the same happens-before relation as in LogicalClocks, and the fact that if a schedule S' is a causal shuffle of another schedule S (i.e., a permutation of T that preserves causality), then S'|p = S|p for all p.
Given a schedule S, consider a schedule S' in which the events are ordered first by increasing round and then by putting all sends before receives. This ordering is consistent with ⇒S, so it's a causal shuffle of S and S'|p = S|p. But it's globally synchronized since no round-r operations at all happen before a round-(r-1) operation.
2. Implementations of synchronizers
These all implement at least a local synchronizer (the beta synchronizer is global). The names were chosen by their inventor, Baruch Awerbuch.
The main difference between them is the mechanism used to determine when round-r messages have been delivered.
In the alpha synchronizer, every node sends a message to every neighbor in every round (possibly a dummy message if the underlying protocol doesn't send a message); this allows the receiver to detect when it's gotten all its round-r messages (because it expects to get a message from every neighbor) but may produce huge blow-ups in message complexity in a dense graph.
In the beta synchronizer, messages are acknowledged by their receivers (doubling the message complexity), so the senders can detect when all of their messages are delivered. But now we need a centralized mechanisms to collect this information from the senders and distribute it to the receivers, since any particular receiver doesn't know which potential senders to wait for. This blows up time complexity, as we essentially end up building a global synchronizer with a central leader.
The gamma synchronizer combines the two approaches at different levels to obtain a trade-off between messages and time that depends on the structure of the graph and how the protocol is organized.
Details of each synchronizer are given below.
2.1. The alpha synchronizer
In round r, the synchronizer at p sends p's message (tagged with the round number) to each neighbor p' or no-msg(r) if it has no messages. When it collects a message or no-msg from each neighbor for round r, it delivers all the messages. It's easy to see that this satisfies the local synchronization specification.
This produces no change in time but may drastically increase message complexity because of all the extra no-msgs flying around.
2.2. The beta synchronizer
Centralize detection of message delivery using a rooted directed spanning tree (previously constructed). When p' receives a round-r message from p, it responds with ack(r). When p collects an ack for all the messages it sent plus an OK from all of its children, it sends OK to its parent. When the root has all its acks and OKs, it broadcasts go. Receiving go makes p deliver the queued round-r messages.
This works because in order for the root to issue go, every round r message has to have gotten an acknowledgment ⇒ all round r messages are waiting in the receivers' buffers to be delivered. For the beta synchronizer, message complexity increases slightly from M to 2M+2(n-1), but time complexity goes up by a factor proportional to the depth of the tree.
2.3. The gamma synchronizer
The gamma synchronizer combines the alpha and beta synchronizers to try to get low blowups on both time complexity and message complexity. The essential idea is to cover the graph with a spanning forest and run beta within each tree and alpha between trees. Specifically:
Every message in the underlying protocol gets acked (including messages that pass between trees).
When a process has collected all of its outstanding round-r acks a, it sends OK up its tree.
When the root of a tree gets all acks and OK, it sends ready to the roots of all adjacent trees (and itself). Two trees are adjacent if any of their members are adjacent.
When the root collects ready from itself and all adjacent roots, it broadcasts go through its own tree.
As in the alpha synchronizer, we can show that no root issues go unless it and all its neighbors issue ready, which happens only after both all nodes in the root's tree and all their neighbors (some of whom might be in adjacent trees) have received acks for all messages. This means that when a node receives go it can safely deliver its bucket of messages.
Message complexity is comparable to the beta synchronizer assuming there aren't too many adjacent trees: 2M messages for sends and acks, plus O(N) messages for in-tree communication, plus O(Eroots) messages for root-to-root communication. Time complexity per synchronous round is proportional to the depth of the trees: this includes both the time for in-tree communication, and the time for root-to-root communication, which might need to be routed through leaves.
In a particularly nice graph, the gamma synchronizer can give costs comparable to the costs of the original synchronous algorithm. An example in LynchBook is a ring of k-cliques, where we build a tree in each clique and get O(1) time blowup and O(N) added messages. This is compared to O(N/k) time blowup for beta and O(k) message blowup (or worse) for alpha. Other graphs may favor tuning the size of the trees in the forest toward the alpha or beta ends of the spectrum, e.g. if the whole graph is a clique (and we didn't worry about contention issues), we might just use beta and get O(1) time blowup and O(N) added messages.
3. Synchronizers for dynamic networks
So far we have assumed the network structure is static. We can also apply a synchronizer to a dynamic network, where the connections between nodes change over time. This is useful, for example, in overlay networks where there is some underlying network that allows point-to-point connections between any two nodes, but only some of these connection are used.
A simple approach is to observe that if each node can compute what neighbors it has in each round of the synchronous execution, it can apply the alpha synchronizer by just weighting for its current neighbors. An example of this approach can be found in this paper.
The downside of the simple approach is that it requires ever node to know the identities of all of its neighbors even in round 0, which creates trouble for new nodes trying to join the system. If I approach some existing node and ask to join it, then I have to adopt a starting round number that is at least as great as its current round number (otherwise the actions it computed for previous synchronous rounds will not be consistent with my having appeared before then). With a long chain of newcomers, this can lead to the last one starting with a very high round number, and so the simulation may take a long time to catch up to this high round number.
Alternatively, we can reset parts of the computation when a new node shows up. This may lead to faster computations (since after the reset everybody starts at round 0 again, and the usual synchronizer bounds apply). Choosing exactly how and when to do this demands cleverness; the best currently known approach is the zeta synchronizer of Awerbuch and Sipser http://citeseer.ist.psu.edu/awerbuch88dynamic.html.
See AttiyaWelch §11.3 or Lynchbook §16.5. The one we have seen is DistributedBreadthFirstSearch, where the two asynchronous algorithms we described were essentially the synchronous algorithms with the beta and alpha synchronizers embedded in them. But in general what synchronizers give us is the ability to forget about problems resulting from asynchrony provided we can assume no failures (which may be a very strong assumption) and are willing to accept a bit of overhead.