[FrontPage] [TitleIndex] [WordIndex

Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

1. Motivation

In the beginning, processes generally didn't talk to each other except occasionally by leaving notes in the filesystem. But faster methods were needed to get real-time cooperation between processes.

Most interactions between processes fall into a small number of patterns:

Shared state
A database, the Windows registry, the filesystem, the kernel scheduler data.
One process generates a sequence of data that is consumed by another process.
Worker threads
Several processes carry out tasks in parallel; requires a task queue and possibly some shared state the threads are operating on.

We want to provide tools for letting processes communicate with each other quickly to carry out tasks with these sort of structures.

2. Shared memory

The simplest form of IPC is shared memory. Here the main issue is ConcurrencyControl (which we've already talked about) and possibly MemoryManagement to map the same physical memory to two different address spaces (which we haven't talked about yet).

Advantage: Very fast.

Disadvantages: Requires careful locking to avoid trouble. Doesn't work across multiple machines. Doesn't (by itself) solve the producer-consumer problem.

3. Message passing

Here the model is that one process sends a message (some block of bits) and another process receives it, with actual delivery the responsibility of the operating system. The main advantage of message-passing is that it scales to multiple machines, is agnostic about the actual mechanism of delivery (see, for example, http://www.ietf.org/rfc/rfc1149.txt, the IETF standard for IP via Carrier Pigeon).

3.1. Interface

3.1.1. Channels and naming

One issue is how to specify where you are sending a message to and where you want to receive a message from. This is particularly tricky if you start out without knowing any friends.

A distinction can be made between direct delivery—sending messages to a specific process—and indirect delivery—sending messages to some channel without necessarily knowing what process might be on the other end. The latter is generally more useful, since it decouples a service (processing particular messages) from its implementation.

Some of the strategies used in various kinds of systems:

Unix pipes

A pipe is an anonymous channel for shipping bytes. You create a pipe with the pipe system call, which gives you two file descriptors that you can then use standard POSIX I/O operations on. The recipient of a message is whoever is reading from the output end of the pipe. Since there is no way to hand file descriptors to an arbitrary process in most versions of Unix, pipe handles are usually inherited during a fork.


An address is given by an IP address (32 bits, or 128 bits for IPv6) and a port number (16 bits). IP addresses may be obtained from more human-readable names using DNS. Many port numbers are standardized, e.g. 21 for telnet, 22 for ssh, 25 for SMTP, 80 for HTTP, 6000 for X11. (Try running nmap localhost on the Zoo sometime.) The recipient of a message is whatever process asked the kernel to open that port for it. Sun RPC: Layered on top of TCP/IP. Port numbers are assigned dynamically by a portmapper service (which itself is at a standard port). Peer-to-peer systems: Implement a distributed lookup service to translate queries to IP addresses.

Even once you have a name, there is still the issue of setting up a channel or connection and what the behavior of that channel is. Some variants:

3.1.2. Sending a message

The interface for sending a message can be pretty straightforward: something like send(msg, length, recipient), where recipient is the name of a channel. From the sender's perspective, the message is gone and it can go do whatever it likes.

3.1.3. Receiving a message

This is trickier. Since the recipient doesn't control when a message is sent (or how long it takes to arrive), we need to somehow get its attention. There are two basic approaches:

Event handler
Recipient registers a callback function with the kernel. When a message comes in, the kernel interrupts the recipient and runs the callback function.
Event loop

Recipient checks its messages from time to time using either a blocking receive function or a non-blocking polling function.

The advantages and disadvantages are similar to those for preemptive vs non-preemptive scheduling. Messages coming in through an event loop are tidier but less timely than messages delivered through interruptions.

Note we can always simulate the event-loop model with an event handler by having the event handler store incoming messages in a buffer. Conversely, a multi-threaded processes can simulate an event handler by having a listener thread whose only purpose is to wait for incoming messages.

3.2. Implementation

When we look at implementation, we suddenly realize that message-passing doesn't solve the producer-consumer problem, it instantiates it.

3.2.1. Using shared memory

For asynchronous message-passing, we need to allocate buffer space somewhere in the kernel for each channel. A send operation is now handled by a system call that (a) checks if the buffer is full and blocks otherwise (e.g., using a semaphore), (b) then writes the message to the end of the buffer. Receive is the reverse, blocking first if the buffer is empty (a second semaphore), and then reading from the start of the buffer. The start and end of the buffer can be maintained by keeping track of two pointers that wrap around at the end, giving a ring buffer architecture. Another option that allows for unbounded buffering subject to the memory limits and patience of the kernel is a dynamically-allocated queue.

For synchronous message-passing, the buffer can be much smaller since it only has to hold one message in transit. Here the main implementation issue is how a send operation interacts with the scheduler. Since we are going to block the sender anyway, do we just queue it up or should we do something like give up the rest of its time-slice to the recipient to get a fast response? Further optimizations may be possible if we can immediately switch to the recipient, like passing very small messages in registers or very big messages by mapping physical memory between address spaces. These optimizations might be especially important when message-passing is a critical system bottleneck, as in microkernels.

3.2.2. Across a network

Typical approach: Buffer on sender's machine, and then separate kernel thread and/or network driver pulls data out of buffer and sends it out through the network hardware. Receiver has matching structure, with incoming data being stuffed in buffers for later delivery.

Many details to work out: acknowledgments, retransmission, checksums, etc. Not our problem.

3.3. Exceptions

Bad things can happen. Suppose recipient dies, how do we notify sender?

Unix approach

SIGPIPE delivered to signal handler (or if not caught, sender dies with Broken Pipe error), errors on write operations.

TCP approach
Special acknowledgment message says connection is closed, kernel translates to appropriate errors.

How about lost messages?

Kernel takes responsibility for resending message until it gets through (and acknowledged).
Ignore it
Sometimes messages just get lost, too bad.

4. Remote procedure calls

Allows process P to call some specific procedure in process Q (generally process Q has to register the procedure as available) and gets result. Process Q may or may not be on the same machine.

Implementation is basically two messages: call and return. Arguments to call are translated into standard form (marshaled) for transmission. Process P generally blocks while waiting for the return.

Simplifies interface since we already understand procedure calls. But blocking P may be expensive (can be dealt with by using multiple threads in P).

Example: HTTP GET.

5. Remote method invocation

Fancy-pants Java RPC. Problem with RPC for OO languages is that arguments might be objects. RPC would send a bitwise copy of the object, but complex objects (particularly objects with state) can't be marshaled. So instead we replace each object in P that is included as an argument with a delegate, a new object that lives on Q but forwards any method invocations back to its progenitor on P (via RMI again, now running in reverse). Note that allowing these reverse calls means that P can't block completely during an RMI or we'll get deadlock. This is handled by having each process have a listener thread that processes incoming RMI requests, spawning worker threads to carry out the actual operations.

6. Efficiency issues

Any form of IPC can be expensive, since dealing with multiple processes probably involves both context switches and copying data. One way to deal with the cost is bundling: Many messages or procedure calls can be packed together so that the overhead is amortized. This is done, for example, by the X11 protocol and by the HTTP KeepAlive mechanism. But this usually requires user cooperation or at the minimum asynchronous communication (so the kernel can bundle the messages for the user).


2014-06-17 11:58