[FrontPage] [TitleIndex] [WordIndex

Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

Notes on deadlock for CS422. For more details see SilberschatzGalvinGagne Chapter 7.

1. What is deadlock?

Deadlock is when two or more tasks never make progress because each is waiting for some resource held by another process.

1.1. Example: too little memory

Two processes each demand 1.5 Gb of memory on a machine with only 2 Gb (and no virtual memory). The operating system kindly gives each 1 Gb exactly. Neither process can make progress until the other gives up some of its memory.

1.2. Example: I/O

Two interactive processes on a primitive handheld device with no window system each want to control the keyboard (so the user can type at them) and the display (so they can respond). If process P grabs the keyboard while process Q grabs the display, both may be stuck waiting for the other resource.

1.3. Example: bidirectional pipe

A common design pattern in Unix-derived operating systems is the pipeline, e.g. sort | uniq | wc to count unique lines in a file. In a pipeline, each program feeds its output to the next program. Curiously, there is no built-in mechanism in most shells to create a circular pipeline, where two programs A and B each produce as output the input to the other, even though it may be reasonable in some circumstances for one program to use another as a subroutine, providing its input and consuming its output. Even scripting languages that provide such tools (e.g. Python's os.popen2 function) bury them in obscure libraries. Why is this?

Let's imagine a simple circular pipeline where process P (say some user program) sends data to process Q (say tr) and then reads the result. We'll imagine each process puts its outgoing data in a buffer guarded by a semaphore, and blocks when the buffer fills up. What happens?

If P's input to Q is large enough and Q produces output comparable in size to its input, then Q fills up its outgoing buffer and blocks before P is done writing. This causes P to fill up its output buffer and block. Nobody makes any progress from this point on—we have a deadlock.

2. Processes and resources

Deadlocks are described in terms of processes (things that can block) and resources (things processes can wait for). Processes may or may not correspond to full-blown processes as used elsewhere. It is typically assumed that each resource may have multiple instances, where a process is indifferent to which instance it gets and nobody blocks unless the processes collectively request more instances than the resource can provide.

3. Necessary conditions

There are four conditions which must hold for deadlock to occur as classically defined. These are known as Coffman's conditions from a 1971 survey paper of which Coffman was the first author in alphabetical order (Coffman, E.G., M.J. Elphick, and A. Shoshani, System Deadlocks, ACM Computing Surveys, 3(2):67–78, 1971; coffman_deadlocks.pdf).

  1. Mutual exclusion. There must be some resource that can't be shared between processes.

  2. Hold and wait. Some process must be holding one resource while waiting for another.

  3. No preemption. Only the process holding a resource can release it.

  4. Circular wait. There must be some cycle of waiting processes P1, P2, ..., Pn such that each process Pi is waiting for a resource held by the next process in the cycle. (Note that this implies hold-and-wait.)

4. Resource-allocation graphs

A resource-allocation graph depicts which processes are waiting for or holding each resource. Each node in the graph represents either a process or a resource. A directed edge is drawn from process P to resource R if P is waiting for R, and from R to P if P holds R. (See SGG §7.2 for many pictures of resource-allocation graphs.)

The situation where P1 waits for a resource R held by P2 corresponds to a path of length 2 in the resource-allocation graph. Circular waiting strings these length-2 paths together into a cycle. It follows that deadlock can occur only if there is a cycle in the resource-allocation graph. The converse is generally not true (although it holds if there is only one instance of each resource). For example, if P1 holds R1 and waits for R2 while P2 holds R2 and waits for R1, then we have a cycle in the resource-allocation graph, but there is no deadlock if R2 has multiple instances some of which are held by other processes not involved in the cycle. When these extra instances become available, P1 stops waiting and the cycle clears up.

5. Preventing deadlock

To prevent deadlock, we just have to violate one of the Coffman conditions.

  1. No mutual exclusion. If there's no need for mutual exclusion, there's no deadlock. This is the best solution when it can be arranged, particularly when resources (read-only files, lock-free data structures) can be shared. This doesn't work for resources that can't reasonably be shared by two processes at the same time (most writable data structures, the CPU, CD-ROM burners, the keyboard). Sometimes resources can be partitioned to avoid mutual exclusion (memory partitioning, window systems).

  2. No hold-and-wait. Adopt a policy of not letting a process wait for one resource while holding another, either by requiring each process to hold only one resource at a time, or to request all of the resources it needs simultaneously. The first approach may be severely limiting (for example, an interactive program can't get exclusive access to the keyboard and the display at the same time). The second has two problems: it requires a process to predict what resources it needs in advance, and it may allow a process to starve waiting for a group of resources that never become free all at the same time. Sometimes resources can be consolidated to allow the single-resource approach: it's quite natural, for example, to have a single mutex that covers all three of the keyboard, mouse, and display. In the limit we can lock the entire system (caveat: difficult for distributed systems) in order to lock any resource, but then things are likely to get slow.

  3. Allow preemption. This is almost as good as eliminating mutual exclusion: if I can bump some process off a resource it is hogging, then I can break a deadlock cycle. The CPU is the primary preemptible resource (although we may get deadlock if we allow processes to block preemption while waiting for other resources—as can happen with disabled interrupts or in cases of priority inversion where a high-priority process pushes the low-priority process it's waiting for off the CPU). Another preemptible resource is memory (assuming we can swap out to disk). The difference between preemption and sharing is that in the preemption case we just need to be able to restore the state of the resource for the preempted process rather than letting it in at the same time as the preemptor.

  4. Eliminate cycles. This is similar to no-hold-and-wait. We'll require processes to grab resources in increasing order according to some total order (one common heuristic is increasing order by the memory address of the mutexes guarding the resources). If process P is holding R1 while waiting for R2, then R2 > R1 in the ordering (otherwise P would have grabbed it first). So we can't have a circular wait, because this would give a cycle in the total order on resources. Note that unlike hold and wait we don't get starvation if the resources are allocated fairly: I will eventually come to the front of the line for the next resource on my list and will thus eventually get all the resources. The disadvantage though is that I still need to be able to predict what resources I want in advance or accept that I may not be able to ask for some resource if I was unlucky enough to request some other resource first.

6. Avoiding deadlock

Here we aren't going to violate the Coffman conditions, but we may make processes wait for a resource even if it is available because granting the resource might lead to deadlock later. Typically this requires that processes declare in advance what resources they might request.

For example, in the keyboard-and-display situation, each process could declare that it may want at some point 1 keyboard and 1 display. Then once process P grabs the keyboard, a deadlock-avoidance algorithm could make Q wait for the display even if P hasn't asked for it.

Formally, this condition is expressed in terms of safe states. Intuitively, a state is safe if there is some way to let all the processes run without creating deadlock. Formally, a state is safe if there is a safe sequence or ordering of the processes { Pi } such that each process Pi can satisfy its maximum demands only using (a) resources that are currently available plus (b) resource held by processes earlier in the sequence. (The idea is that the safe sequence describes the order in which processes will finish and release their resources, and once all previous processes have finished, Pi can get whatever it needs and finish too.)

In the keyboard-and-display example, the state where P holds the keyboard is safe using the safe sequence P < Q, since Q can get everything it needs once P finishes. But the state where P holds the keyboard and Q holds the display is not, since if P < Q then P can't get its maximum resources. A working deadlock-avoidance algorithm will notice this and keep Q from acquiring the display.

6.1. The Banker's algorithm

Due to Edsger_Dijkstra (yes, Dijkstra again), the Banker's Algorithm detects safe states. The method is the following: we write down a matrix whose rows correspond to processes and columns to resources, and put each process's maximum remaining demand in its row. We keep track of a separate vector A of available resources. To determine if a state is safe:

  1. Look for a row that is componentwise less than or equal to A. If there is none, the state is not safe.
  2. Otherwise, pretend that the process for that row has finished, remove its row from the matrix, and give all of its resources back to A.
  3. Repeat until we get stuck or all processes are removed.

(This summary is taken from TanenbaumBook §3.5.4. SilberschatzGalvinGagne §7.5.3 gives more details.)

It's not hard to see that the Banker's Algorithm computes a safe sequence. More work is needed to show that it always finds a safe sequence if one exists. Tanenbaum observes that in practice nobody uses this algorithm given the unreasonable requirement that each process pre-specify its maximum demand.

7. Dealing with deadlock

There are several ways to detect and deal with deadlock. In increasing order of complexity:

  1. Do nothing. If the system locks up, that will teach the user not to try doing what they just did again. This is approach taken in practice for many resource constraints.

  2. Kill processes. We can detect deadlock by looking for waiting cycles (which can be done retrospectively once we notice nobody is making as much progress as we like). Having done so, we can either kill every process on the cycle if we are feeling particularly bloodthirsty or kill one at a time until the cycle evaporates. In either case we need to be able to reset whatever resources the processes are holding to some sane state (which is a weaker than full preemption since we don't care if we can restore the previous state for the now-defunct process that used to be holding it). This is another reason why we design operating systems to be rebooted.

  3. Preempt and rollback. Like the previous approach, but instead of killing a process restore it to some earlier safe state where it doesn't hold the resource. This requires some sort of checkpointing or transaction mechanism, and again requires the ability to preempt resources.

A lot of research has been done on deadlock recovery in the database literature, where transactions (blocks of code that lock everything they touch but that are set up to be rolled back if preempted) provide a basic tool allowing the preempt-and-rollback strategy. Some of this work has started appearing in OS research, e.g. in Software_transactional_memory.


2014-06-17 11:58