[FrontPage] [TitleIndex] [WordIndex

Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

1. Persistent data

When a C program exits, all of its global variables, local variables, and heap-allocated blocks are lost. Its memory is reclaimed by the operating system, erased, and handed out to other programs. So what happens if you want to keep data around for later?

To make this problem concrete, let's suppose we want to keep track of a hit counter for web pages. From time to time, the user will run the command count_hit number where number is an integer value in the range 0 to 99, say. (A real application would probably be using urls, but let's keep things as simple as possible.) We want count_hit to print the number of times the page with the given number has been hit, i.e. 1 the first time it is called, 2 the next time, etc. Where can we store the counts so that they will survive to the next execution of count_hit?

2. A simple solution using text files

The simplest solution is probably to store the data in a text file. Here's a program that reads a file hits, increments the appropriate value, and the writes out a new version. To reduce the chances that data is lost (say if count_hit blows up halfway through writing the file), the new values are written to a new file hit~, which is then renamed to hit, taking the place of the previous version.

   1 #include <stdio.h>
   2 #include <stdlib.h>
   4 #define NUM_COUNTERS (100)      /* number of counters we keep track of */
   5 #define COUNTER_FILE "/tmp/hit" /* where they are stored */
   6 #define NEW_COUNTER_FILE COUNTER_FILE "~"  /* note use of constant string concatenation */
   8 int
   9 main(int argc, char **argv)
  10 {
  11     int c;
  12     int i;
  13     int counts[NUM_COUNTERS];
  14     FILE *f;
  16     if(argc < 2) {
  17         fprintf(stderr, "Usage: %s number\n", argv[0]);
  18         exit(1);
  19     }
  20     /* else */
  22     c = atoi(argv[1]);
  23     if(c < 0 || c > NUM_COUNTERS) {
  24         fprintf(stderr, "Counter %d not in range 0..%d\n", c, NUM_COUNTERS - 1);
  25         exit(2);
  26     }
  28     f = fopen(COUNTER_FILE, "r");
  29     if(f == 0) {
  30         perror(COUNTER_FILE);
  31         exit(3);
  32     }
  34     /* read them in */
  35     for(i = 0; i < NUM_COUNTERS; i++) {
  36         fscanf(f, "%d", &counts[i]);
  37     }
  38     fclose(f);
  40     printf("%d\n", ++counts[c]);
  42     /* write them back */
  43     f = fopen(NEW_COUNTER_FILE, "w");
  44     for(i = 0; i < NUM_COUNTERS; i++) {
  45         fprintf(f, "%d\n", counts[i]);
  46     }
  47     fclose(f);
  51     return 0;
  52 }

If you want to use this, you will need to create an initial file /tmp/hit with NUM_COUNTERS zeroes in it.

Using a simple text file like this is the easiest way to keep data around, since you can look at the file with a text editor or other tools if you want to do things to it. But it means that the program has to parse the file every time it runs. We can speed things up a little bit (and simplify the code) by storing the values in binary.

3. Using a binary file

Here's a version that stores the data as a binary file of exactly sizeof(int) * NUM_COUNTERS bytes. It uses the stdio routines fread and fwrite to read and write the file. These are much faster than the loops in the previous program, since they can just slap the bytes directly into counts without processing them at all.

The program also supplies and extra flag b to fopen. This is ignored on Unix-like machines but is needed on Windows machines to tell the operating system that the file contains binary data (such files are stored differently from text files on Windows).

   1 #include <stdio.h>
   2 #include <stdlib.h>
   4 #define NUM_COUNTERS (100)      /* number of counters we keep track of */
   5 #define COUNTER_FILE "/tmp/hit" /* where they are stored */
   6 #define NEW_COUNTER_FILE COUNTER_FILE "~"  /* note use of constant string concatenation */
   8 int
   9 main(int argc, char **argv)
  10 {
  11     int c;
  12     int counts[NUM_COUNTERS];
  13     FILE *f;
  15     if(argc < 2) {
  16         fprintf(stderr, "Usage: %s number\n", argv[0]);
  17         exit(1);
  18     }
  19     /* else */
  21     c = atoi(argv[1]);
  22     if(c < 0 || c > NUM_COUNTERS) {
  23         fprintf(stderr, "Counter %d not in range 0..%d\n", c, NUM_COUNTERS - 1);
  24         exit(2);
  25     }
  27     f = fopen(COUNTER_FILE, "rb");
  28     if(f == 0) {
  29         perror(COUNTER_FILE);
  30         exit(3);
  31     }
  33     /* read them in */
  34     fread(counts, sizeof(*counts), NUM_COUNTERS, f);
  35     fclose(f);
  37     printf("%d\n", ++counts[c]);
  39     /* write them back */
  40     f = fopen(NEW_COUNTER_FILE, "wb");
  41     fwrite(counts, sizeof(*counts), NUM_COUNTERS, f);
  42     fclose(f);
  46     return 0;
  47 }

Again, you'll have to initialize /tmp/hit to use this; in this case, you want it to contain exactly 400 nul characters. On a Linux machine you can do this with the command dd if=/dev/zero of=/tmp/hit bs=400 count=1.

The advantage of using binary files is that reading and writing them is both simpler and faster. The disadvantages are (a) you can't look at or update the binary data with your favorite text editor any more, and (b) the file may no longer be portable from one machine to another, if the different machines have different endianness or different values of sizeof(int). The second problem we can deal with by converting the data to a standard word size and byte order before storing it, but then we lose some advantages of speed.

4. A version that updates the file in place

We still may run into speed problems if NUM_COUNTERS is huge. The next program avoids rewriting the entire file just to update one value inside it. This program uses the fseek function to position the cursor inside the file. It opens the file using the "r+b" flag to fopen, which means to open an existing binary file for reading and writing.

   1 #include <stdio.h>
   2 #include <stdlib.h>
   4 #define NUM_COUNTERS (100)      /* number of counters we keep track of */
   5 #define COUNTER_FILE "/tmp/hit" /* where they are stored */
   7 int
   8 main(int argc, char **argv)
   9 {
  10     int c;
  11     int count;
  12     FILE *f;
  14     if(argc < 2) {
  15         fprintf(stderr, "Usage: %s number\n", argv[0]);
  16         exit(1);
  17     }
  18     /* else */
  20     c = atoi(argv[1]);
  21     if(c < 0 || c > NUM_COUNTERS) {
  22         fprintf(stderr, "Counter %d not in range 0..%d\n", c, NUM_COUNTERS - 1);
  23         exit(2);
  24     }
  26     f = fopen(COUNTER_FILE, "r+b");
  27     if(f == 0) {
  28         perror(COUNTER_FILE);
  29         exit(3);
  30     }
  32     /* read counter */
  33     fseek(f, sizeof(int) * c, SEEK_SET);
  34     fread(&count, sizeof(int), 1, f);
  36     printf("%d\n", ++count);
  38     /* write it back */
  39     fseek(f, sizeof(int) * c, SEEK_SET);
  40     fwrite(&count, sizeof(int), 1, f);
  41     fclose(f);
  43     return 0;
  44 }

Note that this program is not only shorter than the last one, but it also avoids allocating the counts array. It also is less likely to run into trouble with running out of space during writing. If we ignore issues of concurrency, this is the best we can probably do with just stdio.

5. An even better version using mmap

We can do even better using the mmap routine, available in all POSIX-compliant C libraries. POSIX, which is short for Portable Standard Unix, is supported by essentially all Unix-like operating systems and NT-based versions of Microsoft Windows. The mmap routine tells the operating system to "map" a file in the filesystem to a region in the process's address space. Reading bytes from this region will read from the file; writing bytes to this region will write to the file (although perhaps not immediately). Even better, if more than one process calls mmap on the same file at once, they will share the memory region, so that updates made by one process will be seen immediately by the others.1

Here is the program using mmap:

   1 #include <stdio.h>
   2 #include <stdlib.h>
   3 #include <sys/types.h>
   4 #include <sys/stat.h>
   5 #include <fcntl.h>
   6 #include <sys/mman.h>   /* For mmap.  I think mman is short for "memory management." */
   8 #define NUM_COUNTERS (100)      /* number of counters we keep track of */
   9 #define COUNTER_FILE "/tmp/hit" /* where they are stored */
  10 #define NEW_COUNTER_FILE COUNTER_FILE "~"  /* note use of constant string concatenation */
  12 int
  13 main(int argc, char **argv)
  14 {
  15     int c;
  16     int *counts;
  17     int fd;
  19     if(argc < 2) {
  20         fprintf(stderr, "Usage: %s number\n", argv[0]);
  21         exit(1);
  22     }
  23     /* else */
  25     c = atoi(argv[1]);
  26     if(c < 0 || c > NUM_COUNTERS) {
  27         fprintf(stderr, "Counter %d not in range 0..%d\n", c, NUM_COUNTERS - 1);
  28         exit(2);
  29     }
  31     /* open and map the file */
  32     fd = open(COUNTER_FILE, O_RDWR);
  33     if(fd < 0) {
  34         perror(COUNTER_FILE);
  35         exit(3);
  36     }
  37     counts = mmap(0, sizeof(*counts) * NUM_COUNTERS, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
  39     if(counts == 0) {
  40         perror(COUNTER_FILE);
  41         exit(4);
  42     }
  44     printf("%d\n", ++counts[c]);
  46     /* unmap the region and close the file just to be safe */
  47     munmap(counts, sizeof(*counts) * NUM_COUNTERS);
  48     close(fd);
  50     return 0;
  51 }

Now the code for actually incrementing counts[c] and writing it to the file is trivial. Unfortunately, we have left stdio behind, and have to deal with low-level POSIX calls like open and close to get at the file. Still, this may be the most efficient version we can do, and becomes even better if we plan to do many updates to the same file, since we can just keep the file open.

6. Concurrency and fault-tolerance issues: ACIDitiy

All of the solutions described so far can fail if you run two copies of count_hits simultaneously. The mmap solution is probably the least vulnerable to failures, as the worst that can happen is that some update is lost if the same locations is updated at exactly the same time. The other solutions can fail more spectacularly; simultaneous writes to /tmp/hit~ in the simple text file version, for example, can produce a wide variety of forms of file corruption. For a simple web page hit counter, this may not be a problem. If you are writing a back-end for a bank, you probably want something less vulnerable.

Database writers aim for a property called ACIDity from the acronym ACID = Atomicity, Consistency, Isolation, and Durability. These are defined for a system in which the database is accessed via transactions consisting of one or more operations. An example of a transaction might be ++counts[c], which we can think of as consisting of two operations: reading counts[c], and writing back counts[c]+1.

Atomicity means that either every operation in a transaction is performed or none is. In practice, this means if the transaction fails any partial progress must be undone.

Consistency means that at the end of a transaction the database is in a "consistent" state. This may just mean that no data has been corrupted (e.g. in the text data file we have exactly 100 lines and they're all integer counts), or it may also extend to integrity constraints enforce by the database (e.g. in a database of airline flights, the fact that flight 2937 lands at HVN at 22:34 on 12/17 implies that flight 2937 exists, has an assigned pilot, etc.).

Isolation says that two concurrent transactions can't detect each other; the partial progress of one transaction is not visible to others until the transaction commits.

Durability means that the results of any committed transaction are permanent. In practice this means there is enough information physically written to a disk to reconstruct the transaction before the transaction finished.

How can we enforce these requirements for our hit counter? Atomicity is not hard: if I stop a transaction after a read but before the write, no one will be the wiser (although there is a possible problem if only half of my write succeeds). Consistency is enforced by the fseek and mmap solutions, since they can't change the structure of the file. Isolation is not provided by any of our solutions, and would require some sort of locking (e.g. using flock) to make sure that only one program uses the file at a time. Durability is enforced by not having count_hits return until the fclose or close operation has succeeded (although full durability would require running fsync or msync to actually guarantee data was written to disk).

Though it would be possible to provide full ACIDity with enough work, this is a situation where using an existing well-debugged tool beats writing our own. Depending on what we are allowed to do to the machine our program is running on, we have many options for getting much better handling of concurrency. Some standard tools we could use are:


  1. It is still possible to lose updates if you are very unlucky. (1)

2014-06-17 11:57