[FrontPage] [TitleIndex] [WordIndex

Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

It is a truth universally acknowledged that test code should be written early in the development process. Unfortunately, most programmers (including me) tend to assume that a program will work on the first attempt and there's not much point in testing it anyway, so writing and running test code often gets deferred indefinitely. The solution is to write the test code first, and run it directly from your Makefile every time you save and compile your program. Not only will this guarantee that your program actually works when you are done (or at least passes the tests you thought of), it allows you to see how the program is improving with each positive change, and prevents you from accidentally making new negative changes that break things that used to work.

Going on step further, we can often write our interface and test code first, build a non-working stub implementation, and then slowly flesh out the missing pieces until the implementation passes all the tests. This way there is always some obvious step to do next, and we don't find ourselves stuck staring at an empty file.

1. Setting up the test harness

1.1. Module interface

The module will be a stack for storing integers.

Let's start with the interface, which we'll put in a file called stack.h:

1.1.1. stack.h

   1 /*
   2  * This is an "opaque struct"; it discourages people from looking at
   3  * the inside of our structure.  The actual definiton of struct stack
   4  * is contained in stack.c.
   5  */
   6 typedef struct stack *Stack;
   7 
   8 /* constructor and destructor */
   9 Stack stack_create(void);              /* returns 0 on allocation error */
  10 void stack_destroy(Stack);
  11 
  12 /* push a new element onto the stack */
  13 void stack_push(Stack , int new_element);
  14 
  15 /* return 1 if the stack is empty, 0 otherwise */
  16 int stack_isempty(Stack);
  17 
  18 /* remove and return top element of stack */
  19 /* returns STACK_EMPTY if stack is empty */
  20 #define STACK_EMPTY (-1)
  21 int stack_pop(Stack);

Our intent is that an Stack acts like a stack--- we push things onto it using stack_push, and then pull them off again in reverse order using stack_pop. Ideally, we don't ever pop the stack when it's empty (which we can detect using stack_isempty), but if we do, we have stack_pop return something well-defined.

1.2. Test code

Let's write some test code to try this out. Because our initial stack implementation may be exceptionally bug-ridden, we'll use a test harness that provides macros for detecting and intercepting segmentational faults and similar disasters. The various testing wrappers are defined in the files tester.h and tester.c, shown in the Appendix; you should feel free to use it for your own purposes. I've added line numbers in comments to all the TEST lines so we can find them again later.

1.2.1. test-stack.c

   1 #include <stdio.h>
   2 #include <setjmp.h>
   3 #include <signal.h>
   4 #include <unistd.h>
   5 
   6 #include <stdlib.h>
   7 
   8 #include "stack.h"
   9 #include "tester.h"
  10 
  11 #define STRESS_TEST_ITERATIONS (1000000)
  12 
  13 int
  14 main(int argc, char **argv)
  15 {
  16     Stack s;
  17     int i;
  18 
  19     tester_init();
  20 
  21     /* first we need to build one */
  22     TRY { s = stack_create(); } ENDTRY;
  23 
  24     /* 25 */ TEST_ASSERT(s != 0);
  25 
  26     /* now we'll try pushing and popping a bit */
  27     TRY { stack_push(s, 1); } ENDTRY;
  28     TRY { stack_push(s, 2); } ENDTRY;
  29     TRY { stack_push(s, 3); } ENDTRY;
  30 
  31     /* 32 */ TEST(stack_isempty(s), 0);
  32     /* 33 */ TEST(stack_pop(s), 3);
  33     /* 34 */ TEST(stack_isempty(s), 0);
  34     /* 35 */ TEST(stack_pop(s), 2);
  35     /* 36 */ TEST(stack_isempty(s), 0);
  36     /* 37 */ TEST(stack_pop(s), 1);
  37     /* 38 */ TEST(stack_isempty(s), 1);
  38     /* 39 */ TEST(stack_pop(s), STACK_EMPTY);
  39     /* 40 */ TEST(stack_isempty(s), 1);
  40     /* 41 */ TEST(stack_pop(s), STACK_EMPTY);
  41 
  42     /* can we still push after popping too much? */
  43     TRY { stack_push(s, 4); } ENDTRY;
  44     /* 45 */ TEST(stack_isempty(s), 0);
  45     /* 46 */ TEST(stack_pop(s), 4);
  46     /* 47 */ TEST(stack_isempty(s), 1);
  47     /* 48 */ TEST(stack_pop(s), STACK_EMPTY);
  48     /* 49 */ TEST(stack_isempty(s), 1);
  49 
  50     /* let's do some stress testing */
  51     /* we won't use TEST for this because we might get too much output */
  52     TRY {
  53         for(i = 0; i < STRESS_TEST_ITERATIONS; i++) {
  54             stack_push(s, i);
  55         }
  56         for(i = 0; i < STRESS_TEST_ITERATIONS; i++) {
  57             stack_push(s, 957);
  58             if(stack_pop(s) != 957) {
  59                 /* 60 */ FAIL("wanted 957 but didn't get it");
  60                 abort();
  61             }
  62         }
  63         for(i = STRESS_TEST_ITERATIONS - 1; i >= 0; i--) {
  64             if(stack_isempty(s)) {
  65                 /* 66 */ FAIL("stack empty too early");
  66                 abort();
  67             }
  68             if(stack_pop(s) != i) {
  69                 /* 70 */ FAIL("got wrong value!");
  70                 abort();
  71             }
  72         }
  73     } ENDTRY; /* 74 */
  74 
  75     /* 76 */ TEST(stack_isempty(s), 1);
  76 
  77     TRY { stack_destroy(s); } ENDTRY;
  78 
  79     tester_report(stdout, argv[0]);
  80     return tester_result();
  81 }

There is a lot of test code here. In practice, we might write just a few tests to start off with, and, to be honest, I didn't write all of this at once. But you can never have too many tests--- if nothing else, they give an immediate sense of gratification as the number of failed tests drops.

1.3. Makefile

1.3.1. Makefile

CC=gcc
CFLAGS=-g3 -Wall -ansi -pedantic

all:

test: test-stack
        ./test-stack 
        @echo OK!

test-stack: test-stack.o tester.o stack.o
        $(CC) $(CFLAGS) -o $@ $^

test-stack.o: stack.h tester.h
stack.o: stack.h

Note that we don't provide a convenient shortcut for building test-stack without running it. That's because we want to run the test code every single time.

2. Stub implementation

Of course, we still can't compile anything, because we don't have any implementation. Let's fix that. To make it easy to write, we will try to add as little as possible to what we already have in stack.h:

2.1. stack.c

   1 #include <stdlib.h>     
   2 #include "stack.h"
   3 
   4 struct stack { int dummy; };
   5 Stack stack_create(void) { return malloc(sizeof(struct stack)); }
   6 void stack_destroy(Stack s) { free(s); }
   7 void stack_push(Stack s, int elem) { ; }
   8 int stack_pop(Stack s) { return STACK_EMPTY; }
   9 int stack_isempty(Stack s) { return 1; }

Will this work? Of course not. There's hardly any code! But maybe it will compile if we run make test:

$ make test
gcc -g3 -Wall -ansi -pedantic   -c -o test-stack.o test-stack.c
gcc -g3 -Wall -ansi -pedantic   -c -o tester.o tester.c
gcc -g3 -Wall -ansi -pedantic   -c -o stack.o stack.c
gcc -g3 -Wall -ansi -pedantic -o test-stack test-stack.o tester.o stack.o
./test-stack 
test-stack.c:32: TEST FAILED: stack_isempty(s) -> 1 but expected 0
test-stack.c:33: TEST FAILED: stack_pop(s) -> -1 but expected 3
test-stack.c:34: TEST FAILED: stack_isempty(s) -> 1 but expected 0
test-stack.c:35: TEST FAILED: stack_pop(s) -> -1 but expected 2
test-stack.c:36: TEST FAILED: stack_isempty(s) -> 1 but expected 0
test-stack.c:37: TEST FAILED: stack_pop(s) -> -1 but expected 1
test-stack.c:45: TEST FAILED: stack_isempty(s) -> 1 but expected 0
test-stack.c:46: TEST FAILED: stack_pop(s) -> -1 but expected 4
test-stack.c:60: wanted 957 but didn't get it
test-stack.c:74: Aborted (signal 6)
./test-stack: errors 8/17, signals 1, FAILs 1
make[1]: *** [test] Error 8

Hooray! It compiles on the first try! (Well, not really, but let's pretend it did.) Unfortunately, it only passes any tests at all by pure dumb luck. But now we just need to get the code to pass a few more tests.

3. Bounded-space implementation

Here's a first attempt at a stack that suffers from some artificial limits. We retain the structure of the original broken implementation, we just put a few more lines of code in and format it more expansively.

3.1. stack.c

   1 #include <stdlib.h>     
   2 #include "stack.h"
   3 
   4 #define MAX_STACK_SIZE (100)
   5 
   6 struct stack { 
   7     int top;
   8     int data[MAX_STACK_SIZE];
   9 };
  10 
  11 Stack 
  12 stack_create(void)
  13 {
  14     struct stack *s;
  15     
  16     s = malloc(sizeof(*s));
  17     s->top = 0;
  18     return s;
  19 }
  20 
  21 void
  22 stack_destroy(Stack s)
  23 {
  24     free(s);
  25 }
  26 
  27 void
  28 stack_push(Stack s, int elem)
  29 {
  30     s->data[(s->top)++] = elem;
  31 }
  32 
  33 int
  34 stack_pop(Stack s)
  35 {
  36     return s->data[--(s->top)];
  37 }
  38 
  39 int
  40 stack_isempty(Stack s)
  41 {
  42     return s->top == 0;
  43 }

Let's see what happens now:

$ make test
gcc -g3 -Wall -ansi -pedantic   -c -o test-stack.o test-stack.c
gcc -g3 -Wall -ansi -pedantic   -c -o tester.o tester.c
gcc -g3 -Wall -ansi -pedantic   -c -o stack.o stack.c
gcc -g3 -Wall -ansi -pedantic -o test-stack test-stack.o tester.o stack.o
./test-stack 
test-stack.c:40: TEST FAILED: stack_isempty(s) -> 0 but expected 1
test-stack.c:41: TEST FAILED: stack_pop(s) -> 409 but expected -1
test-stack.c:47: TEST FAILED: stack_isempty(s) -> 0 but expected 1
test-stack.c:48: TEST FAILED: stack_pop(s) -> 0 but expected -1
test-stack.c:49: TEST FAILED: stack_isempty(s) -> 0 but expected 1
test-stack.c:74: Segmentation fault (signal 11)
test-stack.c:76: TEST FAILED: stack_isempty(s) -> 0 but expected 1
free(): invalid pointer 0x804b830!
./test-stack: errors 6/17, signals 1, FAILs 0
make[1]: *** [test] Error 6

There are still errors, but we get past several initial tests before things blow up. Looking back at the line numbers in test-stack.c, we see that the first failed test is the one that checks if the stack is empty after we pop from an empty stack. The code for stack_isempty looks pretty clean, so what happened? Somewhere s->top got set to a nonzero value, and the only place this can happen is inside stack_pop. Aha! There's no check in stack_pop for an empty stack, so it's decrementing s->top past 0. (Exercise: why didn't the test of stack_pop fail?)

4. First fix

If we're lucky, fixing this problem will make the later tests happier. Let's try a new version of stack_pop. We'll leave everything else the same.

   1 int
   2 stack_pop(Stack s)
   3 {
   4     if(stack_isempty(s)) {
   5         return STACK_EMPTY;
   6     } else {
   7         return s->data[--(s->top)];
   8     }

}

And now we get:

$ make test
gcc -g3 -Wall -ansi -pedantic   -c -o test-stack.o test-stack.c
gcc -g3 -Wall -ansi -pedantic   -c -o tester.o tester.c
gcc -g3 -Wall -ansi -pedantic   -c -o stack.o stack.c
gcc -g3 -Wall -ansi -pedantic -o test-stack test-stack.o tester.o stack.o
./test-stack 
test-stack.c:74: Segmentation fault (signal 11)
test-stack.c:76: TEST FAILED: stack_isempty(s) -> 0 but expected 1
./test-stack: errors 1/17, signals 1, FAILs 0
make[1]: *** [test] Error 1

Which is much nicer. We are still failing the stress test, but that's not terribly surprising.

5. Final version

After some more tinkering, this is what I ended up with. This version uses a malloc'd data field, and realloc's it when the stack gets too big.

5.1. stack.c

   1 #include <stdlib.h>     
   2 #include "stack.h"
   3 
   4 struct stack { 
   5     int top;    /* first unused slot in data */
   6     int size;   /* number of slots in data */
   7     int *data;  /* stack contents */
   8 };
   9 
  10 #define INITIAL_STACK_SIZE (1)
  11 #define STACK_SIZE_MULTIPLIER (2)
  12 
  13 Stack 
  14 stack_create(void)
  15 {
  16     struct stack *s;
  17 
  18     s = malloc(sizeof(*s));
  19     if(s == 0) return 0;
  20     
  21     s->top = 0;
  22     s->size = INITIAL_STACK_SIZE;
  23     s->data = malloc(s->size * sizeof(*(s->data)));
  24     if(s->data == 0) return 0;
  25 
  26     /* else everything is ok */
  27     return s;
  28 }
  29 
  30 void
  31 stack_destroy(Stack s)
  32 {
  33     free(s->data);
  34     free(s);
  35 }
  36 
  37 void
  38 stack_push(Stack s, int elem)
  39 {
  40     if(s->top == s->size) {
  41         /* need more space */
  42         s->size *= STACK_SIZE_MULTIPLIER;
  43         s->data = realloc(s->data, s->size * sizeof(*(s->data)));
  44         if(s->data == 0) {
  45             abort();    /* we have no other way to signal failure :-( */
  46         }
  47     }
  48     /* now there is enough room */
  49     s->data[s->top++] = elem;
  50 }
  51 
  52 int
  53 stack_pop(Stack s)
  54 {
  55     if(stack_isempty(s)) {
  56         return STACK_EMPTY;
  57     } else {
  58         return s->data[--(s->top)];
  59     }
  60 }
  61 
  62 int
  63 stack_isempty(Stack s)
  64 {
  65     return s->top == 0;
  66 }

At last we have a version that passes all tests:

$ make test
gcc -g3 -Wall -ansi -pedantic   -c -o test-stack.o test-stack.c
gcc -g3 -Wall -ansi -pedantic   -c -o tester.o tester.c
gcc -g3 -Wall -ansi -pedantic   -c -o stack.o stack.c
gcc -g3 -Wall -ansi -pedantic -o test-stack test-stack.o tester.o stack.o
./test-stack 
OK!

6. Moral

Writing a big program all at once is hard. If you can break the problem down into little problems, it becomes easier. "Test first" is a strategy not just for getting a well-tested program, but for giving you something easy to do at each step--- it's usually not too hard to write one more test, and it's usually not too hard to get just one test working. If you can keep taking those small, easy steps, eventually you will run out of failed tests and have a working program.

7. Appendix: Test macros

7.1. tester.h

   1 /*
   2  * Test macros.
   3  * 
   4  * Usage:
   5  *
   6  * #include <setjmp.h>
   7  * #include <stdio.h>
   8  * #include <signal.h>
   9  * #include <unistd.h>
  10  *
  11  * tester_init();                  -- Initialize internal data structures.
  12  * tester_report(FILE *, "name");  -- Print report.
  13  * tester_result();                -- Returns # of failed tests.
  14  *
  15  * TRY { code } ENDTRY;
  16  *
  17  * Wraps code to catch seg faults, illegal instructions, etc.  May not be
  18  * nested.
  19  * Prints a warning if a signal is caught.
  20  * To enforce a maximum time, set alarm before entering.
  21  *
  22  * TEST(expr, expected_value);
  23  *
  24  * Evaluates expr (which should yield an integer value) inside a TRY.
  25  * Prints a warning if evaluating expr causes a fault or returns a value
  26  * not equal to expected_value.
  27  *
  28  * TEST_ASSERT(expr)
  29  *
  30  * Equivalent to TEST(!(expr), 0)
  31  *
  32  * You can also cause your own failures with FAIL:
  33  *
  34  * TRY {
  35  *     x = 1;
  36  *     if(x == 2) FAIL("why is x 2?");
  37  * } ENDTRY;
  38  *
  39  * To limit the time taken by a test, call tester_set_time_limit with
  40  * a new limit in seconds, e.g.
  41  *
  42  * tester_set_time_limit(1);
  43  * TRY { while(1); } ENDTRY;
  44  *
  45  * There is an initial default limit of 10 seconds.
  46  * If you don't want any limit, set the limit to 0.
  47  *
  48  */
  49 
  50 /* global data used by macros */
  51 /* nothing in here should be modified directly */
  52 extern struct tester_global_data {
  53     jmp_buf escape_hatch;       /* jump here on surprise signals */
  54     int escape_hatch_active;    /* true if escape hatch is usable */
  55     int tests;                  /* number of tests performed */
  56     int errors;                 /* number of tests failed */
  57     int signals;                /* number of signals caught */
  58     int expr_value;             /* expression value */
  59     int setjmp_return;          /* return value from setjmp */
  60     int try_failed;             /* true if last try failed */
  61     int user_fails;             /* number of calls to FAIL */
  62     int time_limit;             /* time limit for TRY */
  63 } TesterData;
  64 
  65 /* set up system; call this before using macros */
  66 void tester_init(void);
  67 
  68 /* prints a summary report of all errors to f, prefixed with preamble */
  69 /* If there were no errors, nothing is printed */
  70 void tester_report(FILE *f, const char *preamble);
  71 
  72 /* returns number of errors so far. */
  73 int tester_result(void);
  74 
  75 /* set a time limit t for TRY, TEST, TEST_ASSERT etc. */
  76 /* After t seconds, an ALARM signal will interrupt the test. */
  77 /* Set t = 0 to have no time limit. */
  78 /* Default time limit is 10 seconds. */
  79 void tester_set_time_limit(int t);        
  80 
  81 const char *tester_strsignal(int);      /* internal hack; don't use this */
  82 
  83 /* gruesome non-syntactic macros */
  84 #define TRY \
  85     TesterData.try_failed = 0; \
  86     alarm(TesterData.time_limit); \
  87     if(((TesterData.setjmp_return = setjmp(TesterData.escape_hatch)) == 0) \
  88             && (TesterData.escape_hatch_active = 1) /* one = is correct*/)
  89 #define ENDTRY else { \
  90         fprintf(stderr, "%s:%d: %s (signal %d)\n", \
  91             __FILE__, __LINE__, \
  92             tester_strsignal(TesterData.setjmp_return), \
  93             TesterData.setjmp_return); \
  94         TesterData.signals++; \
  95         TesterData.try_failed = 1; \
  96     } \
  97     alarm(0); \
  98     TesterData.escape_hatch_active = 0
  99 
 100 /* another atrocity */
 101 #define TEST(expr, expected_value) \
 102     TesterData.tests++; \
 103     TesterData.errors++; /* guilty until proven innocent */ \
 104     TRY { TesterData.expr_value = (expr); \
 105         if(TesterData.expr_value != expected_value) { \
 106             fprintf(stderr, "%s:%d: TEST FAILED: %s -> %d but expected %d\n", \
 107                     __FILE__, __LINE__, __STRING(expr), \
 108                     TesterData.expr_value, expected_value); \
 109         } else { \
 110             TesterData.errors--; \
 111         } \
 112     } \
 113     ENDTRY; \
 114     if(TesterData.try_failed) \
 115         fprintf(stderr, "%s:%d: TEST FAILED: %s caught signal\n", \
 116                 __FILE__, __LINE__, __STRING(expr))
 117 
 118 #define TEST_ASSERT(expr) TEST((expr) != 0, 1)
 119 #define FAIL(msg) \
 120     (fprintf(stderr, "%s:%d: %s\n", __FILE__, __LINE__, (msg)), \
 121              TesterData.user_fails++, \
 122              TesterData.try_failed = 1)
 123 

7.2. tester.c

   1 #define _GNU_SOURCE     /* get strsignal def */
   2 
   3 #include <stdio.h>
   4 #include <signal.h>
   5 #include <string.h>
   6 #include <setjmp.h>
   7 
   8 #include "tester.h"
   9 
  10 struct tester_global_data TesterData;
  11 
  12 const char *
  13 tester_strsignal(int sig)
  14 {
  15     return strsignal(sig);
  16 }
  17 
  18 static void
  19 tester_sighandler(int signal)
  20 {
  21     if(TesterData.escape_hatch_active) {
  22         TesterData.escape_hatch_active = 0;
  23         longjmp(TesterData.escape_hatch, signal);
  24     }
  25 }
  26 
  27 void
  28 tester_init(void)
  29 {
  30     TesterData.escape_hatch_active = 0;
  31     TesterData.tests = 0;
  32     TesterData.errors = 0;
  33     TesterData.signals = 0;
  34     TesterData.user_fails = 0;
  35 
  36     signal(SIGSEGV, tester_sighandler);
  37     signal(SIGILL, tester_sighandler);
  38     signal(SIGFPE, tester_sighandler);
  39     signal(SIGALRM, tester_sighandler);
  40     signal(SIGBUS, tester_sighandler);
  41     signal(SIGABRT, tester_sighandler);
  42 }
  43 
  44 void
  45 tester_report(FILE *f, const char *preamble)
  46 {
  47     if(TesterData.errors != 0 || TesterData.signals != 0) {
  48         fprintf(f, "%s: errors %d/%d, signals %d, FAILs %d\n", 
  49                 preamble,
  50                 TesterData.errors, 
  51                 TesterData.tests,
  52                 TesterData.signals,
  53                 TesterData.user_fails);
  54     }
  55 }
  56 
  57 int
  58 tester_result(void)
  59 {
  60     return TesterData.errors;
  61 }
  62 
  63 void
  64 tester_set_time_limit(int t)
  65 {
  66     TesterData.time_limit = t;
  67 }


CategoryProgrammingNotes


2014-06-17 11:58