It is a truth universally acknowledged that test code should be written early in the development process. Unfortunately, most programmers (including me) tend to assume that a program will work on the first attempt and there's not much point in testing it anyway, so writing and running test code often gets deferred indefinitely. The solution is to write the test code first, and run it directly from your Makefile every time you save and compile your program. Not only will this guarantee that your program actually works when you are done (or at least passes the tests you thought of), it allows you to see how the program is improving with each positive change, and prevents you from accidentally making new negative changes that break things that used to work.
Going on step further, we can often write our interface and test code first, build a non-working stub implementation, and then slowly flesh out the missing pieces until the implementation passes all the tests. This way there is always some obvious step to do next, and we don't find ourselves stuck staring at an empty file.
Contents
1. Setting up the test harness
1.1. Module interface
The module will be a stack for storing integers.
Let's start with the interface, which we'll put in a file called stack.h:
1.1.1. stack.h
1 /*
2 * This is an "opaque struct"; it discourages people from looking at
3 * the inside of our structure. The actual definiton of struct stack
4 * is contained in stack.c.
5 */
6 typedef struct stack *Stack;
7
8 /* constructor and destructor */
9 Stack stack_create(void); /* returns 0 on allocation error */
10 void stack_destroy(Stack);
11
12 /* push a new element onto the stack */
13 void stack_push(Stack , int new_element);
14
15 /* return 1 if the stack is empty, 0 otherwise */
16 int stack_isempty(Stack);
17
18 /* remove and return top element of stack */
19 /* returns STACK_EMPTY if stack is empty */
20 #define STACK_EMPTY (-1)
21 int stack_pop(Stack);
Our intent is that an Stack acts like a stack--- we push things onto it using stack_push, and then pull them off again in reverse order using stack_pop. Ideally, we don't ever pop the stack when it's empty (which we can detect using stack_isempty), but if we do, we have stack_pop return something well-defined.
1.2. Test code
Let's write some test code to try this out. Because our initial stack implementation may be exceptionally bug-ridden, we'll use a test harness that provides macros for detecting and intercepting segmentational faults and similar disasters. The various testing wrappers are defined in the files tester.h and tester.c, shown in the Appendix; you should feel free to use it for your own purposes. I've added line numbers in comments to all the TEST lines so we can find them again later.
1.2.1. test-stack.c
1 #include <stdio.h>
2 #include <setjmp.h>
3 #include <signal.h>
4 #include <unistd.h>
5
6 #include <stdlib.h>
7
8 #include "stack.h"
9 #include "tester.h"
10
11 #define STRESS_TEST_ITERATIONS (1000000)
12
13 int
14 main(int argc, char **argv)
15 {
16 Stack s;
17 int i;
18
19 tester_init();
20
21 /* first we need to build one */
22 TRY { s = stack_create(); } ENDTRY;
23
24 /* 25 */ TEST_ASSERT(s != 0);
25
26 /* now we'll try pushing and popping a bit */
27 TRY { stack_push(s, 1); } ENDTRY;
28 TRY { stack_push(s, 2); } ENDTRY;
29 TRY { stack_push(s, 3); } ENDTRY;
30
31 /* 32 */ TEST(stack_isempty(s), 0);
32 /* 33 */ TEST(stack_pop(s), 3);
33 /* 34 */ TEST(stack_isempty(s), 0);
34 /* 35 */ TEST(stack_pop(s), 2);
35 /* 36 */ TEST(stack_isempty(s), 0);
36 /* 37 */ TEST(stack_pop(s), 1);
37 /* 38 */ TEST(stack_isempty(s), 1);
38 /* 39 */ TEST(stack_pop(s), STACK_EMPTY);
39 /* 40 */ TEST(stack_isempty(s), 1);
40 /* 41 */ TEST(stack_pop(s), STACK_EMPTY);
41
42 /* can we still push after popping too much? */
43 TRY { stack_push(s, 4); } ENDTRY;
44 /* 45 */ TEST(stack_isempty(s), 0);
45 /* 46 */ TEST(stack_pop(s), 4);
46 /* 47 */ TEST(stack_isempty(s), 1);
47 /* 48 */ TEST(stack_pop(s), STACK_EMPTY);
48 /* 49 */ TEST(stack_isempty(s), 1);
49
50 /* let's do some stress testing */
51 /* we won't use TEST for this because we might get too much output */
52 TRY {
53 for(i = 0; i < STRESS_TEST_ITERATIONS; i++) {
54 stack_push(s, i);
55 }
56 for(i = 0; i < STRESS_TEST_ITERATIONS; i++) {
57 stack_push(s, 957);
58 if(stack_pop(s) != 957) {
59 /* 60 */ FAIL("wanted 957 but didn't get it");
60 abort();
61 }
62 }
63 for(i = STRESS_TEST_ITERATIONS - 1; i >= 0; i--) {
64 if(stack_isempty(s)) {
65 /* 66 */ FAIL("stack empty too early");
66 abort();
67 }
68 if(stack_pop(s) != i) {
69 /* 70 */ FAIL("got wrong value!");
70 abort();
71 }
72 }
73 } ENDTRY; /* 74 */
74
75 /* 76 */ TEST(stack_isempty(s), 1);
76
77 TRY { stack_destroy(s); } ENDTRY;
78
79 tester_report(stdout, argv[0]);
80 return tester_result();
81 }
There is a lot of test code here. In practice, we might write just a few tests to start off with, and, to be honest, I didn't write all of this at once. But you can never have too many tests--- if nothing else, they give an immediate sense of gratification as the number of failed tests drops.
1.3. Makefile
Finally, we'll write a Makefile:
1.3.1. Makefile
CC=gcc CFLAGS=-g3 -Wall -ansi -pedantic all: test: test-stack ./test-stack @echo OK! test-stack: test-stack.o tester.o stack.o $(CC) $(CFLAGS) -o $@ $^ test-stack.o: stack.h tester.h stack.o: stack.h
Note that we don't provide a convenient shortcut for building test-stack without running it. That's because we want to run the test code every single time.
2. Stub implementation
Of course, we still can't compile anything, because we don't have any implementation. Let's fix that. To make it easy to write, we will try to add as little as possible to what we already have in stack.h:
2.1. stack.c
1 #include <stdlib.h>
2 #include "stack.h"
3
4 struct stack { int dummy; };
5 Stack stack_create(void) { return malloc(sizeof(struct stack)); }
6 void stack_destroy(Stack s) { free(s); }
7 void stack_push(Stack s, int elem) { ; }
8 int stack_pop(Stack s) { return STACK_EMPTY; }
9 int stack_isempty(Stack s) { return 1; }
Will this work? Of course not. There's hardly any code! But maybe it will compile if we run make test:
$ make test gcc -g3 -Wall -ansi -pedantic -c -o test-stack.o test-stack.c gcc -g3 -Wall -ansi -pedantic -c -o tester.o tester.c gcc -g3 -Wall -ansi -pedantic -c -o stack.o stack.c gcc -g3 -Wall -ansi -pedantic -o test-stack test-stack.o tester.o stack.o ./test-stack test-stack.c:32: TEST FAILED: stack_isempty(s) -> 1 but expected 0 test-stack.c:33: TEST FAILED: stack_pop(s) -> -1 but expected 3 test-stack.c:34: TEST FAILED: stack_isempty(s) -> 1 but expected 0 test-stack.c:35: TEST FAILED: stack_pop(s) -> -1 but expected 2 test-stack.c:36: TEST FAILED: stack_isempty(s) -> 1 but expected 0 test-stack.c:37: TEST FAILED: stack_pop(s) -> -1 but expected 1 test-stack.c:45: TEST FAILED: stack_isempty(s) -> 1 but expected 0 test-stack.c:46: TEST FAILED: stack_pop(s) -> -1 but expected 4 test-stack.c:60: wanted 957 but didn't get it test-stack.c:74: Aborted (signal 6) ./test-stack: errors 8/17, signals 1, FAILs 1 make[1]: *** [test] Error 8
Hooray! It compiles on the first try! (Well, not really, but let's pretend it did.) Unfortunately, it only passes any tests at all by pure dumb luck. But now we just need to get the code to pass a few more tests.
3. Bounded-space implementation
Here's a first attempt at a stack that suffers from some artificial limits. We retain the structure of the original broken implementation, we just put a few more lines of code in and format it more expansively.
3.1. stack.c
1 #include <stdlib.h>
2 #include "stack.h"
3
4 #define MAX_STACK_SIZE (100)
5
6 struct stack {
7 int top;
8 int data[MAX_STACK_SIZE];
9 };
10
11 Stack
12 stack_create(void)
13 {
14 struct stack *s;
15
16 s = malloc(sizeof(*s));
17 s->top = 0;
18 return s;
19 }
20
21 void
22 stack_destroy(Stack s)
23 {
24 free(s);
25 }
26
27 void
28 stack_push(Stack s, int elem)
29 {
30 s->data[(s->top)++] = elem;
31 }
32
33 int
34 stack_pop(Stack s)
35 {
36 return s->data[--(s->top)];
37 }
38
39 int
40 stack_isempty(Stack s)
41 {
42 return s->top == 0;
43 }
Let's see what happens now:
$ make test gcc -g3 -Wall -ansi -pedantic -c -o test-stack.o test-stack.c gcc -g3 -Wall -ansi -pedantic -c -o tester.o tester.c gcc -g3 -Wall -ansi -pedantic -c -o stack.o stack.c gcc -g3 -Wall -ansi -pedantic -o test-stack test-stack.o tester.o stack.o ./test-stack test-stack.c:40: TEST FAILED: stack_isempty(s) -> 0 but expected 1 test-stack.c:41: TEST FAILED: stack_pop(s) -> 409 but expected -1 test-stack.c:47: TEST FAILED: stack_isempty(s) -> 0 but expected 1 test-stack.c:48: TEST FAILED: stack_pop(s) -> 0 but expected -1 test-stack.c:49: TEST FAILED: stack_isempty(s) -> 0 but expected 1 test-stack.c:74: Segmentation fault (signal 11) test-stack.c:76: TEST FAILED: stack_isempty(s) -> 0 but expected 1 free(): invalid pointer 0x804b830! ./test-stack: errors 6/17, signals 1, FAILs 0 make[1]: *** [test] Error 6
There are still errors, but we get past several initial tests before things blow up. Looking back at the line numbers in test-stack.c, we see that the first failed test is the one that checks if the stack is empty after we pop from an empty stack. The code for stack_isempty looks pretty clean, so what happened? Somewhere s->top got set to a nonzero value, and the only place this can happen is inside stack_pop. Aha! There's no check in stack_pop for an empty stack, so it's decrementing s->top past 0. (Exercise: why didn't the test of stack_pop fail?)
4. First fix
If we're lucky, fixing this problem will make the later tests happier. Let's try a new version of stack_pop. We'll leave everything else the same.
}
And now we get:
$ make test gcc -g3 -Wall -ansi -pedantic -c -o test-stack.o test-stack.c gcc -g3 -Wall -ansi -pedantic -c -o tester.o tester.c gcc -g3 -Wall -ansi -pedantic -c -o stack.o stack.c gcc -g3 -Wall -ansi -pedantic -o test-stack test-stack.o tester.o stack.o ./test-stack test-stack.c:74: Segmentation fault (signal 11) test-stack.c:76: TEST FAILED: stack_isempty(s) -> 0 but expected 1 ./test-stack: errors 1/17, signals 1, FAILs 0 make[1]: *** [test] Error 1
Which is much nicer. We are still failing the stress test, but that's not terribly surprising.
5. Final version
After some more tinkering, this is what I ended up with. This version uses a malloc'd data field, and realloc's it when the stack gets too big.
5.1. stack.c
1 #include <stdlib.h>
2 #include "stack.h"
3
4 struct stack {
5 int top; /* first unused slot in data */
6 int size; /* number of slots in data */
7 int *data; /* stack contents */
8 };
9
10 #define INITIAL_STACK_SIZE (1)
11 #define STACK_SIZE_MULTIPLIER (2)
12
13 Stack
14 stack_create(void)
15 {
16 struct stack *s;
17
18 s = malloc(sizeof(*s));
19 if(s == 0) return 0;
20
21 s->top = 0;
22 s->size = INITIAL_STACK_SIZE;
23 s->data = malloc(s->size * sizeof(*(s->data)));
24 if(s->data == 0) return 0;
25
26 /* else everything is ok */
27 return s;
28 }
29
30 void
31 stack_destroy(Stack s)
32 {
33 free(s->data);
34 free(s);
35 }
36
37 void
38 stack_push(Stack s, int elem)
39 {
40 if(s->top == s->size) {
41 /* need more space */
42 s->size *= STACK_SIZE_MULTIPLIER;
43 s->data = realloc(s->data, s->size * sizeof(*(s->data)));
44 if(s->data == 0) {
45 abort(); /* we have no other way to signal failure :-( */
46 }
47 }
48 /* now there is enough room */
49 s->data[s->top++] = elem;
50 }
51
52 int
53 stack_pop(Stack s)
54 {
55 if(stack_isempty(s)) {
56 return STACK_EMPTY;
57 } else {
58 return s->data[--(s->top)];
59 }
60 }
61
62 int
63 stack_isempty(Stack s)
64 {
65 return s->top == 0;
66 }
At last we have a version that passes all tests:
$ make test gcc -g3 -Wall -ansi -pedantic -c -o test-stack.o test-stack.c gcc -g3 -Wall -ansi -pedantic -c -o tester.o tester.c gcc -g3 -Wall -ansi -pedantic -c -o stack.o stack.c gcc -g3 -Wall -ansi -pedantic -o test-stack test-stack.o tester.o stack.o ./test-stack OK!
6. Moral
Writing a big program all at once is hard. If you can break the problem down into little problems, it becomes easier. "Test first" is a strategy not just for getting a well-tested program, but for giving you something easy to do at each step--- it's usually not too hard to write one more test, and it's usually not too hard to get just one test working. If you can keep taking those small, easy steps, eventually you will run out of failed tests and have a working program.
7. Appendix: Test macros
7.1. tester.h
1 /*
2 * Test macros.
3 *
4 * Usage:
5 *
6 * #include <setjmp.h>
7 * #include <stdio.h>
8 * #include <signal.h>
9 * #include <unistd.h>
10 *
11 * tester_init(); -- Initialize internal data structures.
12 * tester_report(FILE *, "name"); -- Print report.
13 * tester_result(); -- Returns # of failed tests.
14 *
15 * TRY { code } ENDTRY;
16 *
17 * Wraps code to catch seg faults, illegal instructions, etc. May not be
18 * nested.
19 * Prints a warning if a signal is caught.
20 * To enforce a maximum time, set alarm before entering.
21 *
22 * TEST(expr, expected_value);
23 *
24 * Evaluates expr (which should yield an integer value) inside a TRY.
25 * Prints a warning if evaluating expr causes a fault or returns a value
26 * not equal to expected_value.
27 *
28 * TEST_ASSERT(expr)
29 *
30 * Equivalent to TEST(!(expr), 0)
31 *
32 * You can also cause your own failures with FAIL:
33 *
34 * TRY {
35 * x = 1;
36 * if(x == 2) FAIL("why is x 2?");
37 * } ENDTRY;
38 *
39 * To limit the time taken by a test, call tester_set_time_limit with
40 * a new limit in seconds, e.g.
41 *
42 * tester_set_time_limit(1);
43 * TRY { while(1); } ENDTRY;
44 *
45 * There is an initial default limit of 10 seconds.
46 * If you don't want any limit, set the limit to 0.
47 *
48 */
49
50 /* global data used by macros */
51 /* nothing in here should be modified directly */
52 extern struct tester_global_data {
53 jmp_buf escape_hatch; /* jump here on surprise signals */
54 int escape_hatch_active; /* true if escape hatch is usable */
55 int tests; /* number of tests performed */
56 int errors; /* number of tests failed */
57 int signals; /* number of signals caught */
58 int expr_value; /* expression value */
59 int setjmp_return; /* return value from setjmp */
60 int try_failed; /* true if last try failed */
61 int user_fails; /* number of calls to FAIL */
62 int time_limit; /* time limit for TRY */
63 } TesterData;
64
65 /* set up system; call this before using macros */
66 void tester_init(void);
67
68 /* prints a summary report of all errors to f, prefixed with preamble */
69 /* If there were no errors, nothing is printed */
70 void tester_report(FILE *f, const char *preamble);
71
72 /* returns number of errors so far. */
73 int tester_result(void);
74
75 /* set a time limit t for TRY, TEST, TEST_ASSERT etc. */
76 /* After t seconds, an ALARM signal will interrupt the test. */
77 /* Set t = 0 to have no time limit. */
78 /* Default time limit is 10 seconds. */
79 void tester_set_time_limit(int t);
80
81 const char *tester_strsignal(int); /* internal hack; don't use this */
82
83 /* gruesome non-syntactic macros */
84 #define TRY \
85 TesterData.try_failed = 0; \
86 alarm(TesterData.time_limit); \
87 if(((TesterData.setjmp_return = setjmp(TesterData.escape_hatch)) == 0) \
88 && (TesterData.escape_hatch_active = 1) /* one = is correct*/)
89 #define ENDTRY else { \
90 fprintf(stderr, "%s:%d: %s (signal %d)\n", \
91 __FILE__, __LINE__, \
92 tester_strsignal(TesterData.setjmp_return), \
93 TesterData.setjmp_return); \
94 TesterData.signals++; \
95 TesterData.try_failed = 1; \
96 } \
97 alarm(0); \
98 TesterData.escape_hatch_active = 0
99
100 /* another atrocity */
101 #define TEST(expr, expected_value) \
102 TesterData.tests++; \
103 TesterData.errors++; /* guilty until proven innocent */ \
104 TRY { TesterData.expr_value = (expr); \
105 if(TesterData.expr_value != expected_value) { \
106 fprintf(stderr, "%s:%d: TEST FAILED: %s -> %d but expected %d\n", \
107 __FILE__, __LINE__, __STRING(expr), \
108 TesterData.expr_value, expected_value); \
109 } else { \
110 TesterData.errors--; \
111 } \
112 } \
113 ENDTRY; \
114 if(TesterData.try_failed) \
115 fprintf(stderr, "%s:%d: TEST FAILED: %s caught signal\n", \
116 __FILE__, __LINE__, __STRING(expr))
117
118 #define TEST_ASSERT(expr) TEST((expr) != 0, 1)
119 #define FAIL(msg) \
120 (fprintf(stderr, "%s:%d: %s\n", __FILE__, __LINE__, (msg)), \
121 TesterData.user_fails++, \
122 TesterData.try_failed = 1)
123
7.2. tester.c
1 #define _GNU_SOURCE /* get strsignal def */
2
3 #include <stdio.h>
4 #include <signal.h>
5 #include <string.h>
6 #include <setjmp.h>
7
8 #include "tester.h"
9
10 struct tester_global_data TesterData;
11
12 const char *
13 tester_strsignal(int sig)
14 {
15 return strsignal(sig);
16 }
17
18 static void
19 tester_sighandler(int signal)
20 {
21 if(TesterData.escape_hatch_active) {
22 TesterData.escape_hatch_active = 0;
23 longjmp(TesterData.escape_hatch, signal);
24 }
25 }
26
27 void
28 tester_init(void)
29 {
30 TesterData.escape_hatch_active = 0;
31 TesterData.tests = 0;
32 TesterData.errors = 0;
33 TesterData.signals = 0;
34 TesterData.user_fails = 0;
35
36 signal(SIGSEGV, tester_sighandler);
37 signal(SIGILL, tester_sighandler);
38 signal(SIGFPE, tester_sighandler);
39 signal(SIGALRM, tester_sighandler);
40 signal(SIGBUS, tester_sighandler);
41 signal(SIGABRT, tester_sighandler);
42 }
43
44 void
45 tester_report(FILE *f, const char *preamble)
46 {
47 if(TesterData.errors != 0 || TesterData.signals != 0) {
48 fprintf(f, "%s: errors %d/%d, signals %d, FAILs %d\n",
49 preamble,
50 TesterData.errors,
51 TesterData.tests,
52 TesterData.signals,
53 TesterData.user_fails);
54 }
55 }
56
57 int
58 tester_result(void)
59 {
60 return TesterData.errors;
61 }
62
63 void
64 tester_set_time_limit(int t)
65 {
66 TesterData.time_limit = t;
67 }