[FrontPage] [TitleIndex] [WordIndex

Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

One of the goals of programming is to make your code readable by other programmers (including your future self). An important tool for doing so is to give good names to everything. Not only can such a name document what it names, it can also be used to hide implementation details that are not interesting or that may change later.

1. Naming types

Suppose that you want to represent character strings as

   1 struct string {
   2     int length;
   3     char *data;         /* malloc'd block */
   4 };
   5 
   6 int string_length(const struct string *s);

If you later change the representation to, say, traditional null-terminated char * strings or some even more complicated type (union string **some_string[2];), you will need to go back and replace ever occurrence of struct string * in every program that uses it with the new type. Even if you don't expect to change the type, you may still get tired of typing struct string * all the time, especially if your fingers slip and give you struct string sometimes.

The solution is to use a typedef, which defines a new type name:

   1 typedef struct string *String;
   2 
   3 int string_length(String s);

The syntax for typedef looks like a variable declaration preceded by typedef, except that the variable is replaced by the new type name that acts like whatever type the defined variable would have had. You can use a name defined with typedef anywhere you could use a normal type name, as long as it is later in the source file than the typedef definition. Typically typedefs are placed in a header file (.h file) that is then included anywhere that needs them.

You are not limited to using typedefs only for complex types. For example, if you were writing numerical code and wanted to declare overtly that a certain quantity was not just any double but actually a length in meters, you could write

   1 typedef double LengthInMeters;
   2 typedef double AreaInSquareMeters;
   3 
   4 AreaInSquareMeters rectangleArea(LengthInMeters height, LengthInMeters width);

Unfortunately, C does not do type enforcement on typedef'd types: it is perfectly acceptable to the compiler if you pass a value of type AreaInSquareMeters as the first argument to rectangleArea, since by the time it checks it has replaced by AreaInSquareMeters and LengthInMeters by double. So this feature is not as useful as it might be, although it does mean that you can write rectangleArea(2.0, 3.0) without having to do anything to convert 2.0 and 3.0 to type LengthInMeters.

2. Naming constants

Suppose that you have a function (call it getchar) that needs to signal that sometimes it didn't work. The usual way is to return a value that the function won't normally return. Now, you could just tell the user what value that is:

   1 /* get a character (as an `int` ASCII code) from `stdin` */
   2 /* return -1 on end of file */
   3 int getchar(void);

and now the user can write

   1     while((c = getchar()) != -1) {
   2         ...
   3     }

But then somebody reading the code has to remember that -1 means "end of file" and not "signed version of \0xff" or "computer room on fire, evacuate immediately." It's much better to define a constant EOF that happens to equal -1, because among other things if you change the special return value from getchar later then this code will still work (assuming you fixed the definition of EOF):

   1     while((c = getchar()) != EOF) {
   2         ...
   3     }

So how do you declare a constant in C? The traditional approach is to use the C preprocessor, the same tool that gets run before the compiler to expand out #include directives. To define EOF, the file /usr/include/stdio.h includes the text

   1 #define EOF (-1)
   2 

What this means is that whenever the characters EOF appear in a C program as a separate word (e.g. in 1+EOF*3 but not in wOEFully_long_variable_name), then the preprocessor will replace them with the characters (-1). The parentheses around the -1 are customary to ensure that the -1 gets treated as a separate constant and not as part of some larger expression.

In general, any time you have a non-trivial constant in a program, it should be #defined. Examples are things like array dimensions, special tags or return values from functions, maximum or minimum values for some quantity, or standard mathematical constants (e.g., /usr/include/math.h defines M_PI as pi to umpteen digits). This allows you to write

   1     char buffer[MAX_FILENAME_LENGTH+1];
   2     
   3     area = M_PI*r*r;
   4 
   5     if(status == COMPUTER_ROOM_ON_FIRE) {
   6         evacuate();
   7     }

instead of

   1     char buffer[513];
   2     
   3     area = 3.141592319*r*r;
   4 
   5     if(status == 136) {
   6         evacuate();
   7     }

which is just an invitation to errors (including one the one on line 3).

Like typedefs, #defines that are intended to be globally visible are best done in header files; in large programs you will want to #include them in many source files. The usual convention is to write #defined names in all-caps to remind the user that they are macros and not real variables.

3. Naming values in sequences

C provides the enum construction for the special case where you want to have a sequence of named integer constants, but you don't care what their actual values are, as in

   1 enum color { RED, BLUE, GREEN, MAUVE, TURQUOISE };

This will assign the value 0 to RED, 1 to BLUE, and so on. These values are effectively of type int, although you can declare variables, arguments, and return values as type enum color to indicate their intended interpretation Despite declaring a variable enum color c (say), the compiler will still allow c to hold arbitrary values of type int; see enums_are_ints.c for some silly examples of this.

It is also possible to specify particular values for particular enumerated constants, as in

   1 enum color { RED = 37, BLUE = 12, GREEN = 66, MAUVE = 5, TURQUOISE };

Anything that doesn't get a value starts with one plus the previous value; so the above definition would set TURQUOISE to 6.

In practice, enums are seldom used, and you will more commonly see a stack of #defines:

   1 #define RED     (0)
   2 #define BLUE    (1)
   3 #define GREEN   (2)
   4 #define MAUVE   (3)
   5 #define TURQUOISE (4)
   6 

The reason for this is partly historical—enum arrived late in the evolution of C—but partly practical: a table of #defines makes it much easier to figure out which color is represented by 3, without having to count through a list. But if you never plan to use the numerical values, enum is a better choice.

4. Other uses of #define

It is also possible to use #define to define preprocessor macros that take parameters; this will be discussed in C/Macros.


CategoryProgrammingNotes


2014-06-17 11:57