Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

Our goal here is to find computationally-useful structures that act enough like the rational numbers ℚ or the real numbers ℝ that we can do arithmetic in them that are small enough that we can describe any element of the structure uniquely with a finite number of bits. Such structures are called finite fields.

An example of a finite field is ℤ_p, the integers mod p (see ModularArithmetic). These finite fields are inconvenient for computers, which like to count in bits and prefer numbers that look like 2ⁿ to horrible nasty primes. So we'd really like finite fields of size 2ⁿ for various n, particularly if the operations of addition, multiplication, etc. have a cheap implementation in terms of sequences of bits. To get these, we will show how to construct a finite filed of size pⁿ for any prime p and positive integer n, and then let p=2.

Contents

A magic trick
Fields and rings
Polynomials over a field
Algebraic field extensions
Applications

1. A magic trick

We will start with a magic trick. Suppose we want to generate a long sequence of bits that are hard to predict. One way to do this is using a mechanism known as a linear-feedback shift register (LFSR). There are many variants of LFSRs. Here is one that generates a sequence that repeats every 15 bits by keeping track of 4 bits of state, which we think of as a binary number r₃r₂r₁r₀.

To generate each new bit, we execute the following algorithm:

Rotate the bits of r left, to get a new number r₂r₁r₀r₃.
If the former leftmost bit was 1, flip the new leftmost bit.
Output the rightmost bit.

Here is the algorithm in action, starting with r = 0001:

r	rotated r	rotated r after flip	output
0001	0010	0010	0
0010	0100	0100	0
0100	1000	1000	0
1000	0001	1001	1
1001	0011	1011	1
1011	0111	1111	1
1111	1111	0111	1
0111	1110	1110	0
1110	1101	0101	1
0101	1010	1010	0
1010	0101	1101	1
1101	1011	0011	1
0011	0110	0110	0
0110	1100	1100	0
1100	1001	0001	1
0001	0010	0010	0

After 15 steps, we get back to 0001, having passed through all possible 4-bit values except 0000. The output sequence 000111101011001... has the property that every 4-bit sequence except 0000 appears starting at one of the 15 positions, meaning that after seeing any 3 bits (except 000), both bits are equally likely to be the next bit in the sequence. We thus get a sequence that is almost as long as possible given we have only 2⁴ possible states, that is highly unpredictable, and that is cheap to generate. So unpredictable and cheap, in fact, that the governments of both the United States and Russia operate networks of orbital satellites that beam microwaves into our brains carrying signals generated by linear-feedback shift registers very much like this one. Similar devices are embedded at the heart of every modern computer, scrambling all communications between the motherboard and PCI cards to reduce the likelihood of accidental eavesdropping.

What horrifying deep voodoo makes this work?

2. Fields and rings

A field is a set F together with two operations + and ⋅ that behave like addition and multiplication in the rationals or real numbers. Formally, this means that:

Addition is associative: (x+y)+z = x+(y+z) for all x, y, z F.
There is an additive identity 0 such that 0+x = x+0 = x for all x in F.
Every x in F has an additive inverse -x such that x+(-x) = (-x)+x = 0.
Addition is commutative: x+y = y+x for all x, y in F.
Multiplication distributes over addition: x⋅(y+z) = (x⋅y + x⋅z) and (y+z)⋅x = (y⋅x + z⋅x) for all x,y, z, in F.
Multiplication is associative: (x⋅y)⋅z = x⋅(y⋅z) for all x, y, z in F.
There is a multiplicative identity 1 such that 1⋅x = x⋅1 = x for all x in F-{0}.
Multiplication is commutative: x⋅y = y⋅x for all x, y in F.
Every x in F-{0} has a multiplicative inverse x^-1 such that x⋅x^-1 = x^-1⋅x = 1.

Some structures fail to satisfy all of these axioms but are still interesting enough to be given names. A structure that satisfies 1–3 is called a group; 1–4 is an abelian group; 1–7 is a ring; 1–8 is a commutative ring. In the case of groups and abelian groups there is only one operation +. There are also more exotic names for structures satisfying other subsets of the axioms; see AbstractAlgebra.

Some examples of fields: ℝ, ℚ, ℂ, ℤ_p where p is prime. We will be particularly interested in ℤ_p, since we are looking for finite fields that can fit inside a computer.

If (F,+,⋅) looks like a field except that multiplication isn't necessarily commutative and some nonzero elements might not have inverses, then it's a ring (or a commutative ring if multiplication is commutatitive). The integers ℤ are an example of a commutative ring, as is ℤ_m for m > 1. Square matrices of fixed dimension > 1 are an example of a non-commutative ring.

3. Polynomials over a field

Any field F generates a polynomial ring F[x] consisting of all polynomials in the variable x with coefficients in F. For example, if F = ℚ, some elements of ℚ[x] are 3/5, (22/7)x² + 12, 9003x⁴¹⁷ - (32/3)x⁴ + x², etc. Addition and multiplication are done exactly as you'd expect, by applying the distributive law and combining like terms: (x+1)⋅(x²+3/5) = x⋅x²+x⋅(3/5)+x²+(3/5) = x³ + x² + (3/5)x + (3/5).

The degree deg(p) of a polynomial p in F[x] is the exponent on the leading term, the term with a nonzero coefficient that has the largest exponent. Examples: deg(x²+1) = 2, deg(17) = 0. For 0, which doesn't have any terms with nonzero coefficients, the degree is taken to be -∞. Degrees add when multiplying polynomials: deg((x²+1)(x+5)) = deg(x²+1)+deg(x+5) = 2+1 = 3; this is just a consequence of the product leading terms producing the leading term of the new polynomial. For addition, we have deg(p+q) max(deg(p),deg(q)), but we can't guarantee equality (maybe the leading terms cancel).

Because F[x] is a ring, we can't do division the way we do it in a field like ℝ, but we can do division the way we do it in a ring like ℤ, leaving a remainder. The equivalent of the integer division algorithm for ℤ is:

Division algorithm for polynomials: Given a polynomial f and a nonzero polynomial g in F[x], there are unique polynomials q and r such that f = q⋅g + r and deg(r) < deg(g).

The essential idea is that we can find q and r using the same process of long division as we use for integers. For example, in ℚ[x]:

            x -  1
     ______________
x+2 ) x² +  x +  5
      x² + 2x
      -------
           -x +  5
           -x + -5
           =======
                10

From this we get x² + x + 5 = (x+2)(x-1) + 10, with deg(10) = 0 < deg(x+2) = 1. We are going to use this later to define finite fields by taking F[x] modulo some well-chosen polynomial, analogously to the way we can turn ℤ (a ring) into a field ℤ_p by taking quotients mod p.

4. Algebraic field extensions

Given a field F, we can make a bigger field by adding in extra elements that behave in a well-defined and consistent way. An example of this is the extension of the real numbers ℝ to the complex numbers ℂ by adding i.

The general name for this trick is algebraic field extension and it works by first constructing the ring of polynomials F[x] and then smashing it down into a field by taking remainders modulo some fixed polynomial p(x). For this to work, the polynomial has to to be irreducible, which mean that p(x) = 0 if and only if x = 0, or equivalently that p can't be factored as (x+a)p' for some a and p' (which makes irreducibility sort of like being prime, and makes this construction sort of like the construction of ℤ_p).

The fact that the resulting object is a field follows from inheriting all the commutative ring properties from F[x], plus getting multiplicative inverses for essentially the same reason as in ℤ_p: we can find them using the extended Euclidean algorithm applied to polynomials instead of integers (we won't prove this).

In the case of the complex numbers ℂ, the construction is ℂ = ℝ[i]/(i²+1). Because i²+1 = 0 has no solution i∈ℝ, this makes i²+1 an irreducible polynomial. An element of ℂ is then a degree-1 or less polynomial in ℝ[i], because these are the only polynomials that survive taking the remainder mod i²+1 intact.

If you've used complex numbers before, you were probably taught to multiply them using the rule i² = -1, which is a rewriting of i² + 1 = 0. This is equivalent to taking remainders: (i + 1)(i + 2) = (i² + 3i + 2) = 1⋅(i^2+1) + (3i + 1) = 3i + 1.

The same thing works for other fields and other irreducible polynomials. For example, in ℤ₂, the polynomial x²+x+1 is irreducible, because x²+x+1=0 has no solution (try plugging in 0 and 1 to see). So we can construct a new finite field ℤ₂[x]/(x²+x+1) whose elements are polynomials with coefficients in ℤ₂ modulo x²+x+1.

Addition in ℤ₂[x]/(x²+x+1) looks like vector addition<<FootNote(This is not an accident; it can be shown that that any extension field acts like a vector space over its base field.): (x+1) + (x+1) = 0⋅x + 0 = 0, (x+1) + x = 1, (1) + (x) = (x+1). Multiplication in ℤ₂[x]/(x²+x+1) works by first multiplying the polynomials and taking the remainder mod (x²+x+1): (x+1)⋅(x+1) = x²+1 = 1⋅(x²+x+1) + x = x. If you don't want to take remainders, you can instead substitute x+1 for any occurrence of x² (just like substituting -1 for i² in ℂ), since x²+x+1 = 0 implies x² = -x-1 = x+1 (since -1 = 1 in ℤ₂).

The full multiplication table for this field looks like this:

0		1	x	x+1
0	0	0	0	0
1	0	1	x	x+1
x	0	x	x+1	1
x+1	0	x+1	1	x

We can see that every nonzero element has an inverse by looking for ones in the table; e.g. 1⋅1 = 1 means 1 is its own inverse and x⋅(x+1) = x²+x = 1 means that x and x+1 are inverses of each other.

Here's the same thing for ℤ₂[x]/(x³+x+1):

	1	x	x + 1	x²	x² + 1	x² + x	x² + x + 1
0	0	0	0	0	0	0	0
1	1	x	x + 1	x²	x² + 1	x² + x	x² + x + 1
x	x	x²	x² + x	x + 1	1	x² + x + 1	x² + 1
x + 1	x + 1	x² + x	x² + 1	x² + x + 1	x²	1	x
x²	x²	x + 1	x² + x + 1	x² + x	x	x² + 1	1
x² + 1	x² + 1	1	x²	x	x² + x + 1	x + 1	x² + x
x² + x	x² + x	x² + x + 1	1	x² + 1	x + 1	x	x²
x² + x + 1	x² + x + 1	x² + 1	x	1	x² + x	x²	x + 1

Note that we now have 2³ = 8 elements. In general, if we take ℤ_p[x] modulo a degree-n polynomial, we will get a field with pⁿ elements. These turn out to be all the possible finite fields, with exactly one finite field for each number of the form pⁿ (up to isomorphism, which means that we consider two fields equivalent if there is a bijection between them that preserves + and ⋅). We can refer to a finite field of size pⁿ abstractly as GF(pⁿ), which is an abbreviation for the Galois field of order pⁿ.

5. Applications

So what are these things good for?

On the one hand, given an irreducible polynomial p(x) of degree n over ℤ₂(x), it's easy to implement arithmetic in ℤ₂[x]/p(x) GF(2ⁿ) using standard-issue binary integers. The trick is to represent each polynomial ∑ a_i xⁱ by the integer value a = ∑ a_i 2ⁱ, so that each coefficient a_i is just the i-th bit of a. Adding two polynomials a+b represented in this way corresponds to computing the bitwise exclusive or of a and b: a^b in programming languages that inherit their arithmetic syntax from C (i.e., almost everything except Scheme). Multiplying polynomials is more involved, although it's easy for some special cases like multiplying by x, which becomes a left-shift (a<<1) followed by XORing with the representation of our modulus if we get a 1 in the n-th place. (The general case is like this but involves doing XORs of a lot of left-shifted values, depending on the bits in the polynomial we are multiplying by.)

On the other hand, knowing that we can multiply 7 x²+x+1 by 5 x²+1 and get 6 x²+x quickly using C bit operations doesn't help us much if this product doesn't mean anything. For ModularArithmetic, we at least have the consolation that 7⋅5 = 6 (mod 29) tells us something about remainders. In GF(2³), what this means is much more mysterious. This makes it useful—not in contexts where we want multiplication to make sense—but in contexts where we don't. These mostly come up in random number generation and cryptography.

5.1. Linear-feedback shift registers

Let's suppose we generate x⁰, x¹, x², ... in ℤ₂/(x⁴+x³+1), which happens to be one of the finite fields isomorphic to GF(2⁴). Since there are only 2⁴-1 = 15 nonzero elements in GF(2⁴), we can predict that eventually this sequence will repeat, and in fact we can show that p¹⁵ = 1 for any nonzero p using essentially the same argument as for Fermat's Little Theorem (see ModularArithmetic). So we will have x⁰ = x¹⁵ = x³⁰ etc. and thus will expect our sequence to repeat every 15 steps (or possibly some factor of 15, if we are unlucky).

To compute the actual sequence, we could write out full polynomials: 1, x, x², x³, x³+1, x³+x+1, ..., but this gets tiresome fast. So instead we'd like to exploit our representation of ∑ a_ixⁱ as ∑ a_i2ⁱ.

Now multiplying by x is equivalent to shifting left (i.e. multiplying by 2) followed by XORing with 11001, the binary representation of x⁴ + x³ + 1, if we get a bit in the x⁴ place that we need to get rid of. For example, we might do:

 1101 (initial value)
11010 (after shift)
 0011 (after XOR with 11001)

 0110 (initial value)
01100 (after shift)
 1100 (no XOR needed)

If we write our initial value as r₃r₂r₁r₀, the shift produces a new value r₃r₂r₁r₀0. XORing with 11001 has three effects: (a) it removes a leading 1 if present; (b) it sets the rightmost bit to r₃; and (c) it flips the new leftmost bit if r₃ = 1. Steps (a) and (b) turn the shift into a rotation. Step (c) is the mysterious flip from our sequence generator. So in fact what our magic sequence generator was doing was just computing all the powers of x in a particular finite field.

As in ℤ_p, these powers of an element bounce around unpredictably, which makes them a useful (though cryptographically very weak) pseudorandom number generator. Because high-speed linear-feedback shift registers are very cheap to implement in hardware, they are used in applications where a pre-programmed, statistically smooth sequence of bits is needed, as in the Global Positioning System and to scramble electrical signals in computers to reduce radio-frequency interference.

5.2. Checksums

Shifting an LFSR corresponds to multiplying by x. If we also add 1 from time to time, we can build any polynomial we like, and get the remainder mod m; for example, to compute the remainder of 100101 mod 11001 we do

 0000 (start with 0)
00001 (shift in 1)
 0001 (no XOR)
00010 (shift in 0)
 0010 (no XOR)
00100 (shift in 0)
 0100 (no XOR)
01001 (shift in 1)
 1001 (no XOR)
10010 (shift in 0)
 1011 (XOR with 11001)
10111 (shift in 1)
 1110 (XOR with 11001)

and we have computed that the remainder of x⁵ + x³ + 1 mod x⁴ + x³ + 1 is x³ + x² + x.

This is the basis for Cyclic redundancy check checksums, which are used to detect accidental corruption of data. The idea is that we feed our data stream into the LFSR as the coefficients of some gigantic polynomial, and the checksum summarizing the data is the state when we are done. Since it's unlikely that a random sequence of flipped or otherwise damaged bits would equal 0 mod m, most non-malicious changes to the data will be visible by producing an incorrect checksum.

5.3. Cryptography

GF(2ⁿ) can also substitute for ℤ_p in some cryptographic protocols. An example would be the function f(s) = x^s (mod m), which is fairly easy to compute in ℤ_p and even easier to compute in GF(2ⁿ), but which seems to be hard to invert in both cases. Here we can take advantage of the fast remainder operation provided by LFSRs to avoid having to do expensive division in ℤ.

CategoryMathNotes