Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

A polynomial in a variable x over a commutative ring R is an expression of the form

$p(x) = \sum_{i=0}^{d} a_i x^i.$

Note that a polynomial only has finitely many terms—with infinitely many terms, we get a formal power series (see GeneratingFunctions).

In any polynomial, the values a_i are called the coefficients of the polynomial and each a_i in particular is the coefficient of xⁱ. The variable x generally has no role except to act as a hat-rack for exponents, although we can evaluate a polynomial by substituting a particular element of R for x. Evaluation turns a polynomial into a function from R to R, but we consider polynomials with different coefficients to be different even if they yield the same function (this usually only happens when R is finite, e.g. in ℤ₂[x] the polynomials x, x², x³, x + x² + x³⁷, etc. all give the same function but are still treated as distinct polynomials).

The set of all polynomials in x over R is written R[x]. It satisfies all the axioms of a commutative ring, where addition is defined by (p+q)_i = p_i + q_i and multiplication is defined by

$(pq)_i = \sum_{j=0}^{i} p_j q_{j-i},$

an operation on sequences known as convolution. The intuition behind this definition is that we can obtain a term xⁱ as the product of terms x^j x^i-j for any j with 0 ≤ j ≤ i since the exponents add, and the sum just collects the coefficients of all such terms.

Given a polynomial p(x), the largest exponent of any term with a nonzero coefficient is called the degree of a polynomial; this is often abrreviated as deg(p(x)). The nonzero coefficient in question is called the leading coefficient of the polynomial. A nonzero constant polynomial has degree 0. It is sometimes convenient to assign the zero polynomial degree -1 or -∞, but as the behavior of the zero polynomial is messy it is usually safer just to leave its degree undefined.

It is not hard to show that for nonzero p and q, deg(p+q) ≤ max(deg(p), deg(q)) and deg(pq) ≤ deg(p) + deg(q). The second inequality becomes an equality if R is a field. The reason is that the leading coefficient of pq is the product of the leading coefficients of p and q, and in a field a product can't be zero unless one of its factors is zero. It follows that polynomials over a field do not form a field, as multiplying any polynomial p of degree 1 or higher by any other polynomial q gives a product pq with deg(pq) ≥ 1, implying pq ≠ 1: p has no multiplicative inverse. However, we will be able to construct fields as quotient rings of rings of polynomials.

1. Division of polynomials

Just like we can divide integers to get a quotient and remainder, we can also divide polynomials over a field. The division algorithm looks suspiciously like long division, which is not terribly surprising if we realize that the usual base-10 representation of a number is just a polynomial over 10 instead of x. That the division algorithm for polynomials works and gives unique results follows from a simple induction argument on the degree.

Let a and b be polynomials in F[x], where F is some field. We want to find q and r such that a = bq + r and deg(r) < deg(b) (or r = 0 if deg(b) = 0—this is where it is convenient to define deg(0) as some negative quantity). It's not hard to see that deg(q) will be exactly deg(a) - deg(b) if this quantity is non-negative, since deg(bq) = deg(q) + deg(b) and deg(r) is less, preventing it from knocking out the leading coefficient of bq. On the other hand, if deg(a) < deg(b), then q = 0 and r = a is the unique solution to a = bq+r, as for any nonzero q we would get deg(a) ≥ deg(b). This forms the basis of an inductive argument on deg(a) that there is always a unique q and r for any a and b.

Suppose deg(a) ≥ deg(b), and let the leading term of a be a_nxⁿ and b be b_mx^m. Then a_n = q_m-nb_m since the sum defining a_n will contain only the product of the leading terms of b and q. We can thus solve for the unique leading coefficient q_m-n = a_n(b_m)^-1. To obtain the rest of q, let a' = a - b q_m-n x^m-n; this new polynomial a' has degree less than a, so by the induction hypothesis there is a unique q' and r' with deg(r') < deg(b) such that a' = bq' + r'. But then a = b(q_m-nx^m-n + q') + r' gives the desired unique solution to a = bq+r.

Example: Let F = ℤ₅ and let a = 4x² + 2x + 1 and b = 3x + 4. What are q and r? We have q₁ = 4⋅3^-1 = 4⋅2 = 3. Subtracting q₁x¹b = 3xb from a gives a' = (4x² + 2x + 1) - 3x(3x + 4) = (4x² + 2x + 1) - (4x² + 2x) = 1. This has degree less than deg(b), so we are done: 4x² + 2x + 1 = (3x + 4)(3x) + 1, which we can verify by multiplying out 3x⋅3x + 4⋅3x + 1 = 9x² + 12x + 1 = 4x² + 2x + 1.

2. Divisors and greatest common divisors

A polynomial d(x) is a divisor of a polynomial p(x) if there exists some polynomial q(x) such that d(x)q(x) = p(x). A greatest common divisor of two polynomial p(x) and p'(x) is a common divisor d(x) such that any other d'(x) that divides both p(x) and p'(x) also divides d(x). Outside of ℤ₂[x], a greatest common divisor of two polynomials is not unique, as the fact that our coefficients are all in a field means that if d(x)|p(x) then ad(x)|p(x) for any nonzero field element a (compute from p(x) = d(x)q(x) that p(x) = (ad(x))(a^-1q(x))). However, it is possible to show using a version of the Euclidean algorithm (see NumberTheory) that a greatest common divisor of any two polynomials p(x) and p'(x) exists and is unique up to multiplication by nonzero constants (see BiggsBook §22.6). We can go one step further: the extended Euclidean algorithm also works for polynomials, and yields for each p(x), p'(x), and gcd d(x) of p(x) and p'(x) polynomials a(x) and a'(x) such that d(x) = a(x)p(x) + a'(x)p'(x).

3. Factoring polynomials

As with integers, to factor a polynomial p is to find polynomials p₁, p₂, ... p_k such that p = p₁p₂⋯p_k. For polynomials, the role of primes in integer factorization is taken by irreducible polynomials, where a polynomial p is irreducible if p(x) = a(x)b(x) holds only if at lest one of a(x) or b(x) has degree zero. Because the extended Euclidean algorithm works for polynomials, the same proof used to prove unique factorization for integers also works for polynomials except for the messy detail of being able to multiply factors by nonzero constants. But if we restrict our attention to monic polynomials—those whose leading coefficient is 1—then we do in fact get a unique factorization (up to reordering of factors) of each monic polynomial into monic irreducible polynomials.

Which polynomials are irreducible and how polynomials factor depends on the underlying field. In ℤ₂, the polynomial x²+x+1 is irreducible: it's not equal to x(x+1) = x²+x or x⋅x = x², and there are no other degree-1 polynomials to work with. It's also irreducible over both ℚ and ℝ, since if it factors as (x+a)(x+b), then -a and -b are both solutions to x²+x+1 = 0, and the quadratic formula gives solutions

$\frac{-1 \pm \sqrt{-3}}{2}$

and there is no √(-3) in ℚ or ℝ. But it ceases to be irreducible in the field of complex numbers ℂ, since ℂ contains √(-3) = i√3. In fact, no polynomial of degree greater than 1 is irreducible in ℂ: this is why mathematicians invented the complex numbers.

To take a different example, x² + 1 is not irreducible over ℤ₂, since (x+1)(x+1) = x² + 2x + 1 = x²+1 (mod 2). But it is irreducible in ℚ and ℝ since neither contains i = √(-1), and it factors differently as (x+i)(x-i) in ℂ.

One familiar fact about polynomials that we've been implicitly using is the following handy theorem, which holds no matter what field we are working in:

Theorem: Let p(x) be a nonzero polynomial over a field F. Then p(x) has a divisor (x-a) if and only if p(a) = 0.
Proof: The only if direction is easy: if p(x) = (x-a)q(x), then p(a) = (a-a)q(a) = 0⋅q(a) = 0. For the if direction, suppose p(a) = 0, and apply the division algorithm to obtain q(x) (a polynomial) and r (a constant) such that p(x) = (x-a)q(x) + r. Then 0 = p(a) = (a-a)q(x) + r = 0⋅q(x) + r = r. In other words, r = 0 and p(x) = (x-a)q(x) as claimed.

This can quickly be used to show that a polynomial is not irreducible: e.g. p(x) = x⁴ + 3x² + 3x + 1 is not irreducible in ℤ₅ because p(2) = 0 (check it and see). It follows from the theorem that (x-2) = (x+3) divides p(x). However, it is possible for a polynomial to never evaluate to zero and still not be irreducible: a simple example in ℤ₂[x] would be x^{4 + x}2^{+ 1 = (x}2+x+1)(x²+x+1) and in ℝ[x] would be x⁴ + 2x² + 1 = (x²+1)(x²+1). In both cases the big polynomial never evaluates to zero because none of its factors ever do. However, this can only happen to polynomials that have no degree-1 factors, so looking for zeros is a good test of irreducibility for polynomials of degree 3 or less.

4. The ideal generated by an irreducible polynomial

Suppose p(x) is a polynomial in F[x] for some field F. Then I(p(x)) = { a(x)p(x) | a(x) ∈ F[x] } is an ideal of the ring F[x]. (Recall from AlgebraicStructures that I(p(x)) is an ideal of F[x] if it's a subring of F[x] such that for all b(x) in I(p(x)) and c(x) in F[x], b(x)c(x) is in I(p(x)).) Clearly I(p(x)) is a subring, since a(x)p(x) + a'(x)(x) = (a(x)+a'(x))p(x) is a multiple of p(x) and a(x)p(x)⋅a'(x)p(x) is also a multiple of p(x). But it's also an ideal, since multiplying any b(x) by some a(x)p(x) yields b(x)a(x)⋅p(x) ∈ I(p(x)). It follows that there is a quotient ring F[x]/I(p(x)), usually written simply as F[x]/p(x), in which the elements are the residue classes of polynomials modulo p(x) and two polynomials are in the same residue class if and only if they differ by a multiple of p(x).

Another way of describing these residue classes is that each consists of all polynomial a(x) that yield the same remainder r(x) when divided by p(x)-just like what happens in ℤ_m = ℤ/mℤ, another quotient ring obtained from an ideal generated by a single element. It follows that we can represent each one by a distinct r(x) with deg(r(x)) < deg(p(x)).

If p is irreducible, something very exciting happens in this quotient ring—analogous to what happens in ℤ_p = ℤ/pℤ when p is prime. Each nonzero r(x) with deg(r) < deg(p) has 1 as a greatest common divisor with p, for any constant divides 1 (1 = cc^-1) and any non-constant can't divide both p and r since p is irreducible. It follows from the extended Euclidean algorithm for polynomials that there exist r' and p' such that rr' + pp' = 1, or in other words that rr' is congruent to 1 mod p. It follows that every nonzero element of F[x]/p has a multiplicative inverse: F[x]/p is a field. Since we can specify an element of F[x]/p when deg(p) = n by giving the n coefficients r₀, r₁, ..., r_n-1 of r, this field has exactly |F|ⁿ elements.

Thus to find a finite field of pⁿ elements we can start with F = ℤ_p and take a quotient of F[x] with respect to some irreducible polynomial of degree n. Amazingly enough, such polynomials exist for any n. We won't prove this here but a proof is given in BiggsBook Chapter 23, which contains an extensive discussion of the construction and properties of finite fields.

We can also apply the technique of taking a quotient of F[x] with respect to an irreducible polynomials (known as field extension) to infinite fields. The most famous example is the field ℂ = ℝ[x]/(x²+1); the fact that x²+1 has degree 2 explains why all complex numbers can be written as two-term polynomials a+bx (i.e. a+bi). But we can construct other fields from other irreducible polynomials, such as the field ℚ[√2] = ℚ[x]/(x²-2), which consists of all expressions of the form a+b√2 where a and b are rational. This is a bigger field that ℚ, but it's still much smaller than ℝ: for example, it doesn't include π or even √3.

CategoryMathNotes