# 1. Multiplying large integers

Suppose we wish to multiply two very large integers (e.g. thousands of bits each). When dealing with such monsters it no longer makes sense to assume that multiplication is constant time. Instead, we treat each integer as an array of bits x_{n}x_{n-1}...x_{1}x_{0} and multiply using the algorithm:

x * y = (∑

_{i}x_{i}2^{i})(∑_{j}y_{j}2^{j}) = ∑_{i}∑_{j}x_{i}y_{j}2^{i+j}.

Assuming that the addition is cheap, this takes Theta(n^{2}) time. In practice we can usually speed up this algorithm by a large constant factor by using larger digits, e.g. by multiplying the numbers 16 bits at a time instead of one bit at a time. But can we do better?

# 2. A divide-and-conquer algorithm

This algorithm is due to Karatsuba and Ofman, in a 1962 paper from *Doklady Akademii Nauk SSSR*. The paper is written in Russian, so I haven't read it. The description below is adapted from a description in Section 5.2.2 of Goodrich and Tamassia's *Algorithm Design* (Wiley, 2002).

Imagine splitting x and y into two very huge digits of k bits each: x = x_{1}2^{k} + x_{0}, y = y_{1}2^{k} + y_{0}. Then

xy = (x

_{1}2^{k}+ x_{0})(y_{1}2^{k}+ y_{0}) = x_{1}y_{1}2^{2k}+x_{1}y_{0}2^{k}+x_{0}y_{1}2^{k}+x_{0}y_{0}.

We will assuming that shifting (i.e., multiplying by 2^{something}) is free and addition takes linear time. We have thus reduced the problem of computing a product on n=2k bits to the problem of computing four products on k bits, which gives running time T(n) = 4T(n/2) + Theta(n) = T(n^{2}) by the MasterTheorem. So far we haven't made any progress.

The trick to the Karatsuba-Ofman algorithm is to compute the x_{1}y_{1} and x_{0}y_{0} terms with one recursive multiplication each, and then compute sum of the remaining terms x_{1}y_{0}+x_{0}y_{1} with a single additional multiplication. This can be done by computing

z = (x

_{1}-x_{0)(y}0_{-y}1) = x_{1}y_{0}- x_{1}y_{1}- x_{0}y_{0}+ x_{0}y_{0}

so that

x

_{1}y_{0}+x_{0}y_{1}= z + x_{1}y_{1}+ x_{0}y_{0}.

So to compute xy, compute the three products above, and then let

xy = x

_{1}y_{1}2^{2k}+ (z + x_{1}y_{1}+ x_{0}y_{0})2^{k}+ x_{0}y_{0}.

Since we can compute the three products recursively using the same algorithm, we have T(n) = 3T(n/2) + Theta(n) = Theta(n^{lg 3}) by the MasterTheorem. This is O(n^{1.59}) or thereabouts, much better than the straightforward algorithm.

A similar trick can be used to do fast matrix multiplication, by reducing multiplication of two n by n matrices to seven products of n/2 by n/2 matrices. The resulting algorithm, known as Strassen's Algorithm, has a recurrence T(n) = 7T(n/2) + O(n^{2}) with solution T(n) = Theta(n^{lg 7}). The value of lg 7 is about 2.8, so this is somewhat better than the Theta(n^{3}) running time for the straightforward algorithm. (See StrassensAlgorithm.)

# 3. FFT approach

A still faster method is to apply a FastFourierTransform to the two numbers, which changes convolution (z_{i} = Sigma_{j} x_{i+j}y_{i-j}; this is another way to compute the product) into pointwise multiplication (z_{i}=x_{i}y_{i}). We then carry out the pointwise multiplication and invert the FFT. The cost of this approach in bit operations is O(n log n) if implemented very carefully; see KnuthSeries volume 2, Section 4.3.3.C.