Polynomial multiplication is one of the most important problems in mathematics and computer science. It can be formally defined as follows:

You are given two polynomials roughly of equal size\[A(x)=a_0+a_1x+\dots+a_{n-1}x^{n-1}\]\[B(x)=b_0+b_1x+\dots+b_{n-1}x^{n-1}\]find a polynomial \[C(x)=c_0+c_1x+\dots+c_{2n-2}x^{2n-2}\] such that \(A(x)\cdot B(x)=C(x)\). At first, this may look like an easy problem, since you already know how to multiply polynomials. You just sum the product of every two terms, right?

def multiply(A, B): """ A = list of coefficients of A(x) B = list of coefficients of B(x) """ n = len(A) - 1 # C can have 2*n-2 degree at most C = [0 for i in range(2*n-1)] for i in range(n): for j in range(n): # adding a_i*b_j to c_{i+j} C[i+j] += A[i]*B[j]

It definitely works, but has asymptotic runtime of \(O(n^2)\). That’s quite bad. Can we do it faster?

Turns out we can! We can in fact multiply two polynomials in \(O(n\log n)\) time. How is that possible? It all comes down to looking at the problem from another perspective.

### Change of Perspective

So far we only saw polynomials as a list of coefficients. However, a polynomial of degree \(n\) can also be viewed as a curve in \(n+1\) dimensional space. It is quite possible define it in terms of the points it passes through. For instance, take the line \(P(x)=2x+1\) in 2d space. We only need to know two points on this line, say \((0, 1)\) and \((1, 3)\), to identify it uniquely. To determine a polynomial with degree \(n\), It has been proven that you need to know its values at \(n+1\) distinct points. This is called the Point Value Representation of Polynomials.

A polynomial \(P(x)\) in this form can be saved like \(\{0 : P(0), \dots, n : P(n)\}\), although you can evaluate \(P(x)\) at any complex point. (This will come handy later)

Note that multiplication is really cheap in this form. Given

\[A(x)=\{0:A(0),\dots (2n-2): A(2n-2)\}\] and \[B(x)=\{0:B(0),\dots (2n-2): B(2n-2)\}\] we can compute \(C(i)=A(i)\cdot B(i)\) at every point \(i\) from \(0\) to \(2n-2\). For instance, \(C(0)=A(0)\cdot B(0)\).This costs \(O(n)\) time. So, we are happy.

Wait, are we? We do not know the coefficients of \(C(x)\)! Evaluation at arbitrary points is very costly in point value form. So, we could not, for instance, raise \(C(x)\) to some arbitrary power. What if we could find a way to go back and forth between coefficient representation and point value representation? We could get the best of both worlds.

This is what Fast Fourier Transform allows us to do.

Before I move on, please note a couple of things.

- Prior to multiplication, we must evaluate both \(A(x)\) and \(B(x)\) at at least \(2n-1\) points even though each of them have degree \(n\). This is because \(C(x)\) has degree \(2n-2\).
- It is not necessary to evaluate the polynomials exactly at \(1,2,\dots,2n-2\). There is nothing special about these numbers. However, it is necessary that we evaluate both \(A(x)\) and \(B(x)\) at the same set of points.

### Coefficients to Point Value Representation

The only way to convert a polynomial with degree \(n\) to point value representation is to evaluate it at \(n+1\) distinct points. However, polynomial evaluation at a single point stills costs \(O(n)\) in coefficient representation. For instance, you can evaluate \(A(p)\) at any point \(p\) like this: \[A(p)=a_0+p(a_1+p(a_2+\cdots+pa_{n-1}))\] Then for \(n+1\) points, we would end up with \(O(n^2)\) runtime. However, you can save a lot of this cost by algebraic manipulation. Let me give you an example.

Suppose we want to compute \(P(x)=2+3x-4x^2+2x^3+5x^4\) at \(1, 2, -1, -2\). We can evaluate the odd powers and even powers separately since even powers of negative numbers is same as that of positive numbers. To be more precise, define \(Q(x)=2-4x+5x^2\) and \(R(x)=3x+2x^2\).

Note that we only kept the order of the coefficients from \(P(x)\), but not the associated powers \(x\). This gives an algebraic relationship as follows:

\[P(x)=Q(x^2)+xR(x^2)\] Now, note that \(Q((-1)^2)=Q(1^2), Q((-2)^2)=Q(2^2)\) and so on. Therefore, we will only need to compute \(Q(1), Q(4), R(1), R(4)\) in order to evaluate \(P(x)\) at four points. Both \(Q(x)\) and \(R(x)\) has half the degree as \(P(x)\), hence we’re saving almost half the work.

In order for trick to work, we must select our list of points very carefully. If the polynomial were much bigger, we would have to do that odd-even split multiple times. Every time we would like to have numbers with opposite signs, so that when squared, they give the same value. You can easily see that no initial set of real numbers can satisfy such requirements. This is why we choose roots of unity. In particular, we choose \(m=2^k\)th roots of unity such that \(2n\leq m\leq 4n\). (This is essentially the smallest power of two that’s greater than \(2n\)). They have this magic property that if \(w\) is a \(2^k\)th root, then so is \(-w\) regardless of which \(k\) you choose.

### Point Value to Coefficient Representation

If you know some linear algebra, you might notice that evaluating a polynomial at \(n+1\) point essentially defines a linear transformation, which can be represented by a matrix. Due to special nature of roots of unity, the inverse matrix consists of conjugates of previously used roots scaled by a factor of \(1/n\). I won’t show it here in details, but if you are interested you should check out this blogpost.

In English, the equations tell us that inverse of Fast Fourier Transform is another Fast Fourier Transform where every point of polynomial evaluation must be replaced by its conjugate, and the final result has to be scaled by a factor of \(1/n\).

### Time Complexity

Fast Fourier Transform is essentially a divide-and-conquer algorithm. The divide step corresponds to odd-even coefficient split. Each of the resulting subproblems has cost \(T(n/2)\). The combining step has cost \(O(n)\). (There are \(2n-2\) points and you have to compute the algebraic identity at each point.) So, the recurrence relationship should be

\[T(n)=2T\left(\frac{n}{2}\right)+O(n)\] Using master theorem, \(T(n)=O(n\log(n))\). Since FFT is the most expensive step in this approach, runtime of the entire process is dominated by \(O(n\log n)\). Therefore, polynomial multiplication also costs \(O(n\log n)\).

### Summary

- Given \(A(x)\) and \(B(x\) in coefficient representation, convert them to point value representation with Fast Fourier Transform. (\(O(n\log n)\)))
- Compute \(C(x)=A(x)\cdot B(x)\). (\(O(n)\))
- Go back to coefficients representation with Inverse Fast Fourier Transform. (\(O(n\log n)\)))