I recently ran into a situation where I needed a fast way to evaluate a polynomial on inputs 1,...,n where n. It turns out that there is a way of doing this quicker than just computing the evaluation using something like Horner’s method on each input, assuming that n is sufficiently larger than the degree of the polynomial. As with many other ingenious algorithms, this one can be found in Donald Knuth’s “The Art of Computer Programming”, where it’s mentioned briefly in section 4.6.4 and left as an exercise, and it gives something more general, namely a way to compute evaluations on points in arithmetic progression. Here I give some more details about the algorithm and provide an implementation in Rust.
Let f(x) = f_0 + f_1x + \cdots + f_d x^d be a polynomial in some ring and let x_0 and h define an arithmetic progression, x_0, x_0 + h, x_0 + 2h, \ldots. There is a pre-computation step before we can get to the actual evaluation: First we compute the evaluations of the first d+1 elements of the arithmetic progression:
y_j = f(x_0 + jh), \quad j = 0,\ldots,d.
Since this gives us the first d+1 evaluations, this proves that the method is not quicker if we need fewer points than d+1. To complete the pre-processing, we let for all k=1,\ldots,d and j=d,\ldots,k (going down)
y_j = y_j - y_{j-1}.Now, y_0 holds the evaluation of the first value in the arithmetic progression, i.e. f(x_0), and we can implement the evaluation as an iterator. To get the next evaluation, let for j = 0,\ldots, d-1,
y_j = y_j + y_{j+1},\qquad (1)and yield y_0 as the next evaluation. Notice that in this step, an evaluation can be computed with d additions where a naive evaluation using Horner’s method requires d additions and d multiplications.
There are a few optimisations that can be implemented for this algorithm:
- Since the first d+1 evaluations are computed in the preprocessing, we can store these and yield them directly. This is a very small optimisation since the iteration step (1) has to be computed anyway, and it gives larger state for the iterator.
- Notice that for a single iteration, y_0 only depends on y_0 and y_1, so for the last iterations of this iterator, some of the steps in (1) can be skipped.
The algorithm implemented in Rust is available on GitHub.