43.08.01 · numerical-analysis / 08-interpolation-approximation

Polynomial interpolation: existence, uniqueness, and the Lagrange form

shipped3 tiersLean: none

Anchor (Master): Davis 1975 *Interpolation and Approximation* (Dover) Ch. 2-3 (the interpolation problem as a moment problem, Vandermonde theory, the remainder); Trefethen 2019 *Approximation Theory and Approximation Practice* extended ed. (SIAM) Ch. 5 (the barycentric formula and its numerical stability); Berrut-Trefethen 2004 *SIAM Review* 46(3) (barycentric Lagrange interpolation as the form to use in practice); Higham 2004 *IMA J. Numer. Anal.* 24(4) (numerical stability of barycentric interpolation)

Intuition Beginner

Suppose you have measured a quantity at a handful of separate inputs — the temperature at five times of day, say — and you want a smooth formula that hits every reading exactly and lets you guess the value in between. The oldest answer is to fit a single polynomial through the points. A polynomial is built from powers of $x$ , it is easy to evaluate, easy to differentiate, and easy to integrate, so once you have one passing through your data you can compute with it freely. This unit is about the one polynomial that threads your points, why it exists, why there is only one, and three different ways to write it down.

The first surprise is a counting fact. Two points determine a unique straight line. Three points determine a unique parabola. In general, if you have a certain number of points with all-different inputs, there is exactly one polynomial of a matching modest degree that passes through all of them. Not zero, not many — exactly one. The degree you need grows with the number of points: one point needs a constant, two need a line, and adding a point bumps the degree up by one.

The second idea is a clever way to build that polynomial by hand. For each data point you cook up a helper polynomial that equals $1$ at that point's input and $0$ at every other input. Multiply each helper by the reading you want there and add them up. Because each helper switches on at exactly one input and switches off at all the others, the sum automatically lands on the right value at every input. These helpers are the Lagrange building blocks, and the recipe is the Lagrange formula.

There is a price to the most direct version of the recipe: if you later collect one more reading, you seem to have to rebuild everything from scratch. A second way of writing the same polynomial, due to Newton, fixes this by adding one correction term per new point, reusing all the earlier work. The two recipes always produce the identical curve; they are two notations for one object.

Visual Beginner

Picture five dots on graph paper, each at a different horizontal position. The interpolating polynomial is the single curve of the right degree that passes through all five dots at once. The table below shows the heart of the Lagrange construction for three nodes at horizontal positions $x_{0}, x_{1}, x_{2}$ : one helper curve per node, each rigged to read $1$ at its own node and $0$ at the others.

node	helper value at $x_{0}$	helper value at $x_{1}$	helper value at $x_{2}$
helper for $x_{0}$	$1$	$0$	$0$
helper for $x_{1}$	$0$	$1$	$0$
helper for $x_{2}$	$0$	$0$	$1$

Each row is a helper polynomial; the pattern of ones on the diagonal and zeros off it is the whole trick. When you scale row $k$ by the data value $f_{k}$ and add the rows, the diagonal of ones guarantees the total reads $f_{k}$ at node $x_{k}$ , so the combined curve threads every data dot.

Worked example Beginner

Let us fit a polynomial through the three points $(0, 1)$ , $(1, 3)$ , and $(2, 7)$ . Three points call for a polynomial of degree at most two — a parabola or simpler.

Step 1. Build the helper for the node $x_{0} = 0$ . It must read $1$ at $0$ and $0$ at the other inputs $1$ and $2$ . Start with $(x - 1) (x - 2)$ , which is zero at $1$ and at $2$ . At $x = 0$ this equals $(0 - 1) (0 - 2) = 2$ , so divide by $2$ to make it read $1$ there. The helper is $L_{0} (x) = (x - 1) (x - 2) \div 2$ .

Step 2. Build the helper for $x_{1} = 1$ . It must be zero at $0$ and $2$ , so start with $(x - 0) (x - 2)$ . At $x = 1$ this is $(1) (1 - 2) = - 1$ , so divide by $- 1$ . The helper is $L_{1} (x) = x (x - 2) \div (- 1) = - x (x - 2)$ .

Step 3. Build the helper for $x_{2} = 2$ . It must vanish at $0$ and $1$ , so start with $x (x - 1)$ . At $x = 2$ this is $(2) (1) = 2$ , so divide by $2$ . The helper is $L_{2} (x) = x (x - 1) \div 2$ .

Step 4. Scale each helper by its data value and add: $p (x) = 1 \cdot L_{0} (x) + 3 \cdot L_{1} (x) + 7 \cdot L_{2} (x)$ . Multiplying out, $L_{0} = (x^{2} - 3 x + 2) \div 2$ , $3 L_{1} = - 3 x^{2} + 6 x$ , and $7 L_{2} = (7 x^{2} - 7 x) \div 2$ . Adding gives $p (x) = x^{2} + x + 1$ .

Step 5. Check. At $x = 0$ , $p = 1$ ; at $x = 1$ , $p = 1 + 1 + 1 = 3$ ; at $x = 2$ , $p = 4 + 2 + 1 = 7$ . All three match.

What this tells us: the helper recipe turns "pass through these points" into a short, mechanical computation, and the answer is one definite polynomial — here the parabola $x^{2} + x + 1$ — that you can now evaluate anywhere you like.

Check your understanding Beginner

Formal definition Intermediate+

Throughout, $F$ denotes $R$ or $C$ , and $Π_{n}$ denotes the vector space of polynomials of degree at most $n$ with coefficients in $F$ ; $dim Π_{n} = n + 1$ , with the monomial basis $1, x, \dots, x^{n}$ . Write $[x^{k}] q$ for the coefficient of $x^{k}$ in a polynomial $q$ (the leading-coefficient operator). The symbol $\prod_{j}$ denotes a finite product over the indicated index set.

Definition (the interpolation problem). Let $x_{0}, x_{1}, \dots, x_{n} \in F$ be $n + 1$ distinct points — the interpolation nodes — and let $f_{0}, f_{1}, \dots, f_{n} \in F$ be prescribed values. A polynomial $p \in Π_{n}$ interpolates the data $(x_{i}, f_{i})$ if $p (x_{i}) = f_{i}$ for every $i = 0, \dots, n$ . When the values arise from a function $f$ , so that $f_{i} = f (x_{i})$ , $p$ is the interpolating polynomial of $f$ at the nodes and is written $p_{n}$ .

The conditions $p (x_{i}) = f_{i}$ are $n + 1$ linear equations in the $n + 1$ unknown coefficients of $p$ . In the monomial basis $p (x) = \sum_{j = 0}^{n} a_{j} x^{j}$ , the system is $V a = f$ , where $V$ is the Vandermonde matrix $V_{ij} = x_{i}^{j}$ ( $0 \leq i, j \leq n$ ), $a = (a_{0}, \dots, a_{n})^{⊤}$ , and $f = (f_{0}, \dots, f_{n})^{⊤}$ . Solvability and uniqueness are therefore governed by $det V$ .

Definition (Lagrange cardinal polynomials). For nodes $x_{0}, \dots, x_{n}$ , the Lagrange cardinal polynomials (or cardinal functions) are $$ L_k(x) = \prod_{\substack{j = 0 \ j \ne k}}^{n} \frac{x - x_j}{x_k - x_j}, \qquad k = 0, \dots, n. $$ Each $L_{k} \in Π_{n}$ , and by construction $L_{k} (x_{i}) = δ_{k i}$ (the Kronecker delta: $1$ if $i = k$ , else $0$ ), because the numerator kills $L_{k}$ at every node but $x_{k}$ , and the denominator normalises its value at $x_{k}$ to $1$ . The Lagrange form of the interpolant is $$ p_n(x) = \sum_{k=0}^{n} f_k, L_k(x). $$

Definition (divided differences and the Newton form). The divided differences of $f$ on the nodes are defined recursively. The zeroth order is $f [x_{i}] = f (x_{i})$ , and for $k \geq 1$ , $$ f[x_i, x_{i+1}, \dots, x_{i+k}] = \frac{f[x_{i+1}, \dots, x_{i+k}] - f[x_i, \dots, x_{i+k-1}]}{x_{i+k} - x_i}. $$ With the Newton basis $π_{0} (x) = 1$ and $π_{k} (x) = \prod_{j = 0}^{k - 1} (x - x_{j})$ for $k \geq 1$ , the Newton form of the interpolant is $$ p_n(x) = \sum_{k=0}^{n} f[x_0, \dots, x_k], \pi_k(x) = f[x_0] + f[x_0,x_1](x - x_0) + \cdots + f[x_0,\dots,x_n]\prod_{j=0}^{n-1}(x - x_j). $$ The leading coefficient $[x^{n}] p_{n} = f [x_{0}, \dots, x_{n}]$ is the top divided difference. Adding a new node $x_{n + 1}$ appends a single term $f [x_{0}, \dots, x_{n + 1}] π_{n + 1} (x)$ without disturbing the earlier coefficients — the incremental property that the Lagrange form lacks.

Definition (barycentric weights and form). The barycentric weights are $w_{k} = 1/ \prod_{j \neq = k} (x_{k} - x_{j})$ . The second (true) barycentric form of the interpolant, valid at any $x$ that is not a node, is $$ p_n(x) = \left. \sum_{k=0}^{n} \frac{w_k}{x - x_k}, f_k ;\middle/; \sum_{k=0}^{n} \frac{w_k}{x - x_k} \right. . $$

Counterexamples to common slips Intermediate+

"Repeated nodes are fine if the values agree." The existence-uniqueness theory requires the nodes to be pairwise distinct; with $x_{i} = x_{j}$ the Vandermonde matrix has two equal rows and is singular, and the cardinal denominators contain a zero factor. Matching a value and a derivative at a coincident node is Hermite interpolation, a separate construction (forward-pointed to 43.08.03).
"A higher-degree polynomial also works, so the interpolant is not unique." Uniqueness is asserted within $Π_{n}$ — degree at most $n$ . There are infinitely many higher-degree polynomials through the same $n + 1$ points; the theorem pins down the unique one of degree $\leq n$ , equivalently the unique representative modulo the node polynomial $\prod_{i} (x - x_{i})$ .
"The Lagrange and Newton forms give different polynomials." They are two coordinate systems for the same unique interpolant. The Lagrange form expands it in the cardinal basis ${L_{k}}$ ; the Newton form expands it in the basis ${π_{k}}$ . Both equal $p_{n}$ as functions on all of $F$ .
"The naive Lagrange formula is unstable, so avoid Lagrange interpolation." The representation is sound; only the textbook evaluation $\sum_{k} f_{k} L_{k} (x)$ , which recomputes each $L_{k}$ from scratch, is wasteful and can lose accuracy. The barycentric rewrite of the identical polynomial restores $O (n)$ -per-point evaluation and numerical stability (master tier).

Key theorem with proof Intermediate+

The signature result is the unisolvence theorem: through $n + 1$ distinct nodes there passes one and only one polynomial of degree at most $n$ . Existence is delivered constructively by the Lagrange form; uniqueness rests on the fact that a nonzero polynomial of degree at most $n$ cannot have $n + 1$ roots. The two facts together say the evaluation map is a linear isomorphism, which is why every later representation describes the same object.

Theorem (unisolvence of polynomial interpolation). Let $x_{0}, \dots, x_{n} \in F$ be pairwise distinct and let $f_{0}, \dots, f_{n} \in F$ be arbitrary. There exists exactly one polynomial $p \in Π_{n}$ with $p (x_{i}) = f_{i}$ for all $i$ . Equivalently, the evaluation map $E : Π_{n} \to F^{n + 1}$ , $E (p) = (p (x_{0}), \dots, p (x_{n}))$ , is a linear isomorphism, and the Vandermonde determinant is $$ \det V = \prod_{0 \le i < j \le n} (x_j - x_i) \ne 0. $$

Proof. Existence. The Lagrange combination $p (x) = \sum_{k = 0}^{n} f_{k} L_{k} (x)$ lies in $Π_{n}$ , being a sum of degree- $n$ cardinal polynomials. Evaluating at a node $x_{i}$ and using $L_{k} (x_{i}) = δ_{k i}$ gives $p (x_{i}) = \sum_{k} f_{k} δ_{k i} = f_{i}$ . So an interpolant exists.

Uniqueness. Suppose $p, q \in Π_{n}$ both interpolate the data. Their difference $d = p - q \in Π_{n}$ satisfies $d (x_{i}) = 0$ for all $n + 1$ distinct nodes. A nonzero polynomial of degree at most $n$ has at most $n$ roots, so a polynomial of degree $\leq n$ vanishing at $n + 1$ distinct points is the zero polynomial. Hence $d = 0$ and $p = q$ .

The Vandermonde determinant. The map $E$ is linear between spaces of equal dimension $n + 1$ , so it is an isomorphism precisely when it is injective; injectivity is the uniqueness just shown (the kernel is the set of degree- $\leq n$ polynomials vanishing at all nodes, namely ${0}$ ). In the monomial basis $E$ is the matrix $V$ with $V_{ij} = x_{i}^{j}$ , so $det V \neq = 0$ . The explicit value follows by induction on $n$ : subtract $x_{0}$ times column $j - 1$ from column $j$ (top-down), factor $(x_{i} - x_{0})$ from each row $i \geq 1$ , and reduce to the $n \times n$ Vandermonde on $x_{1}, \dots, x_{n}$ ; the row factors accumulate to $\prod_{i \geq 1} (x_{i} - x_{0})$ , and the induction yields $det V = \prod_{0 \leq i < j \leq n} (x_{j} - x_{i})$ . Distinctness of the nodes makes every factor nonzero. $□$

Corollary (the Newton leading coefficient). The coefficient of $x^{n}$ in the interpolant of $f$ at $x_{0}, \dots, x_{n}$ equals the top divided difference: $[x^{n}] p_{n} = f [x_{0}, \dots, x_{n}]$ . Consequently the divided difference is a symmetric function of its arguments, since the interpolant — and hence its leading coefficient — does not depend on the order in which the nodes are listed. The recurrence then computes the whole Newton form in $O (n^{2})$ arithmetic from a triangular table, and adding a node costs one further $O (n)$ pass.

Bridge. The unisolvence theorem is the foundational reason every representation in this chapter describes one object: existence-as-construction (the Lagrange cardinal basis) and uniqueness-as-rank (no degree- $\leq n$ polynomial has $n + 1$ roots) together say the evaluation map is an isomorphism, so the Lagrange, Newton, and barycentric formulas are change-of-basis images of a single interpolant — this is exactly the statement that ${L_{k}}$ and ${π_{k}}$ are two bases for $Π_{n}$ adapted to the same nodes. The result builds toward the interpolation error theorem of 43.08.02, where the node polynomial $\prod_{i} (x - x_{i})$ — visible here as the unique element of $Π_{n + 1}$ vanishing at all nodes — controls $f - p_{n}$ , and it appears again in the quadrature rules of 43.09.01, which integrate $p_{n}$ in place of $f$ . The central insight is that interpolation is a linear-isomorphism statement about $Π_{n}$ dressed in three notations, which generalises the Vandermonde nonsingularity into the broader theory of unisolvent (Haar) systems and is dual to the moment problem of reconstructing a measure from its action on $Π_{n}$ . Putting these together, the choice among the three forms is governed entirely by the computational task — cheap incremental updates favour Newton, stable repeated evaluation favours barycentric — never by which "polynomial" results.

Exercises Intermediate+

Exercise 3 (medium, symbolic).

Build the divided-difference table for $f$ at $x_{0} = 0, x_{1} = 1, x_{2} = 2$ with values $f_{0} = 1, f_{1} = 3, f_{2} = 7$ , and write the Newton form of the interpolant. Confirm it equals the $x^{2} + x + 1$ found by the Lagrange method.

Hint

Compute $f [x_{0}, x_{1}]$ , $f [x_{1}, x_{2}]$ , then $f [x_{0}, x_{1}, x_{2}]$ from the recurrence, and assemble $f [x_{0}] + f [x_{0}, x_{1}] (x - x_{0}) + f [x_{0}, x_{1}, x_{2}] (x - x_{0}) (x - x_{1})$ .

Answer

First differences: $f [x_{0}, x_{1}] = (3 - 1) / (1 - 0) = 2$ and $f [x_{1}, x_{2}] = (7 - 3) / (2 - 1) = 4$ . Second difference: $f [x_{0}, x_{1}, x_{2}] = (4 - 2) / (2 - 0) = 1$ . The Newton form is $p (x) = 1 + 2 (x - 0) + 1 \cdot (x - 0) (x - 1) = 1 + 2 x + x^{2} - x = x^{2} + x + 1$ . This matches the Lagrange result, illustrating that both forms produce the unique interpolant. Rubric: full credit for the correct table entries, the assembled Newton form, and the simplification to $x^{2} + x + 1$ .

Exercise 4 (medium, numeric).

Using the table from Exercise 3, a fourth point $(x_{3}, f_{3}) = (3, 21)$ is added. Compute the new top divided difference $f [x_{0}, x_{1}, x_{2}, x_{3}]$ and state the additional Newton term.

Hint

You need $f [x_{1}, x_{2}, x_{3}]$ first (from $f [x_{1}, x_{2}]$ and $f [x_{2}, x_{3}]$ ), then combine with $f [x_{0}, x_{1}, x_{2}]$ over $x_{3} - x_{0}$ . Only one new column of the table is computed.

Answer

$f [x_{2}, x_{3}] = (21 - 7) / (3 - 2) = 14$ ; $f [x_{1}, x_{2}, x_{3}] = (14 - 4) / (3 - 1) = 5$ ; $f [x_{0}, x_{1}, x_{2}, x_{3}] = (5 - 1) / (3 - 0) = \frac{4}{3}$ . The appended Newton term is $\frac{4}{3} x (x - 1) (x - 2)$ , added on top of $x^{2} + x + 1$ without changing the earlier coefficients. This is the incremental update the Lagrange form cannot perform cheaply. Rubric: full credit for the new column, the value $4/3$ , and the appended term.

Exercise 6 (medium, symbolic).

Show that the barycentric weight $w_{k} = 1/ \prod_{j \neq = k} (x_{k} - x_{j})$ satisfies $L_{k} (x) = w_{k} ℓ (x) / (x - x_{k})$ , where $ℓ (x) = \prod_{j = 0}^{n} (x - x_{j})$ is the node polynomial.

Hint

Note $ℓ (x) / (x - x_{k}) = \prod_{j \neq = k} (x - x_{j})$ , and compare with the definition of $L_{k}$ .

Answer

Cancelling the factor $(x - x_{k})$ gives $ℓ (x) / (x - x_{k}) = \prod_{j \neq = k} (x - x_{j})$ , which is exactly the numerator of $L_{k}$ . The definition $L_{k} (x) = \prod_{j \neq = k} (x - x_{j}) / \prod_{j \neq = k} (x_{k} - x_{j})$ then reads $L_{k} (x) = (\prod_{j \neq = k} (x - x_{j})) \cdot w_{k} = w_{k} ℓ (x) / (x - x_{k})$ , since $w_{k}$ is the reciprocal of the denominator product. This first barycentric form $p_{n} (x) = ℓ (x) \sum_{k} w_{k} f_{k} / (x - x_{k})$ is the bridge to the second form, obtained by dividing through by $ℓ (x) \sum_{k} w_{k} / (x - x_{k}) = 1$ (the partition-of-unity identity of Exercise 5 written barycentrically). Rubric: full credit for the cancellation and the identification of $w_{k}$ .

Exercise 7 (hard, symbolic).

Prove the divided-difference recurrence $f [x_{0}, \dots, x_{k}] = \frac{f [ x _{1} , \dots , x _{k} ] - f [ x _{0} , \dots , x _{k - 1} ]}{x _{k} - x _{0}}$ from the characterisation $[x^{k}] p = f [x_{0}, \dots, x_{k}]$ of the divided difference as a leading interpolation coefficient.

Hint

Let $p$ interpolate on $x_{0}, \dots, x_{k}$ , let $q$ interpolate on $x_{1}, \dots, x_{k}$ , and $r$ on $x_{0}, \dots, x_{k - 1}$ . Verify that $\frac{( x - x _{0} ) q ( x ) - ( x - x _{k} ) r ( x )}{x _{k} - x _{0}}$ interpolates the full data, then use uniqueness and compare leading coefficients.

Answer

Let $q \in Π_{k - 1}$ interpolate $f$ on $x_{1}, \dots, x_{k}$ and $r \in Π_{k - 1}$ on $x_{0}, \dots, x_{k - 1}$ . Define $$ \tilde p(x) = \frac{(x - x_0),q(x) - (x - x_k),r(x)}{x_k - x_0} \in \Pi_k. $$ At an interior node $x_{i}$ ( $1 \leq i \leq k - 1$ ), both $q (x_{i}) = r (x_{i}) = f_{i}$ , so $\tilde{p} (x_{i}) = \frac{( x _{i} - x _{0} ) - ( x _{i} - x _{k} )}{x _{k} - x _{0}} f_{i} = f_{i}$ . At $x_{0}$ the first term vanishes and $\tilde{p} (x_{0}) = \frac{- ( x _{0} - x _{k} ) r ( x _{0} )}{x _{k} - x _{0}} = r (x_{0}) = f_{0}$ . At $x_{k}$ the second term vanishes and $\tilde{p} (x_{k}) = \frac{( x _{k} - x _{0} ) q ( x _{k} )}{x _{k} - x _{0}} = q (x_{k}) = f_{k}$ . So $\tilde{p}$ interpolates all $k + 1$ points and, being in $Π_{k}$ , equals $p$ by uniqueness. Comparing the coefficient of $x^{k}$ : $[x^{k}] \tilde{p} = \frac{[ x ^{k - 1} ] q - [ x ^{k - 1} ] r}{x _{k} - x _{0}} = \frac{f [ x _{1} , \dots , x _{k} ] - f [ x _{0} , \dots , x _{k - 1} ]}{x _{k} - x _{0}}$ , while $[x^{k}] p = f [x_{0}, \dots, x_{k}]$ . Equating gives the recurrence. Rubric: full credit for the convex-combination construction, the node-by-node verification, and the leading-coefficient comparison.

Exercise 8 (hard, symbolic).

Let $ℓ (x) = \prod_{j = 0}^{n} (x - x_{j})$ . Prove that the barycentric weight satisfies $w_{k} = 1/ ℓ^{'} (x_{k})$ , where $ℓ^{'}$ is the derivative.

Hint

Differentiate the product $ℓ$ using the product rule; at $x = x_{k}$ all terms but one vanish.

Answer

By the product rule, $ℓ^{'} (x) = \sum_{m = 0}^{n} \prod_{j \neq = m} (x - x_{j})$ . Evaluate at $x = x_{k}$ : every summand with $m \neq = k$ contains the factor $(x_{k} - x_{k}) = 0$ and drops out, leaving only the $m = k$ term, $ℓ^{'} (x_{k}) = \prod_{j \neq = k} (x_{k} - x_{j})$ . Therefore $w_{k} = 1/ \prod_{j \neq = k} (x_{k} - x_{j}) = 1/ ℓ^{'} (x_{k})$ . This identity makes the barycentric weights computable directly from the node polynomial and is the form used in numerical practice. Rubric: full credit for the product-rule expansion, the vanishing of off-diagonal terms at $x_{k}$ , and the conclusion.

Advanced results Master

The elementary theory fixes the interpolant; the master layer concerns how the three representations differ in cost and numerical behaviour, how the divided difference encodes analytic data, and the structural reason the barycentric form is the one to compute with.

Theorem 1 (equivalence and cost of the three representations). The Lagrange, Newton, and second barycentric formulas all evaluate to the unique interpolant $p_{n} \in Π_{n}$ . Their construction-and-evaluation costs differ: naive Lagrange recomputes $n + 1$ products of $n$ factors at each evaluation point, costing $O (n^{2})$ per point; the Newton form costs $O (n^{2})$ once to build the divided-difference table and $O (n)$ per point by Horner's rule on the nested products, with $O (n)$ incremental cost to add a node; the barycentric form costs $O (n^{2})$ once to precompute the weights $w_{k} = 1/ ℓ^{'} (x_{k})$ and $O (n)$ per point thereafter, and a new node updates all weights in $O (n)$ by dividing each existing $w_{k}$ by $(x_{k} - x_{n + 1})$ and forming $w_{n + 1}$ directly ^{[Berrut-Trefethen 2004]}.

Theorem 2 (numerical stability of the barycentric form). The second barycentric formula is backward stable for evaluating $p_{n}$ : the computed value is the exact interpolant of slightly perturbed data $f_{k} (1 + ϵ_{k})$ with $∣ ϵ_{k} ∣$ bounded by a small multiple of unit roundoff times the Lebesgue-type amplification, for any node distribution; for nodes whose Lebesgue constant grows slowly (Chebyshev points), the formula is forward accurate as well. The naive Lagrange and the Newton forms can suffer cancellation that the barycentric quotient structure avoids, because the same potentially large factor $ℓ (x)$ appears in numerator and denominator and cancels ^{[Higham 2004]}. This overturns the classical verdict that Lagrange interpolation is numerically inferior: the representation is excellent once written barycentrically; only the textbook evaluation rule was at fault.

Theorem 3 (divided differences as derivatives and the Hermite-Genocchi formula). Divided differences are the discrete analogue of derivatives, and they limit to them: if all nodes coalesce, $f [x_{0}, \dots, x_{n}] \to f^{(n)} (ξ) / n!$ . The exact statement is the Hermite-Genocchi integral $$ f[x_0, \dots, x_n] = \int_{\Sigma_n} f^{(n)}!\Big(x_0 + \sum_{i=1}^n t_i (x_i - x_0)\Big), dt, $$ where $Σ_{n} = {t \in R^{n} : t_{i} \geq 0, \sum t_{i} \leq 1}$ is the standard simplex (volume $1/ n!$ ). The formula holds for $f \in C^{n}$ , makes the symmetry of the divided difference in its arguments manifest, and shows divided differences depend continuously on the nodes — so the Newton form extends smoothly to the confluent (Hermite) case where nodes repeat and derivative data is matched, the construction owned by 43.08.03.

Theorem 4 (conditioning and the basis question). Solving the interpolation problem through the Vandermonde system $V a = f$ in the monomial basis is ill-conditioned: the condition number of $V$ grows exponentially in $n$ for equispaced nodes on a real interval, so recovering the monomial coefficients $a_{j}$ is numerically hopeless even though the interpolant itself is well-determined. The remedy is never to form the monomial coefficients: the Newton basis triangularises the system (its matrix is lower-triangular, solved by the $O (n^{2})$ divided-difference recurrence) and the cardinal/barycentric basis diagonalises it (the coefficients are the data values $f_{k}$ themselves). The instability is an artifact of the monomial coordinate choice, not of the interpolation problem; the conditioning of the map data $\mapsto$ value is governed instead by the Lebesgue constant (forward cross-ref to the conditioning theory of 43.01.01 and the node-placement cure of 43.08.04) ^{[Higham 2004]}.

Synthesis. The unisolvence theorem is the foundational reason interpolation, linear algebra, and approximation are one subject seen from three angles: the evaluation map $Π_{n} \to F^{n + 1}$ is an isomorphism, and the Lagrange, Newton, and barycentric formulas are three bases of $Π_{n}$ adapted to the nodes — cardinal (diagonalising), Newton (triangularising), monomial (the ill-conditioned ambient frame one must avoid). The central insight is that the interpolant is uniquely and stably determined while its monomial coordinates are not, so the entire practical theory is the discipline of computing with $p_{n}$ through a well-conditioned basis; this is exactly the lesson the SVD teaches for least-squares in 01.01.12, where the right orthonormal basis tames a problem the naive normal-equation basis renders ill-conditioned, and it is dual to the moment problem, where one reconstructs the action of a functional on $Π_{n}$ from finitely many samples. Putting these together, the node polynomial $ℓ (x) = \prod_{i} (x - x_{i})$ threads the chapter: it is the kernel of the evaluation map on $Π_{n + 1}$ , supplies the barycentric weights through $w_{k} = 1/ ℓ^{'} (x_{k})$ , is the controllable factor in the interpolation error of 43.08.02, and its optimal root placement is the Chebyshev story of 43.08.04. The bridge is that choosing nodes to minimise $∥ ℓ ∥_{\infty}$ at once cures the Runge divergence, minimises the Lebesgue constant, and supplies the Gauss quadrature points of 43.09.03 — one extremal problem behind the whole edifice, which generalises the finite-dimensional isomorphism proved here into the asymptotic approximation theory the section develops.

Full proof set Master

Proposition 1 (Vandermonde determinant). For pairwise distinct nodes $x_{0}, \dots, x_{n}$ , the Vandermonde matrix $V$ with $V_{ij} = x_{i}^{j}$ has $det V = \prod_{0 \leq i < j \leq n} (x_{j} - x_{i})$ .

Proof. Induct on $n$ . For $n = 0$ , $det V = 1$ (empty product). For the step, perform the column operations $C_{j} \leftarrow C_{j} - x_{0} C_{j - 1}$ for $j = n, n - 1, \dots, 1$ (descending, so each uses the un-modified previous column); these leave the determinant unchanged. Row $0$ becomes $(1, 0, \dots, 0)$ , and row $i \geq 1$ has entries $x_{i}^{j} - x_{0} x_{i}^{j - 1} = x_{i}^{j - 1} (x_{i} - x_{0})$ in column $j \geq 1$ . Expanding along row $0$ reduces $det V$ to the determinant of the $n \times n$ minor with entries $x_{i}^{j - 1} (x_{i} - x_{0})$ for $1 \leq i \leq n$ , $1 \leq j \leq n$ . Factoring $(x_{i} - x_{0})$ from each row $i$ gives $det V = \prod_{i = 1}^{n} (x_{i} - x_{0}) \cdot det \tilde{V}$ , where $\tilde{V}$ is the Vandermonde matrix on $x_{1}, \dots, x_{n}$ . By the inductive hypothesis $det \tilde{V} = \prod_{1 \leq i < j \leq n} (x_{j} - x_{i})$ , and combining the two products gives $det V = \prod_{0 \leq i < j \leq n} (x_{j} - x_{i})$ . Pairwise distinctness makes every factor nonzero, so $det V \neq = 0$ and the interpolation system is uniquely solvable. $□$

Proposition 2 (cardinal partition of unity and the second barycentric identity). The cardinal polynomials satisfy $\sum_{k = 0}^{n} L_{k} (x) = 1$ , and consequently $\sum_{k = 0}^{n} w_{k} / (x - x_{k}) = 1/ ℓ (x)$ for $x$ not a node, yielding the second barycentric form.

Proof. The polynomial $\sum_{k} L_{k} \in Π_{n}$ takes the value $\sum_{k} L_{k} (x_{i}) = \sum_{k} δ_{k i} = 1$ at every node $x_{i}$ , so it interpolates the constant data $1$ ; the constant polynomial $1 \in Π_{n}$ does too, and by Proposition-level uniqueness (the unisolvence theorem) the two agree identically, giving $\sum_{k} L_{k} = 1$ . Writing each $L_{k} (x) = w_{k} ℓ (x) / (x - x_{k})$ (cancellation of the common factor, as $ℓ (x) / (x - x_{k}) = \prod_{j \neq = k} (x - x_{j})$ and $w_{k} = 1/ \prod_{j \neq = k} (x_{k} - x_{j})$ ), the identity becomes $ℓ (x) \sum_{k} w_{k} / (x - x_{k}) = 1$ , i.e. $\sum_{k} w_{k} / (x - x_{k}) = 1/ ℓ (x)$ . The first barycentric form is $p_{n} (x) = \sum_{k} f_{k} L_{k} (x) = ℓ (x) \sum_{k} w_{k} f_{k} / (x - x_{k})$ ; dividing it by $1 = ℓ (x) \sum_{k} w_{k} / (x - x_{k})$ cancels $ℓ (x)$ and gives $p_{n} (x) = (\sum_{k} w_{k} f_{k} / (x - x_{k})) / (\sum_{k} w_{k} / (x - x_{k}))$ . $□$

Proposition 3 (divided-difference recurrence and Newton form). The recurrence $f [x_{0}, \dots, x_{k}] = (f [x_{1}, \dots, x_{k}] - f [x_{0}, \dots, x_{k - 1}]) / (x_{k} - x_{0})$ holds, and the Newton form $p_{n} = \sum_{k = 0}^{n} f [x_{0}, \dots, x_{k}] π_{k}$ is the interpolant on $x_{0}, \dots, x_{n}$ .

Proof. Define $f [x_{0}, \dots, x_{k}] := [x^{k}] p^{(k)}$ , the leading coefficient of the degree- $\leq k$ interpolant $p^{(k)}$ on $x_{0}, \dots, x_{k}$ . Let $q$ interpolate on $x_{1}, \dots, x_{k}$ and $r$ on $x_{0}, \dots, x_{k - 1}$ , both in $Π_{k - 1}$ . The polynomial $\tilde{p} (x) = ((x - x_{0}) q (x) - (x - x_{k}) r (x)) / (x_{k} - x_{0})$ lies in $Π_{k}$ and interpolates all of $x_{0}, \dots, x_{k}$ : at interior $x_{i}$ the bracket is $(x_{i} - x_{0}) f_{i} - (x_{i} - x_{k}) f_{i} = (x_{k} - x_{0}) f_{i}$ ; at $x_{0}$ only the $r$ -term survives and equals $- (x_{0} - x_{k}) f_{0} = (x_{k} - x_{0}) f_{0}$ ; at $x_{k}$ only the $q$ -term survives and equals $(x_{k} - x_{0}) f_{k}$ . By unisolvence $\tilde{p} = p^{(k)}$ . The coefficient of $x^{k}$ in $\tilde{p}$ is $([x^{k - 1}] q - [x^{k - 1}] r) / (x_{k} - x_{0}) = (f [x_{1}, \dots, x_{k}] - f [x_{0}, \dots, x_{k - 1}]) / (x_{k} - x_{0})$ , which is the recurrence. For the Newton form, observe $p^{(k)} - p^{(k - 1)} \in Π_{k}$ vanishes at $x_{0}, \dots, x_{k - 1}$ (where both interpolate $f$ ), so it is a scalar multiple of $π_{k} (x) = \prod_{j < k} (x - x_{j})$ ; matching the leading coefficient, $p^{(k)} - p^{(k - 1)} = f [x_{0}, \dots, x_{k}] π_{k}$ . Telescoping from $p^{(0)} = f [x_{0}]$ to $p^{(n)} = p_{n}$ gives $p_{n} = \sum_{k = 0}^{n} f [x_{0}, \dots, x_{k}] π_{k}$ . $□$

Proposition 4 (Hermite-Genocchi representation). For $f \in C^{n}$ on an interval containing the distinct nodes, $f [x_{0}, \dots, x_{n}] = \int_{Σ_{n}} f^{(n)} (x_{0} + \sum_{i = 1}^{n} t_{i} (x_{i} - x_{0})) d t$ , with $Σ_{n}$ the standard simplex.

Proof. Induct on $n$ . For $n = 1$ , the fundamental theorem of calculus gives $\int_{0}^{1} f^{'} (x_{0} + t_{1} (x_{1} - x_{0})) d t_{1} = \frac{f ( x _{1} ) - f ( x _{0} )}{x _{1} - x _{0}} = f [x_{0}, x_{1}]$ . For the step, write the inner integration over $t_{n}$ on the simplex: holding $t_{1}, \dots, t_{n - 1}$ fixed, integrating $f^{(n)}$ in $t_{n}$ from $0$ to $1 - \sum_{i < n} t_{i}$ produces, by the fundamental theorem, a difference of $f^{(n - 1)}$ evaluated at the two endpoints, which are the simplex-convex combinations on the node sets $x_{0}, \dots, x_{n - 1}$ and $x_{0}, \dots, x_{n - 2}, x_{n}$ . The remaining $(n - 1)$ -fold simplex integral of each is, by the inductive hypothesis, $f [x_{0}, \dots, x_{n - 1}]$ and $f [x_{0}, \dots, x_{n - 2}, x_{n}]$ respectively; their difference divided by $(x_{n} - x_{n - 1})$ is the divided-difference recurrence (Proposition 3, using symmetry to order the arguments), giving $f [x_{0}, \dots, x_{n}]$ . Symmetry in the arguments is visible directly: the integrand depends on the nodes only through the convex combination, which is symmetric under relabelling, and the limit $x_{i} \to x_{0}$ recovers $f^{(n)} (x_{0}) / n!$ since $vol (Σ_{n}) = 1/ n!$ . $□$

Connections Master

The node polynomial $ℓ (x) = \prod_{i} (x - x_{i})$ introduced here is the controllable factor in the interpolation error theorem $f (x) - p_{n} (x) = \frac{f ^{(n + 1)} ( ξ )}{( n + 1 )!} ℓ (x)$ developed at 43.08.02; that unit proves the remainder by applying Rolle's theorem to an auxiliary function and identifies the Runge phenomenon as the failure of $∥ ℓ ∥_{\infty}$ to shrink for equispaced nodes. The divided-difference-as-derivative limit proved here (Proposition 4) is exactly the mechanism by which the error constant $f^{(n + 1)} (ξ) / (n + 1)!$ equals a confluent divided difference.
The confluent (repeated-node) limit of the Newton form, where the divided difference matches derivative data via the Hermite-Genocchi continuity of Proposition 4, is the construction of Hermite and piecewise/spline interpolation in 43.08.03; the generalized divided differences with repeated arguments are the coefficients of the Hermite-Newton form, and the $C^{2}$ cubic spline assembles local interpolants of this kind into a global tridiagonal system.
The optimal placement of the roots of $ℓ (x)$ to minimise $∥ ℓ ∥_{\infty}$ — the extremal problem flagged in the synthesis — is the Chebyshev-node theory of 43.08.04, where the monic Chebyshev polynomial is shown to be the minimal-sup-norm monic degree- $(n + 1)$ polynomial, so its roots are the interpolation nodes that defeat the Runge divergence and minimise the Lebesgue constant governing the conditioning of the interpolation map.
Integrating the interpolant $p_{n}$ in place of $f$ produces the interpolatory quadrature rules of 43.09.01: the Lagrange form gives $\int_{a}^{b} f \approx \sum_{k} f_{k} \int_{a}^{b} L_{k}$ , so the quadrature weights are the integrated cardinal polynomials, and the Newton-Cotes family is precisely this construction at equispaced nodes; the Gauss rules of 43.09.03 choose the nodes as roots of orthogonal polynomials, the same node-placement freedom that the barycentric weight identity $w_{k} = 1/ ℓ^{'} (x_{k})$ exploits.

Historical & philosophical context Master

Polynomial interpolation predates the calculus of finite differences that systematised it. The cardinal-function formula is named for Joseph-Louis Lagrange, who published it in his 1795 Leçons élémentaires sur les mathématiques lectures at the École Normale, though Edward Waring had given the same construction in 1779 in the Philosophical Transactions and Leonhard Euler had used an equivalent device earlier; the attribution is one of the standard misnamings in the subject ^{[Lagrange 1795]}. The divided-difference calculus and the incremental form belong to Isaac Newton, whose Principia (1687, Book III, Lemma V) and the earlier Methodus differentialis (circulated 1676, published 1711) constructed interpolating polynomials through forward and divided differences precisely to interpolate astronomical and tabular data, decades before Lagrange ^{[Newton 1711]}.

The Vandermonde determinant carries the name of Alexandre-Théophile Vandermonde, whose 1772 Mémoire sur l'élimination treated symmetric functions and determinants, although the explicit product formula for what is now called his matrix does not appear in his work in that form — another instance of attributive drift. The integral representation of divided differences is due to Charles Hermite and Angelo Genocchi in the 1870s, tying the discrete difference calculus to the integral remainder. The barycentric formula, latent in the nineteenth-century literature, was isolated as the numerically preferred representation by Carl Runge and made the centrepiece of modern practice by Jean-Paul Berrut and Lloyd N. Trefethen in their 2004 SIAM Review survey, with the backward-stability analysis supplied by Nicholas Higham the same year ^{[Berrut-Trefethen 2004]}; their rehabilitation of the Lagrange form reversed a century of textbook preference for the Newton scheme on stability grounds.

Bibliography Master

@book{sulimayers2003,
  author    = {S\"{u}li, Endre and Mayers, David F.},
  title     = {An Introduction to Numerical Analysis},
  publisher = {Cambridge University Press},
  year      = {2003}
}

@book{stoerbulirsch2002,
  author    = {Stoer, Josef and Bulirsch, Roland},
  title     = {Introduction to Numerical Analysis},
  edition   = {3},
  publisher = {Springer},
  year      = {2002}
}

@article{berruttrefethen2004,
  author  = {Berrut, Jean-Paul and Trefethen, Lloyd N.},
  title   = {Barycentric {L}agrange Interpolation},
  journal = {SIAM Review},
  volume  = {46},
  number  = {3},
  year    = {2004},
  pages   = {501--517}
}

@article{higham2004,
  author  = {Higham, Nicholas J.},
  title   = {The numerical stability of barycentric {L}agrange interpolation},
  journal = {IMA Journal of Numerical Analysis},
  volume  = {24},
  number  = {4},
  year    = {2004},
  pages   = {547--556}
}

@book{davis1975,
  author    = {Davis, Philip J.},
  title     = {Interpolation and Approximation},
  publisher = {Dover Publications},
  year      = {1975}
}

@book{trefethen2019,
  author    = {Trefethen, Lloyd N.},
  title     = {Approximation Theory and Approximation Practice, Extended Edition},
  publisher = {SIAM},
  year      = {2019}
}

@book{lagrange1795,
  author    = {Lagrange, Joseph-Louis},
  title     = {Le\c{c}ons \'{e}l\'{e}mentaires sur les math\'{e}matiques, donn\'{e}es \`{a} l'\'{E}cole Normale},
  publisher = {S\'{e}ances des \'{E}coles Normales},
  year      = {1795}
}

@book{newton1711,
  author    = {Newton, Isaac},
  title     = {Methodus differentialis},
  publisher = {London},
  year      = {1711}
}

Prerequisites

01.01.09
02.05.02

Tier anchors

beginner: Süli-Mayers 2003 *An Introduction to Numerical Analysis* (Cambridge) §6.1-6.2 (the interpolation problem and the Lagrange form, opening at the level of a first numerical-methods course); Burden-Faires 2011 *Numerical Analysis* 9e (Brooks/Cole) §3.1 for the hands-on Lagrange-polynomial bookkeeping
intermediate: Süli-Mayers 2003 *An Introduction to Numerical Analysis* (Cambridge) §6.1-6.4 (existence-uniqueness via the Vandermonde determinant, the Lagrange cardinal basis, Newton's divided differences and the recurrence); Stoer-Bulirsch 2002 *Introduction to Numerical Analysis* 3e (Springer) §2.1 (Lagrange and Newton forms, the divided-difference table)
master: Davis 1975 *Interpolation and Approximation* (Dover) Ch. 2-3 (the interpolation problem as a moment problem, Vandermonde theory, the remainder); Trefethen 2019 *Approximation Theory and Approximation Practice* extended ed. (SIAM) Ch. 5 (the barycentric formula and its numerical stability); Berrut-Trefethen 2004 *SIAM Review* 46(3) (barycentric Lagrange interpolation as the form to use in practice); Higham 2004 *IMA J. Numer. Anal.* 24(4) (numerical stability of barycentric interpolation)

References

Süli, E. & Mayers, D. F. — An Introduction to Numerical Analysis · Cambridge University Press (2003). Chapter 6 ('Polynomial interpolation') sets up the interpolation problem of finding a polynomial of degree at most n agreeing with f at n+1 distinct points; §6.1 proves existence and uniqueness; §6.2 builds the Lagrange cardinal polynomials L_k and the Lagrange interpolation formula; §6.3 develops the Newton divided-difference form, the recurrence f[x_i,...,x_{i+k}] = (f[x_{i+1},...,x_{i+k}] - f[x_i,...,x_{i+k-1}])/(x_{i+k}-x_i), and the incremental construction; §6.4 proves the interpolation error theorem f(x)-p_n(x) = f^{(n+1)}(xi)/(n+1)! prod_i (x-x_i), which this unit forward-points to 43.08.02 rather than re-deriving in full.
Stoer, J. & Bulirsch, R. — Introduction to Numerical Analysis · Springer, 3rd edition (2002). §2.1 treats interpolation by polynomials: the unisolvence theorem (existence and uniqueness for n+1 distinct nodes), the Lagrange representation, Neville's algorithm for the value of the interpolant, and Newton's divided-difference scheme with the triangular divided-difference table; the equivalence of the Lagrange and Newton representations as two coordinate systems for the same unique interpolating polynomial.
Berrut, J.-P. & Trefethen, L. N. — Barycentric Lagrange Interpolation · SIAM Review 46(3) (2004), 501-517. Rewrites the Lagrange formula in the barycentric form p(x) = (sum_k w_k f_k/(x-x_k)) / (sum_k w_k/(x-x_k)) with barycentric weights w_k = 1/prod_{j != k}(x_k-x_j); shows the first and second (true) barycentric forms cost O(n) per evaluation after an O(n^2) weight precomputation, update cleanly when a node is added, and are numerically stable, making Lagrange's 'fragile' reputation an artifact of the naive evaluation rather than the representation.
Davis, P. J. — Interpolation and Approximation · Dover (1975, corrected reprint of the 1963 Blaisdell edition). Chapter 2 frames interpolation as the moment/Haar-system problem and proves the Vandermonde determinant formula det V = prod_{i<j}(x_j - x_i); Chapter 3 develops the divided-difference calculus, the Newton form, the Hermite-Genocchi integral representation of divided differences, and the remainder theory.

Estimated time

beginner: 20m
intermediate: 50m
master: 90m