43.08.02 · numerical-analysis / 08-interpolation-approximation

Interpolation error and the Runge phenomenon

shipped3 tiersLean: none

Anchor (Master): Trefethen 2019 *Approximation Theory and Approximation Practice* extended ed. (SIAM) Ch. 11-15 (the Lebesgue constant, the Runge phenomenon, potential theory and the equilibrium measure, exponential vs divergent convergence); Davis 1975 *Interpolation and Approximation* (Dover) Ch. 3-4 (the remainder, the contour-integral Hermite formula, and the Lebesgue function); Runge 1901 *Zeitschrift für Mathematik und Physik* 46 (the original divergence example)

Intuition Beginner

In the previous unit you learned that through any handful of points with different horizontal positions there passes exactly one polynomial of a matching degree. That settles which curve you get. This unit asks the next question: if the points were sampled from some true function, how far does the interpolating curve stray from the true function in the gaps between the points?

The answer has a pleasing shape. The mismatch at a position $x$ is a product of two things. One is a measure of how curvy the true function is — captured by a high derivative of the function, the part that a polynomial of limited degree simply cannot imitate. The other is a single polynomial you can read straight off the node positions: it is the product of the distances from $x$ to each node. Call it the node product. It is zero exactly at the nodes, which matches the fact that the curve hits every data point dead on, and it grows as you move away from the cluster of nodes.

You only get to choose one of these two factors. The curviness of the function is fixed once the function is given. But the node product depends entirely on where you place the sample points, and that is yours to decide. So the whole craft of accurate interpolation comes down to placing nodes so that the node product stays small across the interval.

Here is the surprise that makes this unit worth a chapter of its own. The most natural choice — spread the nodes out evenly — is a bad one. As you add more evenly spaced points, the curve can start to swing wildly near the ends of the interval, overshooting the true function by more and more, even when the true function is perfectly smooth. This runaway wobble is called the Runge phenomenon, and the cure is to bunch the nodes toward the ends rather than spacing them evenly.

Visual Beginner

Picture the true function as a calm curve and the interpolating polynomial as a curve forced to pass through the chosen sample dots. Between dots the two curves can part ways. The table below tracks the node product for five evenly spaced nodes on the interval from $- 1$ to $1$ : it is zero at each node and largest in the gaps, and the gaps near the two ends are where it swells the most.

position $x$	nearest behaviour	size of the node product
at a node	curve meets the true function	$0$
middle gap, near centre	small mismatch	small
outer gap, near an end	large mismatch	large

Each row is the same message read off the node product: the error is pinned to zero at every node and is free to grow in between, and with evenly spaced nodes it grows fastest near the ends. Moving the nodes closer together near the ends flattens those tall outer humps, which is the whole idea behind the cure.

Worked example Beginner

Let us measure the worst-case error of fitting a straight line through two points of the curve $f (x) = x^{2}$ on the interval from $0$ to $2$ . Take the nodes $x_{0} = 0$ and $x_{1} = 2$ .

Step 1. Build the line through the two points. At $x_{0} = 0$ , $f = 0$ ; at $x_{1} = 2$ , $f = 4$ . The line through $(0, 0)$ and $(2, 4)$ is $p (x) = 2 x$ .

Step 2. Write down the exact mismatch. The error is $f (x) - p (x) = x^{2} - 2 x = x (x - 2)$ . Notice this is exactly the node product $(x - 0) (x - 2)$ for these two nodes, and it is zero at both nodes as expected.

Step 3. Find where the mismatch is biggest. The product $x (x - 2)$ is a downward parabola with its low point halfway between the nodes, at $x = 1$ . There the value is $(1) (1 - 2) = - 1$ , so the line sits $1$ unit above the curve at the midpoint.

Step 4. Compare with the error formula. The formula says the error equals a second derivative of $f$ , divided by $2$ , times the node product. For $f (x) = x^{2}$ the second derivative is the constant $2$ , so the formula gives $(2 \div 2) x (x - 2) = x (x - 2)$ . This matches the exact mismatch from Step 2 perfectly.

What this tells us: even a function as tame as $x^{2}$ cannot be matched by a line except at the two nodes, and the formula predicts the gap exactly — a fixed curviness factor times the node product, biggest right in the middle of the gap.

Check your understanding Beginner

Formal definition Intermediate+

Throughout, $f : [a, b] \to R$ and $x_{0}, x_{1}, \dots, x_{n} \in [a, b]$ are $n + 1$ pairwise distinct nodes; $p_{n} \in Π_{n}$ is the unique interpolating polynomial of $f$ at these nodes (existence and uniqueness are the unisolvence theorem of 43.08.01). The notation $f \in C^{m} [a, b]$ means $f$ has continuous derivatives through order $m$ on $[a, b]$ ; $f^{(m)}$ denotes the $m$ -th derivative.

Definition (node polynomial and interpolation error). The node polynomial (or nodal polynomial) for the nodes $x_{0}, \dots, x_{n}$ is the monic degree- $(n + 1)$ polynomial $$ \pi_{n+1}(x) = \prod_{k=0}^{n} (x - x_k) . $$ It is the unique monic element of $Π_{n + 1}$ vanishing at all $n + 1$ nodes. The interpolation error (or remainder) is the function $e_{n} (x) = f (x) - p_{n} (x)$ . It vanishes at every node, so $e_{n}$ is divisible by $π_{n + 1}$ as a relation on the node set.

Definition (Lebesgue function and Lebesgue constant). With $L_{k}$ the Lagrange cardinal polynomials of 43.08.01, the Lebesgue function for the node set is $$ \lambda_n(x) = \sum_{k=0}^{n} \lvert L_k(x) \rvert , $$ and the Lebesgue constant is its maximum, $$ \Lambda_n = \max_{x \in [a,b]} \lambda_n(x) = \max_{x \in [a,b]} \sum_{k=0}^{n} \lvert L_k(x) \rvert . $$ The interpolation operator $I_{n} : C [a, b] \to Π_{n}$ , $I_{n} f = p_{n}$ , is linear and idempotent (a projection); $Λ_{n}$ is precisely its operator norm in the supremum norm $∥ g ∥_{\infty} = max_{x \in [a, b]} ∣ g (x)∣$ . The best approximation error of $f$ in $Π_{n}$ is $E_{n} (f) = min_{q \in Π_{n}} ∥ f - q ∥_{\infty}$ , the quantity owned by 43.08.04.

The symbols $Π_{n}$ , $\prod$ , the supremum norm $∥ \cdot ∥_{\infty}$ , the divided difference $f [\cdot]$ , and $C^{m} [a, b]$ are recorded in _meta/NOTATION.md; the divided-difference recurrence and the node polynomial are inherited from 43.08.01.

Counterexamples to common slips Intermediate+

"The interpolation error is bounded by best approximation error." It is not bounded by it — it is bounded by $(1 + Λ_{n})$ times it. The Lebesgue constant is the amplification factor between the best a polynomial of degree $n$ can do and what interpolation at a fixed node set actually does. For equispaced nodes $Λ_{n}$ grows faster than any power of $n$ , so a small best-approximation error can coexist with a large interpolation error.
"A smooth (even analytic) function is always interpolated convergently." Runge's example $1/ (1 + 25 x^{2})$ is real-analytic on $[- 1, 1]$ yet its equispaced interpolants diverge in the sup norm. Convergence depends on the interplay between the function's complex-analytic singularities and the node distribution, not on real-line smoothness alone.
"The intermediate point $ξ$ in the error formula is the same for every $x$ ." The point $ξ = ξ (x)$ depends on the evaluation point $x$ and lies in the smallest interval containing $x$ and all the nodes. The formula is pointwise; only when one then bounds $∣ f^{(n + 1)} ∣$ by its maximum does the $x$ -dependence of $ξ$ drop out.
"Equispaced nodes minimise the node polynomial." They do the opposite near the endpoints: $∥ π_{n + 1} ∥_{\infty}$ for equispaced nodes is exponentially larger than for the Chebyshev nodes that minimise it. The node polynomial is the controllable factor, and equispacing is a poor control choice (43.08.04).

Key theorem with proof Intermediate+

The signature result is the interpolation error theorem: it expresses the remainder $f - p_{n}$ as a product of an $(n + 1)$ -st derivative of $f$ and the node polynomial. The derivative factor measures the part of $f$ that no degree- $n$ polynomial can capture; the node polynomial is the factor the analyst controls through node placement. The proof is a single application of Rolle's theorem 02.05.02, iterated $n + 1$ times against an auxiliary function rigged to have one extra zero. This follows Süli-Mayers ^{[Süli-Mayers §6.4]}.

Theorem (interpolation error / Cauchy remainder). Let $f \in C^{n + 1} [a, b]$ and let $p_{n} \in Π_{n}$ interpolate $f$ at the distinct nodes $x_{0}, \dots, x_{n} \in [a, b]$ . For every $x \in [a, b]$ there exists a point $ξ = ξ (x)$ in the open interval $I_{x}$ — the smallest open interval containing $x$ and all the nodes — such that $$ f(x) - p_n(x) = \frac{f^{(n+1)}(\xi)}{(n+1)!},\pi_{n+1}(x), \qquad \pi_{n+1}(x) = \prod_{k=0}^{n}(x - x_k). $$

Proof. If $x$ coincides with a node, both sides vanish and there is nothing to prove; fix $x$ not equal to any node. Define the constant $$ G = \frac{f(x) - p_n(x)}{\pi_{n+1}(x)}, $$ which is well-defined because $π_{n + 1} (x) \neq = 0$ for $x$ off the node set, and form the auxiliary function $$ \phi(t) = f(t) - p_n(t) - G,\pi_{n+1}(t), \qquad t \in [a,b]. $$ Then $ϕ \in C^{n + 1} [a, b]$ , since $f$ is and $p_{n}, π_{n + 1}$ are polynomials. At each node $t = x_{k}$ , both $f (x_{k}) - p_{n} (x_{k}) = 0$ and $π_{n + 1} (x_{k}) = 0$ , so $ϕ (x_{k}) = 0$ . At $t = x$ , the definition of $G$ gives $ϕ (x) = (f (x) - p_{n} (x)) - G π_{n + 1} (x) = 0$ . Thus $ϕ$ vanishes at the $n + 2$ distinct points $x, x_{0}, \dots, x_{n}$ .

Apply Rolle's theorem on each of the $n + 1$ consecutive subintervals determined by these $n + 2$ zeros: between any two adjacent zeros of $ϕ$ there is a zero of $ϕ^{'}$ , giving $ϕ^{'}$ at least $n + 1$ distinct zeros in $I_{x}$ . Repeating on $ϕ^{'}$ yields $ϕ^{''}$ with at least $n$ zeros, and after $n + 1$ such steps $ϕ^{(n + 1)}$ has at least one zero $ξ \in I_{x}$ . Differentiating $ϕ$ exactly $n + 1$ times: $p_{n} \in Π_{n}$ has $p_{n}^{(n + 1)} \equiv 0$ , while $π_{n + 1}$ is monic of degree $n + 1$ , so $π_{n + 1}^{(n + 1)} \equiv (n + 1)!$ . Hence $$ 0 = \phi^{(n+1)}(\xi) = f^{(n+1)}(\xi) - G,(n+1)!, $$ so $G = f^{(n + 1)} (ξ) / (n + 1)!$ . Substituting back into the definition of $G$ gives $f (x) - p_{n} (x) = (f^{(n + 1)} (ξ) / (n + 1)!) π_{n + 1} (x)$ . $□$

Corollary (uniform error bound). If $M_{n + 1} = max_{t \in [a, b]} ∣ f^{(n + 1)} (t)∣$ , then $$ \lVert f - p_n \rVert_\infty \le \frac{M_{n+1}}{(n+1)!},\lVert \pi_{n+1} \rVert_\infty . $$ The bound factorises into a function-dependent part $M_{n + 1} / (n + 1)!$ and a node-dependent part $∥ π_{n + 1} ∥_{\infty}$ , the latter the sole quantity under the analyst's control — the entry point for node optimisation in 43.08.04.

Bridge. This theorem is the foundational reason node placement is the whole game: the error splits into a derivative factor fixed by $f$ and the node polynomial $π_{n + 1}$ fixed by the analyst, and the corollary makes minimising $∥ π_{n + 1} ∥_{\infty}$ the one lever available. This is exactly the optimisation that the Chebyshev theory solves, so the result builds toward the minimax node placement of 43.08.04, and the node polynomial $π_{n + 1} = \prod_{k} (x - x_{k})$ appears again in the quadrature error of 43.09.01, where the same product governs how well $\int p_{n}$ approximates $\int f$ . The Rolle-iteration mechanism generalises the single mean value theorem of 02.05.02 from one secant to $n + 1$ nested differences, and the leading-coefficient identity $π_{n + 1}^{(n + 1)} = (n + 1)!$ is dual to the divided-difference reading of the remainder: putting these together, $f (x) - p_{n} (x) = f [x_{0}, \dots, x_{n}, x] π_{n + 1} (x)$ , so the error constant is itself a confluent divided difference, the central insight tying this unit to the divided-difference calculus of 43.08.01.

Exercises Intermediate+

Exercise 4 (medium, symbolic).

Prove the divided-difference form of the remainder: for distinct nodes and $x$ not a node, $f (x) - p_{n} (x) = f [x_{0}, \dots, x_{n}, x] π_{n + 1} (x)$ .

Hint

Let $q$ be the interpolant of $f$ on the $n + 2$ points $x_{0}, \dots, x_{n}, x$ . Use the Newton form: $q = p_{n} + f [x_{0}, \dots, x_{n}, x] π_{n + 1}$ , and evaluate at $x$ .

Answer

Let $q \in Π_{n + 1}$ interpolate $f$ at the $n + 2$ distinct points $x_{0}, \dots, x_{n}, x$ . Writing $q$ in Newton form on these points and grouping the first $n + 1$ terms, which together form the interpolant $p_{n}$ on $x_{0}, \dots, x_{n}$ , gives $q (t) = p_{n} (t) + f [x_{0}, \dots, x_{n}, x] \prod_{k = 0}^{n} (t - x_{k}) = p_{n} (t) + f [x_{0}, \dots, x_{n}, x] π_{n + 1} (t)$ . Since $q$ interpolates $f$ at $x$ , evaluating at $t = x$ yields $f (x) = q (x) = p_{n} (x) + f [x_{0}, \dots, x_{n}, x] π_{n + 1} (x)$ , so $f (x) - p_{n} (x) = f [x_{0}, \dots, x_{n}, x] π_{n + 1} (x)$ . Rubric: full credit for the Newton-form grouping, the identification of the first $n + 1$ terms with $p_{n}$ , and the evaluation at $x$ .

Exercise 5 (medium, numeric).

Compare the two remainder forms of Exercise 4 and the Key theorem to deduce the mean-value identity $f [x_{0}, \dots, x_{n}, x] = f^{(n + 1)} (ξ) / (n + 1)!$ for some $ξ$ . State what this becomes as all $n + 2$ points coalesce to a single point $c$ .

Hint

Both forms equal $f (x) - p_{n} (x)$ and share the factor $π_{n + 1} (x) \neq = 0$ ; cancel it. For the limit, relabel: a divided difference on $m + 1$ coalescing points.

Answer

Equating the Rolle remainder $f^{(n + 1)} (ξ) / (n + 1)! π_{n + 1} (x)$ with the divided-difference remainder $f [x_{0}, \dots, x_{n}, x] π_{n + 1} (x)$ and cancelling the common nonzero factor $π_{n + 1} (x)$ gives $f [x_{0}, \dots, x_{n}, x] = f^{(n + 1)} (ξ) / (n + 1)!$ , the mean-value property of divided differences on $n + 2$ points. As all points coalesce to $c$ , the divided difference on $m + 1$ equal arguments becomes $f^{(m)} (c) / m!$ ; here with $m = n + 1$ this is $f^{(n + 1)} (c) / (n + 1)!$ , consistent with $ξ \to c$ . Rubric: full credit for the cancellation, the mean-value statement, and the coalescent limit.

Exercise 6 (medium, numeric).

For Runge's function $f (x) = 1/ (1 + 25 x^{2})$ on $[- 1, 1]$ , the fourth derivative at $0$ is $f^{(4)} (0) = 15000$ . Using degree- $3$ interpolation at the four equispaced nodes $- 1, - \frac{1}{3}, \frac{1}{3}, 1$ , give the leading factor $∣ f^{(4)} (0)∣ /4!$ that multiplies the node polynomial in the error bound at the centre.

Hint

The error formula has prefactor $f^{(4)} (ξ) /4!$ ; here you are asked only for $∣ f^{(4)} (0)∣ /4!$ with the given derivative value.

Answer

$625$ . Compute $15000/4! = 15000/24 = 625$ . The point of the exercise is the magnitude: even though $f$ is bounded by $1$ , its high derivatives are enormous ( $f^{(4)} (0) = 15000$ ), and these large derivatives are precisely the engine of the Runge divergence, since the prefactor $M_{n + 1} / (n + 1)!$ does not shrink fast enough to defeat the growth of $∥ π_{n + 1} ∥_{\infty}$ for equispaced nodes. Rubric: full credit for $625$ and a sentence linking large derivatives to divergence.

Exercise 7 (hard, symbolic).

Prove the near-best inequality $∥ f - I_{n} f ∥_{\infty} \leq (1 + Λ_{n}) E_{n} (f)$ , where $I_{n}$ is the interpolation projection and $E_{n} (f)$ the best sup-norm approximation error.

Hint

Let $q^{⋆}$ be a best approximant. Use $I_{n} q^{⋆} = q^{⋆}$ (interpolation reproduces polynomials), insert it, apply the triangle inequality, and bound $∥ I_{n} (f - q^{⋆}) ∥_{\infty} \leq Λ_{n} ∥ f - q^{⋆} ∥_{\infty}$ .

Answer

Let $q^{⋆} \in Π_{n}$ attain $∥ f - q^{⋆} ∥_{\infty} = E_{n} (f)$ . Since $I_{n}$ is a projection onto $Π_{n}$ , $I_{n} q^{⋆} = q^{⋆}$ . Then $$ f - \mathcal{I}n f = (f - q^\star) - \mathcal{I}n f + q^\star = (f - q^\star) - \mathcal{I}n(f - q^\star). $$ By the triangle inequality, $\lVert f - \mathcal{I}n f \rVert\infty \le \lVert f - q^\star \rVert\infty + \lVert \mathcal{I}n(f - q^\star) \rVert\infty $. T h eo p er a t or - n or mi d e n t i t y$ \lVert \mathcal{I}n g \rVert\infty \le \Lambda_n \lVert g \rVert\infty $(t h e L e b es g u eco n s t an t i s t h es u p - n or m o p er a t or n or m o f$ \mathcal{I}n $) b o u n d s t h eseco n d t er mb y$ \Lambda_n \lVert f - q^\star \rVert\infty $. H e n ce$ \lVert f - \mathcal{I}n f \rVert\infty \le (1 + \Lambda_n)\lVert f - q^\star \rVert\infty = (1 + \Lambda_n)E_n(f) $. R u b r i c : f u l l cr e d i t f or t h e p o l y n o mia l - r e p r o d u c t i o n s t e p, t h e t r ian g l e in e q u a l i t y, an d t h eo p er a t or - n or mb o u n d b y$ \Lambda_n$.

Exercise 8 (hard, symbolic).

Show that the Lebesgue constant is the operator norm of $I_{n}$ , i.e. $Λ_{n} = sup_{∥ g ∥_{\infty} = 1} ∥ I_{n} g ∥_{\infty}$ , with $Λ_{n} = max_{x} \sum_{k} ∣ L_{k} (x)∣$ .

Hint

Upper bound: $∣ I_{n} g (x)∣ = ∣ \sum_{k} g (x_{k}) L_{k} (x)∣ \leq ∥ g ∥_{\infty} \sum_{k} ∣ L_{k} (x)∣$ . Lower bound: at the maximising $x^{⋆}$ , choose $g$ continuous with $∥ g ∥_{\infty} = 1$ and $g (x_{k}) = sign L_{k} (x^{⋆})$ .

Answer

For any $g$ with $∥ g ∥_{\infty} \leq 1$ , the Lagrange form gives $I_{n} g (x) = \sum_{k} g (x_{k}) L_{k} (x)$ , so $∣ I_{n} g (x)∣ \leq \sum_{k} ∣ g (x_{k})∣ ∣ L_{k} (x)∣ \leq \sum_{k} ∣ L_{k} (x)∣ = λ_{n} (x) \leq Λ_{n}$ . Taking the max over $x$ shows the operator norm is $\leq Λ_{n}$ . For the reverse, let $x^{⋆}$ attain $λ_{n} (x^{⋆}) = Λ_{n}$ . Choose values $s_{k} = sign L_{k} (x^{⋆}) \in {- 1, + 1}$ and let $g \in C [a, b]$ with $∥ g ∥_{\infty} = 1$ and $g (x_{k}) = s_{k}$ (a continuous interpolant of the signs exists). Then $I_{n} g (x^{⋆}) = \sum_{k} s_{k} L_{k} (x^{⋆}) = \sum_{k} ∣ L_{k} (x^{⋆})∣ = Λ_{n}$ , so the operator norm is $\geq Λ_{n}$ . The two bounds give equality. Rubric: full credit for the upper bound via the Lagrange form, the sign-construction for the lower bound, and the conclusion of equality.

Advanced results Master

The pointwise remainder fixes the local error; the master layer concerns the global behaviour as the degree grows — when interpolation converges, when it diverges, and the single quantity, the Lebesgue constant, that adjudicates.

Theorem 1 (Lebesgue inequality and the role of node distribution). For any node set, $∥ f - I_{n} f ∥_{\infty} \leq (1 + Λ_{n}) E_{n} (f)$ , with $Λ_{n}$ the Lebesgue constant of the nodes and $E_{n} (f)$ the best uniform approximation error. By the Weierstrass theorem $E_{n} (f) \to 0$ for every $f \in C [a, b]$ , so interpolation converges uniformly whenever $Λ_{n} E_{n} (f) \to 0$ . The Lebesgue constant depends only on the nodes, not on $f$ : for equispaced nodes on $[- 1, 1]$ it grows as $Λ_{n} \sim 2^{n + 1} / (e n lo g n)$ , faster than any polynomial, while for Chebyshev nodes $Λ_{n} \sim (2/ π) lo g n$ grows only logarithmically ^{[Trefethen Ch. 15]}. The contrast is the precise content of the slogan that node placement, not node count, governs convergence.

Theorem 2 (the Runge phenomenon). Equispaced polynomial interpolation of $f (x) = 1/ (1 + 25 x^{2})$ on $[- 1, 1]$ diverges: $∥ f - I_{n} f ∥_{\infty} \to \infty$ as $n \to \infty$ , with the error growing geometrically and concentrated near the endpoints $\pm 1$ , even though $f$ is real-analytic on $[- 1, 1]$ ^{[Runge 1901]}. The mechanism is twofold. The function-side factor $f^{(n + 1)}$ in the remainder grows like $(n + 1)! ρ^{- (n + 1)}$ governed by the distance $ρ$ from $[- 1, 1]$ to the poles $\pm i /5$ of $f$ in the complex plane. The node-side factor $∥ π_{n + 1} ∥_{\infty}$ for equispaced nodes grows like $2^{- (n + 1)}$ relative to its value near the centre but is exponentially larger near the endpoints; the product diverges precisely where the equispaced node polynomial swells. Potential theory makes this exact: the limiting node density of equispaced points is uniform, whose logarithmic potential is non-constant on $[- 1, 1]$ , so the node polynomial cannot be uniformly small.

Theorem 3 (potential-theoretic convergence criterion). Let the nodes be drawn with limiting density $μ$ (a probability measure on $[- 1, 1]$ ), and let $U^{μ} (z) = \int lo g ∣ z - t ∣^{- 1} d μ (t)$ be its logarithmic potential. Equispaced-node interpolation converges geometrically for functions analytic in the region ${z : U^{μ} (z) > U_{m i n}^{μ}}$ and diverges outside it; the convergence region is bounded by the level curve of $U^{μ}$ through the endpoints. For the equilibrium (arcsine) measure $d μ = π^{- 1} (1 - t^{2})^{- 1/2} d t$ — the limiting density of Chebyshev nodes — $U^{μ}$ is constant on $[- 1, 1]$ , the level curves are confocal ellipses (Bernstein ellipses), and interpolation converges geometrically for any function analytic in a neighbourhood of the interval ^{[Trefethen Ch. 12]}. The uniform density of equispaced nodes is not the equilibrium measure, which is the root cause of Runge divergence; the arcsine clustering of Chebyshev nodes toward the endpoints is exactly the correction.

Theorem 4 (divided-difference remainder and confluence). The remainder admits the divided-difference form $f (x) - p_{n} (x) = f [x_{0}, \dots, x_{n}, x] π_{n + 1} (x)$ , valid for $x$ off the node set and extending by continuity to all $x$ when $f \in C^{n + 1}$ ^{[Stoer-Bulirsch §2.1.4]}. The divided difference on $n + 2$ points equals $f^{(n + 1)} (ξ) / (n + 1)!$ by the mean-value property, and as the evaluation point $x$ migrates to a node the formula passes continuously to the Hermite (repeated-node) remainder of 43.08.03, where derivative data is matched and the node polynomial acquires repeated factors $\prod_{k} (x - x_{k})^{m_{k}}$ . The divided difference is thus the analytic object unifying the Lagrange remainder, the Hermite remainder, and the coalescent limit $f [x_{0}, \dots, x_{n}] \to f^{(n)} (ξ) / n!$ established in 43.08.01.

Synthesis. The error theorem is the foundational reason interpolation accuracy is a question of geometry rather than of smoothness alone: the remainder factorises into a derivative of $f$ and the node polynomial $π_{n + 1}$ , and the Lebesgue constant $Λ_{n}$ packages the same node-dependence into the operator norm of the interpolation projection, so the convergence theory is the study of one node-determined quantity. The central insight is that node placement controls $∥ π_{n + 1} ∥_{\infty}$ and $Λ_{n}$ at once: equispaced nodes make the node polynomial swell exponentially and leave $Λ_{n}$ super-polynomial, while Chebyshev nodes make $∥ π_{n + 1} ∥_{\infty}$ minimal and $Λ_{n}$ merely logarithmic — this is exactly the extremal problem solved in 43.08.04, and it is dual to the potential-theoretic statement that the equilibrium measure of the interval is the arcsine law, not the uniform law. Putting these together, the Runge phenomenon is not a failure of polynomials but a failure of equispaced sampling: the factor $f^{(n + 1)}$ is governed by the complex singularities of $f$ through the Bernstein-ellipse picture, the node-side factor by the limiting node density, and divergence occurs when the analyticity region of $f$ omits the convergence region the node distribution allows. The bridge is that the arcsine node density at once minimises the node polynomial, tames the Lebesgue constant, and places the Gauss quadrature nodes of 43.09.03 — one geometric principle, clustering toward the endpoints, which generalises the local Rolle-based remainder proved here into the section's global approximation theory.

Full proof set Master

Proposition 1 (interpolation error theorem, Rolle form). For $f \in C^{n + 1} [a, b]$ and $p_{n}$ the interpolant at distinct nodes $x_{0}, \dots, x_{n}$ , every $x \in [a, b]$ admits $ξ \in I_{x}$ with $f (x) - p_{n} (x) = f^{(n + 1)} (ξ) π_{n + 1} (x) / (n + 1)!$ .

Proof. For $x$ equal to a node both sides vanish; fix $x$ off the node set and set $G = (f (x) - p_{n} (x)) / π_{n + 1} (x)$ . The function $ϕ (t) = f (t) - p_{n} (t) - G π_{n + 1} (t)$ lies in $C^{n + 1} [a, b]$ and vanishes at the $n + 2$ distinct points $x, x_{0}, \dots, x_{n}$ (at the nodes because $f - p_{n}$ and $π_{n + 1}$ both vanish there, at $x$ by the choice of $G$ ). Ordering these $n + 2$ zeros and applying Rolle's theorem to $ϕ$ on each of the $n + 1$ adjacent gaps produces $\geq n + 1$ zeros of $ϕ^{'}$ in $I_{x}$ ; iterating, $ϕ^{(j)}$ has $\geq n + 2 - j$ zeros, so $ϕ^{(n + 1)}$ has at least one zero $ξ \in I_{x}$ . Since $p_{n}^{(n + 1)} \equiv 0$ and $π_{n + 1}^{(n + 1)} \equiv (n + 1)!$ (monic of degree $n + 1$ ), $0 = ϕ^{(n + 1)} (ξ) = f^{(n + 1)} (ξ) - G (n + 1)!$ , whence $G = f^{(n + 1)} (ξ) / (n + 1)!$ and the claim follows. $□$

Proposition 2 (divided-difference remainder). For distinct nodes and $f \in C^{n + 1}$ , $f (x) - p_{n} (x) = f [x_{0}, \dots, x_{n}, x] π_{n + 1} (x)$ for $x$ off the node set, and $f [x_{0}, \dots, x_{n}, x] = f^{(n + 1)} (ξ) / (n + 1)!$ for some $ξ \in I_{x}$ .

Proof. Let $q \in Π_{n + 1}$ interpolate $f$ at $x_{0}, \dots, x_{n}, x$ . Its Newton form on these $n + 2$ ordered points is $q (t) = \sum_{k = 0}^{n} f [x_{0}, \dots, x_{k}] \prod_{j < k} (t - x_{j}) + f [x_{0}, \dots, x_{n}, x] \prod_{j = 0}^{n} (t - x_{j})$ . The first sum is the Newton form of the interpolant on $x_{0}, \dots, x_{n}$ , namely $p_{n} (t)$ , and the last product is $π_{n + 1} (t)$ , so $q (t) = p_{n} (t) + f [x_{0}, \dots, x_{n}, x] π_{n + 1} (t)$ . Evaluating at $t = x$ , where $q (x) = f (x)$ , gives $f (x) - p_{n} (x) = f [x_{0}, \dots, x_{n}, x] π_{n + 1} (x)$ . Comparing with Proposition 1 and cancelling $π_{n + 1} (x) \neq = 0$ yields $f [x_{0}, \dots, x_{n}, x] = f^{(n + 1)} (ξ) / (n + 1)!$ . $□$

Proposition 3 (Lebesgue constant as operator norm and the near-best bound). The Lebesgue constant satisfies $Λ_{n} = ∥ I_{n} ∥_{C [a, b] \to C [a, b]} = max_{x} \sum_{k} ∣ L_{k} (x)∣$ , and consequently $∥ f - I_{n} f ∥_{\infty} \leq (1 + Λ_{n}) E_{n} (f)$ .

Proof. For $∥ g ∥_{\infty} \leq 1$ , $I_{n} g (x) = \sum_{k} g (x_{k}) L_{k} (x)$ gives $∣ I_{n} g (x)∣ \leq \sum_{k} ∣ L_{k} (x)∣ \leq Λ_{n}$ , so $∥ I_{n} ∥ \leq Λ_{n}$ . At a point $x^{⋆}$ with $\sum_{k} ∣ L_{k} (x^{⋆})∣ = Λ_{n}$ , pick $g \in C [a, b]$ , $∥ g ∥_{\infty} = 1$ , with $g (x_{k}) = sign L_{k} (x^{⋆})$ ; then $I_{n} g (x^{⋆}) = \sum_{k} ∣ L_{k} (x^{⋆})∣ = Λ_{n}$ , so $∥ I_{n} ∥ \geq Λ_{n}$ , giving equality. For the near-best bound, let $q^{⋆} \in Π_{n}$ attain $E_{n} (f)$ ; since $I_{n} q^{⋆} = q^{⋆}$ , $f - I_{n} f = (f - q^{⋆}) - I_{n} (f - q^{⋆})$ , and the triangle inequality with $∥ I_{n} (f - q^{⋆}) ∥_{\infty} \leq Λ_{n} ∥ f - q^{⋆} ∥_{\infty}$ gives $∥ f - I_{n} f ∥_{\infty} \leq (1 + Λ_{n}) ∥ f - q^{⋆} ∥_{\infty} = (1 + Λ_{n}) E_{n} (f)$ . $□$

Proposition 4 (equispaced node polynomial swells near the endpoints). For the $n + 1$ equispaced nodes $x_{k} = - 1 + 2 k / n$ on $[- 1, 1]$ , the ratio of $∣ π_{n + 1} ∣$ near an endpoint to its value near the centre grows exponentially in $n$ .

Proof. Compare the centre $x = 0$ with a point near an endpoint, $x = 1$ . Estimate $lo g ∣ π_{n + 1} (x)∣ = \sum_{k} lo g ∣ x - x_{k} ∣$ by recognising it as a Riemann sum for $\frac{n}{2} \int_{- 1}^{1} lo g ∣ x - t ∣ d t = n U (x)$ up to a correction bounded in $n$ , where $U (x) = \frac{1}{2} \int_{- 1}^{1} lo g ∣ x - t ∣ d t$ is the logarithmic potential of the uniform density on $[- 1, 1]$ . A direct integration gives $U (x) = \frac{1}{2} [(1 + x) lo g ∣ 1 + x ∣ + (1 - x) lo g ∣ 1 - x ∣] - 1$ . Evaluating at the two points, $U (0) = \frac{1}{2} [lo g 1 + lo g 1] - 1 = - 1$ , and $U (1) = \frac{1}{2} [2 lo g 2 + 0] - 1 = lo g 2 - 1$ . Their difference is $U (1) - U (0) = lo g 2 > 0$ , so $U$ is strictly larger near the endpoint than at the centre. Hence $lo g ∣ π_{n + 1} (near endpoint)∣ - lo g ∣ π_{n + 1} (centre)∣ \approx n (U (1) - U (0)) = n lo g 2$ , and the ratio grows like $e^{n l o g 2} = 2^{n} \to \infty$ . The endpoint swelling of the equispaced node polynomial is therefore exponential in the degree, the node-side driver of the Runge divergence; the arcsine (Chebyshev) density replaces the non-constant $U$ by a potential constant on $[- 1, 1]$ , removing the swelling. $□$

Connections Master

The node polynomial $π_{n + 1} (x) = \prod_{k} (x - x_{k})$ whose sup norm the error corollary isolates is minimised over all monic degree- $(n + 1)$ polynomials by the scaled Chebyshev polynomial $T_{n + 1}$ , whose roots are the optimal interpolation nodes; this extremal problem, the equioscillation characterisation, and the resulting logarithmic Lebesgue constant are the content of 43.08.04, which closes the Runge loop opened here by replacing equispaced nodes with the arcsine-clustered Chebyshev nodes.
The divided-difference remainder $f [x_{0}, \dots, x_{n}, x] π_{n + 1} (x)$ proved here passes continuously, as the evaluation point migrates to a node, into the Hermite (repeated-node) remainder with node polynomial $\prod_{k} (x - x_{k})^{m_{k}}$ ; the confluent divided differences and the $C^{2}$ piecewise-cubic construction that defeats Runge by lowering the local degree are developed in 43.08.03, which inherits both the error mechanism and the node polynomial of this unit.
Integrating the interpolant in place of $f$ turns the interpolation error into the quadrature error: $\int_{a}^{b} (f - p_{n}) = \int_{a}^{b} f^{(n + 1)} (ξ (x)) π_{n + 1} (x) / (n + 1)! d x$ , so the same node polynomial governs how accurately a Newton-Cotes rule integrates, and the Peano-kernel representation of the quadrature error in 43.09.01 is the integrated form of the pointwise remainder established here; the instability of high-order equispaced Newton-Cotes mirrors the Runge divergence through the identical node-polynomial swelling.
The error theorem rests entirely on iterated Rolle applied to an auxiliary function, the same mechanism by which the Taylor remainder is derived in 02.05.02; the Lagrange remainder $f^{(n + 1)} (ξ) / (n + 1)! \prod (x - x_{k})$ is the interpolation generalisation of the Taylor remainder $f^{(n + 1)} (ξ) / (n + 1)! (x - x_{0})^{n + 1}$ , recovered in the confluent limit where all nodes coalesce to $x_{0}$ , tying numerical interpolation to the differential calculus of single-variable analysis.

Historical & philosophical context Master

The pointwise interpolation remainder in the form $f^{(n + 1)} (ξ) / (n + 1)! \prod_{k} (x - x_{k})$ was established by Augustin-Louis Cauchy in the 1840s, building on the Lagrange-remainder technique Cauchy had himself made rigorous for Taylor's theorem; the auxiliary-function-and-Rolle argument is the natural extension of the mean-value reasoning to $n + 1$ interpolation conditions. The divided-difference form of the remainder and the identification of divided differences with derivatives via the mean-value property are due to the nineteenth-century difference calculus, with the integral representation supplied by Charles Hermite and Angelo Genocchi in the 1870s.

The divergence example is Carl Runge's 1901 paper in the Zeitschrift für Mathematik und Physik, where he showed that equispaced interpolation of $1/ (1 + x^{2})$ on a sufficiently wide symmetric interval (equivalently $1/ (1 + 25 x^{2})$ on $[- 1, 1]$ ) diverges as the degree increases, and traced the cause to the location of the function's complex poles relative to the interval ^{[Runge 1901]}. The quantitative theory matured through the twentieth century: Henri Lebesgue introduced the function and constant now bearing his name as the operator norm of the interpolation projection; Sergei Bernstein and the potential-theory school connected convergence to the equilibrium measure of the interval and the Bernstein-ellipse analyticity regions; and the modern synthesis, including the sharp asymptotics $Λ_{n} \sim 2^{n + 1} / (e n lo g n)$ for equispaced nodes against $(2/ π) lo g n$ for Chebyshev nodes, is presented by Lloyd N. Trefethen ^{[Trefethen Ch. 15]}. Runge's example is the standard cautionary instance that smoothness on the real line does not guarantee convergence of high-degree equispaced interpolation.

Bibliography Master

@book{sulimayers2003,
  author    = {S\"{u}li, Endre and Mayers, David F.},
  title     = {An Introduction to Numerical Analysis},
  publisher = {Cambridge University Press},
  year      = {2003}
}

@book{stoerbulirsch2002,
  author    = {Stoer, Josef and Bulirsch, Roland},
  title     = {Introduction to Numerical Analysis},
  edition   = {3},
  publisher = {Springer},
  year      = {2002}
}

@book{trefethen2019,
  author    = {Trefethen, Lloyd N.},
  title     = {Approximation Theory and Approximation Practice, Extended Edition},
  publisher = {SIAM},
  year      = {2019}
}

@book{davis1975,
  author    = {Davis, Philip J.},
  title     = {Interpolation and Approximation},
  publisher = {Dover Publications},
  year      = {1975}
}

@article{runge1901,
  author  = {Runge, Carl},
  title   = {\"{U}ber empirische Funktionen und die Interpolation zwischen \"{a}quidistanten Ordinaten},
  journal = {Zeitschrift f\"{u}r Mathematik und Physik},
  volume  = {46},
  year    = {1901},
  pages   = {224--243}
}

@article{cauchy1840interpolation,
  author  = {Cauchy, Augustin-Louis},
  title   = {Sur les fonctions interpolaires},
  journal = {Comptes Rendus de l'Acad\'{e}mie des Sciences},
  volume  = {11},
  year    = {1840},
  pages   = {775--789}
}

Prerequisites

43.08.01
02.05.02

Tier anchors

beginner: Süli-Mayers 2003 *An Introduction to Numerical Analysis* (Cambridge) §6.4 (the interpolation error and why high-degree fits can wobble, at the level of a first numerical-methods course); Burden-Faires 2011 *Numerical Analysis* 9e (Brooks/Cole) §3.1 for the error-bound bookkeeping on worked nodes
intermediate: Süli-Mayers 2003 *An Introduction to Numerical Analysis* (Cambridge) §6.4 (the error theorem $f-p_n = f^{(n+1)}(\xi)/(n+1)!\,\prod(x-x_k)$ proved by Rolle, the node polynomial as the controllable factor); Stoer-Bulirsch 2002 *Introduction to Numerical Analysis* 3e (Springer) §2.1.4 (the remainder, divided differences as derivatives, and equispaced-node behaviour)
master: Trefethen 2019 *Approximation Theory and Approximation Practice* extended ed. (SIAM) Ch. 11-15 (the Lebesgue constant, the Runge phenomenon, potential theory and the equilibrium measure, exponential vs divergent convergence); Davis 1975 *Interpolation and Approximation* (Dover) Ch. 3-4 (the remainder, the contour-integral Hermite formula, and the Lebesgue function); Runge 1901 *Zeitschrift für Mathematik und Physik* 46 (the original divergence example)

References

Süli, E. & Mayers, D. F. — An Introduction to Numerical Analysis · Cambridge University Press (2003). §6.4 proves the interpolation error theorem: if f is (n+1)-times continuously differentiable on an interval containing the distinct nodes x_0,...,x_n and p_n is the interpolating polynomial, then for each x there is a point xi in the smallest interval containing x and the nodes with f(x)-p_n(x) = f^{(n+1)}(xi)/(n+1)! prod_{k}(x-x_k). The proof fixes x, forms the auxiliary function phi(t) = f(t)-p_n(t) - [f(x)-p_n(x)]/pi(x) * pi(t) with pi the node polynomial, observes phi vanishes at the n+2 points x, x_0,...,x_n, and applies Rolle's theorem n+1 times to locate a zero of phi^{(n+1)}. §6.4 also discusses the node polynomial as the only x-dependent factor the analyst controls and points to the size of its sup norm as the governing quantity.
Trefethen, L. N. — Approximation Theory and Approximation Practice (Extended Edition) · SIAM (2019). Chapters 11-15 develop the modern account: the Lebesgue constant Lambda_n = max_x sum_k |L_k(x)| as the operator norm of the interpolation projection from C[a,b] onto Pi_n in the sup norm, the near-best inequality ||f-p_n|| <= (1+Lambda_n) E_n(f) relating interpolation error to the best approximation error E_n(f), the growth Lambda_n ~ 2^{n+1}/(e n log n) for equispaced nodes versus Lambda_n ~ (2/pi) log n for Chebyshev nodes, and the potential-theoretic explanation of the Runge phenomenon via the equilibrium measure of the interval, where the divergence of equispaced interpolation of 1/(1+25x^2) on [-1,1] is read off from the node polynomial's exponential growth away from the centre.
Stoer, J. & Bulirsch, R. — Introduction to Numerical Analysis · Springer, 3rd edition (2002). §2.1.4 gives the interpolation remainder via divided differences, proving f(x)-p_n(x) = f[x_0,...,x_n,x] prod_k (x-x_k) and identifying the divided difference on n+2 points with f^{(n+1)}(xi)/(n+1)! through the mean-value property of divided differences (the confluent / Hermite-Genocchi limit), thereby connecting the Rolle-based remainder of §6.4 to the divided-difference calculus of the Newton form and to the limit f[x_0,...,x_n] -> f^{(n)}(xi)/n! as the nodes coalesce.
Runge, C. — Über empirische Funktionen und die Interpolation zwischen äquidistanten Ordinaten · Zeitschrift für Mathematik und Physik 46 (1901), 224-243. The original paper exhibiting that equispaced polynomial interpolation of the analytic function 1/(1+x^2) on a symmetric interval diverges as the degree increases, with the error blowing up near the endpoints even though the interpolated function is smooth on the real axis; Runge traced the divergence to the behaviour of the node polynomial and the location of the function's complex singularities relative to the interval.

Estimated time

beginner: 20m
intermediate: 50m
master: 90m