Best uniform approximation, minimax, and Chebyshev polynomials
Anchor (Master): Cheney 1966 *Introduction to Approximation Theory* (McGraw-Hill / AMS Chelsea reprint) Ch. 1-3 (existence, the Haar condition, equioscillation, uniqueness, the Remez algorithm); DeVore-Lorentz 1993 *Constructive Approximation* (Springer Grundlehren 303) Ch. 3 and Ch. 7 (the Chebyshev systems, strong uniqueness, the extremal theory of Chebyshev polynomials); Trefethen 2019 *Approximation Theory and Approximation Practice* extended ed. (SIAM) Ch. 9-10, 16 (near-best Chebyshev interpolation versus true minimax, the Remez algorithm in practice)
Intuition Beginner
In the previous unit you learned that an interpolating polynomial hits the data points exactly but can swing badly in between, and that you only get to control where the error lives by choosing where to sample. This unit asks a sharper question. Forget hitting any point exactly. Among all polynomials of a fixed degree, which one stays closest to a target function across the whole interval — measuring "closest" by the single worst gap anywhere?
This is the minimax idea: minimise the maximum error. You imagine sliding a polynomial of the allowed degree around until its largest deviation from the target, taken over the entire interval, is as small as it can be. The polynomial that wins this contest is the best uniform approximation. It is the right notion when you care about a guarantee — when you need every point to be within some tolerance, not just most points on average.
There is a strikingly clean signature of the winner. When a polynomial is the best one, its error curve does not just stay small; it touches its largest size again and again, flipping sign each time, like a guitar string vibrating between two rails. The error rides up to the maximum, comes down through zero, rides down to the negative of that maximum, comes back up, and so on. This back-and-forth touching is called equioscillation, and it is the fingerprint of the optimal fit.
Why must the best fit wobble like that? Picture a fit that touches its peak error only on one side, say it pokes up high but never dips correspondingly low. Then you could push the whole polynomial up a little and shave that peak without creating a worse one elsewhere. So a one-sided error is never optimal — only a fit balanced between equal-and-opposite extremes cannot be improved.
The hero polynomials of this story are the Chebyshev polynomials. They are built to oscillate between and as evenly as possible, and that even oscillation makes them the answer to many minimax questions at once — including the choice of interpolation nodes that finally cures the Runge wobble from the last unit.
Visual Beginner
Picture the error curve of a candidate fit drawn across the interval. For a poor fit the curve has one tall spike and is small elsewhere. For the best fit the curve looks like an even ripple: it rises to the same height, falls to the same depth, rises again, touching the top and bottom rails several times with the sign flipping at each touch.
The table below contrasts the two error-measuring philosophies you have now met.
| how you measure error | what the best fit does | named tool |
|---|---|---|
| worst gap anywhere (uniform) | error equioscillates between equal extremes | Chebyshev polynomials |
| total squared gap (least squares) | error is perpendicular to the fit space | orthogonal polynomials |
The whole point of the top row is the alternation: a fit whose error reaches its maximum size at several points with flipping signs cannot be nudged to do better, and that is exactly what "best in the worst-case sense" means.
Worked example Beginner
Let us find the best constant — a degree-zero polynomial — that approximates on the interval from to , measuring by the worst gap.
Step 1. Write the error. The error is . As runs from to , this runs from up to .
Step 2. Find the worst gap for a given . The largest size of over the interval is whichever is bigger: at the left end, or at the right end. The worst gap is the larger of and .
Step 3. Balance the two ends. The larger of and is made as small as possible when the two are equal: , giving . Then both ends have gap , and that is the smallest the worst gap can be.
Step 4. Check the signature. With the error is . At it equals ; at it equals . The error touches its maximum size at two points, with opposite signs — it equioscillates at two points, exactly the fingerprint of a best fit at this degree.
What this tells us: the best uniform fit is not found by averaging or by matching any single point, but by balancing the error so it reaches equal-and-opposite extremes. Even at the lowliest degree, the equioscillation signature already appears.
Check your understanding Beginner
Formal definition Intermediate+
Throughout, is a continuous real function, is the space of real polynomials of degree at most in the sense of 43.08.01, and is the supremum (uniform) norm. The contrast object throughout is the best approximation of 43.08.05, which uses the inner product of 02.11.07 in place of this norm.
Definition (minimax error and best uniform approximation). The minimax error (or best uniform approximation error) of from is $$ E_n(f) = \inf_{p \in \Pi_n} \lVert f - p \rVert_\infty = \inf_{p \in \Pi_n}\ \max_{x \in [a,b]} \lvert f(x) - p(x) \rvert . $$ A polynomial attaining the infimum, , is a best uniform approximation (or minimax polynomial) of in .
Definition (alternation / equioscillation set). Let be the error of a candidate , and let . A set of points is an alternation set of length for if for every and the signs alternate: for each , i.e. with fixed. The error equioscillates on points when such a set exists.
Definition (Chebyshev polynomials of the first kind). The Chebyshev polynomial of the first kind is the unique polynomial with for all . The substitution makes this an identity on ; it extends to a genuine polynomial through the recurrence $$ T_0 = 1, \quad T_1 = x, \quad T_{n+1}(x) = 2x,T_n(x) - T_{n-1}(x), $$ which follows from . Then has degree , leading coefficient for , satisfies on , and attains alternately at the Chebyshev extreme points , , with . Its roots, the Chebyshev points, are , .
The symbols , , the supremum norm , the node polynomial , and are recorded in _meta/NOTATION.md; the node polynomial and the minimax error are inherited from 43.08.02, where governs the near-best interpolation bound.
Counterexamples to common slips Intermediate+
"The best uniform approximation is the interpolant at any nodes." It is not. Interpolation forces zeros of the error and pins the error to zero there; the minimax error instead equioscillates and is generically nonzero at points. The Chebyshev interpolant is near-minimax — within a factor of (
43.08.02) — but it is not the true minimax polynomial."Equioscillation on points characterises the best approximation." The correct count is alternation points for . The extra point is what makes the de la Vallée-Poussin and exchange arguments work; alternations is one short and does not force optimality.
"The Chebyshev polynomial itself is the minimal-sup-norm polynomial." The monic rescaling is the minimal-sup-norm element among monic degree- polynomials on ; has leading coefficient and sup norm , but the extremal statement is about the monic normalisation, with minimal deviation .
"Best and best approximation coincide." They differ. The projection of
43.08.05makes the residual orthogonal to and minimises a weighted average of the squared error; the best approximation makes the error equioscillate and minimises the worst case. They agree only in degenerate cases (e.g. the best constant for a monotone via a symmetric weight); in general the minimax polynomial is not any orthogonal projection.
Key theorem with proof Intermediate+
The signature result is the Chebyshev equioscillation theorem: a polynomial is the best uniform approximation if and only if its error reaches the maximum size at points with alternating signs. The forward direction is a perturbation argument — if the alternation is too short, a correcting polynomial shaves the error — and the converse is a clean lower bound. The proof follows Süli-Mayers [Süli-Mayers §8.3] and Cheney [Cheney Ch. 3].
Theorem (Chebyshev equioscillation / alternation). Let and , with error and . Then is the best uniform approximation of from if and only if has an alternation set of length at least — that is, there exist points in with for a fixed sign . The best approximation is unique.
Proof. Assume first that equioscillates on points , and suppose for contradiction that some does strictly better: . Consider . At each , , so the sign of equals the sign of , which alternates with . A continuous function changing sign across each of the gaps between the has at least zeros; but is a polynomial of degree , so , i.e. , contradicting . Hence no beats : is a best approximation.
For the converse, assume is a best approximation but has a longest alternation set of length (so ). Partition by the points where attains into maximal runs of constant extremal sign; there are such runs, separated by interior points at which the extremal sign of flips (each lies strictly between a block and a block and may be chosen with ). Form the *correcting polynomial* $$ \sigma(x) = \varepsilon \prod_{j=1}^{m}(s_j - x), $$ of degree , choosing the sign so that carries the same sign as on each run (it switches sign exactly at the , matching the flips of ). For small the perturbed error has strictly smaller maximum modulus: on each extremal run and share sign, so subtracting pulls below there, while away from the runs is already bounded away from and is taken small enough that does not overshoot. Then , contradicting optimality of . So the alternation length is at least .
Uniqueness: suppose and are both best, each with . Their average also satisfies , so is best too and its error equioscillates on points . At each , forces , since two numbers of modulus whose average has modulus must both equal that average. Thus vanishes at the points ; a degree- polynomial with zeros is identically zero, so .
Bridge. This theorem is the foundational reason the minimax problem is solvable in closed structural terms: optimality is not an inscrutable minimisation but the visible geometric fact that the error equioscillates on points, and the correcting-polynomial perturbation is exactly the construction that shows any shorter alternation can be improved. The result builds toward the extremal characterisation of the Chebyshev polynomials in the Advanced section, where is the minimal-deviation monic polynomial precisely because equioscillates on points, and it appears again in the node-optimisation of 43.08.02, where minimising the node polynomial in the sup norm is the same alternation problem with and a leading-coefficient constraint. The alternation count generalises the two-point balance of the Beginner constant fit, and the perturbation mechanism is dual to the orthogonality condition of 43.08.05: putting these together, both the and best approximations are characterised by an orthogonality-type condition on the residual — sign-alternation against in the uniform case, perpendicularity to in the quadratic case — which is exactly the central insight unifying the two best-approximation theories of this chapter.
Exercises Intermediate+
Advanced results Master
The elementary theory establishes existence, the alternation characterisation, and uniqueness; the master layer concerns the abstract Haar-system setting that isolates why polynomials behave well, the strong form of uniqueness, the precise sense in which Chebyshev interpolation is near-minimax, and the Remez algorithm that computes the true minimax polynomial.
Theorem 1 (existence of a best approximation). For and any , the infimum is attained. The map is continuous and coercive on the finite-dimensional (it tends to infinity as , since and norms on are equivalent), so it attains its minimum on the compact sublevel set [Cheney Ch. 1]. Existence requires only that be a finite-dimensional subspace of a normed space; no Haar condition is needed for existence, only for the alternation characterisation and uniqueness.
Theorem 2 (Haar systems and the Chebyshev characterisation). A system is a Haar (Chebyshev) system if every nonzero generalised polynomial has at most distinct zeros in , equivalently if every collocation matrix is nonsingular for distinct nodes — the unisolvence of 43.08.01. The monomials form a Haar system on any interval. For approximation from a Haar subspace , the equioscillation theorem holds verbatim: is the best uniform approximation of iff has an alternation set of length , and the best approximation is unique [Cheney Ch. 3]. The Haar condition is exactly the hypothesis that makes the correcting-polynomial perturbation available: a generalised polynomial can be built with prescribed sign changes at up to interior points.
Theorem 3 (strong uniqueness and Lipschitz dependence). For a Haar subspace, the best approximation is strongly unique: there is a constant with
$$
\lVert f - p \rVert_\infty \ge \lVert f - p_n^* \rVert_\infty + \gamma,\lVert p - p_n^* \rVert_\infty \quad \text{for all } p \in G,
$$
a one-sided quadratic-free lower bound that ordinary uniqueness lacks [DeVore-Lorentz Ch. 3]. Strong uniqueness forces the best-approximation operator to be locally Lipschitz (indeed pointwise-Lipschitz with constant ), in sharp contrast to the linear projection of 43.08.05: the minimax operator is nonlinear, since in general, yet it is stable. The nonlinearity is the price of the uniform norm's lack of an inner product; strong uniqueness is the compensation that keeps the problem well-posed.
Theorem 4 (near-minimax Chebyshev interpolation). Let interpolate at the Chebyshev points on . Then
$$
\lVert f - p_n^{\mathrm{Cheb}} \rVert_\infty \le (1 + \Lambda_n),E_n(f), \qquad \Lambda_n \sim \tfrac{2}{\pi}\log n ,
$$
so Chebyshev interpolation is within a factor of the true minimax error — for the factor is below [Trefethen Ch. 16]. This is the quantitative payoff of 43.08.02: the Lebesgue constant for Chebyshev nodes grows only logarithmically, where for equispaced nodes it grows faster than any power of . Chebyshev interpolation therefore delivers near-optimal uniform approximation by a linear projection that requires no iteration, recovering most of the benefit of the minimax polynomial at a fraction of the cost; the true minimax polynomial , which the next theorem computes, improves on it by at most that logarithmic factor.
Theorem 5 (Remez exchange algorithm). The minimax polynomial is computed by the Remez exchange algorithm: start with a reference of points ; solve the square linear system $$ f(t_i) - \sum_{j=0}^{n} a_j, t_i^{,j} = (-1)^i,h, \qquad i = 0, \dots, n+1, $$ for the coefficients and the levelled error (one equation per reference point, equations in unknowns, solvable because the reference is a Haar collocation augmented by the alternating column); then exchange the reference for the points where the current error attains its local extrema, preserving the alternation pattern. The iteration increases monotonically toward , sandwiched below by the de la Vallée-Poussin bound , and converges quadratically once the reference is near the true alternation set [DeVore-Lorentz Ch. 7]. Each iteration's is a guaranteed lower bound on and a guaranteed upper bound, so the algorithm self-certifies its accuracy.
Synthesis. The equioscillation theorem is the foundational reason best uniform approximation is a structural rather than a merely variational subject: optimality is the geometric fact that the error alternates on points, and every other result is read off it. The central insight is that the uniform norm replaces the inner-product orthogonality of 43.08.05 by sign-alternation against the approximating space, so the best-approximation operator is nonlinear yet strongly unique — this is exactly the trade the uniform norm imposes, and it generalises the two-point balance of the elementary constant fit to the full -point alternation. The Chebyshev polynomials are the fixed point of the whole theory: equioscillates on points, which makes the minimal-deviation monic polynomial, which makes its roots the optimal interpolation nodes that minimise the node polynomial of 43.08.02 and defeat the Runge phenomenon, and which makes Chebyshev interpolation near-minimax by the logarithmic Lebesgue constant. The bridge is that one extremal object, the equioscillating Chebyshev polynomial, simultaneously solves the minimax node problem, supplies the near-best linear interpolant, and seeds the Remez reference — the same polynomial that the theory of 43.08.05 meets through the weight , so the uniform and quadratic best-approximation theories are dual faces of one approximation problem, putting these together into the single principle that node placement is governed by the arcsine measure of the interval and the Weierstrass density theorem guarantees the whole tower converges.
Full proof set Master
Proposition 1 (existence by compactness). For , the minimax error is attained by some .
Proof. The functional is continuous on (it is -Lipschitz: ). For , the reverse triangle inequality gives , so forces . The sublevel set is therefore bounded; it is closed by continuity of , and since is finite-dimensional (all norms equivalent), is compact. The set is nonempty ( gives ). A continuous function on a nonempty compact set attains its infimum, so attains its minimum on , which equals because any has .
Proposition 2 (equioscillation characterisation). For with and , is a best uniform approximation iff has an alternation set of length .
Proof. () Let equioscillate on . If some had , then satisfies, at each , (the subtracted term is strictly smaller in modulus), so alternates sign across the gaps and has zeros; as , , contradicting . So is best. () Suppose is best but the maximal alternation length is . Let separate the maximal extremal-sign runs of , chosen with , and set with fixed so matches the extremal sign of on every run. Let bound on the compact complement of small neighbourhoods of the extremal runs where might disagree in sign; for small, has modulus everywhere (on each run and share sign and ; off the runs and the perturbation is small). Then beats , a contradiction; so .
Proposition 3 (uniqueness). The best uniform approximation from is unique.
Proof. Let be best, . Then has , so is best and, by Proposition 2, equioscillates on points with . At each , has modulus while each summand has modulus ; equality in with forces . Hence at the points . A polynomial in with zeros is identically zero, so .
Proposition 4 (Chebyshev minimal deviation). Among monic degree- polynomials on , is the unique minimiser of the sup norm, with .
Proof. The leading coefficient of is (by induction on the recurrence : the new leading term is ), so is monic, and with alternating at the points . Suppose a monic had . The difference has degree (monic leading terms cancel), and at each , , so has the alternating sign of . Thus changes sign across each of the gaps between consecutive , giving zeros; with this forces , so , contradicting . Uniqueness and minimality follow.
Proposition 5 (de la Vallée-Poussin lower bound). If and takes alternating signs at points , then .
Proof. Set and suppose ; let be best, . Then has at each the sign of , since . As alternates across the , has sign changes, hence zeros; forces , contradicting . So . Combined with , any near-equioscillating brackets between and , the bracket the Remez iteration tightens.
Connections Master
The minimal node polynomial closes the Runge loop of
43.08.02: minimising over monic degree- polynomials is exactly the Chebyshev minimal-deviation problem of Proposition 4, whose solution has roots at the Chebyshev points; placing interpolation nodes there drives the interpolation-error corollary's controllable factor to its minimum and makes the Lebesgue constant logarithmic, the precise correction to the exponential equispaced node-polynomial swelling and the cure for the divergence exhibited in43.08.02.The uniform best-approximation theory is dual to the best-approximation theory of
43.08.05: where this unit characterises optimality by sign-alternation of the residual against , that unit characterises it by orthogonality of the residual to in a weighted inner product; the two meet on the Chebyshev system, since the Chebyshev polynomials are simultaneously the equioscillating minimax extremals here and the orthogonal polynomials for the weight there, and the near-minimax property of Chebyshev interpolation is the bridge that makes the cheap -flavoured construction nearly solve the problem.The Chebyshev points constructed here are the optimal-node half of the Gauss-quadrature story of
43.09.03: Gauss-Chebyshev quadrature places its nodes at these same roots, so the extremal interpolation theory and the extremal integration theory share one node set, and the positivity and stability of the Gauss weights inherit from the same orthogonal-polynomial root structure that the minimal-deviation property exposes.Existence of a best uniform approximation presupposes that polynomials can approximate continuous functions arbitrarily well, which is the Weierstrass approximation theorem; the trigonometric/Fourier density established via Fejér in
02.10.01is the classical density backdrop, guaranteeing for every so that the equioscillation theory operates on a convergent tower rather than on a sequence with a positive floor, tying the numerical minimax theory to the harmonic-analysis density results of the analysis section.
Historical & philosophical context Master
The theory originates with Pafnuty Chebyshev, who in his 1854 memoir Théorie des mécanismes connus sous le nom de parallélogrammes studied the linkage design problem of approximating straight-line motion and was led to the polynomials of least deviation from zero; he proved that the monic degree- polynomial minimising the maximum modulus on a symmetric interval is the rescaled , and identified the alternation of the error as the signature of optimality [Chebyshev 1854]. The polynomials now bear his name through the transliteration "Tchebychef", which fixes the symbol .
The general characterisation theorem — that the best uniform approximation from the polynomials is the one whose error equioscillates on points — and its uniqueness were brought to their modern form in the early twentieth century, with Émile Borel's 1905 Leçons sur les fonctions de variables réelles presenting the alternation theorem for polynomial approximation. The abstraction to Haar systems is due to Alfréd Haar (1918), who isolated the unisolvence condition under which the alternation theorem and uniqueness survive for a general finite-dimensional subspace. The computational method is Evgeny Remez's 1934 exchange algorithm, which turns the existence theorem into an iteration that converges to the equioscillating error [Remez 1934]. The existence backdrop is the Weierstrass approximation theorem of 1885, that polynomials are dense in , without which could fail; the synthesis of the minimax theory with the analyticity-controlled convergence rate via the Bernstein ellipse, and the practical Remez computation, is the subject of the modern treatments by Cheney, DeVore-Lorentz, and Trefethen.
Bibliography Master
@book{sulimayers2003,
author = {S\"{u}li, Endre and Mayers, David F.},
title = {An Introduction to Numerical Analysis},
publisher = {Cambridge University Press},
year = {2003}
}
@book{cheney1966,
author = {Cheney, Elliott Ward},
title = {Introduction to Approximation Theory},
publisher = {McGraw-Hill},
year = {1966}
}
@book{devorelorentz1993,
author = {DeVore, Ronald A. and Lorentz, George G.},
title = {Constructive Approximation},
series = {Grundlehren der mathematischen Wissenschaften},
volume = {303},
publisher = {Springer},
year = {1993}
}
@book{trefethen2019,
author = {Trefethen, Lloyd N.},
title = {Approximation Theory and Approximation Practice, Extended Edition},
publisher = {SIAM},
year = {2019}
}
@article{chebyshev1854,
author = {Chebyshev, Pafnuty L.},
title = {Th\'{e}orie des m\'{e}canismes connus sous le nom de parall\'{e}logrammes},
journal = {M\'{e}moires des Savants \'{e}trangers pr\'{e}sent\'{e}s \`{a} l'Acad\'{e}mie de Saint-P\'{e}tersbourg},
volume = {7},
year = {1854},
pages = {539--568}
}
@article{remez1934,
author = {Remez, Evgeny Ya.},
title = {Sur la d\'{e}termination des polyn\^{o}mes d'approximation de degr\'{e} donn\'{e}e},
journal = {Communications de la Soci\'{e}t\'{e} math\'{e}matique de Kharkov},
volume = {10},
year = {1934},
pages = {41--63}
}
@book{borel1905,
author = {Borel, \'{E}mile},
title = {Le\c{c}ons sur les fonctions de variables r\'{e}elles et les d\'{e}veloppements en s\'{e}ries de polyn\^{o}mes},
publisher = {Gauthier-Villars, Paris},
year = {1905}
}