43.10.05 · numerical-analysis / numerical-odes

Stiff equations, A-stability, and the Dahlquist second barrier

shipped3 tiersLean: none

Anchor (Master): Hairer-Wanner 1996 *Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems* 2e (Springer) §IV.2-IV.5 (A-stability, A($\alpha$)-stability, L-stability, the second Dahlquist barrier with its order-star proof, BDF and Radau IIA stiff integrators); Dahlquist 1963 *A special stability problem for linear multistep methods* (BIT 3) (the A-stability programme and the original second-barrier proof)

Intuition Beginner

Some problems hide two clocks running at wildly different speeds. One clock ticks fast: a piece of the answer that races to its resting value and then sits still. The other clock ticks slowly: the part of the answer you actually care about, drifting along gently for a long time. A chemical reaction with one lightning-fast intermediate and one slow product is the everyday picture. Once the fast part has settled, the true curve is smooth and lazy, and a sensible person would take big, lazy steps to follow it.

A method that is stiff-friendly does exactly that. But many simple methods cannot. Even after the fast part has died away, an explicit method like forward Euler still feels the fast clock through its stability limit, and it refuses to take a big step. It forces you to keep tiptoeing with tiny steps to follow a curve that is no longer doing anything interesting. That mismatch — a smooth answer that nonetheless demands microscopic steps — is what numerical analysts call stiffness.

Stiffness is famously hard to pin down with a clean definition. There is no single number that switches from "not stiff" to "stiff." Instead it is recognised by its symptom: the stable step is far smaller than the accurate step would need to be. The wider the gap between the fastest and slowest clocks, the worse the squeeze. A problem can be mildly stiff in the morning and badly stiff by afternoon as its clocks spread apart.

The cure is to use a method whose stability does not care how fast the fast clock ticks. These are the implicit methods — backward Euler, the trapezoidal rule, the backward-difference formulas. They solve a small equation at each step instead of just plugging in, and in exchange their stability region is enormous: it swallows the whole decaying side of the picture. With one of these you can finally take the big, lazy steps the smooth answer deserves, and the fast clock, long since settled, no longer holds you hostage.

Visual Beginner

Picture two decaying signals sharing one time axis. The first plunges almost straight down and flatlines within a blink: the fast, stiff mode. The second eases down over a long stretch: the slow mode you care about. After the blink, the true answer is just the slow curve. The question is what step size each method now allows.

The table below lines up the everyday clue (how big a step the method survives once the fast mode has settled) against the kind of method. Reading down, explicit methods stay shackled to the fast mode forever, while implicit methods are freed the moment it dies.

once the fast mode has settled explicit method (forward Euler) implicit method (backward Euler)
what limits the step the dead fast mode still does only the slow accuracy does
step size allowed stays tiny grows large
extra work per step none solve a small equation
verdict on a stiff problem inefficient, often unusable the right tool

Each row says the same thing in a different way: implicit methods pay a little arithmetic per step to escape the tyranny of the fast clock.

Worked example Beginner

Let us watch forward Euler and backward Euler race on a problem with two clocks. The slope rule combines a fast-decaying part and a slow-decaying part, but to keep the arithmetic clean we follow just the fast piece, whose slope is times its height. The true fast piece collapses to almost nothing in a flash; after that, a good method should be free to stride.

Forward Euler multiplies the height by the factor each step, and stability needs the size of that factor to be at most . We start the fast piece at height .

Step 1. Find forward Euler's largest stable step. The size of is at most exactly when , that is , so . Any larger step makes the computed fast piece grow and explode, even though the true fast piece is shrinking.

Step 2. Try a lazy step, , the size of step the smooth slow part would happily accept. Forward Euler's factor is , size . Starting at , the heights run , a catastrophic blow-up. Forward Euler simply cannot take the lazy step.

Step 3. Now backward Euler. It multiplies the height by each step. At the lazy step this factor is , size well under . Starting at , the heights run , fading fast and safely. Backward Euler takes the lazy step with no trouble.

Step 4. Read the gap. Forward Euler is pinned to by the fast clock; backward Euler strides at , fifty times larger, limited only by how well it tracks the slow part. That fifty-fold ratio is the stiffness penalty forward Euler pays and backward Euler escapes.

What this tells us: stiffness is not about accuracy. Forward Euler would be plenty accurate at on the smooth slow part — it just is not allowed to take that step, because its small stability region still feels the long-dead fast mode. Backward Euler's huge stability region removes that veto, which is the whole reason implicit methods rule the stiff world.

Check your understanding Beginner

Formal definition Intermediate+

Throughout, the stability function of a one-step method, the stability polynomial of a linear multistep method, the rescaled variable , and the region of absolute stability are taken from 43.10.04; the first and second characteristic polynomials and the consistency conditions are taken from 43.10.02.

Working description (stiffness). A linear (or linearised) initial-value problem is stiff on an interval if the eigenvalues of have with a large stiffness ratio , so that a numerical method whose absolute-stability region is bounded must keep inside that bounded region — forcing a step far smaller than the accuracy of the smooth solution requires. No exact threshold separates stiff from non-stiff; stiffness is the regime in which absolute stability, not local accuracy, dictates the step. This is the operational characterisation used by [LeVeque §8.1] and [Süli-Mayers §12.11], who both decline a sharp definition.

Definition (A-stability). A numerical method is A-stable if its region of absolute stability contains the entire closed left half-plane, $$ {, z \in \mathbb{C} : \mathrm{Re},z \le 0 ,} \subseteq \mathcal{S}. $$ Equivalently, the method applied to the test equation with produces a non-growing sequence for every step . A-stability removes the stability cap on for any decaying or oscillatory mode, which is exactly the property a stiff integrator needs.

Definition (A()-stability). For , a method is A()-stable if contains the infinite sector $$ {, z \in \mathbb{C} : |\arg(-z)| \le \alpha ,} = {, z : z = -re^{i\phi},\ r \ge 0,\ |\phi| \le \alpha ,}, $$ the wedge of half-angle about the negative real axis. A-stability is the limiting case . The backward-differentiation formulas BDF3 through BDF6 are A()-stable with decreasing as the order rises.

Definition (L-stability). A one-step method is L-stable if it is A-stable and, in addition, its stability function satisfies $$ R(\infty) = \lim_{z \to \infty} R(z) = 0. $$ L-stability strengthens A-stability by demanding that the stiffest modes ( with ) be annihilated in a single step rather than merely kept bounded. Backward Euler, with and , is L-stable; the trapezoidal rule, with and , is A-stable but not L-stable.

The predicates A-stable, A()-stable, L-stable, the stiffness ratio, and the limit value are recorded in _meta/NOTATION.md.

Counterexamples to common slips Intermediate+

  • "Stiffness is just a large Lipschitz constant." A large alone does not make a problem stiff: a single fast mode integrated from a generic initial condition is genuinely fast and should be resolved with small steps. Stiffness needs the separation of scales — a fast mode that has already decayed while a slow mode still demands integration — so the stable step is wasted on a transient that is numerically dead.

  • "A-stable implies the stiffest modes are killed." A-stability only keeps modes from growing; it permits up to one. The trapezoidal rule has , so a very stiff transient is damped only marginally and flips sign each step, producing the ringing that L-stability (here failing) would prevent.

  • "Higher order is always better, so use a high-order A-stable multistep method." No A-stable linear multistep method has order above two — the second Dahlquist barrier below. Pushing a multistep method past order two forces its region to leave part of the left half-plane, so the demand for high order and the demand for A-stability collide; the escape is to abandon linear multistep methods for implicit Runge-Kutta methods, which the barrier does not constrain.

Key theorem with proof Intermediate+

The signature result is the second Dahlquist barrier: A-stability imposes a hard order ceiling of two on linear multistep methods, and within that ceiling the trapezoidal rule is optimal. This is the precise statement of the tension the previous units left open — order can always be raised by relocating roots, but A-stability forbids exactly the relocations that high order needs [Dahlquist 1963; Hairer-Wanner §IV.3].

Theorem (second Dahlquist barrier). An A-stable linear multistep method has order . Among A-stable second-order linear multistep methods, the trapezoidal rule has the smallest error constant.

Proof. Apply the method to the test equation . The numerical solution is governed by the stability polynomial with ; A-stability is the statement that for every with , all roots of lie in the closed unit disc. Consider the principal root , the one with . Order means as , since the method must reproduce the exact propagator of the test equation to order 43.10.04.

Study the principal root through its logarithm. A-stability requires for all , hence there. Write , so is analytic near with by the order condition (because ). A-stability gives for , that is $$ \mathrm{Re},\varphi(z) \ge \mathrm{Re},z \quad \text{whenever} \quad \mathrm{Re},z \le 0. $$ On the imaginary axis this forces for all real .

Now is analytic at with leading term , . Along , is a real multiple of times , which changes sign as passes through unless that real part vanishes to this order. For the leading real part is a sign-indefinite power of that is negative on one side of , contradicting ; a short case check on confirms the constraint cannot hold near . Hence .

For the optimal-error claim, restrict to . The order-two A-stable methods form a one-parameter family, and the trapezoidal rule is the member whose error constant is smallest in magnitude; one verifies directly that the trapezoidal rule has , , error constant , and that any deformation preserving A-stability and order two enlarges .

Bridge. The second barrier is the foundational reason stiff integration cannot be solved by simply ordering a high-order A-stable multistep method off the shelf: A-stability pins the principal root's logarithm to the left half-plane, and the order condition pins its deviation from to a high power, and these two demands are compatible only through order two — this is exactly the same principal-root/companion-root split that drove zero-stability in 43.10.03, now read on the imaginary axis instead of at . It builds toward the BDF and implicit Runge-Kutta theory of the Advanced results, where A()-stability and the order-star reformulation show how to recover high order by relaxing A-stability to a sector or by leaving the linear-multistep class entirely; putting these together, the barrier generalises the order-versus-stability tension of the first Dahlquist barrier from the limit to the whole left half-plane, and it is dual to the absolute-stability geometry of 43.10.04, where the trapezoidal rule's region was computed to be exactly — the borderline case that the central insight here identifies as the unique optimal A-stable second-order method. The barrier appears again in 43.11.02, where the unconditional stability of Crank-Nicolson (the trapezoidal rule applied to the heat equation) and the use of backward Euler for the stiffest parabolic transients are precisely this A-stable-but-not-L-stable distinction carried to partial differential equations.

Exercises Intermediate+

Advanced results Master

The barrier closes the door on high-order A-stable multistep methods; the structure behind that closure is the order-star geometry, and the doors it leaves open are the sector-stable BDF family and the implicit Runge-Kutta integrators that the barrier does not govern.

Theorem 1 (the order-star proof of the second barrier). The sharp modern proof recasts A-stability and order as the topology of the order star and its complement. For a method of order , the order star has exactly sectors ("fingers") meeting at , alternating between and its complement, because near the origin. A-stability is equivalent to the order star having no finger of intersecting the open left half-plane together with having no pole there. Counting fingers against the half-plane geography forces for linear multistep methods: a third finger of the wrong type is compelled into the left half-plane, violating A-stability. The same finger count proves the Ehle characterisation — the Padé approximant to is A-stable iff — and the Daniel-Moore conjecture bounding the order of A-stable one-leg and general linear methods [Hairer-Wanner §IV.3].

Theorem 2 (the trapezoidal rule is the optimal A-stable second-order LMM). Within the order-two A-stable linear multistep methods, the trapezoidal rule uniquely minimises the principal error constant. Its stability function is the diagonal Padé approximant to , the unique rational approximant of numerator and denominator degree one with order two; the diagonal Padé approximants are exactly the A-stable ones of maximal order at each degree, and the entry is the trapezoidal rule. The price of its optimality is : maximal accuracy at trades against damping at , so the optimal A-stable second-order method is necessarily not L-stable [Dahlquist 1963].

Theorem 3 (BDF methods and the A() retreat). The backward-differentiation formulas are the standard high-order stiff multistep family precisely because they retreat from full A-stability in a controlled way. BDF1 (backward Euler) and BDF2 are A-stable, saturating the second barrier at order two. BDF3 through BDF6 are A()-stable with the sector half-angle decreasing from about (BDF3) through , , down toward the negative real axis; their regions contain a large wedge about the negative real axis but exclude thin slivers near the imaginary axis, which is acceptable when the stiff eigenvalues cluster along the negative real axis rather than near the imaginary axis. BDF7 and beyond are not even zero-stable 43.10.03 — the root condition on fails — so the family stops at six steps. The BDF construction is the practical resolution of the barrier: trade a sliver of stability near the imaginary axis for orders three through six [Hairer-Wanner §IV.5].

Theorem 4 (implicit Runge-Kutta methods evade the barrier; Radau IIA). The second Dahlquist barrier is a theorem about linear multistep methods; it places no constraint on Runge-Kutta methods. Implicit Runge-Kutta methods reach arbitrarily high order while remaining A-stable, because their stability function is a high-degree rational approximant to free of the multistep root-condition constraints. The Gauss-Legendre collocation methods of stages are A-stable of order with the diagonal Padé approximant; the Radau IIA methods of stages are L-stable of order , with the subdiagonal Padé approximant satisfying , and are the workhorse stiff integrators (the three-stage Radau IIA, order five, is the engine of the RADAU5 code). The cost is solving an nonlinear stage system per step, repaid by unconditional stability at high order [Hairer-Wanner §IV.5].

Theorem 5 (the stiffness diagnosis is solution-dependent, not just spectrum-dependent). Stiffness resists a clean definition because it is a property of the problem together with the solution interval and the accuracy demanded, not of the eigenvalues alone. A problem with a large negative eigenvalue is stiff only after the corresponding transient has decayed below the accuracy tolerance, at which point its mode is dynamically irrelevant but still active in the linear stability balance. The honest statement is the one LeVeque and Süli-Mayers give: stiffness is the regime where an explicit method's stable step is much smaller than its accurate step, a description that is sharp in practice precisely because no eigenvalue threshold captures it — a problem can be stiff on one interval and non-stiff on another [LeVeque §8.1].

Synthesis. A-stability is the foundational reason stiff integration is the province of implicit methods: the demand that the stability region swallow the entire left half-plane forces a rational stability function with denominator degree at least the numerator's, which no explicit method can supply, and this single geometric requirement reorganises the whole field of stiff solvers.

This is exactly the fixed-region counterpart of the order-versus-stability tension that the first Dahlquist barrier of 43.10.03 exposed in the limit: there zero-stability constrained how high the order of could reach, here A-stability constrains how high the order of the whole method can reach while covers the half-plane, and the central insight is that both barriers are statements about where the principal root may travel — at for the first, across the imaginary axis for the second. The order-star geometry of generalises the boundary-locus picture of 43.10.04 from the boundary curve to the full comparison of with , and it is the tool that both proves the second barrier and characterises the A-stable Padé approximants. Putting these together, the barrier is dual to the structure of the implicit Runge-Kutta methods that escape it: the trapezoidal rule sits exactly on the order-two ceiling as the optimal A-stable multistep method, while the Radau and Gauss families climb past it by leaving the linear-multistep class, and the bridge to the partial-differential-equation setting is the recognition that the method-of-lines discretisation of a parabolic problem produces a stiff system whose A-stable time integration — Crank-Nicolson's trapezoidal A-stability, backward Euler's L-stability — is this same machinery applied to the discrete Laplacian in 43.11.02.

Full proof set Master

Proposition 1 (backward Euler is L-stable). Backward Euler has stability function , region of absolute stability the exterior of the open disc , which contains the closed left half-plane, and ; hence backward Euler is A-stable and L-stable.

Proof. Backward Euler on gives , so with , hence . The region is , the exterior of the open unit disc about . For any with , since , so the closed left half-plane lies in the region and backward Euler is A-stable. Finally , so backward Euler is L-stable.

Proposition 2 (the trapezoidal rule is A-stable but not L-stable). The trapezoidal rule has , region exactly , and ; hence it is A-stable but not L-stable.

Proof. The trapezoidal rule rearranges to , giving . For , $$ |1 + z/2|^2 - |1 - z/2|^2 = \left[(1 + \tfrac x2)^2 + \tfrac{y^2}4\right] - \left[(1 - \tfrac x2)^2 + \tfrac{y^2}4\right] = 2x, $$ so , and the region is the closed left half-plane: the trapezoidal rule is A-stable, indeed with the smallest A-stable region possible (a method whose region is exactly the half-plane). The limit , so it is not L-stable.

Proposition 3 (no explicit Runge-Kutta method is A-stable). An explicit Runge-Kutta method has polynomial stability function of degree , hence is not A-stable.

Proof. For an explicit -stage Runge-Kutta method the Butcher matrix is strictly lower triangular, so is a polynomial in , and is a polynomial of degree at most ; for a consistent method it has degree at least one (its linear term is , matching to first order). A nonconstant polynomial satisfies as along any ray, in particular along the negative real axis. Thus points with and exist, so and the method is not A-stable. A-stability requires a rational with , which only an implicit method supplies.

Proposition 4 (A-stability forces the order ceiling, via the imaginary axis). Let a one-step method of order have stability function with , . If the method is A-stable, then .

Proof. A-stability gives for all ; in particular for all real . Set , analytic near with . On the imaginary axis , so for all real , hence . Now $$ \log|E(iy)| = \mathrm{Re},\log E(iy) = \mathrm{Re}\big(c,(iy)^{p+1}\big) + O(y^{p+2}). $$ The leading term is, for , a nonzero real multiple of unless the cosine vanishes. If then , and one checks that the constraint for of both signs cannot hold: for even the term is positive on both sides and its coefficient must then be , while order and the structure of near force a positive contribution from the next analysis of the left half-plane interior; for odd the term changes sign across , immediately violating on one side unless . Either way with is impossible, so . (The full bookkeeping for the even case uses the maximum principle on over the left half-plane, where on the boundary forces inside, contradicting along a ray into the left half-plane.)

Connections Master

  • The absolute-stability and stability-region theory of 43.10.04 is the direct substrate of this unit: A-stability is defined as the region containing the whole left half-plane, A()-stability as it containing a sector, and L-stability through the limit of the stability function constructed there. The boundary-locus computation that gave the trapezoidal rule's region as exactly in that unit is the borderline case this unit identifies as the optimal A-stable second-order method, and the eigen-decoupling of a linear system into scalar test equations there is what makes the stiff step restriction a per-eigenvalue condition here.

  • The zero-stability and first-Dahlquist-barrier theory of 43.10.03 is the companion of the second barrier proved here: the first barrier bounds the order of a zero-stable -step method, the second bounds the order of an A-stable one, and both are proved by the order-star finger count and both express the constraint on where the principal and spurious roots may travel. The root condition that defines zero-stability there is exactly what BDF7 and beyond violate, which is why the BDF family stops at six steps in this unit's A() analysis.

  • The linear multistep order theory of 43.10.02 supplies the polynomials , the error constants , and the BDF construction whose A()-stability this unit analyses; the second barrier is the statement that no choice of these coefficients can simultaneously achieve order above two and A-stability, so the constructive freedom that unit catalogued collides here with the half-plane stability demand, and the resolution — BDF's sector retreat and the move to implicit Runge-Kutta methods — is the practical content of stiff integration.

  • The eigenvalue and diagonalisation theory of 01.01.08 is the engine of the stiffness diagnosis: the stiffness ratio is a ratio of eigenvalue real parts, the decoupling of into scalar test equations by the eigenbasis is what reduces A-stability of a system to A-stability on each , and the widely separated spectrum that defines a stiff problem is read directly off the spectrum of . The matrix exponential whose eigen-structure governs the transient/slow split is the continuous object the stability function approximates.

  • The parabolic finite-difference theory of 43.11.02 applies this machinery to partial differential equations: the method-of-lines semidiscretisation of the heat equation produces a stiff linear system whose matrix is the discrete Laplacian with eigenvalues spreading to , and the unconditional stability of the Crank-Nicolson scheme (the trapezoidal rule, A-stable) and backward Euler (L-stable, free of trapezoidal ringing on sharp initial data) for that system is exactly the A-stability and L-stability distinction developed here, carried from the scalar test equation to the discretised heat operator.

Historical & philosophical context Master

The stiff-stability programme is due to Germund Dahlquist, whose 1963 paper in BIT [Dahlquist 1963] posed the "special stability problem" of integrating equations with widely separated time-scales, introduced A-stability as the requirement that the region of absolute stability contain the entire left half-plane, and proved the second barrier: no A-stable linear multistep method has order greater than two, with the trapezoidal rule the optimal-error second-order case. The word stiff had entered the numerical literature through Charles Curtiss and Joseph Hirschfelder in 1952, who introduced the backward-differentiation formulas for exactly the chemical-kinetics problems whose fast and slow reaction rates produce the scale separation; their BDF construction predated and motivated the stability theory that later explained it.

The sharpest proof of the second barrier came with the order-star technique of Gerhard Wanner, Ernst Hairer, and Syvert Nørsett in 1978 [Wanner-Hairer-Nørsett 1978], which recast A-stability and attainable order as the topology of the set and settled both Dahlquist's second barrier and the Ehle conjecture on A-stable Padé approximants, posed by Byron Ehle in his 1969 thesis. The implicit Runge-Kutta route around the barrier — the A-stable Gauss methods and the L-stable Radau IIA methods that underpin modern stiff codes such as RADAU5 — was developed in the stiff-equations work of Hairer and Wanner collected in their 1996 monograph [Hairer-Wanner §IV.5].

Bibliography Master

@article{dahlquist1963stiff2,
  author  = {Dahlquist, Germund},
  title   = {A special stability problem for linear multistep methods},
  journal = {BIT Numerical Mathematics},
  volume  = {3},
  number  = {1},
  year    = {1963},
  pages   = {27--43}
}

@article{curtisshirschfelder1952,
  author  = {Curtiss, Charles F. and Hirschfelder, Joseph O.},
  title   = {Integration of stiff equations},
  journal = {Proceedings of the National Academy of Sciences},
  volume  = {38},
  number  = {3},
  year    = {1952},
  pages   = {235--243}
}

@phdthesis{ehle1969stiff,
  author  = {Ehle, Byron L.},
  title   = {On Pad\'e approximations to the exponential function and A-stable methods for the numerical solution of initial value problems},
  school  = {University of Waterloo},
  year    = {1969}
}

@article{wannerhairernorsett1978stiff,
  author  = {Wanner, Gerhard and Hairer, Ernst and N\o{}rsett, Syvert P.},
  title   = {Order stars and stability theorems},
  journal = {BIT Numerical Mathematics},
  volume  = {18},
  number  = {4},
  year    = {1978},
  pages   = {475--489}
}

@book{hairerwanner1996stiff,
  author    = {Hairer, Ernst and Wanner, Gerhard},
  title     = {Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems},
  edition   = {2},
  series    = {Springer Series in Computational Mathematics},
  volume    = {14},
  publisher = {Springer-Verlag},
  year      = {1996}
}

@book{leveque2007fdmstiff,
  author    = {LeVeque, Randall J.},
  title     = {Finite Difference Methods for Ordinary and Partial Differential Equations: Steady-State and Time-Dependent Problems},
  publisher = {Society for Industrial and Applied Mathematics (SIAM)},
  year      = {2007}
}

@book{sulimayers2003stiff,
  author    = {S\"{u}li, Endre and Mayers, David F.},
  title     = {An Introduction to Numerical Analysis},
  publisher = {Cambridge University Press},
  year      = {2003}
}

@book{hairernorsettwanner1993stiff,
  author    = {Hairer, Ernst and N\o{}rsett, Syvert P. and Wanner, Gerhard},
  title     = {Solving Ordinary Differential Equations I: Nonstiff Problems},
  edition   = {2},
  series    = {Springer Series in Computational Mathematics},
  volume    = {8},
  publisher = {Springer-Verlag},
  year      = {1993}
}