43.10.02 · numerical-analysis / numerical-odes

Linear multistep methods: Adams and BDF families, order via the characteristic polynomials

shipped3 tiersLean: none

Anchor (Master): Hairer-Nørsett-Wanner 1993 *Solving Ordinary Differential Equations I: Nonstiff Problems* 2e (Springer) §III.1-III.2 (the general LMM, the error constants $C_p$, Adams and Nyström families, the order via the residual operator); Hairer-Wanner 1996 *Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems* 2e (Springer) §V.1 (BDF methods and their construction); Henrici 1962 *Discrete Variable Methods in Ordinary Differential Equations* (Wiley) Ch. 5 (the difference-operator calculus and the order conditions)

Intuition Beginner

A one-step method draws the next point of a solution curve using only the point you are standing on right now. Every slope you computed at the previous points is thrown away the moment you move. That feels wasteful: you paid to learn the slope at the last few stops, and those slopes still carry real information about how the curve is bending. A linear multistep method is the idea of keeping a short memory and reusing it.

Imagine walking a trail and predicting where the path goes next. A forgetful walker looks only at the ground underfoot. A walker with memory recalls the last few steps, notices the trail has been curving gently to the left, and leans into the turn before reaching it. By blending several recent slope readings instead of one, the method anticipates the bend and lands closer to the true path for the same amount of work each step.

There are two natural memories to keep. One is a memory of past slopes, used to extrapolate forward; this gives the Adams-Bashforth family, which predicts the next point from slopes already computed. The other also leans on the slope at the point you are about to reach, which must be solved for; this gives the Adams-Moulton family, which corrects a prediction and tracks curvature even better. A third design keeps a memory of past positions and fits the new slope to them; this gives the backward-differentiation formulas, the workhorses for stiff problems where explicit methods fail.

Two practical wrinkles appear the moment you keep a memory. First, a method needing the last three points cannot start from a single initial point, so the first few values must be supplied some other way, usually by a one-step method. Second, more memory is not automatically better: a badly chosen blend can let small rounding errors feed on themselves and grow. Sorting out which blends are safe is the work of the next unit; this one builds the blends and measures how accurate they are.

Visual Beginner

Picture the solution curve and a row of equally spaced time stations along the bottom. At each past station you have already marked a slope dash. A one-step method uses only the dash at the current station. A multistep method reaches back and draws on the dashes at several recent stations, fitting a small curve through them and following it forward to the next station.

The table below lines up the three families this unit builds. Reading across, each row says what the method remembers, whether it must solve for the new point, and the highest power of the step size whose error it cancels — the order.

family	what it remembers	solve for the new point?	typical order for $r$ steps
Adams-Bashforth	the last $r$ slopes	no (explicit)	$r$
Adams-Moulton	the last $r$ slopes plus the new slope	yes (implicit)	$r + 1$
backward differentiation	the last $r$ positions plus the new slope	yes (implicit)	$r$

Each row is a recipe with fixed numerical weights, and the rest of the unit explains where those weights come from and why the order column reads as it does.

Worked example Beginner

Let us build and run the simplest memory method, two-step Adams-Bashforth, on the slope rule "the slope equals the current height," starting from height $1$ at time $0$ , whose true curve is $y = e^{t}$ . The recipe uses the two most recent slopes with weights $\frac{3}{2}$ and $- \frac{1}{2}$ : $$ u^{n+2} = u^{n+1} + h\left(\tfrac{3}{2} f^{n+1} - \tfrac{1}{2} f^{n}\right), $$ where $f^{n}$ is the slope at the $n$ -th point, here equal to $u^{n}$ . Take a step of $h = 0.5$ .

Step 1. The method needs two starting points, but we are handed only one. Use one step of forward Euler to manufacture the second: $u^{1} = u^{0} + 0.5 \times u^{0} = 1 + 0.5 = 1.5$ . Now we have $u^{0} = 1$ at time $0$ and $u^{1} = 1.5$ at time $0.5$ .

Step 2. Apply the two-step recipe to reach time $1$ . The slopes are $f^{0} = 1$ and $f^{1} = 1.5$ , so $$ u^2 = 1.5 + 0.5\left(\tfrac{3}{2}\times 1.5 - \tfrac{1}{2}\times 1\right) = 1.5 + 0.5(2.25 - 0.5) = 1.5 + 0.875 = 2.375. $$

Step 3. Compare with the truth. The exact height at time $1$ is $e \approx 2.718$ . Plain forward Euler reached only $2.25$ here, an undershoot of about $0.468$ . The two-step method reached $2.375$ , an undershoot of about $0.343$ — closer, using the same number of slope evaluations per step, because it remembered the previous slope and used the curve's bend.

Step 4. See the order. The weights $\frac{3}{2}$ and $- \frac{1}{2}$ are chosen so the method is exact whenever the true solution is a parabola. That makes it second order: halving the step shrinks the error by about a factor of four, not just two as for Euler.

What this tells us: keeping a one-step memory and blending two slopes with the right weights buys an extra power of accuracy for free, at the cost of needing a helper method to supply the first extra starting value.

Check your understanding Beginner

Formal definition Intermediate+

Consider the initial-value problem (IVP) $y^{'} (t) = f (t, y (t))$ , $y (t_{0}) = y_{0}$ , with $f : [t_{0}, T] \times R^{d} \to R^{d}$ continuous and globally Lipschitz in $y$ , so the exact $C^{1}$ solution exists and is unique by the Picard-Lindelöf theorem 02.12.01. Fix a step $h > 0$ , set $t_{n} = t_{0} + nh$ , write $u^{n} \approx y (t_{n})$ , and abbreviate $f^{n} := f (t_{n}, u^{n})$ .

Definition (linear multistep method). An $r$ -step linear multistep method (LMM) is the difference equation $$ \sum_{j=0}^{r} \alpha_j, u^{n+j} ;=; h \sum_{j=0}^{r} \beta_j, f^{n+j}, \qquad n = 0, 1, 2, \dots, $$ with real coefficients $α_{j}, β_{j}$ , normalised by $α_{r} = 1$ (so the leading term solves for the newest value) and $∣ α_{0} ∣ + ∣ β_{0} ∣ \neq = 0$ (so the method genuinely reaches back $r$ steps). The method is explicit when $β_{r} = 0$ , since then $u^{n + r}$ is given outright by past data; it is implicit when $β_{r} \neq = 0$ , since then $u^{n + r}$ appears inside $f^{n + r}$ and each step solves an algebraic equation for it.

Definition (characteristic polynomials). The first and second characteristic polynomials of the method are $$ \rho(\zeta) = \sum_{j=0}^{r} \alpha_j \zeta^j, \qquad \sigma(\zeta) = \sum_{j=0}^{r} \beta_j \zeta^j. $$ The polynomial $ρ$ records how past values are combined; $σ$ records how past slopes are combined. The pair $(ρ, σ)$ determines the method and is the object through which both accuracy and stability are read.

Definition (linear difference operator, truncation error, order). Associate to the method the linear difference operator acting on a smooth function $z$ : $$ \mathcal{L}z; h ;=; \sum_{j=0}^{r} \big(\alpha_j, z(t + jh) - h,\beta_j, z'(t + jh)\big). $$ The local truncation error (LTE) is $τ^{n} = \frac{1}{h} L [y; h] (t_{n})$ , the residual when the exact solution is substituted into the formula and divided by $h$ . The method has order $p$ if $L [z; h] (t) = O (h^{p + 1})$ for every $C^{p + 1}$ function $z$ , equivalently $τ^{n} = O (h^{p})$ ; it is consistent if $p \geq 1$ . Expanding each $z (t + j h)$ and $z^{'} (t + j h)$ in a Taylor series collects $L [z; h] = \sum_{q \geq 0} C_{q} h^{q} z^{(q)} (t)$ with error constants $$ C_0 = \sum_j \alpha_j, \qquad C_q = \frac{1}{q!}\sum_j \alpha_j j^q - \frac{1}{(q-1)!}\sum_j \beta_j j^{q-1}\ \ (q \ge 1), $$ and order $p$ is exactly $C_{0} = C_{1} = \dots = C_{p} = 0$ , $C_{p + 1} \neq = 0$ , with $C_{p + 1}$ the principal error constant.

The symbols $ρ, σ$ , the operator $L$ , the error constants $C_{q}$ , and the $O (h^{p})$ Landau symbol are recorded in _meta/NOTATION.md.

Counterexamples to common slips Intermediate+

"Consistency of a multistep method is just $τ^{n} \to 0$ , same as one-step." The consistency conditions read identically as $C_{0} = C_{1} = 0$ , but for $r \geq 2$ consistency no longer implies convergence: a consistent LMM can diverge if its $ρ$ has a spurious root outside the unit disc. The missing ingredient, zero-stability, is the root condition on $ρ$ developed in 43.10.03; this unit constructs the methods and certifies their order, and treats convergence as the companion question.
"The order is whatever $σ$ says." Both polynomials enter every order condition. Consistency is $ρ (1) = 0$ and $ρ^{'} (1) = σ (1)$ together: the first says constants are reproduced, the second says linear functions are reproduced. Ignoring $ρ^{'} (1) = σ (1)$ produces a method that is exact on constants but already wrong on straight lines.
"More steps always raises the order." The number of order conditions an $r$ -step method must satisfy grows with $p$ , and for zero-stable methods the attainable order is capped — the first Dahlquist barrier limits a stable $r$ -step method to order $r + 1$ (odd $r$ ) or $r + 2$ (even $r$ ). One can push the order higher by spending all coefficients on accuracy, but the resulting $ρ$ then violates the root condition and the method is useless. This tension is the subject of 43.10.03.

Key theorem with proof Intermediate+

The signature result is the characterisation of order through $ρ$ and $σ$ : a single algebraic statement that turns "match the Taylor expansion to order $p$ " into explicit conditions on the two characteristic polynomials, and reads consistency off as two equations on $ρ$ alone together with $σ (1)$ . Every coefficient choice in the Adams and BDF constructions below is dictated by it.

Theorem (order conditions for linear multistep methods). Let an $r$ -step LMM have characteristic polynomials $ρ, σ$ and error constants $C_{q}$ as above. Then:

The method has order $\geq p$ if and only if $C_{0} = C_{1} = \dots = C_{p} = 0$ , equivalently
$j = 0 \sum r α_{j} = 0 (q = 0), j = 0 \sum r α_{j} j^{q} = q j = 0 \sum r β_{j} j^{q - 1} (1 \leq q \leq p) .$
Equivalently, in terms of the polynomials,
$ρ (e^{h}) - h σ (e^{h}) = C_{p + 1} h^{p + 1} + O (h^{p + 2}) (h \to 0),$
so the method has order $p$ exactly when the analytic function $ρ (e^{h}) - hσ (e^{h})$ vanishes to order $p + 1$ at $h = 0$ .
In particular the method is consistent (order $\geq 1$ ) if and only if
$ρ (1) = 0 and ρ^{'} (1) = σ (1) .$

Proof. Take $z$ smooth and expand each term of $L [z; h] (t)$ in a Taylor series about $t$ . With $z (t + j h) = \sum_{q \geq 0} \frac{( j h ) ^{q}}{q !} z^{(q)} (t)$ and $z^{'} (t + j h) = \sum_{q \geq 1} \frac{( j h ) ^{q - 1}}{( q - 1 )!} z^{(q)} (t)$ , substitute into the operator and collect powers of $h$ : $$ \mathcal{L}z;h = \sum_{j} \alpha_j \sum_{q\ge 0}\frac{(jh)^q}{q!} z^{(q)}

h\sum_j \beta_j \sum_{q \ge 1}\frac{(jh)^{q-1}}{(q-1)!} z^{(q)} = \sum_{q\ge 0} C_q h^q z^{(q)}(t), $$ with $C_{0} = \sum_{j} α_{j}$ and, for $q \geq 1$ , $C_{q} = \frac{1}{q !} \sum_{j} α_{j} j^{q} - \frac{1}{( q - 1 )!} \sum_{j} β_{j} j^{q - 1}$ , reading off the coefficient of $h^{q} z^{(q)}$ from both sums (the second sum contributes to order $q$ via its $h \cdot h^{q - 1}$ factor). The operator therefore annihilates every $C^{p + 1}$ function to order $h^{p + 1}$ precisely when $C_{0} = \dots = C_{p} = 0$ . Clearing the factorials in $C_{q} = 0$ gives $\sum_{j} α_{j} j^{q} = q \sum_{j} β_{j} j^{q - 1}$ , which is part 1.

For part 2, apply the operator to the exponential probe $z (t) = e^{s t}$ . Then $z (t + j h) = e^{s t} e^{s j h}$ and $z^{'} (t + j h) = s e^{s t} e^{s j h}$ , so $$ \mathcal{L}e^{s\cdot}; h = e^{st}\Big(\sum_j \alpha_j e^{sjh} - h s \sum_j \beta_j e^{sjh}\Big) = e^{st}\big(\rho(e^{sh}) - hs,\sigma(e^{sh})\big). $$ Setting $s = 1$ (a unit-rate probe, with $z^{'} = z$ ) gives the function $ρ (e^{h}) - hσ (e^{h})$ . On the other hand $L [e^{s \cdot}; h] = e^{s t} \sum_{q} C_{q} (s h)^{q}$ from the expansion above with $z^{(q)} = s^{q} e^{s t}$ ; comparing, $ρ (e^{h}) - hσ (e^{h}) = \sum_{q} C_{q} h^{q}$ at $s = 1$ . Hence order $p$ (i.e. $C_{0} = \dots = C_{p} = 0$ , $C_{p + 1} \neq = 0$ ) is the same as $ρ (e^{h}) - hσ (e^{h}) = C_{p + 1} h^{p + 1} + O (h^{p + 2})$ .

Part 3 is the case $p = 1$ written through the polynomials. The condition $C_{0} = 0$ is $\sum_{j} α_{j} = ρ (1) = 0$ . The condition $C_{1} = 0$ is $\sum_{j} α_{j} j = \sum_{j} β_{j}$ , and since $ρ^{'} (ζ) = \sum_{j} α_{j} j ζ^{j - 1}$ gives $ρ^{'} (1) = \sum_{j} α_{j} j$ while $σ (1) = \sum_{j} β_{j}$ , this is $ρ^{'} (1) = σ (1)$ . $□$

The analytic form in part 2 is often written as the log condition: dividing by $σ (ζ)$ and setting $ζ = e^{h}$ , order $p$ is the statement that $ρ (ζ) / lo g ζ - σ (ζ) = O ((ζ - 1)^{p})$ as $ζ \to 1$ , since $h = lo g ζ$ near $ζ = 1$ . This is the lens through which the Adams and BDF families are derived: choose $ρ$ first by a structural principle, then take $σ$ to be the Taylor truncation of $ρ (ζ) / lo g ζ$ to the required degree.

Bridge. The order conditions are the foundational reason a multistep method's accuracy is a property of the pair $(ρ, σ)$ and not of any individual coefficient: the linear difference operator $L$ filters out exactly the Taylor data that the polynomials fail to reproduce, and the probe $z = e^{s t}$ collapses that filtering into the single analytic identity $ρ (e^{h}) - hσ (e^{h}) = C_{p + 1} h^{p + 1} + O (h^{p + 2})$ . This argument builds toward the convergence theory of 43.10.03, where the same first polynomial $ρ$ that here governs accuracy through $ρ (1) = 0$ , $ρ^{'} (1) = σ (1)$ reappears as the carrier of stability through the location of its roots; that is exactly the point at which consistency and convergence come apart, because a consistent $ρ$ may still have a root outside the unit disc. The structure generalises the one-step convergence theorem of 43.10.01: the scalar amplification factor $1 + h Λ$ that there propagated the local error is here replaced by the companion matrix of $ρ$ , and putting these together, the central insight of the whole chapter — local accuracy from Taylor matching, propagated under a stability condition — is what the bridge from this unit to the next makes precise, with the root condition on $ρ$ playing the role the Lipschitz bound played for one-step methods. The error-constant calculus that drives the Adams and BDF construction below appears again in the Dahlquist-barrier count of how much order a stable $ρ$ can support.

Exercises Intermediate+

Exercise 3 (medium, symbolic).

Derive the two-step Adams-Moulton method (the trapezoidal-like implicit scheme with $ρ (ζ) = ζ - 1$ shifted appropriately) of maximal order by choosing $β_{2}, β_{1}, β_{0}$ in $u^{n + 1} = u^{n} + h (β_{2} f^{n + 1} + β_{1} f^{n} + β_{0} f^{n - 1})$ to kill $C_{1}, C_{2}, C_{3}$ .

Hint

Here $ρ (ζ) = ζ^{2} - ζ$ on the index window ${n - 1, n, n + 1}$ , so $α$ -conditions give $C_{0} = C_{1}$ automatically up to $σ$ ; impose $\sum_{j} α_{j} j^{q} = q \sum_{j} β_{j} j^{q - 1}$ for $q = 1, 2, 3$ with nodes $j \in {0, 1, 2}$ .

Answer

Index the window $j \in {0, 1, 2}$ with $α_{2} = 1, α_{1} = - 1, α_{0} = 0$ and unknown $β_{2}, β_{1}, β_{0}$ . The conditions $\sum_{j} α_{j} j^{q} = q \sum_{j} β_{j} j^{q - 1}$ read: $q = 1$ : $α_{1} + 2 α_{2} = 1 = β_{0} + β_{1} + β_{2}$ ; $q = 2$ : $α_{1} + 4 α_{2} = 3 = 2 (β_{1} + 2 β_{2})$ ; $q = 3$ : $α_{1} + 8 α_{2} = 7 = 3 (β_{1} + 4 β_{2})$ . From the last two, $β_{1} + 2 β_{2} = \frac{3}{2}$ and $β_{1} + 4 β_{2} = \frac{7}{3}$ ; subtracting, $2 β_{2} = \frac{7}{3} - \frac{3}{2} = \frac{5}{6}$ , so $β_{2} = \frac{5}{12}$ , $β_{1} = \frac{3}{2} - \frac{5}{6} = \frac{2}{3}$ , and $β_{0} = 1 - β_{1} - β_{2} = 1 - \frac{2}{3} - \frac{5}{12} = - \frac{1}{12}$ . This is the third-order Adams-Moulton method $u^{n + 1} = u^{n} + h (\frac{5}{12} f^{n + 1} + \frac{2}{3} f^{n} - \frac{1}{12} f^{n - 1})$ . Rubric: full credit for setting up all three conditions and solving the linear system to the stated coefficients.

Exercise 4 (medium, symbolic).

Compute the principal error constant $C_{3}$ of the two-step Adams-Bashforth method $u^{n + 2} = u^{n + 1} + h (\frac{3}{2} f^{n + 1} - \frac{1}{2} f^{n})$ and state its order.

Hint

Coefficients on the window $j \in {0, 1, 2}$ : $α = (0, - 1, 1)$ , $β = (0, \frac{3}{2}, - \frac{1}{2})$ — wait, line up $β$ with the slope indices: $f^{n + 1}, f^{n}$ correspond to $j = 1, 0$ . Use $C_{q} = \frac{1}{q !} \sum_{j} α_{j} j^{q} - \frac{1}{( q - 1 )!} \sum_{j} β_{j} j^{q - 1}$ .

Answer

On the window $j \in {0, 1, 2}$ the value coefficients are $α_{0} = 0, α_{1} = - 1, α_{2} = 1$ and the slope coefficients are $β_{0} = - \frac{1}{2}, β_{1} = \frac{3}{2}, β_{2} = 0$ . Check $C_{0} = 0$ , $C_{1} = (- 1 + 2) - (- \frac{1}{2} + \frac{3}{2}) = 1 - 1 = 0$ , $C_{2} = \frac{1}{2} (- 1 + 4) - (β_{1} + 2 β_{2}) = \frac{3}{2} - \frac{3}{2} = 0$ . Then $C_{3} = \frac{1}{6} (α_{1} + 8 α_{2}) - \frac{1}{2} (β_{1} \cdot 1 + β_{2} \cdot 4) = \frac{1}{6} (7) - \frac{1}{2} (\frac{3}{2}) = \frac{7}{6} - \frac{3}{4} = \frac{14 - 9}{12} = \frac{5}{12}$ . Since $C_{0} = C_{1} = C_{2} = 0$ and $C_{3} = \frac{5}{12} \neq = 0$ , the order is $p = 2$ with principal error constant $\frac{5}{12}$ . Rubric: full credit for verifying $C_{2} = 0$ and computing $C_{3} = \frac{5}{12}$ .

Exercise 5 (medium, symbolic).

Derive the two-step backward-differentiation formula (BDF2) by fitting the interpolating polynomial through $(t_{n - 1}, u^{n - 1}), (t_{n}, u^{n}), (t_{n + 1}, u^{n + 1})$ and matching its derivative at $t_{n + 1}$ to $f^{n + 1}$ . Show it is $\frac{3}{2} u^{n + 1} - 2 u^{n} + \frac{1}{2} u^{n - 1} = h f^{n + 1}$ .

Hint

For equally spaced nodes, the backward-difference derivative of the quadratic interpolant at the newest node is $\frac{1}{h} (\frac{3}{2} u^{n + 1} - 2 u^{n} + \frac{1}{2} u^{n - 1})$ . Set it equal to $f^{n + 1}$ .

Answer

Let $P$ be the quadratic interpolant of $u^{n - 1}, u^{n}, u^{n + 1}$ at $t_{n - 1}, t_{n}, t_{n + 1}$ . Writing $P$ in Newton backward-difference form about $t_{n + 1}$ , $P (t_{n + 1} - s h) = u^{n + 1} - s \nabla u^{n + 1} + \frac{s ( s + 1 )}{2} \nabla^{2} u^{n + 1}$ with $\nabla u^{n + 1} = u^{n + 1} - u^{n}$ , $\nabla^{2} u^{n + 1} = u^{n + 1} - 2 u^{n} + u^{n - 1}$ . Differentiating in $t$ (so $\frac{d}{d t} = - \frac{1}{h} \frac{d}{d s}$ ) and evaluating at $s = 0$ gives $P^{'} (t_{n + 1}) = \frac{1}{h} (\nabla u^{n + 1} + \frac{1}{2} \nabla^{2} u^{n + 1}) = \frac{1}{h} (\frac{3}{2} u^{n + 1} - 2 u^{n} + \frac{1}{2} u^{n - 1})$ . Setting $P^{'} (t_{n + 1}) = f^{n + 1}$ yields $\frac{3}{2} u^{n + 1} - 2 u^{n} + \frac{1}{2} u^{n - 1} = h f^{n + 1}$ . So $ρ (ζ) = \frac{3}{2} ζ^{2} - 2 ζ + \frac{1}{2}$ , $σ (ζ) = ζ^{2}$ ; one checks $ρ (1) = 0$ and $ρ^{'} (1) = 1 = σ (1)$ , and the order is $2$ . Rubric: full credit for the backward-difference derivative and the resulting coefficients.

Exercise 6 (medium, numeric).

A predictor-corrector pair uses two-step Adams-Bashforth to predict and the trapezoidal rule to correct once. On $y^{'} = y$ , $y (0) = 1$ , with $h = 0.5$ and starting values $u^{0} = 1$ , $u^{1} = 1.5$ , compute the corrected $u^{2}$ in PECE mode.

Hint

Predict $\overset{u}{ˉ}^{2}$ by Adams-Bashforth, evaluate $\overset{ˉ}{f}^{2} = \overset{u}{ˉ}^{2}$ , then correct $u^{2} = u^{1} + \frac{h}{2} (f^{1} + \overset{ˉ}{f}^{2})$ with $f^{1} = u^{1} = 1.5$ .

Answer

$u^{2} = 2.34375$ . Predict: $\overset{u}{ˉ}^{2} = u^{1} + h (\frac{3}{2} f^{1} - \frac{1}{2} f^{0}) = 1.5 + 0.5 (\frac{3}{2} \cdot 1.5 - \frac{1}{2} \cdot 1) = 1.5 + 0.5 (2.25 - 0.5) = 2.375$ . Evaluate $\overset{ˉ}{f}^{2} = \overset{u}{ˉ}^{2} = 2.375$ . Correct (trapezoidal): $u^{2} = u^{1} + \frac{0.5}{2} (f^{1} + \overset{ˉ}{f}^{2}) = 1.5 + 0.25 (1.5 + 2.375) = 1.5 + 0.25 (3.875) = 1.5 + 0.96875 = 2.46875$ . (The final "E" re-evaluates $f^{2} = u^{2}$ for the next step.) Re-reading the prompt's "correct once" as a single corrector application gives $u^{2} = 2.46875$ ; the bare predictor was $2.375$ . Rubric: full credit for the predict, evaluate, correct sequence and the value $2.46875$ .

Exercise 7 (hard, symbolic).

Derive the explicit two-step Adams-Bashforth coefficients from interpolatory quadrature: integrate $y (t_{n + 2}) - y (t_{n + 1}) = \int_{t_{n + 1}}^{t_{n + 2}} f (t, y) d t$ with $f$ replaced by the linear interpolant of $(t_{n + 1}, f^{n + 1})$ and $(t_{n}, f^{n})$ , and recover the weights $\frac{3}{2}, - \frac{1}{2}$ .

Hint

Parametrise $t = t_{n + 1} + s h$ , $s \in [0, 1]$ ; the linear interpolant through the two slope points is $f^{n + 1} + s (f^{n + 1} - f^{n})$ . Integrate over $s \in [0, 1]$ and multiply by $h$ .

Answer

The linear interpolant of $f$ through $(t_{n}, f^{n})$ and $(t_{n + 1}, f^{n + 1})$ is, in the variable $t = t_{n + 1} + s h$ , $ℓ (s) = f^{n + 1} + s (f^{n + 1} - f^{n})$ (slope $\frac{f ^{n + 1} - f ^{n}}{h}$ extrapolated forward). Then $$ \int_{t_{n+1}}^{t_{n+2}} f,dt \approx h\int_0^1 \ell(s),ds = h\Big[f^{n+1} + (f^{n+1}-f^n)\int_0^1 s,ds\Big] = h\Big[f^{n+1} + \tfrac12(f^{n+1}-f^n)\Big] = h\big(\tfrac32 f^{n+1} - \tfrac12 f^n\big). $$ So $y (t_{n + 2}) - y (t_{n + 1}) \approx h (\frac{3}{2} f^{n + 1} - \frac{1}{2} f^{n})$ , which is exactly the two-step Adams-Bashforth update with weights $\frac{3}{2}, - \frac{1}{2}$ . The error term comes from the interpolation remainder $\frac{1}{2} f^{''} (ξ) s (s - 1) h^{2}$ integrated over $[0, 1]$ , recovering $C_{3} = \frac{5}{12}$ . Rubric: full credit for the interpolant, the integral, and the two weights; partial for the error-constant link.

Exercise 8 (hard, symbolic).

Using the log condition, show that fixing $ρ (ζ) = ζ^{r} - ζ^{r - 1}$ (the Adams choice) and taking $σ$ to be the degree- $r$ Taylor truncation of $ρ (ζ) / lo g ζ$ at $ζ = 1$ produces a method of order $r + 1$ (Adams-Moulton) when $σ$ has degree $r$ , and order $r$ (Adams-Bashforth) when $σ_{r}$ is dropped to make the method explicit.

Hint

Order $p$ is $ρ (ζ) / lo g ζ - σ (ζ) = O ((ζ - 1)^{p})$ . With $ρ / lo g ζ$ a power series in $(ζ - 1)$ , matching $σ$ to its first $r + 1$ terms kills the residual through $(ζ - 1)^{r}$ .

Answer

Write $w = ζ - 1$ . Since $lo g ζ = w - \frac{w ^{2}}{2} + \frac{w ^{3}}{3} - \dots$ and $ρ (ζ) = ζ^{r - 1} (ζ - 1) = (1 + w)^{r - 1} w$ , the quotient $ρ (ζ) / lo g ζ = (1 + w)^{r - 1} \cdot \frac{w}{l o g ( 1 + w )}$ is analytic at $w = 0$ with value $1$ there, hence a power series $g (w) = \sum_{k \geq 0} g_{k} w^{k}$ . The order- $p$ log condition is $g (w) - σ (1 + w) = O (w^{p})$ . Taking $σ (1 + w)$ to be the truncation $\sum_{k = 0}^{r} g_{k} w^{k}$ (degree $r$ in $ζ$ , the implicit case using the newest node) makes the residual $O (w^{r + 1})$ , so $p = r + 1$ : this is the $r$ -step Adams-Moulton method. Dropping the top coefficient $g_{r}$ so that $σ$ has no $ζ^{r}$ term (the explicit case, $β_{r} = 0$ ) leaves a residual $O (w^{r})$ , so $p = r$ : this is the $r$ -step Adams-Bashforth method. The construction shows Adams-Moulton gains exactly one order over Adams-Bashforth at equal step count, at the price of implicitness. Rubric: full credit for the analytic quotient, the truncation argument, and both order counts.

Advanced results Master

The order calculus and the $ρ / σ$ construction extend into the full structural theory of multistep methods: the systematic generation of the Adams, Nyström, and BDF families, the error-constant comparison that selects methods in practice, the variable-step and variable-order reformulations of production codes, and the obstruction that caps the order of any method one would actually use.

Theorem 1 (Adams family and its error constants). The $r$ -step Adams methods take $ρ (ζ) = ζ^{r} - ζ^{r - 1}$ and $σ$ the interpolatory quadrature weight polynomial for the integral $\int_{t_{n + r - 1}}^{t_{n + r}} f$ . The explicit Adams-Bashforth method interpolates $f$ at the $r$ nodes $t_{n}, \dots, t_{n + r - 1}$ and has order $r$ ; the implicit Adams-Moulton method interpolates additionally at $t_{n + r}$ and has order $r + 1$ . Their principal error constants are $C_{r + 1}^{AB} = γ_{r}$ and $C_{r + 2}^{AM} = γ_{r}^{*}$ , the Adams coefficients generated by $\sum_{k \geq 0} γ_{k} t^{k} = \frac{- t}{( 1 - t ) l o g ( 1 - t )}$ and $\sum_{k \geq 0} γ_{k}^{*} t^{k} = \frac{- t}{l o g ( 1 - t )}$ ; for each step count the Moulton constant is markedly smaller in magnitude than the Bashforth constant, which is why the implicit corrector is paired with the explicit predictor ^{[Hairer-Nørsett-Wanner §III.1]}.

Theorem 2 (BDF methods and their construction). The $r$ -step backward-differentiation formula takes $σ (ζ) = β_{r} ζ^{r}$ (a single slope term, at the newest node) and $ρ (ζ) = \sum_{k = 1}^{r} \frac{1}{k} ζ^{r - k} (ζ - 1)^{k} / (normalisation)$ , obtained by differentiating the interpolant of past values and matching its derivative at $t_{n + r}$ . Equivalently $ρ (ζ) = \sum_{k = 1}^{r} \frac{1}{k} \nabla^{k}$ -coefficients and $β_{r} = (\sum_{k = 1}^{r} \frac{1}{k})^{- 1}$ . BDF $r$ has order $r$ . The defining feature is that $σ$ concentrates all slope information at the new point, which makes BDF methods strongly damping and the standard choice for stiff systems; the construction loses zero-stability for $r \geq 7$ , so only BDF1 through BDF6 are usable, with BDF1 being backward Euler ^{[Hairer-Wanner §V.1]}.

Theorem 3 (predictor-corrector schemes and local error estimation). Pairing an explicit predictor $P$ of order $p$ with an implicit corrector $C$ of order $p + 1$ in the modes PEC, PECE, or P(EC) $^{m}$ E avoids solving the implicit equation exactly: one or a fixed few corrector iterations suffice to recover the corrector's order provided $h$ is small enough that the corrector iteration contracts. The difference between predicted and corrected values, $u^{n + r} - \overset{u}{ˉ}^{n + r}$ , is a computable estimate of the local error proportional to $C_{p + 2} h^{p + 2}$ — Milne's device — and is the basis of step-size control in production codes ^{[LeVeque §5.9.5]}.

Theorem 4 (first Dahlquist barrier — stated, proof deferred). A zero-stable $r$ -step linear multistep method has order at most $r + 2$ if $r$ is even, $r + 1$ if $r$ is odd, and order $r + 2$ forces all roots of $ρ$ on the unit circle (the optimal methods, with poor damping). The maximal useful order for an explicit zero-stable method is $r$ , and for an A-stable method is $2$ (the second Dahlquist barrier). The barrier is the precise expression of the tension flagged above: order conditions can always be satisfied, but only at the cost of pushing the roots of $ρ$ where the root condition forbids. The root condition itself, zero-stability, and the equivalence theorem that converts it into convergence are developed in 43.10.03, where this barrier is proved ^{[Hairer-Nørsett-Wanner §III.3]}.

Theorem 5 (variable-step and Nordsieck representation). Production multistep codes vary both step size and order. Re-deriving the coefficients at each step for a new step-size ratio is captured compactly by the Nordsieck vector, which carries scaled derivatives $(u^{n}, h u^{' n}, \frac{h ^{2}}{2} u^{'' n}, \dots)$ rather than past values, so a step-size change is a diagonal rescaling and an order change is a truncation or extension of the vector. The Adams-Moulton corrector in Nordsieck form is the algorithm underlying the classical Gear and Adams codes (LSODE, VODE, DASSL), making variable-order Adams and BDF the default stiff and nonstiff integrators in scientific computing ^{[Hairer-Nørsett-Wanner §III.5]}.

Synthesis. The order conditions on $(ρ, σ)$ are the foundational reason the entire zoo of multistep methods is one family with one accuracy calculus: every Adams, Nyström, Milne, and BDF scheme is a choice of $ρ$ by a structural principle — reuse the last value difference, or differentiate past values — followed by the unique $σ$ the log condition forces for the target order, with the principal error constant $C_{p + 1}$ ranking methods of equal order. This is exactly the situation met for one-step methods in 43.10.01, where order came from Taylor matching of the increment function; the central insight is that consistency is a statement about $ρ$ and $σ$ near $ζ = 1$ , and it generalises into the chapter's organising dichotomy: accuracy lives at $ζ = 1$ while stability lives on the rest of the unit circle. Putting these together, the Adams and BDF constructions are dual responses to one constraint — Adams puts all freedom into $σ$ for nonstiff accuracy, BDF collapses $σ$ to a single term for stiff damping — and the bridge to the next unit is the recognition that the freedom the order conditions leave in $ρ$ is precisely what the root condition will spend, so the first Dahlquist barrier is the exact accounting of how much order a stable $ρ$ can carry. The companion-matrix propagation that converts these polynomials into a convergence statement is what 43.10.03 supplies, completing the consistency-plus-stability template this unit's order theory is one half of.

Full proof set Master

Proposition 1 (consistency conditions via $ρ, σ$ ). An $r$ -step LMM is consistent (order $\geq 1$ ) if and only if $ρ (1) = 0$ and $ρ^{'} (1) = σ (1)$ .

Proof. By the Taylor expansion of the difference operator (Key theorem), order $\geq 1$ is $C_{0} = C_{1} = 0$ . The constant $C_{0} = \sum_{j} α_{j} = ρ (1)$ , so $C_{0} = 0 ⟺ ρ (1) = 0$ . The constant $C_{1} = \sum_{j} α_{j} j - \sum_{j} β_{j}$ . Since $ρ^{'} (ζ) = \sum_{j} α_{j} j ζ^{j - 1}$ , evaluation at $ζ = 1$ gives $ρ^{'} (1) = \sum_{j} α_{j} j$ , while $σ (1) = \sum_{j} β_{j}$ . Thus $C_{1} = ρ^{'} (1) - σ (1)$ , and $C_{1} = 0 ⟺ ρ^{'} (1) = σ (1)$ . $□$

Proposition 2 (the error constant is independent of the trajectory). If a method has order $p$ with principal error constant $C_{p + 1}$ , then for any $C^{p + 1}$ exact solution $y$ the local truncation error is $τ^{n} = C_{p + 1} h^{p} y^{(p + 1)} (t_{n}) + O (h^{p + 1})$ .

Proof. The difference operator expands as $L [y; h] (t) = \sum_{q \geq 0} C_{q} h^{q} y^{(q)} (t)$ . Order $p$ means $C_{0} = \dots = C_{p} = 0$ , so the leading surviving term is $C_{p + 1} h^{p + 1} y^{(p + 1)} (t)$ . By definition $τ^{n} = \frac{1}{h} L [y; h] (t_{n}) = C_{p + 1} h^{p} y^{(p + 1)} (t_{n}) + O (h^{p + 1})$ , the higher terms collecting $C_{p + 2} h^{p + 1} y^{(p + 2)}$ and beyond. The coefficient $C_{p + 1}$ depends only on $(α_{j}, β_{j})$ , not on $y$ . $□$

Proposition 3 (Adams-Moulton two-step is order three with $C_{4} = - \frac{1}{24}$ ). The method $u^{n + 1} = u^{n} + h (\frac{5}{12} f^{n + 1} + \frac{2}{3} f^{n} - \frac{1}{12} f^{n - 1})$ has order $3$ , and its principal error constant is $C_{4} = - \frac{1}{24}$ .

Proof. Index the window $j \in {0, 1, 2}$ (nodes $t_{n - 1}, t_{n}, t_{n + 1}$ ) with $α = (0, - 1, 1)$ and $β = (- \frac{1}{12}, \frac{2}{3}, \frac{5}{12})$ . Compute the error constants from $C_{q} = \frac{1}{q !} \sum_{j} α_{j} j^{q} - \frac{1}{( q - 1 )!} \sum_{j} β_{j} j^{q - 1}$ : $C_{0} = \sum α_{j} = 0$ ; $C_{1} = (- 1 + 2) - (- \frac{1}{12} + \frac{2}{3} + \frac{5}{12}) = 1 - 1 = 0$ ; $C_{2} = \frac{1}{2} (- 1 + 4) - (\frac{2}{3} + 2 \cdot \frac{5}{12}) = \frac{3}{2} - (\frac{2}{3} + \frac{5}{6}) = \frac{3}{2} - \frac{3}{2} = 0$ ; $C_{3} = \frac{1}{6} (- 1 + 8) - \frac{1}{2} (\frac{2}{3} + 4 \cdot \frac{5}{12}) = \frac{7}{6} - \frac{1}{2} (\frac{2}{3} + \frac{5}{3}) = \frac{7}{6} - \frac{1}{2} \cdot \frac{7}{3} = \frac{7}{6} - \frac{7}{6} = 0$ . So order $\geq 3$ . Then $C_{4} = \frac{1}{24} (- 1 + 16) - \frac{1}{6} (\frac{2}{3} + 8 \cdot \frac{5}{12}) = \frac{15}{24} - \frac{1}{6} (\frac{2}{3} + \frac{10}{3}) = \frac{5}{8} - \frac{1}{6} \cdot 4 = \frac{5}{8} - \frac{2}{3} = \frac{15 - 16}{24} = - \frac{1}{24} \neq = 0$ . Hence the order is exactly $3$ with $C_{4} = - \frac{1}{24}$ . The magnitude $\frac{1}{24}$ is smaller than the two-step Adams-Bashforth constant $\frac{5}{12}$ at the same step count, the quantitative form of the predictor-corrector advantage. $□$

Proposition 4 (BDF2 is zero-stable; the construction degrades at six steps). The two-step BDF method $\frac{3}{2} u^{n + 1} - 2 u^{n} + \frac{1}{2} u^{n - 1} = h f^{n + 1}$ has $ρ (ζ) = \frac{3}{2} ζ^{2} - 2 ζ + \frac{1}{2}$ with roots $ζ = 1$ and $ζ = \frac{1}{3}$ , both in the closed unit disc with the boundary root simple; it is order $2$ and zero-stable.

Proof. Factor $ρ (ζ) = \frac{3}{2} (ζ^{2} - \frac{4}{3} ζ + \frac{1}{3}) = \frac{3}{2} (ζ - 1) (ζ - \frac{1}{3})$ , exhibiting the roots $1$ and $\frac{1}{3}$ . The principal root $ζ = 1$ is simple and on the unit circle; the second root $\frac{1}{3}$ has modulus $< 1$ . The root condition — all roots of $ρ$ satisfy $∣ ζ ∣ \leq 1$ , those on $∣ ζ ∣ = 1$ simple — therefore holds, so the method is zero-stable 43.10.03. For order: $σ (ζ) = ζ^{2}$ , so $σ (1) = 1$ ; $ρ^{'} (ζ) = 3 ζ - 2$ gives $ρ^{'} (1) = 1 = σ (1)$ and $ρ (1) = 0$ , consistency. Matching $C_{2}$ : with $α = (\frac{1}{2}, - 2, \frac{3}{2})$ on $j \in {0, 1, 2}$ and $β = (0, 0, 1)$ , $C_{2} = \frac{1}{2} (- 2 + 4 \cdot \frac{3}{2}) - (2 \cdot 1) = \frac{1}{2} (4) - 2 = 0$ and $C_{3} = \frac{1}{6} (- 2 + 8 \cdot \frac{3}{2}) - \frac{1}{2} (4 \cdot 1) = \frac{1}{6} (10) - 2 = \frac{5}{3} - 2 = - \frac{1}{3} \neq = 0$ , so order $2$ . The general BDF $r$ first polynomial is $ρ (ζ) = \sum_{k = 1}^{r} \frac{1}{k} ζ^{r - k} (ζ - 1)^{k}$ , and a root-locus computation shows a root crosses outside the unit circle once $r \geq 7$ , so the root condition fails and BDF $r$ is not zero-stable beyond six steps; the proof of that crossing is the barrier analysis of 43.10.03. $□$

Connections Master

The one-step methods of 43.10.01 supply the local-truncation-error and order framework this unit specialises: the difference operator $L$ and the order constants $C_{q}$ are the multistep generalisation of the increment-function Taylor matching used there, and the forward-Euler starter that bootstraps a multistep method's missing initial values is one of its instances. The clean discrete-Gronwall convergence bound of that unit, however, no longer holds automatically here — that is the gap the next unit closes.
The zero-stability and Dahlquist equivalence theorem of 43.10.03 is the direct continuation: it converts the same first characteristic polynomial $ρ$ used here for accuracy into the carrier of stability through the root condition on its zeros, and proves that for a consistent LMM convergence is equivalent to zero-stability, with the first Dahlquist barrier (stated here) limiting the order a zero-stable $ρ$ can attain. The order constants $C_{p + 1}$ this unit computes are the accuracy half of the consistency-plus-stability template that unit completes.
The absolute-stability theory of 43.10.04 studies the combined polynomial $ρ (ζ) - z σ (ζ)$ on the linear test equation $y^{'} = λ y$ , $z = hλ$ : the region where all its roots lie in the unit disc is the multistep region of absolute stability, the direct analogue of the one-step stability function $R (z)$ . The Adams and BDF $(ρ, σ)$ pairs built here are exactly the inputs to that boundary-locus construction.
The stiffness and A-stability theory of 43.10.05 explains why the BDF family constructed here, with $σ$ collapsed to a single newest-node term, is the standard stiff integrator: its strong damping comes from that structural choice, and the second Dahlquist barrier limits A-stable LMMs to order $2$ , with the trapezoidal rule — the one-step Adams-Moulton member — as the optimal case. BDF1 through BDF6 are the practically usable stiff multistep methods.
The interpolatory-quadrature derivation of the Adams coefficients rests on the polynomial-interpolation and Newton-Gregory machinery whose root unit is the corpus's interpolation theory; the multivariable Taylor expansion of 02.05.05 is the engine of every order condition, and the continuous IVP existence guaranteed by Picard-Lindelöf 02.12.01 is the exact solution the local truncation error is measured against.

Historical & philosophical context Master

The Adams methods originate with John Couch Adams, who devised the backward-difference quadrature scheme around 1855 to compute capillary-action profiles for Francis Bashforth's study of the theory of capillary action; the work was published jointly as Bashforth and Adams in 1883 ^{[Bashforth-Adams 1883]}, giving the explicit family the name Adams-Bashforth. The implicit companion was introduced by Forest Ray Moulton (1926) ^{[Moulton 1926]} in his treatise on the numerical computation of ballistic trajectories, who paired the explicit predictor with the implicit corrector — the predictor-corrector idea in its modern form.

The unifying treatment through the characteristic polynomials $ρ$ and $σ$ and the systematic order theory is due to Germund Dahlquist, whose 1956 paper ^{[Dahlquist 1956]} established consistency, the root condition, and the equivalence of convergence with zero-stability for the general linear multistep method, and whose 1963 paper proved the order barriers that bear his name. The difference-operator calculus that makes the order conditions a routine computation was organised by Peter Henrici in his 1962 monograph. The backward-differentiation formulas were brought to prominence by C. William Gear (1971) ^{[Gear 1971]}, whose stiff-equation solver made BDF the default method for the stiff systems pervasive in chemical kinetics and circuit simulation; the variable-order Nordsieck implementation underlying the LSODE and VODE codes descends from this line.

Bibliography Master

@book{leveque2007fdm,
  author    = {LeVeque, Randall J.},
  title     = {Finite Difference Methods for Ordinary and Partial Differential Equations: Steady-State and Time-Dependent Problems},
  publisher = {Society for Industrial and Applied Mathematics (SIAM)},
  year      = {2007}
}

@book{sulimayers2003,
  author    = {S\"{u}li, Endre and Mayers, David F.},
  title     = {An Introduction to Numerical Analysis},
  publisher = {Cambridge University Press},
  year      = {2003}
}

@book{hnw1993,
  author    = {Hairer, Ernst and N\o{}rsett, Syvert P. and Wanner, Gerhard},
  title     = {Solving Ordinary Differential Equations I: Nonstiff Problems},
  edition   = {2},
  series    = {Springer Series in Computational Mathematics},
  volume    = {8},
  publisher = {Springer-Verlag},
  year      = {1993}
}

@book{hairerwanner1996,
  author    = {Hairer, Ernst and Wanner, Gerhard},
  title     = {Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems},
  edition   = {2},
  series    = {Springer Series in Computational Mathematics},
  volume    = {14},
  publisher = {Springer-Verlag},
  year      = {1996}
}

@book{henrici1962,
  author    = {Henrici, Peter},
  title     = {Discrete Variable Methods in Ordinary Differential Equations},
  publisher = {John Wiley \& Sons},
  year      = {1962}
}

@book{bashforth1883,
  author    = {Bashforth, Francis and Adams, John Couch},
  title     = {An Attempt to Test the Theories of Capillary Action by Comparing the Theoretical and Measured Forms of Drops of Fluid},
  publisher = {Cambridge University Press},
  year      = {1883}
}

@book{moulton1926,
  author    = {Moulton, Forest Ray},
  title     = {New Methods in Exterior Ballistics},
  publisher = {University of Chicago Press},
  year      = {1926}
}

@article{dahlquist1956,
  author  = {Dahlquist, Germund},
  title   = {Convergence and stability in the numerical integration of ordinary differential equations},
  journal = {Mathematica Scandinavica},
  volume  = {4},
  year    = {1956},
  pages   = {33--53}
}

@article{dahlquist1963,
  author  = {Dahlquist, Germund},
  title   = {A special stability problem for linear multistep methods},
  journal = {BIT Numerical Mathematics},
  volume  = {3},
  number  = {1},
  year    = {1963},
  pages   = {27--43}
}

@book{gear1971,
  author    = {Gear, C. William},
  title     = {Numerical Initial Value Problems in Ordinary Differential Equations},
  publisher = {Prentice-Hall},
  year      = {1971}
}

Prerequisites

43.10.01
02.05.05
02.12.01

Tier anchors

beginner: LeVeque 2007 *Finite Difference Methods for Ordinary and Partial Differential Equations* (SIAM) §5.9 (the leapfrog/Adams picture and the idea of reusing past slopes); Süli-Mayers 2003 *An Introduction to Numerical Analysis* (Cambridge) §12.6 (linear multistep methods introduced from the integral form)
intermediate: Süli-Mayers 2003 *An Introduction to Numerical Analysis* (Cambridge) §12.6-12.8 (linear multistep methods, the characteristic polynomials, consistency and order conditions); LeVeque 2007 (SIAM) §5.9-5.10 (the $\rho,\sigma$ polynomials, Adams-Bashforth/Moulton, BDF, predictor-corrector)
master: Hairer-Nørsett-Wanner 1993 *Solving Ordinary Differential Equations I: Nonstiff Problems* 2e (Springer) §III.1-III.2 (the general LMM, the error constants $C_p$, Adams and Nyström families, the order via the residual operator); Hairer-Wanner 1996 *Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems* 2e (Springer) §V.1 (BDF methods and their construction); Henrici 1962 *Discrete Variable Methods in Ordinary Differential Equations* (Wiley) Ch. 5 (the difference-operator calculus and the order conditions)

References

LeVeque, R. J. — Finite Difference Methods for Ordinary and Partial Differential Equations: Steady-State and Time-Dependent Problems · SIAM 2007. §5.9 introduces the general linear multistep method $\sum_{j=0}^r \alpha_j u^{n+j} = h\sum_{j=0}^r \beta_j f^{n+j}$, defines the first and second characteristic polynomials $\rho(\zeta)=\sum_j \alpha_j \zeta^j$ and $\sigma(\zeta)=\sum_j \beta_j \zeta^j$, and derives the local truncation error by inserting the exact solution; the order conditions are read off as $\rho(1)=0$, $\rho'(1)=\sigma(1)$ for consistency and the higher matching $\sum_j (\alpha_j j^q - q\beta_j j^{q-1})=0$ for orders through $p$. §5.9.2-5.9.3 construct the Adams-Bashforth (explicit) and Adams-Moulton (implicit) families by interpolating the integrand of $y(t_{n+1})-y(t_n)=\int f$, and §5.9.4 the backward-differentiation formulas by differentiating an interpolant of the unknown; predictor-corrector pairs and the starting-value problem are in §5.9.5.
Süli, E. & Mayers, D. F. — An Introduction to Numerical Analysis · Cambridge University Press 2003. Chapter 12 §12.6 defines the $r$-step linear multistep method and the polynomials $\rho,\sigma$, derives the consistency conditions $\rho(1)=0$ and $\rho'(1)=\sigma(1)$ from the requirement that the truncation error vanish to first order, and develops the order-$p$ residual conditions via the linear difference operator $\mathcal{L}[y;h]=\sum_j (\alpha_j y(t+jh)-h\beta_j y'(t+jh))$ and its error constants $C_q$; §12.7 constructs Adams-Bashforth and Adams-Moulton schemes by Newton-Gregory backward-difference interpolation of $f$; §12.8 treats the root condition and zero-stability as the companion convergence theory.
Hairer, E., Nørsett, S. P. & Wanner, G. — Solving Ordinary Differential Equations I: Nonstiff Problems · Springer, 2nd revised edition 1993. §III.1 sets up the general linear multistep formula and the generating polynomials $\rho(\zeta),\sigma(\zeta)$, proves the order characterisation $\rho(e^h)-h\sigma(e^h)=C_{p+1}h^{p+1}+O(h^{p+2})$ (the log-condition $\rho(\zeta)/\log\zeta - \sigma(\zeta)=O((\zeta-1)^{p})$ near $\zeta=1$), and collects the error constants $C_p$; §III.2 derives the Adams methods (explicit Bashforth, implicit Moulton) from the integral identity by interpolatory quadrature on backward differences, the Nyström and Milne-Simpson families, and the BDF methods, tabulating their coefficients and error constants and noting the loss of zero-stability of BDF beyond six steps.

Estimated time

beginner: 20m
intermediate: 50m
master: 90m