43.02.01 · numerical-analysis / nonlinear-equations

The Bisection Method and the Scalar Root-Finding Problem

shipped3 tiersLean: none

Anchor (Master): Süli & Mayers 2003 *An Introduction to Numerical Analysis* (Cambridge) Ch. 1 (root-finding, the convergence-order framework opening the nonlinear-equations chapter); Trefethen & Bau 1997 *Numerical Linear Algebra* (SIAM) Lecture 13-15 (conditioning and the precision-limited accuracy floor); Brent 1973 *Algorithms for Minimization without Derivatives* (Prentice-Hall) Ch. 3-4 (the bracketing/interpolation hybrid that makes bisection a practical safeguard)

Intuition Beginner

Suppose a quantity depends on some input — the height of a thrown ball at time $x$ , the profit of a business at price $x$ — and you want to know where that quantity hits zero. Crossing zero is where the ball lands or where you break even. Writing the quantity as $f (x)$ , you are asking for the input $x$ where $f (x) = 0$ . That input is called a root, and the search for it is the root-finding problem. Most equations you meet in practice cannot be solved by a tidy formula, so you need a procedure that hunts the root down by getting closer and closer.

Here is the one fact that makes the hunt reliable. If $f$ is a continuous quantity — no sudden jumps — and it is negative at one input and positive at another, then somewhere between those two inputs it must pass through zero. You cannot get from below the ground to above the ground without touching the ground. So if you can find one place where $f$ is negative and another where $f$ is positive, you have trapped a root between them. That trapping interval is called a bracket.

The bisection method is the most patient way to shrink a bracket. Look at the midpoint of your interval and check the sign of $f$ there. The midpoint splits the interval into two halves, and the root has to live in whichever half still shows a sign change. Keep that half, throw the other away, and repeat. Every step cuts the interval in half, so the trap closes in on the root as tightly as you like.

This unit is where the whole chapter on solving equations begins. Bisection is the method that always works once you have a bracket, and the price for that guarantee is that it is slow. Later methods in the chapter trade some of that guarantee for far greater speed, and the way we measure speed — how fast the error shrinks per step — is set up here.

Visual Beginner

Picture a continuous curve that starts below the horizontal axis on the left and ends above it on the right. Because it has no breaks, it must cross the axis somewhere in between, and that crossing point is the root we want. Bisection finds it by repeatedly checking the middle.

The picture below shows the interval $[1, 2]$ for a curve that is negative at $1$ and positive at $2$ . Each row halves the bracket by testing the midpoint and keeping the half where the sign still changes.

   f(1) < 0                                  f(2) > 0
     |---------------------------------------------|
     1                    1.5                       2      midpoint 1.5: f(1.5) > 0  -> root in [1, 1.5]
     |--------------------|
     1         1.25       1.5                              midpoint 1.25: f(1.25) < 0 -> root in [1.25, 1.5]
               |----------|
               1.25  1.375 1.5                             midpoint 1.375: keep the half with the sign change
               |-----|
   each row is half as wide as the row above it

step	bracket	width	midpoint tested
start	$[1, 2]$	$1$	—
1	$[1, 1.5]$	$0.5$	$1.5$
2	$[1.25, 1.5]$	$0.25$	$1.25$
3	$[1.25, 1.375]$	$0.125$	$1.375$

Each row of the table is exactly half as wide as the row above it, and the crossing point stays trapped inside every bracket. After ten steps the width is about one thousandth of the start; after twenty, about one millionth.

Worked example Beginner

Let us find a root of $f (x) = x^{2} - 2$ by bisection. A root of this is a number whose square is $2$ , that is, the square root of $2$ , which is about $1.414$ . We will pretend we do not know that and let the method discover it.

Step 1. Find a starting bracket. Check the endpoints $1$ and $2$ . At $x = 1$ , $f (1) = 1 - 2 = - 1$ , which is negative. At $x = 2$ , $f (2) = 4 - 2 = 2$ , which is positive. The sign changes, so a root is trapped in $[1, 2]$ .

Step 2. Test the midpoint of $[1, 2]$ , which is $1.5$ . Here $f (1.5) = 2.25 - 2 = 0.25$ , positive. The root sits between the negative end $1$ and the positive value at $1.5$ , so the new bracket is $[1, 1.5]$ .

Step 3. Test the midpoint of $[1, 1.5]$ , which is $1.25$ . Here $f (1.25) = 1.5625 - 2 = - 0.4375$ , negative. The sign change is now between $1.25$ and $1.5$ , so the new bracket is $[1.25, 1.5]$ .

Step 4. Test the midpoint of $[1.25, 1.5]$ , which is $1.375$ . Here $f (1.375) = 1.890625 - 2 = - 0.109375$ , negative. The new bracket is $[1.375, 1.5]$ .

Step 5. Read off the current estimate. After three halvings the bracket $[1.375, 1.5]$ has width $0.125$ , and its midpoint $1.4375$ is our estimate. The true root $1.41421 \dots$ lies inside the bracket, and our estimate is off by less than half the width, under $0.0625$ .

What this tells us: starting from nothing but two sign-checks, three halvings already pin the square root of $2$ to within about one part in twenty, and every further step doubles the precision. The method needs no formula for the root, only the ability to check the sign of $f$ .

Check your understanding Beginner

Exercise (easy, multiple-choice).

The bisection method is best described as:

A formula that returns the root in one step.
A guaranteed but slow method that halves a sign-change bracket each step.
A method that needs the derivative of $f$ at every step.
A method that only works for polynomials.

Hint

Think about what bisection uses (only the sign of $f$ ) and how the bracket changes each step.

Answer

B. Bisection halves a bracketing interval each step and is guaranteed to converge once a sign change is found, but it is slow compared to derivative-using methods. Feedback-correct: it uses only the sign of $f$ at the midpoint, never a formula or a derivative, and works for any continuous $f$ . Feedback-wrong: A and C describe other methods; D is wrong because continuity, not polynomial form, is all that is required.

Formal definition Intermediate+

Let $f : [a, b] \to R$ be continuous with $f (a) f (b) < 0$ . The scalar root-finding problem is to determine $x^{*} \in (a, b)$ with $f (x^{*}) = 0$ . Existence of such a root is supplied by the intermediate value theorem (the image of a connected set under a continuous map is connected, specialised to a real interval 02.01.02; equivalently the completeness of $R$ used in 02.02.01): since $0$ lies strictly between $f (a)$ and $f (b)$ , there is a point at which $f$ attains it. Uniqueness is not assumed — the interval may bracket several roots; the method below returns one of them.

The bisection method generates a nested sequence of brackets $[a_{0}, b_{0}] \supseteq [a_{1}, b_{1}] \supseteq \dots$ with $[a_{0}, b_{0}] = [a, b]$ . At step $n$ , set the midpoint $c_{n} = \frac{1}{2} (a_{n} + b_{n})$ and inspect the sign of $f (c_{n})$ :

[a_{n + 1}, b_{n + 1}] = {[a_{n}, c_{n}], [c_{n}, b_{n}], if f (a_{n}) f (c_{n}) \leq 0, if f (a_{n}) f (c_{n}) > 0.

In the boundary case $f (c_{n}) = 0$ the midpoint is itself a root and the process terminates. Each step preserves the invariant $f (a_{n}) f (b_{n}) \leq 0$ , so every bracket contains a root, and the width satisfies $b_{n} - a_{n} = (b - a) / 2^{n}$ . The bisection estimate after $n$ steps is the midpoint $x_{n} = c_{n} = \frac{1}{2} (a_{n} + b_{n})$ ; it requires exactly one evaluation of $f$ per step. A stopping criterion halts the iteration once a target is met: a width test $b_{n} - a_{n} < δ$ , a residual test $∣ f (x_{n}) ∣ < τ$ , or a cap on the iteration count. The symbols $x^{*}$ for the exact root, $e_{n} = x_{n} - x^{*}$ for the error, and $[a_{n}, b_{n}]$ for the bracket are recorded in _meta/NOTATION.md.

A method generating estimates $x_{n} \to x^{*}$ is said to converge with order $p \geq 1$ if there is a constant $C$ (with $C < 1$ when $p = 1$ ) such that, for all large $n$ , $$ |x_{n+1} - x^| \le C,|x_n - x^|^{p}. $$ The case $p = 1$ is linear convergence with rate (or contraction factor) $C$ ; $p = 2$ is quadratic; $1 < p < 2$ is superlinear. This order framework is the yardstick the rest of chapter 43.02 uses to compare bisection against fixed-point iteration 43.02.02 and Newton's method 43.02.03.

Counterexamples to common slips Intermediate+

"A sign change at the endpoints means exactly one root." It guarantees an odd number of roots in the bracket (counting multiplicity for crossings), not a unique one. Bisection converges to one root of the bracket; which one depends on the sign pattern at the midpoints, and a different starting bracket may capture a different root.
"Same sign at the endpoints means no root." A continuous $f$ can have a root — even several — inside $[a, b]$ while $f (a)$ and $f (b)$ share a sign, as with $f (x) = (x - 1.5)^{2} - 0.01$ on $[1, 2]$ , which has two roots yet $f (1), f (2) > 0$ . The sign-change test is sufficient for a bracket, not necessary for a root; an even number of crossings is invisible to it.
"The error per step is the midpoint residual $∣ f (x_{n}) ∣$ ." The quantity that halves deterministically is the bracket width, hence the bound $∣ e_{n} ∣ \leq (b - a) / 2^{n + 1}$ on the position error. The residual $∣ f (x_{n}) ∣$ can be large while $x_{n}$ is close to a flat root, or small while $x_{n}$ is far from a steep one; residual and position error are linked only through the local slope.
"Bisection has a convergence rate $|g'(x^)| $l ik e f i x e d - p o in t i t er a t i o n ." * B i sec t i o ni s n o t a f i x e d - p o in t i t er a t i o n o f a s m oo t hma p; i t ser r or n ee d n o t d ecr e a se m o n o t o ni c a l l y, o n l y i t s * b o u n d * d oes . I t co n v er g es l in e a r l y w i t h t h e b o u n d^{'} sco n t r a c t i o n f a c t or$ 1/2 $, a p r o p er t y o f t h e ha l v in g, n o t o f an y d er i v a t i v eo f$ f$.

Key theorem with proof Intermediate+

The signature result is the guaranteed convergence of bisection together with its explicit a-priori error bound, because it is the single statement that both certifies the method always works and quantifies exactly how many steps a prescribed accuracy costs.

Theorem (bisection convergence with a-priori bound). Let $f : [a, b] \to R$ be continuous with $f (a) f (b) < 0$ . The bisection iterates $x_{n} = \frac{1}{2} (a_{n} + b_{n})$ converge to a root $x^ \in [a, b] $o f$ f $, an df or e v er y$ n \ge 0$* $$ |x_n - x^| \le \frac{b - a}{2^{,n+1}}. $$ Consequently, to guarantee $|x_n - x^| \le \varepsilon$ it suffices to take $$ n \ge \log_2!\frac{b - a}{\varepsilon} - 1. $$

Proof. By construction each step keeps the half on which $f$ changes sign, so the invariant $f (a_{n}) f (b_{n}) \leq 0$ holds for all $n$ , and $b_{n} - a_{n} = (b - a) / 2^{n}$ . The left endpoints $a_{n}$ are nondecreasing and bounded above by $b$ , and the right endpoints $b_{n}$ are nonincreasing and bounded below by $a$ ; by the monotone convergence of bounded real sequences (a consequence of the completeness of $R$ used in the metric-space and real-number foundations 02.01.05, 02.02.01) both converge, and since $b_{n} - a_{n} \to 0$ they share a common limit $x^{*} \in [a, b]$ . Continuity of $f$ gives $f (a_{n}) \to f (x^{*})$ and $f (b_{n}) \to f (x^{*})$ , so $f (x^{*})^{2} = lim f (a_{n}) f (b_{n}) \leq 0$ , forcing $f (x^{*}) = 0$ . Thus $x^{*}$ is a root.

The midpoint $x_{n} = \frac{1}{2} (a_{n} + b_{n})$ lies at the centre of $[a_{n}, b_{n}]$ , and $x^{*} \in [a_{n}, b_{n}]$ , so the distance from $x_{n}$ to $x^{*}$ is at most half the width: $$ |x_n - x^| \le \frac{1}{2}(b_n - a_n) = \frac{1}{2}\cdot\frac{b - a}{2^{n}} = \frac{b - a}{2^{n+1}}. $$ For the iteration count, $|x_n - x^| \le \varepsilon $i ssec u r e d o n ce$ (b - a)/2^{n+1} \le \varepsilon $, t ha t i s$ 2^{n+1} \ge (b - a)/\varepsilon $; t ak in g$ \log_2 $g i v es$ n + 1 \ge \log_2\big((b-a)/\varepsilon\big) $, i . e .$ n \ge \log_2\big((b-a)/\varepsilon\big) - 1 $.$ \square$

Bridge. This theorem is the foundational reason bisection is the safe default for scalar root-finding: the error bound depends on nothing but the starting width and the step count, so convergence is certified before a single evaluation is made, and it builds toward the comparative study of speed that organises the chapter. The bound $∣ x_{n} - x^{*} ∣ \leq (b - a) / 2^{n + 1}$ is exactly the order-of-convergence definition with $p = 1$ and contraction factor $C = 1/2$ , which is the central insight that lets bisection be ranked on the same scale as every other method here: this is exactly linear convergence, and the gain of one correct binary digit per step is what later methods accelerate. The halving generalises to any deterministic interval-contraction scheme, and it is dual to the fixed-point picture of 43.02.02, where convergence is governed instead by a derivative at the root. Putting these together, the a-priori bound certifies that bisection converges and how slowly, and this same convergence-order language appears again in 43.02.02 for fixed-point iteration with rate $∣ g^{'} (x^{*}) ∣$ and in 43.02.03 for Newton's quadratic order, where the contrast with bisection's guaranteed-but-linear behaviour is the organising comparison.

Exercises Intermediate+

Exercise 4 (medium, symbolic).

Prove that the bisection error bound is sharp in the sense that there exist $f$ and a starting bracket for which $∣ x_{0} - x^{*} ∣ = (b - a) /2$ exactly, so the constant $1/2$ in $(b - a) / 2^{n + 1}$ at $n = 0$ cannot be improved.

Hint

Place the root at an endpoint-adjacent extreme of the first bracket so that the first midpoint is as far from the root as the bound allows.

Answer

Take $f$ with its only root in $[a, b]$ located at $x^{*} = a$ approached from a one-sided sign change — for instance let the root sit infinitesimally inside, $x^{*} = a + η$ with $η \to 0^{+}$ , while $f (a) < 0 < f (b)$ . The first midpoint is $x_{0} = \frac{1}{2} (a + b)$ , at distance $∣ x_{0} - x^{*} ∣ = \frac{1}{2} (a + b) - (a + η) = \frac{1}{2} (b - a) - η$ . As $η \to 0$ this approaches $(b - a) /2 = (b - a) / 2^{0 + 1}$ , the bound at $n = 0$ . Hence no constant smaller than $1/2$ works uniformly over all admissible $f$ , so the bound is sharp. Full credit for exhibiting a configuration whose midpoint error meets the bound and arguing the constant cannot be reduced.

Exercise 6 (hard, symbolic).

Show that the bisection midpoint sequence $(x_{n})$ need not be monotone and need not satisfy $∣ x_{n + 1} - x^{*} ∣ \leq \frac{1}{2} ∣ x_{n} - x^{*} ∣$ for the actual errors, even though the bound halves each step. Construct a two-step example.

Hint

The midpoint can land on the opposite side of the root from the previous midpoint, and a later midpoint can be closer to the root than the bound suggests, so consecutive actual errors need not be in a fixed ratio.

Answer

Let $x^{*} = 0.5$ be the root of a function with sign change on $[0, 1]$ , arranged so the brackets are $[0, 1] \to [0, 0.5] \to [0.25, 0.5]$ . The midpoints are $x_{0} = 0.5$ , $x_{1} = 0.25$ , $x_{2} = 0.375$ . Then $∣ x_{0} - x^{*} ∣ = 0$ while $∣ x_{1} - x^{*} ∣ = 0.25$ : the actual error increased from step $0$ to step $1$ , so the ratio $∣ x_{n + 1} - x^{*} ∣ \leq \frac{1}{2} ∣ x_{n} - x^{*} ∣$ fails (the right side would be $0$ ). The errors $0, 0.25, 0.125$ are not monotone decreasing in a fixed ratio, yet each respects its own bound $∣ x_{n} - x^{*} ∣ \leq 2^{- (n + 1)} = 0.5, 0.25, 0.125$ . The lesson: bisection is linearly convergent through the contraction of the bracket-width bound, not through a per-step ratio of actual errors — the midpoint can momentarily sit on or near the root and then move away as the bracket re-centres. Full credit for an explicit bracket sequence whose actual midpoint errors violate a fixed per-step ratio while honouring the a-priori bound.

Exercise 7 (hard, symbolic).

Suppose $f$ is continuous on $[a, b]$ and has $k$ simple roots there, all in the interior, with $f (a) f (b) < 0$ . Show $k$ is odd, and that bisection converges to some root but the limit can depend on the arithmetic of the midpoints.

Hint

A simple root is a sign-crossing. Count sign changes from $a$ to $b$ ; an odd total is forced by $f (a), f (b)$ having opposite signs. For dependence, note a midpoint hitting a root or a sign pattern steers the kept half.

Answer

At a simple root $f$ changes sign. Reading the sign of $f$ from $a$ to $b$ , each simple root flips it; after $k$ flips the sign at $b$ differs from the sign at $a$ if and only if $k$ is odd. Since $f (a) f (b) < 0$ the signs differ, so $k$ is odd. Convergence: the bracketing invariant $f (a_{n}) f (b_{n}) \leq 0$ is preserved at every step (each kept half still shows a sign change), and $b_{n} - a_{n} \to 0$ , so by the convergence theorem the nested brackets close on a point $x^{*}$ with $f (x^{*}) = 0$ — one of the $k$ roots. Which root: when several roots lie in the current bracket, the sign of $f$ at the midpoint decides which half is kept, and that decision depends on where the computed midpoint falls relative to the roots; with two roots in one half, the method keeps that half and eventually isolates one of them, while a perturbation of the midpoint (e.g. from rounding) can change which sub-bracket survives and hence which root is reached. So the limit is a root, but not a canonically determined one. Full credit for the parity argument, the preserved-bracket convergence, and the explanation of root-selection sensitivity.

Advanced results Master

Bisection is the slowest method that still carries an unconditional guarantee, and the results below place it precisely: as the optimal worst-case interval method, as the linear baseline of the order hierarchy, and as the safeguard that hybrid solvers wrap around faster but unreliable iterations.

Theorem 1 (linear convergence and the order baseline). The bisection bound $∣ x_{n} - x^{*} ∣ \leq (b - a) / 2^{n + 1}$ exhibits linear convergence of order $p = 1$ with asymptotic error constant $C = 1/2$ : writing $β_{n} = (b - a) / 2^{n + 1}$ for the bound, $β_{n + 1} / β_{n} = 1/2$ exactly. No higher order is attainable, because the method extracts only one bit of information per evaluation — the sign of $f (c_{n})$ — and one bit can at best halve the candidate interval. This information-theoretic ceiling is the reason every superlinear method in the chapter must use more than the sign: fixed-point iteration uses the value of a contraction map 43.02.02, Newton uses the value and the derivative 43.02.03, and the secant uses two values to approximate the derivative ^{[Atkinson §2.1]}.

Theorem 2 (worst-case optimality of bisection among bracketing methods). Among all methods that, given only the ability to evaluate $sign f$ and that begin with a bracket of width $w = b - a$ containing a single sign change, no method can guarantee to localise the root to an interval of width less than $w / 2^{n}$ after $n$ evaluations. Bisection achieves $w / 2^{n}$ exactly, so it is worst-case optimal in the sign-evaluation model. The argument is an adversary one: after $n$ sign queries the responses partition the possible root locations into at most $2^{n}$ cells, and an adversary answering to keep the largest cell alive forces a residual interval of width $\geq w / 2^{n}$ ; bisection's repeated halving meets this lower bound ^{[Sikorski 1982]}.

Theorem 3 (finite-precision accuracy floor). In floating-point arithmetic with unit roundoff $ε_{mach}$ , the computed midpoint $c_{n} = fl (\frac{1}{2} (a_{n} + b_{n}))$ and the computed sign $sign f (c_{n})$ degrade once the bracket width approaches the spacing of representable numbers near $x^{*}$ , which is of order $∣ x^{*} ∣ ε_{mach}$ . Below that width the evaluated sign of $f$ is dominated by rounding error in computing $f$ , and further halving no longer reliably brackets the true root: the attainable accuracy is floored at roughly $∣ x^{*} ∣ ε_{mach}$ for a well-conditioned simple root, and worse for an ill-conditioned one, where the conditioning of the root scales the floor by $1/∣ f^{'} (x^{*}) ∣$ . This is the conditioning-and-roundoff boundary developed in chapter 43.01 ^{[Trefethen-Bau Lec. 12-15]}, and it is why a stopping test should never demand a width below the precision the data and arithmetic support.

Theorem 4 (bisection as a safeguard; hybrid bracketing methods). A practical scalar solver wraps a fast, unreliable iteration (secant, inverse-quadratic interpolation, Newton) inside a maintained bracket, accepting the fast step only when it lands inside the current bracket and produces sufficient decrease, and falling back to a bisection step otherwise. This preserves the unconditional convergence of bisection while attaining the superlinear speed of the interpolation step on smooth functions. Dekker's method and Brent's method are the canonical realisations: Brent's method combines inverse quadratic interpolation, the secant method, and bisection, guaranteeing convergence in at most about the square of the bisection step count while typically converging superlinearly ^{[Brent 1973 Ch. 4]}. The role of bisection in the hybrid is exactly its guarantee — it is the floor that catches the fast step when it strays.

Synthesis. Bisection is the foundational reason the order-of-convergence framework has a well-defined bottom: it is the method that uses the least information — a single sign bit per evaluation — and so it sets the linear, rate- $1/2$ baseline against which every faster method is measured, and this is exactly the worst-case-optimal performance in the sign-query model, so nothing using only signs can beat it. The central insight is that speed in root-finding is bought by extracting more local information than a sign, and the chapter is the systematic study of that trade: the fixed-point picture of 43.02.02 reads a contraction's value to obtain rate $∣ g^{'} (x^{*}) ∣$ , Newton's method of 43.02.03 reads the derivative to obtain quadratic order, and each gain in order is dual to a loss of unconditional guarantee, which is why the practical solvers of Theorem 4 graft the fast iteration onto a bisection safeguard. Putting these together, bisection appears again not as a competitor to the superlinear methods but as their floor and their fallback: the guaranteed-but-slow method is exactly what makes the fast-but-fragile methods safe to deploy, and the finite-precision floor of Theorem 3 is the foundational reason no method, fast or slow, can be asked for accuracy below what the conditioning of the root and the arithmetic permit — the bridge from this elementary method to the conditioning theory of chapter 43.01.

Full proof set Master

Proposition 1 (nested-bracket convergence and the a-priori bound). Under the hypotheses of the Key theorem, the bisection iterates converge to a root $x^ $w i t h$ |x_n - x^| \le (b-a)/2^{n+1}$.

Proof. The construction keeps, at each step, the half-interval on which $f$ retains a sign change, so $f (a_{n}) f (b_{n}) \leq 0$ for all $n$ and the brackets nest: $a \leq a_{n} \leq a_{n + 1} \leq b_{n + 1} \leq b_{n} \leq b$ . The sequence $(a_{n})$ is nondecreasing and bounded above, $(b_{n})$ nonincreasing and bounded below, so each converges by the monotone convergence theorem, itself a consequence of the order-completeness of $R$ 02.02.01, 02.01.05. Since $b_{n} - a_{n} = (b - a) / 2^{n} \to 0$ , the two limits coincide at a single point $x^{*} \in [a, b]$ . Continuity gives $f (a_{n}) \to f (x^{*})$ and $f (b_{n}) \to f (x^{*})$ ; passing to the limit in $f (a_{n}) f (b_{n}) \leq 0$ yields $f (x^{*})^{2} \leq 0$ , so $f (x^{*}) = 0$ . The midpoint $x_{n} = \frac{1}{2} (a_{n} + b_{n})$ and the root both lie in $[a_{n}, b_{n}]$ , whose width is $(b - a) / 2^{n}$ , so $∣ x_{n} - x^{*} ∣ \leq \frac{1}{2} (b_{n} - a_{n}) = (b - a) / 2^{n + 1}$ . $□$

Proposition 2 (iteration count for a tolerance). To guarantee $|x_n - x^| \le \varepsilon $i t s u f f i ces t ha t$ n \ge \log_2\big((b-a)/\varepsilon\big) - 1$, and this many steps is necessary in the worst case.*

Proof. Sufficiency: $(b - a) / 2^{n + 1} \leq ε ⟺ 2^{n + 1} \geq (b - a) / ε ⟺ n + 1 \geq lo g_{2} ((b - a) / ε)$ , the stated inequality. Necessity in the worst case: by Proposition 3 the residual interval after $n$ sign evaluations has width $(b - a) / 2^{n}$ , and an adversary can place the root anywhere in it, so the guaranteed position error is no better than half that width, $(b - a) / 2^{n + 1}$ . Demanding this be $\leq ε$ reproduces the same bound on $n$ , so fewer steps cannot guarantee the tolerance for every admissible $f$ . $□$

Proposition 3 (information lower bound; worst-case optimality). Any method restricted to evaluating $sign f$ and starting from a width- $w$ bracket with a single sign change cannot guarantee, after $n$ evaluations, a residual bracket of width below $w / 2^{n}$ .

Proof. Model each evaluation as a binary query: it returns the sign of $f$ at a chosen point, one of two outcomes (treating a zero outcome as a lucky termination that an adversary avoids). After $n$ queries the sequence of outcomes is one of at most $2^{n}$ branches; the method's output interval is determined by the branch taken. The possible root locations consistent with a given branch form a sub-interval, and the $2^{n}$ branches partition the original width- $w$ set of locations into at most $2^{n}$ pieces, so some piece has width $\geq w / 2^{n}$ . An adversary answers each query to keep alive the largest consistent piece; the method ends with a residual interval of width $\geq w / 2^{n}$ on that branch. Bisection halves the bracket on every query, reaching exactly $w / 2^{n}$ , so it attains the lower bound and is worst-case optimal. $□$

Proposition 4 (order $p = 1$ with constant $1/2$ ). The bisection bound sequence $β_{n} = (b - a) / 2^{n + 1}$ satisfies $β_{n + 1} = \frac{1}{2} β_{n}$ , so bisection converges linearly with asymptotic error constant $1/2$ and no higher order.

Proof. Directly $β_{n + 1} = (b - a) / 2^{n + 2} = \frac{1}{2} \cdot (b - a) / 2^{n + 1} = \frac{1}{2} β_{n}$ , a fixed ratio $\frac{1}{2} \in (0, 1)$ , which is the definition of linear convergence of order $p = 1$ with constant $C = \frac{1}{2}$ applied to the error bound. No order $p > 1$ can hold for the guaranteed bound: if $β_{n + 1} \leq C β_{n}^{p}$ held with $p > 1$ for all admissible $f$ , then for the worst-case $f$ of Proposition 3, where the residual width is exactly $w / 2^{n}$ and cannot be beaten, the bound $β_{n} = w / 2^{n + 1}$ is tight, and $β_{n + 1} / β_{n}^{p} = \frac{1}{2} β_{n}^{1 - p} \to \infty$ as $β_{n} \to 0$ since $1 - p < 0$ , so no finite $C$ exists. Hence $p = 1$ is exact. $□$

Connections Master

Fixed-point iteration 43.02.02 reformulates $f (x) = 0$ as $x = g (x)$ and iterates $x_{n + 1} = g (x_{n})$ , converging linearly with asymptotic rate $∣ g^{'} (x^{*}) ∣$ when $∣ g^{'} (x^{*}) ∣ < 1$ . The order-of-convergence framework defined in this unit — $∣ e_{n + 1} ∣ \leq C ∣ e_{n} ∣^{p}$ with $p = 1$ the linear case — is exactly what that unit specialises: bisection is the rate- $1/2$ instance whose contraction comes from halving rather than from a derivative, and the contrast (a fixed, guaranteed $1/2$ versus a problem-dependent $∣ g^{'} (x^{*}) ∣$ that may exceed $1$ and diverge) is the comparison 43.02.02 opens with.
Newton's method 43.02.03 is the fixed-point map $g (x) = x - f (x) / f^{'} (x)$ with $g^{'} (x^{*}) = 0$ , achieving quadratic order $p = 2$ . It buys that speed by reading the derivative, the extra information bisection refuses to use; the duality is exact — bisection guarantees convergence and gives linear order, Newton gives quadratic order but only local, conditional convergence. The hybrid safeguards of this unit's Advanced results are the resolution: a bracket-maintaining wrapper takes Newton or secant steps when they stay inside the bracket and falls back to a bisection step otherwise, so 43.02.03's speed and this unit's guarantee combine.
Conditioning, floating-point arithmetic, and backward error 43.01.01 supply the finite-precision floor that caps the accuracy of every root-finder, including bisection: once the bracket width approaches $∣ x^{*} ∣ ε_{mach}$ the evaluated sign of $f$ is rounding-dominated and further halving is unreliable. This unit's stopping-criterion discussion and Theorem 3 forward-reference that chapter; the conditioning of the root, $1/∣ f^{'} (x^{*}) ∣$ , is the multiplier that turns the unit-roundoff floor into the achievable-accuracy floor.
The intermediate value theorem 02.01.02 — the image of a connected interval under a continuous map is connected, so a continuous function changing sign attains zero — is the existence theorem this entire method rests on, and the completeness of $R$ 02.02.01 is what makes the nested-bracket limit exist. Bisection is, in effect, the constructive content of the IVT made into an algorithm: it turns the pure existence statement into a sequence of brackets that produces the guaranteed root to any tolerance.

Historical & philosophical context Master

The idea of trapping a root between values of opposite sign and repeatedly subdividing is old, descending from the method of false position (regula falsi) used by Babylonian, Indian, and medieval Islamic and European computers, and from the bisection-style arguments implicit in proofs of the intermediate value theorem. Bernard Bolzano's 1817 memoir Rein analytischer Beweis gave the first rigorous proof of the intermediate value theorem, isolating the existence of a root of a sign-changing continuous function as a theorem of analysis rather than a geometric assumption ^{[Bolzano 1817]}; the repeated-bisection argument he and later Cauchy and Weierstrass used to locate the root is precisely the algorithm of this unit, read as a constructive proof. The numerical method is thus the existence theorem turned into a finite procedure, and the a-priori bound $(b - a) / 2^{n + 1}$ is the quantitative content the existence proof leaves implicit.

The systematic comparison of root-finders by order of convergence belongs to the twentieth-century numerical-analysis literature. Süli and Mayers open their nonlinear-equations chapter with bisection precisely because its guaranteed-but-linear behaviour fixes the baseline of the order framework ^{[Süli-Mayers Ch. 1]}. The worst-case optimality of bisection in the sign-evaluation model was established within information-based complexity, where Sikorski and others proved that no algorithm using sign information can localise a root faster than halving. Richard Brent's 1973 monograph gave the safeguarded hybrid that remains the standard library root-finder, combining inverse quadratic interpolation with a bisection fallback to secure both speed and guaranteed convergence ^{[Brent 1973]}.

Bibliography Master

@book{sulimayers2003,
  author    = {S\"{u}li, Endre and Mayers, David F.},
  title     = {An Introduction to Numerical Analysis},
  publisher = {Cambridge University Press},
  year      = {2003}
}

@book{atkinson1989,
  author    = {Atkinson, Kendall E.},
  title     = {An Introduction to Numerical Analysis},
  edition   = {2},
  publisher = {John Wiley \& Sons},
  year      = {1989}
}

@book{stoerbulirsch2002,
  author    = {Stoer, Josef and Bulirsch, Roland},
  title     = {Introduction to Numerical Analysis},
  edition   = {3},
  series    = {Texts in Applied Mathematics},
  volume    = {12},
  publisher = {Springer},
  year      = {2002}
}

@book{brent1973,
  author    = {Brent, Richard P.},
  title     = {Algorithms for Minimization without Derivatives},
  publisher = {Prentice-Hall},
  year      = {1973}
}

@book{trefethenbau1997,
  author    = {Trefethen, Lloyd N. and Bau, David},
  title     = {Numerical Linear Algebra},
  publisher = {SIAM},
  year      = {1997}
}

@article{sikorski1982,
  author  = {Sikorski, Krzysztof},
  title   = {Bisection is optimal},
  journal = {Numerische Mathematik},
  volume  = {40},
  number  = {1},
  year    = {1982},
  pages   = {111--117}
}

@book{bolzano1817,
  author    = {Bolzano, Bernard},
  title     = {Rein analytischer Beweis des Lehrsatzes, dass zwischen je zwey Werthen, die ein entgegengesetztes Resultat gew\"{a}hren, wenigstens eine reelle Wurzel der Gleichung liege},
  publisher = {Gottlieb Haase},
  address   = {Prague},
  year      = {1817}
}

Prerequisites

02.01.05
02.05.02

Tier anchors

beginner: Süli & Mayers 2003 *An Introduction to Numerical Analysis* (Cambridge) §1.1-1.2 (the root-finding problem, the bisection algorithm as repeated halving of a sign-change bracket); Burden & Faires 2011 *Numerical Analysis* 9e (Brooks/Cole) §2.1 for the step-by-step bracketing picture
intermediate: Süli & Mayers 2003 *An Introduction to Numerical Analysis* (Cambridge) §1.2 (bisection, the a-priori error bound $(b-a)/2^{n+1}$, the iteration count for a tolerance); Atkinson 1989 *An Introduction to Numerical Analysis* 2e (Wiley) §2.1; Stoer & Bulirsch 2002 *Introduction to Numerical Analysis* 3e (Springer) §5.1
master: Süli & Mayers 2003 *An Introduction to Numerical Analysis* (Cambridge) Ch. 1 (root-finding, the convergence-order framework opening the nonlinear-equations chapter); Trefethen & Bau 1997 *Numerical Linear Algebra* (SIAM) Lecture 13-15 (conditioning and the precision-limited accuracy floor); Brent 1973 *Algorithms for Minimization without Derivatives* (Prentice-Hall) Ch. 3-4 (the bracketing/interpolation hybrid that makes bisection a practical safeguard)

References

Süli, E. & Mayers, D. — An Introduction to Numerical Analysis · Cambridge University Press (2003). Chapter 1 'Solution of equations by iteration' opens the nonlinear-equations material: the scalar root-finding problem $f(x)=0$ for continuous $f$ on $[a,b]$; the bisection method built on the intermediate value theorem, which guarantees a root in any interval where $f$ changes sign; the a-priori error bound that after $n$ bisections the midpoint approximates a root to within $(b-a)/2^{n+1}$, giving the explicit number of steps needed for a prescribed tolerance; the resulting linear convergence with contraction factor $1/2$; and the framing of order of convergence ($p=1$ for bisection) that the chapter then sharpens for fixed-point iteration (linear, rate $|g'(x^*)|$) and Newton's method (quadratic). Süli & Mayers stress that bisection is unconditionally convergent once a sign-change bracket is found, at the cost of being slow, and that finite-precision arithmetic eventually bounds the attainable accuracy.
Atkinson, K. E. — An Introduction to Numerical Analysis · Wiley, 2nd edition (1989). §2.1 presents the bisection method with the same halving-interval error bound and contrasts its guaranteed but linear convergence against the faster but conditionally convergent Newton and secant methods; the chapter develops the general order-of-convergence definition $|x_{n+1}-x^*| \le C|x_n-x^*|^p$ used to rank root-finders.
Trefethen, L. N. & Bau, D. — Numerical Linear Algebra · SIAM (1997). Lectures 12-15 develop conditioning, floating-point arithmetic, and backward stability; the relevant point for root-finding is that the achievable accuracy of any method is floored by the conditioning of the root and unit roundoff $\varepsilon_{\mathrm{mach}}$, so a residual-based stopping test cannot demand accuracy below the level the data and arithmetic support. This is the forward cross-reference for the finite-precision limit on bisection in chapter 43.01.

Estimated time

beginner: 20m
intermediate: 45m
master: 80m