43.02.01 · numerical-analysis / nonlinear-equations

The Bisection Method and the Scalar Root-Finding Problem

shipped3 tiersLean: none

Anchor (Master): Süli & Mayers 2003 *An Introduction to Numerical Analysis* (Cambridge) Ch. 1 (root-finding, the convergence-order framework opening the nonlinear-equations chapter); Trefethen & Bau 1997 *Numerical Linear Algebra* (SIAM) Lecture 13-15 (conditioning and the precision-limited accuracy floor); Brent 1973 *Algorithms for Minimization without Derivatives* (Prentice-Hall) Ch. 3-4 (the bracketing/interpolation hybrid that makes bisection a practical safeguard)

Intuition Beginner

Suppose a quantity depends on some input — the height of a thrown ball at time , the profit of a business at price — and you want to know where that quantity hits zero. Crossing zero is where the ball lands or where you break even. Writing the quantity as , you are asking for the input where . That input is called a root, and the search for it is the root-finding problem. Most equations you meet in practice cannot be solved by a tidy formula, so you need a procedure that hunts the root down by getting closer and closer.

Here is the one fact that makes the hunt reliable. If is a continuous quantity — no sudden jumps — and it is negative at one input and positive at another, then somewhere between those two inputs it must pass through zero. You cannot get from below the ground to above the ground without touching the ground. So if you can find one place where is negative and another where is positive, you have trapped a root between them. That trapping interval is called a bracket.

The bisection method is the most patient way to shrink a bracket. Look at the midpoint of your interval and check the sign of there. The midpoint splits the interval into two halves, and the root has to live in whichever half still shows a sign change. Keep that half, throw the other away, and repeat. Every step cuts the interval in half, so the trap closes in on the root as tightly as you like.

This unit is where the whole chapter on solving equations begins. Bisection is the method that always works once you have a bracket, and the price for that guarantee is that it is slow. Later methods in the chapter trade some of that guarantee for far greater speed, and the way we measure speed — how fast the error shrinks per step — is set up here.

Visual Beginner

Picture a continuous curve that starts below the horizontal axis on the left and ends above it on the right. Because it has no breaks, it must cross the axis somewhere in between, and that crossing point is the root we want. Bisection finds it by repeatedly checking the middle.

The picture below shows the interval for a curve that is negative at and positive at . Each row halves the bracket by testing the midpoint and keeping the half where the sign still changes.

   f(1) < 0                                  f(2) > 0
     |---------------------------------------------|
     1                    1.5                       2      midpoint 1.5: f(1.5) > 0  -> root in [1, 1.5]
     |--------------------|
     1         1.25       1.5                              midpoint 1.25: f(1.25) < 0 -> root in [1.25, 1.5]
               |----------|
               1.25  1.375 1.5                             midpoint 1.375: keep the half with the sign change
               |-----|
   each row is half as wide as the row above it
step bracket width midpoint tested
start
1
2
3

Each row of the table is exactly half as wide as the row above it, and the crossing point stays trapped inside every bracket. After ten steps the width is about one thousandth of the start; after twenty, about one millionth.

Worked example Beginner

Let us find a root of by bisection. A root of this is a number whose square is , that is, the square root of , which is about . We will pretend we do not know that and let the method discover it.

Step 1. Find a starting bracket. Check the endpoints and . At , , which is negative. At , , which is positive. The sign changes, so a root is trapped in .

Step 2. Test the midpoint of , which is . Here , positive. The root sits between the negative end and the positive value at , so the new bracket is .

Step 3. Test the midpoint of , which is . Here , negative. The sign change is now between and , so the new bracket is .

Step 4. Test the midpoint of , which is . Here , negative. The new bracket is .

Step 5. Read off the current estimate. After three halvings the bracket has width , and its midpoint is our estimate. The true root lies inside the bracket, and our estimate is off by less than half the width, under .

What this tells us: starting from nothing but two sign-checks, three halvings already pin the square root of to within about one part in twenty, and every further step doubles the precision. The method needs no formula for the root, only the ability to check the sign of .

Check your understanding Beginner

Formal definition Intermediate+

Let be continuous with . The scalar root-finding problem is to determine with . Existence of such a root is supplied by the intermediate value theorem (the image of a connected set under a continuous map is connected, specialised to a real interval 02.01.02; equivalently the completeness of used in 02.02.01): since lies strictly between and , there is a point at which attains it. Uniqueness is not assumed — the interval may bracket several roots; the method below returns one of them.

The bisection method generates a nested sequence of brackets with . At step , set the midpoint and inspect the sign of :

In the boundary case the midpoint is itself a root and the process terminates. Each step preserves the invariant , so every bracket contains a root, and the width satisfies . The bisection estimate after steps is the midpoint ; it requires exactly one evaluation of per step. A stopping criterion halts the iteration once a target is met: a width test , a residual test , or a cap on the iteration count. The symbols for the exact root, for the error, and for the bracket are recorded in _meta/NOTATION.md.

A method generating estimates is said to converge with order if there is a constant (with when ) such that, for all large , $$ |x_{n+1} - x^| \le C,|x_n - x^|^{p}. $$ The case is linear convergence with rate (or contraction factor) ; is quadratic; is superlinear. This order framework is the yardstick the rest of chapter 43.02 uses to compare bisection against fixed-point iteration 43.02.02 and Newton's method 43.02.03.

Counterexamples to common slips Intermediate+

  • "A sign change at the endpoints means exactly one root." It guarantees an odd number of roots in the bracket (counting multiplicity for crossings), not a unique one. Bisection converges to one root of the bracket; which one depends on the sign pattern at the midpoints, and a different starting bracket may capture a different root.

  • "Same sign at the endpoints means no root." A continuous can have a root — even several — inside while and share a sign, as with on , which has two roots yet . The sign-change test is sufficient for a bracket, not necessary for a root; an even number of crossings is invisible to it.

  • "The error per step is the midpoint residual ." The quantity that halves deterministically is the bracket width, hence the bound on the position error. The residual can be large while is close to a flat root, or small while is far from a steep one; residual and position error are linked only through the local slope.

  • "Bisection has a convergence rate $|g'(x^)|1/2f$.

Key theorem with proof Intermediate+

The signature result is the guaranteed convergence of bisection together with its explicit a-priori error bound, because it is the single statement that both certifies the method always works and quantifies exactly how many steps a prescribed accuracy costs.

Theorem (bisection convergence with a-priori bound). Let be continuous with . The bisection iterates converge to a root $x^ \in [a, b]fn \ge 0$* $$ |x_n - x^| \le \frac{b - a}{2^{,n+1}}. $$ Consequently, to guarantee $|x_n - x^| \le \varepsilon$ it suffices to take $$ n \ge \log_2!\frac{b - a}{\varepsilon} - 1. $$

Proof. By construction each step keeps the half on which changes sign, so the invariant holds for all , and . The left endpoints are nondecreasing and bounded above by , and the right endpoints are nonincreasing and bounded below by ; by the monotone convergence of bounded real sequences (a consequence of the completeness of used in the metric-space and real-number foundations 02.01.05, 02.02.01) both converge, and since they share a common limit . Continuity of gives and , so , forcing . Thus is a root.

The midpoint lies at the centre of , and , so the distance from to is at most half the width: $$ |x_n - x^| \le \frac{1}{2}(b_n - a_n) = \frac{1}{2}\cdot\frac{b - a}{2^{n}} = \frac{b - a}{2^{n+1}}. $$ For the iteration count, $|x_n - x^| \le \varepsilon(b - a)/2^{n+1} \le \varepsilon2^{n+1} \ge (b - a)/\varepsilon\log_2n + 1 \ge \log_2\big((b-a)/\varepsilon\big)n \ge \log_2\big((b-a)/\varepsilon\big) - 1\square$

Bridge. This theorem is the foundational reason bisection is the safe default for scalar root-finding: the error bound depends on nothing but the starting width and the step count, so convergence is certified before a single evaluation is made, and it builds toward the comparative study of speed that organises the chapter. The bound is exactly the order-of-convergence definition with and contraction factor , which is the central insight that lets bisection be ranked on the same scale as every other method here: this is exactly linear convergence, and the gain of one correct binary digit per step is what later methods accelerate. The halving generalises to any deterministic interval-contraction scheme, and it is dual to the fixed-point picture of 43.02.02, where convergence is governed instead by a derivative at the root. Putting these together, the a-priori bound certifies that bisection converges and how slowly, and this same convergence-order language appears again in 43.02.02 for fixed-point iteration with rate and in 43.02.03 for Newton's quadratic order, where the contrast with bisection's guaranteed-but-linear behaviour is the organising comparison.

Exercises Intermediate+

Advanced results Master

Bisection is the slowest method that still carries an unconditional guarantee, and the results below place it precisely: as the optimal worst-case interval method, as the linear baseline of the order hierarchy, and as the safeguard that hybrid solvers wrap around faster but unreliable iterations.

Theorem 1 (linear convergence and the order baseline). The bisection bound exhibits linear convergence of order with asymptotic error constant : writing for the bound, exactly. No higher order is attainable, because the method extracts only one bit of information per evaluation — the sign of — and one bit can at best halve the candidate interval. This information-theoretic ceiling is the reason every superlinear method in the chapter must use more than the sign: fixed-point iteration uses the value of a contraction map 43.02.02, Newton uses the value and the derivative 43.02.03, and the secant uses two values to approximate the derivative [Atkinson §2.1].

Theorem 2 (worst-case optimality of bisection among bracketing methods). Among all methods that, given only the ability to evaluate and that begin with a bracket of width containing a single sign change, no method can guarantee to localise the root to an interval of width less than after evaluations. Bisection achieves exactly, so it is worst-case optimal in the sign-evaluation model. The argument is an adversary one: after sign queries the responses partition the possible root locations into at most cells, and an adversary answering to keep the largest cell alive forces a residual interval of width ; bisection's repeated halving meets this lower bound [Sikorski 1982].

Theorem 3 (finite-precision accuracy floor). In floating-point arithmetic with unit roundoff , the computed midpoint and the computed sign degrade once the bracket width approaches the spacing of representable numbers near , which is of order . Below that width the evaluated sign of is dominated by rounding error in computing , and further halving no longer reliably brackets the true root: the attainable accuracy is floored at roughly for a well-conditioned simple root, and worse for an ill-conditioned one, where the conditioning of the root scales the floor by . This is the conditioning-and-roundoff boundary developed in chapter 43.01 [Trefethen-Bau Lec. 12-15], and it is why a stopping test should never demand a width below the precision the data and arithmetic support.

Theorem 4 (bisection as a safeguard; hybrid bracketing methods). A practical scalar solver wraps a fast, unreliable iteration (secant, inverse-quadratic interpolation, Newton) inside a maintained bracket, accepting the fast step only when it lands inside the current bracket and produces sufficient decrease, and falling back to a bisection step otherwise. This preserves the unconditional convergence of bisection while attaining the superlinear speed of the interpolation step on smooth functions. Dekker's method and Brent's method are the canonical realisations: Brent's method combines inverse quadratic interpolation, the secant method, and bisection, guaranteeing convergence in at most about the square of the bisection step count while typically converging superlinearly [Brent 1973 Ch. 4]. The role of bisection in the hybrid is exactly its guarantee — it is the floor that catches the fast step when it strays.

Synthesis. Bisection is the foundational reason the order-of-convergence framework has a well-defined bottom: it is the method that uses the least information — a single sign bit per evaluation — and so it sets the linear, rate- baseline against which every faster method is measured, and this is exactly the worst-case-optimal performance in the sign-query model, so nothing using only signs can beat it. The central insight is that speed in root-finding is bought by extracting more local information than a sign, and the chapter is the systematic study of that trade: the fixed-point picture of 43.02.02 reads a contraction's value to obtain rate , Newton's method of 43.02.03 reads the derivative to obtain quadratic order, and each gain in order is dual to a loss of unconditional guarantee, which is why the practical solvers of Theorem 4 graft the fast iteration onto a bisection safeguard. Putting these together, bisection appears again not as a competitor to the superlinear methods but as their floor and their fallback: the guaranteed-but-slow method is exactly what makes the fast-but-fragile methods safe to deploy, and the finite-precision floor of Theorem 3 is the foundational reason no method, fast or slow, can be asked for accuracy below what the conditioning of the root and the arithmetic permit — the bridge from this elementary method to the conditioning theory of chapter 43.01.

Full proof set Master

Proposition 1 (nested-bracket convergence and the a-priori bound). Under the hypotheses of the Key theorem, the bisection iterates converge to a root $x^|x_n - x^| \le (b-a)/2^{n+1}$.

Proof. The construction keeps, at each step, the half-interval on which retains a sign change, so for all and the brackets nest: . The sequence is nondecreasing and bounded above, nonincreasing and bounded below, so each converges by the monotone convergence theorem, itself a consequence of the order-completeness of 02.02.01, 02.01.05. Since , the two limits coincide at a single point . Continuity gives and ; passing to the limit in yields , so . The midpoint and the root both lie in , whose width is , so .

Proposition 2 (iteration count for a tolerance). To guarantee $|x_n - x^| \le \varepsilonn \ge \log_2\big((b-a)/\varepsilon\big) - 1$, and this many steps is necessary in the worst case.*

Proof. Sufficiency: , the stated inequality. Necessity in the worst case: by Proposition 3 the residual interval after sign evaluations has width , and an adversary can place the root anywhere in it, so the guaranteed position error is no better than half that width, . Demanding this be reproduces the same bound on , so fewer steps cannot guarantee the tolerance for every admissible .

Proposition 3 (information lower bound; worst-case optimality). Any method restricted to evaluating and starting from a width- bracket with a single sign change cannot guarantee, after evaluations, a residual bracket of width below .

Proof. Model each evaluation as a binary query: it returns the sign of at a chosen point, one of two outcomes (treating a zero outcome as a lucky termination that an adversary avoids). After queries the sequence of outcomes is one of at most branches; the method's output interval is determined by the branch taken. The possible root locations consistent with a given branch form a sub-interval, and the branches partition the original width- set of locations into at most pieces, so some piece has width . An adversary answers each query to keep alive the largest consistent piece; the method ends with a residual interval of width on that branch. Bisection halves the bracket on every query, reaching exactly , so it attains the lower bound and is worst-case optimal.

Proposition 4 (order with constant ). The bisection bound sequence satisfies , so bisection converges linearly with asymptotic error constant and no higher order.

Proof. Directly , a fixed ratio , which is the definition of linear convergence of order with constant applied to the error bound. No order can hold for the guaranteed bound: if held with for all admissible , then for the worst-case of Proposition 3, where the residual width is exactly and cannot be beaten, the bound is tight, and as since , so no finite exists. Hence is exact.

Connections Master

  • Fixed-point iteration 43.02.02 reformulates as and iterates , converging linearly with asymptotic rate when . The order-of-convergence framework defined in this unit — with the linear case — is exactly what that unit specialises: bisection is the rate- instance whose contraction comes from halving rather than from a derivative, and the contrast (a fixed, guaranteed versus a problem-dependent that may exceed and diverge) is the comparison 43.02.02 opens with.

  • Newton's method 43.02.03 is the fixed-point map with , achieving quadratic order . It buys that speed by reading the derivative, the extra information bisection refuses to use; the duality is exact — bisection guarantees convergence and gives linear order, Newton gives quadratic order but only local, conditional convergence. The hybrid safeguards of this unit's Advanced results are the resolution: a bracket-maintaining wrapper takes Newton or secant steps when they stay inside the bracket and falls back to a bisection step otherwise, so 43.02.03's speed and this unit's guarantee combine.

  • Conditioning, floating-point arithmetic, and backward error 43.01.01 supply the finite-precision floor that caps the accuracy of every root-finder, including bisection: once the bracket width approaches the evaluated sign of is rounding-dominated and further halving is unreliable. This unit's stopping-criterion discussion and Theorem 3 forward-reference that chapter; the conditioning of the root, , is the multiplier that turns the unit-roundoff floor into the achievable-accuracy floor.

  • The intermediate value theorem 02.01.02 — the image of a connected interval under a continuous map is connected, so a continuous function changing sign attains zero — is the existence theorem this entire method rests on, and the completeness of 02.02.01 is what makes the nested-bracket limit exist. Bisection is, in effect, the constructive content of the IVT made into an algorithm: it turns the pure existence statement into a sequence of brackets that produces the guaranteed root to any tolerance.

Historical & philosophical context Master

The idea of trapping a root between values of opposite sign and repeatedly subdividing is old, descending from the method of false position (regula falsi) used by Babylonian, Indian, and medieval Islamic and European computers, and from the bisection-style arguments implicit in proofs of the intermediate value theorem. Bernard Bolzano's 1817 memoir Rein analytischer Beweis gave the first rigorous proof of the intermediate value theorem, isolating the existence of a root of a sign-changing continuous function as a theorem of analysis rather than a geometric assumption [Bolzano 1817]; the repeated-bisection argument he and later Cauchy and Weierstrass used to locate the root is precisely the algorithm of this unit, read as a constructive proof. The numerical method is thus the existence theorem turned into a finite procedure, and the a-priori bound is the quantitative content the existence proof leaves implicit.

The systematic comparison of root-finders by order of convergence belongs to the twentieth-century numerical-analysis literature. Süli and Mayers open their nonlinear-equations chapter with bisection precisely because its guaranteed-but-linear behaviour fixes the baseline of the order framework [Süli-Mayers Ch. 1]. The worst-case optimality of bisection in the sign-evaluation model was established within information-based complexity, where Sikorski and others proved that no algorithm using sign information can localise a root faster than halving. Richard Brent's 1973 monograph gave the safeguarded hybrid that remains the standard library root-finder, combining inverse quadratic interpolation with a bisection fallback to secure both speed and guaranteed convergence [Brent 1973].

Bibliography Master

@book{sulimayers2003,
  author    = {S\"{u}li, Endre and Mayers, David F.},
  title     = {An Introduction to Numerical Analysis},
  publisher = {Cambridge University Press},
  year      = {2003}
}

@book{atkinson1989,
  author    = {Atkinson, Kendall E.},
  title     = {An Introduction to Numerical Analysis},
  edition   = {2},
  publisher = {John Wiley \& Sons},
  year      = {1989}
}

@book{stoerbulirsch2002,
  author    = {Stoer, Josef and Bulirsch, Roland},
  title     = {Introduction to Numerical Analysis},
  edition   = {3},
  series    = {Texts in Applied Mathematics},
  volume    = {12},
  publisher = {Springer},
  year      = {2002}
}

@book{brent1973,
  author    = {Brent, Richard P.},
  title     = {Algorithms for Minimization without Derivatives},
  publisher = {Prentice-Hall},
  year      = {1973}
}

@book{trefethenbau1997,
  author    = {Trefethen, Lloyd N. and Bau, David},
  title     = {Numerical Linear Algebra},
  publisher = {SIAM},
  year      = {1997}
}

@article{sikorski1982,
  author  = {Sikorski, Krzysztof},
  title   = {Bisection is optimal},
  journal = {Numerische Mathematik},
  volume  = {40},
  number  = {1},
  year    = {1982},
  pages   = {111--117}
}

@book{bolzano1817,
  author    = {Bolzano, Bernard},
  title     = {Rein analytischer Beweis des Lehrsatzes, dass zwischen je zwey Werthen, die ein entgegengesetztes Resultat gew\"{a}hren, wenigstens eine reelle Wurzel der Gleichung liege},
  publisher = {Gottlieb Haase},
  address   = {Prague},
  year      = {1817}
}