The Itô integral and Itô's formula
Anchor (Master): Karatzas–Shreve — Brownian Motion and Stochastic Calculus Ch. 3; Revuz–Yor — Continuous Martingales and Brownian Motion Ch. IV; Protter — Stochastic Integration and Differential Equations Ch. II
Intuition Beginner
Ordinary integration adds up a quantity that changes along a smooth path: how far you travel is your speed summed over time. Stochastic integration asks the same kind of question, but the path you are summing against is a jittery random walk — the trajectory of a particle bumped around by countless tiny collisions. You hold some amount of a risky asset, the asset price jiggles randomly, and you want the total gain. The amount you hold can itself change as you watch the price move.
The catch is that the random path is far too rough to handle the usual way. Over any time interval, however short, the path wiggles up and down without settling, and its total back-and-forth length is infinite. So we cannot treat the random increments as ordinary small steps and add them with the familiar rules.
The fix is a rule about timing: you must decide how much to hold before you see the next random jump, never after. That single honesty condition — no peeking ahead — is what makes the whole construction work and what separates this integral from every integral that came before it.
Visual Beginner
Picture a jagged random path climbing and falling across the page from left to right: the running position of a particle in random motion. Below it, a second staircase shows how much of the asset you are holding, flat over each short time block and then jumping to a new flat level at the start of the next block.
Each block contributes one term: your fixed holding over that block, multiplied by the net change in the random path across that block. Because your holding for a block is locked in at its left edge — before the path moves through the block — you never get to use future information. Summing these block contributions gives the total gain. As the blocks shrink toward zero width, the staircase sums settle onto a single number: the stochastic integral.
Worked example Beginner
Take the simplest holding: hold exactly one unit at all times, from time to time . The random path starts at . Split into blocks at times . Over block the path changes by , and your holding is , so block contributes that change. Adding all blocks gives . The total gain from holding one unit is just the net displacement , exactly as you would hope.
Now hold the current path value instead: at the left edge of block you hold . Block then contributes . A short algebra rearrangement of the telescoping total gives one half of minus one half of the added-up squared changes across all the blocks. The squared changes do not vanish as blocks shrink: their running total settles to . So the total gain is , not the that ordinary calculus would predict. That stubborn extra is the signature of the whole subject: the random path is so rough that its squared wiggles add up to real time.
Check your understanding Beginner
Formal definition Intermediate+
Fix a probability space with a filtration satisfying the usual conditions, and let be a one-dimensional Brownian motion adapted to this filtration, with increments independent of for (the process constructed in 02.15.01). An integrand is simple predictable when there exist deterministic times and bounded -measurable random variables with
$$
H_t(\omega) = \sum_{k=0}^{n-1} \xi_k(\omega), \mathbf{1}{(t_k, t{k+1}]}(t).
$$
The measurability of with respect to is the non-anticipation requirement. For such the Itô integral is defined pathwise by the finite sum
$$
\int_0^T H_s, dB_s := \sum_{k=0}^{n-1} \xi_k,(B_{t_{k+1}} - B_{t_k}),
$$
evaluating each integrand at the left endpoint of its block. The left-endpoint choice is the convention that makes the construction a martingale; the right-endpoint or midpoint choices give different integrals, the latter being the Stratonovich integral of 02.15.05.
The defining estimate is the Itô isometry: for simple predictable with ,
$$
\mathbb{E}!\left[\left(\int_0^T H_s, dB_s\right)^{!2}\right] = \mathbb{E}!\int_0^T H_s^2, ds.
$$
This identity says the map sends the inner-product space of simple integrands isometrically into . The class of progressively measurable integrands with is the completion of the simple integrands under the norm ; the isometry extends the integral uniquely and continuously to all of , since a complete target space (, by completeness from 02.07.06) receives the limit of any Cauchy sequence of integrals.
The quadratic variation of a continuous process is the limit in probability $$ [X]t = \lim{|\Pi| \to 0} \sum_{k} (X_{t_{k+1}} - X_{t_k})^2 $$ over partitions of with mesh ; for Brownian motion . The covariation of two processes is the polarisation , so . The heuristic , , encodes these facts and drives every Itô computation below.
Counterexamples to common slips
- Evaluating the integrand at the right endpoint instead of the left destroys the martingale property and shifts the answer by the quadratic variation; this is not the Itô integral.
- The quadratic variation is not zero. For an ordinary differentiable path the analogous limit is zero, so importing the smooth-path intuition that squared increments are negligible is the standard error.
- The Itô isometry requires to be non-anticipating. For an integrand that peeks ahead, need not vanish and the isometry fails.
Key theorem with proof Intermediate+
Theorem (Itô's formula, one-dimensional). Let and let be Brownian motion. Then for every , almost surely, $$ f(B_t) = f(B_0) + \int_0^t f'(B_s), dB_s + \frac{1}{2}\int_0^t f''(B_s), ds. $$
Proof. Fix and a partition with mesh tending to zero. A second-order Taylor expansion of between consecutive points gives, for each , $$ f(B_{t_{k+1}}) - f(B_{t_k}) = f'(B_{t_k}),\Delta_k B + \tfrac12 f''(B_{t_k}),(\Delta_k B)^2 + R_k, $$ where and the remainder satisfies with the modulus of continuity of on the (almost surely bounded) range of over . Summing over telescopes the left side to .
The first sum is the Itô-integral approximation of ; since is continuous and adapted, hence in on , this converges in to that integral as the mesh shrinks.
For the second sum, replace by and control the difference. Write $$ \sum_k f''(B_{t_k})(\Delta_k B)^2 = \sum_k f''(B_{t_k}),\Delta_k t + \sum_k f''(B_{t_k})\big((\Delta_k B)^2 - \Delta_k t\big). $$ The first piece is a Riemann sum converging almost surely to by continuity of . For the second piece, the terms are conditionally centred — — and conditionally uncorrelated across blocks, so the second moment of the sum is , using for a Gaussian increment. This is bounded by . Hence the second piece vanishes in .
Finally because uniformly (the path is uniformly continuous on ) while stays bounded. Collecting the limits gives the stated identity, with the factor on the second integral coming directly from the second-order Taylor coefficient.
Bridge. Itô's formula builds toward the entire theory of stochastic differential equations and their links to partial differential equations, and the same correction term appears again in the multidimensional and time-dependent versions below. The foundational reason for the extra is that quadratic variation does not vanish: forces the second-order Taylor term to survive in the limit, which is exactly the heuristic made into a theorem. This is exactly the mechanism by which the worked example produced rather than , and it generalises the ordinary chain rule, to which it reduces whenever the integrating path has zero quadratic variation. Putting these together, the bridge is that Itô's formula converts the analytic operator into the generator of Brownian motion, so that harmonic and heat-equation theory transfer wholesale to the probabilistic setting — the central insight that organises stochastic analysis and reappears in 02.15.03 as the link between SDEs and second-order PDEs.
Exercises Intermediate+
Advanced results Master
The Itô integral against Brownian motion is a continuous -bounded martingale: for the process , , admits a continuous modification, is a martingale with , and has quadratic variation . The isometry is the case of , itself the statement that is a martingale. This identification of the integral's quadratic variation with the time-integral of the squared integrand is the foundational reason the whole calculus closes on itself: differentiating an Itô integral returns its integrand, and squaring returns the -clock that drives every correction term.
Itô's formula extends to continuous semimartingales. For an Itô process with progressively measurable and locally square-integrable, and , $$ df(t, X_t) = \partial_t f, dt + \partial_x f, dX_t + \tfrac12 \partial_{xx} f, d[X]_t, \qquad d[X]_t = \sigma_t^2, dt. $$ The multidimensional version, for driven by an -dimensional Brownian motion through with , reads $$ df(t, X_t) = \partial_t f, dt + \nabla f \cdot dX_t + \tfrac12 \operatorname{tr}!\big(\sigma_t \sigma_t^{\mathsf T} D^2 f\big), dt, $$ with the Hessian. The trace term is the contraction of the Hessian against the diffusion matrix ; it is the multidimensional avatar of the correction and identifies as the generator of the diffusion .
The exponential martingale is the universal example. For an adapted with , the Doléans-Dade exponential $$ \mathcal{E}(\theta)_t = \exp!\Big(\int_0^t \theta_s, dB_s - \tfrac12 \int_0^t \theta_s^2, ds\Big) $$ satisfies , so it is a local martingale; under Novikov's condition it is a true martingale, and it is the density process of the Girsanov change of measure that removes drift. The in the exponent is precisely the Itô correction, the same term that appeared in geometric Brownian motion.
The Burkholder-Davis-Gundy inequalities control the running maximum of a continuous local martingale with by its quadratic variation: for every there are universal constants with $$ c_p, \mathbb{E}\big[[M]T^{,p/2}\big] \le \mathbb{E}\Big[\sup{0 \le t \le T} |M_t|^p\Big] \le C_p, \mathbb{E}\big[[M]_T^{,p/2}\big]. $$ For the upper bound is Doob's inequality combined with the isometry; the general case is the quantitative backbone of estimates for stochastic integrals and of existence-uniqueness theory for SDEs.
Synthesis. The foundational reason Itô calculus exists as a closed system is the identity : it says the quadratic variation of a stochastic integral is the time-integral of its squared integrand, and this is exactly the bookkeeping that turns the heuristic into operative calculus. Putting these together, the one-dimensional correction , the multidimensional trace term , and the exponential-martingale drift are one phenomenon wearing three costumes — each is the second-order Taylor remainder that survives because quadratic variation does not vanish. This is exactly the central insight that the generator of a diffusion is a second-order operator, which generalises the ordinary chain rule and is dual to the forward Kolmogorov (Fokker-Planck) evolution of the law of . The Burkholder-Davis-Gundy inequalities then make the quadratic variation the universal yardstick: control of controls every norm of the path, and this is the bridge from the algebra of the differential rules to the analysis of existence, uniqueness, and convergence for stochastic differential equations in 02.15.03.
Full proof set Master
The one-dimensional Itô formula and its partition argument are proved in full in the Key theorem section. The remaining Master claims are recorded here.
Proposition (Itô isometry and the extension). For simple predictable , . Consequently the integral extends uniquely to a linear isometry .
Proof. Write with . Expanding the square,
$$
\mathbb{E}\Big[\Big(\sum_k \xi_k \Delta_k B\Big)^2\Big] = \sum_{j,k} \mathbb{E}[\xi_j \xi_k, \Delta_j B, \Delta_k B].
$$
For , condition on : the factor is -measurable and is independent of with mean zero, so ; the same holds for by symmetry. The diagonal terms give , again by independence of the increment from . Summing, . The isometry holds. Since the simple integrands are dense in and is complete by 02.07.06, the isometric map sends Cauchy sequences to Cauchy sequences and extends uniquely and continuously to the completion.
Proposition (the integral is a continuous martingale). For the process is a martingale with respect to and admits a continuous modification.
Proof. For simple and , the increment is a sum of terms over blocks past ; each has by the tower property and mean-zero independent increments, so . The martingale property passes to the limit because conditional expectation is an -contraction. Continuity of the simple-integrand process is plain (it is piecewise a multiple of continuous ); for general , take simple in , and Doob's maximal inequality gives , so a subsequence converges uniformly almost surely, and the uniform limit of continuous paths is continuous.
Proposition (quadratic variation of the integral). For , is a martingale; equivalently .
Proof. It suffices to show for , since then has -conditional expectation , the martingale identity (the cross term vanishes by the martingale property of ). The displayed equality is the Itô isometry applied to the integral of against conditioned on , which holds blockwise for simple integrands by the diagonal computation of the first proposition and extends by the -limit. By the characterisation of quadratic variation as the unique continuous increasing process with a martingale, .
Proposition (multidimensional Itô formula). For with , an -dimensional Brownian motion, and , .
Proof. The one-dimensional partition argument generalises componentwise. The second-order Taylor expansion of now carries cross terms . The covariation of the components is , because for independent Brownian coordinates and , so ; products involving contribute nothing. Summing the Taylor terms over a shrinking partition, the first-order part assembles to , the diagonal-and-cross second-order part to , and the explicit time-dependence to . The remainder estimate from the one-dimensional case applies coordinatewise.
Connections Master
Brownian motion 02.15.01 is the integrator on which this entire construction rests. The independence of increments and the Gaussian scaling established there are exactly what make the Itô isometry hold and what give ; without the non-anticipating filtration and the mean-zero independent increments of that unit, neither the martingale property of the integral nor the surviving correction term would be available.
The Lebesgue integral and monotone convergence 02.07.04 supply the integration theory used pathwise and in expectation throughout: the -integrals are ordinary Lebesgue integrals along each path, and the interchange of limit and expectation in the isometry extension and in the martingale-convergence steps rests on the convergence theorems proved there.
spaces and completeness 02.07.06 provide the target space for the central extension. The Itô integral is defined as the unique continuous extension of an isometry from simple integrands into , and that extension exists precisely because is complete; the Burkholder-Davis-Gundy inequalities then live in the scale built in that unit.
Stochastic differential equations 02.15.03 are the immediate downstream consumer. Itô's formula is the change-of-variables rule that lets one solve SDEs in closed form (geometric Brownian motion, the exponential martingale) and is the analytic engine behind the Feynman-Kac and Kolmogorov correspondences that tie diffusion processes to second-order parabolic PDEs; the existence-uniqueness theory there runs on the isometry and the Burkholder-Davis-Gundy bounds proved here.
The Stratonovich integral 02.15.05 is the alternative construction that this unit pointedly does not use. Evaluating the integrand at the midpoint rather than the left endpoint restores the ordinary chain rule at the cost of the martingale property; the difference between the two integrals is exactly , the same quadratic-covariation term that produces the Itô correction, so the contrast between the two conventions is a direct corollary of the calculus developed here.
Historical & philosophical context Master
The stochastic integral was introduced by Kiyosi Itô in 1944 (Stochastic Integral, Proc. Imperial Academy Tokyo 20, 519–524) [Itô 1944], building on Norbert Wiener's 1923 rigorous construction of Brownian motion and on Paul Lévy's structural study of its paths. Itô's decisive move was to define the integral for non-anticipating integrands and to prove the change-of-variables formula with its second-order correction term, the result now universally called Itô's formula; Wolfgang Doeblin had independently arrived at closely related ideas in a sealed note deposited with the Académie des Sciences in 1940 and opened only in 2000, so the lemma is sometimes called the Itô-Doeblin formula. The modern measure-theoretic treatment via the isometry and the martingale property is the synthesis of Joseph Doob's martingale theory with Itô's construction; the textbook accounts of Karatzas and Shreve and of Revuz and Yor [Revuz 1999] codify the semimartingale generality, and Philip Protter's functional-analytic development takes the integral itself as the primitive object.
The conceptual content is that a calculus can be built on a path of infinite total variation provided one fixes the order in which information is revealed. The quadratic variation, which vanishes for every classically differentiable path, becomes the carrier of the new structure: it is the clock against which the second-order term is measured, and the choice of evaluation point (left endpoint for Itô, midpoint for Stratonovich) is the choice of which symmetry to preserve — the martingale property or the ordinary chain rule. Itô's framework gave probability theory its own differential calculus, and through the Feynman-Kac correspondence it supplied a probabilistic representation for solutions of second-order parabolic partial differential equations, a bridge that has organised diffusion theory ever since (Itô 1951, On Stochastic Differential Equations, Memoirs Amer. Math. Soc. 4).
Bibliography Master
@article{ito1944,
author = {It\^o, Kiyosi},
title = {Stochastic Integral},
journal = {Proceedings of the Imperial Academy (Tokyo)},
volume = {20},
number = {8},
pages = {519--524},
year = {1944}
}
@book{karatzas1991,
author = {Karatzas, Ioannis and Shreve, Steven E.},
title = {Brownian Motion and Stochastic Calculus},
series = {Graduate Texts in Mathematics},
volume = {113},
edition = {2nd},
publisher = {Springer-Verlag, New York},
year = {1991}
}
@book{revuzyor1999,
author = {Revuz, Daniel and Yor, Marc},
title = {Continuous Martingales and Brownian Motion},
series = {Grundlehren der mathematischen Wissenschaften},
volume = {293},
edition = {3rd},
publisher = {Springer-Verlag, Berlin},
year = {1999}
}
@book{oksendal2003,
author = {{\O}ksendal, Bernt},
title = {Stochastic Differential Equations: An Introduction with Applications},
edition = {6th},
publisher = {Springer-Verlag, Berlin},
year = {2003}
}
@book{protter2005,
author = {Protter, Philip E.},
title = {Stochastic Integration and Differential Equations},
series = {Stochastic Modelling and Applied Probability},
volume = {21},
edition = {2nd},
publisher = {Springer-Verlag, Berlin},
year = {2005}
}
@article{ito1951sde,
author = {It\^o, Kiyosi},
title = {On Stochastic Differential Equations},
journal = {Memoirs of the American Mathematical Society},
volume = {4},
pages = {1--51},
year = {1951}
}