02.13.03 · analysis / pde

Heat Equation, Heat Kernel, and Duhamel's Principle

shipped3 tiersLean: none

Anchor (Master): Evans §2.3; John §7; Friedman, Partial Differential Equations of Parabolic Type (Prentice-Hall 1964); Lieberman, Second Order Parabolic Differential Equations (World Scientific 1996); Stroock-Varadhan, Multidimensional Diffusion Processes (Springer 1979)

Intuition Beginner

The heat equation is the partial differential equation that describes how temperature evens out over time. Drop a hot coin into a metal block and the block warms near the coin, then warms further away, then warms more and more uniformly until the whole block sits at a single intermediate temperature. The heat equation is the local rule that produces this global behaviour.

At every point in space and at every instant in time, the rate at which the temperature changes equals a positive constant (the thermal diffusivity) times the Laplacian of the temperature field. The Laplacian measures how much a function bows above or below the average of its surroundings; a positive Laplacian means the point is colder than its neighbours, so heat flows in and the temperature rises.

The same equation governs an enormous range of physical processes. Salt diffusing through water, ink spreading in a glass, smoke dissipating in a still room, oxygen diffusing across a cell membrane, voltage smoothing across a passive electrical cable, even the probability density of a Brownian particle: all of these obey a heat-equation-style law. The unifying picture is diffusion, the gradual smoothing of a quantity that is locally conserved but whose flux moves from high concentration toward low concentration in proportion to the local gradient. The heat equation is the canonical example of a parabolic PDE, the class of equations describing irreversible time-evolution toward equilibrium.

Two qualitative properties make the heat equation stand apart from the wave equation and the Laplace equation. First, it smooths. If the initial temperature distribution has a jagged spike, even an honest discontinuity, the temperature at any later positive time is infinitely smooth in space. Diffusion is a relentless smoothing operator: high-frequency fluctuations decay first and fastest, low-frequency fluctuations decay last. After a millisecond the spike has rounded into a bump; after a second the bump has spread into a gentle hump; after a minute the hump is barely a ripple above the background.

Second, the heat equation has infinite propagation speed. A point heated at one location instantly raises the temperature, by an exponentially small amount, at every other point in the universe. This is mathematical fact and physical fiction; the heat equation is an excellent model on intermediate scales but breaks down at the relativistic limit, where finite-speed corrections are needed. Engineers and physicists use the heat equation knowing it is an idealisation; the breakdown matters only when the heated region is so small that mean-free-path effects (and ultimately the speed of light) intrude.

The third property is the one that converts the heat equation into a powerful computational tool: a single explicit function does all the work. The heat kernel is the function that records the temperature distribution that develops from a unit of heat dropped at a single point at a single instant. It is a Gaussian bump centred at the source location, broader and shorter as time advances, whose total area is conserved (this is conservation of energy: the total heat stays fixed, only its spatial distribution changes).

Once we know the heat kernel, we can solve the heat equation for any initial distribution by adding up the contributions of each initial point, weighted by the heat kernel that has spread out from that point over the elapsed time. The recipe is the same convolution idea that solved the Poisson equation, now adapted to the time-dependent setting.

The non-homogeneous version of the equation, with a heat source that varies in space and time, is solved by Duhamel's principle. The idea is simple and physical: a continuously acting heat source is equivalent to an infinite stream of instantaneous sources fired one after another. Each instantaneous source produces a heat-kernel-shaped contribution that subsequently spreads and decays; the total temperature is the integral over time of these contributions. Duhamel's principle is the parabolic version of variation-of-parameters from ordinary differential equations, the universal trick for converting a homogeneous solution operator into a non-homogeneous one.

The takeaway in a single sentence: the heat equation describes diffusion, the heat kernel is its fundamental solution, convolution with the heat kernel solves the initial-value problem, and Duhamel's principle solves the inhomogeneous problem.

Visual Beginner

Picture a long thin metal rod heated at one point at one instant. Just after the heat is delivered, the temperature is a tall narrow spike at the source. A moment later the spike has melted into a slightly broader and slightly shorter bump. A bit later the bump is broader and shorter still. The shape stays Gaussian throughout, a smooth bell curve, but the width grows in proportion to the square root of the elapsed time and the height shrinks correspondingly so that the area under the curve stays fixed at one unit of heat.

This single picture, the spreading Gaussian, is the heat kernel. It is the response of the rod to a unit of heat dropped at one point at one instant. Every other heat-equation solution on the rod is built from this one picture by superposition: stack up shifted and rescaled copies of the heat kernel, one for each initial point, weighted by the initial temperature there, and the sum is the solution.

The same picture extends to two or three dimensions. The heat kernel in three-dimensional space is a three-dimensional Gaussian, a fuzzy ball of concentrated temperature that spreads radially outward over time. The width of the ball grows as the square root of time, and the peak height drops in proportion so that the total heat is conserved. Adding up copies of the three-dimensional heat kernel, weighted by initial temperatures, gives the temperature field for any initial heat distribution in three dimensions.

The visual contrast with the wave equation is sharp. A wave on a string preserves the shape of an initial pulse, moving it bodily without distortion (in one space dimension), or at most spreading it in a controlled wavefront with sharp edges (in three space dimensions). The heat equation has no such structure: every initial pulse, no matter its shape, melts into a single smooth Gaussian-shaped blob at any positive time. The wave equation conserves the initial regularity; the heat equation destroys it, replacing whatever roughness was there with perfect smoothness.

Worked example Beginner

We solve the heat equation on an infinite metal bar with a simple piecewise-constant initial temperature. The setup: at time zero, the temperature is equal to one degree on the segment from minus one to one, and zero elsewhere. The bar is infinite in both directions. We compute the temperature at the origin at time one (in units where the diffusion constant is one).

Step 1. Set up the heat equation. The temperature satisfies for every on the real line and every , with initial condition for and for .

Step 2. The heat kernel in one space dimension is the Gaussian The solution of the initial-value problem at the point and time is obtained by adding up (in the continuous-summation sense) the contributions to the temperature at from each source point . The contribution from source point is the value of the heat kernel, weighted by the initial temperature at that source point. We write this continuous sum in the standard convolution shorthand .

Step 3. Use the initial condition. Only source points with contribute (the others have zero initial temperature). So equals the continuous-sum of as varies over the interval . In standard symbols this is the area under the curve between and .

Step 4. Specialise to and , and substitute (so the area-element factor of appears as a compensating ). The target becomes the area under between and .

Step 5. Recognise the error function. The standard error function is defined as twice the area under between and . Our target is the area between and , which by symmetry equals exactly. So Numerically .

Step 6. Interpret. At time zero the temperature at the origin was one degree. At time one, the temperature has dropped to about degrees as heat has spread out beyond the initial segment. The bar to the left and to the right of the origin has warmed slightly above zero, drawing heat away from the central region. As , for every fixed , since the total amount of heat is finite (equal to degree-units) and it spreads over an infinite bar.

What this tells us: the heat kernel converts an initial-data area-computation into an error-function value. The same recipe handles every reasonable initial condition: convolve with the Gaussian, evaluate either symbolically (often in terms of error functions) or numerically (a single Gaussian area). The error function and its close relatives are the workhorses of explicit heat-equation calculations.

Check your understanding Beginner

Formal definition Intermediate+

Let be an open set and let . Write for the parabolic cylinder. The heat equation is the second-order linear PDE where and is the spatial Laplacian. The inhomogeneous heat equation has a forcing term : A classical solution is a function (twice continuously differentiable in , once in ) satisfying the equation pointwise [Evans 2010 §2.3].

Cauchy problem. The Cauchy problem (initial-value problem) on the whole space asks for satisfying where is the initial datum. The inhomogeneous Cauchy problem replaces the right side of the PDE by a given source and asks for the same initial condition .

Definition (heat kernel). The heat kernel on is the function defined for and . For each fixed , is a Gaussian probability density on with mean zero and covariance matrix .

Properties of the heat kernel. Direct calculation shows:

  1. .
  2. on . That is, itself solves the heat equation, away from .
  3. for every (total heat is conserved).
  4. As , in the sense of distributions: for every (continuous bounded), .
  5. Semigroup property: for , where denotes spatial convolution.

The heat kernel is the fundamental solution of the heat operator , in the sense that it solves the operator equation on , where is the Dirac mass at the space-time origin.

Solution formula (homogeneous Cauchy problem). For (continuous bounded), the function is on , solves the heat equation on , and satisfies as with for every continuity point of .

Definition (Duhamel formula). For and with bounded and suitably regular, the Duhamel formula gives the solution of the inhomogeneous Cauchy problem with as The first integral is the homogeneous solution from the initial data; the second is the time-integrated convolution against the source.

Counterexamples to common slips Intermediate+

  • Bounded continuous initial data is the natural class. The convolution formula extends to for , but the boundary behaviour as degrades: for one recovers in -norm rather than pointwise. For merely measurable and locally integrable without a growth bound, the convolution can diverge; the standard growth restriction for some guarantees the integral converges for .

  • The heat equation cannot be solved backwards in time. Given a temperature distribution at time , attempting to reconstruct the temperature at is an ill-posed problem in the sense of Hadamard: the solution does not exist for arbitrary final data (only for final data in the image of the forward semigroup), and even when it exists it depends discontinuously on the data. The phenomenon reflects the smoothing property of the forward equation: forward evolution irreversibly destroys high-frequency information.

  • Uniqueness requires a growth condition. Tikhonov's 1935 counterexample exhibits a non-zero function on that solves the heat equation everywhere, vanishes identically at , and grows super-exponentially in for . Without a growth condition on the solution, the Cauchy problem has infinitely many solutions; uniqueness in the class for some holds (Widder 1944).

  • Infinite propagation speed is mathematical, not physical. The heat kernel is strictly positive everywhere for every : a unit of heat placed at the origin instantly raises the temperature, by an exponentially small amount, at every point in space. This is a feature of the linear parabolic model; relativistic generalisations (Cattaneo's equation, telegrapher's equation) modify the heat equation to enforce finite propagation speed. The standard heat equation remains the workhorse model because its analytic tractability outweighs the unphysical-speed flaw on every length scale relevant to ordinary physics.

  • The maximum principle holds on bounded domains, not the whole space. On a bounded parabolic cylinder , the maximum and minimum of a subsolution are attained on the parabolic boundary . The top face is excluded: a solution can attain its max in the interior of only at time (and the strong version forbids even that, modulo connectedness). On the unbounded whole space, the heat kernel itself is a counterexample to a naive maximum principle: as even though vanishes everywhere at .

Key theorem with proof Intermediate+

Theorem (heat kernel solves the Cauchy problem). Let (continuous bounded). Define by Then , on , , and [Evans 2010 §2.3, Theorem 1].

Proof. Throughout, fix with . The proof has four steps.

Step 1 (smoothness). Fix any compact , so for some . On , and all its - and -derivatives are bounded by computable Gaussian-decay expressions in that are integrable in uniformly over . For instance, on , which is integrable in and bounded uniformly over . Differentiation under the integral sign is justified by the dominated convergence theorem 02.07.04 applied to the difference quotients, giving and similarly for all higher derivatives. Repeating gives .

Step 2 (heat equation). From the explicit formula for , direct calculation gives on . (Compute: , so and , , giving .) By the differentiation-under-the-integral identity from Step 1: So on .

Step 3 (sup bound). For every and : using positivity of and the unit-mass property .

Step 4 (initial condition). Fix and . By continuity of , choose so that whenever . Compute, for and : using to absorb the constant into the kernel-integral. Split the integration domain into and :

For : so , and .

For : . For and , , so . Then The Gaussian tail integral satisfies as for every fixed . (Concretely, or simpler explicit bounds via the change of variables .) Choose small enough that for .

Combining: for and , . Since was arbitrary, as .

Bridge. The same convolution recipe that worked for the Poisson equation 02.13.02 works again for the heat equation, with the parabolic heat kernel replacing the elliptic Newtonian-potential kernel . The added ingredient is time: is no longer a fixed kernel but a family of kernels parametrised by , each of which is a smoothing operator whose smoothing strength grows with . Duhamel's principle below packages the time-dependent inhomogeneous problem into the same convolution-against-fundamental-solution form. The pattern recurs in 02.13.04 the wave equation via retarded Green functions (the Kirchhoff and d'Alembert formulae), in 02.13.07 separation of variables and Fourier-series expansions on bounded domains, and in the modern Brownian-motion picture (the heat kernel is the transition density of -dimensional Brownian motion, the bridge to stochastic analysis and Feynman-Kac formulas).

Exercises Intermediate+

Advanced results Master

The modern theory of the heat equation organises around six pillars: the fundamental-solution apparatus and Cauchy-problem theory, the maximum-principle framework on bounded domains, uniqueness theorems and their counterexamples (Tikhonov), regularity theory and the Nash-De Giorgi-Moser apparatus, the parabolic Harnack inequality, and the probabilistic interpretation via Brownian motion.

Theorem 1 (heat kernel via Fourier transform; Fourier 1822). The heat kernel on is the inverse Fourier transform of : [Fourier 1822]. Conversely, Fourier-transforming the Cauchy problem , gives the algebraic ODE with , whose solution is . Fourier inversion recovers the convolution formula . The Fourier-transform derivation was Fourier's original method in his 1822 Théorie analytique de la chaleur: Fourier introduced both the Fourier series (for the heat equation on a bounded interval) and the Fourier integral (for the heat equation on the line), inventing the modern technique of decomposition into eigenmodes of the spatial differential operator.

Theorem 2 (smoothing and infinite propagation speed). Let for some , and let for . Then , and for every , is real-analytic in . Moreover, if with , then for every (strict positivity at every space-time point with ). The smoothing reflects the regularising character of the heat semigroup: arbitrary data become real-analytic after any positive time. The strict positivity reflects infinite propagation speed: an arbitrarily small amount of heat anywhere instantly produces a strictly positive temperature everywhere else, however far away.

The analyticity is precise: extends to a complex-analytic function on in the strip for an explicit constant depending only on . The proof uses the Gaussian decay of the heat kernel together with the fact that the heat kernel itself extends to a complex-analytic function in . This is the parabolic analogue of the elliptic real-analyticity for harmonic functions; the parabolic version is sharper because it gives explicit complex-analytic-strip bounds.

Theorem 3 (strong maximum principle; Nirenberg 1953). Let be a subsolution of the heat equation: on . Suppose is open, connected, and bounded. If attains its maximum at an interior space-time point , then is constant on . The strong maximum principle sharpens the weak maximum principle of Exercise 7 in two ways. First, it rules out interior maxima even at the top time (whereas the weak version allows the max on the closed parabolic boundary, which includes the top face). Second, it propagates the maximum backward in time: if the max is attained at , then it equals the same value everywhere in the parabolic cylinder up to time . The mechanism is the Hopf-type lemma for parabolic equations and a connectedness argument.

Theorem 4 (Tikhonov non-uniqueness counterexample; Tikhonov 1935). Without a growth restriction, the Cauchy problem for the heat equation on has infinitely many solutions. Concretely, define and consider the formal power series Tikhonov's 1935 Mat. Sbornik paper [Tikhonov 1935] proved that this series converges for every , defines a function on , satisfies on (away from , where is identically zero), and vanishes identically at . So is a non-zero solution of the Cauchy problem with zero initial data. The non-uniqueness reflects the super-exponential growth in : schematically, growing faster than any Gaussian as . The Widder uniqueness theorem (next item) restores uniqueness under a Gaussian-growth restriction.

Theorem 5 (Widder uniqueness; Widder 1944). Let be on , continuous on , and satisfy on with on . Assume the growth condition Then on [Widder 1944]. Widder's positive-temperature theorem identifies the critical growth rate : solutions growing faster than can fail to be unique (Tikhonov's example saturates this rate); solutions growing strictly slower are unique. The proof uses the Phragmén-Lindelöf principle for parabolic equations, comparing the candidate solution to the explicit Gaussian growth bound. Widder's 1944 paper handled the positive-data case; the symmetric two-sided result is a standard extension.

Theorem 6 (backward uniqueness; Lions-Malgrange 1960). Let be two classical solutions of the heat equation on with appropriate growth conditions. If for every , then on (and on the closure if both extend continuously). The backward uniqueness theorem says that distinct trajectories of the heat semigroup cannot converge to the same final state: the forward evolution, though irreversible in the sense that the inverse problem is ill-posed, is injective on its image. The ill-posedness is in stability (the inverse depends discontinuously on the data), not in uniqueness.

The backward heat equation , which is what one gets by trying to evolve forward in time, is ill-posed in the sense of Hadamard: arbitrarily small final data can produce arbitrarily large norms at earlier times. Concretely, the function solves on , has final data uniformly bounded by , but earlier data has supremum bounded by (which is small for large ). The issue is that small perturbations in high-frequency modes amplify under backward evolution. The ill-posedness of the backward heat equation underlies the regularisation theory of inverse problems (Tikhonov regularisation, named for the same Tikhonov who produced the non-uniqueness counterexample) and the engineering challenges of inverse-conduction problems (reconstructing initial temperature from a present-time measurement).

Theorem 7 (parabolic Harnack inequality; Moser 1964). Let be a non-negative solution of the heat equation. Fix a space-time ball (with appropriate scaling so that the spatial and temporal extents match the parabolic scaling ). Then there exists a constant such that for every : The parabolic Harnack inequality says: in any parabolic-scaled cylinder, a non-negative solution's maximum on the lower half-cylinder is controlled by its minimum on the upper half-cylinder. Moser's 1964 Comm. Pure Appl. Math. paper [Moser 1964] proved the inequality by an iteration scheme that converts energy estimates into control. The inequality is the parabolic analogue of the classical elliptic Harnack inequality and underlies the Nash-De Giorgi-Moser regularity theory for divergence-form equations.

Theorem 8 (Aronson Gaussian bounds; Aronson 1967). For any divergence-form parabolic operator with bounded measurable coefficients satisfying uniform ellipticity , the fundamental solution satisfies two-sided Gaussian bounds for , where the constants depend only on [Aronson 1967]. The Aronson bounds say that even for non-smooth, non-symmetric, time-dependent coefficient matrices, the fundamental solution behaves like a Gaussian with explicit constants. The result is the parabolic analogue of the elliptic Green-function pointwise bounds, and is the keystone of modern parabolic regularity theory and stochastic analysis with rough coefficients.

Theorem 9 (Brownian motion connection; Wiener 1923, Einstein 1905, Kac 1949). Let be standard -dimensional Brownian motion on a probability space, with . Then the heat kernel is the transition density of : and more generally is the density of the random variable conditioned on . The heat equation appears as the Kolmogorov forward equation of Brownian motion: the density of satisfies with (the factor is the standard probabilistic normalisation; the analyst's heat kernel uses the normalisation , which corresponds to Brownian motion with variance instead of ). The Brownian-motion connection was identified by Einstein 1905 Ann. Phys. 17 [Einstein 1905] (heuristic diffusion-equation derivation from molecular kinetic theory) and rigorously formalised by Wiener 1923 J. Math. Phys. 2 [Wiener 1923] (Wiener measure on path space).

Theorem 10 (Feynman-Kac formula; Kac 1949). Let be a continuous bounded function (or measurable with appropriate decay), and let solve the heat equation with potential: Then for every and : where denotes expectation with respect to Brownian motion started at [Kac 1949]. The Feynman-Kac formula identifies the solution operator of the heat equation with a potential as the path integral of an exponential weight against Brownian paths, and is the bridge between PDE theory and stochastic analysis. Wick-rotating (replacing imaginary time by real time) converts the heat equation with potential into the Schrödinger equation with the same potential, and the Feynman-Kac formula becomes Feynman's path-integral formula for quantum mechanics (Feynman 1948 Rev. Mod. Phys. 20). The bridge is the most striking instance of the deep mathematical similarity between diffusion and quantum mechanics.

Theorem 11 (Nash-De Giorgi regularity; Nash 1958, De Giorgi 1957). For a divergence-form parabolic operator with bounded measurable coefficients satisfying uniform ellipticity, every weak solution of on a parabolic cylinder is Hölder continuous, with Hölder exponent and norm depending only on [Nash 1958] [De Giorgi 1957]. The Nash-De Giorgi theorem is the foundational regularity result for second-order divergence-form equations with rough coefficients. Nash 1958 and De Giorgi 1957 proved the theorem independently, with very different methods: Nash by a probabilistic argument tracking entropy along Brownian motion, De Giorgi by an iteration scheme converting control into Hölder control. Moser 1964 unified the two approaches via the Harnack inequality (Theorem 7 above). The Nash-De Giorgi-Moser apparatus is the foundation of modern PDE regularity theory: it provides Hölder continuity for solutions of divergence-form equations without any smoothness assumption on the coefficients, and is the entry point to the modern theory of elliptic and parabolic equations on Riemannian manifolds (Saloff-Coste, Grigor'yan), to homogenisation theory, and to the theory of stochastic differential equations with non-smooth drift.

Synthesis. The heat equation is the prototype parabolic equation, and its solution apparatus (heat kernel, convolution formula, Duhamel principle, maximum principle, energy method, Harnack inequality, Brownian-motion identification) is the prototype of every parabolic regularity and existence theory. The pattern recurs in three main escalations. First, replace the Laplacian by a divergence-form operator with rough coefficients: the convolution formula is replaced by a Green-function representation with Aronson Gaussian bounds, the smoothing property becomes Nash-De Giorgi-Moser Hölder regularity, and the Harnack inequality becomes the centrepiece of the modern theory. Second, replace the Euclidean background by a Riemannian manifold: the heat kernel becomes the Riemannian heat kernel, with Li-Yau gradient estimates replacing the explicit Gaussian formula, and the long-time behaviour becomes a probe of the underlying geometry (Hamilton's Ricci flow, Perelman's -functional, the Atiyah-Singer index theorem via heat-kernel asymptotics). Third, replace the linear equation by a nonlinear equation: the porous-medium equation, the -Laplacian, the Hele-Shaw flow, the Stefan problem, and ultimately mean-curvature flow and Ricci flow all build on the linear-heat-equation apparatus as their starting point.

The probabilistic side of the equation has been equally fertile. The identification of the heat kernel with Brownian motion's transition density, due to Einstein and Wiener, opened the modern theory of stochastic processes. The Feynman-Kac formula, due to Kac, is the bridge to quantum mechanics via Wick rotation. The Itô-Stratonovich calculus extends the framework to stochastic differential equations with drift and diffusion. The Malliavin calculus, the Wiener chaos decomposition, and the modern theory of large deviations (Freidlin-Wentzell, Varadhan) all build on the foundational identification of the heat semigroup with Brownian motion.

The conceptual closure is the recognition that the heat equation packages five distinct mathematical phenomena into a single equation: the spectral decomposition of the Laplacian (Fourier 1822, Sturm-Liouville 1836), the smoothing property of irreversible time evolution, the probabilistic structure of Brownian motion, the analytic continuation to the Schrödinger equation by Wick rotation, and the foundational example of a parabolic PDE underlying modern regularity theory. The arc from Fourier's 1822 Théorie analytique de la chaleur to modern Ricci-flow theory is a two-century lineage in which the same equation, the same heat kernel, and the same convolution recipe have been continuously refined into ever more general and ever more powerful tools.

Full proof set Master

Proposition 1 (heat kernel via Fourier transform). For and , .

Proof. The Fourier transform of a Gaussian is a Gaussian. Specifically, the standard -dimensional Gaussian integral with linear shift is The proof factors the integral over coordinates (since both the exponential of a sum-of-squares and the inner product decompose into one-dimensional pieces) and reduces to the one-dimensional identity , which follows from completing the square and shifting the contour of integration.

Set to get . Multiply both sides by :

The intermediate identity follows from algebraic simplification: .

Proposition 2 (Duhamel principle). Let and with bounded and locally Hölder-continuous in space uniformly in time. Then the function is in and solves on with on .

Proof. Write where By the Key Theorem of the Intermediate tier, solves on with . It suffices to show is in and solves on with .

The convergence to zero at is immediate: by the unit-mass property of , so uniformly as .

For the PDE, compute the time derivative. Change variables in the time integral via (so runs from down to ):

Take . The integrand depends on both through the upper limit and through : provided is in (this is the cleanest case; for merely continuous with Hölder regularity in , an integration-by-parts argument shifts the time derivative onto instead).

The cleaner derivation: . Compute where the differentiation under the integral is justified by the Gaussian decay of (the singularity at is controlled by the Hölder regularity of ; a standard cutoff and limiting argument handles the diagonal behaviour). And splits the integral into a near-the-diagonal piece (which captures in the limit , via the same approximation-to-the-identity argument as Step 4 of the Key Theorem proof) and a far-from-the-diagonal piece (where , by Step 2 of the Key Theorem proof, so the integrand cancels against the corresponding piece of ).

Concretely: where the last limit uses the approximation-to-the-identity property of and the continuity of at .

So on .

Proposition 3 (Tikhonov non-uniqueness, sketch). Define for and for . Then the series converges absolutely and uniformly on every compact subset of , defines a function , satisfies on , and vanishes identically on (in particular ). The function is not identically zero: for every and almost every .

Proof sketch. The function is in , with all derivatives vanishing at (this is the standard test for non-analytic functions). The bound on bounded intervals, with growing factorially in but slower than , gives absolute convergence of the series on bounded space-time domains.

Term-by-term verification of : , and . Reindexing in the second sum gives , matching the time derivative term-by-term.

That is not identically zero requires a more delicate argument: one shows that for any fixed , the function has a non-zero power-series expansion at , so cannot be identically zero by analyticity in . (Indeed, for , so is non-zero at every with .) The detailed verification is in Tikhonov 1935 Mat. Sbornik 42, 199-216 [Tikhonov 1935] and in John 1982 PDE 4e [John 1982] §7.

Proposition 4 (energy decay for the linear heat equation). Let solve with . Then is non-increasing in .

Proof. Multiply the equation by and integrate over : The left side is (assuming sufficient decay to interchange the derivative and the integral, which holds for since the solution decays at infinity by the Gaussian heat kernel). The right side is (integration by parts, with vanishing boundary terms at infinity by the decay). So

So is non-increasing in , and Plancherel together with the explicit Fourier-side semigroup gives the sharper rate which decays in at a rate determined by the frequency content of near the origin.

Connections Master

  • Laplace equation and harmonic functions 02.13.01. The steady-state limit of the heat equation. Setting in gives the Poisson equation (next item); setting and gives the Laplace equation . The heat equation's long-time behaviour on bounded domains with time-independent boundary data is convergence to the steady-state harmonic-function solution: exponentially fast as , with harmonic and satisfying the same boundary data. The heat equation is the parabolic relaxation toward elliptic equilibrium.

  • Poisson equation and Newtonian potential 02.13.02. The elliptic precursor of the heat equation. The Poisson equation is the steady state of the inhomogeneous heat equation with a time-independent source. The Newtonian potential is the elliptic analogue of the Duhamel formula, and the elliptic fundamental solution is the long-time integral of the parabolic heat kernel: when the integral converges (in dimension ). The parabolic heat kernel is the time-resolved version of the elliptic fundamental solution.

  • Lebesgue integral and dominated convergence 02.07.04. Supplies the integration framework on which the heat kernel apparatus rests. The convolution is a Lebesgue integral, and the differentiation-under-the-integral arguments of the Key Theorem proof rely on the dominated convergence theorem to pass derivatives through the integral. The -semigroup theory of the heat semigroup, the energy method for uniqueness, and the Brownian-motion picture all assume the full Lebesgue-integration apparatus.

  • Wave equation 02.13.04. The hyperbolic cousin of the heat equation. The wave equation has finite propagation speed (signals travel at unit speed), conserves an energy , and does not smooth (initial regularity is preserved). The heat equation has infinite propagation speed, dissipates energy (the norm decays), and smooths arbitrarily (any data become real-analytic at any positive time). The two equations are the prototypes of parabolic and hyperbolic theory respectively, and many problems of mathematical physics combine both (the telegrapher's equation interpolates between them; the Klein-Gordon equation adds a potential to the wave equation).

  • Fourier series and Fourier transform [02.10]. The diagonalisation framework for the heat equation. The Fourier transform diagonalises the spatial Laplacian: . The heat equation becomes the algebraic ODE with solution . Inverse-Fourier-transforming recovers the convolution formula. The Fourier-series version (for the heat equation on a bounded interval or ring) was Fourier's original 1822 method and the historical motivation for the entire Fourier-analytic apparatus.

  • Separation of variables 02.13.07. The bounded-domain analogue of the Fourier-transform method. For the heat equation on a bounded interval with Dirichlet boundary conditions , the eigenfunctions of are for , with eigenvalues . The solution of the heat equation with initial data is where are the Fourier sine coefficients. The Sturm-Liouville theory (Sturm 1836, Liouville 1836) [Sturm 1836] [Liouville 1836] generalises the spectral decomposition to second-order ODEs with variable coefficients, providing the foundational tool for heat-equation problems on bounded domains with general geometry or variable material coefficients.

  • Schrödinger equation and Wick rotation [12.04]. The quantum-mechanical analogue of the heat equation, related by analytic continuation in time. The free Schrödinger equation (with , ) becomes the heat equation under the substitution . The free-particle propagator in quantum mechanics is the analytic continuation of the heat kernel: . The Feynman-Kac formula on the heat-equation side becomes the Feynman path integral on the Schrödinger-equation side. The bridge is the deep mathematical similarity between diffusion and quantum mechanics, with the imaginary-time analytic continuation as the bridge.

  • Brownian motion and stochastic processes [11.05]. The probabilistic interpretation of the heat equation. The heat kernel is the transition density of -dimensional Brownian motion: the probability that a Brownian particle starting at is at at time is . The heat equation is the Kolmogorov forward equation for Brownian motion. Diffusion processes with drift and variable diffusion coefficients satisfy more general parabolic equations of the form , with the Aronson Gaussian bounds (Theorem 8) connecting the parabolic-equation theory to the underlying stochastic differential equation.

  • Thermodynamics and Brownian motion [11.01]. The physical origin of the heat equation. Fourier's 1822 derivation of the heat equation modelled the diffusion of thermal energy in a solid, with the diffusion constant determined by the thermal conductivity, specific heat, and density of the material. Einstein's 1905 paper [Einstein 1905] derived the heat equation for the position density of a Brownian particle in a fluid, with the diffusion constant given by the Stokes-Einstein relation (with Boltzmann's constant, absolute temperature, fluid viscosity, particle radius). The Brownian-motion-derived diffusion constant gave the first measurement of Avogadro's number and confirmed the atomic-molecular picture of matter.

Historical & philosophical context Master

Fourier's 1822 Théorie analytique de la chaleur [Fourier 1822] is the founding document of the modern theory of partial differential equations. Fourier derived the heat equation from a physical model of heat conduction in a solid, introducing the now-standard idea that the flux of heat at a point is proportional to the negative gradient of the temperature (Fourier's law of heat conduction). The conservation of energy then gives the local PDE for a homogeneous isotropic material with thermal diffusivity . Fourier solved the equation by his eponymous method: decompose the temperature into eigenmodes of the spatial Laplacian (sines and cosines on a bounded interval; complex exponentials in the integral version on the whole line), evolve each mode independently as a damped exponential in time, and reassemble the solution by superposition. The Fourier series and Fourier integral were both invented for this purpose; their applications to other branches of mathematics (number theory, harmonic analysis, quantum mechanics, signal processing) came later.

Sturm and Liouville's 1836 papers [Sturm 1836] [Liouville 1836] in the Journal de Mathématiques Pures et Appliquées gave the first systematic treatment of the eigenvalue problem for second-order linear ODEs with variable coefficients, the Sturm-Liouville theory. Their motivation was the heat equation on bounded domains with general geometry: separation of variables reduces the PDE to an ODE eigenvalue problem, and the Sturm-Liouville theory guarantees the existence of a complete orthogonal basis of eigenfunctions in , generalising the Fourier-sine basis on the interval to arbitrary Sturm-Liouville problems. The theorem on monotone interlacing of eigenvalues (zeros of consecutive eigenfunctions interlace) is one of the foundational results of spectral theory.

Tikhonov's 1935 Matematicheskii Sbornik paper [Tikhonov 1935] gave the explicit non-uniqueness counterexample for the Cauchy problem without growth restrictions. Tikhonov's example was a landmark: it showed that the natural-looking PDE on with zero initial data has infinitely many smooth solutions if one drops the boundedness assumption. The example exhibits the precise growth rate at which uniqueness fails: super-exponential in for any fixed . Widder's 1944 Trans. Amer. Math. Soc. paper [Widder 1944] proved the matching uniqueness theorem: uniqueness holds for solutions in the growth class with . The Tikhonov-Widder pair frames the modern understanding of well-posedness for the heat-equation Cauchy problem: the growth class matters, the critical rate is Gaussian with width , and the boundedness assumption that mathematicians take for granted is not gratuitous but a precise technical hypothesis at the edge of the uniqueness theorem.

Einstein's 1905 Annalen der Physik paper on Brownian motion [Einstein 1905] showed that the heat equation governs the position density of a Brownian particle suspended in a fluid. Einstein derived the diffusion constant from kinetic theory and the equipartition of energy, giving the Stokes-Einstein relation . Jean Perrin's experimental verification (1909, Nobel Prize 1926) of Einstein's predictions for Brownian motion gave the first direct measurement of Avogadro's number and was the decisive confirmation of the atomic-molecular theory of matter. The heat equation, originally a tool for predicting how heat diffuses in iron rods, turned out to be the precise mathematical statement of the existence of atoms.

Wiener's 1923 Journal of Mathematics and Physics paper [Wiener 1923] gave the first rigorous construction of Brownian motion as a stochastic process on the path space of continuous functions . Wiener constructed what is now called Wiener measure on this path space: a probability measure on continuous paths starting at the origin, with the property that the finite-dimensional marginal distributions are exactly those of an -dimensional Brownian motion. The heat kernel appears as the transition density of the process. Wiener's construction was the foundation of modern stochastic analysis and the bridge between probability theory and PDE theory.

Kac's 1949 Transactions of the American Mathematical Society paper [Kac 1949] gave the Feynman-Kac formula representing solutions of the heat equation with potential as path integrals against Wiener measure. The formula is the probabilistic analogue of the variation-of-parameters formula in ODE theory, and is the bridge between the analyst's view of the heat equation (solve via Fourier transform and convolution) and the probabilist's view (compute the expected value of a functional of Brownian paths). The Wick rotation converts the Feynman-Kac formula into Feynman's path-integral formula for quantum mechanics (Feynman 1948), making the heat equation and the Schrödinger equation two analytic-continuation versions of the same mathematical structure.

Nash 1958 American Journal of Mathematics [Nash 1958] and De Giorgi 1957 Mem. Accad. Sci. Torino [De Giorgi 1957] independently proved the Hölder regularity of solutions of divergence-form parabolic equations with bounded measurable coefficients, the foundational result of modern PDE regularity theory. Nash's method was a probabilistic argument tracking entropy along Brownian motion; De Giorgi's was a deterministic iteration scheme converting control into Hölder control. Moser 1964 [Moser 1964] unified the two approaches via the parabolic Harnack inequality. The Nash-De Giorgi-Moser apparatus is the keystone of the modern theory of partial differential equations with rough coefficients, underlying the theory of homogenisation, the modern theory of stochastic differential equations with non-smooth coefficients, and the Cheeger-Gromov theory of geometric analysis on metric-measure spaces.

Aronson's 1967 Bulletin of the AMS paper [Aronson 1967] proved the two-sided Gaussian bounds on the fundamental solution of divergence-form parabolic equations with bounded measurable coefficients, sharpening the Nash-De Giorgi-Moser regularity into pointwise estimates. The Aronson bounds say that the fundamental solution of an arbitrary uniformly-elliptic parabolic equation with rough coefficients behaves like the heat kernel, up to multiplicative constants depending only on the ellipticity bounds. The result is the parabolic analogue of the Green-function pointwise bounds for the Laplacian and is the foundational tool of modern parabolic theory.

The heat equation now appears as the foundational example in essentially every textbook of partial differential equations and mathematical physics, and the heat-kernel apparatus underlies a vast range of modern mathematics: spectral geometry (heat-kernel expansions on Riemannian manifolds; Atiyah-Singer index theorem), geometric flows (Hamilton's Ricci flow; Perelman's resolution of the Poincaré conjecture), stochastic analysis (Itô's stochastic calculus; the Malliavin calculus; large deviations theory), quantum field theory (Wick rotation and the Schwinger-DeWitt expansion), and computational physics (finite-difference and finite-element methods for diffusion problems in engineering and science). The arc from Fourier's 1822 Théorie analytique de la chaleur to modern Ricci-flow theory is a two-century lineage in which the same equation, the same heat kernel, and the same convolution recipe have been continuously refined into ever more general and ever more powerful tools.

Bibliography Master

@book{Fourier1822,
  author    = {Fourier, Jean Baptiste Joseph},
  title     = {Th\'eorie analytique de la chaleur},
  publisher = {Firmin Didot},
  address   = {Paris},
  year      = {1822}
}

@article{Sturm1836,
  author  = {Sturm, Charles},
  title   = {M\'emoire sur les \'equations diff\'erentielles lin\'eaires du second ordre},
  journal = {Journal de Math\'ematiques Pures et Appliqu\'ees},
  volume  = {1},
  year    = {1836},
  pages   = {106--186}
}

@article{Liouville1836,
  author  = {Liouville, Joseph},
  title   = {Sur le d\'eveloppement des fonctions ou parties de fonctions en s\'eries},
  journal = {Journal de Math\'ematiques Pures et Appliqu\'ees},
  volume  = {1},
  year    = {1836},
  pages   = {253--265}
}

@article{Tikhonov1935,
  author  = {Tikhonov, Andrey N.},
  title   = {Th\'eor\`emes d'unicit\'e pour l'\'equation de la chaleur},
  journal = {Matematicheskii Sbornik (N.S.)},
  volume  = {42},
  year    = {1935},
  pages   = {199--216}
}

@article{Widder1944,
  author  = {Widder, David V.},
  title   = {Positive temperatures on an infinite rod},
  journal = {Transactions of the American Mathematical Society},
  volume  = {55},
  year    = {1944},
  pages   = {85--95}
}

@article{Nash1958,
  author  = {Nash, John},
  title   = {Continuity of solutions of parabolic and elliptic equations},
  journal = {American Journal of Mathematics},
  volume  = {80},
  year    = {1958},
  pages   = {931--954}
}

@article{DeGiorgi1957,
  author  = {De Giorgi, Ennio},
  title   = {Sulla differenziabilit\`a e l'analiticit\`a delle estremali degli integrali multipli regolari},
  journal = {Memorie dell'Accademia delle Scienze di Torino. Classe di Scienze Fisiche, Matematiche e Naturali. Serie 3},
  volume  = {3},
  year    = {1957},
  pages   = {25--43}
}

@article{Moser1964,
  author  = {Moser, J\"urgen},
  title   = {A {H}arnack inequality for parabolic differential equations},
  journal = {Communications on Pure and Applied Mathematics},
  volume  = {17},
  year    = {1964},
  pages   = {101--134}
}

@article{Aronson1967,
  author  = {Aronson, Donald G.},
  title   = {Bounds for the fundamental solution of a parabolic equation},
  journal = {Bulletin of the American Mathematical Society},
  volume  = {73},
  year    = {1967},
  pages   = {890--896}
}

@article{Einstein1905,
  author  = {Einstein, Albert},
  title   = {\"Uber die von der molekularkinetischen Theorie der W\"arme geforderte Bewegung von in ruhenden Fl\"ussigkeiten suspendierten Teilchen},
  journal = {Annalen der Physik},
  volume  = {17},
  year    = {1905},
  pages   = {549--560}
}

@article{Wiener1923,
  author  = {Wiener, Norbert},
  title   = {Differential-space},
  journal = {Journal of Mathematics and Physics},
  volume  = {2},
  year    = {1923},
  pages   = {131--174}
}

@article{Kac1949,
  author  = {Kac, Mark},
  title   = {On distributions of certain {W}iener functionals},
  journal = {Transactions of the American Mathematical Society},
  volume  = {65},
  year    = {1949},
  pages   = {1--13}
}

@book{Evans2010,
  author    = {Evans, Lawrence C.},
  title     = {Partial Differential Equations},
  edition   = {2},
  publisher = {American Mathematical Society},
  series    = {Graduate Studies in Mathematics},
  volume    = {19},
  year      = {2010}
}

@book{John1982,
  author    = {John, Fritz},
  title     = {Partial Differential Equations},
  edition   = {4},
  publisher = {Springer},
  year      = {1982}
}

@book{Strauss2008,
  author    = {Strauss, Walter A.},
  title     = {Partial Differential Equations: An Introduction},
  edition   = {2},
  publisher = {Wiley},
  year      = {2008}
}

@book{Friedman1964,
  author    = {Friedman, Avner},
  title     = {Partial Differential Equations of Parabolic Type},
  publisher = {Prentice-Hall},
  year      = {1964}
}

@book{Lieberman1996,
  author    = {Lieberman, Gary M.},
  title     = {Second Order Parabolic Differential Equations},
  publisher = {World Scientific},
  year      = {1996}
}

@book{StroockVaradhan1979,
  author    = {Stroock, Daniel W. and Varadhan, S. R. Srinivasa},
  title     = {Multidimensional Diffusion Processes},
  publisher = {Springer},
  series    = {Grundlehren der mathematischen Wissenschaften},
  volume    = {233},
  year      = {1979}
}