Galerkin Existence and Energy Estimates for Second-Order Parabolic Equations
Anchor (Master): Evans §7.1-§7.2; Lions, Équations différentielles opérationnelles et problèmes aux limites (Springer 1961); Temam, Navier-Stokes Equations (North-Holland 1977), Ch. 3 §1-§3; Showalter, Monotone Operators in Banach Space and Nonlinear PDEs (AMS 1997), Ch. 3-4; Wloka, Partial Differential Equations (Cambridge 1987), §25-§26
Intuition Beginner
A diffusion equation describes a quantity that smooths itself out as time runs forward: heat in a bar, a chemical seeping through a gel, a population spreading across a habitat. The companion unit on the heat equation handed you one such equation and one explicit formula, the spreading Gaussian, that solved it on an infinite bar with constant material. But most real diffusion happens in a bounded region, through a material whose conductivity varies from place to place and even drifts with time. There is no tidy Gaussian for that. The question becomes sharper: does a solution even exist, and is it the only one?
This unit answers yes by building the solution out of simple pieces instead of guessing a formula. Pick a handful of standard "shapes" that a function on the region can have, the lowest-pitched vibration modes of the region, and look for the best approximate solution that is a blend of just those few shapes. Each shape has a strength that changes over time, and plugging the blend into the diffusion rule turns the partial differential equation into an ordinary system of rate equations for those strengths, the kind of system that always has a solution. Then add more shapes, and more, and watch the approximations settle down toward an honest solution of the full equation.
Why should adding more shapes settle down rather than blow up? Because diffusion drains energy. At every instant the total "size" of the solution can only decrease, apart from whatever the source term pumps in, and a clean bookkeeping inequality turns that physical fact into a uniform numerical bound on every approximation at once, no matter how many shapes you used. A family of approximations that is uniformly bounded cannot run off to infinity; it must have a settling-down point, and that point is the solution.
This is the energy method, and the shape-by-shape construction is the Galerkin method. Together they give existence, uniqueness, and a guarantee that the solution depends continuously on the data, for diffusion equations far too general for any explicit formula. The price is that the solution is found in an averaged, weak sense first, exactly as in the companion unit on weak solutions of the steady-state problem; smoothness is recovered afterward.
Visual Beginner
The picture to hold is a stack of approximations climbing toward a true solution, each one a blend of a fixed handful of standard shapes, all kept from blowing up by a single energy ceiling.
Read the panels left to right. The left panel is the reduction: instead of searching among all functions, you search among blends of a few fixed shapes. The diffusion rule, restricted to those blends, becomes a closed system of ordinary rate equations for the time-varying strengths, and such systems are solvable by standard ordinary-differential-equation theory.
The middle panel is the limit. Each rung of the ladder uses more shapes than the one below, so each is a finer approximation. The rungs crowd together as you climb because a single energy ceiling, drawn as the horizontal bar, holds all of them at once; nothing on the ladder can escape upward. A bounded climbing family has a settling-down point, the dashed curve at the top, and that limit solves the full equation.
The right panel is the source of the ceiling. The total energy of any approximation starts at the energy of the initial data and, because diffusion only smooths and never sharpens, can only fall, except for the controlled amount the source term feeds in. That one-way energy budget is the whole reason the construction converges.
Worked example Beginner
We run the shape-by-shape recipe by hand on the simplest bounded diffusion problem and watch the energy fall. Take the bar from zero to , held at zero temperature at both ends, governed by the rule "rate of temperature change equals curvature", which is . Start with the initial temperature profile , already a blend of two standard shapes.
Step 1. Choose the shapes. The natural shapes for a bar clamped at both ends are the standing waves , , , and so on. Each one keeps its shape under the diffusion rule and only changes in strength, which is exactly what makes the recipe close up neatly.
Step 2. Write the blend. Look for , with the two strengths and to be found. The starting strengths are and , read straight off the initial profile.
Step 3. Turn the rule into rate equations. The curvature of is . Matching the rate of change of each shape to its curvature gives two separate rate equations: and . The shapes have decoupled into independent decays.
Step 4. Solve the rate equations. Each is simple exponential decay: and . So . The higher shape, , decays four times faster, the precise statement that finer wrinkles smooth out first.
Step 5. Watch the energy fall. Measure energy by the total of across the bar. Because the shapes are orthogonal, this total is . At time zero this is . At any later time both terms are smaller, so the energy has strictly fallen, and it heads to zero as time runs on.
What this tells us: restricting to a few shapes turned the partial differential equation into a handful of decoupled decays we solved exactly, and the energy, measured as the total of , only decreased. With no source feeding the bar, the energy ceiling is just its starting value, and every approximation stays under it forever. That falling energy is the engine that, in the general case, keeps the shape-by-shape approximations from blowing up as we add more shapes.
Check your understanding Beginner
Formal definition Intermediate+
Throughout, is open and bounded, is fixed, and the spatial operator at each time is the divergence-form second-order operator of 02.16.04,
with coefficients , , and uniform ellipticity for a.e. and all . The time-dependent bilinear form is
bounded and Gårding-coercive on uniformly in by the estimates of 02.16.04.
Definition (Gelfand triple). Let and , with continuous and dense. Identifying with its own dual through the Riesz map of 02.11.08, and composing with the dual of the inclusion , yields the Gelfand triple (or rigged Hilbert space)
with both inclusions continuous and dense. The duality pairing extends the inner product of : for and , .
Definition (Bochner spaces). For a Banach space and , is the space of (strongly measurable) functions with (essential supremum for ); these are the Bochner spaces built on the Bochner integral of 24.01.01. A function has weak time-derivative if
The Bochner-Sobolev space of solutions is
Definition (weak solution of the parabolic problem). Given and , a function is a weak solution of the initial/boundary-value problem if in (a meaningful pointwise statement by the embedding theorem below) and
Definition (Galerkin approximation). Let be the orthonormal-in-, orthogonal-in- basis of Dirichlet eigenfunctions, in , , supplied by the spectral theorem for the compact resolvent of 02.16.04; is complete in both and . The -th Galerkin approximation is
where the coefficient vector solves the finite-dimensional ODE system
Counterexamples to common slips Intermediate+
lives in $V^Hu_tL^2(0,T;H^{-1})u' \in L^2(0,T;H)g \in Vf \in L^2(0,T;H)u' + L u = fV^*H$.
The pairing identity needs the embedding, not just integration by parts. The formula — the engine of every energy estimate — is valid precisely for and requires the Lions-Aubin embedding . Writing it down for alone is meaningless: such a has no pointwise-in-time values in and no initial trace.
Coercivity may only hold after a shift. The Gårding inequality gives , not coercivity outright. The substitution converts the equation for into one with the genuinely coercive form ; forgetting this shift makes the Grönwall step circular when .
Compactness is needed only for nonlinear passage; the linear limit is purely weak. For the linear equation, weak and weak-* limits of the Galerkin sequence suffice to pass to the limit in every term, because all terms are linear in . The Aubin-Lions strong compactness becomes essential only when a nonlinearity must be passed to the limit; invoking it for the linear problem is unnecessary, though harmless.
Key theorem with proof Intermediate+
Theorem (Galerkin existence and uniqueness). Let be uniformly elliptic with coefficients on , so that is bounded with constant and satisfies the Gårding inequality uniformly in . Then for every and there exists a unique weak solution of the parabolic problem, and it obeys the a priori estimate with [Evans 2010 §7.1] [Lions 1961].
Proof. Step 1 (the finite-dimensional system is solvable). Fix . Writing and using , the Galerkin equations read , a linear ODE system with and . By the Carathéodory existence theorem for ODEs with / coefficients there is a unique absolutely continuous on with the prescribed initial data, hence a unique .
Step 2 (energy estimate, uniform in ). Multiply the -th Galerkin equation by and sum over ; since this is exactly the test choice : The first term is . Gårding bounds the second from below by , and the right side is estimated by via Young's inequality. Rearranging, Dropping the nonnegative and applying the differential Grönwall inequality [Grönwall 1919] gives, for , since ( is the -orthogonal projection of onto ). This bounds . Integrating the rearranged inequality over and using the just-proved sup bound on the term bounds , hence . Both bounds are independent of .
Step 3 (estimate on the time-derivative). Fix with and split with the -orthogonal projection and . Then , so , using . Taking the supremum over such identifies with , whence , again bounded uniformly in .
Step 4 (passage to the limit). Steps 2-3 give bounded in and bounded in . Both spaces are reflexive (Hilbert), so by Banach-Alaoglu / weak sequential compactness there is a subsequence with in and in (the weak limit of the derivatives is the derivative of the weak limit, since distributional differentiation is weakly continuous). Hence . Fix and with , and a fixed with . For , multiply the Galerkin identity by and integrate by parts in time: Every term is linear and continuous in for the weak topologies, so passing along the subsequence replaces by and by (the projections converge to in ). The resulting identity holds for every , and by density of in for every ; undoing the integration by parts shows for a.e. and . Thus is a weak solution, and the a priori estimate is the limit of the uniform bounds by weak lower semicontinuity of the norms.
Step 5 (uniqueness). If is a weak solution with , , test the equation with (legitimate by the embedding and the pairing identity below): , so , and Grönwall with forces . Two solutions with the same data thus coincide.
Bridge. The energy estimate is the foundational reason the construction converges: the Gårding coercivity of 02.16.04, the same inequality that proved elliptic existence, now controls the time-integrated -norm, while the -norm of the solution is dominated for all time by the data through Grönwall — this is exactly the bounded coercive form of the stationary theory promoted to an evolution by integrating in time. The abstract Lax-Milgram solvability of the elliptic problem appears again here at every fixed time as the solvability of the Galerkin ODE matrix , and the whole scheme builds toward the semigroup picture, where the operator generates the solution flow. Putting these together, the central insight is that a parabolic equation is an elliptic energy inequality integrated against time and closed by Grönwall, so existence costs no explicit kernel and no Fourier analysis; the bridge is that uniqueness and continuous dependence both fall out of the single identity , which generalises the Beginner-tier falling-energy picture into a rigorous statement valid for variable, time-dependent coefficients.
Exercises Intermediate+
Advanced results Master
The Galerkin/energy existence theorem sits inside a wider structure. The semigroup viewpoint of Hille-Yosida recasts the autonomous case as an abstract Cauchy problem generated by ; the Lions-Aubin compactness lemma upgrades weak convergence to the strong convergence needed for nonlinear and quasilinear problems; the variational Lions theorem replaces the eigenbasis by an arbitrary dense sequence and handles genuinely non-self-adjoint, time-dependent forms; parabolic regularity bootstraps the weak solution to a classical one; and the same apparatus, with the energy identity replaced by a conserved quantity, treats the hyperbolic (wave) equation. Each refines the Galerkin argument of the Intermediate tier.
Theorem 1 (semigroup representation; autonomous case). Let be time-independent, with generating, via the Hille-Yosida theorem, a strongly continuous (in fact analytic) contraction-type semigroup on [Hille 1948] [Yosida 1948]. Then the weak solution constructed by Galerkin coincides with the mild/semigroup solution
the Duhamel/variation-of-parameters formula of 02.13.03 lifted to the abstract operator. Analyticity of the semigroup encodes the parabolic smoothing: for , maps into the domain of every power of , so is spatially smooth even for only. The Galerkin energy estimate is, in this language, the dissipativity that Hille-Yosida requires of a generator.
Theorem 2 (Aubin-Lions-Simon compactness). Let with the first embedding compact. Then compactly [Aubin 1963] [Lions 1969]. The proof interpolates: the compact embedding makes a -bounded sequence precompact in at a.e. fixed time, and the uniform -bound on provides equicontinuity in time (an Ehrling-inequality argument: ). This is the lemma that lets the Galerkin method pass nonlinearities to the limit: a quasilinear term converges because strongly in , hence a.e. after a further subsequence.
Theorem 3 (Lions' variational existence; general time-dependent forms). Let be a family of bounded bilinear forms on , measurable in , with the uniform Gårding inequality . For every and there is a unique with for all , a.e. , and [Lions 1961] [Lions-Magenes 1972 Ch. 3]. The proof replaces the eigenbasis by any sequence dense in (no spectral theory needed) and runs the same energy estimate and weak-limit passage; the inf-sup structure of the space-time bilinear form on supplies existence à la Babuška-Nečas 02.16.04. This is the form in which the method generalizes to Navier-Stokes and to monotone-operator equations.
Theorem 4 (parabolic regularity). If, in addition, , , and with , then the weak solution satisfies and , with
[Evans 2010 §7.1]. Higher regularity follows by differentiating the equation in and bootstrapping with the interior elliptic estimate of 02.16.04; with smooth data and compatibility conditions the weak solution is the classical solution. The mechanism is the improved energy estimate (Exercise 7): testing with rather than trades one time-derivative for one elliptic gain.
Theorem 5 (the hyperbolic parallel). For the second-order hyperbolic problem with , , the same Galerkin scheme produces a unique weak solution with and . The energy estimate now tests with and uses the conserved quantity , whose time-derivative is controlled rather than sign-definite; there is no smoothing and no gain of regularity, the signature contrast between parabolic dissipation and hyperbolic conservation. The wave equation [02.13.03 successor chapter] is the constant-coefficient instance.
Synthesis. The energy estimate is the foundational reason the Galerkin scheme converges, and it is exactly the Gårding coercivity of 02.16.04 integrated against time and closed by Grönwall: the time-integrated -norm is controlled by ellipticity while the -norm is propagated by the data, so existence costs no kernel and no Fourier analysis. This is dual to the elliptic theory in a precise sense — the stationary Lax-Milgram solvability at each frozen time is what makes the Galerkin ODE matrix invertible, and the parabolic solution operator is the time-ordered product of these elliptic resolvents, which in the autonomous case is exactly the analytic semigroup generated by through Hille-Yosida, the Duhamel formula of 02.13.03 lifted to operators. Putting these together, the central insight is that a parabolic equation is one energy identity, , fed three different right-hand sides: tested against it gives existence and uniqueness, tested against it gives the improved regularity that recovers the classical solution, and combined with the compact embedding of 02.16.04 through Aubin-Lions it gives the strong convergence that carries the whole method into the nonlinear world. The hyperbolic problem is the same scheme with dissipation replaced by conservation, so the bridge from this unit reaches simultaneously back to the elliptic existence theory it integrates and forward to the semigroup, regularity, and nonlinear evolution theories it generates.
Full proof set Master
Proposition 1 (uniform energy estimate). Under the hypotheses of the main theorem, the Galerkin approximations satisfy .
Proof. Testing the Galerkin identity with gives . Gårding and Young () yield . Discarding and applying Grönwall (Exercise 4) bounds . Reinstating and integrating over : using the sup bound just proved. Finally (Step 3 of the theorem, via -orthogonal projection onto ), so , completing the bound.
Proposition 2 (existence of a weak solution). A subsequence of converges weakly in , with converging weakly in , to a limit that is a weak solution with .
Proof. By Proposition 1 and reflexivity, extract in and in . For and , ; passing to the limit (weak convergence against the fixed test pairing) gives , so and . Fix and with . For , multiplying the Galerkin identity by and integrating by parts in time gives . Each term is weakly continuous in ; since in (projections of fixed ), the limit reads . By density of in this holds for all ; choosing recovers the equation a.e., and comparing the boundary terms for general with forces .
Proposition 3 (uniqueness and continuous dependence). The weak solution is unique and depends continuously on : .
Proof. The estimate is the limit of Proposition 1 under weak lower semicontinuity, together with the embedding (Exercise 8). For uniqueness, the difference of two solutions with the same data solves , . The pairing identity (Exercise 3) gives . Grönwall with gives for all , so the two solutions agree.
Proposition 4 (the abstract pairing identity). For , is absolutely continuous with for a.e. , and .
Proof. Smooth -valued functions are dense in (mollify in time after extending). For such the identity is the classical product rule, making the pairing an inner product. For general take in smooth; then . The integrand converges in (weak-times-strong in dual spaces), and the difference converges uniformly by the a priori bound of Exercise 8 applied to , giving a Cauchy sequence in whose limit is the continuous representative of . Passing to the limit yields the identity and the embedding bound .
Connections Master
The spatial engine is the elliptic weak theory of
02.16.04: the Gårding coercivity proved there from uniform ellipticity is the exact input to the parabolic energy estimate, and the Dirichlet eigenbasis driving the Galerkin scheme is the spectral decomposition of the compact resolvent built in that unit. This unit owns the time-evolution layer;02.16.04owns the frozen-time elliptic solvability that the Galerkin matrix inherits at each instant.The Bochner spaces , the weak time-derivative, and the Gelfand triple are built on the Sobolev-space and Bochner-integral framework of
24.01.01; that unit supplies the duality and the trace giving meaning to the spatial boundary condition, while this unit assembles them into the solution space and the Lions-Aubin embedding into .The constant-coefficient, whole-space heat equation of
02.13.03is the explicitly solvable special case: its Gaussian heat kernel is the integral form of the abstract semigroup of Theorem 1, and its Duhamel formula is the variation-of-parameters representation that the Galerkin solution realizes for variable, bounded-domain coefficients where no kernel is available. This unit is the existence theory of which02.13.03is the one computable instance.The semigroup viewpoint of Theorem 1 is developed in its own right in
02.18.03(Hille-Yosida and -semigroups), which characterizes exactly which operators generate the parabolic flow; the Galerkin energy estimate of this unit is the dissipativity hypothesis of that generation theorem, so the two units are the constructive and the generator-theoretic faces of the same evolution.The variational/minimization counterpart for the stationary problem is the direct method of
02.18.04, which finds elliptic solutions as energy minimizers; the parabolic flow of this unit is the gradient flow of that same Dirichlet energy, so the long-time limit of the parabolic solution converges to the minimizer, linking time-evolution existence to variational existence within the chapter.
Historical & philosophical context Master
The approximation by finitely many fixed shapes originates with Boris Galerkin's 1915 paper on the elastic equilibrium of rods and plates [Galerkin 1915], which projected the equilibrium equations onto a finite set of admissible deflection functions; the idea itself traces to Walther Ritz's 1908 variational method, of which Galerkin's is the weak-form generalization not requiring an energy functional. The extension of the method from steady-state to time-dependent problems is due to Sandro Faedo, whose 1949 Annali della Scuola Normale Superiore di Pisa memoir [Faedo 1949] introduced what is now called the Faedo-Galerkin method: project onto finitely many spatial modes, solve the resulting system of ordinary differential equations in time, and pass to the limit using a-priori energy bounds.
The functional-analytic completion belongs to Jacques-Louis Lions, whose 1961 Équations différentielles opérationnelles [Lions 1961] cast parabolic and hyperbolic problems in the Gelfand-triple framework and proved existence for general time-dependent coercive forms by the variational method, with the trace and compactness theory developed jointly with Enrico Magenes [Lions-Magenes 1972]. The compactness lemma that carries the method into nonlinear problems was given by Jean-Pierre Aubin in 1963 [Aubin 1963] and extended by Lions [Lions 1969]. The dual semigroup formulation rests on the Hille-Yosida generation theorem of Einar Hille [Hille 1948] and Kōsaku Yosida [Yosida 1948], independently proved in 1948, and the Grönwall inequality closing the energy estimate is Thomas Grönwall's 1919 lemma [Grönwall 1919].
Bibliography Master
@article{Galerkin1915,
author = {Galerkin, Boris G.},
title = {Rods and plates: series in some questions of elastic equilibrium of rods and plates},
journal = {Vestnik Inzhenerov i Tekhnikov},
volume = {19},
year = {1915},
pages = {897--908}
}
@article{Faedo1949,
author = {Faedo, Sandro},
title = {Un nuovo metodo per l'analisi esistenziale e quantitativa dei problemi di propagazione},
journal = {Annali della Scuola Normale Superiore di Pisa, Serie 3},
volume = {1},
year = {1949},
pages = {1--41}
}
@book{Lions1961,
author = {Lions, Jacques-Louis},
title = {\'Equations diff\'erentielles op\'erationnelles et probl\`emes aux limites},
series = {Grundlehren der mathematischen Wissenschaften},
number = {111},
publisher = {Springer},
year = {1961}
}
@book{LionsMagenes1972,
author = {Lions, Jacques-Louis and Magenes, Enrico},
title = {Non-Homogeneous Boundary Value Problems and Applications I},
series = {Grundlehren der mathematischen Wissenschaften},
number = {181},
publisher = {Springer},
year = {1972}
}
@article{Aubin1963,
author = {Aubin, Jean-Pierre},
title = {Un th\'eor\`eme de compacit\'e},
journal = {Comptes Rendus de l'Acad\'emie des Sciences Paris},
volume = {256},
year = {1963},
pages = {5042--5044}
}
@article{Gronwall1919,
author = {Gr\"onwall, Thomas H.},
title = {Note on the derivatives with respect to a parameter of the solutions of a system of differential equations},
journal = {Annals of Mathematics},
volume = {20},
year = {1919},
pages = {292--296}
}
@article{Yosida1948,
author = {Yosida, K\=osaku},
title = {On the differentiability and the representation of one-parameter semi-group of linear operators},
journal = {Journal of the Mathematical Society of Japan},
volume = {1},
year = {1948},
pages = {15--21}
}
@book{Hille1948,
author = {Hille, Einar},
title = {Functional Analysis and Semi-Groups},
series = {American Mathematical Society Colloquium Publications},
number = {31},
publisher = {American Mathematical Society},
year = {1948}
}