The Direct Method of the Calculus of Variations
Anchor (Master): Evans §8.1-§8.2; Dacorogna, Direct Methods in the Calculus of Variations, 2e (Springer 2008); Giusti, Direct Methods in the Calculus of Variations (World Scientific 2003); Morrey, Multiple Integrals in the Calculus of Variations (Springer 1966), Ch. 1, 4; Ball, Convexity conditions and existence theorems in nonlinear elasticity (Arch. Rational Mech. Anal. 1977)
Intuition Beginner
Many problems in physics and geometry ask for the shape that does the least of something. The soap film stretched across a wire loop settles into the shape of least area. A hanging chain settles into the shape of least energy. A heated plate, once it stops changing, holds the temperature pattern that stores the least of a certain energy. In every case there is a number attached to each candidate shape, and nature picks the shape that makes that number as small as possible.
The direct method is a strategy for proving that such a smallest-shape actually exists. It is "direct" because it goes straight at the minimum instead of first writing down an equation the minimum must satisfy and then solving the equation. The older, indirect route is to say: if a best shape exists, it has to obey a certain balance condition, so let us solve for shapes obeying that condition. The danger is that the balance condition might have no solution, or the solution it finds might not really be the best shape. The direct method sidesteps this by hunting for the minimum first.
Here is the whole idea in three steps. First, look at the list of all possible values of the number you want to minimize and find the lowest value it ever gets close to, its floor. Second, build a sequence of candidate shapes whose numbers march down toward that floor, getting better and better. Third, and this is the hard part, show that this marching sequence settles down to an actual limiting shape, and that the limiting shape really achieves the floor rather than overshooting it.
The third step has two enemies. The sequence of improving shapes might wander off without settling anywhere, the way a sequence of numbers like one, two, three keeps growing and never lands. Or it might settle down to a limit, but the number attached to the limit could jump up above the floor at the last moment, so the limit is not actually the best shape.
The direct method names exactly the two properties that defeat these enemies. A budget that forbids wandering is called coercivity: large shapes must cost a lot, so the improving sequence cannot run off to infinity. A rule that forbids the last-moment jump is called lower semicontinuity: when shapes settle toward a limit, their number can only drop or hold steady, never leap upward.
When both properties hold, the smallest shape exists. That is the direct method, and it is the engine behind most existence proofs in the modern theory of partial differential equations.
Visual Beginner
The single picture to hold is a landscape of candidate shapes with a value attached to each, and a sequence of guesses walking downhill toward the lowest point.
Read the left panel as the heart of the method. The bowl is the value attached to each candidate shape. Because the bowl turns upward at its edges, a shape that is very large or very wild sits high on the walls, so any sequence of improving guesses is trapped near the bottom and cannot escape sideways. This trapping is coercivity. The staircase of guesses is the improving sequence, each one a little lower than the last, piling up at the bottom of the bowl. The dashed line is the floor, the lowest value the number ever gets near.
The right panel shows what can still go wrong even after the guesses pile up. If the value can leap upward right at the limiting shape, then the limit the guesses approached might sit above the floor, and the would-be best shape is a fraud. Lower semicontinuity is the promise that no such upward leap happens: approaching a limit, the value only falls or holds level. With both panels in force, the bottom of the bowl is genuinely reached.
Worked example Beginner
We watch the direct method work on the simplest possible energy, the one whose smallest shape we can find by hand and then check. Take the region to be the interval from zero to one, and attach to each function the number equal to the total of its slope squared, the integral of the square of the derivative, with the function pinned to the value zero at the left end and the value one at the right end. This number is a clean stand-in for stored energy.
Step 1. Find the floor. Among all functions running from zero up to one across the interval, which one spends the least squared slope? The straight ramp, the function whose value is just the position itself, has constant slope equal to one. Its squared slope is one at every point, and the total over the interval of length one is exactly one. So the straight ramp gives the value one.
Step 2. Check that one really is the floor. Any function from zero to one must climb a total height of one. A theorem about averages says that spreading a fixed climb evenly, at constant slope, spends the least squared slope; bunching the climb into a steep stretch spends more, because squaring punishes large slopes. So no function beats the straight ramp, and the floor is one, reached by the ramp.
Step 3. Build an improving sequence. Suppose we did not already know the answer. We could take wigglier and wigglier functions that still run from zero to one but get smoother, their squared-slope totals dropping toward one: maybe one and a half, then one and a quarter, then one and an eighth, marching down.
Step 4. Watch the two safeguards. Coercivity here is the fact that a function with a huge squared-slope total is expensive, so the improving sequence cannot run off to wild functions; it is penned near the ramp. Lower semicontinuity is the fact that as our improving functions settle toward the ramp, their squared-slope totals settle toward the ramp's value of one without jumping above it. Both hold, so the marching sequence lands on the ramp, and the ramp is the genuine minimizer.
What this tells us: the smallest-energy shape exists and is the straight ramp, and we found it by following a sequence of improving guesses down to its floor and checking that nothing let the guesses escape or jump at the last moment. The balance condition that the ramp secretly satisfies, constant slope, is the simplest case of the Euler-Lagrange equation, which the Intermediate tier turns into a precise rule.
Check your understanding Beginner
Formal definition Intermediate+
Throughout, is open and bounded with Lipschitz (so the trace and extension theory applies), , and the unknown is a scalar function unless the vectorial case is named. The Sobolev space , its norm , the trace operator, and the apparatus are taken as available 02.16.01, and the weak compactness of norm-bounded sequences in the reflexive space — every bounded sequence has a weakly convergent subsequence — is supplied by 02.16.03, where it underlies the Rellich-Kondrachov argument.
The Lagrangian and the variational integral. Let , written with the gradient slot, the value slot, and the position, be continuous and measurable in . The associated variational integral (or energy functional) is defined on the admissible class for a prescribed boundary datum (the trace of some fixed function). A function is a minimizer if for all .
Definition (coercivity). is coercive on if there are constants and with
so that . Together with the Poincaré inequality 02.16.03 (admissible share the fixed trace , so has zero trace for a fixed ), coercivity forces : sublevel sets are bounded in .
Definition (weak lower semicontinuity). is (sequentially) weakly lower semicontinuous on if The inequality is one-directional by design: the value of at the weak limit may be strictly smaller than the limiting values along the sequence, never larger.
Definition (convexity in the gradient). is convex in if for each fixed the map is convex: for . It is strictly convex in if the inequality is strict for , . When , convexity in is equivalent to the Hessian condition for all .
Definition (Euler-Lagrange equation, weak form). A minimizer is a weak solution of the Euler-Lagrange equation if, for every test function ,
which is the weak form of the divergence-structure PDE . This is the variational counterpart of the classical Euler-Lagrange derivation 09.02.02; there the equation is obtained as a necessary pointwise condition on a smooth extremal, here it is read off as the vanishing first variation of at the minimizer.
Counterexamples to common slips Intermediate+
Coercivity is about growth in , not in . The exponent in the lower bound must match the Sobolev exponent of the admissible class; a bound does not control the gradient and gives no bound. The Poincaré inequality is what converts the gradient bound into a full-norm bound, and it needs the fixed boundary trace.
Lower semicontinuity needs the weak topology, not the strong one. In the strong topology every continuous is continuous, hence harmless, but minimizing sequences only converge weakly (that is all coercivity plus reflexivity delivers). The content of Tonelli's theorem is lower semicontinuity along merely weakly convergent sequences, where but in norm.
Convexity in is sufficient but the wrong condition in the vectorial case. For with , weak lower semicontinuity is equivalent to Morrey quasiconvexity of in the gradient matrix, a strictly weaker condition than convexity. The determinant on is quasiconvex (even a null Lagrangian) but not convex, and energies built from it are weakly lower semicontinuous without being convex.
A minimizer need not be smooth, so the classical and weak Euler-Lagrange equations are not interchangeable a priori. The direct method produces a minimizer satisfying the weak equation; promoting it to a classical solution of the pointwise PDE requires separate regularity theory (De Giorgi-Nash-Moser, Schauder), and for certain vectorial integrands minimizers are genuinely singular.
Key theorem with proof Intermediate+
Theorem (Tonelli; existence of a minimizer). Let be open and bounded, , and continuous, with **convex** for each fixed and the coercivity bound holding with , . Suppose the admissible class is nonempty and . Then attains its minimum on : there exists with [Tonelli 1921] [Evans 2010 §8.2].
Proof. Write ; by hypothesis , and coercivity gives , so is finite.
Step 1 (minimizing sequence and a priori bound). Choose with . For large, , so by coercivity , hence . Fix ; then , and the Poincaré inequality 02.16.03 gives . Therefore and , a bound uniform in .
Step 2 (extract a weak limit). The space is reflexive for 02.16.01. By weak compactness of bounded sequences in a reflexive space 02.16.03, a subsequence (still denoted ) converges weakly, in , meaning in and in . The trace operator is weakly continuous, so on and .
Step 3 (weak lower semicontinuity from convexity). The crux is
By the Rellich-Kondrachov theorem 02.16.03 the embedding is compact, so passing to a further subsequence strongly in and pointwise a.e. Fix . By Egorov's theorem there is a measurable with on which uniformly and are bounded. Convexity of gives the supporting-hyperplane (gradient) inequality
Integrate over . The first right-hand term tends to by uniform convergence and continuity of . The second term has the form with uniformly and boundedly on ; since in and strongly in , the product integral tends to . Hence
Coercivity makes the integrand bounded below by , so dropping the integral over on the left costs at most ; letting and using with monotone convergence on the right yields .
Step 4 (conclude). Combining, . Since , also , so : is a minimizer.
Bridge. The proof is the foundational reason coercivity and weak lower semicontinuity are the two pillars of variational existence: coercivity converts a finite-energy bound into -boundedness, reflexivity converts boundedness into a weak limit, and convexity-driven lower semicontinuity converts the weak limit into an actual minimizer — this is exactly the upgrade from a weak subsequential limit to a genuine solution that Rellich-Kondrachov 02.16.03 was built to provide, and it is dual to the classical route 09.02.02, which writes the Euler-Lagrange equation first and hunts for a solution. The central insight is that convexity in the gradient is precisely the geometry that forbids the weak limit's energy from leaping upward: a supporting hyperplane at is below the graph, and a weakly convergent gradient cannot beat the tangent. This builds toward the regularity theory that promotes the minimizer to a classical solution, and it appears again in the vectorial theory, where convexity is replaced by Morrey's quasiconvexity and the same liminf inequality is recovered from a weaker geometric hypothesis.
Exercises Intermediate+
Advanced results Master
The existence theorem organizes a larger structure: the precise scalar characterization of lower semicontinuity by convexity, its replacement by quasiconvexity in the vectorial case, the Euler-Lagrange equation as the bridge to PDE, the role of the method in resolving Hilbert's nineteenth and twentieth problems, and the relaxation theory that handles non-convex scalar integrands by replacing with its convex envelope. Each refines the coercivity-plus-lower-semicontinuity argument of the Intermediate tier.
Theorem 1 (scalar lower semicontinuity is exactly convexity; Tonelli 1921, Serrin, Morrey). For a continuous with the natural growth and coercivity bounds, the functional is sequentially weakly lower semicontinuous on for scalar if and only if is convex for each [Tonelli 1921] [Morrey 1966]. Sufficiency is the Key Theorem; necessity is the oscillating-laminate construction of Exercise 6, which forces Jensen's inequality on . Convexity is therefore not a convenient hypothesis but the exact analytic content of lower semicontinuity in the scalar case, and the direct method's reliance on it is structural rather than technical.
Theorem 2 (the vectorial case; Morrey quasiconvexity). For with , , and on matrices with -growth, is sequentially weakly lower semicontinuous on if and only if is quasiconvex in Morrey's sense: i.e. the affine map minimizes among its own compactly-supported perturbations [Morrey 1952] [Morrey 1966]. Quasiconvexity is strictly weaker than convexity and strictly stronger than rank-one convexity (the Legendre-Hadamard condition convex for all vectors ); the implications convex polyconvex quasiconvex rank-one convex are all strict for (Šverák's example separates quasiconvexity from rank-one convexity). The condition is non-local — it cannot be tested pointwise on the Hessian — which is the central difficulty of the vectorial theory.
Theorem 3 (polyconvexity and nonlinear elasticity; Ball 1977). A function is polyconvex if for a convex of all minors of . Polyconvexity implies quasiconvexity and is verifiable, so it furnishes the practical existence tool in nonlinear elasticity, where stored-energy densities of the deformation gradient are non-convex (frame indifference and as both rule out convexity) yet are polyconvex for standard Ogden materials [Ball 1977]. The direct method then yields equilibria as minimizers of the elastic energy: coercivity from , weak lower semicontinuity from polyconvexity via the weak continuity of minors (Exercise 8 for ), and a weak limit by reflexivity. This is the canonical demonstration that the right convexity notion is dictated by the physics, not imposed for convenience.
Theorem 4 (regularity; Hilbert's nineteenth problem). The direct method produces a minimizer satisfying the weak Euler-Lagrange equation, but minimizers of analytic, uniformly convex scalar Lagrangians are themselves analytic. This is Hilbert's nineteenth problem, resolved by the De Giorgi-Nash theorem (1957): a bounded weak solution of a uniformly elliptic divergence-form equation with measurable coefficients is Hölder continuous, after which a bootstrap through Schauder theory promotes to and analyticity. In the vectorial case the conclusion fails: De Giorgi and Giusti-Miranda exhibit quasiconvex (even smooth, uniformly rank-one convex) vectorial integrands whose minimizers have singular sets, so partial regularity — smoothness off a closed set of measure zero — is the best available, and the singular set can be nonempty. The scalar/vectorial divide in regularity mirrors exactly the convex/quasiconvex divide in the existence theory.
Theorem 5 (relaxation; the convex envelope). When the scalar is not convex, is not weakly lower semicontinuous and minimizers may fail to exist (minimizing sequences develop finer and finer oscillations, as in Exercise 6 read backward). The relaxed functional is , where is the convex envelope (biconjugate) of in ; it is the largest weakly lower semicontinuous functional below , and with minimizers of being the weak limits of minimizing sequences of . In the vectorial case the relevant envelope is the quasiconvex envelope , and the relaxation theorem of Dacorogna identifies with [Dacorogna 2008]. Relaxation is the systematic account of what minimizing sequences converge to when the direct method's hypotheses fail, recovering a generalized minimizer carrying the microstructure of the oscillations.
Synthesis. The direct method is the foundational reason the modern theory of partial differential equations can assert existence of weak solutions, and the entire structure is generated by a single principle: coercivity converts an energy bound into weak compactness, and the right convexity notion converts the weak limit into a minimizer — putting these together, existence is the meeting of a compactness input from 02.16.03 and a lower-semicontinuity input from the geometry of . The central insight is that convexity in the gradient is not a technical convenience but exactly the analytic content of weak lower semicontinuity in the scalar case (Theorem 1), and this is exactly why the vectorial theory needs the weaker, non-local quasiconvexity (Theorem 2): the supporting-hyperplane argument of the scalar proof is dual to Jensen's inequality, and Jensen tested against gradient fields rather than measures is precisely Morrey's averaging condition. The Euler-Lagrange equation is the bridge from the minimizer to the PDE, and it generalises the classical pointwise derivation 09.02.02 by reading the equation as the vanishing first variation rather than as a necessary condition on a presumed-smooth extremal; this is the foundational reason the variational and the differential formulations agree once regularity is established (Theorem 4).
The relaxation theory (Theorem 5) is dual to the existence theorem: where existence holds because is convex, relaxation explains the failure when it is not, replacing by its convex envelope and recovering a generalized minimizer. The arc from the Dirichlet principle through Weierstrass's critique and Hilbert's rehabilitation to Tonelli's coercivity-plus-semicontinuity synthesis and Morrey's quasiconvexity is one continuous refinement of a single idea: find the floor, then prove it is reached.
Full proof set Master
Proposition 1 (coercivity yields a bounded minimizing sequence). Let with , , and let be nonempty with . Then any minimizing sequence is bounded in .
Proof. For large , , so , giving . Fix . Then , so the Poincaré inequality 02.16.03 gives . Hence , and , a bound independent of .
Proposition 2 (convexity gives the lower-semicontinuity liminf inequality, model case). For and in , .
Proof. The function is convex and with , so the gradient inequality holds for all . With , , Integrate over . The field lies in , , since . As in and , the pairing . Therefore , and taking gives the claim.
Proposition 3 (the weak Euler-Lagrange equation holds at a minimizer). Let satisfy the growth bounds and , and let minimize . Then for all ,
Proof. Fix ; then for all , since has zero trace. Define . Because minimizes and is admissible, has a minimum at . The growth bounds make the difference quotient dominated, by the mean value theorem and Young's inequality, by a fixed function uniformly for ; dominated convergence permits differentiation under the integral. Hence , and because is an interior minimum.
Proposition 4 (uniqueness under strict convexity in ). If is convex for each and strictly convex in , and is convex, then has at most one minimizer.
Proof. Suppose both minimize, . The midpoint by convexity of , with and . Pointwise convexity gives . Integrating, . If , then on a set of positive measure (equal gradients a.e. plus equal trace force equality), and strict convexity in makes the integrand inequality strict on , so . This contradicts the minimality of . Hence .
Proposition 5 (convex quasiconvex). If is convex, then is quasiconvex.
Proof. Let , bounded, . By Jensen's inequality for the convex and the probability measure on , Since has compact support in , by the divergence theorem (each column integrates to a boundary term that vanishes). Hence the right side is , giving , which is quasiconvexity. The converse fails for (the determinant of Exercise 8), so quasiconvexity is strictly weaker.
Connections Master
The weak-compactness engine — boundedness in yields a weakly convergent subsequence, upgraded to strong convergence — is exactly the Rellich-Kondrachov and Poincaré apparatus of
02.16.03. That unit's compactness theorem is invoked at Step 3 of the Key Theorem to pass from to in ; without it the lower-bound hyperplane argument has no strong convergence to anchor the value slot. The direct method is the canonical consumer of that compactness result.The coercivity-to-boundedness step rests on the Sobolev embedding and the reflexivity of developed in
02.16.01: coercivity bounds the gradient norm, the Poincaré inequality (a corollary of the embedding theory) bounds the full norm, and reflexivity for supplies the weak limit. The critical-exponent restrictions of02.16.01reappear here as the natural growth conditions that make the Euler-Lagrange first variation well-defined.The Euler-Lagrange equation derived here as the vanishing first variation is the variational twin of the classical pointwise derivation in
09.02.02: there the equation is a necessary condition on a smooth extremal of a one-dimensional action; here it is the weak PDE satisfied by a minimizer of a multiple integral. The two agree once regularity theory promotes the weak solution to a classical one, which is the content of Hilbert's nineteenth problem.The existence of minimizers feeds directly into the regularity theory of elliptic equations: the De Giorgi-Nash-Moser theorem takes the weak minimizer produced here and establishes Hölder continuity, and the spectral theory of the Laplacian uses the same minimization (of the Rayleigh quotient over ) to produce eigenfunctions, with attainment of the infimum guaranteed by exactly this method, as noted in the spectral discussion of
02.16.03.
Historical & philosophical context Master
The method has its roots in the Dirichlet principle, the assertion — used freely by Gauss, Dirichlet, Thomson, and Riemann in the mid-nineteenth century — that the boundary-value problem for the Laplace equation is solved by minimizing the Dirichlet energy over functions with prescribed boundary values. Riemann based much of his function theory on it. Weierstrass, in Berlin lectures around 1870, undermined the principle by exhibiting a coercive-looking variational problem whose infimum is not attained [Weierstrass 1870], the prototype of the weighted-energy example of Exercise 5: an infimum can fail to be a minimum, so the mere boundedness-below of an energy does not produce a minimizer. The critique stalled the variational approach for a generation.
David Hilbert rehabilitated the Dirichlet principle in his 1900 address and a companion note [Hilbert 1900], arguing that under suitable hypotheses the minimum is genuinely attained and that the principle could be made rigorous; the nineteenth and twentieth of his celebrated problems concern, respectively, the analyticity of minimizers and the existence of solutions to regular variational problems. Leonida Tonelli, in his two-volume Fondamenti di Calcolo delle Variazioni of 1921-1923 [Tonelli 1921], gave the method its modern form by isolating the two hypotheses that make it work: coercivity, ensuring a minimizing sequence is compact in the relevant weak topology, and lower semicontinuity, ensuring the weak limit does not overshoot the infimum, with convexity of the integrand in the gradient identified as the criterion for the latter.
The vectorial theory required a genuinely new idea. Charles Morrey, in his 1952 Pacific Journal of Mathematics paper [Morrey 1952] and his 1966 monograph [Morrey 1966], introduced quasiconvexity as the exact condition for weak lower semicontinuity when the unknown is vector-valued, showing that convexity is too strong and rank-one convexity too weak. John Ball's 1977 Archive for Rational Mechanics and Analysis paper [Ball 1977] supplied the verifiable intermediate notion of polyconvexity and applied the direct method to nonlinear elasticity, where the physically mandated non-convex stored-energy densities had blocked every earlier existence attempt. Hilbert's nineteenth problem was settled by Ennio De Giorgi and John Nash independently in 1957, completing the passage from variational minimizer to smooth classical solution in the scalar case.
Bibliography Master
@book{Tonelli1921,
author = {Tonelli, Leonida},
title = {Fondamenti di Calcolo delle Variazioni},
publisher = {Zanichelli},
address = {Bologna},
year = {1921},
note = {2 volumes, 1921 and 1923}
}
@article{Hilbert1900,
author = {Hilbert, David},
title = {\"Uber das Dirichletsche Prinzip},
journal = {Jahresbericht der Deutschen Mathematiker-Vereinigung},
volume = {8},
year = {1900},
pages = {184--188}
}
@article{Morrey1952,
author = {Morrey, Charles B.},
title = {Quasi-convexity and the lower semicontinuity of multiple integrals},
journal = {Pacific Journal of Mathematics},
volume = {2},
year = {1952},
pages = {25--53}
}
@book{Morrey1966,
author = {Morrey, Charles B.},
title = {Multiple Integrals in the Calculus of Variations},
series = {Grundlehren der mathematischen Wissenschaften},
volume = {130},
publisher = {Springer},
year = {1966}
}
@article{Ball1977,
author = {Ball, John M.},
title = {Convexity conditions and existence theorems in nonlinear elasticity},
journal = {Archive for Rational Mechanics and Analysis},
volume = {63},
year = {1977},
pages = {337--403}
}
@book{Dacorogna2008,
author = {Dacorogna, Bernard},
title = {Direct Methods in the Calculus of Variations},
edition = {2},
series = {Applied Mathematical Sciences},
volume = {78},
publisher = {Springer},
year = {2008}
}
@book{Giusti2003,
author = {Giusti, Enrico},
title = {Direct Methods in the Calculus of Variations},
publisher = {World Scientific},
year = {2003}
}
@book{Evans2010,
author = {Evans, Lawrence C.},
title = {Partial Differential Equations},
edition = {2},
series = {Graduate Studies in Mathematics},
volume = {19},
publisher = {American Mathematical Society},
year = {2010}
}