The Cauchy-Kovalevskaya Theorem and Holmgren Uniqueness
Anchor (Master): Evans §4.6; John §3.3-3.5; Hörmander, The Analysis of Linear Partial Differential Operators I, 2e (Springer 1990), §9.4 (Holmgren); Hadamard, Lectures on Cauchy's Problem in Linear Partial Differential Equations (Yale UP 1923); Lewy, An example of a smooth linear partial differential equation without solution (Annals of Mathematics 66, 1957)
Intuition Beginner
Suppose you know everything happening on one flat wall: the value of some physical field at every point of the wall, and the rate at which the field changes as you step directly away from the wall. A natural question is whether that wall data already decides the field everywhere nearby, or whether the field is still free to do many different things just off the wall. The Cauchy-Kovalevskaya theorem answers this question for a large family of equations, and the answer is that the wall data does pin the field down near the wall, provided two conditions hold.
The first condition is that the equation and the wall data are analytic: smooth enough that each can be written as a power series and rebuilt exactly from its own derivatives at a point. The second condition is that the wall is non-characteristic, meaning the equation actually lets you compute the field's change in the direction away from the wall. If the wall happens to lie along a direction the equation refuses to control, no amount of wall data settles what happens off the wall.
The strategy is the oldest trick in analysis: guess a power series. From the equation and the wall data you can read off, one derivative at a time, every coefficient of the power series of the unknown field. The recipe always produces a candidate series. The only worry is whether the series actually adds up to a finite number near the wall, or whether the coefficients grow so fast that the sum blows apart. The whole theorem reduces to controlling that growth.
To control the growth you compare your series term by term against a second, simpler series whose coefficients are visibly larger and whose sum you can compute by hand. If the bigger series converges, the smaller one is trapped beneath it and converges too. The simpler comparison series is built from an ordinary geometric series, the one summing to a clean closed form. This comparison method is called the method of majorants: a majorant is just an upholding series that sits above yours and certifies that yours stays finite.
A companion result, Holmgren's theorem, removes the analyticity assumption from the answer while keeping it on the equation. It says that for a linear equation with analytic coefficients, the wall data fixes the field uniquely, even among fields that are merely smooth and not analytic. So while Cauchy-Kovalevskaya hands you one analytic field that fits the wall data, Holmgren guarantees no second field of any kind can sneak in beside it.
There is a sharp boundary to all of this. Hans Lewy found, in 1957, a perfectly smooth linear equation with no solution at all near a point, for most choices of the right-hand side. His equation has smooth but non-analytic coefficients, exactly the case Cauchy-Kovalevskaya and Holmgren refuse to cover. The analytic hypothesis is not a convenience that better technique would remove; it is doing real work, and dropping it breaks both existence and the simple uniqueness story.
The one-sentence takeaway: Cauchy-Kovalevskaya builds, by the method of majorants, a unique analytic field matching analytic data on a non-characteristic wall; Holmgren upgrades the uniqueness to all smooth fields for linear analytic equations; and Lewy's example shows the analytic hypothesis cannot simply be dropped.
Visual Beginner
Picture a flat sheet of paper standing upright; call it the initial wall. On the wall you are handed two pieces of information at every point: the height of an invisible surface that touches the wall, and the slope at which that surface leaves the wall heading away from you. The theorem says the surface is then forced into a single shape in a thin slab of space hugging the wall, like a tent whose fabric is nailed both in position and in lean-angle all along one edge.
data given on the wall surface forced near the wall
(position + away-slope) (unique, in a thin slab)
| | /
| <- value here | / surface leaning
| <- slope here |/ off the wall
| |\
| <- value here | \
| <- slope here | \
|________ wall ______ |___\____ wall ______
\--slab--/
non-characteristic: the equation lets you step OFF the wall
characteristic: the wall lies along a "blind" direction,
and off-wall behaviour is NOT determinedThe left half of the picture shows the data living on the wall. The right half shows the surface determined in the thin slab beside it. The crucial qualifier is the word thin: the theorem is local. It promises a unique surface only in a slab close to the wall, not across the whole room, because the power series that builds the surface is only guaranteed to add up to a finite value within some radius of the wall.
The non-characteristic condition is the difference between the wall facing the surface and the wall lying edge-on to it. A wall facing the surface catches the away-direction the equation can compute, and the surface is determined. A wall lying edge-on sits along a direction the equation is blind to, and the same data is then compatible with many different surfaces.
Worked example Beginner
We follow the power-series recipe on the simplest possible case so the mechanism is visible: a single function of two variables whose change in the second variable is dictated by its change in the first. The rule is "the rate of change in the direction equals the rate of change in the direction", written , with wall data given on the wall as .
Step 1. Read the wall data. On the wall we are told . So at the special point we know .
Step 2. Use the wall data to get every -derivative on the wall. Differentiating in repeatedly and setting gives the familiar pattern for the value and successive -derivatives at the origin.
Step 3. Use the equation to trade a -derivative for an -derivative. The rule says: to step once in , differentiate once in . So the -derivative of at the origin equals the -derivative of at , which is .
Step 4. Repeat to get a second -derivative. Applying the rule twice, two -derivatives equal two -derivatives: the value is the second -derivative of at , which is .
Step 5. Assemble the series and recognise it. We have built the value , first -derivative , second -derivative , and (continuing the same trade) the pattern keeps matching . Combined with the -data, the series sums to . A direct check confirms it: stepping in and stepping in both shift the cosine the same way, so , and at it reduces to .
What this tells us: the equation plus the wall data determined every coefficient with no freedom left, and the resulting series summed to a clean closed-form answer valid for all and . The wall was non-characteristic for this equation, which is exactly why the trade in Step 3 was possible. The method of majorants is the tool that guarantees, in harder cases where no closed form appears, that the series we build the same way still adds up near the wall.
Check your understanding Beginner
Formal definition Intermediate+
Let be open and let be a smooth hypersurface with unit conormal field . Consider a -th order partial differential operator with principal part and principal symbol .
Definition (characteristic surface). The surface is characteristic for at a point when , where is the conormal to at . It is non-characteristic at when , and non-characteristic (without qualification) when this holds at every point of [Evans 2010 §4.6]. Non-characteristicity is the condition that the principal symbol does not vanish on the conormal direction, equivalently that the top-order normal derivative can be solved for in terms of the equation and the lower-order data.
Definition (Cauchy problem and Cauchy data). Given a non-characteristic surface and functions on , the Cauchy problem asks for with near and prescribed Cauchy data
A -th order equation requires pieces of Cauchy data, one for each normal derivative up to order , matching the count seen for the second-order wave equation 02.13.04, where and the data are the initial position and initial velocity.
Cauchy-Kovalevskaya normal form. After flattening to the hyperplane (writing , ), the non-characteristic condition lets the equation be solved for the pure top-order normal derivative: where is analytic in its arguments. A higher-order scalar equation in this form reduces, by introducing the derivatives of up to order as new unknowns, to a first-order quasilinear system with analytic. The reduction is the standard device that lets one prove the theorem for first-order systems and recover the general case [John 1982 §3.3].
Definition (majorant). For formal power series and in , say majorises , written , when for every multi-index . The basic majorant of an analytic germ convergent on with is the geometric germ whose coefficients dominate those of and whose sum is an explicit rational function. The dominance relation is preserved by addition, multiplication, and composition of series with non-negative coefficients, which is what makes it propagate through the recursion that defines the formal solution.
Counterexamples to common slips Intermediate+
Smoothness is not enough; analyticity is essential. The Lewy operator has (indeed polynomial) coefficients, yet for most the equation has no solution in any neighbourhood of a point [Lewy 1957]. Cauchy-Kovalevskaya does not apply because the coefficients, though smooth, are not the analytic data the majorant argument consumes; the theorem's analyticity hypothesis cannot be weakened to .
Non-characteristic depends on the surface, not only the operator. For the heat operator the hyperplane is characteristic: the principal symbol is (the time covariable enters only at first order, below the principal degree ), which vanishes on the conormal . So the standard initial-value surface of the heat equation is exactly the surface Cauchy-Kovalevskaya cannot use, and indeed prescribing and on over-determines the heat equation.
Convergence is local, never global. Even for entire analytic data the Cauchy-Kovalevskaya solution is only guaranteed on a neighbourhood of the surface. The first-order scalar equation with has the analytic solution , which blows up at ; the majorant radius cannot be pushed past the singularity, and no theorem of this type promises a global solution.
Holmgren needs a non-characteristic surface too. Uniqueness fails across characteristic surfaces. For the wave operator the characteristic line carries non-zero solutions with vanishing Cauchy data taken along it; Holmgren's duality argument breaks precisely because the adjoint Cauchy problem it relies on is no longer solvable backward from such a surface.
Key theorem with proof Intermediate+
Theorem (Cauchy-Kovalevskaya). Let and let the first-order quasilinear system have entries real-analytic near and real-analytic near . Then there is a neighbourhood of the origin on which the system has a real-analytic solution , and this solution is unique among real-analytic solutions [Kovalevskaya 1875] [Evans 2010 §4.6.3]. Consequently every analytic Cauchy problem with analytic Cauchy data on a non-characteristic analytic surface has a unique local analytic solution.
Proof. After translating, assume and all data are analytic at the origin. The proof is in three steps: the formal coefficients are uniquely determined; a geometric majorant problem is constructed; and the majorant problem is solved in closed form, certifying convergence.
Step 1 (the formal solution is forced). Write , so . Tangential derivatives on the surface are fixed by the data: . The equation expresses as an analytic function of , , and the tangential derivatives . Differentiating the equation in and in the tangential variables and evaluating on expresses every coefficient as a polynomial, with non-negative integer coefficients, in the Taylor coefficients of , , and . The recursion never requires a normal derivative it has not already computed, so each is determined exactly once. This proves uniqueness among analytic solutions and produces a candidate formal series.
Step 2 (the majorant problem). The key structural fact about the recursion of Step 1 is monotonicity: because the determining polynomials have non-negative integer coefficients, replacing , , by majorants , , produces, through the same recursion, a new formal solution with . So it suffices to exhibit one analytic majorant problem whose formal solution converges; then and the candidate series for converges by comparison on the same polydisc.
Choose the geometric majorant. Since are analytic, there are constants with all their Taylor coefficients dominated by those of . Replace each and by this common geometric germ and each component of by the one-variable geometric germ , which majorises any analytic with after enlarging .
Step 3 (solving the majorant problem in closed form). By symmetry the majorant system admits a solution depending only on and , with all components equal to a single scalar . The system collapses to the single scalar equation
This is a first-order equation in two variables solvable by the method of characteristics: one seeks and reduces to an ordinary differential relation, yielding an analytic explicitly as the root of an algebraic equation
real-analytic at the origin by the analytic inverse/implicit function theorem 02.05.04. Hence , and with it , is analytic on a neighbourhood of the origin. The formal series of is dominated coefficientwise by that of , so it converges on the same neighbourhood and defines a real-analytic function satisfying the original system by construction.
Bridge. The method of majorants is exactly the technique that proves the real-analytic inverse function theorem 02.05.04; here it is deployed one categorical level up, on series in both the independent variables and the unknowns, and this is the foundational reason a single convergence engine drives both the inverse function theorem and the general analytic Cauchy problem. The reduction to a first-order system in normal form builds toward the symmetric-hyperbolic well-posedness machinery, and the role of the principal symbol on the conormal — the non-characteristic test — generalises the light-cone characteristic geometry of the wave equation 02.13.04 to arbitrary operators. The existence half of the story appears again in the uniqueness half below: Holmgren's theorem runs the same Cauchy-Kovalevskaya existence result on the adjoint operator and pairs the two by an integration-by-parts duality that lives naturally in the language of distributions 02.14.04. Putting these together, Cauchy-Kovalevskaya and Holmgren form a single existence-plus-uniqueness backbone for local solvability of analytic equations.
Exercises Intermediate+
Advanced results Master
Theorem 1 (general Cauchy-Kovalevskaya for analytic systems). Let be analytic operators and consider the analytic system in normal form where each is analytic and depends only on derivatives with . With analytic Cauchy data for , there is a unique analytic solution in a neighbourhood of the origin [Kovalevskaya 1875]. The scalar first-order proof extends verbatim once the determining recursion is checked to respect the order bookkeeping , which is the abstract content of the non-characteristic hypothesis: every normal derivative appearing on the right is of strictly lower normal order than the one being solved for.
Theorem 2 (Holmgren's uniqueness theorem). Let have real-analytic coefficients on a neighbourhood of a point of a non-characteristic hypersurface . If satisfies near with vanishing Cauchy data on , then on a neighbourhood of [Holmgren 1901] [Hörmander 1990 §9.4]. Equivalently, the Cauchy problem for a linear analytic operator has at most one solution in , with no analyticity required of the solution.
The proof realises the duality sketched in Exercise 7. One constructs a family of nearby non-characteristic surfaces foliating a lens region with , chosen convex toward so that the adjoint Cauchy problem from the outer surface is solvable by Cauchy-Kovalevskaya. For each analytic one solves in with zero Cauchy data on the outer boundary; Green's identity then yields . Density of analytic (Weierstrass) forces in , and letting sweep recovers a full neighbourhood of . The geometric heart is Holmgren's transformation, a convexification of the surface guaranteeing the adjoint solvability over the whole lens.
Theorem 3 (Lewy non-solvability, sharpness). The operator on has the property that for in a residual set of the equation has no distributional solution on any open set [Lewy 1957]. Consequently the analytic-coefficient hypothesis in both Cauchy-Kovalevskaya and Holmgren is sharp: with merely smooth coefficients, local existence can fail outright, and the elegant duality of Holmgren has no smooth analogue. Hörmander's condition on the Poisson bracket of the principal symbol, refined by Nirenberg-Trèves into the geometric condition , characterises local solvability and explains the Lewy example as the bracket obstruction made concrete.
Theorem 4 (Métivier; analyticity is necessary for general well-posedness of the non-characteristic Cauchy problem). For a non-Kovalevskayan (genuinely overdetermined-in-time) class of operators, the Cauchy problem is well-posed in Gevrey or smooth classes only under symbol conditions strictly stronger than non-characteristicity; the analytic category is the unique one in which non-characteristicity alone suffices. This places Cauchy-Kovalevskaya as the maximal general theorem: weakening the function class forces extra hypotheses (hyperbolicity, parabolicity, ellipticity) tied to the symbol's geometry rather than to the surface alone.
Synthesis. Cauchy-Kovalevskaya and Holmgren together are the foundational reason that local solvability of analytic equations needs only one structural input, the non-characteristic test on the principal symbol, and the central insight is that the same geometric-series majorant certifies convergence for both the forward problem and the adjoint problem that powers uniqueness. The existence theorem and the uniqueness theorem are dual: Holmgren is Cauchy-Kovalevskaya applied to the formal adjoint and paired back by integration by parts, so what looks like two theorems is one convergence engine viewed from two sides. This is exactly the pattern that builds toward the modern theory: the non-characteristic symbol condition generalises the light-cone geometry of the wave equation 02.13.04 and the ellipticity of the Laplace operator 02.13.01 into a single principal-symbol criterion; the duality pairing lives in the distribution calculus 02.14.04 and reappears in the microlocal propagation-of-singularities theorems; and the majorant method is the very engine of the analytic inverse function theorem 02.05.04, now run on series in the unknowns. Putting these together, the analytic Cauchy problem occupies the apex of a hierarchy whose lower floors — hyperbolic, parabolic, elliptic well-posedness — each trade the clean analytic hypothesis for a sharper symbol condition, and Lewy's example marks the exact edge where dropping analyticity collapses both existence and uniqueness at once. The bridge from this unit to the symmetric-hyperbolic and microlocal theory is the recognition that non-characteristicity, normal-form reduction, and adjoint duality survive into the smooth category only when reinforced by the principal symbol's deeper geometry.
Full proof set Master
Proposition 1 (uniqueness of the formal solution). Under the hypotheses of the Cauchy-Kovalevskaya theorem, the Taylor coefficients of any analytic solution are uniquely determined by the system and the Cauchy data.
Proof. Coefficients with are the tangential derivatives, equal to , hence fixed. Suppose all coefficients with are determined. The system differentiated times in and times tangentially, then evaluated at , expresses the coefficient with as a universal polynomial in the Taylor coefficients of and in coefficients of of normal order (the right side carries at most one -derivative beyond those already present, and the tangential derivatives raise only ). By the induction hypothesis these are known, so the order- coefficient is determined. Induction on fixes every coefficient.
Proposition 2 (majorant domination propagates through the recursion). Let the recursion of Proposition 1 determine , where each is a polynomial with non-negative coefficients. If , , , then the solution of the majorised system satisfies .
Proof. Because has non-negative coefficients, it is monotone in each argument: replacing every input Taylor coefficient by one of larger or equal absolute value can only increase the output. The majorant hypotheses say exactly that each input coefficient of dominates the absolute value of the corresponding input of . Hence for every , which is .
Proposition 3 (the geometric majorant problem is analytically solvable). The scalar majorant problem has a solution real-analytic at the origin.
Proof. Seek as a function of the single combination through the characteristic ansatz; equivalently, look for analytic at with . Clearing the denominator, satisfies the analytic relation
which at the origin reads . Define as the algebraic equation obtained by integrating the characteristic system; explicitly the solution is the root, vanishing at the origin, of a quadratic in with coefficients analytic in :
The right side is analytic and positive near , and the implicit relation has non-vanishing -derivative there ( at the origin). By the real-analytic implicit function theorem 02.05.04, is real-analytic near , and a direct substitution confirms it solves the majorant problem with the prescribed data.
Proposition 4 (Green's identity for the Holmgren pairing). Let be a bounded region whose boundary consists of a piece of the non-characteristic surface and a piece of a transversal surface . For and , where is the formal adjoint and is a bilinear boundary form depending on and their derivatives up to order .
Proof. Each term is integrated by parts times; every integration moves one derivative from to and emits a divergence whose integral is, by the divergence theorem, a boundary integral of a bilinear expression in lower-order derivatives of and . Summing over collects the interior terms into , with , and collects the boundary emissions into . The boundary form involves only derivatives of order , i.e. precisely the Cauchy data on each boundary piece.
Proposition 5 (Holmgren uniqueness from the pairing). With , , and vanishing Cauchy data on , and ranging over analytic solutions of in with zero Cauchy data on , one has in .
Proof. By Proposition 4, . The first interior term vanishes since . The boundary integral splits over and : on the Cauchy data of vanish, killing ; on the Cauchy data of vanish by construction, killing . Hence . Cauchy-Kovalevskaya applied to (non-characteristic since is, as and share a principal symbol up to sign) produces such for every analytic . Analytic functions are dense in by Weierstrass approximation, so for all in a dense set, forcing a.e., hence everywhere by continuity.
Connections Master
The non-characteristic test generalises the geometry already seen for the wave equation
02.13.04, where the characteristic surfaces are precisely the light cones and the Cauchy problem is posed on the non-characteristic spacelike hyperplane ; Cauchy-Kovalevskaya recovers analytic solvability there as one instance, while the wave equation's own theory extends solvability to non-analytic data that this unit cannot reach.The convergence engine is the method of majorants, identical to the one proving the real-analytic inverse and implicit function theorems
02.05.04; this unit reuses that result directly in solving the geometric majorant problem and in Holmgren's adjoint construction, so the analytic-category toolkit is shared rather than duplicated.Holmgren's duality pairing is an integration-by-parts statement that lives most naturally in the language of distributions and the Schwartz kernel theorem
02.14.04; the boundary form is a distributional trace, and the propagation-of-singularities refinement of Holmgren (the Holmgren-John uniqueness theorem) is a microlocal statement about wavefront sets defined there.The elliptic prototype
02.13.01supplies the cleanest non-characteristic case: ellipticity means no real surface is characteristic, so analytic data on any analytic surface is solvable, and the analyticity of harmonic functions is the Cauchy-Kovalevskaya shadow of the elliptic regularity proved by other means in that unit.
Historical & philosophical context Master
Augustin-Louis Cauchy introduced the method of limits (calcul des limites), the ancestor of the method of majorants, in a series of 1842 notes in the Comptes Rendus [Cauchy 1842], proving local existence for analytic systems of the special form that now bears his and Kovalevskaya's names. Cauchy's argument already contained the decisive idea of dominating the unknown power series by an explicitly summable comparison series, though his treatment was restricted to particular normal forms.
Sofya Kovalevskaya, in her 1875 Crelle paper Zur Theorie der partiellen Differentialgleichung [Kovalevskaya 1875] — the work for which Göttingen awarded her the doctorate in absentia under Weierstrass's supervision — gave the general theorem for arbitrary analytic systems in normal form, with the clean reduction to first order and the geometric majorant that makes the convergence proof uniform. Her exposition fixed the modern statement: a non-characteristic analytic Cauchy problem has a unique local analytic solution. Weierstrass's insistence on rigorous convergence estimates shaped the majorant technique into the form still taught.
Erik Holmgren proved his uniqueness theorem in 1901 [Holmgren 1901], recognising that the existence theorem for the adjoint operator could be turned, by duality, into a uniqueness theorem valid for non-analytic solutions of analytic linear equations. The geometric device of convexifying the surface to guarantee adjoint solvability is Holmgren's transformation; Fritz John and later Lars Hörmander recast the argument in the operator-theoretic and microlocal language of the twentieth century [Hörmander 1990 §9.4], connecting it to the wavefront-set propagation theory and to the uniqueness-across-non-characteristic-surfaces results of John and of Mizohata [Mizohata 1962].
Hans Lewy's 1957 Annals note An example of a smooth linear partial differential equation without solution [Lewy 1957] ended any hope of dropping analyticity for free: his three-real-variable first-order operator with polynomial coefficients has no local solution for generic smooth right-hand sides. The example provoked the local-solvability program of Hörmander, Nirenberg, and Trèves, culminating in the condition that characterises solvability through the geometry of the principal symbol's Poisson bracket. Jacques Hadamard's earlier framing of well-posedness [Hadamard 1923] supplies the conceptual backdrop: existence, uniqueness, and continuous dependence are separate demands, and the Cauchy-Kovalevskaya/Holmgren pair settles the first two in the analytic category while leaving the third — and the entire smooth category — to the symbol-sensitive theories that grew from Lewy's counterexample.
Bibliography Master
@article{Cauchy1842,
author = {Cauchy, Augustin-Louis},
title = {M\'emoire sur l'emploi du calcul des limites dans l'int\'egration des \'equations aux d\'eriv\'ees partielles},
journal = {Comptes Rendus de l'Acad\'emie des Sciences de Paris},
volume = {15},
year = {1842},
pages = {44--59}
}
@article{Kovalevskaya1875,
author = {Kovalevskaya, Sofya},
title = {Zur {T}heorie der partiellen {D}ifferentialgleichung},
journal = {Journal f\"ur die reine und angewandte Mathematik (Crelle)},
volume = {80},
year = {1875},
pages = {1--32}
}
@article{Holmgren1901,
author = {Holmgren, Erik},
title = {\"Uber {S}ysteme von linearen partiellen {D}ifferentialgleichungen},
journal = {\"Ofversigt af Kongl. Vetenskaps-Akademiens F\"orhandlingar},
volume = {58},
year = {1901},
pages = {91--103}
}
@article{Lewy1957,
author = {Lewy, Hans},
title = {An example of a smooth linear partial differential equation without solution},
journal = {Annals of Mathematics},
volume = {66},
year = {1957},
pages = {155--158}
}
@article{Mizohata1962,
author = {Mizohata, Sigeru},
title = {Solutions nulles et solutions non analytiques},
journal = {Journal of Mathematics of Kyoto University},
volume = {1},
year = {1962},
pages = {271--302}
}
@book{Hormander1990,
author = {H\"ormander, Lars},
title = {The Analysis of Linear Partial Differential Operators I},
edition = {2},
publisher = {Springer},
series = {Grundlehren der mathematischen Wissenschaften},
volume = {256},
year = {1990}
}
@book{Hadamard1923,
author = {Hadamard, Jacques},
title = {Lectures on {C}auchy's Problem in Linear Partial Differential Equations},
publisher = {Yale University Press},
year = {1923}
}
@book{Evans2010,
author = {Evans, Lawrence C.},
title = {Partial Differential Equations},
edition = {2},
publisher = {American Mathematical Society},
series = {Graduate Studies in Mathematics},
volume = {19},
year = {2010}
}
@book{Folland1995,
author = {Folland, Gerald B.},
title = {Introduction to Partial Differential Equations},
edition = {2},
publisher = {Princeton University Press},
year = {1995}
}
@book{John1982,
author = {John, Fritz},
title = {Partial Differential Equations},
edition = {4},
publisher = {Springer},
year = {1982}
}