37.08.05 · probability / 08-random-matrices

The Airy Kernel and the Tracy-Widom Edge Law

shipped3 tiersLean: none

Anchor (Master): Anderson, Guionnet, Zeitouni, An Introduction to Random Matrices (Cambridge, 2010) Ch. 3; Tracy, Widom, Commun. Math. Phys. 159 (1994) and 177 (1996); Deift, Orthogonal Polynomials and Random Matrices: A Riemann-Hilbert Approach (AMS, 1999); Forrester, Log-Gases and Random Matrices (Princeton, 2010) Ch. 9; Hastings, McLeod, Arch. Ration. Mech. Anal. 73 (1980)

Intuition Beginner

The eigenvalues of a large random matrix spread out into a smooth bulk, densest in the middle and thinning toward two edges where the spectrum simply stops. The single most interesting eigenvalue is often the very largest one, sitting right at the top edge. Where does it land, and how much does it wobble from one random matrix to the next? That wobble is not just noise to be averaged away — it has a precise and surprising shape, and that shape is the subject of this unit.

You might guess the largest eigenvalue jitters like a bell curve, the way most averaged quantities do. It does not. Near the edge the eigenvalues are sparse, the topmost one is pushed up against an invisible wall by all the others below it, and the result is a lopsided distribution: it can dip a long way below its typical spot but only rarely climbs much above it. This special lopsided law has a name, the Tracy-Widom law, and it turns out to govern the top eigenvalue of almost every large random matrix, no matter the fine details of how the matrix was built.

The reason this matters far beyond matrices is that the same law keeps reappearing in places that look unrelated: the length of the longest increasing run in a shuffled deck, the shape of a growing crystal interface, the front of a spreading stain. Whenever many things compete to be the extreme one under a rigid, repelling crowd, the Tracy-Widom shape emerges.

Visual Beginner

Picture the eigenvalues as dots piled under a smooth dome — the semicircle profile. In the middle the dots are packed tight; as you move outward toward the right edge they thin out, and just past the edge of the dome there are almost none. Now zoom way in on that right edge with a magnifying glass, stretching the picture so the few topmost dots fill the view. On this magnified scale the highest dot is seen to dance back and forth a little each time you draw a fresh matrix.

The lopsided curve under the inset is the punchline. It records how often the largest eigenvalue lands at each spot. The long tail points left — the top eigenvalue can sag well below its usual perch — while the right side falls off sharply, because climbing above the crowd is hard. A symmetric bell curve would have matching tails on both sides; this one does not, and that asymmetry is the fingerprint of edge fluctuations.

Worked example Beginner

We see how the magnification scale at the edge is forced on us, using only counting, with no matrix algebra.

Step 1. In the bulk of an by matrix the eigenvalues fill a range of width about (from about to after the usual scaling), so the typical gap between neighbours is about . Magnifying the bulk by a factor of makes neighbours sit about one unit apart.

Step 2. At the edge the dome flattens, so the dots thin out and the gaps grow. The density of dots near the right edge falls off like the square root of the distance from the edge: a dot a small distance inside the edge sits where the density is proportional to the square root of .

Step 3. The topmost few dots occupy the region where there is, on average, about one dot. Setting "density times width" equal to one and using that the density grows like the square root of the depth, the natural width of that top region shrinks like raised to the power minus two-thirds. The right magnification at the edge is therefore to the two-thirds — a coarser zoom than the bulk's factor of .

Step 4. Check the exponents with a number. Take . The bulk gap is about . The edge window scales like to the minus two-thirds, which is — wider than the bulk gap, exactly as expected, since the dots are sparser at the edge.

Step 5. What this tells us: the bulk and the edge live on different scales. The bulk is magnified by and its local statistics are the sine pattern of the previous unit; the edge is magnified by the gentler to the two-thirds, and on that scale a new pattern — the Airy pattern — and a new fluctuation law — Tracy-Widom — take over.

Check your understanding Beginner

Formal definition Intermediate+

Work with the Gaussian Unitary Ensemble in the normalisation of 37.08.04, where the GUE has the Hermite kernel $$ K_n(x,y) = \sum_{k=0}^{n-1}\phi_k(x),\phi_k(y),\qquad \phi_k = p_k, e^{-x^2/4}, $$ with the Hermite polynomials orthonormal for , so that the spectrum is supported asymptotically on and the soft edge sits at . The Airy function is the entire solution of decaying as , with the contour representation $$ \mathrm{Ai}(x) = \frac{1}{2\pi i}\int_{\mathcal C}\exp!\Big(\frac{t^3}{3} - x t\Big), dt, $$ running from to . Its asymptotics are for and an oscillatory decay for .

The edge rescaling magnifies the spectrum at on the scale — equivalently the fluctuation window of the largest eigenvalue. Set $$ x = 2\sqrt n + \frac{u}{\sqrt 2, n^{1/6}},\qquad y = 2\sqrt n + \frac{v}{\sqrt 2, n^{1/6}}. $$ The Airy kernel is the edge scaling limit of the Hermite kernel: $$ \frac{1}{\sqrt 2, n^{1/6}},K_n(x,y);\longrightarrow; K_{\mathrm{Ai}}(u,v) = \frac{\mathrm{Ai}(u),\mathrm{Ai}'(v) - \mathrm{Ai}'(u),\mathrm{Ai}(v)}{u - v}, $$ with the diagonal value by l'Hôpital and the Airy equation. The kernel defines the Airy point process, a determinantal process on whose points are the edge-rescaled limits of the GUE eigenvalues; its top point is the limiting fluctuation of the largest eigenvalue.

The Tracy-Widom distribution is the limiting law of the largest eigenvalue, written as the gap probability of the Airy point process on a half-line, i.e. the Fredholm determinant of restricted to : $$ F_2(s) = \lim_{n\to\infty}\mathbb P!\Big(\lambda_{\max} \le 2\sqrt n + \frac{s}{\sqrt 2, n^{1/6}}\Big) = \det!\big(I - K_{\mathrm{Ai}}\big)\big|_{L^2(s,\infty)}. $$ Tracy and Widom identified this determinant with a Painlevé II transcendent. Let solve $$ q''(s) = s,q(s) + 2,q(s)^3,\qquad q(s)\sim \mathrm{Ai}(s)\ \text{ as } s\to+\infty, $$ the Hastings-McLeod solution (the unique one with that decay; it behaves as as ). Then $$ F_2(s) = \exp!\Big(-\int_s^\infty (x-s),q(x)^2, dx\Big). $$ The GOE and GSE () edge laws are and , expressible through the same : $$ F_1(s)^2 = F_2(s),\exp!\Big(-\int_s^\infty q(x),dx\Big),\qquad F_4(s/\sqrt2)^2 = F_2(s),\cosh^2!\Big(\tfrac12\int_s^\infty q(x),dx\Big), $$ so the three classical symmetry classes share one Painlevé II transcendent and differ only by the elementary factor.

Counterexamples to common slips Intermediate+

  • The edge scale is , not or . The eigenvalue lives near and fluctuates on the additive scale (so is of order in these units); in the often-quoted normalisation where the spectrum fills , the fluctuation window is . Conflating the location , the bulk spacing , and the edge window is the most common error.
  • Tracy-Widom is not Gaussian. has tails as and as — strongly asymmetric, with a cubic left tail. Fitting a normal curve to the largest eigenvalue is wrong.
  • The Airy kernel is not the sine kernel. Rescaling at a bulk point gives 37.08.04; rescaling at the edge gives . The two come from different Plancherel-Rotach regimes — oscillatory cosine in the bulk, turning-point Airy at the edge — and have different scaling exponents ( versus ).
  • The Hastings-McLeod solution is a specific solution. Painlevé II has a one-parameter family of solutions with the prescribed asymptotics ; only stays real and bounded as (the others blow up in finite position). uses precisely .

Key theorem with proof Intermediate+

Theorem (Airy-kernel edge limit of the GUE). With the edge rescaling , , the Hermite kernel of the GUE satisfies $$ \lim_{n\to\infty}\frac{1}{\sqrt2, n^{1/6}},K_n(x,y) = K_{\mathrm{Ai}}(u,v) = \frac{\mathrm{Ai}(u),\mathrm{Ai}'(v) - \mathrm{Ai}'(u),\mathrm{Ai}(v)}{u-v}, $$ uniformly for in compact sets. Consequently the largest GUE eigenvalue, recentred and rescaled at the edge, converges in distribution to the law .

Proof. By Christoffel-Darboux 37.08.04 the kernel is carried by the two top Hermite functions: $$ K_n(x,y) = \sqrt n,\frac{\phi_n(x)\phi_{n-1}(y) - \phi_{n-1}(x)\phi_n(y)}{x-y}. $$ The whole limit therefore reduces to the edge Plancherel-Rotach asymptotics of near . At the edge the classical turning point of the Hermite differential operator coincides with the spectral boundary, and the WKB cosine of the bulk degenerates: the rescaled Hermite function converges to the Airy function. Concretely, with , $$ \Big(\frac{1}{\sqrt2, n^{1/6}}\Big)^{1/2}\phi_n(x) \longrightarrow \mathrm{Ai}(\xi),\qquad \Big(\frac{1}{\sqrt2, n^{1/6}}\Big)^{1/2}\phi_{n-1}(x)\longrightarrow \mathrm{Ai}(\xi), $$ uniformly on compacts, where the shift leaves the limit unchanged at leading order because the index step is finer than the edge window. The cleanest derivation reads the Airy limit from the contour integral for the Hermite functions $$ \phi_n(x) = \frac{c_n}{2\pi i}\oint \exp!\big(n f(t) + \cdots\big),\frac{dt}{t},\qquad f(t) = \tfrac12 t^2 - \log t\ (\text{schematically}), $$ and applying steepest descent. At a generic bulk point has two simple, well-separated saddles producing the oscillatory cosine; at the edge the two saddles coalesce into one degenerate saddle where . Near a coalescing saddle the cubic term governs, the local model is , and this is exactly the Airy contour integral. The scale is forced by balancing the cubic: writing makes precisely when the spatial displacement is , fixing both the scale and the constant.

Insert the two convergences into Christoffel-Darboux. The prefactor combined with the rescaling and the two amplitude factors collects, after the substitution in the denominator, into the finite limit $$ \frac{1}{\sqrt2, n^{1/6}}K_n(x,y) \longrightarrow \frac{\mathrm{Ai}(u),\mathrm{Ai}(v)' - \mathrm{Ai}(u)',\mathrm{Ai}(v)}{u - v}, $$ where the appearance of comes from the index shift contributing the derivative of the limiting Airy profile through the discrete difference , which on the edge scale tends to . This is . Uniformity on compacts follows from the uniform steepest-descent error.

For the largest eigenvalue, the event is the event that the rescaled point process has no point in . The gap probability of a determinantal process is the Fredholm determinant of its kernel on that set 37.08.04, and trace-class convergence on (the Airy kernel is trace-class on a half-line because of the super-exponential decay of ) transfers to convergence of the Fredholm determinants. Hence .

Bridge. This edge computation builds toward the universal edge statistics of Hermitian random matrices, and it appears again in the Painlevé II / Hastings-McLeod analysis where the Fredholm determinant is evaluated in closed form. The foundational reason a single special function governs the edge is that two steepest-descent saddles coalesce there into one degenerate saddle whose universal local model is the Airy integral — this is exactly the same mechanism by which the sine kernel arose from well-separated saddles in the bulk 37.08.04, so the bulk and edge are two regimes of one contour integral. The Airy kernel is dual to the sine kernel in this sense: the sine kernel is the Fourier projection onto a band of occupied frequencies, while the Airy kernel is the projection associated with the Schrödinger operator whose eigenfunctions are shifted Airy functions, and it generalises the bulk universality of the previous unit to the spectral boundary. Putting these together, the bridge is that Christoffel-Darboux reduces every local limit to the top two Hermite functions, and which limit appears is dictated solely by whether their Plancherel-Rotach asymptotics are oscillatory (bulk, sine) or turning-point (edge, Airy); this is the central insight that makes the edge law as universal as the bulk law.

Exercises Intermediate+

Advanced results Master

The Airy point process is the edge scaling limit of the GUE eigenvalue process and is itself determinantal with kernel . From the projection representation , , every finite gap probability is well defined: the Airy function's decay makes trace-class on any half-line , so the Fredholm determinant converges and is an entire function of taking values in , increasing from at to at .

The Painlevé II representation is the deepest structural fact. Tracy and Widom analysed the resolvent on and found that the quantities (diagonal resolvent) and the inner products with satisfy a closed coupled ODE system that integrates to Painlevé II. Setting the value built from , one obtains with at , and $$ \frac{d^2}{ds^2}\log F_2(s) = -q(s)^2,\qquad F_2(s) = \exp!\Big(-\int_s^\infty (x-s)q(x)^2,dx\Big). $$ The Hastings-McLeod solution is the unique Painlevé II solution real on all of with this boundary value; its asymptotic gives the left tail , while the tail gives the right tail . The mean and variance are , .

The GOE and GSE edge laws were derived by Tracy and Widom from Pfaffian (matrix-kernel) versions of the Airy kernel. All three reduce to one transcendent: $$ F_1(s) = \exp!\Big(-\tfrac12\int_s^\infty q(x),dx\Big)\sqrt{F_2(s)},\qquad F_4(s) = \cosh!\Big(\tfrac12\int_s^\infty q(x),dx\Big)\sqrt{F_2(s)} $$ (with the GSE argument rescaled by ). The factor is elementary given ; the symmetry class enters only through it, a clean instance of the threefold way 37.08.03 surviving the edge scaling.

Edge universality lifts the exact-GUE computation to a theorem for all Wigner matrices, paralleling bulk universality 37.08.04. Soshnikov proved (2009 and earlier) that for Wigner matrices with symmetric entry distributions and sub-Gaussian tails the largest eigenvalue obeys the Tracy-Widom law of the symmetry class; the four-moment theorem of Tao-Vu and the Green's-function comparison of Erdős-Yau removed the symmetry hypothesis, requiring only matching first four moments and a finite high moment. The same Tracy-Widom edge law also governs the longest increasing subsequence of a random permutation (Baik-Deift-Johansson, 1999), the corner-growth and last-passage-percolation height (Johansson), and the KPZ universality class height fluctuations — so is a genuine universal object far beyond random matrices.

Synthesis. The foundational reason a single transcendent organises the entire edge is that the GUE edge process is the determinantal projection onto the negative spectrum of the Airy operator, and the gap probability of this projection is a Fredholm determinant whose logarithmic curvature is exactly the squared Hastings-McLeod solution of Painlevé II — this is exactly why the bulk sine kernel of 37.08.04 and the edge Airy kernel are dual faces of one Christoffel-Darboux reduction, the first from well-separated saddles and the second from a coalescing turning-point saddle. Putting these together, the moment method 37.08.01, the resolvent self-consistency 37.08.02, the log-gas determinantal structure 37.08.03, and the bulk sine kernel 37.08.04 are views of one spectral object: the diagonal is the semicircle whose square-root edge sets the window, and the kernel whose bulk limit is the sine projection has edge limit the Airy projection. The central insight is that universality is the statement of which projection survives a scaling limit; this generalises the central-limit principle 37.03.01, and it is dual to the Gaussian for averaged quantities, since the edge is where averaging fails and a non-Gaussian transcendent takes over. The bridge to the frontier is that the same three-step local-law / Dyson-Brownian-motion / four-moment strategy that promotes the sine kernel to bulk universality promotes the Airy kernel to edge universality and the Tracy-Widom law to the entire KPZ class.

Full proof set Master

The Airy edge limit and the Tracy-Widom convergence are proved above. The remaining Master claims are recorded here.

Proposition (Airy kernel is a projection; integral representation). , and as an operator on , is the orthogonal projection for ; in particular .

Proof. Set , convergent by the super-exponential decay of . Using , $$ (\partial_u^2 - \partial_v^2)J = \int_0^\infty\big[(u+t)-(v+t)\big]\mathrm{Ai}(u+t)\mathrm{Ai}(v+t),dt = (u-v)J. $$ The integrand is also a total -derivative: applied pointwise equals as well, so integrating from to and using the vanishing of the bracket at gives . Hence . The functions satisfy and form a continuous orthonormal basis ( by the Airy resolution of identity), so is the spectral projection onto eigenvalues , i.e. , which is an orthogonal projection: .

Proposition (edge one-point density matches the semicircle edge). The one-point intensity satisfies as and super-exponentially as .

Proof. The closed form follows from l'Hôpital and the Airy equation as in Exercise 1. For , both and decay like , so at that rate. For use the oscillatory asymptotics and with . Then and , summing (the oscillation averages out at leading order through the Pythagorean identity) to . This is the local form of the semicircle's square-root edge: moving into the bulk, the edge density grows like , matching from 37.08.01 under the edge rescaling.

Proposition (Tracy-Widom logarithmic derivatives). With , one has and .

Proof. Let , so . Differentiating under the integral, the endpoint contribution at carries the factor and drops, leaving . Hence . Differentiating once more, . Both quantities are finite for all real because the Hastings-McLeod decays like at (so the integrals converge there) and grows only like at . The first identity shows is strictly increasing; the second shows is concave, the source of the law's left-skewed shape.

Proposition (left-tail rate of ). As , .

Proof. From and the Hastings-McLeod asymptotic as , one has . Integrating twice: (the constant fixed by would be at ; matching the growth direction gives the quartic-free leading term for ), and integrating again for . Thus the probability that the largest eigenvalue lies far below its typical position decays like — a cubic large-deviation rate, far heavier suppression than a Gaussian's , reflecting that depressing the whole edge of a rigid spectrum is costly.

Connections Master

The determinantal point processes and the sine kernel 37.08.04 is the direct parent of this unit: the same Hermite kernel and the same Christoffel-Darboux reduction produce the sine kernel in the bulk and the Airy kernel at the edge, the only difference being the Plancherel-Rotach regime — oscillatory cosine versus coalescing turning point. The Fredholm-determinant gap-probability machinery developed there is reused verbatim here to express as , so this unit is the edge companion of that unit's bulk analysis.

The Wigner semicircle law and the moment method 37.08.01 supplies the square-root vanishing of the edge density that fixes the (equivalently ) edge scale. The edge one-point intensity as is precisely the local form of , so the macroscopic semicircle edge and the microscopic Airy edge are the same geometry seen on two scales, and the moment method's control of the spectral radius is the global statement whose fluctuation refinement is the Tracy-Widom law.

The Gaussian ensembles and the joint eigenvalue density 37.08.03 is the source of the symmetry-class structure that produces from one Painlevé II transcendent. The determinantal kernel gives the scalar Airy kernel and , while the Pfaffian structures give the matrix Airy kernels whose Pfaffians reduce to dressed by ; Dyson's threefold way therefore survives intact into the edge fluctuation laws.

The Stieltjes transform and the resolvent 37.08.02 is the analytic route to the edge of the spectrum and a technical input to edge universality. The square-root behaviour of the resolvent at the branch points is the analytic signature of the soft edge, and the local law down to scale proved there is the first step that upgrades the exact-GUE Airy computation of this unit to Tracy-Widom universality for all Wigner matrices.

Historical & philosophical context Master

The edge of the random-matrix spectrum was studied numerically and heuristically through the 1980s and early 1990s, with Forrester identifying the Airy-kernel scaling at the soft edge in 1993 [Forrester 1993]. The decisive analytic result came from Craig Tracy and Harold Widom, who in 1994 expressed the gap probability of the Airy kernel as a Fredholm determinant and reduced it to the second Painlevé equation, producing the closed form [Tracy 1994]. They extended the analysis to the orthogonal and symplectic ensembles two years later, obtaining and from the same transcendent [Tracy 1996]. The distinguished solution of Painlevé II with the boundary behaviour had been singled out earlier, in a different context, by Stewart Hastings and John McLeod, who proved its global existence and uniqueness while studying a boundary-value problem connected to the Korteweg-de Vries equation [Hastings 1980].

The reach of the Tracy-Widom law beyond random matrices was revealed by Jinho Baik, Percy Deift and Kurt Johansson in 1999, who proved that the length of the longest increasing subsequence of a uniformly random permutation, suitably centred and scaled, converges to — connecting the matrix edge to a purely combinatorial extreme-value problem. Through the subsequent work of Johansson, Prähofer-Spohn and others, the same law was found to govern the height fluctuations of growth models in the Kardar-Parisi-Zhang universality class, so the edge eigenvalue distribution computed here is now understood as a universal attractor for the extreme statistics of strongly correlated systems. The universality of the law within random-matrix theory — that govern the largest eigenvalue of all Wigner matrices of the corresponding class — was established by Soshnikov and completed by the Tao-Vu four-moment theorem and the Erdős-Yau program.

Bibliography Master

@article{tracywidom1994airy,
  author  = {Tracy, Craig A. and Widom, Harold},
  title   = {Level-spacing distributions and the {Airy} kernel},
  journal = {Communications in Mathematical Physics},
  volume  = {159},
  number  = {1},
  pages   = {151--174},
  year    = {1994}
}

@article{tracywidom1996,
  author  = {Tracy, Craig A. and Widom, Harold},
  title   = {On orthogonal and symplectic matrix ensembles},
  journal = {Communications in Mathematical Physics},
  volume  = {177},
  number  = {3},
  pages   = {727--754},
  year    = {1996}
}

@article{hastingsmcleod1980,
  author  = {Hastings, S. P. and McLeod, J. B.},
  title   = {A boundary value problem associated with the second {Painlev\'e} transcendent and the {Korteweg-de Vries} equation},
  journal = {Archive for Rational Mechanics and Analysis},
  volume  = {73},
  number  = {1},
  pages   = {31--51},
  year    = {1980}
}

@article{forrester1993,
  author  = {Forrester, Peter J.},
  title   = {The spectrum edge of random matrix ensembles},
  journal = {Nuclear Physics B},
  volume  = {402},
  number  = {3},
  pages   = {709--728},
  year    = {1993}
}

@article{baikdeiftjohansson1999,
  author  = {Baik, Jinho and Deift, Percy and Johansson, Kurt},
  title   = {On the distribution of the length of the longest increasing subsequence of random permutations},
  journal = {Journal of the American Mathematical Society},
  volume  = {12},
  number  = {4},
  pages   = {1119--1178},
  year    = {1999}
}

@book{deift1999airy,
  author    = {Deift, Percy},
  title     = {Orthogonal Polynomials and Random Matrices: A Riemann-Hilbert Approach},
  series    = {Courant Lecture Notes},
  volume    = {3},
  publisher = {American Mathematical Society},
  year      = {1999}
}

@book{agz2010airy,
  author    = {Anderson, Greg W. and Guionnet, Alice and Zeitouni, Ofer},
  title     = {An Introduction to Random Matrices},
  series    = {Cambridge Studies in Advanced Mathematics},
  volume    = {118},
  publisher = {Cambridge University Press},
  year      = {2010}
}

@book{forrester2010airy,
  author    = {Forrester, Peter J.},
  title     = {Log-Gases and Random Matrices},
  series    = {London Mathematical Society Monographs},
  publisher = {Princeton University Press},
  year      = {2010}
}