The Stieltjes Transform and the Semicircle Law via the Resolvent
Anchor (Master): Anderson-Guionnet-Zeitouni, An Introduction to Random Matrices (Cambridge, 2010) §2.4; Bai-Silverstein, Spectral Analysis of Large Dimensional Random Matrices (Springer 2e, 2010) Ch. 2 and App. B; Erdős-Yau, A Dynamical Approach to Random Matrix Theory (AMS, 2017) Ch. 5-6 (local semicircle law); Erdős-Schlein-Yau, Local semicircle law and complete delocalization, Commun. Math. Phys. 287 (2009)
Intuition Beginner
There is a slick way to read off the shape of a cloud of points on a line without drawing a histogram. Put a small electric charge at each point and stand off to the side, at a position you choose anywhere in the plane. Measure the pull you feel. As you slide your viewpoint around, the pattern of pulls you record is a single smooth function, and that function encodes exactly where the charges sat. From it you can reconstruct the original cloud. This recording device is the Stieltjes transform, the analyst's preferred handle on a distribution because moving your viewpoint into the plane smooths away the spikiness of individual points.
For a matrix, the cloud of points is its list of eigenvalues, and there is a beautiful shortcut to the recording. Instead of computing all the eigenvalues and then placing charges, you form a single new matrix — the resolvent — by subtracting your chosen viewpoint from the matrix and inverting the result. A simple average of the diagonal of that one matrix gives you the whole recording at that viewpoint. So a question about the entire spectrum collapses into inverting one matrix and reading its diagonal.
The payoff for random matrices is that this recording satisfies a tidy self-referential equation: the value you read is one divided by a simple expression that again contains the value you read. Solving that equation hands back the semicircle directly, and because moving off the line tames the randomness, this route reaches all the way to the sharp edge of the spectrum where the cruder counting methods stall.
Visual Beginner
Picture the eigenvalues as dots strung along a horizontal line. Your viewpoint is a point floating above the line at some height. The Stieltjes transform at that viewpoint is a single complex number you can think of as an arrow: its size says how strong the total pull is, and its tilt says which way the cloud leans relative to you. Lower your viewpoint toward the line and the arrow swells and sharpens, because the nearest dots dominate; raise it high above and the arrow shrinks and points almost straight down, because from far away the whole cloud looks like one lump.
The second panel is the punchline: if you skim your viewpoint just barely above the line and record only the upward component of the arrow at each horizontal position, the trace you get is the smooth density of the dots. Skimming the semicircle cloud this way redraws the half-circle. The recording at a height turns into the true shape in the limit as the height drops to zero.
Worked example Beginner
We compute the recording for the simplest cloud — a single eigenvalue sitting at the point — and watch the smoothing happen.
Step 1. With one charge at , the Stieltjes transform at a viewpoint is just one divided by the gap between the charge and the viewpoint: the value is . There is nothing to average yet, because there is only one point.
Step 2. Put the viewpoint right on the line at , two units to the right of the charge. The value is . The pull points back toward the charge, which is why the sign is negative.
Step 3. Now lift the viewpoint off the line to a height above the point , so the gap is purely vertical and equals one upward unit. Dividing one by a one-unit upward gap turns the arrow by a quarter turn and gives a downward unit arrow: the recording now has size one and points straight down. The upward component is what we read as smoothed density, and here it is small but nonzero even though no charge sits exactly at horizontal position on the line — lifting off the line has spread the single spike into a gentle bump.
Step 4. Drop the height to one tenth. The downward arrow grows roughly tenfold, so the smoothed bump at horizontal position becomes ten times taller and ten times narrower. Its total area stays fixed at one charge.
Step 5. What this tells us: a viewpoint exactly on the line sees a charge as an infinitely sharp spike, but any positive height replaces the spike with a finite smooth bump whose height grows and whose width shrinks as the height drops. Averaging many such bumps — one per eigenvalue — and then letting the height fall to zero is exactly how the recording reconstructs a smooth eigenvalue density like the semicircle.
Check your understanding Beginner
Formal definition Intermediate+
Let be a Borel probability measure on . Its Stieltjes transform (also called the Cauchy transform) is the function $$ s_\mu(z) = \int_{\mathbb{R}} \frac{1}{x - z}, d\mu(x), \qquad z \in \mathbb{C}^+ := {z \in \mathbb{C} : \operatorname{Im} z > 0}. $$ The integrand is bounded by , so the integral converges and is holomorphic on (and on by the same estimate). It is a Herglotz (Nevanlinna) function: for , and , which encodes total mass one. Conversely every Herglotz function with this normalisation is the Stieltjes transform of a probability measure.
Stieltjes inversion. The measure is recovered from the boundary behaviour: for continuity points of , $$ \mu\big((a,b)\big) = \frac{1}{\pi}\lim_{\varepsilon \downarrow 0} \int_a^b \operatorname{Im} s_\mu(x + i\varepsilon), dx, $$ because is the Poisson/Cauchy kernel smoothing of , an approximate identity as . When has a density continuous at , this gives the pointwise formula . Thus determines .
The resolvent. For an Hermitian matrix and the matrix is invertible (its eigenvalues are with nonzero imaginary part), and the resolvent is $$ G(z) = (M - z I)^{-1}. $$ In the eigenbasis with the spectral projections, so its normalised trace is the Stieltjes transform of the empirical spectral distribution [from 37.08.01]: $$ s_n(z) := \frac{1}{n}\operatorname{tr} G(z) = \frac{1}{n}\sum_{i=1}^n \frac{1}{\lambda_i - z} = \int \frac{d\mu_M(x)}{x - z} = s_{\mu_M}(z). $$ The diagonal entries are the local resolvent entries whose sum is ; the resolvent route works by finding a closed equation for an individual and then averaging.
Counterexamples to common slips Intermediate+
- The Stieltjes transform is taken off the real axis. The defining integral with real can diverge (it does whenever ). The transform lives on ; values on are obtained only as boundary limits, and it is precisely the imaginary part of those limits that returns the density.
- Pointwise convergence of must be tested off the axis. Stieltjes continuity says for each fixed is equivalent to . Convergence only at real arguments carries no such force, since the real-axis values need not exist.
- The self-consistent equation has two roots; only one is a Stieltjes transform. The fixed-point equation is quadratic in , so has two solutions. The correct branch is fixed by the Herglotz constraints on and at infinity; choosing the wrong root gives a non-Herglotz function with no measure behind it.
- Sign conventions differ across the literature. Many references write the kernel as rather than , flipping the sign of and of the self-consistent equation. This unit fixes , so that on and ; every formula below is stated in this convention.
Key theorem with proof Intermediate+
Theorem (semicircle law via the self-consistent resolvent equation). Let be normalised Wigner matrices as in 37.08.01, with i.i.d. mean-zero, unit-variance off-diagonal entries and finite fourth moment. Fix and let . Then in probability, where is the unique root of
$$
s_{\mathrm{sc}}(z) = \frac{1}{-z - s_{\mathrm{sc}}(z)}, \qquad\text{equivalently}\qquad s_{\mathrm{sc}}(z)^2 + z, s_{\mathrm{sc}}(z) + 1 = 0,
$$
lying in with as . Consequently in probability, where has density .
Proof. The engine is the Schur complement applied to a single diagonal resolvent entry. Write . Fix an index and split into its -th row/column and the minor obtained by deleting row and column . Let be the -th column of with the -th entry removed, and let . The Schur complement formula for the entry of the inverse gives $$ G_{ii}(z) = \frac{1}{,m_{ii} - z - \mathbf{a}_i^{},(M^{(i)} - z)^{-1},\mathbf{a}_i,}. $$ This is the cavity identity: removing site leaves the cavity minor , and the quadratic form $\mathbf{a}_i^{}(M^{(i)} - z)^{-1}\mathbf{a}_ii$ back to it.
Analyse the denominator. First, as since the diagonal entry has bounded variance and carries a factor. Second, the vector has entries for , independent of the minor , with mean zero and variance . Conditioning on and writing , the quadratic form concentrates on its conditional mean: $$ \mathbb{E}\big[\mathbf{a}i^{} R, \mathbf{a}i ,\big|, M^{(i)}\big] = \sum{j\ne i} \mathbb{E}|(\mathbf a_i)j|^2, R{jj} = \frac{1}{n}\operatorname{tr} R = \frac{1}{n}\operatorname{tr}(M^{(i)} - z)^{-1}. $$ The off-diagonal contributions vanish in expectation because distinct entries of are independent and centred, and the conditional variance of the quadratic form is (it is a sum of terms each of size times a bounded resolvent entry, with the fourth-moment hypothesis controlling the diagonal part), so $\mathbf a_i^ R \mathbf a_i = \tfrac1n\operatorname{tr} R + o{\mathbb P}(1)$.
Now compare the cavity trace to the full trace. Deleting one row and column perturbs the trace of the resolvent by : the interlacing of eigenvalues of between those of forces , a rank-one resolvent bound. Hence , and the denominator of is , uniformly in . Therefore every diagonal entry satisfies $$ G_{ii}(z) = \frac{1}{-z - s_n(z)} + o_{\mathbb P}(1), $$ and averaging over , since , yields the approximate self-consistent equation $$ s_n(z) = \frac{1}{-z - s_n(z)} + o_{\mathbb P}(1). $$ Let be the genuine root of in . The map is a strict contraction on the relevant region (its derivative is , and for away from , with a stability argument covering the rest), so the approximate fixed-point relation forces in probability for each fixed .
Solving the quadratic gives with the branch of the square root making at infinity, hence on . Stieltjes inversion recovers the density: for , as , so . Pointwise convergence on is, by the Stieltjes continuity theorem (proved below), equivalent to in probability.
Bridge. This derivation builds toward the local semicircle law and the entire resolvent-based universality program, and it appears again in the Marchenko-Pastur law for sample-covariance matrices, where the same Schur-complement cavity step produces a different fixed-point equation. The foundational reason the self-consistent equation closes is that removing one row barely moves the spectrum, so a single diagonal resolvent entry sees the rest of the matrix only through the averaged trace it is trying to compute — this is exactly the cavity self-consistency that makes satisfy . The resolvent route is dual to the moment method of 37.08.01: there one expands in powers and matches Catalan numbers, here one inverts and matches a fixed point, and the Catalan generating function reappearing as is the central insight that the two routes compute one analytic object. Putting these together, the branch point of at is the spectral edge that the moment method could only reach indirectly, and this is exactly why the resolvent method, refined to imaginary parts of order , controls the edge and the local eigenvalue statistics.
Exercises Intermediate+
Advanced results Master
The convergence holds almost surely and, more importantly, uniformly down to the optimal scale. The local semicircle law of Erdős-Schlein-Yau [Erdős 2009] upgrades the fixed- statement to control of for with imaginary part as small as : there exist high-probability bounds in the bulk, together with the entrywise law . The control of off-diagonal resolvent entries is complete delocalisation of eigenvectors — no eigenvector concentrates on coordinates — and the control of on scale counts eigenvalues in windows of width , hence in windows holding only eigenvalues. This is the technical content the moment method cannot reach: moments see only global averages, while the resolvent at small resolves the spectrum locally.
The stability of the self-consistent equation is what makes the local law possible. Writing the equation as , the derivative is nonzero except at the edges , where it degenerates like . The bulk stability propagates an additive error in the equation into an error in ; near the edge the square-root degeneracy is exactly the source of the Tracy-Widom edge scaling, the same edge exponent the moment method located through the vanishing of 37.08.01. The resolvent method makes the edge quantitative because it sees the degeneracy of directly.
The method extends far beyond Wigner. For deformed models with deterministic diagonal, the cavity step yields a self-consistent equation — a vector or functional fixed point — whose solution is the free convolution of the semicircle with the empirical law of , the analytic incarnation of free additive convolution. For sample covariance matrices the same step produces the Marchenko-Pastur equation; for band and sparse matrices it produces matrix Dyson equations where is a self-energy operator encoding the variance profile. In every case the architecture is identical: a Schur complement isolates one site, concentration replaces a random quadratic form by a deterministic trace, and the resulting fixed-point equation is solved within the Herglotz class.
The free-probability reading completes the circle. The semicircle is the distribution of a free-semicircular element, and the Stieltjes transform is its Cauchy transform; the self-consistent equation is the statement that the R-transform of the semicircle is the identity map , since rearranges to . Free additive convolution linearises under the R-transform exactly as classical convolution linearises under the logarithm of the characteristic function 37.03.01, and the deformed-model equations above are the R-transform addition law made matrix-valued. The resolvent is thus the bridge between the spectral analysis of one large matrix and the algebraic structure of free independence.
Synthesis. The foundational reason a single fixed-point equation organises this entire method is that the spectrum is stable under deleting one row, so a diagonal resolvent entry can only depend on the rest of the matrix through the averaged trace it helps define, and this is exactly the cavity self-consistency that closes into . Putting these together, the resolvent route is dual to the moment method of 37.08.01: the branch point of the Catalan generating function, the square-root branch point of , and the vanishing of at the edge are one degeneracy seen three ways, and this is the central insight that lets the same equation, refined to imaginary parts of order , deliver the local law, eigenvector delocalisation, and the Tracy-Widom edge that global moments cannot reach. The deformed, covariance, and band equations show that the architecture generalises to any variance profile, and the R-transform identity shows it is dual to free probability — free additive convolution is to the R-transform what classical convolution is to the cumulant expansion. The bridge to the frontier is that the matrix Dyson equation is the universal form of which is the constant-variance scalar shadow, and its stability theory is exactly what the universality program makes quantitative.
Full proof set Master
The Schur-complement derivation of the self-consistent equation, the convergence of , and the inversion to are proved in full above. The remaining Master claims are recorded here.
Proposition (Stieltjes inversion). Let be a Borel probability measure on with Stieltjes transform . For continuity points of , .
Proof. Compute . Thus where is the Cauchy (Poisson-for-the-half-plane) kernel, a probability density with concentrating at as . Integrating in over and applying Fubini, $$ \frac1\pi\int_a^b \operatorname{Im} s_\mu(x + i\varepsilon), dx = \int_{\mathbb{R}}\Big(\int_a^b P_\varepsilon(x - t), dx\Big), d\mu(t) = \int_{\mathbb{R}} \Phi_\varepsilon(t), d\mu(t), $$ where . As , pointwise, and , so bounded convergence gives the limit . At continuity points the boundary term vanishes.
Proposition (Herglotz representation and characterisation). A holomorphic is the Stieltjes transform of a probability measure on if and only if on and .
Proof. Necessity is the computation together with by dominated convergence. For sufficiency, the Nevanlinna representation of a function mapping to its closure gives for a real , , and a positive measure with . The normalisation forces , , and , collapsing the formula to with a probability measure.
Proposition (the self-consistent root is the semicircle transform). The unique root of that is Herglotz on with at infinity is , and its inversion yields .
Proof. The quadratic has roots , where is the branch on with as . Then , matching the required asymptotic, whereas is not bounded and not a Stieltjes transform. Hence . For : from , gives ... more directly, is holomorphic and nonvanishing on the connected with positive imaginary part at (Exercise 2 pattern), so by the open mapping theorem and continuity throughout. Taking with , so , giving ; by the inversion proposition , supported on where the square root is real.
Proposition (resolvent stability of the trace under rank-one deletion). For Hermitian and its principal minor , and any , .
Proof. Embed as the matrix equal to with row and column zeroed out except for a chosen real diagonal value; then . The matrices and differ by a Hermitian perturbation of rank at most two (the -th row and column), and for a rank- Hermitian perturbation the eigenvalue-counting functions interlace with displacement at most , so . Using bounded by and the rank bound, the difference of traces telescopes to at most ; the single isolated term is bounded by . Combining and dividing by yields the stated bound.
Connections Master
The Wigner semicircle law and the moment method 37.08.01 is the dual route to the same theorem and the direct prerequisite. That unit computes by counting non-crossing pair partitions; this unit recovers the identical limiting law by inverting one matrix and solving , and the bridge between the two is the identity tying the Catalan generating function to the resolvent. The resolvent route is the one that survives to the spectral edge and the local scale where the moment method stalls.
The characteristic functions and Lévy continuity theorem 37.03.01 are the exact classical analogue of the Stieltjes continuity theorem proved here. There, weak convergence of measures is equivalent to pointwise convergence of characteristic functions with continuity at the origin; here it is equivalent to pointwise convergence of Stieltjes transforms on , and the tightness-plus-uniqueness skeleton of the proof is shared verbatim, with the Cauchy kernel replacing the Fourier kernel and the test point moved off the real axis.
The QFT large- matrix model and topological expansion 08.14.06 meets this unit through the resolvent: the large- saddle-point equation for the planar free energy of a one-matrix model is exactly a self-consistent equation for the resolvent , whose solution is the spectral density. The loop equations / Schwinger-Dyson hierarchy there is the field-theoretic form of the cavity self-consistency derived here, and the planar limit reproduces the same semicircle for the Gaussian potential.
The Itô integral and Itô's formula 02.15.02 connects through Dyson Brownian motion: differentiating the resolvent along the eigenvalue flow and applying Itô's formula produces a stochastic advection equation for whose deterministic limit is the complex Burgers equation , the dynamical companion of the static self-consistent equation; the stochastic calculus of that unit is what makes the resolvent flow and the local-law analysis rigorous.
Historical & philosophical context Master
The transform originates with Thomas Stieltjes, whose 1894 memoir on continued fractions [Stieltjes 1894] introduced both the Stieltjes integral and the use of the Cauchy-type transform to study the moment problem, reading the measure off the analytic continuation of a continued-fraction expansion of . The same object had appeared as the Cauchy transform in complex analysis and as the Borel transform of a moment sequence; its identification as a Herglotz/Nevanlinna function placed it inside the representation theory of functions mapping the upper half-plane to itself, developed by Nevanlinna and Pick in the 1910s-1920s.
The resolvent route to limiting spectral distributions is due to Vladimir Marchenko and Leonid Pastur, whose 1967 paper [Marchenko 1967] introduced the self-consistent-equation method for the eigenvalue distribution of sample-covariance and related random matrices, deriving the Marchenko-Pastur law and, as a special case of the technique, the semicircle. The modern resolvent program — pushing the self-consistent equation down to the optimal scale to obtain the local semicircle law, eigenvector delocalisation, and ultimately bulk and edge universality — was carried out by László Erdős, Benjamin Schlein, Horng-Tzer Yau and collaborators beginning with their 2009 local semicircle law [Erdős 2009], and runs parallel to the Marchenko-Pastur-style fixed-point analysis systematised in the monograph of Bai and Silverstein. The free-probability interpretation, in which the self-consistent equation is the R-transform addition law, was supplied by Voiculescu's free probability of the 1980s-1990s, identifying the semicircle as the free analogue of the Gaussian.
Bibliography Master
@article{stieltjes1894,
author = {Stieltjes, Thomas Jan},
title = {Recherches sur les fractions continues},
journal = {Annales de la Facult\'e des sciences de Toulouse},
volume = {8},
pages = {J1--J122},
year = {1894}
}
@article{marchenkopastur1967,
author = {Marchenko, Vladimir A. and Pastur, Leonid A.},
title = {Distribution of eigenvalues for some sets of random matrices},
journal = {Matematicheskii Sbornik},
volume = {72(114)},
number = {4},
pages = {507--536},
year = {1967}
}
@book{agz2010,
author = {Anderson, Greg W. and Guionnet, Alice and Zeitouni, Ofer},
title = {An Introduction to Random Matrices},
series = {Cambridge Studies in Advanced Mathematics},
volume = {118},
publisher = {Cambridge University Press},
year = {2010}
}
@book{baisilverstein2010,
author = {Bai, Zhidong and Silverstein, Jack W.},
title = {Spectral Analysis of Large Dimensional Random Matrices},
edition = {2nd},
series = {Springer Series in Statistics},
publisher = {Springer, New York},
year = {2010}
}
@article{erdosschleinyau2009,
author = {Erd{\H o}s, L\'aszl\'o and Schlein, Benjamin and Yau, Horng-Tzer},
title = {Local semicircle law and complete delocalization for {Wigner} random matrices},
journal = {Communications in Mathematical Physics},
volume = {287},
number = {2},
pages = {641--655},
year = {2009}
}
@book{erdosyau2017,
author = {Erd{\H o}s, L\'aszl\'o and Yau, Horng-Tzer},
title = {A Dynamical Approach to Random Matrix Theory},
series = {Courant Lecture Notes},
volume = {28},
publisher = {American Mathematical Society},
year = {2017}
}