40.07.01 · combinatorics / probabilistic-method

The Probabilistic Method: First-Moment and Counting Arguments

shipped3 tiersLean: none

Anchor (Master): Alon-Spencer 2016 *The Probabilistic Method* 4e (Wiley) Ch. 1-3, 6 (first moment, deletion, second moment, Lovász Local Lemma); Erdős 1947 *Bull. AMS* 53 (the Ramsey lower bound); Erdős 1963/1964 *Nordisk Mat. Tidskr.* / *Israel J. Math.* (property B); Bollobás 1986 *Combinatorics* (Cambridge) Ch. on random colourings

Intuition Beginner

Sometimes the easiest way to prove that a thing exists is to refuse to build it and instead show that a blind random guess would land on it with better-than-zero odds. If you reach into a bag and the chance of pulling out a winning ticket is greater than zero, then at least one winning ticket must be in the bag. That single sentence is the whole engine of the probabilistic method: a property that a random object has with positive probability is a property that some specific object actually has.

The surprise is how far this carries. Suppose you want to colour the connections in a big network so that no small group is connected all in one colour. Building such a colouring by hand looks hopeless. Instead, flip a coin for every connection. If the expected number of "bad" all-one-colour groups comes out to less than one, then some coin-flip outcome must have produced zero bad groups, because you cannot have an average below one if every outcome had at least one. So a good colouring exists, even though we never name it.

This counting-by-averaging idea, due to Paul Erdős, replaced clever explicit constructions with a short calculation. You compare the number of bad arrangements against the total, or you compute an average and find it small, and existence falls out for free.

Visual Beginner

Picture a complete network on six dots, every pair joined by a line, and colour each line red or blue by a coin flip. A "bad event" is a triangle whose three lines all share one colour. We count how many triangles there are and how likely each is to be monochromatic.

quantity value for dots, triangle target
number of triangles
chance one fixed triangle is all-one-colour
expected number of bad triangles

When the expected number of bad triangles drops below one, some colouring has none. The table makes the trade-off concrete: more dots means more triangles, so the method works only while the count stays small. For larger targets than a triangle the chance shrinks fast, which is exactly what lets the method beat the count.

Worked example Beginner

Show that the edges of the complete network on five dots can be coloured red or blue so that no triangle is all one colour.

Step 1. Count the triangles. A triangle is a choice of three dots out of five, and the number of such choices is . So there are ten triangles to worry about.

Step 2. Find the chance one fixed triangle goes bad. Its three edges are each red or blue by an independent fair coin. All three red happens with chance , and all three blue likewise with chance . So the chance the triangle is monochromatic is .

Step 3. Average the count of bad triangles. The average number of monochromatic triangles, across all random colourings, is the number of triangles times the per-triangle chance: .

Step 4. Read off what the average permits. An average of is bigger than one, so this particular bound does not yet force a good colouring. Drop to four dots: now there are triangles and the average is , still not below one.

Step 5. Push the target up instead. Ask for no all-one-colour set of four mutually joined dots among six dots. There are such sets, each monochromatic with chance , giving an average of .

What this tells us: an average below one guarantees an outcome with zero bad sets, so a colouring of six dots avoiding a one-colour group of four exists. The method trades construction for a single division.

Check your understanding Beginner

Formal definition Intermediate+

Fix a finite probability space — a finite sample set with a probability assignment on its subsets. An event is a subset , and a random variable is a function with expectation . Expectation is linear: for any random variables, with no independence required. These notions are developed at the basic-probability units 37.01.x; here they are applied, not reproved.

Basic existence principle (first form). If is an event with , then : there is a sample point with . Contrapositively, if every fails a property , then .

Basic existence principle (expectation form). Let . There is a point with and a point with . In particular, if then some point has , and if is integer-valued with then some point has .

Union bound (the first-moment method). Let be events, the "bad" events to be avoided. Writing for the number of bad events that occur, linearity gives . By the expectation form, if $$ \sum_{i=1}^{m} \Pr(A_i) < 1, $$ then some sample point lies in none of the , i.e. . This is the first-moment method: control the expected number of bad events and conclude existence of a bad-event-free outcome. The bound never uses independence of the and never uses the joint distribution beyond the individual probabilities.

The notation , , (indicator), and (complement) is registered in _meta/NOTATION.md.

Counterexamples to common slips Intermediate+

  • "You need the bad events to be independent." The union bound and its expectation form hold for arbitrary events; independence is irrelevant. It is precisely this that makes the method robust.

  • " gives for any real-valued ." It gives a point with . The conclusion needs to be a non-negative integer-valued count, where "below one" forces "exactly zero".

  • "A small expectation means most outcomes are good." It need not: guarantees one good outcome, not a majority. A few outcomes with large can coexist with the existence claim; refining "one good outcome" to "many" is the job of the second-moment method (40.07.02).

  • "The method constructs the object." It certifies existence non-constructively. Turning the guarantee into an algorithm is a separate task — the deletion method, alterations, and the algorithmic Local Lemma address it.

Key theorem with proof Intermediate+

The signature application is Erdős's exponential lower bound on the diagonal Ramsey number, the first triumph of the method [Erdős 1947]. Recall is the least such that every red/blue colouring of the edges of the complete graph contains a monochromatic . The companion unit 40.05.04 states Ramsey's theorem and proves the upper bound ; the matching lower bound is proved here.

Theorem (Ramsey lower bound; Erdős 1947). *If , then there is a red/blue colouring of the edges of with no monochromatic . Consequently, for ,* $$ R(k, k) > 2^{k/2}. $$

Proof. Colour each of the edges of independently red or blue, each with probability ; this defines a finite probability space on the colourings. For a fixed set of vertices, let be the event that the edges inside are all the same colour. There are two monochromatic patterns (all red, all blue) out of equally likely patterns on those edges, so $$ \Pr(A_S) = \frac{2}{2^{\binom{k}{2}}} = 2^{,1 - \binom{k}{2}}. $$ Let count the monochromatic -subsets. By linearity of expectation, summing over the choices of , $$ \mathbb{E}[X] = \sum_{|S| = k} \Pr(A_S) = \binom{n}{k}, 2^{,1 - \binom{k}{2}}. $$ By hypothesis . Since is a non-negative integer-valued random variable, the expectation form of the existence principle yields a colouring with — a colouring with no monochromatic . By definition of the Ramsey number, .

It remains to extract the bound . Take . Using and , $$ \binom{n}{k},2^{1 - \binom{k}{2}} ;\le; \frac{n^k}{k!},2^{1 - k(k-1)/2} ;\le; \frac{2^{k^2/2}}{k!},\cdot, 2^{1 - k(k-1)/2} ;=; \frac{2^{1 + k/2}}{k!}. $$ For the right-hand side is below one (at it is , and outgrows thereafter), so the hypothesis holds with and therefore .

Bridge. This proof builds toward the entire probabilistic-method chapter: the same template — define a random object, bound the expected number of defects, invoke — appears again in the tournament, dominating-set, and property-B arguments below, and the foundational reason it works is the expectation form of the existence principle, that an integer count with mean below one must vanish somewhere. The bound is exactly the lower companion to the constructive-looking upper bound of 40.05.04, so putting these together pins the diagonal Ramsey number between and — a gap, narrowed only by constant factors in seventy years, that is the central insight motivating the second-moment and Lovász-Local-Lemma refinements of 40.07.02. The first-moment estimate generalises from "no monochromatic " to any family of forbidden substructures whose total expected count is small.

Exercises Intermediate+

Advanced results Master

The first moment is the base of a tower of refinements. Each sharpens the conclusion "one good outcome exists" by adding control over the distribution of the defect count, not merely its mean.

Theorem 1 (alteration / deletion bound for Ramsey). For every and , . Two-colour at random; with monochromatic -sets in expectation, fix a colouring with at most this many, then delete one vertex from each monochromatic -set. The surviving graph on at least vertices has no monochromatic . Optimising improves the constant: it yields rather than the bare . The deletion method is developed in 40.07.02.

Theorem 2 (the gap and its slow closure). The diagonal Ramsey number satisfies , where the lower constant is Erdős's first-moment bound (improved by a polynomial factor through the second moment and the Lovász Local Lemma, Spencer 1975) and the upper constant is the Erdős-Szekeres count. The base of the upper bound stood at from 1935 until Campos, Griffiths, Morris, and Sahasrabudhe (2023) proved for an absolute . No exponential-base improvement to the lower bound is known: the first-moment value has resisted all attempts, and producing an explicit colouring matching even is a notorious open problem (the best explicit constructions, via Frankl-Wilson and later Conlon-type designs, are far weaker).

Theorem 3 (Lovász Local Lemma sharpening). When the bad events each have probability and each is mutually independent of all but at most of the others, the symmetric Local Lemma gives provided — a local condition replacing the global . For the Ramsey colouring this multiplies the achievable by a factor polynomial in , and for property B it yields , improving the first-moment . The Local Lemma is the chapter's apex existence tool (40.07.03).

Theorem 4 (sharp threshold for property B). The first moment gives and a random-plus-counting argument gives (Erdős 1964); Radhakrishnan-Srinivasan (2000) sharpened the lower bound to via a clever random recolouring, and the exact growth rate of remains open between and . The two-sided estimate exhibits the method's typical shape: a clean exponential main term pinned quickly, with the polynomial correction the residue of genuine difficulty.

Synthesis. Putting these together, the first-moment method is the foundational reason the whole probabilistic-method edifice coheres: every refinement — deletion, the second moment, the Local Lemma — is a way of extracting more from the same random object than its mean, and each reduces in its degenerate case to the plain inequality . The Ramsey gap is exactly the signature of this: the lower constant is the first moment's verdict and the upper is a forcing count, and the central insight is that they are dual — existence by averaging below a threshold against existence by pigeonhole above one — so closing the gap is the open problem of either beating the average from below or the forcing from above. This generalises the single Ramsey calculation into a method: define a random structure, identify the bad events, sum their probabilities, and either conclude directly (first moment), delete the few defects (alteration), or localise the dependencies (Local Lemma). The bridge from this unit to the rest of the chapter is that all three share one engine, and the deletion and second-moment continuations of 40.07.02 are precisely the next two turns of that engine.

Full proof set Master

Proposition 1 (expectation form of the existence principle). Let be a finite probability space and . Then there exist with . If is integer-valued and then some has .

Proof. Write , a weighted average with weights summing to over the support. If for every in the support, then , contradicting ; so some has . The symmetric argument applied to gives with . For the last clause, if is integer-valued and , take with ; since is a non-negative integer (assuming ) below , it equals .

Proposition 2 (union-bound existence criterion). Let be events. If then ; in particular the intersection of complements is nonempty.

Proof. Let , the number of events that occur; is a non-negative integer-valued random variable. By linearity of expectation, . By Proposition 1 some has , meaning for all , i.e. . Since and the set is nonempty, on a finite space where every sample point has positive probability (or, in general, the event is nonempty, which is all the existence claim requires).

Proposition 3 (Ramsey lower bound, full statement). If then ; hence for all .

Proof. The colouring space is the uniform measure on . For each -set , the event "the edges within are monochromatic" has probability , since exactly two of the patterns on those edges are constant. By Proposition 2 with the events , the hypothesis yields a colouring in : no -set is monochromatic, so has no monochromatic and . Taking and bounding gives for , so , hence (strictly, since the inequality is strict and is an integer exceeding ).

Proposition 4 (property B lower bound). Every -uniform hypergraph with fewer than edges is 2-colourable; equivalently .

Proof. Colour vertices red/blue independently with probability each. For an edge of size , the event " is monochromatic" has . If the hypergraph has edges, then . By Proposition 2 there is a colouring avoiding every , i.e. a proper 2-colouring. Contrapositively, a non-2-colourable -uniform hypergraph has at least edges, so .

Connections Master

  • The deletion (alteration) and second-moment continuations live in 40.07.02: where the first moment proves a good outcome exists, deletion repairs an almost-good outcome by removing its few defects, and the second moment upgrades " forces somewhere" to " is concentrated near its mean", which is what lower-bounds rather than upper-bounds counts. Both are read here as further turns of the single engine .

  • The Ramsey lower bound here is the exact lower companion to Ramsey's theorem and its upper bound in 40.05.04: that unit owns the statement of Ramsey's theorem and the forcing (pigeonhole-recursion) upper bound and cross-references this unit for the probabilistic lower bound, so together they trap between and .

  • The whole method applies the finite-probability and expectation apparatus of the basic-probability units 37.01.x — finite sample spaces, indicators, and the linearity that needs no independence — without reproving any measure theory; this unit is the combinatorial pay-off of that linearity, which is the one property of expectation the first-moment method actually consumes.

Historical & philosophical context Master

The method originates with Paul Erdős's 1947 note in the Bulletin of the American Mathematical Society [Erdős 1947], a three-page paper proving by the counting argument above. Erdős did not phrase it in the language of probability — he compared the number of colourings containing a monochromatic against the total and observed the former is smaller — but the calculation is exactly the first-moment bound, and it is conventionally regarded as the founding instance of the probabilistic method. The reframing in explicitly probabilistic terms, with expectation and the union bound as named tools, was consolidated over the following decades and codified in the Alon-Spencer monograph [Alon-Spencer 2016].

The property-B threshold traces to Erdős's 1963-64 papers [Erdős 1964], which introduced and established by pairing the first-moment lower bound with a random construction for the upper. Bernstein had used a counting argument of similar spirit for a colouring problem as early as 1908, but Erdős made the technique systematic and demonstrated its reach across Ramsey theory, hypergraph colouring, and number theory. Spencer's 1975 use of the Lovász Local Lemma and Radhakrishnan-Srinivasan's 2000 recolouring refined the constants without displacing the first moment as the entry point.

Bibliography Master

@article{erdos1947ramsey,
  author  = {Erd\H{o}s, Paul},
  title   = {Some remarks on the theory of graphs},
  journal = {Bulletin of the American Mathematical Society},
  volume  = {53},
  pages   = {292--294},
  year    = {1947}
}

@article{erdos1964propertyB,
  author  = {Erd\H{o}s, Paul},
  title   = {On a combinatorial problem. II},
  journal = {Acta Mathematica Academiae Scientiarum Hungaricae},
  volume  = {15},
  pages   = {445--447},
  year    = {1964}
}

@book{alonspencer2016,
  author    = {Alon, Noga and Spencer, Joel H.},
  title     = {The Probabilistic Method},
  edition   = {4},
  publisher = {Wiley},
  year      = {2016}
}

@article{spencer1975ramsey,
  author  = {Spencer, Joel},
  title   = {Ramsey's theorem -- a new lower bound},
  journal = {Journal of Combinatorial Theory, Series A},
  volume  = {18},
  pages   = {108--115},
  year    = {1975}
}

@article{radhakrishnan2000property,
  author  = {Radhakrishnan, Jaikumar and Srinivasan, Aravind},
  title   = {Improved bounds and algorithms for hypergraph 2-coloring},
  journal = {Random Structures \& Algorithms},
  volume  = {16},
  number  = {1},
  pages   = {4--32},
  year    = {2000}
}

@article{campos2023ramsey,
  author  = {Campos, Marcelo and Griffiths, Simon and Morris, Robert and Sahasrabudhe, Julian},
  title   = {An exponential improvement for diagonal Ramsey},
  journal = {arXiv preprint arXiv:2303.09521},
  year    = {2023}
}