40.07.08 · combinatorics / probabilistic-method

Combinatorial Discrepancy: Spencer's Theorem and the Beck-Fiala Bound

shipped3 tiersLean: none

Anchor (Master): Alon-Spencer 2016 *The Probabilistic Method* 4e Ch. 13 (Spencer, Beck-Fiala, the entropy/partial-colouring method); Spencer 1985 *Trans. Amer. Math. Soc.* 289 (Six standard deviations suffice); Beck-Fiala 1981 *Discrete Appl. Math.* 3; Bansal 2010 *FOCS* (constructive Spencer via SDP); Lovett-Meka 2015 *SIAM J. Comput.* 44 (the random-walk algorithm); Chazelle 2000 *The Discrepancy Method* (Cambridge)

Intuition Beginner

Imagine you must split a club into two teams, and the club runs many different committees. You want every committee to end up roughly balanced — about as many red-team members as blue-team members on each one. The trouble is that the committees overlap: a person on three committees pulls all three at once, and a fair split for one committee may unbalance another. How unbalanced is the worst committee forced to be, no matter how cleverly you assign the colours?

That worst-case imbalance is what discrepancy measures. You colour each person red or blue, you look at every committee, and you record the gap between its red count and its blue count. Discrepancy is the smallest possible value of the largest such gap, over all the ways you could have coloured. A small discrepancy means a colouring exists that keeps every committee nearly even at the same time; a large one means some committee is unavoidably lopsided.

The surprising news is how small that worst gap can be made. If you simply flip a coin for each person, a committee of size up to the number of people typically lands off-balance by about the square root of its size, times a little extra for having many committees to worry about. But with care you can shave off that extra factor — and you cannot, in general, do much better than the square root itself.

Visual Beginner

Picture a grid: rows are people, columns are committees, and a mark in a cell means that person sits on that committee. You paint each row red or blue. For each column you add up its marks, counting red as plus one and blue as minus one, and the column's imbalance is how far that sum lands from zero.

What you choose	What it controls
colour of each row (person)	red or blue, the only free choice
signed sum down a column	that committee's imbalance
the largest column imbalance	the discrepancy of this colouring
the best colouring's largest imbalance	the discrepancy of the system

The picture shows the contest: you adjust the row colours to push down the tallest column tally, but every flip that helps one column may hurt another. Discrepancy is the height of the shortest tallest-column you can arrange. For a square grid of people and committees, that best height turns out to be about the square root of the number of people.

Worked example Beginner

Take $4$ people and $3$ committees. Committee one is ${1, 2, 3}$ , committee two is ${2, 3, 4}$ , committee three is ${1, 4}$ . We find a colouring with small imbalance and check it is best possible.

Step 1. Set up the scoring. Red counts as $+ 1$ , blue as $- 1$ . A committee's imbalance is the sum of its members' values; we want the largest imbalance, in size, to be small.

Step 2. Try the colouring red, blue, red, blue for people $1, 2, 3, 4$ , i.e. values $+ 1, - 1, + 1, - 1$ . Committee one ${1, 2, 3}$ scores $+ 1 - 1 + 1 = + 1$ . Committee two ${2, 3, 4}$ scores $- 1 + 1 - 1 = - 1$ . Committee three ${1, 4}$ scores $+ 1 - 1 = 0$ .

Step 3. Read off the largest imbalance. The three committee scores are $+ 1, - 1, 0$ , so the largest size is $1$ . This colouring has imbalance $1$ .

Step 4. Check you cannot reach $0$ everywhere. Committee one has three members, an odd number, so its red-minus-blue count is odd and can never be $0$ — its imbalance is at least $1$ for any colouring. So no colouring beats imbalance $1$ , and our colouring is best possible.

What this tells us: the discrepancy of this system is exactly $1$ . An odd-sized set can never be perfectly balanced, so it sets a floor; the art of discrepancy is finding a colouring that meets the floor on every set at once.

Check your understanding Beginner

Formal definition Intermediate+

Let $S = {S_{1}, \dots, S_{m}}$ be a finite family of subsets of a ground set $[n] = {1, \dots, n}$ ; equivalently a hypergraph or an $m \times n$ incidence matrix $A$ with $A_{j i} = 1_{i \in S_{j}}$ . A two-colouring is a map $χ : [n] \to {- 1, + 1}$ , and for a set $S$ its signed sum is $χ (S) = \sum_{i \in S} χ (i)$ , the $j$ -th coordinate of $A χ$ .

Definition (discrepancy). The discrepancy of the colouring $χ$ is $disc (S, χ) = max_{j} ∣ χ (S_{j}) ∣ = ∥ A χ ∥_{\infty}$ , and the discrepancy of the system is $$ \mathrm{disc}(\mathcal{S}) = \min_{\chi : [n]\to{-1,+1}}\ \max_{1 \le j \le m}\ \big|\chi(S_j)\big| = \min_{\chi \in {-1,+1}^n}|A\chi|_\infty. $$

A partial colouring is a map $χ : [n] \to {- 1, 0, + 1}$ ; its support is ${i : χ (i) \neq = 0}$ , the coloured points, and the points with $χ (i) = 0$ are uncoloured. The signed sum $χ (S)$ and the discrepancy of a partial colouring are defined verbatim. A partial colouring is balanced (or half-colouring) when its support has size at least $n /2$ .

Definition (degree). The degree of $S$ is $t = max_{i \in [n]} ∣ {j : i \in S_{j}} ∣$ , the largest number of sets through any single point — the maximum column sum of $A$ . The Beck-Fiala theorem bounds discrepancy by degree alone, with no dependence on $n$ or $m$ .

Two regimes anchor the subject. The random-colouring bound treats $χ$ as uniform on ${\pm 1}^{n}$ and union-bounds the Chernoff tails of the $m$ sums; for $m = n$ sets it gives $disc (S) = O (n lo g n)$ . Spencer's theorem removes the $lo g n$ factor for $m = n$ , giving $disc (S) = O (n)$ . The notation $χ (S)$ , $∥ \cdot ∥_{\infty}$ , $1_{i \in S}$ , the incidence matrix $A$ , and $[n]$ is registered in _meta/NOTATION.md ^{[Alon-Spencer 2016]}.

Counterexamples to common slips Intermediate+

"Random colouring is essentially optimal." For $m = n$ sets, random colouring gives $O (n lo g n)$ , but Spencer's theorem gives $O (n)$ — the union bound is loose by exactly the $lo g n$ that the partial-colouring method removes. The random bound is tight only when $m$ is exponentially larger than $n$ .
"Beck-Fiala depends on $n$ ." It does not: $disc < 2 t$ uses only the degree $t$ , so a system of millions of points each in at most three sets has discrepancy under $6$ . The size of the ground set and the number of sets are irrelevant.
"Low discrepancy is achieved by colouring each set in isolation." The same $χ$ must serve every set simultaneously; a colouring balancing one set may wreck an overlapping one. Discrepancy is a global $ℓ_{\infty}$ minimisation over a shared colour vector, not a per-set optimisation.
"The partial-colouring lemma colours everything." It colours only half the points (support $\geq n /2$ ) with small discrepancy. Spencer's bound emerges from iterating the lemma on the shrinking set of uncoloured points, summing a geometric series of half-colouring discrepancies.

Key theorem with proof Intermediate+

The signature result is Spencer's "six standard deviations suffice" theorem: $n$ sets on $n$ points admit a colouring with every signed sum at most $6 n$ , beating the random union bound by a $lo g n$ factor. The engine is the partial-colouring lemma, an entropy/pigeonhole argument that is the direct combinatorial descendant of the entropy method 40.07.06: a bound on the number of distinguishable rounded discrepancy vectors forces two colourings to collide, and their difference is a balanced partial colouring of small discrepancy.

Lemma (partial colouring). Let $S = {S_{1}, \dots, S_{n}}$ be $n$ sets on $[n]$ . There is a partial colouring $χ : [n] \to {- 1, 0, + 1}$ with support of size at least $n /2$ and $∣ χ (S_{j}) ∣ \leq 10 n$ for every $j$ .

Proof. Round each set's signed sum to a bin of width $Δ_{j} = ⌈ 20 p_{j} (1 - p_{j}) n ⌉$ -scale, more simply taken uniformly as bins of width about $Δ = 10 n$ wide enough that the number of attainable rounded vectors is small. Consider all $2^{n}$ colourings $χ \in {\pm 1}^{n}$ . For each, form the vector $b (χ) = (⌊ χ (S_{j}) / λ_{j} ⌋)_{j = 1}^{n}$ of binned signed sums. A Chernoff/entropy count shows the number of distinct vectors $b (χ)$ that occur with non-negligible frequency is less than $2^{n} / 2^{n /2} = 2^{n /2}$ : the entropy of a single binned coordinate $⌊ χ (S_{j}) / λ_{j} ⌋$ is at most a constant fraction of a bit when $λ_{j} ≍ ∣ S_{j} ∣$ , so the joint entropy of $b (χ)$ under the uniform $χ$ is at most $n /2$ , hence $b$ takes at most $2^{n /2}$ likely values. By pigeonhole, since $2^{n} > 2^{n /2} \cdot 2^{n /2}$ fails to separate them, two distinct colourings $χ^{'}, χ^{''}$ share the same bin vector: $b (χ^{'}) = b (χ^{''})$ . Set $χ = \frac{1}{2} (χ^{'} - χ^{''})$ . Then $χ (i) \in {- 1, 0, + 1}$ , and $χ \neq = 0$ , so its support is non-empty; a refinement of the count (restricting to colourings differing in at least $n /2$ coordinates) forces the support to have size $\geq n /2$ . For each set, $χ (S_{j}) = \frac{1}{2} (χ^{'} (S_{j}) - χ^{''} (S_{j}))$ , and equal bins mean $∣ χ^{'} (S_{j}) - χ^{''} (S_{j}) ∣ \leq λ_{j} \leq 20 n$ , so $∣ χ (S_{j}) ∣ \leq 10 n$ . $□$

Theorem (Spencer, 1985). For any family $S$ of $n$ sets on $[n]$ , $disc (S) \leq 6 n$ .

Proof. Apply the partial-colouring lemma to $[n]$ : obtain a partial colouring $χ^{(1)}$ colouring at least $n /2$ points with discrepancy $\leq c n$ on every set, for an absolute constant $c$ . Freeze those points and recurse on the uncoloured set $U_{1}$ , of size $n_{1} \leq n /2$ , applying the lemma to the induced system $S ∣_{U_{1}}$ (each set restricted to the still-uncoloured points). The induced system has $n$ sets on $n_{1}$ points; the lemma — applied with the ground-set size $n_{1}$ — yields $χ^{(2)}$ colouring half of $U_{1}$ with per-set discrepancy $\leq c^{'} n_{1}$ , where the constant $c^{'}$ may grow mildly because there are $n \geq n_{1}$ sets, contributing a logarithmic factor that the careful $O (n lo g (2 m / n))$ form of the lemma absorbs. Iterating, the uncoloured set halves each round, $n_{k} \leq n / 2^{k}$ , and the total signed sum on any set telescopes: $χ = \sum_{k} χ^{(k)}$ is a full colouring with $$ |\chi(S_j)| \le \sum_{k\ge1} c_k\sqrt{n_{k-1}} \le c\sum_{k\ge0}\sqrt{n/2^{k}} = c\sqrt n\sum_{k\ge0} 2^{-k/2} = \frac{c\sqrt n}{1 - 2^{-1/2}}. $$ The geometric sum converges; tracking the constants in the entropy count gives the explicit bound $6 n$ . $□$

Bridge. Spencer's theorem builds toward the entire algorithmic theory of discrepancy: the foundational reason the random union bound is beatable is that the partial-colouring lemma extracts a structured half-colouring from an entropy/pigeonhole collision rather than from independent coin flips, and this is exactly the move that the entropy method 40.07.06 makes when it bounds a combinatorial count by the entropy of a uniform object — here the count of distinguishable rounded discrepancy vectors. The lemma appears again in the Beck-Fiala and Komlós settings of the Advanced results, where the linear-algebra floating argument is the deterministic counterpart, and the geometric iteration that sums half-colourings is the central insight that converts a one-shot half-colouring into a full $O (n)$ colouring. The partial-colouring method generalises the deletion philosophy of the first-moment method 40.07.01 — fix most of the structure, then repair the remainder — and putting these together, Spencer's theorem is the probabilistic method sharpened past its own union bound by replacing "a random colouring is good" with "two random colourings collide, and their difference is better."

Exercises Intermediate+

Exercise 3 (medium, symbolic).

Prove the random-colouring bound: for $m$ sets on $n$ points, a uniform random $χ \in {\pm 1}^{n}$ has $disc (S, χ) \leq 2 n ln (2 m)$ with positive probability, so $disc (S) \leq 2 n ln (2 m)$ .

Hint

For a fixed set $S$ , $χ (S)$ is a sum of $∣ S ∣ \leq n$ independent $\pm 1$ values; apply the Chernoff/Hoeffding bound $Pr (∣ χ (S) ∣ \geq λ) \leq 2 e^{- λ^{2} / (2 n)}$ and union-bound over the $m$ sets.

Answer

Fix a set $S$ with $∣ S ∣ = s \leq n$ . Then $χ (S) = \sum_{i \in S} χ (i)$ is a sum of $s$ independent $\pm 1$ variables, so Hoeffding gives $Pr (∣ χ (S) ∣ \geq λ) \leq 2 exp (- λ^{2} / (2 s)) \leq 2 exp (- λ^{2} / (2 n))$ . Set $λ = 2 n ln (2 m)$ , so the per-set failure probability is at most $2 exp (- ln (2 m)) = 1/ m$ . Union-bounding over the $m$ sets, $Pr (\exists j : ∣ χ (S_{j}) ∣ \geq λ) \leq m \cdot (1/ m) = 1$ — to get strict positivity use $λ = 2 n ln (2 m) + ε$ , making the union bound $< 1$ , hence some colouring has all $∣ χ (S_{j}) ∣ < λ$ . Therefore $disc (S) \leq 2 n ln (2 m)$ , which for $m = n$ is $O (n lo g n)$ . Rubric: full credit for the per-set Hoeffding tail, the choice of $λ$ giving per-set probability $1/ m$ , and the union bound over $m$ sets.

Exercise 4 (medium, symbolic).

In the Beck-Fiala floating argument, at a given moment let there be $a$ active (still-moving) variables and $b$ active sets (sets whose constraint has not been frozen). Show that if $b < a$ there is always a nonzero direction in which to move the active variables without changing any active set's signed sum.

Hint

The active-set constraints "keep $\sum_{i \in S} x_{i}$ fixed" form $b$ linear equations in the $a$ active variables; a homogeneous linear system with more unknowns than equations has a nonzero solution.

Answer

The requirement that each active set $S$ keep its current signed sum is the linear constraint $\sum_{i \in S \cap A} \overset{x}{˙}_{i} = 0$ on the velocity vector $\overset{x}{˙} \in R^{a}$ of the active variables $A$ . There are $b$ such constraints, defining a linear map $M : R^{a} \to R^{b}$ whose kernel we seek. By rank-nullity, $dim ker M = a - rank (M) \geq a - b > 0$ when $b < a$ . So a nonzero $\overset{x}{˙} \in ker M$ exists; move the active variables along $\pm \overset{x}{˙}$ until some $x_{i}$ reaches $\pm 1$ (which must happen since the variables are confined to $[- 1, 1]^{a}$ and the line exits the box), then freeze it. This is the engine of the argument: when a set becomes tight only after losing enough variables to make $∣ S \cap A ∣ \leq t$ , freezing keeps each frozen set's residual unrounded mass below $t$ , and the final rounding error per set is under $t$ on each side, giving $disc < 2 t$ . Rubric: full credit for setting up the $b$ homogeneous constraints, invoking rank-nullity to get a nonzero kernel direction when $b < a$ , and noting variables move until frozen at $\pm 1$ .

Exercise 6 (medium, symbolic).

Explain why iterating the partial-colouring lemma sums to $O (n)$ rather than diverging. Specifically, given that round $k$ colours half the remaining points (so $n_{k} \leq n / 2^{k}$ ) with per-set discrepancy at most $c n_{k - 1}$ , bound the total $\sum_{k} c n_{k - 1}$ .

Hint

$n_{k - 1} \leq n \cdot 2^{- (k - 1) /2}$ ; sum the geometric series $\sum_{k} 2^{- (k - 1) /2}$ .

Answer

The uncoloured set halves each round, so $n_{k - 1} \leq n / 2^{k - 1}$ and $n_{k - 1} \leq n 2^{- (k - 1) /2}$ . The total discrepancy on any fixed set is at most $\sum_{k \geq 1} c n_{k - 1} \leq c n \sum_{k \geq 1} 2^{- (k - 1) /2} = c n \sum_{j \geq 0} (2^{- 1/2})^{j} = c n \cdot \frac{1}{1 - 2 ^{- 1/2}} = c n \cdot \frac{1}{1 - 0.707} \approx 3.41 c n$ . The series converges because each round's contribution shrinks by the fixed factor $2^{- 1/2} \approx 0.707 < 1$ ; this geometric decay is exactly why halving the ground set each round yields a finite $O (n)$ total rather than the $O (n lo g n)$ a non-shrinking iteration would give. Rubric: full credit for the bound $n_{k - 1} \leq n 2^{- (k - 1) /2}$ , summing the geometric series, and identifying the ratio $2^{- 1/2} < 1$ as the reason for convergence.

Exercise 7 (hard, symbolic).

Prove the Beck-Fiala theorem $disc (S) < 2 t$ in full, via the floating-variable argument.

Hint

Start at $x = 0 \in [- 1, 1]^{n}$ . Call a variable active if $x_{i} \in (- 1, 1)$ and a set active if it has more than $t$ active variables. Show active sets $<$ active variables, move in the kernel of the active-set constraints to freeze a variable, and bound the residual error once a set goes inactive.

Answer

Initialise $x_{i} = 0$ for all $i$ , a fractional colouring with every signed sum $0$ . Maintain the invariant that each active set's signed sum equals its target (initially $0$ ). Call $i$ active if $x_{i} \in (- 1, 1)$ and frozen once $x_{i} \in {\pm 1}$ ; call a set $S$ active if it contains more than $t$ active variables. Counting incidences: each point lies in at most $t$ sets, so the total number of (active point, set) incidences is at most $t \cdot (# active points)$ ; an active set consumes more than $t$ such incidences, so $# active sets < # active points$ . By rank-nullity (Exercise 4) the homogeneous system "hold every active set's sum fixed" has a nonzero kernel direction $\overset{x}{˙}$ supported on active variables; move $x$ along $\pm \overset{x}{˙}$ until some active $x_{i}$ hits $\pm 1$ , then freeze it. Active sets keep their sums unchanged throughout; only inactive sets (those with $\leq t$ active variables) drift. Repeat until all variables are frozen, giving a colouring $χ \in {\pm 1}^{n}$ .

For any set $S$ : while $S$ was active its sum stayed at $0$ ; once $S$ became inactive it had at most $t$ active variables left, each of which can move by at most $2$ (from one face to the other), so its final signed sum differs from $0$ by at most $\sum_{i active in S} ∣ χ (i) - x_{i} ∣ < t \cdot 2 = 2 t$ , with strict inequality since each $∣ χ (i) - x_{i} ∣ < 2$ . Hence $∣ χ (S) ∣ < 2 t$ for every $S$ , so $disc (S) < 2 t$ . Rubric: full credit for the $x = 0$ start, the active-set $<$ active-variable incidence count, the kernel-direction freezing step, and the residual bound $< 2 t$ from at most $t$ active variables each moving under $2$ .

Exercise 8 (hard, short-answer).

Explain in one paragraph why Spencer's theorem was, for decades, a purely non-constructive existence result, and what the Bansal SDP and Lovett-Meka random-walk algorithms changed.

Hint

The partial-colouring lemma finds a good colouring by a pigeonhole collision among $2^{n}$ colourings — it proves one exists but gives no way to find it short of exponential search. The algorithms reach the same bound in polynomial time.

Answer

Spencer's proof certifies a colouring exists by an entropy/pigeonhole collision: among the $2^{n}$ sign vectors, two must share a rounded discrepancy vector, and their difference is the desired half-colouring. The argument is a counting existence proof — it locates no colouring and suggests only brute-force search over exponentially many candidates, and indeed it was long suspected that no polynomial algorithm could match the $O (n)$ bound, since the proof exploits global counting rather than any local improvement rule. Bansal's 2010 algorithm broke this by solving a semidefinite relaxation of the discrepancy problem and rounding via a controlled random walk whose increments are shaped by the SDP solution, reaching $O (n)$ in polynomial time and showing the existence proof was algorithmically realisable after all. Lovett and Meka then removed the SDP entirely: their algorithm performs a constrained Gaussian random walk inside $[- 1, 1]^{n}$ , at each step moving in the subspace orthogonal to the tight set-constraints and frozen coordinates, so no set's discrepancy grows in expectation while the walk is driven to the cube's faces — a direct, elementary constructive proof of the partial-colouring lemma. Rothvoss later gave a one-shot Gaussian-rounding variant. The lesson is that the entropy existence bound and a polynomial algorithm coincide; the non-constructivity was an artifact of the pigeonhole framing, not an inherent barrier. Rubric: full credit for identifying the pigeonhole/counting non-constructivity, the SDP-plus-random-walk Bansal route, and the SDP-free constrained-Gaussian-walk Lovett-Meka route, both matching $O (n)$ .

Advanced results Master

The partial-colouring lemma and the floating argument are the two engines, and the modern theory is their refinement: the entropy method sharpens Spencer to the tight $m$ -dependence, the floating bound is conjectured to drop from $t$ to $t$ , and constructive algorithms realise every existence bound in polynomial time.

Theorem 1 (Spencer's tight bound). For $m \geq n$ sets on $[n]$ , $disc (S) = O (n lo g (2 m / n))$ , and this is tight: there exist systems (e.g. Hadamard-matrix incidence structures) with $disc = Ω (n)$ at $m = n$ ^{[Spencer 1985]}. The proof refines the partial-colouring lemma by binning the $j$ -th sum at width $λ_{j} ≍ ∣ S_{j} ∣ lo g (2 m / n)$ so that the joint entropy of the binned vector stays below $n /2$ even with $m ≫ n$ sets; the pigeonhole collision then yields a half-colouring with per-set discrepancy $O (n lo g (2 m / n))$ , and geometric iteration sums it. At $m = n$ the logarithm is constant, recovering $6 n$ . The entropy bookkeeping — the joint entropy of the rounded discrepancy vector is at most $\sum_{j} H (bin_{j}) \leq n /2$ — is precisely the subadditivity of entropy from 40.07.06.

Theorem 2 (Beck-Fiala and the Komlós conjecture). A degree- $t$ system has $disc < 2 t$ ^{[Beck-Fiala 1981]}; the Beck-Fiala conjecture asserts the truth is $O (t)$ . The sharper Komlós conjecture states that if the incidence vectors are replaced by arbitrary vectors $v_{1}, \dots, v_{n} \in R^{m}$ with $∥ v_{i} ∥_{2} \leq 1$ , then there is $χ \in {\pm 1}^{n}$ with $∥ \sum_{i} χ_{i} v_{i} ∥_{\infty} = O (1)$ , an absolute constant independent of $n, m, t$ ; Beck-Fiala's $O (t)$ follows by normalising columns of degree $\leq t$ (each has $ℓ_{2}$ -norm $\leq t$ ). The best unconditional bound is Banaszczyk's $O (lo g n)$ for Komlós and $O (t lo g n)$ for Beck-Fiala, via a convex-geometry (Gaussian-measure/ $γ$ -vector-balancing) argument, improving the linear $2 t$ but short of the conjectured constants.

Theorem 3 (constructive Spencer: Bansal, Lovett-Meka, Rothvoss). The $O (n)$ bound of Spencer's theorem and the partial-colouring lemma are achievable by polynomial-time randomized algorithms ^{[Lovett-Meka 2015]}. Bansal's algorithm rounds a semidefinite relaxation by a random walk whose step covariance is given by the SDP solution. Lovett-Meka's walk on the edges runs a constrained Brownian motion in $[- 1, 1]^{n}$ : at each tiny step it samples a Gaussian increment restricted to the subspace orthogonal to (i) coordinates already frozen at $\pm 1$ and (ii) the constraints $⟨ a_{j}, x ⟩ =$ current value for sets $j$ whose discrepancy has reached a threshold; the walk reaches a vertex with $\geq n /2$ coordinates at $\pm 1$ and all set-discrepancies $O (n)$ . Rothvoss's algorithm rounds a single Gaussian sample to the nearest point of a discrepancy-feasible polytope. All three match the existence bound, settling the long-open question of whether Spencer's count was algorithmic.

Theorem 4 (lower bounds via eigenvalues and Hadamard systems). Discrepancy is bounded below by spectral and determinant quantities. For the incidence matrix $A$ , $disc (S) \geq \frac{1}{2 m} σ_{m i n} -type$ bounds hold, and the determinant lower bound $disc (S) \geq \frac{1}{2} max_{k} ∣ det (B) ∣^{1/ k}$ over $k \times k$ submatrices $B$ of $A$ (Lovász-Spencer-Vesztergombi) certifies that Hadamard-based systems force $disc = Ω (n)$ , matching Spencer's upper bound up to constants. The Hadamard matrix, with $\pm 1$ entries and orthogonal rows, gives a system on $n$ points and $n$ sets whose every nontrivial colouring is unbalanced by $Ω (n)$ on some set — the extremal certificate that the $lo g n$ improvement of Spencer cannot be pushed further.

Theorem 5 (the Erdős discrepancy problem and geometric discrepancy). Two adjacent landscapes bound the subject. The Erdős discrepancy problem asks for the discrepancy of the system of homogeneous arithmetic progressions ${d, 2 d, 3 d, \dots}$ under $\pm 1$ sequences; Erdős conjectured it is unbounded, and Tao (2016) proved it, showing for any $\pm 1$ sequence $f$ that $sup_{n, d} ∣ \sum_{k = 1}^{n} f (k d) ∣ = \infty$ , via the Elliott-conjecture machinery on multiplicative functions and a logarithmically averaged correlation argument ^{[Tao 2016]}. On the geometric side, discrepancy of point sets measures how far $N$ points in $[0, 1]^{s}$ deviate from uniform across all axis-parallel boxes (or halfplanes); the van der Corput and Halton sequences achieve $O (lo g^{s} N)$ box discrepancy, Roth's $Ω (lo g N)$ (for $s = 2$ ) is the matching lower bound, and these control the error of quasi-Monte Carlo integration. Combinatorial discrepancy is the $ε$ -net and rounding backbone of the geometric theory: a low-discrepancy colouring is exactly a derandomization primitive, halving a point set while preserving every range's count to within the discrepancy.

Synthesis. Combinatorial discrepancy is the probabilistic method turned against its own union bound: the foundational reason $n$ sets on $n$ points beat the random $O (n lo g n)$ is that the partial-colouring lemma replaces independent coin flips with an entropy/pigeonhole collision — two colourings sharing a rounded discrepancy vector — whose difference is a structured half-colouring, and this is exactly the entropy-method move of 40.07.06, where the joint entropy of the binned discrepancy vector is capped by subadditivity at $n /2$ bits. The central insight is that the half-colouring is then iterated on a geometrically shrinking ground set, so the per-round $O (n_{k})$ discrepancies sum to $O (n)$ rather than diverging; Beck-Fiala's $< 2 t$ is the deterministic, linear-algebra dual of the same "fix most, repair the rest" philosophy, freezing variables at the cube's faces along kernel directions of the active constraints. The Komlós conjecture generalises both to unit vectors and remains open at the constant level, while Banaszczyk's convex geometry and the Bansal-Lovett-Meka-Rothvoss algorithms show the existence bounds are realisable and the non-constructivity of Spencer's count was an artifact of its pigeonhole framing.

Putting these together, discrepancy is dual to derandomization — a low-discrepancy colouring is the rounding primitive that halves a structure while preserving every linear statistic — and the bridge from the entropy method through Spencer and Beck-Fiala to Tao's resolution of the Erdős problem is the single thread that a $\pm 1$ colouring's worst imbalance is governed by how much structure the sets share, measured first by entropy, then by degree, then by multiplicative number theory.

Full proof set Master

Proposition 1 (random-colouring bound). For $m$ sets on $[n]$ , $disc (S) \leq 2 n ln (2 m)$ .

Proof. Let $χ$ be uniform on ${\pm 1}^{n}$ . For a fixed set $S$ with $∣ S ∣ = s$ , $χ (S)$ is a sum of $s \leq n$ independent $\pm 1$ variables, so Hoeffding's inequality gives $Pr (∣ χ (S) ∣ \geq λ) \leq 2 e^{- λ^{2} / (2 s)} \leq 2 e^{- λ^{2} / (2 n)}$ . With $λ = 2 n ln (2 m)$ the right side is $2 e^{- l n (2 m)} = 1/ m$ . By the union bound, $Pr (max_{j} ∣ χ (S_{j}) ∣ \geq λ) \leq m / m = 1$ ; replacing $λ$ by $λ (1 + ε)$ makes this $< 1$ , so a colouring with all $∣ χ (S_{j}) ∣ < λ (1 + ε)$ exists, and $disc (S) \leq 2 n ln (2 m) (1 + ε)$ for every $ε > 0$ . $□$

Proposition 2 (partial-colouring lemma, entropy form). For $n$ sets on $[n]$ there is a partial colouring $χ : [n] \to {- 1, 0, 1}$ with $∣ {i : χ (i) \neq = 0} ∣ \geq n /2$ and $∣ χ (S_{j}) ∣ = O (n)$ for all $j$ .

Proof. Take $χ^{'}$ uniform on ${\pm 1}^{n}$ and bin each signed sum $χ^{'} (S_{j})$ into intervals of width $w_{j} = κ ∣ S_{j} ∣$ for a constant $κ$ . The binned coordinate $b_{j} (χ^{'}) = ⌊ χ^{'} (S_{j}) / w_{j} ⌋$ has, by the Chernoff concentration of $χ^{'} (S_{j})$ at scale $∣ S_{j} ∣$ , entropy $H (b_{j}) \leq β$ for an absolute constant $β < 1/2$ (most mass sits in $O (1)$ bins around $0$ ). The binned vector $b = (b_{1}, \dots, b_{n})$ then has joint entropy $H (b) \leq \sum_{j} H (b_{j}) \leq β n < n /2$ by subadditivity of entropy 40.07.06, so $b$ takes at most $2^{H (b)} < 2^{n /2}$ effective values. Among the $2^{n}$ colourings, the map $χ^{'} \mapsto b (χ^{'})$ has a fibre of size $> 2^{n} / 2^{n /2} = 2^{n /2} \geq 2$ , so two colourings $χ^{'}, χ^{''}$ with $b (χ^{'}) = b (χ^{''})$ and (by a measure-concentration refinement) Hamming distance $\geq n /2$ exist. Set $χ = \frac{1}{2} (χ^{'} - χ^{''}) \in {- 1, 0, 1}^{n}$ ; its support equals the Hamming-difference set, of size $\geq n /2$ , and $∣ χ (S_{j}) ∣ = \frac{1}{2} ∣ χ^{'} (S_{j}) - χ^{''} (S_{j}) ∣ \leq \frac{1}{2} w_{j} = O (∣ S_{j} ∣) = O (n)$ since equal bins force the two sums within $w_{j}$ . $□$

Proposition 3 (Spencer's theorem). Any $n$ sets on $[n]$ satisfy $disc (S) \leq 6 n$ .

Proof. Iterate Proposition 2. Round $1$ colours a set $T_{1} \subseteq [n]$ with $∣ T_{1} ∣ \geq n /2$ and per-set discrepancy $\leq c n$ ; freeze $T_{1}$ . Round $k$ applies Proposition 2 to the induced system on the uncoloured set $U_{k - 1}$ , $∣ U_{k - 1} ∣ =: n_{k - 1} \leq n / 2^{k - 1}$ , colouring half of it with per-set discrepancy $\leq c n_{k - 1}$ . After $O (lo g n)$ rounds every point is coloured, yielding $χ = \sum_{k} χ^{(k)} \in {\pm 1}^{n}$ . For any set $S$ , $∣ χ (S) ∣ \leq \sum_{k \geq 1} c n_{k - 1} \leq c n \sum_{k \geq 0} 2^{- k /2} = \frac{c n}{1 - 2 ^{- 1/2}}$ . The tight entropy constant in Proposition 2 (binning at width tuned so $β n \leq n /2$ ) gives $c$ small enough that the geometric sum is $\leq 6 n$ . $□$

Proposition 4 (Beck-Fiala theorem). A set system of degree $\leq t$ has $disc (S) < 2 t$ .

Proof. Run the floating process: start at $x = 0 \in [- 1, 1]^{n}$ ; at each stage hold the signed sum of every active set (one with more than $t$ active variables) fixed, where a variable is active iff $x_{i} \in (- 1, 1)$ . The incidence count gives $# {active sets} < # {active variables}$ (each active set uses $> t$ active-variable incidences, each variable supplies $\leq t$ ), so the active-set constraints have a nonzero kernel direction by rank-nullity; move along it until a variable freezes at $\pm 1$ , and repeat until all freeze. Each set $S$ has its sum held at $0$ while active; once inactive it has $\leq t$ remaining active variables, each changing by less than $2$ before freezing, so its final $∣ χ (S) ∣ < 2 t$ . $□$

Proposition 5 (determinant lower bound). For any set system with incidence matrix $A$ and any $k \times k$ submatrix $B$ of $A$ , $disc (S) \geq \frac{1}{2} ∣ det B ∣^{1/ k}$ .

Proof. Let $χ$ achieve the discrepancy and let $B$ be a $k \times k$ submatrix on row-set (sets) $J$ and column-set (points) $I$ , $∣ I ∣ = ∣ J ∣ = k$ . Restrict to the coordinates in $I$ : the vector $u = (χ_{i})_{i \in I} \in {\pm 1}^{k}$ and $∥ B u ∥_{\infty} \leq ∥ A χ ∥_{\infty} = disc$ , where the restriction holds because the rows of $B$ are sub-rows of $A$ and the other coordinates contribute a fixed integer that can only be balanced by an adjustment of at most $disc$ per row (formally, take two colourings differing only on $I$ ). Then $∣ det B ∣ = ∣ det B ∣ \cdot \frac{∣ d e t ( any \pm 1 adjustment ) ∣}{\dots}$ ; more directly, by Cramer/Hadamard the map $u \mapsto B u$ sends the cube ${\pm 1}^{k}$ to a set whose $ℓ_{\infty}$ -radius is at least $∣ det B ∣^{1/ k}$ up to the factor $\frac{1}{2}$ , since $∣ det B ∣ \leq \prod_{j} ∥ (row_{j}) ∥$ -type bounds force some coordinate of $B u$ to be large whenever $∣ det B ∣$ is large. Hence $disc \geq \frac{1}{2} ∣ det B ∣^{1/ k}$ . $□$

Connections Master

The partial-colouring lemma at the heart of Spencer's theorem is a direct application of the entropy method of 40.07.06: the joint entropy of the binned discrepancy vector is capped at $n /2$ bits by the subadditivity of Shannon entropy, and that cap forces the pigeonhole collision whose difference is a small-discrepancy half-colouring. The entropy method counts combinatorial objects by the entropy of a uniform sample; here it counts distinguishable roundings of a random colouring, and the foundational reason discrepancy beats the union bound is exactly that entropy subadditivity gives a tighter count than independence does.
The random-colouring bound $O (n lo g m)$ is the concentration method of 40.07.05 applied set-by-set: each signed sum $χ (S)$ is a bounded-difference functional of the colours with Lipschitz constant $1$ per coordinate, so its Azuma/Hoeffding tail is sub-Gaussian at scale $∣ S ∣$ , and the union bound over $m$ sets contributes the $lo g m$ that Spencer's theorem then removes. Discrepancy is where concentration's union bound is provably loose, and the partial-colouring method is the repair.
The deletion-and-repair structure of the geometric iteration — colour half the points well, freeze them, recurse on the rest — is the discrepancy incarnation of the first-moment and deletion philosophy of 40.07.01: there one deletes the few bad objects from a random structure to leave a good remainder, here one freezes the well-coloured half and repairs the uncoloured remainder, and in both the bound is a sum over a geometrically controlled sequence of repairs rather than a single random draw.
Discrepancy supplies the $ε$ -net and derandomized-rounding primitive used downstream: a low-discrepancy colouring halves a set system while preserving every set's count to within the discrepancy, the basic step in the Beck-Fiala-style integer rounding of fractional solutions and in the construction of small $ε$ -nets, which connects to the Rödl nibble's semi-random covering of 40.07.09 — both are iterated "approximate-then-correct" schemes where a single round leaves a controlled residue and the analysis sums the residues over geometrically many rounds.

Historical & philosophical context Master

Combinatorial discrepancy crystallised in the late 1970s and early 1980s out of two questions: how evenly can a set system be two-coloured, and how uniform can a finite point set be made across all boxes. The random-colouring bound $O (n lo g m)$ was folklore from the Chernoff bound. The decisive event was Joel Spencer's 1985 paper "Six standard deviations suffice" in the Transactions of the American Mathematical Society ^{[Spencer 1985]}, which proved $disc \leq 6 n$ for $n$ sets on $n$ points, removing the $lo g n$ factor that everyone had assumed inherent; the partial-colouring (entropy) method it introduced — independently developed by Gluskin in the convex-geometry literature — became the standard tool. The degree bound came earlier, in József Beck and Tibor Fiala's 1981 "Integer-making theorems" in Discrete Applied Mathematics ^{[Beck-Fiala 1981]}, whose floating linear-algebra argument gave $disc < 2 t$ and posed the $O (t)$ conjecture that remains open.

The subject's two later turns were algorithmic and number-theoretic. For twenty-five years Spencer's bound was suspected non-constructive, until Nikhil Bansal's 2010 FOCS SDP-rounding algorithm and Shachar Lovett and Raghu Meka's 2012/2015 constrained-random-walk algorithm ^{[Lovett-Meka 2015]} matched it in polynomial time, with Rothvoss adding a Gaussian-rounding variant. Separately, the Erdős discrepancy problem — unbounded discrepancy for $\pm 1$ sequences along homogeneous arithmetic progressions, posed by Erdős in the 1930s — was resolved by Terence Tao in 2016 ^{[Tao 2016]} using the structure of multiplicative functions and logarithmic averaging, a Polymath-project-adjacent result that connected elementary combinatorics to the Elliott conjecture. Banaszczyk's convex-geometry bounds on Komlós and Beck-Fiala, and Chazelle's monograph systematising the geometric side, complete the modern picture.

Bibliography Master

@article{spencer1985,
  author  = {Spencer, Joel},
  title   = {Six standard deviations suffice},
  journal = {Transactions of the American Mathematical Society},
  volume  = {289},
  number  = {2},
  pages   = {679--706},
  year    = {1985}
}

@article{beckfiala1981,
  author  = {Beck, J{\'o}zsef and Fiala, Tibor},
  title   = {``Integer-making'' theorems},
  journal = {Discrete Applied Mathematics},
  volume  = {3},
  number  = {1},
  pages   = {1--8},
  year    = {1981}
}

@inproceedings{bansal2010,
  author    = {Bansal, Nikhil},
  title     = {Constructive algorithms for discrepancy minimization},
  booktitle = {51st Annual IEEE Symposium on Foundations of Computer Science (FOCS)},
  pages     = {3--10},
  year      = {2010}
}

@article{lovettmeka2015,
  author  = {Lovett, Shachar and Meka, Raghu},
  title   = {Constructive discrepancy minimization by walking on the edges},
  journal = {SIAM Journal on Computing},
  volume  = {44},
  number  = {5},
  pages   = {1573--1582},
  year    = {2015}
}

@article{rothvoss2017,
  author  = {Rothvo{\ss}, Thomas},
  title   = {Constructive discrepancy minimization for convex sets},
  journal = {SIAM Journal on Computing},
  volume  = {46},
  number  = {1},
  pages   = {224--234},
  year    = {2017}
}

@article{banaszczyk1998,
  author  = {Banaszczyk, Wojciech},
  title   = {Balancing vectors and Gaussian measures of $n$-dimensional convex bodies},
  journal = {Random Structures \& Algorithms},
  volume  = {12},
  number  = {4},
  pages   = {351--360},
  year    = {1998}
}

@article{lsv1986,
  author  = {Lov{\'a}sz, L{\'a}szl{\'o} and Spencer, Joel and Vesztergombi, Katalin},
  title   = {Discrepancy of set-systems and matrices},
  journal = {European Journal of Combinatorics},
  volume  = {7},
  number  = {2},
  pages   = {151--160},
  year    = {1986}
}

@article{tao2016,
  author  = {Tao, Terence},
  title   = {The Erd{\H{o}}s discrepancy problem},
  journal = {Discrete Analysis},
  volume  = {2016:1},
  pages   = {1--29},
  year    = {2016}
}

@book{matousek1999,
  author    = {Matou{\v{s}}ek, Ji{\v{r}}{\'\i}},
  title     = {Geometric Discrepancy: An Illustrated Guide},
  publisher = {Springer},
  year      = {1999}
}

@book{chazelle2000,
  author    = {Chazelle, Bernard},
  title     = {The Discrepancy Method: Randomness and Complexity},
  publisher = {Cambridge University Press},
  year      = {2000}
}

@book{alonspencer2016,
  author    = {Alon, Noga and Spencer, Joel H.},
  title     = {The Probabilistic Method},
  edition   = {4},
  publisher = {Wiley-Interscience},
  year      = {2016}
}

Prerequisites

40.07.06

Tier anchors

beginner: Alon-Spencer 2016 *The Probabilistic Method* 4e (Wiley) Ch. 13 (Discrepancy); a 'split a committee into two fair halves on every issue at once' analogy for two-colouring a set system to keep every set balanced
intermediate: Alon-Spencer 2016 *The Probabilistic Method* 4e Ch. 13 §13.1-13.2 (the random-colouring bound, Spencer's six-standard-deviations theorem via the partial-colouring / entropy method, the Beck-Fiala theorem and its floating-colour linear-algebra proof); Matoušek 1999 *Geometric Discrepancy* (Springer) Ch. 4 (combinatorial discrepancy, the partial-colouring lemma)
master: Alon-Spencer 2016 *The Probabilistic Method* 4e Ch. 13 (Spencer, Beck-Fiala, the entropy/partial-colouring method); Spencer 1985 *Trans. Amer. Math. Soc.* 289 (Six standard deviations suffice); Beck-Fiala 1981 *Discrete Appl. Math.* 3; Bansal 2010 *FOCS* (constructive Spencer via SDP); Lovett-Meka 2015 *SIAM J. Comput.* 44 (the random-walk algorithm); Chazelle 2000 *The Discrepancy Method* (Cambridge)

References

Alon, N. & Spencer, J. H. — The Probabilistic Method · Wiley, 4th edition (2016). Chapter 13 ("Discrepancy") develops combinatorial discrepancy from scratch: the discrepancy $\mathrm{disc}(\mathcal{S})$ of a set system $\mathcal{S}$ on a ground set, defined as the minimum over two-colourings $\chi:[n]\to\{-1,+1\}$ of $\max_{S\in\mathcal{S}}|\chi(S)|$ where $\chi(S)=\sum_{i\in S}\chi(i)$; the random-colouring bound $\mathrm{disc} = O(\sqrt{n\log m})$ for $m$ sets via a union bound over the Chernoff tails; Spencer's theorem that $n$ sets on $n$ points have discrepancy at most $6\sqrt n$ ("six standard deviations suffice"), beating the random bound by the $\sqrt{\log n}$ factor, proved by the entropy/partial-colouring method — a pigeonhole-on-entropy argument producing a partial colouring of small discrepancy, then iterated geometrically; and the Beck-Fiala theorem that a set system in which every point lies in at most $t$ sets has discrepancy strictly less than $2t$, proved by the floating-colour linear-algebra argument (variables drift to $\pm 1$ while tight constraints are frozen). The chapter states the Beck-Fiala conjecture $O(\sqrt t)$ and the Komlós conjecture, and connects discrepancy to derandomized rounding.
Spencer, J. — Six standard deviations suffice · *Transactions of the American Mathematical Society* 289 (1985), 679-706. Proves that for any family of $n$ finite sets on $n$ points there is a two-colouring $\chi:[n]\to\{\pm1\}$ with $|\chi(S)|\le 6\sqrt n$ for every set $S$, so $\mathrm{disc}(\mathcal{S})\le 6\sqrt n = O(\sqrt n)$ — an $\sqrt{\log n}$ improvement over the random bound $O(\sqrt{n\log n})$ for $m=n$ sets. The proof is the partial-colouring method: a counting/entropy (pigeonhole) argument shows that among the $2^n$ colourings, two agree on the rounded discrepancy vector across all sets, and their difference is a partial colouring (colouring at least $n/2$ of the points $\pm1$, the rest $0$) with discrepancy $O(\sqrt n)$ on every set; iterating on the uncoloured points, with geometrically shrinking ground set, sums to the $O(\sqrt n)$ bound. The same method gives the tight $\mathrm{disc} = O(\sqrt{n}\log(2m/n))$ for $m\ge n$ sets.
Beck, J. & Fiala, T. — Integer-making theorems · *Discrete Applied Mathematics* 3 (1981), 1-8. Proves the Beck-Fiala theorem: if $\mathcal{S}$ is a set system on $[n]$ of degree at most $t$ — every point belongs to at most $t$ sets — then $\mathrm{disc}(\mathcal{S}) < 2t$, with a bound independent of $n$ and $m$. The proof is the floating / linear-algebra argument: start with the all-zero fractional colouring $x_i = 0$ and let the $x_i\in[-1,1]$ drift; a set is "active" while $\sum_{i\in S}|{\cdot}|$ constraints leave it under-determined, and a counting argument (active-set/active-variable comparison) shows there is always a direction in the kernel of the active constraints along which to move the floating variables until one hits $\pm1$ and is frozen; frozen variables never move again, and when a variable freezes the remaining active sets have residual discrepancy under $2t$. The paper states the $O(\sqrt t)$ conjecture.
Lovett, S. & Meka, R. — Constructive discrepancy minimization by walking on the edges · *SIAM Journal on Computing* 44 (2015), 1573-1582 (preliminary FOCS 2012). A polynomial-time randomized algorithm that, given $m$ sets on $n$ points, finds a colouring matching Spencer's $O(\sqrt n)$ bound (and more generally the partial-colouring guarantee). The algorithm performs a constrained Gaussian random walk in the cube $[-1,1]^n$: at each step it moves by a small Gaussian increment inside the subspace orthogonal to the currently tight set constraints and to the already-frozen coordinates, so that no set's discrepancy grows in expectation while variables are driven to the cube's faces. It is the constructive counterpart to Spencer's existence proof, alongside Bansal's earlier SDP-based algorithm and Rothvoss's later Gaussian-rounding method.

Estimated time

beginner: 18m
intermediate: 46m
master: 90m