Continuous-Time Markov Chains I: Q-Matrices, Jump Chains, and Holding Times
Anchor (Master): Norris 1997 *Markov Chains* (Cambridge) §2.1-2.8 (jump-chain/holding-time construction, explosion, the backward equation); Chung 1967 *Markov Chains with Stationary Transition Probabilities* 2e Ch. II; Anderson 1991 *Continuous-Time Markov Chains: An Applications-Oriented Approach* (Springer) Ch. 2
Intuition Beginner
A continuous-time Markov chain models a system that hops between a list of states, like a discrete-time chain, but now the clock runs continuously: the system sits in a state for a random length of time, then jumps to a new state, then waits again, and so on. The memoryless rule still holds, but it now has two parts. While the system sits in a state, only the current state decides how long the wait tends to be and where the next jump is likely to go. Nothing about the earlier history matters, and — this is the new ingredient — not even how long the system has already been waiting matters.
That last point forces the waiting times to be of one special kind. If the remaining wait should not depend on how long you have already waited, the wait must follow the "exponential" pattern: a continuous version of repeatedly flipping a coin to decide whether to jump now. Each state has a rate that sets how fast its exponential clock runs. A large rate means short stays and frequent jumps; a small rate means long, patient stays.
So a continuous-time chain is built from two ingredients you can picture separately. First, an ordinary discrete chain that says, "when you do jump out of state , here are the odds of where you land." This is the jump chain. Second, for each state, a single number — the rate — that runs the exponential waiting clock. Bundle the jump odds and the rates together and you get one table, the generator or -matrix, that packages both the where and the when.
One subtlety gives the theory its only real surprise. If the rates keep getting bigger as the system wanders into faster and faster states, the waits can shrink so quickly that infinitely many jumps pile up into a finite stretch of time. The chain then "runs off the end of the clock," an event called explosion. Most chains do not explode, and there is a clean test telling them apart.
The one-sentence takeaway: a continuous-time chain waits an exponential time on each state, set by that state's rate, then jumps according to a discrete jump chain, and the rate-and-jump data is exactly the -matrix — provided the jumps do not pile up into an explosion.
Visual Beginner
Picture three states drawn as circles, with the system tracing a step-like path in time: flat stretches where it waits, and sudden vertical hops where it jumps.
Left: the chain as a labeled graph, each circle carrying a rate that runs its exponential clock; the arrows out of a circle carry the jump-chain odds. Right: a sample path as a staircase in time — flat holding times set by the rates, vertical jumps set by the jump chain. The inset shows the one pathology, explosion, where shrinking waits crowd infinitely many jumps into a finite time.
Worked example Beginner
We follow a two-state chain. State has rate (so its average wait is ), and state has rate (average wait ). From state the only place to jump is state , and from state the only place to jump is state , so the jump chain just alternates .
Step 1. Start in state . The wait is exponential with rate . The average wait is time unit. Say this particular wait comes out to .
Step 2. At time the system jumps. The jump chain from state sends it to state with chance , so it lands in state .
Step 3. Now in state , the wait is exponential with rate , average . Say this wait is . The system holds state from time until time .
Step 4. At time it jumps back to state , and the pattern repeats with a fresh exponential wait of rate .
Step 5. Read off the long-run fraction of time in each state. On average the system spends a unit in state and unit in state per round trip, so out of every units it spends in state . The fraction of time in state is , and in state it is .
What this tells us: the rates alone fixed the average waits, the jump chain alone fixed the order of visits, and combining them gave both a sample path and a long-run time split. The state with the slower clock (smaller rate) collects more of the time, which matches the picture of patient versus hurried states.
Check your understanding Beginner
Formal definition Intermediate+
Throughout, is a countable state space and time is continuous, indexed by . All random objects live on a probability space 37.01.01. The discrete-time apparatus of stochastic matrices, the simple and strong Markov properties, and the finite-dimensional law from 37.05.01 is assumed.
Definition (-matrix; generator). A -matrix (or generator) on is a family such that Each row therefore sums to zero. Write for the total jump rate (the holding rate) out of . A -matrix with for every (built into the definition above) is stable; if moreover it is uniform or bounded. The matrix is conservative when the row sums are exactly zero, as imposed here; allowing corresponds to an extra killing rate , excluded in this unit.
Definition (jump matrix). The jump matrix (or jump chain matrix) associated with is the stochastic matrix A state with is absorbing: once entered, it is never left, and keeps the chain there. For the row is the conditional distribution of the next state given a jump out of , and recovers the off-diagonal generator entries. Thus the pair and the matrix carry identical information.
Definition (minimal chain via jump chain and holding times). Let be a discrete-time chain 37.05.01, the jump chain. Independently, let be i.i.d. random variables, and set the holding times (with if ). The jump times are , , and the explosion time is . The minimal continuous-time chain is
where is an adjoined cemetery state used only to define past explosion; before the path is right-continuous and piecewise constant with jumps at the . Conditional on , the holding time is and is independent of and of the past, the defining structural fact of the construction.
Definition (explosion and non-explosion). The chain is non-explosive (or honest) from if , and explosive if for some . Non-explosion means for all almost surely, so the cemetery is never reached and is a genuine -valued process for all time.
Counterexamples to common slips Intermediate+
The -matrix is not a stochastic matrix. Its rows sum to , not , and its diagonal is non-positive. The associated transition semigroup (when defined) is stochastic; is its derivative at , the continuous-time analogue of the discrete generator of
37.05.01, not of itself.Holding times are exponential, not the increments of the jump chain. The randomness splits cleanly: the jump chain decides where, the exponentials decide when, and the two are independent given the visited states. Confusing the holding rate with a transition probability conflates the clock with the routing.
for a non-absorbing state. The jump chain never stays put when ; self-transitions of the continuous chain are invisible because a jump from to would not change the path. Self-loops in are therefore meaningless — only off-diagonal carry information.
Explosion is genuine and rate-driven. A finite-state -matrix, or any bounded one with , never explodes; explosion needs along a path the jump chain can traverse fast enough. The Markov property alone does not rule it out.
Key theorem with proof Intermediate+
Theorem (jump chain and holding times -matrix; the competing-exponentials law). Let be a -matrix on with jump matrix and rates . Construct as in the minimal-chain definition. Then for a state with , started at :
(a) the holding time is ;
(b) the first jump target has law , i.e. ;
(c) and are independent.
Conversely, if a process holds for an time and then jumps to with probability , independently, its generator entries are off-diagonal and , recovering [Norris 1997 §2.1-2.2].
Proof. Model the dynamics out of by a family of independent competing exponential clocks. For each with , let be independent across , and let the chain jump to whichever rings first.
Step 1 (the minimum of independent exponentials). Let . For , independence gives since . Hence , proving (a) with .
Step 2 (which clock rings first). The probability that clock is the strict minimum is computed by conditioning on and requiring every other clock to exceed : using in the exponent. This proves (b): the jump target has law .
Step 3 (independence of when and where). Compute the joint law of the first-ring time and its identity. For the event that rings first and this happens after time ,
The joint probability factors into the marginal of the target and the marginal of the time, so and are independent, proving (c). The competing-exponentials model therefore produces exactly an holding time and an independent -distributed jump, which is the minimal-chain construction; and the lack-of-memory of the exponential makes the resulting process Markov, with the same data regenerating at each jump by the strong Markov property of 37.05.01.
Converse. Given holding rate and jump law , set for and . Then off-diagonal and , so is a conservative -matrix, and the construction run from returns the given holding-and-jump dynamics.
Bridge. This theorem builds toward the entire analytic theory of continuous-time chains and appears again in the forward and backward Kolmogorov equations, where becomes the generator of the transition semigroup . The foundational reason the construction is canonical is that the memoryless exponential is the unique continuous holding law compatible with the Markov property, so the only freedom left is the routing matrix : this is exactly the split of the dynamics into a when (the rate ) and a where (the row ), and the identity is the dictionary between them. The discrete jump chain is the embedded skeleton governed by the discrete theory of 37.05.01, so every recurrence, transience, and irreducibility question about reduces to one about ; putting these together, the continuous-time chain is the jump chain with exponential clocks bolted on, and the -matrix is dual to the diffusion generator of 02.15.03 — there the generator is a second-order differential operator on a continuum, here it is a row-zero-sum matrix on a countable set, the central insight being that both are the infinitesimal description of a Markov semigroup.
Exercises Intermediate+
Advanced results Master
Beyond the pathwise construction, the theory organizes around the equivalence of the data, the embedded jump chain as the discrete skeleton, the explosion dichotomy with Reuter's analytic criterion, and the passage from to the transition semigroup that places the matrix as the infinitesimal generator dual to the continuous-state operator of 02.15.03.
Theorem 1 (three equivalent descriptions). Let be the minimal right-continuous process on . The following data determine one another and determine the law of : (i) the conservative stable -matrix ; (ii) the pair consisting of the jump matrix and the rate vector , via , ; (iii) the requirement that, started at with , the first holding time is independent of the first jump, whose law is , and that the process restarts afresh at each jump (strong Markov at jump times). The equivalence (i) (ii) is algebraic; (ii) (iii) is the Key theorem together with the strong Markov property of the discrete jump chain 37.05.01.
Theorem 2 (the embedded jump chain governs the qualitative theory). The jump chain is a discrete-time chain, and the continuous chain's irreducibility, recurrence, and transience are read off : is irreducible iff is irreducible (on the non-absorbing states), and a state is recurrent for iff it is recurrent for . The continuous and discrete notions of visiting a state coincide because the holding times are almost surely positive and finite (for ), so the set of states visited is identical for and . The distinction between positive and null recurrence, however, involves the holding rates: positive recurrence of requires the expected return time to be finite, blending the routing of with the rates.
Theorem 3 (explosion dichotomy and Reuter's criterion). Let be a stable conservative -matrix. The minimal chain is non-explosive (honest) from every state iff the only bounded solution with to is . This is Reuter's criterion [Reuter 1957]. The quantity is the maximal such solution: it satisfies by a first-jump decomposition, and explosion ( for some ) is equivalent to the existence of a non-zero bounded solution. Sufficient conditions follow at once: a bounded -matrix (), or more generally any chain for which almost surely, is non-explosive; the pure-birth criterion is the one-dimensional instance.
Theorem 4 (from to the transition semigroup; the generator). Define for the minimal chain. Then with is a sub-stochastic semigroup: , (the continuous-time Chapman–Kolmogorov relation), and with equality iff the chain is non-explosive up to time . It satisfies the backward equation and, under non-explosion and suitable regularity, the forward equation , with when is bounded. Thus is the infinitesimal generator: the matrix analogue, on a countable state space, of the second-order differential generator of a diffusion 02.15.03. The off-diagonal is the instantaneous rate , and is the instantaneous rate of leaving .
Synthesis. The foundational reason the subject coheres is that one conservative -matrix encodes the entire law of the minimal chain, and putting these together, the jump-chain/holding-time data is exactly the same information presented as routing-plus-clock rather than as a single generator — this is the central insight that the when and the where decouple, with the memoryless exponential the unique clock the Markov property allows. The embedded jump chain is the discrete skeleton, so every recurrence and transience question generalises the discrete theory of 37.05.01 verbatim, while the rates re-enter only for positive recurrence and for the time-change between and . The explosion dichotomy is dual to the analytic question of whether admits a non-zero bounded solution: this is exactly Reuter's criterion, and the maximal solution is the bridge between the probabilistic event of explosion and the linear algebra of the generator. The passage shows the -matrix is dual to the diffusion generator of 02.15.03 — both are for their respective semigroups, one a row-zero-sum matrix on a countable set, the other a second-order elliptic operator on a continuum — so the -matrix, the jump-and-hold construction, the explosion criterion, and the generator of the semigroup are four presentations of the single fact that an infinitesimal description determines a continuous-time Markov flow whenever its jumps do not pile up.
Full proof set Master
Proposition 1 (the jump matrix is stochastic). For any stable conservative -matrix , the associated is a stochastic matrix.
Proof. For with : for since , and ; the row sum is using . For with : and for , a degenerate but valid stochastic row. Hence every row of is a probability distribution, so is stochastic.
Proposition 2 (competing exponentials; full joint law). Let be independent with , . Let and . Then , , and are independent, and the minimizer is almost surely unique.
Proof. For the tail, , so . For the joint law of , condition on : Setting gives ; combined with the joint factors as , proving independence. Two continuous independent variables coincide with probability zero, so the minimizer is a.s. unique.
Proposition 3 (memorylessness characterizes the exponential). A nonnegative random variable with satisfies for all iff for some (with meaning a.s.).
Proof. If then gives . Conversely the hypothesis is , the multiplicative Cauchy equation for the right-continuous non-increasing with . Right-continuous solutions of on with are for some : writing gives Cauchy's additive equation , whose monotone solutions are linear . Hence , i.e. .
Proposition 4 ( for the explosion functional). Let for the minimal chain, . Then on the non-absorbing states, and is the maximal bounded solution in .
Proof. Condition on the first holding time and the first jump , independent by the Key theorem. After the first jump the residual explosion time is with law by the strong Markov property at the jump time 37.05.01. Hence, for ,
using for . Multiplying by and rearranging with , :
which simplifies to , i.e. . Maximality: any solving the same equation satisfies for every by iterating the first-jump identity, and with , giving by dominated convergence.
Proposition 5 (Reuter's criterion). The minimal chain is non-explosive from every state iff is the only solution of in .
Proof. By Proposition 4, is the maximal bounded -solution. Non-explosion from every state means for all , equivalently a.s., equivalently for all . Since is maximal, forces every bounded -solution to vanish; conversely if some then itself is a non-zero solution and , i.e. explosion. Hence non-explosion the only solution is .
Proposition 6 (bounded gives , a stochastic semigroup). If , the series converges in operator norm on , defines a stochastic semigroup with , and equals the transition matrix of the minimal (non-explosive) chain.
Proof. On , , so and the series converges absolutely in operator norm; then satisfies , , and by the functional equation of the matrix exponential. Conservativity gives since for ; nonnegativity of follows by writing with chosen so entrywise, a convergent series of nonnegative matrices. Boundedness gives non-explosion (Exercise 7), and the forward/backward equations identify with the minimal chain's transition matrix.
Connections Master
The discrete-time chain
37.05.01supplies the embedded jump chain : it is a chain in the exact sense of that unit, and its stochastic-matrix algebra, simple and strong Markov properties, and recurrence/transience dichotomy transfer verbatim to the qualitative theory of the continuous chain; the continuous-time content added here is only the exponential clock and the explosion phenomenon that the discrete skeleton cannot see.The probability-space and extension foundations
37.01.01underwrite the construction of the pair on one probability space: the jump chain is built from a consistent family of finite-dimensional laws and the holding-time exponentials are an independent i.i.d. sequence, the joint law assembled by the Kolmogorov/Ionescu–Tulcea machinery, and the explosion time is a measurable functional of that data on path space.The continuous-state diffusion generator
02.15.03is the parallel theory on a continuum: the -matrix is to the countable-state semigroup what the second-order elliptic operator is to the diffusion semigroup ; both are , both run forward (Kolmogorov forward / Fokker–Planck) on measures and backward on observables, and the contrast is jumps on a discrete set versus continuous sample paths on , with explosion here mirroring the explosion of solutions to the SDE there.
Historical & philosophical context Master
The analytic theory of continuous-time Markov processes was founded by Andrei N. Kolmogorov in his 1931 Mathematische Annalen memoir [Kolmogorov 1931], which introduced the transition function and derived the forward and backward differential equations governing it. Kolmogorov worked from the semigroup relation — the continuous-time Chapman–Kolmogorov equation — and obtained as the matrix of instantaneous rates, though the pathwise jump-and-hold construction was not yet his concern.
William Feller, in his 1940 Transactions of the American Mathematical Society paper on purely discontinuous Markov processes [Feller 1940], analyzed the integro-differential equations of jump processes on countable state spaces and confronted the possibility that the backward equation does not determine a unique transition function — the analytic shadow of explosion and of boundary behaviour. The probabilistic resolution, separating the routing (jump matrix) from the timing (exponential holding times) and constructing the minimal process directly, is the modern synthesis presented here.
Gerd E. H. Reuter settled the existence and uniqueness question for denumerable chains in his 1957 Acta Mathematica paper [Reuter 1957], characterizing non-explosion (honesty) of the minimal process through the vanishing of every bounded -solution of , the criterion stated above. The contraction-semigroup framework he used connects the matrix to the Hille–Yosida generation theory for the transition semigroup on , placing continuous-time chains within the general theory of Markov semigroups whose continuous-state instance is the diffusion generator of 02.15.03.
Bibliography Master
@book{Norris1997,
author = {Norris, James R.},
title = {Markov Chains},
series = {Cambridge Series in Statistical and Probabilistic Mathematics},
publisher = {Cambridge University Press},
year = {1997}
}
@article{Reuter1957,
author = {Reuter, Gerd E. H.},
title = {Denumerable {M}arkov processes and the associated contraction semigroups on $l$},
journal = {Acta Mathematica},
volume = {97},
year = {1957},
pages = {1--46}
}
@article{Kolmogorov1931,
author = {Kolmogorov, Andrei N.},
title = {\"Uber die analytischen {M}ethoden in der {W}ahrscheinlichkeitsrechnung},
journal = {Mathematische Annalen},
volume = {104},
year = {1931},
pages = {415--458}
}
@article{Feller1940,
author = {Feller, William},
title = {On the integro-differential equations of purely discontinuous {M}arkoff processes},
journal = {Transactions of the American Mathematical Society},
volume = {48},
year = {1940},
pages = {488--515}
}
@book{Anderson1991,
author = {Anderson, William J.},
title = {Continuous-Time Markov Chains: An Applications-Oriented Approach},
series = {Springer Series in Statistics},
publisher = {Springer},
year = {1991}
}
@book{Chung1967,
author = {Chung, Kai Lai},
title = {Markov Chains with Stationary Transition Probabilities},
edition = {2},
publisher = {Springer},
year = {1967}
}