38.02.02 · dynamics / symbolic-dynamics

Shifts of Finite Type, Transition Matrices, and Coding

shipped3 tiersLean: none

Anchor (Master): Lind-Marcus 1995 *An Introduction to Symbolic Dynamics and Coding* (Cambridge University Press) Ch. 2-7 (edge shifts, the conjugacy problem, state splitting, strong shift equivalence, the Decomposition and Classification theorems, sofic shifts); Kitchens 1998 *Symbolic Dynamics* (Springer Universitext) Ch. 2; Williams 1973 *Classification of subshifts of finite type* (Annals of Mathematics 98) — the origin of strong shift equivalence; Bowen-Lanford 1970 *Zeta functions of restrictions of the shift transformation* (Proc. Symp. Pure Math. 14) — the rationality of the SFT zeta function

Intuition Beginner

A subshift is the set of all sequences that avoid some list of forbidden patterns. The friendliest case is when the only patterns you forbid are pairs of adjacent letters: you draw a small map of dots, one dot per letter, and an arrow from letter to letter exactly when the pair is allowed. An allowed sequence is then nothing more than an endless walk that always travels along arrows of your map. The whole system is captured by a tiny picture, and the messy question "which infinite strings are legal?" becomes the concrete question "which walks does this graph permit?"

The map can be packaged as a square table of zeros and ones: put a in row , column when there is an arrow from to , and a when there is none. This table is the transition matrix. It looks just like the probability table of a random walk, but it has thrown the probabilities away and kept only the bare fact of which steps are possible. That single change — from "how likely" to "is it allowed" — is the whole difference between the random world and the symbolic one, and it is why the same little graph governs both.

The surprise is how much the table remembers. Multiply the table by itself times and the entries count the walks of length . The diagonal then counts the loops — the walks that return to where they started — which are exactly the sequences that repeat with period . So a question about repeating patterns in infinite strings becomes a question you answer by multiplying a small matrix.

Visual Beginner

Picture three dots labelled , , , with arrows showing which letter may follow which. Say may go to or , may go only to , and may go to . An allowed sequence is any endless trip that always follows an arrow: you can sit at for a while, then hop to , then you are forced to , then back to , and so on.

Beside the picture sits its table. Reading row : a under and under , a under , because may go to or but not straight to . The table below contrasts the two ways of describing the same allowed sequences.

description the graph picture the matrix picture
what a letter is a dot a row-and-column index
what "allowed pair" means an arrow a in row , column
an allowed sequence an endless walk along arrows a sequence of indices with each step a
how many length- walks count paths by hand add up the entries of the table multiplied times
sequences repeating with period closed loops of length the diagonal sum of that -th power

Worked example Beginner

Take the golden-mean rule on two letters and : the only forbidden pair is . The graph has an arrow , an arrow , and an arrow , but no arrow . Its transition matrix, with rows and columns ordered then , is $$ A = \begin{pmatrix} 1 & 1 \ 1 & 0 \end{pmatrix}. $$ We will count the sequences that repeat with period by computing the diagonal sum of powers of .

Step 1. Square the matrix. Multiplying by itself, $$ A^2 = \begin{pmatrix} 1 & 1 \ 1 & 0 \end{pmatrix}\begin{pmatrix} 1 & 1 \ 1 & 0 \end{pmatrix} = \begin{pmatrix} 2 & 1 \ 1 & 1 \end{pmatrix}. $$ The diagonal entries are and , and they add to . So there are closed walks of length .

Step 2. Multiply once more. From and , $$ A^3 = \begin{pmatrix} 2 & 1 \ 1 & 1 \end{pmatrix}\begin{pmatrix} 1 & 1 \ 1 & 0 \end{pmatrix} = \begin{pmatrix} 3 & 2 \ 2 & 1 \end{pmatrix}. $$ The diagonal entries are and , adding to . So there are closed walks of length .

Step 3. Read off the diagonal sums. The diagonal sum of is ; of it is ; of it is . These count the period-, period-, and period- repeating sequences allowed by the golden-mean rule (counting each by the length of its repeating block).

Step 4. Notice the Fibonacci flavour. The numbers that come out are the Lucas numbers, the close cousins of the Fibonacci counts of allowed words. Both grow at the same rate, set by the golden ratio.

What this tells us: the single banned pair is enough to fix the count of every repeating sequence, and you recover those counts by the purely mechanical act of multiplying a two-by-two table and adding its diagonal. The graph forbids one arrow; the matrix does all the bookkeeping.

Check your understanding Beginner

Formal definition Intermediate+

Fix a finite alphabet and recall from 38.02.01 that a subshift is a closed, shift-invariant subset of the full shift , and that a subshift defined by a finite set of forbidden words is a subshift of finite type (SFT). When the forbidden words all have length , the SFT is a topological Markov chain: it is described by a transition matrix indexed by the alphabet , with iff the block is allowed. Then $$ \mathsf{X}A = { x \in \mathcal{A}^{\mathbb{Z}} : A{x_n x_{n+1}} = 1 \text{ for all } n \in \mathbb{Z} }. $$ This matrix is the topological analogue of the stochastic transition matrix of a Markov chain 37.05.02: where the stochastic matrix records the probability of , the transition matrix records merely whether is permitted. The two share the same support graph (vertices , an edge iff ), and every combinatorial notion below is read off alone.

Definition (vertex and edge shifts). Let be a finite directed graph with vertex set and edge set . The vertex shift is the set of bi-infinite vertex sequences with an edge for every ; this is for the vertex adjacency matrix. The edge shift is the set of bi-infinite edge sequences with ; this is the vertex shift of the line graph of , and it is the SFT defined by the edge adjacency. Multiple edges between two vertices are allowed, so the edge adjacency matrix is a general non-negative integer matrix, not merely .

Definition (higher-block presentation). For , the -th higher-block code sends to the sequence whose -th symbol is the block , an element of the allowed--block alphabet . It is an invertible sliding-block code onto its image, the higher-block presentation .

Definition (irreducible, primitive, period). The matrix (equivalently the graph ) is irreducible if for every pair there is with — the graph is strongly connected. It is primitive if there is a single with entrywise. The period of an irreducible is , the gcd of the lengths of closed walks; the value is independent of the vertex . An irreducible matrix is primitive iff (aperiodic). These are exactly the irreducibility/period notions of 37.05.02, now read off the support graph.

Definition (conjugacy and the conjugacy problem). Two subshifts are topologically conjugate, written , if there is a shift-commuting homeomorphism ; by the Curtis-Hedlund-Lyndon theorem 38.02.01 this is an invertible sliding-block code. The conjugacy problem for SFTs asks for a decidable invariant complete for .

Counterexamples to common slips

  • Irreducible is not the same as primitive. The matrix is irreducible (its graph is the -cycle, strongly connected) but not primitive: is never strictly positive, because it alternates between and . Its period is . Primitivity demands aperiodicity on top of strong connectedness.

  • The transition matrix need not be symmetric, and direction matters. means is allowed; this says nothing about . The golden-mean matrix is symmetric by accident; a one-way rule like "allow but forbid " gives an asymmetric with a different support graph.

  • Conjugacy is finer than equal entropy. Two irreducible SFTs can have the same Perron eigenvalue (hence equal topological entropy) yet fail to be conjugate. Entropy is a conjugacy invariant, not a complete one; the complete invariant is strong shift equivalence (Master tier), and even the weaker shift equivalence does not suffice in general.

  • Higher-block presentations change the alphabet but not the system. Passing from to replaces letters by overlapping -blocks; the two are conjugate by construction. An "-step SFT" (forbidden words of length ) is therefore a -step SFT in disguise, after recoding.

Key theorem with proof Intermediate+

Theorem (the topological Markov dictionary, periodic-point count, and zeta function). Let be a transition matrix with subshift and support graph .

(a) Recoding. Every SFT is topologically conjugate to a -step SFT, and every -step vertex SFT is conjugate to an edge shift; thus every SFT is conjugate to the edge shift of a finite graph.

(b) Dictionary. irreducible (i.e. strongly connected) iff is topologically transitive; primitive (strongly connected and aperiodic) iff is topologically mixing.

(c) Periodic points and zeta. The number of points of period is , and the Artin-Mazur zeta function is rational: $$ \zeta_\sigma(t) = \exp!\Big( \sum_{n \ge 1} \frac{#\mathrm{Fix}(\sigma^n)}{n}, t^n \Big) = \det(I - tA)^{-1}. $$

(See [Lind-Marcus 1995 Ch. 2-4], [Brin-Stuck 2002 §3.3-§3.5], [Bowen-Lanford 1970].)

Proof. Part (a). If is an SFT with forbidden words of length , its higher-block presentation (with alphabet the allowed -blocks) has forbidden words of length only: a pair of overlapping -blocks is allowed iff their overlap is consistent and the combined -block is allowed. Thus is a -step SFT conjugate to via the invertible higher-block code . A -step vertex SFT becomes an edge shift by taking the graph whose vertices are the letters and drawing one edge for each allowed pair: the map sending a vertex sequence to its sequence of traversed edges is a conjugacy onto the edge shift .

Part (b). Topological transitivity means a dense orbit, equivalently that for any two allowed blocks there is an allowed block with allowed. Given letters last symbol of and first symbol of , such a exists iff there is a walk in , i.e. iff for some — exactly strong connectedness of , i.e. irreducibility of . Topological mixing means that for all there is with an allowed connecting block of every length ; the connecting blocks of length correspond to walks of length , and walks of all large lengths exist between every ordered pair iff for all large , i.e. iff is primitive. (For an irreducible with period , walk lengths are confined to a fixed residue class mod , so mixing fails; this is the cyclic structure of 37.05.02.)

Part (c). A point with is a bi-infinite sequence of period , determined by one period with for all taken mod ; this is precisely a closed walk of length in . The number of closed walks of length based at vertex is , so the total is . For the zeta function, write the eigenvalues of as (with multiplicity), so . Then $$ \sum_{n \ge 1} \frac{\operatorname{tr}(A^n)}{n} t^n = \sum_i \sum_{n \ge 1} \frac{(\lambda_i t)^n}{n} = -\sum_i \log(1 - \lambda_i t) = -\log \prod_i (1 - \lambda_i t) = -\log\det(I - tA), $$ using . Exponentiating gives , a rational function.

Bridge. This theorem builds toward the entire conjugacy theory of symbolic systems and appears again in every entropy and classification computation, because it reduces three apparently different questions — recoding, mixing, and counting — to linear algebra on one non-negative matrix. The foundational reason the same matrix answers all three is that an allowed sequence is a walk and a periodic point is a closed walk, so combinatorics on and spectral data of are two readings of one object; this is exactly the principle that makes the trace count loops and the determinant package them into a zeta function. The dictionary half is dual to the probabilistic dictionary of 37.05.02: irreducibility and period are graph properties shared by the matrix and the stochastic matrix on the same support, and the central insight is that topological transitivity here plays the role of irreducibility there while topological mixing plays the role of aperiodicity. Putting these together, the recoding statement promotes every finite-type system to an edge shift, so the conjugacy problem of the next unit becomes a problem about non-negative integer matrices up to a graph-rewriting equivalence — the bridge from forbidden words to matrices that the Master tier turns into strong shift equivalence.

Exercises Intermediate+

Advanced results Master

Theorem (state splitting and the Decomposition Theorem; Williams). Every topological conjugacy between edge shifts is a composition of in-splittings, out-splittings, and their inverses (amalgamations). Two edge shifts are topologically conjugate iff the non-negative integer matrices and are strong shift equivalent: there are non-negative integer rectangular matrices and a chain with , (each step an elementary equivalence). (See [Williams 1973], [Lind-Marcus 1995 Ch. 7].)

Strong shift equivalence is the exact algebraic shadow of conjugacy. A single elementary equivalence , realises one round of state splitting: records how each state of splits into states of and records the reassembly, and , encode that the two graphs present the same bi-infinite walk set. The Decomposition Theorem says these elementary moves generate all of conjugacy, so the conjugacy problem for SFTs is the decidability problem for strong shift equivalence — a problem whose general decidability remained open for decades and whose subtlety is the reason the coarser shift equivalence (, for some , with , ) was introduced as a computable approximation. Kim and Roush showed shift equivalence is decidable; whether it equals strong shift equivalence (Williams' conjecture) is false in general, settled by Kim-Roush in the 1990s with reducible counterexamples.

Theorem (entropy, the Perron eigenvalue, and Lind's theorem). For an irreducible SFT with transition matrix , the topological entropy is , where is the Perron-Frobenius eigenvalue (spectral radius) of . A real number is the entropy-exponential of some irreducible SFT iff is a Perron number (an algebraic integer strictly dominating its other conjugates in modulus). (Lind-Marcus 1995 [Lind-Marcus 1995 Ch. 4].)

The entropy formula is Perron-Frobenius applied to the word-count growth , and it shows entropy to be a conjugacy invariant taking values in . This is the topological counterpart of the Perron eigenvalue governing the convergence rate of the stochastic chain in 37.05.02: there the eigenvalue carries the stationary distribution and the spectral gap controls mixing, here the eigenvalue carries the entropy and the rest of the spectrum carries the zeta-function poles. Lind's characterisation closes the realisation question — every Perron number, and no other, is an SFT growth rate — exhibiting the spectral radius as the bridge from combinatorial complexity to algebraic number theory.

Theorem (sofic shifts as factors of SFTs; the Krieger and Fischer covers). A subshift is sofic iff it is a sliding-block factor of an SFT, iff it is the set of label-sequences of a finite labelled graph, iff its language is regular. Every irreducible sofic shift has a unique minimal right-resolving presentation, the Fischer cover, and a canonical Krieger cover; sofic-but-not-SFT shifts exist, the even shift being canonical. (Lind-Marcus 1995 [Lind-Marcus 1995 Ch. 3].)

Sofic shifts sit one level above finite type: an SFT records a language defined by finitely many forbidden factors, while a sofic shift records an arbitrary regular language, presented by a finite automaton whose underlying edge shift is an SFT and whose label map is the factor code. The Fischer cover is the minimal such automaton among irreducible right-resolving presentations, playing the role for sofic shifts that the transition matrix plays for SFTs, and it makes entropy and zeta-function computations for sofic shifts reduce to matrix computations on the cover. The even shift's failure to be finite type — no bounded memory can enforce a global parity constraint — is exactly the gap between regular languages and the bounded-factor languages of SFTs.

Synthesis. These results are one architecture seen at four resolutions, and putting these together the entire finite-type theory becomes linear algebra on non-negative integer matrices up to strong shift equivalence. The central insight is that an SFT is its transition matrix modulo state splitting: the Decomposition Theorem makes conjugacy a chain of elementary equivalences , so the foundational reason entropy, periodic-point counts, and the zeta function are conjugacy invariants is that each is a strong-shift-equivalence invariant of the matrix — the trace data , the determinant , and the spectral radius all survive the move because and share their nonzero spectrum. This is exactly the principle that the periodic-point count is the matrix trace and the zeta function is its determinant package, now promoted to a classification statement. The dictionary half is dual to the probabilistic theory of 37.05.02: the matrix is the support shadow of the stochastic matrix, irreducibility and period are shared by both on the common support graph, and the Perron eigenvalue that controls mixing there controls entropy here — topological and measured Markov dynamics share one combinatorial skeleton. The bridge onward is that sofic shifts generalise SFTs to regular languages, embedding symbolic dynamics in automata theory, so the classification of finite-state symbolic systems is the classification of non-negative integer matrices up to strong shift equivalence, the central problem this unit hands to the conjugacy theory downstream.

Full proof set Master

Proposition 1 (periodic points are closed walks). For a transition matrix , .

Proof. A point with satisfies for all , so it is the bi-infinite repetition of the word , subject to for with . This is exactly a closed walk of length in . The number of closed walks of length based at is — by the standard induction number of length- walks , since counts walks by their penultimate vertex. Summing the diagonal over base vertices gives .

Proposition 2 (rationality of the zeta function). .

Proof. Let be the eigenvalues of with algebraic multiplicity, so (Newton; the trace of a power is the power sum of eigenvalues). Then for small, $$ \sum_{n \ge 1} \frac{\operatorname{tr}(A^n)}{n} t^n = \sum_i \sum_{n \ge 1} \frac{(\lambda_i t)^n}{n} = -\sum_i \log(1 - \lambda_i t). $$ Exponentiating, , since . The right side is rational in .

Proposition 3 (irreducible transitive). is topologically transitive iff is irreducible.

Proof. Transitivity is equivalent to: for all allowed blocks there is a block with allowed (this builds a dense orbit by concatenating an enumeration of block pairs, and conversely a dense orbit realises every such junction). Writing for the last letter of and for the first of , the existence of is the existence of a walk in , i.e. for some . Quantifying over all , this is strong connectedness of , which is irreducibility of .

Proposition 4 (primitive mixing). is topologically mixing iff is primitive.

Proof. Mixing means: for all allowed there is such that for every a connecting block of length exists with allowed and the gap equal to . With as before, connecting blocks of length correspond to walks of length , counted by . Thus mixing says for all and all large , i.e. entrywise for all large . A non-negative matrix with for some has for all when it is irreducible with a self-loop somewhere; in general for all large is precisely primitivity (irreducible and aperiodic, by Perron-Frobenius). Conversely a primitive matrix has for all some , giving mixing.

Proposition 5 (higher-block conjugacy). The higher-block code is a topological conjugacy, and is a -step SFT for at least the longest forbidden-word length minus one.

Proof. is a sliding-block code with window and block map , hence continuous and shift-commuting 38.02.01. Its inverse is the -block code reading the first symbol of each -block, , also a sliding-block code; and on the image, so is an invertible sliding-block code, a conjugacy onto . In the image alphabet , two consecutive symbols , are compatible iff overlap in their length- inner block and the merged -block is allowed; this is a length- constraint, so the image is -step. For one less than the longest forbidden word, every forbidden constraint becomes a length- constraint on , so is a -step SFT.

Proposition 6 (elementary equivalence preserves the nonzero spectrum, hence entropy and zeta). If and with non-negative integer matrices, then and have the same nonzero eigenvalues with multiplicity; consequently for all , , and up to the factor a power of accounting for zero eigenvalues.

Proof. For any (possibly rectangular) with defined, and have equal nonzero eigenvalues with multiplicity: if with , then and , so is a -eigenvector of ; the map is injective on the -eigenspace for (as forces ), and symmetrically, giving equal multiplicities. Hence the power sums agree for , the spectral radii agree (), and and differ only by the unit factor from the differing counts of zero eigenvalues, which is . Therefore strong shift equivalence — a chain of such elementary moves — preserves entropy and the zeta function, confirming they are conjugacy invariants.

Connections Master

  • Shift spaces and subshifts 38.02.01. This unit specialises the general theory of that prerequisite to the finite-type case: the forbidden-word characterisation and the Curtis-Hedlund-Lyndon theorem proved there are the tools that turn an SFT into an edge shift and a conjugacy into an invertible sliding-block code, and the golden-mean and even shifts introduced there are the running examples whose transition matrix, entropy, and sofic-vs-SFT status are computed here. The upstream unit owns the definition of subshift and code; this unit owns the matrix theory and the conjugacy classification.

  • Class structure, irreducibility, and periodicity of Markov chains 37.05.02. The transition matrix here is the topological shadow of the stochastic transition matrix there: communication classes, irreducibility, and the gcd-of-cycle-lengths period transfer verbatim from the stochastic matrix to its support graph, with topological transitivity replacing irreducibility and topological mixing replacing aperiodicity. Where the probabilistic theory weights walks by probability and extracts a stationary distribution from the Perron eigenvalue , the symbolic theory counts walks and extracts entropy from the Perron eigenvalue ; both read off the same directed graph, and this unit's hooks_out records the link.

  • Dynamical systems, orbits, and limit sets 38.01.01. The chaotic systems catalogued abstractly there — the doubling map, the Smale horseshoe, the hyperbolic toral automorphism — are each conjugate, via a Markov partition, to a subshift of finite type studied here, so the transition matrix of the present unit is the concrete combinatorial model of the recurrence, topological transitivity, and mixing defined there. The periodic-point count computes the periodic orbits of those smooth systems through their symbolic codings.

Historical & philosophical context Master

Subshifts of finite type appear implicitly in the symbolic codings of Hadamard (1898) and of Morse and Hedlund (1938), but their systematic theory as topological Markov chains was developed in the 1960s and 1970s alongside the smooth-ergodic-theory program of Anosov, Sinai, and Bowen, who used Markov partitions to code Axiom A diffeomorphisms by SFTs. The rationality of the Artin-Mazur zeta function for a subshift of finite type was established by Rufus Bowen and Oscar Lanford in their 1970 note Zeta functions of restrictions of the shift transformation (Proc. Symp. Pure Math. 14) [Bowen-Lanford 1970], adapting the Artin-Mazur definition of a dynamical zeta function to the symbolic setting.

The conjugacy problem for SFTs was reframed as an algebraic problem by Robert Williams in his 1973 paper Classification of subshifts of finite type (Annals of Mathematics 98) [Williams 1973], which introduced strong shift equivalence and shift equivalence of non-negative integer matrices and proved the Decomposition Theorem that every conjugacy of edge shifts factors through state splittings. Williams conjectured that shift equivalence and strong shift equivalence coincide; the question stood for two decades until Ki Hang Kim and Fred Roush constructed reducible counterexamples in the early 1990s, with the irreducible case resolved later, leaving the general decidability of strong shift equivalence — and hence of SFT conjugacy — among the central open problems of the field. The textbook synthesis is Douglas Lind and Brian Marcus' 1995 An Introduction to Symbolic Dynamics and Coding (Cambridge) [Lind-Marcus 1995], which organises the subject around edge shifts, the transition matrix, entropy and the Perron number characterisation due to Lind, sofic shifts, and the Williams classification.

Bibliography Master

@book{LindMarcus1995sft,
  author    = {Lind, Douglas and Marcus, Brian},
  title     = {An Introduction to Symbolic Dynamics and Coding},
  publisher = {Cambridge University Press},
  year      = {1995}
}

@book{BrinStuck2002sft,
  author    = {Brin, Michael and Stuck, Garrett},
  title     = {Introduction to Dynamical Systems},
  publisher = {Cambridge University Press},
  year      = {2002}
}

@article{Williams1973,
  author  = {Williams, Robert F.},
  title   = {Classification of subshifts of finite type},
  journal = {Annals of Mathematics},
  volume  = {98},
  year    = {1973},
  pages   = {120--153}
}

@incollection{BowenLanford1970,
  author    = {Bowen, Rufus and Lanford, Oscar E.},
  title     = {Zeta functions of restrictions of the shift transformation},
  booktitle = {Global Analysis (Proc. Sympos. Pure Math., Vol. XIV)},
  publisher = {American Mathematical Society},
  year      = {1970},
  pages     = {43--49}
}

@book{Kitchens1998sft,
  author    = {Kitchens, Bruce P.},
  title     = {Symbolic Dynamics: One-sided, Two-sided and Countable State Markov Shifts},
  publisher = {Springer-Verlag},
  series    = {Universitext},
  year      = {1998}
}

@article{KimRoush1999,
  author  = {Kim, Ki Hang and Roush, Fred W.},
  title   = {The Williams conjecture is false for irreducible subshifts},
  journal = {Annals of Mathematics},
  volume  = {149},
  year    = {1999},
  pages   = {545--558}
}