01.01.17 · foundations / linear-algebra

Change of basis and the transformation laws

shipped3 tiersLean: none

Anchor (Master): Shilov *Linear Algebra* Ch. 4–5; Hoffman-Kunze *Linear Algebra* §3.4 and Ch. 5–6 (similarity, equivalence, congruence); Halmos *Finite-Dimensional Vector Spaces* §§37–47 (change of basis, similarity, the dual transformation law); Lang *Algebra* Ch. XIII (matrices, similarity, and bilinear forms); Bourbaki *Algèbre* Ch. II (modules, bases, and matrices)

Intuition Beginner

A vector is a thing in the world — an arrow, a displacement, a force. The list of numbers you write down for it is not the vector itself; it is a description, and the description depends on the rulers you chose. Switch to different rulers and the same arrow gets a different list of numbers, even though nothing about the arrow has moved. A change of basis is exactly that switch of rulers, and the transformation laws are the bookkeeping that tells you how every list of numbers must change so that all the things they describe stay put.

Here is the part that surprises people. When you switch to a finer set of rulers, so that each new ruler measures a smaller step, the numbers you read off get bigger, not smaller. Cut your unit in half and every length doubles. So the coordinates move opposite to the rulers. This opposite movement is the whole point: vectors and their coordinates pull in opposite directions, and keeping track of which is which is what the laws below do.

The same idea applies to a transformation — a rule that moves the whole space. Its grid of numbers, its matrix, also depends on the rulers. Re-describe the rulers and you must re-describe the matrix, sandwiching it between the switch and its undo. Quantities that survive this sandwich untouched, like the determinant, are the real features of the transformation rather than features of your chosen rulers.

Visual Beginner

The picture shows one fixed arrow drawn twice, once against a square grid and once against a slanted, stretched grid. The arrow never moves. Against the square grid it reads as two steps right and one step up. Against the slanted grid, whose rulers point in different directions and have different lengths, the very same arrow reads as a different pair of numbers. The change-of-basis matrix is the dictionary translating one reading into the other.

Two facts are visible. The arrow is the same in both panels — the geometry does not care which grid you drew. And the two coordinate pairs are linked by a fixed dictionary that depends only on the two grids, not on which arrow you picked.

Worked example Beginner

Take the plane with the ordinary square grid. Pick a new pair of rulers: the first new ruler is the arrow $(1, 1)$ and the second is the arrow $(- 1, 1)$ . These are the columns of the change-of-basis matrix $$ P = \begin{pmatrix} 1 & -1 \ 1 & 1 \end{pmatrix}. $$ Now take the fixed vector $v$ that reads as $(2, 4)$ in the old square grid. We want its reading in the new grid.

Step 1. The new coordinates are obtained by undoing $P$ , that is by multiplying by $P^{- 1}$ . Compute the inverse. The number $det P = (1) (1) - (- 1) (1) = 2$ , and $$ P^{-1} = \frac{1}{2}\begin{pmatrix} 1 & 1 \ -1 & 1 \end{pmatrix}. $$

Step 2. Apply it to the old coordinates $(2, 4)$ : $$ P^{-1}\begin{pmatrix} 2 \ 4 \end{pmatrix} = \frac{1}{2}\begin{pmatrix} 1 & 1 \ -1 & 1 \end{pmatrix}\begin{pmatrix} 2 \ 4 \end{pmatrix} = \frac{1}{2}\begin{pmatrix} 6 \ 2 \end{pmatrix} = \begin{pmatrix} 3 \ 1 \end{pmatrix}. $$

Step 3. Check the answer means what it should. The reading $(3, 1)$ claims $v = 3 \cdot (1, 1) + 1 \cdot (- 1, 1)$ . Compute the right side: $3 \cdot (1, 1) = (3, 3)$ and $1 \cdot (- 1, 1) = (- 1, 1)$ , and $(3, 3) + (- 1, 1) = (2, 4)$ . That is the original $v$ . The reading is correct.

What this tells us: to convert coordinates from the old grid to the new grid you multiply by $P^{- 1}$ , not by $P$ . The matrix $P$ carries new-ruler readings back to old readings; converting an old reading into a new one runs that backward, which is why the inverse appears.

Check your understanding Beginner

Exercise (easy, multiple choice).

The columns of the change-of-basis matrix $P$ are the new basis vectors written in old coordinates. To convert a vector's old coordinates into its new coordinates, which matrix do you apply?

A. $P$ B. $P^{- 1}$ C. $P + P$ D. The identity matrix

Hint

$P$ turns new-grid readings into old-grid readings. You want the reverse direction.

Answer

B. $P^{- 1}$ . Because the columns of $P$ are the new rulers in old coordinates, multiplying by $P$ takes a list written in the new grid and returns the corresponding list in the old grid. Converting the other way — old coordinates into new — runs that backward, so you apply $P^{- 1}$ . Feedback-correct: coordinates convert by the inverse of the new-to-old matrix. Feedback-wrong: applying $P$ would go new-to-old; you want old-to-new, which is its undo, $P^{- 1}$ .

Formal definition Intermediate+

Let $K$ be a field and $V$ a finite-dimensional $K$ -vector space of dimension $n$ , with the notion of a basis and of coordinates relative to a basis as in 01.01.04, and the matrix of a linear map as in 01.01.05. Fix two ordered bases $B = (b_{1}, \dots, b_{n})$ and $B^{'} = (b_{1}^{'}, \dots, b_{n}^{'})$ . For a vector $v \in V$ , write $[v]_{B} \in K^{n}$ for its coordinate column relative to $B$ , the unique column with $v = \sum_{j} ([v]_{B})_{j} b_{j}$ .

Definition (change-of-basis matrix). The change-of-basis matrix from $B^{'}$ to $B$ , written $P = P_{B \leftarrow B^{'}}$ , is the $n \times n$ matrix whose $j$ -th column is the coordinate column $[b_{j}^{'}]_{B}$ of the new basis vector $b_{j}^{'}$ expressed in the old basis $B$ : $$ P_{B \leftarrow B'} = \big(, [b'_1]_B \mid [b'_2]_B \mid \cdots \mid [b'_n]B ,\big). $$ Equivalently $b_{j}^{'} = \sum_{i} P_{ij} b_{i}$ . Because $B^{'}$ is a basis, its vectors are linearly independent, so $P$ is invertible, and $P^{-1} = P{B' \leftarrow B}$ is the change-of-basis matrix in the reverse direction.

Coordinate transformation law (contravariance). For every $v \in V$ , $$ [v]B = P,[v]{B'}, \qquad\text{equivalently}\qquad [v]_{B'} = P^{-1},[v]_B. $$ The basis vectors transform by $P$ (the columns of $P$ are the new vectors in old coordinates), while the coordinates transform by $P^{- 1}$ . The coordinates move contravariantly — opposite to the basis. This sign-of-direction convention is the source of the lower-versus-upper index notation in tensor analysis and is fixed here as: $P$ goes new-to-old on vectors, hence $P^{- 1}$ goes old-to-new on coordinates.

Operator transformation law (similarity). Let $T : V \to V$ be a linear operator with matrix $[T]_{B}$ relative to $B$ , defined by $[T v]_{B} = [T]_{B} [v]_{B}$ . Then $$ [T]_{B'} = P^{-1},[T]_B,P. $$ Two square matrices $A, A^{'} \in M_{n} (K)$ are similar (or conjugate) when $A^{'} = P^{- 1} A P$ for some invertible $P$ ; the operator transformation law says that the matrices of one operator in different bases form a single similarity class. Similarity is an equivalence relation, and its equivalence classes are exactly the operators on $V$ up to choice of basis.

Passive versus active. The matrix $P$ here is a passive change of basis: the vectors of $V$ do not move; only their descriptions change. This is to be held apart from an active linear map, which moves vectors of $V$ to new vectors. Numerically the two can look identical — both are invertible matrices — but they act on different data. A passive $P$ relates two coordinate columns of the same vector; an active $T$ relates the coordinate columns of $v$ and of its image $T v$ in a single basis.

Dual transformation law (contragredience). Let $V^{*}$ be the dual space 01.01.02, with dual bases $B^{*} = (b^{1}, \dots, b^{n})$ and $B^{' *} = (b^{'1}, \dots, b^{' n})$ characterised by $b^{i} (b_{j}) = δ_{j}^{i}$ . Writing a covector's coordinate row relative to a basis, the dual coordinates transform by $P^{T}$ where the primal coordinates transformed by $P^{- 1}$ : the dual basis is contragredient to the original. Concretely the matrix of dual bases satisfies $P_{B^{' *} \leftarrow B^{*}} = (P^{- 1})^{T} = (P^{T})^{- 1}$ , so covector coordinates are covariant — they transform the same way the basis vectors do, by $P^{T}$ acting on the row.

Notation: $[v]_{B}$ is the coordinate column of $v$ in the ordered basis $B$ ; $P_{B \leftarrow B^{'}}$ is the change-of-basis matrix whose columns are the new basis vectors in old coordinates; $A^{T}$ is the transpose; $δ_{j}^{i}$ is the Kronecker delta; $A^{'} = P^{- 1} A P$ denotes similarity and $A^{'} = P^{T} A P$ denotes congruence, a distinct relation used for bilinear and quadratic forms 01.01.15.

Counterexamples to common slips

The coordinates do not transform by $P$ . Writing $[v]_{B^{'}} = P [v]_{B}$ inverts the law. The columns of $P$ are the new vectors in old coordinates, so $P$ sends a new-basis column to the matching old-basis column; the coordinate change from old to new is therefore $P^{- 1}$ . The single most common error is to apply $P$ where $P^{- 1}$ is required.
Similarity ( $P^{- 1} A P$ ) is not congruence ( $P^{T} A P$ ). The first is the law for the matrix of an operator $V \to V$ ; the second is the law for the matrix of a bilinear form $V \times V \to K$ . They agree only when $P$ is orthogonal, $P^{T} = P^{- 1}$ . Confusing them swaps the invariants: similarity preserves eigenvalues, congruence preserves the signature.
A change of basis is passive, not active. The matrix $P^{- 1} [T]_{B} P$ is the same operator $T$ re-described; it is not the composite of $T$ with a genuine motion of the space. Reading $P$ as an active map silently changes the object under study.

Key theorem with proof Intermediate+

Theorem (the transformation laws; Shilov Ch. 4 ^{[source pending]}; Halmos §§44–47 ^{[source pending]}). Let $B, B^{'}$ be ordered bases of the finite-dimensional space $V$ , and let $P = P_{B \leftarrow B^{'}}$ be the change-of-basis matrix whose $j$ -th column is $[b_{j}^{'}]_{B}$ . Then $P$ is invertible, and for every vector $v \in V$ and every linear operator $T : V \to V$ , $$ [v]_{B'} = P^{-1},[v]B, \qquad [T]{B'} = P^{-1},[T]_B,P. $$

Proof. Invertibility of $P$ . The columns of $P$ are the coordinate columns $[b_{1}^{'}]_{B}, \dots, [b_{n}^{'}]_{B}$ . The coordinate map $v \mapsto [v]_{B}$ is a linear isomorphism $V \to K^{n}$ , so it carries the linearly independent family $b_{1}^{'}, \dots, b_{n}^{'}$ to a linearly independent family of columns; $n$ independent columns in $K^{n}$ form an invertible matrix. Hence $P \in G L_{n} (K)$ .

Coordinate law. Fix $v \in V$ and write its new coordinates $[v]_{B^{'}} = (c_{1}, \dots, c_{n})^{T}$ , so $v = \sum_{j} c_{j} b_{j}^{'}$ by definition of coordinates relative to $B^{'}$ . Apply the linear coordinate map $[\cdot]_{B}$ to both sides: $$ [v]_B = \Big[\sum_j c_j, b'_j\Big]_B = \sum_j c_j, [b'_j]B = \sum_j c_j, (\text{ $j$ -th column of } P) = P, [v]{B'}. $$ The first equality is the definition, the second is linearity of $[\cdot]_{B}$ , the third is the definition of $P$ 's columns, and the last is the column rule for matrix–vector multiplication. Thus $[v]_{B} = P [v]_{B^{'}}$ , and since $P$ is invertible, $[v]_{B^{'}} = P^{- 1} [v]_{B}$ .

Operator law. By definition of the matrix of $T$ in a basis, $[T w]_{B^{'}} = [T]_{B^{'}} [w]_{B^{'}}$ and $[T w]_{B} = [T]_{B} [w]_{B}$ for all $w$ . Start from an arbitrary $w$ and chase coordinates through the coordinate law applied to both $w$ and $T w$ : $$ [T]{B'}[w]{B'} = [Tw]{B'} = P^{-1}[Tw]B = P^{-1}[T]B[w]B = P^{-1}[T]B, P,[w]{B'}. $$ The second equality is the coordinate law for the vector $T w$ , the third is the operator's matrix in $B$ , and the fourth is the coordinate law $[w]_{B} = P [w]_{B^{'}}$ . The identity $[T]{B'}[w]{B'} = P^{-1}[T]B P,[w]{B'} $h o l d s f or e v er y$ w $, h e n ce f or e v er y coor d ina t eco l u mn$ [w]{B'} \in K^n $a s$ w $r an g eso v er$ V $. T w o ma t r i ces t ha t a g r eeo na l l o f$ K^n $a r ee q u a l, so$ [T]{B'} = P^{-1}[T]_B P $.$ \square$

Bridge. The operator transformation law builds toward the entire canonical-form programme: once $[T]_{B^{'}} = P^{- 1} [T]_{B} P$ , the question "what is the simplest matrix of $T$ ?" becomes "what is the simplest representative of a similarity class?", which is exactly what diagonalisation 01.01.08, the Jordan form 01.01.11, and the primary decomposition 01.01.16 answer by choosing $P$ to be an eigenvector or generalised-eigenvector matrix. The foundational reason the determinant, trace, rank, and characteristic polynomial are attached to the operator rather than to the matrix is that each is invariant under $A \mapsto P^{- 1} A P$ , so they are constant on a similarity class and therefore properties of $T$ alone. This is exactly the same algebra read on covectors with the transpose in place of the inverse: the dual basis transforms contragrediently, by $P^{T}$ , which generalises to the upper-versus-lower index transformation law of tensors 13.02.01 and appears again in the congruence law $A \mapsto P^{T} A P$ for bilinear and quadratic forms 01.01.15, where the invariant is no longer the eigenvalue but the signature. Putting these together, change of basis is the single grammatical rule — conjugate operators, transpose-invert covectors, sandwich forms — from which the invariants of every later structure are read off.

Exercises Intermediate+

Exercise 3 (medium, symbolic).

Over $R^{2}$ with standard basis $B$ , let $B^{'} = (b_{1}^{'}, b_{2}^{'})$ with $b_{1}^{'} = (1, 1)$ , $b_{2}^{'} = (1, - 1)$ , and let $T$ have standard matrix $[T]_{B} = (0110)$ (the swap of coordinates). Compute $[T]_{B^{'}}$ and interpret the result.

Hint

$P = (11 1 - 1)$ , and $b_{1}^{'}, b_{2}^{'}$ are eigenvectors of the swap with eigenvalues $+ 1, - 1$ .

Answer

With $P = (11 1 - 1)$ , $det P = - 2$ and $P^{- 1} = - \frac{1}{2} (- 1 - 1 - 1 1) = \frac{1}{2} (11 1 - 1)$ . Then $$ [T]_{B'} = P^{-1}[T]_B P = \tfrac{1}{2}\begin{pmatrix} 1 & 1 \ 1 & -1 \end{pmatrix}\begin{pmatrix} 0 & 1 \ 1 & 0 \end{pmatrix}\begin{pmatrix} 1 & 1 \ 1 & -1 \end{pmatrix} = \begin{pmatrix} 1 & 0 \ 0 & -1 \end{pmatrix}. $$ The matrix is diagonal: in the eigenbasis $B^{'}$ the swap acts as $+ 1$ on $b_{1}^{'} = (1, 1)$ and $- 1$ on $b_{2}^{'} = (1, - 1)$ . The change of basis to the eigenbasis diagonalises $T$ . Rubric: full credit for the diagonal matrix $diag (1, - 1)$ and the eigenbasis interpretation.

Exercise 6 (medium, symbolic).

Let $B$ be the standard basis of $R^{2}$ and $B^{'}$ the basis obtained by rotating the axes through $4 5^{\circ}$ : $b_{1}^{'} = \frac{1}{2} (1, 1)$ , $b_{2}^{'} = \frac{1}{2} (- 1, 1)$ . The quadratic form $q (x, y) = x y$ has symmetric matrix $A = (0 1/2 1/2 0)$ . Compute the matrix of $q$ in the rotated basis using the congruence law $P^{T} A P$ , and contrast with the similarity law.

Hint

For a form, the transformation law is $A \mapsto P^{T} A P$ , not $P^{- 1} A P$ . Here $P$ is orthogonal, so $P^{T} = P^{- 1}$ , but write the form law explicitly.

Answer

With $P = \frac{1}{2} (11 - 1 1)$ (columns $b_{1}^{'}, b_{2}^{'}$ ), the congruence transform is $$ P^{\mathsf T} A P = \tfrac{1}{2}\begin{pmatrix} 1 & 1 \ -1 & 1 \end{pmatrix}\begin{pmatrix} 0 & 1/2 \ 1/2 & 0 \end{pmatrix}\begin{pmatrix} 1 & -1 \ 1 & 1 \end{pmatrix} = \begin{pmatrix} 1/2 & 0 \ 0 & -1/2 \end{pmatrix}. $$ So in rotated coordinates $(u, v)$ the form becomes $q = \frac{1}{2} u^{2} - \frac{1}{2} v^{2}$ : the rotation of axes diagonalises the quadratic form, turning the cross term $x y$ into a difference of squares. Because $P$ is orthogonal here, $P^{T} A P$ and $P^{- 1} A P$ coincide numerically, but the congruence law is the correct one for a form: it preserves the signature $(+, -)$ , whereas similarity would preserve eigenvalues. Rubric: full credit for $diag (1/2, - 1/2)$ and the congruence-versus-similarity distinction.

Exercise 7 (hard, short-answer).

Show that the trace is a similarity invariant directly, without going through the characteristic polynomial, using only the identity $tr (X Y) = tr (Y X)$ .

Hint

Group $P^{- 1} A P$ as $(P^{- 1}) (A P)$ and cycle.

Answer

Write $A^{'} = P^{- 1} A P$ and set $X = P^{- 1}$ , $Y = A P$ . By the cyclic property $tr (X Y) = tr (Y X)$ , $$ \operatorname{tr}(A') = \operatorname{tr}(P^{-1}(AP)) = \operatorname{tr}((AP)P^{-1}) = \operatorname{tr}(A(PP^{-1})) = \operatorname{tr}(A). $$ The third equality uses associativity to regroup $A P \cdot P^{- 1} = A (P P^{- 1}) = A$ . Hence $tr (A^{'}) = tr (A)$ , so the trace depends only on the operator, not on the basis. The identity $tr (X Y) = tr (Y X)$ itself follows from $\sum_{i} (X Y)_{ii} = \sum_{i, j} X_{ij} Y_{j i} = \sum_{j, i} Y_{j i} X_{ij} = \sum_{j} (Y X)_{j j}$ . Rubric: full credit for the cyclic-trace argument and the index proof of cyclicity.

Exercise 8 (hard, symbolic).

Let $T$ act on $R^{2}$ , let ${f^{1}, f^{2}}$ be the dual basis to $B$ and ${f^{'1}, f^{'2}}$ the dual basis to $B^{'}$ , with change-of-basis matrix $P = P_{B \leftarrow B^{'}}$ . Derive the contragredient law: show the dual coordinate rows transform by $P^{T}$ when the primal columns transform by $P^{- 1}$ , and conclude that the natural pairing $φ (v)$ is basis-independent.

Hint

Write the pairing as the row–column product $[φ]_{B}^{row} [v]_{B}$ and demand it equal $[φ]_{B^{'}}^{row} [v]_{B^{'}}$ for all $v$ .

Answer

For a covector $φ \in V^{*}$ , the value $φ (v)$ is the row–column product $[φ]_{B} [v]_{B}$ , where $[φ]_{B}$ is the coordinate row of $φ$ relative to the dual basis $B^{*}$ . Basis-independence demands $[φ]_{B^{'}} [v]_{B^{'}} = [φ]_{B} [v]_{B}$ for all $v$ . Substitute $[v]_{B} = P [v]_{B^{'}}$ : $$ [\varphi]B[v]B = [\varphi]B,P,[v]{B'} = ([\varphi]B P),[v]{B'}. $$ Comparing with $[\varphi]{B'}[v]{B'} $f or a l l$ [v]{B'} $g i v es$ [\varphi]{B'} = [\varphi]B, P $. A sco l u mn s t hi s i s$ [\varphi]{B'}^{\mathsf T} = P^{\mathsf T}[\varphi]B^{\mathsf T} $, so t h e d u a l coor d ina t es t r an s f or mb y$ P^{\mathsf T} $— co v a r ian tl y, t h es am e d i r ec t i o na s t h e ba s i s v ec t or s an d o pp os i t e t o t h e p r ima l coor d ina t es w hi c h u se d$ P^{-1} $. T h e p ai r in g$ \varphi(v) $i s in v a r ian t b y co n s t r u c t i o n : t h e$ P $o n t h eco v ec t or s i d ec an ce l s t h e$ P^{-1} $o n t h e v ec t or s i d e . R u b r i c : f u l l cr e d i t f or$ [\varphi]{B'} = [\varphi]_B P$, the transpose form, and the cancellation showing invariance.

Advanced results Master

Theorem (the three relations and their invariants; Hoffman-Kunze Ch. 5–6 ^{[source pending]}; Halmos §§44–47 ^{[source pending]}). On $M_{n} (K)$ there are three transformation relations attached to three different geometric objects, each generated by an invertible $P$ :

equivalence A \mapsto Q A P, similarity A \mapsto P^{- 1} A P, congruence A \mapsto P^{T} A P .

Equivalence ( $Q, P$ independently invertible) is the law for the matrix of a linear map $V \to W$ under independent changes of basis in source and target; its complete invariant is the rank, and the canonical form is $(I_{r} 0 00)$ . Similarity is the law for the matrix of an operator $V \to V$ under one change of basis used on both source and target; its invariants are the rank, determinant, trace, characteristic polynomial, minimal polynomial, and the full system of elementary divisors, and the canonical form is the rational or Jordan form. Congruence is the law for the matrix of a bilinear form $V \times V \to K$ under one change of basis; over $R$ for symmetric matrices its complete invariant is the signature (Sylvester's law of inertia), and the canonical form is $diag (I_{p}, - I_{q}, 0)$ .

The three relations coincide precisely when $P$ is orthogonal, $P^{T} = P^{- 1} = Q^{- 1}$ , which is why the spectral theorem 01.01.13 — diagonalisation by an orthogonal change of basis — simultaneously diagonalises a symmetric operator and its associated quadratic form, reconciling similarity and congruence in one stroke. Away from orthogonal $P$ the relations genuinely differ: the matrices $diag (1, 1)$ and $diag (2, 1/2)$ are congruent (both positive definite, signature $(2, 0)$ ) but not similar (different determinants); the matrices $diag (1, - 1)$ and $(0110)$ are both similar and congruent here because the connecting $P$ happens to be orthogonal.

Theorem (the tensor transformation law). Coordinates of a vector transform by $P^{- 1}$ (contravariantly, written with an upper index $v^{i}$ ), coordinates of a covector transform by $P^{T}$ , equivalently by the inverse-transpose viewed on the components (covariantly, written with a lower index $ω_{i}$ ), and a general type- $(p, q)$ tensor transforms with $p$ copies of $P^{- 1}$ on its upper indices and $q$ copies of $P^{T}$ on its lower indices. In index notation, with $\tilde{P}^{i}_{j}$ the entries of $P^{- 1}$ and $P^{j}_{i}$ the entries of $P$ , $$ \tilde v{}^i = \tilde P{}^i{}_j, v^j, \qquad \tilde \omega_i = P^j{}i, \omega_j, \qquad \tilde T{}^{i}{}{k} = \tilde P{}^i{}j, P^l{}k, T^{j}{}{l}, $$ so that contractions like $ω_{i} v^{i}$ are invariant because each upper $P^{- 1}$ cancels a lower $P$ . The operator transformation law $[T]{B'} = P^{-1}[T]_B P $i s t h es p ec ia l c a seo f a t y p e -$ (1,1) $t e n sor : o n e$ P^{-1} $o n t h eco n t r a v a r ian t s l o t, o n e$ P $o n t h eco v a r ian t s l o t, w hi c hi se x a c tl y co nj ug a t i o nb y$ P $. T hi s i d e n t i f i es t h e ma t r i x o f an o p er a t or w i t ha$ (1,1) $- t e n sor an d t h e ma t r i x o f abi l in e a r f or m w i t ha$ (0,2)$-tensor, explaining at the index level why operators conjugate while forms sandwich with the transpose 13.02.01.

Theorem (the groupoid of bases). Fix $V$ of dimension $n$ . The ordered bases of $V$ are the objects of a category in which the unique morphism from $B$ to $B^{'}$ is the change-of-basis matrix $P_{B^{'} \leftarrow B} \in G L_{n} (K)$ ; composition is matrix multiplication and every morphism is invertible, so this category is a groupoid, and it is connected because any two bases are related by some invertible $P$ . Choosing a base point $B_{0}$ trivialises the groupoid: its vertex group $Aut (B_{0})$ is $G L_{n} (K)$ , and the change-of-basis matrices are the morphisms of the action groupoid of $G L_{n} (K)$ acting simply transitively on the set of bases. Operators, vectors, forms, and tensors are then functors on this groupoid valued in their respective transformation laws — conjugation, $P^{- 1}$ , $P^{T} A P$ , mixed — and "an intrinsic quantity" means a natural object on the groupoid, constant on the connected component. This is the categorical reading of basis-independence: an invariant is a limit over the groupoid of bases.

Synthesis. The central insight is that the change-of-basis matrix $P$ , whose columns are the new basis vectors in old coordinates, is the single datum from which every transformation law in linear algebra descends, and the laws differ only in which representation of $G L_{n} (K)$ the object carries. A vector carries the standard representation read backward, $v \mapsto P^{- 1} v$ , so its coordinates are contravariant; a covector carries the dual representation, $ω \mapsto P^{T} ω$ , so its coordinates are covariant — the covector law is dual to the vector law — and the pairing $ω_{i} v^{i}$ is invariant because the two representations are contragredient and cancel. An operator carries the conjugation representation $A \mapsto P^{- 1} A P$ , the tensor product of the standard with its dual; this is exactly why its invariants — determinant, trace, rank, characteristic and minimal polynomials, elementary divisors — are the class functions on $G L_{n} (K)$ and the data of the rational and Jordan canonical forms 01.01.11. A bilinear form carries the congruence representation $A \mapsto P^{T} A P$ , the symmetric or alternating square of the dual, whose complete invariant over $R$ is the signature of Sylvester's law of inertia 01.01.15 rather than the spectrum; the same conjugation grammar generalises to the upper-versus-lower index calculus of tensors 13.02.01.

The three relations — equivalence, similarity, congruence — are the three ways one matrix can present a map, an operator, or a form, and their distinct canonical forms (rank normal form, Jordan/rational form, inertia normal form) are the three answers to "what survives a change of basis." Reading $P$ as a morphism in the groupoid of bases makes the whole structure functorial: invariants are the natural quantities, the index calculus of upper and lower indices 13.02.01 is the bookkeeping of which representation acts, and the spectral theorem is the single point where an orthogonal $P$ makes similarity and congruence agree. The transformation laws are therefore not three separate rules but one representation-theoretic principle applied to four kinds of object.

Full proof set Master

Proposition (similarity is an equivalence relation and rank, determinant, trace, and characteristic polynomial are invariants). On $M_{n} (K)$ , the relation $A \sim A^{'}$ defined by $\exists P \in G L_{n} (K) : A^{'} = P^{- 1} A P$ is reflexive, symmetric, and transitive, and the maps $A \mapsto rank A$ , $A \mapsto det A$ , $A \mapsto tr A$ , $A \mapsto χ_{A}$ are constant on each equivalence class.

Proof. Reflexivity: $A = I^{- 1} A I$ . Symmetry: if $A^{'} = P^{- 1} A P$ then $A = (P^{- 1})^{- 1} A^{'} (P^{- 1})$ , so $A^{'} \sim A$ with $P^{- 1}$ . Transitivity: if $A^{'} = P^{- 1} A P$ and $A^{''} = Q^{- 1} A^{'} Q$ then $A^{''} = Q^{- 1} P^{- 1} A P Q = (P Q)^{- 1} A (P Q)$ , with $P Q$ invertible. Hence $\sim$ is an equivalence relation.

Rank is invariant because multiplication by an invertible matrix on either side is a bijective linear map, which neither increases nor decreases the dimension of the image; thus $rank (P^{- 1} A P) = rank A$ . Determinant: $det (P^{- 1} A P) = det (P^{- 1}) det A det P = (det P)^{- 1} det A det P = det A$ , using multiplicativity and $det (P^{- 1}) = (det P)^{- 1}$ . Trace: $tr (P^{- 1} A P) = tr ((A P) P^{- 1}) = tr (A)$ by the cyclic property of the trace. Characteristic polynomial: $t I - P^{- 1} A P = P^{- 1} (t I - A) P$ , so $χ_{P^{- 1} A P} (t) = det (P^{- 1} (t I - A) P) = det (t I - A) = χ_{A} (t)$ ; the determinant and trace invariance follow again as the bottom and next-to-top coefficients of $χ_{A}$ , giving a second proof consistent with the direct ones. $□$

Proposition (the coordinate and operator laws determine each other). If a rule assigning to each basis $B$ a column $[v]_{B} \in K^{n}$ transforms by $[v]_{B^{'}} = P^{- 1} [v]_{B}$ under $P = P_{B \leftarrow B^{'}}$ , then the induced matrix of any operator $T$ necessarily transforms by $[T]_{B^{'}} = P^{- 1} [T]_{B} P$ ; conversely the operator law forces the coordinate law on the columns of an invertible operator's matrix.

Proof. Assume the coordinate law. For an operator $T$ and any $w$ , $[T w]_{B^{'}} = P^{- 1} [T w]_{B} = P^{- 1} [T]_{B} [w]_{B} = P^{- 1} [T]_{B} P [w]_{B^{'}}$ , while also $[T w]_{B^{'}} = [T]_{B^{'}} [w]_{B^{'}}$ ; equating and ranging over all $[w]_{B^{'}} \in K^{n}$ gives $[T]_{B^{'}} = P^{- 1} [T]_{B} P$ . Conversely assume the operator law holds for all $T$ . Apply it to the operator $T$ whose matrix in $B$ is $[T]_{B} = ([v]_{B} ∣ 0 ∣ \dots ∣ 0)$ for a fixed $v$ , i.e. $T$ sends $b_{1}$ to $v$ and the other basis vectors to $0$ . The first column of $[T]_{B^{'}} = P^{- 1} [T]_{B} P$ is $P^{- 1} [T]_{B} (first column of P) = P^{- 1} [T]_{B} [b_{1}^{'}]_{B}$ . Choosing $v$ and tracking the first column recovers $[v]_{B^{'}} = P^{- 1} [v]_{B}$ after normalising by the action on $b_{1}^{'}$ ; the two laws are thus inter-derivable, reflecting that the operator law is the coordinate law tensored with its dual. $□$

Proposition (Sylvester's law of inertia; congruence invariant). Let $A$ be a real symmetric $n \times n$ matrix. The number $p$ of positive and $q$ of negative diagonal entries obtained when $A$ is reduced to diagonal form by a congruence $A \mapsto P^{T} A P$ is independent of the reducing $P$ ; the pair $(p, q)$ — the signature — is a complete invariant of the congruence class, and $p + q = rank A$ .

Proof. Reduction to diagonal form by congruence is possible by simultaneous row-and-column operations (symmetric Gaussian elimination), so some $(p, q)$ arises. Suppose two congruences give signatures $(p, q)$ and $(p^{'}, q^{'})$ with $p > p^{'}$ . Let $U_{+}$ be the $p$ -dimensional subspace on which the form $x \mapsto x^{T} A x$ is positive definite in the first reduction, and $W_{-}$ the $(n - p^{'})$ -dimensional subspace on which it is negative semidefinite in the second (the span of the negative and zero directions). Then $dim U_{+} + dim W_{-} = p + (n - p^{'}) > n$ , so $U_{+} \cap W_{-} \neq = {0}$ ; a nonzero vector there makes $x^{T} A x$ simultaneously $> 0$ and $\leq 0$ , a contradiction. Hence $p \leq p^{'}$ , and by symmetry $p = p^{'}$ ; likewise $q = q^{'}$ . The rank is invariant under congruence by the rank argument of the first proposition, so $p + q = rank A$ . The signature, not the eigenvalues, is therefore the congruence invariant — the precise sense in which congruence and similarity carry different information. $□$

Connections Master

Change of basis is the engine of the canonical-form theory. Diagonalisation 01.01.08 chooses $P$ to be an eigenvector matrix so that $P^{- 1} A P$ is diagonal; the Jordan canonical form 01.01.11 chooses $P$ from generalised eigenvectors when diagonalisation fails; the primary decomposition 01.01.16 is the basis-free statement of which $P$ exist. Each is the search for the simplest representative of a similarity class, the orbits of which the operator transformation law defines.

The congruence law $A \mapsto P^{T} A P$ is the transformation law for bilinear and quadratic forms 01.01.15, whose invariant is the signature by Sylvester's law of inertia, in deliberate contrast to the eigenvalue invariants of similarity. The spectral theorem for normal operators 01.01.13 is exactly the case where an orthogonal $P$ makes the two laws coincide, simultaneously diagonalising an operator and its associated form, which is why the principal-axes theorem can be read either as similarity or as congruence.

The contravariant-coordinate / covariant-covector pair generalises directly to the tensor transformation law on smooth manifolds 13.02.01, where the change-of-basis matrix becomes the Jacobian of a coordinate change and upper and lower indices record exactly the $P^{- 1}$ -versus- $P^{T}$ distinction proved here for the dual basis 01.01.02; the invariance of contractions is the finite-dimensional shadow of the tensor calculus underlying general relativity.

Historical & philosophical context Master

The idea that a vector exists independently of any coordinate system, so that coordinates are derived descriptions subject to a transformation law, was first made systematic by Hermann Grassmann in the 1844 Ausdehnungslehre, which built an $n$ -dimensional theory of linear extension with bases but without a privileged coordinate frame ^{[Grassmann 1844]}. The matrix as an algebraic object on which a substitution acts, with a product, an inverse, and a calculus that makes $P^{- 1} A P$ meaningful, was introduced by Arthur Cayley in the 1858 Memoir on the Theory of Matrices, the work in which similarity becomes a manipulable operation rather than an implicit change of variables ^{[Cayley 1858]}. The recognition that congruence carries a different invariant from similarity is due to James Joseph Sylvester, whose law of inertia, stated for real quadratic forms in 1852, established that the signature is preserved under the substitution $A \mapsto P^{T} A P$ regardless of the reducing transformation ^{[Sylvester 1852]}. The modern finite-dimensional packaging — coordinates as a linear isomorphism, the operator law as conjugation, the dual law as contragredience — is the treatment of Paul Halmos's 1958 Finite-Dimensional Vector Spaces, while Shilov's 1971 Linear Algebra gives the transition-matrix development in the concrete operator language followed here.

Bibliography Master

@article{Cayley1858,
  author  = {Cayley, Arthur},
  title   = {A Memoir on the Theory of Matrices},
  journal = {Philosophical Transactions of the Royal Society of London},
  volume  = {148},
  year    = {1858},
  pages   = {17--37}
}

@article{Sylvester1852,
  author  = {Sylvester, James Joseph},
  title   = {A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares},
  journal = {Philosophical Magazine (Series 4)},
  volume  = {4},
  year    = {1852},
  pages   = {138--142}
}

@book{Grassmann1844,
  author    = {Grassmann, Hermann},
  title     = {Die lineale Ausdehnungslehre, ein neuer Zweig der Mathematik},
  publisher = {Otto Wigand},
  address   = {Leipzig},
  year      = {1844}
}

@book{Halmos1958,
  author    = {Halmos, Paul R.},
  title     = {Finite-Dimensional Vector Spaces},
  edition   = {2nd},
  publisher = {Van Nostrand},
  address   = {Princeton, NJ},
  year      = {1958}
}

@book{Shilov1977,
  author    = {Shilov, Georgi E.},
  title     = {Linear Algebra},
  publisher = {Dover Publications},
  address   = {New York},
  year      = {1977},
  note      = {Translation of the 1971 Russian edition, transl. R. A. Silverman}
}

@book{HoffmanKunze1971,
  author    = {Hoffman, Kenneth and Kunze, Ray},
  title     = {Linear Algebra},
  edition   = {2nd},
  publisher = {Prentice-Hall},
  address   = {Englewood Cliffs, NJ},
  year      = {1971}
}

@book{Lang2002,
  author    = {Lang, Serge},
  title     = {Algebra},
  edition   = {3rd, revised},
  series    = {Graduate Texts in Mathematics},
  volume    = {211},
  publisher = {Springer},
  year      = {2002}
}

Prerequisites

01.01.04
01.01.05

Tier anchors

beginner: Re-describing the same arrow and the same motion when you switch rulers — Shilov *Linear Algebra* Ch. 4; 3Blue1Brown *Essence of Linear Algebra* Ch. 13 (change of basis)
intermediate: Shilov *Linear Algebra* Ch. 4 (the change-of-basis matrix, coordinate and operator transformation laws, similarity); Hoffman-Kunze *Linear Algebra* §3.4 (the matrix of a linear transformation and change of basis); Axler *Linear Algebra Done Right* §3.C–§3.F (matrices, invertibility, change of basis)
master: Shilov *Linear Algebra* Ch. 4–5; Hoffman-Kunze *Linear Algebra* §3.4 and Ch. 5–6 (similarity, equivalence, congruence); Halmos *Finite-Dimensional Vector Spaces* §§37–47 (change of basis, similarity, the dual transformation law); Lang *Algebra* Ch. XIII (matrices, similarity, and bilinear forms); Bourbaki *Algèbre* Ch. II (modules, bases, and matrices)

References

images/Shilov-Linear-Algebra__4cbdee00cc.jpg · Shilov *Linear Algebra* — Fast Track archive cover; Ch. 4 the matrix of a linear operator under a change of basis, the coordinate transformation law $[v]_{B'} = P^{-1}[v]_B$, the operator transformation law $[T]_{B'} = P^{-1}[T]_B P$, and similarity
Shilov, G. E. — Linear Algebra (Dover, 1977 transl. of the 1971 Russian ed.) · Ch. 4 §§4.2–4.4 (coordinates relative to a basis, the transition matrix, transformation of the matrix of an operator), Ch. 5 (the canonical form and similarity invariants)
Halmos, P. R. — Finite-Dimensional Vector Spaces (2nd ed., Van Nostrand, 1958) · §§37–47 — bases and coordinates, the matrix of a transformation, change of basis, similarity, the transpose / adjoint and the contragredient transformation law on the dual space
Cayley, A. — A Memoir on the Theory of Matrices · Philosophical Transactions of the Royal Society of London 148 (1858), 17–37 — the matrix as an algebraic object, multiplication, the inverse, and the calculus of substitutions underlying similarity
Sylvester, J. J. — A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares · Philosophical Magazine (4) 4 (1852), 138–142 — the law of inertia: the signature of a real quadratic form is invariant under the congruence transformation $A \mapsto P^{\mathsf T} A P$
Grassmann, H. — Die lineale Ausdehnungslehre, ein neuer Zweig der Mathematik · Otto Wigand, Leipzig, 1844 — the first systematic theory of $n$-dimensional linear extension, bases, and coordinates independent of a fixed coordinate system, the conceptual source of basis-independence

Estimated time

beginner: 16m
intermediate: 42m
master: 85m