Change of basis and the transformation laws
Anchor (Master): Shilov *Linear Algebra* Ch. 4–5; Hoffman-Kunze *Linear Algebra* §3.4 and Ch. 5–6 (similarity, equivalence, congruence); Halmos *Finite-Dimensional Vector Spaces* §§37–47 (change of basis, similarity, the dual transformation law); Lang *Algebra* Ch. XIII (matrices, similarity, and bilinear forms); Bourbaki *Algèbre* Ch. II (modules, bases, and matrices)
Intuition Beginner
A vector is a thing in the world — an arrow, a displacement, a force. The list of numbers you write down for it is not the vector itself; it is a description, and the description depends on the rulers you chose. Switch to different rulers and the same arrow gets a different list of numbers, even though nothing about the arrow has moved. A change of basis is exactly that switch of rulers, and the transformation laws are the bookkeeping that tells you how every list of numbers must change so that all the things they describe stay put.
Here is the part that surprises people. When you switch to a finer set of rulers, so that each new ruler measures a smaller step, the numbers you read off get bigger, not smaller. Cut your unit in half and every length doubles. So the coordinates move opposite to the rulers. This opposite movement is the whole point: vectors and their coordinates pull in opposite directions, and keeping track of which is which is what the laws below do.
The same idea applies to a transformation — a rule that moves the whole space. Its grid of numbers, its matrix, also depends on the rulers. Re-describe the rulers and you must re-describe the matrix, sandwiching it between the switch and its undo. Quantities that survive this sandwich untouched, like the determinant, are the real features of the transformation rather than features of your chosen rulers.
Visual Beginner
The picture shows one fixed arrow drawn twice, once against a square grid and once against a slanted, stretched grid. The arrow never moves. Against the square grid it reads as two steps right and one step up. Against the slanted grid, whose rulers point in different directions and have different lengths, the very same arrow reads as a different pair of numbers. The change-of-basis matrix is the dictionary translating one reading into the other.
Two facts are visible. The arrow is the same in both panels — the geometry does not care which grid you drew. And the two coordinate pairs are linked by a fixed dictionary that depends only on the two grids, not on which arrow you picked.
Worked example Beginner
Take the plane with the ordinary square grid. Pick a new pair of rulers: the first new ruler is the arrow and the second is the arrow . These are the columns of the change-of-basis matrix $$ P = \begin{pmatrix} 1 & -1 \ 1 & 1 \end{pmatrix}. $$ Now take the fixed vector that reads as in the old square grid. We want its reading in the new grid.
Step 1. The new coordinates are obtained by undoing , that is by multiplying by . Compute the inverse. The number , and $$ P^{-1} = \frac{1}{2}\begin{pmatrix} 1 & 1 \ -1 & 1 \end{pmatrix}. $$
Step 2. Apply it to the old coordinates : $$ P^{-1}\begin{pmatrix} 2 \ 4 \end{pmatrix} = \frac{1}{2}\begin{pmatrix} 1 & 1 \ -1 & 1 \end{pmatrix}\begin{pmatrix} 2 \ 4 \end{pmatrix} = \frac{1}{2}\begin{pmatrix} 6 \ 2 \end{pmatrix} = \begin{pmatrix} 3 \ 1 \end{pmatrix}. $$
Step 3. Check the answer means what it should. The reading claims . Compute the right side: and , and . That is the original . The reading is correct.
What this tells us: to convert coordinates from the old grid to the new grid you multiply by , not by . The matrix carries new-ruler readings back to old readings; converting an old reading into a new one runs that backward, which is why the inverse appears.
Check your understanding Beginner
Formal definition Intermediate+
Let be a field and a finite-dimensional -vector space of dimension , with the notion of a basis and of coordinates relative to a basis as in 01.01.04, and the matrix of a linear map as in 01.01.05. Fix two ordered bases and . For a vector , write for its coordinate column relative to , the unique column with .
Definition (change-of-basis matrix). The change-of-basis matrix from to , written , is the matrix whose -th column is the coordinate column of the new basis vector expressed in the old basis : $$ P_{B \leftarrow B'} = \big(, [b'_1]_B \mid [b'_2]_B \mid \cdots \mid [b'_n]B ,\big). $$ Equivalently . Because is a basis, its vectors are linearly independent, so is invertible, and $P^{-1} = P{B' \leftarrow B}$ is the change-of-basis matrix in the reverse direction.
Coordinate transformation law (contravariance). For every , $$ [v]B = P,[v]{B'}, \qquad\text{equivalently}\qquad [v]_{B'} = P^{-1},[v]_B. $$ The basis vectors transform by (the columns of are the new vectors in old coordinates), while the coordinates transform by . The coordinates move contravariantly — opposite to the basis. This sign-of-direction convention is the source of the lower-versus-upper index notation in tensor analysis and is fixed here as: goes new-to-old on vectors, hence goes old-to-new on coordinates.
Operator transformation law (similarity). Let be a linear operator with matrix relative to , defined by . Then $$ [T]_{B'} = P^{-1},[T]_B,P. $$ Two square matrices are similar (or conjugate) when for some invertible ; the operator transformation law says that the matrices of one operator in different bases form a single similarity class. Similarity is an equivalence relation, and its equivalence classes are exactly the operators on up to choice of basis.
Passive versus active. The matrix here is a passive change of basis: the vectors of do not move; only their descriptions change. This is to be held apart from an active linear map, which moves vectors of to new vectors. Numerically the two can look identical — both are invertible matrices — but they act on different data. A passive relates two coordinate columns of the same vector; an active relates the coordinate columns of and of its image in a single basis.
Dual transformation law (contragredience). Let be the dual space 01.01.02, with dual bases and characterised by . Writing a covector's coordinate row relative to a basis, the dual coordinates transform by where the primal coordinates transformed by : the dual basis is contragredient to the original. Concretely the matrix of dual bases satisfies , so covector coordinates are covariant — they transform the same way the basis vectors do, by acting on the row.
Notation: is the coordinate column of in the ordered basis ; is the change-of-basis matrix whose columns are the new basis vectors in old coordinates; is the transpose; is the Kronecker delta; denotes similarity and denotes congruence, a distinct relation used for bilinear and quadratic forms 01.01.15.
Counterexamples to common slips
- The coordinates do not transform by . Writing inverts the law. The columns of are the new vectors in old coordinates, so sends a new-basis column to the matching old-basis column; the coordinate change from old to new is therefore . The single most common error is to apply where is required.
- Similarity () is not congruence (). The first is the law for the matrix of an operator ; the second is the law for the matrix of a bilinear form . They agree only when is orthogonal, . Confusing them swaps the invariants: similarity preserves eigenvalues, congruence preserves the signature.
- A change of basis is passive, not active. The matrix is the same operator re-described; it is not the composite of with a genuine motion of the space. Reading as an active map silently changes the object under study.
Key theorem with proof Intermediate+
Theorem (the transformation laws; Shilov Ch. 4 [source pending]; Halmos §§44–47 [source pending]). Let be ordered bases of the finite-dimensional space , and let be the change-of-basis matrix whose -th column is . Then is invertible, and for every vector and every linear operator , $$ [v]_{B'} = P^{-1},[v]B, \qquad [T]{B'} = P^{-1},[T]_B,P. $$
Proof. Invertibility of . The columns of are the coordinate columns . The coordinate map is a linear isomorphism , so it carries the linearly independent family to a linearly independent family of columns; independent columns in form an invertible matrix. Hence .
Coordinate law. Fix and write its new coordinates , so by definition of coordinates relative to . Apply the linear coordinate map to both sides: $$ [v]_B = \Big[\sum_j c_j, b'_j\Big]_B = \sum_j c_j, [b'_j]B = \sum_j c_j, (\text{-th column of } P) = P, [v]{B'}. $$ The first equality is the definition, the second is linearity of , the third is the definition of 's columns, and the last is the column rule for matrix–vector multiplication. Thus , and since is invertible, .
Operator law. By definition of the matrix of in a basis, and for all . Start from an arbitrary and chase coordinates through the coordinate law applied to both and : $$ [T]{B'}[w]{B'} = [Tw]{B'} = P^{-1}[Tw]B = P^{-1}[T]B[w]B = P^{-1}[T]B, P,[w]{B'}. $$ The second equality is the coordinate law for the vector , the third is the operator's matrix in , and the fourth is the coordinate law . The identity $[T]{B'}[w]{B'} = P^{-1}[T]B P,[w]{B'}w[w]{B'} \in K^nwVK^n[T]{B'} = P^{-1}[T]_B P\square$
Bridge. The operator transformation law builds toward the entire canonical-form programme: once , the question "what is the simplest matrix of ?" becomes "what is the simplest representative of a similarity class?", which is exactly what diagonalisation 01.01.08, the Jordan form 01.01.11, and the primary decomposition 01.01.16 answer by choosing to be an eigenvector or generalised-eigenvector matrix. The foundational reason the determinant, trace, rank, and characteristic polynomial are attached to the operator rather than to the matrix is that each is invariant under , so they are constant on a similarity class and therefore properties of alone. This is exactly the same algebra read on covectors with the transpose in place of the inverse: the dual basis transforms contragrediently, by , which generalises to the upper-versus-lower index transformation law of tensors 13.02.01 and appears again in the congruence law for bilinear and quadratic forms 01.01.15, where the invariant is no longer the eigenvalue but the signature. Putting these together, change of basis is the single grammatical rule — conjugate operators, transpose-invert covectors, sandwich forms — from which the invariants of every later structure are read off.
Exercises Intermediate+
Advanced results Master
Theorem (the three relations and their invariants; Hoffman-Kunze Ch. 5–6 [source pending]; Halmos §§44–47 [source pending]). On there are three transformation relations attached to three different geometric objects, each generated by an invertible :
Equivalence ( independently invertible) is the law for the matrix of a linear map under independent changes of basis in source and target; its complete invariant is the rank, and the canonical form is . Similarity is the law for the matrix of an operator under one change of basis used on both source and target; its invariants are the rank, determinant, trace, characteristic polynomial, minimal polynomial, and the full system of elementary divisors, and the canonical form is the rational or Jordan form. Congruence is the law for the matrix of a bilinear form under one change of basis; over for symmetric matrices its complete invariant is the signature (Sylvester's law of inertia), and the canonical form is .
The three relations coincide precisely when is orthogonal, , which is why the spectral theorem 01.01.13 — diagonalisation by an orthogonal change of basis — simultaneously diagonalises a symmetric operator and its associated quadratic form, reconciling similarity and congruence in one stroke. Away from orthogonal the relations genuinely differ: the matrices and are congruent (both positive definite, signature ) but not similar (different determinants); the matrices and are both similar and congruent here because the connecting happens to be orthogonal.
Theorem (the tensor transformation law). Coordinates of a vector transform by (contravariantly, written with an upper index ), coordinates of a covector transform by , equivalently by the inverse-transpose viewed on the components (covariantly, written with a lower index ), and a general type- tensor transforms with copies of on its upper indices and copies of on its lower indices. In index notation, with the entries of and the entries of ,
$$
\tilde v{}^i = \tilde P{}^i{}_j, v^j, \qquad \tilde \omega_i = P^j{}i, \omega_j, \qquad \tilde T{}^{i}{}{k} = \tilde P{}^i{}j, P^l{}k, T^{j}{}{l},
$$
so that contractions like are invariant because each upper cancels a lower . The operator transformation law $[T]{B'} = P^{-1}[T]_B P(1,1)P^{-1}PP(1,1)(0,2)$-tensor, explaining at the index level why operators conjugate while forms sandwich with the transpose 13.02.01.
Theorem (the groupoid of bases). Fix of dimension . The ordered bases of are the objects of a category in which the unique morphism from to is the change-of-basis matrix ; composition is matrix multiplication and every morphism is invertible, so this category is a groupoid, and it is connected because any two bases are related by some invertible . Choosing a base point trivialises the groupoid: its vertex group is , and the change-of-basis matrices are the morphisms of the action groupoid of acting simply transitively on the set of bases. Operators, vectors, forms, and tensors are then functors on this groupoid valued in their respective transformation laws — conjugation, , , mixed — and "an intrinsic quantity" means a natural object on the groupoid, constant on the connected component. This is the categorical reading of basis-independence: an invariant is a limit over the groupoid of bases.
Synthesis. The central insight is that the change-of-basis matrix , whose columns are the new basis vectors in old coordinates, is the single datum from which every transformation law in linear algebra descends, and the laws differ only in which representation of the object carries. A vector carries the standard representation read backward, , so its coordinates are contravariant; a covector carries the dual representation, , so its coordinates are covariant — the covector law is dual to the vector law — and the pairing is invariant because the two representations are contragredient and cancel. An operator carries the conjugation representation , the tensor product of the standard with its dual; this is exactly why its invariants — determinant, trace, rank, characteristic and minimal polynomials, elementary divisors — are the class functions on and the data of the rational and Jordan canonical forms 01.01.11. A bilinear form carries the congruence representation , the symmetric or alternating square of the dual, whose complete invariant over is the signature of Sylvester's law of inertia 01.01.15 rather than the spectrum; the same conjugation grammar generalises to the upper-versus-lower index calculus of tensors 13.02.01.
The three relations — equivalence, similarity, congruence — are the three ways one matrix can present a map, an operator, or a form, and their distinct canonical forms (rank normal form, Jordan/rational form, inertia normal form) are the three answers to "what survives a change of basis." Reading as a morphism in the groupoid of bases makes the whole structure functorial: invariants are the natural quantities, the index calculus of upper and lower indices 13.02.01 is the bookkeeping of which representation acts, and the spectral theorem is the single point where an orthogonal makes similarity and congruence agree. The transformation laws are therefore not three separate rules but one representation-theoretic principle applied to four kinds of object.
Full proof set Master
Proposition (similarity is an equivalence relation and rank, determinant, trace, and characteristic polynomial are invariants). On , the relation defined by is reflexive, symmetric, and transitive, and the maps , , , are constant on each equivalence class.
Proof. Reflexivity: . Symmetry: if then , so with . Transitivity: if and then , with invertible. Hence is an equivalence relation.
Rank is invariant because multiplication by an invertible matrix on either side is a bijective linear map, which neither increases nor decreases the dimension of the image; thus . Determinant: , using multiplicativity and . Trace: by the cyclic property of the trace. Characteristic polynomial: , so ; the determinant and trace invariance follow again as the bottom and next-to-top coefficients of , giving a second proof consistent with the direct ones.
Proposition (the coordinate and operator laws determine each other). If a rule assigning to each basis a column transforms by under , then the induced matrix of any operator necessarily transforms by ; conversely the operator law forces the coordinate law on the columns of an invertible operator's matrix.
Proof. Assume the coordinate law. For an operator and any , , while also ; equating and ranging over all gives . Conversely assume the operator law holds for all . Apply it to the operator whose matrix in is for a fixed , i.e. sends to and the other basis vectors to . The first column of is . Choosing and tracking the first column recovers after normalising by the action on ; the two laws are thus inter-derivable, reflecting that the operator law is the coordinate law tensored with its dual.
Proposition (Sylvester's law of inertia; congruence invariant). Let be a real symmetric matrix. The number of positive and of negative diagonal entries obtained when is reduced to diagonal form by a congruence is independent of the reducing ; the pair — the signature — is a complete invariant of the congruence class, and .
Proof. Reduction to diagonal form by congruence is possible by simultaneous row-and-column operations (symmetric Gaussian elimination), so some arises. Suppose two congruences give signatures and with . Let be the -dimensional subspace on which the form is positive definite in the first reduction, and the -dimensional subspace on which it is negative semidefinite in the second (the span of the negative and zero directions). Then , so ; a nonzero vector there makes simultaneously and , a contradiction. Hence , and by symmetry ; likewise . The rank is invariant under congruence by the rank argument of the first proposition, so . The signature, not the eigenvalues, is therefore the congruence invariant — the precise sense in which congruence and similarity carry different information.
Connections Master
Change of basis is the engine of the canonical-form theory. Diagonalisation 01.01.08 chooses to be an eigenvector matrix so that is diagonal; the Jordan canonical form 01.01.11 chooses from generalised eigenvectors when diagonalisation fails; the primary decomposition 01.01.16 is the basis-free statement of which exist. Each is the search for the simplest representative of a similarity class, the orbits of which the operator transformation law defines.
The congruence law is the transformation law for bilinear and quadratic forms 01.01.15, whose invariant is the signature by Sylvester's law of inertia, in deliberate contrast to the eigenvalue invariants of similarity. The spectral theorem for normal operators 01.01.13 is exactly the case where an orthogonal makes the two laws coincide, simultaneously diagonalising an operator and its associated form, which is why the principal-axes theorem can be read either as similarity or as congruence.
The contravariant-coordinate / covariant-covector pair generalises directly to the tensor transformation law on smooth manifolds 13.02.01, where the change-of-basis matrix becomes the Jacobian of a coordinate change and upper and lower indices record exactly the -versus- distinction proved here for the dual basis 01.01.02; the invariance of contractions is the finite-dimensional shadow of the tensor calculus underlying general relativity.
Historical & philosophical context Master
The idea that a vector exists independently of any coordinate system, so that coordinates are derived descriptions subject to a transformation law, was first made systematic by Hermann Grassmann in the 1844 Ausdehnungslehre, which built an -dimensional theory of linear extension with bases but without a privileged coordinate frame [Grassmann 1844]. The matrix as an algebraic object on which a substitution acts, with a product, an inverse, and a calculus that makes meaningful, was introduced by Arthur Cayley in the 1858 Memoir on the Theory of Matrices, the work in which similarity becomes a manipulable operation rather than an implicit change of variables [Cayley 1858]. The recognition that congruence carries a different invariant from similarity is due to James Joseph Sylvester, whose law of inertia, stated for real quadratic forms in 1852, established that the signature is preserved under the substitution regardless of the reducing transformation [Sylvester 1852]. The modern finite-dimensional packaging — coordinates as a linear isomorphism, the operator law as conjugation, the dual law as contragredience — is the treatment of Paul Halmos's 1958 Finite-Dimensional Vector Spaces, while Shilov's 1971 Linear Algebra gives the transition-matrix development in the concrete operator language followed here.
Bibliography Master
@article{Cayley1858,
author = {Cayley, Arthur},
title = {A Memoir on the Theory of Matrices},
journal = {Philosophical Transactions of the Royal Society of London},
volume = {148},
year = {1858},
pages = {17--37}
}
@article{Sylvester1852,
author = {Sylvester, James Joseph},
title = {A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares},
journal = {Philosophical Magazine (Series 4)},
volume = {4},
year = {1852},
pages = {138--142}
}
@book{Grassmann1844,
author = {Grassmann, Hermann},
title = {Die lineale Ausdehnungslehre, ein neuer Zweig der Mathematik},
publisher = {Otto Wigand},
address = {Leipzig},
year = {1844}
}
@book{Halmos1958,
author = {Halmos, Paul R.},
title = {Finite-Dimensional Vector Spaces},
edition = {2nd},
publisher = {Van Nostrand},
address = {Princeton, NJ},
year = {1958}
}
@book{Shilov1977,
author = {Shilov, Georgi E.},
title = {Linear Algebra},
publisher = {Dover Publications},
address = {New York},
year = {1977},
note = {Translation of the 1971 Russian edition, transl. R. A. Silverman}
}
@book{HoffmanKunze1971,
author = {Hoffman, Kenneth and Kunze, Ray},
title = {Linear Algebra},
edition = {2nd},
publisher = {Prentice-Hall},
address = {Englewood Cliffs, NJ},
year = {1971}
}
@book{Lang2002,
author = {Lang, Serge},
title = {Algebra},
edition = {3rd, revised},
series = {Graduate Texts in Mathematics},
volume = {211},
publisher = {Springer},
year = {2002}
}