Mutation and repair
Anchor (Master): Friedberg et al., *DNA Repair and Mutagenesis* (2nd ed., ASM Press 2006); Lindahl — Instability and decay of the primary structure of DNA (1993 Nature); Modrich — Mechanisms in eukaryotic DNA mismatch repair (2006 J. Biol. Chem.); Cleaver — Xeroderma pigmentosum (1968 Nature); Haber — Mating-type gene switching in S. cerevisiae (1992)
Intuition [Beginner]
DNA is remarkably stable, but it is not invincible. Every day, every cell in your body suffers tens of thousands of DNA lesions — breaks, chemical modifications, mispaired bases. Most are fixed silently by dedicated repair enzymes. When they are not, the result is a mutation: a permanent change in the DNA sequence.
Mutations come in several types. A point mutation changes a single base pair. If it changes the amino acid encoded, it is a missense mutation (like sickle-cell: Glu to Val). If it creates a premature stop codon, it is a nonsense mutation. If it changes a base but does not change the amino acid (due to genetic code redundancy), it is a silent mutation. If bases are inserted or deleted (not in multiples of three), the reading frame shifts — a frameshift — garbling everything downstream.
Mutations can arise from replication errors, from chemical damage (UV light, cigarette smoke, reactive oxygen species), or from mobile genetic elements that jump around the genome.
Cells have multiple repair systems. Some correct mismatched bases right after replication. Others detect and remove damaged bases. Some patch broken DNA strands. When repair fails, mutations accumulate. A few mutations in the right (or wrong) genes can lead to cancer.
Visual [Beginner]
Think of DNA as a long text. A point mutation is a typo — one wrong letter. A missense mutation changes a word's meaning (the tea). A nonsense mutation ends the sentence prematurely (the big cat the big [stop]). A frameshift shifts all the reading boundaries (t|he|big|cat th|eb|ig|ca|t...).
Worked example [Beginner]
The sickle-cell mutation is a single base change in the HBB gene (beta-globin). The normal codon at position 6 is GAG (glutamic acid). The mutation changes it to GTG (valine):
Normal DNA: ...GAG... mRNA: ...GAG... Amino acid: Glu Mutant DNA: ...GTG... mRNA: ...GUG... Amino acid: Val
This is an AT transversion (a purine replaced by a pyrimidine). A single atom-level change — replacing a negatively charged, hydrophilic side chain with a non-polar, hydrophobic one — creates a sticky patch on the hemoglobin surface that causes polymerization, distorting red blood cells into sickle shapes. The consequences: pain crises, organ damage, anemia. Yet the mutation persists because carriers (heterozygotes) have resistance to malaria.
What this tells us: a single base change can have enormous phenotypic consequences through a chain of molecular interactions, from DNA to protein structure to cell behaviour to organismal fitness.
Check your understanding [Beginner]
Formal definition [Intermediate+]
A mutation is any heritable change in the nucleotide sequence of a genome. DNA repair encompasses the enzymatic pathways that detect and correct DNA damage and replication errors.
Types of mutations
By scale:
- Point mutations: Single base-pair substitutions. Classified as transitions (purine to purine: A to G; pyrimidine to pyrimidine: C to T) or transversions (purine to pyrimidine or vice versa).
- Insertions/deletions (indels): Addition or removal of one or more base pairs.
- Copy number variations (CNVs): Duplications or deletions of larger chromosomal segments.
- Chromosomal rearrangements: Translocations, inversions, fusions.
By effect on the protein:
- Silent (synonymous): Codon changes but amino acid does not (due to code degeneracy).
- Missense: One amino acid replaced by another. Conservative (similar properties) or non-conservative (different properties).
- Nonsense: Codon becomes a stop codon, truncating the protein.
- Readthrough: Stop codon mutated to a sense codon, extending the protein.
- Frameshift: Indel not divisible by 3 shifts the reading frame, altering all downstream amino acids.
- Splice-site mutations: Disrupt intron-exon boundaries, causing aberrant splicing.
DNA damage types
Spontaneous damage (approximately 10,000–100,000 lesions per cell per day):
- Depurination: Loss of purine bases (A or G), leaving an apurinic (AP) site. About 10,000 per cell per day.
- Deamination: Loss of an amino group from a base. Cytosine deaminates to uracil (~100–500 per day). 5-methylcytosine deaminates to thymine (difficult for BER to correct — a major source of C to T transitions at CpG dinucleotides).
- Oxidation: Reactive oxygen species modify bases. 8-oxo-guanine (8-oxoG) is the most common oxidative lesion; it pairs with A instead of C, causing G to T transversions.
Induced damage:
- UV radiation: Causes cyclobutane pyrimidine dimers (CPDs) and 6-4 photoproducts between adjacent pyrimidines. These distort the helix and block replication.
- Ionizing radiation: Causes single-strand breaks (SSBs), double-strand breaks (DSBs), and base modifications.
- Chemical mutagens: Alkylating agents (add methyl/ethyl groups to bases), deaminating agents (nitrous acid), base analogs (5-bromouracil), intercalating agents (ethidium bromide, causing frameshifts).
Repair mechanisms
1. Direct reversal. A few lesions are repaired without excision: photolyase (absent in placental mammals) uses blue light to split UV dimers; O6-methylguanine methyltransferase (MGMT) transfers a methyl group from O6-methylG to itself (a suicide enzyme).
2. Base excision repair (BER). Repairs small base modifications (deamination, oxidation, alkylation). A DNA glycosylase recognizes and removes the damaged base, creating an AP site. AP endonuclease (APE1) nicks the backbone. DNA polymerase beta fills the gap, and DNA ligase seals it. Different glycosylases target different lesions: UNG (uracil), OGG1 (8-oxoG), MBD4 (T from 5-methyl-C deamination).
3. Nucleotide excision repair (NER). Repairs bulky lesions that distort the helix (UV dimers, bulky chemical adducts). Two sub-pathways: global genome NER (GG-NER, scans entire genome; uses XPC for damage recognition) and transcription-coupled NER (TC-NER, repairs lesions in actively transcribed strands; uses CSB/CSA). In both, TFIIH unwinds the DNA, XPA verifies the lesion, and endonucleases XPF-ERCC1 (5' cut) and XPG (3' cut) excise a 24–32 nt oligonucleotide. DNA polymerase fills the gap and ligase seals it.
4. Mismatch repair (MMR). Corrects base-pairing errors that escape DNA polymerase proofreading. In E. coli: MutS recognizes the mismatch, MutL recruits MutH, which nicks the newly synthesized strand at a hemimethylated GATC site. In eukaryotes: MSH2-MSH6 (MutS homolog) recognizes mismatches; MLH1-PMS2 (MutL homolog) recruits excision machinery. MMR improves fidelity approximately 100-fold, from about to about errors per base.
5. Double-strand break repair. Two pathways: non-homologous end joining (NHEJ) directly ligates broken ends using Ku70/80, DNA-PKcs, Artemis, and XRCC4-Ligase IV — fast but error-prone. Homologous recombination (HR) uses the sister chromatid as a template for accurate repair via Rad51 (loaded by BRCA2) — accurate but restricted to S/G2 phase.
Counterexamples to common slips
- All mutations are harmful. Most are neutral, especially in non-coding regions and at synonymous sites. Beneficial mutations are rare but are the only source of evolutionary novelty. The neutral theory
19.02.05holds that the majority of observed polymorphism is selectively neutral. - DNA repair always restores the original sequence. Translesion synthesis and error-prone polymerases deliberately introduce mutations as a trade-off for survival when the replication fork encounters unrepaired damage. The SOS response in bacteria upregulates error-prone polymerases.
- Double-strand breaks are always repaired by NHEJ. HR is the preferred pathway in S/G2 when a sister chromatid is available, and it is essentially error-free. NHEJ dominates in G1 when no sister chromatid exists.
Key theorem with proof [Intermediate+]
Theorem (Replication fidelity from layered repair). The overall per-base-pair error rate of chromosomal DNA replication in a eukaryotic cell with functional 3'-to-5' exonuclease proofreading and mismatch repair is approximately errors per base pair per replication, a -fold improvement over the intrinsic polymerase error rate. The overall fidelity equals the product of the fidelity contributions at each stage.
Proof. DNA polymerases and have an intrinsic misincorporation rate of approximately per base (one wrong nucleotide per 100,000 incorporations). The 3'-to-5' exonuclease proofreading activity of Pol corrects approximately 99% of these misincorporations, reducing the error rate to approximately . Mismatch repair then scans the newly synthesized duplex, detects the residual mismatches, and corrects approximately 99% of those that escaped proofreading, reducing the error rate further to approximately to .
The overall fidelity is the product of the per-stage survival rates of errors:
where is the raw polymerase error rate, is the proofreading correction efficiency, and is the MMR correction efficiency. Substituting:
For a diploid human genome of base pairs, this gives approximately 6 new mutations per genome per cell division, consistent with direct measurements of de novo mutation rates in human pedigrees (approximately 70 per generation, accumulated over roughly 10–20 cell divisions in the germline).
Corollary (Fidelity collapse from repair loss). Loss of a single repair layer increases the per-base error rate by 100-fold. Loss of MMR alone raises the error rate from to , producing a mutator phenotype that drives rapid cancer evolution. Lynch syndrome, caused by germline mutations in MSH2 or MLH1, increases the error rate by this factor and confers an approximately 80% lifetime risk of colorectal cancer.
Bridge. The multiplicative fidelity principle builds toward 19.02.05, where the per-generation mutation rate enters the Wright-Fisher model as the parameter governing the supply of new genetic variation in a finite population. The foundational reason genome stability is maintained at rather than is that the cell deploys three independent error-correction stages in series, and this is exactly the engineering principle of a cascade of filters, applied to error suppression rather than signal amplification. The consequence for cancer — when one filter is removed — appears again in 17.07.02, where DNA damage signaling cascades activate cell-cycle arrest in response to the elevated error burden.
Exercises [Intermediate+]
Advanced results [Master]
DNA damage types and mutagen specificity
The DNA damage spectrum in a mammalian cell spans orders of magnitude in both frequency and severity. Lindahl's 1993 census [Lindahl1993] established the quantitative baseline: approximately 10,000 depurinations, 3,000 single-strand breaks, 1,000 oxidative lesions, and 500 cytosine deaminations per cell per day. These endogenous lesions are constant, unavoidable consequences of thermal chemistry and aerobic metabolism 17.04.01. Their rate sets the minimum repair capacity a cell must maintain to avoid genomic catastrophe.
Exogenous mutagens add to this baseline with distinctive chemical signatures. UV-B radiation (280–315 nm) produces cyclobutane pyrimidine dimers (CPDs) and 6-4 photoproducts (6-4PPs) at adjacent pyrimidines on the same strand. The chemistry is a [2+2] photocycloaddition between the C5-C6 double bonds of neighbouring thymines or cytosines. The resulting CPD bends the DNA helix by approximately 30 degrees and unwinds it by approximately 9 degrees — a distortion that XPC recognizes indirectly through the thermodynamic destabilization of the duplex, not through direct contact with the damaged bases.
Ionizing radiation (X-rays, gamma rays, particle radiation) generates a mixed damage profile: single-strand breaks, double-strand breaks, base modifications (primarily 8-oxoG and thymine glycol), and sugar damage. The linear energy transfer (LET) of the radiation determines the clustering of damage: high-LET radiation (alpha particles) produces dense clusters of lesions within 1–10 bp, which are harder to repair than the isolated lesions produced by low-LET radiation (X-rays).
Alkylating agents produce a predictable spectrum: N7-methylguanine (the most common alkylation product, relatively benign), O6-methylguanine (highly mutagenic — pairs with T instead of C, causing G to A transitions), and N3-methyladenine (blocks replication). The specificity of O6-methylG for causing G to A transitions after replication is the basis for its mutagenic potency: if unrepaired by MGMT, DNA polymerase inserts T opposite the lesion, and the next replication round fixes the G-C to A-T change permanently.
Theorem 1 (Mutational signatures). Each class of mutagen produces a characteristic pattern of base substitutions, flanking sequence context, and strand asymmetry. These mutational signatures can be extracted from tumor genome sequences by non-negative matrix factorization, allowing reconstruction of the mutagenic exposures that drove each tumor's evolution.
Alexandrov et al. (2013, Nature 500, 415–421) identified approximately 30 distinct mutational signatures across human cancers. Signature 7 (predominantly C to T transitions at dipyrimidine sites with a CC preference) is the UV signature; signature 4 (C to A transversions) is the tobacco signature; signature 3 (large numbers of small indels with microhomology) marks BRCA-deficient HR failure. The mutational signature programme identifies the causal mutagen retrospectively from the tumor genome alone.
Base excision repair and mismatch repair in depth
BER is the workhorse repair pathway, handling the highest volume of lesions. The pathway operates in two sub-pathways distinguished by the length of the repair patch.
Short-patch BER replaces a single nucleotide. A monofunctional glycosylase (e.g., UNG for uracil, SMUG1 for oxidized uracil) cleaves the N-glycosidic bond, releasing the damaged base and creating an AP site. APE1 cleaves the phosphodiester backbone 5' to the AP site, generating a 3'-OH and a 5'-deoxyribose phosphate (dRP). Pol removes the dRP moiety via its lyase activity, inserts the correct nucleotide, and DNA ligase III-XRCC1 seals the nick. The entire process replaces one nucleotide.
Long-patch BER replaces 2–10 nucleotides and is used when the dRP moiety is refractory to Pol lyase activity (e.g., reduced or oxidized sugar residues). Pol or Pol performs strand displacement synthesis, generating a flap of 2–10 nucleotides that is removed by flap endonuclease 1 (FEN1). DNA ligase I seals the final nick. Long-patch BER is favoured when the cell is in S phase because the replication machinery (PCNA, Pol , FEN1) is already available.
Theorem 2 (BER enzyme specificity). Each DNA glycosylase recognizes a specific subset of damaged bases through a combination of base-flipping (the damaged base is rotated 180 degrees out of the helix into the enzyme active site) and chemical sensing. UNG distinguishes uracil from thymine by a single atom: the 5-methyl group of thymine is sterically excluded from the UNG active site, while uracil (lacking this methyl group) fits. OGG1 recognizes 8-oxoG by detecting the additional oxygen atom at position 8.
Mismatch repair is the final proofreading layer after polymerase selectivity and exonuclease proofreading. Its central problem is strand discrimination: the mismatch is a legitimate base pair in chemistry (e.g., G-T has two hydrogen bonds), so the repair machinery must identify which strand is newly synthesized. In E. coli, Dam methylase methylates adenine at GATC sequences, and the newly synthesized strand is transiently hemimethylated (methylated on the parental strand only). MutH nicks the unmethylated (new) strand at the nearest GATC site.
Theorem 3 (Eukaryotic strand discrimination). In eukaryotes, which lack Dam methylation, strand discrimination uses the intrinsic asymmetry of the replication fork: the lagging strand contains frequent Okazaki fragment junctions (nicks), and the leading strand is distinguished by the presence of PCNA and the 3' terminus of the nascent strand. MutS-alpha (MSH2-MSH6) binds the mismatch and recruits MutL-alpha (MLH1-PMS2), which contains a latent endonuclease activity activated by PCNA and RFC. This endonuclease introduces nicks in the discontinuous strand near the mismatch, providing the entry point for exonuclease 1 (EXO1) degradation past the mismatch.
Lynch syndrome (hereditary nonpolyposis colorectal cancer, HNPCC) results from germline mutations in MMR genes — most commonly MSH2 or MLH1. The resulting MMR deficiency produces a mutator phenotype (approximately 100-fold elevated mutation rate) that accelerates the accumulation of driver mutations. The lifetime colorectal cancer risk reaches approximately 80% for MLH1 mutation carriers, compared to approximately 4% in the general population. Endometrial cancer risk is approximately 60% for female MLH1 carriers.
Double-strand break repair pathways
Double-strand breaks (DSBs) are the most dangerous form of DNA damage because both strands are severed, eliminating the template needed for accurate repair. A single unrepaired DSB can trigger apoptosis or cause chromosomal translocations that activate oncogenes. Two mechanistically distinct pathways repair DSBs.
Non-homologous end joining (NHEJ) ligates the broken ends directly. The Ku70/Ku80 heterodimer binds each DSB end within seconds of break formation, forming a ring around the DNA terminus. Ku recruits DNA-PKcs (the catalytic subunit of DNA-dependent protein kinase), which autophosphorylates to activate the complex. If the ends are not directly ligatable (e.g., damaged bases, hairpins), Artemis nuclease processes them. XRCC4-DNA Ligase IV (with XLF/Cernunnos as a co-factor) performs the final ligation. NHEJ is fast (minutes to hours) and active throughout the cell cycle, but it is error-prone: nucleotides are lost or added at the junction, producing small insertions and deletions.
When NHEJ joins ends from different chromosomes, the result is a chromosomal translocation. The BCR-ABL translocation (Philadelphia chromosome, t(9;22)) joins the BCR gene on chromosome 22 to the ABL1 gene on chromosome 9, producing a constitutively active tyrosine kinase that drives chronic myeloid leukaemia. This single translocation event — one DSB on each chromosome, misjoined by NHEJ — is both the molecular cause of CML and the target for imatinib, the first successful tyrosine kinase inhibitor.
Homologous recombination (HR) uses an undamaged homologous sequence (the sister chromatid in mitotic cells) as a template for accurate repair. The MRN complex (Mre11-Rad50-Nbs1) binds the DSB and initiates 5'-to-3' end resection, generating 3' single-stranded DNA overhangs. BRCA1 promotes long-range resection by recruiting CtIP and the BLM helicase/EXO1 complex. RPA coats the ssDNA overhangs to prevent secondary structure formation.
BRCA2 then loads Rad51 onto the ssDNA, displacing RPA and forming a nucleoprotein filament. Rad51 performs strand invasion: the ssDNA filament searches for and invades the homologous duplex, forming a displacement loop (D-loop). DNA synthesis extends the invading 3' end using the sister chromatid as template. The resulting intermediates (Holliday junctions) are resolved by resolvases (GEN1, MUS81-EME1, or the BLM-TOP3A-RMI1/2 dissolvasome) to produce either crossover or non-crossover products.
The choice between NHEJ and HR is regulated by CDK-dependent phosphorylation of repair factors and by the MRN complex itself. During G1 phase, 53BP1 binds DSB ends and shields them from resection, directing repair toward NHEJ. In S and G2 phases, BRCA1 antagonizes 53BP1, allowing CtIP-mediated end resection and committing the break to HR. Loss of 53BP1 in BRCA1-deficient cells partially restores HR — a finding that reveals the competitive balance between the two pathways and has implications for resistance to PARP inhibitors.
Gene conversion outcomes in HR are not always symmetric. When the donor sequence differs from the broken copy at heterozygous sites, the repair can produce 3:1 or 4:0 segregation patterns at these sites (where 2:2 is the Mendelian expectation). Haber's work in S. cerevisiae mating-type switching [Haber1992] demonstrated these non-Mendelian gene conversion events directly: the DSB at the MAT locus is repaired using a silent donor (HML or HMR), and the repaired locus acquires the donor's sequence. The ratio of crossover to non-crossover outcomes is tightly regulated — crossovers are essential for proper chromosome segregation in meiosis but are suppressed in mitosis because they can produce loss of heterozygosity at tumour suppressor loci.
Theorem 4 (Synthetic lethality of PARP inhibition in HR-deficient cells). Inhibition of poly-ADP-ribose polymerase (PARP) is selectively lethal to cells with defective homologous recombination (e.g., BRCA1 or BRCA2 mutations). PARP repairs single-strand breaks (SSBs) via the base excision repair pathway. When PARP is inhibited, unrepaired SSBs are encountered by the replication fork and converted to one-ended DSBs. In HR-proficient cells, these DSBs are accurately repaired by HR. In HR-deficient cells, the DSBs are shunted to error-prone NHEJ, causing genomic catastrophe and cell death. This synthetic lethality is the basis for olaparib and other PARP inhibitors in BRCA-mutant ovarian and breast cancer (Bryant et al. 2005 Nature 434, 917–921; Farmer et al. 2005 Nature 434, 917–921).
The clinical significance is substantial: PARP inhibitors kill tumour cells with BRCA mutations while sparing normal cells (which retain one functional BRCA allele and can perform HR). This is one of the first successful applications of synthetic lethality in oncology.
Translesion synthesis and the DNA damage response
When the replication fork encounters a lesion that has not been repaired, the replicative polymerase (Pol ) stalls because its active site cannot accommodate distorted bases. The cell then faces a choice: activate error-prone translesion synthesis (TLS) to bypass the lesion, or collapse the fork and invoke recombination-based recovery.
Translesion synthesis uses specialized Y-family polymerases with relaxed active sites that can accommodate damaged bases. Pol (eta) bypasses UV-induced thymine dimers accurately — it inserts two adenines opposite the dimer, preserving the sequence. Mutations in the POLH gene encoding Pol cause the XP-V (xeroderma pigmentosum variant) form of XP, which has normal NER but elevated UV mutagenesis because thymine dimers are bypassed by less accurate polymerases instead.
Pol (iota) and Pol (kappa) bypass other lesions with variable accuracy. Pol (Rev3/Rev7, a B-family polymerase) extends from the mispaired termini generated by Y-family insertion, completing the bypass. The Rev1 protein acts as a scaffold, recruiting TLS polymerases to the stalled fork via its ubiquitin-binding domain (which recognizes monoubiquitinated PCNA).
Theorem 5 (Error-prone vs. error-free bypass trade-off). TLS is regulated by PCNA ubiquitination. When the replication fork stalls at a lesion, Rad6-Rad18 (E2-E3 ubiquitin ligase) monoubiquitinates PCNA at K164. This modification recruits TLS polymerases. If the lesion is a CPD, Pol performs accurate bypass (error-free TLS). If the lesion is not a substrate for Pol (e.g., a bulky adduct), Pol or Pol insert a nucleotide opposite the lesion and Pol extends — this is error-prone TLS, introducing a mutation but preventing fork collapse. The trade-off: a mutation is preferable to a stalled replication fork that could generate a double-strand break.
The DNA damage response (DDR) coordinates repair with cell-cycle progression. The master regulators are ATM (activated by DSBs through the MRN complex) and ATR (activated by RPA-coated ssDNA at stalled forks). ATM phosphorylates H2AX on serine 139 (producing gamma-H2AX), which spreads megabases from the break site and serves as a recruitment platform for MDC1, 53BP1, and BRCA1. ATM also phosphorylates Chk2 and p53. ATR, recruited by ATRIP, phosphorylates Chk1, which inhibits CDC25A/C phosphatases, preventing CDK1/2 activation and halting the cell cycle at G1/S or G2/M checkpoints.
p53 sits at the decision node. When stabilized by ATM/ATR-mediated phosphorylation (which blocks its MDM2-mediated ubiquitination and degradation), p53 transcriptionally activates p21 (a CDK inhibitor), causing cell-cycle arrest. If the damage burden is too extensive for repair, p53 activates pro-apoptotic genes (BAX, PUMA, NOXA), committing the cell to programmed death. The decision between arrest and apoptosis depends on the magnitude and duration of the DNA damage signal, the cell type, and the p53 dosage.
The bacterial analogue of the DDR is the SOS response, first characterized by Miroslav Radman in 1975. When replication forks stall at unrepaired lesions, RecA binds the accumulated ssDNA and forms nucleoprotein filaments that activate the LexA repressor's self-cleavage. LexA derepression upregulates approximately 40 SOS genes, including UmuD and UmuC (which form Pol V, an error-prone TLS polymerase), sulA (which inhibits FtsZ and blocks cell division), and several DNA repair enzymes. The SOS response trades fidelity for survival: the upregulated error-prone polymerases introduce mutations, but they allow replication to complete rather than stall catastrophically. This regulated mutagenesis is a source of adaptive mutations under stress.
Theorem 6 (p53 as the guardian of the genome). p53 is mutated in approximately 50% of all human cancers — more than any other single gene. Its function is damage sensing and response orchestration: it receives inputs from ATM/ATR, decides between cell-cycle arrest (to allow repair) and apoptosis (when damage is irreparable), and transcriptionally regulates the effector genes. Loss of p53 eliminates the checkpoint that prevents cells with damaged DNA from dividing, allowing mutations to accumulate.
Synthesis. The foundational reason genomic integrity persists across billions of cell divisions is the hierarchical deployment of repair pathways, each tuned to a specific damage class: direct reversal for the rare lesions amenable to chemical correction, BER for the ubiquitous small modifications, NER for the bulky helix-distorting adducts, MMR for the replication errors, and DSB repair for the most dangerous lesions. Putting these together with the DNA damage checkpoint network reveals that the cell does not merely repair damage passively — it actively coordinates repair with cell-cycle progression, halting replication until the genome is restored. The central insight is that error-prone translesion synthesis is not a failure of the system but a calculated trade-off: when accurate repair is impossible, a mutagenic bypass is preferable to a stalled replication fork that would cause a catastrophic collapse. This is exactly the logic that identifies cancer cells with repair-deficient states — BRCA loss, MMR loss, NER loss — where the mutation burden escalates because one layer of the hierarchy is missing. The bridge is between the biochemistry of glycosylases, helicases, and recombinases on one hand and the population genetics of mutation-selection balance on the other 19.02.05: the pattern recurs from single enzymes to organismal fitness.
Full proof set [Master]
Proposition 1 (Mutation-selection balance for a dominant deleterious allele). For a dominant deleterious allele with fitness reduction (fitness of , , ) and mutation rate from to per generation, the equilibrium frequency of the deleterious allele in a large randomly-mating population is $q^ \approx \mu / s2\mu/s$.*
Proof. Let be the frequency of the deleterious allele and be the wild-type frequency. The mean fitness is . For , .
The change in due to selection per generation is:
The change due to mutation (new mutations from to ) is:
At equilibrium, :
Since the disease is dominant, affected individuals are either or , with combined frequency for . For achondroplasia (a dominant disorder with and ), and approximately 80% of cases are new mutations.
Proposition 2 (Replication fidelity collapse from MMR loss). Loss of mismatch repair increases the per-base mutation rate by a factor of approximately , where is the MMR correction efficiency. The per-genome mutation burden per cell division increases from approximately 6 to approximately 640 new mutations.
Proof. With functional MMR, the per-base error rate is:
Without MMR:
The fold increase is . Per-genome: versus . The 634 additional mutations per division are predominantly at microsatellite loci (where strand slippage is frequent) and at mono-/di-nucleotide repeats. Over the approximately 30–40 cell divisions required for a colonic crypt to accumulate the 3–7 driver mutations for colorectal cancer, MMR deficiency produces approximately 19,000–25,000 excess mutations — more than sufficient to generate the observed driver mutation complement.
Connections [Master]
Biomolecules in cells
17.01.01. The nucleic acid chemistry introduced in the biomolecules unit provides the structural basis for understanding why specific lesions form: the N-glycosidic bond linking bases to deoxyribose is labile to hydrolysis (depurination), the exocyclic amino group on cytosine is susceptible to deamination, and the conjugated pi-system of purine and pyrimidine rings absorbs UV radiation to form photoproducts.Cellular respiration and ROS production
17.04.01. The electron transport chain in mitochondria leaks approximately 1–2% of electrons to oxygen, generating superoxide () and downstream reactive oxygen species (hydrogen peroxide, hydroxyl radical) that are the primary endogenous source of oxidative DNA damage, including 8-oxoG. The mutation rate is therefore coupled to metabolic rate: tissues with high oxidative phosphorylation activity face a larger endogenous damage burden.Wright-Fisher model and the diffusion approximation
19.02.05. The per-generation mutation rate is the parameter that feeds into the Wright-Fisher population-genetics model as the rate of introduction of new alleles. The balance between mutation input () and selective removal () determines the equilibrium genetic load, and the mutation rate sets the molecular clock for neutral evolution. The mutation-selection balance proposition proved in this unit is the deterministic foundation on which the stochastic Wright-Fisher model builds.Electrophilic addition to alkenes
15.05.01. The formation of cyclobutane pyrimidine dimers by UV radiation is a photochemical [2+2] cycloaddition — a reaction that is mechanistically related to the pericyclic reactions covered in the alkene addition unit. The C5-C6 double bond in thymine is an electron-rich alkene susceptible to photochemical activation, and the resulting cyclobutane ring is a four-membered ring with the same ring strain considerations discussed in organic chemistry.Cell cycle and mitosis
17.08.01. The DNA damage checkpoints (G1/S and G2/M) described in the cell cycle unit consume the repair machinery characterised here: ATM/ATR kinases halt the cell cycle via p53/p21 and Chk1/Cdc25 to allow repair before replication or division proceeds. Failure of these checkpoints allows cells to replicate and segregate damaged DNA, accelerating mutation accumulation.Mendelian genetics
19.01.01pending. Mutations are the molecular origin of the alleles whose segregation Mendel described; the mutation rate sets the supply of new genetic variation on which selection and drift act. Dominance can arise when one allele produces a functional protein and the other does not (haploinsufficiency), a relationship that connects the molecular repair machinery to classical genetic ratios.
Historical & philosophical context [Master]
Hermann Muller demonstrated in 1927 that X-rays induce heritable mutations in Drosophila melanogaster [Muller1927], establishing that mutations could be artificially induced and opening the field of radiation genetics. Muller received the Nobel Prize in 1946 for this discovery. The Luria-Delbruck fluctuation test of 1943 [LuriaDelbruck1943] demonstrated that bacterial mutations arise spontaneously before exposure to selective pressure, not in response to it — establishing the randomness of mutation with respect to fitness.
James Cleaver discovered in 1968 that cells from xeroderma pigmentosum patients are defective in nucleotide excision repair [Cleaver1968], establishing the first direct link between a DNA repair defect and human disease. This finding connected the biochemistry of DNA repair to clinical medicine and stimulated the systematic characterization of repair pathways.
Tomas Lindahl's 1993 Nature paper, "Instability and decay of the primary structure of DNA" [Lindahl1993], systematically catalogued spontaneous DNA damage rates and demonstrated that DNA is chemically unstable — decaying at a rate incompatible with life without constant repair. This quantitative framework underpins the entire field. Lindahl went on to characterize base excision repair; Aziz Sancar elucidated nucleotide excision repair; and Paul Modrich defined the mismatch repair mechanism. The three shared the 2015 Nobel Prize in Chemistry.
Jack Haber's work on mating-type switching in S. cerevisiae (1992) [Haber1992] provided the experimental system for studying double-strand break repair by homologous recombination in eukaryotes, revealing the gene conversion and crossover outcomes that define HR. The concept of synthetic lethality — that combining two non-lethal mutations can produce cell death — was applied to BRCA-deficient cancer by Bryant and Farmer in 2005, leading directly to PARP inhibitor therapy.
The discovery that cancer is a disease of accumulating mutations traces to Nowell's 1976 clonal evolution model (Science 194, 23–28), which proposed that tumours evolve by sequential acquisition of mutations conferring growth advantages. Vogelstein and colleagues validated this model in colorectal cancer in 1988, showing that the progression from benign polyp to invasive carcinoma involves sequential mutations in APC, KRAS, and p53 — each mutation conferring a selective growth advantage that drives clonal expansion. The DNA repair defects described in this unit accelerate this evolutionary process: every repair deficiency increases the mutation supply, shortening the time required for a cell lineage to accumulate the full complement of driver mutations needed for malignancy.
Bibliography [Master]
@article{Muller1927,
author = {Muller, H. J.},
title = {Artificial transmutation of the gene},
journal = {Science},
volume = {66},
year = {1927},
pages = {84--87},
}
@article{LuriaDelbruck1943,
author = {Luria, S. E. and Delbruck, M.},
title = {Mutations of bacteria from virus sensitivity to virus resistance},
journal = {Genetics},
volume = {28},
year = {1943},
pages = {491--511},
}
@article{Cleaver1968,
author = {Cleaver, J. E.},
title = {Xeroderma pigmentosum: defective DNA repair in man},
journal = {Nature},
volume = {218},
year = {1968},
pages = {652--653},
}
@article{Lindahl1993,
author = {Lindahl, T.},
title = {Instability and decay of the primary structure of {DNA}},
journal = {Nature},
volume = {362},
year = {1993},
pages = {709--715},
}
@article{Modrich2006,
author = {Modrich, P.},
title = {Mechanisms in eukaryotic {DNA} mismatch repair},
journal = {J. Biol. Chem.},
volume = {281},
year = {2006},
pages = {30305--30309},
}
@article{Sancar2016,
author = {Sancar, A.},
title = {Mechanisms of {DNA} repair by photolyase and excision nucleases},
journal = {Angew. Chem. Int. Ed.},
volume = {55},
year = {2016},
pages = {8623--8641},
}
@article{Haber1992,
author = {Haber, J. E.},
title = {Mating-type gene switching in {Saccharomyces cerevisiae}},
journal = {Trends Genet.},
volume = {8},
year = {1992},
pages = {446--452},
}
@book{Friedberg2006,
author = {Friedberg, E. C. and Walker, G. C. and Siede, W. and Wood, R. D. and Schultz, R. A. and Ellenberger, T.},
title = {DNA Repair and Mutagenesis},
publisher = {ASM Press},
year = {2006},
edition = {2nd},
}
@book{Alberts2014,
author = {Alberts, B. and Johnson, A. and Lewis, J. and Morgan, D. and Raff, M. and Roberts, K. and Walter, P.},
title = {Molecular Biology of the Cell},
publisher = {Garland Science},
year = {2014},
edition = {6th},
}
@article{Alexandrov2013,
author = {Alexandrov, L. B. and Nik-Zainal, S. and Wedge, D. C. and others},
title = {Signatures of mutational processes in human cancer},
journal = {Nature},
volume = {500},
year = {2013},
pages = {415--421},
}
@article{Bryant2005,
author = {Bryant, H. E. and Schultz, N. and Thomas, H. D. and others},
title = {Specific killing of {BRCA2}-deficient tumours with inhibitors of poly({ADP}-ribose) polymerase},
journal = {Nature},
volume = {434},
year = {2005},
pages = {917--921},
}
Cycle D Track B deepening. Status: draft. All hooks_out targets are proposed. Pending Tyler review and external biology reviewer.