17.01.01 · mol-cell-bio / biomolecules

Biomolecules in cells — overview

shipped3 tiersLean: nonepending prereqs

Anchor (Master): Alberts et al., MBoC 7e; Voet & Voet, Biochemistry 5e; Dill & Bromberg, Molecular Driving Forces, 2nd ed. (2011); Tanford, The Hydrophobic Effect (1980); Pauling & Mirsky 1936 Proc. Natl. Acad. Sci. 22, 439-447; Anfinsen 1973 Science 181, 223-230

Intuition [Beginner]

Every cell, from a bacterium to a neuron, is built from four families of molecules: carbohydrates, lipids, proteins, and nucleic acids. These are the cell's structural materials, energy currency, catalysts, and information carriers. All four share a common architectural principle: small molecular building blocks (monomers) are strung together into large chains (polymers), and the properties of the polymer depend on both the identity and the sequence of its monomers.

Water is the solvent in which all of this chemistry happens. The polar nature of the water molecule means it dissolves charged and polar substances readily, but excludes nonpolar ones. This exclusion drives protein folding, membrane formation, and nearly every higher-order structure in the cell.

Functional groups are the reactive substructures attached to carbon skeletons. A hydroxyl group ( $-OH$ ) makes a molecule more water-soluble and can participate in hydrogen bonding. A carboxyl group ( $-COOH$ ) can donate a proton, becoming negatively charged at cellular pH. An amino group ( $-NH_{2}$ ) can accept a proton, becoming positively charged. These charged states govern how molecules interact inside the cell.

The monomer-to-polymer transition proceeds by condensation (dehydration synthesis): two monomers join, and a water molecule is removed. The reverse reaction, hydrolysis, breaks the bond by adding water. Enzymes in the cell control both directions. This cycle of building up and breaking down is the metabolic heartbeat of every living system.

Carbohydrates provide short-term energy storage and structural scaffolds. Glucose, a six-carbon sugar, is the primary fuel for cellular metabolism. Lipids store energy long-term in the hydrophobic tails of fatty acids and form the barriers that define cellular compartments. Proteins, built from twenty amino acid monomers, carry out nearly all catalytic and structural work. Nucleic acids (DNA and RNA) store and transmit genetic information.

Visual [Beginner]

The four classes of biomolecules can be organised by their monomer-polymer relationships:

Class	Monomer	Polymer	Bond type	Primary role
Carbohydrates	Monosaccharides (e.g., glucose)	Polysaccharides (e.g., glycogen)	Glycosidic	Energy, structure
Lipids	Fatty acids, glycerol	Triglycerides, phospholipids	Ester	Energy storage, membranes
Proteins	Amino acids	Polypeptides	Peptide	Catalysis, structure
Nucleic acids	Nucleotides	DNA, RNA	Phosphodiester	Information storage

Each condensation reaction removes one water molecule. Each hydrolysis reaction adds one back.

Worked example [Beginner]

Sucrose (table sugar) is a disaccharide composed of one glucose molecule and one fructose molecule linked by a glycosidic bond. When you eat sucrose, the enzyme sucrase catalyses hydrolysis, cleaving the bond and releasing free glucose and fructose.

Step 1. Identify the monomers. Glucose is a six-carbon aldose (an aldehyde-bearing sugar). Fructose is a six-carbon ketose (a ketone-bearing sugar). Both have the molecular formula $C_{6} H_{12} O_{6}$ .

Step 2. Identify the bond. The glycosidic bond in sucrose connects carbon-1 of glucose (the anomeric carbon, bearing the aldehyde-derived hydroxyl in ring form) to carbon-2 of fructose (the anomeric carbon in the ketose). Because both anomeric carbons are locked in the bond, sucrose is a non-reducing sugar.

Step 3. Write the reaction. Condensation: glucose + fructose yields sucrose + $H_{2} O$ . Hydrolysis (the biologically relevant direction in digestion): sucrose + $H_{2} O$ yields glucose + fructose. The enzyme sucrase lowers the activation energy barrier, allowing the reaction to proceed rapidly at body temperature ( $37^{\circ} C$ ).

Step 4. Energetics. The standard free energy change for sucrose hydrolysis is $Δ G^{\circ^{'}} \approx - 29 kJ/mol$ . The negative sign means the reaction releases energy and proceeds spontaneously under standard conditions. In the cell, the released monosaccharides are then absorbed and metabolised through glycolysis.

What this tells us: even a single glycosidic bond stores enough energy that its hydrolysis releases a measurable thermodynamic kick, and enzymes make that energy available on biologically relevant timescales.

Check your understanding [Beginner]

Formal definition [Intermediate+]

A biomolecule is any organic molecule produced by a living organism. The four major classes are distinguished by their monomer chemistry and bond types.

Carbohydrates have the general formula $C_{n} (H_{2} O)_{m}$ . Monosaccharides are polyhydroxy aldehydes or ketones with three to seven carbons. Two monosaccharides join via a glycosidic bond (an acetal linkage between an anomeric carbon and a hydroxyl oxygen) to form a disaccharide, releasing one $H_{2} O$ . Longer chains form oligosaccharides (3-10 units) and polysaccharides (hundreds to thousands of units). The glycosidic bond has the general form:

R_{1} -O-R_{2} + H_{2} O ⟷ R_{1} -OH + HO-R_{2}

where the forward direction is hydrolysis and the reverse is condensation.

Lipids are defined operationally: they are biological molecules that are poorly soluble in water but soluble in nonpolar solvents. Unlike the other three classes, lipids are not all polymeric. The main subtypes are fatty acids (long hydrocarbon chains with a terminal carboxyl group), triglycerides (three fatty acids esterified to a glycerol backbone), phospholipids (two fatty acids plus a phosphate-containing head group on glycerol), and sterols (four-ring structures such as cholesterol). The ester bond linking fatty acids to glycerol forms by condensation between a carboxyl group and a hydroxyl group.

Proteins are polymers of amino acids. Each of the twenty standard amino acids has a central (alpha) carbon bonded to an amino group ( $-NH_{2}$ ), a carboxyl group ( $-COOH$ ), a hydrogen, and a variable side chain (R group). Amino acids polymerise via peptide bonds, formed by condensation between the carboxyl group of one amino acid and the amino group of the next:

R-CH(NH_{2})COOH + R’-CH(NH_{2})COOH ⟶ R-CH(NH_{2})CO-NH-CH(R’)COOH + H_{2} O

A polypeptide chain has an N-terminus (free amino group) and a C-terminus (free carboxyl group). Proteins range from a few dozen to several thousand amino acid residues.

Nucleic acids are polymers of nucleotides. Each nucleotide has three components: a nitrogenous base (purine or pyrimidine), a pentose sugar (ribose in RNA, deoxyribose in DNA), and one or more phosphate groups. Nucleotides polymerise via phosphodiester bonds, linking the 5'-phosphate of one nucleotide to the 3'-hydroxyl of the next, releasing pyrophosphate ( $PP_{i}$ ). The resulting backbone is a sugar-phosphate chain with bases projecting outward.

Water as a biological solvent has a high dielectric constant ( $ε \approx 80$ at $20^{\circ} C$ ), high specific heat capacity, and strong hydrogen-bonding capacity. These properties stabilise charged species, moderate temperature fluctuations, and drive the hydrophobic effect that underlies membrane formation and protein folding.

Functional groups summary

Group	Structure	Charge at pH 7	Key role
Hydroxyl	$-OH$	Neutral	H-bond donor/acceptor; alcohols, sugars
Carboxyl	$-COOH$	$-COO^{-}$	Acid; amino acid C-terminus
Amino	$-NH_{2}$	$-NH_{3}^{+}$	Base; amino acid N-terminus
Phosphate	$-PO_{4} H_{2}$	$-PO_{4}^{2 -}$	Energy transfer (ATP); nucleic acid backbone
Sulfhydryl	$-SH$	Neutral	Disulfide bonds in proteins
Carbonyl	$C=O$	Neutral	Aldehyde/ketone; sugar reactivity

Key theorem with proof [Intermediate+]

Theorem (Hydrolysis thermodynamics of biomolecular bonds). The standard free energy of hydrolysis for the major biomolecular bonds follows the hierarchy: phosphoanhydride (ATP) > ester > glycosidic > peptide. The more negative $Δ G^{\circ^{'}}$ of hydrolysis, the more energy is released when the bond is cleaved, and the more work the cell can extract from the reaction. ATP hydrolysis ( $Δ G^{\circ^{'}} \approx - 30.5 kJ/mol$ ) occupies the top of this hierarchy, which is why the cell uses ATP as its universal energy currency.

Proof sketch (by enthalpy-entropy decomposition). Consider a generic condensation bond $A-B$ hydrolysing to $A-OH + HO-B$ . The free energy change is:

Δ G^{\circ^{'}} = Δ H^{\circ^{'}} - T Δ S^{\circ^{'}}

For phosphoanhydride bonds (as in ATP), three factors make $Δ G^{\circ^{'}}$ strongly negative. First, electrostatic repulsion: at pH 7, ATP carries four negative charges on its triphosphate tail. Hydrolysis relieves this repulsion by separating the charges onto ADP ( $-3$ ) and inorganic phosphate ( $-2$ at physiological pH), each now solvated independently. The enthalpy gain from charge separation is substantial.

Second, resonance stabilisation: inorganic phosphate ( $P_{i}$ ) distributes its negative charge over four equivalent oxygen atoms. ADP also gains resonance stabilisation relative to its state in the ATP molecule. The products are thermodynamically more stable than the reactant.

Third, entropy increase: one molecule (ATP) produces two molecules (ADP + $P_{i}$ ), increasing the number of translational degrees of freedom. At $T = 310 K$ (body temperature), even a modest entropy increase contributes $- T Δ S^{\circ^{'}} \approx - 10$ to $- 15 kJ/mol$ to the total free energy change.

Combining these contributions gives $Δ G^{\circ^{'}} \approx - 30.5 kJ/mol$ for ATP hydrolysis under standard biochemical conditions (1 M concentrations, pH 7, $25^{\circ} C$ ). In the cell, where [ATP] >> [ADP][ $P_{i}$ ], the actual $Δ G$ is closer to $- 50$ to $- 60 kJ/mol$ .

Peptide bonds, by contrast, have $Δ G^{\circ^{'}} \approx - 3.8 kJ/mol$ for hydrolysis. The smaller driving force reflects the absence of electrostatic repulsion in the reactant and less resonance stabilisation in the products. This is why proteins are kinetically stable (hydrolysis is slow without a catalyst) even though the reaction is thermodynamically favourable.

Bridge. The hydrolysis hierarchy builds toward 17.04.01 cellular respiration, where ATP hydrolysis drives every energy-requiring step in glycolysis, the citric acid cycle, and oxidative phosphorylation. The foundational reason ATP occupies the top of the hierarchy is the combination of electrostatic repulsion in the triphosphate tail and resonance stabilisation in the products — this is exactly the same charge-separation principle that appears in the catalytic triad of serine proteases, where electrostatic relay stabilises the transition state 15.14.01 pending. The free-energy framework appears again in 14.06.01 chemical thermodynamics as the canonical application of $Δ G = Δ H - T Δ S$ to biochemical coupling.

Worked example: comparing bond energies

Calculate how many peptide bonds the cell can synthesise using the energy from hydrolysis of one ATP molecule under cellular conditions ( $Δ G \approx - 50 kJ/mol$ ).

Each peptide bond formation requires condensation, which is energetically unfavourable. The cell couples this to ATP hydrolysis. Peptide bond hydrolysis releases $3.8 kJ/mol$ , so forming one requires approximately $+ 3.8 kJ/mol$ (at standard conditions). One ATP hydrolysis at cellular $Δ G \approx - 50 kJ/mol$ provides enough energy to form $50/3.8 \approx 13$ peptide bonds in principle, though the actual coupling efficiency is lower due to entropy losses.

Exercises [Intermediate+]

Exercise 5 (medium, short answer).

Starch and cellulose are both polymers of glucose, yet humans can digest starch but not cellulose. Explain why, naming the specific structural difference and the enzymatic constraint.

Hint

The difference lies in the configuration of the glycosidic bond. One uses alpha linkage; the other uses beta.

Answer

Starch consists of alpha-1,4-glycosidic bonds (with alpha-1,6 branches in amylopectin). Cellulose consists of beta-1,4-glycosidic bonds. The beta linkage produces a linear, flat chain that packs tightly into fibres. Humans possess amylase, which cleaves alpha-1,4 bonds, but lack cellulase, the enzyme required to hydrolyse beta-1,4 bonds. Ruminants and termites host symbiotic microorganisms that produce cellulase, enabling them to extract energy from cellulose.

Exercise 7 (hard, symbolic).

ATP hydrolysis has $Δ G^{\circ^{'}} = - 30.5 kJ/mol$ . The synthesis of glutamine from glutamate and $NH_{4}^{+}$ has $Δ G^{\circ^{'}} = + 14.2 kJ/mol$ . Show that coupling ATP hydrolysis to glutamine synthesis makes the overall reaction spontaneous, and calculate $Δ G^{\circ^{'}}$ for the coupled reaction.

Hint

Coupled reactions have additive free energy changes. Write both half-reactions and add them.

Answer

Reaction 1: $ATP + H_{2} O \to ADP + P_{i}$ , $Δ G^{\circ^{'}} = - 30.5 kJ/mol$ .

Reaction 2: $Glutamate + NH_{4}^{+} \to Glutamine + H_{2} O$ , $Δ G^{\circ^{'}} = + 14.2 kJ/mol$ .

Coupled: $ATP + Glutamate + NH_{4}^{+} \to ADP + P_{i} + Glutamine$ , $Δ G^{\circ^{'}} = - 30.5 + 14.2 = - 16.3 kJ/mol$ .

The coupled reaction is spontaneous. In the actual enzymatic mechanism (glutamine synthetase), the reaction proceeds through a gamma-glutamyl phosphate intermediate, ensuring tight coupling.

Exercise 8 (hard, short answer).

Cholesterol is classified as a lipid despite being neither a fatty acid nor a glycerol ester. Justify this classification using the operational definition of lipids. Then explain why cholesterol's rigid ring structure is functionally important for membrane properties.

Hint

Consider the operational definition: insoluble in water, soluble in organic solvents. For membrane function, consider how rigid rings affect fluidity.

Answer

Cholesterol is a lipid by the operational definition: its hydrocarbon-dominated structure (four fused rings plus a hydrocarbon tail) makes it poorly soluble in water but soluble in organic solvents like chloroform.

Cholesterol's rigid fused-ring structure inserts between phospholipid fatty-acid tails in the bilayer. At high temperatures, it restrains the movement of the tails, decreasing membrane fluidity. At low temperatures, it prevents tight packing of the tails, maintaining fluidity. This dual buffering effect keeps membrane fluidity within a functional range across a wider temperature span. The hydroxyl group at carbon-3 positions cholesterol at the lipid-water interface, anchoring it in place.

Exercise 9 (hard, symbolic).

The Watson-Crick base pairs are adenine-thymine (A-T, two hydrogen bonds) and guanine-cytosine (G-C, three hydrogen bonds). A DNA molecule of length 100 base pairs has 60% G-C content. Estimate the total number of hydrogen bonds stabilising the double helix, and explain why DNA with higher G-C content has a higher melting temperature ( $T_{m}$ ).

Hint

Calculate how many G-C pairs and how many A-T pairs there are, then multiply by the number of hydrogen bonds per pair type.

Answer

100 base pairs: 60 are G-C, 40 are A-T.

Total hydrogen bonds: $60 \times 3 + 40 \times 2 = 180 + 80 = 260$ hydrogen bonds.

Higher G-C content raises $T_{m}$ because each G-C pair contributes three hydrogen bonds (vs. two for A-T), requiring more thermal energy to disrupt. Additionally, the stacking interactions between adjacent G-C base pairs are stronger than for A-T pairs. A rough rule: $T_{m} \approx 64.9 + 41 \times (G-C fraction - 0.5)$ for salt concentrations near physiological. For 60% G-C: $T_{m} \approx 64.9 + 41 \times 0.1 \approx 69^{\circ} C$ .

Noncovalent interactions and the hydrophobic effect [Master]

The four classes of biomolecules interact through a small repertoire of noncovalent forces whose quantitative properties determine the structure and stability of every cellular assembly. These forces are individually weak (0.5-5 kcal/mol per interaction, compared to 50-100 kcal/mol for a covalent bond) but collectively determine three-dimensional structure through their enormous numbers.

Hydrogen bonds form when a hydrogen atom covalently bonded to an electronegative donor (D-H, where D is typically N, O, or S) interacts with a lone-pair-bearing acceptor atom (A). The geometry is constrained: the D-H $\dots$ A distance is 2.7-3.2 $\overset{˚}{A}$ and the D-H $\dots$ A angle exceeds 150 degrees for a strong bond ^{[Pauling 1936]}. In vacuum, a single hydrogen bond contributes 3-7 kcal/mol of stabilisation. In water, the net contribution drops to 0.5-1.5 kcal/mol because water molecules compete for the same donors and acceptors — a hydrogen bond between two groups in solution is really the difference between the group-group bond and two group-water bonds.

In protein secondary structure, backbone amide N-H and carbonyl C=O groups form regular hydrogen-bond networks. The alpha-helix places H-bonds between residue $i$ and residue $i + 4$ (3.6 residues per turn, 5.4 $\overset{˚}{A}$ pitch). The beta-sheet places them between adjacent strands. These patterns produce the characteristic signatures in circular dichroism spectroscopy and in the Ramachandran density map of folded proteins.

In DNA, the Watson-Crick base pairs exploit hydrogen bonding: adenine-thymine (two bonds) and guanine-cytosine (three bonds). The additional bond per G-C pair and the stronger stacking interactions between adjacent purines account for the higher melting temperature of GC-rich DNA.

Electrostatic interactions between charged groups follow Coulomb's law: $E = q_{1} q_{2} / (4 π ε_{0} ε_{r} r)$ . In the cellular environment, where the ionic strength is approximately 150 mM, the Debye-Huckel screening length $λ_{D}$ is approximately 0.8 nm. Beyond about 8 $\overset{˚}{A}$ , electrostatic interactions are attenuated to 1/e of their vacuum value by the mobile ions in solution. Inside a protein, where the local dielectric constant can be as low as $ε_{r} \approx 4$ (compared to $ε_{r} \approx 80$ for bulk water), salt bridges (ion pairs between oppositely charged side chains, such as lysine NH $_{3}^{+}$ and aspartate COO $^{-}$ ) contribute 1-5 kcal/mol. The pH dependence of these interactions connects to the pKa of the participating groups 14.10.01.

Van der Waals forces comprise London dispersion (induced dipole-induced dipole), Debye (dipole-induced dipole), and Keesom (dipole-dipole) interactions. The total van der Waals interaction between two atoms is approximated by the Lennard-Jones 6-12 potential:

$E (r) = 4 ε_{LJ} [(\frac{σ}{r})^{12} - (\frac{σ}{r})^{6}]$

where $ε_{LJ}$ is the well depth and $σ$ is the distance at which the potential is zero. The $r^{- 12}$ term is the repulsive wall (Pauli exclusion); the $r^{- 6}$ term is the attractive dispersion. For carbon-carbon interactions, the minimum occurs at $r_{0} \approx 3.4$ $\overset{˚}{A}$ with $ε_{LJ} \approx 0.12$ kcal/mol. Individual van der Waals contacts are weak, but a well-packed protein core contains thousands of them; their summed contribution to stability is 10-20 kcal/mol.

The hydrophobic effect is not a bond but an entropy-driven phenomenon. When a nonpolar solute dissolves in water, the surrounding water molecules reorganise into a more ordered cage-like structure to maintain their hydrogen-bond network around the nonpolar surface. This ordering reduces the entropy of the water. When two nonpolar surfaces come together, the ordered water between them is released back to the bulk, gaining translational and rotational entropy. The free energy of hydrophobic burial is approximately proportional to the buried nonpolar surface area. For a typical protein burying roughly 5000 $\overset{˚}{A}^{2}$ of nonpolar surface upon folding, the hydrophobic effect contributes roughly 125 kcal/mol to the driving force — by far the largest single contribution. Calorimetric measurements (differential scanning calorimetry, Privalov 1979) confirm that the hydrophobic effect has a large positive entropy component and a relatively small enthalpy component at room temperature.

Proposition. The free energy of transferring a nonpolar solute of surface area $A$ from water to a nonpolar environment is $Δ G_{transfer} \approx + γ A$ , where $γ \approx 25$ cal/(mol $\cdot \overset{˚}{A}^{2}$ ) is the hydrophobic surface tension coefficient. This quantity is positive (unfavourable) when the surface is exposed to water and negative (favourable) when buried.

Proof. The transfer free energy decomposes as $Δ G = Δ H - T Δ S$ . For nonpolar solutes, $Δ H$ of transfer is small (typically +1 to +3 kcal/mol at 25 degrees C, from the disruption of van der Waals contacts with water). The entropy change $Δ S$ is large and negative when the solute enters water, because water reorganises into ordered cages, losing configurational entropy. The entropy of the released water upon burial is $Δ S_{water} \approx + A \cdot s_{0}$ , where $s_{0}$ is the entropy per unit area of water released from the cage structure. Calorimetric measurements give $s_{0} \approx 0.08$ cal/(mol $\cdot$ K $\cdot \overset{˚}{A}^{2}$ ). At $T = 310$ K, the entropic contribution is $- T Δ S = T A s_{0} \approx 310 \cdot 0.08 \cdot A \approx 25 A$ cal/mol. The enthalpy contribution is smaller and partially compensating, giving the net coefficient of approximately 25 cal/(mol $\cdot \overset{˚}{A}^{2}$ ). $□$

The hydrophobic effect strengthens with increasing temperature (up to approximately 60-70 degrees C) because the entropy gain from releasing ordered water becomes larger at higher $T$ . This temperature dependence explains why some proteins cold-denature: at low temperatures the hydrophobic driving force weakens and the folded state becomes less favourable.

Protein structure hierarchy and folding thermodynamics [Master]

Proteins adopt a hierarchy of structural levels — primary, secondary, tertiary, and quaternary — each determined by the physicochemical properties of the amino acid sequence and the cellular environment.

Primary structure is the linear sequence of amino acid residues. The peptide bond is planar (partial double-bond character of the C-N bond, resonance-stabilised) and adopts the trans configuration in greater than 99.9 percent of X-Pro bonds and virtually 100 percent of other peptide bonds. This planarity constrains the backbone conformations accessible to the polypeptide chain.

Secondary structure comprises regular, repeating backbone conformations stabilised by hydrogen bonds. The alpha-helix (3.6 residues per turn, pitch 5.4 $\overset{˚}{A}$ , H-bonds from C=O( $i$ ) to N-H( $i + 4$ )) was predicted by Pauling, Corey, and Branson in 1951 from stereochemical considerations before any protein structure had been solved experimentally. The beta-sheet (parallel or antiparallel strands connected by inter-strand H-bonds) was predicted in the same paper. Turns (reverse turns, beta-bends) change the direction of the polypeptide chain and are enriched at the protein surface.

The Ramachandran plot maps the allowed combinations of the two backbone torsion angles $ϕ$ (N-C $_{α}$ ) and $ψ$ (C $_{α}$ -C). Steric clashes between backbone atoms and the side chain restrict the allowed regions. Glycine, with only a hydrogen atom as its side chain, can access a much larger region of the plot. Proline, whose side chain bonds to the backbone nitrogen, is restricted to a narrow range of $ϕ$ values. In a well-refined protein structure, greater than 98 percent of residues fall within the allowed regions.

Tertiary structure is the full three-dimensional arrangement of a single polypeptide chain. The driving force is hydrophobic collapse: nonpolar side chains (valine, leucine, isoleucine, phenylalanine, methionine) pack into the protein interior, while polar and charged side chains remain at the surface. The structure is stabilised by hydrogen bonds, electrostatic interactions (salt bridges between oppositely charged side chains), van der Waals packing in the core, and occasionally disulfide bonds (covalent S-S bonds between cysteine residues, formed in the oxidising environment of the endoplasmic reticulum).

The serine protease chymotrypsin illustrates these principles. Chymotrypsinogen, the inactive zymogen, is a 245-residue protein that folds into a compact globular structure stabilised by 5 disulfide bonds and a hydrophobic core. Upon proteolytic activation, a small conformational change repositions the catalytic triad (Ser195, His57, Asp102) into the precise geometry required for peptide bond hydrolysis. The X-ray structure determined by Matthews et al. in 1967 ^{[Matthews 1967]} revealed how the folded scaffold positions the catalytic residues — the serine hydroxyl, the histidine imidazole, and the aspartate carboxylate form a charge-relay system that activates Ser195 as a nucleophile.

Quaternary structure is the arrangement of multiple polypeptide subunits. Haemoglobin ( $α_{2} β_{2}$ tetramer) is the canonical example: oxygen binding to one subunit induces a conformational change that increases the oxygen affinity of the remaining subunits (cooperative binding, described by the Monod-Wyman-Changeux model). The Bohr effect (decreased oxygen affinity at lower pH) is a direct consequence of electrostatic interactions between protons and specific amino acid side chains at the subunit interface.

Anfinsen's thermodynamic hypothesis. In experiments on ribonuclease A beginning in 1961, Christian Anfinsen demonstrated that a denatured protein can refold to its native, enzymatically active conformation without any external template ^{[Anfinsen 1973]}. Ribonuclease A was denatured in 8 M urea with $β$ -mercaptoethanol (reducing the four disulfide bonds), then the denaturant and reducing agent were removed. The protein refolded to its native structure with full enzymatic activity, reforming the correct four disulfide bonds out of 105 possible combinations. Anfinsen concluded that the native structure is the thermodynamic ground state — the conformation with the lowest Gibbs free energy under physiological conditions. This result earned him the 1972 Nobel Prize in Chemistry.

Levinthal's paradox highlights the gap between thermodynamic prediction and kinetic reality. If a 100-residue protein sampled each of its roughly $1 0^{100}$ possible conformations at a rate of $1 0^{13}$ /s (the rate of bond vibration), exhaustive search would take $1 0^{77}$ years. Real proteins fold in seconds. The resolution is that the energy landscape is funnel-shaped: local interactions guide the chain toward the native state through a series of progressively lower-energy intermediates, rather than requiring a random search. The folding funnel concept, developed by Bryngelson and Wolynes in the 1980s, provides the theoretical framework.

Molecular chaperones assist protein folding in the cell without violating Anfinsen's thermodynamic hypothesis. The Hsp70 family binds exposed hydrophobic patches on nascent or misfolded polypeptides, preventing aggregation. The GroEL/GroES complex (Hsp60) provides an enclosed chamber in which a single protein molecule can fold in isolation, shielded from the crowded cytoplasm. Chaperones accelerate the kinetics of reaching the native fold and prevent off-pathway aggregation, but they do not change the thermodynamic endpoint.

Intrinsically disordered proteins (IDPs) lack a fixed three-dimensional structure under physiological conditions. Far from being non-functional, disorder enables binding-induced folding (the protein folds upon encountering its target), signalling flexibility (a single disordered region can interact with multiple partners), and regulatory post-translational modifications. Approximately 30-40 percent of eukaryotic proteins contain substantial disordered regions.

Protein misfolding diseases arise when proteins adopt aberrant conformations that self-associate into toxic aggregates. In Alzheimer's disease, the amyloid- $β$ peptide (A $β$ 42) forms oligomers and then fibrils with a cross- $β$ spine structure in which $β$ -strands run perpendicular to the fibril axis. In prion diseases (Creutzfeldt-Jakob disease, bovine spongiform encephalopathy), the normal cellular prion protein (PrP $^{C}$ , predominantly $α$ -helical) converts to the scrapie form (PrP $^{Sc}$ , predominantly $β$ -sheet), which templates further conversion. This conformational self-propagation is the molecular basis of prion infectivity.

Lipid self-assembly and membrane thermodynamics [Master]

The amphipathic character of lipids — a hydrophilic head group attached to hydrophobic hydrocarbon tails — drives the spontaneous formation of membrane structures. The thermodynamics of self-assembly is a direct consequence of the hydrophobic effect.

Critical micelle concentration (CMC). When amphipathic molecules are added to water, they remain as monomers at low concentration. Above the CMC, they self-assemble into micelles — spherical or cylindrical aggregates with the hydrophobic tails sequestered in the interior and the hydrophilic heads facing the water. The CMC depends on tail length (longer tails produce a lower CMC, because the hydrophobic driving force is stronger) and head group chemistry (larger or more polar heads produce a higher CMC). For sodium dodecyl sulfate (SDS, a 12-carbon tail), the CMC is approximately 8 mM. For phospholipids, the CMC is extremely low ( $1 0^{- 10}$ to $1 0^{- 6}$ M), which is why phospholipid bilayers are stable structures in cells.

The aggregation process is governed by $Δ G_{agg} = Δ H_{agg} - T Δ S_{agg}$ . The enthalpy change is modest; the dominant contribution is the entropy gain from releasing ordered water molecules around the hydrophobic tails — the same hydrophobic effect that drives protein folding. For a phospholipid with two 16-carbon tails, burying approximately 800 $\overset{˚}{A}^{2}$ of hydrophobic surface upon aggregation releases roughly $800 \times 25 = 20, 000$ cal/mol = 20 kcal/mol of free energy.

The packing parameter $p = v / (a_{0} \cdot l_{c})$ — where $v$ is the tail volume, $a_{0}$ is the optimal head-group area, and $l_{c}$ is the critical tail length — predicts aggregate geometry. When $p < 1/3$ , the molecule is cone-shaped and forms micelles. When $1/3 < p < 1/2$ , cylindrical micelles form. When $1/2 < p < 1$ , the molecule is roughly cylindrical and forms bilayers (vesicles, liposomes). Phospholipids with two tails have $p$ near 1, which is why they form stable bilayers rather than micelles.

Bilayer properties. A phospholipid bilayer is approximately 5 nm thick (depending on tail length). Lateral diffusion of individual lipid molecules within the plane of the bilayer is rapid (roughly 1 $μ$ m/s at 37 degrees C), allowing membrane components to redistribute. Transverse diffusion (flip-flop of a lipid from one leaflet to the other) is energetically costly — it requires dragging the polar head group through the hydrophobic core — and occurs on a timescale of hours to days without enzymatic assistance. Flippases, floppases, and scramblases catalyse transverse lipid movement.

Phase behaviour. Below a characteristic melting temperature $T_{m}$ , the bilayer is in the gel phase ( $L_{β}$ ): the lipid tails are ordered and extended, and lateral diffusion is slow. Above $T_{m}$ , the bilayer enters the fluid (liquid-crystalline, $L_{α}$ ) phase: the tails are disordered, and lateral diffusion is fast. Saturated tails (no double bonds) pack tightly and raise $T_{m}$ ; unsaturated tails (cis double bonds) introduce kinks that disrupt packing and lower $T_{m}$ . Bacteria adjust their fatty acid composition in response to temperature changes (homeoviscous adaptation) to maintain functional membrane fluidity.

Cholesterol intercalates between phospholipid tails with its hydroxyl group at the lipid-water interface and its rigid ring system within the upper portion of the bilayer. At temperatures above $T_{m}$ , cholesterol restrains tail motion, decreasing fluidity. At temperatures below $T_{m}$ , it prevents tight packing, maintaining fluidity. This dual buffering effect keeps membrane properties within a functional range across a wider temperature span. Mammalian cell membranes contain 20-50 mol percent cholesterol.

Lipid rafts are cholesterol- and sphingolipid-enriched microdomains proposed to serve as platforms for signal transduction, membrane trafficking, and pathogen entry. Their existence has been supported by detergent-resistance assays and single-particle tracking, though their size (10-200 nm) and lifetime remain debated.

Membrane protein insertion exploits the same thermodynamic principles. A transmembrane alpha-helix of approximately 20 hydrophobic residues spans the 5 nm bilayer with its polar backbone hydrogen-bonded internally and its hydrophobic side chains facing the lipid tails. The energetic cost of inserting a charged residue into the bilayer core is approximately 2-4 kcal/mol, which is why transmembrane domains are overwhelmingly hydrophobic. The signal recognition particle (SRP) targets nascent membrane proteins to the endoplasmic reticulum, where the Sec61 translocon provides a gated channel for co-translational insertion of transmembrane helices into the bilayer. Beta-barrel outer-membrane proteins (in bacteria, mitochondria, and chloroplasts) are inserted post-translationally by the Bam complex, folding into cylindrical barrels whose hydrogen-bonded strands close upon themselves.

Carbohydrate diversity and cellular function [Master]

Carbohydrates exhibit structural diversity generated by stereochemistry at multiple chiral centres and by the variety of glycosidic linkage positions. This diversity underlies their roles in energy storage, structural support, cell-cell recognition, and immune function.

Glycosidic bond stereochemistry. The anomeric carbon (C1 in aldoses, C2 in ketoses) of a sugar in ring form can adopt the $α$ -configuration (the C1-OH points down in the Haworth projection) or the $β$ -configuration (pointing up). Glycosidic bonds form between the anomeric carbon of one sugar and a hydroxyl-bearing carbon of another, and the $α$ or $β$ designation specifies the configuration at the linkage. Starch and glycogen use $α$ -1,4 linkages (and $α$ -1,6 branches), which produce an open, helical conformation amenable to enzymatic degradation. Cellulose uses $β$ -1,4 linkages, which produce a flat, extended conformation that packs tightly into fibres through inter-chain hydrogen bonds.

Polysaccharide structural diversity. Starch (plant energy storage) consists of amylose (unbranched $α$ -1,4 glucan, forming a left-handed helix with 6 residues per turn) and amylopectin ( $α$ -1,4 backbone with $α$ -1,6 branches every 24-30 residues). Glycogen (animal energy storage) is more highly branched ( $α$ -1,6 branches every 8-12 residues), providing many non-reducing ends for rapid glucose release by glycogen phosphorylase. Cellulose ( $β$ -1,4-glucan chains packed in parallel) forms microfibrils with tensile strength approaching that of steel on a per-weight basis. Chitin (arthropod exoskeletons, fungal cell walls) replaces the C2 hydroxyl of glucose with an N-acetyl group, adding inter-chain hydrogen bonds.

The diversity of biomolecular function depends on isomerism at multiple scales. At the monomer level, stereochemistry determines biological activity: L-amino acids are the biologically active form in proteins; D-amino acids appear only in bacterial cell walls and some antibiotics. Similarly, D-glucose (not L-glucose) is the metabolic fuel. This homochirality is a universal feature of terrestrial life. Structural isomers in carbohydrates — glucose, galactose, and mannose share the formula $C_{6} H_{12} O_{6}$ but differ in hydroxyl configuration — are biologically distinct: galactose must be converted to glucose via the Leloir pathway before entering glycolysis, and failure of this conversion causes galactosemia. At the polymer level, sequence isomerism generates immense diversity: a protein of $n$ residues built from 20 amino acids has $2 0^{n}$ possible sequences.

Glycoproteins carry covalently attached oligosaccharide chains that modify protein function, stability, and localisation. N-linked glycans attach to Asn in the consensus sequence Asn-X-Ser/Thr and are assembled on a dolichol lipid carrier in the endoplasmic reticulum before transfer to the protein. O-linked glycans attach to Ser or Thr residues in the Golgi apparatus. Glycosylation is the most common post-translational modification in eukaryotes, affecting an estimated 50 percent of all proteins. Functions include assisting protein folding (calnexin/calreticulin cycle in the ER), protecting against proteolysis, mediating cell-cell adhesion (selectin-carbohydrate interactions in immune cell trafficking), and enabling immune evasion (dense glycan shields on viral envelope proteins).

The ABO blood group system is a direct application of carbohydrate chemistry to medicine. The H antigen (present on type O red blood cells) is a specific oligosaccharide chain on surface glycoproteins and glycolipids, terminating in fucose linked $α$ -1,2 to galactose. Type A individuals express a glycosyltransferase that adds N-acetylgalactosamine (GalNAc) in $α$ -1,3 linkage to the galactose. Type B individuals express a different glycosyltransferase that adds galactose (Gal) in $α$ -1,3 linkage. Type AB individuals express both enzymes. The two enzymes differ by four amino acid substitutions that alter substrate specificity — a single gene with three common alleles encoding two enzyme variants and a non-functional version (type O).

Extracellular matrix polysaccharides provide hydration, compressive resistance, and signalling scaffolds. Hyaluronan is a non-sulfated disaccharide polymer (glucuronic acid $β$ -1,3 N-acetylglucosamine $β$ -1,4) of up to 25,000 disaccharide units (molecular weight up to $1 0^{7}$ Da). Its extraordinary water-binding capacity (up to 1000 times its weight in water) underlies its role in joint lubrication and dermal hydration. Glycosaminoglycans (heparan sulfate, chondroitin sulfate, dermatan sulfate) are sulfated polysaccharides attached to core proteins as proteoglycans. Their high negative charge density attracts cations and water, providing compressive resistance to cartilage. Heparan sulfate proteoglycans on the cell surface serve as co-receptors for growth factors (FGF, VEGF), concentrating ligands and presenting them to their signalling receptors.

Synthesis. The four classes of biomolecules are unified by a small repertoire of noncovalent interactions — the hydrophobic effect, hydrogen bonding, electrostatic forces, and van der Waals contacts — whose quantitative treatment provides the foundational reason for all higher-order cellular architecture. The central insight is that the cell exploits these forces hierarchically: hydrophobic collapse at the sub-nanometre scale drives protein folding and membrane assembly, hydrogen bonds stabilise secondary structures and base pairing, and electrostatic interactions govern enzyme specificity and ion selectivity. Putting these together, the free-energy landscape of the cell emerges from the sum of millions of weak interactions, and this is exactly what identifies molecular self-assembly with the emergence of cellular structure. The pattern generalises from individual protein folds to multi-subunit complexes, from lipid bilayers to membrane-bound organelles, and from single glycosidic bonds to extracellular matrices. The bridge is between the atomic-scale chemistry described in 14.01.01 and the cell-scale biology that depends on it: the thermodynamic principles governing a hydrogen bond in a chymotrypsin active site are the same principles that build a membrane, and both are consequences of the hydrophobic effect and the physics of water.

Connections [Master]

Lewis structures and molecular bonding 14.02.01. The covalent bonding framework — octet rule, electronegativity, bond polarity — describes why carbon, nitrogen, oxygen, and phosphorus form the specific functional groups that define monomer chemistry. This unit assumes those bonding patterns and builds the macromolecular edifice on top of them.
Acid-base chemistry and pKa 14.10.01. The protonation state of amino acid side chains at physiological pH determines protein charge, enzyme active-site chemistry, and the zwitterionic form of free amino acids. The Henderson-Hasselbalch framework developed there explains why carboxyl groups are deprotonated and amino groups are protonated at pH 7.
Chemical thermodynamics 14.06.01. The free energy framework $Δ G = Δ H - T Δ S$ governs every condensation and hydrolysis reaction in this unit and extends to the folding, self-assembly, and binding equilibria treated in the Master tier. ATP coupling is the central application.
Cell membranes — structure 17.02.01. The phospholipid amphipathicity and bilayer thermodynamics introduced here become the organising principle for the next unit's treatment of membrane structure, fluidity, and protein insertion.
Cellular respiration — glycolysis and the citric acid cycle 17.04.01. Glucose, the carbohydrate monomer described here, is the substrate for the metabolic pathway that generates the ATP whose hydrolysis drives cellular work. The downstream unit traces ATP coupling through the complete oxidative pathway.
DNA replication and transcription 17.05.01 pending. The nucleic acid monomer-polymer chemistry — phosphodiester bonds, base pairing, the sugar-phosphate backbone — introduced here is the chemical prerequisite for understanding the mechanisms of information storage, copying, and readout.
Amino acids and protein chemistry 15.12.01 pending. The organic chemistry perspective on amino acid structure, stereochemistry, and reactivity complements this unit's biochemical treatment. 15.12.01 pending provides the detailed reaction mechanisms that this unit summarises.
Cellular organization: organelles 17.03.01 pending. The four classes of biomolecules introduced here — lipids (membrane compartments), proteins (enzymatic and transport functions), nucleic acids (genetic information in nucleus and mitochondria), and carbohydrates (energy storage and surface markers) — are the molecular constituents from which every organelle is built. The organelle unit describes how these biomolecules self-assemble into the compartmentalised architecture of the eukaryotic cell.
Mutation and repair 17.06.01 pending. The nucleic acid chemistry described here — the N-glycosidic bond linking bases to deoxyribose, the exocyclic amino group on cytosine, and the conjugated pi-systems of purine and pyrimidine rings — explains why DNA is vulnerable to depurination, deamination, and UV photoproducts. The mutation and repair unit builds on these structural vulnerabilities to explain the molecular mechanisms of DNA damage and its correction.

Historical & philosophical context [Master]

The recognition that living matter obeys the same chemical principles as non-living matter was established by Friedrich Wohler's synthesis of urea from ammonium cyanate in 1828 ^{[Wohler 1828]}. Wohler's result demolished the vitalist position that organic compounds required a "vital force" for their production and opened the mechanistic biochemistry that followed, though vitalism persisted in attenuated forms for decades.

Emil Fischer's 1894 work on enzyme specificity introduced the lock-and-key analogy: the stereochemical complementarity between an enzyme's active site and its substrate determines catalytic selectivity ^{[Fischer 1894]}. Fischer's insight — that molecular geometry governs biological recognition — anticipated the entire field of molecular recognition and remains the conceptual foundation for understanding receptor-ligand interactions, antibody-antigen binding, and substrate selectivity.

Linus Pauling's 1936 paper with Alfred Mirsky established the role of hydrogen bonding in maintaining protein structure ^{[Pauling 1936]}. Pauling proposed that denaturation disrupts the hydrogen-bond network while leaving covalent bonds intact, and that the native structure represents a hydrogen-bond-stabilised state — the direct precursor to the modern understanding of secondary structure. His 1951 prediction of the alpha-helix and beta-sheet from stereochemical reasoning, before any protein structure had been experimentally determined, is one of the great predictive successes in structural biology.

Christian Anfinsen's experiments on ribonuclease A refolding, published from 1961 and summarised in his 1973 Nobel lecture ^{[Anfinsen 1973]}, established the thermodynamic hypothesis: the native fold is the global free-energy minimum of the polypeptide chain under physiological conditions. Charles Tanford's monographs on the hydrophobic effect ^{[Tanford 1980]} quantified the entropy-driven nature of hydrophobic burial and provided the thermodynamic framework for understanding membrane self-assembly and protein folding as manifestations of the same physical phenomenon.

The determination of the first protein structures by John Kendrew (myoglobin, 1958) and Max Perutz (haemoglobin, 1960) using X-ray crystallography revealed the three-dimensional complexity of protein folds and established the sequence-structure relationship. The first atomic-resolution lipid bilayer structures came much later, with molecular dynamics simulations (CHARMM force field, Karplus and coworkers, 1980s onward) providing atomistic models that complement experimental structures from X-ray diffraction and NMR spectroscopy. Hermann Staudinger's 1920 proposal that polymers are giant molecules held together by covalent bonds — controversial at the time but confirmed by Svedberg's ultracentrifugation — laid the conceptual groundwork for all of macromolecular chemistry (Nobel Prize 1953).

Bibliography [Master]

Primary literature:

Wohler, F., "Ueber die kunstliche Bildung des Harnstoffs," Annalen der Physik und Chemie 88 (1828), 253-256.
Fischer, E., "Einfluss der Configuration auf die Wirkung der Enzyme," Ber. Dtsch. Chem. Ges. 27 (1894), 2985-2993.
Staudinger, H., "Uber Polymerisation," Ber. Dtsch. Chem. Ges. 53 (1920), 1073-1085.
Pauling, L. & Mirsky, A. E., "On the structure of native, denatured, and coagulated proteins," Proc. Natl. Acad. Sci. USA 22 (1936), 439-447.
Pauling, L., Corey, R. B. & Branson, H. R., "The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chain," Proc. Natl. Acad. Sci. USA 37 (1951), 205-211.
Chargaff, E., "Chemical specificity of nucleic acids and mechanism of their enzymatic degradation," Experientia 6 (1950), 201-209.
Watson, J. D. & Crick, F. H. C., "Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid," Nature 171 (1953), 737-738.
Kendrew, J. C. et al., "A three-dimensional model of the myoglobin molecule obtained by X-ray analysis," Nature 181 (1958), 662-666.
Matthews, B. W. et al., "Three-dimensional structure of tosyl- $α$ -chymotrypsin," Nature 214 (1967), 652-656.
Anfinsen, C. B., "Principles that govern the folding of protein chains," Science 181 (1973), 223-230.
Tanford, C., The Hydrophobic Effect: Formation of Micelles and Biological Membranes, 2nd ed. (Wiley, 1980).

Textbooks and monographs:

Alberts, B. et al., Molecular Biology of the Cell, 7th ed. (Garland Science, 2022).
Berg, J. M., Tymoczko, J. L. & Stryer, L., Biochemistry, 9th ed. (W. H. Freeman, 2019).
Voet, D. & Voet, J. G., Biochemistry, 5th ed. (Wiley, 2019).
Dill, K. A. & Bromberg, S., Molecular Driving Forces, 2nd ed. (Garland Science, 2011).
Israelachvili, J. N., Intermolecular and Surface Forces, 3rd ed. (Academic Press, 2011).

Prerequisites

14.01.01 pending

Used in

17.02.01
17.05.01

Tier anchors

beginner: Alberts et al., Molecular Biology of the Cell, 7th ed. (2022), Ch. 2
intermediate: Alberts et al., MBoC 7e, Ch. 2; Berg, Tymoczko & Stryer, Biochemistry, 9th ed. (2019), Ch. 1-3
master: Alberts et al., MBoC 7e; Voet & Voet, Biochemistry 5e; Dill & Bromberg, Molecular Driving Forces, 2nd ed. (2011); Tanford, The Hydrophobic Effect (1980); Pauling & Mirsky 1936 Proc. Natl. Acad. Sci. 22, 439-447; Anfinsen 1973 Science 181, 223-230

References

TODO_REF
Alberts et al. — Molecular Biology of the Cell, 7th ed. (Garland Science, 2022) · Ch. 2 The Chemical Components of the Cell
TODO_REF
Berg, Tymoczko & Stryer — Biochemistry, 9th ed. (W. H. Freeman, 2019) · Ch. 1-3 Foundations; Water; Amino acids and proteins
TODO_REF
Voet & Voet — Biochemistry, 5th ed. (Wiley, 2019) · Ch. 1-3 Introduction; Water; Nucleotides and nucleic acids
TODO_REF
Anfinsen, C. B. — Principles that govern the folding of protein chains · Science 181 (1973) 223-230; Nobel lecture on the thermodynamic hypothesis of protein folding
TODO_REF
Tanford, C. — The Hydrophobic Effect: Formation of Micelles and Biological Membranes, 2nd ed. (Wiley, 1980) · Ch. 5-7 Hydrophobic effect, micelles, bilayers
TODO_REF
Pauling, L. & Mirsky, A. E. — On the structure of native, denatured, and coagulated proteins · Proc. Natl. Acad. Sci. USA 22 (1936) 439-447; hydrogen bonding in protein structure
TODO_REF
Fischer, E. — Einfluss der Configuration auf die Wirkung der Enzyme · Ber. Dtsch. Chem. Ges. 27 (1894) 2985-2993; lock-and-key specificity
TODO_REF
Matthews, B. W. et al. — Three-dimensional structure of tosyl-alpha-chymotrypsin · Nature 214 (1967) 652-656; X-ray structure of chymotrypsin
TODO_REF
Wohler, F. — Ueber die kunstliche Bildung des Harnstoffs · Annalen der Physik und Chemie 88 (1828) 253-256; first synthesis of an organic compound from inorganic precursors

Reviewer

Tyler (pending external biology reviewer per BIOLOGY_PLAN §6)

Estimated time

beginner: 15m
intermediate: 35m
master: 75m