Biomolecules in cells — overview
Anchor (Master): Alberts et al., MBoC 7e; Voet & Voet, Biochemistry 5e; Dill & Bromberg, Molecular Driving Forces, 2nd ed. (2011); Tanford, The Hydrophobic Effect (1980); Pauling & Mirsky 1936 Proc. Natl. Acad. Sci. 22, 439-447; Anfinsen 1973 Science 181, 223-230
Intuition [Beginner]
Every cell, from a bacterium to a neuron, is built from four families of molecules: carbohydrates, lipids, proteins, and nucleic acids. These are the cell's structural materials, energy currency, catalysts, and information carriers. All four share a common architectural principle: small molecular building blocks (monomers) are strung together into large chains (polymers), and the properties of the polymer depend on both the identity and the sequence of its monomers.
Water is the solvent in which all of this chemistry happens. The polar nature of the water molecule means it dissolves charged and polar substances readily, but excludes nonpolar ones. This exclusion drives protein folding, membrane formation, and nearly every higher-order structure in the cell.
Functional groups are the reactive substructures attached to carbon skeletons. A hydroxyl group () makes a molecule more water-soluble and can participate in hydrogen bonding. A carboxyl group () can donate a proton, becoming negatively charged at cellular pH. An amino group () can accept a proton, becoming positively charged. These charged states govern how molecules interact inside the cell.
The monomer-to-polymer transition proceeds by condensation (dehydration synthesis): two monomers join, and a water molecule is removed. The reverse reaction, hydrolysis, breaks the bond by adding water. Enzymes in the cell control both directions. This cycle of building up and breaking down is the metabolic heartbeat of every living system.
Carbohydrates provide short-term energy storage and structural scaffolds. Glucose, a six-carbon sugar, is the primary fuel for cellular metabolism. Lipids store energy long-term in the hydrophobic tails of fatty acids and form the barriers that define cellular compartments. Proteins, built from twenty amino acid monomers, carry out nearly all catalytic and structural work. Nucleic acids (DNA and RNA) store and transmit genetic information.
Visual [Beginner]
The four classes of biomolecules can be organised by their monomer-polymer relationships:
| Class | Monomer | Polymer | Bond type | Primary role |
|---|---|---|---|---|
| Carbohydrates | Monosaccharides (e.g., glucose) | Polysaccharides (e.g., glycogen) | Glycosidic | Energy, structure |
| Lipids | Fatty acids, glycerol | Triglycerides, phospholipids | Ester | Energy storage, membranes |
| Proteins | Amino acids | Polypeptides | Peptide | Catalysis, structure |
| Nucleic acids | Nucleotides | DNA, RNA | Phosphodiester | Information storage |
Each condensation reaction removes one water molecule. Each hydrolysis reaction adds one back.
Worked example [Beginner]
Sucrose (table sugar) is a disaccharide composed of one glucose molecule and one fructose molecule linked by a glycosidic bond. When you eat sucrose, the enzyme sucrase catalyses hydrolysis, cleaving the bond and releasing free glucose and fructose.
Step 1. Identify the monomers. Glucose is a six-carbon aldose (an aldehyde-bearing sugar). Fructose is a six-carbon ketose (a ketone-bearing sugar). Both have the molecular formula .
Step 2. Identify the bond. The glycosidic bond in sucrose connects carbon-1 of glucose (the anomeric carbon, bearing the aldehyde-derived hydroxyl in ring form) to carbon-2 of fructose (the anomeric carbon in the ketose). Because both anomeric carbons are locked in the bond, sucrose is a non-reducing sugar.
Step 3. Write the reaction. Condensation: glucose + fructose yields sucrose + . Hydrolysis (the biologically relevant direction in digestion): sucrose + yields glucose + fructose. The enzyme sucrase lowers the activation energy barrier, allowing the reaction to proceed rapidly at body temperature ().
Step 4. Energetics. The standard free energy change for sucrose hydrolysis is . The negative sign means the reaction releases energy and proceeds spontaneously under standard conditions. In the cell, the released monosaccharides are then absorbed and metabolised through glycolysis.
What this tells us: even a single glycosidic bond stores enough energy that its hydrolysis releases a measurable thermodynamic kick, and enzymes make that energy available on biologically relevant timescales.
Check your understanding [Beginner]
Formal definition [Intermediate+]
A biomolecule is any organic molecule produced by a living organism. The four major classes are distinguished by their monomer chemistry and bond types.
Carbohydrates have the general formula . Monosaccharides are polyhydroxy aldehydes or ketones with three to seven carbons. Two monosaccharides join via a glycosidic bond (an acetal linkage between an anomeric carbon and a hydroxyl oxygen) to form a disaccharide, releasing one . Longer chains form oligosaccharides (3-10 units) and polysaccharides (hundreds to thousands of units). The glycosidic bond has the general form:
where the forward direction is hydrolysis and the reverse is condensation.
Lipids are defined operationally: they are biological molecules that are poorly soluble in water but soluble in nonpolar solvents. Unlike the other three classes, lipids are not all polymeric. The main subtypes are fatty acids (long hydrocarbon chains with a terminal carboxyl group), triglycerides (three fatty acids esterified to a glycerol backbone), phospholipids (two fatty acids plus a phosphate-containing head group on glycerol), and sterols (four-ring structures such as cholesterol). The ester bond linking fatty acids to glycerol forms by condensation between a carboxyl group and a hydroxyl group.
Proteins are polymers of amino acids. Each of the twenty standard amino acids has a central (alpha) carbon bonded to an amino group (), a carboxyl group (), a hydrogen, and a variable side chain (R group). Amino acids polymerise via peptide bonds, formed by condensation between the carboxyl group of one amino acid and the amino group of the next:
A polypeptide chain has an N-terminus (free amino group) and a C-terminus (free carboxyl group). Proteins range from a few dozen to several thousand amino acid residues.
Nucleic acids are polymers of nucleotides. Each nucleotide has three components: a nitrogenous base (purine or pyrimidine), a pentose sugar (ribose in RNA, deoxyribose in DNA), and one or more phosphate groups. Nucleotides polymerise via phosphodiester bonds, linking the 5'-phosphate of one nucleotide to the 3'-hydroxyl of the next, releasing pyrophosphate (). The resulting backbone is a sugar-phosphate chain with bases projecting outward.
Water as a biological solvent has a high dielectric constant ( at ), high specific heat capacity, and strong hydrogen-bonding capacity. These properties stabilise charged species, moderate temperature fluctuations, and drive the hydrophobic effect that underlies membrane formation and protein folding.
Functional groups summary
| Group | Structure | Charge at pH 7 | Key role |
|---|---|---|---|
| Hydroxyl | Neutral | H-bond donor/acceptor; alcohols, sugars | |
| Carboxyl | Acid; amino acid C-terminus | ||
| Amino | Base; amino acid N-terminus | ||
| Phosphate | Energy transfer (ATP); nucleic acid backbone | ||
| Sulfhydryl | Neutral | Disulfide bonds in proteins | |
| Carbonyl | Neutral | Aldehyde/ketone; sugar reactivity |
Key theorem with proof [Intermediate+]
Theorem (Hydrolysis thermodynamics of biomolecular bonds). The standard free energy of hydrolysis for the major biomolecular bonds follows the hierarchy: phosphoanhydride (ATP) > ester > glycosidic > peptide. The more negative of hydrolysis, the more energy is released when the bond is cleaved, and the more work the cell can extract from the reaction. ATP hydrolysis () occupies the top of this hierarchy, which is why the cell uses ATP as its universal energy currency.
Proof sketch (by enthalpy-entropy decomposition). Consider a generic condensation bond hydrolysing to . The free energy change is:
For phosphoanhydride bonds (as in ATP), three factors make strongly negative. First, electrostatic repulsion: at pH 7, ATP carries four negative charges on its triphosphate tail. Hydrolysis relieves this repulsion by separating the charges onto ADP () and inorganic phosphate ( at physiological pH), each now solvated independently. The enthalpy gain from charge separation is substantial.
Second, resonance stabilisation: inorganic phosphate () distributes its negative charge over four equivalent oxygen atoms. ADP also gains resonance stabilisation relative to its state in the ATP molecule. The products are thermodynamically more stable than the reactant.
Third, entropy increase: one molecule (ATP) produces two molecules (ADP + ), increasing the number of translational degrees of freedom. At (body temperature), even a modest entropy increase contributes to to the total free energy change.
Combining these contributions gives for ATP hydrolysis under standard biochemical conditions (1 M concentrations, pH 7, ). In the cell, where [ATP] >> [ADP][], the actual is closer to to .
Peptide bonds, by contrast, have for hydrolysis. The smaller driving force reflects the absence of electrostatic repulsion in the reactant and less resonance stabilisation in the products. This is why proteins are kinetically stable (hydrolysis is slow without a catalyst) even though the reaction is thermodynamically favourable.
Bridge. The hydrolysis hierarchy builds toward 17.04.01 cellular respiration, where ATP hydrolysis drives every energy-requiring step in glycolysis, the citric acid cycle, and oxidative phosphorylation. The foundational reason ATP occupies the top of the hierarchy is the combination of electrostatic repulsion in the triphosphate tail and resonance stabilisation in the products — this is exactly the same charge-separation principle that appears in the catalytic triad of serine proteases, where electrostatic relay stabilises the transition state 15.14.01 pending. The free-energy framework appears again in 14.06.01 chemical thermodynamics as the canonical application of to biochemical coupling.
Worked example: comparing bond energies
Calculate how many peptide bonds the cell can synthesise using the energy from hydrolysis of one ATP molecule under cellular conditions ().
Each peptide bond formation requires condensation, which is energetically unfavourable. The cell couples this to ATP hydrolysis. Peptide bond hydrolysis releases , so forming one requires approximately (at standard conditions). One ATP hydrolysis at cellular provides enough energy to form peptide bonds in principle, though the actual coupling efficiency is lower due to entropy losses.
Exercises [Intermediate+]
Noncovalent interactions and the hydrophobic effect [Master]
The four classes of biomolecules interact through a small repertoire of noncovalent forces whose quantitative properties determine the structure and stability of every cellular assembly. These forces are individually weak (0.5-5 kcal/mol per interaction, compared to 50-100 kcal/mol for a covalent bond) but collectively determine three-dimensional structure through their enormous numbers.
Hydrogen bonds form when a hydrogen atom covalently bonded to an electronegative donor (D-H, where D is typically N, O, or S) interacts with a lone-pair-bearing acceptor atom (A). The geometry is constrained: the D-HA distance is 2.7-3.2 and the D-HA angle exceeds 150 degrees for a strong bond [Pauling 1936]. In vacuum, a single hydrogen bond contributes 3-7 kcal/mol of stabilisation. In water, the net contribution drops to 0.5-1.5 kcal/mol because water molecules compete for the same donors and acceptors — a hydrogen bond between two groups in solution is really the difference between the group-group bond and two group-water bonds.
In protein secondary structure, backbone amide N-H and carbonyl C=O groups form regular hydrogen-bond networks. The alpha-helix places H-bonds between residue and residue (3.6 residues per turn, 5.4 pitch). The beta-sheet places them between adjacent strands. These patterns produce the characteristic signatures in circular dichroism spectroscopy and in the Ramachandran density map of folded proteins.
In DNA, the Watson-Crick base pairs exploit hydrogen bonding: adenine-thymine (two bonds) and guanine-cytosine (three bonds). The additional bond per G-C pair and the stronger stacking interactions between adjacent purines account for the higher melting temperature of GC-rich DNA.
Electrostatic interactions between charged groups follow Coulomb's law: . In the cellular environment, where the ionic strength is approximately 150 mM, the Debye-Huckel screening length is approximately 0.8 nm. Beyond about 8 , electrostatic interactions are attenuated to 1/e of their vacuum value by the mobile ions in solution. Inside a protein, where the local dielectric constant can be as low as (compared to for bulk water), salt bridges (ion pairs between oppositely charged side chains, such as lysine NH and aspartate COO) contribute 1-5 kcal/mol. The pH dependence of these interactions connects to the pKa of the participating groups 14.10.01.
Van der Waals forces comprise London dispersion (induced dipole-induced dipole), Debye (dipole-induced dipole), and Keesom (dipole-dipole) interactions. The total van der Waals interaction between two atoms is approximated by the Lennard-Jones 6-12 potential:
where is the well depth and is the distance at which the potential is zero. The term is the repulsive wall (Pauli exclusion); the term is the attractive dispersion. For carbon-carbon interactions, the minimum occurs at with kcal/mol. Individual van der Waals contacts are weak, but a well-packed protein core contains thousands of them; their summed contribution to stability is 10-20 kcal/mol.
The hydrophobic effect is not a bond but an entropy-driven phenomenon. When a nonpolar solute dissolves in water, the surrounding water molecules reorganise into a more ordered cage-like structure to maintain their hydrogen-bond network around the nonpolar surface. This ordering reduces the entropy of the water. When two nonpolar surfaces come together, the ordered water between them is released back to the bulk, gaining translational and rotational entropy. The free energy of hydrophobic burial is approximately proportional to the buried nonpolar surface area. For a typical protein burying roughly 5000 of nonpolar surface upon folding, the hydrophobic effect contributes roughly 125 kcal/mol to the driving force — by far the largest single contribution. Calorimetric measurements (differential scanning calorimetry, Privalov 1979) confirm that the hydrophobic effect has a large positive entropy component and a relatively small enthalpy component at room temperature.
Proposition. The free energy of transferring a nonpolar solute of surface area from water to a nonpolar environment is , where cal/(mol) is the hydrophobic surface tension coefficient. This quantity is positive (unfavourable) when the surface is exposed to water and negative (favourable) when buried.
Proof. The transfer free energy decomposes as . For nonpolar solutes, of transfer is small (typically +1 to +3 kcal/mol at 25 degrees C, from the disruption of van der Waals contacts with water). The entropy change is large and negative when the solute enters water, because water reorganises into ordered cages, losing configurational entropy. The entropy of the released water upon burial is , where is the entropy per unit area of water released from the cage structure. Calorimetric measurements give cal/(molK). At K, the entropic contribution is cal/mol. The enthalpy contribution is smaller and partially compensating, giving the net coefficient of approximately 25 cal/(mol).
The hydrophobic effect strengthens with increasing temperature (up to approximately 60-70 degrees C) because the entropy gain from releasing ordered water becomes larger at higher . This temperature dependence explains why some proteins cold-denature: at low temperatures the hydrophobic driving force weakens and the folded state becomes less favourable.
Protein structure hierarchy and folding thermodynamics [Master]
Proteins adopt a hierarchy of structural levels — primary, secondary, tertiary, and quaternary — each determined by the physicochemical properties of the amino acid sequence and the cellular environment.
Primary structure is the linear sequence of amino acid residues. The peptide bond is planar (partial double-bond character of the C-N bond, resonance-stabilised) and adopts the trans configuration in greater than 99.9 percent of X-Pro bonds and virtually 100 percent of other peptide bonds. This planarity constrains the backbone conformations accessible to the polypeptide chain.
Secondary structure comprises regular, repeating backbone conformations stabilised by hydrogen bonds. The alpha-helix (3.6 residues per turn, pitch 5.4 , H-bonds from C=O() to N-H()) was predicted by Pauling, Corey, and Branson in 1951 from stereochemical considerations before any protein structure had been solved experimentally. The beta-sheet (parallel or antiparallel strands connected by inter-strand H-bonds) was predicted in the same paper. Turns (reverse turns, beta-bends) change the direction of the polypeptide chain and are enriched at the protein surface.
The Ramachandran plot maps the allowed combinations of the two backbone torsion angles (N-C) and (C-C). Steric clashes between backbone atoms and the side chain restrict the allowed regions. Glycine, with only a hydrogen atom as its side chain, can access a much larger region of the plot. Proline, whose side chain bonds to the backbone nitrogen, is restricted to a narrow range of values. In a well-refined protein structure, greater than 98 percent of residues fall within the allowed regions.
Tertiary structure is the full three-dimensional arrangement of a single polypeptide chain. The driving force is hydrophobic collapse: nonpolar side chains (valine, leucine, isoleucine, phenylalanine, methionine) pack into the protein interior, while polar and charged side chains remain at the surface. The structure is stabilised by hydrogen bonds, electrostatic interactions (salt bridges between oppositely charged side chains), van der Waals packing in the core, and occasionally disulfide bonds (covalent S-S bonds between cysteine residues, formed in the oxidising environment of the endoplasmic reticulum).
The serine protease chymotrypsin illustrates these principles. Chymotrypsinogen, the inactive zymogen, is a 245-residue protein that folds into a compact globular structure stabilised by 5 disulfide bonds and a hydrophobic core. Upon proteolytic activation, a small conformational change repositions the catalytic triad (Ser195, His57, Asp102) into the precise geometry required for peptide bond hydrolysis. The X-ray structure determined by Matthews et al. in 1967 [Matthews 1967] revealed how the folded scaffold positions the catalytic residues — the serine hydroxyl, the histidine imidazole, and the aspartate carboxylate form a charge-relay system that activates Ser195 as a nucleophile.
Quaternary structure is the arrangement of multiple polypeptide subunits. Haemoglobin ( tetramer) is the canonical example: oxygen binding to one subunit induces a conformational change that increases the oxygen affinity of the remaining subunits (cooperative binding, described by the Monod-Wyman-Changeux model). The Bohr effect (decreased oxygen affinity at lower pH) is a direct consequence of electrostatic interactions between protons and specific amino acid side chains at the subunit interface.
Anfinsen's thermodynamic hypothesis. In experiments on ribonuclease A beginning in 1961, Christian Anfinsen demonstrated that a denatured protein can refold to its native, enzymatically active conformation without any external template [Anfinsen 1973]. Ribonuclease A was denatured in 8 M urea with -mercaptoethanol (reducing the four disulfide bonds), then the denaturant and reducing agent were removed. The protein refolded to its native structure with full enzymatic activity, reforming the correct four disulfide bonds out of 105 possible combinations. Anfinsen concluded that the native structure is the thermodynamic ground state — the conformation with the lowest Gibbs free energy under physiological conditions. This result earned him the 1972 Nobel Prize in Chemistry.
Levinthal's paradox highlights the gap between thermodynamic prediction and kinetic reality. If a 100-residue protein sampled each of its roughly possible conformations at a rate of /s (the rate of bond vibration), exhaustive search would take years. Real proteins fold in seconds. The resolution is that the energy landscape is funnel-shaped: local interactions guide the chain toward the native state through a series of progressively lower-energy intermediates, rather than requiring a random search. The folding funnel concept, developed by Bryngelson and Wolynes in the 1980s, provides the theoretical framework.
Molecular chaperones assist protein folding in the cell without violating Anfinsen's thermodynamic hypothesis. The Hsp70 family binds exposed hydrophobic patches on nascent or misfolded polypeptides, preventing aggregation. The GroEL/GroES complex (Hsp60) provides an enclosed chamber in which a single protein molecule can fold in isolation, shielded from the crowded cytoplasm. Chaperones accelerate the kinetics of reaching the native fold and prevent off-pathway aggregation, but they do not change the thermodynamic endpoint.
Intrinsically disordered proteins (IDPs) lack a fixed three-dimensional structure under physiological conditions. Far from being non-functional, disorder enables binding-induced folding (the protein folds upon encountering its target), signalling flexibility (a single disordered region can interact with multiple partners), and regulatory post-translational modifications. Approximately 30-40 percent of eukaryotic proteins contain substantial disordered regions.
Protein misfolding diseases arise when proteins adopt aberrant conformations that self-associate into toxic aggregates. In Alzheimer's disease, the amyloid- peptide (A42) forms oligomers and then fibrils with a cross- spine structure in which -strands run perpendicular to the fibril axis. In prion diseases (Creutzfeldt-Jakob disease, bovine spongiform encephalopathy), the normal cellular prion protein (PrP, predominantly -helical) converts to the scrapie form (PrP, predominantly -sheet), which templates further conversion. This conformational self-propagation is the molecular basis of prion infectivity.
Lipid self-assembly and membrane thermodynamics [Master]
The amphipathic character of lipids — a hydrophilic head group attached to hydrophobic hydrocarbon tails — drives the spontaneous formation of membrane structures. The thermodynamics of self-assembly is a direct consequence of the hydrophobic effect.
Critical micelle concentration (CMC). When amphipathic molecules are added to water, they remain as monomers at low concentration. Above the CMC, they self-assemble into micelles — spherical or cylindrical aggregates with the hydrophobic tails sequestered in the interior and the hydrophilic heads facing the water. The CMC depends on tail length (longer tails produce a lower CMC, because the hydrophobic driving force is stronger) and head group chemistry (larger or more polar heads produce a higher CMC). For sodium dodecyl sulfate (SDS, a 12-carbon tail), the CMC is approximately 8 mM. For phospholipids, the CMC is extremely low ( to M), which is why phospholipid bilayers are stable structures in cells.
The aggregation process is governed by . The enthalpy change is modest; the dominant contribution is the entropy gain from releasing ordered water molecules around the hydrophobic tails — the same hydrophobic effect that drives protein folding. For a phospholipid with two 16-carbon tails, burying approximately 800 of hydrophobic surface upon aggregation releases roughly cal/mol = 20 kcal/mol of free energy.
The packing parameter — where is the tail volume, is the optimal head-group area, and is the critical tail length — predicts aggregate geometry. When , the molecule is cone-shaped and forms micelles. When , cylindrical micelles form. When , the molecule is roughly cylindrical and forms bilayers (vesicles, liposomes). Phospholipids with two tails have near 1, which is why they form stable bilayers rather than micelles.
Bilayer properties. A phospholipid bilayer is approximately 5 nm thick (depending on tail length). Lateral diffusion of individual lipid molecules within the plane of the bilayer is rapid (roughly 1 m/s at 37 degrees C), allowing membrane components to redistribute. Transverse diffusion (flip-flop of a lipid from one leaflet to the other) is energetically costly — it requires dragging the polar head group through the hydrophobic core — and occurs on a timescale of hours to days without enzymatic assistance. Flippases, floppases, and scramblases catalyse transverse lipid movement.
Phase behaviour. Below a characteristic melting temperature , the bilayer is in the gel phase (): the lipid tails are ordered and extended, and lateral diffusion is slow. Above , the bilayer enters the fluid (liquid-crystalline, ) phase: the tails are disordered, and lateral diffusion is fast. Saturated tails (no double bonds) pack tightly and raise ; unsaturated tails (cis double bonds) introduce kinks that disrupt packing and lower . Bacteria adjust their fatty acid composition in response to temperature changes (homeoviscous adaptation) to maintain functional membrane fluidity.
Cholesterol intercalates between phospholipid tails with its hydroxyl group at the lipid-water interface and its rigid ring system within the upper portion of the bilayer. At temperatures above , cholesterol restrains tail motion, decreasing fluidity. At temperatures below , it prevents tight packing, maintaining fluidity. This dual buffering effect keeps membrane properties within a functional range across a wider temperature span. Mammalian cell membranes contain 20-50 mol percent cholesterol.
Lipid rafts are cholesterol- and sphingolipid-enriched microdomains proposed to serve as platforms for signal transduction, membrane trafficking, and pathogen entry. Their existence has been supported by detergent-resistance assays and single-particle tracking, though their size (10-200 nm) and lifetime remain debated.
Membrane protein insertion exploits the same thermodynamic principles. A transmembrane alpha-helix of approximately 20 hydrophobic residues spans the 5 nm bilayer with its polar backbone hydrogen-bonded internally and its hydrophobic side chains facing the lipid tails. The energetic cost of inserting a charged residue into the bilayer core is approximately 2-4 kcal/mol, which is why transmembrane domains are overwhelmingly hydrophobic. The signal recognition particle (SRP) targets nascent membrane proteins to the endoplasmic reticulum, where the Sec61 translocon provides a gated channel for co-translational insertion of transmembrane helices into the bilayer. Beta-barrel outer-membrane proteins (in bacteria, mitochondria, and chloroplasts) are inserted post-translationally by the Bam complex, folding into cylindrical barrels whose hydrogen-bonded strands close upon themselves.
Carbohydrate diversity and cellular function [Master]
Carbohydrates exhibit structural diversity generated by stereochemistry at multiple chiral centres and by the variety of glycosidic linkage positions. This diversity underlies their roles in energy storage, structural support, cell-cell recognition, and immune function.
Glycosidic bond stereochemistry. The anomeric carbon (C1 in aldoses, C2 in ketoses) of a sugar in ring form can adopt the -configuration (the C1-OH points down in the Haworth projection) or the -configuration (pointing up). Glycosidic bonds form between the anomeric carbon of one sugar and a hydroxyl-bearing carbon of another, and the or designation specifies the configuration at the linkage. Starch and glycogen use -1,4 linkages (and -1,6 branches), which produce an open, helical conformation amenable to enzymatic degradation. Cellulose uses -1,4 linkages, which produce a flat, extended conformation that packs tightly into fibres through inter-chain hydrogen bonds.
Polysaccharide structural diversity. Starch (plant energy storage) consists of amylose (unbranched -1,4 glucan, forming a left-handed helix with 6 residues per turn) and amylopectin (-1,4 backbone with -1,6 branches every 24-30 residues). Glycogen (animal energy storage) is more highly branched (-1,6 branches every 8-12 residues), providing many non-reducing ends for rapid glucose release by glycogen phosphorylase. Cellulose (-1,4-glucan chains packed in parallel) forms microfibrils with tensile strength approaching that of steel on a per-weight basis. Chitin (arthropod exoskeletons, fungal cell walls) replaces the C2 hydroxyl of glucose with an N-acetyl group, adding inter-chain hydrogen bonds.
The diversity of biomolecular function depends on isomerism at multiple scales. At the monomer level, stereochemistry determines biological activity: L-amino acids are the biologically active form in proteins; D-amino acids appear only in bacterial cell walls and some antibiotics. Similarly, D-glucose (not L-glucose) is the metabolic fuel. This homochirality is a universal feature of terrestrial life. Structural isomers in carbohydrates — glucose, galactose, and mannose share the formula but differ in hydroxyl configuration — are biologically distinct: galactose must be converted to glucose via the Leloir pathway before entering glycolysis, and failure of this conversion causes galactosemia. At the polymer level, sequence isomerism generates immense diversity: a protein of residues built from 20 amino acids has possible sequences.
Glycoproteins carry covalently attached oligosaccharide chains that modify protein function, stability, and localisation. N-linked glycans attach to Asn in the consensus sequence Asn-X-Ser/Thr and are assembled on a dolichol lipid carrier in the endoplasmic reticulum before transfer to the protein. O-linked glycans attach to Ser or Thr residues in the Golgi apparatus. Glycosylation is the most common post-translational modification in eukaryotes, affecting an estimated 50 percent of all proteins. Functions include assisting protein folding (calnexin/calreticulin cycle in the ER), protecting against proteolysis, mediating cell-cell adhesion (selectin-carbohydrate interactions in immune cell trafficking), and enabling immune evasion (dense glycan shields on viral envelope proteins).
The ABO blood group system is a direct application of carbohydrate chemistry to medicine. The H antigen (present on type O red blood cells) is a specific oligosaccharide chain on surface glycoproteins and glycolipids, terminating in fucose linked -1,2 to galactose. Type A individuals express a glycosyltransferase that adds N-acetylgalactosamine (GalNAc) in -1,3 linkage to the galactose. Type B individuals express a different glycosyltransferase that adds galactose (Gal) in -1,3 linkage. Type AB individuals express both enzymes. The two enzymes differ by four amino acid substitutions that alter substrate specificity — a single gene with three common alleles encoding two enzyme variants and a non-functional version (type O).
Extracellular matrix polysaccharides provide hydration, compressive resistance, and signalling scaffolds. Hyaluronan is a non-sulfated disaccharide polymer (glucuronic acid -1,3 N-acetylglucosamine -1,4) of up to 25,000 disaccharide units (molecular weight up to Da). Its extraordinary water-binding capacity (up to 1000 times its weight in water) underlies its role in joint lubrication and dermal hydration. Glycosaminoglycans (heparan sulfate, chondroitin sulfate, dermatan sulfate) are sulfated polysaccharides attached to core proteins as proteoglycans. Their high negative charge density attracts cations and water, providing compressive resistance to cartilage. Heparan sulfate proteoglycans on the cell surface serve as co-receptors for growth factors (FGF, VEGF), concentrating ligands and presenting them to their signalling receptors.
Synthesis. The four classes of biomolecules are unified by a small repertoire of noncovalent interactions — the hydrophobic effect, hydrogen bonding, electrostatic forces, and van der Waals contacts — whose quantitative treatment provides the foundational reason for all higher-order cellular architecture. The central insight is that the cell exploits these forces hierarchically: hydrophobic collapse at the sub-nanometre scale drives protein folding and membrane assembly, hydrogen bonds stabilise secondary structures and base pairing, and electrostatic interactions govern enzyme specificity and ion selectivity. Putting these together, the free-energy landscape of the cell emerges from the sum of millions of weak interactions, and this is exactly what identifies molecular self-assembly with the emergence of cellular structure. The pattern generalises from individual protein folds to multi-subunit complexes, from lipid bilayers to membrane-bound organelles, and from single glycosidic bonds to extracellular matrices. The bridge is between the atomic-scale chemistry described in 14.01.01 and the cell-scale biology that depends on it: the thermodynamic principles governing a hydrogen bond in a chymotrypsin active site are the same principles that build a membrane, and both are consequences of the hydrophobic effect and the physics of water.
Connections [Master]
Lewis structures and molecular bonding
14.02.01. The covalent bonding framework — octet rule, electronegativity, bond polarity — describes why carbon, nitrogen, oxygen, and phosphorus form the specific functional groups that define monomer chemistry. This unit assumes those bonding patterns and builds the macromolecular edifice on top of them.Acid-base chemistry and pKa
14.10.01. The protonation state of amino acid side chains at physiological pH determines protein charge, enzyme active-site chemistry, and the zwitterionic form of free amino acids. The Henderson-Hasselbalch framework developed there explains why carboxyl groups are deprotonated and amino groups are protonated at pH 7.Chemical thermodynamics
14.06.01. The free energy framework governs every condensation and hydrolysis reaction in this unit and extends to the folding, self-assembly, and binding equilibria treated in the Master tier. ATP coupling is the central application.Cell membranes — structure
17.02.01. The phospholipid amphipathicity and bilayer thermodynamics introduced here become the organising principle for the next unit's treatment of membrane structure, fluidity, and protein insertion.Cellular respiration — glycolysis and the citric acid cycle
17.04.01. Glucose, the carbohydrate monomer described here, is the substrate for the metabolic pathway that generates the ATP whose hydrolysis drives cellular work. The downstream unit traces ATP coupling through the complete oxidative pathway.DNA replication and transcription
17.05.01pending. The nucleic acid monomer-polymer chemistry — phosphodiester bonds, base pairing, the sugar-phosphate backbone — introduced here is the chemical prerequisite for understanding the mechanisms of information storage, copying, and readout.Amino acids and protein chemistry
15.12.01pending. The organic chemistry perspective on amino acid structure, stereochemistry, and reactivity complements this unit's biochemical treatment.15.12.01pending provides the detailed reaction mechanisms that this unit summarises.Cellular organization: organelles
17.03.01pending. The four classes of biomolecules introduced here — lipids (membrane compartments), proteins (enzymatic and transport functions), nucleic acids (genetic information in nucleus and mitochondria), and carbohydrates (energy storage and surface markers) — are the molecular constituents from which every organelle is built. The organelle unit describes how these biomolecules self-assemble into the compartmentalised architecture of the eukaryotic cell.Mutation and repair
17.06.01pending. The nucleic acid chemistry described here — the N-glycosidic bond linking bases to deoxyribose, the exocyclic amino group on cytosine, and the conjugated pi-systems of purine and pyrimidine rings — explains why DNA is vulnerable to depurination, deamination, and UV photoproducts. The mutation and repair unit builds on these structural vulnerabilities to explain the molecular mechanisms of DNA damage and its correction.
Historical & philosophical context [Master]
The recognition that living matter obeys the same chemical principles as non-living matter was established by Friedrich Wohler's synthesis of urea from ammonium cyanate in 1828 [Wohler 1828]. Wohler's result demolished the vitalist position that organic compounds required a "vital force" for their production and opened the mechanistic biochemistry that followed, though vitalism persisted in attenuated forms for decades.
Emil Fischer's 1894 work on enzyme specificity introduced the lock-and-key analogy: the stereochemical complementarity between an enzyme's active site and its substrate determines catalytic selectivity [Fischer 1894]. Fischer's insight — that molecular geometry governs biological recognition — anticipated the entire field of molecular recognition and remains the conceptual foundation for understanding receptor-ligand interactions, antibody-antigen binding, and substrate selectivity.
Linus Pauling's 1936 paper with Alfred Mirsky established the role of hydrogen bonding in maintaining protein structure [Pauling 1936]. Pauling proposed that denaturation disrupts the hydrogen-bond network while leaving covalent bonds intact, and that the native structure represents a hydrogen-bond-stabilised state — the direct precursor to the modern understanding of secondary structure. His 1951 prediction of the alpha-helix and beta-sheet from stereochemical reasoning, before any protein structure had been experimentally determined, is one of the great predictive successes in structural biology.
Christian Anfinsen's experiments on ribonuclease A refolding, published from 1961 and summarised in his 1973 Nobel lecture [Anfinsen 1973], established the thermodynamic hypothesis: the native fold is the global free-energy minimum of the polypeptide chain under physiological conditions. Charles Tanford's monographs on the hydrophobic effect [Tanford 1980] quantified the entropy-driven nature of hydrophobic burial and provided the thermodynamic framework for understanding membrane self-assembly and protein folding as manifestations of the same physical phenomenon.
The determination of the first protein structures by John Kendrew (myoglobin, 1958) and Max Perutz (haemoglobin, 1960) using X-ray crystallography revealed the three-dimensional complexity of protein folds and established the sequence-structure relationship. The first atomic-resolution lipid bilayer structures came much later, with molecular dynamics simulations (CHARMM force field, Karplus and coworkers, 1980s onward) providing atomistic models that complement experimental structures from X-ray diffraction and NMR spectroscopy. Hermann Staudinger's 1920 proposal that polymers are giant molecules held together by covalent bonds — controversial at the time but confirmed by Svedberg's ultracentrifugation — laid the conceptual groundwork for all of macromolecular chemistry (Nobel Prize 1953).
Bibliography [Master]
Primary literature:
- Wohler, F., "Ueber die kunstliche Bildung des Harnstoffs," Annalen der Physik und Chemie 88 (1828), 253-256.
- Fischer, E., "Einfluss der Configuration auf die Wirkung der Enzyme," Ber. Dtsch. Chem. Ges. 27 (1894), 2985-2993.
- Staudinger, H., "Uber Polymerisation," Ber. Dtsch. Chem. Ges. 53 (1920), 1073-1085.
- Pauling, L. & Mirsky, A. E., "On the structure of native, denatured, and coagulated proteins," Proc. Natl. Acad. Sci. USA 22 (1936), 439-447.
- Pauling, L., Corey, R. B. & Branson, H. R., "The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chain," Proc. Natl. Acad. Sci. USA 37 (1951), 205-211.
- Chargaff, E., "Chemical specificity of nucleic acids and mechanism of their enzymatic degradation," Experientia 6 (1950), 201-209.
- Watson, J. D. & Crick, F. H. C., "Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid," Nature 171 (1953), 737-738.
- Kendrew, J. C. et al., "A three-dimensional model of the myoglobin molecule obtained by X-ray analysis," Nature 181 (1958), 662-666.
- Matthews, B. W. et al., "Three-dimensional structure of tosyl--chymotrypsin," Nature 214 (1967), 652-656.
- Anfinsen, C. B., "Principles that govern the folding of protein chains," Science 181 (1973), 223-230.
- Tanford, C., The Hydrophobic Effect: Formation of Micelles and Biological Membranes, 2nd ed. (Wiley, 1980).
Textbooks and monographs:
- Alberts, B. et al., Molecular Biology of the Cell, 7th ed. (Garland Science, 2022).
- Berg, J. M., Tymoczko, J. L. & Stryer, L., Biochemistry, 9th ed. (W. H. Freeman, 2019).
- Voet, D. & Voet, J. G., Biochemistry, 5th ed. (Wiley, 2019).
- Dill, K. A. & Bromberg, S., Molecular Driving Forces, 2nd ed. (Garland Science, 2011).
- Israelachvili, J. N., Intermolecular and Surface Forces, 3rd ed. (Academic Press, 2011).