Punctuation: semicolons, colons, dashes
Anchor (Master): Nunberg, The Linguistics of Punctuation (1990)
Intuition [Beginner]
Three punctuation marks let you create connections between ideas that are stronger than a period but more structured than a comma:
Semicolons (;) join two related sentences into one. Think of them as "super commas" or "weak periods." Use them when two independent clauses are closely related and you want the reader to see the connection:
- "It was getting late; we decided to head home."
- "The experiment failed; however, we learned a great deal."
Colons (:) mean "here it comes." They signal that what follows explains, illustrates, or lists what came before:
- "I need three things: milk, bread, and eggs."
- "The reason is simple: we ran out of time."
- "Shakespeare wrote: "All the world's a stage.""
Dashes come in two varieties. The em dash (-- in print) creates emphasis or marks a sharp break. The en dash (--) indicates a range:
- Em dash: "The answer -- and only one answer -- is correct."
- Em dash: "She packed everything she owned -- clothes, books, photos -- into one suitcase."
- En dash: "pages 10--15", "2020--2024", "the London--Paris flight"
Quick guide to choosing:
| Situation | Use |
|---|---|
| Two related sentences | Semicolon |
| Before a list or explanation | Colon |
| Before a quotation | Colon |
| Sudden break or emphasis | Em dash |
| Number or date range | En dash |
Visual [Beginner]
SEMICOLON -- joins related independent clauses:
"The sun was setting; the sky turned orange."
^
(Both sides could stand alone as sentences.)
With a conjunctive adverb:
"It rained all day; nevertheless, we hiked."
^ ^
(semicolon + adverb + comma)
COLON -- "here comes the explanation/list/quote":
LIST: I need three items: pen, paper, and ink.
^
EXPLANATION: The result was clear: we had won.
^
QUOTATION: She announced: "The winner is Alice."
^
EM DASH (--) -- emphasis, interruption, or parenthetical:
EMPHASIS: The answer -- and this surprised everyone -- was zero.
^ ^
INTERRUPTION: He started to explain -- but then the phone rang.
^
AFTER LIST: Pen, paper, ink -- all were scattered on the desk.
^
EN DASH (--) -- ranges:
pages 45--52 January--March 2018--2022
^ ^ ^
Worked example [Beginner]
Punctuate each sentence correctly.
1. "The data was incomplete we could not draw a conclusion"
- Two related independent clauses with no conjunction.
- Semicolon: "The data was incomplete; we could not draw a conclusion."
2. "I need to buy eggs milk cheese and butter at the store"
- A list is coming after a complete statement.
- Colon before the list, commas between items: "I need to buy eggs, milk, cheese, and butter at the store."
- With a colon: "I need to buy the following: eggs, milk, cheese, and butter."
3. "The team played well but they lost the game in the final minutes"
- Two independent clauses joined by "but" -- use a comma before the coordinating conjunction.
- "The team played well, but they lost the game in the final minutes."
- Alternatively, with a semicolon (for stronger separation): "The team played well; but they lost the game in the final minutes." (Less common.)
4. "The book had only one flaw it ended too soon"
- The second clause explains the first.
- Em dash or colon: "The book had only one flaw: it ended too soon."
- Or: "The book had only one flaw -- it ended too soon."
5. "Please read chapters 4 through 7 for homework"
- A range of numbers.
- En dash: "Please read chapters 4--7 for homework."
Check your understanding [Beginner]
Formal definition [Intermediate+]
Semicolon (;). The semicolon joins two independent clauses without a coordinating conjunction. It indicates a closer semantic relationship between the clauses than a period would. The clauses it joins typically have a coordinate or contrastive relationship. The semicolon also separates items in a list when one or more items contain internal commas:
- "The guests came from Portland, Oregon; Austin, Texas; and Albany, New York."
Colon (:). The colon follows an independent clause and signals that what follows elaborates on it. The three main uses:
| Use | Pattern | Constraint |
|---|---|---|
| List | IC: item, item, and item | The clause before the colon must be a complete independent clause |
| Explanation | IC: explanation | Same constraint |
| Quotation | IC: "quotation" | Same constraint |
A common error is placing a colon after a fragment: *"The items are: milk, bread, and eggs" is incorrect because "The items are" is not an independent clause. Correct: "I need the following items: milk, bread, and eggs."
Em dash (--). The em dash marks a sharp break in thought, an interruption, or a parenthetical element. It is more emphatic than commas and less formal than parentheses. Em dashes can be used singly (to introduce a terminal parenthetical) or in pairs (to enclose an interruption):
- Single: "She finally understood -- the answer had been obvious all along."
- Pair: "The answer -- if you can call it that -- was unsatisfying."
En dash (--). The en dash indicates a range between two endpoints (numbers, dates, pages) or a connection between two entities:
- Range: "pages 45--52", "2018--2024"
- Connection: "the Boston--New York route"
The en dash is shorter than the em dash and longer than a hyphen. In practice, many writers use a hyphen for ranges in informal contexts.
Key concepts [Intermediate+]
Semicolon with conjunctive adverbs. Conjunctive adverbs (however, therefore, moreover, furthermore, nevertheless, consequently, thus, meanwhile, otherwise, instead, accordingly, besides, indeed, rather) transition between independent clauses. The pattern is: IC**;** adverb**, IC. "It was late;** therefore**, **we left." The semicolon is required because conjunctive adverbs are adverbs, not conjunctions, and cannot join clauses alone.
Colon vs. dash for introducing an explanation. Both the colon and the em dash can introduce an explanation or elaboration. The colon is more formal and requires a complete clause before it. The dash is more informal and dramatic. "There was one problem: the car would not start" (formal, explanatory) vs. "There was one problem -- the car would not start" (dramatic, abrupt).
Hyphen (-), en dash (--), and em dash (--) are three distinct marks. The hyphen joins compound words ("well-known"). The en dash indicates ranges ("2020--2024"). The em dash marks breaks and parentheticals. In typed text, en and em dashes are often represented as hyphens or double hyphens, but formal typography distinguishes all three.
Overuse of dashes. Em dashes are emphatic and should be used sparingly. In formal prose, more than two or three pairs per page is generally excessive. Overused, dashes make writing feel breathless and disjointed. Commas or parentheses are usually quieter alternatives.
Linguistic theory [Master]
The semicolon as a marker of discourse coherence. The semicolon does not mark a syntactic relationship in the way that subordinating conjunctions do. Rather, it signals to the reader that two clauses should be processed as a single discourse unit. Halliday and Hasan's (1976) framework of cohesion treats the semicolon as an implicit marker of conjunctive cohesion -- it invites the reader to infer a relationship (cause, contrast, elaboration, temporal sequence) that is not lexically explicit.
This inferential quality makes the semicolon pragmatic rather than purely syntactic. The same two clauses can be joined by a semicolon or separated by a period, and the difference is a matter of information packaging, not grammaticality. "She studied hard; she passed" packages the information as a single assertion with implied causality. "She studied hard. She passed" presents two separate assertions, leaving the causal inference to the reader.
The colon as a construction. The colon participates in what construction grammarians might call the specification construction: a general statement followed by a specific instance or enumeration. The constraint that the pre-colon clause must be independent is not arbitrary -- it reflects the requirement that the general statement be a complete assertion capable of standing on its own before being elaborated.
The dash in spoken vs. written language. The em dash has no clear prosodic counterpart in speech. While commas correspond to brief pauses and periods to terminal falls, the em dash seems to correspond to a sudden shift in intonation or a dramatic pause that is difficult to transcribe precisely. This makes the dash the most "writerly" of the three marks discussed here -- it is a tool of written rhetoric with no direct spoken equivalent.
The en dash and typographic convention. The en dash's use for ranges is a typographic convention that emerged with modern printing. It has no grammatical function in the syntactic sense; its role is purely notational, indicating that two endpoints form a span. In linguistic terms, the en dash functions as a relational operator between two nominal expressions.
Historical context [Master]
The semicolon was introduced by Aldus Manutius the Elder around 1494, originally as a mark indicating a pause of intermediate length -- longer than a comma but shorter than a colon. In the elocutionary system of punctuation that dominated through the 16th and 17th centuries, punctuation marks were ranked by pause duration: comma (shortest), semicolon, colon, period (longest). This hierarchy is still echoed in the Italian names punto e virgola ("point and comma") and due punti ("two points").
The semicolon's function shifted from prosodic to syntactic during the 18th and 19th centuries. Grammarians began prescribing its use for joining independent clauses rather than marking a medium-length pause. By the late 19th century, the modern rule -- semicolon between independent clauses not joined by a coordinating conjunction -- was well established.
The colon's history is older. The Greek colon (meaning "limb" or "part") originally referred to a segment of a sentence. In medieval punctuation, the punctus medius (middle dot, later the colon) marked a division within a sentence. Its modern function as a signal of specification or enumeration developed in the 17th-18th centuries.
The em dash derives from the printer's em quad, a blank space equal to the width of the letter M in a given typeface. Its use as a punctuation mark for interruption and emphasis developed in 18th-century English printing. The practice of using two em dashes to indicate a missing word or name ("Mr. D------") was common in 18th- and 19th-century novels.
The en dash takes its name from the en quad, a space equal to the width of the letter N. Its use for ranges is a relatively modern typographic convention, becoming standardized in the 20th century as style guides distinguished it from the hyphen and em dash.
Kurt Vonnegut famously disparaged the semicolon in A Man Without a Country (2005): "They are transvestite hermaphrodites representing absolutely nothing. All they do is show you've been to college." This is a minority view, but it reflects a genuine tension: the semicolon is the most "literary" of punctuation marks, and its use signals a certain register of formal prose.
Bibliography [Master]
- Halliday, M.A.K. & Hasan, R. (1976). Cohesion in English. Longman.
- Huddleston, R. & Pullum, G.K. (2002). The Cambridge Grammar of the English Language. Cambridge University Press.
- Manutius, A. (1494/1501). Orthographiae ratio. Venice.
- Nunberg, G. (1990). The Linguistics of Punctuation. CSLI.
- Parkes, M. (1993). Pause and Effect: An Introduction to the History of Punctuation in the West. University of California Press.
- Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A Comprehensive Grammar of the English Language. Longman.
- Truss, L. (2003). Eats, Shoots & Leaves: The Zero Tolerance Approach to Punctuation. Profile Books.