Philosophy of science: demarcation, falsification, and paradigms
Anchor (Master): primary sources: Popper 1935, Kuhn 1962, Lakatos 1970, Feyerabend 1975, van Fraassen 1980
Intuition Beginner
What counts as science? The question sounds simple. Astronomy is science; astrology is not. Evolutionary biology is science; creationism is not. Physics is science; phrenology is not. Most people can sort individual cases without much trouble. But the principle behind the sorting has resisted every attempt to state it cleanly, and the failures have shaped the last century of philosophy of science.
The problem has a name: the demarcation problem. It asks for a criterion that separates science from non-science (or from pseudoscience, or from pretence to science) in a way that tracks what actually matters — not just whether a claim uses technical jargon or has institutional backing, but whether it earns its conclusions in a way that deserves the label "scientific".
Why does it matter? Three reasons that compound each other. First, resources: governments fund research, universities hire faculty, journals publish papers, and all of these decisions presuppose some demarcation, even if nobody writes it down. Second, authority: scientific claims carry weight in public policy, medicine, and law; if the label "science" can be attached without constraint, that authority is counterfeit. Third, self-understanding: scientists themselves operate with implicit criteria about what counts as good practice, and making those criteria explicit is part of understanding what the enterprise is doing.
The history of the demarcation problem is a sequence of bold proposals, each followed by decisive counterexamples. The logical positivists of the 1920s and 1930s proposed the verification principle: a statement is meaningful only if it can be verified by observation. Science consists of statements that pass this test; metaphysics, religion, and much of traditional philosophy are literally meaningless — not false, but without content. The proposal was dramatic, and it failed.
No version of the verification principle could be stated in a form that both excluded the intended targets (metaphysics, theology) and included what the positivists wanted to keep (scientific laws, theoretical entities, universal generalisations). Universal claims like "all ravens are black" can never be verified by finite observation; yet they are paradigmatic scientific statements. The verification principle refuted itself.
Karl Popper, working in Vienna at the same time, proposed the mirror image: what makes a theory scientific is not that it can be verified but that it can be falsified. A theory is scientific if it makes predictions that could, in principle, be shown wrong by observation. Astrology is not scientific because its predictions are so vague that nothing counts as a refutation. Marxism, in Popper's view, had once been scientific but had degenerated into unfalsifiable doctrine. Einstein's general relativity was the gold standard: it predicted the bending of starlight by gravity, a specific, quantitative, risky prediction that observation could have contradicted.
Falsificationism is more durable than verificationism, and it captures something important about scientific practice. But it too faces counterexamples. Real scientists do not abandon their theories when a single experiment disagrees with prediction. They blame the instruments, the auxiliary assumptions, the calibration, the background conditions. And they are sometimes right to do so. Newtonian mechanics predicted the perihelion advance of Mercury incorrectly for decades before general relativity explained the discrepancy. By strict falsificationist logic, Newtonian mechanics should have been abandoned long before Einstein arrived. It was not, because it was too successful in too many domains to discard on one anomaly.
The demarcation problem, then, is not a technicality. It is a window onto the deepest questions about how scientific knowledge is produced, how it changes, and what makes it authoritative.
Visual Beginner
Figure: A horizontal spectrum with five columns — pseudoscience (astrology, no testable predictions), protoscience (string theory circa 1985, testable but unconfirmed), mature science (general relativity, specific predictions, rigorous testing), historical science (evolutionary biology, retrodiction and converging evidence), and formal science (mathematics, proof not observation). Three arrows show why verificationism, falsificationism, and Kuhnian paradigms each fail to draw a single sharp boundary between the columns.
Worked example Beginner
Consider two theories about the motion of planets. Theory A says: planets move in ellipses with the Sun at one focus, and the square of the orbital period is proportional to the cube of the semi-major axis. Theory B says: planets move according to the will of a cosmic intelligence whose purposes are imperfectly understood by humans.
Theory A makes specific, quantitative predictions. You can calculate where Mars should be on the night of 14 March 2027, and if Mars is somewhere else, the theory is wrong. The prediction is risky in Popper's sense: it stakes out a definite claim that observation can contradict. The theory has been tested thousands of times. Each successful prediction does not prove the theory true — Popper's point — but each failed prediction would, in principle, count against it.
Theory B does not make risky predictions. Whatever Mars does on 14 March 2027, a defender of Theory B can say: the cosmic intelligence intended that outcome, and our imperfect understanding did not anticipate it. No observation contradicts the theory because the theory accommodates any observation after the fact. This is not a weakness in the theory, exactly — it is a structural feature of the kind of claim the theory makes. But it is the feature that disqualifies it from being scientific in Popper's sense.
Now consider a harder case. In 1915, Einstein's general relativity predicted that starlight passing near the Sun would be deflected by 1.75 arcseconds. Newtonian gravity predicted either 0 or 0.87 arcseconds, depending on the calculation. Eddington's 1919 eclipse expedition measured approximately 1.6 arcseconds — closer to Einstein than to Newton. This was a risky prediction: had the measurement come out at 0, Einstein's theory would have been in serious trouble.
But here is the complication that Popper's simple picture does not capture. Eddington's data were noisy. Some plates supported Einstein; others were closer to Newton. Eddington made judgement calls about which plates to include and which to discard. Was the result a genuine test, or did the experimenter's expectations shape the analysis?
The episode illustrates that even the cleanest falsification is embedded in a web of auxiliary assumptions — about instruments, about data processing, about what counts as a good plate. The Duhem-Quine thesis, discussed in the Intermediate section, makes this point in full generality: you never test a single hypothesis in isolation; you test the whole package.
What this tells us: the simple distinction between "falsifiable" and "not falsifiable" works well for extreme cases (astrology versus astronomy) but becomes murky in the middle, where real scientific judgement operates. The demarcation problem is not about drawing a line and being done; it is about understanding the structure of scientific reasoning well enough to know why some practices deserve confidence and others do not.
Check your understanding Beginner
Formal definition Intermediate+
The demarcation problem can be stated as a request for a function from the set of theories (or belief systems, or research programmes) to the set , such that captures what is epistemically distinctive about scientific practice. The major proposals differ in what they take the input to to be and what they take the discriminating feature to be.
Verificationism (Vienna Circle, 1920s-1930s). A statement is scientifically meaningful iff it is either analytically true (true by definition) or empirically verifiable. The verification condition requires that there exists a finite set of observations that would establish the truth of the statement. Problem: universal generalisations ("all are ") are never finitely verifiable, yet they are central to scientific practice. Successively weaker versions (verification in principle, partial verifiability, confirmability) each proved either too weak to exclude metaphysics or too strong to include science. The programme collapsed in the 1940s-1950s.
Falsificationism (Popper, Logik der Forschung, 1935). A theory is scientific iff entails at least one basic statement such that the negation is empirically testable. Formally, is falsifiable iff there exists a potential falsifier — a basic statement inconsistent with — that is in principle observable. Popper's key insight is asymmetry: no finite number of observations can verify a universal claim, but a single counterexample can falsify one. Science progresses not by accumulating confirmations but by surviving severe tests.
Popper introduced the notion of corroboration: a theory is well-corroborated if it has survived many severe tests. Corroboration is not a measure of truth or probability; it is a historical report on how the theory has performed so far. This is a subtle point: Popper was a deductivist about science. He denied that induction plays any legitimate role in scientific reasoning.
Conventionalist stratagems are Popper's term for the moves scientists make to protect a theory from refutation: introducing ad hoc hypotheses, redefining terms, adjusting auxiliary assumptions. Popper's response is to require that any modification to a theory in the face of counterevidence must increase the theory's falsifiable content — it must make new, testable predictions, not merely accommodate the anomaly. This is Popper's criterion for distinguishing legitimate theoretical revision from immunisation.
Kuhn's paradigms (The Structure of Scientific Revolutions, 1962). Kuhn rejected the idea that science progresses by linear accumulation of knowledge. He distinguished three phases of scientific activity. Pre-paradigm science: multiple competing schools, no consensus on methods or problems. Normal science: a single paradigm dominates — a shared constellation of theories, methods, exemplars, and values that defines what counts as a legitimate problem and a legitimate solution. Scientists spend most of their careers doing normal science: articulating the paradigm, filling in details, resolving minor puzzles. Revolutionary science: anomalies accumulate that the paradigm cannot resolve; a crisis develops; an alternative paradigm is proposed; the scientific community undergoes a paradigm shift.
Kuhn's concept of incommensurability is the claim that competing paradigms do not share a common measure. Observations are theory-laden: what a scientist sees is partly determined by the paradigm they inhabit. Proponents of different paradigms may use the same words but mean different things by them ("mass" in Newtonian versus Einsteinian physics is the standard example). They may disagree about what problems are worth solving and what counts as a solution. Kuhn compared paradigm shifts to Gestalt switches — the same data look different before and after.
Lakatos's research programmes (Falsification and the Methodology of Scientific Research Programmes, 1970). Lakatos attempted to reconcile Popper's rationalism with Kuhn's historicism. A research programme consists of a hard core of central assumptions that are protected from refutation by convention, and a protective belt of auxiliary hypotheses that absorb counterevidence. The programme is progressive if the modifications to the protective belt lead to novel predictions that are confirmed. It is degenerating if the modifications are purely ad hoc — absorbing anomalies without predicting anything new. Demarcation, for Lakatos, is not about individual theories but about the trajectory of research programmes over time. A programme that has been degenerating for an extended period is, while not exactly unscientific, a poor investment of research effort.
Feyerabend's epistemological anarchism (Against Method, 1975). Feyerabend took Kuhn's historical observations to their radical conclusion. If the history of science shows that scientists routinely violate every proposed methodological rule — and yet science progresses — then there is no universal scientific method. Feyerabend's slogan: "anything goes." He argued that the only principle that does not constrain science is the principle that anything is permitted. His target was not science itself but the ideology of a single scientific method, which he saw as stultifying and historically inaccurate. Feyerabend pointed to Galileo's advocacy of heliocentrism as a case study: Galileo succeeded not by following any recognisable scientific method but by using rhetoric, persuasion, and occasionally deceptive arguments.
The Duhem-Quine thesis. Pierre Duhem (1906) and W.V.O. Quine (1951) independently argued that scientific hypotheses are never tested in isolation. When a prediction derived from a theory fails, the logic of the situation is: the conjunction of the central hypothesis , the auxiliary assumptions , and the initial conditions entails the prediction . Observation yields . By modus tollens, follows — but this tells you only that something in the conjunction is wrong, not which component. You can always preserve by modifying some auxiliary assumption. Quine extended this to a holism about confirmation: our beliefs face the "tribunal of experience" as a whole, not one by one. The Duhem-Quine thesis undercuts falsificationism at its foundation: if no single hypothesis can be isolated for testing, then no theory is falsified by a single experiment.
Underdetermination
A consequence of the Duhem-Quine thesis is underdetermination: the thesis that the available evidence is insufficient to determine which of several competing theories is correct. In its strong form, underdetermination says that for any body of evidence , there exist multiple mutually incompatible theories each consistent with . The strong form is a logical point: for any finite evidence set, indefinitely many theories can be constructed that fit the data. The interesting question is whether underdetermination is transient (resolvable by future evidence) or permanent (theories that remain empirically equivalent even in principle). The transient form is a fact about the growth of knowledge; the permanent form is a philosophical challenge to scientific realism.
Realism vs anti-realism
Scientific realism holds that (i) the world described by science is largely mind-independent, (ii) mature scientific theories are approximately true, and (iii) the theoretical entities postulated by successful theories (electrons, genes, quasars) really exist. The no-miracles argument is the realist's master argument: it would be a miracle if a false theory made such accurate predictions; the best explanation for scientific success is that theories are at least approximately true.
Anti-realism comes in several forms. Instrumentalism holds that theories are instruments for prediction, not descriptions of reality; their terms need not refer to real entities. Constructive empiricism (van Fraassen, 1980) accepts that theories aim to be empirically adequate — to get the observable phenomena right — but withholds commitment to the truth of claims about unobservable entities. The distinction between observable and unobservable becomes the hinge: van Fraassen argues that accepting a theory involves believing its claims about observables but merely accepting (using) its claims about unobservables.
Structural realism (Worrall, 1989) is a compromise: what is preserved across theory change is not the ontology (electrons were not what J.J. Thomson thought they were) but the mathematical structure (the equations relating measurable quantities). Structural realism attempts to explain both the success of science (the structure tracks reality) and the history of theory change (the ontology gets revised).
Key argument — Popper's falsificationism reconstructed Intermediate+
Popper's falsificationism can be reconstructed as an explicit argument with numbered premises.
Premise P1 (asymmetry of falsifiability). No finite set of observation statements can logically entail a universal generalisation , but a single observation statement can entail if and the observation yields . This is the logical asymmetry between verification and falsification.
Premise P2 (demarcation criterion). A theory is scientific iff there exists at least one potential falsifier — a basic (observation) statement such that entails and is in principle observable. Theories with no potential falsifiers are not scientific.
Premise P3 (scientific rationality). The rational response to a genuine falsification (observation of where ) is to reject — or at minimum to regard as no longer tenable without substantial revision.
Premise P4 (anti-conventionalist constraint). Any revision of in response to a falsification must increase the theory's empirical content — it must predict something new that the original theory did not.
Conclusion C1 (demarcation). Theories that are in principle falsifiable are scientific; theories that are not are non-science or pseudoscience.
Conclusion C2 (rationality). Scientific rationality consists in subjecting theories to the severest possible tests and revising or replacing them when they fail, while resisting ad hoc modifications that merely accommodate counterevidence.
The strongest objection: the Duhem-Quine problem. Premise P1 is correct as a point of deductive logic. Premise P2 defines a property that some theories have and others lack, and the distinction captures something real. The trouble starts with Premise P3. In practice, theories do not generate predictions unaided. The inference from to the prediction requires auxiliary assumptions (about instruments, background conditions, calibration, the absence of interfering factors). What is actually tested is , not alone. When the prediction fails, modus tollens gives , which is equivalent to . The scientist can preserve by rejecting .
This is not merely a logical point. It is an historical fact about how science works. Newtonian celestial mechanics "falsified" itself on Mercury's perihelion for decades. The response was not to reject Newton but to postulate an unseen planet (Vulcan) that would account for the anomaly. Vulcan was an auxiliary hypothesis. It turned out to be wrong — there is no such planet — but the strategy of modifying auxiliaries rather than rejecting the core theory is standard practice, and sometimes it produces genuine discoveries (the discovery of Neptune was exactly this kind of auxiliary-hypothesis manoeuvre, and it was correct).
Popper was aware of the Duhem-Quine problem and responded by requiring that modifications to auxiliaries increase the theory's falsifiable content (Premise P4). But this response concedes the central point: falsification is not the simple, clean logical operation that Premise P1 suggests. It is always embedded in a network of auxiliary commitments, and the decision about which component to revise is a judgement call, not a deduction. Lakatos's research programme framework is the most developed attempt to make this judgement call principled: a progressive programme modifies its protective belt in ways that predict novel facts, while a degenerating programme merely accommodates known anomalies.
Exercises Intermediate+
Kuhn's paradigms and the structure of scientific revolutions Master
Kuhn's The Structure of Scientific Revolutions (1962) is the most cited academic book of the twentieth century, and its influence extends far beyond philosophy of science. Kuhn introduced a vocabulary — paradigm, normal science, anomaly, crisis, revolution, incommensurability — that has become standard in discussions of how knowledge changes. The book's central claim is that science does not progress by the steady accumulation of facts toward an ultimate truth. Instead, it alternates between extended periods of conservative puzzle-solving within a dominant framework (normal science) and rare, disruptive episodes in which the framework itself is replaced (scientific revolutions).
Normal science is the activity Kuhn attributes to most scientists most of the time. It is not aimed at discovering new phenomena or testing fundamental assumptions. It is aimed at articulating and extending the existing paradigm — filling in details, resolving puzzles, improving the match between the paradigm's predictions and observation. A paradigm, for Kuhn, is more than a theory. It includes exemplary problem solutions (what Kuhn calls "exemplars"), shared values about what counts as a good explanation, standard experimental techniques, and a consensus about which problems are worth working on. The paradigm defines the rules of the game. Scientists working within a paradigm are not testing the paradigm; they are using it.
This is a striking claim. Popper had argued that the essence of science is critical testing — that scientists should always be trying to refute their theories. Kuhn says that this almost never happens. Normal science is not critical; it is consolidating. It assumes the paradigm is correct and works to bring recalcitrant phenomena into line. This is not a criticism of science; Kuhn insists that normal science is productive. The depth and precision of scientific knowledge depend on the focused, uncritical work that paradigms make possible.
Anomalies arise when the paradigm cannot accommodate a phenomenon despite sustained effort. Most anomalies are eventually resolved within the paradigm. But some persist. When enough persistent anomalies accumulate, the scientific community enters a state of crisis. Crisis is characterised by a loss of confidence in the paradigm, a proliferation of ad hoc modifications, and the emergence of alternative frameworks. The transition from crisis to revolution occurs when a competitor paradigm demonstrates that it can resolve the anomalies that defeated the old paradigm, and a sufficient portion of the community switches allegiance.
Incommensurability is Kuhn's most contested concept. Kuhn claims that proponents of different paradigms "live in different worlds" — not metaphorically but in the sense that their perceptual and conceptual categories are shaped by their paradigm. The standard example: when Western astronomers first saw the Copernican prediction that the Earth moves, many reported that they could not see it — not that the evidence was weak, but that the conceptual framework for understanding "Earth moves" as a literal claim was unavailable to them. Kuhn extended this to observation language: he argued that there is no neutral, theory-independent observation language in which competing paradigms can be compared point by point. What one paradigm calls a "planet" another calls a "wandering star"; what one calls "mass" (a conserved quantity in Newton) another calls "mass" (a quantity that varies with velocity in Einstein). The terms are the same; the referents differ.
Critics — particularly Popper and his followers — charged Kuhn with relativism: if paradigms are incommensurable, then there is no rational basis for preferring one to another, and scientific change reduces to mob psychology. Kuhn denied this charge in the postscript to the second edition (1970) and in later writings. He argued that there are shared values among scientists (accuracy, consistency, scope, simplicity, fruitfulness) that survive paradigm shifts and provide rational grounds for theory choice — but that these values do not determine a unique choice in every case. Two scientists sharing all five values can still disagree about which paradigm better satisfies them, because the values conflict (simplicity vs. scope, accuracy vs. fruitfulness) and their relative weights are matters of judgement.
Normal science and the puzzle-solving tradition
Kuhn's account of normal science as puzzle-solving has been both influential and misunderstood. A puzzle, in Kuhn's sense, is a problem that is guaranteed to have a solution within the paradigm. The paradigm supplies the rules and the assurance that an answer exists. The scientist's job is to find it. This is what makes normal science possible: without the guarantee that a solution exists, scientists would not invest years of effort in a problem. The guarantee comes from the paradigm, not from nature. If the paradigm is wrong, the puzzle may be ill-formed — but that realisation arrives only during a crisis, not during normal science.
The puzzle-solving character of normal science explains why scientists are resistant to anomalies. An anomaly is not just a failed prediction; it is a violation of the guarantee that the paradigm provides. The default response is not "the paradigm is wrong" but "I have not solved the puzzle correctly." This is rational within the framework of normal science: the paradigm has succeeded thousands of times before, and one failure is more likely to be the scientist's error than the paradigm's. Only when failures accumulate across many researchers and many techniques does the community begin to doubt the paradigm itself.
Paradigm shifts and Gestalt psychology
Kuhn explicitly modelled paradigm shifts on Gestalt perceptual switches — the duck-rabbit image, where the same drawing is seen first as a duck, then as a rabbit, with no intermediate state. The analogy is deliberate: Kuhn wanted to convey that a paradigm shift is not a gradual accumulation of evidence but a sudden reorganisation of perception. After a revolution, scientists "see the world differently" — not because the world has changed, but because their conceptual framework has.
The Gestalt analogy has been criticised as misleading. Perceptual switches are instantaneous and involuntary; paradigm shifts are gradual and require intellectual effort. Kuhn acknowledged the disanalogy but maintained that the experience of conversion to a new paradigm shares with Gestalt switches the property that the before-state and the after-state are not simultaneously accessible. You cannot see the duck and the rabbit at the same time; you cannot inhabit two paradigms simultaneously. This is what makes paradigm shifts disorienting and why they are often completed only when the older generation of scientists dies or retires. Kuhn quoted Planck: "a new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it."
Lakatos and the methodology of research programmes Master
Imre Lakatos was a student of Popper who sought to preserve Popper's rationalism while accommodating Kuhn's historical insights. His framework, presented in the 1970 paper "Falsification and the Methodology of Scientific Research Programmes," replaces the individual theory with the research programme as the unit of scientific appraisal.
A research programme has three components. The hard core consists of the central assumptions that define the programme — assumptions that are, by methodological convention, immune from refutation. The protective belt consists of auxiliary hypotheses, initial conditions, and background assumptions that can be modified in response to counterevidence. The heuristic is a set of rules for how the programme should be developed: the positive heuristic tells the scientist what to do (what lines of research to pursue, what modifications to make), while the negative heuristic tells the scientist what not to do (do not modify the hard core).
The key distinction is between progressive and degenerating problemshifts. When the protective belt is modified in response to an anomaly, the modification is progressive if it leads to the prediction of novel facts — facts that were not known when the modification was made and that are independently confirmed. A degenerating problemshift is one in which the modification merely accommodates known anomalies without predicting anything new.
Lakatos illustrated the framework with the competition between the Newtonian and Cartesian research programmes in celestial mechanics. The Newtonian programme was progressive for over two centuries: each modification (postulating Uranus to explain orbital perturbations, then Neptune, then Pluto) predicted a novel celestial body that was subsequently observed. The Cartesian vortex theory, by contrast, was degenerating: its modifications were ad hoc, designed to accommodate known planetary motions without predicting new phenomena.
The framework has several virtues. It captures the insight that theories should not be judged in isolation but as part of ongoing research traditions. It explains why scientists rationally persist with a theory despite anomalies (the programme may still be progressive overall). It provides a criterion for when persistence becomes irrational (when the programme has been degenerating for an extended period with no prospect of revival). And it avoids Kuhn's relativism by maintaining that the distinction between progressive and degenerating programmes is objective, not paradigm-relative.
The framework also has limitations. The distinction between the hard core and the protective belt is a methodological convention, not a logical distinction — and conventions can change. What counts as a "novel fact" is sometimes contested: a prediction that was made before the relevant evidence was gathered, but which was based on evidence that was already available in some form, may or may not count as genuinely novel. And the framework is better suited to retrospective analysis than to prospective guidance — Lakatos himself admitted that it may take decades to determine whether a programme is progressive or degenerating.
Lakatos versus Kuhn: rational reconstruction versus historical description
The Lakatos-Kuhn debate is one of the central episodes in twentieth-century philosophy of science. Kuhn's position is primarily descriptive: he aimed to describe how science actually works, drawing on historical case studies. Lakatos's position is primarily normative: he aimed to prescribe how science should work, using historical examples as tests of his prescriptions. The tension between these two projects — describing science as it is versus prescribing how it ought to be — runs through the entire demarcation debate and remains unresolved.
Lakatos accused Kuhn of reducing theory choice to "mob psychology." Kuhn responded that Lakatos's methodology is either empty (if the hard core can be redefined retroactively) or question-begging (if it presupposes what it is supposed to explain). The exchange was heated and never fully resolved; Lakatos died in 1974, aged 51.
Feyerabend and epistemological anarchism Master
Paul Feyerabend was another student of Popper who moved in a direction diametrically opposite to Lakatos. Where Lakatos tried to rescue rationalism from Kuhn's historicism, Feyerabend embraced the historicist conclusion and drew out its most radical implications.
Feyerabend's central argument in Against Method (1975) proceeds by historical case study. He examines Galileo's defence of heliocentrism and argues that Galileo violated every methodological principle that philosophers of science have proposed. Galileo argued for the truth of Copernican astronomy at a time when the empirical evidence favoured the Ptolemaic system. He used rhetorical tricks, thought experiments that could not be performed, and arguments that were (by the standards of the time) misleading. Yet Galileo was right, and his "unscientific" methods produced a scientific revolution.
From this and similar cases, Feyerabend drew a general conclusion: there is no single scientific method. The history of science shows that the only principle that does not impede scientific progress is the principle that anything goes. Scientists should be free to use whatever methods, arguments, and approaches produce results — including approaches that philosophers classify as unscientific.
Feyerabend's target was not science but the ideology of a single scientific method. He argued that the insistence on a universal method serves to suppress alternative approaches (traditional knowledge systems, alternative medicine, non-Western science) and to concentrate epistemic authority in the hands of a professional elite. His pluralism was political as well as epistemological.
The standard objection to Feyerabend is that his conclusion does not follow from his premises. The fact that scientists sometimes violate methodological rules and still succeed does not show that the rules are worthless — it shows that the rules are not exceptionless. A rule that is generally useful but occasionally violated is different from no rule at all. Feyerabend's response was that the exceptions are not occasional but pervasive: the history of science is a continual sequence of violations, and what counts as a "methodological rule" in one era is a violation in another. The rules themselves are historically contingent.
A more sympathetic reading treats Feyerabend not as denying that there are better and worse ways of investigating nature, but as insisting that the criteria for "better" and "worse" are themselves open to revision in light of scientific practice. On this reading, epistemological anarchism is not nihilism about method but a demand that methodology remain responsive to the actual practice of science rather than legislating it in advance.
The Duhem-Quine thesis and underdetermination Master
The Duhem-Quine thesis has two components, corresponding to the two thinkers who developed it. Duhem's thesis (1906) is that in physics, hypotheses are tested only as networks, never in isolation. Quine's thesis (1951) is stronger: the entire web of belief faces experience as a whole, and any belief can be held true come what may, provided sufficient adjustments are made elsewhere in the system. Quine's holism is global; Duhem's is local to physical theories.
The distinction matters. Duhem's point is that the logic of hypothesis testing involves bundles of claims, and modus tollens tells you only that something in the bundle is wrong. This is a constraint on the falsificationist programme: it shows that falsification is not the simple logical operation that Popper's initial formulation requires. Quine's point is philosophical: it is about the relation between evidence and belief in general, not about the structure of physical theories specifically. Quine argued that the distinction between analytic truths (true by meaning) and synthetic truths (true by fact) is untenable, and that confirmation is holistic — our beliefs are connected not by a chain (where each link can be tested separately) but by a net (where any strand can be adjusted to accommodate recalcitrant experience).
Underdetermination follows from holism. If for any body of evidence there exist multiple theories each consistent with , then evidence alone cannot determine theory choice. In its transient form, underdetermination is unremarkable: it says that current evidence does not settle all theoretical questions, but future evidence may resolve the underdetermination. In its permanent or strong form, underdetermination says that there exist empirically equivalent theories — theories that agree on all possible observations but disagree on theoretical claims — and that no amount of evidence can choose between them.
The strong form is a logical point: given any finite set of observations, one can construct indefinitely many theories that fit the data. Whether the strong form is interesting depends on whether the constructed theories are genuine rivals (theoretically motivated, not just logical trickery) and whether they disagree on claims that matter. The standard example in contemporary philosophy of science is the underdetermination of quantum-mechanical interpretations: Bohmian mechanics, Everettian many-worlds, and GRW collapse theories all reproduce the statistical predictions of standard QM but disagree about what the world is like. The disagreement is not resolvable by any experiment that both sides accept, because the disagreement is about what exists, not about what is observed.
The underdetermination argument against realism
The anti-realist wields underdetermination as follows:
Premise U1. For any successful theory , there exist empirically equivalent rivals that disagree with on unobservable claims.
Premise U2. If evidence cannot distinguish between and , rational belief should be agnostic between them.
Conclusion. We should not believe the unobservable claims of ; we should accept only as empirically adequate (van Fraassen's position).
The realist responds in several ways. One line denies Premise U1 for mature theories: as a theory develops and integrates with neighbouring theories, the space of empirically equivalent rivals shrinks, and genuinely equivalent rivals become contrived or degenerating. Another line attacks Premise U2: the no-miracles argument says that the best explanation of 's success is that its unobservable claims are approximately true, and inference to the best explanation licenses believing them. A third line — structural realism — accepts that the ontology may be underdetermined but insists that the mathematical structure is not: whatever rival theory matches the empirical data must share the relevant structure with .
Realism vs anti-realism: the contemporary debate Master
The realism debate has become one of the most technically sophisticated areas of philosophy of science. The major positions can be mapped along two dimensions: what realism claims and what the anti-realist alternative is.
Selective realism (Kitcher, Psillos) concedes that the history of science contains cases where successful theories made false claims about unobservable entities (caloric, phlogiston, the luminiferous ether). But these failures are not random: they cluster around the idle or working parts of theories. The working parts — the claims that actually contributed to the theory's predictive success — have generally been preserved across theory change. Selective realism claims that we are entitled to believe the working parts, even if the idle parts are revised away. This is an attempt to salvage the no-miracles argument while acknowledging the historical record of ontological revision.
Entity realism (Cartwright, Hacking) takes a different approach. We are entitled to believe in unobservable entities not because the theories that postulate them are approximately true, but because we can manipulate those entities in experiments. Hacking's example: if you can spray electrons in a cloud chamber and use them to investigate other phenomena, then electrons are real — regardless of whether the theory that describes them is true. The criterion is intervention, not representation. The objection is that the criterion is too weak: it licenses belief in entities that are later eliminated (phlogiston was experimentally manipulated), and it does not extend to entities that are genuinely unobservable in principle (quarks, which cannot be isolated).
Constructive empiricism (van Fraassen) is the most developed anti-realist position. Van Fraassen accepts that theories aim to be true but argues that acceptance of a theory involves only the belief that it is empirically adequate — that what it says about observables is correct. The theoretical superstructure (claims about unobservables) is accepted as useful but not believed. The distinction between observable and unobservable is epistemically significant: we have direct epistemic access to observables but only indirect access to unobservables, and the indirect access is mediated by the very theories whose truth is in question.
The main objection to constructive empiricism concerns the observable/unobservable distinction itself. Is it sharp? Van Fraassen says yes: a thing is observable if it can be detected by unaided human senses. But this makes observability contingent on human physiology in a way that seems arbitrary. A being with different sensory apparatus could observe things we cannot; does that make our "unobservable" observable? And the boundary shifts with technology: the moons of Jupiter were unobservable before the telescope, observable after. Van Fraassen's response is that observability is a contingent, empirical matter — but this makes the distinction itself a scientific question, which complicates its role as a philosophical criterion.
Van Fraassen's voluntarism and the pragmatic stance
Van Fraassen later developed a broader epistemological position he calls voluntarism: the view that epistemic rules do not determine a unique rational response to any given body of evidence. Rational agents may permissibly differ in their beliefs even given the same evidence, because the principles of rationality (consistent probability assignments, avoidance of Dutch books, updating by conditionalisation) underdetermine priors and leave room for legitimate epistemic variation. Applied to scientific realism, voluntarism implies that both realist and anti-realist stances are rationally permissible — the evidence does not compel either. This is a more nuanced position than simple anti-realism, but it raises the question of what role philosophy of science plays if its central debates are ultimately matters of permissible preference rather than correctness.
Values in science Master
The question of values in science intersects the demarcation problem at a specific point: if scientific reasoning is supposed to be objective and value-free, what role do ethical, social, and cognitive values play in actual scientific practice?
The value-free ideal holds that scientific conclusions should be determined solely by evidence and logic, not by moral, political, or personal values. The ideal has been defended on the grounds that values introduce bias, that scientific authority depends on the perception of impartiality, and that the role of science is to describe the world, not to prescribe what should be done with the descriptions.
Heather Douglas (2009) argued that the value-free ideal is both unattainable and undesirable. It is unattainable because scientists must make judgements under uncertainty — choosing which hypotheses to test, how to design experiments, how to interpret ambiguous data, how to set statistical significance thresholds — and these judgements are influenced by values whether or not the scientist acknowledges them. It is undesirable because the suppression of value-relevant reasoning can lead to worse outcomes: a toxicologist who sets a high bar for evidence of harm (to avoid false positives) is implicitly valuing the avoidance of unnecessary regulation over the protection of public health, and this value judgement should be made explicitly rather than hidden behind methodological conventions.
Douglas distinguishes epistemic values (simplicity, scope, accuracy, fruitfulness, consistency) from non-epistemic values (ethical, social, political). Epistemic values are internal to the scientific enterprise: they concern what makes a theory good as a representation of the world. Non-epistemic values concern what makes a theory good for purposes beyond representation — its social consequences, its ethical implications, its political ramifications. Douglas argues that non-epistemic values have a legitimate role in scientific reasoning, specifically in the assessment of uncertainty: when the evidence is ambiguous, non-epistemic values can permissibly influence how uncertainty is resolved. They should not, however, determine the content of scientific claims directly — a scientist should not adjust data to fit a desired conclusion, even if the conclusion serves a good cause.
The argument from inductive risk is the classic formulation. When a scientific claim has practical consequences (a drug is safe or unsafe, a chemical is carcinogenic or not), the cost of a false positive (approving an unsafe drug) and the cost of a false negative (rejecting a safe drug) are asymmetric. The choice of how much evidence to require before accepting a claim reflects a value judgement about which kind of error is worse. This value judgement is not eliminable by gathering more data; it is inherent in the logic of decision-making under uncertainty.
Helen Longino (1990) approached the question from a social epistemology perspective. She argued that objectivity in science is not the property of individual scientists but of scientific communities. A community achieves objectivity when it includes diverse perspectives, when dissent is institutionally protected, when criticism is public and responsive, and when the standards of evaluation are publicly shared. On this view, values are not contaminants to be excluded but resources to be distributed across a community, with the expectation that competing value commitments will generate constructive criticism.
Feminist philosophy of science and situated knowledge
Feminist philosophers of science (Harding, Haraway, Longino, Wylie) have contributed some of the most careful work on values in science. Sandra Harding's strong objectivity thesis argues that the value-free ideal produces less objective science, because it renders invisible the social interests and assumptions that shape research agendas, methodological choices, and interpretive frameworks. Strong objectivity requires making those interests and assumptions explicit and subjecting them to critical scrutiny — a more demanding standard than the value-free ideal, not a less demanding one.
Donna Haraway's concept of situated knowledges makes a related point from a different direction. Haraway rejects the "god trick" — the claim to a view from nowhere that sees everything from no particular location. All knowledge, she argues, is produced from a particular location, with particular resources and limitations. Acknowledging this situatedness does not undermine objectivity; it is a precondition for it. A knower who is aware of their own position can compensate for its limitations in ways that a knower who believes themselves to be positionless cannot.
Alison Wylie's work in philosophy of archaeology provides a concrete case study. Archaeological evidence is always partial and ambiguous; interpretation requires background assumptions about past societies. Wylie showed that feminist archaeologists, by bringing different background assumptions to the same evidence, generated new and productive research programmes — not because feminist values determined the conclusions, but because they opened up questions that had been invisible within the dominant (androcentric) framework. The case supports Longino's claim that diversity of values, properly organised, enhances rather than undermines scientific objectivity.
Connections Master
Epistemology: knowledge, justification, and truth
20.01.01(pending) connects directly: the demarcation problem is an application of epistemology to the specific case of scientific knowledge. What counts as justification for a scientific claim? How does scientific evidence relate to truth? The epistemology unit's treatment of foundationalism, coherentism, and the Gettier problem provides the background for the phil-of-science analysis of confirmation and falsification.Phil-of-physics: the measurement problem
20.03.01is a paradigmatic case study for several demarcation-related issues: underdetermination of interpretation by evidence (multiple QM interpretations produce identical predictions), the role of auxiliary assumptions in theory choice, and the realism debate (what does the wave function describe?). The phil-of-science unit provides the general framework; the phil-of-physics unit provides the concrete application.Phil-of-biology [20.05.NN] (pending) connects via the demarcation debates specific to biology: is evolutionary biology "real science" by Popper's criterion? (Popper initially called it a "metaphysical research programme" and later retracted.) How does the historical character of biological knowledge (retrodiction vs. prediction) affect its scientific status?
Phil-of-math [20.09.NN] (pending) connects via the question of whether mathematics is a science. If demarcation requires empirical falsifiability, mathematics is excluded; if it requires rigorous proof and progressive knowledge growth, mathematics is the paradigm case. The tension illuminates the limits of any demarcation criterion rooted in empirical methods.
Historical & philosophical context Master
The demarcation problem has roots in Aristotle's distinction between episteme (demonstrative knowledge) and techne (practical craft), but its modern form emerges in the early twentieth century with the Vienna Circle. The Circle — Schlick, Carnap, Neurath, Feigl, and others — convened around the programme of logical positivism: the application of modern logic and empirical science to the renovation of philosophy. Their central doctrine was the verification principle, and their central target was metaphysics: the tradition from Plato through Hegel that made claims about reality that could not be tested by observation.
The verification principle underwent multiple revisions. The strongest form — a statement is meaningful iff it is analytically true or verifiable by a finite set of observations — excluded universal scientific laws. Weakened forms (confirmability, testability-in-principle) were proposed but each encountered counterexamples or proved too weak to exclude the intended targets. The programme was further undermined by Quine's "Two Dogmas of Empiricism" (1951), which attacked the analytic-synthetic distinction that the positivists relied on to separate logical truths from empirical ones, and by the growing recognition that the observation/theory distinction was not sharp.
Popper's Logik der Forschung (1935; English edition The Logic of Scientific Discovery, 1959) was written in Vienna during the same period but took a different approach. Popper was not a positivist; he rejected the verification principle and the attempt to eliminate metaphysics as meaningless. His target was what he called the problem of demarcation: distinguishing science from non-science, not meaningful from meaningless. Falsificationism was his solution: a theory is scientific if it makes predictions that could be refuted by observation. Popper applied the criterion to Marxism, psychoanalysis, and astrology, which he classified as pseudoscientific because they could accommodate any observation after the fact.
The Popper-Kuhn encounter at the 1965 London conference on criticism and the growth of knowledge was a watershed. Kuhn's Structure had been published three years earlier; Popper and his followers (Lakatos, Watkins, Musgrave) confronted Kuhn with the charge of relativism. The proceedings, published as Criticism and the Growth of Knowledge (Lakatos and Musgrave, eds., 1970), contain the most concentrated exchange on demarcation in the literature. Lakatos's research programme methodology was presented there for the first time as an attempt to reconcile Popper and Kuhn.
Feyerabend's Against Method (1975) was the rupture. Originally planned as a collaborative work with Lakatos (who was to write a counter-argument), it became a solo polemic after Lakatos's death. The book's historical case studies — Galileo above all — were marshalled against every methodological rule in the Popper-Lakatos tradition. Feyerabend's later work extended the argument to a general critique of the authority of science in society, arguing for the separation of science and state on the model of the separation of church and state.
The contemporary landscape (2000s-2020s) has moved beyond the stark oppositions of the earlier debate. The demarcation problem is widely regarded as having no clean solution — no single criterion separates science from non-science in all cases. Instead, philosophers have pursued cluster approaches (a set of features — falsifiability, empirical adequacy, progressive research, institutional mechanisms for criticism, etc. — that tend to co-occur in paradigmatic science and tend to be absent in paradigmatic pseudoscience, without any single feature being necessary and sufficient). The cluster approach draws on Wittgenstein's concept of family resemblance: there is no single property shared by all games, but games are connected by overlapping similarities. Similarly, there may be no single property shared by all sciences, but the sciences are connected by overlapping methodological and institutional features.
The values-in-science debate has become a major research area in its own right, intersecting with feminist epistemology, social studies of science, and the philosophy of scientific practice. The consensus among working philosophers of science is that the value-free ideal is untenable in its strongest form; the active research question is how to characterise the legitimate role of values without collapsing into relativism or arbitrariness.
Bibliography Master
Foundational:
- Duhem, P. — The Aim and Structure of Physical Theory (1906; English trans. Wiener, Princeton University Press, 1954).
- Popper, K. R. — Logik der Forschung (Springer, 1935); English trans. The Logic of Scientific Discovery (Hutchinson, 1959).
- Quine, W. V. O. — "Two dogmas of empiricism", Philosophical Review 60, 20-43 (1951).
- Kuhn, T. S. — The Structure of Scientific Revolutions (University of Chicago Press, 1962; 2nd ed. with postscript, 1970).
- Lakatos, I. — "Falsification and the methodology of scientific research programmes", in Lakatos & Musgrave (eds.), Criticism and the Growth of Knowledge (Cambridge University Press, 1970), pp. 91-196.
- Feyerabend, P. K. — Against Method: Outline of an Anarchistic Theory of Knowledge (Humanities Press, 1975).
Contemporary canonical:
- van Fraassen, B. C. — The Scientific Image (Clarendon Press, 1980).
- Laudan, L. — "The demise of the demarcation problem", in Ruse (ed.), But Is It Science? (Prometheus Books, 1988), pp. 337-350.
- Kitcher, P. — The Advancement of Science: Science without Legend, Objectivity without Illusions (Oxford University Press, 1993).
- Psillos, S. — Scientific Realism: How Science Tracks Truth (Routledge, 1999).
- Chalmers, A. F. — What Is This Thing Called Science? 4th ed. (Hackett, 2013).
- Ladyman, J. — Understanding Philosophy of Science (Routledge, 2002).
Values in science:
- Longino, H. E. — Science as Social Knowledge: Values and Objectivity in Scientific Inquiry (Princeton University Press, 1990).
- Douglas, H. E. — Science, Policy, and the Value-Free Ideal (University of Pittsburgh Press, 2009).
- Harding, S. — Whose Science? Whose Knowledge? Thinking from Women's Lives (Cornell University Press, 1991).
- Haraway, D. J. — "Situated knowledges: the science question in feminism and the privilege of partial perspective", Feminist Studies 14, 575-599 (1988).
- Wylie, A. — "The constitution of archaeological evidence: gender politics and science", in Galison & Stump (eds.), The Disunity of Science (Stanford University Press, 1996), pp. 311-343.
Structural realism and the realism debate:
- Worrall, J. — "Structural realism: the best of both worlds?", Dialectica 43, 99-124 (1989).
- Ladyman, J. & Ross, D. (with Spurrett & Collier) — Every Thing Must Go: Metaphysics Naturalized (Oxford University Press, 2007).
- Wray, K. B. — Kuhn's Evolutionary Social Epistemology (Cambridge University Press, 2011).