The Internal Revenue Code is 2.4 million words long because English grammar refuses to be precise. Every definition section is a prosthesis for a missing grammatical feature.
| Year | Word Count (Code + Regulations) | IRC Sections | Cross-References |
|---|---|---|---|
| 1913 | ~5,000 | — | — |
| 1954 | ~1,400,000 | 103 | — |
| 1991 | — | — | <50,000 |
| 2005 | — | 736 | — |
| 2012 | — | — | ~70,000 |
| 2024 | ~10,000,000 | 800+ | — |
The shape of the curve tells the story. This is not linear growth driven by new policy. It is exponential growth driven by definitional patches — each new rule introduces terms that require further definition, which introduces further terms.
The core argument: the IRC’s complexity is substantially driven by the need to define terms that English grammar leaves ambiguous. The code is not primarily a description of tax policy. It is a definitional apparatus — a machine for pinning down meaning that the grammar refuses to fix.
Section 7701 contains 50+ definitions for general use across the entire Internal Revenue Code. It is the grammar patch layer — the place where English words are forcibly assigned precise meanings.
“Person” — includes corporations, partnerships, trusts, estates, associations, and companies. Not just humans. The word “person” in the IRC does not mean what it means in English.
“Taxpayer” — means “a person subject to any internal revenue tax.” Circular: defined using “person,” which is itself redefined.
“Employee” — “includes an officer of a corporation.” The word “includes” here is itself ambiguous: is the list exhaustive or illustrative?
“Trade or business” — famously, §7701 does not define this term. It is the most fundamental term in tax law, used 339+ times in the code, and it has no statutory definition. The absence has generated thousands of court cases over a century. The fact that the most important term in the IRC is undefined is the perfect illustration: English lets you use a phrase without specifying its boundary, and sometimes even the 10-million-word patch cannot close the gap.
The phrase “for purposes of this section” appears hundreds of times in the IRC. Each occurrence creates a local definition that overrides the general meaning — a scoping mechanism that English grammar does not natively provide. It is the tax code’s version of a namespace.
| Section | Term | General Meaning | Local Definition | Why Different |
|---|---|---|---|---|
| §61 | gross income | Everything you receive | “all income from whatever source derived” | Broader than common usage |
| §162 | ordinary and necessary | Common and required | Not defined — courts decide case-by-case | English can’t specify |
| §401(k) | qualified plan | A good plan | Specific 18-requirement checklist | Precision via enumeration |
| §1031 | like-kind | Similar things | Real property for real property (post-TCJA) | Not common meaning |
| §529 | qualified | Meeting standards | Specific education savings account requirements | Different “qualified” |
| §280A | home office | Where you work at home | “Regular and exclusive use as principal place of business” | Much narrower |
| §132 | de minimis | Trivially small | “so small as to make accounting for it unreasonable” | Still vague |
| §501(c)(3) | organized and operated | Structured and running | “exclusively for religious, charitable, scientific…” | “exclusively” means “primarily” per case law |
The same English word gets redefined in different sections of the IRC:
“Qualified” means different things in §401 (retirement plans), §529 (education savings), §1031 (property exchanges), §42 (low-income housing), and §179 (depreciation). Five different meanings for one English word, each requiring its own definitional apparatus.
“Person” in §7701 includes corporations. In common English it means a human being. The gap between these two meanings is the entire body of entity taxation law.
“Income” in §61 is broader than economic income, narrower than accounting income, and different from cash flow. Three disciplines, three meanings, one English word.
“Employee” vs “independent contractor” — the IRC’s definition doesn’t match the common law test, which doesn’t match the economic reality test. Three legal frameworks, still arguing about what one English word means.
This is exactly the clusivity problem from the unsayable analysis: English “we” doesn’t distinguish inclusive/exclusive. English “qualified” doesn’t distinguish which qualifications. The tax code patches this with local definitions, but each patch creates new ambiguity at its boundary.
Each grammatical gap identified in the enemies analysis has a direct counterpart in the tax code. The table below maps the unsayable dimensions to the IRC patches that compensate for them.
| English Gap (from enemies/) | Tax Code Patch | Cost of the Patch |
|---|---|---|
| No evidentiality marking | Documentation requirements, substantiation rules, record-keeping mandates | IRS form burden: 7.9 billion hours/year |
| No clusivity (“we” is ambiguous) | Entity classification rules (§7701), check-the-box regulations | Entire body of entity law |
| No definitional scope marking | “For purposes of this section…” — hundreds of local definitions | Thousands of pages of definitions |
| No moral precision (“bad” is vague) | Penalty tiers: negligence (§6662), substantial understatement, fraud (§6663) | Each tier requires separate proof standard |
| Count noun bias (ideas as objects) | “Property” defined broadly to include intangible rights, creating IP taxation complexity | Transfer pricing rules (§482) |
| No grammatical precision on time | Taxable year, accounting periods, recognition timing, realization doctrine | Entire temporal framework of tax law |
Some of the IRC’s most important terms are defined circularly or not at all. These are the places where the grammar patch fails — where English’s ambiguity is so deep that even millions of words of definition cannot close the gap.
The Supreme Court in Groetzinger (1987) admitted it couldn’t define it either. The most fundamental term in tax law has no statutory definition because English cannot draw the boundary between an activity that is a “trade or business” and one that is not.
What is reasonable? Courts use multi-factor tests because English cannot specify. The word “reasonable” is itself a placeholder for a judgment that the grammar refuses to encode.
Welch v. Helvering (1933): “life in all its fullness must supply the answer.” Justice Cardozo conceded that the phrase has no stable meaning — it is English at its most strategically vague.
Defined in regulations as “the price at which the property would change hands between a willing buyer and a willing seller, neither being under any compulsion to buy or to sell and both having reasonable knowledge of relevant facts.” That is 38 words to define a 3-word phrase. And still litigated constantly.
Never defined. IRS guidelines suggest 90% of assets and 70% of operating assets. But “substantially” is inherently vague in English — it resists quantification because the language does not grammatically distinguish degrees of completeness.
How do other legal traditions handle the same problem? The answer depends, in part, on the grammatical resources of the underlying language.
German’s compound noun system allows precise term creation. “Einkommensteuergesetz” is one word: income-tax-law. The language’s grammatical case system (nominative, accusative, dative, genitive) reduces referential ambiguity — you know who is doing what to whom from the noun endings, not from word order. German tax law is complex but more structurally precise. The grammar does some of the work that English offloads to definitions sections.
Civil law systems generally centralize definitions in codes rather than scattering them across sections. The “for purposes of this section” pattern is less common because the code structure is more systematic. Definitions are declared once and imported by reference — closer to how a programming language handles scope than how English handles context.
Chinese’s topic-prominent grammar and classifier system create different ambiguities. “Income” (所得, suṏdé, “that which is obtained”) is more process-oriented than English’s nominalization. The word itself encodes a directional metaphor — income is something that arrives at you, not something you possess. The conceptual framework is different before any law is written.
The growth of the tax code can be modeled as an entropy reduction process.
The tax code cannot stop growing because it is trapped in a positive feedback loop. Each step in the cycle produces the input for the next.
This is the same structure as the extinction curve in the languages page, but inverted: instead of languages dying, definitions are being born. The tax code is the fastest-growing “language” in human history — 144,500 words per year, every year, for seven decades.
Three serious objections have been raised against the framing presented above:
1. The German Counterexample. German has a four-case system, compound noun formation, and gendered articles that reduce referential ambiguity. Yet German tax law (Einkommensteuergesetz + Abgabenordnung) is notoriously complex. If richer grammar produced simpler legal codes, German tax law should be substantially simpler. It is not.
Response: The hypothesis is not that grammar is the primary driver of tax complexity — policy complexity (lobbying, carve-outs, special interests) is. The refined claim: grammar is a measurable multiplier on complexity. German tax law may be complex, but the definition overhead ratio may still be lower than the IRC’s. This is empirically testable and has not been tested.
2. The Causal vs. Analogical Problem. The mappings between unsayable dimensions and IRC patches (evidentiality → substantiation, clusivity → entity classification) may be analogical rather than causal. Courts would need substantiation rules even in Quechua because the legal verification problem is independent of grammar.
Response: Correct. Substantiation addresses a legal problem (verification), not a grammatical problem (evidentiality marking). The connection is that Quechua grammar forces the speaker to mark their epistemic relationship to the claim, which is one layer of the same problem that substantiation rules address at a different layer. The mapping is structural, not causal — they solve the same information problem through different mechanisms.
3. Ambiguity vs. Vagueness. Most definitional work in the IRC addresses conceptual vagueness (where does “small business” end and “large business” begin?) rather than grammatical ambiguity (what does “they” refer to?). Vagueness is a property of concepts mapping to continuous reality, not a property of grammar. A tax code in Lojban would still need to define “substantially all.”
Response: This is the strongest objection. The refined hypothesis: English’s grammatical ambiguity contributes a marginal multiplier to the definitional overhead that is primarily driven by conceptual vagueness and policy specificity. The grammar-patch framing remains useful as an analytical lens, not as a causal explanation.