In the last chapter, we saw how duality of patterning greatly increases the expressive power of language by combining a small number of meaningless elements (phonemes) into a very large number of meaningful elements (morphemes). These elements, in turn, can recombine to form a potentially extremely large number of further meaningful elements: complex words of various kinds.
Now we look at syntax: the structure of phrases and sentences. Here we’ll see how the rules of syntax allow us to form any of an infinite number of sentences. Syntax is at the core of the expressive power of language; it quite literally gives us freedom of expression, the ability to talk about just about everything and anything.
You might object that it’s quite impossible to produce an infinite number of sentences in practice, for reasons having to do, basically, with lack of time and energy. Here it is useful to introduce an important distinction, that between linguistic competence and linguistic performance. A person’s competence in their native language is their ability to control all the aspects of that language’s structure: the huge, complex, intricate array of things we’ve seen in the preceding chapters, plus the syntax and semantics we’ll look at in this chapter and the next. Performance is putting competence into action: actually speaking and understanding your language. Performance depends on competence but draws on more than just linguistic abilities; when talking and listening other factors, long- and short-term memory, concentration and others, all come into play. Competence refers to the ‘pure’ linguistic abilities. This implies, for example, that competence is distinct from actually speaking and listening; my linguistic competence as a native speaker of English remains in my mind when I’m asleep, or just keeping quiet. We’ll come back to this distinction in Chapter 9, when we look at psycholinguistics, but what’s relevant here is that our competence allows us to produce an infinite number of sentences. Nonetheless, performance limitations (lack of time, etc.) prevent us from ever actually doing so in practice.
There are two ways to see that there is a potentially infinite number of sentences in English. First, we asked in the last chapter how many words there are in English and suggested that, although people might be tempted to say there are a few hundred thousand (or however many there are in the Oxford English Dictionary or some similar authority), in fact the number is unlimited, as you can always add new ones, and we did. But it is reasonable to ask how many English words you have stored in your mental dictionary (or lexicon) somewhere in your long-term memory. Knowing the answer would be about as useful as knowing the exact number of hairs you have on your head – only harder to find out, as we have no idea where to look in the brain to find the mental lexicon, or how to look for it, while we could in principle sit there and count our hairs.
Now ask yourself how many English sentences you have stored in your head. There’s no answer. Imagine the longest sentence you can; you can always add something like ‘… but I don’t think so’ at the end. And you can add it again. And again. And again … ad infinitum. From a strictly linguistic, syntactic point of view, nothing stops you from beginning your attempted longest sentence right now and carrying on through to the heat death of the universe and beyond (mortality, the laws of physics and, above all, boredom will get in the way, but the sentence itself can just go on and on). There is no longest sentence (in any language), just as there is no largest number; however big a number you can imagine, you can always add one to it and get a larger number. Similarly, however long a sentence you imagine, you can always add ‘but I don’t think so’, or something comparable.
A related point, often made by the linguist Noam Chomsky (who is personally responsible for just about every idea in this chapter): most of the sentences you say and understand every day of your life are completely novel; you’ve never heard them before and you might very well never hear them again. You may be the first person in all of history to say a given sentence. This awesome observation holds true because of the unlimited power of syntax to make new sentences out of largely (but not always!) familiar words. This is what Chomsky called the creative aspect of language use (the word ‘creative’ here doesn’t have to have to do with literature or poetry, although of course it might: even the most ordinary sentences are creative in Chomsky’s sense because they are novel).
So the most important thing to understand as we look at the rather abstract rules and formalisms of syntax is this: syntax gives us our flexibility of mind by giving us maximum freedom of expression in speech and thought. Thanks to syntax, we can talk and think about all the things we do, and generate, store and transmit all our knowledge. In a nutshell: no syntax, no spaceships.
All of this means that there can’t possibly be a list of grammatical English sentences in your head or anywhere else. It’s impossible, as the list is infinite. In fact, as Steven Pinker has pointed out, even the finite combinatorial possibilities of syntax are so great that a list would be hopelessly long; he estimates that there are 6.4 trillion possible combinations of words making up a simple five-word sentence of English (The cat ate a mouse is one of them). So we clearly need another way of specifying what the possible sentences of a language can be.
Sentences come about because of the syntactic rules that generate them out of smaller elements like words and morphemes: the centrality of this idea gives this approach to syntax its usual name, ‘generative grammar’. So now let’s look at how this works. We’ll look at the three basic building blocks of syntax: categories, constituents and rules. All these go together to account for the nature of constituent structure, how words group into larger units of various kinds (known as phrases) and phrases into sentences.
Categories
Categories come in two main types: lexical categories and functional categories. Lexical categories include some of the familiar ones from school grammars, some of which we briefly saw in the previous chapter: nouns, verbs, adjectives, adverbs and prepositions. Syntacticians love abbreviations and are extremely unimaginative in their use of them, so we refer to these categories for short as N, V, Adj, Adv and P.
Lexical categories form open classes: you can always add new members to them (actually, this isn’t so easy in the case of prepositions, which is why these are sometimes classified as functional categories). We saw ways to invent new words out of existing ones in the last chapter, but here we’re going to be a bit more creative. Take a look at sentence (1):
The onx splooed a blatt blarg.
(1) is syntactically just like (2):
The cat ate a fat mouse.
Obviously, in (1) we have some novel lexical elements – onx, sploo,blatt and blarg – while in (2) we don’t; the other parts of the sentence are functional categories (the words the and a as well as the past-tense inflection -ed) and they’re the same in both (1) and (2), as you can see. I’ll say more about functional categories later.
The traditional definitions of lexical categories, which are really based on semantics (i.e. the kinds of things that words typically mean) are somewhat useful but don’t get us very far, especially with new words such as onx, etc. For example, the traditional definition of a noun is that it means a ‘person, place or thing’. In (1), onx is a noun, and it does seem plausible that an onx is a thing (although it’s not easy to say what kind of thing). A verb is traditionally defined as an action of some kind, and splooed might indeed be telling us what the onx is doing here. Adjectives refer to qualities, or kinds of things, and blatt might be telling us what particular kind of blarg is being splooed by the onx.
But it can’t be that we’re working out that onx is a noun, sploo a verb and blatt an adjective on the basis of their meaning because we don’t actually know what they mean. There must be other ways of picking out lexical categories. One is morphology. As we saw in the last chapter, nouns inflect for plural in English. We can guess that the plural of onx is onxes, just as the plural of cat is cats (note the phonologically conditioned allomorphy here). Verbs inflect for past tense: splooed is the (regular) past tense of sploo just as ate is the (irregular) past tense of eat. Some adjectives in English inflect for their comparative (more than something) and superlative (the most of something) forms, as in fat, fatter (comparative) and fattest (superlative). So we could say (if we knew what we were talking about): This blarg is blatter than that one. So, the main lexical categories can be identified by their inflections (in languages with more inflections than English, this is much easier to do).
But the main thing that helps us to spot categories is syntax. This is how we can be reasonably confident about the categories of onx, sploo and so on in (1) (and in fact we partly guess their possible meanings on this basis, compare The sploo onxed a blarg blatt with (1)).
For example, nouns can appear in the subject position in English. This position is often first in the sentence, designates the doer (agent) of the action described by the verb and usually comes right before the verb. The subject position is blank in (3):
____ stole my lunch money.
In the blank here you could put any of the following: teenagers,vampires, cats, Priscilla or even, with a bit of poetic licence, syntax. But you can’t put in a verb, adjective or preposition:
a. *Walk stole my lunch money.
b. *Tall stole my lunch money.
c. *Under stole my lunch money.
(Remember the asterisk, indicating ungrammaticality, from the previous chapter). In fact, whole sequences of words can go in the blank in (3), but they must have a noun as their main word, in fact the noun must be the head of the phrase. The notion of head here is really the same as what we saw in the last chapter in connection with heads of words, only now we apply it to sequences of words, i.e. phrases. So we can say:
a. The cat stole my lunch money.
b. A strange person stole my lunch money.
c. A professor of linguistics stole my lunch money.
d. Someone’s dog stole my lunch money.
e. The person I met yesterday stole my lunch money.
Sequences of words such as the cat, a strange person, a professor of linguistics, someone’s dog or the person I met yesterday are all Noun Phrases, phrases whose most important word is a noun, NPs for short. In fact, the noun is the head of the NP. Just as in morphology, we can see this from the fact that a professor of linguistics is a kind of professor, a strange person and the person I met yesterday are both kinds of people, and someone’s dog is a kind of dog. So professor, dog and person are all nouns, heading their respective NPs.
Only verbs can appear in the blank in (6):
Teenagers can ____ quickly.
So, we can complete (6) as (7), for example, by filling verbs into the blank:
Teenagers can talk/write/learn/understand quickly.
But NPs, adjectives, adverbs and prepositions are banned from that position, and trying to put them there leads to ungrammatical sentences, such as:
*Teenagers can Dave/kids/injections/tall/in/yesterday quickly.
On the other hand, sequences of words whose head is a verb, i.e. Verb Phrases (VPs), can go into the blank slot in (6):
Teenagers can dissolve in sulphuric acid/get angry/conclude you’re boring quickly.
So here we have VPs whose head is dissolve, get and conclude, respectively.
These tests for categories can confirm our idea that onx is a noun and sploo a verb in (1). In fact, we can see that the onx is an NP and splooed a blatt blarg is a VP. The onx fits into the gap in (3):
The onx stole my lunch money.
And splooed a blatt blarg fits into the gap in (6) (with a slight change in the verb, dropping the past-tense ending -ed as this is not a past-tense context):
Teenagers can sploo a blatt blarg quickly.
One thing that you might have noticed about (1) and (2) is that both contain two familiar English words: the and a. These are called the definite and indefinite articles respectively. They are examples of functional categories. Functional categories are closed classes; you can’t make up new ones, and they vary much more from language to language (for example, as far as we know all languages have nouns and verbs, but plenty of languages have neither a definite nor an indefinite article: Latin, for example).
Another important class of functional categories is auxiliaries, like can in (6–9) or has in the following slight modifications of (1) and (2):
The onx has splooed a blatt blarg.
The cat has eaten a fat mouse.
So we can use morphology (inflections really), positions in the sentence and semantics to identify lexical categories. We can do the same with functional categories, too. For example, a syntactic test for English auxiliaries is that only members of this category can ‘invert’ (i.e. go before the subject instead of after it) in simple questions. So, alongside the statements in (14) we have the questions in (15):
a. Onxes can sploo blargs.
b. Cats can eat mice.
a. Can onxes sploo blargs?
b. Can cats eat mice?
These examples show that can is an auxiliary. Compare (16), where a verb tries to invert:
a. *Sploo onxes blargs?
b. *Eat cats mice?
Together, (14–16) tell us that verbs and auxiliaries are distinct syntactic categories in English. Also, while there are lots of verbs, and new ones can always be invented as we have seen, there are very few auxiliaries, maybe ten or twelve at the most. There are lots of other tests like these which we can use to individuate various other functional categories, but these are enough to get across the basic idea.
So, here we’ve seen that
There are two types of categories, lexical and functional.
The lexical categories are N, V, Adj, Adv and P.
Functional categories include auxiliaries and articles.
There are various kinds of tests for distinguishing and isolating the categories.
Now that we’ve seen roughly what categories there are, let’s look at how they combine.
Constituent Structure
Look at these two really simple sentences:
a. Night fell.
b. John yawned.
Each of these consists of an N and a V. So we could very straightforwardly indicate the constituent structure of these sentences as in (18):
a. [N Night] [V fell].
b. [N John] [V yawned].
Notation: here we have the same square-bracket-with-subscript notation as we saw in the structure of complex derived words in the previous chapter.
Nouns are heads of NPs, and as we saw, where we can find a noun, we can usually find an NP. So we have:
a. [NP A professor of linguistics] [laughed].
b. [NP The cat who lives next door] [sneezed].
c. [NP Several onxes] [splooed].
These NPs are complex categories, in the sense that they contain more than one word. They all have a head noun (professor, cat and onxes) and other words and phrases that either depend on or modify that noun in various ways. These other words include articles (e.g. the), and many different kinds of modifiers such as relative clauses (who lives next door in (19b)) and quantity words like several in (19c), among many other possibilities. (We’ll say more about certain kinds of ‘quantity words’ when we look at semantics in the next chapter.)
Verbs head VPs, so the right-hand bracketed word in (19) can be ‘expanded’ as in (20):
a. [NP John] [VP spoke about the economy].
b. [NP Mary] [VP ate her husband].
c. [NP Clover] [VP ate a fat mouse].
The VPs here include the verb and various other words and phrases that depend on or modify the verb. These are often direct objects, the thing that undergoes the action described by the verb (as we briefly mentioned in the previous chapter): her husband in (20b) and a fat mouse in (20c). But they can be various other things, such as the Prepositional phrase (PP) about the economy in (20a).
Labelled brackets like those in (20) are one way of representing constituent structure. We can represent the full constituent structure of (20b) as in (21):
The abbreviations NP, N, VP and V are fairly familiar by now, I hope. S simply stands for ‘sentence’. D stands for ‘determiner’, a functional category which includes articles as well as possessive pronouns like her in (21).
The other way to represent constituent structure is with a tree diagram:
There’s nothing magical or mysterious about either labelled brackets or tree diagrams. Both are just different ways of representing exactly the same information about constituent structure. We decide which to use for convenience; tree diagrams are what most people prefer as they present the structure in a way that’s immediate and easy to see. For example, it’s obvious from a glance at (22) that the whole thing is an S, whose basic division is into NP and VP. This information is present in (21), but it takes a bit more looking (and bracket-counting) to spot it.
Time for some tree terminology. The first, and most obvious, thing to note is that the tree is actually pictured upside-down, with its root, as it were, at the top. The parts of the tree with category labels on them are called nodes. S is the root node, where the tree starts. The lines linking the nodes are, unsurprisingly, called branches. If a node divides, it’s called a branching node; so VP and NP2 are branching nodes in (22), while NP1 is a non-branching node.
Notation: the subscript numbers on NP1 and NP2 are used simply to distinguish these two NPs and have no deeper significance; we could just as well call them NPBellatrix and NPPriscilla for example, but the numbers are more convenient to use.
If a node has branches below it, it is a non-terminal node. If it doesn’t, it is a terminal node. In (22), the actual words are the terminal nodes. You can think of the words as the leaves on the branches.
Three conventions apply to tree diagrams like (22). First, branches never cross. Second, all branches emanate from the root, S. Third, branching is only downward. All of these conventions are needed in order to make tree diagrams equivalent to labelled brackets, and, more generally, to make sure they capture the idea of breaking the structure down, as we go ‘down the tree’ from S to the terminal nodes.
Next we need to see two central notions of constituent structure: dominance and constituency. As we’ll see right away, these are really the same relation looked at in different ways.
In a tree diagram, a given category – call it A – dominates another category B just where A is ‘higher up the tree’ than B and connected to B by a continuous sequence of branches going down the tree from A to B. So, in (22), S dominates all the other nodes, VP dominates V, NP2, D and N2, while NP1 dominates N1. On the other hand, VP does not dominate NP1. Further, a node A immediately dominates another node B just where A dominates B and no node intervenes on the dominance path from A to B. So, in (22), S immediately dominates NP1 and VP, and VP immediately dominates V and NP2. But VP does not immediately dominate D and N2, although it does dominate them.
Constituency is dominance ‘the other way up’. So B is a constituent of A just where there is a continuous sequence of branches going up the tree from B to A. Also, B is an immediate constituent of A just where B is a constituent of A and no node intervenes on the upward path from B to A. In these terms, we can say that a linear string of terminal nodes forms a constituent if each of them can be traced upwards by single lines to the same non-terminal node. In (22), then, everything (except S) is a constituent of S, but only NP1 and VP are immediate constituents of S. A good exercise, to prove that you’ve grasped these notions, is to work out some more dominance (downward-looking) and constituency (upward-looking) relations in (22) for yourself. Now we’ve seen what constituent structure looks like. Next we need to know how to get it right.
Rules
We’ve seen how syntactic representations, either labelled brackets or tree diagrams, are made out of categories and constituents. Morphemes and words belong to different categories, and – aside from being constituents themselves – make up larger phrasal constituents to form sentences. So they combine and recombine in various ways. But what specifies how they combine? And what prevents them from combining wrongly, giving ungrammatical sentences like (23)?
a. *Sneezed John.
b. *Husband her ate Mary.
c. *Fat Clover a ate mouse.
We need rules of combination. The basic rules of combination in syntax are Phrase-Structure Rules, or PS-rules for short.
PS-rules are a formal device, meaning that they’re supposed to be followed absolutely literally – the way a computer executes a programme – and have nothing to do with meaning. PS-rules generate constituent structure, by specifying the precise ways in which categories and constituents can combine, and only those.
Here is a simple basic PS-rule of English:
S → NP VP
This says ‘rewrite S as the sequence NP VP, in that order’. You can imagine a computer doing this in a very mechanical way (think of ‘Find and Replace’ in a basic word-processing application). In general, PS-rules mean ‘replace every occurrence of the symbol on the left of the arrow with the one(s) on the right of the arrow’. The PS-rules specify possible combinations of constituents, and so tell us what kinds of tree or labelled brackets are allowed. The one in (24), for example, tells us that the top bit of the tree in (22) is allowed in English. Here it is again:

Implicitly, unless we are told something more, it tells us that other ‘expansions of S’ (ways of rewriting S) are not allowed. Everything which is not required by a PS-rule is forbidden.
There is a link between PS-rules and dominance/constituency. The rule in (24) generates the bit of tree in (25), and thereby tells us that S immediately dominates NP and VP, and that NP and VP are immediate constituents of S. It also tells us that NP and VP must appear in that order, and not in the order VP NP. So (24) tells us how the categories NP and VP can form the constituent S by combining together in the stated order. PS-rules, then, tell us about hierarchical relations (dominance/constituency), linear relations (NP VP and not VP NP) and categorial relations (NP and VP, not AP [adjective Phrase] and PP).
Here are some more English PS-rules:
i. VP → V NP
ii. NP → (D) N
The rule in (26i) says that the bit of structure in (27) is ok:

The rule in (26ii) has a category in brackets on the right of the arrow. This means that category is optional. Strictly speaking, (26ii) is an abbreviation for the two rules in (28):
i. NP → N
ii. NP → D N
The rules in (24) and (26) can generate the tree in (22). Here is the tree again (without the words), annotated with the PS-rule that generated each part:
If we tweak rule (26ii) a bit, adding an optional AP between D and N, and add a simple rule for APs, as in (30), we can generate the structure of (1) and (2):
(30)
(26ii, revised) NP → (D) (AP) N AP → (Mod) A
‘Mod’ here means a category of elements that can modify adjectives, like very, as in very hot.
Now here’s the structure of (1) and (2):
Under NP2, we have AP as an immediate constituent thanks to our tweaking of rule (26ii). AP expands as just A following one option under rule (30). The other option, giving the substructure in (32), would be:

Taking this option of rule (30), we would generate: The cat ate a very fat mouse or The onx splooed a very blatt blarg.
The PS-rules don’t determine where the words go, just the ways categories and constituents combine. But the words, as leaves on the trees, have to match the categories they appear with. So, in (31), the and a are Ds, cat, onx, mouse and blarg are Ns, splooed and ate are Vs and so on. How the words get to ‘slot into’ their positions in the trees is called lexical insertion (inserting lexical items – or words/morphemes – into constituent structures). There are various ways to do this, whose precise technical details would take us too far afield here, but suffice it to say that the main point is that the category of the word or morpheme (which is intrinsic to that word and partly connected to its meaning) must match the category of the terminal node it sits under.
One last and really important point about syntax. PS-rules can be applied and re-applied to their own output. To see what this means, let’s look at a more complex English sentence:
The detective thinks that [Mary ate her husband].
If you look at the bracketed part of this sentence, [Mary ate her husband], you can see that it is a sentence, too. In fact, it is (20b), which has the structure in (22), and we’ve seen the PS-rules that can generate (22).
So we already know what the structure of bracketed part of (33) is. What about the rest? It’s clear that the detective is an NP, of the form [NP D N], generated by one of the rules in (26ii). We can easily tell that thinks is a V, but what is that?
Here we need to introduce another functional category: C, for Complementiser. Complementisers correspond roughly to what you may know from school grammars as ‘subordinating conjunctions’. Complementisers combine with sentences (S) to form a larger constituent, the Complementiser Phrase, or CP. We can state this in a new PS-rule:
CP → C S
Also, we need to modify the VP-rule in (26i) as follows:
VP → V (NP) (CP)
Now, choosing the expansion of VP as V CP in (35), and applying rule (34), we can give the structure in (36):
Looking at each bit of the tree in turn we can see that we’ve already seen the PS-rules that generate each of them.
But there’s something hugely important about (36), which again emerges if we look closely at how each bit of the tree is generated by a PS-rule. The first rule we apply here is (24): S → NP VP. Looking at VP, we choose one of the options in (35), V → V CP. CP is expanded, following (34), as C S. And now, we apply rule (24) to S again, and get [S NP VP] (equivalently, the subtree in (25)). Then we apply a different option to VP (V NP). But, of course, at this point we could take the V CP option again here, apply (34) again, apply (24) again, expand VP as V CP again … and so on. Until the heat death of the universe (PS-rules aren’t interested in cosmology).
So our PS-rules can give us infinitely big trees of the form in
(37):
Trees like this correspond to sentences like:
Mary said that John thinks that Priscilla believes that Clover suspects …
Sentences like this, although long-winded and inelegant, are definitely grammatical sentences of English.
What all this means is that even a small, simple set of PS-rules like the ones that we have seen here – rules (24), (34) and one version of (35) – can generate an infinite number of sentences! These simple rules – or at least a few rather like them – can give us the unlimited expressive power of syntax that we talked about at the beginning of this chapter. It’s the finite PS-rules that can give rise to the infinite number of sentences in languages.
The property of PS-rules that makes it possible to generate infinite structures from a finite set of rules is known as recursion. Recursion means roughly ‘running back again’, and in a way this is what happens when we generate potentially infinite structures like (37). We apply rule (24), one version of rule (35), rule (34), and then ‘run back’ to rule (24), and then do it all again, and again and again. The rules, taken together, apply to their own output. Formally, this is a very simple matter: as long as you have the same symbol on the right of the arrow as on the left (not necessarily in the same rule, but somewhere in the rule system), and the rules don’t have to apply in a specific order (so you can ‘run back’).
So we see that it’s the finite PS-rules that can give rise to the infinite number of sentences in languages. Human language is thus a system of discrete infinity. In this respect, sentences are exactly like the whole numbers in mathematics. Sentences are discrete in that there are five-word sentences and six-word sentences but no five-and-a-half-word sentences. Sentences are infinite in that, just like whole numbers, we can always add one more bit to a sentence of any length at all and get another slightly longer one, just as you can always add one to a whole number, however large, to get a bigger one.
The recursive nature of PS-rules of the kind we have seen here captures – in a relatively straightforward way – one of the very central properties of human language and maybe a defining aspect of us as humans.
*
What we’ve seen in this chapter is the very heart of language: it’s the way syntax works, as specified by PS-rules generating constituent structures, which gives language its unlimited expressive power. Arguably, this is what makes it possible for us to say – and think – anything. Thanks to the PS-rules, we can build spaceships.
But there’s still a pretty big part of the story missing. On their own, constituent structures don’t mean anything; representations like (22), (31), (36) and (37) don’t convey meaning. They’re purely formal objects, generated by the formal system of PS-rules. We still need to see how semantics fits into the picture, and that’s what we’ll do next.