Published online by Cambridge University Press: 22 April 2026
This paper presents a method for the analysis of connected speech (or writing). The method is formal, depending only on the occurrence of morphemes as distinguishable elements; it does not depend upon the analyst's knowledge of the particular meaning of each morpheme. By the same token, the method does not give us any new information about the individual morphemic meanings that are being communicated in the discourse under investigation. But the fact that such new information is not obtained does not mean that we can discover nothing about the discourse but how the grammar of the language is exemplified within it. For even though we use formal procedures akin to those of descriptive linguistics, we can obtain new information about the particular text we are studying, information that goes beyond descriptive linguistics.
1 It is a pleasure to acknowledge here the cooperation of three men who have collaborated with me in developing the method and in analyzing various texts: Fred Lukoff, Noam Chomsky, and A. F. Brown. Earlier investigations in the direction of this method have been presented by Lukoff, Preliminary analysis of the linguistic structure of extended discourse, University of Pennsylvania Library (1948). A detailed analysis of a sample text will appear in a future number of Language.
2 Correlations between personality and language are here taken to be not merely related to correlations between ‘culture’ and language, but actually a special case of these. The reason for this view is that most individual textual characteristics (as distinguished from phonetic characteristics) correlate with those personality features which arise out of the individual's experience with socially conditioned interpersonal situations.
3 When the verb is transformed to suit such an inversion of subject (N 1 above) and object (N 2), we may call the new verb form the conjugate of the original form, and write it V*. Then an active verb has a passive verb as its conjugate, and a passive verb has an active verb as its conjugate.
4 Two personal names may have identical distributions. Thus, for every sentence containing Bill we may find an otherwise identical sentence containing Jim instead.
5 I owe a clarification of the use of such chains to the unpublished work of Noam Chomsky.
6 Mathematics, and to a greater extent logic, have already set up particular sentence orders which are equivalent to each other. This equivalence can be rediscovered linguistically by finding that the distribution of each sequence is equivalent to that of the others. Our interest here, however, is to discover other equivalences than those which we already know to have been explicitly built into a system.
6a This is the actual text of an advertisement, found on a card which had presumably been attached to a bottle of hair tonic. A considerable number of advertisements have been analyzed, because they offer repetitive and transparent material which is relatively easy to handle at this stage of our experience with discourse analysis. Many other kinds of texts have been analyzed as well—sections of textbooks, conversations, essays, and so on; and a collection of these will be published soon.
7 This will be true, though to a lesser extent, even in the writing of those who obey the school admonition to use synonyms instead of repeating a word. In such cases the synonyms will often be found in the same environments as the original not-to-be-repeated word. In contrast, when a writer has used a different word because he intends the particular difference in meaning expressed by it, the synonym will often occur in correspondingly different environments from the original word.
8 Cf. Harris, Methods in structural linguistics 160 (Chicago, 1951). It goes without saying that this vague use of foresight is a preliminary formulation. Detailed investigations will show what may be expected from different kinds of equivalence chains, and will thus make possible a more precise formulation of safeguards.
9 The -s is also a part of all singular nouns (The child walks, etc.). Or else walks, goes, and the like can be taken as alternants of walk, go, etc. after he and singular nouns.
10 Before this can be done, some further operations must be carried out to reduce Four out of five ... say they prefer ... to two PW sequences: Four ... say ... and They prefer ..., with the sentence You ... will prefer ... as a third PW sequence. Otherwise, the words say they would be left hanging, since the P section (equivalent to Millions)' is only Four out of five people in a nationwide survey, and the corrected W section (identical with the W of You ... will prefer ...) is only prefer X- to any hair tonic Q've used. See §3.2 below.
10a In such formulas as A is X 1: AX 2, the italic colon indicates the end of a sentence or interval. (It is used instead of a period because that might be mistaken for the period at the end of a sentence in the author's exposition.)
11 The case which we have been considering here is the important one of the sequence adjective + noun + verb, in which the noun relates independently to the adjective and to the verb. The adjective can be represented as a predicate of the noun in the same way as the verb. This will be discussed in §2.33 below.
12 True, one might claim that this last sentence is still ‘grammatical’. But present-day grammar does not distinguish among the various members of a morpheme class. Hence to require that sentence B must contain the same morphemes as sentence A is to go beyond grammar in the ordinary sense.
13 To give a crude example, one can read the text sentence The memorable concerts were recorded in company with an informant, and then stop and say to him, in an expectant and hesitant way, ‘That is to say, the concerts——‘, waiting for him to supply the continuation.
14 We may find a great many sentences beginning with The concerts and containing the other two words, e.g. The concerts were not memorable but were nevertheless recorded. These sentences will contain various words in addition to those of the original sentence; but the only new word which will occur in all sentences of the desired form NMR (or rather in a subclass of the NMR sentences) will be a form of the verb to be. Hence this is the only new word that is essential when changing to that form.
15 A for adjective, N for noun, V for verb, P for preposition. Subscripts indicate particular morphemes, regardless of their class.
16 The only way to express the exclusion of an object here purely in terms of occurrence of elements is to say that the object already occurs. This cannot be I, since that is the subject of phoned; hence it must be the other N, the man.
17 Semantically one would say that the PN ‘modifies’ the A.
17a The array given here represents the following sentences, taken from a review of some recent phonograph records: Casals, who is self-exiled from Spain, stopped performing after the fascist victory ... The self-exiled Casals is waiting across the Pyrenees for the fall of Franco ... The memorable concerts were recorded in Prades ... The concerts were recorded first on tape. (The other sentences analyzed in §2.32 were composed by me for comparison with these.) The sentences do not represent a continuous portion of the text. This fact limits very materially the relevance of the double array; but that does not concern us here, since the array is intended only as an example of how such arrangements are set up.
18 We have consumers in P; and since the singular-plural distinction does not figure in our classes, we can associate the dropping of the -s with the occurrence of consumers in the first sentence. By dropping the -s from the P-element consumers we get a P-form consumer for the sentence.
19 Since millions of consumers would be a natural English phrase (P 1 of P 2 = P 2), the effect of using the almost identical sequence millions of consumer in front of bottles is to give a preliminary impression that the sentence is talking about P; but when one reaches the word bottles one sees that the subject of the sentence is B, with the P words only adjectival to B.