The coinages in Seuss

The children's books of Dr. Seuss abound in words that the author invented. Inspection shows that these coinages are not arbitrary, raising the challenge of specifying the linguistic basis on which they were created. Drawing evidence from regression analyses covering the full set of Seuss coinages, I note several patterns, which include coinages that are phonotactically ill-formed, coinages meant to sound German and coinages that assist compliance with the meter. But the primary coinage principle for Seuss appears to have been to use words that include phonesthemes (Firth 1930), small quasi-morphemic sequences affiliated with vague meanings. For instance, the coinage Snumm contains two phonesthemes identified in earlier research, [sn-] and [-ʌm]. Concerning phonesthemes in general, I assert their affiliation with vernacular style, and suggest that phonesthemes can be identified in words purely from their stylistic effect, even when the affiliated meaning is absent. This is true, I argue, both for Seuss’s coinages and for the existing vocabulary.

2 The Seuss coinages: an attempt at precise description I adopt the strategy of Shih and Kawahara, employing a digital data corpus and statistical modeling in order to obtain objective testimony about issues that can become subjective very easily. The modeling described here employs the technique of logistic regression (on which see, e.g., Johnson 2008: 159-74). The purpose of my models is to predict for any given word, on the basis of just its phonological form, whether it is a Seuss coinage or a real word (for similar applications in other domains, see Hayes 2016). It is, of course, impossible to achieve perfect prediction, and what the model really does is to assign a 'probability-of-Seussian' value to every form, so its predictions are gradient. The deeper purpose of the modeling is that, once a model has been optimized, we can make useful inferences from its internal structure, specifically the degree to which the model attributes explanatory importance to principles hypothesized to underlie Seuss's coinage practice.
I employed readily accessible data. The Seuss coinages, which number about 435 in his complete oeuvre, were carefully collected and described by Lathem (2000). I extracted the coinages from Lathem's work and rendered them in phonetic transcription by hand. I believe the latter task is not difficult or controversial, in light of Seuss's clear use of orthography and the additional clues provided by rhyme. For English in general I employed my own version of the Carnegie-Mellon pronunciation dictionary (www. speech.cs.cmu.edu/cgi-bin/cmudict). My edited version, with 17,744 entries, includes only words that have a frequency of one or above in the English CELEX database (Baayen et al. 1995). This is meant to restrict it to words likely to be familiar to English speakers. 4 I also excluded words formed with highly productive suffixes such as inflectional [-z/-s/-əz] (plural, possessive, 3rd sg. pres.) or [-d/-t/-əd] (past tenses and participles). This is important because there are sequences that are very unusual in stems, but common in inflected forms. For instance, [ts] is rare in stems (e.g. Katz, Hertz) but is ordinary in inflected forms like cats or hurts. I argue below that Seuss indeed uses [ts] as a basis for coinages.
All the analytic work I did for this article (lexical databases with phonetic transcription, R scripts, spreadsheets) may be accessed in the Supplemental Materials at https:// linguistics.ucla.edu/people/hayes/papers/SeussSupplementalMaterials.zip 2.1 What principles might be used to characterize the coined words?
Following a few pilot efforts, I settled on the following procedure to guide the work: I searched a fairly large preliminary set of predictive principles, then narrowed it down to a smaller set with just the most effective ones.
My pilot studies indicated that several strongly predictive traits consisted of word-initial syllable onsets, such as the [sn-] of Snumm. To be thorough, I searched the entire set of 73 attested word-initial onsets, irrespective of whether they occur in the real-word corpus or the Seuss corpus (some onsets occurred only in Seuss). I also found that particular vowels, occurring in the main-stressed position of a word, were sometimes highly predictive of Seuss, such as [ʌ]. Thus, in my more serious search, I included all 15 main-stressed vowels from the corpora as potential predictive factors. 5 I also incorporated into my search some ideas taken from the research literature on sound symbolism, mostly from work on PHONESTHEMES. These are short segmental sequences that don't fully qualify as morphemes, but nonetheless often impart a (perhaps vague) meaning to words that they contain. Phonesthemes are discussed extensively below in section 3.4. For purposes of including multiple phonesthemes in the initial search, I relied on the lists in Marchand (1960) and Hutchins (1998).

The culling procedure
The final model was made by reducing the original set of factors, described in the previous section, to a smaller set, each of whose members demonstrably contributes to predicting Seussian status in a logistic regression model.
I offer a brief note on logistic regression. Every principle that might help predict Seussian status is here termed a FEATURE. 6 Each feature is given a particular number, called its WEIGHT. In my own setup, if the weight of a feature is positive, it means that the feature favors Seuss-coinage status; if the weight is negative, is means that the feature militates against Seussian status; and if it is zero, the feature is indifferent. Greater magnitudes of weights (either positive or negative) have greater effect.
The output of the model, for any given phonetically transcribed word, is a value ranging from zero to one, expressing the estimated probability that a word is Seussian; a perfect model would assign one to all Seuss coinages and zero to all ordinary words. Where computation enters is in setting the weights: one's chosen logistic-regression software will calculate the weights that best separate out the Seuss coinages in the data from the real words. 7 5 For both cases above, it might have been possible to generalize the initial findings, using standard phonological features in the normal way. However, I usually found this was not all that helpful (see fn. 10 for the one case that I retained in the final model). Since generalizing the features vastly expands the number of hypotheses to consider and makes diagnosis harder, I did not pursue it any further. 6 The terminology comes, I believe, from computer science. In linguistics, we would be more likely to call the features CONSTRAINTS, following the research tradition of Optimality Theory (Prince & Smolensky 1993/2004). However, constraints normally act only as penalties for particular candidates, and here they more often reward them; hence the terminology. 7 The method of computation used here is a standard one, namely to maximize LIKELIHOOD, the probability predicted by the model for the corpus as a whole. This is the product of the probability assigned to Seussian-status for all 435 Seuss coinages, multiplied by the probability assigned to real-status for all 17,744 real words. Maximizing this product (it would approach 1 in a perfect model, 0 in a perfectly bad model) maximizes what we intuit as model effectiveness. Actual likelihood achieved by this and other models is reported (in log form) in appendix A.
My chosen software was the bayesglm() function within the R statistics system (https://cran.r-project.org/web/packages/arm/arm.pdf). This version of logistic regression is somewhat conservative, assigning lower-magnitude weights than would obtain under the simplest forms of logistic regression.
I sought to trim my large candidate system of features into one that would be much smaller but perform almost as well. First, I removed all constraints that tested nonsignificant (by a .1 p value), then culled further with the stepAIC() function, which lets us keep a feature only if it creates improvement by the Akaike Information Criterion, a well-known measure that penalizes overly complex models. 8 Statistical evaluation of all models reported here is given below in appendix A. Tables 1 and 2 below give what I found. There is one non-specific feature in the system, the INTERCEPT, which simply is a raw penalty against being Seussiana sensible penalty, in light of the disparity in numbers. The weight of the Intercept is −4.80, which is quite large. Hence, for any form to receive a really strong Seussian probability, it must rack up substantial compensation from the positive-weighted features in order, as it were, to climb out of the hole.

The features of the completed model
For each feature in the tables, I give the following information: • The form of the feature; usually a sequence of phonemes. '['means that a feature counts the relevant sequence only if it is initial in the word; ']' analogously means 'final'; no bracket means 'anywhere'. A few features deviate from this format and are described in words. • The weight of the feature. For intuitive interpretation of weights see footnote. 9 • The number of words, both Seussian and real, that come under the scope of the feature, with representative Seussian examples. • Explanatory comments, where applicable; these serve as placeholders for the discussion to follow. 'Marchand' with page number means that a sequence has been identified as an English phonestheme by Marchand (1960), the reference source used for statistical testing in section 4.

How well does the model work?
We should not expect the model to make always-correct up-or-down decisions on whether a word is a Seuss coinage or a normal word; it would be remarkable if Seuss somehow managed to make every coinage fully distinguishable in this way. Rather, we should see if the model makes meaningful, useful distinctions. It emerges here that the 8 It would have been more principled to use stepAIC() to do all of the culling, but this task overwhelmed my computing equipment. 9 By the math of logistic regression, it can be shown that the formula e −weight tells us the ratio of predicted probability between otherwise-identical forms that do and do not come under the scope of a feature. A weight difference of 5 means a probability ratio of about 150, 2 about 7, 1 about 2.7, .5 about 1.6, and 0 means equal probability. We can get a more detailed picture by comparing histograms. In figures 1 and 2, I plot the probabilities assigned by the model to the 435 Seuss words, compared to the probabilities assigned to the 17,744 real words (in the latter, the scale is compressed to accommodate them in the same vertical space). We can also examine the extremes of behavior. In table 3 are given the ten most 'Seussian' Seuss coinages. I include the particular phoneme sequences that are picked up by the features of tables 1 and 2 and converted, via the math of logistic regression, to high predicted probability.
In less detail, (3) gives the ten least Seussian Seuss coinages, as well as the 'most Seussian' and 'least Seussian' real words (the latter consists of ten words randomly chosen from the 484 real words that got a 0.000 score). These are meant mainly as a guide for the intuition, though the forms of (3b) evoke a further phenomenon: Seuss occasionally adapts a real word, often bearing Seussian phonological traits, to serve as a novel word; for example zip, respelled as Zipp, is used as a surname in Oh Say Can You Say?. These forms are discussed in appendix B. Perhaps more informative is a sampling of minimal-Seuss-score monosyllables, which are for the most part not from learned vocabulary: case, coal, cork, corn, course, shake, stake, stave, stove, sty.

BRUCE HAYES
One wonders whether the model could be improved by further work. I would judge that this is likely, since many of the Seussian coinages are assigned low scores but somehow sound Seussian to me, for example sporn, Jounce and tweetle, all with scores below 0.02something is still missing. However, I believe the model in its present form suffices for its intended purpose; namely, that we can inspect it, trying to find in its features some principles that will be informative about Seuss's coinage practice in general terms.
3 The Seuss coinages: seeking general principles I will put forth four proposed principles of Seussian coinage.

Meter
First, the Seussian words are skewed somewhat to make them fit easily into his hallmark meter, anapestic tetrametersee features (k) and (p) in table 1. Since these metrical principles are so distinct from the main theme of this article, I have relegated discussion of them to appendix C below.

Phonotactic violations
As Nilsen (1977) observed, a noticeable minority of the coinages violate principles of English phonotactics; specifically word-level phonological well-formedness. For English phonotactics see e.g. Hammond (1999), Hayes & Wilson (2008) and Daland et al. (2011). The patterns noted below are probably not controversial.
First, a number of onsets found in the Seuss coinages are not permissible in the core English vocabulary (although they may occur in unassimilated borrowings). Lastly, consider Snumm, quoted above in (2). Along with Snimm (a proper name from Too Many Daves) this coinage violates a phonotactic principle discussed in Davis (1991): English avoids the occurrence of similar or identical consonants in the C positions of the formula sCVC. Davis' constraints include, for instance, bans on /spVp/ and /skVk/ (spip or skeck would be odd as English words). In the present context, the relevant ban, also noticed by Davis, is on /sNVN/, where N is any nasal consonant. As Davis points out, no such words exist in English and I personally find smem, smun, snam (and indeed Snumm and Snimm) to sound odd.
Unsurprisingly, none of these phonotactic violations is extreme, like, say, the use of uvular consonants or grossly sonority-violating initial or final clusters. It seems that Seuss wanted his words to sound funny, but would hardly want to inflict an impossible phonetic challenge on his readers.
The specific examples given above most likely are only the most salient cases of a more general pattern: Westbury et al.'s (2016) experiments suggest that phonotactically improbable English nonce words are more likely than chance to be felt as funny, and their sample of Seuss coinages emerged in the aggregate as less phonotactically probable than ordinary words. Nilsen (1977) and Teuber (2018) suggest that a number of the Seuss coinages sound like German words. Some of these have already been mentioned in the previous section: words beginning in [ʃl] and [ʃn] are aberrant in English, but are normal in German.

Words that sound German
German is of course closely related to English and has similar phonotactics. Yet the phonological history of the language (see e.g. Chambers & Wilkie 1970) has produced 12 We know this because in On Beyond Zebra Nuh is the letter used to spell Nutches, which rhymes with hutches. That these coinages were actually intended by Seuss to sound German is made plausible by several factors. First, the orthography Seuss chose for them is largely German, as in Gitz, Glotz, Schlottz, Schnutz (that is, tz not ts, sch not sh). Second, the texts include a few overt German cultural references, notably the blue-footed mandolinist Gretchen von Schwinn, from Oh Say Can You Say, and the castle of Krupp, from Dr. Seuss's Sleep Book. Lastly, Seuss's German-styled coinage practice can be related to his own life history (Morgan & Morgan 1995): he grew up in a German-speaking family (he was third-generation) in Springfield, Massachusetts, a city that during his youth included a vibrant German-American community.

The German coinages and the American audience
It is only natural that Seuss, a popular artist, would have attempted to create coinages that would make sense to his readers. In the present context this raises the question of whether Seuss's audience (mostly mid-century Americans) would have been able to identify Germanness in nonce words. An intriguing research finding by Oh et al. (2020) bears on this question: they show by experiment that non-Māori residents of New Zealand, very few of whom can actually speak Māori, nonetheless have an accurate sense of the phonotactic principles of the language, obtained from second-hand exposure. This suggests that if Seuss's audience had enough second-hand exposure to German they likewise could have internalized a sense of what German phonology is like. It seems reasonable to me to claim that mid-century Americans did indeed have considerable exposure to German; this was the period following World War II, and closer to the historical time when German-Americans were the nation's largest ethnic minority. 14 Of course, even now many American Seuss readers would surely recognize Schlottz as a German-like word. 15 13 I did not include in my feature set any phoneme sequences that correspond to actual German morphemes, but these do occur a number of times in the coinages:

Phonesthemes
For present purposes, I define a phonestheme as the following: (i) it is a segment or segment sequence that occurs in multiple words; (ii) it has some vague, often expressive meaning; (iii) its 'residue' in a word is not a morpheme; e.g. in words of the form [ Ph X ] word , where Ph is a phonestheme, X is not in general an identifiable morpheme of the language. To give an example, initial [sn-] is a well-known phonestheme of English. Its meaning is (vaguely, as always) 'having something to do with the nose', as in snoot, snot, sneeze, snout, snuff, snore, sniff, sniffle, snort; and by extension 'looking down the nose at', snob, snooty, sneer, snicker, snide, sniffy and snub. 16 We will examine other phonesthemes below.
Phonesthemes are the topic of a large research literature, 17 which I briefly discuss before going on to the Seuss coinages.

Theories about phonesthemes
I see three basic lines of thought.
The first is least relevant here, so let us dispose of it up front. Phonesthemes, or at least many of them, are often said to have a NATURAL PHONETIC BASIS, as in the affiliation of [i] (a low-sonority vowel with high F2) with smallness (Jespersen 1933). For a careful overview of this topic see Kawahara (2021b). For present purposes I believe it will be safe to ignore whether a phonestheme is natural or arbitrary.
More pertinently, there are different points of view about where phonesthemes come from and their role in language. One prominent viewpoint is the WORD AFFINITIES approach, put forth by Bolinger (1965) and Magnus (2001). This sees phonesthemes as the result of word comparison: human language learners comb through their lexicons, seeking all conceivable correlations between phoneme sequences and meaning. Of course, when pursued to a successful conclusion, this learning behavior yields knowledge of the authentic morphology, enabling most words to be parsed into a sequence of clearly defined, plainly meaningful morphemes. Phonesthemes, in contrast, are the morpheme candidates left on the workbench when learning doesn't fully succeedhence, they occur in words whose 'residues' (X in [ Ph X ] word ) are meaningless, their meanings are elusive, and native speaker judgments about them are difficult and ambivalent. 18 A rather different view on phonesthemes is put forth by Bloomfield (1933: 156), Wales (1990) and Joseph (1994), who emphasize the STYLISTIC FUNCTION of phonesthemes: 16 Pentangelo (2020). The idea that Seuss uses phonesthemes in his coinages was first put forth by Teuber (2018). 18 Interestingly, in recent years it has become possible to implement the word-affinities approach as a computational model (Otis & Sagi 2008;Liu et al. 2018), since there exist ways to approximate meaning using text distributions.

BRUCE HAYES
phonesthemic words are characteristically vernacular in tone and expressive in function. Joseph (1994: 222, 229) articulates this view clearly, describing phonesthemic words as 'expressive, affective, connotative'; they 'add color to the language'. An important component of this view, put forth by Joseph, is that a word can include a phonestheme which embodies style without bearing any trace of the phonestheme's meaning. This will turn out to be important when we later turn to Seuss. The stylistic function of phonesthemes arises, I suspect, from their use in word coinage. Speakers obviously do not concoct phonologically novel words for the purpose of making their meaning clear; rather, these coinages are intended to make an impression, based their imaginative phonological content. Earlier (section 1), I mentioned the apparent fact that many of our existing words originated as phonesthetic coinages, the work of creative speakers long forgotten. It is not unreasonable to regard these coinages, at least at the moment of origin, as the deployment of phonesthemes in the service of verbal folk art. 19 Here, I suggest that Seuss embraced this art form as part of his own distinctive vernacular style.
I have now given two accounts of phonesthemes, but how do we integrate them? Here again, word coinage provides the key: the anonymous verbal artists who coin new words draw on the set of word affinities to make their words more vivid as well as more intelligible. Although the phonesthemes originate with word affinities, the fact that they are repeatedly used to create new vernacular words over time means that the phonesthemes themselves are likely eventually to acquire the vernacular stylistic tinge. And the process may be self-feeding: the acquired stylistic tinge invites word-coiners to make use of the phonestheme more frequently, a virtuous cycle.

Phonesthemic words: a three-way distinction
With the above general discussion in mind, we now turn to a proposed taxonomy of the words in which the phonesthemes occur. The idea is that for any given phonestheme, we will normally find words that fit into each of the following categories.
(5) A three-way classification for phonesthemic vocabulary (a) Words in the MEANINGFUL CORE of a phonestheme ('core words') both contain the phonestheme and bear the appropriate meaning. (b) Words in the PENUMBRA of a phonestheme contain the phonestheme and also convey the vivid, expressive character of phonesthetic style; but they do not bear the meaning of the phonestheme. (c) Words in the NEUTRAL ZONE of a phonestheme contain the segments of the phonestheme but do not bear the meaning of the phonestheme and lack a vivid, expressive meaning; they are not phonesthetic. 20 19 Wales (1990) aptly refers to the coiners of novel (vernacular) words as 'folk poets'. 20 Citations: (5a) is agreed upon by all. To my knowledge, only Joseph (1994: 229-30) has ever noticed (5b), the penumbra. The neutral zone, (5c), is widely noted, e.g. by Jespersen (1922: 406) and Fordyce (1988: 177). Note that my use of the term 'core' differs from that of Fordyce, who uses it to describe those words that embody the phonestheme's meaning most clearly and saliently. That degree of adherence to the meaning of a phonestheme is gradient seems clear from Fordyce's as well as Hutchins' (1998) experiments.

THE COINAGES IN SEUSS
I illustrate this taxonomy for the 'nasal' phonestheme [sn-] already mentioned. The core of this phonestheme would include the words with nasal meaning enumerated earlier: snoot, snot, sneeze, snout, snuff, snore, sniff, sniffle, snort, snob, snooty, sneer, snicker, snide, sniffy, snub. What of the penumbra? I suggest that it includes words like snazzy, snag, snap, snatch, sneak, snip, snitch, snoop and snug. These seem unnasal in their meaning, but they are nonetheless expressive, in the way that phonesthemes characteristically are. To defend this claim, I juxtapose some words that occupy the penumbra of the [sn-] phonestheme with their literal near-equivalents:  (6) and (7) that include the phonestheme are more vivid and more colloquial. The implication is that a phonestheme does not require its core meaning to be present to render its stylistic effect. Unsurprisingly, the element of vivid style that is the sole phonesthemic property of penumbral words is also found in the words of the core, as the comparisons of (8) suggest. Consider next the neutral zone. It is treated here as the set of words that accidentally contain the segments of a phonestheme, in the same way that, say, lens accidentally contains the [-z] of the plural suffix. This zone can be a source of frustration to anyone lecturing about phonesthemes, whose audience is naturally inclined to ask, 'What about word X? Isn't that a counterexample?' It seems best to acknowledge that most phonesthemes do have a neutral zone, but the existence of this zone should not be taken as counterevidence to the existence of the phonesthemein pointing out a phonestheme, we are only pointing out a pattern that is too frequent to be coincidence, not an implicational law. Indeed, Hutchins' (1998) experiments affirmed psychological reality for phonesthemes that possess a demonstrable neutral zone.
The neutral zone for [sn-], a potent phonestheme, is small; I suggest that two plausible candidate words are snow and snail.

'Gravitational attraction' in phonesthemes
A number of scholars (e.g. Jespersen 1922: 407;Malkiel 1990;Magnus 2001: 8, 72;Pentangelo 2020)) have suggested that phonesthemes exert a kind of gravitational attraction; drawing additional words into their membership by adjusting either their form 21 or their meaning. In present terms this claim can be elaborated a bit: I suggest that members of the periphery may gradually assume semantic properties of the core, and members of the neutral zone may be drawn into the core or periphery, becoming regarded as phonesthetic and expressive. Such drift is likely to be the result of language misacquisition; children are prone to mislearn either the style level or the meaning of phonesthetic words.
Here is an example of drift into the core: Malkiel (1990: 99-110) documents an Italian phonestheme of the form CVC i C i V (C i C i a geminate) with meaning 'negative, or ridiculous, or both', which has pulled in words that were formerly neutral such as nullo 'nothing' and secco 'dry', giving them novel secondary usages that fit the core meaning. Another example is the extraordinary semantic drift of English snob (roughly, from 'lowlife' to 'one who looks down on others'), documented in the OED. For drift into the penumbra, I am on more speculative ground, but the reader may wish to ponder the words snooker and snipe. I feel that they belong in the penumbra, not the neutral zone, of [sn-]: as words they seem absurdly jokey and vivid for the purpose of denoting an ordinary indoor sport and bird species. 22 The Broadway composer Irving Berlin evidently felt a sense of pull for the phonestheme [ j-] when he wrote the musical Yip Yip Yaphank, attracting the name of the Long Island town where he did his Army service ([ˈjaepaeŋk]) into the penumbra of the [ j-] phonestheme.
What enables a neutral-zone word to resist the inward pull of its component phonestheme? I suspect frequency matters: in my lexical database, the most frequent words (per CELEX) beginning with the phonesthemes discussed here have at most a modest penumbral tinge: snow, 23 Z, zone, year, use, young. The other cause of phonestheme resistance is speech register: formal or technical words are incompatible with the stylistic character of phonesthesia, and so they can contain the phonesthemic 21 For example, Jespersen suggests that peep originated as a phonesthetic 'repair' of pipe, a word which had lost its phonesthetic appropriateness when the Great Vowel Shift altered its vowel from [iː] to [aɪ]. 22 It is circular reasoning, but worth pointing out, that Seuss used both words in his books: snipe appear (as such) in If I Ran the Circus, and Snookers is a surname in Happy Birthday to You. A journalist calls snooker 'the funniest word I have ever heard'; www.theguardian.com/stage/2016/dec/16/andy-zaltzman-day-today-lee-mack 23 Snow is perhaps a core word for sniffers of white powder cocaine, a usage dated to 1914 by the OED.
An important implication of the above for present purposes is this: a word coined for purposes of writing a children's book would be unlikely to occupy the neutral zone of any phonestheme it contains. If an author uses the segments of a phonestheme, it will probably be perceived by readers as being intended as a phonestheme. The word frequency of a coinage is very low (i.e. zero); and technical or formal vocabulary would hardly be expected in a children's book.

Phonesthemes in general: summary
Summing up, in the discussion below I will approach Seuss's coinages from the viewpoint of the three-way taxonomy of (5), which emphasizes (a) the stylistic role of phonesthemes; (b) the possibility of phonesthemes that convey style but not the relevant meaning; (c) gravitational attraction, under specified conditions, from neutral zone to penumbra to core. These ideas can be connected to the rough theories of phonesthemes discussed above. The core words acquire their phonesthetic meaning via the word-comparison process, during language acquisition. Core words tend to be felt as vernacular for the reason given earlier; that use in coinages over time gradually lends the phonestheme a vernacular tone. Words of the penumbra have meanings that cannot be accommodated within the phonestheme's semantic territory, but speakers nevertheless apprehend their vernacular character, either from context, or simply by adopting the reasonable hypothesis that whatever is phonesthemic is also vernacular. Lastly, neutral zone words are the words that can escape the gravitational-attraction mechanism: either they are so frequent that they can maintain their style and meaning on their own, or they fall into a dry, technical lexical domain, so that no one would think of using them in vernacular style.
At this point we can turn to some of the particular phonesthemes used in Seuss's coinages. I will argue that a minority of the phonesthemic usages in Seuss are core, the rest are penumbral and none are neutral.

[sn-]
The 21 Seuss coinages that begin with the 'nasal' phonestheme [sn-] are given in (9) Of these, I have identified four as belonging to the core of [sn-]. Snaff, from The Big Brag, inherits the phonesthemic status of sniff, of which it is a jocular past tense. Snargled appears in a sequence of verbs with sneezed, snuffled and sniffed, describing inhalation of polluted air, in The Lorax. The snobbish Sneetches plainly qualify, per Seuss's description: (10) With their snoots in the air, they would sniff and they'd snort 502 BRUCE HAYES A more subtle case is the Sneedle, from On Beyond Zebra: this is an insect whose nose takes the form of a large and frightening stinger: (11) Then we go on to SNEE. And the SNEE is for Sneedle A terrible kind of ferocious mos-keedle Whose hum-dinger stinger is sharp as a needle.
However, this seems to exhaust the core [sn-] words in Seuss, as the remaining 18 [sn-] coinages are slim pickings for anyone seeking out nasal meaning. For instance, the Drum-Tummied Snumm, from (2) above, has a spectacular tummy, but a very ordinary nose. Elsewhere in If I Ran the Circus, neither the Harp-Twanging Snarp nor Mr. Sneelock seem nasal in any way, and the same goes for the remaining words in (9). I would suggest that these forms are indeed penumbral; i.e. expressive but not nasally meaningful.

[z-]
[z-] is given short shrift by my primary reference, Marchand (1960) ('an infrequent initial') but is taken more seriously by Wescott (1980), who demonstrates considerable productivity for it. Let us consider the cases from my own data corpus. The 47 real [z]-initial words in my dictionary include eight that seem fairly clearly phonesthetic: zest, zigzag, zing, zip, zoom, zot, zany and zap. Were I try to define the core meaning of [z-], I would guess something like 'with great liveliness'. Thus, a person who is zany is not just somewhat crazy, but crazy in a lively way; for a lizard to zap an insect it must make a very abrupt movement of its tongue.
The [z-] phonestheme also appears to have a penumbra. For example, zit is a very expressive way to denote a pimple, but pimples are not lively. Zilch means 'nothing', but is used to express the idea with feeling and humor. Zonked is plainly expressive but denotes stupor rather than liveliness. There is a neutral zone, composed of technical expressions like zinc and zinnia. A possible example of a neutral-zone word drawn toward the penumbra is Zenith, a brand name that did well for selling television sets in Seuss's day.
[z-] has been noticed before by scholars of the Seuss coinages (Teuber 2018;Keyes 2021) and is indeed the most frequent phonestheme in his work, with 40 occurrences.

THE COINAGES IN SEUSS
With an extension to not-quite-initial position, we may include G-r-r-zapp, G-r-r-zibb, G-r-r-zopp, the sounds of the arrows shot by the Yeoman of the Bowmen in The 500 Hats of Bartholemew Cubbins. However, most instances of [z-] in Seuss appear to be only penumbral. Notably, several [z]-initial Seussian animals are placid and serene: the Ziffs and Zuffs of Scrambled Eggs Super, the Zizzer-Zazzer-Zuzz of Dr. Seuss's ABC, the Zatz-It of On Beyond Zebra, and the Zans and the Zeep of One Fish Two Fish Red Fish Blue Fish. 24
[ j-] is a 'weaker' phonestheme than the other two and it has a large neutral zone including words like yellow, 25 yoke, yarn, yolk and eucharist. Neutral zone words that (for me at least) risk falling into the penumbra are Yonkers, yak and yam, which seem a bit silly for purposes of denoting a city, an animal and a vegetable; see also Yaphank, above.
As a phonestheme in Seuss [ j-] includes the following core items: (13) Core [ j-]-initial coinages in Seuss (a) YOPP, the cry of help made by a small Who that saves the Whos from destruction (Horton Hears a Who) 26 (b) Yekko, a beast who 'howls in an underground grotto in Gekko' (On Beyond Zebra) (c) Ying, a creature with whom it is fun to sing (One Fish Two Fish) But as before, the penumbral examples outnumber them: these include Yop (this time a name of a creature, in One Fish Two Fish); Yink, another creature in One Fish Two Fish; and Yupster, a place name in On Beyond Zebra. There are about ten other cases.
To sum up this section: the patterning of phonesthemes in Seuss's coinages matches their behavior in real language: we find full-blown core coinages like Sneedle, bearing the appropriate meaning; as well as penumbral coinages like Snumm, in which the phonestheme provides only expressiveness and style. The third case, namely appearance of the phonesthemic segments without any phonesthetic effect at all, 24 A possibility to consider is that the [z-] phonestheme possesses two cores, the second of which evokes sleepiness.
Some relevant real words I have noticed are zzz (orthographic phonestheme denoting snoring), zone out and zonked. For Seuss, several of the animals cited above are portrayed as sleepy. For multiple-core phonesthemes see Fordyce (1988: 194-5), Wales (1990), Magnus (2001) and Pentangelo (2020). 25 Yellow is perhaps penumbral when used to mean 'cowardly '. 26 As MacDonald (1988: 86) observes, the climactic YOPP is prepared by the appearance of four evidently phonesthemic [ j-] words ( yapping, yipping, yo-yo, yip) in the immediately preceding pages. One is reminded of Bergen's (2004) experimental finding that phonesthemes can be primed.
504 BRUCE HAYES appears to be impossible, since in real life these cases exist only among words that are frequent or learned, neither of which could plausibly be used in a Seuss coinage.
4 Are these speculations on the right track? A statistical test To return to the main thread, we sought to explain in general terms Seuss's coinage practice, and came up with four hypotheses: • Words that match Seuss's meter are likely to be Seuss coinages.
• Words that are phonotactically aberrant are likely to be Seuss coinages.
• Words that sound German are likely to be Seuss coinages.
• Words that contain phonesthemes are likely to be Seuss coinages.
The statistical model described in section 2 was meant to provide the raw material for evaluating these hypotheses in detail. However, that model only tests phoneme sequences as such, and we have not yet tested whether it is really true that it is the phonesthemic status of these sequences, as I have claimed, that is essential. Perhaps Seuss's practice is systematic, but has nothing to do with phonesthemes. Hypothesis-testing in this domain is not straightforward given the notorious subjectivity of phonesthemic analysis.
Hoping to find objectivity, I constructed a second logistic regression model on a different basis. Whereas the previous model was an attempt to scrutinize a great number of potential features, hoping to find the best ones, my second model implements only the proposed phonesthemes found in one single reference source, Marchand (1960); I will call it the Marchand Model. 27 The model is less complete and accurate than the Full Model given in tables 1-2, but it is arguably objective. Marchand had no ax to grind concerning Seuss, but simply compiled a long list, offering his considered and informed judgment (based on examination of numerous examples) of whether a particular sequence was phonesthemic.
In compiling his list it is clear that Marchand examined all English vowels, all possible onsets and a great many syllable rhymes. 28 In these domains, if Marchand makes no mention of a sequence, it is reasonable to infer that he saw no reason to call it a phonestheme. Unsurprisingly, there is much overlap in the features of the Marchand Model with my Full Model, and to show this, I included the information (page number) of Marchand's discussion of these various sequences in tables 1 and 2 above. As before, the complete Marchand Model may be inspected in the Supplemental Materials. 27 I picked Marchand because he tends to be somewhat conservative, refraining from seeing phonesthemes everywhere he looks. Magnus (2001) and Bolinger (1965) are, conceivably, correct in claiming that phonesthemes are omnipresent, but if this view is true then the hypothesis 'Seuss used phonesthemes in coining words' becomes trivial and not worth checking. 28 Indeed, departing from his normal practice, he seeks a phonesthemic interpretation for every vowel, so there was no point in including them in my Marchand Model. Marchand also covers a few singleton final codas, but they did not improve the accuracy of the model and I omit them here. The model, being so coarse, is far less effective than the model of tables 1 and 2 in predicting Seussian status; see appendix A for details. The key point of the model is that it permits Marchand's independent testimony to bear on the question of whether Seuss's practice is indeed phonesthemic. The results of the model are that Germanness, phonotactic ill-formedness and metrical appropriateness all test significant as factors for predicting Seussian status. In addition, phonesthemic status, for sequences identified as such by Marchand, also matters; although the constraint weights may be lower than those for Germanness and phonotactics, the number of words covered is considerably larger. 30, 31

PREFER-REAL
This is the intercept term, expressing as before the basic preference for a word being real. GERMAN This feature is invoked by all words that invoke one of the specific German features in the Full Model; these are the ones labeled 'German' in tables 1 and 2. PHONOTACTIC This feature is invoked by all words that are phonologically ill-formed or near-ill-formed, invoking one of the features labeled as such in tables 1 and 2. 29 METRICAL The sum of the violations of the two meter-related constraints discussed in appendix C. MARCHAND-ONSET This feature is invoked by all words that begin with a Marchand-mentioned onset phonestheme. MARCHAND-RHYME This feature is invoked by all words that end with a Marchand-mentioned rhyme phonestheme. 29 Of course the German sequences are themselves mostly ill-formed in English, but for clarity I excluded them from the scope of my Phonotactic feature, which covers only the remaining ill-formed cases. 30 A reviewer suggested examining a model that includes interaction terms; this would test, for example, if having a Marchand onset is more important in words that are metrically felicitous. A check revealed that no interaction terms test as significant. 31 A final note: as reviewers have commented, the factors overlap; e.g. initial [z-] is a phonestheme, but it may also be a Germanism (from the sound change *[s > [z) and is moreover phonotactically somewhat improbable (Hayes & Wilson 2008). I'm not sure what kind of test could control for these issues.

BRUCE HAYES
To obtain a more detailed look, I also ran a fine-grained version of the Marchand Model that separates out all the Marchand-mentioned features (there are 44 for onsets and 116 for rhymes). The result, available in the Supplemental Materials, demonstrates that most of the work of predicting Seussian status is being done by a fairly small subset of Marchand's features; only 21 of 160 meet the criterion of bearing a weight of at least 1 and receiving a p-value < .001. 32 The upshot of these studies, I believe, is as follows. If we agree to take Marchand as an impartial witness for phonesthemic status, then it seems almost certain that Seuss is using phonesthemes when he coins words. Further, Seuss is making use of only a modest subset of Marchand's phonesthemes. There are at least two possible reasons for this. First, Marchand may have been overenthusiastic in positing phonesthemes (I tend to think so, particularly among the onsets). Second, Seuss was perhaps making an unconscious artistic decision, choosing his favorites from a larger available inventory.

Conclusions
Verbal artists, particularly popular artists, must rely on phonological resources they share with their reading community. This dictum is confirmed by Seuss's coinage practice. First, native speakers of English internalize a detailed phonotactics of their language, which leads them to be amused by novel words like Thneed. Speakers also have some ability to internalize phonotactic principles of languages they don't speak but find accessible, and hence can be entertained by novel pseudo-German words like Schlottz. Lastly, they have internalized a system of phonesthemes, which gives them the ability to appreciate novel phonesthemic words. Just like real-life phonesthetic words, coined ones may either include the semantic component of the phonestheme, as in Sneedle, or exclude it, with the phonestheme offering only a sense of style, as in Snumm.
It goes without saying that the rigor of the research reported in this article would be increased by extensive experimentation, in the research tradition of, e.g., Fordyce (1988), Hutchins (1998) and Bergen (2004). We would like to know more about which proposed phonemes are actually internalized by native speakers, what meanings they are assigned, whether my proposed 'penumbra' (section 3.4.2) is psychologically real, and (on a different topic) the extent to which older American English speakers (the original Seuss audience) have internalized the phonotactics of German (section 3.3.1). Since my account depends on the ability of people to learn the stylistic affiliation of particular linguistic entities, we are also in need of a theory of how this is done.
Lastly, it might also be useful to carry out studies comparing Seuss's use of phonesthemes with that of other word-coinersin literature, in ordinary life and in industry (see Wong 2014, and the Pokémon research cited in section 1). I imagine that such study would find considerable variation. While Seuss's choices were principled, they probably access only a subset of the possibilities offered by the resources the