In the nineteenth century, an English woman, Sarah Ann Glover (1785–1867), believed that singing was for the public good and a Yorùbá man, Samuel Àjàyí Crowther (1809–91), thought that speech tones should be preserved in writing. Their stories illustrate how diversity in thought sometimes struggles to have an impact, but can ultimately shape human consciousness, and that distinct ideas with disparate aims may be creolized in a period of rapid social change (see, for example, Hannerz Reference Hannerz1987). While the outcome shows a positive side of the missionary field, bringing people and ideas together, the transmission of Glover's and Crowther's ideas was mediated by the overlapping political, social and cultural hegemonies of the colonial era. Crowther was celebrated in the English-speaking world as evidence that the civilizing agenda – and colonialism – was good for all involved. Glover's innovations in music education have been misattributed to a few different men. This article draws on evidence from ethnographic work, field recordings and literature from a variety of disciplines, including several articles published in this journal. All of this information contributes to one answer to the question: why is do-re-mi the preferred heuristic for speech tone levels among bilingual Yorùbá speakers and teachers of Yorùbá language?
The presence of speech surrogates, such as the talking drum (dùndún), indicate that language and music have long had a close relationship in Yorùbá-speaking areas. Throughout sub-Saharan Africa, missionary activity introduced Western forms of literacy for both language and music concurrently. In an ethnolinguistic culture with a fuzzy boundary between language and music, a culture where drums can speak, it is not surprising that a musical model was (and is) used to fill a void in the Western concept of language. Together, Crowther's innovation of marking tone along with aspects of Glover's Tonic Sol-Fa method have made lexical tone more comprehensible for students of the Yorùbá language, from Nigerian secondary school students to Africanist scholars.
Sarah Ann Glover and the Tonic Sol-Fa method
Solmization originated during the rise of musical literacy in the Carolingian era, as musical literacy necessitated a way to teach it. In around 795 CE, Charlemagne wrote to the Abbot of Fulda requesting that the bishoprics and monasteries ‘undertake the task of teaching’ because ‘without knowledge it is impossible to do good’ (Treitler Reference Treitler1984: 135). Two hundred years later, Italian music theorist Guido d'Arezzo (c.990–1030) introduced a more precise staff notation along with solmization syllables to make it comprehensible – ut-re-mi-fa-so-la, forming a hexachord. The growth in monastic education, including a literate tradition of hymn-singing, explains Guido's choice of an existing musical text and tune, ‘Ut queant laxis’ (see Figure 1), as the basis for his hexachordal (six-note) solmization system (Boynton Reference Boynton2003: 100).
In Guido's time, a plurality of notation methods existed with the common goal of transmitting texts with efficiency and fidelity (Treitler Reference Treitler1984: 139, 207). Guido's staff notation and solmization were widely adopted throughout medieval Christian Europe. Staff notation continued to evolve, but the hexachordal system continued to be used for centuries. On the other side of the continent, English choristers at cathedrals and in the Chapel Royal learned to recite the Guidonian gamutFootnote 1 forwards and backwards well into the nineteenth century (Rainbow Reference Rainbow1967: 14–15). In this same period, a related but modified solmization system was introduced by a Sunday school teacher in Norwich, England.
Sarah Ann Glover (1785–1867) had a conviction that teaching should emphasize practice and that theory should be derived from practice, not the other way around (Bennett Reference Bennett1984: 28). By reducing complexity – ‘inadequate representation of the scale on the staff’, ‘non-accidental sharps and flats’, and the ‘contrivance of clefs’ – Glover could implement practice swiftly (Glover Reference Glover1982: 16–17). Glover's system anglicized Guido's syllables and added a syllable (te) for the seventh degree of the diatonic scale, which had become conventional by the late seventeenth century. Do-re-me-fah-sole-lah-te (Glover's spelling) were used to sing the major scale and the same syllables starting from lah were used for the minor scale. Like Guido, her method included both a notation and a solmization, but with an even more direct connection between the two. In Glover's notation, the pitches are represented by the first letter of the syllable and accompanied by a rhythmic tablature of dots and lines (see Figure 2).
Glover's attempts to apply her Sunday School ‘experiments’ in day schools were met with resistance; some believed that teaching music at charity schools might be detrimental to the public good. It was the influence of John Curwen (1816–80) that overrode these concerns. Curwen transformed Glover's Sol-Fa method into a movement aligned with the temperance cause, ensuring that no one could associate Sol-Fa singing with societal ills (Bennett Reference Bennett1984: 29). By the 1850s, the ‘Sol-Faists’ were a community of thousands (see Curwen Reference Curwen1880). The Sol-Fa movement precipitated a musical renaissance in Victorian England but struggled to gain traction with the academic establishment. John Hullah (1812–84), musical inspector for the UK's Council of Education, campaigned against Tonic Sol-Fa in the schools and advocated for a fixed-do system as was taught in much of continental Europe at the time (Leinster-Mackay Reference Leinster-Mackay1981: 165).
Moveable-do systemsFootnote 2 have since been adopted widely in English and American music education. However, Glover – and, to a lesser extent, Curwen – has largely been neglected in histories of the period. Despite its tangible impact on amateur singing culture in England, music historian George Grove omitted the Sol-Fa movement in his telling of the Victorian Musical Renaissance (Olwage Reference Olwage2010: 193). In American music education, Zoltán Kodály is given credit for a system he adapted. Sometimes Curwen is cited as his source, but rarely Glover. Bernarr Rainbow drew attention to the efforts of Glover and Curwen in The Land Without Music (Reference Rainbow1967). In several recent books and articles,Footnote 4 oversight of Glover has been corrected if not fully recognized. It is doubtful that the Western tonal music system would have had the same impact on music throughout the world without Glover's Tonic Sol-Fa methods. And, perhaps, Christian evangelism would not have been so successful during the colonial era.
In the late nineteenth century, the Tonic Sol-Fa Society joined forces with the missionary movement, similar to the earlier alliance with the temperance movement in the 1840s. In 1857, Curwen began to publish testimonials and reports from stations in Barbados, China, India and throughout the African continent, including one from Old Calabar in present-day Nigeria. A missionary to China, John Fryer, reported that Tonic Sol-Fa ‘formed a bond of union between teachers and pupils’ (McGuire Reference McGuire2009: 130). Although Sol-Fa had been part of missionary activities before, the first missionary trained directly by Curwen was Robert Toy, who was sent to Madagascar in 1862 by the London Missionary Society (LMS) (Southcott Reference Southcott2004: 3). In Madagascar and elsewhere, learning Tonic Sol-Fa, along with European dress and language, became a rite of passage to conversion and an important symbol of control recognized by the colonizers (McGuire Reference McGuire2009: 128). Choral singing became a method of disciplining colonized peoples, demonstrating that they could be civilized through the work of missionaries (Olwage Reference Olwage and Randall2005). An 1890s tour by a black choir from the Cape Colony was designed to impress English audiences, convincing them that non-Europeans could be civilized through missionary education including a healthy dose of Sol-Fa singing. Halfway through the performance, the choir changed from indigenous dress to Victorian clothing, intended and likely received as a ‘serious demonstration’ of progress (Erlmann Reference Erlmann and Lindfors1999: 128).
By the early twentieth century, the Sol-Fa method was present at English- and American-run missions throughout the African continent, often with a diet of simple hymns such as those composed by American evangelist Ira Sankey (1840–1908). The loss of World War I led to the internment of German missionaries and the takeover of their missions by British and American missionaries (Busse Berger Reference Busse Berger2013: 482). As a result, many former German missions (which likely used fixed do) now taught Tonic Sol-Fa with moveable do.
The 1950s and 1960s independence movement largely brought the Protestant missionary era to an end in Africa. Now, many Africans are evangelists of both Christianity and Tonic Sol-Fa. The method also continues to thrive in Asia, in locales as far as Fiji, where it is a mainstay of community singing and a highly effective ‘alternative to staff notation’ (Stevens Reference Stevens2003). In Nigeria in particular, Glover's solmization and notation are well preserved in contemporary musical practice, whereas the method is filtered by Kodály in Europe and the United States, and Glover's notation absent. Sarah Ann Glover's invigoration of moveable do has had a profound impact on Western music education, but even more directly on music practice in former colonies. Although Glover surely did not know it, the use of moveable do made the Tonic Sol-Fa method more sympathetic to the intertwined musical and linguistic tone systems of Niger-Congo cultures. Fixed do, in which each solmization syllable is tied to an absolute pitch, is dependent on the availability of instruments with standardized tuning and requires a significant amount of formal training before it can be used. Moveable do, as a relative pitch system, is not and does not. To illustrate the lack of, and even resistance to, standard tuning in indigenous African music, Kavyu offers the anecdote of a 1971 cultural workshop at the Institute of African Studies in Nairobi, Kenya. A group of lyre players from across Kenya were gathered and asked to tune their lyres to the same pitch level. While it may have satisfied the organizers for lyrists of many different ethnicities to play in concert, it was disorienting to the musicians (Kavyu Reference Kavyu1977: 31). Èkwúèmé suggests that diatonicism (scale systems similar to the Western major and minor scale) existed in Africa before the arrival of missionaries (Èkwúèmé Reference Èkwúèmé1974: 52). However, standardized tuning certainly did not. Moveable do is much more adaptable to different instruments and voice ranges, and in its reliance on relative pitch recognition perceptually similar to the contrastive lexical tone of Niger-Congo languages.
Samuel Àjàyí Crowther and African tone systems
For centuries, the ports at Badagry and Ouidah in the Bight of Benin were busy with human trafficking to Brazil, Cuba, Haiti and elsewhere. Many of the people traded at these ports belong to what is now known as the Yorùbá ethnic group. As a result, aspects of Yorùbá culture, including the language, religion, food and music, are now found throughout the Americas as well as in their ancestral home in present-day Nigeria and Benin. The act abolishing British involvement in the slave trade was passed by the British parliament in 1807. Soon afterwards, the Royal Navy began intercepting outbound slave ships along the West African coast. On one of these ships was a young man, Àjàyí, who would be reborn as Samuel Crowther (1809–91) in the missionary community in Freetown, Sierra Leone.
Samuel Crowther's career was unique. A kidnapped slave in 1821, a rescued slave in 1822, a mission school boy in 1823, a baptized Christian in 1825, a college student in 1826, a teacher in 1828, a clergyman in 1843, a missionary to the country whence he had been stolen in 1845, the founder of a new mission in 1857, the first negro bishop in 1864 – where is the parallel to such a life? (Page Reference Page1908: vi)
Crowther received his doctorate in divinity from the University of Oxford in 1864. His identity as a ‘Black Bishop’ piqued the interest of evangelical communities in London and New York, where multiple biographies were published (see Figure 3). His legacy in Nigeria is closely tied to the publishing of the first Yorùbá primer and vocabulary in 1843 and a complete orthography, grammar and dictionary in 1852. Crowther went on to publish primers and vocabularies for Igbo (1857) and Nupe (1860 and 1864).
Like Crowther, many of those freed from slave ships and educated in Freetown eventually returned to their homelands. In the Niger territory, they were known as Sàrós (after Sierra Leone) and occupied esteemed, but ultimately restricted, positions within the colonial system. As bureaucrats, clergy and educationists, the repatriates utilized dual identities of being Africans and Western-educated Christians to mediate between the colonists and the colonized. Crowther's development of a Yorùbá orthography aided the colonization and evangelism of Yorùbá-speaking areas, forever changing the society. At the same time, the early adoption and sustained use of a Romanized orthography have contributed to an ethnolinguistic culture that continues to be robust in the twenty-first century while the majority of African languages are in decline.
Crowther's first task was to create a pan-Yorùbá identity by collapsing the dialects:
Among the purest Yorùbá speakers, there are no less than three modes of pronouncing some words; namely, the Capital – or Ọ̀yọ́ – pronunciation, and two Provincial dialects – the Ibapa and the Ibollo. People from all parts of Yorùbá are now together in the Colony of Sierra Leone, and each party contends for the superiority of its mode of utterance. (Crowther Reference Crowther1852: 1)
Because Crowther recognized Ọ̀yọ́ as the capital, Ọ̀yọ́ dialect became the model for Standard Yorùbá (SY). The Church Missionary Society (CMS) became the arbiter of Yorùbá literacy. In 1875, the CMS convened a conference to standardize the Romanized orthography, a disappointment to Muslim Yorùbá scholars then and now (Ogunbiyi Reference Ogunbiyi2003: 77). Despite this, Yorùbá people are united by their language and are not prone to the intra-ethnic religious conflict consuming northern Nigeria in the twenty-first century.
The success of the civilizing mission among the Yorùbá was lauded by general reports from the missionary field:
Yariba is the every-day Language of teaching and preaching of a large Mission at Lagos and Abeokúta. The whole Bible is in the course of publication … The Yariba people are full of energy, and from their ranks several men have already sprung up of high attainments, and we may look forward to this Language being one of the most important in Western Africa. (Cust Reference Cust1883: 207)
While Cust was a bit off on the spelling, his prediction about Yorùbá’s virility in the future was correct. English is the national language of Nigeria, but Yorùbá people continue to take pride in their language and cultivate a pan-Yorùbá identity around it, ignoring smaller differences in dialect and custom, and sometimes major differences in religious belief. Elsewhere in the book, Cust also noted a peculiar aspect of Crowther's approach to orthography: Crowther insisted on the importance of tones in both the language and the orthography (Cust Reference Cust1883: 229). Cust's survey of the missionary field makes it clear that Crowther's emphasis on tone was novel at the time. Analogies between tone-language speech and musical melody are now known to Western scholars, but at the time Crowther's description must have struck readers as extraordinary.
The Yorùbá language is very musical: certain marks to distinguish the tones thus become indispensable. Two accents have therefore been used to point out this distinction, i.e. not to imply that a particular stress is to be laid on the accentuated syllable, but to mark a variety of intonation. The accents thus employed are, the acute [(´)], indicating elevation of tone … [and the] grave [(`)], indicating depression of tone. (Crowther Reference Crowther1852: 3)
When Crowther developed the orthography in the 1840s, there was no precedent for accommodating the strong presence of lexical tone found in many Niger-Congo languages. Over 40 per cent of the disyllables depend on tone to be differentiated from other entries in the concise and widely available Yorùbá dictionary published by Ìbàdàn University Press Yorùbá. In a more thorough dictionary, such as Abraham's (Reference Abraham1962), I would expect this percentage to be higher. There is a comparison to lexical tone in Chinese in the preface to the 1852 edition. However, the Romanized script for Chinese with diacritics, Pīnyīn, was not developed until the twentieth century. It is not made explicit, but it is likely that the polytonic orthography for ancient and medieval Greek pitch accents – (´) for high, (`) for low – was the closest model available to Crowther.
Crowther's prescription for marking tones has been sustained and developed in Ayo Bamgboṣe's orthography (first published in 1966), which is the basis for the Academy of African Languages’ (ACALAN) Yorùbá writing manual. However, tone-marking remains alternately bewildering and irritating to some fluent speakers and a recent dictionary advocates ‘eliminating the tonal signs’ (Fakinlede Reference Fakinlede2003: 9). The concept of lexical tone was almost inconceivable to missionaries who spoke only European languages without contrastive lexical tone. The missionaries approached orthography for the hundreds of languages in Africa from a very different perspective. Crowther had a distinct advantage of being a first-language (L1) speaker of Yorùbá who had immersive training in English. Of the European missionary linguists, Johann Gottlieb Christaller (1827–95) of Basel was unique in his careful consideration of West African tone systems (Bearth Reference Bearth and Jenkins1998: 85). Written some years later and describing Akan in the Gold Coast (Ghana), Christaller's description of Niger-Congo tonology is largely consistent with Crowther's and likely influenced by it.
The great variety of vowels is increased by different tones, every syllable of every word having its own relative tone, equal with or different from the neighbouring syllables, either high, or low, or middle, sometimes in successive degrees. (Christaller Reference Christaller1875: xviii)
Some sceptics still doubt the necessity of marking tone for a language such as Yorùbá or Akan. However, if one accepts that marking tone is necessary, the issue then becomes efficiency: should tone always be marked or just when it provides lexical contrast? There are many words of the same class – noun and noun, verb and verb – in which vowels or consonants are allophonic (non-contrastive) while tone is phonemic (contrastive). Are these the only words in which tones should be marked? What about the syllables themselves? Should all syllables within a word carry diacritics? While these are no longer controversies among modern linguists, Crowther and Christaller grappled with these issues. Crowther's method was to mark the first high or low tone in a sequence of consecutive high or low tones, so that the second syllable would be marked only in a four-syllable word of mid-high-high-high tones.
In the examples Crowther offers (see Table 1), leaving repeated high or low tones unmarked is more efficient because fewer diacritics are necessary. However, his tone-marking does not accommodate a return to the unmarked (mid- or neutral) level after a marked (high or low) tone that cannot be indicated. This is problematic for the tonally contrastive minimal pair in Table 2.
Two versions of the homophone [mimɔ] are written in exactly the same way using Crowther's orthography, even though they are pronounced differently and have distinct meanings. In Bamgboṣe's orthography (first published in 1966), each occurrence of high or low is marked, solving the problem.
Christaller's system largely followed Crowther's in terms of leaving consecutive equal tones unmarked but introduced some complex rules for step tones and short vowels (Christaller Reference Christaller1875: 15). While Reverend Christaller had a much greater sensitivity to tone than the vast majority of missionary linguists, he adopted a laissez-faire attitude about when tone marks should be applied:
In common writing and in books for the people we mark the tone only in cases of ambiguity; but in grammar and dictionary, and for the study of the language by foreigners, an accurate designation of the tones and the stress is necessary. (Christaller Reference Christaller1875: 15)
However, in the preface, Christaller admits:
[The tone-marks] are also wanting on many words of this dictionary, either from uncertainty or oversight, or because the tones may be known from analogy or simple rules. (Christaller Reference Christaller1875: xxv)
Both Crowther's and Christaller's early treatments of tone had inadequacies that have been corrected by recent orthographies, but their methods continued to be used largely intact for nearly a century. In the late twentieth century, formal linguistics grew immensely, and, along with it, the study of tone. Much like the misattribution of Sarah Ann Glover's Tonic Sol-Fa method, linguists have often cited Christaller as the progenitor of tone-marking in African languages, not Crowther.
What is the difference between tone and tune?
Ignorance among missionaries about the importance of tone is manifested in metric translations of hymn texts into African languages. The singing of the translations to standard tunes has produced hymns of utter nonsense across West Africa (Parrinder Reference Parrinder1956: 37) as well as in Asia and the Americas. For a well-documented language such as Yorùbá, with a published vocabulary available since 1843, those who undertook these translations had the resources available to be more sensitive to tone. One may speculate that they chose to ignore tone. The response among Yorùbá Anglicans was to add ‘native airs’ to the hymnbook, explained by Reverend J. J. Ransome-Kuti (grandfather of Fẹlá Kuti) in his preface:
No [hymn] tune … can possibly express the meaning of words in a ‘tonic’ language such as Yorùbá, so well as one written specially for the words. (Ìwẹ́ Ọrin Mímọ́ (Book of Holy Songs) 1923)
The native airs, though cognizant of tone, were decidedly Westernized. Comparative musicologist Erich von Hornbostel suggested that, in lieu of European hymns or hymns composed in a European fashion, African converts be encouraged to ‘sing and play after their own natural manner’ (Reference von Hornbostel1928: 62). This musical practice was pioneered in the Africanized Aládùúrá churches early on, but it was not until the postcolonial era that a more ‘natural manner’ spread to Catholic and Protestant churches, where European hymns in both foreign and indigenous languages are sung alongside praise choruses to this day. Agu indicates that the true fault of the hymns is not the linguistic defects – people have come to recognize the intended meaning: the problem is that one cannot dance to them (Reference Agu1992: 14). However, I have seen this shortcoming of hymns overcome on many occasions.
Linguists and ethnomusicologists have explored correspondence between speech tones and musical tunes for nearly a century. The first major article on the subject, Herzog's ‘Speech-melody and primitive music’, states that a strict representation of speech melody by musical melody is not implied (Reference Herzog1934: 466). However, if a melody ascends in pitch where a spoken contour would descend, it is a problem for lyric intelligibility. The Yorùbá text ‘Wá s’ádúrà ŏrọ̀’ means ‘Come to morning prayer’ with the proper tone, but when the text is sung to the melody in Figure 4 the last word sounds like ‘ŏrọ’ (crippled).
The Yorùbá text ‘Nkọ̀ jẹ́ gbẹ́kẹ̀lẹ́ ohun kan’ means ‘I will not ever hope in one thing’ with the proper tone. However, when it is sung to the melody in Figure 5 it sounds more like ‘I will not eat hope in one voice’.
Because of the potential for tone–tune mismatch, it is ideal to compose a melody to a text. Beyond one-to-one correspondence, there is a more general incongruence between the contours of Niger-Congo languages and the aesthetics of European music. ‘Nkọ̀ jẹ́ gbẹ́kẹ̀lẹ́ ohun kan’ has a tone sequence of MLHHLHMMM, including no fewer than four changes of direction. The corresponding melody of ‘Solid Rock’ contains only one change of direction, forming a melodic arch ascending for three beats then descending for three beats. A melody composed based on the text would be more angular, with more frequent changes of direction.
Indeed, the general angularity of Niger-Congo tone-language speech is reflected in the melodic character of the music. Kolinski describes a song from Dahomey (see Figure 6), a neighbouring kingdom to Ọ̀yọ́, as ‘bold and ragged’ (Reference Kolinski1965: 116). Kolinski's Western-acculturated ears underlie his pejorative assessment, since they are accustomed to small intervals between pitches (steps and skips), which are fairly continuous in direction, and less frequent large intervals (leaps) and changes in direction. Different language typologies correspond to distinctive musical aesthetics.
If one wonders whether the analogy between music and language in Yorùbá culture originated with Crowther's comment in his preface to the 1852 orthography – ‘the Yorùbá language is very musical’ – the answer is most likely no. Speech surrogate instruments that serve both as signal and entertainment are found in many tone-language cultures in Africa, from the iconic Yorùbá dùndún (double-membrane hourglass-shaped talking drums) to sets of pitched ideophones among the Ìgbò (including the ògénè bells and ùdù pot drums). Long into the past, Yorùbá praise poets and dùndún players (who often work in tandem) must have understood a connection between the pitch effects of decreasing and increasing the tension of the leather bands of the drum and manipulating the human voice because they would imitate each other in antiphonal performance. Nigeria's first professor of music, Ígwē Laz Èkwúèmé, was the fifth thesis advisee of celebrated music theorist Allen Forte (Carson Berry Reference Carson Berry2009: 214). Èkwúèmé rejects missionary-turned-musicologist A. M. Jones’ contention that Africans are unconscious of any organized theory behind their music. He cites examples of ethnolinguistic groups that have concepts similar to a tonal centre around which notes revolve (Èkwúèmé Reference Èkwúèmé1974: 35–6). Later in the same article, he comments on scales:
[In] many cases the music of Sub-Saharan Africans is diatonic – that is, uses whole steps and half steps – but may be said to be modal in that the ordering of these whole steps and half steps may not be in keeping with the ordering of the Western European major or minor scale. Scales may be tetratonic, pentatonic, hexatonic, or heptatonic. (Èkwúèmé Reference Èkwúèmé1974: 52)
Although anachronistic, Èkwúèmé’s attempts to reconcile the diatonic scale and African pitch systems must have some grain of truth: how else could Tonic Sol-Fa so quickly take hold in so many cultures in sub-Saharan Africa? The responsiveness of each culture to Tonic Sol-Fa likely varied, and may have done so in accordance with the nature of the instruments already present within the culture. Wolfe and Schubert (Reference Wolfe and Schubert2010; Schubert and Wolfe Reference Schubert and Wolfe2013) argue that stable pitch in singing is unnatural and is influenced by pitched musical instruments. If this is the case, at least in some African cultures, stable pitch in voice was probably already present in cultures with stable pitch instruments, such as the lamellophone (thumb piano). In others, Tonic Sol-Fa may well have introduced singing on a relatively stable pitch. Èkwúèmé points out that spoken pitch is less definite than sung pitch and that slides and glissandi are present in singing, but states that these ‘should be ignored in an attempt to determine a scale’ (Èkwúèmé Reference Èkwúèmé1974: 52). In Èkwúèmé’s Ìgbò language, speech tones and the pitches of speech-surrogate instruments tend to be discrete (delineated and stable pitch). Thus, the assumption of Tonic Sol-Fa – people sing on a level pitch – is compatible. However, in Yorùbá, sloped pitches (falling or rising contour) are found within a vowel segment and the instruments also produce glissandi. Yet, Tonic Sol-Fa, including the notation, was heartily adopted and adapted into Yorùbá Christian culture and is now a common method for choral singing. While the portamenti and glissandi of Yorùbá tones and talking drums are still extant, the solmization syllables do-re-mi have come to inform conceptions about pitch in unexpected ways.
In 2013, the BBC World Service broadcast a programme on the ‘thriving music and art scene’ in Lagos, Nigeria. During a visit to the Musical Society of Nigeria on Lagos Island, Will Ross interviewed several students in the diploma programme:
[Uche] played a rickety piano, accompanying Alaba, whose hands danced above the xylophone to a number inspired by West African highlife music. At least I thought his name was Alaba until he put me right. ‘Say, “Do-do-mi”,’ he instructed, as he tapped out the notes on the xylophone. ‘That is it now – A-la-ba.’ Although it is not the most convenient thing to carry around, a xylophone is just what is needed to get to grips with the tricky tonal pronunciation of Nigeria's Yorùbá names. A former President, Olusegun Obasanjo, was broken down to: ‘Re-mi-mi-re, re-mi-re-mi: O-lu-se-gun, O-ba-san-jo.’Footnote 10
When the music student Àlàbá (tones: low-low-high) uses the xylophone to teach the lexical tone of his name, he is using the do-re-mi heuristic. Another name with the same tone contour as Àlàbá is Crowther's Yorùbá name: Àjàyí (also low-low-high). What is glossed over by the do-re-mi heuristic is the articulatory detail that the first low tone typically has a falling contour and the high tone following a low tone has a rising contour in fluent speech. However, the concept of the relative pitch heights (if not the pitch trajectories) is preserved. Unsurprisingly, the first reference to both Sol-Fa and Yorùbá tone in the same manuscript is made by a musician, Thomas Ekundayo Phillips, a former organist of Christ Church Cathedral in Lagos, not far from the Musical Society.
Yorùbá is supposed to have only three tones. There are some who go further to assert that these three tones are fixed and can be represented by Do, Me and Soh. These ideas are quite erroneous. The positions of the tones may be principally three, but not only may each of these, especially the medium, be slightly higher or lower, but the speech tones do not strictly follow the three Solfa tones. The system that I propose to use is that of a three-line Staff, with provision made for the use of the space as well as the lines, as in music. (Phillips Reference Phillips1952: 1)
Phillips is also the first, to my knowledge, to point out in published literature that the positions of the tones are not fixed. Acoustic analysis suggests that a wider range than do-re-mi (four semitones) is more characteristic of recorded speech. Do-mi-so (seven semitones) is more similar to the spacing than do-re-mi (Carter-Ényì Reference Carter-Ényì2016: 155). To use any solmization syllable suggests that speech tones are stable pitches. Drawing on a staff gives the option of indicating contours by connecting pitch events across the lines and spaces, through portamento and glissando. This is useful because there are circumstances in which tones are stable and others in which tones are sloped. In ‘The assimilated low tone’, Bamgboṣe describes the circumstances under which low tone has level pitch (after a low tone) and is low-falling (after a mid or high) (Reference Bamgboṣe1966). In his grammar, Bamgboṣe explains that high becomes low-rising after low and mid is low-mid after low (Reference Bamgboṣe2010: 9). Although capable of indicating these pitch trajectories, Phillips’ suggestion of using the staff is more elaborate than the simple diacritics so many already find cumbersome, and is more appropriate for transcription than a streamlined orthography. The musical staff has been used as Phillips suggested by music researchers, for example in Adégbìtè’s study of oríkì (Reference Adégbìté1978). The comparison of tones to musical pitches is not restricted to musicians; linguists have also contributed to the conflation. Writing on another Nigerian language, Jukun, Welmers states:
The three levels [of Jukun] are discrete throughout the sentence, and so precisely limited that playing them on three notes on a piano (a major triad does very well) does not appreciably distort the pitches of normal speech. (Welmers Reference Welmers1973: 81)
While the tones may be understood if sung on do-mi-so, the phonetic implementation in speech is often more complex than Welmers describes. As Welmers suggests, playing speech tones on a piano does not ‘appreciably distort’ the tones, but it does not capture the full story either. A variable-tension talking drum (such as the dùndún) is much more capable of representing the speech contours. In addition to the sloped contours (rising and falling tones) that Bamgboṣe describes, production and intonational effects during fluent speech such as downstep and high-rising put tone levels in constant flux (Laniran and Clements Reference Laniran and Clements2003: 203). Despite the complexity of African tone systems and the manifold ways in which musical pitches are not like speech tones, the do-re-mi folk heuristic has gained traction with the Christianized public and within the academy in Nigeria. A popular text found in street markets and used in secondary schools includes Hausa, Ìgbò and Yorùbá vocabulary (the three major indigenous languages of Nigeria) and was written to ‘generate unity’. Yorùbá is the only language in the text for which tones are marked: ‘accents are used over Yorùbá words to denote their sounds which are doo, ree, mii … ree has no visible tone mark except on nasalized syllables’ (Odetunde Reference Odetunde2009: 1). According to the brief biography included in the book, Odetunde is a bit of a renaissance man with an interest in languages but little formal training. A foreign text by a bona fide linguist, Professor Antonia (Yetunde Folarin) Schleicher of Indiana University, also uses the same folk heuristic:
Each unit in this book has a tone exercise to help you learn how these tones are pronounced in different words. You can use the musical notes ‘do, reh, mi’ to help you learn how to pronounce the tones: low tone is ‘do’; mid tone is ‘reh’; high tone is ‘mi’. (Schleicher Reference Schleicher2008: xv)
Although Schleicher is targeting a non-Yorùbá audience, schoolteachers and even linguists in Nigeria use the do-re-mi heuristic to describe the tone levels to fluent speakers working on Yorùbá literacy. I have also observed people with Yorùbá literacy use do-re-mi as a mnemonic aid in transcribing speech to text. Do they have experience singing in a choir using the Tonic Sol-Fa method? Most likely; many Nigerians do.
The current orthography, reflected in Bamgboṣe's grammar (first published in 1966), makes the analogy more appropriate, because the underlying tones are all conceived as discrete (low, mid or high). To manage this, he divides long syllables with contour tones into smaller units: ‘The so-called glides … recognised as additional tones by many scholars … are treated in this system as separate tones occurring on a sequence of two syllables’ (Bamgboṣe Reference Bamgboṣe2010: 6). A new Yorùbá grammar by Rutgers University professor Akinlabi is forthcoming. Some years back, in a chapter for laypeople on Yorùbá orthography, he stated:
there are three contrastive tones, a one syllable may have a three way pitch contrast … e.g. ko (H) (build), ko (M) (sing), ko (L) (reject). Therefore tones are like consonants and vowels in Yorùbá, since they distinguish the meanings of words like consonants and vowels do. (Akinlabi Reference Akinlabi and Lawal2004: 459–60)
Akinlabi's research has covered complex and novel topics such as under-specification and clitic assimilation of tone, all of which point to the relationality of tone. Yet, for a description for laypeople, Akinlabi's illustration of phonemic tone does not deviate from Crowther's description 150 years earlier. Both used monosyllables to introduce the concept. Despite a century of research, the basic concept of the tonemes as ‘atomic units’ has not changed. Shortly before his death, tonologist Nick Clements questioned whether tone features were motivated at all, reasserting that they ‘do not serve the same functions as segmental features’ (Clements et al. Reference Clements, Michaud, Patin and Hume2011: 3), similar to Lehiste (Reference Lehiste1970). Akinlabi's example of a monosyllable homophone with three distinct tones, like Crowther's example 150 years earlier, is misleading because it is not clear how an isolated syllable can have contrastive tone, except by using extreme parts of one's voice range. Elsewhere, I provide empirical and experimental evidence that high and low tones spoken in the upper or lower part of a speaker's range are intelligible and that tone levels are perceived syntagmatically, as relative pitches (Carter-Ényì Reference Carter-Ényì2016: 156). This is implied in Crowther's and Christaller's early descriptions, using words such as ‘relative’, or ‘elevation’ and ‘depression’ of intonation. From my experiments (ibid.: 147–55), it is clear how two or more syllables may be perceived as having high or low tone levels in relation to each other, but it is unclear how tone may be an isolated (or paradigmatic) unit, perceived through absolute pitch. However, the temptation to use a monosyllable example persists. Segmental features, such as those that combine to create /a/ (+open, +back)Footnote 11 are easily discretized because they are absolute (paradigmatic) features. However, tone features may be perceived in terms of relative pitch (syntagmatic relationships).Footnote 12 Although the conflation of tone and tune is not without its faults, the folk heuristic of do-re-mi, implying moveable pitch relationships, has advantages over the atomic units low, mid and high.Footnote 13
For several years, I have worked with colleagues on transcriptions of Yorùbá vocal arts, including poetry and song, with particular attention to accurately recording tone. Studying Yorùbá poetry is challenging because it includes ìjìnlẹ̀ (deep) language that is not only untranslatable but has no synonymy within Yorùbá. Because there are words not in dictionaries, these transcriptions are not only a record of a performance, but a record of the language. I often seek independent opinions on words, phrases or larger sections. Time and again, I have found bilingual speakers prefer to talk about tone as do-re-mi, not low-mid-high. The linguistic terms low-mid-high reflect an association between frequency, how fast or slow the sound wave is vibrating, and height. This conceptualization of pitch is shared with Western music theory, but conceiving of pitch as low or high is not found in all cultures. In Yorùbá culture, tension may be a better descriptor for pitch variance than height. This is suggested by talking-drum (dùndún) performance practice.
In dùndún performance, the lead drummer often engages in a dialogue with a poet-singer and can speak proverbs or common sayings. This is accomplished through a mapping of the tones of speech to pitches played on the drum. Pitch is changed by tightening or loosening the grip of one's hand on a cluster of tension cords connected to the drumhead (see Figure 7a). Additionally, the hip of the drummer is used as a counter-force to press against and the thumb may apply additional pressure to specific tension cords that are gripped by the hand that adjusts the tension (usually the non-dominant hand). The energy for the sound wave is supplied by striking the drumhead with a curved beater (drumstick) in the other hand (usually the dominant hand). A light presence of the hand on the cords produces mid-tone, squeezing them tightly produces high tone, and releasing all tension is low tone. The pitch-control mechanism of the talking drum is remarkably similar to the pitch-control mechanism of the human voice: variable tension. The vocal cords (or folds; see Figure 7b) are pulled tight by the cricothyroid muscle, increasing frequency (pitch height).
Yorùbá speech is full of ideophones – words that symbolize ideas through sonic imagery (like onomatopoeia). Despite the prevalent use of sound symbolism in Yorùbá, òkè, a word meaning ‘on top of’ or ‘up’, has low tones. In combination with voice (ohùn), ohùn-òkè means ‘high tone’, but it has low tones. Terms used to describe speech tone in Yorùbá are codified in Bamgboṣe's Yorùbá Metalanguage (Reference Bamgboṣe1990), and many of them appear in Abraham's Reference Abraham1962 dictionary. The terms found in Abraham (Reference Abraham1962) and Bamgboṣe (Reference Bamgboṣe1990) are not in colloquial use, nor do they appear in Crowther's works. However, Crowther did rely on the pitch-height paradigm by referring to ‘elevation’ and ‘depression’ of intonation, so the modern terms are consistent with Crowther's conceptualization of tone. Most likely, the metalanguage terms (including ohùn-òkè) are later translations of linguistic terms from English into Yorùbá.
In contrast to òkè, a word referring to height, words referring to tension better fulfil expectations of sound symbolism, or an ideophonic quality:
Tight (adj) há, fún, le, mọ́, pinpin.
Tightly (adj) ni lilelile, gaga, daindain, ṣínṣín. (A Dictionary of the Yorùbá Language 1991)
Many of these words have high tone and none of them have low tone. Is tension a better conceptualization of pitch variation than height within Yorùbá culture? Perhaps. In most of the world, unless one is a vocologist, the variable-tension mechanism of the human voice is felt and not seen. Unlike the instruments of other cultures that rarely use variable tension but rely much more on fixed pitches, Yorùbá culture has external embodiments of the human voice that are exceptional. The dùndún is both an iconic cultural symbol and a tangible model of the voice. Neither the instrument nor the hands of the player move up and down; instead, a variable-tension mechanism is tightened and loosened. Thus, tight tone may be closer to indigenous concepts of pitch than high tone. Furthermore, voice range can be conceived in terms of age, gender or size instead of height, with the mother, father and child drums. Many discussions during fieldwork in Nigeria and much evidence from the literature on other cultures (see, for example, Seeger Reference Seeger1987) suggest that the pitch-height paradigm manifested by Western science and music notation do not reflect universal concepts of pitch in either music or language.
In the Western classical music tradition, reading staff notation constantly reinforces the idea that pitch goes up and down, but Tonic Sol-Fa notation does not present pitch in this way because there is no staff. It reads from left to right in a chronological stream of syllables (do-re-mi-fa-so-la-ti). Figure 2 shows both staff notation and Glover's Tonic Sol-Fa notation (above and below the staff) for a Yorùbá Christmas carol. Although it appears in Àìná’s score, staff notation is not nearly as widespread a practice as solmization syllables and Tonic Sol-Fa notation. Tonic Sol-Fa notation is preferred for amateur music-making by church and school choirs in southern Nigeria. When singing, one feels the sensation of tightening (engaging the cricothyroid muscle) or thickening (engaging the thyroarytenoid muscle) the vocal folds, which may or may not extend to a metaphorical notion of raising or lowering pitch. The action of singing the melody in Figure 2 is more than superficially analogous to the sequence of tightening and loosening the grip necessary for playing the same melody on the talking drum.Footnote 16 The height paradigm is now present in Nigeria. However, among the Christianized public, including language teachers and even professional musicians (who are very familiar with the musical staff), the do-re-mi heuristic is preferred. Whether one associates pitch change with height or tension, do-re-mi fits. A weakness of the pitch-height concept is that it is used to describe both small- and large-scale pitch relationships as well as vocal and instrumental pitch ranges. Within the field of linguistics, tone and non-Western languages are peripheral, outside the mainstream. So, using height to describe small-scale pitch relationships in non-Western languages is unproblematic. In the public sphere of a tone-language culture, where pitch plays so many roles in both language and music, it becomes much more problematic. In my examination of four dictionaries,Footnote 17 it is not clear whether ohùn-òkè means a voice with a high range, using the high part of one's range, or a high tone (which can be quite low within one's range) – all very different meanings. Terms such as ohùn-òkè reflect an ongoing and important effort to develop a metalanguage for Yorùbá in Yorùbá (Adéẹ̀kọ́ Reference Adéẹ̀kọ́1992). This is an effort that, unfortunately, is not always understood or appreciated by the general public, like many other pursuits within humanities research. My conjecture that adjusting tension is a historic and still vital conceptualization of changing pitch is post hoc. It arises out of a consideration of a wide variety of data, but it does not reflect a line of enquiry I pursued during my fieldwork or that can be substantiated historically. However, the do-re-mi heuristic avoids the extrinsic analogy between pitch and height that does not resonate with all people, opting instead for the cross-cultural currency of the Tonic Sol-Fa movement.Footnote 18 If one chooses to take the heuristic very literally and sing Yorùbá words with the tone levels fixed to do-re-mi, it works; if one uses the heuristic as intended, infusing the small-scale pitch relationships of Tonic Sol-Fa into speech (not singing), it works even better.
It is not clear when the heuristic originated, but this is how I imagine it. Missionary churches were a meeting place of language and music, both Western and indigenous, but indigenous language and music were seen as tools for evangelism, not forms of communication and art that were valuable in themselves. European missionaries were bewildered by the melodic speech of Yorùbá, much as stress-language speakers struggle to learn tone languages now. From the late nineteenth century to the early twentieth century, Yorùbá scholar evangelists (in one or many locations synergistically) creolized their rich knowledge of two very different cultures into a notion that Europeans could understand. There is no evidence to suggest that this originated with Crowther himself, but it certainly arose among those who followed in his footsteps. I imagine a statement both confrontational and cathartic: ‘You know your do-re-mi that you've been evangelizing with? Well, that is how our language is, and that is why your hymns don't make sense!’ Reverend J. J. Ransome-Kuti wrote that no hymn tune can ‘express the words in a tonic language’ in the preface to the Yorùbá-language Anglican hymnal (Ìwẹ́ Ọrin Mímọ́ 1923). Over time, the do-re-mi heuristic became a pedagogical device for Yorùbá instruction in parochial and public schools, and was particularly useful for learning to mark tones in written Yorùbá.Footnote 19 Low-mid-high tone levels are acknowledged in secondary school, but do-re-mi is the mnemonic for learning and writing tone sequences. The standardization of the model in bilingual education reinforced the colloquial use of do-re-mi in talking about tone. While the Yorùbá language is robust, it is competing in a multi-lingual environment of Hausa, Igbo, Pidgin, Arabic, English, French and other languages. Language change is speeding up. I have often heard older speakers correct younger speakers, aghast that they do not know the correct tones. As an òyìbó, I have often received more patient coaching because even a little sensitivity to tone on my part is appreciated.
Crowther made one of the most important contributions to the field of linguistics. He made it clear that segmental phonemes (i.e. letters of the alphabet) are not enough to describe the lexicon of all languages. This very important observation questions the primacy of the segmental phoneme and is still not fully appreciated in the linguistics community (see Majid and Levinson Reference Majid and Levinson2010). Phonology is ostensibly the study of sounds, but it has mostly been the study of segmental phonemes thus far. Methods for the analysis of other forms of linguistic contrast (within phonology) are not nearly as developed (Leben Reference Leben2006). Before Crowther, orthographies of African languages were filtered through Western ears and minds, and conformed to Western ways of writing with little or no accommodation for linguistic diversity. In mandating that ‘elevations’ and ‘depressions’ of pitch be marked in the Yorùbá orthography, Crowther made a bold step forward. Unfortunately, his innovation of orthography was attributed to Johann Gottlieb Christaller, who is credited with generating a more holistic and sensitive understanding of African languages (see Bearth Reference Bearth and Jenkins1998). The exclusion of Crowther in the narrative of linguistic theory is partially because of the dominance of Germany within the field at the time (Agwuele Reference Agwuele2008), and because Crowther was an African. His fame within Anglophone evangelism led to biographies that lauded his work as a bishop, not as a linguist. Although Christaller's Reference Christaller1875 Twi grammar does not cite Crowther's Reference Crowther1852 Yorùbá orthography, the comparative study of related languages in the introductory notes draws comparisons to Yorùbá no fewer than seven times (Christaller Reference Christaller1875: x–xxiv). While Crowther is celebrated within Yorùbá scholarship, he is not given due credit in the broader field of linguistics. I propose the Christaller method be redubbed the Crowther method. Tone-marking has evolved considerably, specifically in Bamgboṣe's work with regard to Yorùbá. The minor nuances between Crowther's and Christaller's tone-marking conventions are now insignificant. History often celebrates who got there first, certainly more so than minor innovations. Hopefully we have progressed to the point where the fact that Crowther was an African does not prevent us from giving credit where credit is due. Even more effort is needed (beyond this article) to draw attention to Crowther's contributions, not only to Yorùbá, but to the broader study of African languages.
Sarah Ann Glover's modernization of Guido's solfege was also misattributed in past scholarship, often credited to Curwen and Kodály. Several notable scholarly works (Rainbow Reference Rainbow1967; Bennett Reference Bennett1984; McGuire Reference McGuire2009) have already made this correction. This is a good model for the corrective process needed for Crowther, but it is still incomplete. Within American music education, the Tonic Sol-Fa method is still considered a subset of the Kodály method, and thus continued reinforcement within scholarly literature is necessary to acknowledge Glover. This article is a testament to her impact, revealing a trajectory of the Tonic Sol-Fa method that is completely independent from Kodály and extends beyond music into language.
Aside from suffering the consequences of intellectual and social dominance by white men in life and in death, there is a greater kinship between Glover's and Crowther's innovative approaches. Glover brought music education out of elite cathedrals and conservatories, adapting an antiquated method of solmization to new music and environments. Crowther developed an orthographic approach that accommodated other forms of linguistic contrast, instead of using the same Eurocentric method. Although their physical paths never crossed,Footnote 20 their ideas did, converging in an inter-continental and trans-disciplinary synthesis.
Four main points summarize this article:
1. The presence of speech-surrogate instruments indicates that language and music had a close relationship among Yorùbá-speaking cultures prior to the missionary era.
2. Western perspectives on language and music literacy were introduced simultaneously in many African cultures, with key aspects including: the segmentation of sound into phonemes using the Latin alphabet; and the Tonic Sol-Fa method of solmization and notation.
3. Crowther's innovation of marking tone in standard Yorùbá was later adapted by Christaller to Twi languages and became a standard up until the mid-twentieth century.
4. The do-re-mi heuristic creolizes indigenous knowledge with solmization into a pedagogical tool that is now widely used in secondary and tertiary education.
The do-re-mi heuristic may be seen as a gentle and democratic resistance to the pitch-height paradigm embedded in Western culture and used in formal linguistics: that is, low-mid-high tone levels. And, as Crowther taught us, we should maintain healthy scepticism about the primacy of segmental phonemes and their self-sufficiency.