The interaction of material, social, and individual dimensions of meaning design is clearly exemplified in the early development of the technology of writing. Writing is thought to have developed roughly 5,000 years ago in four different parts of the world: Mesopotamia, China, the Indus River Valley, and Egypt, with independent development considerably later in Mesoamerica.Footnote 1 Because scholarly documentation is most extensive for early writing in ancient Mesopotamia, this chapter will focus on that region.
Although few statements about the beginnings of writing go uncontested, most scholars agree that writing in Mesopotamia began not as a representation of language at all, but as a way of recording administrative activities in the ancient Near East. At the end of the Neolithic period, social conditions were changing rapidly, with rising wealth, labor specialization, increased circulation of goods, and the emergence of an elite. A more complex political and social infrastructure was essential to manage the expanding economy, and a specific need arose to keep durable records of the production, distribution, and storage of goods. The earliest writing, using an interface of soft clay etched with a pointed stick or reed, was well suited to this need.
But clay etchings and the system that organized them did not spring from nowhere. Archaeological evidence suggests they were likely to have developed over millennia through the invention and use of different systems of administrative record-keeping such as clay tokens, standardized containers, and seals. For example, Schmandt-Besserat (Reference Schmandt-Besserat1996) argues that writing evolved from an accounting system of clay tokens used as early as 8000 bce.Footnote 2 The earliest tokens, appearing with the beginning of agriculture, were basic geometric forms such as cones, disks, spheres, and cylinders, which Schmandt-Besserat calls ‘plain’ tokens (Figure 5.1).Footnote 3 With the rise of cities about four millennia later, ‘complex’ tokens arrived on the scene, complementing the plain tokens with new geometric shapes: bent coils, parabolas, quadrangles, as well as miniature representations of animals, fruits, furniture, vessels, and tools (Figure 5.2).Footnote 4 Whereas the plain tokens had a smooth surface, with no markings, the complex tokens often bore patterns, notches, and punctuations made with a stylus. Both types of token served the same overall function: to organize and store information about goods produced or transacted.

Figure 5.1 Plain tokens from Tepe Gawra, present-day Iraq, c. 4000 bce

Figure 5.2 Complex tokens from Tello, ancient Girsu, present-day Iraq, c. 3300 bce
As the administrative need arose to archive transaction records, several storage methods were developed. Plain tokens were enclosed in a bulla – a clay envelope the size and shape of a baseball (Figure 5.3), while complex tokens were often strung on a string attached at both ends to an oblong lump of clay impressed with a seal to endorse the authenticity and legitimacy of the record.Footnote 5 Both methods were presumably designed to impede tampering with records – a particularly important safeguard for agreements to be transacted at a later time.

Figure 5.3 Clay ball and tokens
The problem with storing tokens inside clay balls, however, was that once the tokens were enclosed, they were no longer visible or accessible, so the ball had to be broken if one wanted to verify its contents. The solution was to press the tokens into the soft clay exterior of the envelope just before sealing them inside (Figure 5.4). In that way, the contents could be verified without having to break the hardened clay ball. This storage system effectively made the enclosed tokens obsolete, for the impressions alone provided a sufficient record of transacted goods. And if impressions sufficed, it meant that the hollow clay envelope was no longer needed either – a mere flat surface would be adequate. Thus was born the clay tablet (Figure 5.5).Footnote 6

Figure 5.4 Sealed clay ball impressed with tokens similar to those shown to its right
Figure 5.5 Clay tablet from Susa (Iran) showing an enumeration
In terms of the design process, then, a change in social environment (i.e., greater economic and political complexity in Mesopotamian society) was accompanied by the adaptation of existing accounting technologies to develop new practices with the same basic materials (i.e., clay). But how did these social-technological changes relate to linguistic writing?
This is where the story becomes highly controversial. Archaeological evidence from the ancient city of Uruk (in present-day Iraq) dates the beginning of a pictographic writing system known as proto-cuneiform to around 3200 bc.Footnote 7 Proto-cuneiform included icons representing parts of the body, birds, fish, plants, mountains, stars, and so forth (e.g., bird was represented by
, barley by
). Schmandt-Besserat (Reference Schmandt-Besserat1996) argues that some of these pictographic signs depicted complex tokens she had inventoried – in other words, they were signs of signs, and thus constituted the essential leap necessary to allow the creation of written language.
Critics argue, however, that Schmandt-Besserat's classification scheme is flawed, that complex tokens are too loosely defined, and that there is no reason to assume that a given token consistently meant the same thing over many millennia and across the vast geographic range where tokens were found (from Egypt to Iran to Anatolia). Archaeologist Paul Zimansky (Reference Zimansky1993) suggests it is more likely that various people at various times used clay tokens to represent whatever they (as individuals) wanted them to represent. Stephen Lieberman (Reference Lieberman1980) adds that because tokens continued to be used alongside written records, we should not think of them as the specific precursor to writing but rather as a parallel system. He takes the fact that the complex tokens were themselves inscribed with marks as further evidence that there well may have existed a sign system that manifested itself both in clay tokens and in clay tablets contemporaneously.
Sumerologist Piotr Michalowski (Reference Michalowski1993) agrees that many tokens likely developed along with or even later than the first inscribed tablets, and proposes a feedback mechanism “whereby certain tokens were made in the shape of written signs, and the proliferation of symbols reflected the experimentation that was taking place in the first writing system” (p. 997). He argues that while tokens and bullae may well have been resources drawn upon in the invention of writing, they were not alone adequate to account for the development of what was really a completely new semiotic form whose genesis he describes as follows:
It is clear…that multiple forms of communication and visual means of social control were prevalent in the Near East in the periods directly preceding and during the time of rapid urbanization. Different social contexts provided the impetus for differing vehicles. Seals, potters’ marks, painting and craft ornamentation, tokens, bullae, numerical tablets, and other designs – these must be seen as parallel systems of communication. The Uruk IV tablets must be placed beside them, not as an evolutionary descendant but as a new member of the extended family. The inventor(s) of proto-cuneiform undoubtedly drew upon many pre-existing elements to create the new vehicle.
Following Michalowski in light of the metaphor of design, we can think of tokens, bullae, seals, numerical tablets, standardized containers, potters’ marks, painting, and craft ornamentation as available designs that operate within their own social contexts. In the new overarching social context of nascent civilization (rapid urbanization, economic exchange on a scale that demanded record-keeping of past, present, and future transactions), proto-cuneiform may not have evolved specifically from tokens but rather emerged from the vortex of available designs operating in parallel semiotic systems, drawing elements from each system (Figure 5.6).Footnote 8

Figure 5.6 Available designs in the development of proto-cuneiform tablets
It is important to point out that language in the sense of connected speech was not explicitly in the mix at this stage, since proto-cuneiform texts apparently bore no relation to spoken language (Damerow, Reference Damerow2006, p. 4). The transition from proto-cuneiform to a full-fledged cuneiform writing system capable of representing speech is poorly documented by archaeological findings, but specialists hypothesize that it was probably at least five centuries after early pictograms before the first connected texts (e.g., legal, religious, commemorative) were written.
What is clear, however, is that two important changes took place between proto-cuneiform and cuneiform writing of the Fara period about 500 years later. One was material, having to do with the technique of writing. The other was functional, having to do with how sounds were linked to graphic signs, first in Sumerian and then other languages.
The language connection: phonetic coding and the rebus principle
Archaeologists believe that the use of true cuneiform writing (as opposed to proto-cuneiform) began during the Early Dynastic I period (c. 2950–2750 bce) in Mesopotamia. Two key changes are considered especially important in this development. The first has to do with the material techniques used in writing, which led to increasingly abstract forms. The second, perhaps enabled by this increased abstraction, is the way written signs came to represent speech sounds.
Because wet clay did not preserve curved lines well, and because it dried quickly in the hot, dry climate, lines drawn with a sharp pointed stylus in proto-cuneiform were replaced by short, straight impressions made with a reed stylus with a triangular tip, explaining the characteristic wedge-shaped marks that give cuneiform its name (see Figure 5.7). With this new wedge-shaped stylus, curved lines became straight strokes (Figure 5.8), circles became squares, and fine details were eliminated, all of which increased writing speed (Gaur, Reference Gaur1984, p. 48). This is the first instance of many that we will encounter of changes in written forms brought about by material factors.

Figure 5.7 Stylus shapes and their respective impressions

Figure 5.8 Breaking up of curved lines with change of stylus
Because early clay tablets were bookkeeping devices, they did not have any discernable syntax and were mostly organized in spatial hierarchies (Damerow, Reference Damerow2006).Footnote 9 For example, names of donors or recipients of a transaction were consistently found below signs indicating the goods transacted, allowing information such as ‘ten sheep (received from) Kurlil’ to be recorded even in the absence of signs to indicate verbs and prepositions (Schmandt-Besserat, Reference Schmandt-Besserat1996, p. 98). While such extreme economy might seem overly ambiguous to the modern reader, it is not entirely unlike today's pared down text messages, tweets, and emoticons. Nissen (Reference Nissen1986) points out that for a long time writing was used “only as a means of producing catchwords for someone who was more or less familiar with the context but needed to be reminded of particular details” (p. 329). Nissen illustrates this point with the example shown in Figure 5.9. While the inscriptions can be interpreted as ‘Two sheep delivered to the temple (or house) of the goddess Inanna,’ they can also be read as ‘Two sheep received from the temple/house of the goddess Inanna.’ Furthermore, because the starlike sign (AN – ‘heaven’) can refer to An, the god of the heavens, or can function as a semantic determinative (marking divinity), it is not clear whether the house or the temple belongs to ‘the goddess Inanna’ or ‘the gods An and Inanna’ (1986, p. 329). Contingency of meaning was thus a feature of written texts right from the start.

Figure 5.9 The problem of relating signs in a proto-cuneiform text fragment
Over a period of centuries, ‘naturalistic’ signs became increasingly abstract, and they continued to change over subsequent millennia as they came to be used not just for Sumerian but also other languages such as Akkadian, Hittite, Elamite, Hurrian, and Urartian. Consider, for example, the transformation of the signs for Sumerian words mušen (‘bird’), še (‘barley’), and gu (‘ox’) from pictograms to cuneiform signs (Figure 5.10). The 90-degree left rotation (shown in the third column of Figure 5.10) occurred sometime during the early third millennium, corresponding to the transition from writing in vertical columns (read from right to left) to writing in rows (read left to right).Footnote 10

The second and crucial change, occurring during the Early Dynastic period (c. 2800 bce), was the linkage of Sumerian writing and speech through phonetic coding. This adaptation changed both the structure and the range of application of proto-cuneiform writing. Although proto-cuneiform was almost entirely logographic, meaning that one graphic sign represented a whole word, the system gradually became more logosyllabic, meaning that signs took on phonetic value through the use of the rebus principle. In a rebus, a picture or symbol is used purely for its sound to represent a spoken word or syllable (see Figure 5.11).
Figure 5.11 Example of a rebus in English. May I see you home, my dear? Escort Card, c. 1865, US
In cuneiform writing, the sign for one word was used to represent another word with the same or similar sound.Footnote 11 For example, the sign for the Sumerian word ti (‘arrow’), written
, was also used to designate the near homophone til (‘life’). Rebuses and homophony not only made it possible to write abstract nouns but also afforded the possibility of writing multisyllabic words and names by representing the sounds of each individual syllable. An analogy in English would be using an icon of the sun not only to represent the words sun or son, or the name Sunn, but also the syllable in sundry, lesson, Anderson, and so on.
But homophony has its limitations as well. Whereas context might clarify meaning in face-to-face speech, in the context-reduced situation of writing, a systematic one-to-one sound–symbol correspondence system could lead to frequent ambiguity. Because some Sumerian words had a large number of homophones, they needed to be distinguished in writing. For example, fourteen homophones of gu were represented by distinct visual signs. The first four are presented below (from Walker, Reference Walker1987, p. 12):

This is similar to how contemporary languages like English and French attribute multiple meanings to the same sound pattern, but differentiate each meaning with a distinct visual sign (e.g., there, their, they're; ver, vers, vert, vair, verre). And this is what allows tongue-in-cheek use of homophony as in the email forwarding tag ‘Sent from my eyePhone.’
To further complicate things, Sumerian was also characterized by homophony's complement, polyphony, meaning that one graphic sign could represent semantically related words that nevertheless had vastly different sounds. For example, the sign for ka‘mouth,’ originally a pictogram of a head with the mouth drawn on it, is also used for words associated with the mouth; for example, gu3 ‘voice’, zu ‘tooth’, du11 ‘speak,’ and inim ‘word’ (Walker, Reference Walker1987). Because of homophony and polyphony, the intended meaning of cuneiform signs could often only be determined by context.
A partial solution to the one-to-many correspondence problem was the development of determinatives – signs placed before or after other signs to specify the relevant semantic domain (e.g., geography, divinity, metal, wood, and so on). Perhaps the most common determinative in Sumerian was an (
) as seen in the tablet in Figure 5.9 above to designate a deity. Other markers were phonetic, to indicate how a sign should be pronounced (akin to our use of st, nd, rd in 1st, 2nd, 3rd in English). Because they ‘determine’ how a stretch of writing should be read, these particles are vaguely analogous to emoticons in electronic texts today, which are often intended to clarify the tone of an utterance.
An important structural change resulting from cuneiform's new function of representing speech was the standardization of sign order and linearity. Whereas early accounts and administrative records used formats in which written signs were not necessarily placed in the order in which they would be spoken, by the middle of the third millennium bce, the order of signs usually reflected the syntax of spoken discourse (Veldhuis, Reference Veldhuis and Houston2012, p. 6). This happened at the same time that cuneiform was being used for the first time on a large scale to record lengthy prose and poetic compositions.
This story of the origins of writing still poses intriguing puzzles for scholars. We still do not know the extent to which particular threads of the story interact. As we have seen, for example, two important changes occurred between proto-cuneiform and cuneiform writing: material changes in the technique of writing, and functional changes with the beginnings of phonetic representation. Was there a relationship between these two developments? Here are the facts as we currently know them. First, we have seen that the earliest written signs depicted things in the world. When the wedge-shaped reed stylus was introduced, curved lines became straight impressions, fine details were lost, and signs became more stylized and abstract – a trend that continued over centuries and millennia. Signs lost any resemblance they might have originally had to things in the world. Might it have been this increasing abstraction of signs, grounded in the very materiality of writing practices, that facilitated the development of phonetic coding by distancing signs from their original real-world referents and thus making them more suitable for ‘arbitrary’ and multifunctional use? We will never know. But we do know that as cuneiform writing interfaced more fully with the Sumerian language, the degree of multifunctional use of cuneiform signs increased remarkably. Logograms served additionally as phonograms, and homophony and polyphony led to a complex web of relationships between signs, sounds, and meaning. Certain signs were used as semantic determinatives. Sumerian logograms were loosened from the language and came to be used to write Akkadian and other languages as well. Thus a limited inventory of signs was used to accomplish a maximum of communicative work. We can see how this principle still operates today as the Roman alphabet is used to write multiple languages, and when we use letters and numbers to represent words or parts of words, as when we write CU l8r for ‘see you later’ in a text message.
To round out the story, cuneiform became one of the most successful writing systems in human history. Over a period of three millennia, cuneiform writing served multiple needs, purposes, cultures, and languages. Despite its purely administrative origins, cuneiform writing came to preserve and disseminate literature (e.g., the Sumerian/Akkadian Epic of Gilgamesh), law (e.g., the Code of Hammurabi), religion (prayers, hymns, omens, divinations), and scholarship in fields such as astronomy, mathematics, and medicine.
Cuneiform and design
The above description has been somewhat detailed because the specific processes involved in establishing early Mesopotamian writing 5,000 years ago – the use of tokens, seals, tablets, pictograms, and abstract signs that were re-worked in relation to one another – are connected to general processes that have been reiterated throughout the history of literacy, and have direct relevance to our understanding of the relationship of language, literacy, and technology today.
We have seen that writing in Mesopotamia did not begin as a representation of speech but rather was born out of administrative needs in an increasingly complex society, to allow records of transactions to be made and preserved. What thus began as an interface to administrative bureaucracy was subsequently adapted to become a visual interface to language. This sequence is important, as it reminds us that writing is first and foremost a social phenomenon, and that its formal linguistic features follow from its social function as well as material constraints. In previous chapters we have seen this idea play out in diverse examples from handwriting to Greeklish, texting, online chat, and Facebook, and we will see it again in the next chapter when we consider new conventions that developed with the printing press. It will also help us to think about educational issues related to literacy in Part III. In writing classrooms today, for example, it all too often seems that writing is made to be about rules of standard grammar, spelling, and punctuation but without any real exploration of the social and pragmatic underpinnings of those conventions. The case study of cuneiform affirms the importance of conventions, but also highlights their social basis, which is what gives them meaning.
Material dimensions of literacy are also clearly illustrated in the case of cuneiform writing. As we saw in Chapter 2, the physical medium used has important implications for the form of the script, the size of texts, and their relative permanence. The medium of clay and reed stylus lent itself to short, straight impressions rather than round drawn lines. Because most tablets were small enough to fit in a scribe's palm, texts were generally short and required economical use of space. To date, clay has proved to be the most durable of any material used for writing – and its durability was only increased by fires!
Another important point that we can take from the story of cuneiform is that writing co-evolved with Mesopotamia's increasing social, economic, and cultural complexity, but it was not its cause. Technology is often considered an autonomous force that brings about advances in society, but the cases of innovations such as cuneiform writing, the telephone, the printing press, and the computer show us that it is not technology per se, but the interaction of technologies with social life, in the context of particular social conditions, that gives rise to broad social change. As functional needs change, so do material forms of communication, and these changes in turn open up possibilities for new functions, and so on.
The story of cuneiform also reminds us that the repurposing of existing resources depends on the ability to view familiar signs in a new way, drawing on repertoires of symbol manipulation. That ability is facilitated by the abstraction of signs. As users of an alphabet whose origins go back to the Phoenicians, writers of English no more think of an ox head when they write the letter A than they think of a little snake when they write the letter N. And no doubt the Phoenicians didn't either. The letters A and N make meaning when positioned in relation to other letters, but have no intrinsic meaning as letter forms.Footnote 12 The leap made by the Sumerians was to make signs abstract, which was at least partly due to material factors in using clay and stylus. Abstraction allowed signs to designate other signs (e.g., impressions on a bulla of tokens contained within it, which were in turn signs of categories of objects in the world; signs inscribed in clay being used to designate spoken words as well as functions that had no spoken equivalent, such as determinatives). The fact that a sign might refer to a notion of an object in the world, or it might refer to a language sound, or it might be used as a semantic determinative also suggests that extralinguistic context must have played an important role in the Sumerians’ interpretations of texts.
Another significant point illustrated by the case of cuneiform is that technologies of communication don't simply replace one another in a neat succession. Rather, they overlap and inform one another, creating a synergistic panoply of resources for record-keeping, for authenticating, for making and communicating meaning – all operating within a sociocultural ecology (as described in the Introduction and Chapter 3). When writing appeared, it supplanted neither token use nor seals, and it certainly did not supplant the use of speech. Nor did writing likely have just one precursor. As Lieberman, Michalowski and others have argued, writing emerged from a matrix of existing forms (i.e., available designs) coming from different, but interacting, semiotic systems.Footnote 13 Furthermore, the technology of cuneiform was itself an ‘available design’ in the sense that it went on to influence the development of other scripts, such as Ugaritic and Old Persian. Gaur (Reference Gaur2000) likens this (re)invention process to the turning of a kaleidoscope: “at every turn new patterns emerge but the basic components from which these patterns are made remain the same” (p. 6).
Finally, the case of cuneiform furthermore illustrates that writing systems are not static but respond dynamically to the ecologies in which they are used: they are variable in extension and adapt to the needs of the reading culture to which they belong. Thus whereas Old Assyrian merchants needed only about one hundred cuneiform signs, Old Babylonian scholars recording the Sumerian literary tradition used hundreds more, and many of which had several distinct values (Veldhuis, Reference Veldhuis and Houston2012, p. 8). Yet they were working with a system that had a recognizable quality of ‘sameness’ to it.Footnote 14 By virtue of interfacing with fourteen languages other than Sumerian, cuneiform writing also disabused us of the notion that writing systems ‘represent’ languages in any essential way. Rather, they have the potential to interface with multiple languages, and are often adopted for political or cultural reasons rather than for any linguistic reasons.
In the history of literacy, the printing press is usually described as the second quantum leap forward after the invention of writing itself. The standard narrative highlights Johannes Gutenberg's invention of the printing press in the mid-fifteenth century in Mainz, Germany. Because books no longer had to be copied individually by hand and could be mechanically mass produced, the press made it possible for many different people, physically dispersed, to view the same texts, images, maps, or diagrams simultaneously (Eisenstein, Reference Eisenstein1980, p. 53). This meant that texts, literacy, and knowledge could be much more widely disseminated, and that power could be redistributed out of the hands of a small priestly and scholarly elite to a more broad-based democratic base. The standard narrative has it that printing was the catalyst that made possible the Reformation, the Renaissance, and the Enlightenment, along with a wide range of -isms, including individualism, scientism, rationalism, and nationalism (Eisenstein, Reference Eisenstein1980; McLuhan, Reference McLuhan1962).Footnote 1
This standard narrative is, however, technocentric and Eurocentric, focusing on how a material tool revolutionized European society. The printing press as an invented artifact did not constitute a quantum leap forward; rather, it was the felicitous conjuncture of the printing press with just the right social, material, and economic conditions in Europe that made widespread social change possible.Footnote 2
Indeed, in some respects the printing press might even be considered more reactionary than revolutionary (Pattison, Reference Pattison1982). Gutenberg did not introduce a new form of storing narrative and information, but mechanized the production of a familiar form.Footnote 3 For example, the earliest printed books imitated scribal manuscripts down to the finest details, incorporating the same idiosyncratic letterforms, ligatures, and abbreviations. They incorporated the same page layout, and even left blank spaces for artists to add elaborate initial letters by hand and to illuminate the texts individually just as they had done with manuscripts.Footnote 4
In theory, printed texts could eliminate errors through careful editing and typesetting. However, early printed versions actually increased distortions and corruptions of texts, and when errors were produced in print, they were disseminated far more widely than they would have been with manuscript technology.Footnote 5 Besides errors that arose in the process of copy-reading or typesetting, the need to print pages out of sequence on folded sheets of paper also contributed to errors. In his 1492 treatise de laude scriptorum (In Praise of Scribes, Reference Trithemius and Behrendt1974), Johannes Trithemius argued that scribes were more careful than printers and that parchment was a far more durable medium than paper.Footnote 6
Nor was the technology itself entirely new. Ceramic movable type had been invented four centuries before Gutenberg in China, and even metal movable type had been developed in Korea two centuries before Gutenberg's time.Footnote 7 Printing was also known in the Muslim world long before being ‘invented’ in Europe.Footnote 8 The technology of the Gutenberg press was drawn from wine presses and cloth presses, which had come to Germany from the Romans. Gutenberg was a goldsmith by trade, and his true invention was to use brass casts to make unlimited uniform copies of letter sorts by pouring molten lead into the casts. His other key innovation was to develop an oil-based ink that would adhere to the metal type, since the traditional water-based ink used in woodblock printing would not stick. Here, he was inspired by Flemish painters, who mixed pigment into a linseed-oil varnish (Steinberg, Reference Steinberg1974, p. 25). Gutenberg's genius, then, was in bringing together tools and resources (i.e., available designs) from different spheres of activity – goldsmithery, bookmaking, wine making, painting – and synthesizing them in a novel way to satisfy a rapidly growing need for more efficient production of texts (Figure 6.1).

Figure 6.1 Available designs in the development of the printing press
Print and society
If the technology existed across the Asian continent long before it did in Europe, why didn't printing become widespread there before it did in Europe? Essentially because of different social conditions. In the case of China, printing was under the strict control of its rulers, who limited the number of print copies and restricted their dissemination to an elite. Furthermore, unlike Europe, where printing was immediately realized as a profit-producing industry, China did not have a capitalist economy. In the Muslim world, printing encountered powerful social resistance.Footnote 9 The reasons were partly religious, partly economic, and partly technical. With regard to religion, only the handwritten word was considered sacred. As Ogier Ghiselin de Busbecq, the European ambassador to the Ottoman Empire, wrote from Istanbul in 1560: “the scriptures, their holy letters, once printed would cease to be scriptures” (Káldy-Nagy, Reference Káldy-Nagy and Káldy-Nagy1974, p. 203). Furthermore, the practice of cleaning typesettings with hog-bristle brushes made it all the more inconceivable to print the name of Allah in type (ibid., p. 203).Footnote 10 Another reason was economic: printing would have supposedly put thousands of manuscript copyists out of work.Footnote 11 A third reason for the late adoption of printing had to do with the complexity of typesetting Arabic script. Because Arabic is a cursive script, letters are connected to one another and have different forms when they are in initial, medial, final, and free-standing positions. This means that a complete Arabic font, including vowel marks, can run to over 600 glyphs (Bloom, Reference Bloom2001, p. 218). Typographically, then, Arabic presented unique difficulties – and required greater typesetting skill – than languages that could be printed in uniform, discrete characters like English or Chinese. Ironically, the movable type printing press became widely used in the Islamic world only after the French occupation of Egypt in 1798.Footnote 12
Clearly, the mere existence of the technology of movable type was not sufficient for its extensive use; the right social ecology was needed as well. The comparatively rapid success of printing in Europe depended in particular on three key factors. One was a good supply of readers to make the mass production of books be economically feasible. Related to this were the entrepreneurial ambitions of printers, who were motivated by the commercial potential of publishing texts. The third necessary ingredient was an abundant supply of a durable yet inexpensive material on which to print.
Concerning readers, a “vigorous literate culture” (Clanchy, Reference Clanchy1979, p. 8) had been developing in Europe since the late twelfth century, when the growth of universities created an unprecedented demand for written texts, commentaries, and reference works. In cities such as Paris, Bologna, and Oxford, lay scribes worked in guilds to hand copy academic texts long before the arrival of print (Steinberg, Reference Steinberg1974). By the late fifteenth century, a growing population (and growing numbers of readers) combined with an economy on the ascent translated into a viable market for booksellers. The printing press may have amplified the demand for books, but it did not create it in the first place. Like cuneiform writing in Mesopotamia, the print ‘revolution’ co-evolved with, but did not cause, increased societal complexity.
Within the first fifty years following Gutenberg's invention, hundreds of printing presses were established in cities throughout Europe, and Febvre and Martin (Reference Febvre, Martin and Gerard1976) estimate that some twenty million books were printed (p. 248). Compared to the limited capacity of scribal guilds, this was clearly a huge expansion in the scale of production. Printers sought the most lucrative markets – university towns and commercial centers – and developed standardized procedures to maximize speed and productivity.
The third requirement, the plentiful availability of inexpensive material on which to print, was satisfied by paper. Introduced into southern Europe from the Muslim world in the eleventh century, paper was strong, smooth, flexible, and could be produced in uniform formats. Because it was made from recycled linen rags, paper was much cheaper than parchment, and could be produced in virtually unlimited quantities. If early printers had only had parchment to print on, their books would have been almost as expensive as handwritten manuscripts, and printing would not likely have taken hold to the extent that it did.
Although printing presses and books proliferated rapidly in Europe in the first fifty years after Gutenberg printed his first Bible, they did not immediately change the existing power dynamics of literacy. Three-quarters of the books published were in Latin and most readers were clerics or academics, so print began as a product of – and for – a very circumscribed segment of society. Although the printing press was initially welcomed by the Church because it increased dissemination of Bibles and other religious materials, it soon became apparent that the new print era also heralded unwelcome changes. The Church's tight control over the production of books was threatened, as was the Church's very language (since the Bible could now be published in vernacular languages as well as Latin). Mass dissemination of print also threatened communal recitation of prayers by making it possible for individuals to read and contemplate scripture privately. Circulation of the printed writings of Martin Luther, Jean Calvin, and other Protestant reformers so enraged France's King François I that he issued a series of restrictive laws, culminating in a 1535 decree outlawing all printing (by punishment of hanging) and ordering the closure of all book shops.Footnote 13
Nor did the printing press immediately usher in more widespread literacy. Non-literate aristocrats could hire readers and scribes for their literacy needs. Peasants, on the other hand, had little need of or access to writing. It was among the emerging middle classes that literacy held the greatest promise, for in a culture that was increasingly in demand of written records, the ability to write assured a livelihood. But this demand for written documents preceded the printing press, and for many lay scribes, print was more of a threat than a boon. During the sixteenth and seventeenth centuries, literacy rates in Europe remained low, especially in rural areas, and it wasn't until the late eighteenth century – more than three hundred years after Gutenberg's invention – that literacy rates came to approach even 50 percent among men in France and 60 percent among men in England.Footnote 14
Scholarly practices remained largely unchanged following the arrival of the printing press. Given the still relatively high cost of books, scholars continued to copy texts by hand, including texts that had been produced on a printing press.Footnote 15 Even in the early eighteenth century, access to university libraries was often limited to professors, and teaching practices typically followed the medieval tradition of reading aloud from authoritative texts and commenting upon them (Melton, Reference Melton2001, p. 90).
If print did not suddenly transform an oral society into a literate one, it was nevertheless appropriated in interesting ways into oral culture. Historian Natalie Zemon Davis writes that printing in sixteenth-century France (like the Internet today) established “new networks of communication…new options for the people…new means of controlling the people” (1975, p. 190). She explains how in village veillées in which books were read aloud, ‘reading aloud’ meant translating texts into the local dialect so the villagers could understand. In the case of long works, it also meant editing while reading. Reading was therefore far from a simple vocalization of print – it was a wholesale recreation and performance of texts. Non-literate peasants on the listening end were not passive recipients of information, Davis tells us, but participated in their own way in the new world of printing – not by reading and writing, but by actively interpreting and using the information that was read to them. Orality and literacy therefore suffused and transformed one another.
Similarly, Harold Love points out that even in late sixteenth and early seventeenth century England many key cultural texts were oral, and texts that were written were often intended for oral performance (2002, pp. 100–101). Because people's actual experience of texts often involved a mixture of speech, print, and handwriting, there was no clear-cut transition from ‘oral’ to ‘literate’ cultures, and practices in early modern Britain were better characterized as “intricate negotiations between the media” (ibid., p. 102). In the next chapter we will see that the same holds true today, and that “negotiations between the media” aptly characterizes twenty-first-century literacies as well.
Finally, a common assertion is that the printing press was the key factor in disseminating Luther's and other Protestant reformers’ ideas during the Reformation in Germany. It is undeniable that print played a significant role, but given the low literacy rates in sixteenth-century Germany, Scribner and Dixon (Reference Scribner and Dixon2003) estimate that only 2.3 percent of the German population could have encountered Luther's ideas via print. They argue that most information was passed by word of mouth through personal contacts, and that the real role of print was to inform ‘opinion leaders’ – literate and influential people – who disseminated the new ideas through oral communication, with sermons being the most powerful communication platform (p. 20). What print did, then, was to accelerate and broaden dissemination of information to such opinion leaders, creating more networks and spheres of influence through which new ideas could be spread.Footnote 16
Print and language
Probably the most significant way in which the printing press influenced language was to introduce an ethos of uniformity and standardization. This happened at two levels: at the visual surface of the printed text and more broadly in the standardization of language use and in the determination of what counted as a language. Although manuscripts had been written in standard hands, variability from one scribe to the next was inevitable, and of course any individual scribe's handwriting could vary slightly from one writing session to the next, depending on posture, fatigue, ambient temperature, and so on. As manuscripts were transformed into printed texts, such individual variability was lost, as the same letter forms were used consistently within and across texts produced by a given printer. As Amalia Gnanadesikan puts it, printing shifted the process from creating letters to selecting letters (Reference Gnanadesikan2009, p. 252). This shift was later extended by the typewriter and the computer, which allowed texts to be composed from the outset by means of selecting letters.
On a broader level, the printing press provided the impetus for European vernaculars (which had been mostly spoken) to be written, codified, and thereby legitimized as languages. Prior to the arrival of print in Europe, a diglossic linguistic configuration existed whereby the language of the Church, authority, and literacy was Latin, and the language of daily life in society was the regional vernacular. Although in the early years of printing most texts were published in Latin, by the sixteenth century printers produced increasing numbers of vernacular texts, motivated by the prospect of broadening their market base. The wide circulation of such texts helped consolidate and establish the literary languages of Europe and ultimately contributed to the decline of Latin.
Of course, there were many regional vernaculars. Those chosen for print (i.e., those most closely associated with power) became standard languages, whereas those that remained unprinted took on the status of provincial dialects. Early grammars for the most prestigious vernaculars were developed during the sixteenth century (Nebrija's 1493 Grammatica Castellana being a notably early contribution). These grammars provided a basis for filtering out many of the regionalisms that had marked earlier vernacular publications.
Print also gradually introduced new formatting, spelling, and punctuation conventions. Although many of the earliest printed books followed manuscript tradition in terms of layout and punctuation, new practices came to be established. For example, whereas rubrication (literally ‘red-inking’) had been used to show divisions in a text, printers found this too costly and made use of different fonts for headings, initial words, and main text. Quotations, which in manuscripts had been underlined in red ink, came to be indicated by quotation marks. On the other hand, paragraphs, which had been explicitly marked with symbols (such as the pilcrow, ¶), came to be marked by white space (indentation of the first line, or most recently, a preceding blank line in ‘block’ paragraphing) (Baron, Reference Baron2000, p. 169).
Before print, spelling was highly variable, and abbreviations were commonly used. Producing a manuscript book was expensive, not only in terms of labor, but also in terms of materials (a Bible written on parchment required the skins of hundreds of sheep). Because of this expense, scribes saved space wherever possible and developed elaborate abbreviation schemes (Cappelli's Lexicon Abbreviaturarum lists some 13,000 Latin abbreviations).Footnote 17 Printed books, which at first attempted to reproduce manuscripts as closely as possible, preserved abbreviations along with handwriting style, ligatures, illuminations, and layout conventions. Similarly, in both manuscript and early print books, word spellings were adjusted by adding or subtracting letters to fine-tune the length of each line of text so that a flush right margin could be maintained. Once printed, spellings took on a certain authority, and some of the ‘anomalous’ modified spellings became standard when dictionaries were made. Other techniques scribes and printers used to make text fit were adding or deleting whole words, increasing or decreasing space between words or between letters, substituting words for phrases, or vice versa (N. S. Baron, Reference Baron2000).
Abbreviations gradually came to be less used in printed books because printers did not want to risk losing readers who might not understand what they meant. Moreover, as they shifted from expensive parchment to comparatively cheap paper, printers were less compelled to maximize the efficiency of the texts they printed by using abbreviations.
The history of these early printing practices reminds us that spelling is not ‘natural’ and is often shaped by the technological medium in which it is used. Moreover, the control and standardization of spelling (as well as grammar) is also a matter of social forces whereby the power elite set standards, and ordinary people's access and adherence to those standards largely determines their social mobility.
It is clear that the printing press both reflected and shaped the cultural evolution of Europe. However, as mentioned earlier, print was not a catalyst for social change in Central and East Asia. We will now return to this story to see how another technological innovation – paper – was.
Paper
The invention of paper is commonly attributed to Tsai Lun, a eunuch at the court of the Han emperor Wu Di in China in 105 ce, although archeological findings from the Xuanquanzhi ruins of Tunhuang in China's northwest Gansu province place the invention of paper some two hundred years earlier (Yi & Lu, Reference Yi, Lu, Allen, Zuzao, Xiaolan and Bos2010). Made from rags, hemp, or tree bark dissolved in water and then sieved through woven cloth stretched over a bamboo frame, paper was from the beginning made of waste products and therefore relatively cheap to produce.Footnote 18 It was also very light in weight. These were substantial advantages over competing writing materials such as bamboo, which was heavy, and silk, which was expensive. Over time, production techniques were perfected and paper was made in various qualities for different purposes. One key use of paper was for printing, which evolved from the ancient use of carved stone or bronze seals to make impressions on clay or silk, and from the practice of making rubbings from stone and bronze reliefs (Bloom, Reference Bloom2001, p. 36).
Papermaking spread to Korea, where the product was refined to new levels of quality, and subsequently to Japan in the early seventh century. Paper was introduced to Central Asia in the eighth century, allegedly by Chinese prisoners taken captive by Arabs during the Battle of Talas, although archeological evidence from Samarkand suggests that paper may have existed there centuries earlier (ibid., pp. 43–45). In any event, with the unification of West Asia under Islam, the practice of papermaking migrated westward to Baghdad, Mecca, Egypt, Morocco, and finally to Spain. According to Bloom (ibid.), Spanish Christians began to use paper well before the year 1000, and their use of paper grew as they came to dominate greater areas of the Iberian peninsula. However, it was only after the arrival of the printing press in the fifteenth century that paper gradually came to replace parchment as the preferred medium for recording European thought.
In Islamic civilization, it was paper, not printing, that made possible huge advances in learning and new ways of thinking between the eighth and the sixteenth centuries. Paper was originally produced as an adjunct to papyrus and parchment to serve the administrative needs of the huge bureaucracy that developed during the Caliphates and the Ottoman Empire.Footnote 19 But its light weight, availability, and relatively low cost made it a catalyst for social, intellectual, and artistic innovation. The Hindu numeral system had spread through the Islamic world by the ninth century, and by the tenth century, the mathematician Abu al-Hasan Ahmad ibn Ibrahim al-Uqlidisi adapted the Hindu system to the affordances of paper and ink, developing the notion of positional decimal fractions and showing how to perform calculations without deletions (ibid.). Scholars collected and codified the oral traditions of Muhammad on paper, Greek bookrolls were translated into Arabic and written on paper, and new forms of literature, such as cookbooks and The Thousand and One Nights were copied on paper and sold (ibid., p. 12).
Paper provided a convenient means for textualizing not only language, but also artistic designs, architectural plans, genealogy charts, and battle plans, which could be composed in one place and put to use in another (ibid., pp. 215–16). This ability to separate things from their original context of conception and to recontextualize them in new settings, in new mediums, for new purposes, allowed ideas to be not only disseminated, but also transformed:
Drafters could abstract a design from one place and apply it in an entirely different setting. The scale, too, might also change dramatically. An artist could, for example, draw a design observed on a Chinese carved lacquer bowl and put the drawing aside in a portfolio or album. A bookbinder or plasterer might come upon the drawing and transfer the design by means of stencil or pounce to another medium, perhaps a molded-leather book cover or a carved and painted plaster panel many times the size of the original lacquer bowl. The bookbinder or plasterer would never have seen the bowl, and the intermediary drafter might have had no inkling that his drawing would be translated into leather or stucco. Designs were divorced from their original contexts, and this free-floating quality of design became a feature of later Islamic art, particularly the art made for the court.
The idea of divorcing design from its original context (like writing, which divorces language from its original context of utterance) is crucial to understanding relations among language, technology, and literacy. In Chapter 2 we saw how this idea is extended in today's digital texts, whose content is divorced from form (at the level of machine representation) in order to allow those texts to be appropriately scaled across a broad variety of devices.
Paper and language
In the Islamic world, the most important written text was of course the Qur'an. Although the oral literary tradition based on memorization and recitation remained vital, the written Qur'an assumed an increasingly important role when paper became available. Religious scholars were at first reluctant to transcribe the Qur'an on paper, as it was normally copied on parchment. However, given that a single Qur'an required the skins of about three hundred sheep, paper eventually won out. The oldest paper edition dates to the tenth century.
The medium of paper had an effect on the Arabic script. The Kufic script, traditionally used to write the Qur'an on parchment, and characterized by simple geometric shapes, gave way to an angular ‘broken Kufic’ that contrasted thick and thin strokes and was sometimes vertically elongated. This new script was well suited to the characteristics of paper and carbon ink and was more legible and easier to write. It was used for a wide variety of purposes, both secular and religious, and was popular among Christians as well as Muslims. This broken Kufic script led to the development of a more flowing rounded cursive similar to the Arabic script used today (Bloom, Reference Bloom2001, pp. 103–106).
The physical format of the Qur'an also changed with the accepted use of paper. Parchment editions of the Qur'an had been in landscape (horizontal) format, which differentiated them visually from Christian and Jewish scripture, which respectively took the form of codices with vertical orientation and bookrolls with horizontal orientation. Paper editions of the Qur'an, however, were in portrait (vertical) format (ibid., pp. 103–104). In the next chapter on electronically mediated discourse, we will again see that changes in the material medium are accompanied by changes in the form of writing.
The form of paper itself also had great cultural significance. When mainstream production of paper shifted to Europe in the fourteenth century, European paper began to be imported in Syria, Egypt, and North Africa. But the watermarked images of animals and symbols (and Christian crosses) again raised questions of the suitability of paper for religious writing (ibid., p. 13). Once again, material and technological innovations find themselves subject to sociocultural filters when it comes to their adoption.
With respect to literacy and the overlay of written tradition upon oral tradition, the development of paper has been described as the “industrialization of memory” (Debray, Reference Debray1991). And over the centuries, paper has certainly played a key role in the industrialization of society. Yet today, in the face of digitalization, which at least theoretically makes paper records obsolete (Sellen & Harper, Reference Sellen and Harper2002), the paper industry often highlights the ‘human’ face of paper: a medium of sentiment, a medium of personal meaning that binds us to our past and that we bequeath to our loved ones, as one paper company advertises on its website:
Birth certificates. Wedding albums. Autographed books and baseball cards. Paper mementos are among our most treasured possessions. And although digital media is gaining an increasingly important role in today's world, it can never replace the warm, tactile touch of paper. After all, scrapbooks don't require startups or shutdowns. Magazines never crash. And while flash drives are a convenient way to store images, wearing one around your neck just doesn't have the same effect as your grandmother's antique locket. Computers may keep records – but paper leaves a legacy.Footnote 20
Although it is a technology, paper has become naturalized, imbued with human significance both at the level of society and at the level of the individual.
In the next section we turn to a technology of the nineteenth and twentieth centuries that combined paper with print, but was designed for widespread personal use by individuals.
The typewriter: a personal printing press
The origins of the typewriter go back to at least 1714, when an English engineer named Henry Mill applied for a patent for his machine for transcribing letters that would print letters on paper “so neat and exact as not to be distinguished from print” and whose impression would be “deeper and more lasting than any other writing, and not to be erased or counterfeited without manifest discovery” (Bliven, Reference Bliven1954, p. 24). However, it took over 150 years and dozens of other similar inventions before a marketable machine actually named ‘Type-Writer’ arrived on the scene. Attributed to Christopher Latham Sholes, a newspaper editor and politician from Wisconsin, the prototype Type-Writer was inspired by the piano, with alternating black and white keys that triggered typeface hammers that brought an inked cloth ribbon into contact with paper to leave a printed impression. The ribbon had to be hand-inked, the hammers produced only upper case letter forms, and typists couldn't see what they had typed until they had completed an entire line of text. Nevertheless, the arrival of the ‘literary piano,’ as the editors of Scientific American had called an even earlier prototype, signaled the obsolescence of “the laborious and unsatisfactory performance of the pen” and heralded “a revolution as remarkable as that effected in books by the invention of printing.”Footnote 21
After failing to interest Western Union in his machine, Sholes turned to Philo Remington, whose company manufactured firearms and sewing machines. Demand for firearms was down since the end of the Civil War, and the company had extra production capacity. Remington bought Sholes's patent and marketed the first Remington typewriter in 1874. Sales were slow, given that the Type-Writer cost $125, but one of the early adopters was Mark Twain, who, despite his distaste for the machine, became the first writer to submit a typescript to a publisher.Footnote 22
One particular technical problem that Sholes and his collaborator James Densmore had faced was that the type keys often jammed if they were struck in too quick succession, requiring the typist to stop and pry the type bars apart. The original keyboard displayed the alphabet in sequential order but split across two rows of keys, as follows:
- 3 5 7 9 N O P Q R S T U V W X Y Z
2 4 6 8. A B C D E F G H I J K L M
Densmore suggested reordering letter keys in an unfamiliar pattern to slow down typing. Sholes consulted with his mathematician brother-in-law, asking him to reorganize the key configuration so that the type bars that most often got stuck would be separated. Many calculations and experiments led to the four-row ‘QWERTY’ configuration that is still used on English language keyboards to this day. Sholes claimed that this scientific arrangement of the keys would boost typists’ speed and efficiency. It may have done so to the extent that typists spent less time prying apart stuck type bars, but, keeping in mind that typing was done with only two fingers at the time, the QWERTY layout actually made typists’ movements less efficient because it maximized the distance that their fingers needed to travel in typing most words (Beeching, Reference Beeching1974). Besides reducing the incidence of jams, the QWERTY layout also positioned all of the letters in the word TYPEWRITER in the top row of letter keys, a feature that salesmen used to full advantage when they demonstrated the machine.
Although a number of other keyboard layouts have been invented since Sholes's time, and these have been claimed to be more scientific and efficient (most notably the one developed by Dr. August Dvorak in 1932), none has been able to replace the QWERTY standard.Footnote 23 Once again, we see that technological factors may sometimes be an impetus for change, but in many cases they are not, and that what makes the difference is the social context of the time and place. The technical problem that QWERTY targeted no longer exists – it has been irrelevant ever since the development of the typeball and was permanently put to rest with the personal computer. And yet QWERTY has prevailed – not because of any inherent technical superiority, but rather because of social inertia born of comfortable human habit and the consequent economic risk that any manufacturer proposing a new standard would face.
Unlike the technologies of handwriting and print, which do not favor certain scripts over others, the typewriter and the computer keyboard that evolved from it have a clear structural bias toward alphabetic writing. This is not surprising: a limited character set was a precondition for the typewriter; without an alphabet, the typewriter would not even have been conceivable. Attempts were made at adapting the typewriter to other scripts. For example, Underwood developed a Japanese typewriter based on the Katakana syllabary in 1923. However, because it could not include Kanji (Chinese) characters, which were typically used in conjunction with Katakana, the typewriter was not even marketed as a writing tool, and its use was largely limited to the preparation of billing statements in large companies (Gottlieb, Reference Gottlieb2000). Another limitation was that the typewriter imposed horizontal writing, whereas Japanese writing was most commonly formatted vertically. Finally, the typewriter's need to space characters uniformly created an unusual appearance on the page, and typists had to remember to insert word spaces to avoid ambiguities (Gnanadesikan, Reference Gnanadesikan2009, p. 129). A later typewriter designed in the 1960s for the Hiragana syllabary suffered a similar fate. It was not until the late 1970s, when word-processing technology made it possible to input Kana characters and convert them to Kanji, that the potential of the personal printing press was finally realized – but at the staggering cost of $37,000 per machine (ibid., p. 130). Today, Japanese computer keyboards retain the QWERTY keyboard, but they also print kana characters on the keys, facilitating the inputting of multiple scripts.
If the typewriter's design posed problems for languages written in non-alphabetic scripts, it contributed to the global spread of alphabetic writing by influencing decisions about what script to use for newly written languages. As Gnanadesikan points out:
In the newly global economy that came in the wake of colonialism and saw the growth of new multinational corporations, the result was to advantage those nations whose scripts fit easily onto a typewriter keyboard.
In sub-Saharan Africa, for example, the Roman alphabet (sometimes extended with additional letters) has often been deemed the most practical means to transcribe languages that have not traditionally been written “because it looks international, because individuals already educated in the colonial languages (English, French, and Portuguese) already know it…and because of the lasting, arguably pernicious influence of the typewriter” (ibid., p. 261).
With electronics, type keys were freed from singular mechanical connections to typeface characters. Any key could theoretically be programmed to produce any character – in any language or script. This is the principle that allows the QWERTY keyboard to be used in Chinese, for example. Perhaps the most common way to write Chinese characters on computers is to use the standard Roman alphabet keys on a QWERTY keyboard to write in pinyin (Roman phonetic transliteration), which activates a menu of Chinese characters that correspond to that phonetic representation. The user then chooses the appropriate Chinese character from the menu, and this is inserted into the text. An alternative procedure (called Wubi) maps components of Chinese characters (radicals and strokes) onto the QWERTY keyboard. One enters these character components in the same sequence as one would handwrite them on paper. Wubi-configured keyboards are divided into five regions, each designating a different type of stroke: horizontal, vertical, left-falling, right-falling, and hook. For experienced typists, this way of entering characters is far faster than using pinyin. Yet another way of entering Chinese characters is with finger movements on a touchscreen – pattern recognition software converts the hand-traced forms into printed characters.
Today, the accessibility of self-publishing via websites, blogs, social networking sites, and online forums has led to the production of a staggering amount of writing published by non-professional writers. The publishing industry's traditional role in legitimizing text and gatekeeping what can become ‘public writing’ has become to some extent democratized. This significant change has also made possible the sharing of what was once specialized knowledge of professional printers with ordinary individuals. One important such area of knowledge has to do with the semiotics of typeface.
The semiotics of typefaces
Typographers often seek to make type as ‘transparent’ as possible, not distracting the reader's attention from the writer's work (Warde, Reference Warde1956). At the same time, however, there is a very long tradition, from Egyptian hieroglyphs to illuminated manuscripts to advertising today, of texts that purposely call attention to their surface visual form, beckoning the reader to look at them as well as through them (Lanham, Reference Lanham1993). Calligraphy and typography are semiotic modes in their own right, contributing their own nuances to the meanings prompted by words (van Leeuwen, Reference van Leeuwen2005a). Fonts put new faces on words, and in so doing they affect our reading of those words. Consider how font design characteristics add a dimension of coolness, warmth, or cutesiness to a greeting.
Conversely, we can be led to make associations between different words by means of similar typographic design, as in a 2010 Greenpeace campaign aimed against Nestlé, the company that makes KitKat candy bars, for buying palm oil from companies that destroy Indonesian rainforests and pushing orangutans toward extinction. The typography and red color was designed to lead readers to make an association between a KitKat bar and the word ‘killer,’ as shown in Figure 6.2.
Figure 6.2 Greenpeace's appropriation of the visual design of the Nestlé KitKat logo
Like language, typeface has meaning potential, but not predetermined meaning as in a code. Typefaces, like other aspects of style, get culturally appropriated and therefore can ‘mean’ different things in different cultural and historical contexts. For example, the Nazis appropriated Gothic script as a symbol of German nationalism; stickers from the early years of the Third Reich displayed slogans like “Feel German, think German, speak German, be German, even in your script” (C. Burke, Reference Crystal1998, p. 148). However, in 1941, when Hitler realized the importance of communicating his case to the wider world, he rejected Gothic script, calling it a Jewish invention (“Schwabacher-Judenlettern”), and decreed that the Antiqua typeface was to be the “normal script” of the German people (Steinberg, Reference Steinberg1974, p. 293). In early modern England, on the other hand, Gothic script had been appropriated by the common people because it was thought to be the easiest script to read.
Today, Helvetica is an example of a typeface that is both revered and reviled, and is even the subject of a feature film.Footnote 24 Developed in 1957 by Swiss typographer Max Miedinger, Helvetica was designed to be modern, legible, comforting, and ‘neutral’ in the sense that the typeface itself should not convey any meaning and thus be ‘open’ to interpretation. Many cities have adopted Helvetica for their public signs displaying the dos and don'ts of street life. Critics of Helvetica condemn its lack of rhythm and contrast, its boring predictability, and its overuse in society.
Typefaces might be considered expressions of identity – just like clothes, haircuts, eyeglasses, and other visual design accouterments – and in this light it is notable that social networking sites such as Facebook and MySpace allow users to modify typefaces in their personal spaces (the default fonts are Lucida Grande and Verdana respectively). However, in comparison with the many ways in which social networking users are unwittingly being homogenized by algorithms that make them behave in consistent, predictable ways, the typeface choices may be little more than a superficial gimmick to provide an illusion of agency.
Because visual design features such as typeface and layout contribute in significant ways to the meanings people make and take from texts, there is a substantial responsibility to understand the visual pragmatics that underlies their use. Kress and van Leeuwen are two scholars who have contributed much valuable work in this area (Kress & van Leeuwen, Reference Kress and van Leeuwen1996, Reference Kress and van Leeuwen2001; van Leeuwen, Reference van Leeuwen2005a, Reference van Leeuwen2005b, Reference van Leeuwen2006) and we will consider the pedagogical implications of this topic in Part III.
Conclusion
We have seen in the last two chapters how technologies of literacy have affected humans’ relationship to language. The invention of writing allowed language to be separated from speakers’ bodies and distanced from the original context of utterance. Paper, by providing a cheap, light, foldable medium, offered wider access to writing and drawing, and thus broader use of textualization and recontextualization. Print introduced the mechanization of writing, imposing another degree of spatial and temporal distance between the author and the final form of the work (a distance that had been introduced during the manuscript era by scribes). The typewriter made a simplified version of print accessible to the masses; it made printing personal and portable, but still maintained mechanical intermediaries between writer and written product.
The story of paper and print illustrates how material, social, and individual dimensions interact in producing new designs of knowledge, learning, and social life. The fact that printing did not become widespread for centuries in either China or the Muslim world leads us to the conclusion that it was not the technological innovation of movable type alone that made printing viable. Rather, a favorable social–economic–material ecology had to be in place to allow the technological innovation to take hold. Paper and print influenced the social–economic–material ecology by making it possible to circulate ideas to an unprecedented degree, and, for the first time, to archive texts at a potentially unlimited number of sites. In so doing, paper and print changed both the individual's and society's relationship to written language. These developments were crucial for the development of nation-states, the notion of the public sphere, and the consolidation of academic disciplines, especially the sciences.
Today, digital devices introduce a further stage of transformative separation between humans and language (albeit hidden from the eye), as HTML and other computer codes intervene between the embodied biomechanical movements of writing and the display of signs on screens of various types and sizes. The connections between these devices also make it possible for individuals to disseminate their writing with unprecedented speed and breadth of dispersal. It is to electronically mediated writing that we turn in the next chapter.
Communication is a kind of interaction that actively seeks variety. No matter how firmly custom or instrumentality may appear to organize it and contain it, it carries the seeds of its own subversion.
Written communication in electronically mediated environments involves conventions, for sure, but it also affords the individual an unusual degree of leeway to invent new forms, which in some cases become socially accepted as new conventions. Some see the use of new forms as an illustration of the degradation of language. Others welcome it as a celebration of linguistic creativity. This chapter argues that such debates miss the point – that what is important about the special forms found in electronically mediated discourse is that they provide their users with a special identity and sense of belonging to particular discourse communities.
Electronically mediated writing runs the gamut from the most formal, polished expression to the most spontaneous and improvisational. On the formal end of the spectrum, electronically mediated writing looks very much like the writing found in print media (in fact, print is now almost always derived from electronic copy). When online writing is used for personal communication, on the other hand, it sometimes presents new forms and conventions, as in this texting exchange (to which we will return later):
A: wuz^
B: nmhu?
Non-standard forms and conventions arise partly from the fact that interactive electronic discourse is often space-limited or time-pressured. For example, Twitter has a 140-character limit and text messages have a limit of 160 characters.Footnote 1 In real-time interaction environments like chatrooms, people have to understand and respond quickly, as messages may scroll off the screen within a few seconds if there are many participants writing actively. Even if there are only two participants, lags are to be avoided because they disrupt the rhythm of conversation. Consequently, responses tend to be produced quickly and are short in length. Users have developed a plethora of reductions, abbreviations, acronyms, neologisms, emoticons, and amalgams of letters and symbols to deal with these space and time limitations.Footnote 2 For example, consider the following tweet relaying a statement by US Secretary of Education Arne Duncan:
RT @actfl: “W/i the brdr ctxt of a qual ed 4 evry stud, intl ed & fl stdy r vital 2…stud's full access to the world.” Sec of Ed Dunc …
This tweet incorporates vowel deletion (e.g., brdr ctxt), truncations (e.g., qual ed), initialisms (fl), numerals used in rebus fashion (4, 2), and letter/symbol abbreviated forms (e.g., w/i). ‘RT’ is a Twitter-specific device signifying that this is a forwarded ‘re-tweet’ and ‘@actfl’ means that the message was originally posted on the American Council for the Teaching of Foreign Languages Twitter site. The extreme abbreviation of the text is necessitated by Twitter's 140-character limit. The original quote from the Secretary of Education ran to 240 characters:
“Within the broader context of ensuring a quality education for every student, international education and foreign language study are vital to giving those students full access to the world around them.” Secretary of Education Arne Duncan
Eleven characters and spaces were required for the re-tweet acknowledgment (RT @actfl), leaving 129 for the message, presenting the writer with the task of reducing the footprint of the original statement by 54 percent while maintaining its full content.
Besides the formal surface features of this tweet, there is an interesting communicative characteristic related to the fact that this message is being ‘re-tweeted’ by various individuals who did not articulate the abbreviated word forms, much less compose the message. As such texts are recirculated across networks they become farther and farther removed from what Goffman (Reference Goffman1981) called the author (the person who composes the words) and the principal (the person whose position and beliefs are represented by the words). In this case, Arne Duncan is the principal and author, and the original tweeter (whose identity is unknown) is what Goffman called the animator, the one who actually articulates the words of the utterance or text. This is not new in the sense that scribes, journalists, and editors have been recasting other people's language for centuries, but it is new in that formal modifications of others’ words can now be made by anyone with a digital device. Although the author and principal, US Secretary of Education Arne Duncan, remains identified, his words have taken a radically new form. Is the meaning the same? Yes and no. Although the substance of Secretary Duncan's remarks has been retained, the surface features of the tweet affect its comprehensibility and what we might call its ‘identity aura.’ The abbreviated language projects a hip, youthful glow that may not normally be associated with Secretary Duncan, and moreover, the message can become associated with the various people who re-tweet it. Twitter scholar Dhiraj Murthy (Reference Murthy2013) points out that how much a tweet gets recirculated often has more to do with who is re-tweeting it than who its original author was. If this is true, it begs the question of whether our traditional notions of textual authority are valid in the Twitter world.
To suggest that abbreviations and other modifications in tweets, text messages, and chats are motivated only by space restrictions or a need for quick typing would be misleading. Similar forms are frequently used even when there is no particular space limitation or time pressure, to create a friendly, playful mood or to enact certain identities, as in the instant messaging (IM) exchange between two undergraduates shown in Table 7.1.
Table 7.1 IM exchange between two undergraduates
| Message | Time interval from previous entry |
|---|---|
| 1 A:ohaider | |
| 2 B:ohai2u | 2.48 |
| 3 A:howru | 0:24 |
| 4 A:<33 | 0:02 |
| 5 B:haha i'm ok | 0:06 |
| 6 B:it's been a really lazy Sunday | 0:06 |
| 7 A:gud :-) | 0:09 |
| 8 A:as it should be | 0:01 |
| 9 B:what did you do today | 43:12 |
| 10 A:judged debates | 0:15 |
| 11 A:i'm at the glenbrooks | 0:02 |
| 12 B:oh | 0:02 |
| 13 A:hha | 0:13 |
| 14 A:hot weekend | 0:01 |
| 15 A:i kno | 0:01 |
This interaction took place in two segments, separated by a 43-minute gap after line 8. The first segment is characterized by phatic communication in which language establishes social contact rather than convey new information. It is in this section that we find the most individual variation, high expressivity, and creativity. There is no rush; B takes almost three minutes to respond to A's greeting, and A takes almost half a minute to respond in turn. While such delays would be inconceivable (or rude) in spoken communication, they are considered very normal in online written communication because it is understood that one's interlocutor is doing other things, perhaps even conversing with other people. This acceptance of non-focal engagement is at least partly explicable by the nature of the technology. Because messages (which are often requests for interaction) arrive instantaneously, at any hour of the day or night, senders cannot assume that receivers are available to communicate. However, the highly personal and interactive nature of text messages creates communicative ‘pull’ for the receiver, who is often tempted to respond even though he or she might be engaged in some other activity (hence the many car accidents attributable to texting while driving). Whereas writing has traditionally been a focal activity, in today's online communication it is sometimes a peripheral activity.
To return to the exchange in Table 7.1, those not familiar with instant messaging might not even recognize it as being in English until the fifth line (haha i'm ok). However, even though the spellings might be unfamiliar, the language is not:
A: Oh hi there
B: Oh hi to you
A: How are you?
Line 4 presents a right-tilted ‘heart’ emoticon, with ‘doubled’ affection indicated by the additional 3. What is important to recognize is that although ohaider, ohai2u, howru may not be standard spellings, they are nevertheless highly conventional spellings in the instant messaging world, as evidenced by the fact they each have their own entry in the Urban Dictionary (urbandictionary.com). They are part of a social code that signals a particular identity or establishes a playful tone.Footnote 3 They thus enact a heteroglossic layering of voices. The outside graphic ‘shell’ is impenetrable to the uninitiated, yet is highly familiar to participants in online games, forums, and chatrooms and allows writers and readers alike to identify as members of an in-group.Footnote 4 The second, ‘core’ layer consists of the language indexed by the surface spelling, which has dialectal echoes of Scandinavian English, and is associated with the ‘cute’ language of cats in humorous situations (the lolcats meme) on websites such as icanhas.cheezburger.com. Both the graphic and the dialectal elements perform pragmatic effects that enhance the underlying greeting by reinforcing the affective bond between the participants through the complicity that goes along with mutually understanding messages that may not be readily intelligible to others.Footnote 5
As we can see from the first four lines of this exchange, the more formulaic (and therefore predictable) the communicative function is, the more leeway participants have to use playful language forms (since the writer can expect that the reader's interpretation will be guided largely by context). Even if one does not immediately recognize the graphic forms ohaider and ohai2u, the expectation of a greeting helps the reader decipher the forms. When A and B resume communication in line 9, however, their writing shifts to standard forms. This is consistent with the hypothesis that when informational content will not be predictable, the writers will tend to communicate in more standard forms.
At first glance, the sequence of A's and B's messages gives the impression that turn-taking structure is modified since A and B have what look to be multiple sequential turns. In the case of this interaction, however, this is no more than a superficial artifact of the medium. In instant messaging and other synchronous communication environments, no text is sent until the user hits the return key. So, in order to maintain the floor and not keep their conversation partners waiting, users hit return often, splitting their turn into two or three or more discrete transmissions to maintain activity on the screen. Because of this characteristic of synchronous online interaction, discrete marking of interlocutors’ identities is crucial to sort out who has said what – one cannot assume that a new line entry indicates a new turn. This identification usually includes the user's online name (here removed for privacy purposes) and is often marked by a different color for each participant and sometimes left-side and right-side positioning on the screen (as in iChat), which makes the overall turn structure readily recognizable. If participant identity were not so clearly marked, the structural configuration of having turns split over several lines could lead to considerable confusion. Let's take the last three lines as an example.
A: hha
A: hot weekend
A: i kno
Had B sent the last line (i kno) instead of A, the meaning of hot weekend could conceivably have been interpreted to be referring to the weather (i.e., Yes, I agree, it's very hot this weekend). Knowing that it is A who is continuing her turn, however, makes this interpretation impossible; we know that ‘hot weekend’ is being used in an ironic sense to mean this weekend is not very exciting. When additional participants are added to interactions, the need for clear marking of speaker identity becomes all the more essential.
The mixing of non-standard and standard written English forms has led some to view chat, texting, and instant messaging as reduced (i.e., simplified, simple, and therefore impoverished) forms of language use. Indeed, in the popular press, texting, instant messaging, and other forms of electronically mediated written discourse are often dismissed as signs of the degradation of writing, the failure of schools, and the laziness of young people. Such a perspective overlooks two things. First, non-standard forms are not used as frequently as one might be led to believe by the media. In research conducted with her colleague Rich Ling, Naomi Baron (Reference Baron2008) found that the density of reduced forms was actually fewer than 1 percent of the total number of words in samples of American undergraduate women's texting and instant messages, and emoticons were very infrequently used. These were very small-scale studies, but they do suggest that attention devoted to ‘textisms’ is more due to their perceptual salience than to their actual frequency. Second, although certain forms of online discourse may be reduced, they often demand greater work on the part of both the writer and the reader. Furthermore, as we saw in Chapter 5, creative use of mixed systems is the norm rather than the exception in the history of writing.
The Twitter and instant messaging examples illustrate the creative mixing of three types of resources that will be explored in greater detail below. These resources are phonologically based forms, graphically based forms, and affective elements specific to online discourse.
Phonological and graphic strategies
Phonological strategies
Numerals are frequently used for their sound value, as in 2 for ‘to/too,’ 4 for ‘for’ (and B4 for ‘before’), d8 for ‘date,’ or l8r for ‘later.’ This is a widespread practice in other languages as well. In French, for instance, 2 Ri1 is used for ‘de rien,’ and vi1 7aprem for ‘viens cet après-m[idi].’ In Italian, the numeral 6 is used to represent the second person singular form of essere (tu sei), and 16 (seidici) is used to approximate the expression se dici (‘if you say’). In Korean, 8 is pronounced ‘pal’ and 2 is pronounced ‘ee,’ so the commonly used 8282 is read as ‘palee palee,’ which means ‘hurry, hurry.’ Similarly, 1004 is pronounced ‘cheonsa,’ which means ‘angel’ in Korean (hence the name of a Korean popular song “Be my 1004”).
Sometimes the phonological strategy is complexified by invoking a phonological system of another language. In Japanese, for example, 39 is pronounced ‘sankyuu’ which is meant to sound like English ‘thank you.’ Similarly, the Chinese have 3Q (san Q, read more or less as sankyu). What is interesting about 39 and 3Q is that they are not purely Japanese or Chinese, but hybrid Japanese- and Chinese-inflected English forms that both rely on a stereotyped East Asian pronunciation that replaces the [θ] (‘th’ sound) with [s].Footnote 6
The Chinese have been particularly expansive in their use of numerals. Writing on a computer in Chinese normally involves typing either a radical or a pinyin (roman alphabet phonetic) version, and then selecting the appropriate character from a drop-down table. Because this takes time, Chinese chatters commonly use numerals to signify roughly similar sounding words (or a least a similar leading consonant) to maintain a brisk pace of communication. For example, 282 [èr bā èr] is commonly used to represent 饿不饿 [è bù è] ‘are you hungry?’. One could reply by writing 246 [èr sì liù], used to approximate [è sǐ le] 饿死了 ‘starving to death’ and then sign off with 88 [bā bā], sounding like ‘bye bye’.Footnote 7 While people don't actually produce whole exchanges written purely in numerals like this, what is striking is the extent to which the technology-induced use of numerals has become part of youth culture. Chinese blogs show that there are dozens, perhaps hundreds, of numerical transliterations of Chinese expressions like these. And now they have migrated from the computer to other realms of cultural life. For instance, Mavis Fan, a Taiwanese pop star, put numerical textings to music in her 2001 hit song “Digital Love.” Hotels and restaurants make up special deals for Valentines Day with coded numerical phrases, and some jokes make use of homophonic numbers.Footnote 8 Change the language, of course, and the meaning changes dramatically. In Thai, for example, the number 5 is pronounced ‘ha,’ and so 555 is used for ‘hahaha.’ But in Chinese, 5 is pronounced wu and 555 simulates crying.
Graphic strategies
As we saw earlier in the cases of Greeklish and ASCII-ized Arabic, numerals can also sometimes be used for their graphic resemblance to letters, as in 8 for θ and 3 for ξ in Greek, and 3 for Arabic ع and 7 for Arabic ح.Footnote 9 Other strategies include graphic reductions and leet speak.
Graphic reductions involve initialisms (such as TTYL for Talk to You Later, OMG for Oh My God, or LOL for Laughing Out Loud), in which letters are pronounced individually; acronyms, which are pronounced as words (such as ROTFL, rolling on the floor laughing, or BRB, be right back, pronounced ‘rotful’ and ‘berb’ respectively); and abbreviations (such as thx for thanks, coo for cool, sup for what's up?).
Mathematical symbols are frequently added as well. For example, in French, the plus sign (+) is often used for plus (‘more’), as in the advertising slogan of BNP Paribas: TA + K ENTRER (Tu n'as plus qu’à entrer – You need only come in). It is perhaps most widely used in the leave-taking expression A+, a truncated form of à plus tard (‘until later’), frequently used at the end of an email, just prior to signing one's name. In Italian, the addition sign (+) is similarly used for più (more), while the multiplication sign (x) is used for per (meaning ‘for’ or ‘by’). To further complicate matters, x is sometimes combined with letters to form novel word forms, such as xò for però (‘nevertheless’) or xké for perché (‘why’/‘because’).Footnote 10
The Chinese increasingly incorporate pinyin (roman script) into their online writing, and sometimes abbreviate pinyin spellings. For example, zhengfu, the word for government, is often shortened to ZF. Some abbreviations refer to English rather than Chinese, such as PK for videogame ‘player killing,’ PS for Photoshop, and ML for making love (Tatlow, Reference Tatlow2012).
To the extent that users don't know what specific words a graphically reduced form stands for, it becomes functionally an ideogram. For example, many non-English websites use FAQ as a heading (often pronounced as a word [fɑk]), even though ‘Frequently Asked Questions’ is not part of their language. In the French-speaking world, FAQ has been reassigned to the expression ‘Foire Aux Questions’ to avoid giving the English language credit for the acronym, but it is important to note that the existing acronym drove the choice of words, and not vice versa.
Sometimes an abbreviation does not look like one. In French, for example, t'inquiètes is often used instead of ne t'inquiète pas in text messages. Because the negation is completely absent, someone who did not know the convention (a language learner, for example) might well interpret it as ‘be worried’ rather than ‘don't worry about it.’
Graphic reductions such as IRL (in real life), ROTFL, TTYL, or LOL, even if they are used outside of digital environments, inevitably evoke the domain of electronically mediated communication and virtual worlds when they are used, whereas reduced forms that did not develop online, such as FYI, AIDS, TGIF, CIA, IOU do not. The fact that they are bundled in their own succinct package is evidence of their widespread use – one cannot make a reduced form of something that is not already familiar to at least some members of one's social network. If I write YCMAIOT it will remain forever opaque and useless, since there is no functional need for the expression ‘you cannot make an initialism of this.’ However, if you write me an email that incorporates ycmaiot, at that point it becomes a socially shared resource, an Available Design, whose very use identifies you and me as being ‘in’ on a special meaning that those who haven't read this book will not share, and that therefore gives us an in-group status.
Although today graphic reductions are often associated with texting and other forms of online writing, it is important to remember that the phenomenon is a widespread and ancient one, often born of material limitations but then perpetuated by social convention. In ancient Egypt, for example, abbreviations became common when a rush pen was used to write cursive hieroglyphic on papyrus, because a rush pen could not easily draw strokes from right to left, and consequently only the left half of some signs was written (Parkinson & Quirke, Reference Parkinson and Quirke1995, p. 32). In medieval manuscripts, the use of abbreviations and symbols was common because parchment was expensive. But even when paper and the printing press reduced material and labor costs, abbreviations were preserved, since early printers wished to imitate manuscripts as closely as possible.
Reading and writing non-standard forms
The phonological and graphic strategies described above may or may not save keystrokes. But they rarely, if ever, make reading easier, especially when both phonological and graphic strategies are combined. When writers use multiple encoding strategies, their readers need to use multiple decoding strategies. Let's return to the exchange that opened this chapter:
A: wuz^
B: nmhu?
A's encoding of “what's up?” uses two strategies. First, it tries to capture the phonetic characteristics of casual speech, with a deleted /t/ and an ensuing assimilation that voices the /s/, turning it into [z]. We as readers need to think not only in terms of English speech, but also more specifically in terms of a particular speech variety of the hip-hop generation. When we encounter the caret character we need to switch from a phonological strategy to a graphic strategy to interpret the caret as an upward pointing arrow, and subsequently make the association between this symbol and the word up. There is no question mark, so we must infer that this is a question from our prior familiarity with the expression “what's up?” B's response (nmhu?) also mixes strategies. The first three letters are an initialism representing the first letters of the written words ‘not much, how about’ (or possibly ‘not much here’), but the u is a rebus, phonologically representing the whole word ‘you.’ Again, we must switch processing conventions mid-stream from graphic to phonological in order to correctly parse the utterance. This need to switch processing strategies makes reading difficult initially, especially because the strategy switching points are never marked but must be discovered by trial and error. Of course, if we encounter utterances such as nmhu on a recurring basis, we will develop automaticity and will instantaneously recognize the whole unit, as a unit, without added cognitive load.
‘Leet speak,’ which we encountered in Chapter 4, involves the use of non-alphabetic characters to substitute for alphabetic letters in writing. Although leet chiefly employs a graphic strategy (e.g., 1337 is a common respelling of leet), it sometimes adds a phonological layer of recoding, using non-standard spellings (e.g., r00l for ‘rule’) or incorporates common typing mistakes, such as in teh for ‘the,’ and nad for ‘and,’ which might be rendered as 73# and |\|4|) respectively (N. Ross, Reference Ross2006). This obviously requires users’ prior familiarity with common typographical errors, but also – and significantly – the ability to deal with subsequent transformations of already transformed forms.Footnote 11
The process of writing and reading non-standard forms in online discourse is somewhat like what linguistics students have to go through when they first learn to use the International Phonetic Alphabet. But phonetic transcription has a clear analytic purpose. In the case of online interpersonal communication, why add work? One reason is that symbolic transformation is creatively satisfying fun that can be shared with others, and the complicity that comes with mutually understanding ‘coded’ forms can reinforce personal bonds and reaffirm social affiliations within particular communities of practice.
Affective elements
Because writing is very often unaccompanied by voice, facial expression, gesture, or body language, it has to develop its own devices for conveying emotion. Written texts have always managed to express emotion when their writers have wished to express it, and they often do so in extraordinarily potent ways. Although some early pundits predicted that computer-mediated writing would be adequate for informational memos but not for meaningful personal interaction, research has shown on aggregate that online communication is no less emotional or engaging than face-to-face communication, and if anything, it involves more explicit and frequent expression of emotion (Derks, Fischer, & Bos, Reference Derks, Fischer and Bos2008).
Walther (Reference Walther1996) suggests that online writing encourages ‘hyperpersonal’ communication, referring to the stronger affect that people sometimes experience in technology-mediated communication as compared to face-to-face communication. This effect arises partly because people can often craft a more interesting image of themselves online (and can also better monitor and control that projected image). This curation of the self can lead others to idealize and over-estimate one's qualities, which, in turn, leads to enhanced feedback that seems much better than what one gets in face-to-face interaction. This generates a loop: the more interest one's interlocutor seems to show, the more one will be encouraged to be even more witty and engaging to keep the good feedback coming. All this may lead people to interact quite differently from how they might in a face-to-face context.
What is important to recognize is that this hyperpersonal argument is not limited to twenty-first-century online communication, but it has applied to the technology of writing for centuries. Consider the bashful boy who wants to tell a girl he loves her, but he's struck dumb by her beauty every time he sees her. He may be speechless in her presence, but there's a good chance the letter or sonnet he writes will win her over. The same goes for expressing grief. People can write a heartfelt, articulate sympathy note, but when face to face with the grieving person, they often can't get past clichés or awkward silence.
What is new in electronic discourse are some specific (and comparatively trivial) conventions that have developed to express affect, some language-based and others non-linguistic. These include initialisms such as LOL (laughing out loud), emoticons ;-), capitalization (I AM SHOUTING), punctuation marks (!!!), modified spelling (miss u soooo much!), expressions typical of spoken conversation (Geez! Mm hmm. Yum!), and ‘emotes’ that verbally describe an action or response (e.g., *groan, *shrug, *grimace).Footnote 12
‘Shouting,’ the use of capital letters for emphasis or to express strong emotion (e.g., I AM SO MAD!), is so widely used a convention that even in face-to-face contexts young people sometimes refer to loud, emotion-laden speech as using an ‘upper case voice.’ A similar technique is to flank words with asterisks or underlines to intensify feeling (e.g., I'm *crazy about* that band; I _really_ hate this idea). These devices borrow from the well-established typographical conventions of underlined, boldface, or italic print, and were developed as ‘work-arounds’ before those techniques could be implemented in online writing. Even though formatting options have expanded in many online interactive environments, these work-arounds are nevertheless still commonly used.
Two affective devices that have developed specifically in the context of text-based digitally mediated communication are emoticons and LOL.
Emoticons
As discussed in Chapter 1, emoticons are pictographs that add a non-verbal affective dimension to online communication. Assembled from punctuation marks, they are a good example of how one set of available resources (symbols that mark syntactic boundaries) can be put to use in new configurations to serve a wholly new purpose (to inject friendliness into an interaction or to dissuade readers from taking what is written too seriously). Unlike the expression of emotion in face-to-face communication, which often happens uncontrollably or even unconsciously, a writer's use of emoticons is always conscious and sometimes contrived. That is, when someone uses an emoticon, it does not mean that the person is actually experiencing the represented emotion, but that he or she wants to evoke an aura of that feeling for some communicative effect. This remove is what allows emoticons to be used non-literally for effects such as irony and humor, as in this instant messaging example:
A: tell me why im taking math again? >.<
B: because you're an idiot
A: T_T thanks a lot oh lovely friend of mine
A: lol
Here A is not literally wincing in pain (>.<), but humorously expressing her feelings of frustration with her math course. Her feigned flowing tears (T_T) in response to being called an idiot are tempered by both the sarcasm of “thanks a lot oh lovely friend of mine” and by the lol she sends as a coda in line 4.
Since the 1990s in Japan, and since about 2009 in the US and Europe, smartphones allow users to insert emoji, or full color emoticons, into their texts. Some emoji are specific to Japanese culture (a cup of sake, a bowl of ramen noodles, a white flower for brilliant homework), but many others are more broadly applicable.
Emoji go well beyond just expressing affect; they can be used as a kind of rudimentary sign language. The “Narratives in Emoji” blog offers scenarios and movie plots recounted in emoji. For example, Les Miserables is illustrated in ninety-five emoji, and Titanic is illustrated in fourteen emoji starting with a ship's anchor and ending with a broken heart.Footnote 13 In certain online environments, ASCII emoticons are automatically converted into emoji. For example, when I type :-) in Microsoft Word, my keystrokes are converted into ☺. Many people find these transformations annoying precisely because they are automatic (though the autocorrect feature can be overridden) and they produce an overly ‘cutesy’ look that may be ill suited to the impression the writer wants to convey.
Emoji have become a standard built-in feature in certain chatrooms and blogging environments, and often they are animated as well. Applications such as CLIPish provide full-motion animations that can be inserted into texts to express affect (Figure 7.1).

Figure 7.1 Still from a full-motion animation expressing jubilation
Given that writing has always had its ways of expressing affect and attitude, why have emoticons and emoji developed now? No doubt, the rapid pace and truncated language of chat, instant messaging, and texting has something to do with it, for one has little time or space to elaborate on one's mental states in those environments. Emoticons are quick, easy, and synoptic. Their less frequent use in email (Randall, Reference Randall2002) suggests that when people do have the time and space to write more fully, they also have less need of emoticons.Footnote 14 However, there is also an organic, cultural dimension to the arrival of emoticons. On computers and smartphones, where icons are the dominant organizational and navigational devices, where images increasingly compete with and complement words, and where one person's creative moment can quickly go viral, the emoticon is a logically consistent outgrowth that is superficially emblematic of the digital ethos. ‘Superficially’ is precisely why the use of emoticons varies so much from individual to individual: some computer novices wanting to fit in will use emoticons as digital membership badges. Some more experienced digiterati eschew emoticons because they see them as silly and devoid of any real meaning. Most people are probably in between, willing to use an occasional emoticon for a touch of humor, but essentially indifferent to them. In any event, emoticons are for now a fixture in the online landscape and, as we will see in the next section, people have developed nuanced conventions by which to interpret them.
LOL
LOL (laughing out loud) is one of the most widely used online initialisms, and its use has spread to many languages other than English. While it usually signals that the writer finds whatever is onscreen to be humorous, it rarely, if ever, means that the writer is literally laughing out loud as he or she is writing. Most often, in English at least, lol conveys a tone of light irony or sarcasm.Footnote 15 Despite its origin as an initialism, lol is probably more appropriately viewed as an independent lexical item for a number of reasons.
First, it is most often used in contexts in which an expansion to ‘laughing out loud’ would not be grammatically or pragmatically appropriate. For example, lol frequently appears alone on a line of its own in chat or text exchanges, and often in response to another's lol.
Second, it has developed unique semantic, morphological, and phonological features. Sometimes pronounced as L-O-L, sometimes as ‘lole,’ sometimes as ‘lall,’ lol has spawned literally hundreds of variations (e.g., lololol, loooooolll, I did it for the lulz – i.e., for the laughs) as well as internet memes such as ‘lolcats,’ in which photos or images of cats in humorous positions and circumstances are captioned in ‘lol speak,’ such as i can has t3h kibbls plz?? for ‘Can I have some food please?’Footnote 16
Third, lol is commonly used in languages other than English, where it loses its abbreviation value (i.e., no connection can be made between the letters lol and a corresponding phrase).Footnote 17 However, language-specific, phonetic representations of laughter (such as he he in English) are also used on the Internet. For instance, the Korean character ㅋ is equivalent to the hard ‘k’ sound in English and ㅋㅋㅋㅋㅋ (‘kekekekeke’) is used as a phonetic imitation of laughter. Koreans also use ‘ㅎㅎㅎㅎㅎ’ for ‘hahahahaha.’ In Japanese, a semantic approach is adopted, but with an iterative pattern that mimics the rhythm of laughter. WWWWW repeats the first letter of the verb warau ‘to laugh.’ In Japanese blogs, the kanji character for laughter, 笑, is often used instead of LOL or WWWWW. Again, 555 is used in Thai.
Finally, lol appears to be undergoing semantic bleaching, gradually losing its association with laughter, and becoming an element of discourse grammar, as I will speculate below.
Interpreting affective elements
How are affective elements interpreted? In one experimental study (Walther & D'Addario, Reference Walther and D'Addario2001), researchers compared sentences such as the following to assess the impact of emoticons on message interpretation:
That econ class you asked me about, it's a joy. I wish all my classes were just like it. :-)
That econ class you asked me about, it's a joy. I wish all my classes were just like it. :-(
That econ class you asked me about, it's hell. I wish I never have another class like it. :-)
That econ class you asked me about, it's hell. I wish I never have another class like it. :-(
The researchers were testing the hypothesis (among others) that a frown emoticon used in conjunction with a positive verbal message (e.g., item 2 above) would convey less positivity than a positive ‘pure message’ and less negativity than a negative ‘pure message,’ such as item 4 above. However, they found that the use of :-), ;-), or :-( emoticons made a difference in interpretation only if the verbal part of the message was positive. For example, contrary to the researchers’ predictions, item 2 was rated as ‘unhappy’ as any of the negative verbal messages (for which emoticons made no difference). The problem is that the study did not probe what exactly the writer was unhappy about. The researchers assumed that raters were assessing the writer's attitude toward the econ class. But the juxtaposition of a frown emoticon with an unambiguous statement like ‘it's a joy’ makes little sense, so subjects were most probably applying it to the more proximate second part of the prompt concerning the writer's other courses. By this reading, the frown emoticon implies that the writer's other courses are not as good as the economics class, and explains why subjects in the study rated the item so negatively.
My undergraduate students make a clear distinction between the use of ‘lol’ and a smiley :) either on a line of its own (i.e., as a complete utterance) or at the end of an utterance. For example:
You're a real stud muffin! lol
You're a real stud muffin! :)
The first message, with lol, suggests that the writer is teasing or being ironic. The same message with a smiley at the end is more sincere, expressing admiration, and perhaps flirtatiousness. The effect of lol is to attenuate the seriousness of whatever precedes it, so it can suggest that the writer does not really believe what he or she has written. On the other hand, lol could be concealing coyness or shyness if the writer really does think that the other person is a “stud muffin,” but is afraid of the reader taking it the wrong way, finding it “weird” (and therefore the writer makes it a joke by adding lol). Many students think that this utterance would indeed be strange without some kind of affective particle like lol or :) attached.
However, when we change the matrix sentence to a negative pragmatic polarity, people's responses are more variable, and the values of lol and :) may shift subtly. Consider, for example:
You're a real jerk lol
You're a real jerk :)
Here, some people interpret both lol and :) as face-saving hedges (i.e., “You really are a jerk, but I don't feel comfortable saying so without some attenuation”), but others see only lol as a hedge, with the emoticon :) signifying flirtiness, complicity, or attachment (i.e., “You're a real jerk, but I think it's cute.” “I'm referring to an inside joke between us”). Or even “you did something mean to a third party, and I'm happy about it.” If the speakers know each other fairly well, lol may suggest a slightly negative assessment, but not a threat to their mutual friendship (“You're a real jerk, and I can't believe I put up with it”), but :) suggests a more positive attitude (“You're a real jerk, teehee.” “And that's what I love about you”).
Is there a grammar of affective elements in electronically mediated discourse? My own informal and unscientific inquiries suggest that young people have quite clear and nuanced ideas about how to interpret affective particles and that some kind of shared, conventional ‘grammar’ may be developing, but this is a highly speculative assertion at present. What is needed is research that takes different sources of variation (e.g., age, gender, social class, language, culture, education, online experience) into account in the production and interpretation of affective elements in various contexts of electronic discourse.
Discursive dimensions of online communication
Electronically mediated writing can be synchronous (such as chat and MOOs) or asynchronous (such as blogs, websites, wikis, forums). A number of platforms can be used either synchronously or asynchronously (for instance, email, texting, instant messaging, Twitter). In this section we will contrast two examples of online collaboration, one synchronous (MOO) and the other asynchronous (Wiki).
The kind of writing produced in synchronous environments is first and foremost social interaction, and its form reflects that purpose. Like speech, written chat exchanges are characterized by direct interpersonal address, rapid topic shifts, and frequent digressions. Similarly, the functions expressed in chat frequently overlap with purposes normally associated with speech. Writing becomes a channel for lively, spontaneous exchange of thoughts, feelings, ideas, and wit, and tends to be oriented toward developing social relationships and entertaining others more than for informing, explaining, or persuading.
Exploring a MOO
We saw an example of interaction in a French MOO in Chapter 4. Let's here consider an example of written interaction in English involving eight participants in a MOO. The excerpt is from one of my graduate seminars, during a week when our focus was on the dynamics of online interaction. Although MOOs offer many features that go beyond what is available in chat rooms, this particular example only makes use of standard chatroom features. The excerpt below, which represents my students’ first experience of a MOO, begins on what was the 66th turn in the session (the previous turns involved students logging on and getting acclimated to the environment) and is followed by 224 subsequent turns (about six pages) of continued discussion. The excerpt is almost purely language play, centered on the term MOO. The seed of a ‘serious’ theme of conceptual differences across languages is planted in turns 16–17, but it is not taken up again until turn 28 and then again right after this excerpt, at which point the interaction gains momentum and gradually develops greater content focus in the remaining turns.
1. Matt_R says, “this is moo-erific!”
2. Brindy_S :P
3. Matt_R says, “moo-tastic”
4. Matt_R says, “moolicious”
5. Patti_D says, “thattaboy, Matt!!!!!”
6. Paul_B says, “mooving experience”
7. Matt_R says, “or maybe just moo-fy”
8. Brindy_S says, “moot point”
9. Patti_D says, “she moooooaned.”
10. Paul_B says, “moooood altering”
11. Matt_R says, “I'm just mooing around”
12. Matt_R says, “go moo yourself”
13. Brindy_S says, “not in the mooooood\”
14. Patti_D says, “moooooooreover, we have work to do!”
15. Matt_R says, “Who's wearing a moo moo”
16. Patti_D says, “so, if you don't have “moo” in your language, then could you understand our current word play?”
17. Patti_D says, “I mean, you could understand “word play,” but could you translate the precise word play?”
18. Matt_R says, “hmm”
19. Greg_S says, “are you serious?”
20. Patti_D says, “would you NEED to understand “moo”?”
21. Patti_D says, “yes, it's a serious question.”
22. Greg_S says, “let's start with a defintion of “moo””
23. Brindy_S says, “………”
24. Greg_S says, “are there moos?”
25. Matt_R says, “examples of conceptual differences across languages: we have moo; they don't”
26. Brindy_S says, “bovine speech”
27. Patti_D says, “but they DOOO, it's just called something else – onomotopoetic (sp??)”
28. Rick_K says, “So what kinds of conceptual differences have you encountered in your own language learning?”
29. Kathy_L says, “we say moooooooo, others might saw meeeewwwww….”
30. Greg_S says, “this is udder-ly interesting”
31. Brindy_S says, “meuh in French I think”
32. Greg_S says, “moo-ving”
33. Matt_R says, “if a moo falls in the woods and noone hears it does it make a sound?”
34. Matt_R says, “what's the sound of one moo clapping?”
35. Patti_D says, “OK, maybe we should moooove to more precise examples”
36. Greg_S says, “is this for real?”
37. Greg_S says, “I keep trying to think, but all I hear is “mooooooo””
I have chosen this particular excerpt because it represents a dramatic breaking away from the kind of discourse that students engage in when sitting around a seminar table talking face to face. Although the students were all in the same room, the fact that they were interacting through the frame of their individual computer screens completely destabilized their normal interaction patterns, removed filters of standard notions of appropriateness, and gave way to an unbridled ludic spirit among the students. I as teacher no longer had any control of the interaction, nor did my (written) voice have any more authority than that of any other participant – witness how my attempt to corral the discussion in line 28 goes completely unheeded. Wit is the top priority in this excerpt, and there is a traceable development in the humor. The first examples (moo-erific, moo-tastic, moolicious) are somewhat forced, using moo to replace unrelated syllables. This sets the stage for a series of phonologically based examples that play on the /mu/ sound shared in the source word: mooving experience, moot point, moooood altering. Matt_R then experiments with keeping just the /u/ sound in common with the source word (moo-fy– [g]oofy; mooing around – [scr]ewing around). He then tries a further modification, dropping the -ing in mooing (go moo yourself) and gets the satisfaction of an immediate clever response from Brindy_S in line 13 that plays not only on the sound but also on the sexual tenor of Matt's imperative (not in the mooooood).
Several of the witticisms take a different tack, some relying on graphic, not phonological, play (e.g., Patti_D's moooooaned and moooooooreover in lines 9 and 14), others playing on semantic puns (udder-ly interesting), others based on philosophical riddles (if a moo falls in the woods…, the sound of one moo clapping, lines 33–34), and others simply making clever ripostes (e.g., in response to Greg's call for a definition of ‘moo,’ Brindy replies “bovine speech,” lines 22, 26).
Turn 35 begins a shift toward a more serious focus, and marks the transition effectively by being serious in intent, but at the same time incorporating humor in its form (OK, maybe we should moooove to more precise examples). Following this excerpt, examples of language play are sparse in the remaining 224 turns, but they do not disappear completely, with occasional remarks such as That's moooos to me, following a serious comment by a fellow student.
On the surface of things, it appears that this kind of synchronous online interaction is like speech written down. Turns are short, reminiscent of spoken banter, necessarily punctual to maintain the rhythm and pace of interaction in real time. Some turns directly mimic oral utterances (thattaboy, Matt!!!!!) and other imitate spoken stress patterns through the use of ‘upper case voice’ (but they DOOO… which cleverly plays simultaneously on the ‘moo’ theme). There are analogues to non-verbal turns in conversation (e.g., facial expressions, gestures, shrugs, etc. that serve as a communicative turn) such as the sticking out of the tongue emoticon in line 2, “hmm” in line 18, and the pregnant silence “……” in line 23.
On the other hand, a closer look reveals some significant differences with spoken communication. First is the bundling of discrete messages, rather than overlap, which characterizes spoken interaction. The written channel is either ‘on’ or ‘off,’ and messages are not posted until the return key is pressed. One does not know what the other participants are writing until their message is complete. This can be a good thing – it prevents one from cutting off fellow interlocutors, assuring that everyone gets to complete their thoughts to their own satisfaction – but it does create a very different dynamic from face-to-face or telephonic communication.
Second, unlike most forms of discourse, coherence in the excerpt is based almost exclusively on the associative links among the word forms, all relating in some way to ‘moo.’ It is rather like living in Ionesco's play The Bald Soprano, with phrases being bantered about that at best have only an associative connection with one another. What is important to recognize is that this language play is encouraged by the medium of communication. The same group of graduate students met weekly with me for fifteen weeks, and never once did word play come up in our face-to-face interactions. The social roles of participants (instructor–students and peer-to-peer discussants) remained constant, and although the purpose of online interaction was to become familiar with the dynamics of interaction in a MOO environment, the specific task at hand (which became no more than a pretext) was to discuss conceptual differences across languages – very similar to the types of topics they discussed each week. The physical environment was different in that they were now in a computer lab rather than sitting around a seminar table; however, there was nothing ludic about the physical environment.
What is not represented in this static transcript is the quick-paced temporal rhythm of the exchange, or the spurts of laughter that accompanied the reading of classmates’ postings. This is an important methodological point, for even though the ‘text’ of the interaction is the same, the participants’ experience of the interaction as it unfolded turn by turn in real time was quite different from that of the reader confronted with a transcript on a printed page. Sometimes messages appeared on the screen in a quick batched flurry, alerting the reader to the fact that a number of extraneous messages might separate turns that are meant to go together, such as a response to a question. Other times there were prolonged screen ‘silences’ but lots of audible clicking of keyboards in the room, signaling active message production. Laughter brought readers’ attention to the screen, and made some participants scroll up, looking for a message they might have missed. It is important to note that in environments like MOOs and chatrooms, in which people are alternating reading with typing, almost no one reads all the messages that appear on the screen – there are always some that escape an individual reader's attention. But which particular turns or messages are missed will vary from participant to participant. This is very different from a face-to-face situation, where everyone hears everything. Although face-to-face speakers’ attention can vary, all are at least exposed to all turns, which is generally not the case in online synchronous chat, unless one is just observing and not contributing to the discussion.
Globally, we see that features of the MOO interface affect conversational behavior: the completely open floor (i.e., the lack of any clear turn allocation system) means that all participants can potentially be writing at the same time, which sometimes leads to a chaotic quality to the interaction. Participants consequently adopt new behavior to cope, such as breaking up their utterances into short fragments that will appear onscreen more quickly, and therefore closer to the utterance to which they are responding. These particular adaptations are specific to the environment, but they reflect a more general human tendency to adapt communicative behavior to the medium, which is not unique to electronic environments but has been the general rule since the origins of writing.
Wikis and We media
We now turn to an asynchronous example of collaborative online participation. Wikis are all about community involvement in the creation and maintenance of texts. A wiki (from Wiki Wiki, meaning ‘fast’ in Hawaiian) is a collaborative web environment that allows multiple authors to create, edit, and continually update interlinked web pages.Footnote 18 Unlike traditional co-authored texts, which might have two or three authors, wiki texts may involve the participation of an unlimited number of writers. Moreover, authors work anonymously, so even though a wiki is collaborative, one rarely knows who one's collaborators are. Wikis are best known as encyclopedias – most notably Wikipedia.org, “the free encyclopedia that anyone can edit” started by Jimmy Wales in 2001, which as of 2014 had 287 encyclopedias in different languages – but they are also used in community websites, corporate intranets, online publications, collaborative document design, course and knowledge management systems, and note-taking applications.
Wikis serve different purposes. Some are hierarchical, giving users different levels of access, that is, control over different functions and content. For example, a wikimedia publication might allow no changes to original content, but allow readers to insert comments and propose links to other web content. Wikipedia.org, on the other hand, allows anyone to add or modify content, but content removal has to be explained and approved by community consensus. A wiki designed to produce a collaboratively authored document will usually not have any restrictions whatsoever. A key operative assumption is that Wikipedia can be trusted because there is a community behind it, with all information being reviewed by many people who are working together.
The counterargument is that the ‘democratic’ assumptions behind a wiki are unacceptable, and that what makes an encyclopedia valuable is the identifiable authority of its writers and the careful curation of its editors. This is the argument often endorsed by the traditional institutional gatekeepers of knowledge – schools, universities, and publishers. But then who decides who the real experts are? And who, in turn, decides who those decision makers choosing the experts should be?
The wiki phenomenon raises important questions about who is in control of language and knowledge, and reflects a broader movement to question cultural and intellectual authority. Wiki media (and self-publishing generally) are, in a broad sense, part of this anti-authoritarian movement. Who needs Brittanica when one has Wikipedia? Who needs Random House when one can self-publish with Amazon?
Not surprisingly, journalism is undergoing a similar transformation. Wikis and blogs are increasingly used in what is known as ‘citizen journalism’ or ‘social media journalism.’ During the riots of late 2005 in France, for example, L'Hebdo, a Swiss magazine, sent journalists to Bondy, a suburb of Paris, to blog every day about what they observed and experienced. Months later, after the riots had subsided, the bloggers remained to portray post-riot life on the Bondy Blog, interviewing unemployed youth, hanging out with gang members, going to parties, and talking with the mayor. The reporters found that blogging influenced the way they recorded and reported events. It transformed their writing process, and changed their relationship to their readers, who would post feedback on their blog (Giussani, Reference Giussani2006). What was most remarkable was the impact on some of the idle youngsters from Bondy, who were trained by L'Hebdo to continue to blog themselves with editorial and technical support from the magazine. Five years later, in 2010, the dismal conditions in Bondy had not improved, but the Bondy Blog had become the voice of the region, and some of the young blogger/journalists had published major articles in France's most prestigious newspapers (Jeannet & Schenk, Reference Jeannet and Schenk2010). Through old media journalism and new media technology, the Bondy youngsters had become “actors in their own social space” (Giussani, Reference Giussani2006).
While this everyman approach to journalism has interesting implications, it also has its limits. On June 17, 2005, Michael Kinsley, an opinion and editorial editor for the Los Angeles Times, experimented with an online editorial about the war in Iraq entitled “War and Consequences” which readers were invited to rewrite. His ‘wikitorial’ attracted thousands of readers, many of whom took a hand in contributing to the piece. Additions and modifications appeared in bold and were time-stamped, and comments remained until the next contributor came along. But within two days the wikitorial experiment had to be called off due to online vandalism involving the posting of pornography. Commentators were mixed in their view of the wikitorial. Many saw it as inappropriate, since editorials are normally a matter of a single professional writer taking a coherent, well-reasoned stand on an issue, whereas a wiki turns it into an opinion free-for-all. Some saw it as a pointless use of interactivity just for the sake of interactivity. Others more optimistically saw the potential for a new kind of participatory opinion journalism that could reflect the multiple voices and points of view of a community.
The wikitorial experiment does raise interesting questions about the relationship of genre and medium. It's ostensibly one thing to have an encyclopedia where people are correcting information to make articles as accurate and as up to date as possible, and quite another to edit an ‘opinion’ piece as a reader, where it's not as much about verifying information as arguing a point coherently.
But the two tasks might not be as different as they might appear to be on the surface. One problem with Wikipedia is that it assumes that all information is good as long as it is accurate. But information, however accurate, is never neutral. For example, the Wikipedia entry on Martin Luther King includes information about his extramarital liaisons. For some readers, this information might diminish King's status as a great moral leader. While some Wikipedia participants might want the unvarnished truth, others might want to suppress this information about a man they consider a hero. Formerly, the writing of biography or history assumed that the author could produce at least one solid point of view about his subject. Other writers might publish biographies with different points of view. But each biography stood on its own. Wikipedia, on the other hand, militates against individual authors staking out a unique perspective undiluted by others’ views. As a consequence, Wikipedia potentially denies readers the advantage of engaging with the colloquy among disparate authors. We get information, but in the absence of editorial oversight the perspectives tied up with the information run the risk of becoming so homogenized that readers may not even recognize that multiple viewpoints are represented.
Finally, Wikipedia raises important questions about how knowledge should be substantiated and curated. To date, in order to validate itself within the encyclopedia genre, Wikipedia has stayed rather close to the Brittanica model, relying heavily on text, and requiring that entries include published source citations. But in the age of multimodal online technology, might this requirement be outdated? Many forms of knowledge are not encoded in writing, and if Wikipedia's ambition is to provide “free access to the sum of all human knowledge” (R. Miller, Reference Rieger2004), then it cannot limit itself to written knowledge (N. Cohen, Reference Cohen2011). Technology is not the obstacle: speech and video can be easily integrated with text, and they can be quoted as easily as text. Moreover, the requisite know-how is no longer the exclusive domain of experts but has become increasingly widespread among ordinary individuals. Rather, the obstacle lies in the social convention of acknowledging published (and that traditionally means print) sources. In trying to prove itself worthy by comparing itself to traditional print encyclopedias, Wikipedia is actually selling itself short. This is an area where social forces have put the brakes on technological momentum, and it will be interesting to see how the collective influence of individuals will or will not bring about institutionalized change.
But the Wikimedia movement may be slowed by material forces as well. People are increasingly using mobile phones to do tasks they used to do with computers. Editing Wikipedia pages is difficult to do on the small screen of a mobile phone, and fewer people are editing Wikipedia. As Noam Cohen of the New York Times writes, “smartphones and tablets are designed for ‘consumer behavior’ rather than ‘creative behavior.’ In other words, mobile users are much more likely to read a Wikipedia article than improve it. As a result, the shift to mobile away from desktops could pose long-term problems for Wikipedia” (Cohen, Reference Cohen2014, p. B1).
Conclusion
We have seen that electronically mediated discourse is not just shaped by technology but also by broad social forces as well as the widely varying needs of individuals in particular situations. Each act of online communication brings into play a particular set of language forms and communicative practices that are dynamically adapted to the setting and task at hand. Although these adaptations get socially shared (i.e., taken up as new Available Designs), there is no uniform language of electronically mediated communication as might be suggested from David Crystal's (Reference Crystal2006) term Netspeak. Even the various media within the broad category of electronically mediated writing (i.e., email, chat, texting, etc.) cannot be unambiguously associated with particular genres – in fact, each of them can support multiple genres.
The language inventions found in chat, IM, and SMS environments (which sometimes bleed over into blogs, email, discussion lists, etc.) are not really a simplification of writing systems, but an adaptation of those writing systems to allow inclusion of features needed by the online culture of communication. That is, they are features that the forms of writing associated with print culture don't adequately provide. This adaptation is nothing new; this is what has happened with writing again and again throughout its 5,000-year history.
Moreover, graphic reductions are not specific to online environments. Just to give one example, terms of address are frequently abbreviated (e.g., Mr., Mrs., Dr. in English; M., Mme, Mlle in French; Sr., Sna., Da., Ud., Uds. in Spanish).Footnote 19 Such forms are not signs of the degradation of language, and neither are those found in online environments. Indeed, we have seen that the strategies involved in using reduced or recoded forms often involve shifting among multiple representational systems, thus demonstrating the symbolic sophistication and cognitive flexibility of those who use them.
Clearly, writing is not a purely linguistic activity; it involves designing meaning using forms and space. It is partly linguistic, but never exclusively linguistic. The forms of writing may come from different classes of signs, grounded in different representational systems, and an important part of literacy is developing the ability to deal with the multiple systems that underlie written texts.
In the next chapter we will extend our exploration of electronically mediated discourse beyond writing, considering multimedia environments that involve image, sound, or video.
We have always craved rich, mixed, competitive, antiphonal signals.
The environments in which we communicate, learn, conduct business, and entertain ourselves are ever more infused with technology, integrating not only images but also music, graphic design, color, animation, and video with spoken and written language. Consider a few scenarios. In an economics class, students arrive in the lecture hall with three graphs displayed on a screen at the front of the room. They have already watched their professor's lecture (on video and PowerPoint) the night before on the course website. Class begins with a quick quiz: the professor summarizes a case study, and students must pick the graph that best illustrates the situation he has described. Students use clickers to indicate their choice, and, as they respond, a large bar graph appears on a second screen in the lecture hall, displaying the percentages of students who selected each graph. Students are then asked to discuss the problem in small groups, and after five minutes, they take the quiz again with their clickers. This time all answers converge on one graph.
In the online game Global Conflicts: Afghanistan, the player takes on the role of an investigative reporter visiting a small Afghan village, attempting to find out who is making threats against a school. Through audio and written text, the player must try to understand the views of the Mullah, the Taliban warrior, the ISAF soldier, and others in the village. The player then composes a newspaper article from available headlines, photos and quotes, and makes a recommendation about how to solve the school problem.
On vacation in Florence, you point your smartphone lens at buildings and image-recognition software tells you what they are and when they were built. As you enter a museum, you point your smartphone at a painting to find out who the artist was, when it was painted, its title, and other historical background. During your trip, your phone interprets signs, translates text from Italian to English, identifies foods, plants, and landmarks, and even recognizes music.
While you are on a lunch break at work, you receive an incoming text (“what are you doing?”). Instead of typing a reply, you snap and send a picture of your surroundings. When your friend receives your visual response, it appears on her display for up to ten seconds and then vanishes forever – the image is as evanescent as speech.Footnote 1
These are all examples of multimodal discourse – that is, discourse that incorporates semiotic resources from different modes of meaning making. One of the interesting things about multimodal discourse is that those different modes are generally based on different temporal or spatial logics.Footnote 2 In the context of multimodal discourse, literacy involves understanding relationships within and across modes. This chapter looks at such relationships, showing how language is used in conjunction with other semiotic systems and how material, social, and individual factors interact in multimodal discourse. A key point will be that the decisions that go into designing multimodal discourse are rhetorical in nature, and that the choices people make about what to express in words versus gesture, sound versus graphics, still images versus moving images can be meaningful in and of themselves.Footnote 3
The old and the new
Multimodal communication is not new. Indeed, it has historically been the norm for most communication, whether spoken, written, or signed.Footnote 4 That is to say, although most acts of communication involve some form of language, language rarely stands on its own. We have seen how images and language have complemented one another in texts since the origins of writing, and images were no doubt combined with speech in communication and artistic expression for millennia before that. Medieval illuminated manuscripts, rebus books, theater performance, science textbooks with diagrams and illustrations, concrete poetry, comic books, newspapers and magazines, advertising, instructional manuals, television captioning are all common examples.Footnote 5
What is new is that in digital environments, verbal, visual, and sonic modes are all represented numerically. This provides a common architecture for integrating music, sound, film, and graphic arts in ‘written’ texts.Footnote 6 By the same token, tools for ‘writing’ may be used to compose in other modes. For example, the voice tracks for video animations can be produced by typing on a keyboard with interfaces such as Xtranormal or GoAnimate. In these kinds of situations, ‘authoring,’ ‘composing,’ or ‘designing’ are more apt terms than ‘writing,’ even though the biomechanical processes look identical to those involved in writing text. Numerical representation also means that anyone who has access to the requisite computer software has the wherewithal to create, transform, or re-mediate content and to disseminate it widely on the Internet, potentially attracting massive audiences.Footnote 7 For example, visual mash-ups, such as the one shown in Figure 8.1, are a popular internet practice of combining images and text from a variety of sources to produce something similar to a comic strip, in which the semantic load is distributed across linguistic and visual modes. Here, Kobe Bryant of the Los Angeles Lakers is put into dialogue with Gandalf from the movie version of Tolkien's The Lord of the Rings: The Fellowship of the Ring, with the humor centered on the ambiguity of the word pass and Bryant's penchant for hogging the ball and shooting. Mash-ups rely on readers’ familiarity with pop culture and are frequently found on Facebook and other social networks.
Figure 8.1 Visual mash-up: Kobe Bryant and Gandalf in conversation
Remixes like this circulate widely as internet memes and give rise to many variations on the basic theme. Originality does not derive from the borrowed images or utterances themselves, but rather from the particular juxtapositions that bring the images and utterances into novel relations with one another. These kinds of recontextualizations of others’ work remind us that creativity is not just an individual trait, but also a socially distributed and collaborative practice.
Recontextualizations also introduce questions about intellectual property and ethics – questions that I myself face as I cite this example in this book. For although I can identify the middle image of Figure 8.1 as belonging to the film The Lord of the Rings: The Fellowship of the Ring, I cannot identify the sources of the other images nor the author who put the words and images together.Footnote 8 In sum, there is no readily identifiable original source to cite. But authentication and citation is not part of mash-up culture – it only becomes an issue when I remove the mash-up from its native material and social ecology and try to incorporate it into a non-native medium (a published book) that conventionally requires explicit acknowledgment of sources. This transplantation of a text (which is itself composed of other transplanted texts) illustrates how material and social conventions interact and shows us that what is distinctive about this act of recontextualization is not the borrowing itself (for, as Bahktin (1986) reminds us, our texts are always filled with others’ words) but rather how we acknowledge that borrowing in different contexts. The world of internet remixing and repurposing is like one that Foucault invited us to imagine, namely: “a culture where discourse would circulate without any need for an author,” a world in which discourses “would unfold in a pervasive anonymity” (Reference Foucault, Bouchard, Simons and Bouchard1977b, p. 138). Such visions are at the heart of current debates about plagiarism and its cultural relativity (Pennycook, Reference Pennycook1996). But multimodal texts take the debate to a new level in that they involve the borrowing not only of language, but also of design more broadly, since the look, sound, and feel of a digital object can now be copied and pasted just as easily as language can.Footnote 9
Sharing the semiotic load: graphic overtitles
Let's now take a closer look at how meaning making involves interactions among semiotic modes in multimodal texts. We will take as an example a popular form of captioning used in Korean television shows. Unlike closed captioning, in which ongoing speech is simultaneously shown onscreen in written form, or subtitling, in which speech is translated into another language, this particular form of captioning does not reiterate what is being said onscreen. Rather, it expresses unspoken voices – the internal thoughts of a character, a comment by the producer, a projected reaction from the audience – which may sometimes contrast with the spoken language, images, and actions that are unfolding on the screen. Linguist Junghee Park (Reference Park2009) calls these captions “graphic overtitles” and points out that they add a dialogic dimension to the viewing experience, making it like watching a television show in the company of friends who laugh, make snide remarks, admire stunts, imagine what a character is thinking, and make predictions about what will happen next. Similar in many ways to the overtitles used in silent films to read the thoughts of characters, Korean graphic overtitles are distinctive in that they draw heavily on graphics and symbols from Korean comic books and electronic mediated communication, with typographical variation and color used to express different voices. However, there is no explicit specification of whose ‘voice’ we are reading, since overtitles float anonymously on the screen.
For example, in Figure 8.2, the three co-hosts are laughing about a man's story about his grades in elementary school, and the graphic overtitles can be translated as ‘big laughter’ written diagonally in purple, followed by ‘got bowled over’ written horizontally in yellow below. What is interesting here from a linguistic standpoint is the absence of a grammatical subject, which Park argues is a characteristic syntactic feature of many graphic overtitles. So, instead of saying ‘the co-hosts were bowled over,’ only the words ‘got bowled over’ appear on the screen. The grammar is adapted to the multimodal context, where the grammatical subject (here, the co-hosts) is represented visually on the screen and only the predicate needs to be expressed in language. Park points out that this principle of omitting subjects that are represented visually onscreen also holds in the case of relative clauses. For instance, ‘a girl who is pretty’ reads simply as ‘who is pretty’ in the graphic overtitle because the girl appears on the screen. In other words, expression of the total semantic structure is distributed across linguistic and non-linguistic resources. Viewers thus read the television screen somewhat like a rebus text.

Figure 8.2 Graphic overtitles that read “Big laughter. Got bowled over”
To a non-Korean sensibility, overtitles might seem to overdetermine meaning. Are they too controlling of audience response? Do they reduce the interpretive work of viewers and consequently limit the range of possible responses and interpretations? Whatever the answer, one thing is clear: Korean viewers consider them an integral part of TV game and reality shows. Park reports that when one episode of the show Infinite Challenge aired without graphic overtitles, viewers responded immediately by creating their own overtitles and uploading the modified video onto the Internet (2009, p. 22). Such acts show how active a role individual viewers can play in modifying and re-disseminating media broadcasts. At the same time, such acts remind us that even in a globalized world, the particular forms, norms, and practices that people apply are deeply embedded in culture-specific expectations and literacy traditions.
If semiotic modes interact, they also affect how the signs they mediate are interpreted. When information is represented and re-represented across modes, the meanings it generates can be altered. We will explore this idea in the following section.
Modes and mediation
Multimedia can filter and transform realities in subtle ways that are not immediately recognized. In other words, what seems ‘natural’ may in fact be highly mediated.
Consider, for example, the 2008 ‘text message scandal’ of Detroit mayor Kwame Kilpatrick. As described by Squires (Reference Squires, Thurlow and Mroczek2011), private text messages between Kilpatrick and his chief of staff (and love interest) Christine Beatty were remediated on television news broadcasts. In the process, the text messages became laden with ideological overtones as they moved from ‘private’ to ‘public’ and from ‘source’ to ‘representation.’ Squires outlines the multiple layers of remediation and recontextualiztion of Kilpatrick's and Beatty's text messages as they appeared in the Sky Tel digital log file, in court document spreadsheets, in on-screen television displays of the texts, and as they were read aloud on news broadcasts. In the process, the particular ways the text messages were reproduced, contextualized, and sequenced in the news broadcasts had the effect of toning down the playful, inventive qualities of the text messages (see Chapter 7) and radically reframing them as ‘adult’ discourse, thereby negatively shaping viewer perceptions of Kilpatrick and Beatty.
It is important to recognize, however, that it is not the technologies themselves that produce these rhetorical effects. It is rather the journalists who use the retextualizing technologies to present the ‘data’ in seemingly objective fashion, as they conceal their true rhetorical intent. Technology may contribute to the shaping of discourse, but it is also guided by the imagination, rhetorical skill, and ethical responsibility of the individuals who design the discourse to pursue a communicative agenda. We turn now to a second, more involved example.
Kony 2012: individuals shaping the world?
On March 5, 2012, a 30-minute video called Kony 2012 was released by Invisible Children, an advocacy group based in the United States.Footnote 10 The video was designed as an experiment. As described by the Invisible Children website, the question behind the experiment was: “Could an online video make an obscure war criminal famous? And if he was famous, would the world work together to stop him?” The obscure war criminal was Joseph Kony, leader of the Lord's Resistance Army (LRA) in Central Africa, accused of massive human rights violations, including child abduction, mutilation, murder, and child-sex slavery. The video does not launch into this part of the story right away, however. Instead, opening with an image of the planet Earth from space, it establishes the frame of a new world order, made possible by technology, in which individuals 1) should have the right to thrive regardless of where they happen to be born, and 2) have the power to effect social change when they unite behind a just cause.Footnote 11 “Who are you to end a war?” asks the passionate director and narrator Jason Russell as he speaks to a crowd of multiethnic young people listening in rapt attention. “I'm here to tell you: Who are you not to?”
This is the rhetorical hook Russell needs to capture viewers’ attention. He then proceeds to tell the story of Joseph Kony through parallel dialogues with two boys who changed his life: his young son Gavin and a Ugandan named Jacob, whom Russell had met ten years earlier when traveling in Africa. The camera zooms from a Facebook map of Africa to Uganda to the town of Gulu to an ‘on the ground’ interview with Jacob at night. The extreme close-up of Jacob's illuminated face against the black night stirs viewer empathy as Jacob describes how his brother had been killed by the LRA. The camera pans to another boy who explains how children are not safe in their homes, followed by images of children walking (as if migrating) through the darkness, presumably to avoid LRA raids in their villages. The camera then pans to hundreds of huddled young bodies sleeping in what appears to be a shelter. When Jacob (shown in a tight close-up) sobs after he says he would be better off dead than on earth without his brother, filmmaker Russell makes a solemn promise: “We are also going to do everything that we can to stop them.” The camera cuts to several still images of Jacob and Russell as Russell's voice-over explains that his fight over nine years has led to this film, asserting that his promise is “not just about Jacob or me, its also about you,” as the camera shows hundreds of photographs of young people posted on a wall. Russell explains that 2012 is the year his promise to Jacob can be fulfilled, that the course of human history can be changed. And it is the technology of social media that makes this possible: “the technology that has brought our planet together is allowing us to respond to the problems of our friends. We are not just studying human history,” Russell tells us, “we are shaping it.” What pulls the viewer in is not just the linguistic rhetoric, but also the synergy of image, sound, and language that gives an ‘insider view’ of Russell's relationship to his son and to Jacob. Viewers are addressed as members of an imagined community who, when aided by technology to unite behind a cause, become empowered to shape a new world order. In a sense, Kony is merely a pretext – a straw man against whom to stand united. The video and the broader Invisible Children advertising campaign do not merely serve this imagined community; they create it.
Kony 2012 is about individual agency, but it really represents an appropriation of the idea of individual agency in two senses. First, on the level of video production it appears to be Jason Russell's personal YouTube posting, but in reality it is a highly produced film made by Invisible Children, Inc. Second, on a sociopolitical level, Kony 2012 doesn't use social media to empower citizens of Uganda or the Central African Republic to present their own story. Instead, it is a white man from California that tells a story about them. Unlike the Arab Spring (which developed from within certain Middle Eastern countries), Kony 2012 was developed from the outside and designed for viewers on the outside of the situations depicted.
The rest of the video is devoted to Russell's rationale and plan, which can be summarized as follows: the Ugandan army needs to find Kony. In order to do so, they need technology (from US military advisors). The continued presence of these advisors will require widespread popular support. To get widespread support, Kony must become an infamous ‘household name’ in America.
This is the same kind of branding strategy that is used by companies to sell their products. It centers on appealing to viewers’ emotions through a recognizable brand, established through repeated exposure to well-crafted images and slogans. In the case of Kony 2012, the media campaign started with the film, but also involved banners, posters, stickers, and overt support from politicians, celebrities, athletes, and billionaires – and contributions from viewers of the video.Footnote 12
According to the Invisible Children website, Kony 2012 reached 100 million views in just six days. In late 2013, YouTube ranked it as the most watched video of all time in the ‘Non-Profits and Activism’ category. It also enjoyed some political impact: several weeks after its release, a bipartisan group of thirty-three United States senators introduced a resolution condemning Joseph Kony's crimes against humanity and called for enhanced US support of African forces pursuing the LRA. Senator Lindsey Graham commented on the role of social media in the resolution: “When you get 100 million Americans looking at something, you will get our attention…this YouTube sensation is gonna help the Congress be more aggressive and will do more to lead to [Kony's] demise than all other action combined” (Wong, Reference Wong2012).
In the end, it was not the content of the video that carried the day, but rather its packaging. The oversimplification of Kony 2012's premises, its ideological bias, its lack of Ugandan perspective are hidden by the careful design and polished editing of the images, music, sounds, and voice of the film, which appeal above all to viewers’ emotions and preconceived notions of corruption and violence in Africa. The story is told to a child, and the film simplifies the context to a child's level of understanding, but it covers this oversimplified narrative with a veneer of technological sophistication.Footnote 13
The day after Kony 2012 was released, blogger Jennifer Lentfer commented on the rhetoric of the video:
They conjure up a horrible situation, only to let us distance ourselves from the difficult emotions it inevitably brings forth by creating a shallow sense of empowerment, that is, enabling us to believe that we can change the course of another country's history. It's a Hollywood blockbuster, the ultimate gaming experience, and we're the heroes.
But another hero in this story is technology. Technology gives viewers the sense of ‘seeing’ the problem objectively (because the camera seems to present an incontrovertible ‘on the ground’ reality) and technology offers a solution to that problem by providing the means to amass individuals’ voices and money to effect social change. That is the positive face put on technology from the early moments of the video. Less overt is the ‘artful,’ manipulative side of the technology that allows the filmmaker to select, frame, and sequence images and juxtapose them with the right words to persuade viewers of the rightness and urgency of the cause.
In the next example we will consider a very different use of video: to facilitate personal communication. Even though videoconferencing does not usually involve rhetorical design, the video medium can nevertheless have subtle effects on people's self-representation, interactional behavior, and mutual understanding.
Videoconferencing between California and France
People separated by distance have traditionally been obliged to communicate by writing or by telephone. The Internet has made synchronous audio-visual communication possible through videoconferencing programs such as Skype. Although at first blush videoconferencing gives the impression of immediacy (Bolter & Grusin, Reference Bolter and Grusin2000), what gets included in the audio-visual signal transmitted from one conversational partner to another is filtered and transformed by both hardware and software. This means that people sometimes have to manage semiotic resources differently than they do in face-to-face conversation.
One research project I've been involved with for a number of years examines videoconferencing interactions between learners of French in the United States and graduate students in France who are preparing to become teachers.Footnote 14 Both American and French cohorts consider their online interactions to be authentic, highly engaging, and a welcome addition to their regular academic courses. Nevertheless, a number of significant mediational issues underlie their videoconferencing exchanges. Below we will consider a number of ways in which visual and audio cues can be transformed by technological mediation and how the students we have studied adapt to those transformations.
The webcam is the eye through which all visual communication occurs in videoconferencing. Because webcams are often built into computers, they cannot be easily repositioned, and users have to remain relatively still if they are to remain in the webcam's field of view. This means that people may unnaturally restrict their movement when videoconferencing. The webcam also exaggerates the effects of physical movement (see Figure 8.3). Whereas a short distance from the webcam creates a sense of immediacy and intimacy, a distance of even three feet makes one look distant. Parkinson and Lea (Reference Parkinson, Lea, Kappas and Krämer2011) found that when people videoconference with people they don't know well, they tend to compensate for relatively intimate visual contact by talking about less personal topics in order to increase social distance. “Paradoxically,” they write, “one consequence may be that [videoconferencing] produces less intimacy than text-based or audio-only communication, because, in the latter cases, interactants may seek to increase rather than decrease the emotional relevance of the conversation itself when fewer alternative cues are available” (p. 103).

Figure 8.3 Shifts in position are exaggerated by the webcam
As anyone who has used a videoconferencing application knows, eye gaze is managed very differently online. Real eye contact does not exist. When interlocutors look at each other, they appear to be looking downward. If they want to create the illusion of looking into their interlocutor's eyes, they have to look directly at the webcam, but then paradoxically they cannot see their interlocutor at all. Figure 8.4 shows a woman asking a question, looking directly into the webcam to create an illusion of eye contact, and then looking at the screen as she listens to her interlocutors’ response. Users adjust quickly to the new dynamics of gaze management in videoconferencing – sometimes even claiming during interviews that they maintained frequent direct eye contact with their interlocutors (even though mutual eye contact is currently impossible).

Figure 8.4 Looking at the webcam (left) and looking at her interlocutors (right)
Webcams mediate gestures as well. Although people often gesture extensively during videoconferencing exchanges, when their gestures occur outside the webcam's field of view they remain invisible to their onscreen partners. Ironically, the closer a speaker is to the webcam (e.g., leaning in toward the computer, signaling a high level of involvement), the less likely it is that his or her gestures will be captured by the webcam. On the other hand, the greater the speaker's distance from the webcam (suggestive of social distance in face-to-face interaction), the greater the likelihood that gestures will be picked up by the webcam. Perhaps as a way of compensating for the webcam's limited visual field, people often heighten their facial expressiveness as they speak (Figure 8.5).

Figure 8.5 Animated facial expressivity online
Perhaps the most significant limitation of webcams is that they can create the illusion that people are looking at each other when in fact they may not be. In one instance, two Californian students launched a video that had been mentioned by their French online partners, while they were interacting with them. As they watched the video (which covered their Skype window), their French partners tried in vain to engage them. The Californian students could not see their French interlocutors (and could not hear them over the sound of the video), yet to the French it appeared that the Californian students were looking at them normally, so they erroneously concluded there was a technical problem with the connection.Footnote 15
Audio mediation is significant as well. We are accustomed to the transistorized sound of speech in the telephone, but bandwidth and connection issues can make speech in videoconferencing downright garbled at moments. Moreover, there is no way to know if one's own voice is distorted unless one's interlocutor says it is. Lag and desynchronization of the audio and video signals can also affect communication. While videoconferencing, people generally have the impression they are seeing their interlocutors directly, as if they were right in front of them. But in fact tiny delays are produced as the signal is compressed, delivered over hundreds or thousands of miles, and decompressed. What is perceived, both audially and visually, has been produced in the past, however slightly. The effect this has is to introduce a slightly awkward rhythm to the interaction, making one feel like one is always just a tad behind the beat. At moments when the delays become more pronounced, users sometimes wonder whether the desynchronized smiles, gestures, or facial expressions they see onscreen are responses to what they are saying right now or whether they correspond to what they said a moment earlier. And people sometimes mistake a transmission lag for hesitation on the part of their interlocutor. These ambiguities, combined with dropped frames, which result from bandwidth limitations and produce a jerky appearance to body motions, can potentially present real challenges to understanding.
Mediation and learning
From the above discussion, it might seem that technological mediation in videoconferencing has a largely negative effect. That would be to overestimate the impact of the medium, since, as mentioned earlier, participants have overwhelmingly enjoyed their online interactions. The point is that even though participants are satisfied with their overall experience, they nevertheless face small challenges introduced by the medium itself – in addition to the larger challenges they face in understanding and making themselves understood in a non-native language. What is important is that all participants be aware of the small effects of the medium so that ‘medium behavior’ does not become confused with participant behavior.Footnote 16
From an educational standpoint, videoconferencing offers not only an opportunity to learn from others at a distance, but also an opportunity to learn from retrospective analysis of the interaction itself. Recording technology textualizes communicative interactions, making it possible for students to archive, analyze, discuss, and learn from whatever misunderstandings might have occurred during their online exchanges. One student described the experience of reviewing a recording of a videoconferencing session as follows:
It's kind of like rereading a book that you've already read in terms of familiarity. You kind of know what's coming up next, so the next time you catch more stuff. So I felt like I had a better clarity of what exactly happened in the conversation. I have more self-awareness in terms of what I said, I'll be like, “oh, ok” I know what I was trying to say at that point in time, I can hear now what I did say, but I think now that I could have said it better this way. So it's sort of similar to going back and editing an essay or anything like that, you're re-appraising your own work. And plus it's nice to kind of pick up maybe on a few things you missed out on in that conversation because there's so much to be learned in the course of those conversations, and you hate the fact that there's only so much you can learn.
This student's comments raise a number of key points. First, because he is no longer in the heat of the communicative moment, he can attend to details of the interaction that he hadn't noticed the first time around. Second, having directly experienced the event, he can anticipate moments of uncertainty or misunderstanding and pay special attention to these points. Third, when he compares his memory of the interaction with the objective data of the video recording, he comes away with a greater sense of self-awareness. Finally, likening the process to revising an essay, the student has an opportunity to self-assess and to think about alternative words or actions he could have used, potentially enlarging his repertoire for future interactions.
We will return to the use of technology to heighten users’ awareness of mediation in Chapter 10. We now turn to the question of how multimodal discourse might affect how we think about literacy.
Literacy and multimodal discourse: is the image replacing the word?
It is frequently observed that literacy is undergoing an important shift from a dominance of writing to a dominance of image. Pundits now take it as a given that images speak louder than words. For example, commenting on the image of President Obama speaking at the ceremony commemorating the fiftieth anniversary of Martin Luther King's “I Have a Dream” speech, critic Alessandra Stanley wrote: “Rhetoric soars, but everyone knows a visual image can convey more than even the most dramatic or inspiring language. Dr. King's words still captivate, but nowadays, the camera trumps sound” (2013, p. A16). One may not agree with Stanley's assessment, but it certainly resonates with much of the discourse in the popular press, and with some market trends. For example, in US mobile networks, images are trending up and text is trending down. According to a wireless trade association report, US mobile networks carried 41 percent more multimedia images but 4.9 percent fewer text messages in 2012 than they did in 2011 (Lawson, Reference Lawson2013).Footnote 17 Images and especially video are also becoming increasingly common ways of accessing information: YouTube has become a first-choice search site and internet entry point for many young computer users – hence Google's purchase of YouTube. We increasingly develop visual models of information and concepts that used to be presented only in prose. PowerPoint presentations are now the default rather than the exception at professional conferences. Online, we engage in new forms of participation that may involve avatars whose appearance we design.Footnote 18
What are the implications for literacy? Because writing and images operate with different logics, social semiotician Gunther Kress predicts that
the dominance of the mode of image and of the medium of the screen will produce deep changes in the forms and functions of writing. This in turn will have profound effects on human, cognitive/affective, cultural and bodily engagement with the world, and on the forms and shapes of knowledge. The world told is a world different to the world shown.
Kress (Reference Kress2005, Reference Kress2010) elaborates on this point by arguing that telling and showing involve different epistemological commitments. Kress compares writing or saying “A plant cell has a nucleus” with drawing a picture of a cell, pointing out that whereas the verb have expresses a relation of possession or ownership, the drawing does not.Footnote 19 However, the act of drawing forces a decision about where the nucleus should be positioned within the cell walls, which the linguistic utterance does not. Thus writing or speaking confronts us with a different kind of epistemological commitment from that of drawing. Kress argues that when we use language we deal in categories of relations (such as causality, possession, and so on) and when we use images we are concerned with the positioning of elements in a framed space.
This line of argumentation is perhaps overly reductive in the sense that the language/image distinction is not always so stark. Writing always involves spatial arrangement and speech is most frequently accompanied by visible gestures, body language, and facial expressions that are often crucial to interpretation. And drawing can show logical relations through conventional signs like arrows. Nevertheless, Kress's distinction between telling and showing does seem useful when thinking broadly about how we are positioned to think in certain ways by different modes and mediums of expression. Jones and Hafner (Reference Jones and Hafner2012) offer the example of describing a traffic accident. English speakers will tend to formulate their narrative in a subject–verb–object pattern, attributing agency (as in ‘The truck hit the car’), whereas if they produce a drawing of the accident, they will attend to the particular spatial relationship between the truck, the car, and the road, and perhaps other vehicles as well, but might not include any information about agency (pp. 52–53).
If language and image offer different resources for meaning making and involve different epistemological commitments, then there are also implications for learning. Chun and Plass (Reference Chun and Plass1997), for example, point out that learning from text and learning from images involve different cognitive processes and that educators can take advantage of these differences to improve learners’ comprehension through strategic uses of different modes. However, when it comes to graphics, the technology that allows a teacher to conveniently prepare illustrations of complete models, data charts, and so on in advance of a presentation may not always produce a good learning experience for students. It is often far more comprehensible to watch a presenter draw a diagram or a model on a blackboard, seeing the step-by-step process that lies behind what is being drawn, than to see only the finished product, however aesthetically superior it might be, flashed on a screen.
Communications scholar Mitchell Stephens speculates that the predominance of images will eventually affect both our writing and our speech: “the language our descendants write and speak will increasingly be a less precise, less subtle language – one designed for use with images” (Stephens, Reference Stephens1998, p. 207). Whether or not language becomes less precise and subtle, one thing is clear: if language was king in the mediums of print and radio, which made language look and sound autonomous, today's television and computer mediums remind us how fundamentally language interacts with and complements other semiotic modes. Word and image are both important technologies for the design of meaning, and literacy requires competence with the one just as much as the other. As we have seen, technologies do not replace one another, but remediate and complement one another. Written text is at no risk of disappearing, but it will increasingly share the semiotic stage with other players. This mutual dependency can be seen on the audio-visual side as well: photos, videos, and music have become ‘linguistic’ in the sense that they have verbal tags that allow them to be searched, grouped, and shared.Footnote 20 These tags add value to the original visual object by allowing its features and qualities to be specified by multiple users. The identity (and usefulness) of visual objects is thus increasingly textualized over time.
Semiotics of the image
While the forms and functions of linguistic structures are formalized in grammars, we haven't had a comparably explicit vocabulary to describe the representational modes of images. But this is changing. In the 1950s and 1960s, Roland Barthes (Reference Barthes1957, Reference Barthes and Heath1977) reminded us that visual communication, like language, is always culturally coded and, applying the structuralist ideas of the day, he distinguished between denotative and connotative layers of meaning in photographs. Gunther Kress and Theo van Leeuwen critiqued and built on Barthes's ideas, adding perspective from Michael Halliday's systemic functional linguistics to develop a social semiotic framework for the analysis of visual design in Reading Images (Reference Kress and van Leeuwen1996), which was followed by other useful books, such as van Leeuwen and Jewitt's Handbook of Visual Analysis (Reference van Leeuwen and Jewitt2001), Kress and van Leeuwen's Multimodal Discourse (Reference Kress and van Leeuwen2001), and Kress's Multimodality: A Social Semiotic Approach to Contemporary Communication (Reference Kress2010).Footnote 21
Some of the principles of visual semiotics have been in circulation for some time. In film and photography, for example, camera angles, framing, and focus have long been known to guide the viewer's attention and to influence interpretation in particular ways.Footnote 22 Other principles have come to light more recently, often related to insights from discourse analysis and cognitive linguistics. For example, left and right areas of a visual layout can connote a variety of relations, such as negative–positive valuation, before–after time sequences, or given–new information structure.Footnote 23 Such configurations are of course dependent on the language's direction of writing: left–right relations in English become right–left relations in Arabic or Hebrew. More universal would be differences of valuation along the vertical axis, with ‘up’ and ‘high’ being better than ‘down’ and ‘low’ (Lakoff & Johnson, Reference Lakoff and Johnson1980). Kress and van Leeuwen (Reference Kress and van Leeuwen1996) suggest that the upper portion of a visual layout often represents the ideal, whereas the lower portion represents the real, or where readers are called upon to take some form of action. They also discuss center–margin relationships, with the center being associated with core information, and the margins showing information of lesser importance.
Eye-tracking studies suggest that readers look for the most important information at the top of a screen, and tend to direct their gaze most frequently to the left half of the screen (Nielson, Reference Nielson2010). However, it should be noted that these studies are usually performed with business and news sites, which have a long (paper-based) tradition of placing the most important information ‘above the fold,’ and it may be that other kinds of sites encourage different eye movement patterns. For example, some types of documents such as financial spreadsheets and checkbook registers may draw readers’ primary attention to the ‘bottom line’ rather than to upper regions of the document.Footnote 24 Cause and effect are hard to untangle here, but it does seem safe to say that how people look at web pages is not entirely a function of page design, but involves an interaction of design with individual readers’ purposes.
What is new and interesting about page design in the context of digital environments is precisely the ability to accommodate layouts to different readers. Whereas a fixed print text might be designed to strike a ‘happy medium’ in addressing the envisioned needs, interests, and sensibilities of a range of different readers, digital texts can be reconfigured on the spot to provide the right information in the right manner for readers of different languages and cultures. So, for example, when one goes to the Air France website, one indicates one's country and language to get the most relevant and understandable information. In Facebook, when one sets the language differently, one also gets different layouts (compare English and Arabic, for example, to see how the page layout is mirrored). One challenge for research in multimodal discourse is thus to study the interaction of multiple modes not just in some generic or universal sense, but also in how multimodal discourse is enacted and interpreted by different people, ranging from monolinguals to plurilingual and pluricultural individuals.
Conclusion
Literacy has always been multimodal. But today we are faced with more choices than ever about how to communicate. What to express through still or moving images? What to express in speech? What in writing? And what will it ‘say’ to express it in that particular medium? Furthermore, each choice entails subsequent decisions about the particular mediational tools and techniques to be used, and the most appropriate styles and forms of language to employ. These decisions matter, because modes, technologies, and language are not neutral conduits.
Designing meaning is relational work, and it is rhetorical work. Whereas the rhetorical decisions involved in technology-mediated communication used to be reserved for specialists, they are now often in the hands of ordinary people. This change has important implications for education. If different modes and media have different affordances, then it is important for educators, first, to understand what those affordances are in specific acts of reading and writing and, second, to help young people develop their understanding of the ‘invisible’ processes that go into their practices. This will be taken up in Part III.








