1. Introduction
The last several decades have seen a growing literature comparing the structures of language to that of pictures, especially to the sequential images of visual narratives like comics. Numerous linguistic theories have been applied to sequential images and their units (Bateman & Wildfeuer, Reference Bateman and Wildfeuer2014; Borkent, Reference Borkent2023; Cohn, Reference Cohn2013; Davies, Reference Davies2019; Szawerna, Reference Szawerna2014), and experimental work has shown similar (neuro)cognitive processing for the understanding of visual sequences and language (Cohn, Reference Cohn2020b; Loschky et al., Reference Loschky, Magliano, Larson and Smith2020). More recent work has also begun asking whether graphics also might display similar typological regularities as languages (Bateman et al., Reference Bateman, Veloso and Lau2021; Cohn, Reference Cohn2024; Krajinović et al., Reference Krajinović, Hacımusaoğlu, Cardoso and Cohn2025), especially exploring potential ‘universals’ that persist across diverse variation in graphic systems. We here explore one such typological domain: Do graphic systems display the same relations between complexity, size and frequency as the lexicons of languages?
Although lexicons of languages clearly vary in their structural complexities (Croft, Reference Croft2003; Haspelmath, Reference Haspelmath2017, Reference Haspelmath2018; Jackendoff & Audring, Reference Jackendoff and Audring2020), a widely observed tradeoff emerges in the frequency of different lengths of word units, with shorter words being more frequent than longer words, what is known as Zipf’s law of abbreviation or brevity (Zipf, Reference Zipf1949). Zipf’s law of abbreviation persists across diverse spoken and written languages (Bentz & Ferrer-i-Cancho, Reference Bentz and Ferrer-i-Cancho2016; Koshevoy et al., Reference Koshevoy, Miton and Morin2023; Levshina, Reference Levshina2022; Linders & Louwerse, Reference Linders and Louwerse2023; Lyu et al., Reference Lyu, Wang, Yang, Barner, R. A. and Walker2025; Pimentel et al., Reference Pimentel, Meister, Wilcox, Mahowald and Cotterell2023; Zipf, Reference Zipf1935) and sign languages (Kimchi et al., Reference Kimchi, Wolters, Stamp and Arnon2023). Such word lengths may also be constrained by information density, not just frequency (Piantadosi et al., Reference Piantadosi, Tily and Gibson2011). This brevity law has been taken as a linguistic universal (Bentz & Ferrer-i-Cancho, Reference Bentz and Ferrer-i-Cancho2016), even appearing in some vocal systems in animals (Ferrer-i-Cancho & Lusseau, Reference Ferrer-i-Cancho and Lusseau2009; Heesen et al., Reference Heesen, Hobaiter, Ferrer-i-Cancho and Semple2019; Semple et al., Reference Semple, Hsu and Agoramoorthy2010).
The ubiquity of Zipf’s law of abbreviation has been explained as a principle of least effort (Zipf, Reference Zipf1949), where behaviour is optimized by minimizing effort (i.e. using smaller words). Indeed, longer words typically take longer time to read (Barton et al., Reference Barton, Hanif, Björnström and Hills2014). An information theoretic interpretation describes this in terms of a balance between expressivity and parsimony or compression (Ferrer-i-Cancho, Reference Ferrer-i-Cancho2018; Ferrer-i-Cancho et al., Reference Ferrer-i-Cancho, Bentz and Seguin2022), that is, longer words are thought to be more expressive, but the demand of longer length is in tension with aims to limit communicative cost. More demanding cognitive load would therefore lead to cognitively easier processing and thereby shorter, more frequent words (Linders & Louwerse, Reference Linders and Louwerse2023). Other work has argued that Zipf’s law of abbreviation arises less from pressures for communicative efficiency, but from a cultural evolutionary pressure for brevity (Morin & Koshevoy, Reference Morin and Koshevoy2024).
Given that properties of visual narratives have been shown to involve the same (neuro)cognitive processing as linguistic systems (Cohn, Reference Cohn2020b), if Zipf’s law of abbreviation originates because of cultural evolution and/or domain-general or information-theoretic constraints, it should also persist in the graphic modality. The critical methodological question then is: What graphic unit and measurement of their size is analogous to words and their length?
Word length is measurable in various ways, including the quantity of phonemes or morphemes, or for many studies, the length of a written word in letters (Grzybek, Reference Grzybek and Grzybek2007). All of these measures are also characterizable by length within the one-dimensional arrays constrained by the duration of speech (or its graphic depiction as linear lines of writing). Some work has looked at graphics to investigate Zipf’s law of abbreviation. Studies of the complexity of individual characters in diverse writing systems display tendencies consistent with Zipf’s law of abbreviation (Koshevoy et al., Reference Koshevoy, Miton and Morin2023; Lyu et al., Reference Lyu, Wang, Yang, Barner, R. A. and Walker2025). However, writing systems use graphic representations to convey speech sounds, and a measure of graphic ‘complexity’ for a corpus of historical pictures has not yielded a trade-off with frequency (Miton & Morin, Reference Miton and Morin2019), leading to speculation that iconicity does not allow for such a law of brevity.
Nevertheless, when people produce strings of emoji, utterance lengths maintain the expected trade-off between length and frequency (Cohn et al., Reference Cohn, Engelen and Schilperoord2019; Sekine & Ikuta, Reference Sekine and Ikuta2024). While this work suggests that graphics may evoke Zipf’s law of abbreviation, emoji are produced in one-dimensional arrays alongside, or substituted for, text. In addition, unlike the canonical studies of word length, it is emoji sequencing that maintains a tradeoff in length and frequency, while emoji as units are consistent in their sizes.
In contrast, the units of visual narratives are panels, and though narratives also use a sequence, panels are flexibly shaped as analogue two-dimensional spatial arrays, meaning that they not only have a length but also a width. Such measurements of size are also continuous, unlike the discrete, countable letters. To investigate the properties of visual narrative panels, we therefore triangulate the size and frequency of these graphic units with their ‘morphological complexity’ because this dimension is argued to comprise the lexicalized aspects of visual sequences, which relate to the amount of information in panels, rather than their graphic features alone. In addition, prior work has examined morphological complexity of panels across several decades of corpus analyses.
The morphological content of panels can be decomposed into two parts (Cohn, Reference Cohn2007; Reference Cohn2024): active entities are elements that motivate the structure of a sequence (as shown in Figure 1a), while inactive entities are not necessary to understand a sequence, though they may contribute additional (environmental) meaning and context. Therefore, a sequence should still be understandable if inactive information is deleted (see Figure 1b), but should be harder to understand if active information is deleted (Figure 1c). Prototypically, active entities are the characters or objects in a scene, as in Figure 1a, while inactive information is the background. Active entities also function as the morphological stems to which visual affixes attach (Cohn, Reference Cohn2007, Reference Cohn2013, Reference Cohn2024), that is, the speaker (active entity) of a speech balloon (affix) or the mover (active entity) of a motion line (affix, see the panels in Figure 1).

Figure 1. (a) A panel diagrammed for active and inactive entities and sequences, (b) deleting inactive entities but maintaining active entities and (c) deleting active entities and maintaining inactive entities. Les Tuniques Bleues #1 by Willy Lambil and Raoul Cauvin, Belgium.
Prior analyses of visual narratives have examined the morphological complexity of panels based on categories characterizing the amount of active entities, what has been called a panel’s attentional framing structure (Cohn, Reference Cohn2011, Reference Cohn2024; Cohn et al., Reference Cohn, Taylor-Weiner and Grossman2012). Panels are categorized as containing multiple active entities (macros, as in the first four panels analysed in Figure 1a), or a single entity (monos, as in the final two panels in Figure 1a). Panels can also contain less than a single entity using a zoom (micros), no active entities and only show an environment (amorphics), only a visual affix with no root (affixing, e.g. a panel with only a speech balloon but no speaker) or no content at all (null). In line with word length effects in reading (Barton et al., Reference Barton, Hanif, Björnström and Hills2014), panels with multiple entities are viewed slower than those with single entities, which in turn are read slower than those with zooms of entities, even when panel size is uniform (Cohn et al., Reference Cohn, van Middelaar, Foulsham and Schilperoord2024).
Analysis of these framing categories in corpora of various sizes – from 30 comics to 380 comics – has revealed that panels with single or multiple entities occur consistently more often than those with zooms or no entities. Yet, variation also occurs, with panels from Asian comics typically having fewer entities than those from North America or Europe (Cohn, Reference Cohn2011, Reference Cohn2024; Cohn et al., Reference Cohn, Taylor-Weiner and Grossman2012). These results were originally taken to suggest that geographic region or culture were the defining features of variation between these comics, but other factors could explain this variation, such as the patterns of the graphic systems used in the comics, which may transcend geographic and cultural boundaries.
Visual Language Theory (Cohn, Reference Cohn2013, Reference Cohn2024) posits that pictorial systems use similar structure and cognition as spoken and signed languages. Supporting this, visual narrative sequences elicit neural responses similar to all those evoked by manipulating sentence structure (Cohn, Reference Cohn2020a), modulated by both proficiency and age of acquisition (Coderre & Cohn, Reference Coderre and Cohn2024; Cohn, Reference Cohn2020a). Under this linguistic framework, variation across graphics is better explained as reflecting the patterns of diverse visual languages, just as spoken and signed languages differ typologically.
Visual languages should thus vary across their lexicons (i.e. the patterned representations of pictorial units) and their grammars (i.e. the organizational systems guiding sequences of pictures). Prior corpus analysis has substantiated this idea. For example, the ‘manga style’ is indicative of a Japanese Visual Language originated in Japan, but has proliferated around the world (Brienza, Reference Brienza2015). This system displays typological characteristics of both lexicon and grammar that differ systematically from those of the Kirbyan American Visual Language (i.e. stereotypical of superhero comics) which originated in the United States, and from the visual language(s) in European comics (Cohn, Reference Cohn2013, Reference Cohn2024). Indeed, comprehenders’ processing is not only modulated by general proficiency for visual narratives, but for proficiency in the lexical and grammatical patterns of distinctive visual languages (Cohn & Foulsham, Reference Cohn and Foulsham2020; Cohn & Kutas, Reference Cohn and Kutas2017; Hacımusaoğlu & Cohn, Reference Hacımusaoğlu and Cohn2025; Shimizu et al., Reference Shimizu, Kozawa, Watanuki, Uleman and Arihara2025).
Thus, it is unclear whether findings that Asian comics typically have fewer entities per panel than comics from Europe or North America are indicative of cross-cultural variation or of differences between particular visual languages. Some analyses do seem to suggest this visual linguistic rather than cultural interpretation. For example, ‘American manga’ created by authors in North America display properties of panel framing which are closer to their ‘style’, i.e., Asian manga, than to their geographic origin, i.e., American comics (Cohn, Reference Cohn2024).
While these findings suggest typological diversity across the units of visual narratives, additional corpus analyses indicate that Zipf’s law of abbreviation is maintained across sequences in visual narratives in a more universal way. For example, for repetitions of panels sharing the same framing categories (e.g. successive panels containing multiple entities, successive panels containing one entity), shorter sequences were more frequent than longer sequences with a distribution akin to Zipf’s law of abbreviation (Cohn, Reference Cohn2024). Similar tradeoffs between sequence length and frequency have also been observed in other structures of visual narratives, such as the number of panels in rows of layouts (Cohn, Reference Cohn2024) and the number of panels separating types of backgrounds (Atilla et al., Reference Atilla, Klomberg, Cardoso and Cohn2023). All of these relationships between frequency and sequence length occurred independent of the style or global origin of the comics. These findings are consistent with observations that Zipf’s law of abbreviation not only appears between frequency and the lengths of words (Bentz & Ferrer-i-Cancho, Reference Bentz and Ferrer-i-Cancho2016), but also to frequency and the lengths of phrases (Ryland Williams et al., Reference Ryland Williams, Lessard, Desu, Clark, Bagrow, Danforth and Sheridan Dodds2015).
These findings suggest that Zipf’s law of abbreviation persists for various types of sequences in graphic systems beyond their diverse structures, yet it remains unknown whether such a tradeoff characterizes aspects of visual narrative units themselves (i.e. panels and their content). Thus, no work has yet triangulated between entities per panel, their frequency and the physical size of these graphic units. We therefore investigated these issues by looking at the size and frequency of panels with different numbers of entities per panel – here using the raw count of active entities in order to have an ordinal measure of comparison across panels, rather than framing categories as were used in previous studies.
We first ask: Are differences in the number of entities per panel in comics systematically related to typological variation and/or geographical or cultural origins of comics? As analysis of prior corpora conflated these factors, we sought more clarity by using the TINTIN Corpus, a corpus with global scope and a diverse range of styles (Cohn et al., Reference Cohn, Hendrickson, Cardoso, Klomberg, Hacımusaoğlu, Krajinović, van der Gouw, Atilla, Simons, Hankart, van Noord, Titarsolej, Casanova Martínez, Altamirano, Pagkratidou, Szawerna, Verstappen, Fenkl, Kruszielski, Marini, Singh, Stamenković, Tasić and Yumunder review). Because drawing style characterizes the morpho-graphological structure of a graphic system, and is a primary access point for creators, it may hint at regularities of other structures (like entities per panel). It therefore seems that ‘style’ can be used as a methodologically sound proxy for classifying potential ‘visual languages’. Obviously, the caveat here is that styles are not necessarily clean mappings to visual languages, and that ‘entities per panel’ is just one typological dimension that may characterize such languages. As the possibility of identifying visual languages is treated as an empirical question within the broader TINTIN Project, future work will explore more robust classifications of visual languages by combining annotations of various dimensions and style metrics (see Cohn, Reference Cohn2024, for a first attempt at this with a prior corpus).
Next, we asked: Is the number of entities in panels related to their size? Here we predict a positive relationship, that is, the number of entities per panel will come with larger panel sizes. This would maintain the iconicity principle that ‘more form is more meaning’ as has been argued to explain various typological phenomena (Croft, Reference Croft2003; Haiman, Reference Haiman1980; Taylor, Reference Taylor2002), including Zipf’s law of abbreviation (Haiman, Reference Haiman2008; cf. Haspelmath, Reference Haspelmath2008, Reference Haspelmath2021). While a relationship between panel size and number of entities should be intuitive because larger physical sizes allow for more contents, it is certainly not a given. The size of the graphic content could scale relative to the unit size, for example, smaller panels could also have multiple entities, just drawn proportionally smaller. We further predicted that such an association would be universal across potential sources of diversity like styles or global origin.
Finally, we asked: is there a relationship between panels’ information/size and their frequency? In line with Zipf’s law of abbreviation between word length and frequency, we predict a negative tradeoff such that panels with fewer entities are more frequent than those with more entities. In addition, this negative tradeoff will transcend any systematic diversity across graphic systems.
2. Methods
2.1. Materials
We used data from the TINTIN Corpus, an annotated collection of 1,030 comics from 144 countries/territories around the world with a wide distribution of styles, as depicted in Figure 2. Additional details about the corpus can be found in Cohn et al. (Reference Cohn, Hendrickson, Cardoso, Klomberg, Hacımusaoğlu, Krajinović, van der Gouw, Atilla, Simons, Hankart, van Noord, Titarsolej, Casanova Martínez, Altamirano, Pagkratidou, Szawerna, Verstappen, Fenkl, Kruszielski, Marini, Singh, Stamenković, Tasić and Yumunder review) and represented in the dataset for this study in its repository: https://doi.org/10.34894/MAEOJE. In total, we examined 75,603 panels.

Figure 2. The distribution of the 1030 comics from the TINTIN Corpus in terms of (a) their quantities from 144 countries and (b) their styles across global regions.
We used classification of the drawing ‘style’ of the comics as a proxy for comparing potential differences between graphic systems. However, incorporation of both style and entities per panel, along with other lexical and grammatical structures, into a data-driven assessment of what constitutes ‘visual languages’ is an extended goal of this corpus research (see Cohn, Reference Cohn2024 for a precedent). The TINTIN Corpus includes several diverse drawing styles that span across global regions (Figure 2b), which were categorized using a combination of manual annotation and computer vision techniques (for details, see Cohn et al., Reference Cohn, Hendrickson, Cardoso, Klomberg, Hacımusaoğlu, Krajinović, van der Gouw, Atilla, Simons, Hankart, van Noord, Titarsolej, Casanova Martínez, Altamirano, Pagkratidou, Szawerna, Verstappen, Fenkl, Kruszielski, Marini, Singh, Stamenković, Tasić and Yumunder review; Titarsolej et al., Reference Titarsolej, Cohn and van Noord2024), which resulted in five primary stylistic designations. Manga were stylistically stereotypical of Japanese manga, while Cartoony styles had plastic features without accuracy to real proportions. The RealExaggerated style characterized embellished features that kept real proportions, such as more ‘realistic cartoony’ and/or ‘superheroic’ styles. Realistic styles kept proportions and dimensions more accurate to general perception, and Alternative styles used unorthodox or unclassifiable representations.
2.2. Areas of analyses
Comics in the TINTIN Corpus were analysed using the Multimodal Annotation Software Tool, or MAST (Cardoso & Cohn, Reference Cardoso, Cohn, Calzolari, Béchet, Blache, Choukri, Cieri, Declerck, Goggi, Isahara, Maegaard, Mariani, Mazo, Odijk and Piperidis2022) where vector-based regions were selected corresponding to panels (Figure 3), which were then further annotated (see Cohn et al., Reference Cohn, Hendrickson, Cardoso, Klomberg, Hacımusaoğlu, Krajinović, van der Gouw, Atilla, Simons, Hankart, van Noord, Titarsolej, Casanova Martínez, Altamirano, Pagkratidou, Szawerna, Verstappen, Fenkl, Kruszielski, Marini, Singh, Stamenković, Tasić and Yumunder review for details). Annotators first assigned panels to categories of framing (macro, mono, micro, amorphic, etc.) using the Visual Language Theory: Morphology: Framing v.2 annotation scheme (Cohn et al., Reference Cohn, Hacımusaoğlu, Klomberg and Krajinović2024) and the number of entities per panel was added to the Notes field of annotations if it was not inferable from the framing category. For example, macros could be any number of multiple entities, but monos by definition show a single character unless it represented a group consisting of multiple figures, though these were rare (e.g. multi-figure monos constituted only 0.068% of all mono panels).

Figure 3. A screenshot from the MAST interface showing a selected panel and its annotations of entities per panel in the Notes Field for the annotation of its framing type as a macro panel. Example comic is Archie’s Friend Scarlett © Archie Comics.
‘Entities’ were counted as any identifiable element in a panel that contributed meaningfully to the visual sequence (as in Figure 1b), and they were not limited to human or animates. For example, the primary panel in Figure 3 has four entities (three humans and one dog), while in Figure 1a, the first four panels have three entities (two humans and one horse), while the final panels have one (human) entity.
Annotations were carried out by eight trained coders who independently analysed each comic using MAST. All annotators were trained in the annotation procedures and were required to sufficiently analyse multiple practice comics before proceeding to annotate the TINTIN Corpus. All annotations were checked by at least one other supervisor, and debated annotations were discussed in weekly meetings until agreement had been reached.
2.3. Data analyses
We first asked whether the number of entities per panel varies with style and/or geographic/cultural origin of the comics. Here we compared linear mixed models with the number of entities per each panel in each book of the corpus as the dependent variable. Entities per panel were added by 1 and log transformed (allowing 0 values to be included). Fixed effects were the comics’ style and/or global region and the random effects were each comic. Models were compared using ML with follow-up contrasts using a Bonferroni correction.
Our next analysis sought to establish a relationship between the number of entities per panel and the physical size of those panels, calculated as the relative area (percentage) of the panel out of the total size of the page. For each comic, we calculated the average size of panels for each number of entities per panel and log transformed both size and entities per panel data. As panels with no entities are both rare – as in spoken languages (Croft, Reference Croft2003) – and play supplementary functional roles in visual narratives to the primary sequence that includes entities (Cohn, Reference Cohn2024), we excluded panels with no entities from our analyses here. We provide analyses which include all panels in the Supplementary Materials. We compared linear regression models with size as a dependent variable and entities per panel and style as predictors.
Finally, we asked about the relative frequency of panels with varying numbers of entities per panel. For each comic, we calculated the frequency of panels for each number of entities per panel (i.e. six panels with two entities, four panels with three entities), again excluding those with no entities (again, for completeness see Supplementary Materials). We log transformed both the entities per panel and frequencies and examined their relationship in a linear model along with the style of comics.
3. Results
Our first analysis sought to confirm there was systematic diversity based on comics’ number of entities per panel based on their style and/or global region of origin. Model comparisons using AIC, BIC and log-likelihood indicated that the best model included only Style and not Global Region (Table 1).
Table 1. Model comparisons for numbers of entities per panel across style and global region of comics

Note: Comics were random effects.
This model showed a main effect of Style, χ2(4) = 186, p < .001, which arose because, as depicted in Figure 4a, manga had fewer entities per panel than all other styles (all zs > 53, all ps < .001) and realistic entities had more than all other styles (all zs > 57, all ps < .001). Between these extremes, cartoony styles had more entities per panel than real exaggerated styles (z = 125, p < .001), which had more than alternative styles (z = 57.6, p < .001), though estimated marginal means suggested that these differences appeared smaller than those related to manga or realistic styles.

Figure 4. Entities per panel across (a) different styles of comics and (b) different styles across global regions. Error bars display 95% confidence interval.
Further insight is gained when comparing this model to the one with both Global Region and Style. Here, the main effect of Style persisted, χ2(3) = 58, p < .001), and the model also showed an interaction between Style and Global Region, χ2(23) = 36.5, p < .05. However, no main effect was found of Global Region, χ2(2) = 1.07, p = .587. As in Figure 4b, the number of entities per panel varied for Styles within Global Regions, except for a consistent reduced number in manga, and often an increased number in realistic comics, compared to all other styles which persisted in almost all regions.
In order to establish that this measure of morphological complexity has a relation to the size of the units (cf. word length), we next looked at the average size of panels for each number of entities per panel per book. Entities per panel predicted panel size, F(1, 6034) = 556, p < .001, R = .291, R 2 = .084, confirming a positive relationship that more entities per panel align with larger physical sizes of panels, β = .291, p < .001, as in Figure 5. Including Style in the model accounted for further variance (R 2 change = .047, F = 92, p < .001). However, though slight variations in the mean size of panels exist across styles, the overall positive relationship between entities per panel and size of panels persisted across all styles in a uniform way (all b > .8 and < .12, see Figure 5).

Figure 5. Linear relations between log-transformed averages of entities per panel and the size (relative area of panels per page) across different styles of comics. Confidence band depicts 95% confidence interval.
Having established this relationship between size and morphological complexity, we next investigated the relationship between the frequency of panels with different numbers of entities. Entities per panel predicted their frequency, F(2, 6034) = 9937, p < .001, R = .789, R 2 = .622, with a strong negative relationship (β = −.845, p < .001). Further model comparison indicated that including Style only accounted for only marginally more variance (R 2 change = .003) because, as visible in Figure 6, the same relationship persists between entities per panel and frequency across styles.

Figure 6. Relations between log-transformed averages of entities per panel and their frequency across different styles of comics. Confidence band depicts 95% confidence interval.
4. Discussion
This study examined the relationship between the size and frequency of panels with different numbers of entities in a global corpus of comics. Given prior findings of fewer entities in panels in Asian than European or North American comics (Cohn, Reference Cohn2011, Reference Cohn2024; Cohn et al., Reference Cohn, Taylor-Weiner and Grossman2012), we sought to replicate findings of systematic diversity using a more continuous measure of the morphological complexity of panels. This analysis also sought to clarify whether any such diversity emerges from typologies of graphic systems (‘visual languages’, using the proxy of style) and/or from the comics’ geographic or cultural origins.
We indeed found that consistent differences arose between styles, which better accounted for these tendencies than global regions. Manga maintained the fewest entities per panel, while realistic comics used the most entities per panel. In particular, this reduced amount of information per unit for manga is consistent with prior work (Cohn, Reference Cohn2011, Reference Cohn2024; Cohn et al., Reference Cohn, Taylor-Weiner and Grossman2012), but these studies clarified that this is due to more of a property of ‘manga’ as a conventional graphic system rather than to its global origin in Asia or Japan specifically. Here, manga globally had consistent tendencies, though whether this reflects a spread of a consolidated system originating in Japan or the stabilization of conventions resulting from a proliferation across the world remains a question for further longitudinal analysis.
This variation in structure should place different demands on processing, as less information in each panel (as in manga) would reduce the costs of finding and selecting information within units but may result in increased costs in building a situation model out of disparate parts (Cohn, Reference Cohn2020b; Loschky et al., Reference Loschky, Magliano, Larson and Smith2020). Indeed, manga proficiency modulates the brain responses for constructions connecting panels with disparate entities into a common environment (Cohn & Kutas, Reference Cohn and Kutas2017), and for panels framing varying amounts of information (Cohn & Foulsham, Reference Cohn and Foulsham2020).
While these findings may point toward diverse graphic systems rather than cross-cultural differences, it is worth emphasizing that such systematic differences between styles do not necessarily indicate visual languages wholesale. Styles may be a proxy for visual languages, but these findings are most indicative of differences along this one typological structure, just like any singular linguistic typological dimension is just one part of a broader collection of structures making up the whole of a language. This structure can therefore inform the potential for visual languages but would need to combine with multiple others for a full characteristic of such systems.
We next sought to establish that our measure of panel information (entities per panel) had a relationship with the physical size of panels as an analogy to words as units and their length. In this case, physical size was not one-dimensional length, but two-dimensional relative area of a panel given the area of a page. We indeed observed a clear positive relationship, where increase in the number of entities per panel also led to physically larger panels. As this positive relationship also persisted across styles in a consistent way, it implied a universal tendency.
This suggests an iconic relationship between the quantity of information and its physical size. This follows proposals for an isomorphism that ‘more form is more meaning’ (Taylor, Reference Taylor2002), along with claims that word length corresponds to the complexity of word information content (Piantadosi et al., Reference Piantadosi, Tily and Gibson2011). Here we correspondingly found that ‘more size is more panel information complexity’, with larger panels containing greater numbers of active entities. Indeed, such an iconicity principle has been used to explain a variety of phenomena in linguistic typology (Haiman, Reference Haiman1980, Reference Haiman2008; Taylor, Reference Taylor2002), sometimes in tension with constraints of economy (Croft, Reference Croft2003) or frequency (Haspelmath, Reference Haspelmath2008, Reference Haspelmath2021).
Finally, having established a relationship between different numbers of entities per panel and their size, we next looked at whether they also had a relationship to their frequency of panels. This follows from longstanding findings showing that shorter words are more frequent than longer words, that is, Zipf’s law of abbreviation, that seemingly persists across languages (Bentz & Ferrer-i-Cancho, Reference Bentz and Ferrer-i-Cancho2016; Koshevoy et al., Reference Koshevoy, Miton and Morin2023; Levshina, Reference Levshina2022; Linders & Louwerse, Reference Linders and Louwerse2023; Lyu et al., Reference Lyu, Wang, Yang, Barner, R. A. and Walker2025; Pimentel et al., Reference Pimentel, Meister, Wilcox, Mahowald and Cotterell2023; Zipf, Reference Zipf1935) sequences of emoji (Cohn et al., Reference Cohn, Engelen and Schilperoord2019; Sekine & Ikuta, Reference Sekine and Ikuta2024), and even vocal systems of animal communication (Heesen et al., Reference Heesen, Hobaiter, Ferrer-i-Cancho and Semple2019; Semple et al., Reference Semple, Hsu and Agoramoorthy2010). We again confirmed such a negative tradeoff, whereby panels with fewer entities were progressively less frequent. Again, this distribution persisted across styles, implying a universal tendency.
The iconicity principle has also been argued to help explain Zipf’s law of abbreviation in the lexicon of spoken languages (Haiman, Reference Haiman2008), though this has been disputed (Haspelmath, Reference Haspelmath2008, Reference Haspelmath2021). Given our triangulated findings of relations between entities per panel, panel size and frequency, such an iconicity principle could be argued to work here as well. However, other interpretations are also possible, such as a tradeoff between expressiveness and parsimony (Ferrer-i-Cancho, Reference Ferrer-i-Cancho2018), which should persist as much in graphic as verbal communication. Indeed, most all functional interpretations of Zipf’s law of abbreviation in verbal lexicons should apply to visual lexicons (i.e. least effort, iconicity, expressivity, parsimony, cognitive load).
Despite being taken as defining features of linguistic and/or cultural systems (Bentz & Ferrer-i-Cancho, Reference Bentz and Ferrer-i-Cancho2016; Morin & Koshevoy, Reference Morin and Koshevoy2024), power law relationships like Zipf’s law of abbreviation have been observed across many phenomena (Linders & Louwerse, Reference Linders and Louwerse2023), including in vocal communication systems of animals (Ferrer-i-Cancho & Lusseau, Reference Ferrer-i-Cancho and Lusseau2009; Heesen et al., Reference Heesen, Hobaiter, Ferrer-i-Cancho and Semple2019; Semple et al., Reference Semple, Hsu and Agoramoorthy2010). Its appearance for the units in visual narrative sequencing, which has already been established to evoke similar neurocognitive processing as sentences (Cohn, Reference Cohn2020b), further reinforces that it characterizes universal tendencies in communicative systems that transcend the vocal modality (Ferrer-i-Cancho et al., Reference Ferrer-i-Cancho, Hernández-Fernández, Lusseau, Agoramoorthy, Hsu and Semple2013). Indeed, it has been observed across written (Zipf, Reference Zipf1935, Reference Zipf1949), spoken (Linders & Louwerse, Reference Linders and Louwerse2023), signed (Kimchi et al., Reference Kimchi, Wolters, Stamp and Arnon2023) and now graphic modalities.
Altogether these results suggest that the graphic systems used in comics exhibit an interplay between diversity and universality that is characteristic of the lexicons of linguistic systems. This includes both variation between diverse systems for how information is conveyed by units and consistency across diversity for both the relationship between information with size and frequency. These findings further reinforce similarities between the structures of graphic systems and other linguistic systems (Cohn, Reference Cohn2013, Reference Cohn2020b), warranting a need to reconsider the cognitive functions of such diversity and universal tradeoffs across modalities.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/langcog.2026.10063.
Data availability statement
The dataset for this study can be found in its repository: https://doi.org/10.34894/MAEOJE.
Acknowledgements
This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 850975).
Competing interests
The authors declare none.
