Skip to main content Accessibility help
Hostname: page-component-55597f9d44-zdfhw Total loading time: 0.442 Render date: 2022-08-08T02:34:32.413Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "useRatesEcommerce": false, "useNewApi": true } hasContentIssue true

Some quantitative aspects of written and spoken French based on syntactically annotated corpora

Published online by Cambridge University Press:  07 February 2020

Rafaël Poiret
Zhejiang University
Haitao Liu*
Zhejiang University Beijing Language and Culture University Guangdong University of Foreign Studies
*Corresponding author. Email:


Based on two syntactically annotated corpora, and within the theoretical tradition of dependency grammar, the current study investigates the quantitative differences and similarities between written and spoken French. Our findings support the assumption that spoken and written French are two realizations of one language that do not differ in the syntactic categories, but in the frequency of these categories, and also in their organization in sentence. The subjects in spoken French are mostly pronouns, whereas in written French the subjects are mostly nouns and pronouns. Spoken and written French share many syntactic relations, but with different frequencies. For instance, dislocations are more diverse and frequent in spoken French. Spoken French and written French differ in the word order of vocative nominal phrases. Finally, written French is slightly more difficult to process than spoken French.

© Cambridge University Press 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Arnold, J. E. (2001). The effect of thematic roles on pronoun use and frequency of reference continuation. Discourse Processes, 31.2: 137162.10.1207/S15326950DP3102_02CrossRefGoogle Scholar
Ashby, W. and Bentivoglio, P. (1993). Preferred argument structure in spoken French and Spanish. Language Variation and Change, 5.1: 6176.10.1017/S095439450000140XCrossRefGoogle Scholar
Avanzi, M. (2012). L’interface prosodie/syntaxe en français. Dislocations, incises et asyndètes. Bruxelles: Peter Lang.10.3726/978-3-0352-6282-7CrossRefGoogle Scholar
Avanzi, M., Gendrot, C. and Lacheret, A. (2010). Is there a prosodic difference between left-dislocated and heavy subjects? Evidence from spontaneous speech. Proceedings of the 5th Speech Prosody International Conference (SP’10). Chicago, United States, May 2010.Google Scholar
Barnes, B. K. (1985). The Pragmatics of Left Detachment in Spoken Standard French. Amsterdam: Benjamins.10.1075/ Scholar
Béguelin, M.-J. (1998). Le rapport écrit-oral. Tendances dissimilatrices, tendances assimilatrices, Cahiers de Linguistique Française, 20: 229253.Google Scholar
Berman, R. A. and Verhoeven, L. (2002). Cross linguistic perspectives on the development of text-production abilities: Speech and writing. Written Language and Literacy, 5.1–2: 143.10.1075/wll.5.1.02berCrossRefGoogle Scholar
Berrendonner, A. and Reichler-Béguelin, M.-J. (1997). Left-dislocation in French: Varieties, norm and usage. In: Cheshire, J. and Stein, D. (eds) Taming the Vernacular. From Dialect to Written Standard Language. London and New York: Longman, pp. 200217.Google Scholar
Berns, J. (2015). Merging low vowels in metropolitan French. Journal of French Language Studies, 25.3: 317338.10.1017/S0959269515000174CrossRefGoogle Scholar
Blanche-Benveniste, C. (1991). Les études sur l’oral et le travail d’écriture de certains poètes contemporains. Langue Française, 89: 5271.10.3406/lfr.1991.5763CrossRefGoogle Scholar
Blanche-Benveniste, C. (1994). Quelques caractéristiques grammaticales des ‘sujets’ employés dans le français parlé des conversations. Proceedings of the Conference Subjecthood and Subjectivity. Paris/London: Ophrys and Institut français du Royaume-Uni, pp. 77107.Google Scholar
Blanche-Benveniste, C. (1995). Le semblable et le dissemblable en syntaxe. Recherches sur le Français Parlé, 13: 732.Google Scholar
Blanche-Benveniste, C. (1997). Approches de la langue parlée en français. Paris: Ophrys.Google Scholar
Blanche-Benveniste, C. and Jeanjean, C. (1987). Le français parlé: Transcription et édition. Paris: Didier Erudition.Google Scholar
Blanche-Benveniste, C., Bilger, M., Rouget, C. and Van den Eyende, K. (1990). Le français parlé : Études grammaticales. Paris: Éditions du CNRS.Google Scholar
Blasco-Dulbecco, M. (1999). Les dislocations en français contemporain. Paris: Honoré Champion.Google Scholar
Biber, D. (1988) Variation across Speech and Writing. Cambridge: Cambridge University Press.10.1017/CBO9780511621024CrossRefGoogle Scholar
Biber, D. and Conrad, S. (2003). Register variation: A corpus approach. In: Schiffrin, D., Tannen, D., and Hamilton, D. (eds), Handbook of Discourse Analysis, London: Blackwell, pp. 175196.Google Scholar
Brunetti, L., Avanzi, M. and Gendrot, C. (2013). A quantitative study of sentence topic and its syntactic/prosodic correlates on a French spoken corpus: Methodological and theoretical issues. Proceedings of the Information Structure in Spoken Language Corpora Workshop. Bielefeld, Germany, June 2013.Google Scholar
Campion, E. (1984). Left Dislocation in Montréal French. Ph.D. dissertation, University of Pennsylvania.Google Scholar
Chafe, W. (1979). Integration and involvement in spoken and written language. Proceedings of the 2nd Congress of the International Association for Semiotic Studies. Vienna, Austria, July 1979.Google Scholar
Chafe, W. and Tannen, D. (1987). The relation between written and spoken language. Annual Review of Anthropology, 16.1: 383407.10.1146/ Scholar
Coveney, A. (2002). Variability in Spoken French. Bristol: Intellect.Google Scholar
Coveney, A. (2004). The alternation between “l’on” and “on” in spoken French. Journal of French Language Studies, 14.2: 91112.10.1017/S0959269504001590CrossRefGoogle Scholar
Coveney, A. (2011). A language divided against itself? Diglossia, code-switching and variation in French. In: Martineau, F. and Nadasdi, T. (eds), Le français en contact. Québec: Presses de l’Université Laval, pp. 5185.Google Scholar
Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24: 87185.10.1017/S0140525X01003922CrossRefGoogle ScholarPubMed
Cowan, N. (2005). Working Memory Capacity. Hove, East Sussex, UK: Psychology Press.Google Scholar
De Cat, C. (2002). French Dislocation. Ph.D. dissertation, University of York.Google Scholar
De Cat, C. (2005). French subject clitics are not agreement markers. Lingua, 115.9: 11951219.10.1016/j.lingua.2004.02.002CrossRefGoogle Scholar
De Cat, C. (2007). French Dislocation. Interpretation, Syntax, Acquisition. New York: Oxford University Press.Google Scholar
De Cat, C. (2011). Information tracking and encoding in early L1: Linguistic competence vs. cognitive limitations. Journal of Child Language, 38.4: 828860.10.1017/S030500091000036XCrossRefGoogle Scholar
De Cat, C. (2012). Explaining children’s over-use of definites in ‘partitive contexts’. First Language, 32.1–2: 137150.10.1177/0142723711403884CrossRefGoogle Scholar
Deulofeu, J. (1979). Les énoncés à constituant lexical détaché. Recherches sur le Français Parlé, 2: 75108.Google Scholar
Dik, S. C. (1978). Functional Grammar. Amsterdam: North Holland.Google Scholar
Du Bois, J. W. (1987). The discourse basis of ergativity. Language, 63: 805855.10.2307/415719CrossRefGoogle Scholar
Fayol, M. (1997). Des idées au texte : Psychologie cognitive de la production verbale orale et écrite. Paris: Presse Universitaire de France.10.3917/puf.fayol.1997.01CrossRefGoogle Scholar
Fradin, B. (1990). Approche des constructions à détachement: Inventaire. Revue Romane, 25.1: 314.Google Scholar
François, D. (1974). Français parlé. Analyse des unités phoniques et significatives d’un corpus recueilli dans la région parisienne. Paris: S.E.L.A.F.Google Scholar
Gadet, F. (1991). Le parlé coulé dans l’écrit: Le traitement du détachement par les grammairiens du XXème siècle. Langue Française, 89.1: 110124.10.3406/lfr.1991.5767CrossRefGoogle Scholar
Gadet, F. (1996). Une distinction bien fragile : Oral/écrit. Revue Tranel (Travaux Neuchâtelois de Linguistique), 25: 1327.Google Scholar
Gadet, F. (2007a). La variation de tous les français. Linx: Revue des Linguistes de l’Université Paris X Nanterre, 57: 155164.10.4000/linx.306CrossRefGoogle Scholar
Gadet, F. (2007b). La variation sociale en français, 2nd edition. Paris: Ophrys.Google Scholar
Gerdes, K., Guillaume, B., Kahane, S. and Perrier, G. (2018). SUD or Surface-Syntactic Universal Dependencies: An annotation scheme near-isomorphic to UD. Proceedings of the Universal Dependencies Workshop 2018 (UDW’18). Brussels, Belgium, November 2018.10.18653/v1/W18-6008CrossRefGoogle Scholar
Gibson, E. (1998). Linguistic complexity: Locality of syntactic dependencies. Cognition, 68.1: 176.10.1016/S0010-0277(98)00034-1CrossRefGoogle Scholar
Gibson, E. (2000). The dependency locality theory: A distance-based theory of linguistic complexity. In: Image, Language, Brain. Marantz, A. et al. (eds). Cambridge, MA: The MIT Press, pp. 95126.Google Scholar
Grevisse, M. and Goosse, A. (2008). Le Bon Usage, 14th edition. Bruxelles: De Boeck and Larcier.Google Scholar
Grodner, D. and Gibson, E. (2005). Some consequences of the serial nature of linguistic input. Cognitive Sciences, 29: 261290.10.1207/s15516709cog0000_7CrossRefGoogle ScholarPubMed
Gundel, J. K. (1988). Universals of topic-comment structure. In: Hammond, M., Moravcsik, E. and Wirth, J. (eds), Studies in Syntactic Typology. Amsterdam: Benjamins, pp. 203239.Google Scholar
Halliday, M. (1985). Spoken and Written Language. Oxford: Oxford University Press.Google Scholar
Hawkins, J. A. (1994). A Performance Theory of Order and Constituency. Cambridge, England: Cambridge University Press.Google Scholar
Hellwig, P. (2003). Dependency unification grammar. In: Ágel, V. et al. (eds), Dependency and valency. An International Handbook of Contemporary Research, Volume 1. Berlin/New York: De Gruyter, pp. 593635.Google Scholar
Henry, S. and Pallaud, B. (2003). Word fragments and repeats in spontaneous spoken French. Proceedings of Disfluency in Spontaneous Speech Workshop (DiSS’03). Gothenburg, Sweden, September 2003.Google Scholar
Horváth, M. G. (2018). Le français parlé informel. Stratégies de topicalisation. Berlin: De Gruyter.10.1515/9783110571318CrossRefGoogle Scholar
Hsiao, F. and Gibson, E. (2003). Processing relative clauses in Chinese. Cognition, 90: 327.10.1016/S0010-0277(03)00124-0CrossRefGoogle ScholarPubMed
Hudson, R. (1990). English Word Grammar. Oxford: Basil Blackwell.Google Scholar
Hudson, R. (2007). Language Networks: The New Word Grammar. Oxford: Oxford University Press.Google Scholar
Jeanjean, C. (1980). Les formes sujets de type nominal: Étude sur le français contemporain. PhD dissertation, Provence University.Google Scholar
Jiang, J. and Liu, H. (2015). The effects of sentence length on dependency distance, dependency direction and the implications — Based on a parallel English-Chinese dependency Treebank. Language Sciences, 50, 93104.10.1016/j.langsci.2015.04.002CrossRefGoogle Scholar
Jisa, H. (1998). Relative clauses in French children’s narrative text. Journal of Child Language, 25: 623652.10.1017/S0305000998003523CrossRefGoogle Scholar
Jones, M. A. (1996). Foundations of French Syntax. Cambridge: Cambridge University Press.10.1017/CBO9780511620591CrossRefGoogle Scholar
Koch, P. and Oesterreicher, W. (2001). Langage parlé et language écrit. Lexikon der romanistischen Linguistik, Volume 1. Tübingen, Max Niemeyer Verlag.Google Scholar
Labbé, D. (2003). Coordination et subordination en français oral. Proceedings of the 4ème Journées de l’ERLA, Brest, France, November 2003. In: D. Banks (ed.) (2007). La coordination et la subordination dans le texte de spécialité. Paris: L’Harmattan, 161–182.Google Scholar
Lambrecht, K. (1981). Topic, Antitopic and Verb Agreement in Non-Standard French. Amsterdam: Benjamins.10.1075/pb.ii.6CrossRefGoogle Scholar
Lambrecht, K. (1987). On the status of SVO sentences. In: Tomlin, R. S. (ed.), Coherence and Grounding in Discourse. Amsterdam: Benjamins, pp. 217261.10.1075/tsl.11.12lamCrossRefGoogle Scholar
Lambrecht, K. (1994). Information Structure and Sentence Form. Topic, Focus and the Mental Representations of Discourse Referents. Cambridge: Cambridge University Press.10.1017/CBO9780511620607CrossRefGoogle Scholar
Lambrecht, K. (2001). Dislocation. In: Haspelmath, M. et al. (eds), Language Typology and Language Universals: An International Handbook, Volume 2, Berlin: De Gruyter, pp. 10501078.Google Scholar
Larsson, E. (1979). La dislocation en français. Lund: Gleerup.Google Scholar
Le Goffic, P. (1993). Grammaire de la phrase française. Paris: Hachette.Google Scholar
Liu, H. (2008). Dependency distance as a metric of language comprehension difficulty. Journal of Cognitive Science, 9.2: 159191.Google Scholar
Liu, H. (2010). Dependency direction as a means of word order typology: A method based on dependency treebanks. Lingua, 120.6: 15671578.10.1016/j.lingua.2009.10.001CrossRefGoogle Scholar
Liu, H., Zhao, Y. and Li, W. (2009). Chinese syntactic and typological properties based on dependency syntactic treebanks. Poznań Studies in Contemporary Linguistics, 45.4: 509523.Google Scholar
Liu, H. and Xu, C. (2012). Quantitative typological analysis of Romance languages. Poznań Studies in Contemporary Linguistics. 45.4: 597625.Google Scholar
Liu, H., Xu, C. and Liang, J. (2017). Dependency distance: A new perspective on syntactic patterns in natural languages. Physics of Life Reviews, 21, 171193.10.1016/j.plrev.2017.03.002CrossRefGoogle ScholarPubMed
Liu, B., Niu, Y., and Liu, H. (2012). Word class, syntactic function and style: A comparative study based on annotated corpora. Applied Linguistics, 4: 134142, in Chinese.Google Scholar
Liu, B., Niu, Y. and Liu, H. (2013). A comparative study of style-related differences in syntactic functions of part of speech. Language Teaching and Linguistic Studies, 5: 97104, in Chinese.Google Scholar
Massot, B. (2010). Le patron diglossique de variation grammaticale en français. Langue Française, 4: 87106.10.3917/lf.168.0087CrossRefGoogle Scholar
Massot, B. and Rowlett, P. (2013). Le débat sur la diglossie en France : Aspects scientifiques et politiques. Journal of French Language Studies, 23.1: 116.10.1017/S0959269512000336CrossRefGoogle Scholar
Mazur-Palandre, A. (2015). Overcoming Preferred Argument Structure in written French: Development, modality, text type. Written Language and Literacy, 18.1: 2555.10.1075/wll.18.1.02mazCrossRefGoogle Scholar
Meinschaefer, J., Bonifer, S. and Frisch, C. (2015). Variable and invariable liaison in a corpus of spoken French. Journal of French Language Studies, 25.3: 367396.10.1017/S0959269515000186CrossRefGoogle Scholar
Miller, G. (1956). The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63: 8197.10.1037/h0043158CrossRefGoogle ScholarPubMed
Miller, J. and Weinert, R. (1998). Spontaneous Spoken Language. Oxford: Clarendon Press.Google Scholar
Moreau, M.-L. (1977). Français oral et français écrit: Deux langues différentes?. Français Moderne, 45.3: 204242.Google Scholar
Morel, M.-A., and Danon-Boileau, L. (1998). Grammaire de l’intonation. L’exemple du français. Paris: Ophrys.Google Scholar
Nivre, J., De Marneffe, M.-C., Ginter, F., Goldberg, Y., Hajič, J., Manning, C. D., McDonald, R., Petrov, S., Pyysalo, S., Silveira, N., Tsarfaty, R. and Zeman, D. (2016) Universal Dependencies v1: A Multilingual Treebank Collection. Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). Portorož, Slovenia, May 2016.Google Scholar
Prévost, S. (2003). Détachement et topicalisation: Des niveaux d’analyse différents. Cahiers de Praxématique, 40: 97126.10.4000/praxematique.2707CrossRefGoogle Scholar
Ravid, D., van Hell, J., Rosado, E. and Zamora, A. (2002). Subject NP patterning in the development of text production: Speech and writing. Written Language and Literacy, 5: 6994.10.1075/wll.5.1.04ravCrossRefGoogle Scholar
Redeker, G. (1984). On differences between spoken and written language. Discourse Processes, 7.1: 4355.10.1080/01638538409544580CrossRefGoogle Scholar
Riegel, M., Pellat, J.-C. and Rioul, R. (2016). Grammaire méthodique du français, 6th edition. Paris: PUF.Google Scholar
Serratrice, L. and De Cat, C. (2019). Individual differences in the production of referential expressions: The effect of language proficiency, language exposure and executive function in bilingual and monolingual children. Bilingualism: Language and Cognition. 116. DOI: 10.1017/S1366728918000962 10.1017/S1366728918000962CrossRefGoogle Scholar
Stark, E. (1999). Antéposition et marquage du thème (topic) dans les dialogues spontanés. In: Guimier, C. (ed.), La thématisation dans les langues. Actes du colloque de Caen 1997. Bern: Peter Lang, pp. 337358.Google Scholar
Tannen, D. (1980). Spoken and written language and the oral/literate continuum. Proceedings of the 6th Annual meeting of the Berkeley Linguistics Society. Berkeley, USA, February 1980.Google Scholar
Tesnière, L. (1959). Éléments de syntaxe structurale. Paris: Klincksieck.Google Scholar
Wang, Y. and Liu, H. (2017). The effects of genre on dependency distance and dependency direction. Language Sciences, 59: 135147.10.1016/j.langsci.2016.09.006CrossRefGoogle Scholar
Yan, J. and Liu, H. (2019). Which annotation scheme is more expedient to measure syntactic difficulty and cognitive demand? Proceedings of First Workshop on Quantitative Syntax. Stroudsburg, PA: Association for Computational Linguistics. pp. 1624.Google Scholar
Yngve, V. (1960). A model and a hypothesis for language structure. Proceedings of the American Philosophical Society, 104: 444466.Google Scholar
Zribi-Hertz, A. (2011). Pour un modèle diglossique de description du français: Quelques implications théoriques, didactiques et méthodologiques. Journal of French Language Studies, 21.2, 231256.10.1017/S0959269510000323CrossRefGoogle Scholar
Avanzi, M. (2012). L’interface prosodie/syntaxe en français parlé. Dislocations, incises, asyndètes. Bruxelles: Peter Lang.10.3726/978-3-0352-6282-7CrossRefGoogle Scholar
Branca-Rosoff, S., Fleury, S., Lefeuvre, F. and Pires, M. (2012). Discours sur la ville. Corpus de Français Parlé Parisien des années 2000 (CFPP2000). Technical report.Google Scholar
Candito, M., and Seddah, D. (2012). Le corpus Sequoia : Annotation syntaxique et exploitation pour l’adaptation d’analyseur par pont lexical. Proceedings of the Conférence sur le Traitement Automatique des Langues Naturelles (TALN’12). Grenoble, France, June 2012.Google Scholar
Debaisieux, J. M., Benzitoun, C. and Deulofeu, H.-J. (2016). Le projet ORFEO : Un corpus d’études pour le français contemporain. Revue Corpus, 15: 91114.Google Scholar
Durand, J., Laks, B. and Lyche, C. (eds) (2009). Phonologie, variation et accents du français. Paris: Hermès.Google Scholar
Eshkol-Taravella, I., Baude, O., Maurel, D., Hriba, L., Dugua, C. and Tellier, I. (2012). Un grand corpus oral “disponible” : le corpus d’Orléans 1968–2012. Traitement Automatique des Langues, 53.2: 1746.Google Scholar
Koehn, P. (2005). Europarl: A parallel corpus for statistical machine translation. Proceedings of the Machine Translation Summit X. Phuket, Thailand, September 2005.Google Scholar
Lacheret, A. (2003). La prosodie des circonstants en français parlé. Paris: Peeters.Google Scholar
Lacheret, A., Kahane, S., Beliao, J., Dister, A., Gerdes, K., Goldman, J.-P., Obin, N., Pietrandrea, P. and Tchobanov, A. (2014). Rhapsodie: Un treebank annoté pour l’étude de l’interface syntaxe-prosodie en français parlé. Proceedings of the 4th Congrès Mondial de la Linguistique Française (CMLF’14). Berlin, Germany, July 2014.10.1051/shsconf/20140801305CrossRefGoogle Scholar
Mertens, P. (1987). L’intonation du français : De la description linguistique à la reconnaissance automatique. PhD dissertation, Louvain University.Google Scholar
Tiedemann, J. (2009). News from OPUS-A collection of multilingual parallel corpora with tools and interfaces. Recent Advances in Natural Language Processing, Volume 5. Amsterdam/Philadelphia: Benjamins, pp. 237248.10.1075/cilt.309.19tieCrossRefGoogle Scholar
Villemonte de La Clergerie, É., Hamon, O., Mostefa, D., Ayache, C., Paroubek, P. and Vilnat, A. (2008). Passage: From French parser evaluation to large sized treebank. Proceedings of the International Conference on Language Resources and Evaluation (LREC’08). Marrakesh, Morocco, May 2008.Google Scholar
Cited by

Save article to Kindle

To save this article to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the or variations. ‘’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Some quantitative aspects of written and spoken French based on syntactically annotated corpora
Available formats

Save article to Dropbox

To save this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Dropbox account. Find out more about saving content to Dropbox.

Some quantitative aspects of written and spoken French based on syntactically annotated corpora
Available formats

Save article to Google Drive

To save this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Google Drive account. Find out more about saving content to Google Drive.

Some quantitative aspects of written and spoken French based on syntactically annotated corpora
Available formats

Reply to: Submit a response

Please enter your response.

Your details

Please enter a valid email address.

Conflicting interests

Do you have any conflicting interests? *