Hostname: page-component-6b989bf9dc-vmcqm Total loading time: 0 Render date: 2024-04-13T08:27:04.604Z Has data issue: false hasContentIssue false

Extensive data for morphology: using the World Wide Web

Published online by Cambridge University Press:  01 March 2008

NABIL HATHOUT
Affiliation:
CLLE-ERSS (UMR 5263) Université de Toulouse & CNRS
FABIO MONTERMINI*
Affiliation:
CLLE-ERSS (UMR 5263) Université de Toulouse & CNRS
LUDOVIC TANGUY
Affiliation:
CLLE-ERSS (UMR 5263) Université de Toulouse & CNRS
*
Address for correspondence: Fabio Montermini, Université de Toulouse-Le Mirail, Maison de la Recherche, 5, allées Antonio Machado, F-31058 Toulouse Cedex 9, France e-mail: fabio.montermini@univ-tlse2.fr

Abstract

This paper presents a number of recent studies in French morphology which make extensive use of data. These data relating to derived words have been automatically collected from digital corpora, mostly from the Web. The main point developed here is that this massive increase in the amount of available data can substantially modify the results of a morphological study, and can lead to new theoretical conclusions that would not have been possible with traditional data such as wordlists gathered from dictionaries. However, using the Web as a corpus brings up several technical and methodological questions, which are dealt with through examples and discussions about the different tools and techniques available. We exemplify our thesis through the study of the suffixal forms: -esque, -este, -able, -ment.

Type
Articles
Copyright
Copyright © Cambridge University Press 2008

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Anscombre, J.-C. and Leeman, D. (1994). La dérivation des adjectifs en -ble: morphologie ou sémantique?. Langue française, 103: 3244.CrossRefGoogle Scholar
Baayen, R. H. (1991). Quantitative aspects of morphological productivity. In Booij, G. E. and van Marle, J. (eds.), Yearbook of Morphology 1991, Dordrecht: Kluwer Academic Publishers, pp. 109149.Google Scholar
Burzio, L. (2002) Surface-to-surface morphology: when your representations turn into constraints. In: Boucher, P. (ed.), Many Morphologies. Somerville, MA: Cascadilla Press, pp. 142177.Google Scholar
Dal, G. (2007). Les adverbes de manière en −ment du français: dérivation ou flexion?. In: Hathout, N. and Montermini, F. (eds.), Morphologie à Toulouse. Munich: Lincom, pp. 121149.Google Scholar
Fradin, B. (1997). Esquisse d'une sémantique de la préfixation en anti-. Recherches linguistiques de Vincennes, 26: 87112.Google Scholar
Fradin, B., Dal, G., Grabar, N., Lignon, S., Namer, F., Tribout, D. and Zweigenbaum, P. (to appear). Remarques sur l'usage des corpus en morphologie. Langages.Google Scholar
Gawelko, M. (1977). Evolution des suffixes adjectivaux en français. Wroclaw, Poland: Polska Akademia Nauk Komitet Neofilologiczny.Google Scholar
Hathout, N. and Tanguy, L. (2002). Webaffix: a tool for finding and validating morphological links on the WWW. In: Rodríguez, M. G. and Araujo, C. P. S. (eds.), Proceedings of the Third International Conference on Language Resources and Evaluation. Las Palmas de Gran Canaria, Spain: ELRA, pp. 17991804.Google Scholar
Hathout, N., Plénat, M. and Tanguy, L. (2003). Enquête sur les dérivés en -able. Cahiers de Grammaire, 28: 4990.Google Scholar
Hathout, N., Namer, F., Plénat, M. and Tanguy, L. (to appear). La collecte et l'utilisation des données en morphologie. In Fradin, B., Kerleroux, F. and Plénat, M. (eds.), Aperçus de morphologie du français. Saint-Denis: Presses Universitaires de Vincennes.Google Scholar
Leeman, D. (1992). Deux classes d'adjectifs en -ble. Langue française, 96: 4464.CrossRefGoogle Scholar
Leeman, D. and Meleuc, S. (1990). Verbes en tables et adjectifs en -able. Langue française, 87: 3051.CrossRefGoogle Scholar
Lignon, S. and Plénat, M. (to appear). Echangisme suffixal et contraintes phonologiques. (Cas des dérivés en -ien et en -icien). In Fradin, B., Kerleroux, F. and Plénat, M. (eds.), Aperçus de morphologie du français. Saint-Denis: Presses Universitaires de Vincennes.Google Scholar
Lüdeling, A., Evert, S. and Baroni, M. (2007). Using Web data for linguistic purposes. In: Hundt, M., Nesselhauf, N. and Biewer, C. (eds.), Corpus Linguistics and the Web. Amsterdam: Rodopi, pp. 724.Google Scholar
Molinier, C. (1992). Sur la productivité adverbiale des adjectifs. Langue française, 96: 6573.CrossRefGoogle Scholar
Namer, F. (2003). WaliM: valider les unités morphologiques par le Web. In: Fradin, B., Dal, G., Kerleroux, F., Hathout, N., Plénat, M. and Roché, M. (eds.), Les unités morphologiques. Lille: Forum de morphologie, pp. 142150.Google Scholar
Pichon, E. (1940). Attache d'un suffixe à un complexe. Le français moderne, 8: 27–23.Google Scholar
Plénat, M. (1996). De l'interaction des contraintes: une étude de cas. In: Durand, J. and Laks, B. (eds.), Current Trends in Phonology: Models and Methods. Salford: ESRI, pp. 585615.Google Scholar
Plénat, M. (1997). Analyse morpho-phonologique d'un corpus d'adjectifs en -esque. Journal of French Language Studies, 7: 163179.CrossRefGoogle Scholar
Plénat, M. (2000). Quelques thèmes de recherche actuels en morphophonologie française. Cahiers de lexicologie, 77: 2762.Google Scholar
Plénat, M. (to appear). Les contraintes de taille. In: Fradin, B., Kerleroux, F. and Plénat, M. (eds.), Aperçus de morphologie du français. Saint-Denis: Presses Universitaires de Vincennes.Google Scholar
Plénat, M. and Boyé, G. (to appear). Le Choix des thèmes dans les dérivés désadjectivaux en français. In Tranel, B. (ed.), Understanding Allomorphy. Perspectives from Optimality Theory. London: Equinox Publishing.Google Scholar
Plénat, M., Lignon, S., Serna, N. and Tanguy, L. (2002). La conjecture de Pichon. Corpus et recherches linguistiques, 1: 105150.Google Scholar
Resnik, P. and Elkiss, A. (2005). The linguist's search engine: an overview. In: Knight, K., Ng, H. T. and Oflazer, K. (eds.), Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics. Ann Arbor, MI: University of Michigan. pp. 3336.Google Scholar
Santini, M. (2006). Identifying genres of Web pages. In: Mertens, P., Fairon, C., Dister, A. and Watrin, P. (eds.), Verbum ex machina. Actes de la 13e conférence sur le traitement automatique du langage (TALN 2006). Louvain-la-Neuve: Presses Universitaires de Louvain, pp. 307316.Google Scholar
Yvon, F. (1996). Prononcer par analogie: motivation, formalisation et évaluation. Unpublished PhD thesis, Paris: E.N.S.T.Google Scholar