Influence of personal choices on lexical variability in referring expressions

RAQUEL HERVÁS; JAVIER ARROYO; VIRGINIA FRANCISCO; FEDERICO PEINADO; PABLO GERVÁS

doi:10.1017/S1351324915000182

Influence of personal choices on lexical variability in referring expressions

Published online by Cambridge University Press: 09 July 2015

FEDERICO PEINADO and

RAQUEL HERVÁS: Affiliation:
Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, 28040, Madrid, Spain e-mails: raquelhb@fdi.ucm.es, javier.arroyo@fdi.ucm.es, virginia@fdi.ucm.es, fpeinado@fdi.ucm.es, pgervas@sip.ucm.es
JAVIER ARROYO: Affiliation:
Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, 28040, Madrid, Spain e-mails: raquelhb@fdi.ucm.es, javier.arroyo@fdi.ucm.es, virginia@fdi.ucm.es, fpeinado@fdi.ucm.es, pgervas@sip.ucm.es
VIRGINIA FRANCISCO: Affiliation:
Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, 28040, Madrid, Spain e-mails: raquelhb@fdi.ucm.es, javier.arroyo@fdi.ucm.es, virginia@fdi.ucm.es, fpeinado@fdi.ucm.es, pgervas@sip.ucm.es
FEDERICO PEINADO: Affiliation:
Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, 28040, Madrid, Spain e-mails: raquelhb@fdi.ucm.es, javier.arroyo@fdi.ucm.es, virginia@fdi.ucm.es, fpeinado@fdi.ucm.es, pgervas@sip.ucm.es
PABLO GERVÁS: Affiliation:
Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, 28040, Madrid, Spain e-mails: raquelhb@fdi.ucm.es, javier.arroyo@fdi.ucm.es, virginia@fdi.ucm.es, fpeinado@fdi.ucm.es, pgervas@sip.ucm.es

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Variability is inherent in human language as different people make different choices when facing the same communicative act. In Natural Language Processing, variability is a challenge. It hinders some tasks such as evaluation of generated expressions, while it constitutes an interesting resource to achieve naturalness and to avoid repetitiveness. In this work, we present a methodological approach to study the influence of lexical variability. We apply this approach to TUNA, a corpus of referring expression lexicalizations, in order to study the use of different lexical choices. First, we reannotate the TUNA corpus with new information about lexicalization, and then we analyze this reannotation to study how people lexicalize referring expressions. The results show that people tend to be consistent when generating referring expressions. But at the same time, different people also share certain preferences.

Information

Type: Articles
Information: Natural Language Engineering , Volume 22 , Issue 2 , March 2016 , pp. 257 - 290

DOI: https://doi.org/10.1017/S1351324915000182 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Aamodt, A., and Plaza, E. 1994. Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Communications 7 : 39–59.CrossRef Google Scholar

Artstein, R., and Poesio, M. 2008. Inter-coder agreement for computational linguistics. Computational Linguistics 34 : 555–596.CrossRef Google Scholar

Belz, A. and Gatt, A. 2007. The attribute selection for GRE challenge: overview and evaluation results. In Proceedings of the 2nd UCNLG Workshop: Language Generation and Machine Translation, Copenhaguen, Denmark, pp. 75–83.Google Scholar

Belz, A., and Gatt, A. 2008. Intrinsic vs. extrinsic evaluation measures for referring expression generation. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies, Columbus, Ohio. Association for Computational Linguistics.CrossRef Google Scholar

Biber, D. 1988. Variation Across Speech and Writing. Cambridge: Cambridge University Press.CrossRef Google Scholar

Biber, D. 1995. Dimensions of Register Variation: A Cross-Linguistic Comparison. Cambridge: Cambridge University Press.CrossRef Google Scholar

Bohnet, B. 2008. The fingerprint of human referring expressions and their surface realization with graph transducers. In Referring Expression Generation Challenge 2008, 5th International Natural Language Generation Conference, Salt Fork, Ohio. Association for Computational Linguistics, pp. 207–2010.Google Scholar

Bohnet, B. 2009. Generation of referring expression with an individual imprint. In Generation Challenges 2009, European Natural Language Generation Conference, Athens, Greece. Association for Computational Linguistics, pp. 185–186.Google Scholar

Brennan, R. L., and Prediger, D. J. 1981. Coefficient Kappa: some uses, misuses, and alternatives. Educational and Psychological Measurement 41 : 687–699.CrossRef Google Scholar

Dale, R., and Viethen, J. 2009. Referring expression generation through attribute-based heuristics. In Proceedings of the 12th European Natural Language Generation Conference, Athens, Greece. Association for Computational Linguistics, pp. 85–65.Google Scholar

Dale, R., and Viethen, J. 2010. Empirical Methods in Natural Language Generation. Attribute-Centric Referring Expression Generation, pp. 163–179. Berlin, Heidelberg: Springer-Verlag.CrossRef Google Scholar

Di Fabbrizio, G., Stent, A., and Bangalore, S. 2008. Referring expression generation using speaker-based attribute selection and trainable realization. In Referring Expression Generation Challenge 2008, 5th International Natural Language Generation Conference, Salt Fork, Ohio. Association for Computational Linguistics, pp. 211–214.Google Scholar

Gatt, A. 2007. Generating Coherent References to Multiple Entities. PhD Thesis, University of Aberdeen, UK.Google Scholar

Gatt, A., Belz, A., and Kow, E. 2008b. The TUNA challenge 2008: overview and evaluation results. In Proceedings of the 5th International Conference on Natural Language Generation, Ohio, USA. Association for Computational Linguistics, pp. 198–206.Google Scholar

Gatt, A., Belz, A., and Kow, E. 2009. The TUNA-REG challenge 2009: overview and evaluation results. In Proceedings of the 12th European Workshop on Natural Language Generation, Athens, Greece. Association for Computational Linguistics, pp. 174–182.Google Scholar

Gatt, A., van der Sluis, I., and van Deemter, K. 2007. Evaluating algorithms for the generation of referring expressions using a balanced corpus. In Proceedings of the 11th European Workshop on Natural Language Generation, Germany. Association for Computational Linguistics, pp. 49–56.Google Scholar

Gatt, A., van der Sluis, I., and van Deemter, K. 2008a. XML format guidelines for the TUNA corpus. Technical Report, University of Aberdeen.Google Scholar

Giles, H., Coupland, J., and Coupland, N. 1991. Contexts of Accommodation: Developments in Applied Sociolinguistics. New York: Cambridge University Press.CrossRef Google Scholar

Hervás, R. 2009. Referring Expressions and Rhetorical Figures for Entity Distinction and Description in Automatically Generated Discourses. PhD Thesis, Universidad Complutense de Madrid, Spain.Google Scholar

Hervás, R., Francisco, V., and Gervás, P. 2013. Assessing the influence of personal preferences on the choice of vocabulary for natural language generation. Information Processing and Management 49 : 817–832.CrossRef Google Scholar

Jain, A. K., Murty, M. N., and Flynn, P. J. 1999. Data clustering: a review. ACM Computing Surveys 31 : 264–323.CrossRef Google Scholar

Krahmer, E., and van Deemter, K. 2012. Computational generation of referring expressions: a survey. Computational Linguistics 38 : 173–218.CrossRef Google Scholar

Levenshtein, V. 1966. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10 : 707–710.Google Scholar

Lin, C., and Och, F. J. 2004. Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Barcelona, Spain. Association for Computational Linguistics.Google Scholar

MacQueen, J. B. 1967. Some methods for classification and analysis of multiVariate observations. In Cam, L. M. L., and Neyman, J. (eds.), Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 281–297. University of California, Berkeley.Google Scholar

Mairesse, F., and Walker, M. A. 2011. Controlling user perceptions of linguistic style: trainable generation of personality traits. Computational Linguistics 37 : 455–488.CrossRef Google Scholar

Paiva, D., and Evans, R. 2005. Empirically-based control of natural language generation. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Ann Arbor, Michigan. Association for Computational Linguistics, pp. 58–65.Google Scholar

Papineni, K., Roukos, S., Ward, T., and Zhu, W. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, Pennsylvania. Association for Computational Linguistics, pp. 311–318.Google Scholar

Power, R., Scott, D., and Bouayad-Agha, N. 2003. Generating texts with style. In Gelbukh, A. (ed.), Computational Linguistics and Intelligent Text Processing, pp. 93–105. Berlin, Heidelberg: Springer-Verlag.CrossRef Google Scholar

Randolph, J. J. 2005. Free-marginal multirater Kappa: an alternative to fleiss’ fixed-marginal multirater Kappa. In Joensuu University Learning and Instruction Symposium, Joensuu, Finland.Google Scholar

Randolph, J. J. 2008. Online kappa calculator. http://aiweb.techfak.uni-bielefeld.de/content/bworld-robot-control-software/Google Scholar

Reiter, E., and Dale, R. 2000. Building Natural Language Generation Systems. Cambridge: Cambridge University Press.CrossRef Google Scholar

Reiter, E., Sripada, S., Hunter, J., Yu, J., and Davy, I. 2005. Choosing words in computer-generated weather forecasts. Artificial Intelligence 167 : 137–169.Google Scholar

Scott, W. A. 1955. Reliability of content analysis: the case of nominal scale coding. The Public Opinion Quarterly 19 : 321–325.Google Scholar

van Deemter, K., Gatt, A., van der Sluis, I. and Power, R. 2012. Generation of referring expressions: assessing the incremental algorithm. Cognitive Science 36 (5): 799–836.CrossRef Google Scholar PubMed

van Deemter, K., van der Sluis, I., and Gatt, A. 2006. Building a semantically transparent corpus for the generation of referring expressions. In Proceedings of the 4th International Conference on Natural Language Generation (Special Session on Data Sharing and Evaluation), Sydney, Australia, pp. 130–132.Google Scholar

Viethen, J., and Dale, R. 2010. Speaker-dependent variation in content selection for referring expression generation. In Proceedings of the 8th Australasian Language Technology Workshop, Melbourne, Australia, pp. 81–89.Google Scholar

Article contents

Influence of personal choices on lexical variability in referring expressions

Abstract

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests