Skip to main content

Computational generation and dissection of lexical replacement humor*


We consider automated generation of humorous texts by substitution of a single word in a given short text. In this setting, several factors that potentially contribute to the funniness of texts can be integrated into a unified framework as constraints on the lexical substitution. We discuss three types of such constraints: formal constraints concerning the similarity of sounds or spellings between the original word and the substitute, semantic or connotational constraints requiring the substitute to be a taboo word, and contextual constraints concerning the position and context of the replacement. Empirical evidence from extensive user studies using real SMSs as the corpus indicates that taboo constraints are statistically very effective, and so is a constraint requiring that the substitution takes place at the end of the text even though the effect is smaller. The effects of individual constraints are largely cumulative. In addition, connotational taboo words and word position have a strong interaction.

Hide All

We would like to thank the anonymous reviewers for their insightful comments that have greatly helped us improve the paper. This work has been supported by the Academy of Finland (decision 276897, CLiC; and the Algorithmic Data Analysis Centre of Excellence, Algodan), and by the European Commission (FET grant 611733, ConCreTe; and FET grant 611560, WHIM).

Hide All
Beattie, J. 1971. An essay on laughter, and ludicrous composition. In Essays. Reprinted by Garland (Original work published by William Creech, Edinburgh, 1776), New York: Garland Publishing.
Binsted, K., Pain, H., and Ritchie, G., 1997. Children’s evaluation of computer-generated punning riddles. Pragmatics and Cognition 2 (5): 305354.
Carrell, A., 1997. Joke competence and humor competence. Humor 10 : 173185.
Chen, T., and Kan, M.-Y., 2013. Creating a live, public short message service corpus: the NUS SMS Corpus. Language Resources and Evaluation 74 (2): 299335.
Cory, M., 1995. Comedic distance in holocaust literature. Journal of American Culture 18 (1): 3540.
Doucet, A., and Ahonen-Myka, H., 2006. Probability and expected document frequency of discontinued word sequences, an efficient method for their exact computation. Traitement Automatique des Langues (TAL) 46 (2): 1337.
Dybala, P., Ptaskynsky, M., Higuchi, S., Rzepka, R., and Araki, K. 2008. Humor Prevails! - Implementing a joke generator into a conversational system. In Proceedings of the 21st Australian Joint Conference on AI (AI-08), vol. 5360, pp. 214–225. Berlin: Springer Verlag.
Fellbaum, C., 1998. WordNet. An Electronic Lexical Database. Cambridge, Massachusetts: The MIT Press.
Hempelmann, C. F., 2003. Paronomasic Puns: Target Recoverability Towards Automatic Generation. Ph.D. thesis, West Lafayette, IN: Purdue University.
Hempelmann, C., Taylor, J., and Raskin, V. 2012. Tightening up joke structure: not by length alone. In Proceedings of the 34th Annual Meeting of the Cognitive Science Society 2012 (CogSci 2012), Sapporo, Japan.
Jay, T., Caldwell-Harris, C., and King, K., 2008. Recalling taboo and nontaboo words. American Journal of Psychology 121 (1): 83103.
Kazai, G., Kamps, J., Koolen, M., and Milic-Frayling, N. 2011. Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, ACM, pp. 205–214, Beijing, China.
Koestler, A., 1964. The Act of Creation. London: Hutchinson.
Leach, E. 1964. Antropological aspects of language: animal categories and verbal abuse. In Lenneberg, E. H. (ed.), New Directions in the Study of Language, pp. 2363. Cambridge, Massachusetts: The MIT Press.
Levenshtein, V. I., 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10 (8): 707710.
Levison, M., and Lessard, G., 1992. A system for natural language generation. Computers and the Humanities 26 : 4358.
Magnini, B., and Cavaglià, G. 2000. Integrating subject field codes into WordNet. In Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC2000), pp. 1413–1418, Athens, Greece.
Martin, R. A., 2007. The Psychology of Humor: An Integrative Approach. Elsevier, Elsevier: San Diego, California.
McKay, J. 2002. Generation of idiom-based witticisms to aid second language learning. In Stock, O., Strapparava, C., and Nijholt, A., (eds.), Proceedings of the The April Fools Day Workshop on Computational Humour (TWLT20), pp. 77–87, Trento, Italy.
Michel, J.-B., Shen, Y. K., Aiden, A. P., Veres, A., Gray, M. K., The Google Books Team, Pickett, J. P., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M. A. and Aiden, E. L., 2011. Quantitative analysis of culture using millions of digitized books. Science 331 (6014): 176182.
Morreall, J. 2013. Philosophy of Humor. In Zalta, E. N. (ed.), The Stanford Encyclopedia of Philosophy. The Metaphysics Research Lab Publisher, Stanford, California.
Mulkay, M., 1988. On Humour: Its Nature and its Place in Modern Society. Cambridge, UK: Polity Press.
Özbal, G., and Strapparava, C. 2012. A computational approach to the automation of creative naming. In Proceedings of the 50th annual meeting of the Association of Computational Linguistics (ACL-2012), pp. 703–711, Jeju Island, South Korea.
Petrović, S., and Matthews, D. 2013. Unsupervised joke generation from big data. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 228–232, Sofia, Bulgaria.
Raskin, V., 1985. Semantic Mechanisms of Humor. Netherlands: Dordrecht-Boston-Lancaster.
Raskin, V., and Attardo, S., 1994. Non-literalness and non-bona-fide in language: approaches to formal and computational treatments of humor. Pragmatics and Cognition 2 (1): 3169.
Ritchie, G. 2002. The structure of forced reinterpretation jokes. In Proceedings of the The April Fools Day Workshop on Computational Humour (TWLT20), pp. 47–56, Trento, Italy.
Ritchie, G., 2003. The Linguistic Analysis of Jokes. London: Routledge.
Ross, J., Irani, I., Silberman, M. S., Zaldivar, A., and Tomlinson, B. 2010. Who are the crowdworkers?: shifting demographics in Amazon Mechanical Turk. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 2863–2872, Atlanta, Georgia.
Ruch, W. 1992. Assessment of appreciation of humor: studies with the 3 WD Humor Test. In Spielberger, C. D. and Butcher, J. N. (eds.), Advances in Personality Assessment, vol. 9, pp. 2775. Hillsdale, NJ: Lawrence Erlbaum Associates.
Ruch, W. 2008. Psychology of humor. In Raskin, V. (ed.), The Primer of Humor Research, pp. 17100. De Gruyter Mouton, Hillsdale, New Jersey.
Schank, R., and Abelson, R. 1977. Scripts, Plans Goals and Understanding: An Inquiry into Human Knowledge Structures. Erlbaum, Hillsdale, NJ.
Seizer, S., 2011. On the uses of obscenity in live stand-up comedy. Anthropological Quarterly 84 (1): 209234.
Sherzer, J. 2002. Speech Play and Verbal Art. University of Texas Press, Austin, Texas.
Sjöbergh, J. 2006. Vulgarities are fucking funny, or at least make things a little bit funnier. Technical Report TRITA-CSC-TCS 2006: 4, School of Computer Science and Communication, the Royal Institute of Technology, Stockholm.
Stock, O., and Strapparava, C. 2003. HAHAcronym: humorous agents for humorous acronyms. Humor: International Journal of Humor Research 16 (3), pp. 297314.
Suls, J. 1972. A two-stage model for the appreciation of jokes and cartoons: an information-processing analysis. In Goldstein, J. and McGhee, P. (ed.), The Psychology of Humor, pp. 81100. New York: Academic Press.
Taylor, J., and Mazlack, L. 2005. Toward computational recognition of humorous intent. In Proceedings of the 27th Annual Conference of the Cognitive Science Society (COGSCI 05), pp. 2166–2171, Stresa, Italy.
Valitutti, A. 2011. How many jokes are really funny? Towards a new approach to the evaluation of computational humour generators. In Proceedings of 8th International Workshop on Natural Language Processing and Cognitive Science, pp. 189–200, Copenhagen, Denmark.
Valitutti, A., Toivonen, H., Doucet, A., and Toivanen, J. M. 2013. ‘Let everything turn well in your wife’: generation of adult humor using lexical constraints. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 243–248, Sofia, Bulgaria.
Veale, T., 2013. Humorous similes. HUMOR: The International Journal of Humor Research 21 (1): 322.
Venour, C. 1999. The computational generation of a class of puns. Master’s thesis, Kingston, Ontario: Queen’s University.
Westfall, P. H., and Young, S. S., 1993. Resampling-Based Multiple Testing. New York: John Wiley & Sons.
Zwicky, A. M., 1979. Classical malapropisms. Language Sciences 1 (2): 339348.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Altmetric attention score

Full text views

Total number of HTML views: 9
Total number of PDF views: 65 *
Loading metrics...

Abstract views

Total abstract views: 676 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 18th March 2018. This data will be updated every 24 hours.