Computational generation and dissection of lexical replacement humor*


We consider automated generation of humorous texts by substitution of a single word in a given short text. In this setting, several factors that potentially contribute to the funniness of texts can be integrated into a unified framework as constraints on the lexical substitution. We discuss three types of such constraints: formal constraints concerning the similarity of sounds or spellings between the original word and the substitute, semantic or connotational constraints requiring the substitute to be a taboo word, and contextual constraints concerning the position and context of the replacement. Empirical evidence from extensive user studies using real SMSs as the corpus indicates that taboo constraints are statistically very effective, and so is a constraint requiring that the substitution takes place at the end of the text even though the effect is smaller. The effects of individual constraints are largely cumulative. In addition, connotational taboo words and word position have a strong interaction.

We would like to thank the anonymous reviewers for their insightful comments that have greatly helped us improve the paper. This work has been supported by the Academy of Finland (decision 276897, CLiC; and the Algorithmic Data Analysis Centre of Excellence, Algodan), and by the European Commission (FET grant 611733, ConCreTe; and FET grant 611560, WHIM).

