Multilingual extension and evaluation of a poetry generator*


Poetry generation is a specific kind of natural language generation where several sources of knowledge are typically exploited to handle features on different levels, such as syntax, semantics, form or aesthetics. But although this task has been addressed by several researchers, and targeted different languages, all known systems have focused on a limited purpose and a single language. This article describes the effort of adapting the same architecture to generate poetry in three different languages – Portuguese, Spanish and English. An existing architecture is first described and complemented with the adaptations required for each language, including the linguistic resources used for handling morphology, syntax, semantics and metric scansion. An automatic evaluation was designed in such a way that it would be applicable to the target languages. It covered three relevant aspects of the generated poems, namely: the presence of poetic features, the variation of the linguistic structure and the semantic connection to a given topic. The automatic measures applied for the second and third aspect can be seen as novel in the evaluation of poetry. Overall, poems were successfully generated in the three languages addressed. Despite minor differences in different languages or seed words, poems revealed to have a regular metre, frequent rhymes, to exhibit an interesting degree of variation, and to be semantically-associated with the initially given seeds.

This work was supported by projects PROSECCO and ConCreTe. Part of this work was developed during short term visits funded by the PROSECCO CSA project, European Commission under FP7 FET grant number 600653. The project ConCreTe acknowledges the financial support of the Future and Emerging Technologies (FET) programme within the Seventh Framework Programme for Research of the European Commission, under FET grant number 611733.

