Hostname: page-component-77c78cf97d-57qhb Total loading time: 0 Render date: 2026-04-24T14:20:13.739Z Has data issue: false hasContentIssue false

Improving semantic coverage of data-to-text generation model using dynamic memory networks

Published online by Cambridge University Press:  31 May 2023

Elham Seifossadat
Affiliation:
Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
Hossein Sameti*
Affiliation:
Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
*
Corresponding author: Hossein Sameti; E-mail: Sameti@sharif.edu
Rights & Permissions [Opens in a new window]

Abstract

This paper proposes a sequence-to-sequence model for data-to-text generation, called DM-NLG, to generate a natural language text from structured nonlinguistic input. Specifically, by adding a dynamic memory module to the attention-based sequence-to-sequence model, it can store the information that leads to generate previous output words and use it to generate the next word. In this way, the decoder part of the model is aware of all previous decisions, and as a result, the generation of duplicate words or incomplete semantic concepts is prevented. To improve the generated sentences quality by the DM-NLG decoder, a postprocessing step is performed using the pretrained language models. To prove the effectiveness of the DM-NLG model, we performed experiments on five different datasets and observed that our proposed model is able to reduce the slot error rate rate by 50% and improve the BLEU by 10%, compared to the state-of-the-art models.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© Sharif University of Technology, 2023. Published by Cambridge University Press
Figure 0

Figure 1. The block diagram of our proposed DM-NLG model. Here, the input of the encoder is an example of the restaurant dataset MRs.

Figure 1

Table 1. Datasets statistics.

Figure 2

Table 2. Performance of the baseline and the proposed DM-NLG model on Restaurant, Hotel, TV, and Laptop datasets in terms of BLEU and SER.

Figure 3

Table 3. Performance of the proposed DM-NLG model on Restaurant, Hotel, TV, and Laptop datasets in terms of BERTScore.

Figure 4

Table 4. Performance of the baseline and the proposed DM-NLG model on E2E dataset in terms of BLEU, NIST, METEOR, ROUGE-L, SER, and BERTScore.

Figure 5

Table 5. Performance of the baseline and the proposed DM-NLG model on Personage dataset in terms of BLEU, NIST, METEOR, ROUGE-L, SER, and BERTScore.

Figure 6

Table 6. Examples of generated sentences by DM-NLG with multislot memory and improved by GPT and Transformer-XL for given MRs from the Restaurant and Hotel datasets.

Figure 7

Table 7. Evaluation results on Restaurant, Hotel, TV and Laptop datasets, for proposed DM-NLG model, without postprocessing, compared to pretrained encoder–decoder transformer-based models, in terms of BLEU (B), SER, and BERTScore-F1 (BS).

Figure 8

Table 8. Evaluation results on E2E dataset, for proposed DM-NLG model, without postprocessing, compared to pretrained encoder–decoder transformer-based models, in terms of BLEU, NIST, METEOR, ROUGE-L, SER, and BERTScore-F1.

Figure 9

Table 9. Evaluation results on Personage dataset, for proposed DM-NLG model, without postprocessing, compared to pretrained encoder–decoder transformer-based models, in terms of BLEU, NIST, METEOR, ROUGE-L, SER, and BERTScore-F1.

Figure 10

Table 10. Performance of our proposed model on six datasets in terms of Shannon Text Entropy.

Figure 11

Table 11. Performance of proposed models without postprocessing on seen and unseen data. in terms of BLEU, SER, and BERTScore.

Figure 12

Table 12. Results of Human Evaluations on five used datasets in terms of Faithfulness, Coverage, and Fluency (rating out of 3).

Figure 13

Table 13. Comparison of the generated sentences from the Laptop dataset for the DM-NLG model without postprocessing and baselines.

Figure 14

Table 14. Comparison of the generated sentences from the E2E dataset for the DM-NLG model without postprocessing and baselines.

Figure 15

Table 15. Comparison of the generated sentences from the Personage dataset for the DM-NLG multislot model without postprocessing and baselines.

Figure 16

Figure 2. The change of the forget gate value of memory writing module (a) and the attention weights (b) for DM-NLG one-slot model after generating each word of the generated text for a given MR from the Laptop dataset.

Figure 17

Figure 3. The change of the forget gate value in memory writing module (a), attention weights of memory reading module (b), and the attention weights (c) for DM-NLG multislot model after generating each word of the generated text for a given MR from the Laptop dataset.