Hostname: page-component-89b8bd64d-shngb Total loading time: 0 Render date: 2026-05-08T10:04:21.340Z Has data issue: false hasContentIssue false

Learning and semiautomatic intention labeling for classification models: a COVID-19 dialog attendance study for chatbots

Published online by Cambridge University Press:  25 October 2024

Valmir Oliveira dos Santos Júnior*
Affiliation:
Insight Data Science Lab - Federal University of Ceará, Brazil Graduate Program in Computer Science (PCOMP), Quixadá, Brazil
Marcos Antonio de Oliveira
Affiliation:
Insight Data Science Lab - Federal University of Ceará, Brazil Graduate Program in Computer Science (PCOMP), Quixadá, Brazil
Lívia Almada Cruz
Affiliation:
Insight Data Science Lab - Federal University of Ceará, Brazil
Ticiana L. Coelho da Silva
Affiliation:
Insight Data Science Lab - Federal University of Ceará, Brazil Graduate Program in Computer Science (PCOMP), Quixadá, Brazil
*
Corresponding author: Valmir Oliveira dos Santos Júnior; Email: valmir.oliveira@insightlab.ufc.br
Rights & Permissions [Opens in a new window]

Abstract

It is increasingly common to use chatbots as an interface to services. One of the main components of a chatbot is the Natural Language Understanding (NLU) model, which is responsible for interpreting the text and extracting the intent and entities present in that text. It’s possible to focus only on one of these tasks of NLU, such as intent classification. To train an NLU intent classification model, it’s generally necessary to use a considerable amount of annotated data, where each sentence of the dataset receives a label indicating an intent. Performing manually labeling data is arduous and impracticable, depending on the data volume. Thus, an unsupervised machine learning technique, such as data clustering, could be applied to find and label patterns in the data. For this task, it is essential to have an effective vector embedding representation of texts that depicts the semantic information and helps the machine understand the context, intent, and other nuances of the entire text. This paper extensively evaluates different text embedding models for clustering and labeling. We also apply some operations to improve the dataset’s quality, such as removing sentences and establishing various strategies for distance thresholds (cosine similarity) for the clusters’ centroids. Then, we trained some intent classification Models with two different architectures, one built with the Rasa framework and the other with a neural network (NN) using the attendance text from the Coronavirus Platform Service of Ceará, Brazil. We also manually annotated a dataset to be used as validation data. We conducted a study on semiautomatic labeling, implemented through clustering and visual inspection, which introduced some labeling errors to the intent classification models. However, it would be unfeasible to annotate the entire dataset manually. Nevertheless, results of competitive accuracy were still achieved with the trained models.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. Pipeline to build NLU models and intent classifier models.

Figure 1

Table 1. Dialog example

Figure 2

Table 2. Fragment of a dialog showing the collection of the service evaluation performed by the Patient

Figure 3

Figure 2. Pipeline to get embedding sentence representation using Glove.

Figure 4

Figure 3. Intent classifier model architecture.

Figure 5

Figure 4. Davies Bouldin score of each embedding model.

Figure 6

Table 3. K value chosen for each embedding model representation

Figure 7

Figure 5. Silhouette score of each embedding model.

Figure 8

Figure 6. A word cloud generated by Glove embedding model representing sentences of one cluster intention.

Figure 9

Table 4. Sentences of the Glove clusters representing an inform_symptoms intention

Figure 10

Figure 7. t-SNE visualization for the ninety-nine clusters generated with the Glove embedding model.

Figure 11

Table 5. Number of clusters by Intention

Figure 12

Figure 8. Davies Bouldin scores for datasets variations.

Figure 13

Figure 9. Silhouette scores for datasets variations.

Figure 14

Table 6. Number of sentences after outliers removal

Figure 15

Table 7. Result Metrics (Macro) for the intent classification models based on feed-forward neural network

Figure 16

Table 8. Result Metrics (Macro) for the Rasa intent classification models

Figure 17

Figure 10. Histogram of prediction of intentions using the NLU trained with Rasa and MUSE embedding.

Figure 18

Table 9. Result Metrics (Macro) for the validation set manually labeled for the intent classification models based on feed-forward neural network

Figure 19

Table 10. Result Metrics (Macro) for the validation set manually labeled for the Rasa intent classification models

Figure 20

Table 11. Number of correct label assignment by clustering of each embedding model in validation dataset