Hostname: page-component-89b8bd64d-n8gtw Total loading time: 0 Render date: 2026-05-06T09:50:25.102Z Has data issue: false hasContentIssue false

Improving aspect-based neural sentiment classification with lexicon enhancement, attention regularization and sentiment induction

Published online by Cambridge University Press:  12 September 2022

Lingxian Bao*
Affiliation:
Universitat Pompeu Fabra, Carrer de Roc Boronat, 138, 08018 Barcelona, Spain
Patrik Lambert
Affiliation:
RWS - Language Weaver, Goya 6, Madrid, 28001, Spain
Toni Badia
Affiliation:
Universitat Pompeu Fabra, Carrer de Roc Boronat, 138, 08018 Barcelona, Spain
*
*Corresponding author: E-mail lingxian.bao@upf.edu
Rights & Permissions [Opens in a new window]

Abstract

Deep neural networks as an end-to-end approach lack robustness from an application point of view, as it is very difficult to fix an obvious problem without retraining the model, for example, when a model consistently predicts positive when seeing the word “terrible.” Meanwhile, it is less stressed that the commonly used attention mechanism is likely to “over-fit” by being overly sparse, so that some key positions in the input sequence could be overlooked by the network. To address these problems, we proposed a lexicon-enhanced attention LSTM model in 2019, named ATLX. In this paper, we describe extended experiments and analysis of the ATLX model. And, we also try to further improve the aspect-based sentiment analysis system by combining a vector-based sentiment domain adaptation method.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. Baseline AT-LSTM model architecture (Wang et al. 2016).

Figure 1

Figure 2. ATLX model architecture.

Figure 2

Table 1. Example of the merged lexicon $U$

Figure 3

Table 2. Seed words and word counts used for domain adaptation

Figure 4

Table 3. Distribution of aspects by label and train/test split in the SemEval 2014 Task4, restaurant domain dataset.

Figure 5

Table 4. Distribution of aspects by label and train/test split in the SemEval 2015 Task12, laptop domain dataset.

Figure 6

Table 5. Lexicon statistics of positive, neutral, negative words, and number of words covered in corpus.

Figure 7

Table 6. Mean accuracy and standard deviation ($\sigma$) of cross-validation results on six-folds of development sets and one holdout test set of the SemEval14, restaurant dataset. Note that in our replicated baseline system, the cross-validation performance on the test set ranges from 80.06 to 83.45; in Wang et al. (2016), 83.1 was reported.

Figure 8

Table 7. Mean accuracy and standard deviation ($\sigma$) of cross-validation results on six-folds of development sets and one holdout test set. Evaluated on the SemEval14, restaurant dataset and the SemEval15, laptop dataset.

Figure 9

Table 8. ATLX lexicon dimension experiments on SemEval14, restaurant domain dataset.

Figure 10

Figure 3. ATLX cross-validation results on test set with increasing lexicon size on SemEval14, restaurant domain dataset.

Figure 11

Figure 4. Baseline (“Base”) and ATLX comparison (1/6); baseline predicts positive (“Pos”) for both examples, while the gold labels are negative (“Neg”) for all. In the rows annotated as “Base” and “ATLX,” the numbers represent the attention weights of each model when predicting. Note that they do not sum up to 1 in the Figure because predictions are done in a batch with padding positions in the end which are not shown in the Figure. The rows annotated as “Lexicon” indicate the average polarity per word given by $U$ as described in Section 3.1.2. In some of the following plots, the neutral polarity is annotated as (“Neu”).

Figure 12

Figure 5. Baseline and ATLX comparison (2/6).

Figure 13

Figure 6. Baseline and ATLX comparison (5/6). Baseline predicts neutral (“Neu”).

Figure 14

Figure 7. Baseline and ATLX comparison (4/6).

Figure 15

Figure 8. Baseline and ATLX comparison (5/6).

Figure 16

Figure 9. Baseline and ATLX comparison (6/6).

Figure 17

Table 9. Comparison between main experiments and attention regularizers. Mean accuracy and standard deviation of cross-validation results on six-folds of development sets and one holdout test set. Evaluated on SemEval14 and SemEval15 dataset.

Figure 18

Table 10. (a) ATLX model performance (average cross-validation accuracy and variance) with Domain Adapted Lexicons (DAL) on SemEval15 Task 12, laptop dataset. (b) Accuracy and f-score of DALs measured against the gold lexicon, where binary excludes neutral and ternary does not. The subscripts ${}_{\textrm{bin}}$ and ${}_{\textrm{ter}}$ refer to binary classification and ternary classification, respectively.

Figure 19

Table 11. ATLX model performance (average cross-validation accuracy and variance) with Aspect Adapted Lexicons (AAL) on SemEval15 Task 12, laptop dataset.

Figure 20

Figure 10. Comparison of attention weights between baseline (base), baseline with standard deviation regularizer (basestd), baseline with negative entropy regularizer (baseent-), baseline with positive entropy regularizer (baseent+) and ATLX. Baseline predicts positive while all other models correctly predict negative. The row annotated as “Lexicon” indicates the average polarity given by $U$. Note that only ATLX takes into account lexical features, the rest do not.

Figure 21

Figure 11. Polarity distribution of different lexicons.

Figure 22

Figure 12. Model performance in accuracy by increasing size of noise in lexicon.

Figure 23

Table 12. Comparison between our proposed methods and other ABSA systems in accuracy on the restaurant and laptop domain datasets. All results of the restaurant domain are based on the SemEval 2014 Task 4 restaurant dataset. All results of the laptop domain are based on the SemEval 2014 Task 4 laptop dataset, except the ones marked by *, which are based on the SemEval 2015 Task 12 laptop dataset. The ATLX + DAL$_{\textrm{ter}}$ experiment is the laptop review domain adaptation experiment explained in Section 4.3; thus, no results present for the restaurant domain.