Hostname: page-component-77f85d65b8-g4pgd Total loading time: 0 Render date: 2026-04-20T23:36:27.740Z Has data issue: false hasContentIssue false

Comparison of language models for wine sentiment analysis

Published online by Cambridge University Press:  13 October 2025

Chenyu Yang
Affiliation:
Department of Statistics and Data Science, Southern Methodist University, Dallas, TX, USA
Jing Cao*
Affiliation:
Department of Statistics and Data Science, Southern Methodist University, Dallas, TX, USA
*
Corresponding author: Jing Cao Email: jcao@smu.edu

Abstract

This study presents a comparative evaluation of sentiment analysis models applied to a large corpus of expert wine reviews from Wine Spectator, with the goal of classifying reviews into binary sentiment categories based on expert ratings. We assess six models: logistic regression, XGBoost, LSTM, BERT, the interpretable Attention-based Multiple Instance Classification (AMIC) model, and the generative language model LLAMA 3.1, highlighting their differences in accuracy, interpretability, and computational efficiency. While LLAMA 3.1 achieves the highest accuracy, its marginal improvement over AMIC and BERT comes at a significantly higher computational cost. Notably, AMIC matches the performance of pretrained large language models while offering superior interpretability, making it particularly effective for domain-specific tasks such as wine sentiment analysis. Through qualitative analysis of sentiment-bearing words, we demonstrate AMIC’s ability to uncover nuanced, context-dependent language patterns unique to wine reviews. These findings challenge the assumption of generative models’ universal superiority and underscore the importance of aligning model selection with domain-specific requirements, especially in applications where transparency and linguistic nuance are critical.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of American Association of Wine Economists.
Figure 0

Figure 1. Histogram of wine ratings.

Figure 1

Table 1. Test accuracy and parameter count of different models

Figure 2

Figure 2. Word clouds generated from AMIC’s learned word sentiment scores. Size of a word is proportional to the absolute value of its sentiment score.

Figure 3

Table A1. AMIC’s list of top 50 positive sentiment words

Figure 4

Table A2. AMIC’s list of bottom 50 negative sentiment words