Hostname: page-component-77f85d65b8-zzw9c Total loading time: 0 Render date: 2026-03-29T08:00:25.079Z Has data issue: false hasContentIssue false

Deception detection in text and its relation to the cultural dimension of individualism/collectivism

Published online by Cambridge University Press:  02 July 2021

Katerina Papantoniou*
Affiliation:
Computer Science Department, University of Crete, Heraklion, Greece Institute of Computer Science, FORTH-ICS, Heraklion, Greece
Panagiotis Papadakos
Affiliation:
Computer Science Department, University of Crete, Heraklion, Greece Institute of Computer Science, FORTH-ICS, Heraklion, Greece
Theodore Patkos
Affiliation:
Institute of Computer Science, FORTH-ICS, Heraklion, Greece
George Flouris
Affiliation:
Institute of Computer Science, FORTH-ICS, Heraklion, Greece
Ion Androutsopoulos
Affiliation:
Department of Informatics, Athens University of Economics and Business, Athens, Greece
Dimitris Plexousakis
Affiliation:
Computer Science Department, University of Crete, Heraklion, Greece Institute of Computer Science, FORTH-ICS, Heraklion, Greece
*
*Corresponding author. E-mail: papanton@ics.forth.gr
Rights & Permissions [Opens in a new window]

Abstract

Automatic deception detection is a crucial task that has many applications both in direct physical and in computer-mediated human communication. Our focus is on automatic deception detection in text across cultures. In this context, we view culture through the prism of the individualism/collectivism dimension, and we approximate culture by using country as a proxy. Having as a starting point recent conclusions drawn from the social psychology discipline, we explore if differences in the usage of specific linguistic features of deception across cultures can be confirmed and attributed to cultural norms in respect to the individualism/collectivism divide. In addition, we investigate if a universal feature set for cross-cultural text deception detection tasks exists. We evaluate the predictive power of different feature sets and approaches. We create culture/language-aware classifiers by experimenting with a wide range of n-gram features from several levels of linguistic analysis, namely phonology, morphology and syntax, other linguistic cues like word and phoneme counts, pronouns use, etc., and token embeddings. We conducted our experiments over eleven data sets from five languages (English, Dutch, Russian, Spanish, and Romanian), from six countries (United States of America, Belgium, India, Russia, Mexico, and Romania), and we applied two classification methods, namely logistic regression and fine-tuned BERT models. The results showed that the undertaken task is fairly complex and demanding. Furthermore, there are indications that some linguistic cues of deception have cultural origins and are consistent in the context of diverse domains and data set settings for the same language. This is more evident for the usage of pronouns and the expression of sentiment in deceptive language. The results of this work show that the automatic deception detection across cultures and languages cannot be handled in unified manners and that such approaches should be augmented with knowledge about cultural differences and the domains of interest.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2021. Published by Cambridge University Press
Figure 0

Table 1. Social psychology studies on within and across culture deception detection

Figure 1

Table 2. Studies from social psychology discipline on the expression of sentiment in individualism and collectivism

Figure 2

Table 3. Summary of differences in language use between truthful and deceptive statements across the four cultural groups examined in the work of Taylor et al. (2014); Taylor et al. (2017). Differences in pronoun usage and perceptual details were confirmed when participants lied about experiences, whereas affective language differences were confirmed when participants lied about opinions

Figure 3

Table 4. Overview of the used data sets. The corresponding columns are (a) data set, (b) culture, (c) language, (d) type, (e) origin, (f) collection process, (g) number of total, truthful, and deceptive documents, and (h) average length of words in truthful and deceptive documents. (T) stands for truthful and (D) for deceptive, while (I) stands for individualist cultures and (C) for collectivist cultures. Truthful documents tend to be longer than deceptive ones, except in Bluff and Russian collections

Figure 4

Figure 1. Differences between cultures along Hofstede’s individualism dimension (source: https://www.hofstede-insights.com/product/compare-countries/).

Figure 5

Table 5. The list of used features. Features that are examined in Taylor’s work (2014, 2017) are marked with an asterisk (*). The dot ($\bullet$) marks nonnormalized features. Absence of a tick marks the inability to extract this specific feature for this particular language. The N/A indicates that this feature is not applicable for this particular language

Figure 6

Table 6. Phoneme connection to sentiment in phonological iconicity studies

Figure 7

Table 7. Sentiment lexicons used for each language

Figure 8

Table 8. Linguistic tools used on each language for the extraction of features

Figure 9

Table 9. Examples of n-gram features

Figure 10

Table 10. BERT pretrained models used for each language

Figure 11

Table 11. Multiple logistic regression analysis on linguistic features for each US data set. SE stands for standard error. We show in bold p-values < 0.01. Positive or negative estimate values indicate features associated with deceptive or truthful text, respectively

Figure 12

Table 12. Multiple logistic regression analysis for each data set across cultures. The Russian dataset is absent since no significant features were found in the Mann–Whitney test. SE stands for standard error. We show in bold p-values < 0.01. Positive or negative estimate values indicate features associated with deceptive or truthful text, respectively

Figure 13

Table 13. Results for the OpSpam dataset

Figure 14

Table 14. Results on the test set for the DeRev dataset

Figure 15

Table 15. Results on the test set for the Boulder dataset

Figure 16

Table 16. Results on the test set for the EnglishUS dataset

Figure 17

Table 17. Results on the test set for the Bluff dataset

Figure 18

Table 18. A list with top ten discriminating deceptive features for each native English dataset. The features are listed by decreasing estimate value as calculated by the logistic regression algorithm

Figure 19

Table 19. A list with top ten discriminating truthful features for each native English dataset. The features are listed by decreasing estimate value as calculated by the logistic regression algorithm

Figure 20

Table 20. Cross-data set results for US data sets

Figure 21

Table 21. A list with top ten discriminating deceptive features for each of the five cross-dataset cases. The features are listed by decreasing estimate value as calculated by the logistic regression algorithm

Figure 22

Table 22. A list with top ten discriminating truthful features for each of the five cross-dataset cases. The features are listed by decreasing estimate value as calculated by the logistic regression algorithm

Figure 23

Table 23. Per culture results

Figure 24

Table 24. A list with 10 discriminating deceptive features for each dataset. The features are listed by decreasing estimate value as calculated by the logistic regression algorithm. In square brackets is the English translation

Figure 25

Table 25. A list with 10 discriminating truthful features for each dataset. The features are listed by decreasing estimate value as calculated by the logistic regression algorithm. In square brackets is the English translation

Figure 26

Table 26. Results on fine tuning BERT model for US datasets

Figure 27

Table 27. Per culture results for: a. the fine-tuned BERT model b. the fine-tuned BERT model along with the linguistic features. Results are reported both for the monolingual and the multilingual BERT models

Figure 28

Table 28. Comparison between the monolingual BERT models and the multilingual model. We report the average accuracy of the monolingual BERT model among the BERT-only and the BERT+linguistic setups and for the mBERT model respectively. With bold font we mark the best accuracy. St. sign. stands for statistical significance. We performed a 1-tailed z-test with a 99% confidence interval and $\alpha = 0.01$

Figure 29

Table 29. Comparison for the multilingual model when a model is trained over one language and then tested on another one. With bold are the accuracy values over 60%. Rom. is Romanian, SpanMex. is SpanisMexico

Figure 30

Table 30. Comparison with other works on the same corpora. Bold values denote models studied in this work and the best scores. Accu. stands for Accuracy and St. Sign. marks cases where a statistically significant difference between this work and related work was found