Hostname: page-component-89b8bd64d-r6c6k Total loading time: 0 Render date: 2026-05-07T13:16:07.981Z Has data issue: false hasContentIssue false

Automatic Deception Detection using Multiple Speech and Language Communicative Descriptors in Dialogs

Published online by Cambridge University Press:  16 April 2021

Huang-Cheng Chou*
Affiliation:
Department of Electrical Engineering, National Tsing Hua University, Taiwan MOST Joint Research Center for AI Technology and All Vista Healthcare, Taiwan
Yi-Wen Liu
Affiliation:
Department of Electrical Engineering, National Tsing Hua University, Taiwan
Chi-Chun Lee*
Affiliation:
Department of Electrical Engineering, National Tsing Hua University, Taiwan MOST Joint Research Center for AI Technology and All Vista Healthcare, Taiwan
*
Corresponding author: Huang-Cheng Chou Email: hc.chou@gapp.nthu.edu.tw Chi-Chun Lee Email: cclee@ee.nthu.edu.tw
Corresponding author: Huang-Cheng Chou Email: hc.chou@gapp.nthu.edu.tw Chi-Chun Lee Email: cclee@ee.nthu.edu.tw

Abstract

While deceptive behaviors are a natural part of human life, it is well known that human is generally bad at detecting deception. In this study, we present an automatic deception detection framework by comprehensively integrating prior domain knowledge in deceptive behavior understanding. Specifically, we compute acoustics, textual information, implicatures with non-verbal behaviors, and conversational temporal dynamics for improving automatic deception detection in dialogs. The proposed model reaches start-of-the-art performance on the Daily Deceptive Dialogues corpus of Mandarin (DDDM) database, 80.61% unweighted accuracy recall in deception recognition. In the further analyses, we reveal that (i) the deceivers’ deception behaviors can be observed from the interrogators’ behaviors in the conversational temporal dynamics features and (ii) some of the acoustic features (e.g. loudness and MFCC) and textual features are significant and effective indicators to detect deception behaviors.

Information

Type
Original Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Author(s), 2021 published by Cambridge University Press in association with Asia Pacific Signal and Information Processing Association.
Figure 0

Fig. 1. The overview of the proposed deception detection model. $ACO$, $BERT$, and $CTD$ indicate the turn-level acoustic-prosodic features, textual embeddings extracted by BERT pretrained model, and conversational temporal dynamics features proposed in [20], respectively. Further, $BERT_{P}$ means the word-level textual embeddings extracted by BERT pretrained model, and it also includes non-verbal and pragmatic behavior information. Besides, $C$, $CK$, $S$, and $O$ represent complications, common knowledge details, self-handicapping strategies, and others (none of the above them), respectively.

Figure 1

Fig. 2. An illustration of questioning-answering (QA) pair turns. We only use “complete” QA pair turns and exclude that questioning turns if there is no corresponding answering turns. Each turn could have multiple utterances.

Figure 2

Table 1. The number and annotation of each acoustic and pragmatic behavior.

Figure 3

Table 2. Results (%) on the DDDM database presented with metrics of unweighted accuracy recall (UAR), weighted-F1 score, and macro-precision.

Figure 4

Table 3. Results and the data distribution of the four-class implicatures recognition on the DDDM database. We present metrics of unweighted accuracy recall (UAR), weighted-F1, and macro-precision (%). $C$, $CK$, $S$, and $O$ represent complications, common knowledge details, self-handicapping strategies, and others (none of the above them), respectively.

Figure 5

Table 4. The proportion of implicature classes by calculating the number of each implicature divided by the total number of three types of implicatures (the number of self-handicapping strategies plus the number of common knowledge details plus the number of complications).

Figure 6

Table 5. The Welch's T-test results between truthful and deceptive answering turns in the three feature sets. A feature's value and the number of features all are smaller than $0.05$ (if a feature's p-value is $<$$0.01$, it is marked by *.)

Figure 7

Table 6. Welch's T-test between truthful and deceptive responses in acoustic features. A feature's value and the number of features all are $<$$0.05$ (if a feature's p-value is $<$$0.01$, it is marked by *).