Hostname: page-component-89b8bd64d-j4x9h Total loading time: 0 Render date: 2026-05-07T09:42:01.423Z Has data issue: false hasContentIssue false

The accuracy versus interpretability trade-off in fraud detection model

Published online by Cambridge University Press:  05 July 2021

Anna Nesvijevskaia*
Affiliation:
Quinten, 8 rue Vernier, Paris 75017, France Laboratory DICEN Ile de France, Conservatoire National des Arts et Métiers, 292 rue Saint Martin, Paris 75003, France
Sophie Ouillade
Affiliation:
Quinten, 8 rue Vernier, Paris 75017, France
Pauline Guilmin
Affiliation:
Quinten, 8 rue Vernier, Paris 75017, France
Jean-Daniel Zucker
Affiliation:
IRD, UMMISCO, Sorbonne University, Bondy F-93143, France
*
*Corresponding author. E-mail: anna.nesvijevskaia@gmail.com

Abstract

Like a hydra, fraudsters adapt and circumvent increasingly sophisticated barriers erected by public or private institutions. Among these institutions, banks must quickly take measures to avoid losses while guaranteeing the satisfaction of law-abiding customers. Facing an expanding flow of operations, effective banking relies on data analytics to support established risk control processes, but also on a better understanding of the underlying fraud mechanism. In addition, fraud being a criminal offence, the evidential aspect of the process must also be considered. These legal, operational, and strategic constraints lead to compromises on the means to be implemented for fraud management. This paper first focuses on the translation of practical questions raised in the banking industry at each step of the fraud management process into performance evaluation required to design a fraud detection model. Secondly, it considers a range of machine learning approaches that address these specificities: the imbalance between fraudulent and nonfraudulent operations, the lack of fully trusted labels, the concept-drift phenomenon, and the unavoidable trade-off between accuracy and interpretability of detection. This state-of-the-art review sheds some light on a technology race between black box machine learning models improved by post-hoc interpretation and intrinsic interpretable models boosted to gain accuracy. Finally, it discusses how concrete and promising hybrid approaches can provide pragmatic, short-term answers to banks and policy makers without swallowing up stakeholders with economical and ethical stakes in this technological race.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (http://creativecommons.org/licenses/by-nc-sa/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© Quinten SAS and Jean-Daniel Zucker, 2021. Published by Cambridge University Press
Figure 0

Figure 1. Illustration of fraud dimensions.

Figure 1

Figure 2. Simplified process of fraud management, its associated tasks and actors.

Figure 2

Figure 3. Summary of solutions to deal with imbalanced data.

Figure 3

Figure 4. Summary of methods used for anomaly detection.

Figure 4

Table 1. Procedures addressing the concept drift issues in fraud detection.

Figure 5

Figure 5. Summary of approaches addressing interpretability issues.

Figure 6

Figure 6. Interpretability vs accuracy trade-off: main models and their improvement directions.

Figure 7

Table 2. Confusion matrix on the test set.

Figure 8

Figure 7. Number of true fraud cases detected over 1 year.

Figure 9

Figure 8. Average number of alerts per day over 1 year.

Figure 10

Table 3. Synthesis of the answers of the hybrid fraud detection model (FDM) to fraud management task issues.

Figure 11

Table A1. Solutions to deal with imbalanced data.

Figure 12

Table A2. Methods used for anomaly detection.

Submit a response

Comments

No Comments have been published for this article.