Minimizing the Societal Cost of Credit Card Fraud with Limited and Imbalanced Data

09/03/2019
by   Samuel Showalter, et al.
19

Machine learning has automated much of financial fraud detection, notifying firms of, or even blocking, questionable transactions instantly. However, data imbalance starves traditionally trained models of the content necessary to detect fraud. This study examines three separate factors of credit card fraud detection via machine learning. First, it assesses the potential for different sampling methods, undersampling and Synthetic Minority Oversampling Technique (SMOTE), to improve algorithm performance in data-starved environments. Additionally, five industry-practical machine learning algorithms are evaluated on total fraud cost savings in addition to traditional statistical metrics. Finally, an ensemble of individual models is trained with a genetic algorithm to attempt to generate higher cost efficiency than its components. Monte Carlo performance distributions discerned random undersampling outperformed SMOTE in lowering fraud costs, and that an ensemble was unable to outperform its individual parts. Most notably,the F-1 Score, a traditional metric often used to measure performance with imbalanced data, was uncorrelated with derived cost efficiency. Assuming a realistic cost structure can be derived, cost-based metrics provide an essential supplement to objective statistical evaluation.

READ FULL TEXT

page 8

page 9

page 11

page 12

page 13

research
08/25/2022

Empirical study of Machine Learning Classifier Evaluation Metrics behavior in Massively Imbalanced and Noisy data

With growing credit card transaction volumes, the fraud percentages are ...
research
08/25/2022

Credit card fraud detection - Classifier selection strategy

Machine learning has opened up new tools for financial fraud detection. ...
research
12/05/2021

Ensemble and Mixed Learning Techniques for Credit Card Fraud Detection

Spurious credit card transactions are a significant source of financial ...
research
06/24/2022

A novel approach to increase scalability while training machine learning algorithms using Bfloat 16 in credit card fraud detection

The use of credit cards has become quite common these days as digital ba...
research
06/27/2022

Evaluating resampling methods on a real-life highly imbalanced online credit card payments dataset

Various problems of any credit card fraud detection based on machine lea...
research
04/28/2018

Credit risk prediction in an imbalanced social lending environment

Credit risk prediction is an effective way of evaluating whether a poten...
research
09/10/2019

Spam filtering on forums: A synthetic oversampling based approach for imbalanced data classification

Forums play an important role in providing a platform for community inte...

Please sign up or login with your details

Forgot password? Click here to reset