Abstract
Credit card fraud is a growing concern, and it poses a significant threat as
individual information is being misused and causing a substantial monetary loss.
Hence, the prevention of credit card fraud is crucial. Credit card fraud detection is used
to differentiate the transactions, either as legitimate or fraudulent. Recently, different
machine learning techniques have been implemented to detect credit card fraud.
However, the main challenge with fraud detection is that the credit card data is highly
skewed, with the fraudulent transactions as less as 1% of the total data. This study
investigates the performance of the four supervised machine learning algorithms:
logistic regression, support vector machine, decision tree, and random forest, along
with different sampling techniques to better understand the fraud detection attributes
and performance measures associated with it. This review is also concentrated on
exploring different works where the model has a better value for all of the performance
evaluation metrics: Recall, precision, F1-score, accuracy, MCC, AUC, and area under
the precision-recall curve. This will detect credit card fraudulent transactions better and
control credit card fraud.
Keywords: Decision tree, Logistic regression, Random undersampling technique, Random forest, Random oversampling technique, Supervised machine learning algorithms, SVM, SMOTE.