Fraud Detection in Healthcare Insurance Claims Using Machine Learning

Eman Nabrawi; Abdullah Alanazi

Journal ArticleOPEN ACCESS

Fraud Detection in Healthcare Insurance Claims Using Machine Learning

Risks (2023) 11(9)

DOI: 10.3390/risks11090160

17Citations

210Readers

Abstract

Healthcare fraud is intentionally submitting false claims or producing misinterpretation of facts to obtain entitlement payments. Thus, it wastes healthcare financial resources and increases healthcare costs. Subsequently, fraud poses a substantial financial challenge. Therefore, supervised machine and deep learning analytics such as random forest, logistic regression, and artificial neural networks are successfully used to detect healthcare insurance fraud. This study aims to develop a health model that automatically detects fraud from health insurance claims in Saudi Arabia. The model indicates the greatest contributing factor to fraud with optimal accuracy. The labeled imbalanced dataset used three supervised deep and machine learning methods. The dataset was obtained from three healthcare providers in Saudi Arabia. The applied models were random forest, logistic regression, and artificial neural networks. The SMOT technique was used to balance the dataset. Boruta object feature selection was applied to exclude insignificant features. Validation metrics were accuracy, precision, recall, specificity, F1 score, and area under the curve (AUC). Random forest classifiers indicated policy type, education, and age as the most significant features with an accuracy of 98.21%, 98.08% precision, 100% recall, an F1 score of 99.03%, specificity of 80%, and an AUC of 90.00%. Logistic regression resulted in an accuracy of 80.36%, 97.62% precision, 80.39% recall, an F1 score of 88.17%, specificity of 80%, and an AUC of 80.20%. ANN revealed an accuracy of 94.64%, 98.00% precision, 96.08% recall, an F1 score of 97.03%, a specificity of 80%, and an AUC of 88.04%. This predictive analytics study applied three successful models, each of which yielded acceptable accuracy and validation metrics; however, further research on a larger dataset is advised.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Nabrawi, E., & Alanazi, A. (2023). Fraud Detection in Healthcare Insurance Claims Using Machine Learning. Risks, 11(9). https://doi.org/10.3390/risks11090160

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 27

75%

Lecturer / Post doc 7

19%

Researcher 2

Readers' Discipline

Business, Management and Accounting 14

38%

Computer Science 11

30%

Economics, Econometrics and Finance 7

19%

Engineering 5

14%

Article Metrics

Mentions

Blog Mentions: 1

News Mentions: 1

View details >

Fraud Detection in Healthcare Insurance Claims Using Machine Learning

Abstract

Author supplied keywords

References Powered by Scopus

Developing prediction models for clinical use using logistic regression: An overview

A comprehensive data level analysis for cancer diagnosis on imbalanced data

Machine Learning for Health Services Researchers

Cited by Powered by Scopus

An advanced blockchain-based hyperledger fabric solution for tracing fraudulent claims in the healthcare industry

Unraveling Patterns in Healthcare Fraud through Comprehensive Analysis

Fraud detection in healthcare claims using machine learning: A systematic review

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline

Article Metrics