Handling imbalanced medical datasets: review of a decade of research

8Citations
Citations of this article
70Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Machine learning and medical diagnostic studies often struggle with the issue of class imbalance in medical datasets, complicating accurate disease prediction and undermining diagnostic tools. Despite ongoing research efforts, specific characteristics of medical data frequently remain overlooked. This article comprehensively reviews advances in addressing imbalanced medical datasets over the past decade, offering a novel classification of approaches into preprocessing, learning levels, and combined techniques. We present a detailed evaluation of the medical datasets and metrics used, synthesizing the outcomes of previous research to reflect on the effectiveness of the methodologies despite methodological constraints. Our review identifies key research trends and offers speculative insights and research trajectories to enhance diagnostic performance. Additionally, we establish a consensus on best practices to mitigate persistent methodological issues, assisting the development of generalizable, reliable, and consistent results in medical diagnostics.

References Powered by Scopus

SMOTE: Synthetic minority over-sampling technique

22325Citations
N/AReaders
Get full text

A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches

2275Citations
N/AReaders
Get full text

Learning from imbalanced data: open challenges and future directions

1797Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Quantum deep learning in neuroinformatics: a systematic review

0Citations
N/AReaders
Get full text

Improving Surgical Site Infection Prediction Using Machine Learning: Addressing Challenges of Highly Imbalanced Data

0Citations
N/AReaders
Get full text

Applying Machine Learning Sampling Techniques to Address Data Imbalance in a Chilean COVID-19 Symptoms and Comorbidities Dataset

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Salmi, M., Atif, D., Oliva, D., Abraham, A., & Ventura, S. (2024). Handling imbalanced medical datasets: review of a decade of research. Artificial Intelligence Review, 57(10). https://doi.org/10.1007/s10462-024-10884-2

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 14

52%

Lecturer / Post doc 7

26%

Professor / Associate Prof. 4

15%

Researcher 2

7%

Readers' Discipline

Tooltip

Computer Science 17

52%

Engineering 9

27%

Business, Management and Accounting 5

15%

Economics, Econometrics and Finance 2

6%

Save time finding and organizing research with Mendeley

Sign up for free