RDDL: A systematic ensemble pipeline tool that streamlines balancing training schemes to reduce the effects of data imbalance in rare-disease-related deep-learning applications

2Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Identifying lowly prevalent diseases, or rare diseases, in their early stages is key to disease treatment in the medical field. Deep learning techniques now provide promising tools for this purpose. Nevertheless, the low prevalence of rare diseases entangles the proper application of deep networks for disease identification due to the severe class-imbalance issue. In the past decades, some balancing methods have been studied to handle the data-imbalance issue. The bad news is that it is verified that none of these methods guarantees superior performance to others. This performance variation causes the need to formulate a systematic pipeline with a comprehensive software tool for enhancing deep-learning applications in rare disease identification. We reviewed the existing balancing schemes and summarized a systematic deep ensemble pipeline with a constructed tool called RDDL for handling the data imbalance issue. Through two real case studies, we showed that rare disease identification could be boosted with this systematic RDDL pipeline tool by lessening the data imbalance problem during model training. The RDDL pipeline tool is available at https://github.com/cobisLab/RDDL/.

References Powered by Scopus

SMOTE: Synthetic minority over-sampling technique

22666Citations
N/AReaders
Get full text

Data descriptor: The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions

2169Citations
N/AReaders
Get full text

Survey on deep learning with class imbalance

1931Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Too vulnerable to resist: Problematic use of dating apps associated with social appearance anxiety, social interaction anxiety, and rejection sensitivity

0Citations
N/AReaders
Get full text

DMLS: an automated pipeline to extract the Drosophila modular transcription regulators and targets from massive literature articles

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Yang, T. H., Liao, Z. Y., Yu, Y. H., & Hsia, M. (2023). RDDL: A systematic ensemble pipeline tool that streamlines balancing training schemes to reduce the effects of data imbalance in rare-disease-related deep-learning applications. Computational Biology and Chemistry, 106. https://doi.org/10.1016/j.compbiolchem.2023.107929

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 1

50%

Researcher 1

50%

Readers' Discipline

Tooltip

Arts and Humanities 1

50%

Engineering 1

50%

Save time finding and organizing research with Mendeley

Sign up for free