Identification of different types of minority class examples in imbalanced data

Krystyna Napierala; Jerzy Stefanowski

Conference Proceedings

Identification of different types of minority class examples in imbalanced data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7209 LNAI(PART 2) 139-150

DOI: 10.1007/978-3-642-28931-6_14

61Citations

18Readers

Get full text

Abstract

The characteristics of the minority class distribution in imbalanced data is studied. Four types of minority examples - safe, borderline, rare and outlier - are distinguished and analysed. We propose a new method for identification of these examples in the data, based on analysing the local neighbourhoods of examples. Its application to UCI imbalanced datasets shows that the minority class is often scattered without too many safe examples. This characteristics of data distributions is also confirmed by another analysis with Multidimensional Scaling visualization. We examine the influence of these types of examples on 6 different classifiers learned over various real-world datasets. Results of experiments show that the particular classifiers reveal different sensitivity to the type of examples. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Napierala, K., & Stefanowski, J. (2012). Identification of different types of minority class examples in imbalanced data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7209 LNAI, pp. 139–150). https://doi.org/10.1007/978-3-642-28931-6_14

Identification of different types of minority class examples in imbalanced data

Abstract

Author supplied keywords

Cite

Register to see more suggestions