The characteristics of the minority class distribution in imbalanced data is studied. Four types of minority examples - safe, borderline, rare and outlier - are distinguished and analysed. We propose a new method for identification of these examples in the data, based on analysing the local neighbourhoods of examples. Its application to UCI imbalanced datasets shows that the minority class is often scattered without too many safe examples. This characteristics of data distributions is also confirmed by another analysis with Multidimensional Scaling visualization. We examine the influence of these types of examples on 6 different classifiers learned over various real-world datasets. Results of experiments show that the particular classifiers reveal different sensitivity to the type of examples. © 2012 Springer-Verlag.
CITATION STYLE
Napierala, K., & Stefanowski, J. (2012). Identification of different types of minority class examples in imbalanced data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7209 LNAI, pp. 139–150). https://doi.org/10.1007/978-3-642-28931-6_14
Mendeley helps you to discover research relevant for your work.