The use of Semi-supervised Learning (SSL) methods have emerged as an efficient solution to smooth out the problem of availability of labelled instances. Several methods have been proposed in the literature and Self-training and Co-training are two well-known methods. The main aim is to use only a few labelled instances to define a model and to apply this model in a labelling process, in which unlabelled instances are labelled and included in the labelled set. However, the labelling process is always directly dependent on the selection of the unlabelled instances. Moreover, the selection criterion used to select and label new instances has an important effect in the performance of a semi-supervised method. In this paper, we propose a distance-weighted selection of unlabelled instances for Self-training and Co-training semi-supervised methods. In addition, we compare the standard Self-training and Co-training methods against the proposed versions of these two methods over 20 classification datasets.
CITATION STYLE
Barreto, C. A. S., Gorgônio, A. C., Canuto, A. M. P., & Xavier-Júnior, J. C. (2020). A Distance-Weighted Selection of Unlabelled Instances for Self-training and Co-training Semi-supervised Methods. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12320 LNAI, pp. 352–366). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61380-8_24
Mendeley helps you to discover research relevant for your work.