Two-stage clustering with k-means algorithm

Raied Salman; Vojislav Kecman; Qi Li; Robert Strack; Erick Test

Conference Proceedings

Two-stage clustering with k-means algorithm

Communications in Computer and Information Science (2011) 162 CCIS 110-122

DOI: 10.1007/978-3-642-21937-5_11

1Citations

8Readers

Get full text

Abstract

k-means has recently been recognized as one of the best algorithms for clustering unsupervised data. Since the k-means depends mainly on distance calculation between all data points and the centers then the cost will be high when the size of the dataset is big (for example more than 500MG points). We suggested a two stage algorithm to reduce the cost of calculation for huge datasets. The first stage is fast calculation depending on small portion of the data to produce the best location of the centers. The second stage is the slow calculation in which the initial centers are taken from the first stage. The fast and slow stages are representing the movement of the centers. In the slow stage the whole dataset can be used to get the exact location of the centers. The cost of the calculation of the fast stage is very low due to the small size of the data chosen. The cost of the calculation of the slow stage is also small due to the low number of iterations. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Salman, R., Kecman, V., Li, Q., Strack, R., & Test, E. (2011). Two-stage clustering with k-means algorithm. In Communications in Computer and Information Science (Vol. 162 CCIS, pp. 110–122). https://doi.org/10.1007/978-3-642-21937-5_11

Two-stage clustering with k-means algorithm

Abstract

Author supplied keywords

Cite

Register to see more suggestions