An integrated approach to improve the text categorization using semantic measures

K. Purna Chand; G. Narsimha

Conference Proceedings

An integrated approach to improve the text categorization using semantic measures

Smart Innovation, Systems and Technologies (2015) 32 39-47

DOI: 10.1007/978-81-322-2208-8_5

4Citations

1Readers

Get full text

Abstract

Categorization of text documents plays a vital role in information retrieval systems. Clustering the text documents which supports for effective classification and extracting semantic knowledge is a tedious task. Most of the existing methods perform the clustering based on factors like term frequency, document frequency and feature selection methods. But still accuracy of clustering is not up to mark. In this paper we proposed an integrated approach with a metric named as Term Rank Identifier (TRI). TRI measures the frequent terms and indexes them based on their frequency. For those ranked terms TRI will finds the semantics and corresponding class labels. In this paper, we proposed a Semantically Enriched Terms Clustering (SETC) Algorithm, it is integrated with TRI improves the clustering accuracy which leads to incremental text categorization. Our experimental analysis on different data sets proved that the proposed SETC performing better.

Author supplied keywords

Cite

CITATION STYLE

APA

Purna Chand, K., & Narsimha, G. (2015). An integrated approach to improve the text categorization using semantic measures. In Smart Innovation, Systems and Technologies (Vol. 32, pp. 39–47). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-81-322-2208-8_5

An integrated approach to improve the text categorization using semantic measures

Abstract

Author supplied keywords

Cite

Register to see more suggestions