An integrated approach to improve the text categorization using semantic measures

4Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Categorization of text documents plays a vital role in information retrieval systems. Clustering the text documents which supports for effective classification and extracting semantic knowledge is a tedious task. Most of the existing methods perform the clustering based on factors like term frequency, document frequency and feature selection methods. But still accuracy of clustering is not up to mark. In this paper we proposed an integrated approach with a metric named as Term Rank Identifier (TRI). TRI measures the frequent terms and indexes them based on their frequency. For those ranked terms TRI will finds the semantics and corresponding class labels. In this paper, we proposed a Semantically Enriched Terms Clustering (SETC) Algorithm, it is integrated with TRI improves the clustering accuracy which leads to incremental text categorization. Our experimental analysis on different data sets proved that the proposed SETC performing better.

Cite

CITATION STYLE

APA

Purna Chand, K., & Narsimha, G. (2015). An integrated approach to improve the text categorization using semantic measures. In Smart Innovation, Systems and Technologies (Vol. 32, pp. 39–47). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-81-322-2208-8_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free