Term based semantic clusters for very short text classification

6Citations
Citations of this article
69Readers
Mendeley users who have this article in their library.

Abstract

Very short texts, such as tweets and invoices, present challenges in classification. Although term occurrences are strong indicators of content, in very short texts, the sparsity of these texts makes it difficult to capture important semantic relationships. A solution calls for a method that not only considers term occurrence, but also handles sparseness well. In this work, we introduce such an approach, the Term Based Semantic Clusters (TBSeC) that employs terms to create distinctive semantic concept clusters. These clusters are ranked using a semantic similarity function which in turn defines a semantic feature space that can be used for text classification. Our method is evaluated in an invoice classification task. Compared to well-known content representation methods the proposed method performs competitively.

References Powered by Scopus

GloVe: Global vectors for word representation

26882Citations
N/AReaders
Get full text

Indexing by latent semantic analysis

9514Citations
N/AReaders
Get full text

Term-weighting approaches in automatic text retrieval

6799Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Topic Model with Contextual Outlier Handling: a Study on Electronic Invoice Product Descriptions

2Citations
N/AReaders
Get full text

Towards Intelligent Processing of Electronic Invoices: The General Framework and Case Study of Short Text Deep Learning in Brazil

2Citations
N/AReaders
Get full text

Framework for Classroom Student Grading with Open-Ended Questions: A Text-Mining Approach

1Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Paalman, J., Mullick, S., Zervanou, K., & Zhang, Y. (2019). Term based semantic clusters for very short text classification. In International Conference Recent Advances in Natural Language Processing, RANLP (Vol. 2019-September, pp. 878–887). Incoma Ltd. https://doi.org/10.26615/978-954-452-056-4_102

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 15

60%

Researcher 6

24%

Professor / Associate Prof. 2

8%

Lecturer / Post doc 2

8%

Readers' Discipline

Tooltip

Computer Science 22

76%

Linguistics 5

17%

Materials Science 1

3%

Neuroscience 1

3%

Save time finding and organizing research with Mendeley

Sign up for free