Improved email classification through enhanced data preprocessing approach

11Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Email has become one of the most widely used forms of communication, resulting in an exponential increase in emails received and creating an immense burden on existing approaches to email classification. Applying the classification method on the raw data may worsen the performance of classifier algorithms. Hence, the data have to be prepared for better performance of the machine learning classifiers. This paper proposes an enhanced data preprocessing approach for multi-category email classification. The proposed model removes the signature of the email. Further, special characters and unwanted words are removed using various preprocessing methods such as stop-word removal, enhanced stop-word removal, and stemming. The proposed model is evaluated using various classifiers such as Multi-Nominal Naïve Bayes, Linear Support Vector Classifier, Logistic Regression, and Random Forest. The results showed that the proposed data preprocessing to email classification is superior to the existing approach.

References Powered by Scopus

Extreme learning machine for regression and multiclass classification

5347Citations
N/AReaders
Get full text

Learning from imbalanced data: open challenges and future directions

1845Citations
N/AReaders
Get full text

An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics

1308Citations
N/AReaders
Get full text

Cited by Powered by Scopus

CS-BPSO: Hybrid feature selection based on chi-square and binary PSO algorithm for Arabic email authorship analysis

38Citations
N/AReaders
Get full text

OCR of Kannada Characters Using Deep Learning

6Citations
N/AReaders
Get full text

Unleashing the Power of Predictive Analytics to Identify At-Risk Students in Computer Science

6Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Kumara, B. A., Kodabagi, M. M., Choudhury, T., & Um, J. S. (2021). Improved email classification through enhanced data preprocessing approach. Spatial Information Research, 29(2), 247–255. https://doi.org/10.1007/s41324-020-00378-y

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 3

38%

Professor / Associate Prof. 2

25%

Researcher 2

25%

Lecturer / Post doc 1

13%

Readers' Discipline

Tooltip

Computer Science 6

67%

Engineering 2

22%

Earth and Planetary Sciences 1

11%

Save time finding and organizing research with Mendeley

Sign up for free