Enhancing Phishing Email Detection through Ensemble Learning and Undersampling

10Citations
Citations of this article
73Readers
Mendeley users who have this article in their library.

Abstract

In real-world scenarios, the number of phishing and benign emails is usually imbalanced, leading to traditional machine learning or deep learning algorithms being biased towards benign emails and misclassifying phishing emails. Few studies take measures to address the imbalance between them, which significantly threatens people’s financial and information security. To mitigate the impact of imbalance on the model and enhance the detection performance of phishing emails, this paper proposes two new algorithms with undersampling: the Fisher–Markov-based phishing ensemble detection (FMPED) method and the Fisher–Markov–Markov-based phishing ensemble detection (FMMPED) method. The algorithms first remove benign emails in overlapping areas, then undersample the remaining benign emails, and finally, combine the retained benign emails with phishing emails into a new training set, using ensemble learning algorithms for training and classification. Experimental results have demonstrated that the proposed algorithms outperform other machine learning and deep learning algorithms, achieving an F1-score of 0.9945, an accuracy of 0.9945, an AUC of 0.9828, and a G-mean of 0.9827.

References Powered by Scopus

WordNet: A Lexical Database for English

11664Citations
N/AReaders
Get full text

An overview of statistical learning theory

5449Citations
N/AReaders
Get full text

Detecting Phishing Attacks Using Natural Language Processing and Machine Learning

138Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Investigation of Phishing Susceptibility with Explainable Artificial Intelligence

11Citations
N/AReaders
Get full text

OEC[sbnd]Net: Optimal feature selection-based email classification network using unsupervised learning with deep CNN model

5Citations
N/AReaders
Get full text

DeepEPhishNet: a deep learning framework for email phishing detection using word embedding algorithms

2Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Qi, Q., Wang, Z., Xu, Y., Fang, Y., & Wang, C. (2023). Enhancing Phishing Email Detection through Ensemble Learning and Undersampling. Applied Sciences (Switzerland), 13(15). https://doi.org/10.3390/app13158756

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 12

86%

Lecturer / Post doc 2

14%

Readers' Discipline

Tooltip

Computer Science 7

47%

Engineering 6

40%

Arts and Humanities 1

7%

Business, Management and Accounting 1

7%

Article Metrics

Tooltip
Mentions
News Mentions: 1

Save time finding and organizing research with Mendeley

Sign up for free