Semi-Supervised Self-Training Approach for Web Robots Activity Detection in Weblog

3Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Due to the significant added value of web servers, they are vulnerable to attacks, so web security has received a lot of attention. Web server logging systems that record each user request performed by users have become an important data analysis object in web security. Traditionally, system experts’ analyses log data manually using keyword searches and regular expressions. However, the amount of log data and attack types makes routine detection ineffective. Machine learning-based supervised and unsupervised detection approaches have been employed extensively during the last decade to improve traditional detection methods. The proposed semi-supervised STBOOST web robot detection system uses self-training with XGBoost as its base classifier. Experimental data are taken from the open-source data repository, the NASA 95 dataset, and e-commerce site access logs. In both datasets, self-training XGBoost outperforms XGBoost and can detect anonymous web robots using unlabeled data.

Cite

CITATION STYLE

APA

Jagat, R. R., Sisodia, D. S., & Singh, P. (2022). Semi-Supervised Self-Training Approach for Web Robots Activity Detection in Weblog. In Lecture Notes on Data Engineering and Communications Technologies (Vol. 116, pp. 911–924). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-16-9605-3_64

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free