Supersymmetry theory predicts that every particle in the standard model has a superpartner particle with a different mass. The Classification Problem of Supersymmetric Particles in High-Energy represents a major challenge for physicists. This paper aims to resolve the Big data Classification Problem in the area of Supersymmetric Particles using the Apache Spark Environment with the "MLlib" library. This contribution attempts to explore the performance of Machine Learning methods in the context of large data such as a "Susy" dataset, collected from the UCI Machine Learning repository. In this work, the performance is measured using three metrics: Accuracy, Area Under Curve (AUC), and training Computation Time (CT). The results are promising and show that the Gradient Boosted Tree (GBT) classifier achieves a high accuracy score (79%). While the Logistic Regression (LR) algorithm realizes a well AUC score (86%).
CITATION STYLE
Azhari*, M. … Dakkon, M. (2020). Using Pyspark Environment for Solving a Big Data Problem: Searching for Supersymmetric Particles. International Journal of Innovative Technology and Exploring Engineering, 9(7), 541–546. https://doi.org/10.35940/ijitee.g5308.059720
Mendeley helps you to discover research relevant for your work.