Using Pyspark Environment for Solving a Big Data Problem: Searching for Supersymmetric Particles

undefined; undefined; undefined; undefined; undefined; Mourad Azhari*; Abdallah Abarda; Badia Ettaki; Jamal Zerouaoui; Mohamed Dakkon

Journal Article

Using Pyspark Environment for Solving a Big Data Problem: Searching for Supersymmetric Particles

et al.

International Journal of Innovative Technology and Exploring Engineering (2020) 9(7) 541-546

DOI: 10.35940/ijitee.g5308.059720

N/ACitations

2Readers

Get full text

Abstract

Supersymmetry theory predicts that every particle in the standard model has a superpartner particle with a different mass. The Classification Problem of Supersymmetric Particles in High-Energy represents a major challenge for physicists. This paper aims to resolve the Big data Classification Problem in the area of Supersymmetric Particles using the Apache Spark Environment with the "MLlib" library. This contribution attempts to explore the performance of Machine Learning methods in the context of large data such as a "Susy" dataset, collected from the UCI Machine Learning repository. In this work, the performance is measured using three metrics: Accuracy, Area Under Curve (AUC), and training Computation Time (CT). The results are promising and show that the Gradient Boosted Tree (GBT) classifier achieves a high accuracy score (79%). While the Logistic Regression (LR) algorithm realizes a well AUC score (86%).

Cite

CITATION STYLE

APA

Azhari*, M. … Dakkon, M. (2020). Using Pyspark Environment for Solving a Big Data Problem: Searching for Supersymmetric Particles. International Journal of Innovative Technology and Exploring Engineering, 9(7), 541–546. https://doi.org/10.35940/ijitee.g5308.059720

Readers' Seniority

Professor / Associate Prof. 1

100%

Readers' Discipline

Engineering 1

100%

Using Pyspark Environment for Solving a Big Data Problem: Searching for Supersymmetric Particles

Abstract

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline