Audio DeepFake Detection Employing Multiple Parametric Exponential Linear Units

Md Shahidul Alam; Abderrahim Fathan; Jahangir Alam

Conference Proceedings

Audio DeepFake Detection Employing Multiple Parametric Exponential Linear Units

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2023) 14339 LNAI 307-321

DOI: 10.1007/978-3-031-48312-7_25

0Citations

6Readers

Get full text

Abstract

The main objective of the audio deepfake detection system is to find out the artifacts within the input speech caused by the speech synthesis or voice conversion process. Recent trends in deepfake detection is to employ deep learning architectures in an end-to-end fashion to discriminate between bonafide and spoof speech signals. In deep learning, activation functions play an important role in deciding whether the neuron’s input to the network is relevant or not in the process of prediction/classification. In this work, we propose to employ a Multiple Parametric Exponential Linear Unit (MPELU) activation function with the Residual Network (ResNet) architecture. The aim of the MPELU activation function is to generalize and unify the rectified and exponential linear units. Furthermore, we adopt an Attention Rectified Linear Unit (AReLU) which through the addition of element-wise sign-based attention mechanism with a ReLU module focuses on the enhancement of positive elements and a suppression of negative ones in a data-adaptive manner. The proposed frameworks was experimented on the logical access (LA) task of ASVSpoof2019 dataset, and outperformed the systems using the standard non-learnable and learnable activation functions.

Author supplied keywords

Cite

CITATION STYLE

APA

Alam, M. S., Fathan, A., & Alam, J. (2023). Audio DeepFake Detection Employing Multiple Parametric Exponential Linear Units. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14339 LNAI, pp. 307–321). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-48312-7_25

Audio DeepFake Detection Employing Multiple Parametric Exponential Linear Units

Abstract

Author supplied keywords

Cite

Register to see more suggestions