Deep learning has been getting more attention towards the researchers for transforming input data into an effective representation through various learning algorithms. Hence it requires a large and variety of datasets to ensure good performance and generalization. But manually labeling a dataset is really a time consuming and expensive process, limiting its size. Some of websites like YouTube and Freesound etc. provide large volume of audio data along with their metadata. General purpose audio tagging is one of the newly proposed tasks in DCASE that can give valuable insights into classification of various acoustic sound events. The proposed work analyzes a large scale imbalanced audio data for a audio tagging system. The baseline of the proposed audio tagging system is based on Convolutional Neural Network with Mel Frequency Cepstral Coefficients. Audio tagging system is developed with Google Colaboratory on free Telsa K80 GPU using keras, Tensorflow, and PyTorch. The experimental result shows the performance of proposed audio tagging system with an average mean precision of 0.92.
CITATION STYLE
Sophiya, E., & Jothilakshmi, S. (2019). Audio tagging system using deep learning model. International Journal of Innovative Technology and Exploring Engineering, 8(10), 1949–1957. https://doi.org/10.35940/ijitee.J9281.0881019
Mendeley helps you to discover research relevant for your work.