Optimized Deep Learning Audio Tagging Approach

Fatma S. El-metwally, Department of Computer Engineering and Control Systems,Faculty of Engineering, Mansoura UniversityFollow
Ali I. Eldesouky, Computer Engineering Department, University of Mansoura, Mansoura
Sally M. Elghamrawy, Communications & Computer Engineering Department, Misr Higher Institute for Engineering and TechnologyFollow

Corresponding Author

Sally M. Elghamrawy

Subject Area

Computer and Control Systems Engineering

Article Type

Original Study

Abstract

Audio signal processing is a method for applying powerful algorithms and techniques to record, improve, save and transmit audio content signals. Audio Tagging (AT) is a challenge that requires predicting the tags of audio clips. Developments in deep learning and audio signal processing have resulted in a significant improvement in audio tagging. Many techniques have been used. Several studies have introduced different audio tagging techniques, but the performance of the results obtained from these studies is insufficient. This study proposes an Optimized Deep Learning Audio Tagging (ODLAT] approach to classify and analyze audio tagging. Each input signal is used to extract the different characteristics or features of the audio tagging. Such features are input into a neural network to carry out a multi-label classification for the predicted tags. Adam and Adamax are used as effective optimization methods for learning rate. Many experiments are conducted to test the validity of the [ODLAT] approach against others. The results obtained have shown the superiority of the proposed approach.

Keywords

Deep Learning, audio tagging, short time Fourier transform, multi–label classification

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

El-metwally, Fatma S.; Eldesouky, Ali I.; and Elghamrawy, Sally M. (2023) "Optimized Deep Learning Audio Tagging Approach," Mansoura Engineering Journal: Vol. 48 : Iss. 2 , Article 12.
Available at: https://doi.org/10.58491/2735-4202.3096

Download