Subject Area
Computer and Control Systems Engineering
Article Type
Original Study
Abstract
Audio signal processing is a method for applying powerful algorithms and techniques to record, improve, save and transmit audio content signals. Audio Tagging (AT) is a challenge that requires predicting the tags of audio clips. Developments in deep learning and audio signal processing have resulted in a significant improvement in audio tagging. Many techniques have been used. Several studies have introduced different audio tagging techniques, but the performance of the results obtained from these studies is insufficient. This study proposes an Optimized Deep Learning Audio Tagging (ODLAT] approach to classify and analyze audio tagging. Each input signal is used to extract the different characteristics or features of the audio tagging. Such features are input into a neural network to carry out a multi-label classification for the predicted tags. Adam and Adamax are used as effective optimization methods for learning rate. Many experiments are conducted to test the validity of the [ODLAT] approach against others. The results obtained have shown the superiority of the proposed approach.
Keywords
Deep Learning, audio tagging, short time Fourier transform, multi–label classification
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
El-metwally, Fatma S.; Eldesouky, Ali I.; and Elghamrawy, Sally M.
(2023)
"Optimized Deep Learning Audio Tagging Approach,"
Mansoura Engineering Journal: Vol. 48
:
Iss.
2
, Article 12.
Available at:
https://doi.org/10.58491/2735-4202.3096