Skip to main content
Log in

A hybrid deep learning approach for classification of music genres using wavelet and spectrogram analysis

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Manual classification of millions of songs of the same or different genres is a challenging task for human beings. Therefore, there should be a machine intelligent model that can classify the genres of the songs very accurately. In this paper, a deep learning-based hybrid model is proposed for the analysis and classification of different music genre files. The proposed hybrid model mainly uses a combination of multimodal and transfer learning-based models for classification. This model is analyzed using GTZAN and Ballroom datasets. The GTZAN dataset contains 1000 music files classified with 10 different kinds of music genres such as Metal, Classical, Rock, Reggae, Pop, Disco, Blues, Country, Hip-Hop and Jazz, and the duration of each music file is 30 s. The Ballroom dataset contains 698 music files classified into 8 different kinds of music genres such as Tango, ChaChaCha, Rumba, Viennese waltz, Jlive, Waltz, Quickstep and Samba, and the duration of each music file is 30 s. The performance of the model is evaluated using the Python tool. The macro-average and weighted average are taken for computing the percentage of accuracy of each model. From the results, it is found that the proposed hybrid model is able to perform better as compared to other deep learning models such as the convolution neural network model, transfer learning-based model, multimodal model, machine learning models and other existing models in terms of training accuracy, validation accuracy, training loss, validation loss, precision, recall, F1-score and support.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

Data are available on request.

References

  1. Oramas S, Barbieri F, Nieto Caballero O, Serra X (2018) The Multimodal deep learning for music genre classification. Trans Int Soc Music Inf Retr 1(1):4–21. https://doi.org/10.5334/tismir.10

    Article  Google Scholar 

  2. Feng T (2014) Deep learning for music genre classification. Private document. pp. 1–7. https://courses.engr.illinois.edu/ece544na/fa2014/Tao_Feng.pdf

  3. Bahuleyan H (2018) Music genre classification using machine learning techniques. arXiv preprint arXiv:1804.01149

  4. Elbir A, Aydin N (2020) Music genre classification and music recommendation by using deep learning. Electron Lett 56(12):627–629. https://doi.org/10.1049/el.2019.4202

    Article  Google Scholar 

  5. Nanni L, Costa YM, Aguiar RL, Silla CN Jr, Brahnam S (2018) Ensemble of deep learning, visual and acoustic features for music genre classification. J New Music Res 47(4):383–397. https://doi.org/10.1080/09298215.2018.1438476

    Article  Google Scholar 

  6. Kim S, Kim D Suh B (2016) Music genre classification using the multimodal deep learning. In: Proceedings of HCI Korea pp. 389–395. https://doi.org/10.17210/hcik.2016.01.389

  7. Oramas S, Nieto O, Barbieri F, Serra X (2017) Multi-label music genre classification from audio, text, and images using deep features. arXiv preprint arXiv:1707.04916

  8. Vishnupriya S, Meenakshi K (2018) Automatic music genre classification using convolution neural network. In: 2018 International conference on computer communication and informatics (ICCCI). IEEE pp. 1–4. https://doi.org/10.1109/ICCCI.2018.8441340

  9. Lau DS, Ajoodha R (2022) Music genre classification: a comparative study between deep learning and traditional machine learning approaches. In: Proceedings of sixth international congress on information and communication technology. Springer, Singapore pp. 239–247. https://doi.org/10.1007/978-981-16-2102-4_22

  10. Jeong IY, Lee K (2016) Learning temporal features using a deep neural network and its application to music genre classification. In: Ismir pp. 434–440. https://wp.nyu.edu/ismir2016/wp-content/uploads/sites/2294/2016/07/159_Paper.pdf

  11. Senac C, Pellegrini T, Mouret F, Pinquier J (2017) Music feature maps with convolutional neural networks for music genre classification. In: Proceedings of the 15th international workshop on content-based multimedia indexing pp. 1–5. https://doi.org/10.1145/3095713.3095733

  12. Yu Y, Luo S, Liu S, Qiao H, Liu Y, Feng L (2020) Deep attention based music genre classification. Neurocomputing 372:84–91. https://doi.org/10.1016/j.neucom.2019.09.054

    Article  Google Scholar 

  13. Aguiar RL, Costa YM, Silla CN (2018) Exploring data augmentation to improve music genre classification with convnets. In: 2018 International joint conference on neural networks (IJTHE CNN), IEEE pp. 1–8. https://doi.org/10.1109/IJCNN.2018.8489166

  14. Yang R, Feng L, Wang H, Yao J, Luo S (2020) Parallel recurrent convolutional neural networks-based music genre classification method for mobile devices. IEEE Access 8:19629–19637. https://doi.org/10.1109/ACCESS.2020.2968170

    Article  Google Scholar 

  15. Zhang W, Lei W, Xu X, Xing X (2016) Improved music genre classification with convolutional neural networks. In: Interspeech pp. 3304–3308. https://www.isca-speech.org/archive_v0/Interspeech_2016/pdfs/1236.PDF

  16. Liu J, Wang C, Zha L (2021) A middle-level learning feature interaction method with deep learning for multi-feature music genre classification. Electronics 10(18):2206. https://doi.org/10.3390/electronics10182206

    Article  Google Scholar 

  17. Rajanna AR, Aryafar K, Shokoufandeh A, Ptucha R (2015) Deep neural networks: a case study for music genre classification. In: 2015 IEEE 14th international conference on machine learning and applications (ICMLA), IEEE pp. 655–660. https://doi.org/10.1109/ICMLA.2015.160

  18. Shi L, Li C, Tian L (2019) Music genre classification based on chroma features and deep learning. In: 2019 Tenth international conference on intelligent control and information processing (ICICIP), IEEE pp. 81–86. https://doi.org/10.1109/ICICIP47338.2019.9012215

  19. Elbir A, Çam HB, Iyican ME, Öztürk B, Aydin N (2018). Music genre classification and recommendation by using machine learning techniques. In: 2018 Innovations in intelligent systems and applications conference (ASYU), IEEE pp. 1–5. https://doi.org/10.1109/ASYU.2018.8554016

  20. Tsaptsinos A (2017) Lyrics-based music genre classification using a hierarchical attention network. arXiv preprint arXiv:1707.04678

  21. Panagakis Y, Kotropoulos CL, Arce GR (2014) Music genre classification via joint sparse low-rank representation of audio features. IEEE/ACM Trans Audio Speech Lang Process 22(12):1905–1917. https://doi.org/10.1109/TASLP.2014.2355774

    Article  Google Scholar 

  22. Lykartsis A, Lerch A (2015) Beat histogram features for rhythm-based musical genre classification using multiple novelty functions. In: 18th International conference on digital audio effects. Trondheim, Norway, pp.1–8. https://musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/12/DAFx-15_submission_42-1.pdf

  23. http://mtg.upf.edu/ismir2004/contest/tempoContest/node5.html, accessed on Sep 2021

  24. https://www.kaggle.com/andradaolteanu/gtzan-dataset-music-genre-classification, accessed on Sep 2021

  25. Shah M, Pujara N, Mangaroliya K, Gohil L, Vyas T, Degadwala S (2022) Music genre classification using deep learning. In: 2022 6th International conference on computing methodologies and communication (ICCMC), IEEE pp. 974–978. https://doi.org/10.1109/ICCMC53470.2022.9753953

  26. Hongdan W, SalmiJamali S, Zhengping C, Qiaojuan S, Le R (2022) An intelligent music genre analysis using feature extraction and classification using deep learning techniques. Comput Elect Eng 100:107978. https://doi.org/10.1016/j.compeleceng.2022.107978

    Article  Google Scholar 

  27. Falola PB, Alabi EO, Ogunajo FT, Fasae OD (2022) Music genre classification using machine and deep learning techniques: a review. ResearchJet J Anal Invent 3(03):35–50

    Google Scholar 

  28. Singh Y, Biswas A (2022) Robustness of musical features on deep learning models for music genre classification. Expert Syst Appl 199:116879. https://doi.org/10.1016/j.eswa.2022.116879

    Article  Google Scholar 

  29. Wang W, Sohail M (2022) Research on music style classification based on deep learning. Comput Math Methods Med 2022:1–8. https://doi.org/10.1155/2022/3699885

    Article  Google Scholar 

  30. Narkhede, N., Mathur, S., & Bhaskar, A. (2022). Machine learning techniques for music genre classification. In: Information and communication technology for competitive strategies (ICTCS 2020). Springer, Singapore pp. 155–161. https://doi.org/10.1007/978-981-16-0739-4_15

  31. Gupta R, Ashish S, Shekhar H, Dominic MS (2022) Music genre classification using CNN and RNN-LSTM. In: Micro-electronics and telecommunication engineering. Springer, Singapore

Download references

Acknowledgements

We want to thank Department of CSE, PMEC Berhampur (Government), India for providing adequate infrastructure and facilities to conduct this research work. We also want to thank Sonalisha Mohapatra for her overall coordination with group members Anup Pradhan, Subhankar Dash, and Subham Kumar Biswal of Department of CSE, PMEC Berhampur for supporting in completion of this research work under guidance of Dr. Kalyan Kumar Jena and Dr. Sourav Kumar Bhoi.

Funding

There is no funding information.

Author information

Authors and Affiliations

Authors

Contributions

KKJ, SKB and SM contributed equally to this whole work. SB contributed in the design of hybrid model and simulations.

Corresponding author

Correspondence to Kalyan Kumar Jena.

Ethics declarations

Conflict of interest

There is no conflict of interest.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent to publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jena, K.K., Bhoi, S.K., Mohapatra, S. et al. A hybrid deep learning approach for classification of music genres using wavelet and spectrogram analysis. Neural Comput & Applic 35, 11223–11248 (2023). https://doi.org/10.1007/s00521-023-08294-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08294-6

Keywords

Navigation