Welcome to Scholar Publishing Group

International Journal of Art Innovation and Development, 2022, 3(4); doi: 10.38007/IJAID.2022.030405.

Music Emotion Recognition Model Integrating Deep Learning


Huimin Yang

Corresponding Author:
Huimin Yang

Hebei Chemical & Pharmaceutical College, Shijiazhuang, China


With the development of digital music technology, people begin to explore new classification and recognition methods to retrieve target music from massive data. Music is the carrier of emotion, and the recognition research based on music emotion classification has very important objective significance. The purpose of this paper is to study a music emotion recognition model incorporating deep learning. The audio features of music are extracted and screened based on the underlying audio features, fused with the audio features obtained by deep learning, and combined with the CNN-SVM network model for music emotion classification and recognition. The advantages of the two are combined to carry out the task of music emotion classification, and the final comparative experiments are carried out on three different datasets. Experiments show that the CNN-SVM model in this paper, combined with the filtering of the CNN layer and the new chord vector feature, achieves the best results on all three datasets.


Deep Learning, Music Emotion Recognition, Recognition Model, CNN-SVM Model

Cite This Paper

Huimin Yang. Music Emotion Recognition Model Integrating Deep Learning. International Journal of Art Innovation and Development (2022), Vol. 3, Issue 4: 53-60. https://doi.org/10.38007/IJAID.2022.030405.


[1] Alghifari M F, Gunawan T S, Kartiwi M. Speech emotion recognition using deep feedforward neural network. Indonesian Journal of Electrical Engineering and Computer Science, 2018, 10(2):554-561.

[2] Jain N, Kumar S, Kumar A, et al. Hybrid deep neural networks for face emotion recognition. Pattern Recognition Letters, 2018, 115(NOV.1):101-106.

[3] Kshirsagar P. Face And Emotion Recognition Under Complex Illumination Conditions Using Deep Learning With Morphological Processing. Journal of Interdisciplinary Cycle Research, 2021, XIII(VI):324-331.

[4] Samadiani N, Huang G, Hu Y, et al. Happy Emotion Recognition From Unconstrained Videos Using 3D Hybrid Deep Features. IEEE Access, 2021, PP(99):1-1.

[5] Veltmeijer E A, Gerritsen C, Hindriks K. Automatic emotion recognition for groups: a review. IEEE Transactions on Affective Computing, 2021, PP(99):1-1.

[6] Demircan S, Kahramanli H. Application of ABM to Spectral Features for Emotion Recognition. Mehran University Research Journal of Engineering and Technology, 2018, 37(4):452-462.

[7] Mesnyankina K K, Anishchenko S I, Kalinin K B. The Correlation Between the Set of Mental Functions and Emotion Recognition Skills Formation in Children with Autism Spectrum Disorder. Autism and Developmental Disorders, 2020, 18(4):13-22.

[8] Shukla S, Jain M. A novel stochastic deep conviction network for emotion recognition in speech signal. Journal of Intelligent and Fuzzy Systems, 2020, 38(2):1-16.

[9] Schmidt T, Schlindwein M, Lichtner K, et al. Investigating the Relationship Between Emotion Recognition Software and Usability Metrics. i-com, 2020, 19(2):139-151.

[10] Lotfalinezhad H, Maleki A. Application of multiscale fuzzy entropy features for multilevel subject-dependentemotion recognition. Turkish Journal of Electrical Engineering and Computer Sciences, 2019, 27(6):4070-4081.

[11] Ozseven T. A novel feature selection method for speech emotion recognition. Applied Acoustics, 2019, 146(MAR.):320-326.

[12] Kwak Y J, Lee H S. A Study on Emotion Recognition of Children with ADHD through Computerized Facial Morphing Task. JOURNAL OF SPECIAL EDUCATION & REHABILITATION SCIENCE, 2018, 57(4):41-56.

[13] Aishwarya R. Feature Extraction for Emotion Recognition in Speech with Machine Learning Algorithm. International Journal of Advanced Trends in Computer Science and Engineering, 2020, 9(4):4998-5002.

[14] Tehmina K, Muhammad A S, Muhammad M, et al. Emotion recognition from facial expressions using hybrid feature descriptors. IET Image Processing, 2018, 12(6):1004-1012.

[15] Kaya H, Karpov A A. Efficient and effective strategies for cross-corpus acoustic emotion recognition. Neurocomputing, 2018, 275(JAN.31):1028-1034.

[16] Albraikan A, Tobon D P, Saddik A E. Toward User-Independent Emotion Recognition Using Physiological Signals. IEEE Sensors Journal, 2018, PP(99):1-1.

[17] Nakisa B, Rastgoo M N, Tjondronegoro D, et al. Evolutionary computation algorithms for feature selection of EEG-based emotion recognition using mobile sensors. Expert Systems with Applications, 2018, 93(mar.):143-155.

[18] Ho N H, Yang H J, Kim S H, et al. Multimodal Approach of Speech Emotion Recognition Using Multi-Level Multi-Head Fusion Attention based Recurrent Neural Network. IEEE Access, 2020, PP(99):1-1.