A classifier ensemble model for music emotion classification

Charles, J.; Lekamge, L.S.

IRUOR Home
→
Scholarly Publications
→
Conference and Symposia Proceedings
→
Ruhuna International Science and Technology Conference
→
RISTCON 2020
→
View Item

A classifier ensemble model for music emotion classification

Charles, J.; Lekamge, L.S.

URI: http://ir.lib.ruh.ac.lk/xmlui/handle/iruor/11949

Date: 2020-01-22

Abstract:

Computational modelling of music-emotion has attracted an increasing attention especially in today’s digital age. Even though the use of conventional machine learning algorithms for music-emotion classification are frequently reported in literature, the studies frequently use western/western classical music while traditional melodies like Sri Lankan folk music remains less explored. Therefore, we considered a Sri Lankan folk music dataset introduced by the authors in a previous study, comprising of 206 stimuli (30 seconds, 44100Hz; stereo; 32bit; .wav). The stimuli were purposely composed and orchestrated to express happy (54), sad (70), or fear (82) as predominant emotions and the emotion-annotation was by a panel of musicologists. Twenty-two features related to dynamics, rhythm, timbre, pitch, and tonality were extracted using MATLAB MIRToolbox. Five individual classifiers (Logistic Regression (LR), Naive Bayes, Decision Tree, Random Forest (RF), and k-Nearest Neighbor (k-NN)) were applied on the dataset. K-NN outperformed the others yielding an accuracy of 78.44% with 76.19% and 73.42% for RF and LR respectively. Above classifiers were considered in an ensemble model using Max-voting. Results were enhanced with ensemble boosting techniques (AdaBoost, Gradient Boosting, and XGBoost). With optimized features, AdaBoost (RF) yielded the highest accuracy (95.23%) while reducing the classifier training time significantly. The classifier outperformed the state-of-the art results, but a direct comparison was not possible due to the discrepancies brought by individual datasets, listener populations, etc. in each study. Utilizing features such as Mel-frequency cepstral coefficients (MFCCs) and Deltas of MFCCs which are predominantly used in emotion recognition literature are looked forward.

Show full item record