Abstract:
Music has been influenced by technology; hence data is the fundamental
building block. Through the concepts of Music Information Retrieval (MIR),
data immersed in music can be retrieved and effectively used for data mining
and machine learning aspects. This study aimed to utilize correlation and
gradient boosting technique to increase the accuracy of the music genre
prediction. MIR techniques were used to retrieve data from 200 music tracks
and conducted preprocessing followed by a correlation analysis. With
correlation analysis, the most correlated set of music features were identified
to be roll off, beats, zero crossing rate, spectral centroid, tempo, spectral
bandwidth and Mel-frequency cepstral coefficients (mfcc2). Then experiment
was designed to measure the accuracy in using correlation and gradient
boosting technique. Hence, as the first experiment a Random Forest Classifier
(RFC) and XGBoost Classifier (XGBC) were developed to predict the genres
using all the extracted feature set as the output. Here f-score of 60.5% and 63%
were yielded respectively for RFC and XGBC. Then, for the second
experiment, with the use of identified set of correlated features a RFC and an
XGBC were developed with as accuracy of 93.5% and 100% respectively. In
both the experiments, models were trained to classify 10 music genres and
80:20 training to testing data split was used. Considering above results, it can
be concluded that utilization of gradient boosting technique with correlation
analysis has increased the accuracy level in music genre prediction using
music data.