Van Pham, T. and Quang, N.T.N. and Thanh, T.M. (2019) Deep learning approach for singer voice classification of Vietnamese popular music. In: 10th International Symposium on Information and Communication Technology, SoICT 2019, 4 December 2019 through 6 December 2019.
Deep learning approach for singer voice classification of Vietnamese popular music.pdf
Download (769kB) | Preview
Abstract
Singer voice classification is a meaningful task in the digital era. With a huge number of songs today, identifying a singer is very helpful for music information retrieval, music properties indexing, and so on. In this paper, we propose a new method to identify the singer's name based on analysis of Vietnamese popular music. We employ the use of vocal segment detection and singing voice separation as the pre-processing steps. The purpose of these steps is to extract the singer's voice from the mixture sound. In order to build a singer classifier, we propose a neural network architecture working with Mel Frequency Cepstral Coefficient (MFCC) as extracted input features from said vocal. To verify the accuracy of our methods, we evaluate on a dataset of 300 Vietnamese songs from 18 famous singers. We achieve an accuracy of 92.84% with 5-fold stratified cross-validation, the best result compared to other methods on the same data set. © 2019 Association for Computing Machinery.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Divisions: | Faculties > Faculty of Information Technology |
Identification Number: | 10.1145/3368926.3369700 |
Uncontrolled Keywords: | Classification (of information); Information retrieval; Network architecture; Cross validation; Input features; Learning approach; Mel-frequency cepstral coefficients; Music information retrieval; Popular music; Pre-processing step; Singing voice separations; Deep learning |
Additional Information: | Conference code: 156141. Language of original document: English. All Open Access, Green. |
URI: | http://eprints.lqdtu.edu.vn/id/eprint/9199 |