SOME NEW RESULTS ON AUTOMATIC IDENTIFICATION OF VIETNAMESE FOLK SONGS CHEO AND QUANHO

Chu Ba Thanh; Trinh Van Loan; Nguyen Hong Quang

doi:10.15625/1813-9663/36/4/14424

Author affiliations

Authors

Chu Ba Thanh
Trinh Van Loan
Nguyen Hong Quang

DOI:

https://doi.org/10.15625/1813-9663/36/4/14424

Keywords:

Identification, folk songs, Vietnamese, Cheo, Quan ho, GMM, MFCC, excerpt, tempo, F0.

Abstract

Vietnamese folk songs are very rich in genre and content. Identifying Vietnamese folk tunes will contribute to the storage and search for information about these tunes automatically. The paper will present an overview of the classification of music genres that have been performed in Vietnam and abroad. For two types of very popular folk songs of Vietnam such as Cheo and Quan ho, the paper describes the dataset and GMM (Gaussian Mixture Model) to perform the experiments on identifying some of these folk songs. The GMM used for experiment with 4 sets of parameters containing MFCC (Mel Frequency Cepstral Coefficients), energy, first derivative and second derivative of MFCC and energy, tempo, intensity, and fundamental frequency. The results showed that the parameters added to the MFCCs contributed significantly to the improvement of the identification accuracy with the appropriate values of Gaussian component number M. Our experiments also showed that, on average, the length of the excerpts was only 29.63% of the whole song for Cheo and 38.1% of the whole song for Quan ho, the identification rate was only 3.1% and 2.33% less than the whole song for Cheo and Quan ho respectively.

Metrics

PDF views

301

References

X. Hu, J.S. Downie, K. West, and A. Ehmann. “Mining music reviews: Promising preliminary results”. In Proceedings of the International Conference on Music Information Retrieval, pages 536-539, 2005.

C. DeCoro, Z. Barutcuoglu, and R. Fiebrink. “Bayesian aggregation for hierarchical genre classification”. In Proceedings of the International Conference on Music Information Retrieval, pages 77-80, 2007.

A. Anglade, Q. Mary, R. Ramirez, and S. Dixon. “Genre classification using harmony rules induced from automatic chord transcriptions”. In Proceedings of the International Conference on Music Information Retrieval, pages 669-674, 2009.

Cunningham, Padraig, and Sarah Jane Delany. "k-Nearest neighbor classifiers." Multiple Classifier Systems 34.8 (2007): 1-17.

Sazaki, Yoppy. "Rock Genre Classification using K-Nearest Neighbor." ICON-CSE 1.1 (2015): 81-84.

Ghahramani, Zoubin. "An introduction to hidden Markov models and Bayesian networks." International journal of pattern recognition and artificial intelligence 15.01 (2001): 9-42.

Shao, Xi, Changsheng Xu, and Mohan S. Kankanhalli. "Unsupervised classification of music genre using hidden Markov model." ICME. Vol. 4. 2004.

J. Reed and C.H. Lee. “A study on music genre classification based on universal acoustic models”. In Proceedings of the International Conference on Music Information Retrieval, pages 89-94, 2006.

Bağcı, Ulaş, and Engin Erzin. "Boosting classifiers for music genre classification." International Symposium on Computer and Information Sciences. Springer Berlin Heidelberg, 2005.

Christopher M. Bishop, “Pattern Recognition and Machine Learning”, Springer, 2013.

Markov, Konstantin, and Tomoko Matsui. "Music genre and emotion recognition using Gaussian processes." IEEE Access 2 (2014): 688-697.

A. Meng and J. Shawe-Taylor. “An investigation of feature models for music genre classification using the support vector classifier”. In Proceedings of the International Conference on Music Information Retrieval, pages 604-609, 2005.

M. Li and R. Sleep. “Genre classification via an LZ78-based string kernel”. In Proceedings of the International Conference on Music Information Retrieval, pages 252-259, 2005.

A.S. Lampropoulos, P.S. Lampropoulou, and G.A. Tsihrintzis. “Musical genre classification enhanced by improved source separation techniques”. In Proceedings of the International Conference on Music Information Retrieval, pages 576-581, 2005.

C. McKay and I. Fujinaga. “Automatic genre classification using large high-level musical feature sets”. In Proceedings of the International Conference on Music Information Retrieval, pages 525-530, 2004.

A. Meng, P. Ahrendt, and J. Larsen. “Improving music genre classification by short-time feature integration”. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 497-500, 2005.

Jinsong Zheng, M. Oussalah, “Automatic System for Music Genre Classification”, 2006, ISBN 1-9025-6013-9, PGNet.

Le Danh Khiem, Hoac Cong Huynh, Le Thi Chung, “Quan ho’s cultural space”, Publisher of Bac Ninh Provincial Culture and Sports Center, 2006.

Hoang Kieu, “Learn the ancient Cheo folk songs”, Publisher of the stage - Vietnam Cheo theatre, 2001.

Bui Duc Hanh, “150 ancient Cheo folk songs”, Publisher of National Culture, 2006.

Hoang Kieu, Ha Hoa, “Selected ancient Cheo folk songs”, Publisher of Information culture, 2007.

Nguyen Thi Tuyet, “Cheo singing syllabus”, Publisher of Hanoi Academy of Theatre and Cinema, 2000.

Nguyen Thi Tuyet, “Tình dậu mà tình ơi”, Publisher of Hanoi Academy of Theatre and Cinema, 2007.

Chu Ba Thanh, Trinh Van Loan, Nguyen Hong Quang, "Automatic identification some of Vietnamese folk songs", In Proceedings of the 19th National Symposium of Selected ICT Problems, ISBN: 978-604-67-0781-3, pages 92-97, Hanoi, 2016.

Chu Ba Thanh, Trinh Van Loan, Nguyen Hong Quang, "GMM for automatic identification of some Quan ho Bac Ninh folk songs", In Proceedings of Fundamental and Applied IT Research (FAIR), ISBN: 978-604-913-165-3, pages 416-421, Danang. 2017.

Chu Ba Thanh, Trinh Van Loan, Nguyen Hong Quang, "Classification and identification of Cheo and Quan ho Bac Ninh folk songs", In Proceedings of Fundamental and Applied IT Research (FAIR), ISBN: 978-604-913-165-3, pages 395-403, Hanoi. 2018.

George, Tzanetakis, Essl Georg, and Cook Perry. "Automatic musical genre classification of audio signals." Proceedings of the 2nd international symposium on music information retrieval, Indiana. 2001.

[Online]. Available: http://marsyasweb.appspot.com/download/data_sets/

G. Tzanetakis and P. Cook. “Musical genre classification of audio signals”. IEEE Transactions on Speech and Audio Processing, 10(5):293-302, 2002.

West, K., Cox, S. “Features and classifiers for the automatic classification of musical audio signals”. In Proceedings of the Fifth International Conference on Music Information Retrieval (ISMIR), 2004.

Mohd, N., Doraisamy, S., Wirza, R. “Factors Affecting Automatic Genre Classification: an Investigation Incorporating Non-Western Musical Forms”. In Proceedings of the Fifth International Conference on Music Information Retrieval, 2005.

Witten, Ian H., and Eibe Frank. “Data Mining: Practical machine learning tools and techniques”. Morgan Kaufmann, 2005.

Bergstra, J., Casagrande, N., Erhan, D., Eck, D., & Kégl, B. (2006, December). “Aggregate features and Adaboost for music classification. Machine Learning”, 65(2-3), 473-484.

Chai, Wei, and Barry Vercoe. "Folk music classification using hidden Markov models." Proceedings of International Conference on Artificial Intelligence. Vol. 6. No. 6.4. sn, 2001.

Bassiou, Nikoletta, Constantine Kotropoulos, and Anastasios Papazoglou-Chalikias. "Greek folk music classification into two genres using lyrics and audio via canonical correlation analysis." Image and Signal Processing and Analysis (ISPA), 9th International Symposium on. IEEE, 2015.

Rajesh, Betsy, and D. G. Bhalke. "Automatic genre classification of Indian Tamil and western music using fractional MFCC." International Journal of Speech Technology 19.3 (2016): 551-563.

Lafferty J, Mccallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: International conference on machine learning, pp 282–289

Mohamed, A., D. Yu and L. Deng. Investigation of fullsequence training of deep belief networks for speech

recognition. INTERSPEECH 2010, Conference of the International Speech Communication Association, 2010, 10:2846-2849.

Heintz I, Fosler-Lussier E, Brew C. Discriminative input stream combination for conditional random field phone recognition. IEEE transactions on audio, speech, and language processing, 2009, 17(8): 1533-1546

Li, Juan, Jianhang Ding, and Xinyu Yang. "The regional style classification of Chinese folk songs based on GMM-CRF model." Proceedings of the 9th International Conference on Computer and Automation Engineering. ACM, 2017.

Li, Juan, et al. "Regional classification of Chinese folk songs based on CRF model." Multimedia Tools and Applications 78.9 (2019): 11563-11584.

Hinton GE (2012) A practical guide to training restricted boltzmann machines. Momentum 9(1):599–619

Martel J, Nakashika T, Garcia C, Idrissi K (2013) A combination of hand-crafted and hierarchical highlevel learnt feature extraction for music genre classification. In: International conference on artificial neural networks, pp 397–404

Phan Anh Cang, Phan Thuong Cang. “Music classification by Genre using Discrete Wavelet Transform”, In Proceedings of Fundamental and Applied IT Research (FAIR), ISBN: 978-604-913-165-3, pages 395-403, CanTho, 2016.

[Online]. Available: https://challenge.zalo.ai/

Markov, Konstantin, and Tomoko Matsui. "Music genre and emotion recognition using Gaussian processes." IEEE Access 2 (2014): 688-697.

Eggink, Jana, and Guy J. Brown. "A missing feature approach to instrument identification in polyphonic music." Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP'03). 2003 IEEE International Conference on. Vol. 5. IEEE, 2003.

Heittola, T. and Klapuri, A. “Locating segments with drums in music signals”. In Proceeding of the 3rd International Conference on Music Information Retrieval. pp. 271–272, 2002.

Marolt, M. “Gaussian mixture models for extraction of melodic lines from audio recordings”. In Proceedings of the 2004 International Conference on Music Information Retrieval.

Fuchs G (2015) A robust speech/music discriminator for switched audio coding. In: 23rd European Signal Processing Conference (EUSIPCO). IEEE, pp 569–573. https://doi.org/10.1109/EUSIPCO.2015.7362447

Sell G, Clark P (2014) Music tonality features for speech/music discrimination. In: IEEE International Conference on acoustic, speech and signal processing (ICASSP). IEEE, pp 2489–2493. https://doi.org/10.1109/ICASSP.2014.6854048

Thiruvengatanadhan, R., and P. Dhanalakshmi. "Indexing and Retrieval of Music using Gaussian Mixture Model Techniques." International Journal of Computer Applications 148.3 (2016).

Jakubec, Maros, and Michal Chmulik. "Automatic music genre recognition for in-car infotainment." Transportation Research Procedia 40 (2019): 1364-1371.

Evstifeev, Stepan, and Ivan Shanin. "Music Genre Classification Based on Signal Processing." In DAMDID/RCDL, pp. 157-161, 2018.

Bhattacharjee, Mrinmoy, S. R. M. Prasanna, and Prithwijit Guha. "Time-Frequency Audio Features for Speech-Music Classification." arXiv preprint arXiv:1811.01222 (2018).

Baelde, Maxime, Christophe Biernacki, and Raphaël Greff. "Real-Time monophonic and polyphonic audio classification from power spectra." Pattern Recognition 92 (2019): 82-92.

R. Thiruvengatanadhan, “Music Genre Classification using GMM” International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 10 | Oct 2018 www.irjet.net p-ISSN: 2395-0072

Kaur, Chandanpreet, and Ravi Kumar. "Study and analysis of feature-based automatic music genre classification using Gaussian mixture model." In 2017 International Conference on Inventive Computing and Informatics (ICICI), pp. 465-468. IEEE, 2017.

Khonglah BK, Prasanna SM (2016) Speech/music classification using speech-specific features. Digit Signal Process 48(Supplement C):71–83. https://doi.org/10.1016/j.dsp.2015.09.005

Zhang H, Yang XK, Zhang WQ, Zhang WL, Liu J (2016) Application of i-vector in speech and music classification. In: IEEE International Symposium on signal processing and information technology (ISSPIT). IEEE, pp 1–5. https://doi.org/10.1109/ISSPIT.2016.7885999

Stuttle, Matthew Nicholas. “A Gaussian mixture model spectral representation for speech recognition”. Diss. University of Cambridge, 2003.

Bagul, S. G., and R. K. Shastri. "Text independent speaker recognition system using GMM." International Journal of Scientific and Research Publications 2.10 (2012): 1-5.

Kumar, G. Suvarna, et al. "Speaker recognition using GMM." International Journal of Engineering Science and Technology 2.6 (2010): 2428-2436.

Reynolds, D. and Rose, R. [1995]. “Robust text-independent speaker identification using Gaussian mixture speaker models”. IEEE Transactions on Speech and Audio Processing, vol. 3(1): pp. 72–83.

Dustor, Adam, and Pawel Szwarc. "Application of GMM models to spoken language recognition." Mixed Design of Integrated Circuits & Systems, MIXDES'09. MIXDES-16th International Conference. IEEE, 2009.

Sarmah, Kshirod, and Utpal Bhattacharjee. "GMM based Language Identification using MFCC and SDC Features." International Journal of Computer Applications 85.5 (2014).

Pham Ngoc Hung, “Automatic recognition of continuous speech for Vietnamese main dialects through pronunciation modality”, Doctoral Thesis - Hanoi University of Science and Technology, 2017.

Quang H. Nguyen, Trang T. T. Do, Thanh B. Chu, Loan V. Trinh, Dung H. Nguyen, Cuong V. Phan, Tuan A. Phan, Dung V. Doan, Hung N. Pham, Binh P. Nguyen and Matthew C. H. Chua, “Music Genre Classification Using Residual Attention Network”, 2019 International Conference on System Science and Engineering (ICSSE 2019), Dong Hoi, Vietnam, July 20-21, 2019, pp 115-119, ISBN: 978-1-7281-0524-6

[Online]. Available: http://www.irisa.fr/metiss/guig/spro/spro-4.0.1/spro_1.html#SEC1

[Online]. Available: http://www.fon.hum.uva.nl/praat/downloadwin.html

[Online]. Available: https://www.tutorialspoint.com/matlab/index.htm

[Online]. Available: https://www.mathworks.com/products.html?s_tid=gn_ps

[Online]. Available: https://en.wikipedia.org/wiki/Tempo

Jean-Franҫois Bonastre, Frédéric Wils (2005) ALIZE, A FREE TOOLKIT FOR SPEAKER RECOGNITION. IEEE International Conference, pp. I 737 - I 740.

Tommie Gannert (2007). A Speaker Verification System under the Scope: Alize. Stockholm, Sweden School of Computer Science and Engineering.