Pham Ngoc Hung, Trinh Van Loan, Nguyen Hong Quang


The dialect identification was studied for many languages over the world nevertheless the research on signal processing for Vietnamese dialects is still limited and there were not many published works. There are many different dialects for Vietnamese. The influence of dialectal features on speech recognition systems is important. If the information about dialects is known during speech recognition process, the performance of recognition systems will be better because the corpus of these systems is normally organized according to different dialects. This paper will present the combination of MFCC coefficients and fundamental frequency features of Vietnamese for dialectal identification based on GMM. The experiment result for the dialect corpus of Vietnamese shows that the performance of dialectal identification is increased from 59% for the case using only MFCC coefficients to 71% for the case using MFCC coefficients and the information of fundamental frequency.


Fundamental frequency; MFCC; GMM; Vietnamese dialects; identification

Full Text:



Bernd Kortmann , A Comparative Grammar of British English Dialects, 2005, Walter de Gruyter.

Jing Li et al., “A Dialectal Chinese Speech Recognition Framework”, Journal of Compute. Sci. & Technol., Jan.2006, Vol. 21, No. 1, pp. 106-115.

Sittichok Aunkaew, Montri Karnjanadecha, Chai Wutiwiwatchai, “Development of a Corpus for Southern Thai Dialect Speech Recognition: Design and Text Preparation”, The 10th International Symposium on Natural Language Processing, October 28-30, 2013, Phuket, Thailand .

Shweta Sinha, Aruna Jain, S. S. Agrawal, “Acoustic-Phonetic Feature Based Dialect Identification in Hindi Speech”, International Journal on Smart Sensing and Intelligent Systems Vol. 8, No. 1, March 2015, pp 235-254.

Hoang Thi Chau, Vietnamese Dialects. Hanoi National University Publishing House, 2009.

Jean-Franҫois Bonastre, Frédéric Wils, “ALIZE, A FREE TOOLKIT FOR SPEAKER RECOGNITION,” IEEE International Conference, pp. I 737 - I 740, 2005.

Torres-Carrasquillo, P. A., Gleason, T. P., and Reynolds, D. A., “Dialect Identification Using Gaussian Mixture Models,” In Proc. Odyssey: The Speaker and Language Recognition Workshop in Toledo, Spain, ISCA, pp. 297-300, 31 May - 3 June 2004.

Bin MA, Donglai ZHU and Rong TONG, “Chinese Dialect Identification Using Tone Features Based On Pitch,” ICASSP 2006.

Campbell, W. M., Singer, E., Torres-Carrasquillo, P. A., and Reynolds, D. A., “Language Recognition with Support Vector Machines,” In Proc. Odyssey: The Speaker and Language Recognition Workshop in Toledo, Spain, ISCA, pp. 41-44, 31 May - 3 June 2004.

Torres-Carrasquillo, P. A., Singer, E., Kohler, M. A., Greene, R. J., Reynolds, D. A., and Deller Jr., J. R., “Approaches to Language Identification Using Gaussian Mixture Models and Shifted Delta Cepstral Features,” In Proc. International Conference on Spoken Language Processing in Denver, CO, ISCA, pp. 33-36, 82-92 September 2002.

Fadi Biadsy, Julia Hirschberg, Daniel P. W. Ellis, “Dialect and Accent Recognition using Phonetic-Segmentation Supervectors,” Interspeech 2011, 2011.

Pham Ngoc Hung, Trinh Van Loan, Nguyen Hong Quang, Pham Quoc Hung, "Identification of Vietnamese Dialects using GMM," Proceedings of the 6th National Conference on Fundamental and Applied Information Technology Research (FAIR’6), June 20-21th 2014, ISBN 978-604-913-165-3, pp 449-452.

Journal of Computer Science and Cybernetics ISSN: 1813-9663

Published by Vietnam Academy of Science and Technology