Applying Bottle Neck Feature for Vietnamese speech recognition

Nguyễn Văn Huy; Lương Chi Mai; Vũ Tất Thắng

doi:10.15625/1813-9663/29/4/4345

Applying Bottle Neck Feature for Vietnamese speech recognition

Nguyễn Văn Huy, Lương Chi Mai, Vũ Tất Thắng

Author affiliations

Authors

Nguyễn Văn Huy Khoa Điện tử, Trường ĐH Kỹ thuật Công nghiệp, Thái Nguyên, Việt Nam
Lương Chi Mai Viện Công nghệ Thông tin, Viện Khoa học hàn lâm Việt Nam
Vũ Tất Thắng Viện Công nghệ Thông tin, Viện Khoa học hàn lâm Việt Nam

DOI:

https://doi.org/10.15625/1813-9663/29/4/4345

Keywords:

BNF, bottle neck feature, Vietnamese speech recognition, HMM-GMM

Abstract

In the paper, the basic idea of Bottle Neck Feature (BNF) and the process how to extract BNF are presented. We apply BNF for Vietnamese speech recognition with five layers MLP network of different sizes for the first hidden layer. Input features to extract BNF feature are Perceptual Linear Prediction (PLP) and Mel Frequency Cepstral Coefficient (MFCC). The experiments are carried out on a data set of VOV (Voice of Vietnam). The results show that using BNF for Vietnamese speech recognition, a WER (Word Error Rate) is improved up to 6-7% comparing to the baseline system, and MFCC feature gives a better result than PLP feature.

Metrics

PDF views

587

Downloads

Published

03-12-2013

How to Cite

[1]

N. V. Huy, L. C. Mai, and V. T. Thắng, “Applying Bottle Neck Feature for Vietnamese speech recognition”, J. Comput. Sci. Cybern., vol. 29, no. 4, pp. 379–388, Dec. 2013.

Download Citation

Issue

Vol. 29 No. 4 (2013)

Section

Computer Science

License

1. We hereby assign copyright of our article (the Work) in all forms of media, whether now known or hereafter developed, to the Journal of Computer Science and Cybernetics. We understand that the Journal of Computer Science and Cybernetics will act on my/our behalf to publish, reproduce, distribute and transmit the Work.
2. This assignment of copyright to the Journal of Computer Science and Cybernetics is done so on the understanding that permission from the Journal of Computer Science and Cybernetics is not required for me/us to reproduce, republish or distribute copies of the Work in whole or in part. We will ensure that all such copies carry a notice of copyright ownership and reference to the original journal publication.
3. We warrant that the Work is our results and has not been published before in its current or a substantially similar form and is not under consideration for another publication, does not contain any unlawful statements and does not infringe any existing copyright.
4. We also warrant that We have obtained the necessary permission from the copyright holder/s to reproduce in the article any materials including tables, diagrams or photographs not owned by me/us.

Applying Bottle Neck Feature for Vietnamese speech recognition

Authors

DOI:

Keywords:

Abstract

Metrics

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)