HMM-Based Vietnamese Speech Synthesis

Tóm tắt:

In this paper, improving naturalness HMM-based speech synthesis for Vietnamese language is described. By this synthesis method, trajectories of speech parameters are generated from the trained Hidden Markov models. A final speech waveform is synthesized from those speech parameters. The main objective for the development is to achieve maximum naturalness in output speech through key points. Firstly, system uses a high quality recorded Vietnamese speech database appropriate for training, especially in statistical parametric model approach. Secondly, prosodic information such as tone, POS (part of speech) and features based on characteristics of Vietnamese language are added to ensure the quality of synthetic speech. Third, system uses STRAIGHT which showed its ability to produce high-quality voice manipulation and was successfully incorporated into HMM-based speech synthesis. The results collected show that the speech produced by our system has the best result when being compared with the other Vietnamese TTS systems trained from the same speech data.

Tác giả: Son Trinh; Kiem Hoang

Từ khóa: Hidden Markov models, Speech, Speech synthesis, Context, Databases, Training, Context modeling

Tạp chí: International Journal of Software Innovation (IJSI)

Chỉ số: ISSN: 2166-7160; ESCI, Scorpus

Link tải:


11 Tống Hữu Định, Phường Thảo Điền, Thành phố Thủ Đức, TPHCM.

Hotline: 0933.939930

Tel: 028.36203932 (ext. 200)

Fax: 028.54093928