Quality improvement of Vietnamese HMM-based speech synthesis system based on decomposition of naturalness and intelligibility using non-negative matrix factorization

1Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Hidden Markov model (HMM)-based synthesized speech is intelligible but not natural especially under limited data condition. The goal of this study is to improve naturalness without violating acceptable intelligibility by decomposing the naturalness and intelligibility of synthesized speech using a novel asymmetric bilinear model involving non-negative matrix factorization (NMF). Subjective evaluations carried out on Vietnamese data confirmed that the achieved synthesis quality is higher than other methods under limited data condition. Since F0 contour is important for naturalness and intelligibility, especially in Vietnamese. Proposed method is capable of modifying over-smoothed F0 contour without destroying tonal information.

Cite

CITATION STYLE

APA

Dinh, A. T., Phan, T. S., & Akagi, M. (2017). Quality improvement of Vietnamese HMM-based speech synthesis system based on decomposition of naturalness and intelligibility using non-negative matrix factorization. In Advances in Intelligent Systems and Computing (Vol. 538 AISC, pp. 490–499). Springer Verlag. https://doi.org/10.1007/978-3-319-49073-1_53

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free