IEICE Trans - A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis

A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis

Tomoki TODA
Keiichi TOKUDA

Publication
IEICE TRANSACTIONS on Information and Systems Vol.E90-D No.5 pp.816-824
Publication Date: 2007/05/01
Online ISSN: 1745-1361
DOI: 10.1093/ietisy/e90-d.5.816
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
Keyword:
HMM-based speech synthesis, speech parameter generation, maximum likelihood criterion, over-smoothing effect, global variance,

Full Text: FreePDF(1.2MB)

Summary:
This paper describes a novel parameter generation algorithm for an HMM-based speech synthesis technique. The conventional algorithm generates a parameter trajectory of static features that maximizes the likelihood of a given HMM for the parameter sequence consisting of the static and dynamic features under an explicit constraint between those two features. The generated trajectory is often excessively smoothed due to the statistical processing. Using the over-smoothed speech parameters usually causes muffled sounds. In order to alleviate the over-smoothing effect, we propose a generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance (GV) of the generated trajectory. The latter likelihood works as a penalty for the over-smoothing, i.e., a reduction of the GV of the generated trajectory. The result of a perceptual evaluation demonstrates that the proposed algorithm causes considerably large improvements in the naturalness of synthetic speech.

open access publishing via