The presence of inhalation breaths in speech pauses has recently attracted more attention especially since the focus of speech synthesis research has shifted to prosodic aspects beyond a single sentence, as, for instance in the synthesis of audiobooks. Inhalation breath pauses are usually not an issue in traditional speech synthesis corpora because they typically use single sentences of limited length and therefore pauses including inhalation breaths rarely occur or they are deliberately avoided during recording. However, in readings of large coherent texts like audiobooks, there are often inhalation breaths, particularly in publicly available audiobooks. These inhalation breaths are relevant for the modelling of pauses in audiobook synthesis and can cause a reduction in naturalness when un-modelled. Therefore this paper presents a method to automatically classify pauses into one of four classes (silent pause, inhalation breath pause, noisy pause, no pause) for improved pause modelling in HMM-TTS.
Index Terms: inhalation breaths, pauses, speech synthesis, HMM-TTS, classification
Cite as: Braunschweiler, N., Chen, L. (2013) Automatic detection of inhalation breath pauses for improved pause modelling in HMM-TTS. Proc. 8th ISCA Workshop on Speech Synthesis (SSW 8), 1-6
@inproceedings{braunschweiler13_ssw, author={Norbert Braunschweiler and Langzhou Chen}, title={{Automatic detection of inhalation breath pauses for improved pause modelling in HMM-TTS}}, year=2013, booktitle={Proc. 8th ISCA Workshop on Speech Synthesis (SSW 8)}, pages={1--6} }