This paper describes a new approach to pitch marking. Unlike other approaches that use the same combination of features for the whole signal, we take into account the signal properties and apply different features according to some heuristic. We use the short-term energy as a novel robust feature for placing the pitch marks. Where the energy information turns out to be not suitable as an indicator we resort to the fundamental wave computed from a contiguous F0 contour in combination with detailed voicing information. Our experiments demonstrate that the proposed pitch marking algorithm considerably improves the quality of synthesised speech generated by a concatenative text-to-speech system that uses TD-PSOLA for prosodic modifications.
Bibliographic reference. Ewender, Thomas / Pfister, Beat (2010): "Accurate pitch marking for prosodic modification of speech segments", In INTERSPEECH-2010, 178-181.