Wavelet-based adaptation of pitch contour to Lombard speech

Juraj Simko, Antti Suni, Martti Vainio


Increase in fundamental frequency (f0) is one of the most robust and best-studied phenomena characterizing Lombard speech. In this work, three types of global transformation of f0 contours from normal speech to Lombard condition are investigated: (1) a linear re-scaling of the quiet condition contour to match the mean and standard deviation of f0 in Lombard speech, (2) a non-linear regression between the f0 values in quiet condition against the corresponding f0 values in the Lombard speech and (3) a multiple non-linear regression using components obtained by a wavelet decomposition of the quiet condition contours. The quality of fits is evaluated on a phonetically controlled corpus of Finnish sentences with varying prosodic focus and ambient noise conditions. The results show that the non-linear regression yields a smaller root mean squared error that the simple rescaling. Both methods are outperformed by the technique based on continuous wavelet transformation that uses hierarchical information encoded in speech signal. The findings are discussed in terms of their theoretical implications as well as their possible technological applications.


DOI: 10.21437/SpeechProsody.2016-100

Cite as

Simko, J., Suni, A., Vainio, M. (2016) Wavelet-based adaptation of pitch contour to Lombard speech. Proc. Speech Prosody 2016, 489-493.

Bibtex
@inproceedings{Simko+2016,
author={Juraj Simko and Antti Suni and Martti Vainio},
title={Wavelet-based adaptation of pitch contour to Lombard speech},
year=2016,
booktitle={Speech Prosody 2016},
doi={10.21437/SpeechProsody.2016-100},
url={http://dx.doi.org/10.21437/SpeechProsody.2016-100},
pages={489--493}
}