The Use of Read versus Conversational Lombard Speech in Spectral Tilt Modeling for Intelligibility Enhancement in Near-End Noise Conditions

Emma Jokinen, Ulpu Remes, Paavo Alku


Intelligibility of speech in adverse near-end noise conditions can be enhanced with post-processing. Recently, a post-processing method based on statistical mapping of the spectral tilt of normal speech to that of Lombard speech was proposed. However, previous intelligibility improvement studies utilizing Lombard speech have mainly gathered data from read sentences which might result in a less pronounced Lombard effect. Having a mild Lombard effect in the training data weakens the statistical normal-to-Lombard mapping of the spectral tilt which in turn deteriorates performance of intelligibility enhancement. Therefore, a database containing both conversational and read Lombard speech was recorded in several background noise conditions in this study. Statistical models for normal-to-Lombard mapping of the spectral tilt were then trained using the obtained conversational and read speech data and evaluated using an objective intelligibility metric. The results suggest that the conversational data contains a more pronounced Lombard effect and could be used to obtain better statistical models for intelligibility enhancement.


DOI: 10.21437/Interspeech.2016-143

Cite as

Jokinen, E., Remes, U., Alku, P. (2016) The Use of Read versus Conversational Lombard Speech in Spectral Tilt Modeling for Intelligibility Enhancement in Near-End Noise Conditions. Proc. Interspeech 2016, 2771-2775.

Bibtex
@inproceedings{Jokinen+2016,
author={Emma Jokinen and Ulpu Remes and Paavo Alku},
title={The Use of Read versus Conversational Lombard Speech in Spectral Tilt Modeling for Intelligibility Enhancement in Near-End Noise Conditions},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-143},
url={http://dx.doi.org/10.21437/Interspeech.2016-143},
pages={2771--2775}
}