Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech

Md. Sahidullah, Rosa Gonzalez Hautamäki, Dennis Alexander Lehmann Thomsen, Tomi Kinnunen, Zheng-Hua Tan, Ville Hautamäki, Robert Parts, Martti Pitkänen


Accuracy of automatic speaker recognition (ASV) systems degrades severely in the presence of background noise. In this paper, we study the use of additional side information provided by a body-conducted sensor, throat microphone. Throat microphone signal is much less affected by background noise in comparison to acoustic microphone signal. This makes throat microphones potentially useful for feature extraction or speech activity detection. This paper, firstly, proposes a new prototype system for simultaneous data-acquisition of acoustic and throat microphone signals. Secondly, we study the use of this additional information for both speech activity detection, feature extraction and fusion of the acoustic and throat microphone signals. We collect a pilot database consisting of 38 subjects including both clean and noisy sessions. We carry out speaker verification experiments using Gaussian mixture model with universal background model (GMM-UBM) and i-vector based system. We have achieved considerable improvement in recognition accuracy even in highly degraded conditions.


DOI: 10.21437/Interspeech.2016-1153

Cite as

Sahidullah, M., Hautamäki, R.G., Thomsen, D.A.L., Kinnunen, T., Tan, Z., Hautamäki, V., Parts, R., Pitkänen, M. (2016) Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech. Proc. Interspeech 2016, 1720-1724.

Bibtex
@inproceedings{Sahidullah+2016,
author={Md. Sahidullah and Rosa Gonzalez Hautamäki and Dennis Alexander Lehmann Thomsen and Tomi Kinnunen and Zheng-Hua Tan and Ville Hautamäki and Robert Parts and Martti Pitkänen},
title={Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1153},
url={http://dx.doi.org/10.21437/Interspeech.2016-1153},
pages={1720--1724}
}