Odyssey 2010: The Speaker and Language Recognition Workshop

Brno, Czech Republic
28 June 1 July 2010

Constrained Subword Units for Speaker Recognition

Doris Baum, Daniel Schneider (1), Timp Mertens (2), Joachim Kohler (1)

(1) Fraunhofer IAIS, (2) Norweigian University of Science and Technology

Phonetic features have been proposed to overcome performance degradation in spectral speaker recognition in dif?cult acoustic conditions. The harmful effect of those conditions, however, is not restricted to spectral systems but also affects the performance of the open-loop phone recognisers on which phonetic systems are based. In automatic speech recognition, larger subword units and the use of additional constraints from language models have been employed to improve robustness against adverse acoustic conditions. This paper evaluates the performance of more constrained phone recognition and different subword units for speaker recognition on heterogeneous broadcast data from German parliamentary speeches. Using phone clusters and a strong language model instead of phones obtained from unconstrained recognition improves the equal error rate from 14.3% to 8.6% on the given data.

Full Paper (PDF)

Bibliographic reference.  Baum, Doris / Schneider, Daniel / Mertens, Timo / Kohler, Joachim (2010): "Constrained Subword Units for Speaker Recognition", In Odyssey-2010, paper 002.