The acoustic aspects that differentiate voices are difficult to separate from signal traits that reflect the identity of the sounds. There are two sources of variation among speakers: (1) differences in vocal cords and vocal tract shape, and (2) differences in speaking style. The latter includes variations in both target vocal tract positions for phonemes and dynamic aspects of speech, such as speaking rate. However, most parameters and features are in the former. In this paper, we propose the use of a prosodic feature that represents micro prosody of utterances. The robustness of the prosodic feature on noise environment becomes clear. Also we propose a combined model. The combined model uses both the spectral feature and the prosodic feature. In our experiments, this model provides robust speaker recognition in noise environments.
Cite as: Kyung, Y.-J., Lee, H.-S. (1998) Text independent speaker recognition using micro-prosody. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0407, doi: 10.21437/ICSLP.1998-219
@inproceedings{kyung98_icslp, author={Youn-Jeong Kyung and Hwang-Soo Lee}, title={{Text independent speaker recognition using micro-prosody}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0407}, doi={10.21437/ICSLP.1998-219} }