ISCA Archive Interspeech 2017
ISCA Archive Interspeech 2017

Using Voice Quality Features to Improve Short-Utterance, Text-Independent Speaker Verification Systems

Soo Jin Park, Gary Yeung, Jody Kreiman, Patricia A. Keating, Abeer Alwan

Due to within-speaker variability in phonetic content and/or speaking style, the performance of automatic speaker verification (ASV) systems degrades especially when the enrollment and test utterances are short. This study examines how different types of variability influence performance of ASV systems. Speech samples (< 2 sec) from the UCLA Speaker Variability Database containing 5 different read sentences by 200 speakers were used to study content variability. Other samples (about 5 sec) that contained speech directed towards pets, characterized by exaggerated prosody, were used to analyze style variability. Using the i-vector/PLDA framework, the ASV system error rate with MFCCs had a relative increase of at least 265% and 730% in content-mismatched and style-mismatched trials, respectively. A set of features that represents voice quality (F0, F1, F2, F3, H1-H2, H2-H4, H4-H2k, A1, A2, A3, and CPP) was also used. Using score fusion with MFCCs, all conditions saw decreases in error rates. In addition, using the NIST SRE10 database, score fusion provided relative improvements of 11.78% for 5-second utterances, 12.41% for 10-second utterances, and a small improvement for long utterances (about 5 min). These results suggest that voice quality features can improve short-utterance text-independent ASV system performance.


doi: 10.21437/Interspeech.2017-157

Cite as: Park, S.J., Yeung, G., Kreiman, J., Keating, P.A., Alwan, A. (2017) Using Voice Quality Features to Improve Short-Utterance, Text-Independent Speaker Verification Systems. Proc. Interspeech 2017, 1522-1526, doi: 10.21437/Interspeech.2017-157

@inproceedings{park17b_interspeech,
  author={Soo Jin Park and Gary Yeung and Jody Kreiman and Patricia A. Keating and Abeer Alwan},
  title={{Using Voice Quality Features to Improve Short-Utterance, Text-Independent Speaker Verification Systems}},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={1522--1526},
  doi={10.21437/Interspeech.2017-157}
}