Influences of Fundamental Oscillation on Speaker Identification in Vocalic Utterances by Humans and Computers

Volker Dellwo, Thayabaran Kathiresan, Elisa Pellegrino, Lei He, Sandra Schwab, Dieter Maurer


We tested the influence of fundamental oscillation (fo) on human and machine speaker recognition performance in vocalic test utterances. In experiment I, we trained a Gaussian-Mixture model on 15 speakers (80 multi-word utterances each) and tested it with sustained vowel utterances (/a:/, /i:/ and /u:/) under six fo conditions, three changing (fall, rise, fall-rise) and three steady-state (high, mid, low). Results revealed better performance for the steady-state compared to the changing conditions and within the steady-state condition, performance was poorest for high fo. In experiment II, we tested 9 human listeners on a subset of 4 speakers from experiment I. They went through two training tasks (training 1: multi-word utterances; training 2: words). In the test, they recognized speakers based on the same vocalic utterances as in experiment I (for these 4 speakers). Results showed that performance was about equally high for the changing and steady-state vowels, however, in the steady-state condition performance was best for high fo vowels. The experiments suggest that (a) fo has an influence on the strength of speaker specific characteristics in vowels and (b) humans - compared to machines - pay attention to different acoustic information in vocalic utterances for speaker recognition.


 DOI: 10.21437/Interspeech.2018-2331

Cite as: Dellwo, V., Kathiresan, T., Pellegrino, E., He, L., Schwab, S., Maurer, D. (2018) Influences of Fundamental Oscillation on Speaker Identification in Vocalic Utterances by Humans and Computers. Proc. Interspeech 2018, 3795-3799, DOI: 10.21437/Interspeech.2018-2331.


@inproceedings{Dellwo2018,
  author={Volker Dellwo and Thayabaran Kathiresan and Elisa Pellegrino and Lei He and Sandra Schwab and Dieter Maurer},
  title={Influences of Fundamental Oscillation on Speaker Identification in Vocalic Utterances by Humans and Computers},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={3795--3799},
  doi={10.21437/Interspeech.2018-2331},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2331}
}