9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Human-Like Ears versus Two-Microphone Array, Which Works Better for Speaker Identification?

Waleed H. Abdulla, Yushi Zhang

University of Auckland, New Zealand

In this paper we try to answer with justifications the question posed in the title! We have used for this purpose a speech recording hardware; an acoustic artificial head, which accurately imitates human head, shoulder, and outer ears. It offers excellent level of realism and clarity in audio recording. Special speech corpuses are prepared under different noise conditions using the artificial head in normal office environment and anechoic chamber. Then the speech corpuses are used to evaluate the performance of a text-independent speaker identification system using two types of features. Identical corpuses using two-microphone array for recording are also prepared and used for assessing the performance of the same speaker identification system. The results show that using the artificial head for recording improves the identification rate under different noise conditions. This confirms that human ears and head structures have a role in improving human ability to recognize people from their voice.

Full Paper

Bibliographic reference.  Abdulla, Waleed H. / Zhang, Yushi (2008): "Human-like ears versus two-microphone array, which works better for speaker identification?", In INTERSPEECH-2008, 1337-1340.