EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Subjective Evaluations for Perception of Speaker Identity Through Acoustic Feature Transplantations

Oytun Turk (1), Levent M. Arslan (2)

(1) Sestek Inc., Turkey
(2) Bogazici University, Turkey

Perception of speaker identity is an important characteristic of the human auditory system. This paper^1 describes a subjective test for the investigation of the relevance of four acoustic features in this process: vocal tract, pitch, duration, and energy. PSOLA based methods provide the framework for the transplantations of these acoustic features between two speakers. The test database consists of different combinations of transplantation outputs obtained from a database of 8 speakers. Subjective decisions on speaker similarity indicate that the vocal tract is the most relevant feature for single feature transplantations. Pitch and duration possess similar significance whereas the energy is the least important acoustic feature. Vocal tract + pitch + duration transplantation results in the highest similarity to the target speaker. Vocal tract + pitch, vocal tract + duration + energy and vocal tract + duration transplantations also yield convincing results in transformation of the perceived speaker identity. Konusmaci kimligi algilanmasi insan isitme sisteminin onemli ozelliklerinden biridir. Bu calisma, dort akustik ozniteligin konusmaci kimligi algilanmasindaki onemlerini oznel bir deneyle incelemektedir: girtlak yapisi, ses perdesi, sure ve enerji. Gelistirilen PSOLA tabanli yontemler bu ozniteliklerin konusmacilar arasinda nakledilmesine olanak saglamaktadir. Deneyde sekiz kisilik bir veri tabanindaki konusmaci ciftlerinden elde edilen nakil ciktilari kullanilmistir. Oznel deney sonuclari, konusmaci kimligi algilanmasinda tek basina en onemli ozniteligin girtlak yapisi oldugunu gostermektedir. Girtlak yapisi + ses perdesi + sure nakilleri, hedef konusmaciya en benzer ciktinin elde edilmesini saglamistir. Girtlak yapisi + ses perdesi, girtlak yapisi + sure + enerji nakilleri de konusmaci kimliginin donusturulmesi acisindan basarili sonuclar vermistir.

Full Paper

Bibliographic reference.  Turk, Oytun / Arslan, Levent M. (2003): "Subjective evaluations for perception of speaker identity through acoustic feature transplantations", In EUROSPEECH-2003, 2093-2096.