15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Vocal Tract Length Estimation Based on Vowels Using a Database Consisting of 385 Speakers and a Database with MRI-Based Vocal Tract Shape Information

Hideki Kawahara (1), Tatsuya Kitamura (2), Hironori Takemoto (3), Ryuichi Nisimura (1), Toshio Irino (1)

(1) Wakayama University, Japan
(2) Konan University, Japan
(3) NICT, Japan

A highly-reproducible estimation method of vocal tract length (VTL) and text independent VTL estimation method are proposed based on a Japanese vowel database spoken by 385 male and female speakers ranging from age 6 to 56 and other vowel database with MRI-based vocal tract shape information. Proposed methods are based on interference-free power spectral representation and systematic suppression of biasing factors. MRI data is used to calibrate VTL estimation result to be represented in terms of physically meaningful unit. These databases are normalized based on the estimated VTL information to provide a reference template, which is used to implement a text independent VTL estimation method. A prototype system for text independent estimation of VTL is implemented using Matlab and runs faster than realtime on a PC.

Full Paper

Bibliographic reference.  Kawahara, Hideki / Kitamura, Tatsuya / Takemoto, Hironori / Nisimura, Ryuichi / Irino, Toshio (2014): "Vocal tract length estimation based on vowels using a database consisting of 385 speakers and a database with MRI-based vocal tract shape information", In INTERSPEECH-2014, 870-874.