14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Estimation of Multiple-Branch Vocal Tract Models: The Influence of Prior Assumptions

Christian H. Kasess, Wolfgang Kreuzer

Austrian Academy of Sciences, Austria

Branched-tube models can be used for modeling nasal speech such as nasal stops and nasalized vowels. Previously, it has been shown that the use of probabilistic prior information such as smoothness priors can reduce the within-speaker variability of the vocal tract estimates. This model, however, lacked a representation of paranasal cavities and thus a model with a more complex branching structure is desirable. This raises the question of what prior information is necessary for physically plausible parameter estimates. Here, a model with one maxillary sinus is estimated. The sinus is parameterized in terms of its resonance using radius and angle in the z-plane, and the coupling area ratio. The probabilistic scheme mentioned above is used to estimate nasal stops /m/ and /n/ extracted from the TIMIT database. Different prior assumptions concerning resonance frequency, bandwidth, and coupling of the sinus to the nasal cavity are tested. Results show, on average, a better model fit for the model including the sinus. Further, prior assumptions are shown to have a large influence on the estimated resonance of the sinus. In particular, the lack of anatomically motivated assumptions about the bandwidth and/or the resonance frequency yields unrealistic estimates of these values.

Full Paper

Bibliographic reference.  Kasess, Christian H. / Kreuzer, Wolfgang (2013): "Estimation of multiple-branch vocal tract models: the influence of prior assumptions", In INTERSPEECH-2013, 1663-1667.