For nasal stops and nasalized vowels, one-tube models offer only an inadequate representation. To model the spectral components of nasal speech signals, a minimum of two connected tubes is necessary. Typically, the estimation of branched-tube area functions is based on a polezero model. The present paper introduces a variational Bayesian scheme under Gaussian assumptions to estimate the tube areas directly from the log-spectrum of the speech signal. Probabilistic priors are used to enforce smoothness of the tubes. The method is tested on recorded tokens of /m/ from several speakers using different prior variances. Results show that mild smoothness assumptions yield the best results in terms of model error and marginal likelihood. Furthermore, while yielding comparable fits, the estimated reflection coefficients from the Bayesian scheme show less intra-subject variability between tokens than an unregularized non-linear solver.
Index Terms: vocal tract, estimation, nasal stops, Bayesian statistics
Bibliographic reference. Kasess, Christian H. / Kreuzer, Wolfgang / Enzinger, Ewald / Kerschhofer-Puhalo, Nadja (2012): "Estimation of the vocal tract shape of nasals using a Bayesian scheme", In INTERSPEECH-2012, 699-702.