ISCA Archive SpeechProsody 2010
ISCA Archive SpeechProsody 2010

A novel feature extraction for neural–based modes in acoustic-articulatory inversion mapping

Hossein Behbood, Seyyed Ali SeyyedSalehi, Hamid Reza Tohidypour

Acoustic-articulatory inversion mapping is a process that converts the signal of acoustic data to articulatory features. Most research focused on finding the best model for this mapping process but less attention on finding appropriate representation of articulatory & acoustic signals. This paper suggests two feature extraction methods, including Logarithm of square Hanning Critical Bank filterbank & Discrete Wavelet Transform that have better operation in contrast with conventional feature extraction based on Mel- Frequency Cepstral coefficients. For inversion mapping process an standard feed forward neural network is used. Appling a Time Delay Neural Network for phone recognition. The results show the efficiency of two new feature extraction methods.

Index Terms: Discrete Wavelet Transform, Time Delay Neural Networks (TDNNs), MOCHA-TIMIT database, Acoustic- Articulatory Inversion Mapping, Logarithm of square Hanning Critical Bank filterbank (LHCB), Mel Frequency Cepstral Coefficients(MFCC)


Cite as: Behbood, H., SeyyedSalehi, S.A., Tohidypour, H.R. (2010) A novel feature extraction for neural–based modes in acoustic-articulatory inversion mapping. Proc. Speech Prosody 2010, paper 582

@inproceedings{behbood10b_speechprosody,
  author={Hossein Behbood and Seyyed Ali SeyyedSalehi and Hamid Reza Tohidypour},
  title={{A novel feature extraction for neural–based modes in acoustic-articulatory inversion mapping}},
  year=2010,
  booktitle={Proc. Speech Prosody 2010},
  pages={paper 582}
}