Relating Articulatory Motions in Different Speaking Rates

Astha Singh, G. Nisha Meenakshi, Prasanta Kumar Ghosh

Movements of articulators (e.g., tongue, lips and jaw) in different speaking rates are related in a complex manner. In this work, we examine the underlying function to transform articulatory movements involved in producing speech at a neutral speaking rate into those at fast and slow speaking rates (N2F and N2S). For this we use articulatory movement data collected from five subjects using an Electromagnetic articulograph at neutral, fast and slow speaking rates. As candidate transformation functions (TF), we use affine transformations with a diagonal matrix and a full matrix and a nonlinear function modeled by a deep neural network (DNN). Since the duration of an utterance in different speaking rates would typically be unequal, it is required to time align the articulatory movement trajectories, which, in turn, affects the TF learnt. Therefore, we propose an iterative algorithm to alternately optimize for the TF and the time alignments. Subject specific experiments reveal that while N2F transformation can be well described by an affine transformation with a full matrix, N2S transformation is better represented by a more complex nonlinear function modeled by a DNN. This could be because subjects exhibit gross articulatory movements during fast speech and hyper-articulate while producing slow speech.

 DOI: 10.21437/Interspeech.2018-1862

Cite as: Singh, A., Meenakshi, G.N., Ghosh, P.K. (2018) Relating Articulatory Motions in Different Speaking Rates. Proc. Interspeech 2018, 2992-2996, DOI: 10.21437/Interspeech.2018-1862.

  author={Astha Singh and G. Nisha Meenakshi and Prasanta Kumar Ghosh},
  title={Relating Articulatory Motions in Different Speaking Rates},
  booktitle={Proc. Interspeech 2018},