14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

A Neural Oscillator Model of Speech Timing and Rhythm

Erin Rusaw

University of Illinois at Urbana-Champaign, USA

This paper introduces the Neural Oscillator Model of Speech Timing and Rhythm (NOMSTR), which is designed to be a flexible tool for investigating the systems which affect speech rhythm and timing through simulation. NOMSTR is an artificial neural network (ANN) model which incorporates oscillators, inspired by central pattern generators (CPGs), a type of neural circuit which underlies other types of patterned motor behavior in animals. NOMSTR uses three oscillators paired with thresholded nodes to model three levels of prosodic structure (e.g. syllables, accents, and phrases). In addition to setting the periods and phases of the oscillators to represent syllable and phrase durations, the weights between the thresholded nodes can be adjusted to model interactions between prosodic levels and their durational effects (e.g. pre-boundary lengthening). In this paper I demonstrate NOMSTR's ability to simulate the prosodic structure of spontaneous utterances in English and French, languages with disparate prosodic systems. The accuracy of NOMSTR's simulated prosodic structures is tested through its ability to simulate syllable durations, the locations of accents and phrase boundaries, and the influence of accenting and boundaries on syllable durations.

Full Paper

Bibliographic reference.  Rusaw, Erin (2013): "A neural oscillator model of speech timing and rhythm", In INTERSPEECH-2013, 607-611.