5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Extraction and Representation Rhythmic Components of Spontaneous Speech

Shigeyoshi Kitaazawa, Hideya Ichikawa, Satoshi Kobayashi, Yukihiro Nishinuma (1)

Department of Computer Science, Faculty of Information, Shizuoka University, Jouhoku, Hamamatsu, Japan
(1) CNRS, URA 261 "Laboratoire parole et langage", Universite de Provence, Aix-en-Provence, France

Speech speed is measured and displayed with our specific algorithm TEMAX (Temporal Evaluation and Measurement Algorithm by KS). The TEMAX-gram, a sonagraphic output of speech envelope, the DFT using a 1-second window is convenient to set off isosyllabic characteristics. For Japanese traces 2 dark bars, called rhythmic formants: RF1 and RF2: the first one, around 8 Hz, and the second one, at halfway. RF1 corresponds to speech rate, RF2 represents the bimoraic rhythmic foot. As far as English, its isochronic characteristics are observable with a 2-seconds window as RF1. Furthermore, using a 1-second window the periodicity of syllables between stress is displayed as RF2.

Full Paper   Acoustic Example #1   Acoustic Example #2  

Bibliographic reference.  Kitaazawa, Shigeyoshi / Ichikawa, Hideya / Kobayashi, Satoshi / Nishinuma, Yukihiro (1997): "Extraction and representation rhythmic components of spontaneous speech", In EUROSPEECH-1997, 641-644.