September 22-25, 1997
A statistical modeling of voice fundamental frequency contours was proposed for the purpose of developing effective ways to utilize prosodic features in speech recognition. In view of the fact that prosodic features should be treated in longer units, the proposed modeling represents the transition in moraic units. A fundamental frequency contour was first segmented into moraic units and then each moraic contour was represented by a code depending on the shape. After modeling fundamental frequency contours for the portions of several morae around boundaries in question based on HMM scheme, experiments on syntactic boundary detection were conducted. Detection rate reached to 89.2 % for the closed condition experiment and was around 85 % for the open (speaker and topic) condition experiment. Experiments on accent type recognition were also conducted yielding around 74 % of correct recognition for the speaker independent cases.
Bibliographic reference. Hirose, Keikichi / Iwano, Kouji (1997): "A method of representing fundamental frequency contours of Japanese using statistical models of moraic transition", In EUROSPEECH-1997, 311-314.