7th International Conference on Spoken Language Processing
September 16-20, 2002
A very detailed segmentation of prosodic phrase was carried out in order to construct a Japanese prosodic database. Boundaries correspond to junctures between phrases including C|C and V|V clusters. The "prosodic phrase" we introduced as a unit of the segmentation was defined and regarded as a unit of language speech perception. For the exact segmentation, the wide-band spectrum, the narrow-band spectrum, fine speech wave and fundamental frequency shapes and transition of amplitude of the higher order formants were adopted to enumerate the candidate points for the segment boundary. Fine time adjustment by the steps of the respective fundamental period of the speech determined the exact boundary. To maintain the consistency of the segmentation, one person ascertained the entire segment carefully. The database, referred to here as "Japanese Multext", contains read style speech and spontaneous style speech by three male speakers and three female speakers in Tokyo dialect.
Bibliographic reference. Shigeyoshi, Kitazawa / Toshihiko, Itoh / Tatsuya, Kitamura (2002): "Juncture segmentation of Japanese prosodic unit based on the spectrographic features", In ICSLP-2002, 1201-1204.