4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Constructing Multi-level Speech Database for Spontaneous Speech Processing

Minsoo Hahn (1), Sanghun Kim (2), Jung-Chul Lee (2), Yong-Ju Lee (3)

(1) Audio Information Processing Section
(2) Spoken Language Processing Section, Electronics and Telecommunication Research Institute, Taejeon, Korea
(3) Dept. of Computer Eng.,WonKwang Univ., Chonbuk, Korea

This paper describes a new database, called muti-level speech database, for spontaneous speech processing. We designed the database to cover textual and acoustic variations from declarative speech to spontaneous speech. The database is composed of 5 categories which are, in the order of decreasing spontaneity, spontaneous speech, interview, simulated interview, declarative speech with context, and declarative speech without context. We collected total 112 sets from 23 subjects(male: 19, female: 4). Then the database was firstly transcribed using 15 transcription symbols according to our own transcription rules. Secondly, prosodic information will be added. The goal of this research is a comparative textual and prosodic analysis at each level, quantification of spontaneity of diversified speech database for dialogue speech synthesis and recognition. From the preliminary analysis of transcribed texts, the spontaneous speech has more corrections, repetitions, and pauses than the others as expected. In addition, average number of sentences per turn of spontaneous speech is greater than the others. From the above results, we can quantify the spontaneity of speech database.

Full Paper

Bibliographic reference.  Hahn, Minsoo / Kim, Sanghun / Lee, Jung-Chul / Lee, Yong-Ju (1996): "Constructing multi-level speech database for spontaneous speech processing", In ICSLP-1996, 1930-1933.