SPoSS - Sound Patterns of Spontaneous Speech

La Baume-les-Aix, France
September 24-26, 1998

The Aims of SPoSS

Danielle Duez

Laboratoire Parole et Langage (LPL), CNRS ESA 6057, Aix-en-Provence, France

It is well known that most of our daily speech productions are spontaneous, i.e. improvised, expressed on the spur of the moment. We speak to exchange ideas, to express our feelings, to communicate with our relatives, friends, colleagues and new acquaintances. In spite of this, however, phonetic studies on spontaneous speech have long been relatively rare, or at best, marginal.

There have been many reasons for this marginality, including the extreme variability of spontaneous speech patterns and the search for scientificness. To neutralize speech variability and to avoid the manifestations of spontaneity, often considered as undesirable, studies on speech production and perception have restricted their investigations to read speech, or more precisely to oralized written speech, also called 'laboratory speech'. Most often, laboratory-speech corpora have consisted of a limited number of prepared sentences repeated a limited number of times in sound proofrooms by a selected speaker.

The contribution of laboratory-speech investigations has been fundamental to the understanding of the processes of speech production and perception, to the description of universal as well as language-specific characteristics, and to the development of speech technology. However, the results obtained have been specific to laboratory speech, that is, to speech stripped of its normal creativity because it has consisted of prepared sentences produced in isolation in a very specific situational context and because the speakers have had to be neutral and the listeners, virtual and ideal. Moreover, the analysis of factor effects has been based on the premise that factors act in a categorical way (for example, prominence/non-prominence; normal rate/fast rate).

Within the past thirty years, there has been a renewed interest in spontaneous speech processes. Now, far from being ignored or avoided, the manifestations of spontaneity have become the subject of investigations in various languages. For example, to pinpoint the characteristics of spontaneous speech, the acoustic structure of words or sentences produced in conversations has often been compared with the structure of the same read words or sentences. Generally, the speech segments in words or sentences produced spontaneously were found to be reduced, altered, omitted, or combined with other segments compared with the same read words or sentences.. This finding was interpreted as the result of greater reduction and contextual assimilation in spontaneous speech.

The fact that the acoustic-phonetic form of words may be reduced and impoverished has crucial implications for the production and perception of spontaneous speech, and raises a certain number of questions such as:

  1. Are spontaneous-speech processes rule-governed?
  2. To what extent are spontaneous-speech processes universal, and language-specific?
  3. Are diachronic sound changes rooted in the same principles as those that dictate synchronic sound variations?
  4. How does prosodic information affect spontaneous-speech processes?
  5. Do we perceive the manifestations of spontaneous-speech processes?
  6. The reduced form of words may be insufficient for identification, so how do listeners cope with missing information? What is the role of context in the treatment of variations?
  7. Is there complete omission of articulatory movements in case of speech-segment deletion ?
  8. How can spontaneous-speech-processes be modelled?

These questions provided the rationale for this workshop on the 'Sound Patterns of Spontaneous Speech'. It is believed that the analysis of spontaneous-speech processes will provide us with better knowledge of articulatory organization and a better understanding of speech perception, and should allow us to improve the naturalness of speech synthesis and the reliability of speech recognition.

The aim of the workshop was fourfold:

  1. The principle of phoneme concatenation and coarticulation supplemented with articulatory phenomena as occasional deviations from it, has recently been questioned. Speech movements do not appear to act simultaneously, but rather seem to occur with grossly shifted timing. Moreover, articulatory-movement timing also seems to depend on external factors such as syllable position in phrases and utterances, and speech rates and styles. Spontaneous-speech data are particularly valuable since they concern long stretches of speech and allow us to test the ability of articulatory models to account for spontaneous-speech phenomena such as deletions or reconstructions of speech segments. Therefore, the first aim of the workshop was to collect data on various languages using various phonological and prosodic systems.
  2. Languages are 'alive' and as any 'living thing' are subject to variations and changes. These changes can be investigated diachronically or synchronically. Over the past twenty years there has been a surge of interest in the link between diachronic changes in speech segments and synchronic morphonological or phonological variations. It is clear that any diachronic change has its synchronic consequence and that when a diachronic change has taken place, some traces of the old form may be retained at the phonological or morphophonologivcal level. Thus, there may be a correspondence between diachronic and synchronic events and facts. Therefore, the second aim of the workshop was to examine the question of whether the same rules of assimilation, and weakening or strengthening provide an explanation both for diachronic changes and the coarticulatory patterns of speech segments.
  3. In spontaneous speech, the acoustic structure of words may be severely reduced and impoverished compared to their intended phonetic form. Similarly, assimilatory processes can also lead to the creation of words which do not exist, i.e. to non-words. It is well known that the intelligibility rate of words taken from conversations is often less than 50% intelligible when assessed on the basis of the acoustic signal alone. Therefore, a fundamental question that arises is how do listeners decode these non-words. What implications does this have for the way knowledge of lexical form is mentally represented and for the kinds of mental processes that relate non-words to lexical representation?
  4. Transcribing speech is a very difficult task, mainly due to the complexity of the real-word speech. In the case of spontaneous speech, the difficulty of the transcription is almost insurmountable since the goal is to report spontaneous-speech processes as accurately as possible. This raises a certain number of questions such as:l) What are the spontaneous-speech processes that are to be transcribed, and 2) how can we transcribe them? Answering these questions implies methodological and theoretical choices which have crucial implications on the linguistic representation of speech, and linguistic theory.

The SPoSS workshop will probably offer some answers to the above questions, while also raising other new questions and issues. As Bachelard wrote (1980), 'La connaissance du réel est une lumière qui projette toujours quelque part des ombres. Elle n'est jamais immédiate et pleine'. (Knowledge of the real is a light that always casts shadows somewhere. Never is it immediate or complete). But these new questions will sow the seeds for future research on spontaneous speech.

The workshop is organised in four main sections, beginning with data descriptions of spontaneous speech processes in various languages. The complex relationship between spontaneous speech processes and their perceptual consequences are addressed in section 2. In section 3, there are detailed descriptions of prosodic aspects of spontaneous speech. Section 4 presents methods of automatic transcription and labelling of spontaneous speech.

As a workshop on spontaneous speech, it can only be interactive. Each day of the workshop will end with a concluding discussion about the data presented in the different talks and lectures and the questions they have raised. A panel session entitled 'Achievements and perspectives of research on spontaneous speech' will finish the workshop. No doubt also that La Baume with its beautiful old building and peaceful pine woods will be a place conducive for thinking and fruitflul and friendly discussion.

I would like to thank ESCA and GPCP for their scientific sponsoring of the workshop. I am also grateful to you for your participation and your interest in spontaneous speech. I hope that you will like the scientific program and that you will enjoy La Baume-les-Aix, Aix-en-Provence, and the surroundings.


Bibliographic reference.  Duez, Danielle (1998): "The aims of SPoSS. Introductory remarks", In SPoSS-1998, vii-ix.