Phonetics and Phonology of Speaking Styles: Reduction and Elaboration in Speech Communication

Barcelona, Catalonia, Spain
September 30 - October 2, 1991

        

The Lancaster/IBM Spoken English Corpus: a Database for Research into Speaking Styles

Gerry Knowles

Department of Linguistics, Lancaster University, UK

This paper discusses techniques currently being developed at Lancaster University to investigate in a systematic way a wide range of phonetic patterns in naturally occurring spoken texts. These techniques will have a number of applications, including the study of speaking style.

There has been a steady shift over the last decade or so away from the study of language as system towards the study of language in communication. This has important consequences for the way we approach the phonetic form of a text. Formerly it was sufficient for many purposes to use a standard orthographic transcript, or at most to do a phonemic transcription and mark word stress: either of these is sufficient to indicate what words are being said, but neither can satisfactorily represent what is being communicated.

If we are concerned with language as system, we can pose relatively simple phonological questions, typically relating to individual words out of context, and they can be answered in a simple and straight forward way. Thus we can ascertain how words are pronounced, and which syllables are stressed, either by examining our own intuitions, or by looking the words up in a pronouncing dictionary. But to deal with what is communicated requires an analysis of a range of patterns - including pitch, timing, voice quality and loudness - which conventional phonetic and phonological theory are not equipped to handle.

As an illustration of the problem, consider the pronunciation of and, which in texts sometimes begins with a glottal stop and semi-reduced vowel. The glottal stop is traditionally regarded as phonemically irrelevant, and semi-reduction is impossible if all vowels must belong uniquely to one phoneme or another. And yet this pronunciation indicates a different meaning from the form /an/ with no glottal stop and a fully reduced shwa. (At the time of writing it would appear that the former type acts as a cohesive link between sentences, and the latter as a coordinator at phrase level.) The distribution and meaning of such forms - and even their existence - cannot be discovered by introspection. Their occurrence cannot be predicted with any certainty, and is governed by probabilities.

The size of prosodic word groups, and patterns of accentuation, are similarly a matter of probability. The probabilities themselves are not fixed, but vary according to style: the prosodic patterns of a liturgical passage differ from those of a story read aloud. The variation may or may not be related to specific parameters, such as speaking rate.

In order to handle contextual patterns in texts, we need to develop a new methodology. First, we need a carefully compiled corpus of naturally occurring data. Secondly, the corpus has to be annotated in a useful way. Thirdly, we need to convert the corpus into a database so that we can extract contextual information from it.

Full Paper

Bibliographic reference.  Knowles, Gerry (1991): "The Lancaster/IBM spoken English corpus: a database for research into speaking styles", In PPoSpSt-1991, paper 035.