5th International Conference on Spoken Language Processing
This paper describes a methodology for analysing the dynamic encoding of identity and dialect in prosodic parameters. Properties of the well-known DTW (Dynamic Time Warping) path of best match allow the separation of dynamic from static properties of acoustic parameters. A database of 19 speakers of Australian English was recorded and F0, energy, zero crossing rate and voicing contours extracted. Discriminant analysis figures measured identity encoding, while correlation rates measured dialect. Dynamic encoding levels were found to be significantly higher than static for both speaker characteristics (identity: 75% versus 55%; dialect: 0.58 versus 0.45).Normalisation of acoustic parameters into the range 0--1, eliminating all static information, reduced encoding levels to 70% (identity) and 0.52 (dialect) showing the robustness of dynamic encoding. Contrasting DTW warp path properties with the DTW distance showed the warp path a significantly better extractor of encoded information (72% versus 54% for identity; 0.45 versus 0.30 for dialect).
Bibliographic reference. Barlow, Michael / Wagner, Michael (1998): "Measuring the dynamic encoding of speaker identity and dialect in prosodic parameters", In ICSLP-1998, paper 0979.