Cross-Linguistic Study of the Production of Turn-Taking Cues in American English and Argentine Spanish

Pablo Brusco, Juan Manuel Pérez, Agustín Gravano


We present the results of a series of machine learning experiments aimed at exploring the differences and similarities in the production of turn-taking cues in American English and Argentine Spanish. An analysis of prosodic features automatically extracted from 21 dyadic conversations (12 En, 9 Sp) revealed that, when signaling Holds, speakers of both languages tend to use roughly the same combination of cues, characterized by a sustained final intonation, a shorter duration of turn-final inter-pausal units, and a distinct voice quality. However, in speech preceding Smooth Switches or Backchannels, we observe the existence of the same set of prosodic turn-taking cues in both languages, although the ways in which these cues are combined together to form complex signals differ. Still, we find that these differences do not degrade below chance the performance of cross-linguistic systems for automatically detecting turn-taking signals. These results are relevant to the construction of multilingual spoken dialogue systems, which need to adapt not only their ASR modules but also the way prosodic turn-taking cues are synthesized and recognized.


 DOI: 10.21437/Interspeech.2017-124

Cite as: Brusco, P., Pérez, J.M., Gravano, A. (2017) Cross-Linguistic Study of the Production of Turn-Taking Cues in American English and Argentine Spanish. Proc. Interspeech 2017, 2351-2355, DOI: 10.21437/Interspeech.2017-124.


@inproceedings{Brusco2017,
  author={Pablo Brusco and Juan Manuel Pérez and Agustín Gravano},
  title={Cross-Linguistic Study of the Production of Turn-Taking Cues in American English and Argentine Spanish},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2351--2355},
  doi={10.21437/Interspeech.2017-124},
  url={http://dx.doi.org/10.21437/Interspeech.2017-124}
}