On this paper we present a new approach for the localisation of superposed speech areas. The system is based on the frequency tracking of speech segments following the evolution of the main amplitude frequencies and uses no learning of acoustic or prosodic models. The set of trackings of the frequencies are then grouped together using a distance based on the harmonicity, each group being the production of a single speaker. The co-occurrence of different harmonic groups is then used as a consequence of the presence of multiple speakers. Our method has been evaluated on the data of the French ANR evaluation campaign ETAPE, showing the usability of this approach.
Bibliographic reference. Coz, Maxime Le / Pinquier, Julien / André-Obrecht, Régine (2013): "Superposed speech localisation using frequency tracking", In INTERSPEECH-2013, 714-717.