R-Vectors: New Technique for Adaptation to Room Acoustics

Yuri Khokhlov, Alexander Zatvornitskiy, Ivan Medennikov, Ivan Sorokin, Tatiana Prisyach, Aleksei Romanenko, Anton Mitrofanov, Vladimir Bataev, Andrei Andrusenko, Mariya Korenevskaya, Oleg Petrov


Distant speech recognition is an important problem which is far from being solved. Reverberation and noise are in the list of main problems in this area. The most popular methods of dealing with them are data augmentation and speech enhancement. In this paper, we propose a novel approach, inspired by modern methods of speaker adaptation.

First of all, a feed-forward network is trained to classify room impulse responses (RIRs) from speech recordings. Then this network is used for extracting embeddings, which we call R-vectors. These R-vectors are appended to input features of the acoustic model. Due to the lack of labeled data for RIRs classification task, we propose a self-supervised method of training the network, which consists of using artificial audio generated by room simulator.

Experimental evaluation was conducted on VOiCES19 and AMI single-channel tasks as well as CHiME5 multi-channel task. It is shown that the R-vector-adapted ASR systems achieve up to 14% relative WER reduction. Furthermore, it is additive with gains from state-of-the-art dereverberation (WPE) and speaker adaptation (x-vector) techniques.


 DOI: 10.21437/Interspeech.2019-2645

Cite as: Khokhlov, Y., Zatvornitskiy, A., Medennikov, I., Sorokin, I., Prisyach, T., Romanenko, A., Mitrofanov, A., Bataev, V., Andrusenko, A., Korenevskaya, M., Petrov, O. (2019) R-Vectors: New Technique for Adaptation to Room Acoustics. Proc. Interspeech 2019, 1243-1247, DOI: 10.21437/Interspeech.2019-2645.


@inproceedings{Khokhlov2019,
  author={Yuri Khokhlov and Alexander Zatvornitskiy and Ivan Medennikov and Ivan Sorokin and Tatiana Prisyach and Aleksei Romanenko and Anton Mitrofanov and Vladimir Bataev and Andrei Andrusenko and Mariya Korenevskaya and Oleg Petrov},
  title={{R-Vectors: New Technique for Adaptation to Room Acoustics}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={1243--1247},
  doi={10.21437/Interspeech.2019-2645},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2645}
}