ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

X-net: A Joint Scale Down and Scale Up Method for Voice Call

Liang Wen, Lizhong Wang, Xue Wen, Yuxing Zheng, Youngo Park, Kwang Pyo Choi

This paper proposes X-net, a jointly learned scale-down and scale-up architecture for data pre- and post-processing in voice calls, as a means to bandwidth extension over band-limited channels. Scale-down and scale-up are deployed separately on transmitter and receiver to perform down- and upsampling. Separate supervisions are used on the submodules so that X-net can work properly even if one submodule is missing. A two-stage training method is used to learn X-net for improved perceptual quality. Results show that jointly learned X-net achieves promising improvement over blind audio super-resolution by both objective and subjective metrics, even in a lightweight implementation with only 1k parameters.


doi: 10.21437/Interspeech.2021-812

Cite as: Wen, L., Wang, L., Wen, X., Zheng, Y., Park, Y., Choi, K.P. (2021) X-net: A Joint Scale Down and Scale Up Method for Voice Call. Proc. Interspeech 2021, 1644-1648, doi: 10.21437/Interspeech.2021-812

@inproceedings{wen21_interspeech,
  author={Liang Wen and Lizhong Wang and Xue Wen and Yuxing Zheng and Youngo Park and Kwang Pyo Choi},
  title={{X-net: A Joint Scale Down and Scale Up Method for Voice Call}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={1644--1648},
  doi={10.21437/Interspeech.2021-812}
}