Directional Audio Rendering Using a Neural Network Based Personalized HRTF

Geon Woo Lee, Jung Hyuk Lee, Seong Ju Kim, Hong Kook Kim


Multi-channel speech/audio separation and enhancement methods are popularly used for many speech/audio related applications. However, these methods may cause a loss of spatial cues, including the interaural time difference and interaural level difference, for further processing of monoaural signals. Thus, listeners may encounter difficulties in understanding the direction of the source signal. We present a directional audio renderer using a personalized HRTF, which is estimated by a neural network that combines DNN and CNN with anthropometric parameters and ear images of the listener. This demonstrated directional audio renderer concept aims to help foster research on audio processing for virtual reality/augmented reality to improve the quality of service of such devices.


Cite as: Lee, G.W., Lee, J.H., Kim, S.J., Kim, H.K. (2019) Directional Audio Rendering Using a Neural Network Based Personalized HRTF. Proc. Interspeech 2019, 2364-2365.


@inproceedings{Lee2019,
  author={Geon Woo Lee and Jung Hyuk Lee and Seong Ju Kim and Hong Kook Kim},
  title={{Directional Audio Rendering Using a Neural Network Based Personalized HRTF}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={2364--2365}
}