It is well known that speech utterances convey a rich diversity of information concerning the speaker in addition to related semantic content. Such information may contain speaker traits such as personality, likability, health/pathology, etc. To detect speaker traits in human computer interface is an important task toward formulating more efficient and natural computer engagement. This study proposes two groups of supra-segmental features for improving speaker trait detection performance. Compared with the 6125 dimension features based baseline system, the proposed supra-segmental system not only improves performance by 9.0%, but also is computationally attractive and proper for real life application since it derives a less than 63 dimension features, which are 99% less than the baseline system.
Cite as: Liu, G., Hansen, J.H.L. (2014) Supra-Segmental Feature Based Speaker Trait Detection. Proc. The Speaker and Language Recognition Workshop (Odyssey 2014), 94-99, doi: 10.21437/Odyssey.2014-19
@inproceedings{liu14_odyssey, author={Gang Liu and John H.L. Hansen}, title={{Supra-Segmental Feature Based Speaker Trait Detection}}, year=2014, booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2014)}, pages={94--99}, doi={10.21437/Odyssey.2014-19} }