The objective of this work is to represent the information in the speech signal picked up by a throat microphone (TM) in an efficient manner in terms of number of bits required. Since the TM signal is unaffected by ambient noise, it is possible to extract the required information effectively under different environmental conditions. A spectral mapping technique is proposed from the TM speech to normal microphone (NM) speech to improve the perceptual quality. The mapping is done using vector quantization of pairwise spectral feature vectors derived from each frame of TM and the corresponding NM speech signals. Once the codebook is formed, the spectral features from a TM signal are represented as a sequence of codebook indices. The sequence of codebook indices, the pitch contour and the energy contour derived from the TM signal are used to store/transmit the TM speech information efficiently. From the received sequence of codebook indices, the NM spectral vectors are retrieved due to pairwise vector quantization of the feature vectors. A synthetic residual signal is generated at the receiver from prestored residual templates by incorporating the pitch and the energy. The synthetic residual signal is used to excite the system corresponding to the NM spectral vectors to generate the speech signal.
Bibliographic reference. Rama Murty, K. Sri / Khurana, Saurav / Itankar, Yogendra Umesh / Kesheorey, M. R. / Yegnanarayana, B. (2008): "Efficient representation of throat microphone speech", In INTERSPEECH-2008, 2610-2613.