Keynote Speech


Recent Trends in NLP Research

Authors: Changning HUANG
Affiliation: Microsoft Research,China
Mailto: cnhuang@microsoft.com

ABSTRACT

See abstract in his paper

Page 1


From Graphical to Voice User Interface: The Next Revolution

Authors: Chin-Hui LEE
Affiliation: Dialogue Systems Research Department, Bell Labs, Lucent Technologies,Murray Hill, New Jersey
Mailto: chl@research.bell-labs.com

ABSTRACT

See abstract in his paper

Page 2


Topics on Minimum Classification Error Rate Based Discriminant Function Approach to Speech Recognition

Authors: Wu CHOU
Affiliation: Bell Labs., Lucent Technologies, 600 Mountain Ave., Murray Hill, NJ 07974
Mailto: wuchou@research.bell-labs.com

ABSTRACT

In this paper, we study discriminant function based minimum recognition error rate pattern recognition approach. This ap-proach departs from the conventional paradigm which links a classification/recognition task to the problem of distribution esti-mation. Instead, it takes a discriminant function based statistical pattern recognition approach and the goodness of this approach to classification error rate minimization is established through a special loss function. It is meaningful even when the model correctness assumption is known not valid. The use of discrimi-nant function has a significant impact on classifier design, since in many realistic applications, such as speech recognition, the true distribution form of the source is rarely known precisely and without model correctness assumption, the classical optimality theory of the distribution estimation approach can not be applied directly. We discuss issues in this new classifier design paradigm and present various extensions of this approach for applications in speech processing.

Page 3


Toward Making Speech Part of People¡¯s Daily Life

Authors: Yonghong YAN
Affiliation: Intel China Research Center
Mailto: yonghong.yan@intel.com

ABSTRACT

See abstract in his paper

Page 11


A Corpus-Based Prosodic Modeling Method for Mandarin and Min-Nan Text-to-Speech Conversions

Authors: Sin-Horng CHEN
Affiliation: Department of Communication Engineering, Chiao Tung University, Hsinchu
Mailto: schen@cc.nctu.edu.tw

ABSTRACT

This talk gives an introduction to a recurrent neural network (RNN) based prosody synthesis method for both Mandarin and Min-Nan text-to-speech (TTS) conversions. The method uses a four-layer RNN to model the dependency of output prosodic information and input linguistic information. Main advantages of the method are the capability of learning many humans prosody pronunciation rules automatically and the relatively short time of system development. Two variations of the baseline RNN prosody synthesis method are also discussed. One uses an additional fuzzy-neural network to infer some fuzzy rules of affections from high-level linguistic features for assisting in the RNN prosody generation. The other uses additional statistical models of prosodic parameter to remove some affecting factors of linguistic features for reducing the load of the RNN.

Page 12


Processing Some Special Features in Chinese Speech Recognition

Authors: Bo XU, Taiyi HUANG
Affiliation: National Lab of Pattern Recognition (NLPR),
Institute of Automation, Chinese Academy of Sciences

ABSTRACT

See abstract in his paper

Page 22