ISCA Archive SPECOM 2004
ISCA Archive SPECOM 2004

Integration and fusion aspects of speech and handwriting media

Sascha Schimke, Thomas Vogel, Claus Vielhauer, Jana Dittmann

In this paper we discuss synchronization approaches for fusion of speech and handwriting data on a signal representation level. There are many advantages in utilizing additional modalities to speech, for example bimodal signals have the potential of increasing accuracy of recognition systems. Further we intend to provide users more flexibility for human to computer communication by allowing them to choose their preferred modality. After discussion of goals, we analyze different ways for synchronization of media streams. Besides approaches based on synchronized time stamp protocols as additional metadata, we dwell on a concept for synchronization based on embedding the data stream of one modality into the other by using digital watermarking techniques. Here we introduce the general concept of direct embedding and analyze the necessary watermarking capacity (payload) for synchronization. Finally we have a look at aspects of information retrieval in multimodal documents.


Cite as: Schimke, S., Vogel, T., Vielhauer, C., Dittmann, J. (2004) Integration and fusion aspects of speech and handwriting media. Proc. 9th Conference on Speech and Computer (SPECOM 2004), 42-46

@inproceedings{schimke04_specom,
  author={Sascha Schimke and Thomas Vogel and Claus Vielhauer and Jana Dittmann},
  title={{Integration and fusion aspects of speech and handwriting media}},
  year=2004,
  booktitle={Proc. 9th Conference on Speech and Computer (SPECOM 2004)},
  pages={42--46}
}