EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Time Alignment for Scenario and Sounds with Voice, Music and BGM

Yamato Wada, Masahide Sugiyama

University of AIZU, Japan

This paper proposes a new time alignment method between scenario and sounds with voice, music and BGM (Back Ground Music) in order to generate video caption automatically. The proposed time alignment method, Voice-Music-Pause+BGM method, is based on the composition of voice and music models. The result of the experiments to evaluate the proposed method shows the proposed method works about 10~60 times better than the conventional time alignment methods.

Full Paper

Bibliographic reference.  Wada, Yamato / Sugiyama, Masahide (2003): "Time alignment for scenario and sounds with voice, music and BGM", In EUROSPEECH-2003, 445-448.