EUROSPEECH 2001 Scandinavia
We conducted human language identification experiments using signals with reduced segmental information with Japanese and bilingual subjects. American English and Japanese excerpts from the OGI_TS Corpus were processed by spectral-envelope removal (SER), vowel extraction from SER (VES) and temporal-envelope modulation (TEM). With the SER signal, where the spectral-envelope is eliminated, humans could still identify the languages fairly successfully. With the VES signal, which retains only vowel sections of the SER signal, the identification score was low. With the TEM signal, composed of whitenoise-driven intensity envelopes from several frequency bands, the identification score rose as the number of bands increased. Results varied depending on the stimulus language. Japanese and bilingual subjects demonstrated different scores from each other. These results indicate that humans can identify languages using a signal with drastically reduced segmental information. The results also suggest variation due to the phonetic attributes of languages and subjects knowledge.
Bibliographic reference. Komatsu, Masahiko / Mori, Kazuya / Arai, Takayuki / Murahara, Yuji (2001): "Human language identification with reduced segmental information: comparison between monolinguals and bilinguals", In EUROSPEECH-2001, 149-152.