TY - GEN
T1 - Automatic speech recognition for live TV subtitling for hearing-impaired people
AU - Obach, Michael
AU - Lehr, Maider
AU - Arruti, Andoni
PY - 2007
Y1 - 2007
N2 - Most Spanish TV channels offer subtitles (closed captions) only for some of their pre-recorded programmes, and mainly due to the costs of specially trained stenographers and fast typists, subtitles are rarely available for live programmes like news broadcasts, sports events, and others. Progress in automatic speech recognition (ASR) opens a new way for live subtitling, but only works well when trained to recognise a single voice and when trained previously with material related to the contents of the programmes. We developed a prototype based on ASR that could be applied to generate automatically live subtitles as teletext for Spanish news broadcasts without human participation. The main goal was to evaluate the feasibility of using this technology to improve the quality of life of millions of hearing-impaired people, in accordance with applicable and future Spanish legislation. State-of-the-art speech recognition software for dictation as literal transcription of speech and a commercial teletext generator conforming to Spanish standards were integrated with our modules for improved pre-processing of the audio signal, voice normalization for speaker independence, speech/non-speech segmentation, and tools for the generation and update of dictionaries. The prototype was validated in cooperation with a TV broadcaster, which provided audiovisual material for the generation of the language corpus and specific dictionaries. System outputs were evaluated by organizations of the deaf and the hard of hearing. Results indicate that ASR is (still) not suitable for fully automated live subtitling. A delay of several seconds between speech and subtitle was observed. A limited word recognition rate, mainly caused by a huge number of named entities and variability of speakers and acoustic conditions, made understanding of the news sometimes impossible. We identified the lack of automatic punctuation as a major problem that decreased the readability of the contents of subtitles and also affected recognition quality. Many results are valid for other languages and other areas of subtitling than television.
AB - Most Spanish TV channels offer subtitles (closed captions) only for some of their pre-recorded programmes, and mainly due to the costs of specially trained stenographers and fast typists, subtitles are rarely available for live programmes like news broadcasts, sports events, and others. Progress in automatic speech recognition (ASR) opens a new way for live subtitling, but only works well when trained to recognise a single voice and when trained previously with material related to the contents of the programmes. We developed a prototype based on ASR that could be applied to generate automatically live subtitles as teletext for Spanish news broadcasts without human participation. The main goal was to evaluate the feasibility of using this technology to improve the quality of life of millions of hearing-impaired people, in accordance with applicable and future Spanish legislation. State-of-the-art speech recognition software for dictation as literal transcription of speech and a commercial teletext generator conforming to Spanish standards were integrated with our modules for improved pre-processing of the audio signal, voice normalization for speaker independence, speech/non-speech segmentation, and tools for the generation and update of dictionaries. The prototype was validated in cooperation with a TV broadcaster, which provided audiovisual material for the generation of the language corpus and specific dictionaries. System outputs were evaluated by organizations of the deaf and the hard of hearing. Results indicate that ASR is (still) not suitable for fully automated live subtitling. A delay of several seconds between speech and subtitle was observed. A limited word recognition rate, mainly caused by a huge number of named entities and variability of speakers and acoustic conditions, made understanding of the news sometimes impossible. We identified the lack of automatic punctuation as a major problem that decreased the readability of the contents of subtitles and also affected recognition quality. Many results are valid for other languages and other areas of subtitling than television.
KW - Automatic Speech Recognition
KW - Closed Captioning
KW - Deaf and Hard of Hearing
KW - Hearing Impaired
KW - Live Subtitling
KW - Subtitling
KW - Teletext
UR - https://www.scopus.com/pages/publications/84865495940
M3 - Conference contribution
AN - SCOPUS:84865495940
SN - 9781586037918
T3 - Assistive Technology Research Series
SP - 286
EP - 291
BT - Challenges for Assistive Technology. AAATE 07
A2 - Eizmendi, Gorka
A2 - Azkoitia, Jose Miguel
A2 - Craddock, Gerald
ER -