Retrieving radio news broadcasts in Danish: accuracy and categorization of unrecognized words

Morten Hertzum, Haakon Lund, Rasmus Troelsgård

Abstract

Digital archives of radio news broadcasts can possibly be made searchable by combining speech recognition with information retrieval. We explore this possibility for the retrieval of news broadcasts in Danish. An average of 84% of the words in the broadcasts was recognized. Most of the unrecognized words were compounds, names, and other words that appear of value to retrieval. Thus, the set of words describing a broadcast has to be expanded to compensate for the recognition errors. We discuss doing this by exploiting the alternative matches from the speech recognizer and by extracting words from a related corpus
OriginalsprogEngelsk
TitelOzCHI'16 : The 28th Australian Conference on Compute-Human Interaction
Antal sider5
UdgivelsesstedNew York
ForlagACM
Publikationsdato29 nov. 2016
Sider160-164
ISBN (Elektronisk)978-1-4503-4618-4
DOI
StatusUdgivet - 29 nov. 2016
BegivenhedAustralian Conference on Human-Computer Interaction - Launceston, Australien
Varighed: 29 nov. 20162 dec. 2016
Konferencens nummer: 28
http://www.ozchi.org/2016/index.html

Konference

KonferenceAustralian Conference on Human-Computer Interaction
Nummer28
LokationLaunceston
Land/OmrådeAustralien
Periode29/11/201602/12/2016
Internetadresse

Fingeraftryk

Dyk ned i forskningsemnerne om 'Retrieving radio news broadcasts in Danish: accuracy and categorization of unrecognized words'. Sammen danner de et unikt fingeraftryk.

Citationsformater