Retrieving radio news broadcasts in Danish: accuracy and categorization of unrecognized words

Morten Hertzum, Haakon Lund, Rasmus Troelsgård

Abstract

Digital archives of radio news broadcasts can possibly be made searchable by combining speech recognition with information retrieval. We explore this possibility for the retrieval of news broadcasts in Danish. An average of 84% of the words in the broadcasts was recognized. Most of the unrecognized words were compounds, names, and other words that appear of value to retrieval. Thus, the set of words describing a broadcast has to be expanded to compensate for the recognition errors. We discuss doing this by exploiting the alternative matches from the speech recognizer and by extracting words from a related corpus
Original languageEnglish
Title of host publicationOzCHI'16 : The 28th Australian Conference on Compute-Human Interaction
Number of pages5
Place of PublicationNew York
PublisherACM
Publication date29 Nov 2016
Pages160-164
ISBN (Electronic)978-1-4503-4618-4
DOIs
Publication statusPublished - 29 Nov 2016
EventAustralian Conference on Human-Computer Interaction - Launceston, Australia
Duration: 29 Nov 20162 Dec 2016
Conference number: 28
http://www.ozchi.org/2016/index.html

Conference

ConferenceAustralian Conference on Human-Computer Interaction
Number28
LocationLaunceston
Country/TerritoryAustralia
Period29/11/201602/12/2016
Internet address

Fingerprint

Dive into the research topics of 'Retrieving radio news broadcasts in Danish: accuracy and categorization of unrecognized words'. Together they form a unique fingerprint.

Cite this