Demographic Factors Improve Classification Performance

Dirk Hovy

58 Citationer (Scopus)

Abstract

Extra-linguistic factors influence language use, and are accounted for by speakers and listeners. Most natural language processing (NLP) tasks to date, however, treat language as uniform. This assumption can harm performance. We investigate the effect of including demographic information on performance in a variety of text-classification tasks. We find that by including age or gender information, we consistently and significantly improve performance over demographic-Agnostic models. These results hold across three text-classification tasks in five languages.

OriginalsprogEngelsk
TitelProceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing
Antal sider11
Vol/bindVolume 1
ForlagAssociation for Computational Linguistics
Publikationsdato2015
Sider752-762
ISBN (Trykt)978-1-941643-72-3
StatusUdgivet - 2015

Fingeraftryk

Dyk ned i forskningsemnerne om 'Demographic Factors Improve Classification Performance'. Sammen danner de et unikt fingeraftryk.

Citationsformater