Demographic Factors Improve Classification Performance

Dirk Hovy

58 Citations (Scopus)

Abstract

Extra-linguistic factors influence language use, and are accounted for by speakers and listeners. Most natural language processing (NLP) tasks to date, however, treat language as uniform. This assumption can harm performance. We investigate the effect of including demographic information on performance in a variety of text-classification tasks. We find that by including age or gender information, we consistently and significantly improve performance over demographic-Agnostic models. These results hold across three text-classification tasks in five languages.

Original languageEnglish
Title of host publicationProceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing
Number of pages11
VolumeVolume 1
PublisherAssociation for Computational Linguistics
Publication date2015
Pages752-762
ISBN (Print)978-1-941643-72-3
Publication statusPublished - 2015

Fingerprint

Dive into the research topics of 'Demographic Factors Improve Classification Performance'. Together they form a unique fingerprint.

Cite this