Demographic Factors Improve Classification Performance

Dirk Hovy

Demographic Factors Improve Classification Performance

Dirk Hovy

LUKKET: Center for Sprogteknologi

58 Citationer (Scopus)

Abstract

Extra-linguistic factors influence language use, and are accounted for by speakers and listeners. Most natural language processing (NLP) tasks to date, however, treat language as uniform. This assumption can harm performance. We investigate the effect of including demographic information on performance in a variety of text-classification tasks. We find that by including age or gender information, we consistently and significantly improve performance over demographic-Agnostic models. These results hold across three text-classification tasks in five languages.

Originalsprog	Engelsk
Titel	Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing
Antal sider	11
Vol/bind	Volume 1
Forlag	Association for Computational Linguistics
Publikationsdato	2015
Sider	752-762
ISBN (Trykt)	978-1-941643-72-3
Status	Udgivet - 2015

Adgang til dokumentet

http://www.aclweb.org/anthology/P/P15/P15-1073.pdf

Citationsformater

Demographic Factors Improve Classification Performance. / Hovy, Dirk.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Bind Volume 1 Association for Computational Linguistics, 2015. s. 752-762.

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

@inproceedings{3612cc2f53b84cddb50a69b3259d88fb,

title = "Demographic Factors Improve Classification Performance",

abstract = "Extra-linguistic factors influence language use, and are accounted for by speakers and listeners. Most natural language processing (NLP) tasks to date, however, treat language as uniform. This assumption can harm performance. We investigate the effect of including demographic information on performance in a variety of text-classification tasks. We find that by including age or gender information, we consistently and significantly improve performance over demographic-Agnostic models. These results hold across three text-classification tasks in five languages.",

author = "Dirk Hovy",

year = "2015",

language = "English",

isbn = "978-1-941643-72-3",

volume = "Volume 1",

pages = "752--762",

booktitle = "Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing",

publisher = "Association for Computational Linguistics",

}

TY - GEN

T1 - Demographic Factors Improve Classification Performance

AU - Hovy, Dirk

PY - 2015

Y1 - 2015

N2 - Extra-linguistic factors influence language use, and are accounted for by speakers and listeners. Most natural language processing (NLP) tasks to date, however, treat language as uniform. This assumption can harm performance. We investigate the effect of including demographic information on performance in a variety of text-classification tasks. We find that by including age or gender information, we consistently and significantly improve performance over demographic-Agnostic models. These results hold across three text-classification tasks in five languages.

AB - Extra-linguistic factors influence language use, and are accounted for by speakers and listeners. Most natural language processing (NLP) tasks to date, however, treat language as uniform. This assumption can harm performance. We investigate the effect of including demographic information on performance in a variety of text-classification tasks. We find that by including age or gender information, we consistently and significantly improve performance over demographic-Agnostic models. These results hold across three text-classification tasks in five languages.

M3 - Article in proceedings

SN - 978-1-941643-72-3

VL - Volume 1

SP - 752

EP - 762

BT - Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing

PB - Association for Computational Linguistics

ER -

Demographic Factors Improve Classification Performance

Abstract

Adgang til dokumentet

Fingeraftryk

Citationsformater