Abstract
Recent work in geolocation has madeseveral hypotheses about what linguisticmarkers are relevant to detect where peoplewrite from. In this paper, we examinesix hypotheses against a corpus consistingof all geo-tagged tweets from theUS, or whose geo-tags could be inferred,in a 19% sample of Twitter history. Ourexperiments lend support to all six hypotheses,including that spelling variantsand hashtags are strong predictors of location.We also study what kinds of commonnouns are predictive of location aftercontrolling for named entities such as dolphinsor sharks.
Original language | English |
---|---|
Title of host publication | Proceedings of the 3rd Workshop on Noisy User-generated Text |
Number of pages | 6 |
Publisher | Association for Computational Linguistics |
Publication date | 2017 |
Pages | 62-67 |
ISBN (Print) | 978-1-945626-94-4 |
Publication status | Published - 2017 |
Event | 3rd Workshop on Noisy User-generated Text - Copenhagen, Denmark Duration: 7 Sept 2017 → 7 Sept 2017 |
Conference
Conference | 3rd Workshop on Noisy User-generated Text |
---|---|
Country/Territory | Denmark |
City | Copenhagen |
Period | 07/09/2017 → 07/09/2017 |