Seasonal Web Search Query Selection for Influenza-Like Illness (ILI) Estimation

Niels Dalum Hansen, Kåre Mølbak, Ingemar Johansson Cox, Christina Lioma

6 Citationer (Scopus)

Abstract

Influenza-like illness (ILI) estimation from web search data is an importantweb analytics task. The basic idea is to use the frequencies of queries in web search logs that are correlated with past ILI activity as features when estimating current ILI activity. It has been noted that since influenza is seasonal, this approach can lead to spurious correlations with features/queries that also exhibit seasonality, but have no relationship with ILI. Spurious correlations can, in turn, degrade performance. To address this issue, we propose modeling the seasonal variation in ILI activity and selecting queries that are correlated with the residual of the seasonal model and the observed ILI signal. Experimental results show that re-ranking queries obtained by Google Correlate based on their correlation with the residual strongly favours ILI-related queries.

OriginalsprogEngelsk
TitelSIGIR '17 Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
ForlagAssociation for Computing Machinery
Publikationsdato7 aug. 2017
Sider1197-1200
ISBN (Elektronisk)978-1-4503-5022
DOI
StatusUdgivet - 7 aug. 2017
Begivenhed40th International ACM SIGIR Conference on Research and Development in Information Retrieval: SIGIR '17 - Shinjuku, Tokyo, Japan
Varighed: 7 aug. 201711 aug. 2017

Konference

Konference40th International ACM SIGIR Conference on Research and Development in Information Retrieval
Land/OmrådeJapan
ByShinjuku, Tokyo
Periode07/08/201711/08/2017

Citationsformater