AWARE: Exploiting evaluation measures to combine multiple assessors

Marco Ferrante; Nicola Ferro; Maria Maistro

doi:10.1145/3110217

AWARE: Exploiting evaluation measures to combine multiple assessors

Marco Ferrante, Nicola Ferro, Maria Maistro

5 Citationer (Scopus)

Abstract

We propose the Assessor-drivenWeighted Averages for Retrieval Evaluation (AWARE) probabilistic framework, a novel methodology for dealing with multiple crowd assessors that may be contradictory and/or noisy. By modeling relevance judgements and crowd assessors as sources of uncertainty, AWARE takes the expectation of a generic performance measure, like Average Precision, composed with these random variables. In this way, it approaches the problem of aggregating different crowd assessors from a new perspective, that is, directly combining the performance measures computed on the ground truth generated by the crowd assessors instead of adopting some classification technique to merge the labels produced by them. We propose several unsupervised estimators that instantiate the AWARE framework and we compare them with state-of-theart approaches, that is,Majoriity Vote and Expectation Maximization, on TREC collections. We found that AWARE approaches improve in terms of their capability of correctly ranking systems and predicting their actual performance scores.

Originalsprog	Engelsk
Artikelnummer	20
Tidsskrift	ACM Transactions on Information Systems
Vol/bind	36
Udgave nummer	2
ISSN	1046-8188
DOI	https://doi.org/10.1145/3110217
Status	Udgivet - 1 aug. 2017
Udgivet eksternt	Ja

Adgang til dokumentet

10.1145/3110217

Andre filer og links

Link to publication in Scopus

Citationsformater

@article{48f6e8528c15486daab0ede3a6095d48,

title = "AWARE: Exploiting evaluation measures to combine multiple assessors",

abstract = "We propose the Assessor-drivenWeighted Averages for Retrieval Evaluation (AWARE) probabilistic framework, a novel methodology for dealing with multiple crowd assessors that may be contradictory and/or noisy. By modeling relevance judgements and crowd assessors as sources of uncertainty, AWARE takes the expectation of a generic performance measure, like Average Precision, composed with these random variables. In this way, it approaches the problem of aggregating different crowd assessors from a new perspective, that is, directly combining the performance measures computed on the ground truth generated by the crowd assessors instead of adopting some classification technique to merge the labels produced by them. We propose several unsupervised estimators that instantiate the AWARE framework and we compare them with state-of-theart approaches, that is,Majoriity Vote and Expectation Maximization, on TREC collections. We found that AWARE approaches improve in terms of their capability of correctly ranking systems and predicting their actual performance scores.",

keywords = "AWARE, Crowdsourcing, Performance measure, Unsupervised estimators, Weighted average",

author = "Marco Ferrante and Nicola Ferro and Maria Maistro",

year = "2017",

month = aug,

day = "1",

doi = "10.1145/3110217",

language = "English",

volume = "36",

journal = "ACM Transactions on Information Systems",

issn = "1046-8188",

publisher = "Association for Computing Machinery, Inc.",

number = "2",

}

TY - JOUR

T1 - AWARE

T2 - Exploiting evaluation measures to combine multiple assessors

AU - Ferrante, Marco

AU - Ferro, Nicola

AU - Maistro, Maria

PY - 2017/8/1

Y1 - 2017/8/1

N2 - We propose the Assessor-drivenWeighted Averages for Retrieval Evaluation (AWARE) probabilistic framework, a novel methodology for dealing with multiple crowd assessors that may be contradictory and/or noisy. By modeling relevance judgements and crowd assessors as sources of uncertainty, AWARE takes the expectation of a generic performance measure, like Average Precision, composed with these random variables. In this way, it approaches the problem of aggregating different crowd assessors from a new perspective, that is, directly combining the performance measures computed on the ground truth generated by the crowd assessors instead of adopting some classification technique to merge the labels produced by them. We propose several unsupervised estimators that instantiate the AWARE framework and we compare them with state-of-theart approaches, that is,Majoriity Vote and Expectation Maximization, on TREC collections. We found that AWARE approaches improve in terms of their capability of correctly ranking systems and predicting their actual performance scores.

AB - We propose the Assessor-drivenWeighted Averages for Retrieval Evaluation (AWARE) probabilistic framework, a novel methodology for dealing with multiple crowd assessors that may be contradictory and/or noisy. By modeling relevance judgements and crowd assessors as sources of uncertainty, AWARE takes the expectation of a generic performance measure, like Average Precision, composed with these random variables. In this way, it approaches the problem of aggregating different crowd assessors from a new perspective, that is, directly combining the performance measures computed on the ground truth generated by the crowd assessors instead of adopting some classification technique to merge the labels produced by them. We propose several unsupervised estimators that instantiate the AWARE framework and we compare them with state-of-theart approaches, that is,Majoriity Vote and Expectation Maximization, on TREC collections. We found that AWARE approaches improve in terms of their capability of correctly ranking systems and predicting their actual performance scores.

KW - AWARE

KW - Crowdsourcing

KW - Performance measure

KW - Unsupervised estimators

KW - Weighted average

UR - http://www.scopus.com/inward/record.url?scp=85028669422&partnerID=8YFLogxK

U2 - 10.1145/3110217

DO - 10.1145/3110217

M3 - Journal article

AN - SCOPUS:85028669422

SN - 1046-8188

VL - 36

JO - ACM Transactions on Information Systems

JF - ACM Transactions on Information Systems

IS - 2

M1 - 20

ER -

AWARE: Exploiting evaluation measures to combine multiple assessors

Abstract

Adgang til dokumentet

Andre filer og links

Fingeraftryk

Citationsformater