Abstract
The problem of detecting scientific fraud
using machine learning was recently introduced,
with initial, positive results from
a model taking into account various general
indicators. The results seem to suggest
that writing style is predictive of scientific
fraud. We revisit these initial experiments,
and show that the leave-one-out
testing procedure they used likely leads to
a slight over-estimate of the predictability,
but also that simple models can outperform
their proposed model by some margin.
We go on to explore more abstract
linguistic features, such as linguistic complexity
and discourse structure, only to obtain
negative results. Upon analyzing our
models, we do see some interesting patterns,
though: Scientific fraud, for examples,
contains less comparison, as well as
different types of hedging and ways of presenting
logical reasoning.
using machine learning was recently introduced,
with initial, positive results from
a model taking into account various general
indicators. The results seem to suggest
that writing style is predictive of scientific
fraud. We revisit these initial experiments,
and show that the leave-one-out
testing procedure they used likely leads to
a slight over-estimate of the predictability,
but also that simple models can outperform
their proposed model by some margin.
We go on to explore more abstract
linguistic features, such as linguistic complexity
and discourse structure, only to obtain
negative results. Upon analyzing our
models, we do see some interesting patterns,
though: Scientific fraud, for examples,
contains less comparison, as well as
different types of hedging and ways of presenting
logical reasoning.
Originalsprog | Engelsk |
---|---|
Titel | Proceedings of the Workshop on Stylistic VariationAssociation for Computational Linguistics |
Antal sider | 6 |
Forlag | Association for Computational Linguistics |
Publikationsdato | 2017 |
Sider | 37-42 |
ISBN (Trykt) | 978-1-945626-99-9 |
Status | Udgivet - 2017 |
Begivenhed | Workshop on Stylistic Variation - Copenhagen, Danmark Varighed: 8 sep. 2017 → 8 sep. 2017 |
Workshop
Workshop | Workshop on Stylistic Variation |
---|---|
Land/Område | Danmark |
By | Copenhagen |
Periode | 08/09/2017 → 08/09/2017 |