Abstract
Pupils not finishing their secondary education are a big societal problem. Previous studies indicate that machine learning can be used to predict high-school dropout, which allows early interventions. To the best of our knowledge, this paper presents the first large-scale study of that kind. It considers pupils that were at least six months into their Danish high-school education, with the goal to predict dropout in the subsequent three months. We combined information from the MaCom Lectio study administration system, which is used by most Danish high schools, with data from public online sources (name database, travel planner, governmental statistics). In contrast to existing studies that were based on only a few hundred students, we considered a considerably larger sample of 36299 pupils for training and 36299 for testing. We evaluated different machine learning methods. A random forest classifier achieved an accuracy of 93.47% and an area under the curve of 0.965. Given the large sample, we conclude that machine learning can be used to reliably detect high-school dropout given the information already available to many schools.
Originalsprog | Engelsk |
---|---|
Titel | Proceedings. ESANN 2015 : 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning |
Redaktører | Michel Verleysen |
Antal sider | 6 |
Forlag | i6doc.com |
Publikationsdato | 2015 |
Sider | 319-324 |
ISBN (Trykt) | 978-2-87587-014-8 |
ISBN (Elektronisk) | 978-2-87587-015-5 |
Status | Udgivet - 2015 |
Begivenhed | 23rd European Symposium on Artificial Neural Networks - Bruges, Belgien Varighed: 22 apr. 2015 → 24 apr. 2015 Konferencens nummer: 23 |
Konference
Konference | 23rd European Symposium on Artificial Neural Networks |
---|---|
Nummer | 23 |
Land/Område | Belgien |
By | Bruges |
Periode | 22/04/2015 → 24/04/2015 |