Massively-parallel best subset selection for ordinary least-squares regression

Fabian Gieseke; Kai Lars Polsterer; Ashish Mahabal; Christian Igel; Tom Heskes

doi:10.1109/SSCI.2017.8285225

Massively-parallel best subset selection for ordinary least-squares regression

Fabian Gieseke, Kai Lars Polsterer, Ashish Mahabal, Christian Igel, Tom Heskes

Department of Computer Science

2 Citations (Scopus)

Abstract

Selecting an optimal subset of k out of d features for linear regression models given n training instances is often considered intractable for feature spaces with hundreds or thousands of dimensions. We propose an efficient massively-parallel implementation for selecting such optimal feature subsets in a brute-force fashion for small k. By exploiting the enormous compute power provided by modern parallel devices such as graphics processing units, it can deal with thousands of input dimensions even using standard commodity hardware only. We evaluate the practical runtime using artificial datasets and sketch the applicability of our framework in the context of astronomy.

Original language	English
Title of host publication	2017 IEEE Symposium Series on Computational Intelligence (SSCI) Proceedings
Number of pages	8
Publisher	IEEE
Publication date	1 Jul 2017
Pages	1-8
ISBN (Electronic)	978-1-5386-2726-6
DOIs	https://doi.org/10.1109/SSCI.2017.8285225
Publication status	Published - 1 Jul 2017
Event	2017 IEEE Symposium Series on Computational Intelligence (SSCI) - Honolulu, United States Duration: 27 Nov 2017 → 1 Dec 2017

Conference

Conference	2017 IEEE Symposium Series on Computational Intelligence (SSCI)
Country/Territory	United States
City	Honolulu
Period	27/11/2017 → 01/12/2017

Keywords

graphics processing units
least squares approximations
optimisation
parallel processing
regression analysis
sensitivity analysis
input dimensions
linear regression models
massively-parallel best subset selection
optimal feature subsets
optimal subset
ordinary least-squares regression
subset selection
Computational modeling
Graphics processing units
Instruction sets
Optimization
Runtime
Task analysis
Training

Access to Document

10.1109/SSCI.2017.8285225

https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8285225

Cite this

Gieseke, F, Polsterer, KL, Mahabal, A, Igel, C & Heskes, T 2017, Massively-parallel best subset selection for ordinary least-squares regression. in 2017 IEEE Symposium Series on Computational Intelligence (SSCI) Proceedings. IEEE, pp. 1-8, 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, United States, 27/11/2017. https://doi.org/10.1109/SSCI.2017.8285225

@inproceedings{86ffbdbd2e8d4764bee8902f9922f3fc,

title = "Massively-parallel best subset selection for ordinary least-squares regression",

abstract = "Selecting an optimal subset of k out of d features for linear regression models given n training instances is often considered intractable for feature spaces with hundreds or thousands of dimensions. We propose an efficient massively-parallel implementation for selecting such optimal feature subsets in a brute-force fashion for small k. By exploiting the enormous compute power provided by modern parallel devices such as graphics processing units, it can deal with thousands of input dimensions even using standard commodity hardware only. We evaluate the practical runtime using artificial datasets and sketch the applicability of our framework in the context of astronomy.",

keywords = "graphics processing units, least squares approximations, optimisation, parallel processing, regression analysis, sensitivity analysis, input dimensions, linear regression models, massively-parallel best subset selection, optimal feature subsets, optimal subset, ordinary least-squares regression, subset selection, Computational modeling, Graphics processing units, Instruction sets, Optimization, Runtime, Task analysis, Training",

author = "Fabian Gieseke and Polsterer, {Kai Lars} and Ashish Mahabal and Christian Igel and Tom Heskes",

year = "2017",

month = jul,

day = "1",

doi = "10.1109/SSCI.2017.8285225",

language = "English",

pages = "1--8",

booktitle = "2017 IEEE Symposium Series on Computational Intelligence (SSCI) Proceedings",

publisher = "IEEE",

note = "2017 IEEE Symposium Series on Computational Intelligence (SSCI) ; Conference date: 27-11-2017 Through 01-12-2017",

}

TY - GEN

T1 - Massively-parallel best subset selection for ordinary least-squares regression

AU - Gieseke, Fabian

AU - Polsterer, Kai Lars

AU - Mahabal, Ashish

AU - Igel, Christian

AU - Heskes, Tom

PY - 2017/7/1

Y1 - 2017/7/1

N2 - Selecting an optimal subset of k out of d features for linear regression models given n training instances is often considered intractable for feature spaces with hundreds or thousands of dimensions. We propose an efficient massively-parallel implementation for selecting such optimal feature subsets in a brute-force fashion for small k. By exploiting the enormous compute power provided by modern parallel devices such as graphics processing units, it can deal with thousands of input dimensions even using standard commodity hardware only. We evaluate the practical runtime using artificial datasets and sketch the applicability of our framework in the context of astronomy.

AB - Selecting an optimal subset of k out of d features for linear regression models given n training instances is often considered intractable for feature spaces with hundreds or thousands of dimensions. We propose an efficient massively-parallel implementation for selecting such optimal feature subsets in a brute-force fashion for small k. By exploiting the enormous compute power provided by modern parallel devices such as graphics processing units, it can deal with thousands of input dimensions even using standard commodity hardware only. We evaluate the practical runtime using artificial datasets and sketch the applicability of our framework in the context of astronomy.

KW - graphics processing units

KW - least squares approximations

KW - optimisation

KW - parallel processing

KW - regression analysis

KW - sensitivity analysis

KW - input dimensions

KW - linear regression models

KW - massively-parallel best subset selection

KW - optimal feature subsets

KW - optimal subset

KW - ordinary least-squares regression

KW - subset selection

KW - Computational modeling

KW - Graphics processing units

KW - Instruction sets

KW - Optimization

KW - Runtime

KW - Task analysis

KW - Training

U2 - 10.1109/SSCI.2017.8285225

DO - 10.1109/SSCI.2017.8285225

M3 - Article in proceedings

SP - 1

EP - 8

BT - 2017 IEEE Symposium Series on Computational Intelligence (SSCI) Proceedings

PB - IEEE

T2 - 2017 IEEE Symposium Series on Computational Intelligence (SSCI)

Y2 - 27 November 2017 through 1 December 2017

ER -

Massively-parallel best subset selection for ordinary least-squares regression

Abstract

Conference

Keywords

Access to Document

Fingerprint

Cite this