A tutorial on the lasso approach to sparse modeling

Morten Arendt Rasmussen; Rasmus Bro

doi:10.1016/j.chemolab.2012.10.003

A tutorial on the lasso approach to sparse modeling

65 Citations (Scopus)

Abstract

In applied research data are often collected from sources with a high dimensional multivariate output. Analysis of such data is composed of e.g. extraction and characterization of underlying patterns, and often with the aim of finding a small subset of significant variables or features. Variable and feature selection is well-established in the area of regression, whereas for other types of models this seems more difficult. Penalization of the L ₁ norm provides an interesting avenue for such a problem, as it produces a sparse solution and hence embeds variable selection. In this paper a brief introduction to the mathematical properties of using the L ₁ norm as a penalty is given. Examples of models extended with L ₁ norm penalties/constraints are presented. The examples include PCA modeling with sparse loadings which enhance interpretability of single components. Sparse inverse covariance matrix estimation is used to unravel which variables are affecting each other, and a modified PCA to model data with (piecewise) constant responses in e.g. process monitoring is shown. All examples are demonstrated on real or synthetic data. The results indicate that sparse solutions, when appropriate, can enhance model interpretability.

Original language	English
Journal	Chemometrics and Intelligent Laboratory Systems
Volume	119
Pages (from-to)	21-31
Number of pages	11
ISSN	0169-7439
DOIs	https://doi.org/10.1016/j.chemolab.2012.10.003
Publication status	Published - 1 Oct 2012

Access to Document

10.1016/j.chemolab.2012.10.003

A tutorial on the lasso approach to sparse modelingFinal published version, 1.07 MB

Cite this

@article{46119cedb5c84f1b8f1392be49407655,

title = "A tutorial on the lasso approach to sparse modeling",

abstract = "In applied research data are often collected from sources with a high dimensional multivariate output. Analysis of such data is composed of e.g. extraction and characterization of underlying patterns, and often with the aim of finding a small subset of significant variables or features. Variable and feature selection is well-established in the area of regression, whereas for other types of models this seems more difficult. Penalization of the L 1 norm provides an interesting avenue for such a problem, as it produces a sparse solution and hence embeds variable selection. In this paper a brief introduction to the mathematical properties of using the L 1 norm as a penalty is given. Examples of models extended with L 1 norm penalties/constraints are presented. The examples include PCA modeling with sparse loadings which enhance interpretability of single components. Sparse inverse covariance matrix estimation is used to unravel which variables are affecting each other, and a modified PCA to model data with (piecewise) constant responses in e.g. process monitoring is shown. All examples are demonstrated on real or synthetic data. The results indicate that sparse solutions, when appropriate, can enhance model interpretability.",

author = "Rasmussen, {Morten Arendt} and Rasmus Bro",

year = "2012",

month = oct,

day = "1",

doi = "10.1016/j.chemolab.2012.10.003",

language = "English",

volume = "119",

pages = "21--31",

journal = "Chemometrics and Intelligent Laboratory Systems",

issn = "0169-7439",

publisher = "Elsevier",

}

TY - JOUR

T1 - A tutorial on the lasso approach to sparse modeling

AU - Rasmussen, Morten Arendt

AU - Bro, Rasmus

PY - 2012/10/1

Y1 - 2012/10/1

N2 - In applied research data are often collected from sources with a high dimensional multivariate output. Analysis of such data is composed of e.g. extraction and characterization of underlying patterns, and often with the aim of finding a small subset of significant variables or features. Variable and feature selection is well-established in the area of regression, whereas for other types of models this seems more difficult. Penalization of the L 1 norm provides an interesting avenue for such a problem, as it produces a sparse solution and hence embeds variable selection. In this paper a brief introduction to the mathematical properties of using the L 1 norm as a penalty is given. Examples of models extended with L 1 norm penalties/constraints are presented. The examples include PCA modeling with sparse loadings which enhance interpretability of single components. Sparse inverse covariance matrix estimation is used to unravel which variables are affecting each other, and a modified PCA to model data with (piecewise) constant responses in e.g. process monitoring is shown. All examples are demonstrated on real or synthetic data. The results indicate that sparse solutions, when appropriate, can enhance model interpretability.

AB - In applied research data are often collected from sources with a high dimensional multivariate output. Analysis of such data is composed of e.g. extraction and characterization of underlying patterns, and often with the aim of finding a small subset of significant variables or features. Variable and feature selection is well-established in the area of regression, whereas for other types of models this seems more difficult. Penalization of the L 1 norm provides an interesting avenue for such a problem, as it produces a sparse solution and hence embeds variable selection. In this paper a brief introduction to the mathematical properties of using the L 1 norm as a penalty is given. Examples of models extended with L 1 norm penalties/constraints are presented. The examples include PCA modeling with sparse loadings which enhance interpretability of single components. Sparse inverse covariance matrix estimation is used to unravel which variables are affecting each other, and a modified PCA to model data with (piecewise) constant responses in e.g. process monitoring is shown. All examples are demonstrated on real or synthetic data. The results indicate that sparse solutions, when appropriate, can enhance model interpretability.

U2 - 10.1016/j.chemolab.2012.10.003

DO - 10.1016/j.chemolab.2012.10.003

M3 - Journal article

SN - 0169-7439

VL - 119

SP - 21

EP - 31

JO - Chemometrics and Intelligent Laboratory Systems

JF - Chemometrics and Intelligent Laboratory Systems

ER -

A tutorial on the lasso approach to sparse modeling

Abstract

Access to Document

Fingerprint

Cite this