Using electronic patient records to discover disease correlations and stratify patient cohorts

Francisco S Roque; Peter B Jensen; Henriette Schmock; Marlene Dalgaard; Massimo Andreatta; Thomas Hansen; Karen Søeby; Søren Bredkjær; Anders Juul; Thomas Werge; Lars J Jensen; Søren Brunak

doi:10.1371/journal.pcbi.1002141

Using electronic patient records to discover disease correlations and stratify patient cohorts

Francisco S Roque, Peter B Jensen, Henriette Schmock, Marlene Dalgaard, Massimo Andreatta, Thomas Hansen, Karen Søeby, Søren Bredkjær, Anders Juul, Thomas Werge, Lars J Jensen, Søren Brunak

180 Citationer (Scopus)

Abstract

Electronic patient records remain a rather unexplored, but potentially rich data source for discovering correlations between diseases. We describe a general approach for gathering phenotypic descriptions of patients from medical records in a systematic and non-cohort dependent manner. By extracting phenotype information from the free-text in such records we demonstrate that we can extend the information contained in the structured record data, and use it for producing fine-grained patient stratification and disease co-occurrence statistics. The approach uses a dictionary based on the International Classification of Disease ontology and is therefore in principle language independent. As a use case we show how records from a Danish psychiatric hospital lead to the identification of disease correlations, which subsequently can be mapped to systems biology frameworks.

Originalsprog	Engelsk
Tidsskrift	P L o S Computational Biology
Vol/bind	7
Udgave nummer	8
Sider (fra-til)	e1002141
Antal sider	10
ISSN	1553-734X
DOI	https://doi.org/10.1371/journal.pcbi.1002141
Status	Udgivet - aug. 2011

Adgang til dokumentet

10.1371/journal.pcbi.1002141

Citationsformater

@article{4b0328de49b14b85a63ae97bb274506e,

title = "Using electronic patient records to discover disease correlations and stratify patient cohorts",

abstract = "Electronic patient records remain a rather unexplored, but potentially rich data source for discovering correlations between diseases. We describe a general approach for gathering phenotypic descriptions of patients from medical records in a systematic and non-cohort dependent manner. By extracting phenotype information from the free-text in such records we demonstrate that we can extend the information contained in the structured record data, and use it for producing fine-grained patient stratification and disease co-occurrence statistics. The approach uses a dictionary based on the International Classification of Disease ontology and is therefore in principle language independent. As a use case we show how records from a Danish psychiatric hospital lead to the identification of disease correlations, which subsequently can be mapped to systems biology frameworks.",

keywords = "Cluster Analysis, Cohort Studies, Comorbidity, Computational Biology, Data Collection, Data Mining, Electronic Health Records, Humans, International Classification of Diseases, Reproducibility of Results",

author = "Roque, {Francisco S} and Jensen, {Peter B} and Henriette Schmock and Marlene Dalgaard and Massimo Andreatta and Thomas Hansen and Karen S{\o}eby and S{\o}ren Bredkj{\ae}r and Anders Juul and Thomas Werge and Jensen, {Lars J} and S{\o}ren Brunak",

year = "2011",

month = aug,

doi = "10.1371/journal.pcbi.1002141",

language = "English",

volume = "7",

pages = "e1002141",

journal = "P L o S Computational Biology (Online)",

issn = "1553-734X",

publisher = "Public Library of Science",

number = "8",

}

TY - JOUR

T1 - Using electronic patient records to discover disease correlations and stratify patient cohorts

AU - Roque, Francisco S

AU - Jensen, Peter B

AU - Schmock, Henriette

AU - Dalgaard, Marlene

AU - Andreatta, Massimo

AU - Hansen, Thomas

AU - Søeby, Karen

AU - Bredkjær, Søren

AU - Juul, Anders

AU - Werge, Thomas

AU - Jensen, Lars J

AU - Brunak, Søren

PY - 2011/8

Y1 - 2011/8

N2 - Electronic patient records remain a rather unexplored, but potentially rich data source for discovering correlations between diseases. We describe a general approach for gathering phenotypic descriptions of patients from medical records in a systematic and non-cohort dependent manner. By extracting phenotype information from the free-text in such records we demonstrate that we can extend the information contained in the structured record data, and use it for producing fine-grained patient stratification and disease co-occurrence statistics. The approach uses a dictionary based on the International Classification of Disease ontology and is therefore in principle language independent. As a use case we show how records from a Danish psychiatric hospital lead to the identification of disease correlations, which subsequently can be mapped to systems biology frameworks.

AB - Electronic patient records remain a rather unexplored, but potentially rich data source for discovering correlations between diseases. We describe a general approach for gathering phenotypic descriptions of patients from medical records in a systematic and non-cohort dependent manner. By extracting phenotype information from the free-text in such records we demonstrate that we can extend the information contained in the structured record data, and use it for producing fine-grained patient stratification and disease co-occurrence statistics. The approach uses a dictionary based on the International Classification of Disease ontology and is therefore in principle language independent. As a use case we show how records from a Danish psychiatric hospital lead to the identification of disease correlations, which subsequently can be mapped to systems biology frameworks.

KW - Cluster Analysis

KW - Cohort Studies

KW - Comorbidity

KW - Computational Biology

KW - Data Collection

KW - Data Mining

KW - Electronic Health Records

KW - Humans

KW - International Classification of Diseases

KW - Reproducibility of Results

U2 - 10.1371/journal.pcbi.1002141

DO - 10.1371/journal.pcbi.1002141

M3 - Journal article

C2 - 21901084

SN - 1553-734X

VL - 7

SP - e1002141

JO - P L o S Computational Biology (Online)

JF - P L o S Computational Biology (Online)

IS - 8

ER -

Using electronic patient records to discover disease correlations and stratify patient cohorts

Abstract

Adgang til dokumentet

Fingeraftryk

Citationsformater