Using importance sampling to improve simulation in linkage analysis

Lars Ängquist; Ola Hössjer

doi:10.2202/1544-6115.1049

Using importance sampling to improve simulation in linkage analysis

Lars Ängquist^*, Ola Hössjer

^*Corresponding author af dette arbejde

6 Citationer (Scopus)

Abstract

In this article we describe and discuss implementation of a weighted simulation procedure, importance sampling, in the context of nonparametric linkage analysis. The objective is to estimate genome-wide p-values, i.e. the probability that the maximal linkage score exceeds given thresholds under the null hypothesis of no linkage. In order to reduce variance of the estimate for large thresholds, we simulate linkage scores under a distribution different from the null with an artificial disease locus positioned somewhere along the genome. To compensate for the fact that we simulate under the wrong distribution, the simulated scores are reweighted using a certain likelihood ratio. If the sampling distribution are properly chosen the variance of the corresponding estimate is reduced. This results in accurate genome-wide p-value estimates for a wide range of large thresholds with a substantially smaller cost adjusted relative efficiency with respect to standard unweighted simulation. We illustrate the performance of the method for several pedigree examples, discuss implementation including the amount of variance reduction and describe some possible generalizations.

Originalsprog	Engelsk
Artikelnummer	5
Tidsskrift	Statistical Applications in Genetics and Molecular Biology
Vol/bind	3
Udgave nummer	1
ISSN	1544-6115
DOI	https://doi.org/10.2202/1544-6115.1049
Status	Udgivet - 1 jan. 2004

Adgang til dokumentet

10.2202/1544-6115.1049

Andre filer og links

Link to publication in Scopus

Citationsformater

@article{b394d51e07d0435f9710fa3f611f3956,

title = "Using importance sampling to improve simulation in linkage analysis",

abstract = "In this article we describe and discuss implementation of a weighted simulation procedure, importance sampling, in the context of nonparametric linkage analysis. The objective is to estimate genome-wide p-values, i.e. the probability that the maximal linkage score exceeds given thresholds under the null hypothesis of no linkage. In order to reduce variance of the estimate for large thresholds, we simulate linkage scores under a distribution different from the null with an artificial disease locus positioned somewhere along the genome. To compensate for the fact that we simulate under the wrong distribution, the simulated scores are reweighted using a certain likelihood ratio. If the sampling distribution are properly chosen the variance of the corresponding estimate is reduced. This results in accurate genome-wide p-value estimates for a wide range of large thresholds with a substantially smaller cost adjusted relative efficiency with respect to standard unweighted simulation. We illustrate the performance of the method for several pedigree examples, discuss implementation including the amount of variance reduction and describe some possible generalizations.",

keywords = "Change of probability measure, Cost adjusted relative efficiency, Exponential tilting, Genome-wide significance, Importance sampling, Marker information, Nonparametric linkage analysis, Variance reduction",

author = "Lars {\"A}ngquist and Ola H{\"o}ssjer",

year = "2004",

month = jan,

day = "1",

doi = "10.2202/1544-6115.1049",

language = "English",

volume = "3",

journal = "Statistical Applications in Genetics and Molecular Biology",

issn = "1544-6115",

publisher = "Walterde Gruyter GmbH",

number = "1",

}

TY - JOUR

T1 - Using importance sampling to improve simulation in linkage analysis

AU - Ängquist, Lars

AU - Hössjer, Ola

PY - 2004/1/1

Y1 - 2004/1/1

N2 - In this article we describe and discuss implementation of a weighted simulation procedure, importance sampling, in the context of nonparametric linkage analysis. The objective is to estimate genome-wide p-values, i.e. the probability that the maximal linkage score exceeds given thresholds under the null hypothesis of no linkage. In order to reduce variance of the estimate for large thresholds, we simulate linkage scores under a distribution different from the null with an artificial disease locus positioned somewhere along the genome. To compensate for the fact that we simulate under the wrong distribution, the simulated scores are reweighted using a certain likelihood ratio. If the sampling distribution are properly chosen the variance of the corresponding estimate is reduced. This results in accurate genome-wide p-value estimates for a wide range of large thresholds with a substantially smaller cost adjusted relative efficiency with respect to standard unweighted simulation. We illustrate the performance of the method for several pedigree examples, discuss implementation including the amount of variance reduction and describe some possible generalizations.

AB - In this article we describe and discuss implementation of a weighted simulation procedure, importance sampling, in the context of nonparametric linkage analysis. The objective is to estimate genome-wide p-values, i.e. the probability that the maximal linkage score exceeds given thresholds under the null hypothesis of no linkage. In order to reduce variance of the estimate for large thresholds, we simulate linkage scores under a distribution different from the null with an artificial disease locus positioned somewhere along the genome. To compensate for the fact that we simulate under the wrong distribution, the simulated scores are reweighted using a certain likelihood ratio. If the sampling distribution are properly chosen the variance of the corresponding estimate is reduced. This results in accurate genome-wide p-value estimates for a wide range of large thresholds with a substantially smaller cost adjusted relative efficiency with respect to standard unweighted simulation. We illustrate the performance of the method for several pedigree examples, discuss implementation including the amount of variance reduction and describe some possible generalizations.

KW - Change of probability measure

KW - Cost adjusted relative efficiency

KW - Exponential tilting

KW - Genome-wide significance

KW - Importance sampling

KW - Marker information

KW - Nonparametric linkage analysis

KW - Variance reduction

UR - http://www.scopus.com/inward/record.url?scp=18544367812&partnerID=8YFLogxK

U2 - 10.2202/1544-6115.1049

DO - 10.2202/1544-6115.1049

M3 - Journal article

AN - SCOPUS:18544367812

SN - 1544-6115

VL - 3

JO - Statistical Applications in Genetics and Molecular Biology

JF - Statistical Applications in Genetics and Molecular Biology

IS - 1

M1 - 5

ER -

Using importance sampling to improve simulation in linkage analysis

Abstract

Adgang til dokumentet

Andre filer og links

Fingeraftryk

Citationsformater