Consistency of estimators of population scaled parameters using composite likelihood

Carsten Wiuf

doi:10.1007/s00285-006-0031-0

Consistency of estimators of population scaled parameters using composite likelihood

Carsten Wiuf^*

^*Corresponding author for this work

32 Citations (Scopus)

Abstract

Composite likelihood methods have become very popular for the analysis of large-scale genomic data sets because of the computational intractability of the basic coalescent process and its generalizations: It is virtually impossible to calculate the likelihood of an observed data set spanning a large chromosomal region without using approximate or heuristic methods. Composite likelihood methods are approximate methods and, in the present article, assume the likelihood is written as a product of likelihoods, one for each of a number of smaller regions that together make up the whole region from which data is collected. A very general framework for neutral coalescent models is presented and discussed. The framework comprises many of the most popular coalescent models that are currently used for analysis of genetic data. Assume data is collected from a series of consecutive regions of equal size. Then it is shown that the observed data forms a stationary, ergodic process. General conditions are given under which the maximum composite estimator of the parameters describing the model (e.g. mutation rates, demographic parameters and the recombination rate) is a consistent estimator as the number of regions tends to infinity.

Original language	English
Journal	Journal of Mathematical Biology
Volume	53
Issue number	5
Pages (from-to)	821-841
Number of pages	21
ISSN	0303-6812
DOIs	https://doi.org/10.1007/s00285-006-0031-0
Publication status	Published - 1 Nov 2006
Externally published	Yes

Keywords

Coalescent theory
Composite likelihood
Consistency
Estimator
Genomic data

Access to Document

10.1007/s00285-006-0031-0

Cite this

@article{2eee2bc0a7514475aa63fa86f08a5dec,

title = "Consistency of estimators of population scaled parameters using composite likelihood",

abstract = "Composite likelihood methods have become very popular for the analysis of large-scale genomic data sets because of the computational intractability of the basic coalescent process and its generalizations: It is virtually impossible to calculate the likelihood of an observed data set spanning a large chromosomal region without using approximate or heuristic methods. Composite likelihood methods are approximate methods and, in the present article, assume the likelihood is written as a product of likelihoods, one for each of a number of smaller regions that together make up the whole region from which data is collected. A very general framework for neutral coalescent models is presented and discussed. The framework comprises many of the most popular coalescent models that are currently used for analysis of genetic data. Assume data is collected from a series of consecutive regions of equal size. Then it is shown that the observed data forms a stationary, ergodic process. General conditions are given under which the maximum composite estimator of the parameters describing the model (e.g. mutation rates, demographic parameters and the recombination rate) is a consistent estimator as the number of regions tends to infinity.",

keywords = "Coalescent theory, Composite likelihood, Consistency, Estimator, Genomic data",

author = "Carsten Wiuf",

year = "2006",

month = nov,

day = "1",

doi = "10.1007/s00285-006-0031-0",

language = "English",

volume = "53",

pages = "821--841",

journal = "Journal of Mathematical Biology",

issn = "0303-6812",

publisher = "Springer",

number = "5",

}

TY - JOUR

T1 - Consistency of estimators of population scaled parameters using composite likelihood

AU - Wiuf, Carsten

PY - 2006/11/1

Y1 - 2006/11/1

N2 - Composite likelihood methods have become very popular for the analysis of large-scale genomic data sets because of the computational intractability of the basic coalescent process and its generalizations: It is virtually impossible to calculate the likelihood of an observed data set spanning a large chromosomal region without using approximate or heuristic methods. Composite likelihood methods are approximate methods and, in the present article, assume the likelihood is written as a product of likelihoods, one for each of a number of smaller regions that together make up the whole region from which data is collected. A very general framework for neutral coalescent models is presented and discussed. The framework comprises many of the most popular coalescent models that are currently used for analysis of genetic data. Assume data is collected from a series of consecutive regions of equal size. Then it is shown that the observed data forms a stationary, ergodic process. General conditions are given under which the maximum composite estimator of the parameters describing the model (e.g. mutation rates, demographic parameters and the recombination rate) is a consistent estimator as the number of regions tends to infinity.

AB - Composite likelihood methods have become very popular for the analysis of large-scale genomic data sets because of the computational intractability of the basic coalescent process and its generalizations: It is virtually impossible to calculate the likelihood of an observed data set spanning a large chromosomal region without using approximate or heuristic methods. Composite likelihood methods are approximate methods and, in the present article, assume the likelihood is written as a product of likelihoods, one for each of a number of smaller regions that together make up the whole region from which data is collected. A very general framework for neutral coalescent models is presented and discussed. The framework comprises many of the most popular coalescent models that are currently used for analysis of genetic data. Assume data is collected from a series of consecutive regions of equal size. Then it is shown that the observed data forms a stationary, ergodic process. General conditions are given under which the maximum composite estimator of the parameters describing the model (e.g. mutation rates, demographic parameters and the recombination rate) is a consistent estimator as the number of regions tends to infinity.

KW - Coalescent theory

KW - Composite likelihood

KW - Consistency

KW - Estimator

KW - Genomic data

UR - http://www.scopus.com/inward/record.url?scp=33750543338&partnerID=8YFLogxK

U2 - 10.1007/s00285-006-0031-0

DO - 10.1007/s00285-006-0031-0

M3 - Journal article

C2 - 16960689

AN - SCOPUS:33750543338

SN - 0303-6812

VL - 53

SP - 821

EP - 841

JO - Journal of Mathematical Biology

JF - Journal of Mathematical Biology

IS - 5

ER -

Consistency of estimators of population scaled parameters using composite likelihood

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this