On the number of siblings and p-th cousins in a large population sample

Vladimir Shchur; Rasmus Nielsen

doi:10.1007/s00285-018-1252-8

On the number of siblings and p-th cousins in a large population sample

3 Citations (Scopus)

Abstract

The number of individuals in a random sample with close relatives in the sample is a quantity of interest when designing Genome Wide Association Studies and other cohort based genetic, and non-genetic, studies. In this paper, we develop expressions for the distribution and expectation of the number of p-th cousins in a sample from a population of size N under two diploid Wright–Fisher models. We also develop simple asymptotic expressions for large values of N. For example, the expected proportion of individuals with at least one p-th cousin in a sample of K individuals, for a diploid dioecious Wright–Fisher model, is approximately 1-e-(22p-1)K/N. Our results show that a substantial fraction of individuals in the sample will have at least a second cousin if the sampling fraction (K / N) is on the order of 10 ^{- 2}. This confirms that, for large cohort samples, relatedness among individuals cannot easily be ignored.

Original language	English
Journal	Journal of Mathematical Biology
Volume	77
Issue number	5
Pages (from-to)	1279-1298
ISSN	0303-6812
DOIs	https://doi.org/10.1007/s00285-018-1252-8
Publication status	Published - 1 Nov 2018

Access to Document

10.1007/s00285-018-1252-8

Cite this

@article{9ca154bd72d644d988cac0745b8f689c,

title = "On the number of siblings and p-th cousins in a large population sample",

abstract = "The number of individuals in a random sample with close relatives in the sample is a quantity of interest when designing Genome Wide Association Studies and other cohort based genetic, and non-genetic, studies. In this paper, we develop expressions for the distribution and expectation of the number of p-th cousins in a sample from a population of size N under two diploid Wright–Fisher models. We also develop simple asymptotic expressions for large values of N. For example, the expected proportion of individuals with at least one p-th cousin in a sample of K individuals, for a diploid dioecious Wright–Fisher model, is approximately 1-e-(22p-1)K/N. Our results show that a substantial fraction of individuals in the sample will have at least a second cousin if the sampling fraction (K / N) is on the order of 10 - 2. This confirms that, for large cohort samples, relatedness among individuals cannot easily be ignored.",

author = "Vladimir Shchur and Rasmus Nielsen",

year = "2018",

month = nov,

day = "1",

doi = "10.1007/s00285-018-1252-8",

language = "English",

volume = "77",

pages = "1279--1298",

journal = "Journal of Mathematical Biology",

issn = "0303-6812",

publisher = "Springer",

number = "5",

}

TY - JOUR

T1 - On the number of siblings and p-th cousins in a large population sample

AU - Shchur, Vladimir

AU - Nielsen, Rasmus

PY - 2018/11/1

Y1 - 2018/11/1

N2 - The number of individuals in a random sample with close relatives in the sample is a quantity of interest when designing Genome Wide Association Studies and other cohort based genetic, and non-genetic, studies. In this paper, we develop expressions for the distribution and expectation of the number of p-th cousins in a sample from a population of size N under two diploid Wright–Fisher models. We also develop simple asymptotic expressions for large values of N. For example, the expected proportion of individuals with at least one p-th cousin in a sample of K individuals, for a diploid dioecious Wright–Fisher model, is approximately 1-e-(22p-1)K/N. Our results show that a substantial fraction of individuals in the sample will have at least a second cousin if the sampling fraction (K / N) is on the order of 10 - 2. This confirms that, for large cohort samples, relatedness among individuals cannot easily be ignored.

AB - The number of individuals in a random sample with close relatives in the sample is a quantity of interest when designing Genome Wide Association Studies and other cohort based genetic, and non-genetic, studies. In this paper, we develop expressions for the distribution and expectation of the number of p-th cousins in a sample from a population of size N under two diploid Wright–Fisher models. We also develop simple asymptotic expressions for large values of N. For example, the expected proportion of individuals with at least one p-th cousin in a sample of K individuals, for a diploid dioecious Wright–Fisher model, is approximately 1-e-(22p-1)K/N. Our results show that a substantial fraction of individuals in the sample will have at least a second cousin if the sampling fraction (K / N) is on the order of 10 - 2. This confirms that, for large cohort samples, relatedness among individuals cannot easily be ignored.

U2 - 10.1007/s00285-018-1252-8

DO - 10.1007/s00285-018-1252-8

M3 - Journal article

C2 - 29876645

SN - 0303-6812

VL - 77

SP - 1279

EP - 1298

JO - Journal of Mathematical Biology

JF - Journal of Mathematical Biology

IS - 5

ER -

On the number of siblings and p-th cousins in a large population sample

Abstract

Access to Document

Fingerprint

Cite this