TY - JOUR
T1 - Novel variation and de novo mutation rates in population-wide de novo assembled Danish trios
AU - Besenbacher, Søren
AU - Liu, Siyang
AU - Gonzalez-Izarzugaza, Jose Maria
AU - Grove, Jakob
AU - Belling, Kirstine G
AU - Bork-Jensen, Jette
AU - Huang, Shujia
AU - Als, Thomas Damm
AU - Li, Shengting
AU - Yadav, Rachita
AU - Rubio García, Arcadio
AU - Lescai, Francesco
AU - Demontis, Ditte
AU - Rao, Junhua
AU - Ye, Weijian
AU - Mailund, Thomas
AU - Møllegaard Friborg, Rune
AU - Pedersen, Christian N. S.
AU - Xu, Ruiqi
AU - Sun, Jihua
AU - Liu, Hao
AU - Wang, Ou
AU - Cheng, Xiaofang
AU - Flores, David
AU - Rydza, Emil Karol
AU - Rapacki, Kristoffer
AU - Sørensen, John Damm
AU - Chmura, Piotr Jaroslaw
AU - Westergaard, David
AU - Dworzynski, Piotr
AU - Sørensen, Thorkild I. A.
AU - Lund, Ole
AU - Hansen, Torben
AU - Xu, Xun
AU - Li, Ning
AU - Bolund, Lars
AU - Pedersen, Oluf
AU - Eiberg, Hans
AU - Krogh, Anders
AU - Børglum, Anders D.
AU - Brunak, Søren
AU - Kristiansen, Karsten
AU - Schierup, Mikkel H
AU - Wang, Jun
AU - Gupta, Ramneek
AU - Villesen, Palle
AU - Rasmussen, Simon
PY - 2015/1
Y1 - 2015/1
N2 - Building a population-specific catalogue of single nucleotide variants (SNVs), indels and structural variants (SVs) with frequencies, termed a national pan-genome, is critical for further advancing clinical and public health genetics in large cohorts. Here we report a Danish pan-genome obtained from sequencing 10 trios to high depth (50 × ). We report 536k novel SNVs and 283k novel short indels from mapping approaches and develop a population-wide de novo assembly approach to identify 132k novel indels larger than 10 nucleotides with low false discovery rates. We identify a higher proportion of indels and SVs than previous efforts showing the merits of high coverage and de novo assembly approaches. In addition, we use trio information to identify de novo mutations and use a probabilistic method to provide direct estimates of 1.27e-8 and 1.5e-9 per nucleotide per generation for SNVs and indels, respectively.
AB - Building a population-specific catalogue of single nucleotide variants (SNVs), indels and structural variants (SVs) with frequencies, termed a national pan-genome, is critical for further advancing clinical and public health genetics in large cohorts. Here we report a Danish pan-genome obtained from sequencing 10 trios to high depth (50 × ). We report 536k novel SNVs and 283k novel short indels from mapping approaches and develop a population-wide de novo assembly approach to identify 132k novel indels larger than 10 nucleotides with low false discovery rates. We identify a higher proportion of indels and SVs than previous efforts showing the merits of high coverage and de novo assembly approaches. In addition, we use trio information to identify de novo mutations and use a probabilistic method to provide direct estimates of 1.27e-8 and 1.5e-9 per nucleotide per generation for SNVs and indels, respectively.
U2 - 10.1038/ncomms6969
DO - 10.1038/ncomms6969
M3 - Journal article
C2 - 25597990
SN - 2041-1723
VL - 6
JO - Nature Communications
JF - Nature Communications
M1 - 5969
ER -