Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

Jie Huang, Bryan Howie, Shane Mccarthy, Yasin Memari, Klaudia Walter, Josine L. Min, Petr Danecek, Giovanni Malerba, Elisabetta Trabetti, Hou-feng Zheng, Saeed Al Turki, Antoinette Amuzu, Carl A. Anderson, Richard Anney, Dinu Antony, María Soler Artigas, Muhammad Ayub, Senduran Bala, Jeffrey C. Barrett, Inês BarrosoPhil Beales, Marianne Benn, Jamie Bentham, Shoumo Bhattacharya, Ewan Birney, Douglas Blackwood, Martin Bobrow, Elena Bochukova, Patrick F. Bolton, Rebecca Bounds, Chris Boustred, Gerome Breen, Mattia Calissano, Keren Carss, Juan Pablo Casas, John C. Chambers, Ruth Charlton, Krishna Chatterjee, Lu Chen, Antonio Ciampi, Sebahattin Cirak, Peter Clapham, Gail Clement, Guy Coates, Massimiliano Cocca, David A. Collier, Catherine Cosgrove, Tony Cox, Nick Craddock, Lucy Crooks, Sarah Curran, David Curtis, Allan Daly, Ian N. M. Day, Aaron Day-williams, George Dedoussis, Thomas Down, Yuanping Du, Cornelia M. Van Duijn, Ian Dunham, Sarah Edkins, Rosemary Ekong, Peter Ellis, David M. Evans, I. Sadaf Farooqi, David R. Fitzpatrick, Paul Flicek, James Floyd, A. Reghan Foley, Christopher S. Franklin, Marta Futema, Louise Gallagher, Paolo Gasparini, Tom R. Gaunt, Matthias Geihs, Daniel Geschwind, Celia Greenwood, Heather Griffin, Detelina Grozeva, Xiaosen Guo, Xueqin Guo, Hugh Gurling, Deborah Hart, Audrey E. Hendricks, Peter Holmans, Liren Huang, Tim Hubbard, Steve E. Humphries, Matthew E. Hurles, Pirro Hysi, Valentina Iotchkova, Aaron Isaacs, David K. Jackson, Yalda Jamshidi, Jon Johnson, Chris Joyce, Konrad J. Karczewski, Jane Kaye, Thomas Keane, John P. Kemp, Karen Kennedy, Alastair Kent, Julia Keogh, Farrah Khawaja, Marcus E. Kleber, Margriet Van Kogelenberg, Anja Kolb-kokocinski, Jaspal S. Kooner, Genevieve Lachance, Claudia Langenberg, Cordelia Langford, Daniel Lawson, Irene Lee, Elisabeth M. Van Leeuwen, Monkol Lek, Rui Li, Yingrui Li, Jieqin Liang, Hong Lin, Ryan Liu, Jouko Lönnqvist, Luis R. Lopes, Margarida Lopes, Jian'an Luan, Daniel G. Macarthur, Massimo Mangino, Gaëlle Marenne, Winfried März, John Maslen, Angela Matchan, Iain Mathieson, Peter Mcguffin, Andrew M. Mcintosh, Andrew G. Mckechanie, Andrew Mcquillin, Sarah Metrustry, Nicola Migone, Hannah M. Mitchison, Alireza Moayyeri, James Morris, Richard Morris, Dawn Muddyman, Francesco Muntoni, Børge Nordestgaard, Kate Northstone, Michael C. O'donovan, Stephen O'rahilly, Alexandros Onoufriadis, Karim Oualkacha, Michael J. Owen, Aarno Palotie, Kalliope Panoutsopoulou, Victoria Parker, Jeremy R. Parr, Lavinia Paternoster, Tiina Paunio, Felicity Payne, Stewart J. Payne, John R. B. Perry, Olli Pietilainen, Vincent Plagnol, Rebecca C. Pollitt, Sue Povey, Michael A. Quail, Lydia Quaye, Lucy Raymond, Karola Rehnström, Cheryl K. Ridout, Susan Ring, Graham R. S. Ritchie, Nicola Roberts, Rachel L. Robinson, David B. Savage, Peter Scambler, Stephan Schiffels, Miriam Schmidts, Nadia Schoenmakers, Richard H. Scott, Robert A. Scott, Robert K. Semple, Eva Serra, Sally I. Sharp, Adam Shaw, Hashem A. Shihab, So-youn Shin, David Skuse, Kerrin S. Small, Carol Smee, George Davey Smith, Lorraine Southam, Olivera Spasic-boskovic, Timothy D. Spector, David St Clair, Beate St Pourcain, Jim Stalker, Elizabeth Stevens, Jianping Sun, Gabriela Surdulescu, Jaana Suvisaari, Petros Syrris, Ioanna Tachmazidou, Rohan Taylor, Jing Tian, Martin D. Tobin, Daniela Toniolo, Michela Traglia, Anne Tybjærg-Hansen, Ana M. Valdes, Anthony M. Vandersteen, Anette Varbo, Parthiban Vijayarangakannan, Peter M. Visscher, Louise V. Wain, James T. R. Walters, Guangbiao Wang, Jun Wang, Yu Wang, Kirsten Ward, Eleanor Wheeler, Peter Whincup, Tamieka Whyte, Hywel J. Williams, Kathleen A. Williamson, Crispian Wilson, Scott G. Wilson, Kim Wong, Changjiang Xu, Jian Yang, Gianluigi Zaza, Eleftheria Zeggini, Feng Zhang, Pingbo Zhang, Weihua Zhang, Giovanni Gambaro, J. Brent Richards, Richard Durbin, Nicholas J. Timpson, Jonathan Marchini, Nicole Soranzo

136 Citations (Scopus)
92 Downloads (Pure)

Abstract

Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants.

Original languageEnglish
Article number8111
JournalNature Communications
Volume6
Number of pages9
ISSN2041-1723
DOIs
Publication statusPublished - 14 Sept 2015

Fingerprint

Dive into the research topics of 'Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel'. Together they form a unique fingerprint.

Cite this