| Literature DB >> 28746312 |
Lasse Maretty1, Jacob Malte Jensen2,3, Bent Petersen4, Jonas Andreas Sibbesen1, Siyang Liu1,5, Palle Villesen2,3,6, Laurits Skov2,3, Kirstine Belling4, Christian Theil Have7, Jose M G Izarzugaza4, Marie Grosjean4, Jette Bork-Jensen7, Jakob Grove3,8,9, Thomas D Als3,8,9, Shujia Huang10,11, Yuqi Chang10, Ruiqi Xu5, Weijian Ye5, Junhua Rao5, Xiaosen Guo10,12, Jihua Sun5,7, Hongzhi Cao10, Chen Ye10, Johan van Beusekom4, Thomas Espeseth13,14, Esben Flindt12, Rune M Friborg2,3, Anders E Halager2,3, Stephanie Le Hellard14,15, Christina M Hultman16, Francesco Lescai3,8,9, Shengting Li3,8,9, Ole Lund4, Peter Løngren4, Thomas Mailund2,3, Maria Luisa Matey-Hernandez4, Ole Mors3,6,9, Christian N S Pedersen2,3, Thomas Sicheritz-Pontén4, Patrick Sullivan16,17, Ali Syed4, David Westergaard4, Rachita Yadav4, Ning Li5, Xun Xu10, Torben Hansen7, Anders Krogh1, Lars Bolund8,10, Thorkild I A Sørensen7,18,19, Oluf Pedersen7, Ramneek Gupta4, Simon Rasmussen4, Søren Besenbacher2,6, Anders D Børglum3,8,9, Jun Wang3,10,12, Hans Eiberg20, Karsten Kristiansen10,12, Søren Brunak4,21, Mikkel Heide Schierup2,3,22.
Abstract
Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set of structural variants including many novel insertions and demonstrate how this variant catalogue enables further deciphering of known association mapping signals. We leverage the assemblies to provide 100 completely resolved major histocompatibility complex haplotypes and to resolve major parts of the Y chromosome. Our study provides a regional reference genome that we expect will improve the power of future association mapping studies and hence pave the way for precision medicine initiatives, which now are being launched in many countries including Denmark.Entities:
Mesh:
Year: 2017 PMID: 28746312 DOI: 10.1038/nature23264
Source DB: PubMed Journal: Nature ISSN: 0028-0836 Impact factor: 49.962