| Literature DB >> 28250455 |
Birte Kehr1,2, Anna Helgadottir1, Pall Melsted1,3, Hakon Jonsson1, Hannes Helgason1,3, Adalbjörg Jonasdottir1, Aslaug Jonasdottir1, Asgeir Sigurdsson1, Arnaldur Gylfason1, Gisli H Halldorsson1, Snaedis Kristmundsdottir1, Gudmundur Thorgeirsson4,5, Isleifur Olafsson6, Hilma Holm1,5, Unnur Thorsteinsdottir1,4, Patrick Sulem1, Agnar Helgason1,7, Daniel F Gudbjartsson1,3, Bjarni V Halldorsson1,8, Kari Stefansson1,4.
Abstract
Genomes usually contain some non-repetitive sequences that are missing from the reference genome and occur only in a population subset. Such non-repetitive, non-reference (NRNR) sequences have remained largely unexplored in terms of their characterization and downstream analyses. Here we describe 3,791 breakpoint-resolved NRNR sequence variants called using PopIns from whole-genome sequence data of 15,219 Icelanders. We found that over 95% of the 244 NRNR sequences that are 200 bp or longer are present in chimpanzees, indicating that they are ancestral. Furthermore, 149 variant loci are in linkage disequilibrium (r2 > 0.8) with a genome-wide association study (GWAS) catalog marker, suggesting disease relevance. Additionally, we report an association (P = 3.8 × 10-8, odds ratio (OR) = 0.92) with myocardial infarction (23,360 cases, 300,771 controls) for a 766-bp NRNR sequence variant. Our results underline the importance of including variation of all complexity levels when searching for variants that associate with disease.Entities:
Mesh:
Year: 2017 PMID: 28250455 DOI: 10.1038/ng.3801
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330