| Literature DB >> 28429804 |
Henry Richard Johnston1, Yi-Juan Hu1, Jingjing Gao2, Timothy D O'Connor3,4,5, Gonçalo R Abecasis6, Genevieve L Wojcik7, Christopher R Gignoux7, Pierre-Antoine Gourraud8, Antoine Lizee8, Mark Hansen9, Rob Genuario9, Dave Bullis9, Cindy Lawley9, Eimear E Kenny7,10, Carlos Bustamante7, Terri H Beaty11, Rasika A Mathias11,12, Kathleen C Barnes11,12, Zhaohui S Qin1.
Abstract
A primary goal of The Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) is to develop an 'African Diaspora Power Chip' (ADPC), a genotyping array consisting of tagging SNPs, useful in comprehensively identifying African specific genetic variation. This array is designed based on the novel variation identified in 642 CAAPA samples of African ancestry with high coverage whole genome sequence data (~30× depth). This novel variation extends the pattern of variation catalogued in the 1000 Genomes and Exome Sequencing Projects to a spectrum of populations representing the wide range of West African genomic diversity. These individuals from CAAPA also comprise a large swath of the African Diaspora population and incorporate historical genetic diversity covering nearly the entire Atlantic coast of the Americas. Here we show the results of designing and producing such a microchip array. This novel array covers African specific variation far better than other commercially available arrays, and will enable better GWAS analyses for researchers with individuals of African descent in their study populations. A recent study cataloging variation in continental African populations suggests this type of African-specific genotyping array is both necessary and valuable for facilitating large-scale GWAS in populations of African ancestry.Entities:
Mesh:
Year: 2017 PMID: 28429804 PMCID: PMC5399604 DOI: 10.1038/srep46398
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Illumina projected coverage of African variants on several commercially available GWAS arrays.
| Array Type | MAF Category | |||
|---|---|---|---|---|
| <0.010 (non-singletons) | >0.010 | >0.025 | >0.050 | |
| HumanExomev1-2 | 0.031 | 0.032 | 0.034 | 0.035 |
| HumanCore-12v1-0 | 0.127 | 0.152 | 0.203 | 0.256 |
| HumanCoreExome-12v1-0 | 0.148 | 0.172 | 0.221 | 0.271 |
| OmniExpress-12v1-1 | 0.210 | 0.249 | 0.326 | 0.395 |
| OmniExpressExome-8v1-1 | 0.226 | 0.263 | 0.337 | 0.403 |
Figure 1The ADPC design pipeline, describing the steps taken to whittle ~15 million novel African SNPs into a 627k African-targeted GWAS array.
Figure 2Estimated imputation coverage of variants tagged by the ADPC content as part of the MEGA array.
(a) Coverage in 1000 Genomes African populations is >=0.8 r2 down to 1% MAF. (b) Coverage in 1000 Genomes admixed African populations is >=0.8 r2 down to 1% MAF
Projected coverage for the ADPC among CAAPA variants >=1% MAF, with and without OmniExpress pairing, for the whole genome.
| Coverage of CAAPA variants >=1% MAF | |||
|---|---|---|---|
| r2 | OmniExpress Alone | ADPC Alone | Combined |
| 0.9 | 20% | 12% | 29% |
| 0.8 | 26% | 16% | 37% |
| 0.5 | 39% | 31% | 56% |
Figure 3Projected minor allele frequency histograms for the ADPC and OmniExpress arrays overlayed with one another.
The disparity between the arrays is significant, and represents very different tagging approaches. This makes them well suited to complement each other.