Literature DB >> 18208330

Application of ancestry informative markers to association studies in European Americans.

Michael F Seldin1, Alkes L Price.   

Abstract

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18208330      PMCID: PMC2211545          DOI: 10.1371/journal.pgen.0040005

Source DB:  PubMed          Journal:  PLoS Genet        ISSN: 1553-7390            Impact factor:   5.917


× No keyword cloud information.
Recently, whole genome association (WGA) studies have accelerated progress in the search for genetic variations underlying the inheritance of complex genetic diseases. Although population differences in allele frequencies are usually small, these studies have demonstrated the importance of accounting for population differences in order to reduce false positive associations. Even within a continental population, population stratification—ancestry differences between cases and controls—can cause false associations at markers whose frequency differs across subpopulations [1,2]. For example, in a recent WGA study of rheumatoid arthritis in European Americans, markers in the LCT and IRF4 genes would have been falsely implicated as associated to disease without the application of methods to control for stratification [3]. Similar empirical examples of population stratification exist for other phenotypes, and genetic risk has been reported to vary across Europe for a wide range of diseases [4-8]. In general, investigators should be alerted to consider population stratification when WGA data indicates that a particular marker shows a strong frequency gradient across Europe. Methods have already been developed to control for population stratification in the initial stage of WGA studies, in which data from hundreds of thousands of markers is generated [9,10]. However, controlling for stratification is just as important in replication studies in independent sample sets, which will focus on a small number of markers. Similarly, candidate gene studies and fine mapping or sequencing studies will also require attention to population differences. Because a small number of candidate markers will not be sufficiently informative for ancestry, and genotyping a large number of markers is expensive, there is a need for small panels of ancestry informative markers (AIMs) that can be used to accurately infer ancestry [11]. We focus here on European Americans, a structured population that is often sampled in association studies. Multiple studies have shown that the largest source of population structure in European Americans involves a north–south (or northwest–southeast) cline through Europe [9,12,13]. However, subtler effects involving other regional or ethnic differences can also contribute to stratification. An important question is which ancestries should be evaluated in replication studies by genotyping of AIMs at additional cost. The answer to this question will vary from study to study, depending on factors such as the collection location of cases and controls, the phenotype being studied, and considerations of cost. For example, a study of a phenotype with known ancestry differences, in which cases are collected from a large city and controls are collected from throughout the country, would be well-advised to define ancestry to the fullest extent possible. On the other hand, a study of a phenotype with no known ancestry differences, involving cases and controls rigorously matched by location, might choose to bypass the use of AIMs entirely. An intermediate option would be to model only north–south ancestry, addressing the single most likely source of stratification at partial cost, with some residual risk of stratification. Two research papers by our two groups in the current issue of PLoS Genetics provide a broad assessment of European American population structure, and also provide several sets of AIMs for inferring ancestry in European Americans [3,4]. Our respective sets of AIMs were ascertained using different pairs of populations, but have each been shown to be effective in discerning the ancestries for which they were ascertained. The Price et al. study analyzes WGA data from the Affymetrix 500 K and Illumina 300 K platforms and describes a set of 100 AIMs ascertained using northwest versus southeast European ancestry (Price100) and a set of 200 AIMs ascertained using southeast European versus Ashkenazi Jewish ancestry (Price200) [4]. The Tian et al. study analyzes WGA data from the Illumina 300 K and 500 K platforms and describes a set of 192 AIMs ascertained using northern European versus Ashkenazi Jewish ancestry (Tian192) and a set of 1,211 AIMs ascertained using Irish versus other northern European ancestry (Tian1211) [3]. It should be stressed that combined information from either a very large set of markers or a set of highly specialized markers is required to distinguish the ancestries of these genetically very similar populations, whose real or perceived group differences may often be dominated by environmental, social, and cultural factors. Below, we outline the possible choices of marker sets for inferring various ancestries. In each case, a method such as structured association or principal components analysis can be applied to genotype data to correct for stratification. To correct for stratification along the north–south (or northwest–southeast) cline, either the Price100 or Tian192 marker sets can be used. (The Tian192 markers, which were ascertained using northern European versus Ashkenazi Jewish ancestry, are effective in distinguishing north–south ancestry because southern Europeans attain intermediate ancestry values as compared to values at one extreme for northern Europeans.) To correct for stratification involving both north–south and Ashkenazi Jewish ancestry, one option is to use the Price100+Price200 marker sets, which together separate north, south, and Ashkenazi ancestry into three distinct clusters. Another option is to use the Tian192 marker set, which models these three ancestries along a single axis and will be sufficient in the case that the phenotype being analyzed has intermediate values for southern European as compared to northern European versus Ashkenazi Jewish ancestry. Finally, to correct for stratification involving a west–east gradient within northern Europe (e.g., Irish versus other northern European ancestry), the Tian1211 marker set is the only set of AIMs available. We note that the initial information from a WGA study can help to determine the appropriate choice of AIMs for a replication study. An important question is, are there ancestry differences between cases and controls in the initial WGA study—and if so, which ancestries contribute to this effect, and do sets of AIMs correct for stratification in the WGA data as effectively as the complete set of WGA markers? Of course, a caveat to such an approach is the requirement that the cases and controls used for replication are demographically matched to those used in the initial study. It is also worth noting that for some studies the analysis of population structure might precede WGA. Thus, depending on the number of case and control samples and the cost of prescreening with AIM panels, it may be advantageous to first match cases and controls for ancestry. This could improve the power of the study, for two reasons. First, methods to correct for stratification in a scenario with poorly matched cases and controls will lead to an inevitable loss of power in a WGA study. Second, if a variant is more polymorphic or has higher relative risk in samples of a particular ancestry, then a more genetically homogeneous group of subjects (for example, focusing on Ashkenazi Jewish ancestry [14]) may be more likely to reveal that variant. We caution that population stratification is not the only source of false positive associations in disease studies. In particular, differences in DNA quality or laboratory treatment between cases or controls may produce spurious signals that will not be addressed by using AIMs [15]. Subtle instances of differential bias will be difficult to detect in studies involving a small number of markers, but a possible diagnostic check is to compare rates of missing data between cases and controls. In conclusion, replication and candidate gene studies in European Americans can now make use of AIMs for examining north–south, Ashkenazi Jewish, and Irish ancestry. Though European Americans could exhibit additional even subtler population structure effects, these would contribute much less strongly to stratification and would require a higher number of markers to discern, limiting their relevance to AIM sets. Going forward, the widespread implementation of AIMs may benefit from specialized products on dedicated platforms to reduce costs. Discussions are currently under way to achieve this, and we anticipate that specialized AIM products will be commercially available to the research community in the near future. We also envision that the increasing explosion of WGA data will aid the ascertainment of AIM panels for a broader range of populations, beyond European Americans.
  15 in total

1.  Shifts in angiotensin I converting enzyme insertion allele frequency across Europe: implications for Alzheimer's disease risk.

Authors:  F Panza; V Solfrizzi; A D'Introno; A M Colacicco; C Capurso; A Capurso; P G Kehoe
Journal:  J Neurol Neurosurg Psychiatry       Date:  2003-08       Impact factor: 10.154

2.  Control of confounding of genetic associations in stratified populations.

Authors:  Clive J Hoggart; Eteban J Parra; Mark D Shriver; Carolina Bonilla; Rick A Kittles; David G Clayton; Paul M McKeigue
Journal:  Am J Hum Genet       Date:  2003-06       Impact factor: 11.025

3.  An Icelandic example of the impact of population structure on association studies.

Authors:  Agnar Helgason; Bryndís Yngvadóttir; Birgir Hrafnkelsson; Jeffrey Gulcher; Kári Stefánsson
Journal:  Nat Genet       Date:  2004-12-19       Impact factor: 38.330

4.  Population structure, differential bias and genomic control in a large-scale, case-control association study.

Authors:  David G Clayton; Neil M Walker; Deborah J Smyth; Rebecca Pask; Jason D Cooper; Lisa M Maier; Luc J Smink; Alex C Lam; Nigel R Ovington; Helen E Stevens; Sarah Nutland; Joanna M M Howson; Malek Faham; Martin Moorhead; Hywel B Jones; Matthew Falkowski; Paul Hardenbol; Thomas D Willis; John A Todd
Journal:  Nat Genet       Date:  2005-10-09       Impact factor: 38.330

5.  Demonstrating stratification in a European American population.

Authors:  Catarina D Campbell; Elizabeth L Ogburn; Kathryn L Lunetta; Helen N Lyon; Matthew L Freedman; Leif C Groop; David Altshuler; Kristin G Ardlie; Joel N Hirschhorn
Journal:  Nat Genet       Date:  2005-07-24       Impact factor: 38.330

6.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

7.  Contribution of factor VII genotype to activated FVII levels. Differences in genotype frequencies between northern and southern European populations.

Authors:  F Bernardi; P Arcieri; R M Bertina; F Chiarotti; J Corral; M Pinotti; H Prydz; M Samama; P M Sandset; R Strom; V V Garcia; G Mariani
Journal:  Arterioscler Thromb Vasc Biol       Date:  1997-11       Impact factor: 8.311

8.  Familial empirical risks for inflammatory bowel disease: differences between Jews and non-Jews.

Authors:  H Yang; C McElree; M P Roth; F Shanahan; S R Targan; J I Rotter
Journal:  Gut       Date:  1993-04       Impact factor: 23.059

9.  Coronary heart disease incidence in northern and southern European populations: a reanalysis of the seven countries study for a European coronary risk chart.

Authors:  A Menotti; M Lanti; P E Puddu; D Kromhout
Journal:  Heart       Date:  2000-09       Impact factor: 5.994

10.  Analysis and application of European genetic substructure using 300 K SNP information.

Authors:  Chao Tian; Robert M Plenge; Michael Ransom; Annette Lee; Pablo Villoslada; Carlo Selmi; Lars Klareskog; Ann E Pulver; Lihong Qi; Peter K Gregersen; Michael F Seldin
Journal:  PLoS Genet       Date:  2008-01       Impact factor: 5.917

View more
  35 in total

1.  Geographical structure and differential natural selection among North European populations.

Authors:  Brian P McEvoy; Grant W Montgomery; Allan F McRae; Samuli Ripatti; Markus Perola; Tim D Spector; Lynn Cherkas; Kourosh R Ahmadi; Dorret Boomsma; Gonneke Willemsen; Jouke J Hottenga; Nancy L Pedersen; Patrik K E Magnusson; Kirsten Ohm Kyvik; Kaare Christensen; Jaakko Kaprio; Kauko Heikkilä; Aarno Palotie; Elisabeth Widen; Juha Muilu; Ann-Christine Syvänen; Ulrika Liljedahl; Orla Hardiman; Simon Cronin; Leena Peltonen; Nicholas G Martin; Peter M Visscher
Journal:  Genome Res       Date:  2009-03-05       Impact factor: 9.043

Review 2.  Accounting for ancestry: population substructure and genome-wide association studies.

Authors:  Chao Tian; Peter K Gregersen; Michael F Seldin
Journal:  Hum Mol Genet       Date:  2008-10-15       Impact factor: 6.150

3.  A panel of ancestry informative markers to estimate and correct potential effects of population stratification in Han Chinese.

Authors:  Pengfei Qin; Zhiqiang Li; Wenfei Jin; Dongsheng Lu; Haiyi Lou; Jiawei Shen; Li Jin; Yongyong Shi; Shuhua Xu
Journal:  Eur J Hum Genet       Date:  2013-05-29       Impact factor: 4.246

4.  European population genetic substructure: further definition of ancestry informative markers for distinguishing among diverse European ethnic groups.

Authors:  Chao Tian; Roman Kosoy; Rami Nassir; Annette Lee; Pablo Villoslada; Lars Klareskog; Lennart Hammarström; Henri-Jean Garchon; Ann E Pulver; Michael Ransom; Peter K Gregersen; Michael F Seldin
Journal:  Mol Med       Date:  2009-08-27       Impact factor: 6.354

5.  Improved ancestry inference using weights from external reference panels.

Authors:  Chia-Yen Chen; Samuela Pollack; David J Hunter; Joel N Hirschhorn; Peter Kraft; Alkes L Price
Journal:  Bioinformatics       Date:  2013-03-28       Impact factor: 6.937

6.  Fast Principal-Component Analysis Reveals Convergent Evolution of ADH1B in Europe and East Asia.

Authors:  Kevin J Galinsky; Gaurav Bhatia; Po-Ru Loh; Stoyan Georgiev; Sayan Mukherjee; Nick J Patterson; Alkes L Price
Journal:  Am J Hum Genet       Date:  2016-02-25       Impact factor: 11.025

Review 7.  Rheumatoid arthritis: a view of the current genetic landscape.

Authors:  M J H Coenen; P K Gregersen
Journal:  Genes Immun       Date:  2008-11-06       Impact factor: 2.676

8.  Genetic risk variants in African Americans with multiple sclerosis.

Authors:  Noriko Isobe; Pierre-Antoine Gourraud; Hanne F Harbo; Stacy J Caillier; Adam Santaniello; Pouya Khankhanian; Martin Maiers; Stephen Spellman; Nezih Cereb; SooYoung Yang; Marcelo J Pando; Laura Piccio; Anne H Cross; Philip L De Jager; Bruce A C Cree; Stephen L Hauser; Jorge R Oksenberg
Journal:  Neurology       Date:  2013-06-14       Impact factor: 9.910

9.  The Gene, Environment Association Studies consortium (GENEVA): maximizing the knowledge obtained from GWAS by collaboration across studies of multiple conditions.

Authors:  Marilyn C Cornelis; Arpana Agrawal; John W Cole; Nadia N Hansel; Kathleen C Barnes; Terri H Beaty; Siiri N Bennett; Laura J Bierut; Eric Boerwinkle; Kimberly F Doheny; Bjarke Feenstra; Eleanor Feingold; Myriam Fornage; Christopher A Haiman; Emily L Harris; M Geoffrey Hayes; John A Heit; Frank B Hu; Jae H Kang; Cathy C Laurie; Hua Ling; Teri A Manolio; Mary L Marazita; Rasika A Mathias; Daniel B Mirel; Justin Paschall; Louis R Pasquale; Elizabeth W Pugh; John P Rice; Jenna Udren; Rob M van Dam; Xiaojing Wang; Janey L Wiggs; Kayleen Williams; Kai Yu
Journal:  Genet Epidemiol       Date:  2010-05       Impact factor: 2.135

10.  An ancestry informative marker set for determining continental origin: validation and extension using human genome diversity panels.

Authors:  Rami Nassir; Roman Kosoy; Chao Tian; Phoebe A White; Lesley M Butler; Gabriel Silva; Rick Kittles; Marta E Alarcon-Riquelme; Peter K Gregersen; John W Belmont; Francisco M De La Vega; Michael F Seldin
Journal:  BMC Genet       Date:  2009-07-24       Impact factor: 2.797

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.