Literature DB >> 21498954

Genetic heterogeneity of self-reported ancestry groups in an admixed Brazilian population.

Tulio C Lins1, Rodrigo G Vieira, Breno S Abreu, Paulo Gentil, Ricardo Moreno-Lima, Ricardo J Oliveira, Rinaldo W Pereira.   

Abstract

BACKGROUND: Population stratification is the main source of spurious results and poor reproducibility in genetic association findings. Population heterogeneity can be controlled for by grouping individuals in ethnic clusters; however, in admixed populations, there is evidence that such proxies do not provide efficient stratification control. The aim of this study was to evaluate the relation of self-reported with genetic ancestry and the statistical risk of grouping an admixed sample based on self-reported ancestry.
METHODS: A questionnaire that included an item on self-reported ancestry was completed by 189 female volunteers from an admixed Brazilian population. Individual genetic ancestry was then determined by genotyping ancestry informative markers.
RESULTS: Self-reported ancestry was classified as white, intermediate, and black. The mean difference among self-reported groups was significant for European and African, but not Amerindian, genetic ancestry. Pairwise fixation index analysis revealed a significant difference among groups. However, the increase in the chance of type 1 error was estimated to be 14%.
CONCLUSIONS: Self-reporting of ancestry was not an appropriate methodology to cluster groups in a Brazilian population, due to high variance at the individual level. Ancestry informative markers are more useful for quantitative measurement of biological ancestry.

Entities:  

Mesh:

Year:  2011        PMID: 21498954      PMCID: PMC3899415          DOI: 10.2188/jea.je20100164

Source DB:  PubMed          Journal:  J Epidemiol        ISSN: 0917-5040            Impact factor:   3.211


INTRODUCTION

The genetic structure of human populations is relevant in epidemiologic studies and can be used as a tool for collecting parental ancestry information in an admixed population. Although the biogeography of some groups is culturally and genetically fixed, other groups have experienced substantial recent admixture with ancestors from widely divergent regions. That is the case in the Brazilian population, which is genetically characterized by differing degrees of admixture of 3 parental populations (European, African, and Native American).[1],[2] The debate on how genetic studies should be controlled for population stratification has encompassed several methodologies, including self-reported ethnicity and genetic ancestry markers.[3]–[8] Self-reported ancestry has been described as a method that is highly correlated with genetic population structure in well defined, stratified ethnic groups, such as Europeans, Africans, and Asians.[7]–[9] However, in cases of admixed populations, both self-reported ancestry and anthropometric traits used as proxies, such as skin pigmentation, are believed to be unreliable methods of determining ancestry,[3]–[5],[8] which suggests that molecular markers based on genetic clustering should be used to reduce the potential inaccuracies of population stratification.[3],[6],[10] Many association studies have classified ethnic groups by means of subjective assessment by the interviewer, evaluation of anthropometric traits, genealogical examination, and self-reported ancestry.[11]–[15] However, the recent use of molecular markers to determine genetic ancestry has revealed wide genetic heterogeneity in admixed Brazilians.[5],[16]–[22] One problem in performing association studies of admixed populations that are assessed solely by self-reported ancestry as a proxy of ethnic group is the possibility of spurious association with false-positive or false-negative results.[5]–[7],[23]–[26] Thus, the aim of this study was to evaluate the relation of self-reported with genetic ancestry and the statistical risk of grouping an admixed sample by using self-reported ancestry.

METHODS

Population sample

Samples were obtained from 189 postmenopausal women (age, 67.77 ± 5.22 years) who had volunteered as part of a healthcare program developed by the Universidade Católica de Brasília, located in the Center-West Region of Brazil (Taguatinga, DF, Brazil). The volunteers answered a lifestyle questionnaire that included a multiple-choice question on self-reported ancestry, based on the method used by the Brazilian Institute of Geography and Statistics (IBGE) national census survey.[27] All sampled individuals signed an informed consent form, and the research protocol was approved by the University Ethical Committee.

Assessment of individual genetic ancestry

For assessment of individual genetic ancestry, we selected 13 Ancestry Informative Markers (AIMs) that have differential allele frequencies among European, African, and Amerindian parental populations[28]–[30] (Table 1). The potential informativeness of most of these SNPs was evaluated in a Brazilian population sample,[21] and a modified method was applied to the use of the current markers. Briefly, genotypic data were obtained by optimized PCR to coamplify DNA fragments in 2 multiplex panels of ancestry informative markers. Later, the PCR-amplified products were purified in an enzymatic treatment with exonuclease I (ExoI) and shrimp alkaline phosphatase (SAP) enzymes to eliminate nonincorporated dNTPs and primers. Finally, the minisequencing reaction was performed using the SNaPshot Multiplex minisequencing kit reaction mix (Applied Biosystems, Foster City, USA), and the products were analyzed on the ABI 3130 XL Genetic Analyzer (Applied Biosystems) in an ABI 3700 POP-6 polymer. Genotypes were called using GeneScan Analysis Software, version 3.7 (Applied Biosystems) and Genotyper version 3.7 (Applied Biosystems). The detailed optimized multiplex single-base extension protocol, with reactant concentration and PCR thermocycling conditions, has been reported elsewhere.[21],[31]–[33]
Table 1.

Allelic frequencies and information on 13 ancestry informative markers for parental populations and studied samples

LocusrefPositionAlleleEURAMRAFRWhiteIntermediateBlack
CRH (rs3176921)[30]8q13.1G0.0730.0170.6820.4350.3950.767
CYP3A4 (rs2740574)[29]7q22.1G0.9580.9590.1980.4850.6540.771
FyNull (rs2814778)[30]1q23.2C0.0020.0000.9990.1010.2370.548
LPL (rs285)[30]8p21.3G0.5080.5580.0290.5190.5130.226
OCA2 (rs1800404)[30]15q13.1G0.2540.5520.8850.3700.5280.717
rs1129038[30]15q13C0.2240.9830.9950.6710.8080.871
rs1426654[30]15q21C0.0100.9300.9700.0960.2760.758
rs1480642[30]6q23C0.9940.6210.1060.7880.6790.435
AT3 (rs3138521)[28]1q25Insertion0.2820.0610.8580.3540.3040.597
rs736556[30]7p15C0.2440.0180.9390.2420.3130.519
rs3768641[30]2p13G0.9231.0000.0100.7150.7250.577
rs1871534[30]5p15.2G0.0190.0000.9600.0290.0490.250
rs4766807[30]12q24.2A0.6220.9480.0300.5590.5420.429

European (EUR), African (AFR) and Native American (AMR).

European (EUR), African (AFR) and Native American (AMR).

Statistical analysis

Allelic frequencies were obtained by direct counting, along with pairwise population Fixation index (FST) analysis, which was performed using GenAlex software.[34] The FST measures population differentiation based on the heterozygosity of genetic polymorphism data is calculated using the formula, FST = (HT − HS) × HT−1, where HT is the expected heterozygosity in the total population and HS is the observed heterozygosity in a subpopulation.[35] The fixation index can range from 0.0 (no differentiation) to 1.0 (complete differentiation) and theoretically varies from little (0.0 to 0.05) to moderate (0.05 to 0.15), great (0.15 to 0.25), or very great genetic differentiation (>0.25). Estimation of individual genomic ancestry was performed using an algorithm based on maximum likelihood estimation (MLE). Briefly, the log likelihood function is maximized for the admixture parameter of up to 3 parental populations using a priori known allele frequencies and estimates the individual ancestry probability from a predetermined number of analyzed genotypes of an admixed individual. The MLE approach was implemented in the software program IAE3CI; the detailed statistics have been described elsewhere.[36],[37] Basic descriptive statistics and 1-way analysis of variance (ANOVA) with the post-hoc Games-Howell test to adjust for unequal variances were used to determine the relation between genetic ancestry distribution and self-reported ancestry groups. A P value of 0.05 or lower was considered statistically significant. Statistical analysis was performed using SPSS software version 13 (SPSS Inc., Chicago, IL, USA).

RESULTS

A total of 192 participants completed the study, but only 3 were of self-reported Amerindian ancestry. Due to the lack of statistical power, these 3 women were not considered in the analysis, and 189 participants remained for study, as previously described. The questionnaire responses indicated that the sampled population had similar prevalences (41.8%) for 2 groups, white and intermediate (ie, any sort of admixture). Blacks represented 16.4% of the sample. No individual reported Asian descent in the present research. Allelic frequency for the 13 genotyped SNPs is described in Table 1, along with the frequencies of the parental populations. Using Wright’s scale of genetic differentiation, FST analysis revealed little difference between the white and intermediate groups (FST = 0.022), a moderate difference between the intermediate and black groups (FST = 0.138), and great differentiation between the black and white groups (FST = 0.225); all P-values were significant. The genetic ancestry of each self-reported ancestry category and the total sample was estimated (Table 2). The range of individual ancestry for the 3 parental genomes within each self-reported ancestry category is depicted in a box plot (Figure). The 3 self-reported categories had overlapping ranges for each parental ancestry. For example, with regard to European ancestry, there were individuals in all 3 self-reported categories within the range of 0.41 to 0.78 for the ancestry proportion. For African ancestry, this overlap ranged from 0.19 to 0.48, and for Amerindian ancestry the range was from 0 to 0.42 (Figure). For instance, an individual in the black group had 78% European ancestry and 22% African ancestry (sample 181; Figure).
Table 2.

Genetic ancestry estimates among self-reported ancestry groups, by skin color

Self-reported ancestryEuropeanAfricanAmerindian



AV ± SDVARAV ± SDVARAV ± SDVAR
White (n = 79)0.738 ± 0.1350.0180.172 ± 0.1340.0180.090 ± 0.1200.014
Intermediate (n = 79)0.615 ± 0.1400.0200.256 ± 0.1420.0200.129 ± 0.1680.029
Black (n = 31)0.387 ± 0.1640.0270.472 ± 0.1580.0250.141 ± 0.1890.035
Total (n = 189)0.629 ± 0.1870.0350.254 ± 0.1740.0310.117 ± 0.1550.024

Average values (AV), standard deviations (SD), and variance (VAR).

Figure.

Boxplot of genetic ancestry estimates of European (EUR), African (AFR), and Amerindian (AMR) proportions among the 3 self-reported groups. *outliers.

Average values (AV), standard deviations (SD), and variance (VAR). One-way ANOVA for comparison of means in conjunction with the Games-Howell post-hoc test revealed significant mean differences among self-reported groups for European and African ancestries, but not for Amerindian ancestry (Table 3). Although significant, the confidence interval showed that, for European and African ancestries, the average range of the boundary limits was 0.14, which indicates that the probability of spurious findings (type 1 error) arising simply by chance was 14% higher; the usual result is 5% (P = 0.05).
Table 3.

Comparisons of genetic ancestry among groups, by self-reported skin color. The mean difference was considered significant at P ≤ 0.05

Dependent variable(i) Group(j) GroupMean difference(i − j)Standard errorP-value95% Confidence interval

Lower boundUpper bound
EURWhiteIntermediate0.1220.022<0.0010.0700.174
 Black0.3510.033<0.0010.2710.432
IntermediateWhite−0.1220.022<0.001−0.174−0.070
 Black0.2290.034<0.0010.1480.310
BlackWhite−0.3510.033<0.001−0.432−0.271
 Intermediate−0.2290.034<0.001−0.310−0.148

AFRWhiteIntermediate−0.0780.0220.001−0.130−0.026
 Black−0.3000.032<0.001−0.377−0.222
IntermediateWhite0.0780.0220.0010.0260.130
 Black−0.2210.033<0.001−0.300−0.143
BlackWhite0.3000.032<0.0010.2220.377
 Intermediate0.2210.033<0.0010.1430.300

AMRWhiteIntermediate−0.0410.0230.187−0.0970.014
 Black−0.0510.0370.349−0.1410.038
IntermediateWhite0.0410.0230.187−0.0140.097
 Black−0.0100.0390.964−0.1050.084
BlackWhite0.0510.0370.349−0.0380.141
 Intermediate0.0100.0390.964−0.0840.105

European (EUR), African (AFR), and Native American (AMR) proportions.

European (EUR), African (AFR), and Native American (AMR) proportions.

DISCUSSION

The Federal District is a modern urban center and the capitol of Brazil. It has a population of migrants from several regions of Brazil. The 2007 National Household Sample Survey reported a distribution of self-reported ancestry in the Federal District that was very similar to that of the present study, differing only in the prevalence of blacks.[27] The inferred ancestry estimated here is comparable to those of other published studies of the Center-West Brazilian population,[17],[21] with only slight differences in European and African ancestry proportions, which were probably due to sampling issues. In this study, the statistical power of the 13 AIM panel might have been insufficient to accurately assess ancestry in an admixed population.[38] However, when the population ancestry estimates, standard deviations, and variances of the present study were compared with those of a different sample of Center-West Brazil that was assessed using a set of 28 ancestry markers,[21] the values did not statistically differ between samples (P = 0.49), especially with regard to individuals of Amerindian ancestry (0.118 ± 0.149, variance 0.022, in the earlier sample).[21] Allelic frequency and F-statistic estimates significantly differed among groups. It is noteworthy that the differences in allelic frequencies between the corresponding ancestry-related populations (ie, European versus white Brazilians and African versus black Brazilians) were remarkably divergent. This was the case for CYP3A4 in EUR-white (δ = 0.473) and for rs1871534 in AFR-black (δ = 0.710), which highlights the admixture among these groups. The proportions of genetic ancestries in the intermediate group were similar to those of the total sample and differed only in variance. Therefore, the range amplitude and variance of ancestry at an individual level were too large for self-reported ancestry to be considered a suitable proxy for homogenic clustering, although the differences in their means were statistically significant. In addition, we observed an overlap in the range of genetic ancestry values among groups, which suggests that individuals with the same proportions of admixture could include themselves in any ethnic category. The confidence interval revealed that the risk of this occurring simply by chance was 14%, considering the European and African ancestry estimates. It is worth mentioning that the group self-reported as black had a proportion of non-African ancestry exceeding 53%. In previous studies of the Brazilian population, African ancestry did not exceed non-African ancestry, but a large proportion was observed in a sample of the rural community of the Southeastern Brazilian state of Minas Gerais (48% non-African ancestry)[5] and in an urban population sample from Rio de Janeiro (49% non-African ancestry).[39] The intermediate group described here had estimates closer to those of the white group, as was the case for the urban population of Rio de Janeiro.[39] Alternatively, in a study of a sample from a rural community,[5] the intermediate group was closer to blacks, revealing an important issue, namely, that groups with equal self-reported proportions can have different genetic ancestry profiles, especially if the samples are from communities with different levels of urbanization. Those features were also demonstrated in other Brazilian population samples that assessed ancestry by using maternal (mtDNA) and paternal (Y-chromosome) molecular markers.[20],[22],[40] A related study of Puerto Ricans who self-classified ancestry/color groups found statistically significant differences in genetic ancestry among 3 groups (blanco, trigueño, and negro).[41] The results of that study can be compared with the categories in the present study. An overlapping range in ancestry estimates was also reported, in which the distribution of African ancestry overlapped across 12% for all 3 color categories (range of ancestry estimates: 0.27–0.35). In the present sample, this range was considerably higher (0.19–0.48), which encompassed 48% of the sample. For European ancestry, the overlap accounted for 63% of the sample in a range between 0.41–0.78, while for Amerindian ancestry, 95% of individuals were in the overlapping range (0.0–0.42) for all 3 categories. The reliability of self-classification can be poor, even among a proband and siblings,[4] in which the family history would be assumed to be more reliable. In the same way, interviewers might misclassify an individual for whom they do not know the ancestral family history. Indeed, different interviewers have classified the same individual into different groups.[23],[42] These examples illustrate how self-identified ethnicity might not be sufficiently accurate for use in biomedical research, as it is primarily a sociocultural construct.[42] Considerable variation exists because ethnicity is essentially constructed under social circumstances that consider many cultural traditions.[1],[42],[43] In admixed populations, individuals might feel that they belong to a certain ethnic group for cultural reasons or beliefs; however, their genealogy might consist of an unknown admixture.[1],[2] Although our sample comprised only women, we did not evaluate such effects on self-declared ancestry, as it may have more sociological than biological meaning.[42],[43] From a sociological and anthropological point, a person’s biological ancestry might have no relationship to their self-identification with a cultural group, but such ancestry might be of great importance in clinical research. In conclusion, determination of ethnicity based on self-reported ancestry is vulnerable to misclassification and should be avoided in scientific research. Therefore, the concept of ethnic group and self-declared ancestry are not synonymous in biomedical research and must be replaced with scientific measurements that have biological meaning, such as individual ancestry estimated by DNA markers.[5],[24],[44] Several strategies can be effective in controlling heterogeneity equivalence. For example, individual ancestry estimates can be used to match admixed case-control groups.[45] They can also be used in cross-sectional studies as covariates to adjust for a population stratification effect.[24],[31],[33] The use of ancestry informative markers to estimate individual ancestry is an effective and reliable solution to correct the effects of heterogeneity.
  43 in total

1.  Control of confounding of genetic associations in stratified populations.

Authors:  Clive J Hoggart; Eteban J Parra; Mark D Shriver; Carolina Bonilla; Rick A Kittles; David G Clayton; Paul M McKeigue
Journal:  Am J Hum Genet       Date:  2003-06       Impact factor: 11.025

Review 2.  Human population structure and genetic association studies.

Authors:  Elad Ziv; Esteban González Burchard
Journal:  Pharmacogenomics       Date:  2003-07       Impact factor: 2.533

3.  Genetic structure of human populations.

Authors:  Noah A Rosenberg; Jonathan K Pritchard; James L Weber; Howard M Cann; Kenneth K Kidd; Lev A Zhivotovsky; Marcus W Feldman
Journal:  Science       Date:  2002-12-20       Impact factor: 47.728

4.  Characterization of CD28, CTLA4, and ICOS polymorphisms in three Brazilian ethnic groups.

Authors:  V B Guzman; A Morgun; N Shulzhenko; K L Mine; A Gonçalves-Primo; C C Musatti; M Gerbase-Delima
Journal:  Hum Immunol       Date:  2005-07       Impact factor: 2.850

5.  The role of self-defined race/ethnicity in population structure control.

Authors:  X-Q Liu; A D Paterson; E M John; J A Knight
Journal:  Ann Hum Genet       Date:  2006-07       Impact factor: 1.670

6.  A multiplex single-base extension protocol for genotyping Cdx2, FokI, BsmI, ApaI, and TaqI polymorphisms of the vitamin D receptor gene.

Authors:  T C L Lins; L R Nogueira; R M Lima; P Gentil; R J Oliveira; R W Pereira
Journal:  Genet Mol Res       Date:  2007-05-22

7.  Who were the male founders of rural Brazilian Afro-derived communities? A proposal based on three populations.

Authors:  Guilherme Galvarros Bueno Lobo Ribeiro; Kiyoko Abe-Sandes; Rejane da Silva Sena Barcelos; Maria de Nazaré Klautau-Guimarães; Wilson Araujo da Silva Junior; Silviene Fabiana de Oliveira
Journal:  Ann Hum Biol       Date:  2010-07-15       Impact factor: 1.533

8.  Vitamin-d-receptor genotypes and bone-mineral density in postmenopausal women: interaction with physical activity.

Authors:  Paulo Gentil; Tulio Cesar de Lima Lins; Ricardo Moreno Lima; Breno Silva de Abreu; Dario Grattapaglia; Martim Bottaro; Ricardo Jaco de Oliveira; Rinaldo Wellerson Pereira
Journal:  J Aging Phys Act       Date:  2009-01       Impact factor: 1.961

9.  Genetic signatures of parental contribution in black and white populations in Brazil.

Authors:  Vanderlei Guerreiro-Junior; Rafael Bisso-Machado; Andrea Marrero; Tábita Hünemeier; Francisco M Salzano; Maria Cátira Bortolini
Journal:  Genet Mol Biol       Date:  2009-01-10       Impact factor: 1.771

10.  Genetic ancestry, social classification, and racial inequalities in blood pressure in Southeastern Puerto Rico.

Authors:  Clarence C Gravlee; Amy L Non; Connie J Mulligan
Journal:  PLoS One       Date:  2009-09-09       Impact factor: 3.240

View more
  17 in total

Review 1.  An Interactive Resource to Probe Genetic Diversity and Estimated Ancestry in Cancer Cell Lines.

Authors:  Julie Dutil; Zhihua Chen; Alvaro N Monteiro; Jamie K Teer; Steven A Eschrich
Journal:  Cancer Res       Date:  2019-03-20       Impact factor: 12.701

2.  The correlation between ancestry and color in two cities of Northeast Brazil with contrasting ethnic compositions.

Authors:  Thiago Magalhães da Silva; M R Sandhya Rani; Gustavo Nunes de Oliveira Costa; Maria A Figueiredo; Paulo S Melo; João F Nascimento; Neil D Molyneaux; Maurício L Barreto; Mitermayer G Reis; M Glória Teixeira; Ronald E Blanton
Journal:  Eur J Hum Genet       Date:  2014-10-08       Impact factor: 4.246

3.  Neither self-reported ethnicity nor declared family origin are reliable indicators of genomic ancestry.

Authors:  Bruna Ribeiro de Andrade Ramos; Maria Paula Barbieri D'Elia; Marcos Antônio Trindade Amador; Ney Pereira Carneiro Santos; Sidney Emanuel Batista Santos; Erick da Cruz Castelli; Steven S Witkin; Hélio Amante Miot; Luciane Donida Bartoli Miot; Márcia Guimarães da Silva
Journal:  Genetica       Date:  2016-03-17       Impact factor: 1.082

4.  Association between polymorphisms in the TRHR gene, fat-free mass, and muscle strength in older women.

Authors:  Cláudia C Lunardi; Ricardo M Lima; Rinaldo W Pereira; Tailce K M Leite; Ana B M Siqueira; Ricardo J Oliveira
Journal:  Age (Dordr)       Date:  2013-04-02

5.  HAART-associated dyslipidemia varies by biogeographical ancestry in the multicenter AIDS cohort study.

Authors:  Matthew J Nicholaou; Jeremy J Martinson; Alison G Abraham; Todd T Brown; Shehnaz K Hussain; Steven M Wolinsky; Lawrence A Kingsley
Journal:  AIDS Res Hum Retroviruses       Date:  2013-03-08       Impact factor: 2.205

6.  Revisiting the genetic ancestry of Brazilians using autosomal AIM-Indels.

Authors:  Fernanda Saloum de Neves Manta; Rui Pereira; Romulo Vianna; Alfredo Rodolfo Beuttenmüller de Araújo; Daniel Leite Góes Gitaí; Dayse Aparecida da Silva; Eldamária de Vargas Wolfgramm; Isabel da Mota Pontes; José Ivan Aguiar; Milton Ozório Moraes; Elizeu Fagundes de Carvalho; Leonor Gusmão
Journal:  PLoS One       Date:  2013-09-20       Impact factor: 3.240

7.  Association of serum lipid components and obesity with genetic ancestry in an admixed population of elderly women.

Authors:  Tulio C Lins; Alause S Pires; Roberta S Paula; Clayton F Moraes; Rodrigo G Vieira; Lucy G Vianna; Otávio T Nobrega; Rinaldo W Pereira
Journal:  Genet Mol Biol       Date:  2012-07-13       Impact factor: 1.771

8.  Genomic ancestry, self-reported "color" and quantitative measures of skin pigmentation in Brazilian admixed siblings.

Authors:  Tailce K M Leite; Rômulo M C Fonseca; Nanci M de França; Esteban J Parra; Rinaldo W Pereira
Journal:  PLoS One       Date:  2011-11-02       Impact factor: 3.240

9.  Assessment of the relationship between self-declared ethnicity, mitochondrial haplogroups and genomic ancestry in Brazilian individuals.

Authors:  Mari M S G Cardena; Andrea Ribeiro-Dos-Santos; Sidney Santos; Alfredo J Mansur; Alexandre C Pereira; Cintia Fridman
Journal:  PLoS One       Date:  2013-04-24       Impact factor: 3.240

10.  Human leukocyte antigen profiles of latin american populations: differential admixture and its potential impact on hematopoietic stem cell transplantation.

Authors:  Esteban Arrieta-Bolaños; J Alejandro Madrigal; Bronwen E Shaw
Journal:  Bone Marrow Res       Date:  2012-11-18
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.