| Literature DB >> 12184798 |
Neil Risch1, Esteban Burchard, Elad Ziv, Hua Tang.
Abstract
A debate has arisen regarding the validity of racial/ethnic categories for biomedical and genetic research. An epidemiologic perspective on the issue of human categorization in biomedical and genetic research strongly supports the continued use of self-identified race and ethnicity.Entities:
Mesh:
Year: 2002 PMID: 12184798 PMCID: PMC139378 DOI: 10.1186/gb-2002-3-7-comment2007
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1The evolutionary tree of human races. Population genetic studies of world populations support the categorization into five major groups, as shown. See text for further details.
Allele frequency differentiation of drug metabolizing enzymes on the basis of "genetic clusters" versus "racial groups," from the data of Wilson et al. [2]
| Genetic clusters | Racial groups | ||||||||
| Allele frequencies | σ2 | Allele frequencies | σ2 | ||||||
| Locus | C | A | B | D | African | Caucasian | East Asian | ||
| 0.60 | 0.66 | 0.69 | 0.59 | 0.0023 | 0.58 | 0.68 | 0.67 | 0.0030 | |
| 0.31 | 0.47 | 0.53 | 0.45 | 0.0087 | 0.33 | 0.49 | 0.52 | 0.0104 | |
| 0.27 | 0.09 | 0.37 | 0.25 | 0.0134 | 0.22 | 0.08 | 0.354 | 0.0182 | |
| 0.19 | 0.22 | 0.11 | 0.53 | 0.0340 | 0.21 | 0.21 | 0.32 | 0.0040 | |
| 0.46 | 0.74 | 0.17 | 0.33 | 0.0582 | 0.58 | 0.74 | 0.15 | 0.0931 | |
| 0.70 | 0.53 | 0.39 | 0.42 | 0.0197 | 0.70 | 0.49 | 0.37 | 0.0279 | |
Genetic clusters: C, primarily African; A, primarily Caucasian; B, primarily Pacific Islander; D, primarily East Asian. Racial groups: East Asian denotes Chinese plus Papua New Guinean.
Box 1The number of markers required for clustering as a function of the misclassification rate (calculated as shown in Box 1)
| Misclassification rate | |||
| δ value | 10-3 | 10-4 | 10-5 |
| 0.6 | 9 | 13 | 17 |
| 0.5 | 15 | 21 | 28 |
| 0.4 | 25 | 37 | 48 |
| 0.3 | 49 | 71 | 92 |
| 0.2 | 115 | 166 | 218 |
| 0.1 | 474 | 687 | 901 |
Median δ values for different racial/ethnic group comparisons, from data of Dean et al. [29] and Smith et al. [30]
| All markers | Top 50% of markers | Top 20% of markers | ||||
| Groups | SNP | STRP | SNP | STRP | SNP | STRP |
| CA-AA | 0.17 | 0.28 | 0.24 | 0.36 | 0.37 | 0.44 |
| CA-AS | 0.17 | 0.36 | 0.27 | 0.47 | 0.37 | 0.59 |
| CA-NA | 0.21 | 0.29 | 0.40 | |||
| CA-HA | 0.18 | 0.27 | 0.34 | |||
| AA-AS | 0.20 | 0.42 | 0.34 | 0.52 | 0.46 | 0.59 |
| AA-NA | 0.20 | 0.36 | 0.48 | |||
| AA-HA | 0.27 | 0.36 | 0.43 | |||
| AS-NA | 0.16 | 0.29 | 0.42 | |||
| AS-HA | 0.32 | 0.42 | 0.50 | |||
Groups are as follows: CA, Caucasians; AA, African Americans; AS, East Asians; NA, Native Americans; HA, Hispanic Americans.
Figure 2An example of confounding and a stratified analysis of environmental and genetic factors. Here we assume two populations (for example, races), groups A and B. G1 and G2 represent dichotomous genotype classes at a candidate gene locus (here one of the classes represents two genotypes for simplification, as would be the case for a dominant model), and E1 and E2 represent two strata of an environmental factor. (a) We assume that the probability (P) of trait D depends only on E, so that the risk of D given E1 is 10%, versus 1% given E2. In group A, the frequency of G1, G2, E1 and E2 are each 50%, whereas in group B, the frequency of G1 and E1 are each 10% and the frequency of G2 and E2 are each 90% Then, within group A, the prevalence of D is 5.5% whereas in group B the prevalence is 1.9%; hence, a racial difference exists in the prevalence of D. (b) We next consider the prevalence of D within strata defined by G and E. First, we assume G and E are frequency-independent within each group. In this case, the frequency difference in D between groups A and B persists within strata defined by G, but not within strata defined by E. Thus, the environmental factor E can completely explain the racial difference between groups A and B, but the genetic factor does not. Next consider the case where G and E are completely correlated in frequency within groups. In this case, analysis stratified on G or E eliminates the prevalence difference between groups A and B, and it is impossible to determine which is the functional cause of the racial difference. More important, consider the situation where factor E was not measured. Then for the first scenario (G and E independent within group), analysis stratified on G yields the correct interpretation that G does not contribute to the racial difference; for the second scenario (G and E fully correlated), however, analysis stratified on G would lead to the incorrect conclusion that G is the cause of the racial difference. P(D|G1) denotes the probability of disease given an individual has genotype G1, and similarly for G2, E1 and E2.