| Literature DB >> 35337356 |
Abdimajid Osman1,2, Jon Jonasson3,4.
Abstract
A majority of studies reporting human genetic variants were performed in populations of European ancestry whereas other global populations, and particularly many ethnolinguistic groups in other continents, are heavily underrepresented in these studies. To investigate the extent of this disproportionate representation of global populations concerning variants of significance to thrombosis and hemostasis, 845 single nucleotide polymorphisms (SNPs) in and around 34 genes associated with thrombosis and hemostasis and included in the commercial Axiom Precision Medicine Research Array (PMRA) were evaluated, using gene frequencies in 3 African (Somali and Luhya in East Africa, and Yoruba in West Africa) and 14 non-African (admixed American, East Asian, European, South Asian, and sub-groups) populations. Among the populations studied, Europeans were observed to be the best represented population by the hemostatic SNPs included in the PMRA. The European population also presented the largest number of common pharmacogenetic and pathogenic hemostatic variants reported in the ClinVar database. The number of such variants decreased the farther the genetic distance a population was from Europeans, with Yoruba and East Asians presenting the least number of clinically significant hemostatic SNPs in ClinVar while also being the two genetically most distinct populations from Europeans among the populations compared. Current study shows the lopsided representation of global populations as regards to hemostatic genetic variants listed in different commercial SNP arrays, such as the PMRA, and reported in genetic databases while also underlining the importance of inclusion of non-European ethnolinguistic populations in genomics studies designed to discover variants of significance to bleeding and thrombotic disorders.Entities:
Keywords: Gene frequency; Hemostasis; Population genetics; Single nucleotide polymorphism; Thrombosis
Mesh:
Substances:
Year: 2022 PMID: 35337356 PMCID: PMC8957123 DOI: 10.1186/s12920-022-01220-0
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1PCA for 845 hemostatic SNPs in 6 global populations. Cos2 values show the quality of representation for the variables on the PCA, the higher Cos2 the better representation
Fig. 2Contributions by each population to the first two dimensions of PCA (PC1 and PC2). The red dashed horizontal line represents the expected mean contribution
Fig. 3Violin plots showing the distribution of allele frequencies for 845 hemostatic SNPs in different populations. The small white circles indicate the median value: 0.036 (AMR), 0.018 (EAS), 0.035 (EUR), 0.031 (SAS), 0.053 (SOM), and 0.056 (YRI). The black rectangle in the middle represents the interquartile range. Allele frequency refer to the frequency of the alternative allele
Fig. 4A heatmap showing 66 hemostatic SNPs found in the ClinVar database with at least 1% alternative allele frequency in 6 world populations. The legend on the right side shows corresponding colors for alternative allele frequencies (AAF). The SNPs are arranged from top to bottom as follows according to their ClinVar annotations: drug-response (n = 8), pathogenic (n = 9), variant of uncertain significance (n = 13), likely-benign (n = 26), and benign (n = 10)
VKORC1 haplotypes in different world populations
| Gene | VKORC1 | Haplotype frequencies | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Cromosome | 16 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | This study | 1000 genomes phase III | ||||||||
| Position hg19 | 31,102,321 | 31,103,796 | 31,104,509 | 31,104,720 | 31,104,878 | 31,105,353 | 31,105,554 | 31,105,945 | ||||||||||
| Ref. allele | C | A | C | C | G | G | A | C | ||||||||||
| Alt. Allelle | T | G | G | T | A | A | C | A | ||||||||||
| rs-number | rs7294 | rs2359612 | rs8050894 | rs72547529 | rs9934438 | rs17708472 | rs2884737 | rs61742245 | SOM | ASW | CEU | CHB | GIH | JPT | LWK | MXL | TSI | YRI |
| Haplotypes | C | A | C | C | G | G | A | C | 7% | 12% | 14% | |||||||
| C | A | C | T | G | G | A | C | 3% | ||||||||||
| C | A | G | C | A | G | A | C | 3% | 8% | 17% | 95% | 11% | 90% | 3% | 33% | 15% | 3% | |
| C | A | G | C | A | G | C | C | 41% | 7% | 28% | 7% | 14% | 33% | |||||
| C | G | C | C | G | A | A | C | 14% | 7% | 26% | 16% | 6% | 15% | 16% | 2% | |||
| C | G | C | C | G | G | A | A | 8% | ||||||||||
| C | G | C | C | G | G | A | C | 1% | 8% | 19% | 10% | |||||||
| C | G | G | C | G | G | A | C | 15% | 13% | 3% | 2% | 19% | ||||||
| T | G | C | C | G | G | A | C | 31% | 48% | 32% | 4% | 67% | 10% | 43% | 34% | 33% | 51% | |
The following populations were included: Somali in Northeastern Somalia (SOM), African Ancestry in Southwest USA (ASW), Northern Europeans from Utah (CEU), Han Chinese in Beijing, China (CHB), Gujarati Indians in Houston, Texas, USA (GIH), Japanese in Tokyo, Japan (JPT), Luhya in Webuye, Kenya (LWK), Mexican Ancestry in Los Angeles, CA, USA (MXL), Toscani in Italy (TSI), and Yoruba in Ibadan, Nigeria (YRI)
Fig. 5Continental and subcontinental haplotypes in a DNA region of 1 Mbps encompassing the protein C (PROC) gene. Rows and columns represent haplotypes and SNPs, respectively. Haplotypes of the same color are identical
Fig. 6Battleship plots displaying allele frequencies (AF) for drug response (left) and pathogenic (right) SNPs in different populations. The horizontal width of the rectangle (or the square) is proportional to the magnitude of allele frequency. A legend for AF is shown on the right side of each plot with the highest AF on top and the lowest AF at the foot
Fig. 7Ancestry category distribution in the GWAS Catalog. The figure shows the distribution of ancestry categories in percentages of individuals (N = 110,291,046; left panel) and studies (N = 4655; right panel).
Source: https://www.ebi.ac.uk/gwas/docs/ancestry-data