| Literature DB >> 33154511 |
Ngoc Hieu Tran1,2, Thanh Binh Vo1,3, Van Thong Nguyen4, Nhat-Thang Tran5, Thu-Huong Nhat Trinh6, Hong-Anh Thi Pham1,3, Thi Hong Thuy Dao1,3, Ngoc Mai Nguyen1,3, Yen-Linh Thi Van1,3, Vu Uyen Tran1,3, Hoang Giang Vu1,3, Quynh-Tram Nguyen Bui1,3, Phuong-Anh Ngoc Vo1,3, Huu Nguyen Nguyen1,3, Quynh-Tho Thi Nguyen3, Thanh-Thuy Thi Do3, Nien Vinh Lam7, Phuong Cao Thi Ngoc3,8, Dinh Kiet Truong3, Hoai-Nghia Nguyen9, Hoa Giang10,11, Minh-Duy Phan12,13.
Abstract
The under-representation of several ethnic groups in existing genetic databases and studies have undermined our understanding of the genetic variations and associated traits or diseases in many populations. Cost and technology limitations remain the challenges in performing large-scale genome sequencing projects in many developing countries, including Vietnam. As one of the most rapidly adopted genetic tests, non-invasive prenatal testing (NIPT) data offers an alternative untapped resource for genetic studies. Here we performed a large-scale genomic analysis of 2683 pregnant Vietnamese women using their NIPT data and identified a comprehensive set of 8,054,515 single-nucleotide polymorphisms, among which 8.2% were new to the Vietnamese population. Our study also revealed 24,487 disease-associated genetic variants and their allele frequency distribution, especially 5 pathogenic variants for prevalent genetic disorders in Vietnam. We also observed major discrepancies in the allele frequency distribution of disease-associated genetic variants between the Vietnamese and other populations, thus highlighting a need for genome-wide association studies dedicated to the Vietnamese population. The resulted database of Vietnamese genetic variants, their allele frequency distribution, and their associated diseases presents a valuable resource for future genetic studies.Entities:
Mesh:
Year: 2020 PMID: 33154511 PMCID: PMC7644705 DOI: 10.1038/s41598-020-76245-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Distributions of genome coverage and sequencing depth of the NIPT dataset. (a) Average genome coverage and sequencing depth per sample and from all samples combined. (b) Summary histogram of sequencing depth over all genome positions. (c) Distribution of sequencing depth per chromosome. (d) IGV tracks of sequencing depth, bwa MAPQ score, and Umap k50 mappability across the whole genome (the figure was produced using IGV, Integrative Genomics Viewer, version 2.8.9[24]).
Figure 2Summary of the NIPT call set. (a) Venn diagram comparison between the NIPT call set, the KHV and EAS call sets from the 1000 genomes project, and the dbSNP database. The percentages were calculated with respect to the NIPT call set. (b) Allele frequency distribution of the NIPT call set. (c) Distribution of locations and effects of variants in the NIPT call set.
Figure 3Principal component analysis of the NIPT call set and other East Asia populations. (a) Scatter plot comparison of allele frequency estimated from the NIPT and the KHV call sets. (b) Principal component analysis. (JPT Japanese in Tokyo, Japan, CHB Han Chinese in Beijing, China, CHS Southern Han Chinese, CDX Chinese Dai in Xishuangbanna, China, KHV Kinh in Ho Chi Minh City, Vietnam, NIPT Non-Invasive Prenatal Testing data).
Summary of ClinVar annotations for the NIPT call set.
| ClinVar clinical significance | In KHV | Not in KHV |
|---|---|---|
| Benign or likely benign | 23,414 | 300 |
| Uncertain significance | 353 | 74 |
| Pathogenic or likely pathogenic | 4 | 1 |
| Drug response | 114 | 3 |
| Others | 211 | 13 |
| Total number of annotations | 24,096 | 391 |
Pathogenic variants identified from the NIPT call set.
| Variant information | ClinVar annotations | Allele frequency | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| chr | Position | dbSNP | Ref | Alt | ID | Gene | Conditions | NIPT (%) | gnomAD EAS (%) | gnomAD (%) |
| chr18 | 57,571,588 | rs2272783 | A | G | 562 | FECH | Erythropoietic protoporphyria | 28.10 | 32.57 | 11.23 |
| chr13 | 20,189,473 | rs72474224 | C | T | 17023 | GJB2 | Nonsyndromic hearing loss and deafness | 13.40 | 8.35 | 0.76 |
| chr13 | 72,835,359 | rs17089782 | G | A | 217689 | PIBF1 | Joubert syndrome | 6.80 | 5.13 | 1.36 |
| chr6 | 26,090,951 | rs1799945 | C | G | 10 | HFE | Hemochromatosis type 1 | 5.10 | 3.41 | 10.82 |
| chr2 | 31,529,325 | rs9332964 | C | T | 3351 | SRD5A2 | 5-alpha reductase deficiency | 2.90 | 0.67 | 0.05 |