| Literature DB >> 28935855 |
Woei-Yuh Saw1,2, Erwin Tantoso1, Husna Begum2,3, Lihan Zhou4, Ruiyang Zou4, Cheng He4, Sze Ling Chan5, Linda Wei-Lin Tan1, Lai-Ping Wong1, Wenting Xu1, Don Kyin Nwe Moong1, Yenly Lim1, Bowen Li1, Nisha Esakimuthu Pillai2, Trevor A Peterson6,7, Tomasz Bielawny6,7, Peter J Meikle3,8, Piyushkumar A Mundra3, Wei-Yen Lim1, Ma Luo6,7, Kee-Seng Chia1, Rick Twee-Hee Ong1, Liam R Brunham5, Chiea-Chuen Khor9,10, Heng Phon Too11,12,13, Richie Soong14, Markus R Wenk2,11,15,16,17, Peter Little2, Yik-Ying Teo18,19,20,21,22.
Abstract
The Singapore Integrative Omics Study provides valuable insights on establishing population reference measurement in 364 Chinese, Malay, and Indian individuals. These measurements include > 2.5 millions genetic variants, 21,649 transcripts expression, 282 lipid species quantification, and 284 clinical, lifestyle, and dietary variables. This concept paper introduces the depth of the data resource, and investigates the extent of ethnic variation at these omics and non-omics biomarkers. It is evident that there are specific biomarkers in each of these platforms to differentiate between the ethnicities, and intra-population analyses suggest that Chinese and Indians are the most biologically homogeneous and heterogeneous, respectively, of the three groups. Consistent patterns of correlations between lipid species also suggest the possibility of lipid tagging to simplify future lipidomics assays. The Singapore Integrative Omics Study is expected to allow the characterization of intra-omic and inter-omic correlations within and across all three ethnic groups through a systems biology approach.The Singapore Genome Variation projects characterized the genetics of Singapore's Chinese, Malay, and Indian populations. The Singapore Integrative Omics Study introduced here goes further in providing multi-omic measurements in individuals from these populations, including genetic, transcriptome, lipidome, and lifestyle data, and will facilitate the study of common diseases in Asian communities.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28935855 PMCID: PMC5608948 DOI: 10.1038/s41467-017-00413-x
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Spectrum of omics and non-omics measurements available in the iOmics
| Details | Sample size per ethnicity (C/M/I) | |
|---|---|---|
|
| ||
| Genomics | ● Illumina 2.5M microarray genotyping | ● 110/108/105 |
| ● Illumina exome chip genotyping | ● 110/108/105 | |
| ● Pharmacogenomics SNP typing (4032 SNPs) | ● 106/112/115 | |
| ● HLA typing (-A, -B, -C, -DPA, -DPB, -DQA, -DQB, -DRB) | ● 111/119/120 | |
| ● Deep (30×) whole-genome sequencing | ● 0/62/38 | |
| Lipidomics | ● Mass spectrometry with Multiple Reaction Monitoring of 282 lipid molecules in three major lipid classes (glycerophospholipids, sphingolipids, sterols) | ● 122/117/120 |
| Transcriptomics | ● Affymetrix HumanGene 1.0 ST array | ● 98/75/96 |
| MicroRNA | ● mSMRT-qPCR miRNA assay of 274 circulating miRNAs | ● 117/115/119 |
|
| ||
| Nutrition | ● Validated interviewer-directed Food Frequency Questionnaire (199 dietary variables) | ● 122/116/120 |
| Lifestyle and environment | ● Interviewer-directed questionnaire, including smoking, alcohol consumption, and physical activity (46 lifestyle variables) | ● 122/116/120 |
| Clinical measurements | ● Clinically assessed measurements and assays, including age, sex, height, weight, BMI, HDLc, LDLc, TG, BP, total cholesterol, HbA1c, fasting glucose (39 clinical variables) | ● 122/116/120 |
Note: The sample sizes stated here refer to the number of subjects that remained after assessment for data quality
Fig. 1PCAs of omics and clinical/lifestyle/diet data. Biplots are shown for five distinct PCAs using the respective first two axes of variations from each PCA. The five PCAs correspond to the analysis of: a 101,099 autosomal SNPs pseudo-randomly chosen to minimize linkage disequilibrium between the SNPs; b 21,649 gene transcript probesets; c 274 miRNAs; d 282 lipid species; e a set of 284 clinical, lifestyle, and dietary variables; and f only the 199 dietary variables. Each circle represents an individual from the iOmics and is assigned a color corresponding to the self-reported ethnicity of the subject, according to the color legend on the top right panel in a
Six candidate pharmacogenomic variants of most differentiated between three ethnic groups
| SNP | CHR | POS | Alleles | Gene region | Frequency (Chinese) | Frequency (Malay) | Frequency (Indian) | Clinical PGx implicationa | Wright FST |
|
|---|---|---|---|---|---|---|---|---|---|---|
| rs2359612 | 16 | 31011297 | A/G | VKORC1, intron | 0.118(G) | 0.263(G) | 0.900 (G) | (i) Patients with AA genotype who are treated with warfarin may require lowest dose as compared to patients with the AG or GG genotype (ii) Patients with the AG genotype who are treated with warfarin may require lower dose as compared to patients with the GG genotype (iii) Patients with the GG genotype who are treated with warfarin may require higher dose as compared to patients with the AG or AA genotype | 0.471 | 1.23E-05 |
| rs749671 | 16 | 30995848 | A/G | VKORC1 ZNF646, coding SYN | 0.118(G) | 0.273(G) | 0.900 (G) | NA | 0.466 | 1.31E-05 |
| rs8050894 | 16 | 31012010 | C/G | VKORC1, intron | 0.118(C) | 0.263 (C) | 0.868 (C) | (i) Patients with the CC genotype who are treated with warfarin may require a higher dose as compared to patients with the CG or GG genotype (ii) Patients with the CG genotype who are treated with warfarin may require a lower dose as compared to patients with the GG genotype (iii) Patients with the GG genotype who are treated with warfarin may require the lowest dose as compared to the patients with the CG or CC genotype | 0.434 | 2.33E-05 |
| rs7294 | 16 | 31009822 | C/T | VKORC1, flanking UTR | 0.109(T) | 0.272 (T) | 0.822(T) | (i) Patients with the CC genotype who are treated with warfarin may require a lower dose as compared to patients with the CT or TT genotype (ii) Patients with the CT genotype who are treated with warfarin may require a higher dose as compared to patients with the CC genotype (iii) Patients with the TT genotype who are treated with warfarin may require a higher dose as compared to patients with the CC genotype | 0.387 | 4.79E-05 |
| rs1238741 | 4 | 100202335 | C/T | ADH4/5, flanking UTR | 0.179 (T) | 0.272 (T) | 0.865 (T) | NA | 0.375 | 5.78E-05 |
| rs11974407 | 7 | 20695644 | C/G | ABCB5, intron | 0.038 (C) | 0.165 (C) | 0.670(C) | NA | 0.361 | 7.08E-05 |
aInformation were retrieved from PharmGKB®, only clinical implication with level 2a, 2b, or 1 were retrieved
Fig. 2Distribution of the Wright FST value across three ethnic groups at the eight HLA loci. The distribution of the Wright FST value across three populations at the eight HLA loci. The alleles shown in the plot are the top three FST alleles at each HLA loci. The triangular shape indicates the HLA alleles, where the FST values are driven by differences between Chinese and Indians. The diamond shape indicates the HLA alleles, where the FST values are driven by the differences between Chinese and Malays. The square shape indicates the HLA alleles, where the FST values are driven by the differences between Malays and Indians. Shapes with red color outline are representing drug-associated HLA alleles[41] (Table 3)
Drug-associated HLA alleles
| HLA allele | Drug | Adverse reaction | Allele frequency (%) | Wright FST value | ||
|---|---|---|---|---|---|---|
| Chinese | Malay | Indian | ||||
| A*31:01 | Carbamazepine | Rash | 0.9 | 0 | 5.0 | 0.025 |
| B*15:02 | Carbamazepine Phenytoin | SJS | 7.4 | 12.4 | 3.8 | 0.017 |
| B*13:01 | Dapsone | HSS | 10.2 | 3.0 | 0.8 | 0.036 |
| B*38:02 | Sulfomethoxazole | SJS/TEN | 4.6 | 4.3 | 2.9 | 0.001 |
| B*57:01 | Abacavir Flucloxacilin | HSS DILI | 0 | 0.9 | 7.5 | 0.041 |
| B*58:01 | Allopurinol | SJS | 10.2 | 3.0 | 0.8 | 0.036 |
DILI drug-induced liver injury, HSS hypersensitivity syndrome, SJS Stevens–Johnson syndrome, TEN toxic epidermal necrolysis
Fig. 3A combined boxplot and scatter plot of the top three significant transcript probesets across three populations. The combined plot showing distribution of transcript intensities of a 7912136:UTS2 gene, b 8041061:PLB1 gene, c 8102362:TIFA gene across three populations. P-values were calculated using ANOVA, adjusted for batch effect and gender, and corrected for Bonferroni. The upper whisker represents either the maximum value observed or is 1.5 times the interquartile range greater than the third quartile, whichever is smaller. The lower whisker represents either the minimum value observed or is 1.5 times lower than the first quartile, whichever is greater. The details of the significant transcript probesets across three populations can be found in Supplementary Data 3
Five most differentiated miRNAs between three ethnic groups after adjusted for RT plate effect
| miRNA |
| lsmChinese | lsmMalay | lsmIndian | FC (Malay–Chinese) | FC(Indian–Chinese) |
|---|---|---|---|---|---|---|
| has_miR_4732_3p | 9.50E-04 | 18.01 | 18.17 | 17.42 | 1.12 | 0.67 |
| hsa_miR_375 | 9.40E-03 | 18.04 | 17.52 | 17.38 | 0.70 | 0.63 |
| hsa_miR_140_3p | 1.10E-02 | 21.82 | 22.00 | 21.40 | 1.13 | 0.75 |
| hsa_miR_378a_3p | 2.92E-02 | 20.81 | 20.87 | 20.48 | 1.04 | 0.79 |
| hsa_miR_378a_5p | 3.13E-02 | 15.10 | 15.35 | 14.69 | 1.19 | 0.75 |
Note: Least squares mean (lsm) was calculated for each ethnic groups and fold change was also calculated with respect to Chinese FC is calculate in this way: since the lipid data was log-2-transformed, i.e., log2 FC(Malay–Chinese) = lsmMalay − lsmChinese; FC = 2log2FC(Malay–Chinese)
Fig. 4Correlation heatmap of the 282 lipids in each ethnic group. The correlation heatmap between 282 lipids in the a Chinese; b Malays; and c Indians. The correlation was calculated by using concentration of the lipids via Pearson’s correlation, r 2. The lipids are first categorized into lipid category and followed by lipid classes. In each of the lipid class, the lipid species are ordered according to their carbon chain length and degree of unsaturation (number of double bonds). The intensity of the color reflects the magnitude of the correlation, in which the white color means r 2 = 0 and red color means r = 1
Fig. 5Distribution of the differences and correlation of the 16 significant differentiated clinical phenotypes across three populations. The distribution of the differences of the 16 clinical significant phenotypes across three populations, with a correlation heatmap of the phenotypes. The correlation was calculated using Pearson’s correlation, r², across 358 samples. The intensity of the color reflects the magnitude of the correlation, in which the white color means r² = 0 and red color means r² = 1. The details of the differences can be found in Supplementary Table 9