| Literature DB >> 27528366 |
Seizo Koshiba1,2, Ikuko Motoike1,3, Kaname Kojima1,2, Takanori Hasegawa1,2, Matsuyuki Shirota1,2, Tomo Saito1,2, Daisuke Saigusa1,2, Inaho Danjoh1,2, Fumiki Katsuoka1,2, Soichi Ogishima1,2, Yosuke Kawai1,2, Yumi Yamaguchi-Kabata1,2, Miyuki Sakurai1, Sachiko Hirano1, Junichi Nakata1, Hozumi Motohashi1,4, Atsushi Hozawa1,2, Shinichi Kuriyama1,2, Naoko Minegishi1,2, Masao Nagasaki1,2,3, Takako Takai-Igarashi1,2, Nobuo Fuse1,2, Hideyasu Kiyomoto1,2, Junichi Sugawara1,2, Yoichi Suzuki1,2, Shigeo Kure1,2, Nobuo Yaegashi1,2, Osamu Tanabe1,2, Kengo Kinoshita1,3,4, Jun Yasuda1,2, Masayuki Yamamoto1,2.
Abstract
Relationship between structural variants of enzymes and metabolic phenotypes in human population was investigated based on the association study of metabolite quantitative traits with whole genome sequence data for 512 individuals from a population cohort. We identified five significant associations between metabolites and non-synonymous variants. Four of these non-synonymous variants are located in enzymes involved in metabolic disorders, and structural analyses of these moderate non-synonymous variants demonstrate that they are located in peripheral regions of the catalytic sites or related regulatory domains. In contrast, two individuals with larger changes of metabolite levels were also identified, and these individuals retained rare variants, which caused non-synonymous variants located near the catalytic site. These results are the first demonstrations that variant frequency, structural location, and effect for phenotype correlate with each other in human population, and imply that metabolic individuality and susceptibility for diseases may be elicited from the moderate variants and much more deleterious but rare variants.Entities:
Year: 2016 PMID: 27528366 PMCID: PMC4985752 DOI: 10.1038/srep31463
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Study population.
(a) Sample characteristics in this study. (b,c) Distributions of leucine (b) and glutamine (c) concentrations in plasma. Red and blue bars represent female and male, respectively.
Figure 2Correlation analyses.
(a) Correlations between the quantified metabolites. Each node and each edge represent a metabolite and the metabolite pairs with log10(P-value) <−10, respectively. Positive and negative correlations are represented using black and red lines, respectively. Thicker lines represent stronger correlations between two metabolite levels. This figure was generated using Cytoscape v3.2.142. (b) Correlations between time after eating and the quantified metabolites. Each node represents a metabolite or time after eating, and each edge represents the pairs with log10(P-value) <−10. Positive and negative correlations were represented same as (a). (c) Correlation between the plasma glucose concentration quantified by NMR and that from the blood test.
Genome-wide significant loci associated with metabolites.
| Metabolites | sex | −log P | SNP | Chr. | Gene | Allele | MAF | Residue change |
|---|---|---|---|---|---|---|---|---|
| asparagine | all | 14.19 | rs8012505 | 14 | C > G | 0.127 | S344R | |
| phenylalanine | all | 12.54 | rs118092776 | 12 | C > T | 0.047 | R53H | |
| proline | all | 14.88 | rs5747933 | 22 | G > T | 0.148 | T275N | |
| glycine | all | 50.66 | rs1047891 | 2 | C > A | 0.152 | T1406N | |
| formate | all | 8.34 | rs1801133 | 1 | C > T | 0.397 | A222V | |
| Chr., chromosome; MAF, minor allele frequency | ||||||||
| MAF | ||||||||
| Gene | ||||||||
| Allele | C > G | C > T | G > T | C > A | C > T | G > A | ||
| Non-synonymous variant | S344R | R53H | T275N | T1406N | A222V | |||
| ToMMo | 0.127 | 0.047 | 0.148 | 0.152 | 0.397 | 0.020 | ||
| CSHL- HAPMAP | HapMap-JPT | Japan | 0.852*1 | 0.153 | 0.360 | 0.017 | ||
| HapMap-HCB | China | 0.909*1 | 0.159 | 0.512 | 0.058 | |||
| HapMap-CEU | European (Utah, USA) | 0.817*1 | 0.295 | 0.310 | 0.808 | |||
| HapMap-YRI | Sub-Saharan African | 0.567*1 | 0.283 | 0.093 | 0.044 | |||
| HAPMAP-ASW | African ancestry in Southwest USA. | 0.388 | 0.112 | 0.214 | ||||
| 1000 GENOMES (pilot 1) | YRI_low_coverage_panel | 0.051 | 0.186 | 0.119 | 0.008 | |||
| CEU_low_coverage_panel | 0.175 | 0.083 | 0.300 | 0.275 | 0.800 | |||
| CHB + JPT_low_coverage_panel | 0.117 | 0.025 | 0.158 | 0.100 | 0.433 | 0.042 | ||
MAFs of HAPMAP and 1000GENOMES are obtained from NCBI dbSNP database (http://www.ncbi.nlm.nih.gov/SNP/).
*1: Definition of the minor allele for ASPG in CSHL-HAPMAP seems to be opposite to that in 1000GENOMES or our study.
Figure 3Association study of metabolomics and genomics.
(a) Manhattan plots for metabolic traits. The strength of association with plasma metabolite concentrations for the five loci is shown based on the results from the association study for all 512 samples (Table 1). The line indicates a suggestive genome-wide significance level with a P-value of 7.08 × 10−9. (b–f) Regional association plots for significant loci reported in Table 1. Statistical Significance of associated SNPs is plotted on the −log10(P-value) scale as a function of chromosomal position (NCBI 37). The identified causal SNP at each locus is shown in purple. Correlation of the causal SNP to other SNPs at each locus is shown on a scale from minimal (blue) to maximal (red). Estimate recombination rate are also shown.
Figure 4Distribution of the Plasma Metabolite Concentration.
Distribution of the plasma metabolite concentration across the genotypes of each locus summarized in Table 1 was shown using a violin plot. These figures were made using R package.
Figure 5Mapping of the five non-synonymous variants on the reported crystal structures of the enzymes, respectively.
Ribbon models of (a) structure of Guinea pig L-Asparaginase 1 Catalytic domain (Protein Data Bank (PDB) ID: 4R8 K), (b) structure of rat phenylalanine hydroxylase (PAH) (PDB ID: 2PHM), (c) two types of tetramer model structures of PAH, low activity tetramer and high activity tetramer, (d) structure of E. coli PutA proline dehydrogenase domain (PDB ID: 2FZM), (e) structure of human carbamoyl-phosphate synthase 1 (CPS1) regulatory domain (PDB ID: 2YVQ), and (f) structure of E. coli methylenetetrahydrofolate reductase (MTHFR) (PDB ID: 1B5T) are shown. The catalytic domains are depicted in green, whereas the regulatory domains are depicted in cyan. In the case of PAH structure, the C-terminal tetramerization domain is depicted in yellow. The residue corresponding to the position of each non-synonymous variant is shown by a sphere model. Cofactor FADs were represented by stick model for PRODH and MTHFR. The catalytic site of each enzyme is indicated by red arrow. The coordinates of the human PAH tetramers were obtained from the supplementary data of the reported article15. All figures were made using the PyMOL Software Package (https://www.pymol.org/).
Figure 6Relationship among variant, allele frequency, and structure.
(a) Scatter plot of the concentration distribution of plasma phenylalanine. Heterozygotes of rs118092776, rs79931499, and rs746203167 are colored orange, red, and pink, respectively. Others are colored grey. (b) Mapping of the three non-synonymous variants on the structure of rat PAH. Three residues corresponding to the position of each non-synonymous variant are shown by a sphere model. (d) Schematic view of the relationship among variant, allele frequency, and structure. We propose a new category for variant, omic, whose variants partially influence omics environments of individuals.
Prediction of functional effects of the SNPs.
| Metabolites | SNP | Gene | Allele | Residue change | CADD | polyphen 2 | Provean | SIFT | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| impact | phred score | prediction | score | prediction (cutoff = 0.05) | score | prediction | score | |||||
| asparagine | rs8012505 | ASPG | C > G | S344R | Moderate, modifier | 15.87 | probably damaging | 0.988 | Damaging | 0.016 | Damaging | 0.03 |
| proline | rs5747933 | PRODH | G > T | T275N | Moderate, modifier | 7.274 | no data | Tolerated | 0.636 | Tolerated | 0.5 | |
| phenyl- alanine | rs118092776 | PAH | C > T | R53H | Moderate, modifier | 17.68 | no data | Damaging | 0.007 | Damaging | 0.01 | |
| glycine | rs1047891 | CPS1 | C > A | T1406N | Moderate, modifier | 14.03 | benign | 0.009 | Tolerated | 0.235 | Tolerated | 0.25 |
| formate | rs1801133 | MTHFR | C > T | A222V | Moderate | 21.9 | probably damaging | 0.998 | Deleterious | −3.76 | Damaging | 0.05 |
| creatinine | rs820336 | MYLK | G > A | intron | Modifier | 4.782 | ||||||