| Literature DB >> 24204807 |
Alexandra Moura1, Michael A Savageau, Rui Alves.
Abstract
BACKGROUND: Identifying organism-environment interactions at the molecular level is crucial to understanding how organisms adapt to and change the chemical and molecular landscape of their habitats. In this work we investigated whether relative amino acid compositions could be used as a molecular signature of an environment and whether such a signature could also be observed at the level of the cellular amino acid composition of the microorganisms that inhabit that environment. METHODOLOGIES/PRINCIPALEntities:
Mesh:
Substances:
Year: 2013 PMID: 24204807 PMCID: PMC3808408 DOI: 10.1371/journal.pone.0077319
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Average environmental relative amino acid abundance (eRAAA) across habitats calculated from the literature.
| Amino | Aquatic | Terrestrial | Host-associated | All environments |
| acid | (mean ±SD) | (mean ±SD) | (mean ±SD) | (mean ±SD) |
|
| 0.1167 ±0.05315 | 0.1144 ±0.01689 | 0.0602 ±0.00516 | 0.1113 ±0.04662 |
|
| 0.0378 ±0.03249 | 0.0459 ±0.03462 | 0.0504 ±0.01340 | 0.0409 ±0.03191 |
|
| 0.0896 ±0.04352 | 0.1435 ±0.04755 | 0.0618 ±0.00557 | 0.1009 ±0.04980 |
|
| 0.0005 ±0.00265 | 0.0015 ±0.00323 | 0.0351 ±0.00332 | 0.0037 ±0.01004 |
|
| 0.0883 ±0.04270 | 0.1696 ±0.13768 | 0.2039 ±0.01316 | 0.1187 ±0.08787 |
|
| 0.2002 ±0.11806 | 0.0951 ±0.04960 | 0.0781 ±0.00580 | 0.1633 ±0.11176 |
|
| 0.0134 ±0.02044 | 0.0328 ±0.03000 | 0.0199 ±0.00191 | 0.0189 ±0.02374 |
|
| 0.0375 ±0.02426 | 0.0266 ±0.00944 | 0.0349 ±0.00138 | 0.0345 ±0.02075 |
|
| 0.0604 ±0.02298 | 0.0439 ±0.02440 | 0.0723 ±0.00263 | 0.0572 ±0.02374 |
|
| 0.0139 ±0.02318 | 0.0356 ±0.02159 | 0.0333 ±0.00376 | 0.0210 ±0.02386 |
|
| 0.0032 ±0.00587 | 0.0055 ±0.00211 | 0.0174 ±0.00149 | 0.0050 ±0.00627 |
|
| 0.0334 ±0.02062 | 0.0367 ±0.02188 | 0.0344 ±0.00219 | 0.0343 ±0.01995 |
|
| 0.0157 ±0.03960 | 0.0039 ±0.01650 | 0.1174 ±0.00959 | 0.0213 ±0.04465 |
|
| 0.1455 ±0.15386 | 0.0660 ±0.02238 | 0.0558 ±0.00245 | 0.1178 ±0.13120 |
|
| 0.0595 ±0.01956 | 0.0538 ±0.01834 | 0.0440 ±0.00337 | 0.0567 ±0.01884 |
|
| 0.0005 ±0.00129 | 0.0023 ±0.00549 | 0.0080 ±0.00037 | 0.0016 ±0.00359 |
|
| 0.0179 ±0.01621 | 0.0563 ±0.05206 | 0.0126 ±0.00177 | 0.0272 ±0.03355 |
|
| 0.0583 ±0.02954 | 0.0487 ±0.01737 | 0.0605 ±0.00371 | 0.0560 ±0.02583 |
Environmental determinations of Asp/Asn and Glu/Gln did not distinguish between the two amino acids in the pairs, therefore they were considered together for the analysis.
Figure 1Characterization of different environments by their relative amino acid composition.
A) scatter plot by Principal Component Analysis according to the type of environment; B) Hierarchical clustering analysis. The length of branches represents the degree of dissimilarity between clusters. The x-axis of the heat map represents the 20 amino acids by alphabetical order of the three-letter code name. Determinations of Asp/Asn and Glu/Gln were considered together for the analysis, because environmental measurements did not distinguish between the two amino acids in the pairs. The y- axis of the heatmap represents the individual environments where amino acid abundance was determined. Over- and under-representation of amino acid residues in each environment are represented in green and red colored squares, respectively.
Spearman rank correlation coefficients between estimated amino acid compositions (based on CAI and δ predictors) and experimentally-determined amino acid abundances.
| Organism | Description | Correlation | p-value |
| ƒaa | 0.783 | *** | |
|
| CAI | 0.789 | *** |
| δ |
| *** | |
| ƒaa | 0.846 | *** | |
|
| CAI | 0.837 | *** |
| δ |
| *** | |
| ƒaa | 0.854 | *** | |
|
| CAI | 0.847 | *** |
| δ |
| *** | |
| ƒaa | 0.775 | *** | |
|
| CAI | 0.743 | *** |
| δ |
| *** |
Values in bold indicate the strongest correlation.
ƒaa indicates unweighted amino acid frequency in the complete predicted proteome of an organism.
*** p<0.001.
Figure 2Characterization of the relative amino acid composition of the proteomes from different organisms.
A) scatter plot by Principal Component Analysis according to the type of environment; B) Hierarchical clustering analysis. The length of branches represents the degree of dissimilarity between clusters. The x-axis of the heat map represents the 20 amino acids by alphabetical order of the three-letter code name. The y- axis of the heatmap represents the individual organisms where amino acid abundance was estimated. Over- and under-representation of amino acid residues in each organism are represented in green and red colored squares, respectively.
Linear regression models for the effect of GC content, Phylogeny and Habitat on the relative cellular amino acid abundance.
| Amino Acid | Variable | Function | Adjusted R2 | p-value |
| %GC | 12.3823+416.765x | 0.888 | *** | |
| Ala | Phylogeny | 14.78−0.209605x | 0.019 | *** |
| Habitat | 3.28+0.0541254x | 0.030 | *** | |
| %GC | 10.0666+708.622x | 0.892 | *** | |
| Arg | Phylogeny | 14.78−0.0953123x | 0.003 | n.s. |
| Habitat | 3.28+0.0627061x | 0.041 | *** | |
| %GC | 77.0024−682.993x | 0.815 | *** | |
| Asn | Phylogeny | 14.78+0.213086x | 0.034 | *** |
| Habitat | 3.28+0.0618256x | 0.038 | *** | |
| %GC | 18.0057+587.185x | 0.085 | *** | |
| Asp | Phylogeny | 14.78−0.0699774x | 0.018 | *** |
| Habitat | 3.28+0.052295x | 0.028 | *** | |
| %GC | 53.5793−362.54x | 0.009 | *** | |
| Cys | Phylogeny | 14.78−0.0835933x | 0.028 | *** |
| Habitat | 3.28+0.0504854x | 0.025 | *** | |
| %GC | 50.0219−13.8308x | −0.001 | *** | |
| Gln | Phylogeny | 14.78−0.0940546x | 0.036 | *** |
| Habitat | 3.28+0.0485648x | 0.023 | *** | |
| %GC | 79.8505−472.873x | 0.134 | *** | |
| Glu | Phylogeny | 14.78−0.162344x | 0.096 | *** |
| Habitat | 3.28+0.0319903x | 0.009 | *** | |
| %GC | −16.8867+945.005x | 0.839 | *** | |
| Gly | Phylogeny | 14.78+0.420635x | 0.120 | *** |
| Habitat | 3.28+0.0704056x | 0.046 | *** | |
| %GC | 13.4426+1730.45x | 0.230 | *** | |
| His | Phylogeny | 14.78−0.194095x | 0.122 | *** |
| Habitat | 3.28+0.034075x | 0.010 | *** | |
| %GC | 86.7595−563.15x | 0.849 | *** | |
| Ile | Phylogeny | 14.78−0.453057x | 0.130 | *** |
| Habitat | 3.28+0.0538274x | 0.026 | *** | |
| %GC | 1.94247+470.982x | 0.077 | *** | |
| Leu | Phylogeny | 14.78−0.11572x | 0.051 | *** |
| Habitat | 3.28+0.0486466x | 0.023 | *** | |
| %GC | 76.1764−470.209x | 0.866 | *** | |
| Lys | Phylogeny | 14.78−0.0723698x | 0.002 | n. s. |
| Habitat | 3.28+0.0618894x | 0.040 | *** | |
| %GC | 61.4524−479.803x | 0.016 | *** | |
| Met | Phylogeny | 14.78−0.091508x | 0.034 | *** |
| Habitat | 3.28+0.0488429x | 0.024 | *** | |
| %GC | 104.337−1332.69x | 0.682 | *** | |
| Phe | Phylogeny | 14.78+0.13288x | 0.022 | *** |
| Habitat | 3.28+0.0707417x | 0.056 | *** | |
| %GC | −0.615608+1159.53x | 0.841 | *** | |
| Pro | Phylogeny | 14.78−0.145668x | 0.013 | *** |
| Habitat | 3.28+0.0642624x | 0.043 | *** | |
| %GC | 89.6811−655.588x | 0.248 | *** | |
| Ser | Phylogeny | 14.78−0.00453643x | −0.001 | n. s. |
| Habitat | 3.28+0.063256x | 0.042 | *** | |
| %GC | 15.6874+650.185x | 0.089 | *** | |
| Thr | Phylogeny | 14.78−0.0849289x | 0.027 | *** |
| Habitat | 3.28+0.0493972x | 0.024 | *** | |
| %GC | 7.78436+3584.66x | 0.617 | *** | |
| Trp | Phylogeny | 14.78−0.15309x | 0.037 | *** |
| Habitat | 3.28+0.0582441x | 0.034 | *** | |
| %GC | 93.3622−1407.74x | 0.764 | *** | |
| Tyr | Phylogeny | 14.78−0.333038x | 0.110 | *** |
| Habitat | 3.28+0.0519745x | 0.025 | *** | |
| %GC | −11.348+863.332x | 0.496 | *** | |
| Val | Phylogeny | 14.78+0.170741x | 0.061 | *** |
| Habitat | 3.28+0.0768137x | 0.058 | *** |
*** p<0.001; n.s., not significant.
Figure 3Relative amino acid composition, weighted by δ index, of each organism plotted against average GC content.
Figure 4Spearman Rank Correlations between the RAAA of organisms and environments.
Asterisks represent significance at p<0.01 (**) and p<0.001 (***).