| Literature DB >> 21269423 |
Zlatko Smole1, Nela Nikolic, Fran Supek, Tomislav Šmuc, Ivo F Sbalzarini, Anita Krisko.
Abstract
BACKGROUND: Prokaryotic environmental adaptations occur at different levels within cells to ensure the preservation of genome integrity, proper protein folding and function as well as membrane fluidity. Although specific composition and structure of cellular components suitable for the variety of extreme conditions has already been postulated, a systematic study describing such adaptations has not yet been performed. We therefore explored whether the environmental niche of a prokaryote could be deduced from the sequence of its proteome. Finally, we aimed at finding the precise differences between proteome sequences of prokaryotes from different environments.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21269423 PMCID: PMC3045906 DOI: 10.1186/1471-2148-11-26
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Classification accuracies displayed as area under the curve (AUC) obtained by the support vector machines (SVM) and random forrest (RF) for the classification according to domain of life, halophilicity and thermophilicity.
| Area Under the Curve (AUC) | ||
|---|---|---|
| Domain of Life | 0.99 | 0.99 |
| Halophilicity | 0.83 | 0.89 |
| Thermophilicity | 0.96 | 0.95 |
Figure 1Four unique features used for classifications regarding domain of life revealed by the feature selection algorithm of RF. Pairs of box-and-whisker plots are shown for each feature: Leu content, average protein size in a proteome, His content, and 10-Cys content. Box-and-whisker plots represent bacteria and archaea from top to bottom. The feature values are normalized from 0 to 1 from left to right. (+) signs represent outliers.
Figure 2Three unique features used for classifications regarding halophilicity revealed by the feature selection algorithm of RF. Pairs of box-and-whisker plots are shown for each feature: positive charge, normalized frequency of beta turn, and Phe content. Box-and-whisker plots represent non-halophiles and halophiles from top to bottom. The feature values are normalized from 0 to 1 from left to right. (+) signs represent outliers.
Figure 3Four unique features used for classifications regarding thermophilicity revealed by the feature selection algorithm of RF. Triplets of box-and-whisker plots are shown for each feature: information measure for loop, Val content, Tyr content, and Chou-Fasman parameter of the coil conformation. Box-and-whisker plots represent mesophiles, mesothermophiles and thermophiles from top to bottom. The feature values are normalized from 0 to 1 from left to right. (+) signs represent outliers.