| Literature DB >> 30444863 |
Joana C Xavier1,2, Kiran Raosaheb Patil2, Isabel Rocha1,3.
Abstract
Essential metabolic reactions are shaping constituents of metabolic networks, enabling viable and distinct phenotypes across diverse life forms. Here we analyse and compare modelling predictions of essential metabolic functions with experimental data and thereby identify core metabolic pathways in prokaryotes. Simulations of 15 manually curated genome-scale metabolic models were integrated with 36 large-scale gene essentiality datasets encompassing a wide variety of species of bacteria and archaea. Conservation of metabolic genes was estimated by analysing 79 representative genomes from all the branches of the prokaryotic tree of life. We find that essentiality patterns reflect phylogenetic relations both for modelling and experimental data, which correlate highly at the pathway level. Genes that are essential for several species tend to be highly conserved as opposed to non-essential genes which may be conserved or not. The tRNA-charging module is highlighted as ancestral and with high centrality in the networks, followed closely by cofactor metabolism, pointing to an early information processing system supplied by organic cofactors. The results, which point to model improvements and also indicate faults in the experimental data, should be relevant to the study of centrality in metabolic networks and ancient metabolism but also to metabolic engineering with prokaryotes.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30444863 PMCID: PMC6283598 DOI: 10.1371/journal.pcbi.1006556
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Details on models and corresponding species used in in silico essentiality simulations.
| Phylum | Species | Model ID | # | # | % ORFs | Reference |
|---|---|---|---|---|---|---|
| Firmicutes | iYO844 | 1020 | 988 | 21% | [ | |
| iCB925 | 938 | 881 | 18% | [ | ||
| iSB619 | 641 | 571 | 24% | [ | ||
| Proteobacteria | iAF1260 | 2077 | 1039 | 29% | [ | |
| iCA1273 | 2477 | 1111 | 27% | [ | ||
| iIT341 | 476 | 485 | 21% | [ | ||
| iYL1228 | 1970 | 1658 | 24% | [ | ||
| iJN746 | 950 | 911 | 14% | [ | ||
| STM_v1.0 | 2201 | 1119 | 28% | [ | ||
| iSO783 | 774 | 634 | 15% | [ | ||
| Actinobacteria | iNJ661 | 939 | 828 | 15% | [ | |
| Chloroflexi | iAI549 | 518 | 549 | 27% | [ | |
| Cyanobacteria | iJN678 | 863 | 795 | 21% | [ | |
| Thermotogales | (None) | 562 | 503 | 25% | [ | |
| Euryarchaeota | iAF692 | 476 | 485 | 14% | [ |
Large-scale essentiality assays used in this study, number of essential genes in each and respective original reference of publication.
| Species name | Number of essential genes | Reference | |
|---|---|---|---|
| 499 | [ | ||
| 271 | [ | ||
| 547 | [ | ||
| 325 | [ | ||
| 505 | [ | ||
| 406 | [ | ||
| 228 | [ | ||
| 480 | [ | ||
| 609 | [ | ||
| 296 | [ | ||
| 392 | [ | ||
| 642 | [ | ||
| 323 | [ | ||
| 519 | [ | ||
| 614 | [ | ||
| 771 | [ | ||
| 687 | [ | ||
| 381 | [ | ||
| 310 | [ | ||
| 463 | [ | ||
| 117 | [ | ||
| 335 | [ | ||
| 353 | [ | ||
| 358 | [ | ||
| 353 | [ | ||
| 105 | [ | ||
| 230 | [ | ||
| 403 | [ | ||
| 535 | [ | ||
| 302 | [ | ||
| 351 | [ | ||
| 244 | [ | ||
| 227 | [ | ||
| 241 | [ | ||
| 218 | [ | ||
| 779 | [ | ||
a The corresponding annotated data was obtained from the DEG database [17].
Fig 1Clustering of (A) simulated and (B) experimental genome-scale essentialities of prokaryotes. Clusters show approximately unbiased p-values greater than 80% calculated by multiscale bootstrap re-sampling with 1000 replicas (see Methods for details). Phyla are coloured according to the taxonomical representation of the prokaryotic tree of life built with iTOL [78] (tree in S2 Fig).
Fig 2Essentiality for biomass production of each metabolic subsystem in 15 genome-scale manually curated metabolic models.
The colour bar represents the ratio of essential reactions (in that subsystem) to the total of essential reactions in each model (0%—blue; 5%—white; 50%—red). In parenthesis next to the subsystem name is the total of reactions in that subsystem (for all models).
Fig 3Number of essential reactions predicted by genome-scale metabolic models and essential genes in genome-scale experimental assays for different metabolic subsystems.
Each reaction and gene were counted only once, even if present (or present and essential) in more than one model or experimental dataset. Single, double and triple asterisks indicate p-values less than 0.05, 0.01 and 0.0001, respectively, after a Fisher's exact test for count data testing independence of the number of essential vs the number of non-essential mapped genes and reactions.
Fig 4Conservation of essentiality of metabolic subsystems in 36 large-scale gene essentiality datasets and correlation with modelling predictions (inset).
Red indicates highest conservation (genes essential in more than 31 datasets) and grey the least (essential in less than 4 datasets); black bar: weighted sum of essential genes given the number of datasets in which they are essential (see Methods). Inset plot: Correlation between modelling and experimental genome-scale essentiality data at subsystem level (adjusted R2 of 0.821, Pearson correlation coefficient of 0.909 with p-value 9.65e-14); axes are represented in log scale for visualization purposes only.
Fig 5Conservation of metabolic subsystems in genomes of all prokaryotic phyla with at least one fully sequenced genome.
Dark red indicates the highest conservation (genes that are present in all 79 genomes accessed) and dark blue the least (present in less than 10 genomes).
Fig 6Conservation and essentiality of protein-encoding genes.
Conservation is calculated as the number of genomes where a gene is present and essentiality as the number of datasets where a gene is essential minus the datasets where it is non-essential. The full list of genes is given in S3 File. In the bottom panel, all data is plotted, with conservation on the x axis and essentiality on the y axis. The top panel depicts the density distribution of the conservation of genes (on the shared x axis) with a sum of essentiality larger than 0 –blue–and that of the genes with a sum of essentiality equal or smaller than 0 –orange. The results of a Kolmogorov–Smirnov test for the independence of both distributions are shown, D for the value of the test statistic and the p-value of the test.