| Literature DB >> 35181703 |
Andrea Zorro-Aranda1,2, Juan Miguel Escorcia-Rodríguez1, José Kenyi González-Kise1,3, Julio Augusto Freyre-González4.
Abstract
Streptomyces coelicolor A3(2) is a model microorganism for the study of Streptomycetes, antibiotic production, and secondary metabolism in general. Even though S. coelicolor has an outstanding variety of regulators among bacteria, little effort to globally study its transcription has been made. We manually curated 29 years of literature and databases to assemble a meta-curated experimentally-validated gene regulatory network (GRN) with 5386 genes and 9707 regulatory interactions (~ 41% of the total expected interactions). This provides the most extensive and up-to-date reconstruction available for the regulatory circuitry of this organism. Only ~ 6% (534/9707) are supported by experiments confirming the binding of the transcription factor to the upstream region of the target gene, the so-called "strong" evidence. While for the remaining interactions there is no confirmation of direct binding. To tackle network incompleteness, we performed network inference using several methods (including two proposed here) for motif identification in DNA sequences and GRN inference from transcriptomics. Further, we contrasted the structural properties and functional architecture of the networks to assess the reliability of the predictions, finding the inference from DNA sequence data to be the most trustworthy approach. Finally, we show two applications of the inferred and the curated networks. The inference allowed us to propose novel transcription factors for the key Streptomyces antibiotic regulatory proteins (SARPs). The curated network allowed us to study the conservation of the system-level components between S. coelicolor and Corynebacterium glutamicum. There we identified the basal machinery as the common signature between the two organisms. The curated networks were deposited in Abasy Atlas ( https://abasy.ccg.unam.mx/ ) while the inferences are available as Supplementary Material.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35181703 PMCID: PMC8857197 DOI: 10.1038/s41598-022-06658-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1(a) Workflow of this work. The purple area covers the inference of the networks. (b) Type of interactions contained in the networks. The green path connects to curated regulations supported by experimental evidence. The “strong” network contains only the interactions that are supported by an experiment proving that the transcription factor binds a DNA site near a target gene to regulate its transcription. Curated networks without the “strong” label might contain indirect interactions, as they could be supported by non-directed experiments (such as gene knockout and its effect on genes transcription). The purple path connects to inferred interactions. Predictions based solely on binding sites predictions would be inferring only TF-DNA interactions. Predictions involving gene expression data might contain indirect interactions.
Figure 2Interactions curated from literature for Streptomyces coelicolor A3(2). (a) Number of publications per year and (b) Number of interactions reported per year.
Description of networks used in this work.
| Network | Abasy ID | Genes | Interactions | Description |
|---|---|---|---|---|
| Curated_RTB | 100226_v2015_sRTB13 | 311 | 330 | Network from RegTransBase database |
| Curated_DBSCR | 100226_v2015_sDBSCR15 | 273 | 341 | Network from Database of transcriptional regulation in |
| Curated_DBSCR(S) | 100226_v2015_sDBSCR15_eStrong | 112 | 115 | Filtration of interactions with strong evidence from the DBSCR network |
| Curated_FL | 100226_v2019_sA22 | 5331 | 9454 | Network from the collection and curation performed for this work |
| Curated_FL(cS) | Not reported | 347 | 438 | Filtration of interactions with strong evidence from the FL network (cS = curated strong) |
| Curated_FL(S) | 100226_v2019_sA22_eStrong | 396 | 493 | Filtration of interactions with strong evidence from the FL network along with statistically validated interactions |
| Curated_FL-DBSCR-RTB | 100226_v2019_sA22-DBSCR15-RTB13 | 5386 | 9707 | Meta-curation of RTB, DBSCR and FL networks |
| Curated_FL(cS)-DBSCR(S) | Not Reported | 387 | 480 | Filtration of interactions with strong evidence from the meta-curated network |
| Curated_FL(S)-DBSCR(S) | 100226_v2019_sA22-DBSCR15_eStrong | 435 | 534 | Filtration of interactions with strong evidence from meta-curated networks along with statistically validated interactions |
| Inferred_BS | Available as a Supplementary File | 6263 | 23,908 | Inferred GRN from binding sites prediction |
| Inferred_Exp | Available as a Supplementary File | 4739 | 23,908 | Inferred GRN from transcriptomic data |
| Inferred_BS-Exp | Available as a Supplementary File | 4763 | 23,908 | Community network from Inferred_BS and Inferred_Exp |
| Inferred_All | Available as a Supplementary File | 3804 | 23,908 | Community network from all the inference methods |
Figure 3(a) AUROC and AUPR for each of the methods and the community networks. (b) Number of interactions statistically validated by TF.
Network properties for inferred networks.
| Property | Inferred_BS | Inferred_Exp | Inferred_BS-Exp | Inferred_All | Curated_FL-DBSCR-RTB |
|---|---|---|---|---|---|
| Number of nodes (N) | 6263 | 4739 | 4763 | 3804 | 5386 |
| 2.17 | 2.14 | 2.14 | 2.11 | 2.15 | |
| Average shortest path length | 2.86 | 3.38 | 3.38 | 3.11 | 2.84 |
| Average clustering coefficient | 0.213 | 0.385 | 0.385 | 0.470 | 0.182 |
| 1.861 | 1.952 | 1.955 | 1.968 | 1.742 | |
| R2adj | 0.87 | 0.92 | 0.91 | 0.84 | |
| 0.924 | 0.767 | 0.742 | 0.729 | 1.142 | |
| R2adj | 0.79 | 0.58 | 0.54 | 0.68 | 0.89 |
Figure 4Network comparative by structural properties. (a) Pearson correlation of the profile of structural properties listed in Supplementary Figure 12. (b) D-value from Schieber et al.[43] to measure network similarity.
Figure 5Simpson similarity index for NDA analysis for all curated and inferred networks. (a) Global regulators. (b) Modular genes. (c) Intermodular genes. (d) Basal machinery genes.
Figure 6MCC for global regulators predicted by NDA for each of the curated and inferred networks. Scores ≥ 0.5 are represented in white numbers and meta-curations are marked with an asterisk. aGold standard curated from reference[28]. bGold standard curated from independent publications (Supplementary Table 4).
Figure 7Conservation of the systems-level components between S. coelicolor and C. glutamicum. (a) NDA classification for the GRN-wide orthologs in their corresponding organism. Total matrix sum to 1. Most of the GRN-wide orthologs are classified as basal machinery in both organisms (b) Overlapping of the NDA classes between the two networks with reference to the smallest set. Each cell can range between 0 and 1, where 1 means one class is a subset of another, and 0 means there is no overlap at all. (c) Similarities between the two organisms highlight the size difference between the datasets. The color of the inner circle sections represents the NDA classes and ribbons colors represent the S. coelicolor NDA classes. Numbers represent the genes for each class and organism. Gray ribbons are the widest ones, representing that all the basal machinery of S. coelicolor with 1:1 orthology relationship with C. glutamicum is classified either as basal machinery or modular genes in C. glutamicum. This suggests that multiple basal machinery genes could be reclassified as modular components in more complete reconstructions of the S. coelicolor network.