| Literature DB >> 30395280 |
Alberto Santos-Zavaleta1, Heladia Salgado1, Socorro Gama-Castro1, Mishael Sánchez-Pérez1, Laura Gómez-Romero1, Daniela Ledezma-Tejeida1, Jair Santiago García-Sotelo1, Kevin Alquicira-Hernández1, Luis José Muñiz-Rascado1, Pablo Peña-Loredo1, Cecilia Ishida-Gutiérrez1, David A Velázquez-Ramírez1, Víctor Del Moral-Chávez1, César Bonavides-Martínez1, Carlos-Francisco Méndez-Cruz1, James Galagan2, Julio Collado-Vides1,2.
Abstract
RegulonDB, first published 20 years ago, is a comprehensive electronic resource about regulation of transcription initiation of Escherichia coli K-12 with decades of knowledge from classic molecular biology experiments, and recently also from high-throughput genomic methodologies. We curated the literature to keep RegulonDB up to date, and initiated curation of ChIP and gSELEX experiments. We estimate that current knowledge describes between 10% and 30% of the expected total number of transcription factor- gene regulatory interactions in E. coli. RegulonDB provides datasets for interactions for which there is no evidence that they affect expression, as well as expression datasets. We developed a proof of concept pipeline to merge binding and expression evidence to identify regulatory interactions. These datasets can be visualized in the RegulonDB JBrowse. We developed the Microbial Conditions Ontology with a controlled vocabulary for the minimal properties to reproduce an experiment, which contributes to integrate data from high throughput and classic literature. At a higher level of integration, we report Genetic Sensory-Response Units for 200 transcription factors, including their regulation at the metabolic level, and include summaries for 70 of them. Finally, we summarize our research with Natural language processing strategies to enhance our biocuration work.Entities:
Year: 2019 PMID: 30395280 PMCID: PMC6324031 DOI: 10.1093/nar/gky1077
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Growth through the years of types of elements and of citations in RegulonDB. (A) Curated elements over time supported by classic experimental methods. (B) Breakdown of classic versus HT curated records for promoters and RIs. (C) Publications curated by RegulonDB over time. (D) Curated objects reported in the NAR special issue publications and citations over time in RegulonDB.
Figure 2.Display of HT curated regulatory interactions and datasets in RegulonDB. (A) The HT RIs are part of the Regulon web page results, in a section called ‘High-throughput transcription factor binding sites’. (B) The HT datasets are available in the Downloads menu, and the user can filter them by any field, using the text box.
Figure 3.Annotation framework and display in RegulonDB of the GCs contrasts. (A) Part of the framework to curate GCs based on the MCO. (B) Display of example of contrasts in the GCs page result. The variable hydrogen peroxide concentration 120 μM i.e., that induces the dps gene expression, is shown in bold. The evidence and reference of the induction of the gene under that GCs is shown in each contrast.
Number of curated instances of growth condition variables types
|
|
|
|---|---|
|
| |
| Knockout of genes | 26 |
| Insertion of a plasmid with gene(s) | 28 |
|
| |
| Microbiological culture medium | 9 |
|
| |
| Antimicrobial agents | 5 |
| Carbon sources | 23 |
| Electron acceptors (respiration) | 10 |
| Iron depletion/repletion | 7 |
| Nitrogen sources | 4 |
| Nucleotide availability | 4 |
| Oxidative stress | 10 |
| Protein cofactors | 11 |
| Quorum sensing | 4 |
|
| |
| Growth phase | 3 |
Number of curated instances of growth condition elements
|
|
|
|---|---|
| GC phrases | 378 |
| GC contrasts | 269 |
| GC controls | 164 |
| GC experimental tests | 256 |
| GC variables | 162 |