| Literature DB >> 24903516 |
Socorro Gama-Castro1, Fabio Rinaldi2, Alejandra López-Fuentes1, Yalbi Itzel Balderas-Martínez1, Simon Clematide1, Tilia Renate Ellendorff1, Alberto Santos-Zavaleta1, Hernani Marques-Madeira1, Julio Collado-Vides2.
Abstract
Given the current explosion of data within original publications generated in the field of genomics, a recognized bottleneck is the transfer of such knowledge into comprehensive databases. We have for years organized knowledge on transcriptional regulation reported in the original literature of Escherichia coli K-12 into RegulonDB (http://regulondb.ccg.unam.mx), our database that is currently supported by >5000 papers. Here, we report a first step towards the automatic biocuration of growth conditions in this corpus. Using the OntoGene text-mining system (http://www.ontogene.org), we extracted and manually validated regulatory interactions and growth conditions in a new approach based on filters that enable the curator to select informative sentences from preprocessed full papers. Based on a set of 48 papers dealing with oxidative stress by OxyR, we were able to retrieve 100% of the OxyR regulatory interactions present in RegulonDB, including the transcription factors and their effect on target genes. Our strategy was designed to extract, as we did, their growth conditions. This result provides a proof of concept for a more direct and efficient curation process, and enables us to define the strategy of the subsequent steps to be implemented for a semi-automatic curation of original literature dealing with regulation of gene expression in bacteria. This project will enhance the efficiency and quality of the curation of knowledge present in the literature of gene regulation, and contribute to a significant increase in the encoding of the regulatory network of E. coli. RegulonDB Database URL: http://regulondb.ccg.unam.mx OntoGene URL: http://www.ontogene.org.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24903516 PMCID: PMC4207228 DOI: 10.1093/database/bau049
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.The display of ODIN adapted for curation in RegulonDB.
The same sematic structure obtained from several sentences
| Sentence | Semantic structures |
|---|---|
| We also discovered that the | MntR [+] |
| Taken together, the observations that | MntR [+] |
| We additionally found that expression of the | MntR [+] |
| We demonstrated | MntR [+] |
The words in bold are those extracted in the corresponding normalized semantic structures.
Several semantic structures obtained from a unique sentence
| Sentence | Semantic structure |
|---|---|
| The two binding modes probably allow | OxyR [−] oxyR |
| The two binding modes probably allow | OxyR [+] katG [oxidative stress] |
| The two binding modes probably allow | OxyR [+] ahpCF [oxidative stress] |
| The two binding modes probably allow | OxyR [+] dps [oxidative stress] |
| The two binding modes probably allow | OxyR [+] gorA [oxidative stress] |
| The two binding modes probably allow | OxyR [+] oxyS [oxidative stress] |
The words in bold are those extracted in the corresponding normalized semantic structures.
OxyR RIs and their GCs
| Interaction number | GENE/operon target | EFFECT | GC |
|---|---|---|---|
| 1 | fhuF | Repressor | Hydrogen peroxide treatment |
| 2 | flu | ‘Activator/repressor’ | Hydrogen peroxide treatment |
| 3 | gor | Activator | Hydrogen peroxide treatment |
| 4 | grxA | Activator | Hydrogen peroxide treatment |
| 5 | katG | Activator | Hydrogen peroxide treatment |
| 6 | mntH | Activator | Hydrogen peroxide treatment |
| 7 | oxyS | Activator/repressor | Hydrogen peroxide treatment |
| 8 | sufABCDSE | Activator | Hydrogen peroxide treatment |
| 9 | trxC | Activator | Hydrogen peroxide treatment |
| 10 | uof-fur | Activator | Hydrogen peroxide treatment |
| 11 | ychF | Repressor | Hydrogen peroxide treatment |
| 12 | ahpCF | Activator | Hydrogen peroxide treatment, ascorbate treatment |
| 13 | dps | Activator | Hydrogen peroxide treatment, exponential phase, stationary phase |
| 14 | gntP | Repressor | Not found |
| 15 | hcp-hcr | Activator | Not found |
| 16 | uxuAB | Repressor | Not found |
| 17 | ybjC-nfsA-rimK-ybjN | Repressor | Not found |
| 18 | yhjA | Activator | Not found |
| 19 | dsbG | Activator | Oxidative stress |
| 20 | hemH | Activator | Oxidative stress |
| 21 | oxyR | Repressor | Oxidative stress, reducing conditions |
List of all RIs for OxyR extracted from the corpus. The RIs that show a dual effect ‘activator/repressor’ are discussed in the text. For some RIs, no GC was found.
RIs and their GCs of several TFs
| TF name | GENE/operon name | EFFECT | GC |
|---|---|---|---|
| AraC | ara | Repressor | Absence of arabinose |
| ArcA | cydAB | Activator | Anaerobiosis |
| ArcA | gltA | Repressor | Anaerobiosis |
| ArcA | sdhCDAB | Repressor | Anaerobiosis, anoxic transition |
| ArcA | sodA | Repressor | Anaerobiosis |
| CRP | oxyR | Activator | Exponential phase |
| Fis | acs | Repressor | Exponential phase |
| FNR | yfgF | Activator | Anaerobiosis |
| FNR | yhjA | Activator | Anaerobiosis |
| Fur | fhuF | repressor | Iron-rich conditions + absence of oxidative stress |
| Fur | mntH | Repressor | CO2 treatment, iron treatment |
| Fur | sufABCDSE | Repressor | Iron treatment |
| IHF | acs-yjcH-actP | Repressor | Stationary phase |
| IHF | sufABCDSE | Activator | Oxidative stress |
| IHF + SigmaS | dps | Activator | Stationary phase |
| IscR | iscRSUA | Repressor | Anaerobiosis, reactive oxygen species |
| MntR | dps | Repressor | Stationary phase |
| MntR | mntH | Repressor | Manganese treatment, iron treatment, metal treatment |
| MntR | mntP | Repressor | Manganese treatment |
| MntR | mntS | Repressor | Manganese treatment |
| NarL | hcp-hcr | Activator | Nitrate treatment, Nitrite treatment |
| NarP | hcp-hcr | Activator | Nitrate treatment, Nitrite treatment |
| SigmaS | aidB | Activator | Oxygen-limiting conditions |
| SigmaS | ansP | Activator | Onset of stationary phase |
| SigmaS | artIPQM | Activator | Onset of stationary phase |
| SigmaS | ilvD | Activator | Onset of stationary phase |
| SigmaS | tnaA | Activator | Onset of stationary phase |
| SoxS | mutM | Activator | Superoxide generators treatment |
| SoxS | ybjC-nfsA-rimK-ybjN | Activator | Paraquat treatment |
These are interactions obtained from the same corpus of papers.
GCs and their effect on target genes
| GC | EFFECT | GENE/operon name |
|---|---|---|
| 2,2′-dipyridyl treatment | Induction | mntH |
| 2,2′-dipyridyl treatment | Inhibition | mntS |
| 2,2′-dipyridyl treatment | Induction | isc |
| 2,2′-dipyridyl treatment | Induction | suf |
| Aerobiosis | Induction | katG |
| Anaerobiosis | Induction | arcA |
| Anaerobiosis | Induction | hcp |
| Anaerobiosis | Inhibition | lctPRD |
| Anaerobiosis + nitrate treatment | Induction | hcp |
| Anaerobiosis + nitrite treatment | Induction | hcp |
| Carbon starvation | Induction | csiD |
| Δfur mutant | Induction | ryhB |
| ΔoxyR mutant | Induction | sufA |
| EDTA treatment | Induction | mntH |
| Exponential growth | Induction | bolA |
| Fructuronate treatment | Induction | uxuAB |
| Gluconate treatment | Inhibition | gntP |
| Glucose treatment | Inhibition | gntP |
| Glucose treatment | Inhibition | oxyR |
| Glucose treatment | Inhibition | uxuAB |
| Glucuronate treatment | Induction | uxuAB |
| Hydrogen peroxide treatment | Induction | fpr |
| Hydrogen peroxide treatment | Induction | iscRSUA |
| Hydrogen peroxide treatment | Inhibition | rplB |
| Hydrogen peroxide treatment | Induction | sodA |
| Hydrogen peroxide treatment | Induction | soxS |
| Hydrogen peroxide treatment | Induction | yaeH |
| Hydrogen peroxide treatment | Induction | ydcH |
| Hydrogen peroxide treatment | Induction | ydeN |
| Hydrogen peroxide treatment | Inhibition | yfdI |
| Hydrogen peroxide treatment | Induction | ygaQ |
| Hydrogen peroxide treatment | Induction | ytfK |
| Hydrogen peroxide treatment | Inhibition | fldA |
| Hydrogen peroxide treatment + Δfur mutant | Induction | sufA |
| Iron starvation | Induction | mntH |
| Iron starvation | Induction | isc |
| L-ascorbate + glutamine treatment | Induction | yiaK |
| L-ascorbate + proline treatment | Induction | yiaK |
| L-ascorbate + threonine treatment | Induction | yiaK |
| L-ascorbate treatment | Induction | ulaA |
| L-ascorbate treatment | Induction | ulaG |
| L-ascorbate treatment | Induction | yiaK |
| L-ascorbate treatment + early exponential phase | Regulation | ahpC |
| Mannonic amide treatment | Induction | uxuAB |
| Menadione treatment | Induction | ahpC |
| Menadione treatment | Induction | ryhB |
| Menadione treatment | Induction | sufA |
| Menadione treatment + DoxyR | Induction | sufA |
| Menadione treatment + DsoxRS | Induction | sufA |
| Menadione treatment + DsoxS | Induction | sufA |
| Mncl2 treatment | Induction | sodA |
| Nitrosative stress | Induction | hcp-hcr |
| Nitrosative stress | Induction | hmp |
| Nitrosative stress | Induction | yeaR-yoaG |
| Nitrosative stress | Induction | ytfE |
| Paraquat treatment | Induction | fldA-fur |
| Paraquat treatment | Induction | oxyS |
| Paraquat treatment | Induction | soxS |
| Paraquat treatment | Induction | sufA |
| Plumbagin treatment | Induction | sufA |
| PMS treatment | Induction | ahpC |
| PMS treatment | Induction | katG |
| PMS treatment | Induction | ryhB |
| PMS treatment + Dfur mutant | Induction | sufA |
| Stationary phase | Induction | katE |
| Stationary phase | Inhibition | oxyR |
| Superoxide treatment | No effect | mntH |
As discussed in the text, ‘induction’ and ‘inhibition’ are used here because there is no knowledge about the mechanism or TF involved in these interactions.
The (OxyR + katG) sentences in the corpus
| Sentence_ID | Normalized structures | Information within the sentence | Paper |
|---|---|---|---|
| S23 | OxyR [+] katG | PMID:10419964 ( | |
| S17 | OxyR [+] katG | PMID:12644490 ( | |
| S165 | OxyR [+] katG | Reference (PMID:2693740) | PMID:12644490 ( |
| S116 | OxyR [+] katG | PMID:15009899 ( | |
| S38 | OxyR [+] katG | Reference (PMID:2693740) | PMID:1730735 ( |
| S208 | OxyR [+] katG | Reference | PMID:1730735 ( |
| S147 | OxyR [+] katG | Reference (PMID:8087856) | PMID:17464064 ( |
| S157 | OxyR [+] katG | PMID:17464064 ( | |
| S42 | OxyR [?] katG | PMID:2693740 ( | |
| S140 | OxyR [+] katG | PMID:2693740 ( | |
| S180 | OxyR [+] katG | Evidence | PMID:2693740 ( |
| S99 | OxyR [?] katG | PMID:8990289 ( | |
| S141 | OxyR [?] katG | PMID:8990289 ( | |
| S106 | OxyR [+] katG | PMID:9324269 ( | |
| S27 | OxyR [+] katG | Reference | PMID:11443091 ( |
| S25 | OxyR [+] katG | Reference | PMID:11443092 ( |
| S94 | OxyR [?] katG | Evidence | PMID:22539721 ( |
| S118 | OxyR [+] katG | Figure/Table | PMID:22539721 ( |
| S20 | H202 [+] katG | Reference | PMID:3045098 ( |
| S20 | OxyR [+] katG | Reference | PMID:3045098 ( |
| S34 | OxyR [?] katG | Reference | PMID:7868602 ( |
| S15 | OxyR [?] katG | PMID:7984106 ( | |
| S91 | OxyR [?] katG | Reference linked | PMID:7984106 ( |
| S135 | OxyR [?] katG | Reference/evidence linked | PMID:8087856 ( |
The first column contains the ID of each sentence within the paper with PMID shown in the last column followed by their citation number in this paper. The third column indicates whether the sentence includes an evidence, a reference, an image or a table.