| Literature DB >> 22587779 |
Sónia Carneiro1, Anália Lourenço1, Eugénio C Ferreira1, Isabel Rocha1.
Abstract
BACKGROUND: Understanding the mechanisms responsible for cellular responses depends on the systematic collection and analysis of information on the main biological concepts involved. Indeed, the identification of biologically relevant concepts in free text, namely genes, tRNAs, mRNAs, gene products and small molecules, is crucial to capture the structure and functioning of different responses.Entities:
Year: 2011 PMID: 22587779 PMCID: PMC3372295 DOI: 10.1186/2042-5783-1-14
Source DB: PubMed Journal: Microb Inform Exp ISSN: 2042-5783
Figure 1The (p)ppGpp-mediated stringent response. (A) Low amino acid concentrations lead to decreased charging of the corresponding tRNAs. (B) The translational machinery depends on the translocation along the mRNA whereby a new acetylated-tRNA is positioned in the ribosome. Whenever an uncharged tRNA binds to the ribosome, the elongation of the polypeptide chain is stalled. (C) The stringent factor RelA is then activated in the presence of the ribosomal protein L11, catalyzing the synthesis of (p)ppGpp nucleotides. (D) These nucleotides bind directly to the RNA polymerase and affect the binding abilities of sigma factors to the core RNA polymerase. (E) The cofactor DksA also binds to the RNA polymerase and augments the (p)ppGpp regulation of the transcription initiation at certain σ70-dependent promoters, functioning both as negative and positive regulators. (F) These regulators change the gene expression: (i) decreasing the transcription activity of genes involved in translational activities; (ii) and increasing the transcription of stress-related operons and genes encoding for enzymes needed for the synthesis and the transport of amino acids.
Figure 2Corpus annotation contents. Overview of the extent of biological concepts (A) and concept annotations (B) per class in the corpus. GO assignments (C) for molecular functions and biological processes mapped for each set of gene products (i.e. enzymes, transcription factors and other proteins) and MultiFun gene assignments (D) for different functional roles (BC-1 to Metabolism, BC-2 to Information transfer, BC-3 to Regulation, BC-4 to Transport, BC-5 to Cell processes, BC-6 to Cell structure, BC-7 to Location of gene products and BC-8 to Extrachromosomal origin) were recognized in the corpus.
Annotations of the genetic components in the corpus.
| Class | Concept | Number of Annotations | Number of Documents | Mean (Eq. 2) | Std (Eq. 3) | VMR (Eq. 4) | |
|---|---|---|---|---|---|---|---|
| Genes | 3163 | 138 | 71.50 | 22.92 | 27.23 | 33.14 | |
| 1315 | 88 | 45.60 | 14.94 | 27.42 | 52.07 | ||
| 354 | 63 | 32.64 | 5.620 | 19.42 | 72.20 | ||
| 534 | 50 | 25.91 | 10.68 | 17.16 | 28.90 | ||
| 91 | 47 | 24.35 | 1.940 | 0.050 | 4.000 | ||
| 523 | 47 | 24.35 | 11.13 | 20.68 | 36.36 | ||
| 82 | 39 | 20.21 | 2.100 | 1.810 | 0.5000 | ||
| 95 | 36 | 18.65 | 2.640 | 3.530 | 4.500 | ||
| 84 | 36 | 18.65 | 2.330 | 3.760 | 4.500 | ||
| 103 | 34 | 17.62 | 3.030 | 7.250 | 16.33 | ||
| 98 | 34 | 17.62 | 2.880 | 6.800 | 18.00 | ||
| 205 | 33 | 17.10 | 6.210 | 10.83 | 16.67 | ||
| 308 | 33 | 17.10 | 9.330 | 16.61 | 28.44 | ||
| 42 | 31 | 16.06 | 1.350 | 0.7400 | 0 | ||
| 389 | 30 | 15.54 | 12.97 | 17.60 | 24.08 | ||
| 240 | 30 | 15.54 | 8.000 | 21.54 | 55.13 | ||
| 144 | 25 | 12.95 | 5.760 | 14.73 | 39.20 | ||
| 60 | 20 | 10.36 | 3.000 | 3.810 | 3.000 | ||
| 23 | 19 | 9.840 | 1.210 | 0.5600 | 0 | ||
| DNAs | DNA | 1839 | 137 | 70.98 | 13.42 | 16.31 | 19.69 |
| plasmid DNA | 193 | 36 | 18.65 | 5.360 | 12.31 | 28.80 | |
| chromosomal DNA | 63 | 24 | 12.44 | 2.630 | 2.440 | 2.000 | |
| cDNA | 125 | 23 | 11.92 | 5.430 | 5.820 | 5.000 | |
| RNAs | RNA | 4193 | 140 | 72.54 | 29.95 | 38.21 | 49.79 |
| uncharged tRNA | 1168 | 117 | 60.62 | 9.980 | 19.64 | 40.11 | |
| rRNA | 1116 | 97 | 50.26 | 11.51 | 25.97 | 56.82 | |
| a mRNA | 999 | 91 | 47.15 | 10.98 | 19.52 | 36.10 | |
| rrnA | 911 | 87 | 45.08 | 10.47 | 22.51 | 48.40 | |
| stable RNA | 430 | 87 | 45.08 | 4.940 | 8.030 | 16.00 | |
| a charged tRNA | 140 | 43 | 22.28 | 3.260 | 4.200 | 5.330 | |
| rrnB | 301 | 26 | 13.47 | 11.58 | 19.30 | 32.82 | |
| rrn | 321 | 26 | 13.47 | 12.35 | 30.42 | 75.00 | |
| 16s-rRNAs | 156 | 25 | 12.95 | 6.240 | 9.090 | 13.50 | |
Individual genetic components (i.e. genes, DNAs and RNAs) were evaluated considering the number of documents where these entities were annotated and their number of annotations in the corpus. Statistical measurements are detailed in the Methods and Materials section.
ψ A threshold of 10% of the frequency of annotation was set for each genetic component category. However, lists of all annotated entities are provided in Additional file 5.
VMR: variance-to-mean
Std: standard deviation
Annotations of the gene products in the corpus.
| Class | Concept | Number of Annotations | Number of Documents | Mean (Eq. 2) | Std (Eq. 3) | VMR (Eq. 4) | |
|---|---|---|---|---|---|---|---|
| Proteins | Ribosome | 1643 | 128 | 66.32 | 12.84 | 23.57 | 44.08 |
| Rel | 1021 | 62 | 32.12 | 16.50 | 36.60 | 81.00 | |
| LacZ | 543 | 53 | 27.46 | 10.30 | 17.44 | 28.90 | |
| Sigma 38 factor | 392 | 42 | 21.76 | 9.330 | 15.40 | 25.00 | |
| Sigma factor | 112 | 35 | 18.13 | 3.200 | 5.870 | 8.330 | |
| UvrD | 56 | 35 | 18.13 | 1.600 | 1.300 | 1.000 | |
| RpoB | 252 | 35 | 18.13 | 7.200 | 11.50 | 17.29 | |
| RecA | 99 | 31 | 16.06 | 3.190 | 4.260 | 5.330 | |
| EF-Tu | 223 | 26 | 13.47 | 8.580 | 17.32 | 36.13 | |
| Der | 51 | 25 | 12.95 | 2.040 | 2.140 | 2.000 | |
| Sigma 70 factor | 134 | 21 | 10.88 | 6.380 | 11.19 | 20.17 | |
| Transcription factors | Fis | 888 | 18 | 9.330 | 49.33 | 86.88 | 150.9 |
| Fur | 56 | 13 | 6.740 | 4.310 | 9.260 | 20.25 | |
| CRP | 279 | 12 | 6.220 | 23.25 | 36.28 | 56.35 | |
| DnaA | 121 | 11 | 5.700 | 11.00 | 23.00 | 48.09 | |
| H-NS | 73 | 11 | 5.700 | 6.640 | 10.73 | 16.67 | |
| LexA | 101 | 10 | 5.180 | 10.10 | 18.32 | 32.40 | |
| IHF | 54 | 9 | 4.660 | 6.000 | 5.250 | 4.170 | |
| Enzymes | RelA | 4138 | 152 | 78.76 | 27.22 | 31.16 | 35.59 |
| RNAP | 1873 | 117 | 60.62 | 16.01 | 28.08 | 49.00 | |
| SpoT | 1024 | 60 | 31.09 | 17.07 | 42.19 | 103.8 | |
| EcoRI | 215 | 53 | 27.46 | 4.060 | 4.970 | 4.000 | |
| β-galactosidase | 294 | 47 | 24.35 | 6.260 | 6.550 | 6.000 | |
| BamHI | 149 | 43 | 22.28 | 3.470 | 5.870 | 8.330 | |
| HindIII | 114 | 41 | 21.24 | 2.780 | 2.160 | 2.000 | |
| RNase | 109 | 36 | 18.65 | 3.030 | 4.280 | 5.330 | |
| YbcS | 50 | 23 | 11.92 | 2.170 | 2.620 | 2.000 | |
| Reverse transcriptase | 34 | 21 | 10.88 | 1.620 | 1.050 | 1.000 | |
| tRNA synthetase | 54 | 20 | 10.36 | 2.700 | 2.630 | 2.000 | |
| Endonuclease I | 29 | 20 | 10.36 | 1.450 | 1.400 | 1.000 | |
Individual gene products (i.e. enzymes, transcription factors and other proteins) were evaluated considering the number of documents where these entities were annotated and their number of annotations in the corpus. Statistical measurements are detailed in the Methods and Materials section.
ψ A threshold of 10% of the frequency of annotation was set for enzymes and other proteins, whereas a threshold of 5% was set for transcription factors. However, lists of all annotated entities are provided in Additional file 6.
VMR: variance-to-mean
Std: standard deviation
Annotations of the small molecules in the corpus.
| Concept | Number of Annotations | Number of Documents | Mean of annotation (Eq. 2) | Std (Eq. 3) | VMR (Eq. 4) | |
|---|---|---|---|---|---|---|
| Amino acids | 1557 | 160 | 82.90 | 9.730 | 13.83 | 18.78 |
| Nucleotides | 1230 | 145 | 75.13 | 8.480 | 9.290 | 10.13 |
| ppGpp | 4159 | 145 | 75.13 | 28.68 | 31.00 | 34.32 |
| β-D-glucose | 792 | 123 | 63.73 | 6.440 | 10.63 | 16.67 |
| Pi | 662 | 113 | 58.55 | 5.860 | 12.60 | 28.80 |
| Guanosine | 407 | 112 | 58.03 | 3.630 | 3.540 | 3.000 |
| ATP | 587 | 100 | 51.81 | 5.870 | 7.410 | 9.800 |
| GTP | 748 | 91 | 47.15 | 8.220 | 13.85 | 21.13 |
| AMP | 598 | 90 | 46.63 | 6.640 | 10.09 | 16.67 |
| PPi | 447 | 87 | 45.08 | 5.140 | 5.180 | 5.000 |
| H2O | 210 | 83 | 43.01 | 2.530 | 2.430 | 2.000 |
| Tris | 261 | 82 | 42.49 | 3.180 | 2.800 | 1.330 |
| Carbon | 288 | 80 | 41.45 | 3.600 | 4.850 | 5.330 |
| Chloramphenicol | 435 | 77 | 39.90 | 5.650 | 8.250 | 12.80 |
| pppGpp | 632 | 74 | 38.34 | 8.540 | 13.61 | 21.13 |
| (p)ppGpp | 3127 | 72 | 37.31 | 43.43 | 56.00 | 72.93 |
| NaCl | 189 | 67 | 34.72 | 2.820 | 2.790 | 2.000 |
| L-lactate | 413 | 65 | 33.68 | 6.350 | 20.84 | 66.67 |
| Glycerol | 145 | 65 | 33.68 | 2.230 | 1.850 | 0.5000 |
| Ethanol | 189 | 65 | 33.68 | 2.910 | 4.400 | 8.000 |
| Na+ | 145 | 63 | 32.64 | 2.300 | 2.100 | 2.000 |
| Ampicillin | 321 | 62 | 32.12 | 5.180 | 12.74 | 28.80 |
| EDTA | 142 | 60 | 31.09 | 2.370 | 1.680 | 0.5000 |
| L-methionine | 248 | 59 | 30.57 | 4.200 | 6.670 | 9.000 |
| L-histidine | 183 | 59 | 30.57 | 3.100 | 5.410 | 8.330 |
| L-valine | 396 | 57 | 29.53 | 6.950 | 11.90 | 20.17 |
| Formate | 136 | 57 | 29.53 | 2.390 | 2.360 | 2.000 |
Individual small molecules were evaluated considering the number of documents where these entities were annotated and the number of annotations in the corpus. Statistical measurements are detailed in the Methods and Materials section.
ψ A threshold of 30% of the frequency of annotation was set for compounds. However, lists of all annotated entities are provided in Additional file 7.
VMR: variance-to-mean
Std: standard deviation
Figure 3Proteins co-annotated with ppGpp, pppGpp and the collective (p)ppGpp entities. The nodes represent proteins with frequency of co-annotation higher than 10%. Highly co-annotated proteins are represented by nodes with a larger size (frequencies of co-annotation greater than 50%). Pink nodes represent the proteins that were co-annotated with the three entities, while green and yellow nodes indicate the proteins that were co-annotated with only two and one of the nucleotides, respectively.
Figure 4Venn diagram comparing annotations from corpus and selected reviews. This diagram indicates the number of biological concepts per class that represent the corpus and from the latest reviews considered to be relevant to this subject. The intersecting zone gives the number of biological concepts that were simultaneously reported in the two set of documents.
Some examples of less-reported entities (namely in recent reviews), which are relevant in the E. coli stringent response.
| Biological entities | Freq (%) | Details | References |
|---|---|---|---|
| DnaJ - chaperone with DnaK | 3.11 | Chaperone protein that assists the DnaJ/DnaK/GrpE system of | [ |
| ClpB chaperone | 1.55 | ClpB, together with the DnaJ/DnaK/GrpE chaperone system, is able to resolubilize aggregated proteins. | [ |
| GroEL-GroES chaperonin complex | 0.52 | GroEL and GroES are both induced by heat and when ppGpp is overproduced in | [ |
| RuvB - branch migration of Holliday structures; repair helicase | 1.55 | Component of the RuvABC enzymatic complex that promotes the rescue of stalled (often formed by ppGpp) or broken DNA replication forks in | [ |
| CsrA - carbon storage regulator | 1.04 | Regulator of carbohydrate metabolism, which activates UvrY, responsible for the transcription of | [ |
| 0.52 | Encodes the UvrY protein that has been shown to be the cognate response regulator for the BarA sensor protein. This regulator participates in controlling several genes involved in the DNA repair system (e.g. CsrA) and carbon metabolism. | [ | |
| 0.52 | Gene encoding the CstA peptide transporter, which expression is induced by carbon starvation and requires the CRP-cAMP transcriptional regulator. The CstA translation is regulated by the CsrA that occludes ribosome binding to the | [ | |
| CspD - DNA replication inhibitor | 0.52 | CspD is a toxin that appears to inhibit the DNA replication. ppGpp is one of the positive factors for the expression of | [ |
| FabH - β-ketoacyl-ACP synthase III | 0.52 | A key enzyme in the initiation of fatty acids biosynthesis that is stringently regulated by ppGpp. | [ |
| FadR transcriptional dual regulator | 1.55 | Regulates the fatty acid biosynthesis and fatty acid degradation at the level of transcription. ppGpp has been shown to be also involved in the regulation of these pathways | [ |
| NtrC-Phosphorylated transcriptional dual regulator | 1.04 | Regulatory protein involved in the assimilation of nitrogen and in slow growth caused by N-limited condition. It was reported that ppGpp levels increase upon nitrogen starvation. | [ |
| 2.59 | Gene encoding the Dps protein that is highly abundant in the stationary-phase and is required for the starvation responses. It was found to be regulated by ppGpp and RpoS. | [ | |
| 2.07 | Gene induced during phosphate starvation that has been associated with the accumulation of ppGpp. | [ | |
| 2.07 | Encodes the MazE antitoxin, a component of the MazE-MazF system that causes a "programmed cell death" in response to stresses, including starvation. Genes | [ | |
| 0.52 | Encodes the MazG nucleoside pyrophosphohydrolase that limits the detrimental effects of the MazF toxin under nutritional stress conditions. Overexpression of | [ | |
Figure 5Comparison of the expansion of knowledge to the applied experimental techniques. Bars represent the number of biological entities (left Y axis) found for the three major biological classes, i.e., genetic components (genes, RNAs and DNAs), gene products (proteins, transcription factors and enzymes) and small molecules. Lines plot the number of experimental techniques (right Y axis) associated to the annotated PSI-MI classes.
PSI-MI assignments to annotated experimental techniques.
| Techniques | Statistics over the corpus | Frequency per Decade | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| MI:0659 | experimental feature detection | 55% | 2.44 | 2.76 | 2 | 64% | 65% | 67% | 67% | |
| MI:0833 | autoradiography | 25% | 1.65 | 1.4 | 1 | 29% | 35% | 22% | 22% | |
| MI:0113 | western blot | 21% | 4.95 | 3.38 | 2.25 | - | 13% | 18% | 44% | |
| MI:0074 | mutation analysis | 20% | 3.34 | 3.24 | 3 | 14% | 20% | 27% | 32% | |
| MI:0114 | x-ray crystallography | 4% | 1.63 | 1.46 | 1 | - | - | 2% | 8% | |
| MI:0811 | insertion analysis | 4% | 1.14 | 0.38 | 0 | - | - | 7% | 4% | |
| MI:0091 | chromatography technology | 50% | 4.23 | 4.14 | 4 | 100% | 85% | 55% | 64% | |
| MI:0254 | genetic interference | 42% | 2.68 | 2.35 | 2 | - | 28% | 69% | 64% | |
| MI:0807 | comigration in gel electrophoresis | 37% | 2.51 | 2.13 | 2 | 21% | 48% | 51% | 49% | |
| MI:0045 | experimental interaction detection* | 36% | 3.1 | 3.12 | 3 | 14% | 45% | 47% | 41% | |
| MI:0808 | comigration in sds page | 27% | 2.02 | 1.61 | 0.5 | 7% | 23% | 33% | 29% | |
| MI:0099 | scintillation proximity assay | 24% | 1.94 | 1.65 | 1 | 64% | 35% | 20% | 15% | |
| MI:0051 | fluorescence technology | 16% | 1.84 | 1.8 | 1 | 7% | 20% | 5% | 22% | |
| MI:0071 | molecular sieving | 15% | 2.25 | 3.19 | 4.5 | 29% | 13% | 15% | 13% | |
| MI:0217 | phosphorylation reaction | 13% | 2.72 | 3.74 | 4.5 | 7% | 8% | 16% | 14% | |
| MI:0415 | enzymatic study | 12% | 1.96 | 2.11 | 4 | 7% | 8% | 11% | 19% | |
| MI:0008 | array technology | 10% | 8.47 | 7.47 | 6.13 | - | - | - | 22% | |
| MI:0928 | filter trap assay | 9% | 2.24 | 2.47 | 2 | 36% | 13% | 4% | 6% | |
| MI:0004 | affinity chromatography technology | 8% | 2.33 | 2.44 | 2 | - | 5% | 5% | 13% | |
| MI:0428 | imaging technique | 7% | 1.57 | 1.41 | 1 | - | 8% | 4% | 11% | |
| MI:0047 | far western blotting | 6% | 1.5 | 0.82 | 0 | - | 3% | 7% | 8% | |
| MI:0435 | protease assay | 6% | 3.92 | 4.01 | 5.33 | - | 5% | 5% | 8% | |
| MI:0017 | classical fluorescence spectroscopy | 6% | 1.08 | 0.29 | 0 | - | - | 5% | 11% | |
| MI:0089 | protein array | 6% | 1.64 | 1.62 | 1 | - | 3% | 2% | 11% | |
| MI:0029 | cosedimentation through density gradient | 5% | 5.22 | 3.94 | 1.8 | 43% | 10% | - | 2% | |
| MI:0040 | electron microscopy | 4% | 2.57 | 1.31 | 0.5 | - | 10% | - | 4% | |
| MI:0676 | tandem affinity purification | 3% | 3.8 | 5.18 | 8.33 | - | - | - | 7% | |
| MI:0054 | fluorescence-activated cell sorting | 3% | 5.8 | 4.6 | 3.2 | - | - | 4% | 4% | |
| MI:0413 | electrophoretic mobility shift assay | 3% | 1.6 | 0.77 | 0 | - | - | 5% | 2% | |
| MI:0012 | bioluminescence resonance energy transfer | 2% | 6.25 | 5.68 | 4.17 | - | - | 4% | 2% | |
| MI:0018 | two hybrid | 2% | 2 | 1.41 | 0.5 | - | - | - | 7% | |
| MI:0053 | fluorescence polarization spectroscopy | 2% | 5 | 2.94 | 0.8 | - | 3% | - | 2% | |
| MI:0397 | two hybrid array | 2% | 2 | 1.41 | 0.5 | - | - | 2% | 2% | |
| MI:0227 | reverse phase chromatography | 2% | 3.25 | 2.87 | 1.33 | - | - | 5% | 1% | |
| MI:0226 | ion exchange chromatography | 1% | 1 | 0 | 0 | - | - | - | 1% | |
| MI:0031 | protein cross-linking with a bifunctional reagent | 1% | 7 | 4 | 2.29 | - | 3% | - | 1% | |
| MI:0052 | fluorescence correlation spectroscopy | 1% | 1 | 0 | 0 | - | - | - | 1% | |
| MI:0416 | fluorescence microscopy | 1% | 2.5 | 0.71 | 0 | - | - | - | 2% | |
| MI:0016 | circular dichroism | 1% | 1.5 | 0.71 | 0 | - | - | - | 2% | |
| MI:0225 | chromatin immunoprecipitation array | 1% | 1 | 0 | 0 | - | - | - | 1% | |
| MI:0872 | atomic force microscopy | 1% | 1 | 0 | 0 | - | - | - | 1% | |
| MI:0049 | filter binding | 1% | 1 | 0 | 0 | - | 3% | 2% | - | |
| MI:0426 | light microscopy | 1% | 1 | 0 | 0 | - | - | - | 2% | |
| MI:0088 | primer specific pcr | 40% | 10.38 | 15.87 | 22.5 | - | 8% | 29% | 95% | |
| MI:0080 | partial dna sequence identification by hybridization | 27% | 3.75 | 3.47 | 3 | 14% | 30% | 29% | 26% | |
| MI:0078 | nucleotide sequence identification | 20% | 1.77 | 1.26 | 1 | - | 15% | 25% | 22% | |
| MI:0103 | southern blot | 14% | 3.04 | 2.05 | 1.33 | - | 8% | 15% | 19% | |
| MI:0929 | nothern blot | 8% | 5.56 | 4.8 | 3.2 | - | 3% | 11% | 11% | |
| MI:0421 | identification by antibody | 6% | 1.82 | 1.51 | 1 | - | 8% | 5% | 6% | |
| MI:0427 | identification by mass spectrometry | 5% | 1.67 | 0.94 | 0 | - | - | 4% | 8% | |
| MI:0082 | peptide massfingerprinting | 2% | 1.5 | 0.71 | 0 | - | - | - | 5% | |
| MI:0093 | protein sequence identification | 1% | 1 | 0 | 0 | - | - | 2% | ||
| MI:0411 | enzyme linked immunosorbent assay | 1% | 4 | 2 | 1 | - | - | 2% | 1% | |
| MI:0714 | nucleic acid transduction | 26% | 2.22 | 2.82 | 2 | 14% | 15% | 31% | 29% | |
| MI:0715 | nucleic acid conjugation | 6% | 1.73 | 1.41 | 1 | 7% | 3% | 5% | 7% | |
| MI:0308 | electroporation | 5% | 1.89 | 1.56 | 1 | - | - | 2% | 9% | |
| MI:0343 | cdna library | 3% | 1 | 0 | 0 | - | - | - | 6% | |
| MI:0194 | cleavage reaction | 1% | 1 | 0 | 0 | - | 3% | - | - | |
| MI:0373 | dye label | 5% | 1.2 | 0.45 | 0 | 7% | 8% | 5% | 4% | |
Experimental techniques were evaluated considering the number of documents where these concepts were annotated and the number of annotations in the corpus. Moreover, the frequency of annotation of experimental techniques was also estimated for documents published in the four decades (from 1970 to 2009). Statistical measures are detailed in the Methods and Materials section.
* These PSI-MI general classes were used to identify techniques that did not map into any particular technique within the class.
MultiFun cellular function assignments.
| MultiFun Concepts | Frequency of Ontology Annotation | Brief Description | Genes | Gene Products | |||
|---|---|---|---|---|---|---|---|
| Name | Frequency of Assignment | Frequency of Annotation | Name | Frequency of Annotation | |||
| BC-1.7.33 Nucleotide and nucleoside conversions | 76% | The chemical reactions involved in the central carbon metabolism by which a nucleobase, nucleoside or nucleotide is converted from another nucleobase, nucleoside or nucleotide. | 68% | 72% | RelA | 79% | |
| 28% | 46% | SpoT | 31% | ||||
| BC-3.1.3.4 Proteases, cleavage of compounds | 55% | Proteins that hydrolysates a peptide bond or bonds within a protein during posttranscriptional regulatory processes. | 91% | 46% | SpoT | 31% | |
| BC-2.2.2 Transcription related functions | 51% | The information transfer related functions involved in the synthesis of RNA on a template of DNA. | 22% | 6% | Fis | 9% | |
| 17% | 16% | RpoB | 18% | ||||
| BC-2.3.2 Translation | 48% | The cellular metabolic process by which a protein is formed, using the sequence of a mature mRNA molecule to specify the sequence of amino acids in a polypeptide chain. | 23% | 3% | DksA | 8% | |
| 17% | 6% | RplK | 7% | ||||
| 12% | 18% | RpsG | NA | ||||
| 11% | 19% | RpsL | 3% | ||||
| BC-5.5.3 Starvation | 47% | A state or activity of a cell or an organism as a result to the adaptation to starvation. | 85% | 46% | SpoT | 31% | |
| 12% | 3% | DksA | 8% | ||||
| BC-1.1.1 Carbon compounds | 46% | The metabolic reactions by which living organisms utilises carbon compounds. | 49% | 26% | LacZ | 28% | |
| 22% | 16% | PtsG | 1% | ||||
| BC-2.3.8 Ribosomal proteins | 44% | Proteins that associate to form a ribosome involved in genetic information transfer in cells. | 24% | 6% | RplK | 7% | |
| 18% | 18% | RpsG | NA | ||||
| 17% | 19% | RpsL | 3% | ||||
| BC-3.1.2.3 Repressor | 40% | Any transcription regulator that prevents or downregulates transcription. | 41% | 6% | Fis | 9% | |
| BC-3.1.2.2 Activator | 32% | Any transcription regulator that induces or upregulates transcription. | 45% | 6% | Fis | 9% | |
MultiFun terms were assigned to annotated concepts associated with genes. A threshold of 30% of documents was considered for ontology assignment and a threshold of 10% was used to point out the genes that most contributed to such assignment.
NA: corresponds to non-annotated gene products in the corpus.
GO biological processes assignments.
| Gene Ontology concepts | Frequency of Ontology Annotation | Brief Description | Gene Products | Coding Genes | |||
|---|---|---|---|---|---|---|---|
| Name | Frequency of Assignment | Frequency of Annotation | Name | Frequency of Annotation | |||
| GO:0008152 Metabolic process | 89% | The chemical reactions and pathways, including anabolism and catabolism, by which living organisms transform chemical substances. | RelA | 80% | 79% | 72% | |
| LacZ | 10% | 24% | 26% | ||||
| GO:0015949 Nucleobase, nucleoside and nucleotide interconversion | 80% | The chemical reactions and pathways by which a nucleobase, nucleoside or nucleotide is synthesized from another nucleobase, nucleoside or nucleotide. | RelA | 80% | 79% | 72% | |
| SpoT | 20% | 31% | 46% | ||||
| GO:0015969 Guanosine tetraphosphate metabolic process | 80% | The chemical reactions and pathways involving guanine tetraphosphate (5'-ppGpp-3'), a derivative of guanine riboside with four phosphates. | RelA | 80% | 79% | 72% | |
| SpoT | 20% | 31% | 46% | ||||
| GO:0006350 Transcription | 56% | The synthesis of either RNA on a template of DNA or DNA on a template of RNA. | RpoS | 16% | 22% | 17% | |
| CRP | 12% | 6% | 4% | ||||
| RpoB | 10% | 18% | 16% | ||||
| Mfd | 10% | 2% | 2% | ||||
| GO:0006355 Regulation of transcription, DNA-dependent | 52% | Any process that modulates the frequency, rate or extent of DNA-dependent transcription. | RpoS | 20% | 22% | 17% | |
| CRP | 14% | 6% | 4% | ||||
| Mfd | 12% | 2% | 2% | ||||
| GO:0006412 Translation | 40% | The cellular metabolic process by which a protein is formed, using the sequence of a mature mRNA molecule to specify the sequence of amino acids in a polypeptide chain. | RplK | 28% | 7% | 6% | |
| DksA | 28% | 8% | 3% | ||||
| EF-Tu | 13% | 14% | 3% | ||||
| GO:0006950 Response to stress | 39% | A change in state or activity of a cell or an organism as a result of a disturbance in cellular homeostasis, usually, but not necessarily, exogenous. | RecA | 20% | 16% | 20% | |
| RelB | 16% | 4% | 3% | ||||
| NusA | 10% | 5% | 4% | ||||
| GO:0042594 Response to starvation | 39% | A change in state or activity of a cell or an organism as a result of a starvation stimulus, deprivation of nourishment. | SpoT | 67% | 31% | 46% | |
| DksA | 30% | 8% | 3% | ||||
| GO:0006970 Response to osmotic stress | 38% | A change in state or activity of a cell or an organism as a result of a stimulus indicating an increase or decrease in the concentration of solutes outside the organism or cell. | RpoS | 59% | 22% | 17% | |
| EF-Tu | 34% | 14% | 3% | ||||
| GO:0005975 Carbohydrate metabolic process | 36% | The chemical reactions and pathways involving carbohydrates, any of a group of organic compounds based of the general formula Cx(H2O)y. | LacZ | 94% | 24% | 26% | |
| GO:0006974 Response to DNA damage stimulus | 36% | A change in state or activity of a cell or an organism as a result of a stimulus indicating damage to its DNA from environmental insults or errors during metabolism. | Mfd | 28% | 2% | 2% | |
| RecA | 11% | 16% | 20% | ||||
| RecG | 11% | 3% | NA | ||||
| GO:0006281 DNA repair | 36% | The process of restoring DNA after damage that include direct reversal, base excision repair, nucleotide excision repair, photoreactivation, bypass, double-strand break repair pathway, and mismatch repair pathway. | Mfd | 28% | 2% | 2% | |
| RecA | 11% | 16% | 20% | ||||
| RecG | 11% | 3% | NA | ||||
GO terms were assigned to annotated concepts associated with gene products annotations. A threshold of 35% of documents was considered for ontology assignment and a threshold of 10% was used to point out the gene products that most contributed to such assignment.
NA: corresponds to non-annotated genes in the corpus.
Assignment of GO concepts related to stress responses.
| Frequency of Ontology Annotation | ||||||
|---|---|---|---|---|---|---|
| GO:0042594 | Response to starvation | A change in state or activity of a cell or an organism as a result of a starvation stimulus, deprivation of nourishment. | - | 15% | 28% | 68% |
| GO:0006974 | Response to DNA damage stimulus | A change in state or activity of a cell or an organism as a result of a stimulus indicating damage to its DNA. | 7% | 26% | 31% | 50% |
| GO:0006970 | Response to osmotic stress | A change in state or activity of a cell as a result of an increase or decrease in the concentration of solutes outside the cell. | 21% | 28% | 35% | 50% |
| GO:0006950 | Response to stress | A change in state or activity of a cell or an organism as a result of a disturbance in organismal or cellular homeostasis. | - | 31% | 46% | 46% |
| GO:0006979 | Response to oxidative stress | A change in state or activity of a cell or an organism as a result of oxidative stress. | - | 3% | 20% | 45% |
| GO:0009432 | SOS response | An error-prone process for repairing damaged microbial DNA. | - | 23% | 30% | 45% |
| GO:0046677 | Response to antibiotic | A change in state or activity of a cell or an organism as a result of an antibiotic stimulus. | - | 23% | 30% | 15% |
| GO:0042493 | Response to drug | A change in state or activity of a cell or an organism as a result of a drug stimulus. | - | 15% | 35% | 11% |
| GO:0009266 | Response to temperature stimulus | A change in state or activity of a cell or an organism as a result of a temperature stimulus. | - | - | 9% | 11% |
| GO:0042742 | Defense response to bacterium | Reactions triggered in response to the presence of a bacterium that act to protect the cell or organism. | 14% | 26% | 9% | 8% |
| GO:0015968 | A specific global change in the metabolism of a bacterial cell as a result of starvation. | - | 13% | 7% | 6% | |
| GO:0009408 | Response to heat | A change in state or activity of a cell or an organism as a result of a heat stimulus. | - | 3% | 13% | 5% |
| GO:0009409 | Response to cold | A change in state or activity of a cell or an organism as a result of a cold stimulus. | - | - | - | 3% |
| GO:0009636 | Response to toxin | A change in state or activity of a cell or an organism as a result of a toxin stimulus. | - | - | - | 3% |
| GO:0009267 | Cellular response to starvation | A change in state or activity of a cell as a result of deprivation of nourishment. | - | - | - | 1% |
| GO:0046688 | Response to copper ion | A change in state or activity of a cell or an organism as a result of a copper ion stimulus. | - | - | - | 1% |
| GO:0009269 | Response to desiccation | A change in state or activity of a cell or an organism as a result of a desiccation stimulus. | - | - | 4% | - |
| GO:0031427 | Response to methotrexate | A change in state or activity of a cell or an organism as a result of a methotrexate stimulus. | - | - | 2% | - |
The frequency of annotation of stress response-related concepts was estimated for documents published in the four decades analysed (from 1970 to 2009).
Figure 6Semi-automatic information extraction approach. The first step encompasses the retrieval of relevant documents that are then processed to recognize biological concepts. In the following step, a manual curation procedure is undertaken to ensure the quality of the final corpus. Ontological terms are further mapped to enable functional enrichment analysis. The corpus analysis enables the identification of key players or significant information by an incremental curation that can further deliver information for retrieving new relevant documents.
Annotation statistics used in the analysis.
| Frequency | (Eq. 1) | |
|---|---|---|
| (Eq. 2) | ||
| (Eq. 3) | ||
| (Eq. 4) | ||
| (Eq. 5) | ||