| Literature DB >> 19555467 |
Albert Pallejà1, Tomàs Reverter, Santiago Garcia-Vallvé, Antoni Romeu.
Abstract
BACKGROUND: Although prokaryotes live in a variety of habitats and possess different metabolic and genomic complexity, they have several genomic architectural features in common. The overlapping genes are a common feature of the prokaryote genomes. The overlapping lengths tend to be short because as the overlaps become longer they have more risk of deleterious mutations. The spacers between genes tend to be short too because of the tendency to reduce the non coding DNA among prokaryotes. However they must be long enough to maintain essential regulatory signals such as the Shine-Dalgarno (SD) sequence, which is responsible of an efficient translation. DESCRIPTION: PairWise Neighbours is an interactive and intuitive database used for retrieving information about the spacers and overlapping genes among bacterial and archaeal genomes. It contains 1,956,294 gene pairs from 678 fully sequenced prokaryote genomes and is freely available at the URL http://genomes.urv.cat/pwneigh. This database provides information about the overlaps and their conservation across species. Furthermore, it allows the wide analysis of the intergenic regions providing useful information such as the location and strength of the SD sequence.Entities:
Mesh:
Year: 2009 PMID: 19555467 PMCID: PMC2716372 DOI: 10.1186/1471-2164-10-281
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Entity-relationship model of the MySQL database. Schema of the data model designed and translated to a MySQL database.
Figure 2Study of a 4 bps overlap conservation. Compilation of images that the users can find when they are studying the conservation of an overlap. General Info label shows information about the 4 bps overlap between NC_000913.b0043 and NC_000913.b0044 genes (A). The BLAST results give an idea of the conservation of the overlap across the species (B). Information given on the NC_000913.b0044 Downstream Gene label provides gene details (gene function, gene COG, start and stop codon), SD related information (position of minimal ΔG° value, minimal ΔG° value) as well as a graph of the ΔG° values along translation initiation region (C). The same information is given for the NC_003197.STM078 gene (D), which is an orthologous gene of NC_000913.b0044.
Figure 3Study of a 130 bps incorrectly annotated overlap. The BLAST results show that the gene NC_002947.PP_2781 of P. putida KT2440 is longer than its orthologous gene NC_009512.Pput_2974 in P. putida F1 (A). This difference in length indicates that the 130 bps overlap between NC_002947.PP_2780 and NC_002947.PP_2781 is not conserved and thus not reliable. In the NC_002947.PP_2781 Downstream Gene label is shown that this gene has no SD sequence (B), while in the NC_009512.Pput_2974 upstream gene label is shown that this gene has the SD sequence at 7 nucleotides to the start codon (C).
Figure 4Study of the location of the SD sequence between a co-directional gene pair. Compilation of images that the users can find when they are studying the location of the SD sequence between the co-directional genes NC_000913.b2644 and NC_000913.b4548 separated by 8 bps. General Info label gives details about the spacer between this gene pair, which include the Spacing length and the Spacer sequence (A). The NC_000913.b2644 Upstream Gene label gives information about this gene (B), while the NC_000913.b4548 Downstream Gene label gives information about this gene as well as SD related information and the corresponding graph of the ΔG° values along the translation initiation region (C).
Genes with or without SD in E. coli K12
| Number of genes | Percentage of genes with SD | Percentage of genes without SD | |
| All | 4,133 | 69.66 | 30.34 |
| 253 | 81.03 | 18.97 | |
| 310 | 68.39 | 31.61 | |
| Mean and standard deviation of 100 sets of 300 genes randomly selected from | 300 | 69.04 ± 2.58 | 30.96 ± 2.58 |
Number of genes and percentage of genes with the Shine-Dalgarno motif from E. coli K12.
(1) HEG extracted from the HEG-DB [39]
(2) HGT extracted from the HGT-DB [40]
Abbreviations: SD, Shine-Dalgarno