| Literature DB >> 15634352 |
David A Rasko1, Garry S A Myers, Jacques Ravel.
Abstract
BACKGROUND: The first microbial genome sequence, Haemophilus influenzae, was published in 1995. Since then, more than 400 microbial genome sequences have been completed or commenced. This massive influx of data provides the opportunity to obtain biological insights through comparative genomics. However few tools are available for this scale of comparative analysis.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15634352 PMCID: PMC545078 DOI: 10.1186/1471-2105-6-2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1BSR rationale and scatter plot example. A. BLAST score ratio analysis (BSR) calculation demonstrating how the two coordinates for plotting in figures B and C are calculated. B. Locations of the peptide spot revels the similarity that the peptide has to the two Query genomes. Use of a 0.4 separator is based on ~30% amino acid identity over 30% of the length of the peptide [10]. C. Sample data obtained from comparison of Chlaymidia caviae GPIC (GenBank Accession Number AE015925) to the proteomes of Chlamydia muridarum strain Nigg (GenBank Accession Number AE002160) and Chlamydia pneumoniae AR39 (GenBank Accession Number AE002161) [17]. Each point in the figure represents a single peptide in Chlaymidia caviae GPIC This analysis reveals that while these organisms are very similar, C. caviae is more similar to C. pneumoniae AR39 than C. muridarum strain Nigg due the skew of peptides with a slope of greater than 1.
Figure 2Genome structure visualization. Direct comparison of two genomes at a time demonstrating some examples of large-scale genomic rearrangements. Each protein is plotted by the genomic location of the coding region and is color-coded by the degree of similarity based on the BSR as is demonstrated in the legend. A. Comparison of C. caviae GPIC and C. pneumoniae AR39. This comparison contains two genomic rearrangements of different sizes as indicated by the arrows. B. C. caviae GPIC and C. muridarum strain Nigg comparison reveals a more extensive genomic rearrangements suggesting that while proteomically these organisms are similar the genomes have diverged significantly. C. E. coli CFT073 (GenBank Accession Number AE014075) vs. E. coli K12 (GenBank Accession Number U00096). E. coli CFT073 contains a number of unique insertions that are represented as breakpoints in the plot and highlighted with arrows. The high level of synteny and similarity are exhibited by these genomes.
Figure 3Visualization with GGobi. GGobi screenshots of the graphical outputs from the BSR. The proteins for tryptophan synthase alpha and beta subunits are highlighted as they were unique in the C. caviae genome and represented a significant metabolic adaptation of this species in comparison to the other species compared [17]. A. The scatter plot represents the same figure as shown in Figure 1C, however the interactive nature of GGobi allows visualization of the annotation associated with any of the peptides. B. Synteny plots as seen in GGobi. These same genes from Figure 3A can be highlighted in the in the synteny plots and the genomic location can be observed. To take advantage of the usefulness of the interactive mouseover the BSR is included with the annotation.