| Literature DB >> 16011797 |
H Chiapello1, I Bourgait, F Sourivong, G Heuclin, A Gendrault-Jacquemard, M-A Petit, M El Karoui.
Abstract
BACKGROUND: Public databases now contain multitude of complete bacterial genomes, including several genomes of the same species. The available data offers new opportunities to address questions about bacterial genome evolution, a task that requires reliable fine comparison data of closely related genomes. Recent analyses have shown, using pairwise whole genome alignments, that it is possible to segment bacterial genomes into a common conserved backbone and strain-specific sequences called loops.Entities:
Mesh:
Year: 2005 PMID: 16011797 PMCID: PMC1187871 DOI: 10.1186/1471-2105-6-171
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Flow diagram of bacterial genomes segmentation in MOSAIC. The bacterial genomes segmentation includes four main steps in MOSAIC: NCBI bacterial genomes selection using Mummer and MGA, processing of genome alignments using MGA, backbone/loops segmentation and database integration using Perl scripts.
Segmentation results obtained from MGA alignments and included in the MOSAIC database. For each segmentation result, the first column describes the species and genomes used for segmentation analyses; the number of compared strains is indicated between parentheses. Total loop sizes and loop number of each genome are entered in the same order as strain names, and separated by '+'. Coverage corresponds to the ratio between backbone size and total genome size of a strain; here the mean value for all compared strains is given in percents.
| C58 Cereon circ X C58 Univ. Wash circ | 2.09 | 751 | 74 % |
| C58 Cereon lin RC X C58 Univ. Wash lin | 1.82 | 252 | 88 % |
| Ames X Ames 'Ancestor' | 3.93 | 528 | 90 % |
| ATCC14579 X ATCC10987 | 4.02 | 1390 | 76 % |
| AR39 RC+TR X CWL029 X J138 X TW183 | 1.22 | 10 | 99% |
| CWL029 X J138 X TW183 | 1.21 | 15 | 99 % |
| CWL029 X J138 | 1.21 | 21 | 99 % |
| J138 X TW183 | 1.22 | 9 | 99 % |
| CWL029 X TW183 | 1.22 | 13 | 99 % |
| AR39 RC+TR X CWL029 | 1.22 | 8 | 99% |
| K-12 X Sakai X EDL933 X CFT073 | 3.52 | 1119 | 68 % |
| K-12 X Sakai X CFT073 | 3.73 | 904 | 73 % |
| 26695 X J99 | 1.24 | 428 | 75 % |
| EGD X 4b F2365 | 2.67 | 270 | 92 % |
| CDC1551 X H37Rv | 4.19 | 217 | 95 % |
| MW2 X MU50 X N315 | 2.59 | 226 | 92 % |
| 2603V/R X NEM316 | 1.88 | 276 | 86 % |
| R6 X TIGR4 | 1.91 | 128 | 91 % |
| M1GAS X MGAS315 X MGAS8232 | 1.62 | 235 | 86 % |
| M1GAS X MGAS315 | 1.64 | 210 | 88 % |
| M1GAS X MGAS8232 | 1.65 | 206 | 88 % |
| YJ016 K2 X CMCP6 K2 TR | 1.63 | 222 | 89 % |
| YJ016 K1 RC X CMCP6 K1 TR | 2.73 | 628 | 82% |
Figure 2Graphical visualization of the backbone/loop structure available through the Web interface of Mosaic. 'Physical map' mode of MOSAIC corresponding to the graphical visualization of a 15 kb portion of the E. coli K-12, O157:H7 Sakai and CFT073 segmented genomes (data correspond to the comparison of three E. coli strains described in results). Genbank annotations are indicated with coloured arrows. Supplementary annotations are indicated as red boxes. Backbone is indicated in grey whereas loops appear in green.
Size distribution of loops (in bp) obtained from segmentation of the E. coli genomes K-12, O157:H7 Sakai (SAK) and CFT073 (CFT). Minimal size (Min), Mean size, Maximal size (Max), First Quartile (1st Qu.), Median size, and Third Quartile (3rd Qu.) are shown.
| 20 | 20 | 20 | |
| 1093 | 2217 | 1942 | |
| 40120 | 96682 | 150690 | |
| 34 | 32.5 | 31 | |
| 113 | 109 | 77 | |
| 486 | 863 | 314 |
Figure 3Distribution of the loop sizes of three E. coli genomes (K-12, SAKAI and CFT073). Loop sizes range from 20 bp to 40 120 to 151 690 bp. Log10 scale is used on the x-axis.
Distribution of BIME (in percent of length) in backbone and loops regions of the E. coli K-12 genome, as determined from the triple K-12, Sakai and CFT073 alignment.
| 38 % | 62 % | |
| | 29 % | 71 % |
| | 47 % | 53 % |
| | 37 % | 63 % |