| Literature DB >> 20049164 |
Abstract
Comparing genomes is an essential preliminary step to solve many problems in biology. Matching long similar segments between two genomes is a precondition for their evolutionary, genetic, and genome rearrangement analyses. Though various comparison methods have been developed in recent years, a quantitative assessment of their performance is lacking. Here, we describe two families of assessment measures whose purpose is to evaluate bacteria-oriented comparison tools. The first measure is based on how well the genome segmentation fits the gene annotation of the studied organisms; the second uses the number of segments created by the segmentation and the percentage of the two genomes that are conserved. The effectiveness of the two measures is demonstrated by applying them to the results of genome comparison tools obtained on 41 pairs of bacterial species. Despite the difference in the nature of the two types of measurements, both show consistent results, providing insights into the subtle differences between the mapping tools.Entities:
Year: 2009 PMID: 20049164 PMCID: PMC2798158 DOI: 10.1155/2009/749027
Source DB: PubMed Journal: Adv Bioinformatics ISSN: 1687-8027
Figure 1A schematic illustration of a genome mapping result. Each genome (illustrated as a straight line) is broken into segments or fragments (the labeled rectangular blocks), and each segment is mapped to a corresponding one in the other genome. The breakage of the genome into segments is referred to as segmentation. Segments with an identical label but different signs have reversed orientation. Note that a segmentation may leave regions outside of the blocks. Those regions are considered segmental insertions/deletions and are not included in the mapping between the genomes.
The 41 bacteria pairs. Pair: number of the pair. Organism: names of the organisms. Size: genome size in base pairs. MAGIC: MAGIC's results. Mauve: Mauve's results. Seg.: segmentation size. Ratio: conserved percentage. The pairs are sorted in alphabetical order. For each pair, the best obtained scores are marked in bold.
| Pair | Organism | Size | MAGIC | Mauve | ||
|---|---|---|---|---|---|---|
| Seg. | Ratio | Seg. | Ratio | |||
| 1 |
| 1197687 |
| 0.35 | 94 |
|
|
| 1471282 | |||||
|
| ||||||
| 2 |
| 5411809 |
|
| 260 |
|
|
| 5300915 | |||||
|
| ||||||
| 3 |
| 705557 |
|
|
|
|
|
| 791654 | |||||
|
| ||||||
| 4 |
| 5277274 |
| 0.87 | 146 |
|
|
| 5205140 | |||||
|
| ||||||
| 5 |
| 1931047 |
| 0.73 | 37 |
|
|
| 1581384 | |||||
|
| ||||||
| 6 |
| 4086189 |
| 0.79 | 189 |
|
|
| 5339179 | |||||
|
| ||||||
| 7 |
| 640681 |
|
| 2 |
|
|
| 641454 | |||||
|
| ||||||
| 8 |
| 1641481 |
|
| 8 |
|
|
| 1777831 | |||||
|
| ||||||
| 9 |
| 3031430 |
| 0.82 | 94 |
|
|
| 2897393 | |||||
|
| ||||||
| 10 |
| 2932766 |
| 0.67 | 561 |
|
|
| 3046682 | |||||
|
| ||||||
| 11 |
| 1469720 |
| 0.7 | 47 |
|
|
| 1395502 | |||||
|
| ||||||
| 12 |
| 1516355 |
|
| 13 |
|
|
| 1499920 | |||||
|
| ||||||
| 13 |
| 1892819 |
|
| 67 |
|
|
| 1895727 | |||||
|
| ||||||
| 14 |
| 1830138 |
|
| 34 |
|
|
| 1913428 | |||||
|
| ||||||
| 15 |
| 1667867 |
|
| 67 |
|
|
| 1643831 | |||||
|
| ||||||
| 16 |
| 2944528 |
|
| 47 |
|
|
| 3011208 | |||||
|
| ||||||
| 17 |
| 3345687 |
|
| 64 |
|
|
| 3503610 | |||||
|
| ||||||
| 18 |
| 580076 |
|
| 13 |
|
|
| 816394 | |||||
|
| ||||||
| 19 |
| 892758 |
|
| 37 |
|
|
| 920079 | |||||
|
| ||||||
| 20 |
| 4411532 |
|
| 10 |
|
|
| 4403837 | |||||
|
| ||||||
| 21 |
| 2272351 |
| 0.83 | 321 |
|
|
| 2184406 | |||||
|
| ||||||
| 22 |
| 3402093 |
| 0.52 | 301 |
|
|
| 4406967 | |||||
| 23 |
| 2650701 |
|
| 148 |
|
|
| 3059876 | |||||
|
| ||||||
| 24 |
| 7074893 |
| 0.54 | 391 |
|
|
| 6438405 | |||||
|
| ||||||
| 25 |
| 1751080 |
| 0.46 | 84 |
|
|
| 1709204 | |||||
|
| ||||||
| 26 |
| 5459213 |
| 0.59 | 392 |
|
|
| 5331656 | |||||
|
| ||||||
| 27 |
| 1111523 |
| 0.69 | 27 |
|
|
| 1485148 | |||||
|
| ||||||
| 28 |
| 2160267 |
|
| 20 |
|
|
| 2127839 | |||||
|
| ||||||
| 29 |
| 2814816 |
|
| 83 |
|
|
| 2820462 | |||||
|
| ||||||
| 30 |
| 4607203 |
|
| 38 |
|
|
| 4599354 | |||||
|
| ||||||
| 31 |
| 2685015 |
| 0.49 | 175 |
|
|
| 2516575 | |||||
|
| ||||||
| 32 |
| 2160842 |
|
| 110 |
|
|
| 2038615 | |||||
|
| ||||||
| 33 |
| 1895017 |
| 0.91 | 130 |
|
|
| 1894275 | |||||
|
| ||||||
| 34 |
| 1796226 |
| 0.96 | 31 |
|
|
| 1796846 | |||||
|
| ||||||
| 35 |
| 4809037 |
|
| 8 |
|
|
| 4791961 | |||||
|
| ||||||
| 36 |
| 2434428 |
|
| 323 |
|
|
| 2696255 | |||||
|
| ||||||
| 37 |
| 1894877 |
| 0.92 | 32 |
|
|
| 1849742 | |||||
|
| ||||||
| 38 |
| 927303 |
|
| 37 |
|
|
| 925938 | |||||
|
| ||||||
| 39 |
| 5076188 |
|
| 254 |
|
|
| 5178466 | |||||
|
| ||||||
| 40 |
| 2679306 |
|
| 266 |
|
|
| 2519802 | |||||
|
| ||||||
| 41 |
| 4653728 |
| 0.87 | 48 |
|
|
| 4600755 | |||||
Figure 2(a) GD-scores and (b) their normalization by segmentation sizes for both MAGIC and Mauve. The X-axis lists the pairs (as in Table 1), with MAGIC and Mauve results represented in the left and right bars, respectively, of each pair. The Y-axis is the scores. The rightmost column is the mean score for the method.
Figure 3The GD-scores as a function of segmentation size for MAGIC and Mauve. The X-axis is the segmentation size. The Y-axis is the GD-scores. Least-squares estimated linear fittings are shown as straight lines.
Figure 4The difference between Mauve and MAGIC in segmentation size and matching percentage.
Figure 5The conserved percentage divided by the segmentation size.