| Literature DB >> 20042101 |
Marília D V Braga1, Christian Gautier, Marie-France Sagot.
Abstract
BACKGROUND: The reversal distance and optimal sequences of reversals to transform a genome into another are useful tools to analyse evolutionary scenarios. However, the number of sequences is huge and some additional criteria should be used to obtain a more accurate analysis. One strategy is searching for sequences that respect constraints, such as the common intervals (clusters of co-localised genes). Another approach is to explore the whole space of sorting sequences, eventually grouping them into classes of equivalence. Recently both strategies started to be put together, to restrain the space to the sequences that respect constraints. In particular an algorithm has been proposed to list classes whose sorting sequences do not break the common intervals detected between the two initial genomes A and B. This approach may reduce the space of sequences and is symmetric (the result of the analysis sorting A into B can be obtained from the analysis sorting B into A).Entities:
Year: 2009 PMID: 20042101 PMCID: PMC2813847 DOI: 10.1186/1748-7188-4-16
Source DB: PubMed Journal: Algorithms Mol Biol ISSN: 1748-7188 Impact factor: 1.405
Figure 1Different approaches to select an optimal sorting sequence. The permutations (-5, -2, -7, 4, -8, 3, 6, -1) and ℐ8 have only one initially detected non-trivial irreducible common interval, which is {2,..., 8}. (A) A sequence of reversals that sorts the permutation, but does not preserve the initially detected common interval. (B) A sequence of reversals that is a perfect sorting sequence (preserves the initially detected common interval), but does not preserve the new common intervals that appear during the sorting process (such as {3, 4} and {2, 3}). (C) A progressive perfect sequence that sorts the descendant permutation (-5, -2, -7, 4, -8, 3, 6, -1) without breaking the progressively detected irreducible common intervals (listed on the right side).
Experimental results
| Perm. | Algorithm | N | N | Exec. |
|---|---|---|---|---|
| all ( | 81, 869 | 377 | ≃ 5 s | |
| prf( | 51,304 | 92 | ≃ 5 s | |
| prg ( | 11, 568 | 12 | ≃ 3 s | |
| prg (ℐ8 → | 8, 400 | 5 | ≃ 2 s | |
| all ( | 505, 634, 256 | 21, 902 | ≃ 7.3 m | |
| prf ( | 122, 862, 960 | 171 | ≃ 27 s | |
| prg ( | 5, 963, 760 | 6 | ≃ 14 s | |
| prg (ℐ16→ | 5, 393, 520 | 9 | ≃ 16 s | |
The experimental results of computing traces (all), perfect traces (prf) and progressive perfect subtraces (prg; in both directions), considering the pairs of permutations (A, ℐ8) and (B, ℐ16), where A = (-5, -2, -7, 4, -8, 3, 6, -1) and B = (-12, 11, -10, -1, 16, -4, -3, 15, -14, 9, -8, -7, -2, -13, 5, -6).
The columns Nand Ngive, respectively, the resulting number of sorting sequences and traces for each approach. All algorithms are part of the BAOBABLUNA package [22]. Experiments were made on a 64 bit personal computer with two 3GHz CPUs and 2GB of RAM and the execution time is given in seconds (s) or minutes (m).
Figure 2Evolutionary scenario between Phylogenetic tree of six Rickettsia (extracted from [2]). The numbers on the edges give the reversal distance between the genomes on the vertices, which could be either a current species or an ancestor (R1, R2, R3, R4 and R5). (B) The optimal sequence of reversals to transform the ancestor R2 into Rickettsia felis (proposed by Blanc et al. [2] with the help of the software GRIMM[10]). The two common interval breaks are indicated by the "comma" signs.
Traces of sequences sorting R. felis into its ancestor
| Trace | Trace normal form (f) | ||
|---|---|---|---|
| Subtrace representative (e) | trace | subtr. | |
| 1. | 90720 | 45360 | |
| 2. | 90720 | 45360 | |
| 3. | 90720 | 45360 | |
| 4. | 60480 | 60480 | |
| 5. | 60480 | 0 | |
| 6. | 60480 | 0 | |
| 7. | 60480 | 60480 | |
| 8. | 9072 | 0 | |
| 9. | 6048 | 0 | |
| 10. | 6048 | 0 | |
| 11. | 6048 | 6048 | |
| 12. | 3024 | 0 | |
| 13. | 2520 | 0 | |
| Total | 546840 | 263088 | |
The 546840 sequences that sort Rfe = (1, 3, -2, -11, 5, -9, -10, 8, 6, -7, -4, 12) (R. felis) into R2 = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12) are distributed in 13 traces. Each trace is represented by its normal form. The third column indicates the number of sequences in each trace. When we apply the progressive detection of common intervals, accepting at most two common interval breaks, we obtain 263088 sequences distributed in 6 progressive near-perfect subtraces (subsets of traces 1, 2, 3, 4, 7 and 11). Each progressive near-perfect subtrace is represented by a 2-tuple (e is the subtrace representative, f is the trace normal form). The fourth column gives the number of sequences in each subtrace.
Figure 3Alternative scenario between . An alternative optimal sequence of reversals to transform the ancestor R2 into Rickettsia felis, that is the inverse of a sequence taken from subtrace 3 of Table 2. The two common interval breaks are indicated by the "comma" signs.