| Literature DB >> 27793083 |
Jorge Álvarez-Jarreta1,2, Eduardo Ruiz-Pesini3,4,5,6.
Abstract
BACKGROUND: Molecular evolution studies involve many different hard computational problems solved, in most cases, with heuristic algorithms that provide a nearly optimal solution. Hence, diverse software tools exist for the different stages involved in a molecular evolution workflow.Entities:
Keywords: Biological knowledge; Molecular evolution; Multiple sequence alignment; Phylogenetic inference; Python; Software; Supertrees
Mesh:
Substances:
Year: 2016 PMID: 27793083 PMCID: PMC5084326 DOI: 10.1186/s12859-016-1303-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
MEvoLib modules
| Category | Module list |
|---|---|
| Data | rCRS |
| Fetch | BioSeqs, PhyTrees |
| Cluster | Naïve rows, Naïve columns, |
| PRD, Genes | |
| Align | Mafft, Clustal Omega, |
| Muscle | |
| Inference | FastTree, RAxML |
| PhyloAssemble | Consense |
Fig. 1Motivating example of the application of MEvoLib. The figure shows a common molecular evolution workflow: data fetching, clustering, multialignment, phylogenetic inference and building of the consensus tree. Each stage presents the MEvoLib’s interface for that procedure and an example of a possible parameterization. The sequence sets and subsets are represented by sheets and the resultant phylogenies by triangles
Performance and feature recovery of a classical approach versus Genes method of MEvoLib for four sets of hmtDNA sequences
| Num. Seqs. | Configurations | Time (s) | Mem. (MB) | CDS feat. | rRNA feat. | tRNA feat. | |||
|---|---|---|---|---|---|---|---|---|---|
| 100 | Gene | 1.56 | 32.37 | 26 | (+13) | 3 | (+1) | 21 | (-1) |
| 100 | Product | 1.41 | 32.33 | 18 | (+5) | 2 | (0) | 20 | (-2) |
| 100 | All | 1.42 | 32.46 | 15 | (+2) | 2 | (0) | 20 | (-2) |
| 1000 | Gene | 13.15 | 91.42 | 26 | (+13) | 5 | (+3) | 43 | (+21) |
| 1000 | Product | 12.71 | 92.06 | 22 | (+9) | 4 | (+2) | 20 | (-2) |
| 1000 | All | 13.74 | 92.63 | 15 | (+2) | 2 | (0) | 20 | (-2) |
| 10000 | Gene | 127.95 | 687.63 | 31 | (+18) | 5 | (+3) | 45 | (+23) |
| 10000 | Product | 171.80 | 692.59 | 34 | (+21) | 4 | (+2) | 20 | (-2) |
| 10000 | All | 136.91 | 700.73 | 13 | (0) | 2 | (0) | 20 | (-2) |
| 31752 | Gene | 412.22 | 2126.83 | 33 | (+20) | 5 | (+3) | 46 | (+24) |
| 31752 | Product | 467.78 | 2144.32 | 51 | (+38) | 6 | (+4) | 20 | (-2) |
| 31752 | All | 509.78 | 2177.28 | 13 | (0) | 2 | (0) | 20 | (-2) |
The recovered features are CDS, rRNA and tRNA. The classical approach uses the qualifiers gene or product separately, whilst the Genes method exploits all of them. The second column of each feature shows the result’s divergence from the expected value