| Literature DB >> 22253821 |
Etienne Lord1, Mickael Leclercq, Alix Boc, Abdoulaye Baniré Diallo, Vladimir Makarenkov.
Abstract
In this paper we introduce Armadillo v1.1, a novel workflow platform dedicated to designing and conducting phylogenetic studies, including comprehensive simulations. A number of important phylogenetic and general bioinformatics tools have been included in the first software release. As Armadillo is an open-source project, it allows scientists to develop their own modules as well as to integrate existing computer applications. Using our workflow platform, different complex phylogenetic tasks can be modeled and presented in a single workflow without any prior knowledge of programming techniques. The first version of Armadillo was successfully used by professors of bioinformatics at Université du Quebec à Montreal during graduate computational biology courses taught in 2010-11. The program and its source code are freely available at: <http://www.bioinfo.uqam.ca/armadillo>.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22253821 PMCID: PMC3256230 DOI: 10.1371/journal.pone.0029903
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Comparison of four different bioinformatics platforms for sequence search (i.e., the BLAST algorithm is used in all of them).
Panel (A) presents a standard pipeline using Perl scripting; Panels (B, C, D) show different workflow designs for the sequence search operation provided by Galaxy (B) [10], Taverna (C) [27] and the introduced Armadillo workflow platform (D).
Figure 2Overview of the graphical user interface of the Armadillo workflow platform.
Panel (A) presents the available tools. All these tools can be used as drag-and-drop components in the workflow. Panel (B) presents the main view of the workflow design. Panel (C) presents a picture of a phylogenetic tree (i.e., phylogeny or evolutionary tree) displayed using the PhyloWidget application [40]. Panel (D) shows an integrated sequence viewer. Panel (E) shows an example of a custom component view allowing an easy configuration of the user pipeline.
Bioinformatics applications and services included in Armadillo v1.1.
| Bioinformatics tasks | Applications and services |
|
| |
| National Center for Biotechnology Information (NCBI) | Access to database search and data downloads through the EUtils Web-services |
| ENSEMBL-European Bioinformatics Institute (EBI) | Access to database search through EBI-Eye |
| HUGO Gene Nomenclature Committee | Access to database search and downloads of human genes information |
|
| BAli-phy |
|
| HGT Detection |
|
| fastDNAml |
|
| PhyloWidget |
|
| jModelTest |
|
| PAML v4.4 |
|
| BLAST (Local and Web at EBI and WTSI, and NCBI) |
See the Armadillo website for the complete list of included applicationsa.
Up-to-date list of included applications is available at: http://adn.bioinfo.uqam.ca/armadillo/included.html.
NCBI EUtil is available at: http://www.ncbi.nlm.nih.gov/entrez/query/static/esoap_help.html.
Figure 3An example of a bioinformatics solution created with Armadillo.
Panel (A) presents available comments and support files (available in the text and HTML formats). Panel (B) presents the beginning of the workflow and the if control used to select between different alternatives in the dataflow. Panel (C) shows how different multiple sequence alignment applications can be modeled. Panel (D) illustrates the use of different colors to annotate different parts of the workflow in order to facilitate the learning process. Panel (E) presents an example of a phylogenetic pipeline. Panel (F) displays an example of obtained results (i.e., results report).
Comparison of the main features provided by Armadillo v1.1 with those available in the Taverna [27], Galaxy [13], LONI [50], Ergatis [48] and Kepler [49] bioinformatics workflow platforms.
| Workflow design | Data management | Platform expansion | ||||||
| Platform |
|
|
|
|
|
|
|
|
|
| Yes | Yes | No | Yes | Yes | Yes | Yes | Yes |
|
| No | Yes | No | No | No | Yes | Yes | Yes |
|
| Yes | No | Yes | Yes | Yes | Yes | No | No |
|
| Yes | Yes | Yes | Yes | Yes | Yes | No | Yes |
|
| Yes | Yes | Yes | No | No | Yes | Yes | Yes |
|
| Yes | Yes | No | No | No | Yes | Yes | Yes |
Addition of new applications through Web Services or Java programming.
Figure 4A quick view of different steps needed for phylogenetic inference with Armadillo.
Step A: Search dialog box allowing for direct access to different Internet databases. Step B: Creating and interconnecting individual components by means of drag-and-drop operations. Muscle and ProbCons multiple sequence alignment applications are presented here. Step C: Representing the aligned sequences using an internal sequence viewer. Step D: Configuring the options of the PhyML and ProtDist applications prior to phylogenetic inference. Step E: Visualizing the resulting PhyML phylogenetic tree using the Archaeopteryx tree viewer. Panel (F): Displaying the complete computational workflow after a sequential execution of the first (multiple sequence alignment algorithms) and second (phylogenetic tree inference algorithms) workflow parts.