Literature DB >> 21646336

Phylemon 2.0: a suite of web-tools for molecular evolution, phylogenetics, phylogenomics and hypotheses testing.

Rubén Sánchez1, François Serra, Joaquín Tárraga, Ignacio Medina, José Carbonell, Luis Pulido, Alejandro de María, Salvador Capella-Gutíerrez, Jaime Huerta-Cepas, Toni Gabaldón, Joaquín Dopazo, Hernán Dopazo.   

Abstract

Phylemon 2.0 is a new release of the suite of web tools for molecular evolution, phylogenetics, phylogenomics and hypotheses testing. It has been designed as a response to the increasing demand of molecular sequence analyses for experts and non-expert users. Phylemon 2.0 has several unique features that differentiates it from other similar web resources: (i) it offers an integrated environment that enables evolutionary analyses, format conversion, file storage and edition of results; (ii) it suggests further analyses, thereby guiding the users through the web server; and (iii) it allows users to design and save phylogenetic pipelines to be used over multiple genes (phylogenomics). Altogether, Phylemon 2.0 integrates a suite of 30 tools covering sequence alignment reconstruction and trimming; tree reconstruction, visualization and manipulation; and evolutionary hypotheses testing.

Entities:  

Mesh:

Year:  2011        PMID: 21646336      PMCID: PMC3125789          DOI: 10.1093/nar/gkr408

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Phylogenetic analysis and model-based hypothesis testing are essential elements in current molecular evolution studies (1,2). Web servers for phylogenetic and evolutionary analyses range from those running single programs to those integrating multiple tools. Among the first are servers that execute multiple sequence alignment (MSA) tools such as ClustalW (3) (http://www.ebi.ac.uk/clustalw/), or sophisticated programs to test molecular adaptation such as the HyPhy environment (4) (http://www.datamonkey.org/). In the second category, resources such as the ‘Pasteur server’ (e.g. see http://bioweb.pasteur.fr/seqanal/phylogeny/intro-uk.html), Phylogeny.fr (http://www.phylogeny.fr/) (5), CIPRES (http://www.phylo.org/) and Phylemon (6) developed the concept of integrated platforms, which implement different phylogenetic analysis programs in a single server. Phylemon was originally developed in 2007 as a web server providing a common framework to run the most frequent analyses on DNA and protein sequences from a phylogenetic and evolutionary perspective. Phylemon 2.0 covers a wide, yet selected, range of programs, integrating over 30 different tools for phylogenetic and evolutionary analyses. Phylemon 2.0 has several unique features that differentiates it from other resources: (i) it offers an integrated environment that enables the concatenation of evolutionary analyses, the storage of results and that handles format conversions transparently; (ii) once an output file is produced, Phylemon suggests other possible analyses that could logically follow, thus guiding the user through the web server; and finally (iii) users can build and save complete pipelines to be automatically used on many genes in subsequent sessions (phylogenomics). The main objective of Phylemon is to provide to expert and non-expert users all necessary applications in a single integrated web framework that guides them through the whole sequence evolutionary analysis. Here, we outline the main characteristics of the server and the new developments added to this version. Phylemon 2.0 is accessible at http://phylemon.bioinfo.cipf.es

OUTLINE OF THE PROGRAM

Phylemon 2.0 resources are organized in five major sections: Alignment, Phylogeny, Evolutionary Tests, Pipeliner and Utilities. Phylemon verifies the format of the uploaded input file (non-aligned sequences, aligned sequences, distance matrix, tree or pipeline format) and stores files for the exclusive use of tools reading that format. Users can rename files to help their recognition throughout the project. A basic phylogenetic analysis consists of: (i) the proposal of a hypothesis about positional homology in a multiple alignment of sequences; and (ii) the search for a tree topology and branch lengths, the main components of a phylogeny. Once the phylogenetic hypothesis is solved, users may test for additional and specific hypotheses related to their sequences, including molecular clock behavior, estimation of synonymous and non-synonymous distances, maximum likelihood (ML) based parameter estimation, statistical support among competing topologies, clades or adaptive events acting on the sequences. For these purposes, Phylemon 2.0 groups different tools under the web tabs: Alignment, Phylogeny and Evolutionary Tests (see sections below). Table 1 lists all the programs implemented in Phylemon 2.0 and the main connections among them.
Table 1.

Programs available in Phylemon 2.0 web server

ProgramaVersionFunctionOutput to programPplinb
Alignment
    1TClustalW2.0.10Multiple alignments. DNA and protein sequences5, 8, 9, 11–16, 21–23, 26, 28Y
    2TMuscle3.7Multiple alignments. DNA and protein sequences5, 8, 9, 11–16, 21–23, 26, 28Y
    3TLagan2.0Pairwise alignment. Long and distant genomic sequences5, 8, 9, 11
    4TM-Lagan2.0Multiple alignments. Long and distant genomic sequences5, 8, 9, 11–16, 21–23, 26, 28
    5UTrimAl1.3Automated trimming of MSAs8, 9, 11–16, 21–23, 26, 28Y
    6UCDS-ProtAl1.0Alignment of DNA coding sequence using protein template8, 11–16, 22, 28–30Y
    7UConcatenAl1.0Concatenation of MSAs8, 9, 11–16, 21–23, 25, 26
    8UReadAl1.3File format conversion1–5, 8, 9, 11–16, 21–23, 25–30
Phylogeny reconstruction
    9TSeqbootPhylip 3.68Bootstrap, jackknife or permutation resampling methods11–16Y
    10TConsensePhylip 3.68Consensus tree reconstruction20Y
    11TDnadistPhylip 3.68DNA pairwise distances computation17, 18Y
    12TProtdistPhylip 3.68Protein pairwise distances computation17, 18Y
    13TDnaMLPhylip 3.68ML tree reconstruction from DNA data10, 20
    14TProMLPhylip 3.68ML tree reconstruction from protein data10, 20
    15TDnaParsPhylip 3.68Maximum parsimony tree reconstruction from DNA data10, 20
    16TProtParsPhylip 3.68Maximum parsimony tree reconstruction from protein data10, 20
    17TNeighborPhylip 3.68Tree reconstruction using UPGMA and NJ methods10, 20Y
    18TFitchPhylip 3.68Tree reconstruction using LS and ME methods10, 20Y
    19UTreeDistPhylip 3.68Distance computation among tree topologies
    20UETE2.1 betaTree visualizationY
    21TPhyML-Best-AIC-Tree1.0ML tree with the best model fitting data under AIC estimation20Y
    22TPhyML3.00Maximum likelihood analysis (MLA) of DNA & protein data20Y
    23TTree-Puzzle5.2MLA of DNA & protein sequences using quartets20
    24TMrBayes3.1.2Bayesian phylogenetic analysis of DNA and protein sequences20
Evolutionary tests
    25TProtTest1.4ML fitting of protein sequences to evolutionary models
    26TjModelTest0.1.0Model testing and phylogeny averaging
    27TRRTree1.1.11Relative rate test
    28TSLR1.3Site-wise analysis of positive and negative selection
    29TYN00PAML 4.4cPairwise analysis of positive selection (PS) with counting methods
    30TCodeMLPAML 4.4cMLA of PS using sites, branch and branch-site models

Programs are assembled in three main blocks: (i) alignment and files format conversion; (ii) phylogenetic reconstruction; and (iii) evolutionary tests. New resources in this version are shown in cursive.

aT-U: tools/utilities.

bPplin: programs able to run in the Pipeliner.

Programs available in Phylemon 2.0 web server Programs are assembled in three main blocks: (i) alignment and files format conversion; (ii) phylogenetic reconstruction; and (iii) evolutionary tests. New resources in this version are shown in cursive. aT-U: tools/utilities. bPplin: programs able to run in the Pipeliner.

Alignment

Phylemon 2.0 integrates four different programs for the alignment of molecular sequences: ClustalW v2.0.10 (3), Muscle v3.7(7), Lagan v2.0 and M-Lagan v2.0 (8). The first two are among the most frequently used programs for MSA. In this version of the server, we added Lagan (Limited Area Global Alignment of Nucleotides) and Multi-Lagan, which run efficient algorithms specifically developed to produce pairwise and multiple alignments of long genomic sequences, respectively. Lagan and M-Lagan use a single input file containing two or more sequences in Fasta format, respectively. Both programs use the Translated Anchoring option, translating coding regions to anchor sequences. This is useful when distantly related sequences are compared (i.e. primates and fishes). The Reverse Complement option in Lagan is useful to search for positional homology on the opposite DNA strand of the second sequence. Both programs produce a single output file of aligned sequences in Fasta format. Multiple alignments in Phylemon 2.0 can be sent to distance, parsimony and statistical tree reconstruction (ML and Bayesian) tools. Format conversion or alignment edition can be performed using ReadAl and TrimAl (see ‘Utilities’ section).

Phylogeny

Phylemon 2.0 incorporates distances, parsimony, ML and Bayesian methods for tree reconstruction. Distance and parsimony methods for DNA or protein sequence data are provided by algorithms of the Phylip package (9) v3.68: DnaDist, ProtDist, DnaPars and ProtPars, respectively. ML analysis can be performed using Phylip (DnaML, ProML), Tree-Puzzle v5.2 (10) and PhyML v3.0 (11,12) programs. Bayesian phylogenetic analysis runs in MrBayes (13) v3.1.2. Users have the option to interact with the program, thus monitoring the progress of the analysis. Program allows the user to specify sump and sumt parameters. Users interested to build the MrBayes commands block can fill the form that summarizes the main parameters. A useful list of command line parameters is available on the fly. PhyML-Best-AIC-tree v1.02 b is a new tool in Phylemon 2.0. It is a Python script allowing the reconstruction of ML trees using the best AIC-DNA or protein model (14).

Evolutionary tests

For users interested in evolutionary hypotheses testing, Phylemon 2.0 collects tools of: Model Selection, Molecular Adaptation and Relative Rate Test. In this version, we added jModelTest (15) v0.1, and a new version of ProtTest (16) v1.4 to improve the search for the best model of evolution for DNA and protein explaining the data. One of the interesting results of jModelTest is the average topology obtained by models within 95% confidence interval. This topology can be used as the intree file required for all programs testing for molecular adaptation in Phylemon 2.0. Adaptation tests on protein-coding DNA sequences run in Phylemon 2.0 by means of Site-wise Likelihood Ratio (SLR) test program vs1.3 (17) and CodeML & YN00 (18) from PAML vs 4.4c (19). Finally, deviations from the molecular clock hypothesis can be tested using the RRTree (20) program vs1.1.11. RRTree computes relative rates tests among user-defined lineages with a weighted or unweighted scheme of species based on the tree topology provided by the user. The program accepts different parameters: the number of synonymous substitutions and synonymous transitions per synonymous site (Ks and As, respectively), the number of non-synonymous substitutions and non-synonymous transversions per non-synonymous site (Ka and Ba, respectively) and, finally, the number of synonymous transversions per 4-fold degenerate site (B4). Kimura two parameters (K2P) (21) and Jukes and Cantor (JC) models are available for non-coding DNA sequences. For protein sequences, RRTree computes a modification of JC model. Users interested in ML comparison of topologies (paired-sites test) can select evaluation of user-defined trees option in Tree-Puzzle program.

PIPELINE AND PHYLOGENOMICS

Phylogenomic analyses sometimes involve repeating a certain set of analysis over several groups of genes. In such cases, it is necessary to apply the same set of phylogenetic algorithms to different sequence data using a single pipeline of tools. To satisfy this requirement, we developed the Pipeliner tool. Users interested in such kind of studies can upload a zip file containing sequences to run in a pipeline. Previous version of Pipeliner provides basic programs derived from the Phylip package and pipelines like ClustalW, Seqboot, DnaDist/ProtDist, Neighbor and Consense may be used in that order, for a phylogenetic reconstruction with bootstrap values. Pipeliner in Phylemon 2.0 added PhyML and PhyML-Best-AIC-tree to select the best tree after comparing all AIC (Akaike Informaion Criteria) estimations of DNA or protein models. Users can add tools from the list of tools and connect them using the ‘create link’ option. Once all the options of the tools are completed, the user can run and save the pipeline for future jobs.

UTILITIES

Phylemon 2.0 implements three new utilities. First, TrimAl vs1.3 (22) for automatically removing poorly aligned regions from MSAs. The user can select a set of columns to be removed or set specific thresholds based on the fraction of gaps or the similarity of residues in a column. Additionally, TrimAl implements a series of automated algorithms that apply different optimized thresholds, based on the characteristics of each alignment. Second, CDS-ProtAl, a new tool for multiple coding sequence alignment based on protein sequences. This program uses Muscle to compute protein alignments using default parameters but capped at 5 h running time or 9999 iterations on a translation (universal genetic code) of the coding DNA sequences provided as input. Finally, ReadAl v 1.3 a new tool for file conversion among the most popular format files used in phylogeny has been included. ETE vs2.1 (23) allows users to visualize and interact with trees. The new version allows rooting, collapsing, expanding or swapping nodes and incorporates the possibility to search for distances, support values or names (including the use of Perl-based regular expressions). These options and many others are available by clicking on the nodes, and in the close framework of the tree by using left mouse buttons.

REGISTRATION, ACCOUNTS, PROJECTS AND SPACE

Phylemon 2.0 can be accessed by anonymous login or by registered users. The only difference between these choices is that registered users, from whom only an e-mail is required, can have many different projects and use the server to store up to 1.0 GB of data for future use. Files from anonymous users are deleted after 24 h. Projects and jobs in Phylemon 2.0 can be created, renamed and deleted by users. The number of jobs finished and waiting to be visited, visited, running and those waiting to be run in the queue are colored green, blue, red and yellow, respectively. Users have two icons to access to files, projects and data management.

Technical details

Phylemon 2.0 has been completely reengineered. The server-side is implemented in Java, the client-side is implemented in AJAX (Asynchronous JavaScript And XML). JSON (JavaScript Object Notation) exchanges client and server data. Consequently, the new interface allows asynchronous use of tools (a program can be left running to later come back to see the results), including new facilities for the management of projects and jobs. Moreover, a queue system has been implemented in the server. This release makes an intensive use of new web technologies and standards, so the supported browsers for this version are as follows: Chrome 7+, Firefox 3.5+, Safari 4+, Opera 10+ and Internet Explorer 8. Internet Explorer 6 and 7 are no longer supported. Pipeliner was developed in HTML5 JavaScript and makes use of Scalable Vector Graphics (SVG), therefore it runs in Chrome7+, Firefox4+ or Internet Explorer 9+. More details are available at the Wiki-Help: http://docs.bioinfo.cipf.es/phylemonwiki/doku.php.

DISCUSSION

Molecular evolution, phylogenetics, phylogenomics and evolutionary hypothesis testing embrace a wide range of scientific enquiries in biology. Following the last developments in the field, Phylemon 2.0 combines tools and programs ranging from the simplest distance phylogenetic reconstruction, or the basic relative rate test, to the newest ML model-averaged estimation of the tree topology or the analysis of molecular adaptation. The incorporation of new tools in Phylemon 2.0 extends its usefulness to advanced users trying to find answers to more complex questions of phylogeny and evolution in a web server. Phylemon 2.0 addresses an important requirement of users and students of evolution and phylogeny; namely, the need for a public web server providing a core set of format-compatible, classical and advanced tools truly integrated in an independent web platform.

FUNDING

Grants (BFU2009-13409-C02-01, BIO2008-04212, BFU2009-09168); ‘Plan de Estímulo a la Economía y el Empleo’ (Plan E) from MICINN; PROMETEO/2010/001 from the GVA-FEDER. The CIBER de Enfermedades Raras and the INB are initiatives of the ISCIII. Funding for open access charge: Funding of Open Access publication charges was provided by MICINN project to HD. Conflict of interest statement. None declared.
  20 in total

1.  LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA.

Authors:  Michael Brudno; Chuong B Do; Gregory M Cooper; Michael F Kim; Eugene Davydov; Eric D Green; Arend Sidow; Serafim Batzoglou
Journal:  Genome Res       Date:  2003-03-12       Impact factor: 9.043

2.  MrBayes 3: Bayesian phylogenetic inference under mixed models.

Authors:  Fredrik Ronquist; John P Huelsenbeck
Journal:  Bioinformatics       Date:  2003-08-12       Impact factor: 6.937

3.  MUSCLE: multiple sequence alignment with high accuracy and high throughput.

Authors:  Robert C Edgar
Journal:  Nucleic Acids Res       Date:  2004-03-19       Impact factor: 16.971

4.  Detecting amino acid sites under positive selection and purifying selection.

Authors:  Tim Massingham; Nick Goldman
Journal:  Genetics       Date:  2005-01-16       Impact factor: 4.562

5.  HyPhy: hypothesis testing using phylogenies.

Authors:  Sergei L Kosakovsky Pond; Simon D W Frost; Spencer V Muse
Journal:  Bioinformatics       Date:  2004-10-27       Impact factor: 6.937

6.  ProtTest: selection of best-fit models of protein evolution.

Authors:  Federico Abascal; Rafael Zardoya; David Posada
Journal:  Bioinformatics       Date:  2005-01-12       Impact factor: 6.937

7.  Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative.

Authors:  Maria Anisimova; Olivier Gascuel
Journal:  Syst Biol       Date:  2006-08       Impact factor: 15.683

8.  Phylogenetic methods come of age: testing hypotheses in an evolutionary context.

Authors:  J P Huelsenbeck; B Rannala
Journal:  Science       Date:  1997-04-11       Impact factor: 47.728

9.  A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences.

Authors:  M Kimura
Journal:  J Mol Evol       Date:  1980-12       Impact factor: 2.395

10.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.

Authors:  J D Thompson; D G Higgins; T J Gibson
Journal:  Nucleic Acids Res       Date:  1994-11-11       Impact factor: 16.971

View more
  73 in total

1.  The role of pollinators in the evolution of corolla shape variation, disparity and integration in a highly diversified plant family with a conserved floral bauplan.

Authors:  José M Gómez; Ruben Torices; Juan Lorite; Christian Peter Klingenberg; Francisco Perfectti
Journal:  Ann Bot       Date:  2016-02-15       Impact factor: 4.357

2.  Changes in Anthocyanin Production during Domestication of Citrus.

Authors:  Eugenio Butelli; Andrés Garcia-Lor; Concetta Licciardello; Giuseppina Las Casas; Lionel Hill; Giuseppe Reforgiato Recupero; Manjunath L Keremane; Chandrika Ramadugu; Robert Krueger; Qiang Xu; Xiuxin Deng; Anne-Laure Fanciullino; Yann Froelicher; Luis Navarro; Cathie Martin
Journal:  Plant Physiol       Date:  2017-02-14       Impact factor: 8.340

3.  Analysis of a vinculin homolog in a sponge (phylum Porifera) reveals that vertebrate-like cell adhesions emerged early in animal evolution.

Authors:  Phillip W Miller; Sabine Pokutta; Jennyfer M Mitchell; Jayanth V Chodaparambil; D Nathaniel Clarke; W James Nelson; William I Weis; Scott A Nichols
Journal:  J Biol Chem       Date:  2018-06-07       Impact factor: 5.157

4.  A nuclear phylogenetic analysis: SNPs, indels and SSRs deliver new insights into the relationships in the 'true citrus fruit trees' group (Citrinae, Rutaceae) and the origin of cultivated species.

Authors:  Andres Garcia-Lor; Franck Curk; Hager Snoussi-Trifa; Raphael Morillon; Gema Ancillo; François Luro; Luis Navarro; Patrick Ollitrault
Journal:  Ann Bot       Date:  2012-10-26       Impact factor: 4.357

5.  Effects of high hydrostatic pressure on coastal bacterial community abundance and diversity.

Authors:  Angeliki Marietou; Douglas H Bartlett
Journal:  Appl Environ Microbiol       Date:  2014-07-25       Impact factor: 4.792

6.  Phylogenetic analysis and evolutionary origins of DNA polymerase X-family members.

Authors:  Rachelle J Bienstock; William A Beard; Samuel H Wilson
Journal:  DNA Repair (Amst)       Date:  2014-08-09

7.  Sequence comparison and phylogenetic analysis by the Maximum Likelihood method of ribosome-inactivating proteins from angiosperms.

Authors:  Antimo Di Maro; Lucía Citores; Rosita Russo; Rosario Iglesias; José Miguel Ferreras
Journal:  Plant Mol Biol       Date:  2014-06-01       Impact factor: 4.076

8.  Aridification as a driver of biodiversity: a case study for the cycad genus Dioon (Zamiaceae).

Authors:  José Said Gutiérrez-Ortega; Takashi Yamamoto; Andrew P Vovides; Miguel Angel Pérez-Farrera; José F Martínez; Francisco Molina-Freaner; Yasuyuki Watano; Tadashi Kajita
Journal:  Ann Bot       Date:  2018-01-25       Impact factor: 4.357

9.  The phylogeography of the cycad genus Dioon (Zamiaceae) clarifies its Cenozoic expansion and diversification in the Mexican transition zone.

Authors:  José Said Gutiérrez-Ortega; María Magdalena Salinas-Rodríguez; José F Martínez; Francisco Molina-Freaner; Miguel Angel Pérez-Farrera; Andrew P Vovides; Yu Matsuki; Yoshihisa Suyama; Takeshi A Ohsawa; Yasuyuki Watano; Tadashi Kajita
Journal:  Ann Bot       Date:  2018-03-05       Impact factor: 4.357

10.  A cullin-RING ubiquitin ligase promotes thermotolerance as part of the intracellular pathogen response in Caenorhabditis elegans.

Authors:  Johan Panek; Spencer S Gang; Kirthi C Reddy; Robert J Luallen; Amitkumar Fulzele; Eric J Bennett; Emily R Troemel
Journal:  Proc Natl Acad Sci U S A       Date:  2020-03-19       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.