| Literature DB >> 23677614 |
Shigehiro Kuraku1, Christian M Zmasek, Osamu Nishimura, Kazutaka Katoh.
Abstract
We report a new web server, aLeaves (http://aleaves.cdb.riken.jp/), for homologue collection from diverse animal genomes. In molecular comparative studies involving multiple species, orthology identification is the basis on which most subsequent biological analyses rely. It can be achieved most accurately by explicit phylogenetic inference. More and more species are subjected to large-scale sequencing, but the resultant resources are scattered in independent project-based, and multi-species, but separate, web sites. This complicates data access and is becoming a serious barrier to the comprehensiveness of molecular phylogenetic analysis. aLeaves, launched to overcome this difficulty, collects sequences similar to an input query sequence from various data sources. The collected sequences can be passed on to the MAFFT sequence alignment server (http://mafft.cbrc.jp/alignment/server/), which has been significantly improved in interactivity. This update enables to switch between (i) sequence selection using the Archaeopteryx tree viewer, (ii) multiple sequence alignment and (iii) tree inference. This can be performed as a loop until one reaches a sensible data set, which minimizes redundancy for better visibility and handling in phylogenetic inference while covering relevant taxa. The work flow achieved by the seamless link between aLeaves and MAFFT provides a convenient online platform to address various questions in zoology and evolutionary biology.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23677614 PMCID: PMC3692103 DOI: 10.1093/nar/gkt389
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of the interface of the aLeaves web server. (A) The ‘Top’ page of the aLeaves server, containing the search interface. (B) The results page shown when a search and sequence collection is completed. The ‘Proceed to tree building’ section (red oval) provides a gateway to the MAFFT server for the rest of the process from data set refinement to molecular phylogenetic tree inference.
Figure 2.Phylogenetic coverage of the compiled databases available at the aLeaves web server. Numbering of the databases (Database 1–13) corresponds to that in the aLeaves server (http://aleaves.cdb.riken.jp/aleaves/database.html).
Sources of genome-wide protein data sets available at aLeaves but not available at Ensemble-based or NCBI-based sites
| Species name | English common name | Phylum | Database ID at aLeaves | URL |
|---|---|---|---|---|
| Elephant shark (or ghost shark) | Chordata | 6 | ||
| Chordata | 8 | |||
| Amphioxus | Chordata | 8 | ||
| Phytoparasitic root-knot nematode | Nematoda | 10 | ||
| Parasitic flatworm | Platyhelminthes | 11 | ||
| Polychaete worm | Annelida | 11 | ||
| Leech | Annelida | 11 | ||
| Pearl oyster | Mollusca | 11 | ||
| Pacific oyster | Mollusca | 11 | ||
| Okinawan staghorn coral | Cnidaria | 12 |
These species are available at aLeaves but not available at ‘Ensembl', ‘EnsemblGenomes Metazoa' or NCBI Genome (as of April 8, 2013). The complete list of species available at aLeaves is found at http://aleaves.cdb.riken.jp/aleaves/species.html.
aThe detail of the aLeaves databases is found in Figure 2 (also see http://aleaves.cdb.riken.jp/aleaves/database.html).
Figure 3.Sequence data set refinement at the MAFFT web server through Archaeopteryx. Shown as inset is a view of the Archaeopteryx applet, in which a single node containing six sequences is selected (highlighted in bright green with parent node marked by a circle). The parental web browser window shows an HTML page with a list of sequences in the present data set, in which the six sequences selected in Archaeopteryx are newly selected with ticks on the left. The colouring of the different sequences indicates their taxonomic categorization (detailed in the ‘Species’ page of the aLeaves server).