| Literature DB >> 20377857 |
Abstract
BACKGROUND: Large phylogenies are crucial for many areas of biological research. One method of creating such large phylogenies is the supertree method, but creating supertrees containing thousands of taxa, and hence providing a comprehensive phylogeny, requires hundred or even thousands of source input trees. Managing and processing these data in a systematic and error-free manner is challenging and will become even more so as supertrees contain ever increasing numbers of taxa. Protocols for processing input source phylogenies have been proposed to ensure data quality, but no robust software implementations of these protocols as yet exist.Entities:
Year: 2010 PMID: 20377857 PMCID: PMC2872655 DOI: 10.1186/1756-0500-3-95
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Executables in the STK. List of all executables contained in the STK along with a short description
| Script Name | Function |
|---|---|
| Stk_amalagamate_trees | Create a single file from multiple tree files. |
| Stk_check_data | Provides a consistency check on the current dataset. |
| Stk_check_overlap | Ensures minimal taxonomic overlap between source trees in the dataset. |
| Stk_check_substitutions | Checks that a substitution file will not introduce taxa into a dataset. Also checks formatting of a file. |
| Stk_create_matrix | Create a MRP matrix file from a dataset. |
| Stk_data_independence | Check the dataset for independence between source trees. |
| Stk_data_summary | Prints a summary of a dataset, including taxa list and some basic statistics. |
| Stk_fix_treeview | Converts tree files produced in TreeView [ |
| Stk_replace_genera | Converts a dataset to species-level (as far as possible). |
| Stk_replace_taxa | Enables substitutions or removal of taxa from a dataset or file. |
| Stk_search_data | Searches phylogenetic- and meta-data. Can also create a copy from the returned results. |
| Stk_tree_permutation | Uses a modified NEXUS file to create all permutations of taxa positions for paraphyletic taxa. |
Figure 1Checking taxonomic overlap of source phylogenies. An example of the output from stk_check_overlap. A) shows the source trees represented by nodes with an edge drawn between them when two taxa are shared between them. Tree 10 does not have sufficient taxonomic overlap and should therefore be excluded. B) shows the same set of source phylogenies, but with four taxa required as the minimum for taxonomic overlap. Now trees 10 and 18 should be excluded. In addition, note that the number of edges connecting each node has decreased accordingly.
Figure 2Chaining of STK scripts to produce a "processing pipeline". Data (right - both trees and associated XML meta-data) are processed via the STK scripts. The data output from one script can be fed into another, with data checks being performed throughout processing. The final stage is matrix generation and use of other tools to produce a supertree.