| Literature DB >> 29372116 |
T Jeffrey Cole1, Michael S Brewer1.
Abstract
BACKGROUND: The recent proliferation of large amounts of biodiversity transcriptomic data has resulted in an ever-expanding need for scalable and user-friendly tools capable of answering large scale molecular evolution questions. FUSTr identifies gene families involved in the process of adaptation. This is a tool that finds genes in transcriptomic datasets under strong positive selection that automatically detects isoform designation patterns in transcriptome assemblies to maximize phylogenetic independence in downstream analysis.Entities:
Keywords: Gene family reconstruction; Molecular evolution; Positive selection; Transcriptomics
Year: 2018 PMID: 29372116 PMCID: PMC5775752 DOI: 10.7717/peerj.4234
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Parallelization scheme and workflow of FUSTr.
Color coding denotes functional subroutines in the pipeline: preparation and open reading frame prediction (red); homology inferenece and gene family clustering (green); multiple sequence alignment, phylogenetics, and selection detection (brown); and model selection and reconciliation (blue).
Benchmarks for each subroutines’ time and memory used for the Tetragnatha transcriptome assembly analysis.
Red highlighted row represents subroutine consuming the most memory and time per task, blue highlighted row represents subroutine consuming the most memory and time in total.
| Subroutine | Tasks | Seconds per task | Total seconds | RAM per task (MiB) | Total RAM (MiB) |
|---|---|---|---|---|---|
| Clean fastas | 6 | 1.40 | 8.38 | 46.5 | 278.9 |
| New headers | 6 | 1.65 | 9.90 | 43.6 | 261.5 |
| Long isoform | 6 | 0.512 | 3.07 | 51.5 | 309.13 |
| Diamond | 1 | 32.1 | 32.1 | 234.0 | 234.0 |
| SiLiX | 1 | 4.51 | 4.51 | 22.8 | 22.8 |
| Mafft | 135 | 3.24 | 437.8 | 18.3 | 2,466.5 |
| FastTree | 135 | 3.09 | 417.4 | 18.5 | 2,491.3 |
| TrimAL | 135 | 1.87 | 252.2 | 17.9 | 2,415.6 |
Figure 2Schematic of EvolveAGene methods used to simulate sequences for the validation of FUSTr.
Sequences were randomly generated and evolved along a symmetric phylogeny under a given selective regime (positive, negative, or constant across the entire gene).