| Literature DB >> 29881251 |
Julie M Allen1, Raphael LaFrance1, Ryan A Folk1, Kevin P Johnson2, Robert P Guralnick1.
Abstract
Massive strides have been made in technologies for collecting genome-scale data. However, tools for efficiently and flexibly assembling raw outputs into downstream analytical workflows are still nascent. aTRAM 1.0 was designed to assemble any locus from genome sequencing data but was neither optimized for efficiency nor able to serve as a single toolkit for all assembly needs. We have completely re-implemented aTRAM and redesigned its structure for faster read retrieval while adding a number of key features to improve flexibility and functionality. The software can now (1) assemble single- or paired-end data, (2) utilize both read directions in the database, (3) use an additional de novo assembly module, and (4) leverage new built-in pipelines to automate common workflows in phylogenomics. Owing to reimplementation of databasing strategies, we demonstrate that aTRAM 2.0 is much faster across all applications compared to the previous version.Entities:
Keywords: aTRAM; locus assembly; massively parallel sequencing; short-read sequencing; software
Year: 2018 PMID: 29881251 PMCID: PMC5987885 DOI: 10.1177/1176934318774546
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Figure 1.Overall aTRAM workflow, which includes a preparation process followed by assembly steps. The preparation process includes sharding raw data into an aTRAM library, including construction of a whole-dataset SQLite database from raw reads. The assembly process uses a bait sequence or set of sequences to perform an iterative assembly. The whole process generates assembled reads that are often longer than target baits as shown in later iterations below.