| Literature DB >> 24885806 |
Getiria Onsongo, Jesse Erdmann, Michael D Spears, John Chilton, Kenneth B Beckman, Adam Hauge, Sophia Yohe, Matthew Schomaker, Matthew Bower, Kevin A T Silverstein1, Bharat Thyagarajan.
Abstract
BACKGROUND: The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories.Entities:
Mesh:
Year: 2014 PMID: 24885806 PMCID: PMC4036707 DOI: 10.1186/1756-0500-7-314
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Figure 1Flowchart for analysis pipeline. A metadata file describing the sequence data being uploaded for analysis together with the location of the files are passed as input to a shell script. The shell script configures the VM on Amazons AWS and uploads data to the VM. A Galaxy workflow is used for Phase 1 of the analysis. QC results are examined to verify data meets quality thresholds. A second Galaxy workflow is used for Phase 2 of the analyses producing a VCF file containing variants.
Figure 2The Galaxy analysis pipeline. The URL gives a link to a virtual machine on the amazon cloud that runs the analysis pipeline. Galaxy interface is configured to make the CLIA certified workflows accessible as tools under the tools pane (left pane). The center pane shows results for one of the QC analyses (coverage plot outlining percent of bases with different levels of coverage). The right pane is a history of all the tools and the order in which they were executed by the pipeline together their results.