| Literature DB >> 24381581 |
Richard M Leggett1, Ricardo H Ramirez-Gonzalez1, Bernardo J Clavijo1, Darren Waite1, Robert P Davey1.
Abstract
The processes of quality assessment and control are an active area of research at The Genome Analysis Centre (TGAC). Unlike other sequencing centers that often concentrate on a certain species or technology, TGAC applies expertise in genomics and bioinformatics to a wide range of projects, often requiring bespoke wet lab and in silico workflows. TGAC is fortunate to have access to a diverse range of sequencing and analysis platforms, and we are at the forefront of investigations into library quality and sequence data assessment. We have developed and implemented a number of algorithms, tools, pipelines and packages to ascertain, store, and expose quality metrics across a number of next-generation sequencing platforms, allowing rapid and in-depth cross-platform Quality Control (QC) bioinformatics. In this review, we describe these tools as a vehicle for data-driven informatics, offering the potential to provide richer context for downstream analysis and to inform experimental design.Entities:
Keywords: NGS data analysis; QC; bioinformatics tools; contamination screening; quality assessment and improvement; quality control; run statistics; sequence analysis
Year: 2013 PMID: 24381581 PMCID: PMC3865868 DOI: 10.3389/fgene.2013.00288
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Data flow through the Primary Analysis Pipeline, focused on the Illumina platform.
Figure 2Example kmer spectra. An initial peak around coverage 1 is indicative of sequencing errors. Two further peaks indicate heterozygosity.