Literature DB >> 33511994

FASTQuick: rapid and comprehensive quality assessment of raw sequence reads.

Fan Zhang1, Hyun Min Kang2.   

Abstract

BACKGROUND: Rapid and thorough quality assessment of sequenced genomes on an ultra-high-throughput scale is crucial for successful large-scale genomic studies. Comprehensive quality assessment typically requires full genome alignment, which costs a substantial amount of computational resources and turnaround time. Existing tools are either computationally expensive owing to full alignment or lacking essential quality metrics by skipping read alignment.
FINDINGS: We developed a set of rapid and accurate methods to produce comprehensive quality metrics directly from a subset of raw sequence reads (from whole-genome or whole-exome sequencing) without full alignment. Our methods offer orders of magnitude faster turnaround time than existing full alignment-based methods while providing comprehensive and sophisticated quality metrics, including estimates of genetic ancestry and cross-sample contamination.
CONCLUSIONS: By rapidly and comprehensively performing the quality assessment, our tool will help investigators detect potential issues in ultra-high-throughput sequence reads in real time within a low computational cost at the early stages of the analyses, ensuring high-quality downstream results and preventing unexpected loss in time, money, and invaluable specimens.
© The Author(s) 2021. Published by Oxford University Press GigaScience.

Entities:  

Keywords:  contamination; genetic ancestry; quality assessment; sequencing data analysis

Year:  2021        PMID: 33511994      PMCID: PMC7844880          DOI: 10.1093/gigascience/giab004

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


  13 in total

1.  Mapping short DNA sequencing reads and calling variants using mapping quality scores.

Authors:  Heng Li; Jue Ruan; Richard Durbin
Journal:  Genome Res       Date:  2008-08-19       Impact factor: 9.043

2.  Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data.

Authors:  Goo Jun; Matthew Flickinger; Kurt N Hetrick; Jane M Romm; Kimberly F Doheny; Gonçalo R Abecasis; Michael Boehnke; Hyun Min Kang
Journal:  Am J Hum Genet       Date:  2012-10-25       Impact factor: 11.025

3.  Integrating common and rare genetic variation in diverse human populations.

Authors:  David M Altshuler; Richard A Gibbs; Leena Peltonen; David M Altshuler; Richard A Gibbs; Leena Peltonen; Emmanouil Dermitzakis; Stephen F Schaffner; Fuli Yu; Leena Peltonen; Emmanouil Dermitzakis; Penelope E Bonnen; David M Altshuler; Richard A Gibbs; Paul I W de Bakker; Panos Deloukas; Stacey B Gabriel; Rhian Gwilliam; Sarah Hunt; Michael Inouye; Xiaoming Jia; Aarno Palotie; Melissa Parkin; Pamela Whittaker; Fuli Yu; Kyle Chang; Alicia Hawes; Lora R Lewis; Yanru Ren; David Wheeler; Richard A Gibbs; Donna Marie Muzny; Chris Barnes; Katayoon Darvishi; Matthew Hurles; Joshua M Korn; Kati Kristiansson; Charles Lee; Steven A McCarrol; James Nemesh; Emmanouil Dermitzakis; Alon Keinan; Stephen B Montgomery; Samuela Pollack; Alkes L Price; Nicole Soranzo; Penelope E Bonnen; Richard A Gibbs; Claudia Gonzaga-Jauregui; Alon Keinan; Alkes L Price; Fuli Yu; Verneri Anttila; Wendy Brodeur; Mark J Daly; Stephen Leslie; Gil McVean; Loukas Moutsianas; Huy Nguyen; Stephen F Schaffner; Qingrun Zhang; Mohammed J R Ghori; Ralph McGinnis; William McLaren; Samuela Pollack; Alkes L Price; Stephen F Schaffner; Fumihiko Takeuchi; Sharon R Grossman; Ilya Shlyakhter; Elizabeth B Hostetter; Pardis C Sabeti; Clement A Adebamowo; Morris W Foster; Deborah R Gordon; Julio Licinio; Maria Cristina Manca; Patricia A Marshall; Ichiro Matsuda; Duncan Ngare; Vivian Ota Wang; Deepa Reddy; Charles N Rotimi; Charmaine D Royal; Richard R Sharp; Changqing Zeng; Lisa D Brooks; Jean E McEwen
Journal:  Nature       Date:  2010-09-02       Impact factor: 49.962

4.  PIQA: pipeline for Illumina G1 genome analyzer data quality assessment.

Authors:  A Martínez-Alcántara; E Ballesteros; C Feng; M Rojas; H Koshinsky; V Y Fofanov; P Havlak; Y Fofanov
Journal:  Bioinformatics       Date:  2009-07-14       Impact factor: 6.937

5.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

6.  HTQC: a fast quality control toolkit for Illumina sequencing data.

Authors:  Xi Yang; Di Liu; Fei Liu; Jun Wu; Jing Zou; Xue Xiao; Fangqing Zhao; Baoli Zhu
Journal:  BMC Bioinformatics       Date:  2013-01-31       Impact factor: 3.169

7.  An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data.

Authors:  Goo Jun; Mary Kate Wing; Gonçalo R Abecasis; Hyun Min Kang
Journal:  Genome Res       Date:  2015-04-16       Impact factor: 9.043

8.  QPLOT: a quality assessment tool for next generation sequencing data.

Authors:  Bingshan Li; Xiaowei Zhan; Mary-Kate Wing; Paul Anderson; Hyun Min Kang; Goncalo R Abecasis
Journal:  Biomed Res Int       Date:  2013-11-11       Impact factor: 3.411

9.  Ancestry-agnostic estimation of DNA sample contamination from sequence reads.

Authors:  Fan Zhang; Matthew Flickinger; Sarah A Gagliano Taliun; Gonçalo R Abecasis; Laura J Scott; Steven A McCaroll; Carlos N Pato; Michael Boehnke; Hyun Min Kang
Journal:  Genome Res       Date:  2020-01-24       Impact factor: 9.043

10.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.