Literature DB >> 34090332

2-kupl: mapping-free variant detection from DNA-seq data of matched samples.

Yunfeng Wang1,2, Haoliang Xue1, Christine Pourcel1, Yang Du2, Daniel Gautheret3,4.   

Abstract

BACKGROUND: The detection of genome variants, including point mutations, indels and structural variants, is a fundamental and challenging computational problem. We address here the problem of variant detection between two deep-sequencing (DNA-seq) samples, such as two human samples from an individual patient, or two samples from distinct bacterial strains. The preferred strategy in such a case is to align each sample to a common reference genome, collect all variants and compare these variants between samples. Such mapping-based protocols have several limitations. DNA sequences with large indels, aggregated mutations and structural variants are hard to map to the reference. Furthermore, DNA sequences cannot be mapped reliably to genomic low complexity regions and repeats.
RESULTS: We introduce 2-kupl, a k-mer based, mapping-free protocol to detect variants between two DNA-seq samples. On simulated and actual data, 2-kupl achieves higher accuracy than other mapping-free protocols. Applying 2-kupl to prostate cancer whole exome sequencing data, we identify a number of candidate variants in hard-to-map regions and propose potential novel recurrent variants in this disease.
CONCLUSIONS: We developed a mapping-free protocol for variant calling between matched DNA-seq samples. Our protocol is suitable for variant detection in unmappable genome regions or in the absence of a reference genome.

Entities:  

Keywords:  Contigs; DNAseq; Mapping-free; PRAD; Recurrent variants; WES; WGS; k-mers

Mesh:

Substances:

Year:  2021        PMID: 34090332     DOI: 10.1186/s12859-021-04185-6

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  33 in total

1.  VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing.

Authors:  Daniel C Koboldt; Qunyuan Zhang; David E Larson; Dong Shen; Michael D McLellan; Ling Lin; Christopher A Miller; Elaine R Mardis; Li Ding; Richard K Wilson
Journal:  Genome Res       Date:  2012-02-02       Impact factor: 9.043

2.  Mapping short DNA sequencing reads and calling variants using mapping quality scores.

Authors:  Heng Li; Jue Ruan; Richard Durbin
Journal:  Genome Res       Date:  2008-08-19       Impact factor: 9.043

3.  Infertility treated with donor specific lymphocytes in recurrent idiopathic spontaneous abortion.

Authors:  I L Bernstein; D I Bernstein; K Balakrishnan; L Korbee
Journal:  Transplant Proc       Date:  1989-02       Impact factor: 1.066

Review 4.  Analyzing metabolic variations in different bacterial strains, historical perspectives and current trends--example E. coli.

Authors:  Joseph Shiloach; Shamlan Reshamwala; Santosh B Noronha; Alejandro Negrete
Journal:  Curr Opin Biotechnol       Date:  2010-01-29       Impact factor: 9.740

5.  SomaticSniper: identification of somatic point mutations in whole genome sequencing data.

Authors:  David E Larson; Christopher C Harris; Ken Chen; Daniel C Koboldt; Travis E Abbott; David J Dooling; Timothy J Ley; Elaine R Mardis; Richard K Wilson; Li Ding
Journal:  Bioinformatics       Date:  2011-12-06       Impact factor: 6.937

Review 6.  Standards and Guidelines for the Interpretation and Reporting of Sequence Variants in Cancer: A Joint Consensus Recommendation of the Association for Molecular Pathology, American Society of Clinical Oncology, and College of American Pathologists.

Authors:  Marilyn M Li; Michael Datto; Eric J Duncavage; Shashikant Kulkarni; Neal I Lindeman; Somak Roy; Apostolia M Tsimberidou; Cindy L Vnencak-Jones; Daynna J Wolff; Anas Younes; Marina N Nikiforova
Journal:  J Mol Diagn       Date:  2017-01       Impact factor: 5.568

Review 7.  The functional impact of structural variation in humans.

Authors:  Matthew E Hurles; Emmanouil T Dermitzakis; Chris Tyler-Smith
Journal:  Trends Genet       Date:  2008-04-02       Impact factor: 11.639

8.  Guidelines for investigating causality of sequence variants in human disease.

Authors:  D G MacArthur; T A Manolio; D P Dimmock; H L Rehm; J Shendure; G R Abecasis; D R Adams; R B Altman; S E Antonarakis; E A Ashley; J C Barrett; L G Biesecker; D F Conrad; G M Cooper; N J Cox; M J Daly; M B Gerstein; D B Goldstein; J N Hirschhorn; S M Leal; L A Pennacchio; J A Stamatoyannopoulos; S R Sunyaev; D Valle; B F Voight; W Winckler; C Gunter
Journal:  Nature       Date:  2014-04-24       Impact factor: 49.962

9.  How to apply de Bruijn graphs to genome assembly.

Authors:  Phillip E C Compeau; Pavel A Pevzner; Glenn Tesler
Journal:  Nat Biotechnol       Date:  2011-11-08       Impact factor: 54.908

10.  Mapping-free variant calling using haplotype reconstruction from k-mer frequencies.

Authors:  Peter A Audano; Shashidhar Ravishankar; Fredrik O Vannberg
Journal:  Bioinformatics       Date:  2018-05-15       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.