Literature DB >> 27334474

Vecuum: identification and filtration of false somatic variants caused by recombinant vector contamination.

Junho Kim1, Ju Heon Maeng1, Jae Seok Lim2, Hyeonju Son1, Junehawk Lee3, Jeong Ho Lee2, Sangwoo Kim1.   

Abstract

MOTIVATION: Advances in sequencing technologies have remarkably lowered the detection limit of somatic variants to a low frequency. However, calling mutations at this range is still confounded by many factors including environmental contamination. Vector contamination is a continuously occurring issue and is especially problematic since vector inserts are hardly distinguishable from the sample sequences. Such inserts, which may harbor polymorphisms and engineered functional mutations, can result in calling false variants at corresponding sites. Numerous vector-screening methods have been developed, but none could handle contamination from inserts because they are focusing on vector backbone sequences alone.
RESULTS: We developed a novel method-Vecuum-that identifies vector-originated reads and resultant false variants. Since vector inserts are generally constructed from intron-less cDNAs, Vecuum identifies vector-originated reads by inspecting the clipping patterns at exon junctions. False variant calls are further detected based on the biased distribution of mutant alleles to vector-originated reads. Tests on simulated and spike-in experimental data validated that Vecuum could detect 93% of vector contaminants and could remove up to 87% of variant-like false calls with 100% precision. Application to public sequence datasets demonstrated the utility of Vecuum in detecting false variants resulting from various types of external contamination.
AVAILABILITY AND IMPLEMENTATION: Java-based implementation of the method is available at http://vecuum.sourceforge.net/ CONTACT: swkim@yuhs.acSupplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2016        PMID: 27334474     DOI: 10.1093/bioinformatics/btw383

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  3 in total

1.  VecScreen_plus_taxonomy: imposing a tax(onomy) increase on vector contamination screening.

Authors:  Alejandro A Schäffer; Eric P Nawrocki; Yoon Choi; Paul A Kitts; Ilene Karsch-Mizrachi; Richard McVeigh
Journal:  Bioinformatics       Date:  2018-03-01       Impact factor: 6.937

2.  Analysis of low-level somatic mosaicism reveals stage and tissue-specific mutational features in human development.

Authors:  Ja Hye Kim; Shinwon Hwang; Hyeonju Son; Dongsun Kim; Il Bin Kim; Myeong-Heui Kim; Nam Suk Sim; Dong Seok Kim; Yoo-Jin Ha; Junehawk Lee; Hoon-Chul Kang; Jeong Ho Lee; Sangwoo Kim
Journal:  PLoS Genet       Date:  2022-09-19       Impact factor: 6.020

3.  cDNA-detector: detection and removal of cDNA contamination in DNA sequencing libraries.

Authors:  Meifang Qi; Utthara Nayar; Leif S Ludwig; Nikhil Wagle; Esther Rheinbay
Journal:  BMC Bioinformatics       Date:  2021-12-24       Impact factor: 3.169

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.