| Literature DB >> 32363341 |
Tiffany M Delhomme1, Patrice H Avogbe1, Aurélie A G Gabriel1, Nicolas Alcala1, Noemie Leblay1, Catherine Voegele1, Maxime Vallée2, Priscilia Chopard2, Amélie Chabrier1, Behnoush Abedi-Ardekani1, Valérie Gaborieau2, Ivana Holcatova3, Vladimir Janout4, Lenka Foretová5, Sasa Milosavljevic6, David Zaridze7, Anush Mukeriya7, Elisabeth Brambilla8, Paul Brennan2, Ghislaine Scelo2, Lynnette Fernandez-Cuesta1, Graham Byrnes9, Florence L Calvez-Kelm1, James D McKay1, Matthieu Foll1.
Abstract
The emergence of next-generation sequencing (NGS) has revolutionized the way of reaching a genome sequence, with the promise of potentially providing a comprehensive characterization of DNA variations. Nevertheless, detecting somatic mutations is still a difficult problem, in particular when trying to identify low abundance mutations, such as subclonal mutations, tumour-derived alterations in body fluids or somatic mutations from histological normal tissue. The main challenge is to precisely distinguish between sequencing artefacts and true mutations, particularly when the latter are so rare they reach similar abundance levels as artefacts. Here, we present needlestack, a highly sensitive variant caller, which directly learns from the data the level of systematic sequencing errors to accurately call mutations. Needlestack is based on the idea that the sequencing error rate can be dynamically estimated from analysing multiple samples together. We show that the sequencing error rate varies across alterations, illustrating the need to precisely estimate it. We evaluate the performance of needlestack for various types of variations, and we show that needlestack is robust among positions and outperforms existing state-of-the-art method for low abundance mutations. Needlestack, along with its source code is freely available on the GitHub platform: https://github.com/IARCbioinfo/needlestack. © World Health Organization and the authors, 2020. All rights reserved. The World Health Organization and the authors have granted the Publisher permission for the reproduction of this article.Entities:
Year: 2020 PMID: 32363341 PMCID: PMC7182099 DOI: 10.1093/nargab/lqaa021
Source DB: PubMed Journal: NAR Genom Bioinform ISSN: 2631-9268