Literature DB >> 34729472

Ideafix: a decision tree-based method for the refinement of variants in FFPE DNA sequencing data.

Maitena Tellaetxe-Abete1, Borja Calvo2, Charles Lawrie1.   

Abstract

Increasingly, treatment decisions for cancer patients are being made from next-generation sequencing results generated from formalin-fixed and paraffin-embedded (FFPE) biopsies. However, this material is prone to sequence artefacts that cannot be easily identified. In order to address this issue, we designed a machine learning-based algorithm to identify these artefacts using data from >1 600 000 variants from 27 paired FFPE and fresh-frozen breast cancer samples. Using these data, we assembled a series of variant features and evaluated the classification performance of five machine learning algorithms. Using leave-one-sample-out cross-validation, we found that XGBoost (extreme gradient boosting) and random forest obtained AUC (area under the receiver operating characteristic curve) values >0.86. Performance was further tested using two independent datasets that resulted in AUC values of 0.96, whereas a comparison with previously published tools resulted in a maximum AUC value of 0.92. The most discriminating features were read pair orientation bias, genomic context and variant allele frequency. In summary, our results show a promising future for the use of these samples in molecular testing. We built the algorithm into an R package called Ideafix (DEAmination FIXing) that is freely available at https://github.com/mmaitenat/ideafix.
© The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.

Entities:  

Year:  2021        PMID: 34729472      PMCID: PMC8557387          DOI: 10.1093/nargab/lqab092

Source DB:  PubMed          Journal:  NAR Genom Bioinform        ISSN: 2631-9268


  41 in total

1.  Patterns of damage in genomic DNA sequences from a Neandertal.

Authors:  Adrian W Briggs; Udo Stenzel; Philip L F Johnson; Richard E Green; Janet Kelso; Kay Prüfer; Matthias Meyer; Johannes Krause; Michael T Ronan; Michael Lachmann; Svante Pääbo
Journal:  Proc Natl Acad Sci U S A       Date:  2007-08-21       Impact factor: 11.205

Review 2.  Toward better understanding of artifacts in variant calling from high-coverage samples.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2014-06-27       Impact factor: 6.937

3.  From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline.

Authors:  Geraldine A Van der Auwera; Mauricio O Carneiro; Christopher Hartl; Ryan Poplin; Guillermo Del Angel; Ami Levy-Moonshine; Tadeusz Jordan; Khalid Shakir; David Roazen; Joel Thibault; Eric Banks; Kiran V Garimella; David Altshuler; Stacey Gabriel; Mark A DePristo
Journal:  Curr Protoc Bioinformatics       Date:  2013

4.  Identification of high-confidence somatic mutations in whole genome sequence of formalin-fixed breast cancer specimens.

Authors:  Shawn E Yost; Erin N Smith; Richard B Schwab; Lei Bao; HyunChul Jung; Xiaoyun Wang; Emile Voest; John P Pierce; Karen Messer; Barbara A Parker; Olivier Harismendy; Kelly A Frazer
Journal:  Nucleic Acids Res       Date:  2012-04-06       Impact factor: 16.971

5.  Discriminating somatic and germline mutations in tumor DNA samples without matching normals.

Authors:  Saskia Hiltemann; Guido Jenster; Jan Trapman; Peter van der Spek; Andrew Stubbs
Journal:  Genome Res       Date:  2015-07-24       Impact factor: 9.043

6.  Towards standardization of next-generation sequencing of FFPE samples for clinical oncology: intrinsic obstacles and possible solutions.

Authors:  Maxim Ivanov; Konstantin Laktionov; Valery Breder; Polina Chernenko; Ekaterina Novikova; Ekaterina Telysheva; Sergey Musienko; Ancha Baranova; Vladislav Mileyko
Journal:  J Transl Med       Date:  2017-01-31       Impact factor: 5.531

7.  Performance comparison of three DNA extraction kits on human whole-exome data from formalin-fixed paraffin-embedded normal and tumor samples.

Authors:  Eric Bonnet; Marie-Laure Moutet; Céline Baulard; Delphine Bacq-Daian; Florian Sandron; Lilia Mesrob; Bertrand Fin; Marc Delépine; Marie-Ange Palomares; Claire Jubin; Hélène Blanché; Vincent Meyer; Anne Boland; Robert Olaso; Jean-François Deleuze
Journal:  PLoS One       Date:  2018-04-05       Impact factor: 3.240

8.  Prediction of response to anti-EGFR antibody-based therapies by multigene sequencing in colorectal cancer patients.

Authors:  Laura Lupini; Cristian Bassi; Jitka Mlcochova; Gentian Musa; Marta Russo; Petra Vychytilova-Faltejskova; Marek Svoboda; Silvia Sabbioni; Radim Nemecek; Ondrej Slaby; Massimo Negrini
Journal:  BMC Cancer       Date:  2015-10-27       Impact factor: 4.430

9.  Automated high throughput nucleic acid purification from formalin-fixed paraffin-embedded tissue samples for next generation sequence analysis.

Authors:  Simon Haile; Pawan Pandoh; Helen McDonald; Richard D Corbett; Philip Tsao; Heather Kirk; Tina MacLeod; Martin Jones; Steve Bilobram; Denise Brooks; Duane Smailus; Christian Steidl; David W Scott; Miruna Bala; Martin Hirst; Diane Miller; Richard A Moore; Andrew J Mungall; Robin J Coope; Yussanne Ma; Yongjun Zhao; Rob A Holt; Steven J Jones; Marco A Marra
Journal:  PLoS One       Date:  2017-06-01       Impact factor: 3.240

10.  Clinical whole-genome sequencing from routine formalin-fixed, paraffin-embedded specimens: pilot study for the 100,000 Genomes Project.

Authors:  Pauline Robbe; Niko Popitsch; Samantha J L Knight; Pavlos Antoniou; Jennifer Becq; Miao He; Alexander Kanapin; Anastasia Samsonova; Dimitrios V Vavoulis; Mark T Ross; Zoya Kingsbury; Maite Cabes; Sara D C Ramos; Suzanne Page; Helene Dreau; Kate Ridout; Louise J Jones; Alice Tuff-Lacey; Shirley Henderson; Joanne Mason; Francesca M Buffa; Clare Verrill; David Maldonado-Perez; Ioannis Roxanis; Elena Collantes; Lisa Browning; Sunanda Dhar; Stephen Damato; Susan Davies; Mark Caulfield; David R Bentley; Jenny C Taylor; Clare Turnbull; Anna Schuh
Journal:  Genet Med       Date:  2018-02-01       Impact factor: 8.822

View more
  1 in total

Review 1.  Cancer Neoantigens: Challenges and Future Directions for Prediction, Prioritization, and Validation.

Authors:  Elizabeth S Borden; Kenneth H Buetow; Melissa A Wilson; Karen Taraszka Hastings
Journal:  Front Oncol       Date:  2022-03-03       Impact factor: 6.244

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.