Lennart Langouche1, April Aralar2, Mridu Sinha2, Shelley M Lawrence3,4,5, Stephanie I Fraley2,4, Todd P Coleman2. 1. Department of Nanoengineering, University of California, San Diego, La Jolla, CA, USA. 2. Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA. 3. Department of Pediatrics, Division of Neonatal-Perinatal Medicine, University of California, San Diego, La Jolla, CA, USA. 4. Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA. 5. Rady Children's Hospital of San Diego, San Diego, CA, USA.
Abstract
MOTIVATION: The need to rapidly screen complex samples for a wide range of nucleic acid targets, like infectious diseases, remains unmet. Digital High-Resolution Melt (dHRM) is an emerging technology with potential to meet this need by accomplishing broad-based, rapid nucleic acid sequence identification. Here, we set out to develop a computational framework for estimating the resolving power of dHRM technology for defined sequence profiling tasks. By deriving noise models from experimentally generated dHRM datasets and applying these to in silico predicted melt curves, we enable the production of synthetic dHRM datasets that faithfully recapitulate real-world variations arising from sample and machine variables. We then use these datasets to identify the most challenging melt curve classification tasks likely to arise for a given application and test the performance of benchmark classifiers. RESULTS: This toolbox enables the in silico design and testing of broad-based dHRM screening assays and the selection of optimal classifiers. For an example application of screening common human bacterial pathogens, we show that human pathogens having the most similar sequences and melt curves are still reliably identifiable in the presence of experimental noise. Further, we find that ensemble methods outperform whole series classifiers for this task and are in some cases able to resolve melt curves with single-nucleotide resolution. AVAILABILITY: Data and code available on https://github.com/lenlan/dHRM-noise-modeling. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: The need to rapidly screen complex samples for a wide range of nucleic acid targets, like infectious diseases, remains unmet. Digital High-Resolution Melt (dHRM) is an emerging technology with potential to meet this need by accomplishing broad-based, rapid nucleic acid sequence identification. Here, we set out to develop a computational framework for estimating the resolving power of dHRM technology for defined sequence profiling tasks. By deriving noise models from experimentally generated dHRM datasets and applying these to in silico predicted melt curves, we enable the production of synthetic dHRM datasets that faithfully recapitulate real-world variations arising from sample and machine variables. We then use these datasets to identify the most challenging melt curve classification tasks likely to arise for a given application and test the performance of benchmark classifiers. RESULTS: This toolbox enables the in silico design and testing of broad-based dHRM screening assays and the selection of optimal classifiers. For an example application of screening common human bacterial pathogens, we show that human pathogens having the most similar sequences and melt curves are still reliably identifiable in the presence of experimental noise. Further, we find that ensemble methods outperform whole series classifiers for this task and are in some cases able to resolve melt curves with single-nucleotide resolution. AVAILABILITY: Data and code available on https://github.com/lenlan/dHRM-noise-modeling. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Samuel Yang; Padmini Ramachandran; Richard Rothman; Yu-Hsiang Hsieh; Andrew Hardick; Helen Won; Aleksandar Kecojevic; Joany Jackman; Charlotte Gaydos Journal: J Clin Microbiol Date: 2009-05-20 Impact factor: 5.948
Authors: Michael Liew; Robert Pryor; Robert Palais; Cindy Meadows; Maria Erali; Elaine Lyon; Carl Wittwer Journal: Clin Chem Date: 2004-07 Impact factor: 8.327
Authors: Pornpat Athamanolap; Vishwa Parekh; Stephanie I Fraley; Vatsal Agarwal; Dong J Shin; Michael A Jacobs; Tza-Huei Wang; Samuel Yang Journal: PLoS One Date: 2014-10-02 Impact factor: 3.240
Authors: Stephanie I Fraley; Pornpat Athamanolap; Billie J Masek; Justin Hardick; Karen C Carroll; Yu-Hsiang Hsieh; Richard E Rothman; Charlotte A Gaydos; Tza-Huei Wang; Samuel Yang Journal: Sci Rep Date: 2016-01-18 Impact factor: 4.379