Chiann-Ling C Yeh1, Clara J Amorosi1, Soyeon Showman1,2, Maitreya J Dunham1. 1. Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA. 2. Molecular and Cellular Biology Program, University of Washington, Seattle, WA 98195, USA.
Abstract
SUMMARY: Use of PacBio sequencing for characterizing barcoded libraries of genetic variants is on the rise. However, current approaches in resolving PacBio sequencing artifacts can result in a high number of incorrectly identified or unusable reads. Here, we developed a PacBio Read Alignment Tool (PacRAT) that improves the accuracy of barcode-variant mapping through several steps of read alignment and consensus calling. To quantify the performance of our approach, we simulated PacBio reads from eight variant libraries of various lengths and showed that PacRAT improves the accuracy in pairing barcodes and variants across these libraries. Analysis of real (non-simulated) libraries also showed an increase in the number of reads that can be used for downstream analyses when using PacRAT. AVAILABILITY AND IMPLEMENTATION: PacRAT is written in Python and is freely available (https://github.com/dunhamlab/PacRAT). SUPPLEMENTARY INFORMATION: Supplemental data are available at Bioinformatics online.
SUMMARY: Use of PacBio sequencing for characterizing barcoded libraries of genetic variants is on the rise. However, current approaches in resolving PacBio sequencing artifacts can result in a high number of incorrectly identified or unusable reads. Here, we developed a PacBio Read Alignment Tool (PacRAT) that improves the accuracy of barcode-variant mapping through several steps of read alignment and consensus calling. To quantify the performance of our approach, we simulated PacBio reads from eight variant libraries of various lengths and showed that PacRAT improves the accuracy in pairing barcodes and variants across these libraries. Analysis of real (non-simulated) libraries also showed an increase in the number of reads that can be used for downstream analyses when using PacRAT. AVAILABILITY AND IMPLEMENTATION: PacRAT is written in Python and is freely available (https://github.com/dunhamlab/PacRAT). SUPPLEMENTARY INFORMATION: Supplemental data are available at Bioinformatics online.
Authors: Anja R Ollodart; Chiann-Ling C Yeh; Aaron W Miller; Brian H Shirts; Adam S Gordon; Maitreya J Dunham Journal: Genetics Date: 2021-06-24 Impact factor: 4.562
Authors: Kenneth A Matreyek; Lea M Starita; Jason J Stephany; Beth Martin; Melissa A Chiasson; Vanessa E Gray; Martin Kircher; Arineh Khechaduri; Jennifer N Dines; Ronald J Hause; Smita Bhatia; William E Evans; Mary V Relling; Wenjian Yang; Jay Shendure; Douglas M Fowler Journal: Nat Genet Date: 2018-05-21 Impact factor: 38.330
Authors: Aaron M Wenger; Paul Peluso; William J Rowell; Pi-Chuan Chang; Richard J Hall; Gregory T Concepcion; Jana Ebler; Arkarachai Fungtammasan; Alexey Kolesnikov; Nathan D Olson; Armin Töpfer; Michael Alonge; Medhat Mahmoud; Yufeng Qian; Chen-Shan Chin; Adam M Phillippy; Michael C Schatz; Gene Myers; Mark A DePristo; Jue Ruan; Tobias Marschall; Fritz J Sedlazeck; Justin M Zook; Heng Li; Sergey Koren; Andrew Carroll; David R Rank; Michael W Hunkapiller Journal: Nat Biotechnol Date: 2019-08-12 Impact factor: 54.908
Authors: Clara J Amorosi; Melissa A Chiasson; Matthew G McDonald; Lai Hong Wong; Katherine A Sitko; Gabriel Boyle; John P Kowalski; Allan E Rettie; Douglas M Fowler; Maitreya J Dunham Journal: Am J Hum Genet Date: 2021-07-26 Impact factor: 11.025