Hui Peng1, Yi Zheng1, Zhixun Zhao1, Tao Liu2,3, Jinyan Li1. 1. Advanced Analytics Institute, Faculty of Engineering and Information Technology, University of Technology Sydney, Broadway, Australia. 2. Centre for Childhood Cancer Research, University of New South Wales, Kensington, Australia. 3. Children's Cancer Institute, Sydney, Australia.
Abstract
Motivation: CRISPR/Cas9 is driving a broad range of innovative applications from basic biology to biotechnology and medicine. One of its current issues is the effect of off-target editing that should be critically resolved and should be completely avoided in the ideal use of this system. Results: We developed an ensemble learning method to detect the off-target sites of a single guide RNA (sgRNA) from its thousands of genome-wide candidates. Nucleotide mismatches between on-target and off-target sites have been studied recently. We confirm that there exists strong mismatch enrichment and preferences at the 5'-end close regions of the off-target sequences. Comparing with the on-target sites, sequences of no-editing sites can be also characterized by GC composition changes and position-specific mismatch binary features. Under this novel space of features, an ensemble strategy was applied to train a prediction model. The model achieved a mean score 0.99 of Aera Under Receiver Operating Characteristic curve and a mean score 0.45 of Aera Under Precision-Recall curve in cross-validations on big datasets, outperforming state-of-the-art methods in various test scenarios. Our predicted off-target sites also correspond very well to those detected by high-throughput sequencing techniques. Especially, two case studies for selecting sgRNAs to cure hearing loss and retinal degeneration partly prove the effectiveness of our method. Availability and implementation: The python and matlab version of source codes for detecting off-target sites of a given sgRNA and the supplementary files are freely available on the web at https://github.com/penn-hui/OfftargetPredict. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: CRISPR/Cas9 is driving a broad range of innovative applications from basic biology to biotechnology and medicine. One of its current issues is the effect of off-target editing that should be critically resolved and should be completely avoided in the ideal use of this system. Results: We developed an ensemble learning method to detect the off-target sites of a single guide RNA (sgRNA) from its thousands of genome-wide candidates. Nucleotide mismatches between on-target and off-target sites have been studied recently. We confirm that there exists strong mismatch enrichment and preferences at the 5'-end close regions of the off-target sequences. Comparing with the on-target sites, sequences of no-editing sites can be also characterized by GC composition changes and position-specific mismatch binary features. Under this novel space of features, an ensemble strategy was applied to train a prediction model. The model achieved a mean score 0.99 of Aera Under Receiver Operating Characteristic curve and a mean score 0.45 of Aera Under Precision-Recall curve in cross-validations on big datasets, outperforming state-of-the-art methods in various test scenarios. Our predicted off-target sites also correspond very well to those detected by high-throughput sequencing techniques. Especially, two case studies for selecting sgRNAs to cure hearing loss and retinal degeneration partly prove the effectiveness of our method. Availability and implementation: The python and matlab version of source codes for detecting off-target sites of a given sgRNA and the supplementary files are freely available on the web at https://github.com/penn-hui/OfftargetPredict. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Bogdan Kirillov; Ekaterina Savitskaya; Maxim Panov; Aleksey Y Ogurtsov; Svetlana A Shabalina; Eugene V Koonin; Konstantin V Severinov Journal: Nucleic Acids Res Date: 2022-01-25 Impact factor: 16.971
Authors: Hanspeter Naegeli; Jean-Louis Bresson; Tamas Dalmay; Ian Crawford Dewhurst; Michelle M Epstein; Leslie George Firbank; Philippe Guerche; Jan Hejatko; Francisco Javier Moreno; Ewen Mullins; Fabien Nogué; Jose Juan Sánchez Serrano; Giovanni Savoini; Eve Veromann; Fabio Veronesi; Josep Casacuberta; Andrea Gennaro; Konstantinos Paraskevopoulos; Tommaso Raffaello; Nils Rostoks Journal: EFSA J Date: 2020-11-24