Literature DB >> 34406047

CRISPRclassify: Repeat-Based Classification of CRISPR Loci.

Matthew A Nethery1, Michael Korvink2, Kira S Makarova3, Yuri I Wolf3, Eugene V Koonin3, Rodolphe Barrangou1.   

Abstract

Detection and classification of CRISPR-Cas systems in metagenomic data have become increasingly prevalent in recent years due to their potential for diverse applications in genome editing. Traditionally, CRISPR-Cas systems are classified through reference-based identification of proximate cas genes. Here, we present a machine learning approach for the detection and classification of CRISPR loci using repeat sequences in a cas-independent context, enabling identification of unclassified loci missed by traditional cas-based approaches. Using biological attributes of the CRISPR repeat, the core element in CRISPR arrays, and leveraging methods from natural language processing, we developed a machine learning model capable of accurate classification of CRISPR loci in an extensive set of metagenomes, resulting in an F1 measure of 0.82 across all predictions and an F1 measure of 0.97 when limiting to classifications with probabilities >0.85. Furthermore, assessing performance on novel repeats yielded an F1 measure of 0.96. Although the performance of cas-based identification will exceed that of a repeat-based approach in many cases, CRISPRclassify provides an efficient approach to classification of CRISPR loci for cases in which cas gene information is unavailable, such as metagenomes and fragmented genome assemblies.

Entities:  

Mesh:

Year:  2021        PMID: 34406047      PMCID: PMC8392126          DOI: 10.1089/crispr.2021.0021

Source DB:  PubMed          Journal:  CRISPR J        ISSN: 2573-1599


  53 in total

1.  Identification of genes that are associated with DNA repeats in prokaryotes.

Authors:  Ruud Jansen; Jan D A van Embden; Wim Gaastra; Leo M Schouls
Journal:  Mol Microbiol       Date:  2002-03       Impact factor: 3.501

Review 2.  MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices.

Authors:  Dinghua Li; Ruibang Luo; Chi-Man Liu; Chi-Ming Leung; Hing-Fung Ting; Kunihiko Sadakane; Hiroshi Yamashita; Tak-Wah Lam
Journal:  Methods       Date:  2016-03-21       Impact factor: 3.608

3.  Essential features and rational design of CRISPR RNAs that function with the Cas RAMP module complex to cleave RNAs.

Authors:  Caryn R Hale; Sonali Majumdar; Joshua Elmore; Neil Pfister; Mark Compton; Sara Olson; Alissa M Resch; Claiborne V C Glover; Brenton R Graveley; Rebecca M Terns; Michael P Terns
Journal:  Mol Cell       Date:  2012-01-05       Impact factor: 17.970

4.  Diversity and evolution of class 2 CRISPR-Cas systems.

Authors:  Sergey Shmakov; Aaron Smargon; David Scott; David Cox; Neena Pyzocha; Winston Yan; Omar O Abudayyeh; Jonathan S Gootenberg; Kira S Makarova; Yuri I Wolf; Konstantin Severinov; Feng Zhang; Eugene V Koonin
Journal:  Nat Rev Microbiol       Date:  2017-01-23       Impact factor: 60.633

5.  Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin.

Authors:  Alexander Bolotin; Benoit Quinquis; Alexei Sorokin; S Dusko Ehrlich
Journal:  Microbiology       Date:  2005-08       Impact factor: 2.777

6.  Short motif sequences determine the targets of the prokaryotic CRISPR defence system.

Authors:  F J M Mojica; C Díez-Villaseñor; J García-Martínez; C Almendros
Journal:  Microbiology       Date:  2009-03       Impact factor: 2.777

Review 7.  Evolutionary classification of CRISPR-Cas systems: a burst of class 2 and derived variants.

Authors:  Kira S Makarova; Yuri I Wolf; Jaime Iranzo; Sergey A Shmakov; Omer S Alkhnbashi; Stan J J Brouns; Emmanuelle Charpentier; David Cheng; Daniel H Haft; Philippe Horvath; Sylvain Moineau; Francisco J M Mojica; David Scott; Shiraz A Shah; Virginijus Siksnys; Michael P Terns; Česlovas Venclovas; Malcolm F White; Alexander F Yakunin; Winston Yan; Feng Zhang; Roger A Garrett; Rolf Backofen; John van der Oost; Rodolphe Barrangou; Eugene V Koonin
Journal:  Nat Rev Microbiol       Date:  2019-12-19       Impact factor: 60.633

8.  CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III.

Authors:  Elitza Deltcheva; Krzysztof Chylinski; Cynthia M Sharma; Karine Gonzales; Yanjie Chao; Zaid A Pirzada; Maria R Eckert; Jörg Vogel; Emmanuelle Charpentier
Journal:  Nature       Date:  2011-03-31       Impact factor: 49.962

9.  Diverse CRISPRs evolving in human microbiomes.

Authors:  Mina Rho; Yu-Wei Wu; Haixu Tang; Thomas G Doak; Yuzhen Ye
Journal:  PLoS Genet       Date:  2012-06-13       Impact factor: 5.917

10.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

View more
  2 in total

1.  Genomes of six viruses that infect Asgard archaea from deep-sea sediments.

Authors:  Ian M Rambo; Marguerite V Langwig; Pedro Leão; Valerie De Anda; Brett J Baker
Journal:  Nat Microbiol       Date:  2022-06-27       Impact factor: 30.964

2.  Engineered Cas12i2 is a versatile high-efficiency platform for therapeutic genome editing.

Authors:  Colin McGaw; Anthony J Garrity; Gabrielle Z Munoz; Jeffrey R Haswell; Sejuti Sengupta; Elise Keston-Smith; Pratyusha Hunnewell; Alexa Ornstein; Mishti Bose; Quinton Wessells; Noah Jakimo; Paul Yan; Huaibin Zhang; Lauren E Alfonse; Roy Ziblat; Jason M Carte; Wei-Cheng Lu; Derek Cerchione; Brendan Hilbert; Shanmugapriya Sothiselvam; Winston X Yan; David R Cheng; David A Scott; Tia DiTommaso; Shaorong Chong
Journal:  Nat Commun       Date:  2022-05-20       Impact factor: 17.694

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.