Literature DB >> 32556168

CRISPRcasIdentifier: Machine learning for accurate identification and classification of CRISPR-Cas systems.

Victor A Padilha1, Omer S Alkhnbashi2, Shiraz A Shah3, André C P L F de Carvalho1, Rolf Backofen2,4.   

Abstract

BACKGROUND: CRISPR-Cas genes are extraordinarily diverse and evolve rapidly when compared to other prokaryotic genes. With the rapid increase in newly sequenced archaeal and bacterial genomes, manual identification of CRISPR-Cas systems is no longer viable. Thus, an automated approach is required for advancing our understanding of the evolution and diversity of these systems and for finding new candidates for genome engineering in eukaryotic models.
RESULTS: We introduce CRISPRcasIdentifier, a new machine learning-based tool that combines regression and classification models for the prediction of potentially missing proteins in instances of CRISPR-Cas systems and the prediction of their respective subtypes. In contrast to other available tools, CRISPRcasIdentifier can both detect cas genes and extract potential association rules that reveal functional modules for CRISPR-Cas systems. In our experimental benchmark on the most recently published and comprehensive CRISPR-Cas system dataset, CRISPRcasIdentifier was compared with recent and state-of-the-art tools. According to the experimental results, CRISPRcasIdentifier presented the best Cas protein identification and subtype classification performance.
CONCLUSIONS: Overall, our tool greatly extends the classification of CRISPR cassettes and, for the first time, predicts missing Cas proteins and association rules between Cas proteins. Additionally, we investigated the properties of CRISPR subtypes. The proposed tool relies not only on the knowledge of manual CRISPR annotation but also on models trained using machine learning.
© The Author(s) 2020. Published by Oxford University Press.

Entities:  

Keywords:  CRISPR-Cas; Cas genes; Cas proteins; machine learning

Year:  2020        PMID: 32556168      PMCID: PMC7298778          DOI: 10.1093/gigascience/giaa062

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


  36 in total

1.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment.

Authors:  Michael Remmert; Andreas Biegert; Andreas Hauser; Johannes Söding
Journal:  Nat Methods       Date:  2011-12-25       Impact factor: 28.547

2.  The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA.

Authors:  Josiane E Garneau; Marie-Ève Dupuis; Manuela Villion; Dennis A Romero; Rodolphe Barrangou; Patrick Boyaval; Christophe Fremaux; Philippe Horvath; Alfonso H Magadán; Sylvain Moineau
Journal:  Nature       Date:  2010-11-04       Impact factor: 49.962

3.  HMMCAS: A Web Tool for the Identification and Domain Annotations of CAS Proteins.

Authors:  Guoshi Chai; Min Yu; Lixu Jiang; Yaocong Duan; Jian Huang
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2017-02-07       Impact factor: 3.710

4.  Crystal structure of clustered regularly interspaced short palindromic repeats (CRISPR)-associated Csn2 protein revealed Ca2+-dependent double-stranded DNA binding activity.

Authors:  Ki Hyun Nam; Igor Kurinov; Ailong Ke
Journal:  J Biol Chem       Date:  2011-06-21       Impact factor: 5.157

5.  Diversity and evolution of class 2 CRISPR-Cas systems.

Authors:  Sergey Shmakov; Aaron Smargon; David Scott; David Cox; Neena Pyzocha; Winston Yan; Omar O Abudayyeh; Jonathan S Gootenberg; Kira S Makarova; Yuri I Wolf; Konstantin Severinov; Feng Zhang; Eugene V Koonin
Journal:  Nat Rev Microbiol       Date:  2017-01-23       Impact factor: 60.633

6.  Modulation of CRISPR locus transcription by the repeat-binding protein Cbp1 in Sulfolobus.

Authors:  Ling Deng; Chandra S Kenchappa; Xu Peng; Qunxin She; Roger A Garrett
Journal:  Nucleic Acids Res       Date:  2011-12-01       Impact factor: 16.971

7.  A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes.

Authors:  Daniel H Haft; Jeremy Selengut; Emmanuel F Mongodin; Karen E Nelson
Journal:  PLoS Comput Biol       Date:  2005-11-11       Impact factor: 4.475

8.  Comprehensive search for accessory proteins encoded with archaeal and bacterial type III CRISPR-cas gene cassettes reveals 39 new cas gene families.

Authors:  Shiraz A Shah; Omer S Alkhnbashi; Juliane Behler; Wenyuan Han; Qunxin She; Wolfgang R Hess; Roger A Garrett; Rolf Backofen
Journal:  RNA Biol       Date:  2018-06-19       Impact factor: 4.652

9.  CRISPRdisco: An Automated Pipeline for the Discovery and Analysis of CRISPR-Cas Systems.

Authors:  Alexandra B Crawley; James R Henriksen; Rodolphe Barrangou
Journal:  CRISPR J       Date:  2018-04-09

10.  CRISPR adaptive immune systems of Archaea.

Authors:  Gisle Vestergaard; Roger A Garrett; Shiraz A Shah
Journal:  RNA Biol       Date:  2014-02-07       Impact factor: 4.652

View more
  10 in total

1.  Casboundary: automated definition of integral Cas cassettes.

Authors:  Victor A Padilha; Omer S Alkhnbashi; Van Dinh Tran; Shiraz A Shah; André C P L F Carvalho; Rolf Backofen
Journal:  Bioinformatics       Date:  2021-06-16       Impact factor: 6.937

2.  CRISPRidentify: identification of CRISPR arrays using machine learning approach.

Authors:  Alexander Mitrofanov; Omer S Alkhnbashi; Sergey A Shmakov; Kira S Makarova; Eugene V Koonin; Rolf Backofen
Journal:  Nucleic Acids Res       Date:  2021-02-26       Impact factor: 16.971

3.  Genomes of six viruses that infect Asgard archaea from deep-sea sediments.

Authors:  Ian M Rambo; Marguerite V Langwig; Pedro Leão; Valerie De Anda; Brett J Baker
Journal:  Nat Microbiol       Date:  2022-06-27       Impact factor: 30.964

4.  A closed Candidatus Odinarchaeum chromosome exposes Asgard archaeal viruses.

Authors:  Daniel Tamarit; Eva F Caceres; Mart Krupovic; Reindert Nijland; Laura Eme; Nicholas P Robinson; Thijs J G Ettema
Journal:  Nat Microbiol       Date:  2022-06-27       Impact factor: 30.964

Review 5.  Molecular and Computational Strategies to Increase the Efficiency of CRISPR-Based Techniques.

Authors:  Lucia Mattiello; Mark Rütgers; Maria Fernanda Sua-Rojas; Rafael Tavares; José Sérgio Soares; Kevin Begcy; Marcelo Menossi
Journal:  Front Plant Sci       Date:  2022-05-31       Impact factor: 6.627

Review 6.  Synthetic biology in the clinic: engineering vaccines, diagnostics, and therapeutics.

Authors:  Xiao Tan; Justin H Letendre; James J Collins; Wilson W Wong
Journal:  Cell       Date:  2021-02-10       Impact factor: 41.582

7.  The Space-Exposed Kombucha Microbial Community Member Komagataeibacter oboediens Showed Only Minor Changes in Its Genome After Reactivation on Earth.

Authors:  Daniel Santana de Carvalho; Ana Paula Trovatti Uetanabaro; Rodrigo Bentes Kato; Flávia Figueira Aburjaile; Arun Kumar Jaiswal; Rodrigo Profeta; Rodrigo Dias De Oliveira Carvalho; Sandeep Tiwar; Anne Cybelle Pinto Gomide; Eduardo Almeida Costa; Olga Kukharenko; Iryna Orlovska; Olga Podolich; Oleg Reva; Pablo Ivan P Ramos; Vasco Ariston De Carvalho Azevedo; Bertram Brenig; Bruno Silva Andrade; Jean-Pierre P de Vera; Natalia O Kozyrovska; Debmalya Barh; Aristóteles Góes-Neto
Journal:  Front Microbiol       Date:  2022-03-11       Impact factor: 5.640

8.  CRISPRtracrRNA: robust approach for CRISPR tracrRNA detection.

Authors:  Alexander Mitrofanov; Marcus Ziemann; Omer S Alkhnbashi; Wolfgang R Hess; Rolf Backofen
Journal:  Bioinformatics       Date:  2022-09-16       Impact factor: 6.931

9.  Unraveling the Genomic Potential of the Thermophilic Bacterium Anoxybacillus flavithermus from an Antarctic Geothermal Environment.

Authors:  Júnia Schultz; Mariana Teixeira Dornelles Parise; Doglas Parise; Laenne G Medeiros; Thiago J Sousa; Rodrigo B Kato; Ana Paula Trovatti Uetanabaro; Fabrício Araújo; Rommel Thiago Jucá Ramos; Siomar de Castro Soares; Bertram Brenig; Vasco Ariston de Carvalho Azevedo; Aristóteles Góes-Neto; Alexandre S Rosado
Journal:  Microorganisms       Date:  2022-08-19

10.  Identification and classification of antiviral defence systems in bacteria and archaea with PADLOC reveals new system types.

Authors:  Leighton J Payne; Thomas C Todeschini; Yi Wu; Benjamin J Perry; Clive W Ronson; Peter C Fineran; Franklin L Nobrega; Simon A Jackson
Journal:  Nucleic Acids Res       Date:  2021-11-08       Impact factor: 16.971

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.