Literature DB >> 31794006

eCAMI: simultaneous classification and motif identification for enzyme annotation.

Jing Xu1,2, Han Zhang1, Jinfang Zheng3, Philippe Dovoedo4, Yanbin Yin3.   

Abstract

MOTIVATION: Carbohydrate-active enzymes (CAZymes) are extremely important to bioenergy, human gut microbiome, and plant pathogen researches and industries. Here we developed a new amino acid k-mer-based CAZyme classification, motif identification and genome annotation tool using a bipartite network algorithm. Using this tool, we classified 390 CAZyme families into thousands of subfamilies each with distinguishing k-mer peptides. These k-mers represented the characteristic motifs (in the form of a collection of conserved short peptides) of each subfamily, and thus were further used to annotate new genomes for CAZymes. This idea was also generalized to extract characteristic k-mer peptides for all the Swiss-Prot enzymes classified by the EC (enzyme commission) numbers and applied to enzyme EC prediction.
RESULTS: This new tool was implemented as a Python package named eCAMI. Benchmark analysis of eCAMI against the state-of-the-art tools on CAZyme and enzyme EC datasets found that: (i) eCAMI has the best performance in terms of accuracy and memory use for CAZyme and enzyme EC classification and annotation; (ii) the k-mer-based tools (including PPR-Hotpep, CUPP and eCAMI) perform better than homology-based tools and deep-learning tools in enzyme EC prediction. Lastly, we confirmed that the k-mer-based tools have the unique ability to identify the characteristic k-mer peptides in the predicted enzymes.
AVAILABILITY AND IMPLEMENTATION: https://github.com/yinlabniu/eCAMI and https://github.com/zhanglabNKU/eCAMI. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Year:  2020        PMID: 31794006     DOI: 10.1093/bioinformatics/btz908

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  5 in total

1.  Conserved unique peptide patterns (CUPP) online platform: peptide-based functional annotation of carbohydrate active enzymes.

Authors:  Kristian Barrett; Cameron J Hunt; Lene Lange; Anne S Meyer
Journal:  Nucleic Acids Res       Date:  2020-07-02       Impact factor: 16.971

2.  Structural basis of the strict specificity of a bacterial GH31 α-1,3-glucosidase for nigerooligosaccharides.

Authors:  Marina Ikegaya; Toshio Moriya; Naruhiko Adachi; Masato Kawasaki; Enoch Y Park; Takatsugu Miyazaki
Journal:  J Biol Chem       Date:  2022-03-12       Impact factor: 5.486

3.  Transcriptional profile of oil palm pathogen, Ganoderma boninense, reveals activation of lignin degradation machinery and possible evasion of host immune response.

Authors:  Braham Dhillon; Richard C Hamelin; Jeffrey A Rollins
Journal:  BMC Genomics       Date:  2021-05-05       Impact factor: 3.969

4.  New Method for Identifying Fungal Kingdom Enzyme Hotspots from Genome Sequences.

Authors:  Lene Lange; Kristian Barrett; Anne S Meyer
Journal:  J Fungi (Basel)       Date:  2021-03-11

5.  Multiple Profile Models Extract Features from Protein Sequence Data and Resolve Functional Diversity of Very Different Protein Families.

Authors:  R Vicedomini; J P Bouly; E Laine; A Falciatore; A Carbone
Journal:  Mol Biol Evol       Date:  2022-04-10       Impact factor: 8.800

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.