Literature DB >> 33816830

SNARE-CNN: a 2D convolutional neural network architecture to identify SNARE proteins from high-throughput sequencing data.

Nguyen Quoc Khanh Le1, Van-Nui Nguyen2.   

Abstract

Deep learning has been increasingly and widely used to solve numerous problems in various fields with state-of-the-art performance. It can also be applied in bioinformatics to reduce the requirement for feature extraction and reach high performance. This study attempts to use deep learning to predict SNARE proteins, which is one of the most vital molecular functions in life science. A functional loss of SNARE proteins has been implicated in a variety of human diseases (e.g., neurodegenerative, mental illness, cancer, and so on). Therefore, creating a precise model to identify their functions is a crucial problem for understanding these diseases, and designing the drug targets. Our SNARE-CNN model which uses two-dimensional convolutional neural networks and position-specific scoring matrix profiles could identify SNARE proteins with achieved sensitivity of 76.6%, specificity of 93.5%, accuracy of 89.7%, and MCC of 0.7 in cross-validation dataset. We also evaluate the performance of our model via an independent dataset and the result shows that we are able to solve the overfitting problem. Compared with other state-of-the-art methods, this approach achieved significant improvement in all of the metrics. Throughout the proposed study, we provide an effective model for identifying SNARE proteins and a basis for further research that can apply deep learning in bioinformatics, especially in protein function prediction. SNARE-CNN are freely available at https://github.com/khanhlee/snare-cnn. ©2019 Le and Nguyen.

Entities:  

Keywords:  Biological domain; Cancer; Deep learning; Human disease; Membrane fusion; Overfitting; Position specific scoring matrix; Protein family classification; SNARE protein function; Vesicular transport protein

Year:  2019        PMID: 33816830      PMCID: PMC7924420          DOI: 10.7717/peerj-cs.177

Source DB:  PubMed          Journal:  PeerJ Comput Sci        ISSN: 2376-5992


  55 in total

1.  Protein secondary structure prediction based on position-specific scoring matrices.

Authors:  D T Jones
Journal:  J Mol Biol       Date:  1999-09-17       Impact factor: 5.469

2.  Data classification with radial basis function networks based on a novel kernel density estimation algorithm.

Authors:  Yen-Jen Oyang; Shien-Ching Hwang; Yu-Yen Ou; Chien-Yu Chen; Zhi-Wei Chen
Journal:  IEEE Trans Neural Netw       Date:  2005-01

Review 3.  Deep learning.

Authors:  Yann LeCun; Yoshua Bengio; Geoffrey Hinton
Journal:  Nature       Date:  2015-05-28       Impact factor: 49.962

4.  Identifying the molecular functions of electron transport proteins using radial basis function networks and biochemical properties.

Authors:  Nguyen-Quoc-Khanh Le; Trinh-Trung-Duong Nguyen; Yu-Yen Ou
Journal:  J Mol Graph Model       Date:  2017-02-02       Impact factor: 2.518

5.  iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals.

Authors:  Xiang Cheng; Shu-Guang Zhao; Xuan Xiao; Kuo-Chen Chou
Journal:  Bioinformatics       Date:  2017-02-01       Impact factor: 6.937

6.  iPTM-mLys: identifying multiple lysine PTM sites and their different types.

Authors:  Wang-Ren Qiu; Bi-Qian Sun; Xuan Xiao; Zhao-Chun Xu; Kuo-Chen Chou
Journal:  Bioinformatics       Date:  2016-06-22       Impact factor: 6.937

7.  pLoc-mVirus: Predict subcellular localization of multi-location virus proteins via incorporating the optimal GO information into general PseAAC.

Authors:  Xiang Cheng; Xuan Xiao; Kuo-Chen Chou
Journal:  Gene       Date:  2017-07-18       Impact factor: 3.688

8.  POODLE-S: web application for predicting protein disorder by using physicochemical features and reduced amino acid set of a position-specific scoring matrix.

Authors:  Kana Shimizu; Shuichi Hirose; Tamotsu Noguchi
Journal:  Bioinformatics       Date:  2007-06-28       Impact factor: 6.937

9.  A conserved domain is present in different families of vesicular fusion proteins: a new superfamily.

Authors:  T Weimbs; S H Low; S J Chapin; K E Mostov; P Bucher; K Hofmann
Journal:  Proc Natl Acad Sci U S A       Date:  1997-04-01       Impact factor: 11.205

10.  Predicting sub-Golgi localization of type II membrane proteins.

Authors:  A D J van Dijk; D Bosch; C J F ter Braak; A R van der Krol; R C H J van Ham
Journal:  Bioinformatics       Date:  2008-06-18       Impact factor: 6.937

View more
  11 in total

1.  SNARER: new molecular descriptors for SNARE proteins classification.

Authors:  Alessia Auriemma Citarella; Luigi Di Biasi; Michele Risi; Genoveffa Tortora
Journal:  BMC Bioinformatics       Date:  2022-04-24       Impact factor: 3.307

2.  Improving protein domain classification for third-generation sequencing reads using deep learning.

Authors:  Nan Du; Jiayu Shang; Yanni Sun
Journal:  BMC Genomics       Date:  2021-04-09       Impact factor: 3.969

3.  SNAREs-SAP: SNARE Proteins Identification With PSSM Profiles.

Authors:  Zixiao Zhang; Yue Gong; Bo Gao; Hongfei Li; Wentao Gao; Yuming Zhao; Benzhi Dong
Journal:  Front Genet       Date:  2021-12-20       Impact factor: 4.599

4.  Gutter oil detection for food safety based on multi-feature machine learning and implementation on FPGA with approximate multipliers.

Authors:  Wei Jiang; Ruiqi Chen; Yuhanxiao Ma
Journal:  PeerJ Comput Sci       Date:  2021-11-16

5.  Genetic variations analysis for complex brain disease diagnosis using machine learning techniques: opportunities and hurdles.

Authors:  Hala Ahmed; Louai Alarabi; Shaker El-Sappagh; Hassan Soliman; Mohammed Elmogy
Journal:  PeerJ Comput Sci       Date:  2021-09-20

6.  Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules.

Authors:  Emna Harigua-Souiai; Mohamed Mahmoud Heinhane; Yosser Zina Abdelkrim; Oussama Souiai; Ines Abdeljaoued-Tej; Ikram Guizani
Journal:  Front Genet       Date:  2021-11-29       Impact factor: 4.599

7.  Identification of Enzymes-specific Protein Domain Based on DDE, and Convolutional Neural Network.

Authors:  Rahu Sikander; Yuping Wang; Ali Ghulam; Xianjuan Wu
Journal:  Front Genet       Date:  2021-11-30       Impact factor: 4.599

8.  A SNARE Protein Identification Method Based on iLearnPlus to Efficiently Solve the Data Imbalance Problem.

Authors:  Dong Ma; Zhihua Chen; Zhanpeng He; Xueqin Huang
Journal:  Front Genet       Date:  2022-01-28       Impact factor: 4.599

9.  Identification of the ubiquitin-proteasome pathway domain by hyperparameter optimization based on a 2D convolutional neural network.

Authors:  Rahu Sikander; Muhammad Arif; Ali Ghulam; Apilak Worachartcheewan; Maha A Thafar; Shabana Habib
Journal:  Front Genet       Date:  2022-07-22       Impact factor: 4.772

10.  VTP-Identifier: Vesicular Transport Proteins Identification Based on PSSM Profiles and XGBoost.

Authors:  Yue Gong; Benzhi Dong; Zixiao Zhang; Yixiao Zhai; Bo Gao; Tianjiao Zhang; Jingyu Zhang
Journal:  Front Genet       Date:  2022-01-03       Impact factor: 4.599

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.