Literature DB >> 31136183

Incorporating Distance-Based Top-n-gram and Random Forest To Identify Electron Transport Proteins.

Xiaoqing Ru1,2, Lihong Li2, Quan Zou1,3.   

Abstract

Cellular respiration provides direct energy substances for living organisms. Electron storage and transportation should be completed through electron transport chains during the cellular respiration process. Thus, identifying electron transport proteins is an important research task. In protein identification, selection of the feature extraction method and classification algorithm has a direct bearing on classification. The distance-based Top-n-gram method, which was proposed based on the frequency profile and considered evolutionary information, was used in this study for feature extraction. The Max-Relevance-Max-Distance algorithm was adopted for feature selection. The first 4D features that greatly influenced the classification result were selected to form the feature data set. Finally, the random forest algorithm was used to identify electron transport proteins. Under the 10-fold cross-validation of the model constructed in this study, sensitivity, specificity, and accuracy rates surpassed 85%, 80%, and 82%, respectively. In the testing set, F-measure, AUC value, and accuracy exceeded 74%, 95%, and 86%, respectively. These experimental results indicated that the classification model built in this study is an effective tool in identifying electron transport proteins.

Keywords:  ACC; AUC value; F-measure; Max-Relevance-Max-Distance; distance-based Top-n-gram method; electron transport proteins; feature extraction; feature selection; protein identification; random forest

Year:  2019        PMID: 31136183     DOI: 10.1021/acs.jproteome.9b00250

Source DB:  PubMed          Journal:  J Proteome Res        ISSN: 1535-3893            Impact factor:   4.466


  24 in total

1.  sgRNACNN: identifying sgRNA on-target activity in four crops using ensembles of convolutional neural networks.

Authors:  Mengting Niu; Yuan Lin; Quan Zou
Journal:  Plant Mol Biol       Date:  2021-01-01       Impact factor: 4.076

2.  Prediction of m5C Modifications in RNA Sequences by Combining Multiple Sequence Features.

Authors:  Lijun Dou; Xiaoling Li; Hui Ding; Lei Xu; Huaikun Xiang
Journal:  Mol Ther Nucleic Acids       Date:  2020-06-10       Impact factor: 8.886

3.  A Novel Triple Matrix Factorization Method for Detecting Drug-Side Effect Association Based on Kernel Target Alignment.

Authors:  Xiaoyi Guo; Wei Zhou; Yan Yu; Yijie Ding; Jijun Tang; Fei Guo
Journal:  Biomed Res Int       Date:  2020-05-28       Impact factor: 3.411

4.  IMPContact: An Interhelical Residue Contact Prediction Method.

Authors:  Chao Fang; Yajie Jia; Lihong Hu; Yinghua Lu; Han Wang
Journal:  Biomed Res Int       Date:  2020-03-25       Impact factor: 3.411

5.  Its2vec: Fungal Species Identification Using Sequence Embedding and Random Forest Classification.

Authors:  Chao Wang; Ying Zhang; Shuguang Han
Journal:  Biomed Res Int       Date:  2020-05-27       Impact factor: 3.411

6.  Enhancing Top-Down Proteomics Data Analysis by Combining Deconvolution Results through a Machine Learning Strategy.

Authors:  Sean J McIlwain; Zhijie Wu; Molly Wetzel; Daniel Belongia; Yutong Jin; Kent Wenger; Irene M Ong; Ying Ge
Journal:  J Am Soc Mass Spectrom       Date:  2020-04-08       Impact factor: 3.262

7.  Predicting ATP-Binding Cassette Transporters Using the Random Forest Method.

Authors:  Ruiyan Hou; Lida Wang; Yi-Jun Wu
Journal:  Front Genet       Date:  2020-03-25       Impact factor: 4.599

8.  STS-NLSP: A Network-Based Label Space Partition Method for Predicting the Specificity of Membrane Transporter Substrates Using a Hybrid Feature of Structural and Semantic Similarity.

Authors:  Xiangeng Wang; Xiaolei Zhu; Mingzhi Ye; Yanjing Wang; Cheng-Dong Li; Yi Xiong; Dong-Qing Wei
Journal:  Front Bioeng Biotechnol       Date:  2019-11-06

9.  PSBP-SVM: A Machine Learning-Based Computational Identifier for Predicting Polystyrene Binding Peptides.

Authors:  Chaolu Meng; Yang Hu; Ying Zhang; Fei Guo
Journal:  Front Bioeng Biotechnol       Date:  2020-03-31

10.  iRNA5hmC: The First Predictor to Identify RNA 5-Hydroxymethylcytosine Modifications Using Machine Learning.

Authors:  Yuan Liu; Dasheng Chen; Ran Su; Wei Chen; Leyi Wei
Journal:  Front Bioeng Biotechnol       Date:  2020-03-31
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.