Literature DB >> 29186295

Systematic analysis and prediction of type IV secreted effector proteins by machine learning approaches.

Jiawei Wang1, Bingjiao Yang2, Yi An3, Tatiana Marquez-Lago4, André Leier5, Jonathan Wilksch6, Qingyang Hong7, Yang Zhang8, Morihiro Hayashida9, Tatsuya Akutsu10, Geoffrey I Webb11, Richard A Strugnell12, Jiangning Song13, Trevor Lithgow14.   

Abstract

In the course of infecting their hosts, pathogenic bacteria secrete numerous effectors, namely, bacterial proteins that pervert host cell biology. Many Gram-negative bacteria, including context-dependent human pathogens, use a type IV secretion system (T4SS) to translocate effectors directly into the cytosol of host cells. Various type IV secreted effectors (T4SEs) have been experimentally validated to play crucial roles in virulence by manipulating host cell gene expression and other processes. Consequently, the identification of novel effector proteins is an important step in increasing our understanding of host-pathogen interactions and bacterial pathogenesis. Here, we train and compare six machine learning models, namely, Naïve Bayes (NB), K-nearest neighbor (KNN), logistic regression (LR), random forest (RF), support vector machines (SVMs) and multilayer perceptron (MLP), for the identification of T4SEs using 10 types of selected features and 5-fold cross-validation. Our study shows that: (1) including different but complementary features generally enhance the predictive performance of T4SEs; (2) ensemble models, obtained by integrating individual single-feature models, exhibit a significantly improved predictive performance and (3) the 'majority voting strategy' led to a more stable and accurate classification performance when applied to predicting an ensemble learning model with distinct single features. We further developed a new method to effectively predict T4SEs, Bastion4 (Bacterial secretion effector predictor for T4SS), and we show our ensemble classifier clearly outperforms two recent prediction tools. In summary, we developed a state-of-the-art T4SE predictor by conducting a comprehensive performance evaluation of different machine learning algorithms along with a detailed analysis of single- and multi-feature selections.
© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  bioinformatics; comprehensive performance evaluation; feature analysis; machine learning; sequence analysis; type IV secreted effector

Mesh:

Substances:

Year:  2019        PMID: 29186295      PMCID: PMC6585386          DOI: 10.1093/bib/bbx164

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  75 in total

Review 1.  Functions of the Yersinia effector proteins in inhibiting host immune responses.

Authors:  Lorena Navarro; Neal M Alto; Jack E Dixon
Journal:  Curr Opin Microbiol       Date:  2005-02       Impact factor: 7.934

2.  Circos: an information aesthetic for comparative genomics.

Authors:  Martin Krzywinski; Jacqueline Schein; Inanç Birol; Joseph Connors; Randy Gascoyne; Doug Horsman; Steven J Jones; Marco A Marra
Journal:  Genome Res       Date:  2009-06-18       Impact factor: 9.043

3.  Using optimized evidence-theoretic K-nearest neighbor classifier and pseudo-amino acid composition to predict membrane protein types.

Authors:  Hongbin Shen; Kuo-Chen Chou
Journal:  Biochem Biophys Res Commun       Date:  2005-08-19       Impact factor: 3.575

4.  An account of in silico identification tools of secreted effector proteins in bacteria and future challenges.

Authors:  Cong Zeng; Lingyun Zou
Journal:  Brief Bioinform       Date:  2019-01-18       Impact factor: 11.622

5.  Shigella flexneri inhibits staurosporine-induced apoptosis in epithelial cells.

Authors:  Christina S Clark; Anthony T Maurelli
Journal:  Infect Immun       Date:  2007-03-05       Impact factor: 3.441

6.  Cascleave 2.0, a new approach for predicting caspase and granzyme cleavage targets.

Authors:  Mingjun Wang; Xing-Ming Zhao; Hao Tan; Tatsuya Akutsu; James C Whisstock; Jiangning Song
Journal:  Bioinformatics       Date:  2013-10-21       Impact factor: 6.937

Review 7.  Salmonella takes control: effector-driven manipulation of the host.

Authors:  Emma J McGhie; Lyndsey C Brawn; Peter J Hume; Daniel Humphreys; Vassilis Koronakis
Journal:  Curr Opin Microbiol       Date:  2009-01-20       Impact factor: 7.934

8.  MS-kNN: protein function prediction by integrating multiple data sources.

Authors:  Liang Lan; Nemanja Djuric; Yuhong Guo; Slobodan Vucetic
Journal:  BMC Bioinformatics       Date:  2013-02-28       Impact factor: 3.169

9.  Predicting RNA-binding sites of proteins using support vector machines and evolutionary information.

Authors:  Cheng-Wei Cheng; Emily Chia-Yu Su; Jenn-Kang Hwang; Ting-Yi Sung; Wen-Lian Hsu
Journal:  BMC Bioinformatics       Date:  2008-12-12       Impact factor: 3.169

10.  Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study.

Authors:  Anoop D Shah; Jonathan W Bartlett; James Carpenter; Owen Nicholas; Harry Hemingway
Journal:  Am J Epidemiol       Date:  2014-01-12       Impact factor: 4.897

View more
  18 in total

1.  Bastion3: a two-layer ensemble predictor of type III secreted effectors.

Authors:  Jiawei Wang; Jiahui Li; Bingjiao Yang; Ruopeng Xie; Tatiana T Marquez-Lago; André Leier; Morihiro Hayashida; Tatsuya Akutsu; Yanju Zhang; Kuo-Chen Chou; Joel Selkrig; Tieli Zhou; Jiangning Song; Trevor Lithgow
Journal:  Bioinformatics       Date:  2019-06-01       Impact factor: 6.937

2.  Bastion6: a bioinformatics approach for accurate prediction of type VI secreted effectors.

Authors:  Jiawei Wang; Bingjiao Yang; André Leier; Tatiana T Marquez-Lago; Morihiro Hayashida; Andrea Rocker; Yanju Zhang; Tatsuya Akutsu; Kuo-Chen Chou; Richard A Strugnell; Jiangning Song; Trevor Lithgow
Journal:  Bioinformatics       Date:  2018-08-01       Impact factor: 6.937

3.  PaCRISPR: a server for predicting and visualizing anti-CRISPR proteins.

Authors:  Jiawei Wang; Wei Dai; Jiahui Li; Ruopeng Xie; Rhys A Dunstan; Christopher Stubenrauch; Yanju Zhang; Trevor Lithgow
Journal:  Nucleic Acids Res       Date:  2020-07-02       Impact factor: 16.971

4.  ATGPred-FL: sequence-based prediction of autophagy proteins with feature representation learning.

Authors:  Shihu Jiao; Zheng Chen; Lichao Zhang; Xun Zhou; Lei Shi
Journal:  Amino Acids       Date:  2022-03-14       Impact factor: 3.520

5.  ASPIRER: a new computational approach for identifying non-classical secreted proteins based on deep learning.

Authors:  Xiaoyu Wang; Fuyi Li; Jing Xu; Jia Rong; Geoffrey I Webb; Zongyuan Ge; Jian Li; Jiangning Song
Journal:  Brief Bioinform       Date:  2022-03-10       Impact factor: 13.994

6.  Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework.

Authors:  Yanju Zhang; Ruopeng Xie; Jiawei Wang; André Leier; Tatiana T Marquez-Lago; Tatsuya Akutsu; Geoffrey I Webb; Kuo-Chen Chou; Jiangning Song
Journal:  Brief Bioinform       Date:  2019-11-27       Impact factor: 11.622

7.  Protein functional annotation of simultaneously improved stability, accuracy and false discovery rate achieved by a sequence-based deep learning.

Authors:  Jiajun Hong; Yongchao Luo; Yang Zhang; Junbiao Ying; Weiwei Xue; Tian Xie; Lin Tao; Feng Zhu
Journal:  Brief Bioinform       Date:  2020-07-15       Impact factor: 11.622

8.  iT4SE-EP: Accurate Identification of Bacterial Type IV Secreted Effectors by Exploring Evolutionary Features from Two PSI-BLAST Profiles.

Authors:  Haitao Han; Chenchen Ding; Xin Cheng; Xiuzhi Sang; Taigang Liu
Journal:  Molecules       Date:  2021-04-24       Impact factor: 4.411

9.  SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome.

Authors:  Shaherin Basith; Balachandran Manavalan; Tae Hwan Shin; Gwang Lee
Journal:  Mol Ther Nucleic Acids       Date:  2019-08-16       Impact factor: 8.886

10.  Using an optimal set of features with a machine learning-based approach to predict effector proteins for Legionella pneumophila.

Authors:  Zhila Esna Ashari; Kelly A Brayton; Shira L Broschat
Journal:  PLoS One       Date:  2019-01-25       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.