Literature DB >> 33638635

BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides.

Phasit Charoenkwan1, Chanin Nantasenamat2, Md Mehedi Hasan3,4, Balachandran Manavalan5, Watshara Shoombuatong2.   

Abstract

MOTIVATION: The identification of bitter peptides through experimental approaches is an expensive and time-consuming endeavor. Due to the huge number of newly available peptide sequences in the post-genomic era, the development of automated computational models for the identification of novel bitter peptides is highly desira-ble.
RESULTS: In this work, we present BERT4Bitter, a bidirectional encoder representation from transformers (BERT)-based model for predicting bitter peptides directly from their amino acid sequence without using any structural information. To the best of our knowledge, this is the first time a BERT-based model has been employed to identify bitter peptides. Compared to widely used machine learning models, BERT4Bitter achieved the best performance with accuracy of 0.861 and 0.922 for cross-validation and independent tests, respectively. Furthermore, extensive empirical benchmarking experiments on the independent dataset demonstrated that BERT4Bitter clearly outperformed the existing method with improvements of > 8% accuracy and >16% Matthews coefficient correlation, highlighting the effectiveness and robustness of BERT4Bitter. We believe that the BERT4Bitter method proposed herein will be a useful tool for rapidly screening and identifying novel bitter peptides for drug development and nutritional research. AVAILABILITY: The user-friendly web server of the proposed BERT4Bitter is freely accessible at: http://pmlab.pythonanywhere.com/BERT4Bitter. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) (2021). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Year:  2021        PMID: 33638635     DOI: 10.1093/bioinformatics/btab133

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  14 in total

1.  BERT6mA: prediction of DNA N6-methyladenine site using deep learning-based approaches.

Authors:  Sho Tsukiyama; Md Mehedi Hasan; Hong-Wen Deng; Hiroyuki Kurata
Journal:  Brief Bioinform       Date:  2022-03-10       Impact factor: 11.622

2.  STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction.

Authors:  Shaherin Basith; Gwang Lee; Balachandran Manavalan
Journal:  Brief Bioinform       Date:  2022-01-17       Impact factor: 11.622

3.  GeMI: interactive interface for transformer-based Genomic Metadata Integration.

Authors:  Giuseppe Serna Garcia; Michele Leone; Anna Bernasconi; Mark J Carman
Journal:  Database (Oxford)       Date:  2022-06-03       Impact factor: 4.462

Review 4.  Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins.

Authors:  Phasit Charoenkwan; Nalini Schaduangrat; Md Mehedi Hasan; Mohammad Ali Moni; Pietro Lió; Watshara Shoombuatong
Journal:  EXCLI J       Date:  2022-03-02       Impact factor: 4.022

5.  nhKcr: a new bioinformatics tool for predicting crotonylation sites on human nonhistone proteins based on deep learning.

Authors:  Yong-Zi Chen; Zhuo-Zhi Wang; Yanan Wang; Guoguang Ying; Zhen Chen; Jiangning Song
Journal:  Brief Bioinform       Date:  2021-11-05       Impact factor: 11.622

6.  A novel sequence-based predictor for identifying and characterizing thermophilic proteins using estimated propensity scores of dipeptides.

Authors:  Phasit Charoenkwan; Warot Chotpatiwetchkul; Vannajan Sanghiran Lee; Chanin Nantasenamat; Watshara Shoombuatong
Journal:  Sci Rep       Date:  2021-12-10       Impact factor: 4.379

Review 7.  Large-scale comparative review and assessment of computational methods for phage virion proteins identification.

Authors:  Muhammad Kabir; Chanin Nantasenamat; Sakawrat Kanthawong; Phasit Charoenkwan; Watshara Shoombuatong
Journal:  EXCLI J       Date:  2022-01-03       Impact factor: 4.068

Review 8.  Representation learning applications in biological sequence analysis.

Authors:  Hitoshi Iuchi; Taro Matsutani; Keisuke Yamada; Natsuki Iwano; Shunsuke Sumi; Shion Hosoda; Shitao Zhao; Tsukasa Fukunaga; Michiaki Hamada
Journal:  Comput Struct Biotechnol J       Date:  2021-05-23       Impact factor: 7.271

9.  PredNTS: Improved and Robust Prediction of Nitrotyrosine Sites by Integrating Multiple Sequence Features.

Authors:  Andi Nur Nilamyani; Firda Nurul Auliah; Mohammad Ali Moni; Watshara Shoombuatong; Md Mehedi Hasan; Hiroyuki Kurata
Journal:  Int J Mol Sci       Date:  2021-03-08       Impact factor: 5.923

10.  Identification of Helicobacter pylori Membrane Proteins Using Sequence-Based Features.

Authors:  Mujiexin Liu; Hui Chen; Dong Gao; Cai-Yi Ma; Zhao-Yue Zhang
Journal:  Comput Math Methods Med       Date:  2022-01-12       Impact factor: 2.238

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.