| Literature DB >> 35145365 |
Muhammad Kabir1, Chanin Nantasenamat2, Sakawrat Kanthawong3, Phasit Charoenkwan4, Watshara Shoombuatong2.
Abstract
Phage virion proteins (PVPs) are effective at recognizing and binding to host cell receptors while having no deleterious effects on human or animal cells. Understanding their functional mechanisms is regarded as a critical goal that will aid in rational antibacterial drug discovery and development. Although high-throughput experimental methods for identifying PVPs are considered the gold standard for exploring crucial PVP features, these procedures are frequently time-consuming and labor-intensive. Thusfar, more than ten sequence-based predictors have been established for the in silico identification of PVPs in conjunction with traditional experimental approaches. As a result, a revised and more thorough assessment is extremely desirable. With this purpose in mind, we first conduct a thorough survey and evaluation of a vast array of 13 state-of-the-art PVP predictors. Among these PVP predictors, they can be classified into three groups according to the types of machine learning (ML) algorithms employed (i.e. traditional ML-based methods, ensemble-based methods and deep learning-based methods). Subsequently, we explored which factors are important for building more accurate and stable predictors and this included training/independent datasets, feature encoding algorithms, feature selection methods, core algorithms, performance evaluation metrics/strategies and web servers. Finally, we provide insights and future perspectives for the design and development of new and more effective computational approaches for the detection and characterization of PVPs.Entities:
Keywords: bioinformatics; classification; feature representation; feature select; machine learning; phage virion protein
Year: 2022 PMID: 35145365 PMCID: PMC8822302 DOI: 10.17179/excli2021-4411
Source DB: PubMed Journal: EXCLI J ISSN: 1611-2156 Impact factor: 4.068
Table 1A comprehensive list of current PVP predictors reviewed in this study
Table 2A summary of training and independent test datasets used in PVP predictors
Table 3Different types of features employed for developing the PVP predictors
Table 4Cross-validation results for different PVP predictors evaluated on the Feng2013 dataset
Table 5Independent test results from different PVP predictors as evaluated on Manavalan2018 and Charoenkwan2020_2.0 datasets
Figure 1Performance evaluation on the Feng2013 dataset as deduced from 10-fold cross validation test
Figure 2Performance evaluation on Manavalan2018 (A) and Charoenkwan2020_2.0 (B) datasets as deduce from independent test