| Literature DB >> 34186189 |
Shihu Jiao1, Lei Xu2, Ying Ju3.
Abstract
Drug resistance of pathogenic bacteria has become increasingly serious due to the abuse of antibiotics in recent years. Researchers have found that cell wall lyases are effective antibacterial agents that can specifically recognize target bacteria and degrade bacterial peptidoglycan. Traditional wet experiments are usually expensive, time-consuming and laborious for the identification of lyases. Therefore, there is an urgent need to develop prediction tools based on computer methods to identify lyases quickly and accurately. In this paper, a new predictor, CWLy-RF, is proposed based on the random forest (RF) algorithm to identify cell wall lyases. In this method, we combined three features, namely, 400D, 188D and the composition of k-spaced amino acid group pairs, using mixed-feature representation methods. Afterward, we improved the feature representation ability with the selected top 100 features by using the information gain method and trained a predictive model using RF. The constructed prediction model is evaluated by using 10-fold cross-validation. The accuracy obtained was 96.09%, the AUC was 0.993, the MCC was 0.922, the sensitivity was 94.92%, and the specificity was 97.32%. We have proved that the proposed predictor CWLy-RF is superior to other latest models, and it will hopefully become an effective and useful tool for identifying lyases.Entities:
Keywords: Cell wall lyases; Feature selection; Mixed-feature method; Random forest; Unbalanced data
Mesh:
Substances:
Year: 2021 PMID: 34186189 DOI: 10.1016/j.ygeno.2021.06.038
Source DB: PubMed Journal: Genomics ISSN: 0888-7543 Impact factor: 5.736