| Literature DB >> 30947439 |
Han Yu Zhao1, Chao Che1, Bo Jin2, Xiao Peng Wei3.
Abstract
The interaction between viral proteins and small molecule compounds is the basis of drug design. Therefore, it is a fundamental challenge to identify viral proteins according to their amino acid sequences in the field of biopharmaceuticals. The traditional prediction methods su er from the data imbalance problem and take too long computation time. To this end, this paper proposes a deep learning framework for virus protein identifying. In the framework, we employ Temporal Convolutional Network(TCN) instead of Recurrent Neural Network(RNN) for feature extraction to improve computation e ciency. We also customize the cost-sensitive loss function of TCN and introduce the misclassification cost of training samples into the weight update of Gradient Boosting Decision Tree(GBDT) to address data imbalance problem. Experiment results show that our framework not only outperforms traditional data imbalance methods but also greatly reduces the computation time with slight performance enhancement.Entities:
Keywords: GBDT ; TCN ; data imbalance ; deep learning ; viral protein identifying
Mesh:
Substances:
Year: 2019 PMID: 30947439 DOI: 10.3934/mbe.2019081
Source DB: PubMed Journal: Math Biosci Eng ISSN: 1547-1063 Impact factor: 2.080