| Literature DB >> 32437848 |
Jael Sanyanda Wekesa1, Jun Meng2, Yushi Luan3.
Abstract
Long non-coding RNAs (lncRNAs) play key roles in regulating cellular biological processes through diverse molecular mechanisms including binding to RNA binding proteins. The majority of plant lncRNAs are functionally uncharacterized, thus, accurate prediction of plant lncRNA-protein interaction is imperative for subsequent functional studies. We present an integrative model, namely DRPLPI. Its uniqueness is that it predicts by multi-feature fusion. Structural and four groups of sequence features are used, including tri-nucleotide composition, gapped k-mer, recursive complement and binary profile. We design a multi-head self-attention long short-term memory encoder-decoder network to extract generative high-level features. To obtain robust results, DRPLPI combines categorical boosting and extra trees into a single meta-learner. Experiments on Zea mays and Arabidopsis thaliana obtained 0.9820 and 0.9652 area under precision/recall curve (AUPRC) respectively. The proposed method shows significant enhancement in the prediction performance compared with existing state-of-the-art methods.Entities:
Keywords: Deep learning; Plants; Prediction; Secondary structure features; lncRNA–protein interaction
Year: 2020 PMID: 32437848 DOI: 10.1016/j.ygeno.2020.05.005
Source DB: PubMed Journal: Genomics ISSN: 0888-7543 Impact factor: 5.736