Literature DB >> 35498708

Opportunities and Challenges of Predictive Approaches for the Non-coding RNA in Plants.

Dong Xu1,2, Wenya Yuan1, Chunjie Fan3, Bobin Liu4, Meng-Zhu Lu1, Jin Zhang1.   

Abstract

Entities:  

Keywords:  bioinformatics tools; deep learning; experimental technology; miRNA; non-coding RNA

Year:  2022        PMID: 35498708      PMCID: PMC9048598          DOI: 10.3389/fpls.2022.890663

Source DB:  PubMed          Journal:  Front Plant Sci        ISSN: 1664-462X            Impact factor:   6.627


× No keyword cloud information.

Introduction

Non-coding RNA (ncRNA) is a key regulatory RNA with limited abilities of coding potential (Liu et al., 2015). To date, a variety of ncRNAs have been identified in plants, which can be classified into three groups according to their sequence length: small RNAs <50 nucleotides (nt), such as microRNAs (miRNAs); long non-coding RNAs (lncRNAs) with arbitrarily longer than 200 nt; and intermediate-size ncRNA between small RNAs and lncRNA in length (Wang et al., 2014). In addition, the process of back splicing also produces a class of covalently closed RNA molecules called exonic circular RNAs (circRNAs). Different types of ncRNAs show diverse mechanisms of action. miRNAs typically degrade their target genes by binding to their transcripts at the post-transcriptional level (Chipman and Pasquinelli, 2019). However, lncRNAs function at transcriptional, post-transcriptional and epigenetic levels by interacting with macromolecules (Wu et al., 2020). Whiles circRNAs can act as miRNA/protein sponges, or regulate alternative splicing or transcription (Lai et al., 2018). These ncRNAs with different mechanisms of action form a complex regulatory network to jointly regulate plant growth, development and stress response. For instance, Vvi-miPEP171d1 regulates adventitious root formation in grapevine (Chen et al., 2020). And miRNA frameworks have been proved to be important for the flower induction in apple (Fan et al., 2018). In addition to regulation of self-life activities, ncRNAs also play a role in plant-to-plant communication. Exogenous miR399 and miR156 can trigger RNA interference to repress the expression of PHOSPHATE OVERACCUMULATOR 2 (PHO2) and SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE 9 (SPL9) in plants (Betti et al., 2021). Moreover, ncRNAs can even act as a bridge between plants and other species. For example, two plant viruses, barley yellow dwarf virus and red clover necrotic mosaic virus, can express small subgenomic (sg) RNAs to attenuate host translation by binding translation initiation factor eIF4G (Miller et al., 2016). Expressing double-strand (ds) RNA in maize can effectively reduce feeding damage of western corn rootworm by triggering its RNA interference (Baum et al., 2007). In summary, ncRNAs play important roles in plant activities, and their regulatory functions are the basis for plant growth, development and survival. The regulatory roles of ncRNAs make them important tools for adjusting gene expression and studying gene functions. For example, the artificial miRNA pAmiRNA156h-PDSh can effectively decline the expression of phytoene desaturase (MdPDS) in apple (Charrier et al., 2019). Overexpression of amiRNA-319a-HaEcR in tomato can effectively silence the ECDYSONE RECEPTOR (HaEcR) gene to reduce the survival of Helicoverpa armigera (Yogindran and Rajam, 2021). From this point of view, exploring new ncRNAs and elucidating the regulatory mechanism of ncRNAs will have profound impacts on plant research. Here, our emphasis is put on recognition technologies of ncRNAs in plants, including bioinformatic tools and experimental technologies.

Bioinformatics Tools

With the release of more and more genomic information of diverse plants, it provides a key basis for the discovery of novel ncRNAs. In addition, knowledge about the function, structure and conservation of ncRNAs is accumulating, which can help us distinguish different types of ncRNAs. Since ncRNAs were reported, at least 39 bioinformatics websites and softwares have been established, of which 40.5% (13 websites and 4 softwares) were released in the past 5 years (Table 1). Furthermore, 41 deep learning models were developed in the past 3 years (Table 1). These tools can be divided into three classes.
Table 1

Bioinformatics tools in ncRNAs analysis.

(1) Websites and softwaresa
Tools Websites Main function
(1.1) Database
PLncDB http://www.tobaccodb.org/plncdb/ Long non-coding RNA Database
Rfam http://rfam.xfam.org/search RNA sequence-family database
PlantcircBase http://ibi.zju.edu.cn/plantcircbase/index.php Predict circRNAs
RNAcentral https://rnacentral.org/ ncRNA database
PmiRExAt http://pmirexat.nabi.res.in/ miRNA-expression database
NONCODE http://www.noncode.org/index.php ncRNA database
PNRD http://structuralbiology.cau.edu.cn/PNRD/index.php miRNA database
scaRNAbase http://gene.fudan.edu.cn/snoRNAbase.nsf sno/scaRNA database
(1.2) Prediction tools
miRDeep-P2 https://sourceforge.net/projects/mirdp2/ Analyze miRNA transcriptome
miRBase https://www.mirbase.org/search.shtml Contain miRNAs and precursors
psRNATarget https://www.zhaolab.org/psRNATarget/analysis Predict target genes of small RNA
TarBase http://carolina.imis.athena-innovation.gr/diana_tools/web/index.php?r=tarbasev8%2Findex miRNA-target interactions
PeTMbase http://tools.ibg.deu.edu.tr/petmbase/ miRNA-target mimics
RNAComposer http://rnacomposer.ibch.poznan.pl/ Predict 3D structure of ncRNA
COME https://github.com/lulab/COME Annotate lncRNAs
comPARE https://mpss.danforthcenter.org/tools/mirna_apps/comPARE.php Predict miRNAs and their targets
comTAR http://rnabiology.ibr-conicet.gov.ar/comtar/ Evolutionary analysis of miRNA and target
plantDARIO http://plantdario.bioinf.uni-leipzig.de/index.py Predict ncRNA from RNA-seq data
miRNEST http://rhesus.amu.edu.pl/mirnest/copy/ miRNAs and targets
PhasiRNAnalyzer https://cbi.njau.edu.cn/PPSA/ Identify phasiRNAs and their target genes
miRTarBase https://mirtarbase.cuhk.edu.cn/miRTarBase/miRTarBase_2019/php/index.php Interaction between miRNAs and their target genes
NPInter http://bigdata.ibp.ac.cn/npinter4/ Interaction between ncRNAs and biomolecules
RNAshapes https://bibiserv.cebitec.uni-bielefeld.de/rnashapes Predict ncRNA structure
RNAcon http://crdd.osdd.net/raghava/rnacon/ Predict and classify the ncRNAs
miR-PREFeR https://github.com/hangelwen/miR-PREFeR Predict miRNAs and precursors
CNCI https://github.com/www-bioinfo-org/CNCI Classify lncRNAs
CPAT http://rna-cpat.sourceforge.net/ Annotate lncRNAs
Infernal http://eddylab.org/infernal/ Predict ncRNA-secondary sequences
PsRobot http://omicslab.genetics.ac.cn/psRobot/index.php Predict stem-loop structure and target of ncRNA
NUPACK http://www.nupack.org/partition/new Analyze and design ncRNA structures
MiSolRNAdb http://www.misolrna.org/ Map position of miRNA and targets
TAPIR http://bioinformatics.psb.ugent.be/webtools/tapir/ Predict binding sites of miRNA and target
RNAz https://www.tbi.univie.ac.at/software/RNAz/ Predict ncRNA secondary structures
CentroidAlign http://www.ncrna.org/software/centroidalign/ Multiple alignments of ncRNAs
CleaveLand4https://github.com/MikeAxtell/CleaveLand4/ blob/master/CleaveLand4.plPredict the binding sites of miRNAs in target
RNAfold http://rna.tbi.univie.ac.at//cgi-bin/RNAWebSuite/RNAfold.cgi Provide information of ncRNA secondary structures
fRNAdb https://dbarchive.biosciencedbc.jp/en/frnadb/download.html NcRNA sequences, prediction tools
Randfold https://github.com/erbon7/randfold Predict secondary structures of ncRNAs
Mfold http://www.unafold.org/mfold/applications/rna-folding-form.php Predict the nucleic acid folding and hybridization
(2) Deep learning models
Models Main function References
(2.1) Finding novel ncRNAs or classification
ncRDensencRNA classificationChantsalnyam et al., 2021
linc2functionlncRNA identificationRamakrishnaiah et al., 2021
ncDLRESncRNA identificationWang et al., 2021b
ncRDeepncRNA classificationChantsalnyam et al., 2020
2L-piRNADNNpiRNA identificationKhan et al., 2020
ncPro-MLncRNA promoter identificationTang et al., 2020
circDeepcircular RNA classificationChaabane et al., 2020
PredLnc-GFStackncRNA identificationLiu et al., 2019
LncADeeplncRNA identificationYang et al., 2018
nRCncRNA classificationFiannaca et al., 2017
DARIOncRNA identificationFasold et al., 2011
(2.2) ncRNA-biomolecular interaction
NPI-RGCNAEncRNA–protein interactionYu et al., 2021
DeepLPIncRNA–protein interactionShaw et al., 2021
PRPI-SClncRNA–protein interactionZhou et al., 2021
LGFC-CNNlncRNA–protein interactionHuang et al., 2021
Capsule-LPIlncRNA–protein interactionLi et al., 2021
EDLMFCncRNA–protein interactionWang et al., 2021a
NPI-GNNncRNA–protein interactionShen Z. A. et al., 2021
PmliPEMGmiRNA–lncRNA interactionKang et al., 2021
lncIBTPlncRNA-biomolecule interactionZhang et al., 2021
RPI-SEncRNA–protein interactionYi et al., 2020
DRPLPIlncRNA–protein interactionWekesa et al., 2020c
HFC-RPIncRNA–protein interactionDai et al., 2020
GPLPIncRNA–protein interactionWekesa et al., 2020b
LPI-DLlncRNA–protein interactionWekesa et al., 2020a
LPI-CNNCPlncRNA–protein interactionZhang S. W. et al., 2020
CIRNNmiRNA–lncRNA interactionZhang P. et al., 2020
LncMirNetmiRNA–lncRNA interactionYang et al., 2020
PmliPredmiRNA–lncRNA interactionKang et al., 2020
LMI-DForestmiRNA–lncRNA interactionWang et al., 2020
MD-MLImiRNA–lncRNA interactionSong et al., 2020
RPITERncRNA–protein interactionPeng et al., 2019
BGFEncRNA–protein interactionZhan et al., 2019
DM-RPIsncRNA–protein interactionCheng et al., 2019
PLRPIlncRNA–protein interactionZhou et al., 2019
CFRPncRNA–protein interactionDai et al., 2019
McBel-PlnclncRNA–protein interactionNavamajiti et al., 2019
LPI-BLSlncRNA–protein interactionFan and Zhang, 2019
LightGBMncRNA–protein interactionZhan et al., 2018
IPMinerncRNA–protein interactionPan et al., 2016
FlaiMappersmall ncRNA identificationHoogstrate et al., 2015

Details of the websites and softwares are listed in .

Bioinformatics tools in ncRNAs analysis. Details of the websites and softwares are listed in .

ncRNA Databases

The ncRNA databases collected ncRNA-related information, such as sequences, interactions between ncRNAs and target genes, as well as expression profiles. In addition, some tools contain information that has been supported by experimental evidences. For example, miRbase (Kozomara et al., 2019) integrates published mature sequences of miRNAs and their relevant hairpin precursors of miRNA. miRTarBase (Huang et al., 2020) and NPInter (Teng et al., 2020) not only focus on the interaction information between ncRNAs and other biomolecules, but also provide the relevant experimental evidences. To date, miRNAs, lncRNAs, and circular RNAs (circRNAs) in ncRNAs were collected by different online databases (Chu et al., 2017; Jin et al., 2021). Details of these tools including the application system, input format, and included species are listed in Supplementary Table S1. The development of high-throughput sequencing technologies and the increasing amount of published genomic data have given us the opportunity to collect diverse information from different species, which facilities the analysis of ncRNA evolutional and the construction of conserved models for predicting and exploring novel ncRNAs. However, the application of these databases in cross-species research still faces enormous challenges. On the one hand, there is a lack of identifying species-conserved or species-specific ncRNAs; and on the other hand, it is difficult to determine functional ncRNAs in a predictive manner.

ncRNA Prediction Tools

Prediction of ncRNA, including recognition of ncRNA and prediction of ncRNA function. At present, there are many prediction tools for ncRNA recognition, but relatively few tools for their function prediction. We classify the existing prediction tools according to their categories (Table 1, Supplementary Table S1). How to accurately predict novel ncRNAs and their target genes has always been the focus of researchers. The current development of prediction tools mainly focuses on three aspects: predicting the sequences of miRNAs and their precursors (Wu et al., 2012; Lei and Sun, 2014; Fei et al., 2021), predicting the binding sites of ncRNAs to targets (Bonnet et al., 2010; Brousse et al., 2014), predicting or visualizing the secondary or three-dimensional structure of ncRNAs (Steffen et al., 2006; Byun and Han, 2009; Biesiada et al., 2016). Sequence alignment is the basis for this prediction, and the divergence of ncRNA sequences is an important factor affecting the accuracy of prediction. In general, ancient ncRNAs (especially those related to plant development) remain highly conserved across species (Willmann and Poethig, 2007). However, recently evolved ncRNAs appear to be highly species-specific (Cuperus et al., 2011). Furthermore, there appears to be variability among different classes of ncRNAs in conservation among different species (Wu et al., 2020). Therefore, how to improve the accuracy of ncRNA prediction is an important problem to be solved. With the accumulation of ncRNA data, building conserved models for each ncRNA family may be a solution for this question.

Application of Deep Learning in ncRNA Study

Deep learning developed in recent years has shown a powerful potential capability in addressing bioinformatics problems in ncRNA study. For example, RPITER model can be used to predict interactions between ncRNAs and proteins based on sequence and structure information (Peng et al., 2019). Compared with traditional approaches, more information can be introduced into the computational process of deep learning to ensure the accuracy of the predicted results (Zhang S. W. et al., 2020). So far, at least 41 deep learning models have been built to predict ncRNA classification (Amin et al., 2019), ncRNA-protein interaction (Peng et al., 2019), ncRNA interaction, as well as ncRNA identification and functions (Khan et al., 2020; Zhang P. et al., 2020) (Table 1, Supplementary Table S1). The basic processes of applying deep learning are showed in Figure 1A. After choosing an appropriate framework, researchers need to input data to generate relevant models. The resulting model can be used for the next prediction step. Compared with other prediction methods, the advantage of deep learning can effectively reduce the prediction bias caused by imperfect design parameters. Its limitation is that the accuracy of predictions heavily depends on the accuracy of the models, which usually require large enough data to build and train. Therefore, the dataset used to build deep learning models seriously affect the accuracy of prediction and analysis. How to obtain a large amount of ncRNA data from different species for building more accurate models is a serious problem that needs to be solved. In addition, most of them are distributed in Linux system, and the proficient computational skills of users are an important precondition to utilize these models. Developing softwares with a user-friendly interface (Xu et al., 2021) is crucial for the application of these models. However, the cross-operating system adaptability of these models and the differences in fit to different data types are the main obstacles facing the use of new models to build user-friendly softwares.
Figure 1

Deep learning processes (A) and experimental techniques (B) for studying ncRNAs. sRNA-Seq, small RNA sequencing; CLIP-Seq, crosslinking and immunoprecipitation with sequencing; 5′RLM-RACE, 5′ RNA ligase mediated amplification of cDNA ends; ssRNA-Seq, strand-specific RNA sequencing; CAGE-Seq, cap analysis gene expression sequencing.

Deep learning processes (A) and experimental techniques (B) for studying ncRNAs. sRNA-Seq, small RNA sequencing; CLIP-Seq, crosslinking and immunoprecipitation with sequencing; 5′RLM-RACE, 5′ RNA ligase mediated amplification of cDNA ends; ssRNA-Seq, strand-specific RNA sequencing; CAGE-Seq, cap analysis gene expression sequencing.

Experimental Technologies

Although the results of bioinformatics predictions are becoming more and more accurate with the accumulation of ncRNA knowledge, experimental technologies are still needed to further validate the prediction results. According to the characteristics of ncRNAs, at least three aspects need to be verified: firstly, a functional ncRNA should have transcriptional activity (Bazzini et al., 2009); secondly, there should be an expression correlation between ncRNA and target genes (Bai et al., 2018); thirdly, if the ncRNA functions by degrading its target genes, the cleavage site of the ncRNA on the target genes should be verified (Gao et al., 2018). Moreover, experimental strategies should be varied for different types of ncRNAs. Figure 1B summarized the main sequencing and experimental techniques in ncRNA studies. In detail, the combination of sRNA-Seq and RNA-Seq is commonly used for global identification of novel small ncRNAs (Huang et al., 2019). After removed low-quality reads, high-quality reads are subsequently annotated by several databases (such as miRbase and Rfam) (Deforges et al., 2019; Huang et al., 2019). The length of small RNAs is far shorter than protein-coding genes and lncRNAs. Therefore, the extraction of small RNAs unlike other RNA, can be performed using special RNA-extraction kits (Gao et al., 2019) or TRIzol reagent (Tan et al., 2018). For miRNAs, it is necessary to identify the binding sites between miRNAs and their target genes (Shen W. et al., 2021). Degradome sequencing and crosslinking and immunoprecipitation with sequencing (CLIP-Seq) can be used to analyze binding sites between miRNAs and the target genes (Han et al., 2016; Chipman and Pasquinelli, 2019). Furthermore, 5′-RNA ligase mediated amplification of cDNA ends (5′RLM-RACE) assays are directly used for verifying the predicted binding sites (Cui et al., 2020). The detection of novel lncRNAs can be completed by multiple RNA-Seq strategies, such as isoform-sequencing, strand-specific RNA-Seq (ssRNA-Seq), cap analysis gene expression with polyA-Seq technologies (CAGE-Seq) (Zheng et al., 2021). Although lncRNAs are considered as a part of ncRNAs, some of them still remain a weak ability of translating small peptides. Therefore, ribosome profiling become has become one of the strategies to detect lncRNAs (Wu et al., 2019). Meanwhile, the recently developed high-precision single-base CRISPR/Cas9 technology can effectively create ncRNA-related mutants to explore the relationships between ncRNAs and target genes (Jacobs et al., 2015). Currently, in addition to the research on the function of small RNAs, more and more attention is focused on circRNAs in recent years. However, most studies focus more on the discovery of novel circRNAs by sequencing technologies. One of challenges of circRNA sequencing is to improve the accuracy of detection and quantification of circRNAs due to their lack of poly (A) tails and insufficient expression levels (Wang et al., 2019; Zhao et al., 2019). Treatment of total RNA with ribonuclease R or increasing sequencing depth may resolve these issues (Chen et al., 2017). Meanwhile, it is also necessary to develop functional research techniques to further clarify the biological functions of circRNAs.

Conclusion

In conclusion, although many tools and technologies have been developed to study ncRNAs in plants, there are still opportunities and challenges in this field. In bioinformatics, since there are significant differences in ncRNAs between species, it is beneficial for our research on ncRNAs to collect as much data as possible based on different species. Meanwhile, ncRNAs in a same family exhibit high conservation, it is possible for us to build models to discover novel ncRNAs. Moreover, most prediction tools and deep-learning models are developed based on Linux system, and the development of user-friendly Windows versions will help more researchers to analyze different kinds of ncRNA. As ncRNAs play a regulatory role in plants, how to manipulate ncRNAs through genetic engineering to regulate specific biological processes remains to be resolved.

Author Contributions

JZ conceived the study. DX collected and synthesized the data and draft the manuscript. JZ, WY, CF, BL, and M-ZL revised the manuscript. All authors contributed to the article and approved the final version.

Funding

This work was supported by the Key Scientific and Technological Grant of Zhejiang for Breeding New Agricultural Varieties (2021C02070-1), the National Key Research and Development Program of China (2021YFD2200205 and 2021YFD2200700), the National Science Foundation of China (32171814), the Natural Science Foundation of Zhejiang Province for Distinguished Young Scholars (LR22C160001), and the Zhejiang A&F University Research and Development Fund Talent Startup Project (2021LFR013).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
  78 in total

1.  PlantcircBase: A Database for Plant Circular RNAs.

Authors:  Qinjie Chu; Xingchen Zhang; Xintian Zhu; Chen Liu; Lingfeng Mao; Chuyu Ye; Qian-Hao Zhu; Longjiang Fan
Journal:  Mol Plant       Date:  2017-03-16       Impact factor: 13.164

2.  Predicting the interaction biomolecule types for lncRNA: an ensemble deep learning approach.

Authors:  Yu Zhang; Cangzhi Jia; Chee Keong Kwoh
Journal:  Brief Bioinform       Date:  2021-07-20       Impact factor: 11.622

3.  A deep learning model for plant lncRNA-protein interaction prediction with graph attention.

Authors:  Jael Sanyanda Wekesa; Jun Meng; Yushi Luan
Journal:  Mol Genet Genomics       Date:  2020-05-15       Impact factor: 3.291

4.  ncRDense: A novel computational approach for classification of non-coding RNA family by deep learning.

Authors:  Tuvshinbayar Chantsalnyam; Arslan Siraj; Hilal Tayara; Kil To Chong
Journal:  Genomics       Date:  2021-07-07       Impact factor: 5.736

5.  PhasiRNAnalyzer: an integrated analyser for plant phased siRNAs.

Authors:  Yuhan Fei; Jiejie Feng; Rui Wang; Baoyi Zhang; Hongsheng Zhang; Ji Huang
Journal:  RNA Biol       Date:  2021-02-04       Impact factor: 4.652

6.  DARIO: a ncRNA detection and analysis tool for next-generation sequencing experiments.

Authors:  Mario Fasold; David Langenberger; Hans Binder; Peter F Stadler; Steve Hoffmann
Journal:  Nucleic Acids Res       Date:  2011-05-27       Impact factor: 16.971

7.  miRBase: from microRNA sequences to function.

Authors:  Ana Kozomara; Maria Birgaoanu; Sam Griffiths-Jones
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

8.  RPI-SE: a stacking ensemble learning framework for ncRNA-protein interactions prediction using sequence information.

Authors:  Hai-Cheng Yi; Zhu-Hong You; Mei-Neng Wang; Zhen-Hao Guo; Yan-Bin Wang; Ji-Ren Zhou
Journal:  BMC Bioinformatics       Date:  2020-02-18       Impact factor: 3.169

9.  Exogenous miRNAs induce post-transcriptional gene silencing in plants.

Authors:  Federico Betti; Maria Jose Ladera-Carmona; Daan A Weits; Gianmarco Ferri; Sergio Iacopino; Giacomo Novi; Benedetta Svezia; Alicja B Kunkowska; Antonietta Santaniello; Alberto Piaggesi; Elena Loreti; Pierdomenico Perata
Journal:  Nat Plants       Date:  2021-10-14       Impact factor: 15.793

10.  RPITER: A Hierarchical Deep Learning Framework for ncRNA⁻Protein Interaction Prediction.

Authors:  Cheng Peng; Siyu Han; Hui Zhang; Ying Li
Journal:  Int J Mol Sci       Date:  2019-03-01       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.