BACKGROUND: Animal genomes contain thousands of long noncoding RNA (lncRNA) genes, a growing subset of which are thought to be functionally important. This functionality is often mediated by short sequence elements scattered throughout the RNA sequence that correspond to binding sites for small RNAs and RNA binding proteins. Throughout vertebrate evolution, the sequences of lncRNA genes changed extensively, so that it is often impossible to obtain significant alignments between sequences of lncRNAs from evolutionary distant species, even when synteny is evident. This often prohibits identifying conserved lncRNAs that are likely to be functional or prioritizing constrained regions for experimental interrogation. RESULTS: We introduce here LncLOOM, a novel algorithmic framework for the discovery and evaluation of syntenic combinations of short motifs. LncLOOM is based on a graph representation of the input sequences and uses integer linear programming to efficiently compare dozens of sequences that have thousands of bases each and to evaluate the significance of the recovered motifs. We show that LncLOOM is capable of identifying specific, biologically relevant motifs which are conserved throughout vertebrates and beyond in lncRNAs and 3'UTRs, including novel functional RNA elements in the CHASERR lncRNA that are required for regulation of CHD2 expression. CONCLUSIONS: We expect that LncLOOM will become a broadly used approach for the discovery of functionally relevant elements in the noncoding genome.
BACKGROUND: Animal genomes contain thousands of long noncoding RNA (lncRNA) genes, a growing subset of which are thought to be functionally important. This functionality is often mediated by short sequence elements scattered throughout the RNA sequence that correspond to binding sites for small RNAs and RNA binding proteins. Throughout vertebrate evolution, the sequences of lncRNA genes changed extensively, so that it is often impossible to obtain significant alignments between sequences of lncRNAs from evolutionary distant species, even when synteny is evident. This often prohibits identifying conserved lncRNAs that are likely to be functional or prioritizing constrained regions for experimental interrogation. RESULTS: We introduce here LncLOOM, a novel algorithmic framework for the discovery and evaluation of syntenic combinations of short motifs. LncLOOM is based on a graph representation of the input sequences and uses integer linear programming to efficiently compare dozens of sequences that have thousands of bases each and to evaluate the significance of the recovered motifs. We show that LncLOOM is capable of identifying specific, biologically relevant motifs which are conserved throughout vertebrates and beyond in lncRNAs and 3'UTRs, including novel functional RNA elements in the CHASERR lncRNA that are required for regulation of CHD2 expression. CONCLUSIONS: We expect that LncLOOM will become a broadly used approach for the discovery of functionally relevant elements in the noncoding genome.
Entities:
Keywords:
3′UTR; CHASERR; CHD2; Homology; Integer linear programming; Long noncoding RNA; MicroRNA; Molecular evolution; OIP5-AS1
Authors: Alexander Dobin; Carrie A Davis; Felix Schlesinger; Jorg Drenkow; Chris Zaleski; Sonali Jha; Philippe Batut; Mark Chaisson; Thomas R Gingeras Journal: Bioinformatics Date: 2012-10-25 Impact factor: 6.937
Authors: Matthew K Iyer; Yashar S Niknafs; Rohit Malik; Udit Singhal; Anirban Sahu; Yasuyuki Hosono; Terrence R Barrette; John R Prensner; Joseph R Evans; Shuang Zhao; Anton Poliakov; Xuhong Cao; Saravana M Dhanasekaran; Yi-Mi Wu; Dan R Robinson; David G Beer; Felix Y Feng; Hariharan K Iyer; Arul M Chinnaiyan Journal: Nat Genet Date: 2015-01-19 Impact factor: 38.330
Authors: Yasset Perez-Riverol; Attila Csordas; Jingwen Bai; Manuel Bernal-Llinares; Suresh Hewapathirana; Deepti J Kundu; Avinash Inuganti; Johannes Griss; Gerhard Mayer; Martin Eisenacher; Enrique Pérez; Julian Uszkoreit; Julianus Pfeuffer; Timo Sachsenberg; Sule Yilmaz; Shivani Tiwary; Jürgen Cox; Enrique Audain; Mathias Walzer; Andrew F Jarnuczak; Tobias Ternent; Alvis Brazma; Juan Antonio Vizcaíno Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971
Authors: Christian Much; Michael J Smallegan; Gabrijela Dumbović; John L Rinn; Taeyoung Hwang; Skylar D Hanson Journal: RNA Date: 2022-03-18 Impact factor: 5.636
Authors: Magdalena Regina Kubiak; Elżbieta Wanowska; Michał Wojciech Szcześniak; Izabela Makałowska Journal: Essays Biochem Date: 2021-10-27 Impact factor: 8.000