Alina Munteanu1,2, Neelanjan Mukherjee1, Uwe Ohler1,2. 1. Berlin Institute for Medical Systems Biology, Max Delbruck Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany. 2. Department of Computer Science, Humboldt University, Berlin, Germany.
Abstract
Motivation: RNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized. Results: We developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3'UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP. Availability and implementation: SSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: RNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized. Results: We developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3'UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP. Availability and implementation: SSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: V Brown; P Jin; S Ceman; J C Darnell; W T O'Donnell; S A Tenenbaum; X Jin; Y Feng; K D Wilkinson; J D Keene; R B Darnell; S T Warren Journal: Cell Date: 2001-11-16 Impact factor: 41.582
Authors: Martijn Kedde; Marieke van Kouwenhove; Wilbert Zwart; Joachim A F Oude Vrielink; Ran Elkon; Reuven Agami Journal: Nat Cell Biol Date: 2010-09-05 Impact factor: 28.824
Authors: Debashish Ray; Hilal Kazan; Kate B Cook; Matthew T Weirauch; Hamed S Najafabadi; Xiao Li; Serge Gueroussov; Mihai Albu; Hong Zheng; Ally Yang; Hong Na; Manuel Irimia; Leah H Matzat; Ryan K Dale; Sarah A Smith; Christopher A Yarosh; Seth M Kelly; Behnam Nabet; Desirea Mecenas; Weimin Li; Rakesh S Laishram; Mei Qiao; Howard D Lipshitz; Fabio Piano; Anita H Corbett; Russ P Carstens; Brendan J Frey; Richard A Anderson; Kristen W Lynch; Luiz O F Penalva; Elissa P Lei; Andrew G Fraser; Benjamin J Blencowe; Quaid D Morris; Timothy R Hughes Journal: Nature Date: 2013-07-11 Impact factor: 49.962
Authors: Melissa L Wilbert; Stephanie C Huelga; Katannya Kapeli; Thomas J Stark; Tiffany Y Liang; Stella X Chen; Bernice Y Yan; Jason L Nathanson; Kasey R Hutt; Michael T Lovci; Hilal Kazan; Anthony Q Vu; Katlin B Massirer; Quaid Morris; Shawn Hoon; Gene W Yeo Journal: Mol Cell Date: 2012-09-06 Impact factor: 17.970
Authors: Jessica I Hoell; Erik Larsson; Simon Runge; Jeffrey D Nusbaum; Sujitha Duggimpudi; Thalia A Farazi; Markus Hafner; Arndt Borkhardt; Chris Sander; Thomas Tuschl Journal: Nat Struct Mol Biol Date: 2011-11-13 Impact factor: 15.369
Authors: David L Corcoran; Stoyan Georgiev; Neelanjan Mukherjee; Eva Gottwein; Rebecca L Skalsky; Jack D Keene; Uwe Ohler Journal: Genome Biol Date: 2011-08-18 Impact factor: 13.583
Authors: Mikael Feracci; Jaelle N Foot; Sushma N Grellscheid; Marina Danilenko; Ralf Stehle; Oksana Gonchar; Hyun-Seo Kang; Caroline Dalgliesh; N Helge Meyer; Yilei Liu; Albert Lahat; Michael Sattler; Ian C Eperon; David J Elliott; Cyril Dominguez Journal: Nat Commun Date: 2016-01-13 Impact factor: 14.919