Literature DB >> 29106616

MODOMICS: a database of RNA modification pathways. 2017 update.

Pietro Boccaletto1, Magdalena A Machnicka1,2, Elzbieta Purta1, Pawel Piatkowski1, Blazej Baginski1, Tomasz K Wirecki1, Valérie de Crécy-Lagard3, Robert Ross4, Patrick A Limbach4, Annika Kotter5, Mark Helm5, Janusz M Bujnicki1,6.   

Abstract

MODOMICS is a database of RNA modifications that provides comprehensive information concerning the chemical structures of modified ribonucleosides, their biosynthetic pathways, the location of modified residues in RNA sequences, and RNA-modifying enzymes. In the current database version, we included the following new features and data: extended mass spectrometry and liquid chromatography data for modified nucleosides; links between human tRNA sequences and MINTbase - a framework for the interactive exploration of mitochondrial and nuclear tRNA fragments; new, machine-friendly system of unified abbreviations for modified nucleoside names; sets of modified tRNA sequences for two bacterial species, updated collection of mammalian tRNA modifications, 19 newly identified modified ribonucleosides and 66 functionally characterized proteins involved in RNA modification. Data from MODOMICS have been linked to the RNAcentral database of RNA sequences. MODOMICS is available at http://modomics.genesilico.pl.
© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29106616      PMCID: PMC5753262          DOI: 10.1093/nar/gkx1030

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The presence of modified nucleosides in RNA, beyond the basic A, U, C and G, has been recognized for more than half a century. However, their importance to RNA biochemistry and cell biology has been underappreciated, mainly because very few modifications (at defined positions in defined RNA molecules) were found to be truly indispensable for basic biological processes [review: (1)]. Now we know that 163 post-transcriptional modifications of RNA introduce a functional diversity that allows the four basic ribonucleotide residues to gain diverse functions, akin to those of side chains of amino acid residues, which may be e.g., polar, charged, aliphatic or aromatic. Modifications can directly influence RNA structure, by promoting or disrupting certain intramolecular interactions; they can make the RNA molecule more rigid or more flexible. They can also influence RNA interactions with other molecules, in particular proteins. Overall, they contribute strongly to the diversity of functions fulfilled by RNA molecules, especially within complex regulatory networks, where small subtle structural changes can bring about significant changes to cellular metabolism (2). In the last years, new types of RNA modifications have been found, and biochemical and physiological roles have been elucidated for many known modified ribonucleosides (3–5). Some of these advances were driven by the use of liquid chromatography/mass spectrometry (LC/MS)-based methods, which provide highly precise quantification of changes in the spectrum of modified ribonucleosides in RNA from any organism, facilitating the study of translational control of cellular responses and phenotypes (6). Moreover, a number of previously unknown RNA-modifying enzymes have been identified and characterized. New important roles for long-known RNA modifications were also discovered. Prominent examples include the involvement of N6-methyladenosine (m6A) in regulating gene expression by influencing transcript stability, splicing, translation efficiency and cap-independent translation, and in promoting circular RNA translation [review: (7)]. Mutations in many human genes encoding RNA modification enzymes have been linked to diseases, such as cancer, cardiovascular diseases, metabolic diseases, neurological disorders, and mitochondria-related defects [review: (8)]. To adequately represent the recent accumulation of knowledge, we have added both to the variety and volume of data in the MODOMICS database. The most significant additions are: (i) extensively updated datasets: new modifications, new enzymes, and new RNA sequences with modifications; (ii) a new category of LC/MS data for modifications (Figure 1); (iii) new naming/numbering convention for modified residues in RNA sequences; (iv) replacement of Jmol by JSmol for 3D structure viewing.
Figure 1.

Example of MS/LC data for N2,N2-dimethylguanosine (m2,2G). The display of a new data type: LC/MS information on the modification detail page.

Example of MS/LC data for N2,N2-dimethylguanosine (m2,2G). The display of a new data type: LC/MS information on the modification detail page.

DATABASE CONTENT

MODOMICS has been developed to house and distribute collections of RNA modification pathways, chemical structures of modified nucleosides, sequences of modified RNAs, enzymes responsible for individual reactions, a catalog of ‘building blocks’ for chemical synthesis of modified RNA, and to be expanded to include new data types. The database was created as a single resource to organize and present all these data in a convenient and straightforward way and is currently the most comprehensive source of information among all existing RNA modification databases. Information about modified residues is also available in the RNAMDB database (9), while information about modified nucleosides identified from high-throughput experiments like Pseudo-seq, CeU-seq, m6A-seq, Aza-IP and RiboMeth-seq is hosted by RMBase (10). Recently MODOMICS was linked to RNAcentral, a database of non-coding RNA sequences (11), and serves as a source of modified tRNA and rRNA sequences. At present, MODOMICS contains 163 different modifications that have been identified in RNA molecules. A typical entry for a modified ribonucleoside contains information about its fundamental chemical properties, chemical structure, localization in known RNA molecule types, the phylogenetic distribution with respect to Domains of Life, and known enzymes responsible for its biosynthesis. Among other available details information related to MS analyses of modified RNAs is provided (see the ‘LC/MS data for modified nucleosides’ section). Many of the products of modification reactions are substrates for further reactions, and the formation of hypermodified residues occurs in complex pathways, which are displayed as graphs in the PATHWAYS section of the database. Pathways are divided into six different categories according to their starting point: four categories correspond to the standard bases (A, G, C and U), one presents the incorporation and hypermodification pathway of queuosine, and one the modifications of the RNA 5′ cap. MODOMICS provides a collection of modified RNA sequences of different types. For families of homologous RNAs, multiple sequence alignments are available. Sequences are visualized with all modifications highlighted and linked to the corresponding modification records. The current set of sequences comprises 691 tRNA, 19 rRNA, 46 snRNA and 25 snoRNA sequences. The MODOMICS database currently contains information about 340 functionally characterized proteins involved in RNA modification, both functional enzymes and protein co-factors necessary for multi-protein enzymatic activities. For each protein a set of detailed information is provided and includes: identifiers and accession numbers from relevant resources and databases such as: NCBI GI, UniProt ID, COG number, PDB ID of structure (if available); amino acid sequence; corresponding ORF; information about catalyzed reaction, the position of modification and modified RNA(s) (if available). For proteins that are parts of enzymatic complexes, the name of the complex is provided. MODOMICS also contains human and yeast snoRNAs, involved in RNA-guided RNA modification by the C/D box and H/ACA box ribonucleoproteins, linked to the corresponding modification sites in human and yeast RNAs and the catalog of ‘building blocks’ for the chemical synthesis of naturally occurring modified nucleosides. Several options for database searching and querying are implemented in MODOMICS, including the BLAST (12) search of protein sequences and the PARALIGN (13) search of nucleic acid sequences collected in MODOMICS, as well as a utility that sends a protein sequence from a MODOMICS entry to BLAST on the NCBI web server.

Updated modifications section

Since the previous release of MODOMICS (14), 19 new modifications were added to the database. Among those are four types of geranylated nucleosides discovered in bacterial tRNA (3), 5-cyanomethyluridine (cnm5U) (15), and 2′-O-methyluridine 5-oxyacetic acid methyl ester (mcmo5Um) (5). LC/MS analyses of tRNAs from Bacillus subtilis, plants, and Trypanosoma brucei revealed the presence of 2-methylthio cyclic N6-threonylcarbamoyladenosine (ms2ct6A), a derivative of N6-threonylcarbamoyladenosine (t6A), at position 37 of tRNAs responsible for recognition of adenosine-starting codons (16). 3D chemical structures of modified nucleosides are now displayed with JSmol (17). It is an open-source JS library and HTML5 viewer for 3D chemical structures which, in contrast to the previously used Jmol tool, does not require the installation of the Java software package. JSmol can also be used on systems that no longer support Java applets due to security concerns or for which Java is not available, like smartphones or tablets, and it does not use hardware graphics acceleration, enabling the software to run in any browser that supports HTML5 standards.

LC/MS data for modified nucleosides

This new MODOMICS release features a new section on modification detail page created to host the LC/MS data of the modified nucleoside. The new fields include information concerning the product ions, the protonated mass [M+H]+, the LC elution order and its characteristics, the normalized LC elution time and their literature references. The LC elution time is normalized to guanosine (G), measured with an RP C-18 column with acetonitrile/ammonium acetate as mobile phase and the elution order is based on the retention times of C, U, G, A and m6A to cover all areas of the chromatogram. The LC data is intended to provide the novice LC-MS user with guidance on the relative hydrophobicity of modified nucleosides and an estimated elution region using the denoted stationary and mobile phases. Currently, only 48 modifications, for instance, m1A, m3C, and m2,2G are associated with LC information, while 138 modifications have been associated with MS information. The new system is in place to be extended to all the modifications present in the database as soon as the new data become available. LC/MS based methods allow RNA modification profiling of different organisms in a semi-quantitative manner for the newly-detected modifications along with known modifications, and one can expect that approach will be further extended. LC/MS data for MODOMICS collection of modified nucleosides provides a comprehensive source of information for mapping of the identity and position of modified residues in RNA sequences.

New nomenclature for modified nucleosides

The old systems used to encode modified residues in RNA sequences have been very cumbersome for automated data processing, especially in the case of special characters (often interpreted as special actions) or names that contained letters such as ‘c’ or ‘i’ that could be confused with different bases. Thus, we developed a new naming convention that uses only digits in addition to standard letters (A, G, C, U), and which makes names distinct from one another. In the new proposed system, a number is introduced before the modified residue, so the software for sequence processing can recognize the original residue type before modification, as well as identify the specific modification(s) introduced. For the most common modification type, i.e., simple methylations, we use single digit numbers that indicate, whenever possible, methylated positions, with 0 representing the 2′-OH group. Consequently, modifications Am, m1A, m6A, m5C, are indicated as 0A, 1A, 6A, 5C, respectively. Residues with several methylations list all methylated positions, sorted in ascending order, e.g., m1Am, m2,2,7G, m4,4Cm are indicated as 01A, 227G, 044C, respectively. Some other modifications also use single digits for convenience, e.g., I and Ψ are indicated as 9A and 9U. Other modifications are indicated with additional numbers, usually following the position of the modification. For example, i6A, io6A, k2C, are indicated as 61A, 60A, 21C, respectively. Some naming decisions, especially in case of very complex modifications, were arbitrary, keeping in mind that ambiguity in the numbering must always be avoided, i.e., that a given sequence of digits preceding a letter corresponds to a unique modification. For each modified nucleoside the code in new nomenclature is available on the nucleoside site in the Modifications section of the database. The nomenclature was also implemented in the Sequences section by providing the option to display modifications in sequences using this nomenclature instead of one-letter symbols. As the next step we intend to develop and provide format conversion tools to allow for exporting and importing RNA sequences with modifications, and e.g., to run sequence searches that take modifications into account.

tRNA sequences section update and development

For this database release, 102 new tRNA sequences were added, and a major update of mammalian tRNA modifications was performed based on (18). Among new sequences are sets of tRNAs from two bacterial species: Streptomyces griseus and Lactococcus lactis (19,20), 60 and 26 sequences, respectively. As technology and methods improve, well-studied sequences continue to undergo revision (21). We have also introduced links of human tRNA sequences to MINTbase (22), which is a framework for the interactive exploration of mitochondrial and nuclear tRNA fragments. The MINTbase link in MODOMICS opens a page with a list of the latest profile of expressed tRNA fragments, aligned against the sequence from MODOMICS.

Updated collection of proteins, enzymatic activities, and pathways

The MODOMICS collection of functionally characterized proteins involved in RNA modification is under constant development. Since the previous release, 66 new proteins have been added. The collections of protein sequences and enzymatic activities are updated in parallel, which resulted in 105 new enzymatic activities. Among new proteins that were added in this release there is a collection of human RNA modification enzymes, including: dimethyladenosine transferase TFB1M (23), tRNA pseudouridine synthase PUS3 (24), tRNA m5C methyltransferases NSUN3 (25) and NSUN6 (26), and rRNA m5C methyltransferase NSUN4 (27). Apart from the addition of newly characterized enzymes, data entries for many enzymes and associated pathways were updated.

Future prospects

The number of experimentally identified modifications and RNA modifying enzymes keeps growing. New modified nucleosides are being discovered in particular in RNAs from recently adopted model systems, such as extremophilic prokaryotes. Though there is a considerable amount of information available about the enzymes responsible for introducing specific modifications, there are still many modified positions in well-characterized RNA molecules, for which the responsible enzymes are not known, e.g., m6Am at the 5′ end of human mRNAs or m5U, m4C in 12S mitochondrial rRNA. To help us keep up with new discoveries, we encourage the users of MODOMICS to submit suggestions for additions to be included in the database. We also encourage developers of other computational resources to contact us to have our databases mutually linked to each other. For the next release of MODOMICS, we plan to update the visualization options and to refurbish the website, to keep up with the changing trends in web design. We also intend to renew data structures, to make MODOMICS more compatible with other databases and web servers, to facilitate automated data exchange, and to introduce the ability to search sequences by taking modifications into account.

AVAILABILITY

The data are accessible freely for research purposes at http://modomics.genesilico.pl.
  25 in total

1.  A homozygous truncating mutation in PUS3 expands the role of tRNA modification in normal cognition.

Authors:  Ranad Shaheen; Lu Han; Eissa Faqeih; Nour Ewida; Eman Alobeid; Eric M Phizicky; Fowzan S Alkuraya
Journal:  Hum Genet       Date:  2016-04-07       Impact factor: 4.132

2.  Systematic identification of tRNAome and its dynamics in Lactococcus lactis.

Authors:  Pranav Puri; Collin Wetzel; Paul Saffert; Kirk W Gaston; Susan P Russell; Juan A Cordero Varela; Pieter van der Vlies; Gong Zhang; Patrick A Limbach; Zoya Ignatova; Bert Poolman
Journal:  Mol Microbiol       Date:  2014-08-06       Impact factor: 3.501

3.  Methylation of 12S rRNA is necessary for in vivo stability of the small subunit of the mammalian mitochondrial ribosome.

Authors:  Metodi D Metodiev; Nicole Lesko; Chan Bae Park; Yolanda Cámara; Yonghong Shi; Rolf Wibom; Kjell Hultenby; Claes M Gustafsson; Nils-Göran Larsson
Journal:  Cell Metab       Date:  2009-04       Impact factor: 27.287

4.  The RNA Modification Database, RNAMDB: 2011 update.

Authors:  William A Cantara; Pamela F Crain; Jef Rozenski; James A McCloskey; Kimberly A Harris; Xiaonong Zhang; Franck A P Vendeix; Daniele Fabris; Paul F Agris
Journal:  Nucleic Acids Res       Date:  2010-11-10       Impact factor: 16.971

5.  RNAcentral: an international database of ncRNA sequences.

Authors:  Anton I Petrov; Simon J E Kay; Richard Gibson; Eugene Kulesha; Dan Staines; Elspeth A Bruford; Mathew W Wright; Sarah Burge; Robert D Finn; Paul J Kersey; Guy Cochrane; Alex Bateman; Sam Griffiths-Jones; Jennifer Harrow; Patricia P Chan; Todd M Lowe; Christian W Zwieb; Jacek Wower; Kelly P Williams; Corey M Hudson; Robin Gutell; Michael B Clark; Marcel Dinger; Xiu Cheng Quek; Janusz M Bujnicki; Nam-Hai Chua; Jun Liu; Huan Wang; Geir Skogerbø; Yi Zhao; Runsheng Chen; Weimin Zhu; James R Cole; Benli Chai; Hsien-Da Huang; His-Yuan Huang; J Michael Cherry; Artemis Hatzigeorgiou; Kim D Pruitt
Journal:  Nucleic Acids Res       Date:  2014-10-28       Impact factor: 16.971

6.  RMBase: a resource for decoding the landscape of RNA modifications from high-throughput sequencing data.

Authors:  Wen-Ju Sun; Jun-Hao Li; Shun Liu; Jie Wu; Hui Zhou; Liang-Hu Qu; Jian-Hua Yang
Journal:  Nucleic Acids Res       Date:  2015-10-12       Impact factor: 16.971

7.  The human 18S rRNA base methyltransferases DIMT1L and WBSCR22-TRMT112 but not rRNA modification are required for ribosome biogenesis.

Authors:  Christiane Zorbas; Emilien Nicolas; Ludivine Wacheul; Emmeline Huvelle; Valérie Heurgué-Hamard; Denis L J Lafontaine
Journal:  Mol Biol Cell       Date:  2015-04-07       Impact factor: 4.138

8.  MODOMICS: a database of RNA modification pathways--2013 update.

Authors:  Magdalena A Machnicka; Kaja Milanowska; Okan Osman Oglou; Elzbieta Purta; Malgorzata Kurkowska; Anna Olchowik; Witold Januszewski; Sebastian Kalinowski; Stanislaw Dunin-Horkawicz; Kristian M Rother; Mark Helm; Janusz M Bujnicki; Henri Grosjean
Journal:  Nucleic Acids Res       Date:  2012-10-30       Impact factor: 16.971

9.  Discovery and biological characterization of geranylated RNA in bacteria.

Authors:  Christoph E Dumelin; Yiyun Chen; Aaron M Leconte; Y Grace Chen; David R Liu
Journal:  Nat Chem Biol       Date:  2012-09-16       Impact factor: 15.040

10.  NSUN4 is a dual function mitochondrial protein required for both methylation of 12S rRNA and coordination of mitoribosomal assembly.

Authors:  Metodi Dimitrov Metodiev; Henrik Spåhr; Paola Loguercio Polosa; Caroline Meharg; Christian Becker; Janine Altmueller; Bianca Habermann; Nils-Göran Larsson; Benedetta Ruzzenente
Journal:  PLoS Genet       Date:  2014-02-06       Impact factor: 5.917

View more
  632 in total

Review 1.  Pseudouridine as a novel biomarker in prostate cancer.

Authors:  Jennifer A Stockert; Rachel Weil; Kamlesh K Yadav; Natasha Kyprianou; Ashutosh K Tewari
Journal:  Urol Oncol       Date:  2020-07-22       Impact factor: 3.498

2.  Structural characterization of B. subtilis m1A22 tRNA methyltransferase TrmK: insights into tRNA recognition.

Authors:  Clément Dégut; Martine Roovers; Pierre Barraud; Franck Brachet; André Feller; Valéry Larue; Abdalla Al Refaii; Joël Caillet; Louis Droogmans; Carine Tisné
Journal:  Nucleic Acids Res       Date:  2019-05-21       Impact factor: 16.971

3.  Molecular Dynamics Study of the Hybridization between RNA and Modified Oligonucleotides.

Authors:  Zhifeng Jing; Rui Qi; Marc Thibonnier; Pengyu Ren
Journal:  J Chem Theory Comput       Date:  2019-10-09       Impact factor: 6.006

Review 4.  Where, When, and How: Context-Dependent Functions of RNA Methylation Writers, Readers, and Erasers.

Authors:  Hailing Shi; Jiangbo Wei; Chuan He
Journal:  Mol Cell       Date:  2019-05-16       Impact factor: 17.970

Review 5.  Relating Structure and Dynamics in RNA Biology.

Authors:  Kevin P Larsen; Junhong Choi; Arjun Prabhakar; Elisabetta Viani Puglisi; Joseph D Puglisi
Journal:  Cold Spring Harb Perspect Biol       Date:  2019-07-01       Impact factor: 10.005

Review 6.  Pathways to disease from natural variations in human cytoplasmic tRNAs.

Authors:  Jeremy T Lant; Matthew D Berg; Ilka U Heinemann; Christopher J Brandl; Patrick O'Donoghue
Journal:  J Biol Chem       Date:  2019-01-14       Impact factor: 5.157

7.  Unraveling the RNA modification code with mass spectrometry.

Authors:  Richard Lauman; Benjamin A Garcia
Journal:  Mol Omics       Date:  2020-04-14

8.  The birth of a bacterial tRNA gene by large-scale, tandem duplication events.

Authors:  Gökçe B Ayan; Hye Jin Park; Jenna Gallie
Journal:  Elife       Date:  2020-10-30       Impact factor: 8.140

Review 9.  How the intracellular partitioning of tRNA and tRNA modification enzymes affects mitochondrial function.

Authors:  Zdeněk Paris; Juan D Alfonzo
Journal:  IUBMB Life       Date:  2018-10-25       Impact factor: 3.885

10.  Methods and Challenges for Computational Data Analysis for DNA Adductomics.

Authors:  Scott J Walmsley; Jingshu Guo; Jinhua Wang; Peter W Villalta; Robert J Turesky
Journal:  Chem Res Toxicol       Date:  2019-11-06       Impact factor: 3.739

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.