Literature DB >> 18948287

The Functional RNA Database 3.0: databases to support mining and annotation of functional RNAs.

Toutai Mituyama1, Kouichirou Yamada, Emi Hattori, Hiroaki Okida, Yukiteru Ono, Goro Terai, Aya Yoshizawa, Takashi Komori, Kiyoshi Asai.   

Abstract

We developed a pair of databases that support two important tasks: annotation of anonymous RNA transcripts and discovery of novel non-coding RNAs. The database combo is called the Functional RNA Database and consists of two databases: a rewrite of the original version of the Functional RNA Database (fRNAdb) and the latest version of the UCSC GenomeBrowser for Functional RNA. The former is a sequence database equipped with a powerful search function and hosts a large collection of known/predicted non-coding RNA sequences acquired from existing databases as well as novel/predicted sequences reported by researchers of the Functional RNA Project. The latter is a UCSC Genome Browser mirror with large additional custom tracks specifically associated with non-coding elements. It also includes several functional enhancements such as a presentation of a common secondary structure prediction at any given genomic window < or =500 bp. Our GenomeBrowser supports user authentication and user-specific tracks. The current version of the fRNAdb is a complete rewrite of the former version, hosting a larger number of sequences and with a much friendlier interface. The current version of UCSC GenomeBrowser for Functional RNA features a larger number of tracks and richer features than the former version. The databases are available at http://www.ncrna.org/.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18948287      PMCID: PMC2686472          DOI: 10.1093/nar/gkn805

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Large-scale transcription analyses such as the H-invitational (1) and Fantom (2) projects reported a large number of transcripts that could not be associated with coding genes, and which were thus left unclassifiable. Several investigations revealed that these unclassifiable transcripts contain novel non-coding genes (3–5). The Functional RNA Database (fRNAdb) 1.0 (6) focused on acquiring and providing lines of evidence to infer non-coding-ness for these unclassifiable transcripts to help filter out candidates for non-coding genes. However, drastic changes in the situation surrounding non-coding RNA research spurred us to move on to the next phase of database development. A transcriptome analysis for natural RNA transcripts utilizing high-throughput sequencing is one of the most attractive topics among recent research activities. Due to the abundance of sequence data produced by deep sequencing, computational analysis plays an important role in the rapid sequence mapping and annotation of anonymous sequences. In particular, a sequence database is the most crucial part of computational analysis. Total RNAs extracted from a cell tend to have diverse compositions even though RNAs are extracted via immunoprecipitation of specific proteins (7–9). They contain tRNAs, rRNAs, coding mRNAs, varieties of transposons and non-coding RNAs including miRNAs and snoRNAs together with a fair amount of anonymous transcripts meeting no existing annotations although they can be mapped to a genome. Such transcripts may contain evidence of novel non-coding RNA genes. In order to adopt the large-scale sequence data from deep sequencing, we have completely redesigned and rebuilt fRNAdb. The major changes include increase of hosting sequences (from 13 693 to 509 795), sequence ontology (SO, http://song.sourceforge.net/) classification, keyword search function and Blast search service. The details given in the next section are new features for the current version.

fRNAdb

fRNAdb is a sequence database hosting a large collection of non-coding RNA sequence data from public non-coding databases: H-invDB rel. 5.0 (1), FANTOM3 (2), miRBase 10.0 (10), NONCODE v1.0 (11), Rfam v8.1 (12), RNAdb v2.0 (13) and snoRNA-LBME-db rel. 3 (14). Although these databases contain many identical sequences, fRNAdb consolidates them to a set of unique sequences. Therefore, one fRNAdb sequence can have multiple accessions and multiple source organisms. A sequence can have one or more mapping loci in multiple genomes, gene association using mapping information, sequence similarity information between other registered sequences, and reference information. All sequences are mapped to multiple genomes (humans, mice, rats and fruit flies) in order to determine potential loci and potential homologs. The mapping loci can be viewed in our UCSC GenomeBrowser for Functional RNA for visual inspection with a number of tracks showing versatile genomic elements provided by the original UCSC Genome Browser and our additional tracks detailed in the next section. fRNAdb allows users to search the sequences through keywords associated with them. Various kinds of information are associated with a sequence, as shown in Figure 1. The keywords are extracted from an identifier, description text, accession, SO, source organism, cross reference information, associated gene names, title/abstract/author text of reference papers, genome/chromosome/cytoband and sequence length. Common English words that may hinder efficient keyword search are eliminated from the index using the English dictionary of the open source spell checker aspell (http://aspell.net/).
Figure 1.

Diagram showing a registered sequence and its associations to other information.

Diagram showing a registered sequence and its associations to other information. Statistics of keywords associated with fRNAdb sequences can be browsed at the fRNAdb::Statistics page, where frequently used keywords corresponding to canonical terms in various ontology sets are presented. These statistics are useful for providing an overview of the entire non-coding RNA sequences from multiple aspects using different ontologies such as SO, taxonomy and several ontologies of the Open Biomedical Ontologies (http://www.obofoundry.org/): human disease ontology and gene ontology (biological/molecular processes). fRNAdb also provides sequence homology search using Blastn (15). In order to provide better usability, we divided our database in two parts: one contains sequences longer than 50 bases and the other contains sequences 50 bases or shorter since some users are not interested in small sequences that include a large number of deep sequencing products. fRNAdb::Blast automatically adjusts some parameters according to the length of a query sequence in order to improve performance for short (<50 bases) query sequences. The adaptive parameters are gap opening/extension cost, E-value, and word size. All Blast parameters can be overridden by users. More details about fRNAdb are provided on the fRNAdb::Help page.

UCSC GENOME BROWSER FOR FUNCTIONAL RNA

This database is an extended mirror of the UCSC Genome Browser (16) hosting genomes of humans (hg17 and hg18), mice (mm9), rats (rn3) and fruit flies (dm3). This database has been updated extensively. There were 15 original tracks in the previous version (6). We re-organized our tracks and added more custom tracks. For hg18, our extension includes 26 essential tracks for the ncRNA Prediction and Mapping Tracks group, five essential tracks for the Misc. Genomic Element Tracks group, and five essential tracks for the miRNA-related Tracks group. Tracks for the whole human tiling array of Affymetrix Transfrags (17) are available (currently supported only on hg17). We have developed several tracks to support an improved presentation. For example, the miRNA Atlas (18) track has a feature to present the expression profile of multiple miRNAs residing inside the GenomeBrowser window (Figure 2). Another example is tissue-specific enhancers and the target loci (19) track. This track indicates an enhancer region with an orange box and its associated gene locus with a green bar, which is rendered in darker green when the locus is activated in more tissues. Yet another extension is given to the conservation track, which shows not only a multiple genome alignment but also predicted common RNA secondary structures. When clicking on the conservation track in the window showing a genomic region ⩽500 bp, prediction is dynamically perfomed in both strands. Then, the browser presents a predicted secondary structure, minimum free energy and the number of base pairs per strand. The estimated secondary structure is downloadable as PDF graphics and in Stockholm format, which is a secondary structure annotated alignment file. This file can be used for determining homologous secondary structure in a database using Infernal software package (http://infernal.janelia.org). Complete listing and details of extension tracks are found in the Project Specific Custom Tracks page (http://www.ncrna.org/custom-tracks).
Figure 2.

Mammalian miRNA Expression Atlas track showing miR-302a/b/c/d highly expressed at 3p (A). The detailed page shows expression profiles for these miRNAs with a heat map and actual read numbers previously reported by (20) (B).

Mammalian miRNA Expression Atlas track showing miR-302a/b/c/d highly expressed at 3p (A). The detailed page shows expression profiles for these miRNAs with a heat map and actual read numbers previously reported by (20) (B).

FUNDING

This work was supported by the Functional RNA Project funded by New Energy and Industrial Technology Development Organization (NEDO). Funding for open access charge: Japan Biological Informatics Consortium (JBIC). Conflict of interest statement. None declared.
  19 in total

1.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

2.  Identification and expression analysis of putative mRNA-like non-coding RNA in Drosophila.

Authors:  Sachi Inagaki; Koji Numata; Takefumi Kondo; Masaru Tomita; Kunio Yasuda; Akio Kanai; Yuji Kageyama
Journal:  Genes Cells       Date:  2005-12       Impact factor: 1.891

3.  Finding noncoding RNA transcripts from low abundance expressed sequence tags.

Authors:  Chenghai Xue; Fei Li; Fei Li
Journal:  Cell Res       Date:  2008-06       Impact factor: 25.617

4.  Drosophila endogenous small RNAs bind to Argonaute 2 in somatic cells.

Authors:  Yoshinori Kawamura; Kuniaki Saito; Taishin Kin; Yukiteru Ono; Kiyoshi Asai; Takafumi Sunohara; Tomoko N Okada; Mikiko C Siomi; Haruhiko Siomi
Journal:  Nature       Date:  2008-05-07       Impact factor: 49.962

5.  An endogenous small interfering RNA pathway in Drosophila.

Authors:  Benjamin Czech; Colin D Malone; Rui Zhou; Alexander Stark; Catherine Schlingeheyde; Monica Dus; Norbert Perrimon; Manolis Kellis; James A Wohlschlegel; Ravi Sachidanandam; Gregory J Hannon; Julius Brennecke
Journal:  Nature       Date:  2008-05-07       Impact factor: 49.962

6.  The transcriptional landscape of the mammalian genome.

Authors:  P Carninci; T Kasukawa; S Katayama; J Gough; M C Frith; N Maeda; R Oyama; T Ravasi; B Lenhard; C Wells; R Kodzius; K Shimokawa; V B Bajic; S E Brenner; S Batalov; A R R Forrest; M Zavolan; M J Davis; L G Wilming; V Aidinis; J E Allen; A Ambesi-Impiombato; R Apweiler; R N Aturaliya; T L Bailey; M Bansal; L Baxter; K W Beisel; T Bersano; H Bono; A M Chalk; K P Chiu; V Choudhary; A Christoffels; D R Clutterbuck; M L Crowe; E Dalla; B P Dalrymple; B de Bono; G Della Gatta; D di Bernardo; T Down; P Engstrom; M Fagiolini; G Faulkner; C F Fletcher; T Fukushima; M Furuno; S Futaki; M Gariboldi; P Georgii-Hemming; T R Gingeras; T Gojobori; R E Green; S Gustincich; M Harbers; Y Hayashi; T K Hensch; N Hirokawa; D Hill; L Huminiecki; M Iacono; K Ikeo; A Iwama; T Ishikawa; M Jakt; A Kanapin; M Katoh; Y Kawasawa; J Kelso; H Kitamura; H Kitano; G Kollias; S P T Krishnan; A Kruger; S K Kummerfeld; I V Kurochkin; L F Lareau; D Lazarevic; L Lipovich; J Liu; S Liuni; S McWilliam; M Madan Babu; M Madera; L Marchionni; H Matsuda; S Matsuzawa; H Miki; F Mignone; S Miyake; K Morris; S Mottagui-Tabar; N Mulder; N Nakano; H Nakauchi; P Ng; R Nilsson; S Nishiguchi; S Nishikawa; F Nori; O Ohara; Y Okazaki; V Orlando; K C Pang; W J Pavan; G Pavesi; G Pesole; N Petrovsky; S Piazza; J Reed; J F Reid; B Z Ring; M Ringwald; B Rost; Y Ruan; S L Salzberg; A Sandelin; C Schneider; C Schönbach; K Sekiguchi; C A M Semple; S Seno; L Sessa; Y Sheng; Y Shibata; H Shimada; K Shimada; D Silva; B Sinclair; S Sperling; E Stupka; K Sugiura; R Sultana; Y Takenaka; K Taki; K Tammoja; S L Tan; S Tang; M S Taylor; J Tegner; S A Teichmann; H R Ueda; E van Nimwegen; R Verardo; C L Wei; K Yagi; H Yamanishi; E Zabarovsky; S Zhu; A Zimmer; W Hide; C Bult; S M Grimmond; R D Teasdale; E T Liu; V Brusic; J Quackenbush; C Wahlestedt; J S Mattick; D A Hume; C Kai; D Sasaki; Y Tomaru; S Fukuda; M Kanamori-Katayama; M Suzuki; J Aoki; T Arakawa; J Iida; K Imamura; M Itoh; T Kato; H Kawaji; N Kawagashira; T Kawashima; M Kojima; S Kondo; H Konno; K Nakano; N Ninomiya; T Nishio; M Okada; C Plessy; K Shibata; T Shiraki; S Suzuki; M Tagami; K Waki; A Watahiki; Y Okamura-Oho; H Suzuki; J Kawai; Y Hayashizaki
Journal:  Science       Date:  2005-09-02       Impact factor: 47.728

7.  snoRNA-LBME-db, a comprehensive database of human H/ACA and C/D box snoRNAs.

Authors:  Laurent Lestrade; Michel J Weber
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

8.  Rfam: annotating non-coding RNAs in complete genomes.

Authors:  Sam Griffiths-Jones; Simon Moxon; Mhairi Marshall; Ajay Khanna; Sean R Eddy; Alex Bateman
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

9.  NONCODE v2.0: decoding the non-coding.

Authors:  Shunmin He; Changning Liu; Geir Skogerbø; Haitao Zhao; Jie Wang; Tao Liu; Baoyan Bai; Yi Zhao; Runsheng Chen
Journal:  Nucleic Acids Res       Date:  2007-11-13       Impact factor: 16.971

10.  Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

Authors:  Tadashi Imanishi; Takeshi Itoh; Yutaka Suzuki; Claire O'Donovan; Satoshi Fukuchi; Kanako O Koyanagi; Roberto A Barrero; Takuro Tamura; Yumi Yamaguchi-Kabata; Motohiko Tanino; Kei Yura; Satoru Miyazaki; Kazuho Ikeo; Keiichi Homma; Arek Kasprzyk; Tetsuo Nishikawa; Mika Hirakawa; Jean Thierry-Mieg; Danielle Thierry-Mieg; Jennifer Ashurst; Libin Jia; Mitsuteru Nakao; Michael A Thomas; Nicola Mulder; Youla Karavidopoulou; Lihua Jin; Sangsoo Kim; Tomohiro Yasuda; Boris Lenhard; Eric Eveno; Yoshiyuki Suzuki; Chisato Yamasaki; Jun-ichi Takeda; Craig Gough; Phillip Hilton; Yasuyuki Fujii; Hiroaki Sakai; Susumu Tanaka; Clara Amid; Matthew Bellgard; Maria de Fatima Bonaldo; Hidemasa Bono; Susan K Bromberg; Anthony J Brookes; Elspeth Bruford; Piero Carninci; Claude Chelala; Christine Couillault; Sandro J de Souza; Marie-Anne Debily; Marie-Dominique Devignes; Inna Dubchak; Toshinori Endo; Anne Estreicher; Eduardo Eyras; Kaoru Fukami-Kobayashi; Gopal R Gopinath; Esther Graudens; Yoonsoo Hahn; Michael Han; Ze-Guang Han; Kousuke Hanada; Hideki Hanaoka; Erimi Harada; Katsuyuki Hashimoto; Ursula Hinz; Momoki Hirai; Teruyoshi Hishiki; Ian Hopkinson; Sandrine Imbeaud; Hidetoshi Inoko; Alexander Kanapin; Yayoi Kaneko; Takeya Kasukawa; Janet Kelso; Paul Kersey; Reiko Kikuno; Kouichi Kimura; Bernhard Korn; Vladimir Kuryshev; Izabela Makalowska; Takashi Makino; Shuhei Mano; Regine Mariage-Samson; Jun Mashima; Hideo Matsuda; Hans-Werner Mewes; Shinsei Minoshima; Keiichi Nagai; Hideki Nagasaki; Naoki Nagata; Rajni Nigam; Osamu Ogasawara; Osamu Ohara; Masafumi Ohtsubo; Norihiro Okada; Toshihisa Okido; Satoshi Oota; Motonori Ota; Toshio Ota; Tetsuji Otsuki; Dominique Piatier-Tonneau; Annemarie Poustka; Shuang-Xi Ren; Naruya Saitou; Katsunaga Sakai; Shigetaka Sakamoto; Ryuichi Sakate; Ingo Schupp; Florence Servant; Stephen Sherry; Rie Shiba; Nobuyoshi Shimizu; Mary Shimoyama; Andrew J Simpson; Bento Soares; Charles Steward; Makiko Suwa; Mami Suzuki; Aiko Takahashi; Gen Tamiya; Hiroshi Tanaka; Todd Taylor; Joseph D Terwilliger; Per Unneberg; Vamsi Veeramachaneni; Shinya Watanabe; Laurens Wilming; Norikazu Yasuda; Hyang-Sook Yoo; Marvin Stodolsky; Wojciech Makalowski; Mitiko Go; Kenta Nakai; Toshihisa Takagi; Minoru Kanehisa; Yoshiyuki Sakaki; John Quackenbush; Yasushi Okazaki; Yoshihide Hayashizaki; Winston Hide; Ranajit Chakraborty; Ken Nishikawa; Hideaki Sugawara; Yoshio Tateno; Zhu Chen; Michio Oishi; Peter Tonellato; Rolf Apweiler; Kousaku Okubo; Lukas Wagner; Stefan Wiemann; Robert L Strausberg; Takao Isogai; Charles Auffray; Nobuo Nomura; Takashi Gojobori; Sumio Sugano
Journal:  PLoS Biol       Date:  2004-04-20       Impact factor: 8.029

View more
  74 in total

1.  The RNase III enzyme DROSHA is essential for microRNA production and spermatogenesis.

Authors:  Qiuxia Wu; Rui Song; Nicole Ortogero; Huili Zheng; Ryan Evanoff; Chris L Small; Michael D Griswold; Satoshi H Namekawa; Helene Royo; James M Turner; Wei Yan
Journal:  J Biol Chem       Date:  2012-06-04       Impact factor: 5.157

Review 2.  Deeply dissecting stemness: making sense to non-coding RNAs in stem cells.

Authors:  Shizuka Uchida; Pascal Gellert; Thomas Braun
Journal:  Stem Cell Rev Rep       Date:  2012-03       Impact factor: 5.739

Review 3.  Analytical tools and current challenges in the modern era of neuroepigenomics.

Authors:  Ian Maze; Li Shen; Bin Zhang; Benjamin A Garcia; Ningyi Shao; Amanda Mitchell; HaoSheng Sun; Schahram Akbarian; C David Allis; Eric J Nestler
Journal:  Nat Neurosci       Date:  2014-10-28       Impact factor: 24.884

Review 4.  From discovery to function: the expanding roles of long noncoding RNAs in physiology and disease.

Authors:  Miao Sun; W Lee Kraus
Journal:  Endocr Rev       Date:  2014-11-26       Impact factor: 19.871

5.  Computational analysis of functional long noncoding RNAs reveals lack of peptide-coding capacity and parallels with 3' UTRs.

Authors:  Farshad Niazi; Saba Valadkhan
Journal:  RNA       Date:  2012-02-23       Impact factor: 4.942

6.  Computer-assisted annotation of murine Sertoli cell small RNA transcriptome.

Authors:  Nicole Ortogero; Grant W Hennig; Chad Langille; Seungil Ro; John R McCarrey; Wei Yan
Journal:  Biol Reprod       Date:  2013-01-03       Impact factor: 4.285

7.  RNPomics: defining the ncRNA transcriptome by cDNA library generation from ribonucleo-protein particles.

Authors:  Mathieu Rederstorff; Stephan H Bernhart; Andrea Tanzer; Marek Zywicki; Katrin Perfler; Melanie Lukasser; Ivo L Hofacker; Alexander Hüttenhofer
Journal:  Nucleic Acids Res       Date:  2010-02-11       Impact factor: 16.971

8.  A comparative taxonomy of parallel algorithms for RNA secondary structure prediction.

Authors:  Ra'ed M Al-Khatib; Rosni Abdullah; Nur'aini Abdul Rashid
Journal:  Evol Bioinform Online       Date:  2010-04-09       Impact factor: 1.625

9.  Unique signatures of long noncoding RNA expression in response to virus infection and altered innate immune signaling.

Authors:  Xinxia Peng; Lisa Gralinski; Christopher D Armour; Martin T Ferris; Matthew J Thomas; Sean Proll; Birgit G Bradel-Tretheway; Marcus J Korth; John C Castle; Matthew C Biery; Heather K Bouzek; David R Haynor; Matthew B Frieman; Mark Heise; Christopher K Raymond; Ralph S Baric; Michael G Katze
Journal:  mBio       Date:  2010-10-26       Impact factor: 7.867

10.  Hyperlink Management System and ID Converter System: enabling maintenance-free hyperlinks among major biological databases.

Authors:  Tadashi Imanishi; Hajime Nakaoka
Journal:  Nucleic Acids Res       Date:  2009-05-19       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.