Literature DB >> 18829717

NRED: a database of long noncoding RNA expression.

Marcel E Dinger1, Ken C Pang, Tim R Mercer, Mark L Crowe, Sean M Grimmond, John S Mattick.   

Abstract

In mammals, thousands of long non-protein-coding RNAs (ncRNAs) (>200 nt) have recently been described. However, the biological significance and function of the vast majority of these transcripts remain unclear. We have constructed a public repository, the Noncoding RNA Expression Database (NRED), which provides gene expression information for thousands of long ncRNAs in human and mouse. The database contains both microarray and in situ hybridization data, much of which is described here for the first time. NRED also supplies a rich tapestry of ancillary information for featured ncRNAs, including evolutionary conservation, secondary structure evidence, genomic context links and antisense relationships. The database is available at http://jsm-research.imb.uq.edu.au/NRED, and the web interface enables both advanced searches and data downloads. Taken together, NRED should significantly advance the study and understanding of long ncRNAs, and provides a timely and valuable resource to the scientific community.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18829717      PMCID: PMC2686506          DOI: 10.1093/nar/gkn617

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Non-protein-coding RNAs (ncRNAs) are currently the subject of intense research activity. Just a decade ago, the number of known ncRNAs was restricted to a small number of housekeeping RNAs (including ribosomal RNAs, transfer RNAs and spliceosomal RNAs) and an even more limited collection of regulatory RNAs, such as lin-4 in Caenorhabditis elegans (1) and H19 and Xist in mammals (2,3). Since then, discovery of novel ncRNAs has increased dramatically. Thousands of short ncRNAs have been identified, and various classes—including microRNAs, endogenous short interfering RNAs, PIWI-interacting RNAs and small nucleolar RNAs—can now be readily distinguished on the basis of length, biogenesis, structural/sequence features and function (4,5). Large numbers of long ncRNAs (>200 nt) have also been discovered using full-length cDNA cloning/sequencing and genomic tiling array technologies to comprehensively profile the transcriptome (6–9). In the mouse genome, for instance, long ncRNAs are estimated to number ∼30 000 (7,10), and in the human genome the majority of transcription occurs as long ncRNAs (9). In recent years, long ncRNAs have been implicated in a variety of regulatory processes, ranging from X chromosome inactivation, genomic imprinting and chromatin modification to transcriptional activation, transcriptional interference and nuclear trafficking (11,12). The exact mechanisms by which these long ncRNAs exert their effects remain unclear. Nevertheless, it has become apparent that long ncRNAs can act both in cis (13) and in trans (14), and that some function as precursors for short ncRNAs (9,15–17), while others act independently as long transcripts. The function of the vast majority of long ncRNAs is currently a mystery despite this recent progress. Indeed, doubts have been raised as to whether these remaining transcripts are functional at all (18). Certainly, long ncRNAs lack discernable features to facilitate categorization and functional prediction. And yet, there are several reasons to believe that many of these long ncRNAs are likely to be functional. First, their expression is often tissue- and/or cell-specific and localized to specific sub-cellular compartments (19–21), which suggests they are regulated and biologically significant. Second, as mentioned earlier, there are already numerous precedents of long ncRNAs having function, and the number of examples will continue to grow as research in this fledgling area continues. Finally, Willingham and colleagues (22) recently screened several hundred novel long ncRNAs for function in a limited battery of cell-based assays and successfully identified multiple functional ncRNAs, which highlights the untapped functional potential of these transcripts. To begin to explore the function of the thousands of remaining novel long ncRNAs, we have recently undertaken a range of large-scale expression analyses of long ncRNAs. First, using in situ hybridization (ISH) data from the Allen Brain Atlas (ABA) (23), we identified >800 long ncRNAs that are expressed in the adult mouse brain, the majority of which were associated with specific anatomical regions, cell types or subcellular compartments (20). Second, we found that >900 long ncRNAs were expressed during mouse embryonic stem (ES) cell differentiation using a custom-designed oligonucleotide microarray, and subsequently showed that some of these ncRNAs appear to have a role in the epigenetic regulation of differentiation (21). Using the same custom array platform, we have also profiled the expression of several thousand long mouse ncRNAs during immune cell activation, neural stem cell differentiation, myoblast differentiation and gonadal ridge development. Finally, we have identified organ- and cell-specific expression data for large numbers of long ncRNAs from both human and mouse, using publicly available data from the Genomics Institute of the Novartis Research Foundation (GNF) (24). In this report, we introduce the Noncoding RNA Expression Database (NRED). The database is available at http://jsm-research.imb.uq.edu.au/NRED, and its primary aim is to provide a specific resource for the expression of long ncRNAs. At this stage, NRED brings together each of the datasets described above, with more expected to follow in the near future. Short RNAs are already well-catered for by a range of other resources (25–27), and are not directly featured in this database. As well as providing detailed expression data, NRED enables researchers to characterize and select long ncRNAs based on various bioinformatic criteria, including predicted secondary structure, evolutionary conservation, and genomic context. In this way, NRED sheds light on a vast and largely unexplored territory of the mammalian transcriptome, and should stimulate and guide future functional studies of long ncRNAs.

DATABASE CONTENT

NRED currently features multiple datasets based on three different experimental platforms (Table 1), each of which is described subsequently.
Table 1.

Summary of NRED datasets

DatasetOrganismNumber of noncoding probesa
Custom noncoding microarrayMouse4926
GNF SymAtlasHuman Mouse1287 5692
Allen Brain AtlasMouse1308

aProbes that exclusively target ncRNAs were identified using a previously-described classification pipeline (20) (see Supplementary Materials), and numbers reflect the classification as at 24 September 2008.

Summary of NRED datasets aProbes that exclusively target ncRNAs were identified using a previously-described classification pipeline (20) (see Supplementary Materials), and numbers reflect the classification as at 24 September 2008.

Custom ncRNA microarray

We designed a custom microarray that contained probes uniquely targeting 9225 protein-coding transcripts and 4926 noncoding transcripts from mouse (Supplementary Material 1). The array was interrogated with RNA samples from a range of experimental systems (Supplementary Material 1). These included: (i) ES cell differentiation over a 16-day time course; (ii) macrophage activation in response to lipopolysaccharide; (iii) CD8+ T-cell differentiation and activation; (iv) neural stem cell (NSC) differentiation; (v) C2C12 myoblast differentiation; and (vi) testis and ovary development. The results of our profiling experiments during ES cell differentiation have been recently reported (21), and demonstrate the utility of our custom microarrays in facilitating in-depth functional study of long ncRNAs. Across the six experimental systems currently featured in NRED, a total of 2913 ncRNAs were expressed above background levels (Supplementary Material 1). Of these, 1475 were differentially expressed in at least one setting (B-statistic >3).

GNF SymAtlas

The GNF previously compiled a large-scale atlas of mammalian gene expression using custom-designed whole-genome gene expression arrays (24). This resource utilized RNAs from 79 human and 61 mouse tissues, and featured the expression of 44 775 human and 36 182 mouse transcripts. We downloaded this publicly available dataset for further analysis (http://symatlas.gnf.org/). Although the probe set used by GNF was originally designed to target the protein-coding transcriptome, we found that 1287 human and 5692 mouse probes uniquely recognized long ncRNAs (Supplementary Material 2). Of these, 733 and 3403 were expressed in human and mouse, respectively.

Allen Brain Atlas

The ABA provides a comprehensive catalogue of gene expression within the adult mouse brain (23). Data were generated using automated high-throughput ISH techniques, and advanced image-based informatics methods enabled automated quantification and mapping of expression information. Through its web interface (http://www.brain-map.org), the atlas permits high-resolution visualization of the expression of ∼20 000 protein-coding transcripts and comprehensive data mining. We downloaded this publicly available dataset for further analysis, and discovered that the ABA also contains ISH data for 1308 ncRNAs (Supplementary Material 2). Of these, 849 are expressed in mouse brain, the majority of which are associated with specific neuroanatomical regions, cell types and/or subcellular compartments (20).

DATABASE ACCESS

Implementation

NRED is available at http://jsm-research.imb.uq.edu.au/NRED. Datasets are stored in relational form in a MySQL database. The web application is implemented in Perl 5, with rich client functionality provided via AJAX and other dynamic HTML procedures. Documentation is provided via jQuery, which allows the user to obtain help on almost any function by simply hovering the mouse on the relevant item on the website. Results tables can be sorted by a field in real-time by clicking on the column headings.

Query interface

NRED can be queried in various ways via the web interface (Figure 1).
Figure 1.

NRED user interface.

NRED user interface. To examine the expression of individual ncRNAs, gene-centric searches can be performed across each of the experimental platforms using the ‘Probe Search Term’ field. For example, queries based on gene name (e.g. ‘Xist’, ‘Air’) or a unique gene identifier (e.g. Genbank accessions, MGI identifiers and UniGene Cluster identifier) can be used to readily display expression data for a given ncRNA of interest. To identify ncRNAs that are expressed in a particular organ/region/cell type of interest or under particular conditions, an experimental platform must first be selected (e.g. ‘Allen Brain Atlas’). This brings up a series of platform-dependent menus, from which a user can then choose a relevant expression sub-system if desired (e.g. ‘Cerebellum’). Then, to restrict the query to those probes that exclusively recognize ncRNAs, one must specify ‘Noncoding only’ under the Target Classification menu, since the probes contained within the NRED datasets include those that recognize protein-coding transcripts as well. The two basic query strategies described above—gene- and platform-centric searches —can be refined further by applying various filters. Expression-based filters permit searches to be modified based upon various statistics, such as significance thresholds (e.g. P-values, B-statistics, q-values), fold change (M-values) and expression intensity (e.g. A-values, Affymetrix Present/Absent calls). In this way, users can select their own criteria by which differentially expressed transcripts are identified. A series of other filters can also be applied based on information related to the probe target itself. For example, probes can be selected depending upon whether their targets are spliced or unspliced. Similarly, users can filter search results based on whether target ncRNAs show evidence of evolutionary conservation or predicted secondary structure using the PhastCons and RNAz tools, respectively (28,29) (Supplementary Material 3). In addition, we have previously developed a method for classifying the genomic context of target ncRNAs (20) (Supplementary Material 4). Using this information, probes can also be filtered depending on whether they map in a sense, cis-antisense and/or bi-directional orientation to other transcripts (including protein-coding transcripts, miRNAs, snoRNAs or other ncRNAs).

Data output

Query results are probe-centric, and can be customised to include any number of associated data fields using a simple format output menu (Figure 1). Thus, for any given probe, users can opt to display unique probe target identifiers (e.g. Genbank accession), selected expression data (e.g. B-statistics, M-values, etc.), overlapping sense and antisense transcript information, RNAz predictions and PhastCons data to name just a few. Results can be displayed in several output formats. The default is to view the results as an online table, but users have the alternative option of obtaining information as a downloadable, tab-delimited text file. Finally, to enable users to use the search results in downstream applications [e.g. via the UCSC Genome Browser (30)], probe data can also be downloaded as individual.bed files.

FUTURE DIRECTIONS

We have recently designed and manufactured second-generation custom ncRNA microarrays. These new arrays will profile 12 000 and 16 000 ncRNAs in mouse and human, respectively. As expression results become available using this new platform, we will update NRED accordingly. Submission of other publicly available expression datasets that might be suitable for NRED is also invited, and should be sent to m.dinger@imb.uq.edu.au.

CITING NRED

To reference NRED, please cite this article. When referring to specific data from the database, the following format is suggested: ‘These data were retrieved from NRED, Institute for Molecular Bioscience, Brisbane, Australia (http://jsm-research.imb.uq.edu.au/NRED) [Date when you retrieved the data.]’.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

National Health & Medical Research Council (to K.C.P.); the Foundation for Research, Science and Technology, New Zealand (to M.E.D.); the Australian Research Council, the Queensland State Government and the University of Queensland (to J.S.M.). Funding for open access charge: The University of Queensland. Conflict of interest statement. None declared.
  30 in total

1.  Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome.

Authors:  Stefan Washietl; Ivo L Hofacker; Melanie Lukasser; Alexander Hüttenhofer; Peter F Stadler
Journal:  Nat Biotechnol       Date:  2005-11       Impact factor: 54.908

2.  A strategy for probing the function of noncoding RNAs finds a repressor of NFAT.

Authors:  A T Willingham; A P Orth; S Batalov; E C Peters; B G Wen; P Aza-Blanc; J B Hogenesch; P G Schultz
Journal:  Science       Date:  2005-09-02       Impact factor: 47.728

3.  Experimental validation of the regulated expression of large numbers of non-coding RNAs from the mouse genome.

Authors:  Timothy Ravasi; Harukazu Suzuki; Ken C Pang; Shintaro Katayama; Masaaki Furuno; Rie Okunishi; Shiro Fukuda; Kelin Ru; Martin C Frith; M Milena Gongora; Sean M Grimmond; David A Hume; Yoshihide Hayashizaki; John S Mattick
Journal:  Genome Res       Date:  2005-12-12       Impact factor: 9.043

4.  Genome-wide atlas of gene expression in the adult mouse brain.

Authors:  Ed S Lein; Michael J Hawrylycz; Nancy Ao; Mikael Ayres; Amy Bensinger; Amy Bernard; Andrew F Boe; Mark S Boguski; Kevin S Brockway; Emi J Byrnes; Lin Chen; Li Chen; Tsuey-Ming Chen; Mei Chi Chin; Jimmy Chong; Brian E Crook; Aneta Czaplinska; Chinh N Dang; Suvro Datta; Nick R Dee; Aimee L Desaki; Tsega Desta; Ellen Diep; Tim A Dolbeare; Matthew J Donelan; Hong-Wei Dong; Jennifer G Dougherty; Ben J Duncan; Amanda J Ebbert; Gregor Eichele; Lili K Estin; Casey Faber; Benjamin A Facer; Rick Fields; Shanna R Fischer; Tim P Fliss; Cliff Frensley; Sabrina N Gates; Katie J Glattfelder; Kevin R Halverson; Matthew R Hart; John G Hohmann; Maureen P Howell; Darren P Jeung; Rebecca A Johnson; Patrick T Karr; Reena Kawal; Jolene M Kidney; Rachel H Knapik; Chihchau L Kuan; James H Lake; Annabel R Laramee; Kirk D Larsen; Christopher Lau; Tracy A Lemon; Agnes J Liang; Ying Liu; Lon T Luong; Jesse Michaels; Judith J Morgan; Rebecca J Morgan; Marty T Mortrud; Nerick F Mosqueda; Lydia L Ng; Randy Ng; Geralyn J Orta; Caroline C Overly; Tu H Pak; Sheana E Parry; Sayan D Pathak; Owen C Pearson; Ralph B Puchalski; Zackery L Riley; Hannah R Rockett; Stephen A Rowland; Joshua J Royall; Marcos J Ruiz; Nadia R Sarno; Katherine Schaffnit; Nadiya V Shapovalova; Taz Sivisay; Clifford R Slaughterbeck; Simon C Smith; Kimberly A Smith; Bryan I Smith; Andy J Sodt; Nick N Stewart; Kenda-Ruth Stumpf; Susan M Sunkin; Madhavi Sutram; Angelene Tam; Carey D Teemer; Christina Thaller; Carol L Thompson; Lee R Varnam; Axel Visel; Ray M Whitlock; Paul E Wohnoutka; Crissa K Wolkey; Victoria Y Wong; Matthew Wood; Murat B Yaylaoglu; Rob C Young; Brian L Youngstrom; Xu Feng Yuan; Bin Zhang; Theresa A Zwingman; Allan R Jones
Journal:  Nature       Date:  2006-12-06       Impact factor: 49.962

Review 5.  Eukaryotic regulatory RNAs: an answer to the 'genome complexity' conundrum.

Authors:  Kannanganattu V Prasanth; David L Spector
Journal:  Genes Dev       Date:  2007-01-01       Impact factor: 11.361

6.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes.

Authors:  Adam Siepel; Gill Bejerano; Jakob S Pedersen; Angie S Hinrichs; Minmei Hou; Kate Rosenbloom; Hiram Clawson; John Spieth; Ladeana W Hillier; Stephen Richards; George M Weinstock; Richard K Wilson; Richard A Gibbs; W James Kent; Webb Miller; David Haussler
Journal:  Genome Res       Date:  2005-07-15       Impact factor: 9.043

7.  A mammalian gene with introns instead of exons generating stable RNA products.

Authors:  K T Tycowski; M D Shu; J A Steitz
Journal:  Nature       Date:  1996-02-01       Impact factor: 49.962

8.  The transcriptional landscape of the mammalian genome.

Authors:  P Carninci; T Kasukawa; S Katayama; J Gough; M C Frith; N Maeda; R Oyama; T Ravasi; B Lenhard; C Wells; R Kodzius; K Shimokawa; V B Bajic; S E Brenner; S Batalov; A R R Forrest; M Zavolan; M J Davis; L G Wilming; V Aidinis; J E Allen; A Ambesi-Impiombato; R Apweiler; R N Aturaliya; T L Bailey; M Bansal; L Baxter; K W Beisel; T Bersano; H Bono; A M Chalk; K P Chiu; V Choudhary; A Christoffels; D R Clutterbuck; M L Crowe; E Dalla; B P Dalrymple; B de Bono; G Della Gatta; D di Bernardo; T Down; P Engstrom; M Fagiolini; G Faulkner; C F Fletcher; T Fukushima; M Furuno; S Futaki; M Gariboldi; P Georgii-Hemming; T R Gingeras; T Gojobori; R E Green; S Gustincich; M Harbers; Y Hayashi; T K Hensch; N Hirokawa; D Hill; L Huminiecki; M Iacono; K Ikeo; A Iwama; T Ishikawa; M Jakt; A Kanapin; M Katoh; Y Kawasawa; J Kelso; H Kitamura; H Kitano; G Kollias; S P T Krishnan; A Kruger; S K Kummerfeld; I V Kurochkin; L F Lareau; D Lazarevic; L Lipovich; J Liu; S Liuni; S McWilliam; M Madan Babu; M Madera; L Marchionni; H Matsuda; S Matsuzawa; H Miki; F Mignone; S Miyake; K Morris; S Mottagui-Tabar; N Mulder; N Nakano; H Nakauchi; P Ng; R Nilsson; S Nishiguchi; S Nishikawa; F Nori; O Ohara; Y Okazaki; V Orlando; K C Pang; W J Pavan; G Pavesi; G Pesole; N Petrovsky; S Piazza; J Reed; J F Reid; B Z Ring; M Ringwald; B Rost; Y Ruan; S L Salzberg; A Sandelin; C Schneider; C Schönbach; K Sekiguchi; C A M Semple; S Seno; L Sessa; Y Sheng; Y Shibata; H Shimada; K Shimada; D Silva; B Sinclair; S Sperling; E Stupka; K Sugiura; R Sultana; Y Takenaka; K Taki; K Tammoja; S L Tan; S Tang; M S Taylor; J Tegner; S A Teichmann; H R Ueda; E van Nimwegen; R Verardo; C L Wei; K Yagi; H Yamanishi; E Zabarovsky; S Zhu; A Zimmer; W Hide; C Bult; S M Grimmond; R D Teasdale; E T Liu; V Brusic; J Quackenbush; C Wahlestedt; J S Mattick; D A Hume; C Kai; D Sasaki; Y Tomaru; S Fukuda; M Kanamori-Katayama; M Suzuki; J Aoki; T Arakawa; J Iida; K Imamura; M Itoh; T Kato; H Kawaji; N Kawagashira; T Kawashima; M Kojima; S Kondo; H Konno; K Nakano; N Ninomiya; T Nishio; M Okada; C Plessy; K Shibata; T Shiraki; S Suzuki; M Tagami; K Waki; A Watahiki; Y Okamura-Oho; H Suzuki; J Kawai; Y Hayashizaki
Journal:  Science       Date:  2005-09-02       Impact factor: 47.728

9.  Argonaute--a database for gene regulation by mammalian microRNAs.

Authors:  Priyanka Shahi; Serguei Loukianiouk; Andreas Bohne-Lang; Marc Kenzelmann; Stefan Küffer; Sabine Maertens; Roland Eils; Herrmann-Josef Gröne; Norbert Gretz; Benedikt Brors
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

10.  Distinguishing protein-coding from non-coding RNAs through support vector machines.

Authors:  Jinfeng Liu; Julian Gough; Burkhard Rost
Journal:  PLoS Genet       Date:  2006-04-28       Impact factor: 5.917

View more
  123 in total

Review 1.  Long non-coding RNAs and cancer: a new frontier of translational research?

Authors:  R Spizzo; M I Almeida; A Colombatti; G A Calin
Journal:  Oncogene       Date:  2012-01-23       Impact factor: 9.867

2.  Two long non-coding RNAs generated from subtelomeric regions accumulate in a novel perinuclear compartment in Plasmodium falciparum.

Authors:  Miguel Sierra-Miranda; Dulce María Delgadillo; Liliana Mancio-Silva; Miguel Vargas; Nicolás Villegas-Sepulveda; Santiago Martínez-Calvillo; Artur Scherf; Rosaura Hernandez-Rivas
Journal:  Mol Biochem Parasitol       Date:  2012-06-18       Impact factor: 1.759

Review 3.  From discovery to function: the expanding roles of long noncoding RNAs in physiology and disease.

Authors:  Miao Sun; W Lee Kraus
Journal:  Endocr Rev       Date:  2014-11-26       Impact factor: 19.871

Review 4.  Long noncoding RNAs in diseases of aging.

Authors:  Jiyoung Kim; Kyoung Mi Kim; Ji Heon Noh; Je-Hyun Yoon; Kotb Abdelmohsen; Myriam Gorospe
Journal:  Biochim Biophys Acta       Date:  2015-07-02

Review 5.  Linking diabetic vascular complications with LncRNAs.

Authors:  Amy Leung; Vishnu Amaram; Rama Natarajan
Journal:  Vascul Pharmacol       Date:  2018-02-03       Impact factor: 5.773

Review 6.  The rise of regulatory RNA.

Authors:  Kevin V Morris; John S Mattick
Journal:  Nat Rev Genet       Date:  2014-04-29       Impact factor: 53.242

7.  AAGAG repeat RNA is an essential component of nuclear matrix in Drosophila.

Authors:  Rashmi U Pathak; Anitha Mamillapalli; Nandini Rangaraj; Ram P Kumar; Dasari Vasanthi; Krishnaveni Mishra; Rakesh K Mishra
Journal:  RNA Biol       Date:  2013-04-01       Impact factor: 4.652

8.  Long noncoding RNAs in neuronal-glial fate specification and oligodendrocyte lineage maturation.

Authors:  Tim R Mercer; Irfan A Qureshi; Solen Gokhan; Marcel E Dinger; Guangyu Li; John S Mattick; Mark F Mehler
Journal:  BMC Neurosci       Date:  2010-02-05       Impact factor: 3.288

9.  Nuclear organization and dynamics of 7SK RNA in regulating gene expression.

Authors:  Kannanganattu V Prasanth; Matthew Camiolo; Grace Chan; Vidisha Tripathi; Laurence Denis; Tetsuya Nakamura; Michael R Hübner; David L Spector
Journal:  Mol Biol Cell       Date:  2010-09-29       Impact factor: 4.138

10.  Unique signatures of long noncoding RNA expression in response to virus infection and altered innate immune signaling.

Authors:  Xinxia Peng; Lisa Gralinski; Christopher D Armour; Martin T Ferris; Matthew J Thomas; Sean Proll; Birgit G Bradel-Tretheway; Marcus J Korth; John C Castle; Matthew C Biery; Heather K Bouzek; David R Haynor; Matthew B Frieman; Mark Heise; Christopher K Raymond; Ralph S Baric; Michael G Katze
Journal:  mBio       Date:  2010-10-26       Impact factor: 7.867

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.