Literature DB >> 17169980

Noncoding RNAs database (ncRNAdb).

Maciej Szymanski1, Volker A Erdmann, Jan Barciszewski.   

Abstract

The noncoding RNA database (ncRNAdb) was created as a source of information on RNA molecules, which do not possess protein-coding capacity. It is now widely accepted that, in addition to constitutively expressed, housekeeping or infrastructural RNAs, there is a wide variety of RNAs participating in mechanisms involved in regulation of gene expression at all levels of transmission of genetic information from DNA to proteins. Noncoding RNAs' activities include chromatin structure remodeling, transcriptional and translational regulation of gene expression, modulation of protein function and regulation of subcellular distribution of RNAs as well as proteins. Noncoding transcripts have been identified in organisms belonging to all domains of life. Currently, the ncRNAdb contains >30,000 ncRNA sequences from Eukaryotes, Eubacteria and Archaea, but does not include housekeeping transcripts or microRNAs and snoRNAs for which more specialized databases are available. The contents of the database can be accessed via the WWW at http://biobases.ibch.poznan.pl/ncRNA/.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 17169980      PMCID: PMC1781251          DOI: 10.1093/nar/gkl994

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

In recent years, we have witnessed a growing interest in the involvement of RNA molecules in controlling gene expression. Numerous studies demonstrated that regulatory noncoding or non-protein-coding RNAs (ncRNAs and npcRNAs) play equally important role as protein transcription factors in determining the repertoire of expressed genes virtually in all living organisms. The significance of ncRNAs is evident from the results of analyses of the protein-coding capacity of sequenced genomes. The contribution of protein-coding regions decreases with the increase of complexity of organisms. In bacteria, unicellular eukaryotes and invertebrates, the coding sequences constitute ∼95, 30 and 20% of the genomic DNA, respectively. In mammals, the open reading frames account only for ∼1.5–2% of the genomes (1,2). Similar proportions of the coding and noncoding regions are observed within the sequences which are actually transcribed (3). There are also very little differences between the mammalian proteomes which suggests that the diversity observed on the phenotypic level is not determined by different sets of proteins, but rather by programs which govern their expression (4). A support for this view came recently from the discovery of the brain-specific HAR1F RNA expressed during cerebral cortex development. Accelerated evolution of HAR1F-encoding region may have contributed to the evolution of human brain (5). Regulatory ncRNAs have been identified in all domains of life and they have been shown to be involved in numerous mechanisms controlling expression of genes at all levels of transmission of genetic information from DNA to proteins (6). They include epigenetic modification of chromatin structure (methylation and modification of histones), regulation of transcription by modulation of activity of RNA polymerase and transcription factors, RNA modification, mRNA stability and translation. Unlike the infrastructural ncRNAs (e.g. tRNAs, rRNAs, snRNA and snoRNAs), regulatory RNAs are not transcribed constitutively in all cells. In many cases, their expression depends on the cell or tissue type, developmental stage or is controlled by epigenetic factors or environmental conditions (e.g. hormones and stress). Specific changes in the expression of particular ncRNAs in humans have been also linked to human neurobehavioral and developmental disorders and cancer (7). Initially, virtually all ncRNAs were discovered by chance and there were no systematic approaches to identify the whole contents of the transcriptome. At the time of the publication of the first ncRNA database in 1999 (8), there was only a handful of known ncRNAs which were regarded as curiosities. Since then, there has been a steady growth of reports on identification of new noncoding transcripts. In recent years, a large number of ncRNAs sequences have been determined in large scale sequencing projects of cDNAs (9,10), and small cytoplasmic RNAs (11). There has also been a growing number of reports describing novel methods for computational identification of ncRNA-encoding genes (12,13). Despite the advances in identification of new ncRNAs both in prokaryotes and eukaryotes, our knowledge of the spectrum of their activities is rather limited. There are relatively few noncoding transcripts which have been at least partially characterized in terms of function or expression. This growth of interest in RNA biology was a motivation for several databases published in recent years (e.g. 14,15).

THE DATABASE

The purpose of the database is to serve and organize information concerning regulatory noncoding transcripts from all groups of organisms. In addition to the RNAs for which regulatory activities have been documented, the database also contains sequences of ncRNAs which are known to be expressed, but their role in a cell is still unknown. In comparison with other ncRNA databases, which appeared in recent years, ncRNAdb does not focus on any particular class of transcripts or taxonomic groups. The RNAs included in the database have been demonstrated or are suspected to function without being translated into proteins and they are not constitutively expressed housekeeping transcripts (e.g. tRNAs, rRNAs and snRNAs). As of August 2006, the noncoding regulatory RNA database includes over 30 000 sequences from 99 organisms. A significant growth of the amount of data, compared with previous edition published in 2003 which contained 300 sequences (16), is primarily due to systematic identification of noncoding transcripts in mammals by means of large scale cDNA sequencing (8,9). These sequences account for over 90% of the data. On the other hand, certain groups of RNAs (microRNAs and snoRNAs), which were present in previous editions, were removed from our service to avoid the redundancy with other specialized databases.

DATA SOURCES

The primary source of sequences included in the database were the GenBank (17) records. Human and mouse ncRNA sequences are in part derived from H-Invitational (18) and FANTOM3 (8) full length cDNA databases, respectively. Computationally identified small cytoplasmic, bacterial RNAs which are not annotated in the genomic sequences were derived from the Rfam—the database of RNA families (15). Since many of the primary transcripts of the eukaryotic ncRNAs are subject to alternative splicing, various splicing variants derived from the same gene are presented as separate entries. For ncRNAs, for which there are no individual GenBank records, predicted transcripts were extracted from genomic sequences using information provided in the feature tables or based on the multiple sequence alignments. Apart from the sequences, the entries contain supplementary information (when available) on experimentally verified activities, expression patterns and chromosomal localization. The annotations and genome mapping information for the sequences rely on data provided in original GenBank records, H-Invitational Database of Annotated Human Genes (release 3.4, July 2006), FANTOM3 Database (March 2006 release) and UCSC Genome Browser Database (19). Most of the entries for eukaryotic transcripts are linked to the UCSC Genome Browser (20).

DATABASE ACCESS

Individual nucleotide sequences can be retrieved in FASTA format as separate entries or downloaded as batch files. The data can be searched using transcript names, accession numbers or organism names. In addition to the access to the database records, the search results also linked to full GenBank entries. In the current version of the database we also included the BLAST server which allows to perform sequence similarity searches using the full ncRNA database (∼64 Mb). The search results are linked to the full database records. The browser section of the database is intended as a source of basic information on noncoding transcripts which have been at least partially characterized in terms of function or expression patterns. For such RNAs we provide short descriptions of known activities described in the literature and relevant citations linked to the Medline database. The browser entries also give access to the nucleotide sequences from the database (FASTA) or the entire GenBank records. The database can be accessed through the WWW at the following URL: .
  20 in total

Review 1.  Regulation by RNA.

Authors:  Maciej Szymański; Jan Barciszewski
Journal:  Int Rev Cytol       Date:  2003

2.  Noncoding regulatory RNAs database.

Authors:  Maciej Szymański; Volker A Erdmann; Jan Barciszewski
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

3.  The UCSC Genome Browser Database.

Authors:  D Karolchik; R Baertsch; M Diekhans; T S Furey; A Hinrichs; Y T Lu; K M Roskin; M Schwartz; C W Sugnet; D J Thomas; R J Weber; D Haussler; W J Kent
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

4.  The amazing complexity of the human transcriptome.

Authors:  Martin C Frith; Michael Pheasant; John S Mattick
Journal:  Eur J Hum Genet       Date:  2005-08       Impact factor: 4.246

5.  Collection of mRNA-like non-coding RNAs.

Authors:  V A Erdmann; M Szymanski; A Hochberg; N de Groot; J Barciszewski
Journal:  Nucleic Acids Res       Date:  1999-01-01       Impact factor: 16.971

Review 6.  A new frontier for molecular medicine: noncoding RNAs.

Authors:  Maciej Szymanski; Miroslawa Z Barciszewska; Volker A Erdmann; Jan Barciszewski
Journal:  Biochim Biophys Acta       Date:  2005-09-25

7.  Rfam: annotating non-coding RNAs in complete genomes.

Authors:  Sam Griffiths-Jones; Simon Moxon; Mhairi Marshall; Ajay Khanna; Sean R Eddy; Alex Bateman
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

8.  RNAdb--a comprehensive mammalian noncoding RNA database.

Authors:  Ken C Pang; Stuart Stephen; Pär G Engström; Khairina Tajul-Arifin; Weisan Chen; Claes Wahlestedt; Boris Lenhard; Yoshihide Hayashizaki; John S Mattick
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

9.  GenBank.

Authors:  Dennis A Benson; Ilene Karsch-Mizrachi; David J Lipman; James Ostell; David L Wheeler
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

10.  Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

Authors:  Tadashi Imanishi; Takeshi Itoh; Yutaka Suzuki; Claire O'Donovan; Satoshi Fukuchi; Kanako O Koyanagi; Roberto A Barrero; Takuro Tamura; Yumi Yamaguchi-Kabata; Motohiko Tanino; Kei Yura; Satoru Miyazaki; Kazuho Ikeo; Keiichi Homma; Arek Kasprzyk; Tetsuo Nishikawa; Mika Hirakawa; Jean Thierry-Mieg; Danielle Thierry-Mieg; Jennifer Ashurst; Libin Jia; Mitsuteru Nakao; Michael A Thomas; Nicola Mulder; Youla Karavidopoulou; Lihua Jin; Sangsoo Kim; Tomohiro Yasuda; Boris Lenhard; Eric Eveno; Yoshiyuki Suzuki; Chisato Yamasaki; Jun-ichi Takeda; Craig Gough; Phillip Hilton; Yasuyuki Fujii; Hiroaki Sakai; Susumu Tanaka; Clara Amid; Matthew Bellgard; Maria de Fatima Bonaldo; Hidemasa Bono; Susan K Bromberg; Anthony J Brookes; Elspeth Bruford; Piero Carninci; Claude Chelala; Christine Couillault; Sandro J de Souza; Marie-Anne Debily; Marie-Dominique Devignes; Inna Dubchak; Toshinori Endo; Anne Estreicher; Eduardo Eyras; Kaoru Fukami-Kobayashi; Gopal R Gopinath; Esther Graudens; Yoonsoo Hahn; Michael Han; Ze-Guang Han; Kousuke Hanada; Hideki Hanaoka; Erimi Harada; Katsuyuki Hashimoto; Ursula Hinz; Momoki Hirai; Teruyoshi Hishiki; Ian Hopkinson; Sandrine Imbeaud; Hidetoshi Inoko; Alexander Kanapin; Yayoi Kaneko; Takeya Kasukawa; Janet Kelso; Paul Kersey; Reiko Kikuno; Kouichi Kimura; Bernhard Korn; Vladimir Kuryshev; Izabela Makalowska; Takashi Makino; Shuhei Mano; Regine Mariage-Samson; Jun Mashima; Hideo Matsuda; Hans-Werner Mewes; Shinsei Minoshima; Keiichi Nagai; Hideki Nagasaki; Naoki Nagata; Rajni Nigam; Osamu Ogasawara; Osamu Ohara; Masafumi Ohtsubo; Norihiro Okada; Toshihisa Okido; Satoshi Oota; Motonori Ota; Toshio Ota; Tetsuji Otsuki; Dominique Piatier-Tonneau; Annemarie Poustka; Shuang-Xi Ren; Naruya Saitou; Katsunaga Sakai; Shigetaka Sakamoto; Ryuichi Sakate; Ingo Schupp; Florence Servant; Stephen Sherry; Rie Shiba; Nobuyoshi Shimizu; Mary Shimoyama; Andrew J Simpson; Bento Soares; Charles Steward; Makiko Suwa; Mami Suzuki; Aiko Takahashi; Gen Tamiya; Hiroshi Tanaka; Todd Taylor; Joseph D Terwilliger; Per Unneberg; Vamsi Veeramachaneni; Shinya Watanabe; Laurens Wilming; Norikazu Yasuda; Hyang-Sook Yoo; Marvin Stodolsky; Wojciech Makalowski; Mitiko Go; Kenta Nakai; Toshihisa Takagi; Minoru Kanehisa; Yoshiyuki Sakaki; John Quackenbush; Yasushi Okazaki; Yoshihide Hayashizaki; Winston Hide; Ranajit Chakraborty; Ken Nishikawa; Hideaki Sugawara; Yoshio Tateno; Zhu Chen; Michio Oishi; Peter Tonellato; Rolf Apweiler; Kousaku Okubo; Lukas Wagner; Stefan Wiemann; Robert L Strausberg; Takao Isogai; Charles Auffray; Nobuo Nomura; Takashi Gojobori; Sumio Sugano
Journal:  PLoS Biol       Date:  2004-04-20       Impact factor: 8.029

View more
  20 in total

Review 1.  From discovery to function: the expanding roles of long noncoding RNAs in physiology and disease.

Authors:  Miao Sun; W Lee Kraus
Journal:  Endocr Rev       Date:  2014-11-26       Impact factor: 19.871

Review 2.  Online tools for bioinformatics analyses in nutrition sciences.

Authors:  Sridhar A Malkaram; Yousef I Hassan; Janos Zempleni
Journal:  Adv Nutr       Date:  2012-09-01       Impact factor: 8.701

Review 3.  Long noncoding RNAs in prostate cancer: mechanisms and applications.

Authors:  Chunlai Li; Liuqing Yang; Chunru Lin
Journal:  Mol Cell Oncol       Date:  2014-10-31

4.  Bioinformatics and computational biology in Poland.

Authors:  Janusz M Bujnicki; Jerzy Tiuryn
Journal:  PLoS Comput Biol       Date:  2013-05-02       Impact factor: 4.475

5.  Elephant transcriptome provides insights into the evolution of eutherian placentation.

Authors:  Zhuo-Cheng Hou; Kirstin N Sterner; Roberto Romero; Nandor Gabor Than; Juan M Gonzalez; Amy Weckle; Jun Xing; Kurt Benirschke; Morris Goodman; Derek E Wildman
Journal:  Genome Biol Evol       Date:  2012-04-30       Impact factor: 3.416

Review 6.  The functional role of long non-coding RNA in human carcinomas.

Authors:  Ewan A Gibb; Carolyn J Brown; Wan L Lam
Journal:  Mol Cancer       Date:  2011-04-13       Impact factor: 27.401

7.  Thousands of Novel Transcripts Identified in Mouse Cerebrum, Testis, and ES Cells Based on ribo-minus RNA Sequencing.

Authors:  Wanfei Liu; Yuhui Zhao; Peng Cui; Qiang Lin; Feng Ding; Chengqi Xin; Xinyu Tan; Shuhui Song; Jun Yu; Songnian Hu
Journal:  Front Genet       Date:  2011-12-26       Impact factor: 4.599

8.  FASTR3D: a fast and accurate search tool for similar RNA 3D structures.

Authors:  Chin-En Lai; Ming-Yuan Tsai; Yun-Chen Liu; Chih-Wei Wang; Kun-Tze Chen; Chin Lung Lu
Journal:  Nucleic Acids Res       Date:  2009-05-12       Impact factor: 16.971

9.  Whole genome sequencing of the fish pathogen Francisella noatunensis subsp. orientalis Toba04 gives novel insights into Francisella evolution and pathogenecity.

Authors:  Settu Sridhar; Animesh Sharma; Heidi Kongshaug; Frank Nilsen; Inge Jonassen
Journal:  BMC Genomics       Date:  2012-11-06       Impact factor: 3.969

10.  LNCipedia: a database for annotated human lncRNA transcript sequences and structures.

Authors:  Pieter-Jan Volders; Kenny Helsens; Xiaowei Wang; Björn Menten; Lennart Martens; Kris Gevaert; Jo Vandesompele; Pieter Mestdagh
Journal:  Nucleic Acids Res       Date:  2012-10-05       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.