Literature DB >> 18000000

NONCODE v2.0: decoding the non-coding.

Shunmin He1, Changning Liu, Geir Skogerbø, Haitao Zhao, Jie Wang, Tao Liu, Baoyan Bai, Yi Zhao, Runsheng Chen.   

Abstract

The NONCODE database is an integrated knowledge database designed for the analysis of non-coding RNAs (ncRNAs). Since NONCODE was first released 3 years ago, the number of known ncRNAs has grown rapidly, and there is growing recognition that ncRNAs play important regulatory roles in most organisms. In the updated version of NONCODE (NONCODE v2.0), the number of collected ncRNAs has reached 206 226, including a wide range of microRNAs, Piwi-interacting RNAs and mRNA-like ncRNAs. The improvements brought to the database include not only new and updated ncRNA data sets, but also an incorporation of BLAST alignment search service and access through our custom UCSC Genome Browser. NONCODE can be found under http://www.noncode.org or http://noncode.bioinfo.org.cn.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 18000000      PMCID: PMC2238973          DOI: 10.1093/nar/gkm1011

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The considerable number of non-coding RNAs (ncRNAs) that has been detected in the past few years was largely unexpected (1–3). Although the functions of the many recently identified ncRNAs remain mostly unknown, increasing evidence stands in support of the notion that ncRNAs represent a diverse and important functional output of most genomes (4). NONCODE is an integrated knowledge database dedicated to ncRNAs. All ncRNAs in NONCODE were filtered automatically from GenBank (5) and the literature, and were then later manually curated. With the exception of rRNAs and tRNAs, all classes of reported ncRNAs are included. The aim of the database is to provide a platform that will facilitate both bioinformatic as well as experimental research. In addition to containing sequence data, NONCODE provides a user-friendly interface, a visualization platform and a convenient search option, allowing efficient recovery of sequences, regulatory elements in the flanking sequences, related publications and other information.

DATA COLLECTION AND ANNOTATION

Data collection and annotation for NONCODE v2.0 was carried out in a similar fashion as for version 1.0 and can be briefly described as follows: GenBank entries constituted the major source of NONCODE. We searched PubMed (6) with a list of ncRNA keywords, such as ‘ncRNA’, ‘snoRNA’, ‘snRNA’, ‘tmRNA’, ‘SRP RNA’, ‘gRNA’, etc., and thereafter consulted the literature matched with them and extracted more ncRNA keywords. The downloaded GenBank files (gbfiles) were then filtered using these keywords, and the filtered entries were subsequently confirmed by manual curation. For all obtained ncRNA records, basic information related to sequence, name, alias, length, ncRNA class, organism, references and accession number in GenBank were extracted and entered into the NONCODE database. Each ncRNA sequence was checked for redundancies using Perl scripts, and each cluster of redundant sequences was given a non-redundant NONCODE accession number (UniqID, i.e. unique ncRNA i.d.). In addition to the ‘traditional’ ncRNA classification system, NONCODE v1.0 introduced the alternative ‘process function class (PfClass)’ system based on the biological processes or functions in which an ncRNA is involved, and one or more of the 26 PfClasses were also assigned to all ncRNAs in NONCODE v2.0. Moreover, a subset of ncRNAs has been divided into nine additional categories according to whether they are gender- or tissue-specific or associated with tumors and diseases, etc. Where possible, NONCODE also provides additional annotations, such as information on function, cellular role, cellular location, chromosomal localization and splicing. The annotations and the genomic mapping information of the sequences rely on data provided in the original GenBank records, the FANTOM3 Database (2), the UCSC Genome Browser Database (7), or directly from the reference literature.

DATABASE CONTENT AND CLASSIFICATION

The purpose of the database is to serve the research community by organizing information concerning all types of ncRNAs (except tRNAs and rRNAs) from all groups of organisms. As of August 2007, the NONCODE database includes over 206 226 non-redundant sequences from 861 organisms. The significant growth in the amount of data, compared with the 5339 non-redundant sequences in the previous edition published in 2005, is primarily due to systematic identification of mRNA-like ncRNA transcripts (2) and the discovery of Piwi-interacting RNAs (piRNAs) through large-scale cDNA sequencing (1,3,8). Other novel ncRNAs, such as stem-bulge RNAs (sbRNAs) (9), snRNA-like RNAs (snlRNAs) (9) and a number of unclassified ncRNA transcripts were mainly obtained from our laboratory and other published literature (10–12). According to the traditional classification system, NONCODE v2.0 contains three novel classes of ncRNAs, the sbRNAs, the snlRNAs and the piRNAs, whereas the number of PfClasses is the same as in NONCODE v1.0 (i.e. 26), with sbRNAs and snlRNAs corresponding to the ‘Miscfunction_snm’ and piRNAs to ‘RNA-processing_cleavage’ PfClass.

DATABASE ACCESS

All sequences can be directly downloaded from the webpage. Sequences can be searched using accession numbers found in GenBank, name, traditional class, PfClass, organism and UniqID in NONCODE. In addition to access to NONCODE database records, search results are also linked to full GenBank entries (Figure 1). In the current version of the database, we also included the online BLAST service (NCBI wwwBLAST version 2.2.17) which allows sequence similarity searches against the entire NONCODE v2.0 database.
Figure 1.

Links between the NONCODE ncRNA annotations, the Genome Browser and NCBI. (A) The NONCODE database window with ncRNA annotations. (B) The corresponding NCBI annotation. (C) The corresponding Genome Browser window. (D) The link from Genome Browser to NONCODE.

Links between the NONCODE ncRNA annotations, the Genome Browser and NCBI. (A) The NONCODE database window with ncRNA annotations. (B) The corresponding NCBI annotation. (C) The corresponding Genome Browser window. (D) The link from Genome Browser to NONCODE. In this updated version of NONCODE, a UCSC Genome Browser for NONCODE was constructed for Saccharomyces cerevisiae, Caenorhabditis elegans and Homo sapiens. NcRNA loci of these species may be viewed through the NONCODE track in the Genome Browser. Other common tracks concerning basic information on these species, such as mRNA genes, ESTs and so on, have also been retrieved from the UCSC Genome Browser Database. For the above three species, ncRNA entries in the NONCODE database can be directly linked to the Genome Browser; similarly, NONCODE ncRNA annotations may be accessed through the Genome Browser (Figure 1). The database can be accessed through the following URL: http://www.noncode.org/ or http://noncode.bioinfo.org.cn.

FUTURE DIRECTIONS

As new ncRNAs are being progressively discovered, we will continue to update the NONCODE database. Submissions of new ncRNAs are invited, and should be sent to noncode@ict.ac.cn. Within the coming year, we will continue to add Genome Browser services for other model organisms, such as mouse and fly. Given the increasing amount of ncRNA data and the emergence of ncRNA prediction software [e.g. QRNA (13), RNAz (14)], we will attempt to establish a service for ncRNA prediction based on the mentioned softwares and the information in the NONCODE database.
  14 in total

1.  A novel class of small RNAs bind to MILI protein in mouse testes.

Authors:  Alexei Aravin; Dimos Gaidatzis; Sébastien Pfeffer; Mariana Lagos-Quintana; Pablo Landgraf; Nicola Iovino; Patricia Morris; Michael J Brownstein; Satomi Kuramochi-Miyagawa; Toru Nakano; Minchen Chien; James J Russo; Jingyue Ju; Robert Sheridan; Chris Sander; Mihaela Zavolan; Thomas Tuschl
Journal:  Nature       Date:  2006-06-04       Impact factor: 49.962

2.  A germline-specific class of small RNAs binds mammalian Piwi proteins.

Authors:  Angélique Girard; Ravi Sachidanandam; Gregory J Hannon; Michelle A Carmell
Journal:  Nature       Date:  2006-06-04       Impact factor: 49.962

3.  Fast and reliable prediction of noncoding RNAs.

Authors:  Stefan Washietl; Ivo L Hofacker; Peter F Stadler
Journal:  Proc Natl Acad Sci U S A       Date:  2005-01-21       Impact factor: 11.205

Review 4.  Non-coding RNA.

Authors:  John S Mattick; Igor V Makunin
Journal:  Hum Mol Genet       Date:  2006-04-15       Impact factor: 6.150

5.  A combined computational and experimental analysis of two families of snoRNA genes from Caenorhabditis elegans, revealing the expression and evolution pattern of snoRNAs in nematodes.

Authors:  Zhan-Peng Huang; Chong-Jian Chen; Hui Zhou; Bei-Bei Li; Liang-Hu Qu
Journal:  Genomics       Date:  2007-01-11       Impact factor: 5.736

6.  Organization of the Caenorhabditis elegans small non-coding transcriptome: genomic features, biogenesis, and expression.

Authors:  Wei Deng; Xiaopeng Zhu; Geir Skogerbø; Yi Zhao; Zhuo Fu; Yudong Wang; Housheng He; Lun Cai; Hong Sun; Changning Liu; Biao Li; Baoyan Bai; Jie Wang; Dong Jia; Shiwei Sun; Hang He; Yan Cui; Yu Wang; Dongbo Bu; Runsheng Chen
Journal:  Genome Res       Date:  2005-12-12       Impact factor: 9.043

7.  Expression of Arabidopsis MIRNA genes.

Authors:  Zhixin Xie; Edwards Allen; Noah Fahlgren; Adam Calamar; Scott A Givan; James C Carrington
Journal:  Plant Physiol       Date:  2005-07-22       Impact factor: 8.340

8.  Characterization of the piRNA complex from rat testes.

Authors:  Nelson C Lau; Anita G Seto; Jinkuk Kim; Satomi Kuramochi-Miyagawa; Toru Nakano; David P Bartel; Robert E Kingston
Journal:  Science       Date:  2006-06-15       Impact factor: 47.728

9.  The transcriptional landscape of the mammalian genome.

Authors:  P Carninci; T Kasukawa; S Katayama; J Gough; M C Frith; N Maeda; R Oyama; T Ravasi; B Lenhard; C Wells; R Kodzius; K Shimokawa; V B Bajic; S E Brenner; S Batalov; A R R Forrest; M Zavolan; M J Davis; L G Wilming; V Aidinis; J E Allen; A Ambesi-Impiombato; R Apweiler; R N Aturaliya; T L Bailey; M Bansal; L Baxter; K W Beisel; T Bersano; H Bono; A M Chalk; K P Chiu; V Choudhary; A Christoffels; D R Clutterbuck; M L Crowe; E Dalla; B P Dalrymple; B de Bono; G Della Gatta; D di Bernardo; T Down; P Engstrom; M Fagiolini; G Faulkner; C F Fletcher; T Fukushima; M Furuno; S Futaki; M Gariboldi; P Georgii-Hemming; T R Gingeras; T Gojobori; R E Green; S Gustincich; M Harbers; Y Hayashi; T K Hensch; N Hirokawa; D Hill; L Huminiecki; M Iacono; K Ikeo; A Iwama; T Ishikawa; M Jakt; A Kanapin; M Katoh; Y Kawasawa; J Kelso; H Kitamura; H Kitano; G Kollias; S P T Krishnan; A Kruger; S K Kummerfeld; I V Kurochkin; L F Lareau; D Lazarevic; L Lipovich; J Liu; S Liuni; S McWilliam; M Madan Babu; M Madera; L Marchionni; H Matsuda; S Matsuzawa; H Miki; F Mignone; S Miyake; K Morris; S Mottagui-Tabar; N Mulder; N Nakano; H Nakauchi; P Ng; R Nilsson; S Nishiguchi; S Nishikawa; F Nori; O Ohara; Y Okazaki; V Orlando; K C Pang; W J Pavan; G Pavesi; G Pesole; N Petrovsky; S Piazza; J Reed; J F Reid; B Z Ring; M Ringwald; B Rost; Y Ruan; S L Salzberg; A Sandelin; C Schneider; C Schönbach; K Sekiguchi; C A M Semple; S Seno; L Sessa; Y Sheng; Y Shibata; H Shimada; K Shimada; D Silva; B Sinclair; S Sperling; E Stupka; K Sugiura; R Sultana; Y Takenaka; K Taki; K Tammoja; S L Tan; S Tang; M S Taylor; J Tegner; S A Teichmann; H R Ueda; E van Nimwegen; R Verardo; C L Wei; K Yagi; H Yamanishi; E Zabarovsky; S Zhu; A Zimmer; W Hide; C Bult; S M Grimmond; R D Teasdale; E T Liu; V Brusic; J Quackenbush; C Wahlestedt; J S Mattick; D A Hume; C Kai; D Sasaki; Y Tomaru; S Fukuda; M Kanamori-Katayama; M Suzuki; J Aoki; T Arakawa; J Iida; K Imamura; M Itoh; T Kato; H Kawaji; N Kawagashira; T Kawashima; M Kojima; S Kondo; H Konno; K Nakano; N Ninomiya; T Nishio; M Okada; C Plessy; K Shibata; T Shiraki; S Suzuki; M Tagami; K Waki; A Watahiki; Y Okamura-Oho; H Suzuki; J Kawai; Y Hayashizaki
Journal:  Science       Date:  2005-09-02       Impact factor: 47.728

10.  Noncoding RNA gene detection using comparative sequence analysis.

Authors:  E Rivas; S R Eddy
Journal:  BMC Bioinformatics       Date:  2001-10-10       Impact factor: 3.169

View more
  56 in total

1.  Deep-sequencing of endothelial cells exposed to hypoxia reveals the complexity of known and novel microRNAs.

Authors:  Christine Voellenkle; Jeroen van Rooij; Alessandro Guffanti; Elena Brini; Pasquale Fasanaro; Eleonora Isaia; Larry Croft; Matei David; Maurizio C Capogrossi; Anna Moles; Armando Felsani; Fabio Martelli
Journal:  RNA       Date:  2012-01-26       Impact factor: 4.942

2.  Long non-coding RNA normalisers in human brain tissue.

Authors:  Theo F J Kraus; Andrea Greiner; Virginie Guibourt; Hans A Kretzschmar
Journal:  J Neural Transm (Vienna)       Date:  2014-12-21       Impact factor: 3.575

3.  Using Network Distance Analysis to Predict lncRNA-miRNA Interactions.

Authors:  Li Zhang; Pengyu Yang; Huawei Feng; Qi Zhao; Hongsheng Liu
Journal:  Interdiscip Sci       Date:  2021-07-07       Impact factor: 2.233

4.  Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics.

Authors:  Koh Aoki; Kentaro Yano; Ayako Suzuki; Shingo Kawamura; Nozomu Sakurai; Kunihiro Suda; Atsushi Kurabayashi; Tatsuya Suzuki; Taneaki Tsugane; Manabu Watanabe; Kazuhide Ooga; Maiko Torii; Takanori Narita; Tadasu Shin-I; Yuji Kohara; Naoki Yamamoto; Hideki Takahashi; Yuichiro Watanabe; Mayumi Egusa; Motoichiro Kodama; Yuki Ichinose; Mari Kikuchi; Sumire Fukushima; Akiko Okabe; Tsutomu Arie; Yuko Sato; Katsumi Yazawa; Shinobu Satoh; Toshikazu Omura; Hiroshi Ezura; Daisuke Shibata
Journal:  BMC Genomics       Date:  2010-03-30       Impact factor: 3.969

5.  Characteristics of transposable element exonization within human and mouse.

Authors:  Noa Sela; Britta Mersch; Agnes Hotz-Wagenblatt; Gil Ast
Journal:  PLoS One       Date:  2010-06-01       Impact factor: 3.240

6.  sRNAMap: genomic maps for small non-coding RNAs, their regulators and their targets in microbial genomes.

Authors:  Hsi-Yuan Huang; Heng-Yi Chang; Chih-Hung Chou; Ching-Ping Tseng; Shinn-Ying Ho; Chi-Dung Yang; Yih-Wei Ju; Hsien-Da Huang
Journal:  Nucleic Acids Res       Date:  2008-11-16       Impact factor: 16.971

7.  FASTR3D: a fast and accurate search tool for similar RNA 3D structures.

Authors:  Chin-En Lai; Ming-Yuan Tsai; Yun-Chen Liu; Chih-Wei Wang; Kun-Tze Chen; Chin Lung Lu
Journal:  Nucleic Acids Res       Date:  2009-05-12       Impact factor: 16.971

8.  Screening non-coding RNAs in transcriptomes from neglected species using PORTRAIT: case study of the pathogenic fungus Paracoccidioides brasiliensis.

Authors:  Roberto T Arrial; Roberto C Togawa; Marcelo de M Brigido
Journal:  BMC Bioinformatics       Date:  2009-08-04       Impact factor: 3.169

9.  Non-coding RNA annotation of the genome of Trichoplax adhaerens.

Authors:  Jana Hertel; Danielle de Jong; Manja Marz; Dominic Rose; Hakim Tafer; Andrea Tanzer; Bernd Schierwater; Peter F Stadler
Journal:  Nucleic Acids Res       Date:  2009-01-16       Impact factor: 16.971

10.  Primate-specific spliced PMCHL RNAs are non-protein coding in human and macaque tissues.

Authors:  Sandra Schmieder; Fleur Darré-Toulemonde; Marie-Jeanne Arguel; Audrey Delerue-Audegond; Richard Christen; Jean-Louis Nahon
Journal:  BMC Evol Biol       Date:  2008-12-09       Impact factor: 3.260

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.