| Literature DB >> 32090359 |
Ruth L Seal1,2, Ling-Ling Chen3, Sam Griffiths-Jones4, Todd M Lowe5, Michael B Mathews6, Dawn O'Reilly7, Andrew J Pierce8, Peter F Stadler9,10,11,12,13, Igor Ulitsky14, Sandra L Wolin15, Elspeth A Bruford1,2.
Abstract
Research on non-coding RNA (ncRNA) is a rapidly expanding field. Providing an official gene symbol and name to ncRNA genes brings order to otherwise potential chaos as it allows unambiguous communication about each gene. The HUGO Gene Nomenclature Committee (HGNC, www.genenames.org) is the only group with the authority to approve symbols for human genes. The HGNC works with specialist advisors for different classes of ncRNA to ensure that ncRNA nomenclature is accurate and informative, where possible. Here, we review each major class of ncRNA that is currently annotated in the human genome and describe how each class is assigned a standardised nomenclature.Entities:
Keywords: gene nomenclature; gene symbols; non-coding RNA
Mesh:
Substances:
Year: 2020 PMID: 32090359 PMCID: PMC7073466 DOI: 10.15252/embj.2019103777
Source DB: PubMed Journal: EMBO J ISSN: 0261-4189 Impact factor: 11.598
Figure 1The number of HGNC gene symbols by type of ncRNA
A full list of locus types, along with numbers of genes per category, can be found at our Statistics & Downloads webpage (https://www.genenames.org/download/statistics-and-files/).
The HGNC hosts gene group pages for different types of non‐coding RNA genes. These pages follow a hierarchical structure and all pages can be browsed starting at the highest‐level gene group page labelled “Non‐coding RNAs”
| Gene group name | Gene group URL | Description |
|---|---|---|
| Non‐coding RNAs |
| Overview page of all non‐coding RNAs in the HGNC project. Can be used as a starting point to browse through all types of named ncRNAs |
| MicroRNAs |
| Starting page for all microRNAs, which are split into curated human families where possible. MicroRNAs not in a defined family are listed on the first page |
| MicroRNA host genes |
| A curated list of microRNA host genes, which is split into protein coding and non‐coding subgroups |
| Transfer RNAs |
| Starting page for all transfer RNA genes, with subgroups “Mitochondrially encoded transfer RNAs” and “Cytoplasmic transfer RNAs” (this page also has the subsets “Cytoplasmic transfer RNA pseudogenes” and “Low confidence cytoplasmic transfer RNAs”) |
| Small nuclear RNAs |
| Lists all canonical small nuclear RNA genes; variant snRNA genes are shown as a subgroup |
| Small nucleolar RNAs |
| Starting page for snoRNAs with the subgroups “Small Cajal body‐specific RNAs”, “Small nucleolar RNAs, C/D box” and “Small nucleolar RNAs, H/ACA box” |
| Small nucleolar RNA host genes |
| A curated list of snoRNA host genes, which is split into protein coding and non‐coding subgroups |
| Ribosomal RNAs |
| Starting page for all ribosomal RNAs, split into the major subgroups “Mitochondrially encoded ribosomal RNAs” and “Cytoplasmic ribosomal RNAs”, which is further split into subtypes of rRNAs |
| Vault RNAs |
| Full list of vault RNA genes |
| Y RNAs |
| Full list of Y RNA genes |
| Small NF90 (ILF3) associated RNAs |
| Full list of SNAR genes |
| Long non‐coding RNAs |
| Starting page for all long non‐coding RNA gene. Divided into subgroups: Long intergenic non‐protein coding RNAs, MicroRNA non‐coding host genes, Overlapping transcripts, Intronic transcripts, Antisense RNAs, Divergent transcripts, Small nucleolar RNA non‐coding host genes, Long non‐coding RNAs with non‐systematic symbols, Long non‐coding RNAs with FAM root symbol |
Figure 2The microRNA gene MIR17 is part of a cluster of microRNA genes that are hosted within an intron of the long non‐coding RNA gene MIR17HG (miR‐17‐92a‐1 cluster host gene)
The symbol MIR17 represents the gene; the symbol mir‐17 represents the miRNA precursor stem‐loop structure; and the symbol miR‐17 represents the active mature microRNA, which interacts with an AGO protein to form the AGO/miRNA silencing complex.
Figure 3An annotated tRNA gene symbol explaining what each part of the approved gene symbol represents
Figure 4Schematic showing the two types of ribosomal RNA (rRNA) gene cluster found within the human genome
The 5S cluster has a variable copy number between individuals, with 98 being the average copy number, while the current human reference genome, GRCh38, has just 17 copies. The HGNC has approved symbols for the 17 annotated copies as shown above. There are five separate 45S rRNA clusters, which are named RNR1‐RNR5. These clusters are not currently represented on GRCh38. The HGNC has approved root symbols for each 45S rRNA genes and their post‐transcriptionally processed transcripts (root symbols shown in dark blue text). The light blue symbols show the format that will be approved in the future for individual 45S rRNA genes and transcripts once the clusters are included and annotated on the human reference genome.
Selected examples of lncRNA genes with equivalent approved symbols in human and mouse. For human and mouse lncRNA genes to be considered orthologous and named as such, the HGNC requires that the genes are at a conserved syntenic location and have detectable sequence similarity. Note that human gene symbols are uppercase while mouse symbols are title case, and mouse gene symbols do not contain hyphens
| Human symbol | Human gene name | Mouse symbol | References |
|---|---|---|---|
|
| antisense of IGF2R non‐protein coding RNA |
| Yotova |
|
| Differentiation antagonising non‐protein coding RNA |
| Chalei |
|
| Eosinophil granule ontogeny transcript |
| Rose and Stadler ( |
|
| FOXF1 adjacent non‐coding developmental regulatory RNA |
| Grote |
|
| Firre intergenic repeating RNA element |
| Hacisuleyman |
|
| Growth arrest specific 5 |
| Smith and Steitz ( |
|
| HOX transcript antisense RNA |
| Rinn |
|
| HOXA distal transcript antisense RNA |
| Wang |
|
| KCNQ1 opposite strand/antisense transcript 1 |
| Gicquel |
|
| Metastasis associated lung adenocarcinoma transcript 1 |
| Ji |
|
| Maternally expressed 3 |
| Miyoshi |
|
| miR‐17‐92a‐1 cluster host gene |
| Dews |
|
| Nuclear paraspeckle assembly transcript 1 |
| Clemson |
|
| Non‐coding repressor of NFAT |
| Willingham |
|
| POU3F3 adjacent non‐coding transcript 1 |
| Clark and Blackshaw ( |
|
| PAX6 upstream antisense RNA |
| Vance |
|
| Paternally expressed 13 |
| Court |
|
| Pvt1 oncogene |
| Carramusa |
|
| RNA component of 7SK nuclear ribonucleoprotein |
| Driscoll |
|
| Small nucleolar RNA host gene 3 |
| Pelczar and Filipowicz ( |
|
| SOX1 overlapping transcript |
| Ahmad |
|
| TSIX transcript, XIST antisense RNA |
| Lee |
|
| Taurine up‐regulated 1 |
| Young |
|
| TCL1 upstream neural differentiation‐associated RNA |
| Ulitsky |
|
| X inactive specific transcript |
| Brown |
|
| ZNFX1 antisense RNA 1 |
| Askarian‐Amiri |
Figure 5LncRNA naming schema for lncRNA genes with no published information at the time of naming
LncRNAs that are intergenic with respect to protein coding genes are assigned the root symbol LINC# followed by a 5‐digit number.
LncRNAs that are antisense to the genomic span of a protein coding gene are assigned the symbol format [protein coding gene symbol]‐AS#.
LncRNAs that are divergent to (share a bidirectional promoter with) a protein coding gene are assigned the symbol format [protein coding gene symbol]‐DT.
LncRNAs that are contained within an intron of a protein coding gene on the same strand are assigned the symbol format [protein coding gene symbol]‐IT#.
LncRNAs that overlap a protein coding gene on the same strand are assigned the symbol format [protein gene coding symbol]‐OT#.
LncRNAs that contain microRNA or snoRNA genes within introns or exons are named as host genes. See the main text for details on how these microRNA host genes and snoRNA host genes are named.
| RNA resource | Resource URL | Description |
|---|---|---|
| RNAcentral |
| Centralised database of non‐coding RNA sequences collated from expert non‐coding RNA member databases, model organism databases and sequence accession databases |
| miRBase |
| Searchable database of microRNA sequences and annotations. Also hosts the miRBase registry where researchers can submit prospective new microRNAs |
| GtRNAdb |
| The genomic tRNA database, which contains predicted tRNA genes by the tRNAscan‐SE program for many different species |
| snoRNABase |
| Database of human snoRNA genes; useful resource but no longer being updated |
| LNCipedia |
| Database of human long non‐coding RNA sequences and manually curated lncRNA articles |
| Ensembl |
| Genome browser for vertebrate genomes that hosts the GENCODE annotation models for non‐coding RNA genes for mouse and human genes |
| NCBI Gene |
| Integrated annotation and related information for many different genomes. Incudes RefSeq manual annotation of human and mouse non‐coding RNA genes |