| Literature DB >> 24769101 |
Jiayan Wu1, Jingfa Xiao1, Zhang Zhang1, Xumin Wang1, Songnian Hu1, Jun Yu2.
Abstract
Ribonucleic acid (RNA) deserves not only a dedicated field of biological research - a discipline or branch of knowledge - but also explicit definitions of its roles in cellular processes and molecular mechanisms. Ribogenomics is to study the biology of cellular RNAs, including their origin, biogenesis, structure and function. On the informational track, messenger RNAs (mRNAs) are the major component of ribogenomes, which encode proteins and serve as one of the four major components of the translation machinery and whose expression is regulated at multiple levels by other operational RNAs. On the operational track, there are several diverse types of RNAs - their length distribution is perhaps the most simplistic stratification - involving in major cellular activities, such as chromosomal structure and organization, DNA replication and repair, transcriptional/post-transcriptional regulation, RNA processing and routing, translation and cellular energy/metabolism regulation. An all-out effort exceeding the magnitude of the Human Genome Project is of essence to construct just mammalian transcriptomes in multiple contexts including embryonic development, circadian and seasonal rhythms, defined life-span stages, pathological conditions and anatomy-driven tissue/organ/cell types.Entities:
Keywords: Informational; Operational; RNA; Ribogenomics; Transcriptome
Mesh:
Substances:
Year: 2014 PMID: 24769101 PMCID: PMC4411354 DOI: 10.1016/j.gpb.2014.04.002
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Figure 1Schematic view from genotype to phenotype with informational and operational tracks
Ribogenomics in the context of a genotype-to-genotype view. Genotype becomes one of the deterministic factors that include ribogenomic, epigenomic, homoeostatic, compartmental and plastic tracks. For the sake of discussion, here we simply classify non-coding RNAs (ncRNAs) into small ncRNAs (sncRNAs) and long ncRNAs (lncRNAs). To emphasize the influence of transcript-centric mutations, we identify expression-related simple nucleotide variations (eSNVs) as an important class of sequence variations that are not yet considered in the context of traditional population genetics and evolution.
Some informational and operational RNAs summarized in the literature for human, mouse, rice and Arabidopsis
| mRNA | 130,029 | 80,383 | 44,118 | 30,633 | |
| guideRNA | 210 | NA | NA | NA | |
| Large | lncRNA | 53,000 | NA | NA | 13,000 |
| lincRNA | 27,500 | NA | NA | 6480 | |
| Small | miRNA | 4450 | 3094 | 1305 | 635 |
| snoRNA | 403 | NA | 46 | 587 | |
| piRNA | 114 | 2710 | NA | NA | |
Note: We do not estimate the number of genes here but merely count the number of mRNAs recorded in the UniGene database (http://www.ncbi.nlm.nih.gov/unigene/statistics/). The different numbers of RNAs identified in the databases or publications reflect the incomplete nature of the studies and collections. Only some representative classes of RNAs are listed here. NA, not yet available.
Distribution of mRNA abundance in different cell types based on a theoretical model
| <1 | 42,665 | 55 | 26,977 | 35 | 1114 | 1.40 |
| 1–5 | 22,866 | 30 | 31,104 | 40 | 25,863 | 34 |
| 5–10 | 4822 | 6 | 7450 | 10 | 15,688 | 20 |
| 10–50 | 5089 | 6 | 8415 | 11 | 22,866 | 30 |
| 50–100 | 855 | 1 | 1496 | 2 | 4822 | 6.30 |
| 100–500 | 763 | 1 | 1421 | 2 | 5089 | 6.60 |
| >500 | 92 | 289 | 1710 | 2.20 | ||
| Mean copies per mRNA | 6.48 | 12.96 | 64.81 | |||
| Median copies per mRNA | 0.83 | 1.66 | 8.28 | |||
Note: The total number of mRNAs per cell in different cell types is estimated based on our theoretical model or previous studies [54–56]. 500 K indicates the total number of mRNA copies in a transcript-rich cell, such as stem cells and cells from cerebrum and testis; 1000 K indicates the total number of mRNA copies in a transcript-poor cell, such as various cell lines and epithelial cells; 5000 K indicates the total number of mRNA copies in a hypothetical cell used for data analysis in this theoretical model [54].
| Universal: shared by all tissues/organs/cells | |
| Tissue-specific: shared by a single or limited number of tissues (such as nerves, muscles and epithelia) | |
| Cell-specific: unique to a single cell type | |
| Near universal: shared by most tissues but not all | |
| Rationally shared: genes that are shared between unrelated tissues or cell types based on function | |
| Expression-variable (majority; genes vary in expression among tissues) | |
| Expression-constant (minority; genes are expressed constantly in all cell types) | |
| Highly-expressed (>1000s of copies) | |
| Moderately-expressed (10s−100s of copies) | |
| Lowly-expressed (<10 copies) | |
| Size: large (>500 kb) | |
| GC/purine content: GC-rich | |
| CpG islands: high, moderate and low density | |
| Minimal-intron-containing | |
| Biologically-defined repetitive sequence element associated | |
| Gene cluster-associated | |
| Transcript-centric variation | |
| Germline-specific | |
| Purifying (Ka/Ks <1) and positively-selected (Ka/Ks >1) genes | |
| Mitochondrion-associated | |
| Chloroplast-associated | |
| Nucleolus-associated | |
| Circadian-regulated | |
| Cell cycle-regulated | |
| Stem cell-differentiation | |
| Translation machinery | |
| Splicing machinery | |
| Nuclear exporting machinery | |
| Embryonic development | |
| Epidermal differentiation | |
| Phenotypic plasticity: | |
| Pathological conditions | |