| Literature DB >> 24924000 |
Georges St Laurent, Yuri Vyatkin, Philipp Kapranov1.
Abstract
In the past decade, numerous studies have made connections between sequence variants in human genomes and predisposition to complex diseases. However, most of these variants lie outside of the charted regions of the human genome whose function we understand; that is, the sequences that encode proteins. Consequently, the general concept of a mechanism that translates these variants into predisposition to diseases has been lacking, potentially calling into question the validity of these studies. Here we make a connection between the growing class of apparently functional RNAs that do not encode proteins and whose function we do not yet understand (the so-called 'dark matter' RNAs) and the disease-associated variants. We review advances made in a different genomic mapping effort - unbiased profiling of all RNA transcribed from the human genome - and provide arguments that the disease-associated variants exert their effects via perturbation of regulatory properties of non-coding RNAs existing in mammalian cells.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24924000 PMCID: PMC4054906 DOI: 10.1186/1741-7015-12-97
Source DB: PubMed Journal: BMC Med ISSN: 1741-7015 Impact factor: 8.775
Figure 1Discovery of genome-wide association studies (GWAS) variants in different genomic elements and disease types. GWAS variants were assigned to a disease type (y axis) or non-disease traits. For each disease type, the P-value (x axis) for a skew towards a particular genomic category (x xis) was calculated (see Additional file 1: Supplementary Text). Numbers of unique GWAS variants for each of the genomic categories are shown as the purple bars; the corresponding numbers for each of the disease types are shown in parenthesis. Only disease types with >100 GWAS variants are shown. CDS, coding DNA sequence (coding regions of known genes); UTR, untranslated region (non-coding regions of known genes; promoter and intronic regions are those of known genes). See Additional file 1: Supplementary Text for details of the analysis.
Glossary of technical terms
| Term | Meaning |
| Chromatin signaling | A system of regulation of gene activity in a cell that works by affecting the immediate surroundings of DNA, for example, by modifying various proteins that coat DNA inside the nucleus. Depending on the exact nature of the modification, DNA becomes either more or less accessible to cellular machinery that activates genes |
| Enhancer | A sequence of DNA that can regulate a target gene or genes over long distances |
| DNAse I hypersensitivity region | A region of DNA identified in an assay where chromatin is digested with DNAse I, an enzyme that degrades DNA. More accessible regions of chromatin, typically containing regulatory elements such as promoters and enhancers, are more susceptible to DNAse digestion and thus are enriched in DNAse I hypersensitivity regions |
| Gene Ontology (GO) term | GO is a an international initiative aimed at assigning controlled vocabulary, consisting of |
| H1 embryonic stem cells | A line of human embryonic stem cells maintained in culture |
| H3K27 trimethylation | A certain type of chemical modification of a protein that binds DNA. Important for reversible deactivation oftargeted portions of the genome |
| Intron | Part of an RNA molecule that is included immediately after transcription and removed during maturation of that molecule |
| Intronic RNA | RNA encoded by a DNA sequence that also encodes an intron of another transcript |
| lincRNA-p21 | A non-coding RNA activated upon DNA damage and in various tumor cell lines |
| A gene encoding an important regulator controlling activity of many genes. This gene has been associated with many cancers | |
| Normal human epidermal keratinocytes (NHEK) | A line of primary keratinocytes maintained in culture |
| Non-coding RNA | RNA that is not used as a template for protein synthesis |
| Pervasive transcription | Massive transcription from unannotated regions of the genome |
| PolyA+ RNA | A molecule of RNA containing a long stretch of adenosine residues at the end |
| PRC2 chromatin signaling complex | A complex composed of multiple protein molecules that reversibly modifies chromatin and silences target genes |
| Promoter | A sequence of DNA that is located immediately adjacent to a target gene and regulates its activity |
| Pseudogene | A copy of a gene, presumed to be non-functional, although a number of recent examples describe both non-coding functions and occasionally coding functions for some of these loci |
| Regulation in | Regulation via interaction with molecules encoded by distal regions of the genome |
| RNA Pol II | A complex composed of multiple protein molecules responsible for synthesis of RNA, which is used as template for protein synthesis |
| Transcript | A molecule of RNA produced by transcription, that is, copying of RNA from the DNA template |
| Transcription factor | A protein that regulates expression of genes by binding to their promoters and/or enhancers |
| Transcription factor motif | A short DNA sequence recognized by a transcription factor or group of transcription factors, typically found in promoters and enhancers |
| Transcriptome | A collection of all the RNA molecules (transcripts) in a cell or a tissue |
| Transcriptomics | Study of the transcriptome |
| Oocytes from frogs of genus |
Figure 2A genomic view of the 8q24 region upstream of the gene. For details, see text.