| Literature DB >> 23722115 |
Abstract
The ENCyclopedia Of DNA Elements (ENCODE) project is an international research consortium that aims to identify all functional elements in the human genome sequence. The second phase of the project comprised 1640 datasets from 147 different cell types, yielding a set of 30 publications across several journals. These data revealed that 80.4% of the human genome displays some functionality in at least one cell type. Many of these regulatory elements are physically associated with one another and further form a network or three-dimensional conformation to affect gene expression. These elements are also related to sequence variants associated with diseases or traits. All these findings provide us new insights into the organization and regulation of genes and genome, and serve as an expansive resource for understanding human health and disease.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23722115 PMCID: PMC4357814 DOI: 10.1016/j.gpb.2013.05.001
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Summary of ENCODE experiments
| Experiment | Description |
|---|---|
| DNA methylation | In 82 human cell lines and tissues: |
| TF ChIP-seq | A total of 119 TFs: |
| Histone ChIP-seq | A total of 12 types: |
| DNase-seq | In 125 cell types or treatments: |
| DNase footprint | In 41 cell types: |
| MNase-seq | In GM12878 and K562 |
| 3C-carbon copy (5C) | In GM12878, K562, HeLa-S3 and H1-hESC |
| GWAS SNP targeting | 296 noncoding GWAS SNPs were assigned a target promoter |
Summary of GENCODE v7 gene annotation
| Category | Number |
|---|---|
| Protein-coding genes | 20,687 |
| Novel noncoding transcripts | 33,977 |
| Long non-coding RNA loci | 9640 |
| Linc RNA loci | 5058 |
| Antisense loci | 3214 |
| Sense intronic loci | 378 |
| Pseudogenes | 11,216 |
| Transcribed | 863 |
| Non-transcribed | 10,353 |

Multi-dimensional regulation of gene expression The transcriptional regulation is controlled by complicated interactions between regulatory elements. Hypermethylated CpGs are located in close chromatin regions, whereas CpGs located in open regions are generally lowly methylated, where DHSs and histone modifications associated with transcriptional activity are enriched. Within DHSs, DNase I cleavage leaves footprints where TFs bind to protect the DNA from cleavage by DNase I. SNP occurring at the TF recognition sequence (motif) will affect TF binding occupancy. Furthermore, distal DHSs harboring disease-associated SNPs can be brought into proximity with a promoter to incorporate TF binding complex to affect gene function through long range chromosomal interaction. Multiple TFs interact to DNA by two scenarios: TFs bind to neighboring sites (cobinding), and one TF binds to another that binds to DNA (tethered binding).