| Literature DB >> 29441061 |
Daniil Nikitin1,2, Dmitry Penzar3, Andrew Garazha2,4, Maxim Sorokin4,5,6, Victor Tkachev4, Nicolas Borisov4,5, Alexander Poltorak7, Vladimir Prassolov1, Anton A Buzdin1,2,4,5.
Abstract
Endogenous retroviruses and retrotransposons also termed retroelements (REs) are mobile genetic elements that were active until recently in human genome evolution. REs regulate gene expression by actively reshaping chromatin structure or by directly providing transcription factor binding sites (TFBSs). We aimed to identify molecular processes most deeply impacted by the REs in human cells at the level of TFBS regulation. By using ENCODE data, we identified ~2 million TFBS overlapping with putatively regulation-competent human REs located in 5-kb gene promoter neighborhood (~17% of all TFBS in promoter neighborhoods; ~9% of all RE-linked TFBS). Most of REs hosting TFBS were highly diverged repeats, and for the evolutionary young (0-8% diverged) elements we identified only ~7% of all RE-linked TFBS. The gene-specific distributions of RE-linked TFBS generally correlated with the distributions for all TFBS. However, several groups of molecular processes were highly enriched in the RE-linked TFBS regulation. They were strongly connected with the immunity and response to pathogens, with the negative regulation of gene transcription, ubiquitination, and protein degradation, extracellular matrix organization, regulation of STAT signaling, fatty acids metabolism, regulation of GTPase activity, protein targeting to Golgi, regulation of cell division and differentiation, development and functioning of perception organs and reproductive system. By contrast, the processes most weakly affected by the REs were linked with the conservative aspects of embryo development. We also identified differences in the regulation features by the younger and older fractions of the REs. The regulation by the older fraction of the REs was linked mainly with the immunity, cell adhesion, cAMP, IGF1R, Notch, Wnt, and integrin signaling, neuronal development, chondroitin sulfate and heparin metabolism, and endocytosis. The younger REs regulate other aspects of immunity, cell cycle progression and apoptosis, PDGF, TGF beta, EGFR, and p38 signaling, transcriptional repression, structure of nuclear lumen, catabolism of phospholipids, and heterocyclic molecules, insulin and AMPK signaling, retrograde Golgi-ER transport, and estrogen signaling. The immunity-linked pathways were highly represented in both categories, but their functional roles were different and did not overlap. Our results point to the most quickly evolving molecular pathways in the recent and ancient evolution of human genome.Entities:
Keywords: endogenous retrovirus; evolution; human genome evolution; immunity; molecular pathway; retroelement; retrotransposon; transcription factor binding site
Mesh:
Substances:
Year: 2018 PMID: 29441061 PMCID: PMC5797644 DOI: 10.3389/fimmu.2018.00030
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 7.561
Figure 1Distribution of RE-linked transcription factor binding site (TFBS) (A) outside and (B) inside 10 kb neighborhoods of TSS between the different groups of REs. Numbers are given for the mapped TFBS of each category. Green columns denote TFBS for the evolutionary young REs (0–8% divergence from the respective consensus sequence). Blue columns show TFBS distribution for the fraction of all REs.
Relative concentration of RE-linked transcription factor binding site (TFBS) in 5-kb neighborhood of human gene transcription start sites.
| Concentration of TFBS per kilobase | |||
|---|---|---|---|
| Class of RE | All | Young | Fold change |
| LINE | 2.939 | 0.093 | −31.6 |
| SINE | 3.892 | 0.416 | −9.4 |
| LR/ERV | 1.457 | 0.107 | −13.6 |
| Total REs | 8.287 | 0.616 | −13.5 |
Figure 2Distribution of GRE score among the known human genes. (A) Distribution of GRE for the young fraction of REs (0–8% divergence from the respective consensus sequence). (B) Distribution of GRE for the total fraction of all REs.
Figure 3Comparison of GRE scores (axis of ordinates) and GTE scores (abscissa axis) for known human genes. Color scale is given to show densities of incidences on the plot. Each dot represents a single gene. Pearson r—Pearson correlation coefficient; p—Pearson p-value.
Gene ontology (GO) functional annotation clusters in top and bottom 6% of human genes sorted by their NGRE scores.
| Cluster, GO terms | Percentage of clusters | |
|---|---|---|
| Top 6% | Bottom 6% | |
| Immunity and response to pathogens | 32 | – |
| Organ development and embryogenesis | 7 | 45 |
| Gene transcription and negative regulation | 6 | – |
| Chromatin assembly | 6 | 16 |
| Protein targeting to Golgi | 6 | – |
| Ubiquitination and protein degradation | 4 | – |
| Extracellular matrix organization | 4 | – |
| Regulation of STAT signaling | 4 | – |
| Perception organ development | 4 | – |
| Negative regulation of metabolism | 4 | – |
| Peptide modifications | 3 | – |
| Regulation of GTPases | 3 | – |
| Reproductive system development | 3 | – |
| Regulation of differentiation and cell proliferation | 3 | 2 |
| Regulation of body fluids | 2 | – |
| Transcription and processing of RNA | – | 16 |
| Ribosome assembly and protein translation regulation | – | 8 |
| Regulation of apoptosis | – | 2 |
| Proteasomal degradation | – | 2 |
| Steroid receptor signaling | – | 2 |
Functional groups of top and bottom molecular pathways sorted by their NPII scores.
| Functional group | Percentage | |
|---|---|---|
| Top 6% | Bottom 6% | |
| Fatty acids metabolism | 19 | – |
| Immunity and response to pathogens | 15 | 14 |
| Nuclear transport | 9 | – |
| Maturation of RNA (mRNA and small RNAs) | 9 | – |
| DNA repair and replication | 6 | – |
| Alpha-synuclein signaling | 5 | – |
| Ubiquitination and protein degradation | 3 | – |
| Protein targeting to Golgi | 3 | – |
| Nerve growth and neuronal signaling | – | 24 |
| Organ development, embryogenesis, and cell adhesion | – | 35 |
| IGF1R signaling and regulation of glucose metabolism | – | 9 |
Figure 4Comparison of normalized transcription factor binding site distributions between the young (0–8% divergence from the respective consensus sequence) and total fractions of REs. (A) Comparison of NGRE scores (gene level of regulation), each dot represents a single gene. (B) Comparison of NPII scores (pathway level of regulation), each dot represents a single molecular pathway. Pearson r—Pearson correlation coefficient; p—Pearson p-value.
Functional groups of top and bottom molecular pathways sorted by the ratios of NPII scores for the all and young RE-linked transcription factor binding sites.
| Functional group | Percentage | |
|---|---|---|
| Top 6% | Bottom 6% | |
| Cell adhesion, Notch, Wnt, and integrin signaling | 20 | – |
| Immunity and response to pathogens | 20 | 17 |
| Nerve growth and neuronal signaling | 17 | – |
| Metabolism of chondroitin sulfate and heparin | 8 | – |
| Metabolism of cAMP | 6 | – |
| Endocytosis | 3 | – |
| IGF1R signaling and regulation of glucose metabolism | 3 | – |
| Cell cycle progression and regulation of apoptosis | – | 21 |
| PDGF, TGF beta, EGFR, and p38 signaling | – | 12 |
| Histone deacetylation and DNA methylation | – | 10 |
| Phospholipid metabolism | – | 9 |
| Insulin and AMPK signaling | – | 6 |
| Protein targeting to Golgi | – | 3 |
| Estrogen signaling and oocyte maturation | – | 3 |
Figure 5Model of gene expression regulation by RE-linked transcription factor binding site (TFBS). (A) Schematic representation of TFBS (arrows) that may overlap with the REs (black boxes) close to transcription start sites of known human genes (shown as “TSS”). (B) Outline of RE-linked TFBS regulation at the long-term and recent evolutional perspectives. This figure aggregates data for both single gene and molecular pathway analysis.
Top and bottom immunity-linked molecular pathways sorted by the ratios of NPII scores for the all and young RE-linked transcription factor binding site.
| Pathway name | All NPII | Young NPII | atio (A/Y) |
|---|---|---|---|
| NCI downstream signaling in naive CD8 T cells pathway (pathway regulation of survival gene product expression | 1.02 | 0.18 | 5.59 |
| NCI downstream signaling in naive CD8 T cells main pathway | 0.93 | 0.43 | 2.16 |
| Reactome toll like receptor 4 TLR4 cascade main pathway | 1.19 | 0.32 | 3.72 |
| Reactome interleukin receptor SHC signaling main pathway | 0.85 | 0.24 | 3.51 |
| Cytokine network pathway | 1.04 | 0.32 | 3.24 |
| NCI CXCR3 mediated signaling events pathway (cell adhesion) | 1.06 | 0.39 | 2.76 |
| NCI CXCR3 mediated signaling events pathway (actin polymerization or depolymerization) | 1.02 | 0.40 | 2.57 |
| NCI LPA receptor mediated events pathway (cAMP biosynthetic process) | 0.87 | 0.35 | 2.46 |
| Biocarta lck and fyn tyrosine kinases in initiation of tcr activation main pathway | 0.98 | 0.42 | 2.36 |
| NCI IL2 mediated signaling events pathway (T cell proliferation) | 1.00 | 0.46 | 2.19 |
| NCI BCR signaling pathway (reentry into mitotic cell cycle) | 0.88 | 0.42 | 2.12 |
| NCI IL4-mediated signaling events main pathway | 0.96 | 0.46 | 2.08 |
| KEGG inflammatory bowel disease IBD main pathway | 1.01 | 0.49 | 2.06 |
| Reactome IRAK2 mediated activation of TAK1 complex main pathway | 1.17 | 1.40 | 0.84 |
| Reactome IRAK2 mediated activation of TAK1 complex upon TLR7 8 or 9 stimulation main pathway | 1.17 | 1.40 | 0.84 |
| KEGG Fanconi anemia main pathway | 0.98 | 1.17 | 0.83 |
| Reactome Fanconi anemia main pathway | 0.89 | 1.08 | 0.82 |
| Reactome CD28 dependent Vav1 main pathway | 1.13 | 1.37 | 0.83 |
| Reactome thromboxane signaling through TP receptor main pathway | 0.94 | 1.15 | 0.82 |
| NCI Thromboxane A2 receptor signaling pathway (JNK cascade) | 0.95 | 1.20 | 0.79 |
| NCI Fc epsilon receptor I signaling in mast cells pathway (regulation of mast cell degranulation) | 0.82 | 1.03 | 0.79 |
| IL-10 pathway IL-10 responsive genes transcription of BCLXL cyclin-D1 D2 D3 Pim1 c-myc and P19 (INK4D) | 0.99 | 1.31 | 0.76 |
| IL-10 pathway inflammatory cytokine genes expression | 0.99 | 1.31 | 0.76 |
| Reactome membrane binding and targeting of GAG proteins main pathway | 0.94 | 1.25 | 0.75 |