| Literature DB >> 30455669 |
Konstantin Kruse1,2, Martin Nettling1, Nadine Wappler2, Alexander Emmer3, Malte Kornhuber3,4, Martin S Staege2, Ivo Grosse1,5.
Abstract
More than eight percent of the human genome consists of human endogenous retroviruses (HERVs). Typically, the expression of HERVs is repressed, but varying activities of HERVs have been observed in diseases ranging from cancer to neuro-degeneration. Such activities can include the transcription of HERV-derived open reading frames, which can be translated into proteins. However, as a consequence of mutations that disrupt open reading frames, most HERV-like sequences have lost their protein-coding capacity. Nevertheless, these loci can still influence the expression of adjacent genes and, hence, mediate biological effects. Here, we present WebHERV (http://calypso.informatik.uni-halle.de/WebHERV/), a web server that enables the computational prediction of active HERV-like sequences in the human genome based on a comparison of genome coordinates of expressed sequences uploaded by the user and genome coordinates of HERV-like sequences stored in the specialized key-value store DRUMS. Using WebHERV, we predicted putative candidates of active HERV-like sequences in Hodgkin lymphoma (HL) cell lines, validated one of them by a modified SMART (switching mechanism at 5' end of RNA template) technique, and identified a new alternative transcription start site for cytochrome P450, family 4, subfamily Z, polypeptide 1 (CYP4Z1).Entities:
Keywords: BLAST; CYP4Z1; DRUMS; HERVs; Hodgkin lymphoma; database; endogenous retroviruses; web server
Year: 2018 PMID: 30455669 PMCID: PMC6231192 DOI: 10.3389/fmicb.2018.02384
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1WebHERV data flow. The user can upload a file with genome coordinates or a file with probe set IDs that are then being transformed into genome coordinates utilizing the integrated probe set database. The user can then set search parameters and start the search of HERV-like sequences in the chromosomal neighborhood of these genome coordinates.
Figure 2Probe sets with high signal intensities in HL cells are located preferentially in the neighborhood of HERV-like sequences. (A) Chromosomal neighborhoods of probe sets with high expression in HL cells (closed circles; GEO data set GSE47686) or normal blood cells (open circles; GEO data set GSE18838) were analyzed for the presence of HERV-like sequences. Presented are the percentages of probe sets with hits in the distance between 1 and 10,000 bp. (B) Presented is the ratio of the two percentages of probe sets with hits from up-regulated and down-regulated probe sets. The HERV association of HL specific probe sets is most pronounced at a distance of approximately 200 bp from the probe set.
Figure 3Signal intensities of six up-regulated genes in HL cell lines in the vicinity of HERV-like sequences. Presented are means and standard deviations for probe sets identified as associated with HERV-like sequences in HL cell lines (closed bars; GEO data set GSE47686) and normal blood cells (open bars; GEO data set GSE18838). A high up-regulation in HL was found for a probe set corresponding to the gene cytochrome P450, family 4, subfamily Z, polypeptide 1 (CYP4Z1).
Figure 4CYP4Z1 is an HL associated gene. Expression of CYP4Z1 in HL samples (from GEO data sets GSE12453, GSE12427, GSE20011, GSE25986, GSE39134) and normal tissues (from GEO data sets GSE7307) was assessed in micro-array data sets from the GEO database (Affymetrix HG-U133Plus2.0 micro-array data). High signal intensities were observed in the majority of HL samples. From the normal tissues only mammary gland expresses CYP4Z1.
Figure 5Identification of alternative transcripts and promoters of human CYP4Z1. (A) The SMART technique was used for the identification of the 5′ end of CYP4Z1 transcripts. Three different transcripts were identified. A + B: RNA from two different samples of the HL cell line L-428 was used for SMART; M: DNA size marker. (B) Primers with a specificity for exons 1 and 3 of CYP4Z1 amplified two CYP4Z1 splice variants (1 and 2) in 3/5 HL cells lines but not in normal peripheral blood mononuclear cells (PBMC). Primers with a specificity for exon 10 and the adjacent HERV-like sequence detected transcripts (3) with an alternative transcription start site only in HL cell lines but not in PBMC; ntc, no template control; M, DNA size marker. (C) Schematic representation of identified CYP4Z1 transcripts. Exons are indicated by blue boxes. The position of the HERV-like sequence is indicated by a red box.