| Literature DB >> 31354981 |
Martha V Koerner1, Kashyap Chhatbar1, Shaun Webb1, Justyna Cholewa-Waclaw1, Jim Selfridge1, Dina De Sousa1, Bill Skarnes2, Barry Rosen2, Mark Thomas2, Joanna Bottomley2, Ramiro Ramires-Solis2, Christopher Lelliott2, David J Adams2, Adrian Bird1,2.
Abstract
Most human genes are associated with promoters embedded in non-methylated, G + C-rich CpG islands (CGIs). Not all CGIs are found at annotated promoters, however, raising the possibility that many serve as promoters for transcripts that do not code for proteins. To test this hypothesis, we searched for novel transcripts in embryonic stem cells (ESCs) that originate within orphan CGIs. Among several candidates, we detected a transcript that included three members of the let-7 micro-RNA family: Let-7a-1, let-7f-1, and let-7d. Deletion of the CGI prevented expression of the precursor RNA and depleted the included miRNAs. Mice homozygous for this mutation were sub-viable and showed growth and other defects. The results suggest that despite the identity of their seed sequences, members of the let-7 miRNA family exert distinct functions that cannot be complemented by other members.Entities:
Keywords: Let-7; micro-RNA; mouse genetics
Year: 2019 PMID: 31354981 PMCID: PMC6660310 DOI: 10.3390/epigenomes3010007
Source DB: PubMed Journal: Epigenomes ISSN: 2075-4655
Figure 1A conserved long ncRNA overlapping three miRNAs is expressed from an orphan CpG island. (A) Mouse RNA-Seq data displayed on the UCSC genome browser on mouse chromosome 13 (mm9). RNA-Seq reads mapping the + and - strand are shown separately for RNA harvested from embryonic stem cells (ESCs), embryoid bodies (EBs), neuronal cells, and adult mouse brain. CpG islands were annotated according to [12] and grouped into “all CGIs” (green bars) and “orphan CGIs” (purple bars). During the course of the study, a transcript originating from this orphan CGI was annotated: KY467470 as a transcript overlapping three annotated miRNAs. The region which was deleted in mice is shown as a black bar. Primers for qPCR assays (a, b, c) are shown as black lines. H3K4me3 and conservation tracks from the UCSC genome browser are shown. (B) Human RNA-Seq data displayed on the UCSC genome browser on human chromosome 9 (hg19). RNA-Seq data from human wildtype LUMES cells are plotted in green. This transcript starts from a CGI syntenic to the mouse orphan CGI in A) (green bar) and overlaps the same three miRNAs. (C) Sequences of the let-7 family. Bases differing from let-7a are shown in red, missing bases as -. Species of origin of the miRNA is indicated as mice (mm) and/or human (hs). miRNAs framed in red are those shown in (A) and (B).
Figure 2KY467470 originates from the orphan CGI, shows widespread expression and is capped and nuclear localised. (A) 5′ RACE maps the transcription start site (TSS) of KY467470. A plot of the CpG frequency is shown on the top. Below, a purple bar indicates the orphan CGI as mapped by [12,14]. Arrows indicate the three TSSs identified by sequencing of the 5′ RACE products with bp coordinates shown next to the arrows. Numbers in brackets indicate the number of sequence reads for each TSS. The purple wavy line indicates KY467470. The photograph of the gel below shows the products of the outer and inner PCRs according to the 5′ RACE procedure. (B) qPCR analysis measuring the relative abundance of KY467470 in different mouse cells/tissues. The location of the primers is shown in Figure 1A). Tissues are denoted as follows: Embryonic stem cells (ES); embryoid bodies (EB); neuronal cells (N); liver (LIV); spleen (SPL); kidney (KID); lung (LUN); heart (HEA); thigh muscle (MUS); brain (BRA); testis (TES); 12.5 dpc embryonic trunk (EMB); 12.5 dpc embryonic head (E. HEAD); visceral yolk sac (VYS); placenta (PLA). Shown are mean and standard deviation for two (ES, EB, N) or three (all other tissues) biological replicates. (C) qPCR analysis of 5′-Cap IP. Mock IP is shown in black, Cap-IP in grey. MeCP2, Cyclophilin A (CypA), and Gapdh were controls for capped mRNAs, whereas 5S rRNA, 5.8S rRNA, and 28S rRNA were controls for uncapped RNAs. (D) qPCR analysis of relative transcript abundance in the nuclear (N; black) or cytoplasmatic (C; grey) fraction in embryoid bodies (EB). Controls for cytoplasmic RNAs were CyclophilinA (CypA), Gapdh, and controls for nuclear localised RNAs were Airn, Kcnq1ot1, and the 45S-pre-rRNA.
Figure 3Knock-out of the orphan CGI leads to sub-viability, increased weight, and increased body length. (A) Scheme showing the strategy to knock-out the orphan CGI. At the wildtype locus, the orphan CGI (purple bar) drives expression of KY467470 (purple wavy line). A loxP site was integrated in the 5′ of the orphan CGI. Downstream of the CGI, a STOP/selection cassette containing a flippase recognition target (FRT) site, a beta-globin stop cassette, a rox site, a neomycin-resistance cassette, a second rox, a second FRT, and finally a second loxP site was inserted to create a STOP allele. The STOP allele was converted into a KO allele by CRE recombination, deleting the orphan CGI as well as the STOP/selection cassette. (B) qPCR analysis was used to detect the relative abundance of KY467470 in wildtype, heterozygous, and homozygous KO animals. Shown are three different assays along the length of KY467470 (a, b, c; for location see Figure 1A). The mean and standard deviation of three animals/genotype are shown. Levels were normalised to CyclophilinA. (C) qPCR analysis was used to detect the relative abundance of the let-7a mature miRNA in wildtype, heterozygous, and homozygous KO animals. The mean and standard deviation of three animals/genotype are shown. Levels were normalised to Snord68. (D) qPCR analysis was used to detect the relative abundance of the let-7d mature miRNA in wildtype, heterozygous, and homozygous KO animals. The mean and standard deviation of three animals/genotype are shown. Levels were normalised to Snord68. (E) qPCR analysis was used to detect the relative abundance of the let-7f mature miRNA in wildtype, heterozygous, and homozygous KO animals. The mean and standard deviation of three animals/genotype are shown. Levels were normalised to Snord68. (F) qPCR analysis was used to detect the relative abundance of Zfp169. Details as in (B). (G) qPCR analysis was used to detect the relative abundance of Ptpdc1. Details as in (B). (H) The number of offspring at P14 being wildtype, heterozygous, or homozygous for the KO allele. Chi squared = 17.340 (2 degrees of freedom), 2-tailed p value = 0.0002. (I) The number of offspring at E14.5 that are wildtype, heterozygous, or homozygous for the KO allele. Chi squared = 2.733 (2 degrees of freedom), 2-tailed p value = 0.2550. (J) Scheme indicating the detection of phenotypes by measuring various parameters. A blue square indicates the absence of a detectable difference from the wildtype, a red square indicates an automatically detected phenotype, and a red square with a blue triangle indicates that the automatic call was not significantly different from the wildtype, but this was manually overridden based on observation, resulting in a significant phenotype, comparing mutant data to a reference range generated from age, sex, and strain-matched wildtypes. (K) Body weight in females homozygous for the KO allele (purple), or controls (black). Shown are the mean and standard deviation. Baseline values (median, 97.5% and 2.5%) are shown in grey. Here, a manual call for a significant phenotype was made, which was supported by observations in males. (L) Body weight in males. A manual call was based on the appearance of a phenotype that, although observed at a low frequency in the mutant, is rarely observed in the baseline wildtype population. (M) Body length in females. The box extends from the 25th to the 75th percentile, the horizontal line is the median, and whiskers denote the minimum and maximum values. Here, a manual call for a significant phenotype was made, which was supported by an observation in males. (N) Body length in males. A manual call for significance was based on the appearance of a phenotype that, although observed at a low frequency in the mutant, is rarely observed in the baseline wildtype population.
Figure 4Knock-out of the orphan CGI affects plasma chemistry and gamma delta T-cells. (A) Triglyceride levels in females homozygous for the KO allele (purple), or controls (black). The box extends from the 25th to the 75th percentile, the horizontal line is the median, and whiskers denote the minimum and maximum values. Baseline values (median, 97.5% and 2.5%) are shown in grey. Automatic significance called. (B) Triglyceride levels in males. Automatic significance called. (C) Cholesterol levels in females. Manual call for significance is supported by an observation in another parameter (triglycerides). Manual call is based on mutant data being clustered across the periphery and outside the boundaries of the reference range. The auto-call rule is considered too stringent on this occasion. (D) Cholesterol levels in males. No significant difference. (E) Amylase levels in females. Automatic significance called. (F) Amylase levels in males. Automatic significance called. (G) Gamma delta T cell number in females. Automatic significance called. (H) Gamma delta T cell number in males. No significant difference. (I) Gamma delta T cell % in females. Manual call is supported by an observation in another parameter (gamma delta T cell number). (J) Gamma delta T cell % in males. No significant difference.