| Literature DB >> 34425866 |
Ming Sun1,2, Yunfei Wang1,3, Caishang Zheng1, Yanjun Wei1, Jiakai Hou1,4, Peng Zhang1,5, Wei He6,7,8, Xiangdong Lv9,10,11, Yao Ding9,10,11, Han Liang1,8,12, Chung-Chau Hon13, Xi Chen9,10,11, Han Xu14,15,16,17, Yiwen Chen18,19.
Abstract
BACKGROUND: The human genome encodes over 14,000 pseudogenes that are evolutionary relics of protein-coding genes and commonly considered as nonfunctional. Emerging evidence suggests that some pseudogenes may exert important functions. However, to what extent human pseudogenes are functionally relevant remains unclear. There has been no large-scale characterization of pseudogene function because of technical challenges, including high sequence similarity between pseudogene and parent genes, and poor annotation of transcription start sites.Entities:
Keywords: CRISPR interference; Cancer; FOXA1; FOXM1; GTEx; Luminal A breast cancer; Nucleus; Pseudogene; TCGA; Transcriptional regulation; Unitary pseudogene
Mesh:
Substances:
Year: 2021 PMID: 34425866 PMCID: PMC8381491 DOI: 10.1186/s13059-021-02464-2
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 17.906
Fig. 1An integrated computational pipeline for designing CRISPRi sgRNA library to screen for functional human pseudogenes and parent genes. A A workflow of the sgRNA library design for pseudogene-focused CRISPRi screen. B The number of sgRNAs targeting pseudogene, parent genes, and positive and negative control sgRNAs that were included in the screen. C The pie chart showing the percentage of different types of pseuodgenes included in the screen. D The distribution of the number of pseudogenes per parent gene that were included in the screen
Fig. 2CRISPRi screen reveals functional human pseudogenes. A Schema depicting the workflow for construction of lentiviral vectors encoding sgRNA library and experimental design of CRISPRi screens. B The scatter plot showing the log2 Fold-Change (FC) and the statistical significance (− log10P value) of sgRNA abundance difference between day 21 and day 0 for the negatively selected pseudogenes (log2 FC < 0), and the pie chart showing the percentage of unitary, processed, and unprocessed pseudogenes among all pseudogenes hits in MCF7 cells. The dots corresponding to the pseudogene hits are shown in red. The examples of pseudogene hits MGAT4EP, DDX12P, TUBBP5, and PRELID1P1 are highlighted in different colors. C The scatter plot showing the log2FC and the statistical significance (− log10P value) of sgRNA abundance difference between day 21 and day 0 for the negatively selected parent genes (log2FC < 0). The dots corresponding to the parent gene hits are shown in red. The examples of parent gene hits RPL27A, MRPS31, EIF3E, and LHDA are shown in different colors. D The scatter plot showing the log2FC of sgRNA abundance difference between day 21 and day 0 for pseudogenes and their corresponding parent genes. The dots corresponding to the pseudogene hits, whose parent gene does not pass the statistical significance threshold, are shown in red. The examples of these pseudogene hits PSPHP1, COX20P1, and CD99P1 are shown in different colors
Fig. 3Validation of top pseudogene hits in MCF7 cells. A A bar graph shows the log2FC of sgRNA abundance difference between day 21 and day 0 for the top-ranked (by log2FC) pseudogene hits in MCF7 cells that showed a significant upregulation in breast cancer compared with normal breast tissues based on TCGA data. B qRT-PCR analysis of the RNA level of MGAT4EP, DDX12P, PRELID1P1, and TUBBP5 in MCF7-dCas9 cells transduced with negative control non-targeting sgRNA (sg-NT) or gene-specific sgRNA. GAPDH was used as an internal control. The growth of MCF7-dCas9 cells transduced with sg-NT or gene-specific sgRNA for C MGAT4EP, D DDX12P, E PRELID1P1, and F TUBBP5 was monitored (OD450 absorbance for WST-8 formazen) every 24 h with CCK-8 assay for 96 h. G The representative pictures of clonogenic growth and H the bar graph quantifying the colonies formed by MCF7-dCas9 cells transduced with sg-NT or gene-specific sgRNAs for MGAT4EP, DDX12P, PRELID1P1, and TUBBP5, after cells were cultured for 2 weeks. All data are shown as mean ± standard deviation (SD), n = 3. The Student’s t test was used to assess the statistical significance of difference in mean between two experimental groups (*p < 0.05; **p < 0.01; ns: not significant, p ≥ 0.05)
Fig. 4MGAT4EP is predominantly localized in the nucleus and interacts with transcription factor FOXA1. A The RNA level of MGAT4EP in nuclear and the cytoplasmic fraction of MCF7 and T47D cells was measured by qRT-PCR. MALAT1 RNA and GAPDH mRNA was used a positive control for nuclear and cytoplasmic fraction, respectively. B The proteins retrieved by RNA pull-down with MGAT4EP RNA and negative control antisense RNA (AS) were visualized by silver staining and subject to mass spectrometry (MS) analysis. C RNA pull-down coupled with western blot validated the interaction between MGAT4EP and FOXA1 that was identified from MS analysis. SP1 that was not found in MS analysis was used as a negative control. D RNA pull-down of the antisense, full-length, and serial deletion mutants of MGAT4EP RNA followed by anti-FOXA1/anti-SP1 western blotting. The four serial deletion mutants of MGAT4EP RNA were generated by deleting 1–700, 700–1400, 1400–2100, or 2100–2819 bps, respectively. E RIP-qPCR analysis with anti-FOXA1 or anti-IgG antibody validated the association of FOXA1 with MGAT4EP RNA, where MALAT1 and GAPDH RNA were used as negative controls. All data are shown as mean ± SD, n = 3. The Student’s t test was used to assess the statistical significance of difference in mean between two experimental groups (*p < 0.05; **p < 0.01; ns: not significant, p ≥ 0.05)
Fig. 5MGAT4EP upregulates the expression of FOXM1, a FOXA1 target, and enhances FOXA1 binding to its promoter. A Schema depicting the workflow of identifying potential protein-coding gene targets that were co-regulated by MGAT4EP and FOXA1 and were important for mediating their tumor-promoting function in luminal A breast cancer. B QRT-PCR analysis of FOXM1 mRNA expression and C western blot for measuring FOXM1 protein expression in MCF7 and T47D cells that were treated with negative control non-targeting siRNA (si-NC) or MGAT4EP-targeting siRNAs. D qRT-PCR analysis of FOXM1 mRNA expression and E western blot for measuring FOXM1 protein expression in MCF7 cells and T47D cells that were treated with si-NC or FOXA1-targeting siRNAs. F The signal track of FOXA1 ChIP-seq and the corresponding input in MCF7 and T47D cells. The identified ChIP-seq peaks were drawn as horizontal lines above the signal track. G ChIP-qPCR analysis was performed with anti-FOXA1 or anti-IgG antibody in MCF7 and T47D cells to confirm the enrichment of DNA fragments covering the FOXA1 ChIP-seq peak in the FOXM1 promoter. The effect of si-NC or MGAT4EP-targeting siRNAs on the binding of FOXA1 to the same region was assessed by ChIP-qPCR. H Western blot for measuring FOXM1 protein expression in MCF7 cells and T47D cells that were transduced with negative control non-targeting shRNA (sh-NC) or FOXM1-targeting shRNAs. I The growth of MCF7 and T47D cells transduced with sh-NC or FOXM1-targeting shRNAs was monitored (OD450 absorbance for WST-8 formazen) every 24 h with CCK-8 assay for 96 h. All data are shown as mean ± SD, n = 3. The Student’s t test was used to assess the statistical significance of difference in mean between two experimental groups (*p < 0.05; **p < 0.01; ns: not significant, p ≥ 0.05)
Fig. 6Integrative analyses of TCGA data reveal clinically relevant unitary pseudogenes in human cancer. A The unitary pseudogenes with significant up-/downregulation in tumors vs. the corresponding normal tissues in at least four cancer types, based on TCGA data are shown. The circle size is proportional to the significance level and the log2(Fold-Change) between tumors and normal tissues is shown by color scale. The unitary pseudogenes, whose expression was significantly B associated with patient OS in at least three cancer types or C associated with patient RFS in at least two cancer types are shown. The circle size is proportional to the significance level. The unitary pseudogenes, whose expression showed positive and negative natural logarithm of hazard ratio (HR) in a given cancer type, is colored in red and blue, respectively. D Higher CMAHP expression was associated with better patient OS in LUAD and SKCM, respectively. E Higher expression of unitary pseudogene CPHL1P was associated with worse patient OS in KIRC and worse patient RFS in PRAD, respectively. F Higher MYH16 expression was associated with both worse patient OS and RFS in PAAD. The p values shown in the figures were calculated based on log-rank test. The Kaplan-Meier survival curves are plotted as solid lines accompanied by 95% confidence interval