| Literature DB >> 33971925 |
Robin-Lee Troskie1, Yohaann Jafrani1, Tim R Mercer2, Adam D Ewing3, Geoffrey J Faulkner4,5, Seth W Cheetham6.
Abstract
Pseudogenes are gene copies presumed to mainly be functionless relics of evolution due to acquired deleterious mutations or transcriptional silencing. Using deep full-length PacBio cDNA sequencing of normal human tissues and cancer cell lines, we identify here hundreds of novel transcribed pseudogenes expressed in tissue-specific patterns. Some pseudogene transcripts have intact open reading frames and are translated in cultured cells, representing unannotated protein-coding genes. To assess the biological impact of noncoding pseudogenes, we CRISPR-Cas9 delete the nucleus-enriched pseudogene PDCL3P4 and observe hundreds of perturbed genes. This study highlights pseudogenes as a complex and dynamic component of the human transcriptional landscape.Entities:
Keywords: CRISPR; Long-read; PacBio; Pseudogene; lncRNA
Mesh:
Substances:
Year: 2021 PMID: 33971925 PMCID: PMC8108447 DOI: 10.1186/s13059-021-02369-0
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Long-read cDNA sequencing elucidates the human pseudogene transcriptome. a Full-length consensus PacBio cDNA reads from normal tissues and cell lines were compared to Gencode annotations to generate a pseudogene transcriptome. b Most transcribed pseudogenes identified here were absent from Gencode. c The transcription start sites (TSSs) of full-length pseudogene transcripts are enriched for CAGE-seq signal (data from FANTOM5 [24]). d Open Reading Frame (ORF) lengths of potentially coding-independent pseudogene transcripts. e Fraction of parental ORF length found intact in transcribed potentially coding-independent pseudogenes. f HMGB1P1 has a novel 5′ exon and is transcribed from an upstream CAGE-confirmed TSS. g Expression of 3XHA-tagged pseudogene ORFs in HEK293T cells detected by Western blot. HMGB1P1 and AK4P3 are translated when expressed in cultured cells. h A novel isoform of retinoblastoma (RB1) is transcribed from a TSS located within the pseudogene PPP1R26P1. The pseudogene sequence adds 179 codons to the RB1 ORF
Fig. 2Deletion of PDCL3P4 impacts the transcriptome of haploid cells. a PDCL3P4 is a pseudogene transcribed in HAP1 cells from the canonical long terminal repeat (LTR) promoter of an upstream human endogenous retrovirus-K (HERV-K) sequence. Grey bars within the PacBio reads represent exons and light blue bars represent introns. b CRISPR-Cas9 genome engineering removes the retroposed portion of PDCL3P4 from the HAP1 genome in three independent clones. c PDCL3P4 expression is ablated in PDCL3P4 mutant clones. d PDCL3P4 ablation disrupted the expression of more than 137 genes, while PDCL3 expression was unaffected. e PDCL3P4 transcripts were enriched in the nucleus. The nuclear-localised noncoding RNA MALAT1 and mRNA ACTB act as controls