| Literature DB >> 26550473 |
Liang Wu1, Xiaolong Zhang2, Zhikun Zhao3, Ling Wang4, Bo Li1, Guibo Li5, Michael Dean6, Qichao Yu7, Yanhui Wang1, Xinxin Lin1, Weijian Rao1, Zhanlong Mei1, Yang Li1, Runze Jiang1, Huan Yang1, Fuqiang Li1, Guoyun Xie1, Liqin Xu1, Kui Wu1, Jie Zhang1, Jianghao Chen4, Ting Wang4, Karsten Kristiansen8, Xiuqing Zhang9, Yingrui Li10, Huanming Yang11, Jian Wang11, Yong Hou5, Xun Xu1.
Abstract
BACKGROUND: Viral infection causes multiple forms of human cancer, and HPV infection is the primary factor in cervical carcinomas. Recent single-cell RNA-seq studies highlight the tumor heterogeneity present in most cancers, but virally induced tumors have not been studied. HeLa is a well characterized HPV+ cervical cancer cell line. RESULT: We developed a new high throughput platform to prepare single-cell RNA on a nanoliter scale based on a customized microwell chip. Using this method, we successfully amplified full-length transcripts of 669 single HeLa S3 cells and 40 of them were randomly selected to perform single-cell RNA sequencing. Based on these data, we obtained a comprehensive understanding of the heterogeneity of HeLa S3 cells in gene expression, alternative splicing and fusions. Furthermore, we identified a high diversity of HPV-18 expression and splicing at the single-cell level. By co-expression analysis we identified 283 E6, E7 co-regulated genes, including CDC25, PCNA, PLK4, BUB1B and IRF1 known to interact with HPV viral proteins.Entities:
Keywords: Cancer; HPV; HeLa; RNA splicing; Single-cell transcriptome; Tumor heterogeneity; Virus
Mesh:
Year: 2015 PMID: 26550473 PMCID: PMC4635585 DOI: 10.1186/s13742-015-0091-4
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Fig. 1The schematic diagram of MIRALCS. a The flowchart of the MIRALCS. b The box plot of Ct (left) and Tm (right) value of the 20 % Percoll solution (negative control) and 10 pg total RNA (positive control), respectively. c The Ct value and Tm value distribution of 20 % Percoll solution, 10 pg total RNA, non-target well and target well during cDNA amplification process in microwells. The target wells (well with cell) and the non-target wells (without cells) were validated by Agilent 2100 Bioanlyzer. The line denotes Ct median. Horizontal bars denote ± 0.5
Fig. 2A high sensitivity, accuracy and reproducibility of MIRALCS. a Comparison of gene number between single cell (the smaller circle) and the 5 ng bulk sample (the larger circle). The left, a typical cell; the right, 5 randomly selected cells (randomly sampling 0.4 million reads per cell) vs. the 5 ng bulk sample (2 million reads). b Gene detection in MIRALCS single-cells, regular tube-based single cells and 5 ng bulk RNA sample. c The distribution of gene number on gene expression along sequencing depths. d The correlation of the mean expression (FPKM) and the number of input molecules of spike-ins of all MIRALCS single-cell libraries. e The reads coverage along the transcript position from 5′ to 3′end. Error bar stands for the standard deviation. f The correlation of spike-ins expression (FPKM) between two randomly selected MIRALCS single cells. g Heat map of correlation coefficients of spike-ins expression levels with input molecules >1 for each library (n = 19). h The correlation of gene expression (FPKM) between technical replicates. Left: two randomly selected MIRALCS 10 pg replicates. Right: two randomly selected tube-based 10 pg replicates. i The pair-wise correlation in MIRALCS 10 pg RNA replicates and tube-based 10 pg RNA replicates
Fig. 3Heterogeneity of gene expression in HeLa S3 single cells. a The mRNA molecular number in single cells and 10 pg RNA replicates. b The heat map of the FPKM values of extremely highly expressed genes (FPKM > 500 in bulk RNA) in single cells and 10 pg replicates. c Single-cell subpopulations identification based on cell cycle relative genes. The cells with underline are in G2/M phase. d Gene co-expression modules derived from 19 single cells based on RNA molecular number (modules are distinguished by colors). The detailed of each module stands for were shown on Additional file 4: Table S7. The weighted gene correlation network was constructed using the WCGNA R package [38]
Fig. 4Heterogeneity of alternative splicing and distributions of splices in in single cells. a The sequencing depth for genes of NPM1, YWHAB, YWHAQ and GAPDH in single cells. b The frequency distribution of detected annotated and novel spliced junctions. c The distributions of the ψ scores of annotated and novel spliced junctions in the bulk RNA (upper) and single cells (lower)
Fig. 5The landscape of the HPV-18/cellular fusion and diversity of HPV-host splicing and expression in HeLa S3 cells. a The overview of the HPV-18 cellular fusion based on HeLa cell transcriptome. Blue lines denote fusion events. b The read coverage of HPV-18 genome in single cells and the bulk RNA. Colored vertical lines denote nucleotides of SNPs detected in the transcriptome. Light green, A; red, T; orange, G; blue, C. c The read coverage of the host region on chromosome 8 in single cells and the bulk RNA. d The schematic diagram of the inferred HPV integration structure (upper) and splicing forms (lower). RPM stands for reads per million