| Literature DB >> 34864849 |
Lu Pan1, Huy Q Dinh2,3, Yudi Pawitan1, Trung Nghia Vu1.
Abstract
MOTIVATION: RNA expression at isoform level is biologically more informative than at gene level and can potentially reveal cellular subsets and corresponding biomarkers that are not visible at gene level. However, due to the strong 3' bias sequencing protocol, mRNA quantification for high-throughput single-cell RNA sequencing such as Chromium Single Cell 3' 10x Genomics is currently performed at the gene level.Entities:
Year: 2021 PMID: 34864849 PMCID: PMC8826380 DOI: 10.1093/bioinformatics/btab807
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Scasa workflow and its performance against existing quantification tools. (a) The Scasa workflow consists of three main parts: (i) fitting the statistical model using an AEM algorithm, based on (ii) mapping of the scRNA-Seq data to produce count matrix Y and (iii) the in silico construction of the transcription clusters and isoform paralogs to obtain the initial X matrices. (b) A simulation study (n = 3955 cells) indicates that Scasa performs well against existing methods in terms of isoform quantification. Isoform-level estimates are plotted against the true values. (c) From left to right: boxplots of the APE ratios of Kallisto and Salmon against Scasa for all isoforms (n = 12 203 truly expressed isoforms) and for non-paralog isoforms (n = 1342); boxplots of APE ratios for gene-level quantification methods against Scasa (n = 8052 truly expressed genes); boxplots of APE ratios for singleton genes (n = 2318)
Fig. 2.Isoform-level quantification from Scasa unveil new cell-type cluster and differential expressed isoforms. (a) UMAP of isoform quantification from Scasa using a bone marrow dataset and cell annotations queried from the original study 12. Arrow points to a distinct subgroup of CD14 monocytes (which we will call it TY32.25 Mono here) discovered via Scasa isoform quantification. (b) Heatmap of median expression of isoforms in each cluster, including TY32.25 Mono. Each cluster (row) is annotated by the most dominant cell types of single cells in that cluster. Mutually exclusive expression pattern of TYROBP isoforms (highlighted in purple and green in the x axis labels) are observed in the TY32.25 Mono group. Size of each square is proportional to the percentage of cells expressing the isoform in their corresponding cell-type group. Color gradient represents the mean expression of the isoforms in each cell-type group. For convenience, the clusters with low total expression of the isoforms are excluded from the plot. NM-0012425 (24–25) refers to two isoforms, NM-001242524 and NM-001242525. (c) Boxplot of the isoforms of TYROBP gene in the TY32.25 Mono group and the other mono groups from the Smart-Seq2 dataset. Full-length-transcript Smart-Seq2 data from the bone marrow cells of 15 human fetuses 14 is used in this validation. (d) Patterns of the differential expressed isoforms of TYROBP gene observed in all cells, with distinct observation in the TY32.25 Mono group as indicated in (a)