Literature DB >> 28472395

SinCHet: a MATLAB toolbox for single cell heterogeneity analysis in cancer.

Jiannong Li1, Inna Smalley2, Michael J Schell1, Keiran S M Smalley2, Y Ann Chen1.   

Abstract

SUMMARY: Single-cell technologies allow characterization of transcriptomes and epigenomes for individual cells under different conditions and provide unprecedented resolution for researchers to investigate cellular heterogeneity in cancer. The SinCHet ( gle ell erogeneity) toolbox is developed in MATLAB and has a graphical user interface (GUI) for visualization and user interaction. It analyzes both continuous (e.g. mRNA expression) and binary omics data (e.g. discretized methylation data). The toolbox does not only quantify cellular heterogeneity using S hannon P rofile (SP) at different clonal resolutions but also detects heterogeneity differences using a D statistic between two populations. It is defined as the area under the P rofile of S hannon D ifference (PSD). This flexible tool provides a default clonal resolution using the change point of PSD detected by multivariate adaptive regression splines model; it also allows user-defined clonal resolutions for further investigation. This tool provides insights into emerging or disappearing clones between conditions, and enables the prioritization of biomarkers for follow-up experiments based on heterogeneity or marker differences between and/or within cell populations.
AVAILABILITY AND IMPLEMENTATION: The SinCHet software is freely available for non-profit academic use. The source code, example datasets, and the compiled package are available at http://labpages2.moffitt.org/chen/software/ . CONTACT: ann.chen@moffitt.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2017. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2017        PMID: 28472395      PMCID: PMC5870537          DOI: 10.1093/bioinformatics/btx297

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Tumor heterogeneity between and within tumors plays a critical role in tumor aggression and the development of drug resistance. Understanding and characterizing clonal heterogeneity enables us to gain insights into the progression of cancer and guide the effective therapeutic strategies (Marusyk and Polyak, 2010). New high-throughput single-cell technologies provide unprecedented resolution for researchers to explore cellular heterogeneity in cancer (Tirosh ). However, these technologies pose new challenges in data analysis and interpretation. Currently, there are several single cell analysis tools available such as SCATT, TSCAN, SPADE, vi-SNE (Amir el ; Anchang ; Ji and Ji, 2016; Mitra ). Each tool has its own strengths and limitations (Supplementary Table S1). There is only a limited number of tools available for quantifying cellular heterogeneity and comparing heterogeneities quantitatively between populations, and identifying markers based on heterogeneity. Therefore, we have developed the SinCHet toolbox, in MATLAB with a GUI for visualization and user interaction, originally for cancer research but with the potential to be used for any single cell research. The toolbox has four parts (Supplementary Fig. S1): (1) imports continuous or categorical omics data and output the figures and results for review; (2) performs exploratory analyses including hierarchical cluster analyses and principal components analyses (PCA); (3) Estimates the clonal heterogeneity using Shannon Profile (SP), provides a Profile of Shannon Differences (PSD) to characterize the heterogeneity differences between populations and a novel D statistic to quantitatively compare heterogeneities between populations; (4) Prioritizes biomarkers based on between and/or within group cellular heterogeneity.

2 Materials and methods

2.1 Clonal richness and heterogeneity estimation

We assume proper normalization has been performed prior to using the tool. Although data normalization is beyond the scope of this application, some considerations for data pre-processing and normalization are discussed in the supplemental materials. Currently, the tool is developed in a two-group comparison setting. Hierarchical cluster analyses with different linkage methods were performed to cluster cells into phenotypic clonal groups, referred to as clones in this application note, based on the similarities of the input dataset (Supplementary Figs. S2A and S3A). Cophenetic correlation coefficient (Sokal, 1962) is used to choose the default linkage method. PCA analyses are available to visualize the relationships and patterns of the samples (Supplementary Figs. S2B and S3B). Clonal richness, i.e. counts of clones and Shannon index (Hernandez-Walls and Trujillo-Ortiz, 2010; Southwood and Henderson, 2000) are used to quantify clonal diversity and heterogeneity (Supplementary equation S1). SinCHet provides Shannon Profile (SP) under each condition by evaluating the heterogeneity using Shannon index at different heights along the dendrogram (Supplementary Figs. S2C and S3C). PSD, the profile of the differences of Shannon index calculated along the same X-axis as SP, is used characterize the heterogeneity differences between two conditions (or populations; Supplementary Figs. S2D and S3D). A novel D statistic is then defined as the area under the PSD, or equivalently, the differences of the areas under the SPs between two groups (Supplementary equation S2). We have shown that this D statistic is empirically robust to choice of different linkage methods for hierarchical cluster analyses (the Supplementary Material results). Permutation is used to evaluate its statistical significance (Supplementary equation S3). To identify the number of existing clones under each condition, Multivariate adaptive regression splines (MARS) model (Friedman, 1991; Jekabsons, 2016) is used to detect the change points in PSD (Supplementary Figs. S2D and S3D). The higher the clone numbers, the fewer the cells there are in each clone, which will reduce the statistical power for comparison. So, minimum of change points determined by MARS is chosen as the default to provide the clonal snap shot (Supplementary Figs. S2E–F and S3E–F) which provides the information on clonal compositions and biomarker analyses (Supplementary Figs. S2G–H and S3G–H). The SinCHet toolbox also allows the user-defined number of clones along the profile for exploration accordingly.

2.2 Biomarkers prioritization

The within- and between- population comparisons are performed and results are all saved for further investigation (Supplementary Tables S2 and S3). Each comparison could have its own biological significance and results of the top-ranked markers could be visualized individually (Supplementary Fig. S1 and Supplementary Figs. S2H and S3H). Given the large amount of information generated by each single cell experiment, a composite score, Generalized Fisher Product Score (GF), is devised to summarize the overall difference between- and within-population comparisons and to prioritize biomarkers for further investigation. For categorical data, GF is aggregation of evidence from three separate Fisher’s exact tests for each biomarker (e.g. methylation site). where is the P value from Fisher’s exact test for the th biomarker and comparison (when = 1, it is the dominant clone comparison between groups and = 2 or 3, the tests are comparisons between dominant clone and the remaining minor clones within each population). When the biomarker is a continuous variable, three rank sum tests are performed to compare the difference of the expression levels. Markers with large differences are often desired by the researchers for validation experiments, therefore, fold change (FC) for each of the three comparisons are also incorporated in the GF score: , is the FC for biomarker at comparison as described above.

3 Results

We applied the SinCHet toolbox to published single-cell expression and methylation datasets (Cheow ). Data processing procedures were summarized in the supplement. For the gene expression dataset, the toolbox identified that the heterogeneity is higher in the EGFR-mutant lung cancer tumors than the wild type group (D = –63.8, P < 0.001, Supplementary Fig. S2C). This was supported by a previous report (Bai ). Nine clones were identified by SinCHet using the default setting (Supplementary Fig. S2E and F). The dominant clone from each group identified by SinCHet, i.e. Clone 1 from the the wild type tumors and Clone 2 from the mutant tumors, were in general agreement with the two clusters identified in the original paper. Additional clonal heterogeneity was characterized by SinCHet, with 7 additional clones identified (Supplementary Results). Furthermore, SinCHet was not only able to identify the same reported top genes (e.g. MUC1, SFTPC and KRT7; which differed significantly between EGFR-mutant and wild-type tumors) using the GF score but also was able to identify novel markers such as CD44, MT2A within each subpopulation (Supplementary Table S2). For the methylation dataset, SinCHet enabled the identification of the significantly hypermethylated loci HOXA9, PROM1 and PAX3 as shown by the paper reported in EGFR-mutant cells. In addition, the SinCHet top-ranked hypermethylated loci PAX5, SOX9 and SPINT1 found in subpopulations within EGFR-mutant cells might infer that some of the subpopulations could acquire stochastic epigenetic aberrations during tumor evolution as discussed in the original paper (Supplementary Table S3). SinCHet can quantify cellular heterogeneity and identify novel candidate biomarkers, considering heterogeneity both between- and/or within groups. It provides unique insights into emerging or disappearing clones at different clonal resolutions between cell populations in different contexts. It could be easily applied to compare heterogeneity between groups with versus without mutations or before versus after acquired drug resistance. It could be also applied to quantify heterogeneity during the course of cancer treatment, potentially changing the face of cancer therapeutic strategies in the future. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file.
  8 in total

1.  Single-cell multimodal profiling reveals cellular epigenetic heterogeneity.

Authors:  Lih Feng Cheow; Elise T Courtois; Yuliana Tan; Ramya Viswanathan; Qiaorui Xing; Rui Zhen Tan; Daniel S W Tan; Paul Robson; Yuin-Han Loh; Stephen R Quake; William F Burkholder
Journal:  Nat Methods       Date:  2016-08-15       Impact factor: 28.547

2.  Visualization and cellular hierarchy inference of single-cell data using SPADE.

Authors:  Benedict Anchang; Tom D P Hart; Sean C Bendall; Peng Qiu; Zach Bjornson; Michael Linderman; Garry P Nolan; Sylvia K Plevritis
Journal:  Nat Protoc       Date:  2016-06-16       Impact factor: 13.491

3.  Single-cell analysis of targeted transcriptome predicts drug sensitivity of single cells within human myeloma tumors.

Authors:  A K Mitra; U K Mukherjee; T Harding; J S Jang; H Stessman; Y Li; A Abyzov; J Jen; S Kumar; V Rajkumar; B Van Ness
Journal:  Leukemia       Date:  2015-12-29       Impact factor: 11.528

4.  TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis.

Authors:  Zhicheng Ji; Hongkai Ji
Journal:  Nucleic Acids Res       Date:  2016-05-13       Impact factor: 16.971

Review 5.  Tumor heterogeneity: causes and consequences.

Authors:  Andriy Marusyk; Kornelia Polyak
Journal:  Biochim Biophys Acta       Date:  2009-11-18

6.  viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia.

Authors:  El-ad David Amir; Kara L Davis; Michelle D Tadmor; Erin F Simonds; Jacob H Levine; Sean C Bendall; Daniel K Shenfeld; Smita Krishnaswamy; Garry P Nolan; Dana Pe'er
Journal:  Nat Biotechnol       Date:  2013-05-19       Impact factor: 54.908

7.  Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq.

Authors:  Itay Tirosh; Benjamin Izar; Sanjay M Prakadan; Marc H Wadsworth; Daniel Treacy; John J Trombetta; Asaf Rotem; Christopher Rodman; Christine Lian; George Murphy; Mohammad Fallahi-Sichani; Ken Dutton-Regester; Jia-Ren Lin; Ofir Cohen; Parin Shah; Diana Lu; Alex S Genshaft; Travis K Hughes; Carly G K Ziegler; Samuel W Kazer; Aleth Gaillard; Kellie E Kolb; Alexandra-Chloé Villani; Cory M Johannessen; Aleksandr Y Andreev; Eliezer M Van Allen; Monica Bertagnolli; Peter K Sorger; Ryan J Sullivan; Keith T Flaherty; Dennie T Frederick; Judit Jané-Valbuena; Charles H Yoon; Orit Rozenblatt-Rosen; Alex K Shalek; Aviv Regev; Levi A Garraway
Journal:  Science       Date:  2016-04-08       Impact factor: 47.728

8.  Detection and clinical significance of intratumoral EGFR mutational heterogeneity in Chinese patients with advanced non-small cell lung cancer.

Authors:  Hua Bai; Zhijie Wang; Yuyan Wang; Minglei Zhuo; Qinghua Zhou; Jianchun Duan; Lu Yang; Meina Wu; Tongtong An; Jun Zhao; Jie Wang
Journal:  PLoS One       Date:  2013-02-13       Impact factor: 3.240

  8 in total
  7 in total

1.  Single-cell RNA-seq clustering: datasets, models, and algorithms.

Authors:  Lihong Peng; Xiongfei Tian; Geng Tian; Junlin Xu; Xin Huang; Yanbin Weng; Jialiang Yang; Liqian Zhou
Journal:  RNA Biol       Date:  2020-03-01       Impact factor: 4.652

2.  Single-cell Characterization of the Cellular Landscape of Acral Melanoma Identifies Novel Targets for Immunotherapy.

Authors:  Jiannong Li; Inna Smalley; Zhihua Chen; Jheng-Yu Wu; Manali S Phadke; Jamie K Teer; Thanh Nguyen; Florian A Karreth; John M Koomen; Amod A Sarnaik; Jonathan S Zager; Nikhil I Khushalani; Ahmad A Tarhini; Vernon K Sondak; Paulo C Rodriguez; Jane L Messina; Y Ann Chen; Keiran S M Smalley
Journal:  Clin Cancer Res       Date:  2022-05-13       Impact factor: 13.801

3.  Single-Cell Characterization of the Immune Microenvironment of Melanoma Brain and Leptomeningeal Metastases.

Authors:  Inna Smalley; Zhihua Chen; Manali Phadke; Jiannong Li; Xiaoqing Yu; Clayton Wyatt; Brittany Evernden; Jane L Messina; Amod Sarnaik; Vernon K Sondak; Chaomei Zhang; Vincent Law; Nam Tran; Arnold Etame; Robert J B Macaulay; Zeynep Eroglu; Peter A Forsyth; Paulo C Rodriguez; Y Ann Chen; Keiran S M Smalley
Journal:  Clin Cancer Res       Date:  2021-05-25       Impact factor: 12.531

4.  Targeted Therapy Given after Anti-PD-1 Leads to Prolonged Responses in Mouse Melanoma Models through Sustained Antitumor Immunity.

Authors:  Manali S Phadke; Zhihua Chen; Jiannong Li; Eslam Mohamed; Michael A Davies; Inna Smalley; Derek R Duckett; Vinayak Palve; Brian J Czerniecki; Peter A Forsyth; David Noyes; Dennis O Adeegbe; Zeynep Eroglu; Kimberly T Nguyen; Kenneth Y Tsai; Uwe Rix; Christin E Burd; Yian A Chen; Paulo C Rodriguez; Keiran S M Smalley
Journal:  Cancer Immunol Res       Date:  2021-03-02       Impact factor: 12.020

5.  Leveraging transcriptional dynamics to improve BRAF inhibitor responses in melanoma.

Authors:  Inna Smalley; Eunjung Kim; Jiannong Li; Paige Spence; Clayton J Wyatt; Zeynep Eroglu; Vernon K Sondak; Jane L Messina; Nalan Akgul Babacan; Silvya Stuchi Maria-Engler; Lesley De Armas; Sion L Williams; Robert A Gatenby; Y Ann Chen; Alexander R A Anderson; Keiran S M Smalley
Journal:  EBioMedicine       Date:  2019-10-05       Impact factor: 8.143

6.  Leveraging Single-Cell RNA Sequencing Experiments to Model Intratumor Heterogeneity.

Authors:  Meghan C Ferrall-Fairbanks; Markus Ball; Eric Padron; Philipp M Altrock
Journal:  JCO Clin Cancer Inform       Date:  2019-04

Review 7.  Statistical and Bioinformatics Analysis of Data from Bulk and Single-Cell RNA Sequencing Experiments.

Authors:  Xiaoqing Yu; Farnoosh Abbas-Aghababazadeh; Y Ann Chen; Brooke L Fridley
Journal:  Methods Mol Biol       Date:  2021
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.