| Literature DB >> 28359308 |
Kiley Graim1,2, Tiffany Ting Liu3,4, Achal S Achrol4,5,6, Evan O Paull1,2, Yulia Newton1,2, Steven D Chang6, Griffith R Harsh6, Sergio P Cordero1,2, Daniel L Rubin3,4, Joshua M Stuart7,8.
Abstract
BACKGROUND: Patient stratification to identify subtypes with different disease manifestations, severity, and expected survival time is a critical task in cancer diagnosis and treatment. While stratification approaches using various biomarkers (including high-throughput gene expression measurements) for patient-to-patient comparisons have been successful in elucidating previously unseen subtypes, there remains an untapped potential of incorporating various genotypic and phenotypic data to discover novel or improved groupings.Entities:
Keywords: Cancer; Clustering; Community detection; MRI; Magnetic resonance imaging; Molecular subtyping; Mutation
Mesh:
Year: 2017 PMID: 28359308 PMCID: PMC5374737 DOI: 10.1186/s12920-017-0256-3
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1a Social network approach to clustering patient samples. First we transform/encode the mutation/voxel data, then compute all patient–patient similarities. At each order of similarities, clustering is based on similarities in that order, resulting in different clustering solutions. Shown here from left to right: features, 1st-order, 2nd-order, 3rd-order, ‘true’ communities. Note that links between the same sample but different orders are not shown (e.g. A always has a strong self link), but are used in the similarity calculations. b Flow diagram of HOCUS analysis: Feature data (such as imaging voxels or mutations are supplied to HOCUS to generate higher-order features from which sample-sample similarities are calculated. The HOCUS order is selected by comparing sample-sample similarity kernels with an external criterion. Clustering is then done followed by survival and downstream analyses
Fig. 2HOCUS in first- through fourth-orders, and Pearson clustering of a GBM c OV and e BLCA survival p-values (log-rank test) vs number of clusters. b GBM, d OV, and f BLCA Kaplan-Meier plots for selected HOCUS clustering solutions (starred in yellow on the survival p-value plots (a, c, e). Clusters with fewer than five samples are excluded from the KM analyses
Fig. 3Oncoprint showing a subset of mutations in BLCA. Line plots above the oncoprint show the total number of mutations per sample. The grey dotted lines indicate median mutational frequency across the cohort. This BLCA oncoprint includes genes with the smallest p-values in a χ 2 test of independence when compared to mutation rates outside the cluster. We compared each cluster to all others combined
Fig. 4Visualization of sample pair frequencies of image-based metrics compared to survival outcome metric; results on the a first-order, b second-order, c third-order, and d fourth-order HOCUS. For this visualization only, data was restricted to patients with a death event, then sample–based pairwise correlations were calculated using 1st-, 2nd-, 3rd-, 4th-order HOCUS metrics as well as difference in the length of survival, in days, between each pair of patients. In each plot, the conditional density is shown in which the distribution of all sample pairs are depicted as density maps. On the left-hand side of each plot, a series of plots are shown in which the feature-based measure is divided into five bands of equal size, and differences in survival time (the outcome metric) are plotted in histograms for those samples restricted to each band. The kernel similarity between the HOCUS metric and the co-survival metric is shown in the top right corner of each plot
Fig. 5HOCUS of GBM MR Images. a P-values of survival separation (log-rank test) for each of the orders of clustering across a range of k clusters. b Kaplan-Meier plot of the third-order HOCUS clusters. c Images of tumors within each cluster projected onto the MNI brain atlas. Showing sagittal, coronal, axial views. Brightness of color indicates the number of patients with tumor at a given location. Generated using Slicer [10]. d Violin plot showing tumor volumes within each third-order cluster. e Molecular (gene expression based) subtypes within the clusters
Fig. 6PathMark analysis of the poor surviving third-order cluster vs others. Node size and color indicates differential expression levels