| Literature DB >> 30237265 |
Jocelyn S Gandelman1,2,3,4, Michael T Byrne1, Akshitkumar M Mistry3,5, Hannah G Polikowsky3,4, Kirsten E Diggins2,3, Heidi Chen6, Stephanie J Lee7, Mukta Arora8, Corey Cutler9, Mary Flowers7, Joseph Pidala10, Jonathan M Irish11,3,4, Madan H Jagasia12,3.
Abstract
The application of machine learning in medicine has been productive in multiple fields, but has not previously been applied to analyze the complexity of organ involvement by chronic graft-versus-host disease. Chronic graft-versus-host disease is classified by an overall composite score as mild, moderate or severe, which may overlook clinically relevant patterns in organ involvement. Here we applied a novel computational approach to chronic graft-versus-host disease with the goal of identifying phenotypic groups based on the subcomponents of the National Institutes of Health Consensus Criteria. Computational analysis revealed seven distinct groups of patients with contrasting clinical risks. The high-risk group had an inferior overall survival compared to the low-risk group (hazard ratio 2.24; 95% confidence interval: 1.36-3.68), an effect that was independent of graft-versus-host disease severity as measured by the National Institutes of Health criteria. To test clinical applicability, knowledge was translated into a simplified clinical prognostic decision tree. Groups identified by the decision tree also stratified outcomes and closely matched those from the original analysis. Patients in the high- and intermediate-risk decision-tree groups had significantly shorter overall survival than those in the low-risk group (hazard ratio 2.79; 95% confidence interval: 1.58-4.91 and hazard ratio 1.78; 95% confidence interval: 1.06-3.01, respectively). Machine learning and other computational analyses may better reveal biomarkers and stratify risk than the current approach based on cumulative severity. This approach could now be explored in other disease models with complex clinical phenotypes. External validation must be completed prior to clinical application. Ultimately, this approach has the potential to reveal distinct pathophysiological mechanisms that may underlie clusters. Clinicaltrials.gov identifier: NCT00637689. CopyrightEntities:
Year: 2018 PMID: 30237265 PMCID: PMC6312024 DOI: 10.3324/haematol.2018.193441
Source DB: PubMed Journal: Haematologica ISSN: 0390-6078 Impact factor: 9.941
Figure 1.A machine-learning workflow reveals clusters of patients with chronic graft-versus-host disease with shared organ involvement phenotypes. t-SNE/viSNE plots show organ scores (heat) for each patient (represented by a dot) on a scale where heat indicates organ involvement. Patients who are closer together are more similar while those who are farther apart are generally more different from each other. All organ domains shown were used to generate the viSNE plots, except National Institutes of Health-Severity which was not used as a parameter to generate the viSNE maps. FlowSOM clustering is shown (right) for the seven clusters of patients, with each cluster color overlaid as a dimension on the viSNE plot. For example, Cluster 7 is pink.
Figure 2.Computational analysis of organ scores reveals phenotypic clusters of patients with chronic graft-versus-host disease who were stratified for overall survival. (A) Patients were grouped into seven clusters by the machine-learning workflow (Online Supplementary Figure S1) and described using marker enrichment modeling (MEM) labels (left), which captured features enriched (▲) or specifically lacking (▼) from each group relative to the others in the cohort. Risk coefficients (right) were then calculated for each group. Risk scores below −0.25 or above 0.25 were considered low and high risk, respectively, and 0 was the average risk for the cohort. Clusters 1-3 were lower risk, Cluster 4 was intermediate risk, and Clusters 5-7 were higher risk. (B) Overall survival probability was stratified for the patients with chronic graft-versus-host disease based on the low-, intermediate-, and high-risk clusters defined by the computational analysis.
Figure 3.A simple, physician-driven decision tree defines chronic graft-versus-host disease phenotypes. A decision tree designed to separate patients into groups with similar phenotypes and clinical risks as those revealed by the machine-learning approach in Figure 1 is shown. The decision tree is read from the top down and sequentially identifies and segregates patients in the most phenotypically distinct clusters (Y=Yes, N=No). Patients meeting the criteria at the decision point are assigned to that cluster and patients who do not meet the criteria are further advanced in the tree logic. Each circled number represents a cluster of patients. For cluster 2, two decision points were used to identify patients (arrows above and below the encircled 2). The length of the horizontal arrow is proportional to the risk coefficient and the width of the arrow is proportional to the percentage of patients in this cohort who were assigned to the cluster.
Figure 4.A simple, physician-driven decision tree created groups of patients with chronic graft-versus-host disease that were similar to computational patient clusters and stratified for overall survival. (A) Cluster numbers, newly calculated marker enrichment modeling (MEM) labels, phenotype interpretations (italics), risk coefficients, and group frequencies (n=339) are shown for the new groups of patients defined using the decision tree in Figure 3. MEM labels and risk were calculated as before (Figure 1 and Methods). Phenotype interpretations were assigned by expert physicians based on analysis of MEM labels and risk. Decision tree groups 1-3 were lower risk, groups 4-5 were intermediate risk, and groups 6-7 were higher risk. (B) Overall survival probability was stratified for patients with chronic graft-versus-host disease identified in the low-, intermediate-, and high-risk groups defined by the physician-driven decision tree.
Figure 5.Time from stem cell transplantation to chronic graft-versus-host disease in decision tree Cluster 2 versus other clusters. Patients in decision-tree-identified Cluster 2-sclerotic phenotype had a significantly longer time from stem cell transplantation to chronic graft-versus-host disease (cGvHD) when compared to patients in all other clusters.
Figure 6.The physician-driven decision tree recapitulates the machine-learning workflow and finds clusters with stable risk. (A) A scatter plot shows the same patients in groups resulting from the decision tree (y-axis) or computational analysis (x-axis). Patients within or touching the black boxes were those with the same group classification in both workflows (86% of patients, n=339). (B) Bootstrapping analysis revealed stability of cluster risk across ten decision-tree analysis runs using 130 of 339 randomly sampled patients. The coefficient of risk was calculated for each run of the analysis for each cluster. The standard deviation of the ten coefficients of risks was calculated and was <0.7 for all clusters, except Cluster 3.