Hunter A Miller1, Victor H van Berkel2,3, Hermann B Frieboes4,5,6,7. 1. Department of Pharmacology and Toxicology, University of Louisville, Louisville, USA. 2. UofL Health-Brown Cancer Center, University of Louisville, Louisville, USA. 3. Department of Cardiovascular and Thoracic Surgery, University of Louisville, Louisville, USA. 4. Department of Pharmacology and Toxicology, University of Louisville, Louisville, USA. hbfrie01@louisville.edu. 5. UofL Health-Brown Cancer Center, University of Louisville, Louisville, USA. hbfrie01@louisville.edu. 6. Department of Bioengineering, University of Louisville, Lutz Hall 419, Louisville, KY, 40292, USA. hbfrie01@louisville.edu. 7. Center for Predictive Medicine, University of Louisville, Louisville, USA. hbfrie01@louisville.edu.
Abstract
INTRODUCTION: While prediction of short versus long term survival from lung cancer is clinically relevant in the context of patient management and therapy selection, it has proven difficult to identify reliable biomarkers of survival. Metabolomic markers from tumor core biopsies have been shown to reflect cancer metabolic dysregulation and hold prognostic value. OBJECTIVES: Implement and validate a novel ensemble machine learning approach to evaluate survival based on metabolomic biomarkers from tumor core biopsies. METHODS: Data were obtained from tumor core biopsies evaluated with high-resolution 2DLC-MS/MS. Unlike biofluid samples, analysis of tumor tissue is expected to accurately reflect the cancer metabolism and its impact on patient survival. A comprehensive suite of machine learning algorithms were trained as base learners and then combined into a stacked-ensemble meta-learner for predicting "short" versus "long" survival on an external validation cohort. An ensemble method of feature selection was employed to find a reliable set of biomarkers with potential clinical utility. RESULTS: Overall survival (OS) is predicted in external validation cohort with AUROCTEST of 0.881 with support vector machine meta learner model, while progression-free survival (PFS) is predicted with AUROCTEST of 0.833 with boosted logistic regression meta learner model, outperforming a nomogram using covariate data (staging, age, sex, treatment vs. non-treatment) as predictors. Increased relative abundance of guanine, choline, and creatine corresponded with shorter OS, while increased leucine and tryptophan corresponded with shorter PFS. In patients that expired, N6,N6,N6-Trimethyl-L-lysine, L-pyrogluatmic acid, and benzoic acid were increased while cystine, methionine sulfoxide and histamine were decreased. In patients with progression, itaconic acid, pyruvate, and malonic acid were increased. CONCLUSION: This study demonstrates the feasibility of an ensemble machine learning approach to accurately predict patient survival from tumor core biopsy metabolomic data.
INTRODUCTION: While prediction of short versus long term survival from lung cancer is clinically relevant in the context of patient management and therapy selection, it has proven difficult to identify reliable biomarkers of survival. Metabolomic markers from tumor core biopsies have been shown to reflect cancer metabolic dysregulation and hold prognostic value. OBJECTIVES: Implement and validate a novel ensemble machine learning approach to evaluate survival based on metabolomic biomarkers from tumor core biopsies. METHODS: Data were obtained from tumor core biopsies evaluated with high-resolution 2DLC-MS/MS. Unlike biofluid samples, analysis of tumor tissue is expected to accurately reflect the cancer metabolism and its impact on patient survival. A comprehensive suite of machine learning algorithms were trained as base learners and then combined into a stacked-ensemble meta-learner for predicting "short" versus "long" survival on an external validation cohort. An ensemble method of feature selection was employed to find a reliable set of biomarkers with potential clinical utility. RESULTS: Overall survival (OS) is predicted in external validation cohort with AUROCTEST of 0.881 with support vector machine meta learner model, while progression-free survival (PFS) is predicted with AUROCTEST of 0.833 with boosted logistic regression meta learner model, outperforming a nomogram using covariate data (staging, age, sex, treatment vs. non-treatment) as predictors. Increased relative abundance of guanine, choline, and creatine corresponded with shorter OS, while increased leucine and tryptophan corresponded with shorter PFS. In patients that expired, N6,N6,N6-Trimethyl-L-lysine, L-pyrogluatmic acid, and benzoic acid were increased while cystine, methionine sulfoxide and histamine were decreased. In patients with progression, itaconic acid, pyruvate, and malonic acid were increased. CONCLUSION: This study demonstrates the feasibility of an ensemble machine learning approach to accurately predict patient survival from tumor core biopsy metabolomic data.