Faraz Faghri1, Fabian Brunn2, Anant Dadu3, Elisabetta Zucchi4, Ilaria Martinelli5, Letizia Mazzini6, Rosario Vasta7, Antonio Canosa7, Cristina Moglia7, Andrea Calvo7, Michael A Nalls8, Roy H Campbell2, Jessica Mandrioli9, Bryan J Traynor10, Adriano Chiò11. 1. Neuromuscular Diseases Research Section, Laboratory of Neurogenetics, US National Institute on Aging, Bethesda, MD, USA; Center for Alzheimer's and Related Dementias, US National Institute on Aging, Bethesda, MD, USA; Data Tecnica International, Glen Echo, MD, USA; Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA. 2. Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA. 3. Center for Alzheimer's and Related Dementias, US National Institute on Aging, Bethesda, MD, USA; Data Tecnica International, Glen Echo, MD, USA; Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA. 4. Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy. 5. Neurology Unit, Department of Neurosciences, Azienda Ospedaliero Universitaria di Modena, Modena, Italy. 6. ALS Centre, Department of Neurology, Maggiore della Carità University Hospital, Novara, Italy. 7. Rita Levi Montalcini, Department of Neuroscience, University of Turin, Turin, Italy. 8. Center for Alzheimer's and Related Dementias, US National Institute on Aging, Bethesda, MD, USA; Data Tecnica International, Glen Echo, MD, USA. 9. Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy; Neurology Unit, Department of Neurosciences, Azienda Ospedaliero Universitaria di Modena, Modena, Italy. 10. Neuromuscular Diseases Research Section, Laboratory of Neurogenetics, US National Institute on Aging, Bethesda, MD, USA; Department of Neurology, Johns Hopkins University Medical Center, Baltimore, MD, USA; Reta Lila Weston Institute, UCL Queen Square Institute of Neurology, University College London, London, UK. Electronic address: traynorb@mail.nih.gov. 11. Rita Levi Montalcini, Department of Neuroscience, University of Turin, Turin, Italy; Institute of Cognitive Sciences and Technologies, CNR, Rome, Italy; Neurology 1 and ALS Centre, Azienda Ospedaliero Universitaria Città della Salute e della Scienza, Turin, Italy.
Abstract
BACKGROUND: Amyotrophic lateral sclerosis (ALS) is known to represent a collection of overlapping syndromes. Various classification systems based on empirical observations have been proposed, but it is unclear to what extent they reflect ALS population substructures. We aimed to use machine-learning techniques to identify the number and nature of ALS subtypes to obtain a better understanding of this heterogeneity, enhance our understanding of the disease, and improve clinical care. METHODS: In this retrospective study, we applied unsupervised Uniform Manifold Approximation and Projection [UMAP]) modelling, semi-supervised (neural network UMAP) modelling, and supervised (ensemble learning based on LightGBM) modelling to a population-based discovery cohort of patients who were diagnosed with ALS while living in the Piedmont and Valle d'Aosta regions of Italy, for whom detailed clinical data, such as age at symptom onset, were available. We excluded patients with missing Revised ALS Functional Rating Scale (ALSFRS-R) feature values from the unsupervised and semi-supervised steps. We replicated our findings in an independent population-based cohort of patients who were diagnosed with ALS while living in the Emilia Romagna region of Italy. FINDINGS: Between Jan 1, 1995, and Dec 31, 2015, 2858 patients were entered in the discovery cohort. After excluding 497 (17%) patients with missing ALSFRS-R feature values, data for 42 clinical features across 2361 (83%) patients were available for the unsupervised and semi-supervised analysis. We found that semi-supervised machine learning produced the optimum clustering of the patients with ALS. These clusters roughly corresponded to the six clinical subtypes defined by the Chiò classification system (ie, bulbar, respiratory, flail arm, classical, pyramidal, and flail leg ALS). Between Jan 1, 2009, and March 1, 2018, 1097 patients were entered in the replication cohort. After excluding 108 (10%) patients with missing ALSFRS-R feature values, data for 42 clinical features across 989 patients were available for the unsupervised and semi-supervised analysis. All 1097 patients were included in the supervised analysis. The same clusters were identified in the replication cohort. By contrast, other ALS classification schemes, such as the El Escorial categories, Milano-Torino clinical staging, and King's clinical stages, did not adequately label the clusters. Supervised learning identified 11 clinical parameters that predicted ALS clinical subtypes with high accuracy (area under the curve 0·982 [95% CI 0·980-0·983]). INTERPRETATION: Our data-driven study provides insight into the ALS population substructure and confirms that the Chiò classification system successfully identifies ALS subtypes. Additional validation is required to determine the accuracy and clinical use of these algorithms in assigning clinical subtypes. Nevertheless, our algorithms offer a broad insight into the clinical heterogeneity of ALS and help to determine the actual subtypes of disease that exist within this fatal neurodegenerative syndrome. The systematic identification of ALS subtypes will improve clinical care and clinical trial design. FUNDING: US National Institute on Aging, US National Institutes of Health, Italian Ministry of Health, European Commission, University of Torino Rita Levi Montalcini Department of Neurosciences, Emilia Romagna Regional Health Authority, and Italian Ministry of Education, University, and Research. TRANSLATIONS: For the Italian and German translations of the abstract see Supplementary Materials section.
BACKGROUND: Amyotrophic lateral sclerosis (ALS) is known to represent a collection of overlapping syndromes. Various classification systems based on empirical observations have been proposed, but it is unclear to what extent they reflect ALS population substructures. We aimed to use machine-learning techniques to identify the number and nature of ALS subtypes to obtain a better understanding of this heterogeneity, enhance our understanding of the disease, and improve clinical care. METHODS: In this retrospective study, we applied unsupervised Uniform Manifold Approximation and Projection [UMAP]) modelling, semi-supervised (neural network UMAP) modelling, and supervised (ensemble learning based on LightGBM) modelling to a population-based discovery cohort of patients who were diagnosed with ALS while living in the Piedmont and Valle d'Aosta regions of Italy, for whom detailed clinical data, such as age at symptom onset, were available. We excluded patients with missing Revised ALS Functional Rating Scale (ALSFRS-R) feature values from the unsupervised and semi-supervised steps. We replicated our findings in an independent population-based cohort of patients who were diagnosed with ALS while living in the Emilia Romagna region of Italy. FINDINGS: Between Jan 1, 1995, and Dec 31, 2015, 2858 patients were entered in the discovery cohort. After excluding 497 (17%) patients with missing ALSFRS-R feature values, data for 42 clinical features across 2361 (83%) patients were available for the unsupervised and semi-supervised analysis. We found that semi-supervised machine learning produced the optimum clustering of the patients with ALS. These clusters roughly corresponded to the six clinical subtypes defined by the Chiò classification system (ie, bulbar, respiratory, flail arm, classical, pyramidal, and flail leg ALS). Between Jan 1, 2009, and March 1, 2018, 1097 patients were entered in the replication cohort. After excluding 108 (10%) patients with missing ALSFRS-R feature values, data for 42 clinical features across 989 patients were available for the unsupervised and semi-supervised analysis. All 1097 patients were included in the supervised analysis. The same clusters were identified in the replication cohort. By contrast, other ALS classification schemes, such as the El Escorial categories, Milano-Torino clinical staging, and King's clinical stages, did not adequately label the clusters. Supervised learning identified 11 clinical parameters that predicted ALS clinical subtypes with high accuracy (area under the curve 0·982 [95% CI 0·980-0·983]). INTERPRETATION: Our data-driven study provides insight into the ALS population substructure and confirms that the Chiò classification system successfully identifies ALS subtypes. Additional validation is required to determine the accuracy and clinical use of these algorithms in assigning clinical subtypes. Nevertheless, our algorithms offer a broad insight into the clinical heterogeneity of ALS and help to determine the actual subtypes of disease that exist within this fatal neurodegenerative syndrome. The systematic identification of ALS subtypes will improve clinical care and clinical trial design. FUNDING: US National Institute on Aging, US National Institutes of Health, Italian Ministry of Health, European Commission, University of Torino Rita Levi Montalcini Department of Neurosciences, Emilia Romagna Regional Health Authority, and Italian Ministry of Education, University, and Research. TRANSLATIONS: For the Italian and German translations of the abstract see Supplementary Materials section.
Authors: Henk-Jan Westeneng; Thomas P A Debray; Anne E Visser; Ruben P A van Eijk; James P K Rooney; Andrea Calvo; Sarah Martin; Christopher J McDermott; Alexander G Thompson; Susana Pinto; Xenia Kobeleva; Angela Rosenbohm; Beatrice Stubendorff; Helma Sommer; Bas M Middelkoop; Annelot M Dekker; Joke J F A van Vugt; Wouter van Rheenen; Alice Vajda; Mark Heverin; Mbombe Kazoka; Hannah Hollinger; Marta Gromicho; Sonja Körner; Thomas M Ringer; Annekathrin Rödiger; Anne Gunkel; Christopher E Shaw; Annelien L Bredenoord; Michael A van Es; Philippe Corcia; Philippe Couratier; Markus Weber; Julian Grosskreutz; Albert C Ludolph; Susanne Petri; Mamede de Carvalho; Philip Van Damme; Kevin Talbot; Martin R Turner; Pamela J Shaw; Ammar Al-Chalabi; Adriano Chiò; Orla Hardiman; Karel G M Moons; Jan H Veldink; Leonard H van den Berg Journal: Lancet Neurol Date: 2018-03-26 Impact factor: 44.182
Authors: Mamede de Carvalho; Reinhard Dengler; Andrew Eisen; John D England; Ryuji Kaji; Jun Kimura; Kerry Mills; Hiroshi Mitsumoto; Hiroyuki Nodera; Jeremy Shefner; Michael Swash Journal: Clin Neurophysiol Date: 2007-12-27 Impact factor: 3.708