Andrew Paul Cox1, Mireia Raluy-Callado2, Meng Wang2, Abdel Magid Bakheit3, Austen Peter Moore4, Jerome Dinet5. 1. Evidera, Metro Building, 6th Floor, 1 Butterwick, London W6 8DL, United Kingdom. Electronic address: andrew.cox@evidera.com. 2. Evidera, Metro Building, 6th Floor, 1 Butterwick, London W6 8DL, United Kingdom. 3. Moseley Hall Hospital, Alcester Road, Birmingham, West Midlands B13 8JL, United Kingdom. 4. Walton Centre NHS Foundation Trust, Lower Lane, Fazakerley, Liverpool, Merseyside L9 7LJ, United Kingdom. 5. Ipsen Pharma, 65, quai Georges Gorse, 92650 Boulogne Billancourt Cedex, France.
Abstract
PURPOSE OF THE RESEARCH: Spasticity is one of the well-recognized complications of stroke which may give rise to pain and limit patients' ability to perform daily activities. The predisposing factors and direct effects of post-stroke spasticity also involve high management costs in terms of healthcare resources, and case-control designs are required for establishing such differences. Using 'The Health Improvement Network' (THIN) database, such a study would not provide reliable estimates since the prevalence of post-stroke spasticity was found to be 2%, substantially below the most conservative previously reported estimates. The objective of this study was to use predictive analysis techniques to determine if there are a substantial number of potentially under-recorded patients with post-stroke spasticity. METHODS: This study used retrospective data from adult patients with a diagnostic code for stroke between 2007 and 2011 registered in THIN. Two algorithm approaches were developed and compared, a statistically validated data-trained algorithm and a clinician-trained algorithm. RESULTS: A data-trained algorithm using Random Forest showed better prediction performance than clinician-trained algorithm, with higher sensitivity and only marginally lower specificity. Overall accuracy was 75% and 72%, respectively. The data-trained algorithm predicted an additional 3912 records consistent with patients developing spasticity in the 12months following a stroke. CONCLUSIONS: Using machine learning techniques, additional unrecorded post-stroke spasticity patients were identified, increasing the condition's prevalence in THIN from 2% to 13%. This work shows the potential for under-reporting of PSS in primary care data, and provides a method for improved identification of cases and control records for future studies.
PURPOSE OF THE RESEARCH: Spasticity is one of the well-recognized complications of stroke which may give rise to pain and limit patients' ability to perform daily activities. The predisposing factors and direct effects of post-stroke spasticity also involve high management costs in terms of healthcare resources, and case-control designs are required for establishing such differences. Using 'The Health Improvement Network' (THIN) database, such a study would not provide reliable estimates since the prevalence of post-stroke spasticity was found to be 2%, substantially below the most conservative previously reported estimates. The objective of this study was to use predictive analysis techniques to determine if there are a substantial number of potentially under-recorded patients with post-stroke spasticity. METHODS: This study used retrospective data from adult patients with a diagnostic code for stroke between 2007 and 2011 registered in THIN. Two algorithm approaches were developed and compared, a statistically validated data-trained algorithm and a clinician-trained algorithm. RESULTS: A data-trained algorithm using Random Forest showed better prediction performance than clinician-trained algorithm, with higher sensitivity and only marginally lower specificity. Overall accuracy was 75% and 72%, respectively. The data-trained algorithm predicted an additional 3912 records consistent with patients developing spasticity in the 12months following a stroke. CONCLUSIONS: Using machine learning techniques, additional unrecorded post-stroke spasticitypatients were identified, increasing the condition's prevalence in THIN from 2% to 13%. This work shows the potential for under-reporting of PSS in primary care data, and provides a method for improved identification of cases and control records for future studies.
Authors: Wenjuan Wang; Martin Kiik; Niels Peek; Vasa Curcin; Iain J Marshall; Anthony G Rudd; Yanzhong Wang; Abdel Douiri; Charles D Wolfe; Benjamin Bray Journal: PLoS One Date: 2020-06-12 Impact factor: 3.240