Literature DB >> 32191621

Autoencoder as a New Method for Maintaining Data Privacy While Analyzing Videos of Patients With Motor Dysfunction: Proof-of-Concept Study.

Marcus D'Souza¹, Caspar E P Van Munster², Jonas F Dorn³, Alexis Dorier³, Christian P Kamm^4,5, Saskia Steinheimer⁵, Frank Dahlke³, Bernard M J Uitdehaag², Ludwig Kappos¹, Matthew Johnson⁶.

Abstract

BACKGROUND: In chronic neurological diseases, especially in multiple sclerosis (MS), clinical assessment of motor dysfunction is crucial to monitor the disease in patients. Traditional scales are not sensitive enough to detect slight changes. Video recordings of patient performance are more accurate and increase the reliability of severity ratings. When these recordings are automated, quantitative disability assessments by machine learning algorithms can be created. Creation of these algorithms involves non-health care professionals, which is a challenge for maintaining data privacy. However, autoencoders can address this issue.
OBJECTIVE: The aim of this proof-of-concept study was to test whether coded frame vectors of autoencoders contain relevant information for analyzing videos of the motor performance of patients with MS.
METHODS: In this study, 20 pre-rated videos of patients performing the finger-to-nose test were recorded. An autoencoder created encoded frame vectors from the original videos and decoded the videos again. The original and decoded videos were shown to 10 neurologists at an academic MS center in Basel, Switzerland. The neurologists tested whether the 200 videos were human-readable after decoding and rated the severity grade of each original and decoded video according to the Neurostatus-Expanded Disability Status Scale definitions of limb ataxia. Furthermore, the neurologists tested whether ratings were equivalent between the original and decoded videos.
RESULTS: In total, 172 of 200 (86.0%) videos were of sufficient quality to be ratable. The intrarater agreement between the original and decoded videos was 0.317 (Cohen weighted kappa). The average difference in the ratings between the original and decoded videos was 0.26, in which the original videos were rated as more severe. The interrater agreement between the original videos was 0.459 and that between the decoded videos was 0.302. The agreement was higher when no deficits or very severe deficits were present.
CONCLUSIONS: The vast majority of videos (172/200, 86.0%) decoded by the autoencoder contained clinically relevant information and had fair intrarater agreement with the original videos. Autoencoders are a potential method for enabling the use of patient videos while preserving data privacy, especially when non-health-care professionals are involved. ©Marcus D'Souza, Caspar E P Van Munster, Jonas F Dorn, Alexis Dorier, Christian P Kamm, Saskia Steinheimer, Frank Dahlke, Bernard M J Uitdehaag, Ludwig Kappos, Matthew Johnson. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 08.05.2020.

Entities: Chemical Disease Species

Keywords: Neurostatus-EDSS; autoencoder; deep neuronal network; machine learning algorithms; video-rating

Year: 2020 PMID： 32191621 PMCID： PMC7244995 DOI： 10.2196/16669

Source DB: PubMed Journal: J Med Internet Res ISSN： 1438-8871 Impact factor: 5.428

Introduction

In chronic neurological diseases, especially multiple sclerosis (MS), clinical assessment of motor dysfunction is crucial to monitor the disease in patients [1]. Traditional scales used to assess MS, such as the Expanded Disability Status Scale (EDSS), are not sensitive enough to detect slight changes in motor performance [2]. Video recordings of patient performance are more accurate and increase the reliability of severity ratings [3,4]. Moreover, when these recordings are automated, quantitative disability assessments by machine learning algorithms (MLA) can be created [5]. Machine learning algorithms are potentially more sensitive in detecting small changes between images; however, they require high-resolution images because of the high dimensionality of the data [6,7]. Creation of these algorithms usually involves non–health care professionals, which is a potential challenge for maintaining data privacy. Autoencoders can address this issue. They embed visual information into a lower-dimensional latent space that preserves information needed for algorithm development but is not visually interpretable by humans. [6]. An autoencoder consists of an encoder that creates encoded videos by creating a sequence of coded frame vectors and a paired decoder that transforms the coded frame vectors back into the original video. Videos encoded in this way can be shared with non–health care professionals, while the decoder can be used to verify if the essential information from the video has been captured. However, it is unknown whether the condensed data in the coded frame vectors contain clinically relevant data. Therefore, the aim of this proof-of-concept study was to test whether coded frame vectors of autoencoders contain relevant information for analyzing videos of the motor performance of patients with MS.

Methods

Study Design and Participants

This study was a subproject of the ASSESS MS study [5] and was approved by the local ethics committees. All participants gave their written informed consent prior to inclusion. In the ASSESS MS study, 9 standardized movements were recorded on video; these movements covered overall motor function, including upper extremity function, truncal stability, and mobility. A detailed description of the movements can be found elsewhere [8]. For this study, we used recordings of the finger-to-nose test. The execution of the finger-to-nose test was standardized using a detailed protocol: Each participant was instructed to close their eyes and abduct their arms to 90° at the shoulder in full extension before touching their nose with the tip of their index finger. Both sides were tested. Original and decoded videos of 20 participants were shown to 10 neurologists at an academic MS center in Basel, Switzerland. The neurologists tested whether these 200 videos in total were human-readable after decoding and rated the severity grade of each original and decoded video according to the Neurostatus-EDSS definitions of limb ataxia [9] (subscore grade 0=no ataxia; grade 1=signs only; grade 2=tremor or clumsy movements easily seen, minor interference with function; grade 3=tremor or clumsy movements that interfere with function in all spheres; and grade 4=most functions are very difficult). The decoded videos were shown firstly, and after an interval of 2-3 weeks, the original videos were shown in the same order to minimize recall bias. The neurologists tested whether these videos were human-readable after decoding.

Autoencoder

A variational autoencoder was trained on 2230 videos comprising the 9 standardized motor performances included in the ASSESS MS study. The autoencoder was structured so that the frames of each video were encoded into a lower-dimensional space and then decoded into their original form. Figure 1 depicts the structure of the autoencoder [10]. An encoder network was presented with a single frame from the video without further context. The frame passed through 5 encoding blocks. In each block, the input was processed in a block inspired by a densely connected convolutional network [11], wherein a skip connection was provided between the input and output layers in addition to a convolutional layer/batch normalization sequence. Each block halved the resolution of the image and doubled the feature depth. This network predicted the mean and variance of a normal distribution, which was then sampled to produce a code. The code was presented to a second network that consisted of 5 decoding blocks. Each decoding block consisted of a skip connection (which performed a simple upsampling process) and a transposed convolutional block like that used in a deep convolutional generative adversarial network [12]. Each block doubled the resolution and halved the feature depth. The network was trained using a multi-scale structural similarity–based perceptual loss function [13] with Kullback-Leibler regularization as per Kingma and Welling [10]. The input images were 256×256 RGB-D images with a code length of 256. The training hyperparameters were as follows: the learning rate was 0.001, the convolutional kernel size was 5, and the number of initial filters was 8. The model was trained for 400 epochs.

Figure 1

Structure of the variational autoencoder.

Structure of the variational autoencoder. The key property of interest to us was that when a frame is in its coded form, it is computationally prohibited to decipher it without access to the decoder [6]. An autoencoder as described above reduces the dimensionality of the input data (in our case, videos) by passing the data through an “information bottleneck” [14]. The resulting coded, or latent, space sufficiently describes the data in a way that allows an accurate partial reconstruction. The shared latent embedding is optimized to represent the salient information that is similar across frames of multiple videos (in our case: the movement), whereas dissimilar aspects (eg, background aspects, details of physical features) are less well conserved. Neural networks are a machine learning approach that is inspired by biological neuronal computation; these networks have demonstrated exceptional performance in complex image-related tasks in recent years [15-17]. Given this success, in this study, we used a neural net approach called a variational autoencoder [18]. A variational autoencoder has at its center a coded vector of vastly reduced dimensionality. This is because the decoder requires millions of floating point values to be set precisely before the coded vector can be successfully decoded into an image. At the same time, the coded vector contains all the information necessary to reconstruct that frame; interestingly, due to the variational constraints during training, the frame has semantically meaningful cosine distances to other visually similar frames. This property is very useful for machine learning tasks that operate upon these coded vectors because the coded frames can be used in place of the original video frames without the possibility that a human could use it to recognize the depicted participant.

Statistics

Intrarater agreement between the ratings of the original and the decoded videos was assessed using the Cohen weighted kappa with linear weights (ie, disagreements of 1, 2, and 3 were weighted by factors of 1, 2, and 3, respectively). A Cohen kappa of 0 corresponds to chance agreement; 0-0.2, to slight agreement; 0.21-0.4, fair agreement; 0.41-0.6, to moderate agreement; 0.61-0.8, to substantial agreement; and 0.81-1, to almost perfect agreement [19]. All analyses were performed in MATLAB (MathWorks, Inc).

Results

The characteristics of the study population and the participating neurologists are summarized in Table 1.

Table 1

Characteristics of the patients and neurologists who participated in the study.

Characteristic			Value
Patient characteristics (n=20)
	Age (years), mean (95% CI)	44.4 (27-74)
	Gender (female/male), n (%)	12 (63%)/7 (37%)
	Disease duration (years), mean (95% CI)	13.2 (1-40)
	Median EDSS^a (range)	3.5 (0-6.5)
	Type of MS^b (RRMS^c/SPMS^d), n (%)	19 (95%)/1 (5%)
Neurologists (n=10)
	Gender (female/male), n (%)	5 (50%)/5 (50%)
	Years of experience in neurology, mean (range)	8.8 (3 to >30)

aEDSS: Expanded Disability Status Scale.

bMS: multiple sclerosis.

cRRMS: relapsing remitting multiple sclerosis.

dSPMS: secondary progressive multiple sclerosis.

In total, 172/200 (86.0%) videos were of sufficient quality to be ratable. The Cohen weighted kappa indicating intra-rater agreement between the original and decoded videos was 0.317. The average difference in the ratings between the original and decoded videos was 0.26, in which the original videos were rated as more severe. The inter-rater agreements of the original and decoded videos were 0.459 and 0.302, respectively. As depicted in Figure 2, agreement was higher when no deficits (grade 0) or very severe deficits (grade 4) were present. Note that most videos that were not ratable were judged so by neurologists 2 and 5.

Figure 2

Ratings by 10 neurologists of the original and decoded videos. The colored squares represent the different grades for limb ataxia of the finger-to-nose-test according to the Neurostatus-Expanded Disability Status Scale subscores: black=0, dark grey=1, grey=2, bright grey=3, and white=4. The blue squares represent videos that were judged as not ratable by the neurologists.

Characteristics of the patients and neurologists who participated in the study. aEDSS: Expanded Disability Status Scale. bMS: multiple sclerosis. cRRMS: relapsing remitting multiple sclerosis. dSPMS: secondary progressive multiple sclerosis. Ratings by 10 neurologists of the original and decoded videos. The colored squares represent the different grades for limb ataxia of the finger-to-nose-test according to the Neurostatus-Expanded Disability Status Scale subscores: black=0, dark grey=1, grey=2, bright grey=3, and white=4. The blue squares represent videos that were judged as not ratable by the neurologists.

Discussion

Principal Findings

In this proof-of-concept study, 172/200 (86.0%) of the decoded videos were of sufficient quality to be ratable. We found fair intrarater agreement between the original and decoded videos. The agreement was better for minor and severe deficits in motor function. Data security and privacy are increasingly requested by health care professionals for data capture, analysis, and storage [20]. At the same time, the use of machine learning algorithms and deep neuronal network techniques as subdomains of artificial intelligence is increasingly infiltrating all areas of health care [21,22]. The use of new technologies and electronic tools for capture and automated analysis of clinical data generally requires the involvement of non–health care professionals, which creates challenges regarding data privacy. To our knowledge, this is the first study to use an autoencoder to allow the analysis of patient videos while preserving data privacy. Patients with MS may present with slight changes in motor performances over their disease course. Clinical assessment of these changes is notoriously difficult. Video analysis of motor performances allows automated analyses and quantification of disability by using machine learning algorithm–based analysis systems such as those used in the ASSESS MS study; however, it requires a huge data set [5]. Since the creation of machine learning algorithms usually involves non-medical collaborators, encoding of these videos is essential. The intra-rater agreement of original and decoded videos in this study was fair. It is unclear whether this is due to accordance of the video quality or the test-retest reliability of the finger-to-nose test. To our knowledge, no data are available regarding this psychometric property of the finger-to-nose test.

Limitations

A limitation of this proof-of-concept study is the class imbalance of the patient videos according to the four grades of limb ataxia for the finger-to-nose test [9,21]. Further iterations of the deep neural network are necessary to increase the intrarater reliability.

Conclusions

In this proof-of-concept study, we have shown that the vast majority (172/200, 86.0%) of videos decoded by an autoencoder contained clinically relevant information regarding upper extremity motor performance represented by the finger-to-nose test and had fair intrarater agreement. Autoencoders are a potential method for enabling the use of patient videos while preserving data privacy, especially when non–health care professionals are involved.

13 in total

1. Convolutional Networks with Dense Connectivity.

Authors: Gao Huang; Zhuang Liu; Geoff Pleiss; Laurens Van Der Maaten; Kilian Weinberger
Journal: IEEE Trans Pattern Anal Mach Intell Date: 2019-05-23 Impact factor: 6.226

2. Video-Based Pairwise Comparison: Enabling the Development of Automated Rating of Motor Dysfunction in Multiple Sclerosis.

Authors: Jessica Burggraaff; Jonas Dorn; Marcus D'Souza; Cecily Morrison; Christian P Kamm; Peter Kontschieder; Prejaas Tewarie; Saskia Steinheimer; Abigail Sellen; Frank Dahlke; Ludwig Kappos; Bernard Uitdehaag
Journal: Arch Phys Med Rehabil Date: 2019-08-30 Impact factor: 3.966

3. The measurement of observer agreement for categorical data.

Authors: J R Landis; G G Koch
Journal: Biometrics Date: 1977-03 Impact factor: 2.571

Review 4. Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological disorders: Methods and applications.

Authors: Sandra Vieira; Walter H L Pinaya; Andrea Mechelli
Journal: Neurosci Biobehav Rev Date: 2017-01-10 Impact factor: 8.989

Review 5. Outcome Measures in Clinical Trials for Multiple Sclerosis.

Authors: Caspar E P van Munster; Bernard M J Uitdehaag
Journal: CNS Drugs Date: 2017-03 Impact factor: 5.749

6. Reference videos reduce variability of motor dysfunction assessments in multiple sclerosis.

Authors: Marcus D'Souza; Saskia Steinheimer; Jonas Dorn; Cecily Morrison; Jacques Boisvert; Kristina Kravalis; Jessica Burggraaff; Caspar Ep van Munster; Manuela Diederich; Abigail Sellen; Christian P Kamm; Frank Dahlke; Bernard Mj Uitdehaag; Ludwig Kappos
Journal: Mult Scler J Exp Transl Clin Date: 2018-08-09

Review 7. Applications of Machine Learning in Real-Life Digital Health Interventions: Review of the Literature.

Authors: Andreas K Triantafyllidis; Athanasios Tsanas
Journal: J Med Internet Res Date: 2019-04-05 Impact factor: 5.428

8. Using deep autoencoders to identify abnormal brain structural patterns in neuropsychiatric disorders: A large-scale multi-sample study.

Authors: Walter H L Pinaya; Andrea Mechelli; João R Sato
Journal: Hum Brain Mapp Date: 2018-10-11 Impact factor: 5.038

9. Towards a Stakeholder-Oriented Blockchain-Based Architecture for Electronic Health Records: Design Science Research Study.

Authors: Jan Heinrich Beinke; Christian Fitte; Frank Teuteberg
Journal: J Med Internet Res Date: 2019-10-07 Impact factor: 5.428

10. Usability and Acceptability of ASSESS MS: Assessment of Motor Dysfunction in Multiple Sclerosis Using Depth-Sensing Computer Vision.

Authors: Cecily Morrison; Marcus D'Souza; Kit Huckvale; Jonas F Dorn; Jessica Burggraaff; Christian Philipp Kamm; Saskia Marie Steinheimer; Peter Kontschieder; Antonio Criminisi; Bernard Uitdehaag; Frank Dahlke; Ludwig Kappos; Abigail Sellen
Journal: JMIR Hum Factors Date: 2015-06-24

1 in total

1. Disease Concept-Embedding Based on the Self-Supervised Method for Medical Information Extraction from Electronic Health Records and Disease Retrieval: Algorithm Development and Validation Study.

Authors: Yen-Pin Chen; Yuan-Hsun Lo; Feipei Lai; Chien-Hua Huang
Journal: J Med Internet Res Date: 2021-01-27 Impact factor: 5.428

1 in total