| Literature DB >> 27293535 |
A Anguera1, J M Barreiro1, J A Lara2, D Lizcano2.
Abstract
One of the major challenges in the medical domain today is how to exploit the huge amount of data that this field generates. To do this, approaches are required that are capable of discovering knowledge that is useful for decision making in the medical field. Time series are data types that are common in the medical domain and require specialized analysis techniques and tools, especially if the information of interest to specialists is concentrated within particular time series regions, known as events. This research followed the steps specified by the so-called knowledge discovery in databases (KDD) process to discover knowledge from medical time series derived from stabilometric (396 series) and electroencephalographic (200) patient electronic health records (EHR). The view offered in the paper is based on the experience gathered as part of the VIIP project. Knowledge discovery in medical time series has a number of difficulties and implications that are highlighted by illustrating the application of several techniques that cover the entire KDD process through two case studies. This paper illustrates the application of different knowledge discovery techniques for the purposes of classification within the above domains. The accuracy of this application for the two classes considered in each case is 99.86% and 98.11% for epilepsy diagnosis in the electroencephalography (EEG) domain and 99.4% and 99.1% for early-age sports talent classification in the stabilometry domain. The KDD techniques achieve better results than other traditional neural network-based classification techniques.Entities:
Keywords: Electronic Health Record; Knowledge Discovery; Medical Data Mining; Time Series
Year: 2016 PMID: 27293535 PMCID: PMC4887593 DOI: 10.1016/j.csbj.2016.05.002
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Event taken from an EEG time series.
Fig. 2Patient performing a test on a stabilometric platform.
Fig. 3Fall event taken from a stabilometric time series.
Fig. 4Generic UML model.
Fig. 5Part of the conceptual model of the stabilometry domain [65].
Fig. 6Stabilometric time series snippet.
Fig. 7XML snippet containing patient stabilometric data.
Fig. 8Definition of events for the US stabilometric test.
Overall results of applying the event definition language to the EEG domain.
| #Series | # | # | # | |
|---|---|---|---|---|
| 200 | 1446 | 1496 | 1412 | 0.959 |
Statistical analysis of the events identified in the EEG domain.
| Class | Avg(#Events) | Avg(Duration) (ms) | Avg(Amplitude) |
|---|---|---|---|
| 9.47 | 195 | 78 | |
| 5.49 | 56 | 54 |
Overall results of the application of the event definition language to the stabilometric domain.
| #Series | # | # | # | |
|---|---|---|---|---|
| 396 | 942 | 954 | 931 | 0.982 |
Statistical analysis of events identified in the stabilometric domain.
| Class | Avg(#Events) | Avg(Duration) (ms) | Avg(Intensity) |
|---|---|---|---|
| 3.37 | 754 | 107 | |
| 1.45 | 346 | 83 |
Fig. 9Overview of the process enacted in the case study (aligned with the KDD process).
Confusion matrix for the application of the outlier detection method to the EEG domain.
| Language | |||
|---|---|---|---|
| Yes | No | ||
| Experts | Yes | 11 | 1 |
| No | 3 | 185 | |
Overall results for the application of the outlier detection method to the EEG domain.
| Indicator | Value |
|---|---|
| 78.6 | |
| 91.7 | |
| 98.4 | |
| 98.0 |
Confusion matrix for the application of the outlier detection method to the stabilometric domain.
| Language | |||
|---|---|---|---|
| Yes | No | ||
| Experts | Yes | 13 | 2 |
| No | 4 | 377 | |
Overall results for the application of the outlier detection method to the stabilometric domain.
| Indicator | Value |
|---|---|
| 76.5 | |
| 86.7 | |
| 99 | |
| 98.5 |
Comparison of the classification of patients by different methods in the EEG domain (accuracy)
| Patient type | Knowledge discovery in time series | AFINN | Multilayer perceptron |
|---|---|---|---|
| 99.86% | 96.26% | 96.61% | |
| 98.11% | 95.12% | 93% |
Reference models built by our proposal for EEG.
| Model | #Event | Duration (ms) | Amplitude | #Timestamp |
|---|---|---|---|---|
| 1 | 56 | 74 | 345 | |
| 2 | 321 | 81 | 1022 | |
| 3 | 68 | 73 | 3429 | |
| 4 | 167 | 87 | 2879 | |
| 5 | 189 | 83 | 1758 | |
| 6 | 245 | 92 | 895 | |
| 7 | 76 | 101 | 2210 | |
| 1 | 89 | 45 | 1355 | |
| 2 | 145 | 32 | 3652 | |
| 3 | 110 | 37 | 345 | |
| 4 | 57 | 21 | 1384 |
Comparison of the classification of patients by different methods in stabilometric domain (accuracy)
| Patient type | Knowledge discovery in time series | AFINN | Multilayer perceptron |
|---|---|---|---|
| 99.4 | 98.8 | 97.7 | |
| 99.1 | 98.1 | 97.1 |
Reference models built by the proposed method for stabilometry.
| Model | #Event | Duration (ms) | Intensity | #Timestamp |
|---|---|---|---|---|
| 1 | 854 | 103 | 345 | |
| 2 | 723 | 96 | 783 | |
| 3 | 1099 | 117 | 267 | |
| 1 | 234 | 73 | 211 |
Fig. 10Model interpretation interface.
| Predicted Label | |||
|---|---|---|---|
| Positive | Negative | ||
| Known label | Positive | True Positive (TP) | False Negative (FN) |
| Negative | False Positive (FP) | True Negative (TN) | |
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
Specificity = TN / (TN + FP)
Accuracy = (TP + TN) / (TP + TN + FP + FN)