| Literature DB >> 34912034 |
Kenan Li1, Katherine Sward2, Huiyu Deng3, John Morrison4, Rima Habre4, Meredith Franklin4, Yao-Yi Chiang5, Jose Luis Ambite6, John P Wilson7,4,6,8,9,10, Sandrah P Eckel7.
Abstract
Advances in measurement technology are producing increasingly time-resolved environmental exposure data. We aim to gain new insights into exposures and their potential health impacts by moving beyond simple summary statistics (e.g., means, maxima) to characterize more detailed features of high-frequency time series data. This study proposes a novel variant of the Self-Organizing Map (SOM) algorithm called Dynamic Time Warping Self-Organizing Map (DTW-SOM) for unsupervised pattern discovery in time series. This algorithm uses DTW, a similarity measure that optimally aligns interior patterns of sequential data, both as the similarity measure and training guide of the neural network. We applied DTW-SOM to a panel study monitoring indoor and outdoor residential temperature and particulate matter air pollution (PM2.5) for 10 patients with asthma from 7 households near Salt Lake City, UT; the patients were followed for up to 373 days each. Compared to previous SOM algorithms using timestamp alignment on time series data, the DTW-SOM algorithm produced fewer quantization errors and more detailed diurnal patterns. DTW-SOM identified the expected typical diurnal patterns in outdoor temperature which varied by season, as well diurnal patterns in PM2.5 which may be related to daily asthma outcomes. In summary, DTW-SOM is an innovative feature engineering method that can be applied to highly time-resolved environmental exposures assessed by sensors to identify typical diurnal (or hourly or monthly) patterns and provide new insights into the health effects of environmental exposures.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34912034 PMCID: PMC8674322 DOI: 10.1038/s41598-021-03515-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Dynamic Time Warping (DTW) alignment of two 24-h time series of outdoor temperature (at minute-level resolution so 60 × 24 = 1440 min long) from two measurement sites, with the color gradient displaying all pairwise Euclidean distances between time points (blue indicates the shortest distances) and the red line shows the shortest path.
Figure 2Conceptual diagrams showing the use of SOM to: (a) discover multipollutant patterns, as in Pierce et al. (2014); and (b) diurnal patterns of a single pollutant, as proposed in this study.
Figure 3Calculation of new neuron weights N using DTW alignment. N’ and N’ are the weighted means of two adjacent DTW pairs; however, neither match existing timestamps. To calculate N, we approximate its value using N’ and N’.
Pseudo code of DTW-SOM updating rules.
Figure 4Visual comparison of the trained weights resulting from one iteration of the Euclidean training rule vs. the DTW training rule applied to a randomly selected observation from the 24-h outdoor temperature time series. Input data refers to the standard time series, trained weights refer to the weights after the iteration, and BMU weights refer to the weights at the previous iteration.
Figure 5Quantization error as a function of the number of neurons in DTW-SOM.
Quantitative comparison of the trained weights resulting from one iteration of the timestamp training rule vs. the DTW training rule on a randomly selected observation from the 24-h outdoor temperature time series.
| Summary statistic | Timestamp Alignment | DTW Alignment |
|---|---|---|
| Euclidean distance between the trained weight and the initial (BMU) weight | 7.77 | 7.37 |
| Variance of the trained weightsa | 0.012 | 0.014 |
aVariance of the input time series was 0.017 and the variance of the initial BMU weights was 0.010.
Quantization errors from applying the three SOM algorithms, each with a 5 × 5 output space (25 neurons), separately to each of the four residential sensor readings.
| Outdoor temperature | Indoor temperature | Outdoor PM | Indoor PM | |
|---|---|---|---|---|
| Standard SOM with Euclidean distance measurement | 1.675 | 2.476 | 0.924 | 0.676 |
| SOM with DTW distance measurement | 1.706 | 2.404 | 0.909 | 0.717 |
| DTW-SOM | 1.189 | 1.800 | 0.907 | 0.538 |
Figure 6Diurnal patterns in outdoor temperature identified using standard SOM (left) and DTW-SOM (right). Each cell represents a neuron, the curved line represents its final weights, and the bar plot indicates the distribution by season of the input observations best matching that neuron.
Figure 7Diurnal patterns in indoor (left) and outdoor (right) residential PM2.5 identified using DTW-SOM. The fraction in each cell represents the number of days with inhaler usage over the number of days matching the diurnal pattern.