| Literature DB >> 23977154 |
Rachel Cavill1, Jos Kleinjans, Jacob-Jan Briedé.
Abstract
When studying time courses of biological measurements and comparing these to other measurements eg. gene expression and phenotypic endpoints, the analysis is complicated by the fact that although the associated elements may show the same patterns of behaviour, the changes do not occur simultaneously. In these cases standard correlation-based measures of similarity will fail to find significant associations. Dynamic time warping (DTW) is a technique which can be used in these situations to find the optimal match between two time courses, which may then be assessed for its significance. We implement DTW4Omics, a tool for performing DTW in R. This tool extends existing R scripts for DTW making them applicable for "omics" datasets where thousands entities may need to be compared with a range of markers and endpoints. It includes facilities to estimate the significance of the matches between the supplied data, and provides a set of plots to enable the user to easily visualise the output. We illustrate the utility of this approach using a dataset linking the exposure of the colon carcinoma Caco-2 cell line to oxidative stress by hydrogen peroxide (H2O2) and menadione across 9 timepoints and show that on average 85% of the genes found are not obtained from a standard correlation analysis between the genes and the measured phenotypic endpoints. We then show that when we analyse the genes identified by DTW4Omics as significantly associated with a marker for oxidative DNA damage (8-oxodG), through over-representation, an Oxidative Stress pathway is identified as the most over-represented pathway demonstrating that the genes found by DTW4Omics are biologically relevant. In contrast, when the positively correlated genes were similarly analysed, no pathways were found. The tool is implemented as an R Package and is available, along with a user guide from http://web.tgx.unimaas.nl/svn/public/dtw/.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23977154 PMCID: PMC3748037 DOI: 10.1371/journal.pone.0071823
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Dynamic time warping.
Panel A shows how time series can be aligned using dynamic time warping from the original (left) to the warped (right). Panel B shows the two varieties of DTW provided by DTW4Omics, showing the comparistons which are considered in case.
Figure 2Output plots.
A – showing gene and endpoint before and after warping. B – Histogram of real and permuted distances obtained through DTW. C – Histogram of the permuted distances obtained for the most significant gene, with the real distance marked by an X.
Comparing DTW4Omics selected genes with correlation analysis.
| Oxidant | G1 | S | 8-oxo-dG | Time | |
|
| No. of genes with positive correlation | 34 | 53 | 51 | 127 |
| No. of genes with DTW | 99 | 102 | 141 | 145 | |
| Overlap with correlated genes | 13 | 18 | 22 | 86 | |
| Percentage of DTW genes not found by correlation | 87% | 82% | 84% | 41% | |
| Percentage of positive correlation genes found by DTW | 38% | 34% | 43% | 68% | |
|
| No. of genes with DTW | 14 | 0 | 17 | 13 |
| Percentage of DTW genes not found by correlation | 100% | – | 100% | 100% |
G2 is not shown as it gave no correlated genes under any conditions and with H2O2 no genes were found with correlation so these rows are omitted.
Figure 3Results from the simulated data for matched DTW.
Both panels show the results of the different levels of noise (low noise at top, high noise at bottom) and different amounts of time shift between the sequences (0 shift on left, 9 units shift on right). All time series were 20 units long. Panel A shows the percentage of elements which were (correctly) recognised as matches using an FDR of 5% and the matched DTW function. Panel B shows the percentage of matches where the matched pattern exactly matched the intended differences in the time courses.