| Literature DB >> 30297422 |
Huanfei Ma1, Siyang Leng2,3,4, Kazuyuki Aihara5,6, Wei Lin7,4,8,9,10, Luonan Chen11,12,13,14.
Abstract
Future state prediction for nonlinear dynamical systems is a challenging task, particularly when only a few time series samples for high-dimensional variables are available from real-world systems. In this work, we propose a model-free framework, named randomly distributed embedding (RDE), to achieve accurate future state prediction based on short-term high-dimensional data. Specifically, from the observed data of high-dimensional variables, the RDE framework randomly generates a sufficient number of low-dimensional "nondelay embeddings" and maps each of them to a "delay embedding," which is constructed from the data of a to be predicted target variable. Any of these mappings can perform as a low-dimensional weak predictor for future state prediction, and all of such mappings generate a distribution of predicted future states. This distribution actually patches all pieces of association information from various embeddings unbiasedly or biasedly into the whole dynamics of the target variable, which after operated by appropriate estimation strategies, creates a stronger predictor for achieving prediction in a more reliable and robust form. Through applying the RDE framework to data from both representative models and real-world systems, we reveal that a high-dimension feature is no longer an obstacle but a source of information crucial to accurate prediction for short-term data, even under noise deterioration.Entities:
Keywords: high-dimensional data; nonlinear dynamics; prediction; short-term data; time series
Year: 2018 PMID: 30297422 PMCID: PMC6205453 DOI: 10.1073/pnas.1802987115
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.Sketch of embedding the original attractor in a high-dimensional space into a reconstructed attractor in a low-dimensional space.
Fig. 2.Distribution of prediction errors by random embedding under different noise levels for a benchmark model of a linear system. (A–C) Probability distributions under different noise strengths.
Fig. 3.General principle of the RDE framework.
Fig. 4.Ten time points of one-step prediction for one variable in the benchmark model of a linear system. (A) The distribution of prediction by random embeddings. Based on this information, the final predicted values are made. (B) In addition to the original data and the predicted data, the box plot of the distribution is also shown in plane with median values, upper and lower quartiles, bounds, and outliers.
Fig. 5.Validations with synthetic data. (A) Prediction for the nonlinear 90D coupled Lorenz system, where multistep prediction up to 30 steps for six components is shown. (B) For the coupled Lorenz system of 90 dimensions, the training data as well as the predicted data cover only small segments of the underlying attractor. Each circle represents a data point corresponding to the data in A; some data points are buried within the attractor. (C) Prediction for grids of a spiral pattern in an ISCAM model, where four selective samples of one-step prediction for the pattern are shown.
Fig. 6.Real-world data. (A) One-step prediction for five probes from the gene dataset. (B) A 1-h prediction for the wind speed in the Tokyo capital region based on data collected from five geometrically local stations and delays up to 5 h. (C) A 1-d prediction for the daily cardiovascular disease admissions with trend, where the standard deviation (STD) is shown as shaded area.
Fig. 7.Performance comparisons of different methods with different lengths of training data and different levels of noise. Two criteria are used to evaluate the prediction quality: the correlation and the rms error between the predicted series and the test data. (A and B) The length test based on 100 randomly chosen sections for each length of training data. (C and D) The noise test based on 100 independent trials. Here, the median, the upper quartile, and the lower quartile are shown. SNR, signal-to-noise ratio.