| Literature DB >> 27293793 |
C J Moore1, A J K Chua1, C P L Berry2, J R Gair3.
Abstract
Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large datasets. Here, we derive some simple results which we have found useful for speeding up the learning stage in the GPR algorithm, and especially for performing Bayesian model comparison between different covariance functions. We apply our techniques to both synthetic and real data and quantify the speed-up relative to using nested sampling to numerically evaluate model evidences.Entities:
Keywords: Gaussian processes; data analysis; inference; regression
Year: 2016 PMID: 27293793 PMCID: PMC4892455 DOI: 10.1098/rsos.160125
Source DB: PubMed Journal: R Soc Open Sci ISSN: 2054-5703 Impact factor: 2.963
Figure 1.Realizations of the GPs k1(t,t′) and k2(t,t′) from (3.1) and (3.2) for values of t=1,2,3…,100 are shown in the (a,b) panels, respectively. The horizontal black lines indicate the length scales associated with the different terms in the covariance functions. The hyperparameters for k1 were chosen to be σ=1, ϕ0=3.5, ϕ1=1.5 and x1=0. The hyperparameters for k2 were chosen to be the same as for k1 and ϕ2=3 and x2=0. In both cases, the noise was fixed to σ=10−2.
A summary of the results of the analysis of synthetic data for three different-sized datasets. The first set of two columns is for a dataset drawn from the k2 covariance function and analysed with the k1 covariance function. The first column is the estimated hyperevidence using the Laplace approximation where is as given in equation (2.13), while the second is the numerically calculated hyperevidence. The second set of two columns shows results for the same data, but analysed with the k2 covariance function. The final pair of columns shows the log Bayes factor, , calculated using the approximate and numerical values for the hyperevidence.
| 30 | −17.77 | −17.87±0.08 | − | −17.73±0.09 | −1.05 | 0.14±0.12 |
| 100 | −20.17 | −20.17±0.10 | −19.22 | −19.22±0.11 | 0.95 | 0.95±0.15 |
| 300 | −49.94 | −50.12±0.11 | −40.21 | −40.36±0.13 | 9.73 | 9.76±0.17 |
Figure 2.The one- and two-dimensional marginalized posterior distributions on the hyperparameters of the k2 covariance function obtained from the largest (n=300) synthetic dataset. The posterior is well approximated as a normal distribution. Shown in the black curves in the one-dimensional marginalized posterior distributions along the diagonal are the normal approximations obtained by using the techniques described in §2 to maximize and find the Hessian. Using the Hessian to approximate the integral of this distribution (the hyperevidence) leads to an error of approximately 10% (table 1).
Figure 3.Shown in the main figure are six lunar months of tidal height data (black), from which the lunar tidal cycle can be discerned. Shown in the inset plot are several days of the tidal data (black points), from which the daily cycles can be clearly seen. Overlaid in the inset plot are both GP interpolants (blue), which are identical on this time scale.