| Literature DB >> 36156995 |
Zhe Sun1,2, Alexander T Archibald1,3.
Abstract
Accurately simulating the geographical distribution and temporal variability of global surface ozone has long been one of the principal components of chemistry-climate modelling. However, the simulation outcomes have been reported to vary significantly as a result of the complex mixture of uncertain factors that control the tropospheric ozone budget. Settling the cross-model discrepancies to achieve higher accuracy predictions of surface ozone is thus a task of priority, and methods that overcome structural biases in models going beyond naïve averaging of model simulations are urgently required. Building on the Coupled Model Intercomparison Project Phase 6 (CMIP6), we have transplanted a conventional ensemble learning approach, and also constructed an innovative 2-stage enhanced space-time Bayesian neural network to fuse an ensemble of 57 simulations together with a prescribed ozone dataset, both of which have realised outstanding performances (R2 > 0.95, RMSE < 2.12 ppbv). The conventional ensemble learning approach is computationally cheaper and results in higher overall performance, but at the expense of oceanic ozone being overestimated and the learning process being uninterpretable. The Bayesian approach performs better in spatial generalisation and enables perceivable interpretability, but induces heavier computational burdens. Both of these multi-stage machine learning-based approaches provide frameworks for improving the fidelity of composition-climate model outputs for uses in future impact studies.Entities:
Keywords: CCM; CMIP6; Data fusion; Model ensemble; Space-time Bayesian neural network; Surface ozone
Year: 2021 PMID: 36156995 PMCID: PMC9488062 DOI: 10.1016/j.ese.2021.100124
Source DB: PubMed Journal: Environ Sci Ecotechnol ISSN: 2666-4984
Summarisation of CMIP6 historical project participant institutes and models with chemistry schemes, spatial gridding, and experiment realisation, physics, and forcing settings. The names of institutes and coupled earth system models are listed in abbreviation. The three-dimensional spatial resolutions are represented in longitudinal-latitudinal-vertical grids. The tropospheric and stratospheric chemistry schemes are denoted as interactive (I), prescribed (P) and none (N) in “Trop” and “Strat” columns. The realisation, physics and forcing indices identify ensemble experiment members. The “Fusion” column indicates whether the simulation experiments are included into multi-model fusion. Full names of the CMIP6 participant research institutes are listed in the Appendix.
| Institute | Model | Trop | Strat | Grids | Realisations | Physics | Forcing | Fusion | Refs |
|---|---|---|---|---|---|---|---|---|---|
| P# | P | 192 × 96 × 47 | [ | ||||||
| I | P | 128 × 64 × 26 | ✓ | [ | |||||
| P | P | 320 × 160 × 19 | [ | ||||||
| N | I | 256 × 128 × 91 | [ | ||||||
| N | I | 256 × 128 × 91 | [ | ||||||
| I | P | 192 × 96 × 47 | ✓ | [ | |||||
| P | P | 144 × 143 × 79 | [ | ||||||
| I | I | 192 × 144 × 85 | ✓ | [ | |||||
| I | I | 192 × 144 × 85 | ✓ | ||||||
| I | I | 192 × 144 × 85 | ✓ | ||||||
| P | P | 384 × 192 × 95 | [ | ||||||
| I | I | 128 × 64 × 80 | ✓ | [ | |||||
| I | I | 144 × 90 × 40 | ✓ | [ | |||||
| I | I | 144 × 90 × 40 | ✓ | ||||||
| I | I | 144 × 90 × 40 | ✓ | ||||||
| I | I | 144 × 90 × 40 | ✓ | ||||||
| I | I | 288 × 192 × 70 | ✓ | [ | |||||
| I | P | 288 × 192 × 32 | ✓ | [ | |||||
| I | I | 192 × 144 × 85 | ✓ | [ | |||||
| I | I | 288 × 180 × 49 | ✓ | [ |
∥ The earth system models are unique for each institute, but coincidently are named the same as ESM with version numbers, thus are named by institute + model name hereafter in this paper for distinguishment (i.e. CNRM-ESM2.1 is not an updated version of BCC-ESM1, but a new version of CNRM-ESM1) [110].
# AWI-ESM, BCC-CSM2, IPSL-CM6A, and MPI-M-ESM1.2-HR use the same prescribed ozone for the whole earth system modelling instead of simulating the ozone, so that the surface ozone concentrations reported by these 4 models are essentially the same. In this sense, the single prescribed ozone (input4MIPs) [63] is used in place of the 4 models to avoid duplication.
⊥ All the realisations of the climate equilibrium started since 1850, so that are marked with the same initialisation index, i1. The ensemble experiment variant serial numbers are defined by a combination of realisation, initialisation, physics, and forcing, e.g. r1i1p1f1.
∗ The 2 CNRM models are not considered for surface ozone multi-model fusion as they do not include tropospheric ozone module.
§ Full name as HAMMOZ-Consortium, marked as HAM in model name.
† MOHC, MO-NERC and NIMS-KMA ran the same UKESM1 model with same configuration, but contributed different ensemble experiments, so that are referred collectively as UKESM1-0-LL hereafter in this paper.
‡ NCC ran the NorESM in two different coupling resolutions, as low atmospheric-medium ocean resolution (LM) and median atmospheric-medium ocean resolution (MM). In order to achieve higher performance in multi-model fusion, only the higher spatial-resolution simulation, MM, is considered so as to avoid duplication.
Fig. 1Schematic diagram of machine learning-based multi-model fusion by aggressive and conservative approaches. The stacking of source data layers refers to the collections of datasets with the same level in training models; the ellipses indicate elemental machine learning methodologies; and the rectangles represent the raw outputs from machine learning treatments. A total of 57 physical model simulations and 1 prescribed O3 concentration dataset (Inputs4MIPs) are considered.
Abbreviations and denotations: RFR, random forest regression; GBR, gradient boosting decision tree regression; CNN, convolutional neural network regression; SFP, semi-final product; BNN, Bayesian neural network regression; , re-scaling factor; , systematic bias corrector; , individual model weight; , bias corrector; , physical model identifier; , location index; , temporal index; , random noise.
Fig. 2Model-observation evaluation for the raw CMIP6 surface ozone simulation-ensemble and multi-model fusion by both aggressive and conservative approaches. a-c: Simulation-observation synchronicity, absolute and relative biases for 57 + 1 CMIP6 simulation ensemble. Model evaluations are conducted on TOAR observation covered sites across 1990–2014. d-g: Evaluations of aggressively and conservatively integrated surface ozone concentrations in terms of the overall model-observation synchronicity and bias. h-i: Multi-model and TOAR-observation assimilated historical global surface ozone concentrations by aggressive and conservative approaches. The 25-year average surface ozone concentrations during 1990–2014 are mapped as summary. All spatial resolutions are set as 2° × 2°, and the temporal interval is set to month.
Evaluation summary of aggressive and conservative multi-model fusion for surface ozone. The model evaluation metrics include the cross-validation (CV), test and full dataset overall coefficient of determination (R), the root mean squared error (RMSE), the normalised mean bias (NMB), and the linear regression slope (k) and intercept (b). Both two statistical models are evaluated separately for each 5-year period, season and continent to assess the spatiotemporal performances.
| Aggressive Approach | Conservative Approach | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RMSE | NMB | RMSE | NMB | |||||||||||
| 1990–1994 | 0.91 | 0.90 | 0.94 | 2.00 | 3.41 | 1.11 | −1.62 | 0.92 | 0.91 | 0.93 | 2.00 | 0.02 | 0.98 | 0.59 |
| 1995–1999 | 0.90 | 0.90 | 0.94 | 1.74 | 1.71 | 1.09 | −1.26 | 0.92 | 0.91 | 0.92 | 2.10 | 0.84 | 0.97 | 0.66 |
| 2000–2004 | 0.91 | 0.91 | 0.95 | 1.71 | 0.88 | 1.09 | −1.16 | 0.91 | 0.91 | 0.93 | 2.28 | 0.71 | 0.97 | 0.95 |
| 2005–2009 | 0.91 | 0.91 | 0.96 | 1.68 | 1.11 | 1.09 | −1.17 | 0.91 | 0.91 | 0.91 | 2.22 | 0.83 | 0.97 | 0.82 |
| 2010–2014 | 0.94 | 0.93 | 0.96 | 1.71 | 0.88 | 1.09 | −1.16 | 0.92 | 0.91 | 0.94 | 2.28 | 0.71 | 0.97 | 0.95 |
| Europe | 0.91 | 0.91 | 0.94 | 1.94 | 2.40 | 1.12 | −1.61 | 0.92 | 0.91 | 0.92 | 2.02 | 1.27 | 0.98 | 0.37 |
| North America | 0.93 | 0.93 | 0.96 | 1.61 | 1.27 | 1.08 | −1.19 | 0.91 | 0.91 | 0.93 | 1.96 | −0.04 | 0.97 | 0.94 |
| South America | 0.90 | 0.87 | 0.95 | 1.22 | 3.12 | 1.10 | −0.89 | 0.83 | 0.81 | 0.83 | 2.55 | 3.06 | 0.92 | 1.51 |
| Asia | 0.92 | 0.92 | 0.95 | 2.14 | 4.03 | 1.12 | −1.65 | 0.90 | 0.90 | 0.92 | 2.96 | 1.85 | 0.96 | 0.90 |
| Africa | 0.90 | 0.86 | 0.90 | 2.13 | 2.82 | 1.19 | −2.33 | 0.82 | 0.80 | 0.84 | 3.69 | −3.81 | 0.93 | 2.88 |
| Oceania | 0.94 | 0.91 | 0.96 | 0.91 | 0.68 | 1.08 | −0.78 | 0.83 | 0.81 | 0.84 | 2.13 | −1.05 | 0.88 | 2.65 |
| March–May | 0.93 | 0.90 | 0.97 | 1.91 | 0.84 | 1.13 | −0.65 | 0.94 | 0.91 | 0.96 | 2.06 | 0.89 | 0.99 | 0.97 |
| June–August | 0.94 | 0.92 | 0.98 | 1.78 | 1.12 | 1.09 | −0.86 | 0.94 | 0.92 | 0.95 | 2.14 | 0.74 | 0.97 | 0.75 |
| September–November | 0.93 | 0.89 | 0.98 | 1.75 | 3.09 | 1.12 | −0.57 | 0.93 | 0.90 | 0.95 | 2.07 | 0.10 | 0.98 | 0.69 |
| December–February | 0.93 | 0.90 | 0.98 | 1.80 | 3.05 | 1.14 | −0.60 | 0.93 | 0.90 | 0.95 | 2.19 | 0.54 | 0.98 | 0.51 |
| 0.94 | 0.89 | 0.96 | 1.81 | 2.01 | 1.05 | −1.35 | 0.90 | 0.88 | 0.95 | 2.12 | 0.57 | 0.97 | 0.71 | |
Fig. 3Spatiotemporal variability parametrisation for CMIP6 multi-model ensemble assimilated surface ozone concentrations during 1990–2014 by the conservative approach. The ensemble-learning predicted concentrations are clustered by month. a: Fourier-series function-based curve-fitting quality for grid-specific surface ozone variabilities against temporal sequence, quantified by R. b: Annual increasing ratio for yearly average surface ozone concentrations, estimated by exp(12a1)-1. c: Annual average intra-year variation amplitude as the peak-valley gaps, estimated by 2b0. d: Annual average linear change rates of the intra-year variation amplitudes, estimated by 24b1. e-f: Averaged annual change rates of peak and valley concentrations, deduced from the fitted second-order Fourier-series function.