| Literature DB >> 35650204 |
Yujie Wang1, Philipp Köhler2, Renato K Braghiere3,4, Marcos Longo3,5, Russell Doughty2,6, A Anthony Bloom3, Christian Frankenberg2,3.
Abstract
Land and Earth system modeling is moving towards more explicit biophysical representations, requiring increasing variety of datasets for initialization and benchmarking. However, researchers often have difficulties in identifying and integrating non-standardized datasets from various sources. We aim towards a standardized database and one-stop distribution method of global datasets. Here, we present the GriddingMachine as (1) a database of global-scale datasets commonly used to parameterize or benchmark the models, from plant traits to vegetation indices and geophysical information and (2) a cross-platform open source software to download and request a subset of datasets with only a few lines of code. The GriddingMachine datasets can be accessed either manually through traditional HTTP, or automatically using modern programming languages including Julia, Matlab, Octave, Python, and R. The GriddingMachine collections can be used for any land and Earth modeling framework and ecological research at the regional and global scales, and the number of datasets will continue to grow to meet the increasing needs of research communities.Entities:
Year: 2022 PMID: 35650204 PMCID: PMC9160223 DOI: 10.1038/s41597-022-01346-x
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
Fig. 1Pathway used to assemble and distribute the GriddingMachine database. Each dataset is (a) reprocessed to meet our standards for distribution, (b) compressed as a tar.gz file along with an empty label file GRIDDINGMACHINE, and (c) stored on publicly available HTTP servers. Then the meta information for each dataset is stored in Artifacts.toml, which includes the tag name, sha1 hashtag, sha256 hashtag, and downloading URLs. Users are able to access the datasets manually via HTTP protocols or automatically through Julia, Matlab, Octave, Python, and R functions aided by Artifacts.toml.
Datasets within GriddingMachine collections.
| Dataset type | LABEL | EXTRALABEL | IX | JT | YEAR | VK | Reference |
|---|---|---|---|---|---|---|---|
| Biomass | BIOMASS | ROOT | 120X | 1Y | — | V1 | [ |
| BIOMASS | SHOOT | 120X | 1Y | — | V2 | [ | |
| Canopy height | CH | — | 20X | 1Y | — | V1 | [ |
| CH | — | 2X | 1Y | — | V2 | [ | |
| Clumping index | CI | — | 2X, 240X | 1Y | — | V1 | [ |
| CI | — | 2X | 1Y | — | V2 | [ | |
| Elevation | ELEV | — | 4X | 1Y | — | V1 | [ |
| Gross primary productivity | GPP | MPI_RS | 2X | 1 M, 8D | 2001–2019 | V1 | [ |
| GPP | VPM | 5X, 12X | 8D | 2000–2019 | V2 | [ | |
| Leaf area index | LAI | MODIS | 2X, 10X, 20X | 1 M, 8D | 2000–2020 | V1 | [ |
| Land mask | LM | — | 4X | 1Y | — | V1 | ERA5 |
| Leaf nitrogen content | LNC | — | 2X | 1Y | — | V1 | [ |
| LNC | — | 2X | 1Y | — | V2 | [ | |
| Leaf phosphorus content | LPC | — | 2X | 1Y | — | V1 | [ |
| Plant functional type | PFT | — | 2X | 1Y | — | V1 | [ |
| Surface area | SA | — | 1X, 2X | 1Y | — | V1 | [ |
| Soil color class | SC | — | 2X | 1Y | — | V1 | [ |
| Solar-induced chlorophyll fluorescence | SIF | TROPOMI_740 | 1X, 2X, 4X, 5X, 12X | 1 M, 8D | 2018–2020 | V1 | [ |
| SIF | TROPOMI_740DC | 1X, 2X, 4X, 5X, 12X | 1 M, 8D | 2018–2020 | V1 | [ | |
| SIF | TROPOMI_683 | 1X, 2X, 4X, 5X, 12X | 1 M, 8D | 2018–2020 | V2 | [ | |
| SIF | TROPOMI_683DC | 1X, 2X, 4X, 5X, 12X | 1 M, 8D | 2018–2020 | V2 | [ | |
| SIF | OCO2_757 | 5X | 1 M | 2014–2020 | V3 | [ | |
| SIF | OCO2_757DC | 5X | 1 M | 2014–2020 | V3 | [ | |
| SIF | OCO2_771 | 5X | 1 M | 2014–2020 | V3 | [ | |
| SIF | OCO2_771DC | 5X | 1M | 2014–2020 | V3 | [ | |
| Solar-induced luminescence | SIL | — | 20X | 1Y | — | V1 | [ |
| Specific leaf area | SLA | — | 2X | 1Y | — | V1 | [ |
| SLA | — | 2X | 1Y | — | V2 | [ | |
| Soil hydraulics | SOIL | SWCR | 12X, 120X | 1Y | — | V1 | [ |
| SOIL | SWCS | 12X, 120X | 1Y | — | V1 | [ | |
| SOIL | VGA | 12X, 120X | 1Y | — | V1 | [ | |
| SOIL | VGN | 12X, 120X | 1Y | — | V1 | [ | |
| SOIL | KSAT | 100X | 1Y | — | V2 | [ | |
| Tree density | TD | — | 2X, 120X | 1Y | — | V1 | [ |
| Maximum carboxylation rate | VCMAX | — | 2X | 1Y | — | V1 | [ |
| VCMAX | — | 2X | 1Y | — | V2 | [ | |
| Wood density | WD | — | 2X | 1Y | — | V1 | [ |
Fig. 2Framework of GriddingMachine.jl package (v0.2). GriddingMachine.jl contains three sub-modules: Collector, Indexer, and Requestor. Collector downloads and manages the datasets; Indexer reads the downloaded datasets; and Requestor requests a subset of data directly from the server without downloading the datasets.
Fig. 3Example of simple dataset requests using GriddingMachine. The requested data are gross primary productivity (GPP) from[27] (MPI GPP, olive symbols) and[28] (VPM GPP, cyan symbols) and day length corrected solar-induced chlorophyll fluorescence (dcSIF) from[18] (TROPOMI SIF, red symbols).
Fields and attributes of the reprocessed NetCDF datasets.
| Field | Dimension | Description | Attributes |
|---|---|---|---|
| lon | 1D array | Longitude in the center of a grid | unit° |
| description: Latitude | |||
| lat | 1D array | Latitude in the center of a grid | unit° |
| description: Longitude | |||
| ind | 1D array | Cycle index (only available in 3D datasets) | unit: - |
| description: Cycle index | |||
| data | 2D/3D array | Data in the center of a grid | longname: long name of the data |
| unit: unit of the data | |||
| about: general information | |||
| authors: authors of the dataset source publication | |||
| year: year of the data source publication | |||
| title: title of of the data source publication | |||
| journal: journal of the data source publication | |||
| doi: DOI tag of the data source publication | |||
| changeN: Change log of the | |||
| std | 2D/3D array | Error of data in the center of a grid | same as “data” field |