| Literature DB >> 32724036 |
Ruben Remelgado1,2, Sherzod Zaitov3, Shavkat Kenjabaev3, Galina Stulina3, Murod Sultanov4, Mirzakhayot Ibrakhimov4, Mustakim Akhmedov5, Victor Dukhovny3, Christopher Conrad6,7.
Abstract
Land cover is a key variable in the context of climate change. In particular, crop type information is essential to understand the spatial distribution of water usage and anticipate the risk of water scarcity and the consequent danger of food insecurity. This applies to arid regions such as the Aral Sea Basin (ASB), Central Asia, where agriculture relies heavily on irrigation. Here, remote sensing is valuable to map crop types, but its quality depends on consistent ground-truth data. Yet, in the ASB, such data are missing. Addressing this issue, we collected thousands of polygons on crop types, 97.7% of which in Uzbekistan and the remaining in Tajikistan. We collected 8,196 samples between 2015 and 2018, 213 in 2011 and 26 in 2008. Our data compile samples for 40 crop types and is dominated by "cotton" (40%) and "wheat", (25%). These data were meticulously validated using expert knowledge and remote sensing data and relied on transferable, open-source workflows that will assure the consistency of future sampling campaigns.Entities:
Year: 2020 PMID: 32724036 PMCID: PMC7387449 DOI: 10.1038/s41597-020-00591-2
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1Spatial and phenological distribution of samples. Location of the study region (a) described by maximum extent of the collected ground-truth data (b). Panel (c) is highlighted in blue in panel (b), and exemplifies the spatial detail of these data within the Fergana, Uzbekistan, collected on a field-by-field basis. These samples depict different crop types which compose distinct phenological class groups (d). “Winter” crops are solely represented by winter-wheat while “summer” combines “cotton”, “rice” and “maize” and “permanent” combines “orchards, “vineyards” and “alfalfa”. “Fallow” is described by samples from unused agricultural land, lacking a clear phenological peak. Crop types that lacked a characteristic phenological behaviour (e.g. potatoes, onions) were labelled as “unclear”. “Double” corresponds to “winter” crops that were followed by a “summer” crop after the first harvest.
Fig. 2Comparison of field samples. Image (a) shows a typical example of a communitarian urban garden. The land parcel is divided between several individuals and used to plant multiple types of crops, including fruit trees. Such cases are hard to interpret with remote sensing due to mixed pixel effects. Image (b) depicts a field where crops share the managed areas with buildings. Excluding the building is essential to accurately depict the spectral signature of the cultivated area. Image (c) shows an ideal case, where the field was cultivated with a single crop with no additional structures within it. Due to their homogeneity, such fields help discriminate crop specific spectral signatures.
Fig. 3Spatial homogeneity of field samples. Translation of a field polygon (left) into centroid coordinates (centre), where each point reports on percent overlap between the respective pixel and the polygon (right). The Overlap is controlled by the pixel resolution. The higher the resolution, the higher the proportion of pixels with a high homogeneity.
Fig. 4Comparison of phenological classes. Comparison of profiles for different phenological class groups. This information is used as a reference to re-classify crop types based on their respective NDVI profiles. “Winter” includes only winter wheat, which is planted in the last months of the year. “Summer” includes e.g. cotton and maize, which are planted at the begging of the year and usually harvested in mid- to late-summer. “Double” describes the existence of crop rotation (i.e. two phenological peaks) while “Permanent” depicts persistent agricultural practices, such as orchards.
Attributes of “CAWa_cropType_samples.shp”.
| Column | Format | Description |
|---|---|---|
| sampler | Character | Institution leading the field campaign |
| country | Character | Country of sampling |
| region | character | Geographic region of sampling |
| date | numeric | Sampling date (format: yyyy-mm-dd) |
| year | character | Sampling year |
| label_1 | character | Crop type; double cropping is labeled as CROP1-CROP2 (e.g. “wheat-other”) |
| label_2 | numeric | Phenological class |
| area | character | Polygon area in m2 |
Description of the ESRI shapefile containing polygons of field samples on crop types.
Attributes of “CAWa_cropType_polygon-Info.csv”.
| Column | Format | Description |
|---|---|---|
| min.cover | Numeric | Minimum percent overlap between the polygon overlap between the polygon and the underlying 30 × 30 m Landsat pixels |
| max.cover | Numeric | Maximum percent overlap between the polygon and the underlying 30 × 30 m Landsat pixels |
| mean.cover | Numeric | Mean percent overlap between the polygon and the underlying 30 × 30 m Landsat pixels |
| count | Numeric | Number of 30 × 30 m pixels in a polygon |
These data inform on the pixel homogeneity of each polygon in the sample dataset, based on 30 × 30 m satellite Data.
Content of “CAWa_cropType_time-series.csv”.
| Column | Format | Description |
|---|---|---|
| 1, …, 353 | Numeric | NDVI for a given day of year (given by the column name) |
These data contain the weighted-mean NDVI time-series used in the validation of the crop-type labels initially assigned to each sample during the respective field surveys. For each row, corresponding to a unique sample, the 23 columns provide equidistant NDVI values with a 16-day interval. The name of each column informs on the day of the year in which the respective sample was collected.
Content of “CAWa_CropType_samples.rds”.
| List element name | R Class | Description |
|---|---|---|
| samples | SpatialPolygonsDataFrame | Content described in Table |
| Info | Data.Frame | Content described in Table |
| ndvi | Data.Frame | Content described in Table |
The data described in Tables 1–3 are combined into an RDS file, which is an R specific file format. When read into R, the input data are organized in a list composed by three elements as described in the present table.
| Measurement(s) | crop type • land cover • Uncertainty • Normalized Difference Vegetation Index |
| Technology Type(s) | GPS navigation system • satellite imaging • computational modeling technique |
| Factor Type(s) | year of data collection • crop type label |
| Sample Characteristic - Environment | agricultural field |
| Sample Characteristic - Location | Uzbekistan • Tajikistan |