Literature DB >> 31423326

Upland vegetation mapping using Random Forests with optical and radar satellite data.

Brian Barrett1, Christoph Raab2, Fiona Cawkwell3, Stuart Green4.   

Abstract

Uplands represent unique landscapes that provide a range of vital benefits to society, but are under increasing pressure from the management needs of a diverse number of stakeholders (e.g. farmers, conservationists, foresters, government agencies and recreational users). Mapping the spatial distribution of upland vegetation could benefit management and conservation programmes and allow for the impacts of environmental change (natural and anthropogenic) in these areas to be reliably estimated. The aim of this study was to evaluate the use of medium spatial resolution optical and radar satellite data, together with ancillary soil and topographic data, for identifying and mapping upland vegetation using the Random Forests (RF) algorithm. Intensive field survey data collected at three study sites in Ireland as part of the National Parks and Wildlife Service (NPWS) funded survey of upland habitats was used in the calibration and validation of different RF models. Eight different datasets were analysed for each site to compare the change in classification accuracy depending on the input variables. The overall accuracy values varied from 59.8% to 94.3% across the three study locations and the inclusion of ancillary datasets containing information on the soil and elevation further improved the classification accuracies (between 5 and 27%, depending on the input classification dataset). The classification results were consistent across the three different study areas, confirming the applicability of the approach under different environmental contexts.

Entities:  

Keywords:  Radar; random forests; remote sensing; satellite data; uplands; vegetation mapping

Year:  2016        PMID: 31423326      PMCID: PMC6686255          DOI: 10.1002/rse2.32

Source DB:  PubMed          Journal:  Remote Sens Ecol Conserv        ISSN: 2056-3485


Introduction

Regular monitoring of vegetation in upland areas is important for biodiversity conservation, land management, carbon storage and within a European context, European Union (EU) policy compliance. Approximately 19% of the area of the Republic of Ireland supports upland habitats and these have not been adequately described or their distribution adequately mapped (Perrin et al. 2009). These upland areas contain the nation's largest expanse of semi‐natural habitats and provide many benefits to society – water supply, climate regulation, maintenance of biodiversity, and provision of recreational activities to name but a few. Notwithstanding this, the uplands are under increasing pressure from a myriad of issues; grazing management, scrub encroachment, diminished supports, ageing farming population and abandonment of land that will lead to major landscape changes into the future (MacDonald et al. 2000; Reed et al. 2009). These stresses have serious consequences upon the composition, extent and conservation status of important vegetation habitats in these areas (Mehner et al. 2004). The inaccessibility and scale of the uplands, along with constraints in time and finance, make monitoring changes in vegetation covering large spatial areas difficult using traditional field‐based surveys (Lees and Ritman 1991; Buchanan et al. 2005; Rhodes et al. 2015). The use of Earth Observation (EO) data can help overcome this problem (Kerr and Ostrovsky 2003; Gillespie et al. 2008; Vanden Borre et al. 2011) and help comply with reporting obligations under the Birds Directive (Council Directive 79/409/EEC 1979) and the Habitats Directive (Council Directive 92/43/EEC 1992). The number of EO satellites orbiting the Earth is increasing, and concurrent with algorithmic advances in information extraction capabilities (Lausch et al. 2015), EO datasets offer a real possibility to provide reliable, high‐quality and spatially explicit maps of habitat distribution and monitor habitat fragmentation at intervals determined by management needs (Nagendra et al. 2013; Barrett et al. 2014; Pettorelli et al. 2014a). In order to obtain sufficient information at the ecotope level, hyperspectral data can be the preferred choice in some studies and has been successfully demonstrated under various conditions (Lawrence et al. 2006; Chan and Paelinckx 2008; Chan et al. 2012; Delalieux et al. 2012; Lucas et al. 2015). However, hyperspectral data is not always or commonly available, and in its absence, multispectral data have also shown their use for habitat mapping (e.g. Feilhauer et al. 2014 and Corbane et al. 2015). The difficulty of acquiring cloud‐free observations in temperate areas prone to persistent cloud coverage, especially during spring and summer periods often limits the capability of classifying habitats with optical data as it is not possible to capture the seasonal variability in the spectral response of the vegetation (Lucas et al. 2011). Additionally, upland areas are usually more prone to cloud cover due to the effects of orographic lift. Consequently, Synthetic Aperture Radar (SAR) data are increasingly being investigated for landscape monitoring as they are largely unaffected by atmospheric conditions (Baghdadi et al. 2009; Waske and Braun 2009; Evans and Costa 2013; Barrett et al. 2014). SAR data are sensitive to vegetation structure and moisture content and in combination with optical data, may help to further improve discrimination of habitats that are structurally different but spectrally similar. The choice and availability of suitable EO data will ultimately determine the amount of information that can be extracted to map and monitor habitats to varying degrees of resolution. In general, most studies are concerned with using the highest spatial resolution data possible, which often introduces further challenges in terms of the maximum coverage that is attainable and financial constraints in acquiring the data. In many cases, extremely detailed imagery may not be needed for a widespread conservation status assessment and the use of medium spatial resolution (>10 and ≤20 m) data may be sufficient to capture the broad extent and spatial patterns of habitats and meet the local needs of stakeholders along with national requirements in terms of reporting under the EU Directives (Lucas et al. 2007; Varela et al. 2008; Nagendra et al. 2013). Within this context, the objective of this study is to evaluate the use of medium spatial resolution optical and radar satellite data, together with ancillary soil and topographic data for identifying and mapping upland vegetation to complement field studies and help contribute to national policy in the area of upland management and the future sustainable development of the uplands. The definition used in this study for upland habitats is taken from Perrin et al. (2009) and is the same as that used by the NPWS of Ireland, whereby uplands are defined as unenclosed areas of land over 150 m in elevation, and contiguous areas of related habitats that descend below this value. Consequently, the study also includes areas below the 150 m cut‐off to include the broad band of transitional vegetation and land management that exists between lowland and upland habitats.

Materials and Methods

Study sites

Suitable study areas were selected from a list of candidate sites for an upland monitoring network, proposed by Perrin et al. (2009) that is designed to meet part of Ireland's obligations under Articles 6, 11 and 17 of the Habitats Directive (92/43/EEC). Figure 1 displays the three areas selected for this study; Mount Brandon, the Galtee Mountains, and the Comeragh Mountains.
Figure 1

Location of the three upland study sites in Ireland. (A) Mount Brandon, (B) Galtee Mountains, and (C) Comeragh Mountains showing topography in shaded relief.

Location of the three upland study sites in Ireland. (A) Mount Brandon, (B) Galtee Mountains, and (C) Comeragh Mountains showing topography in shaded relief.

Mount Brandon

Mount Brandon is located on the Dingle Peninsula in west Kerry, in south western Ireland. It is a mountainous area that includes the second highest peak in Ireland – Mount Brandon at 952 m. It is a designated candidate Special Area of Conservation (cSAC 000375, Lat: 52.22, Long: −10.07) and the area has an oceanic climate with a mean temperature range of between 7 and 13°C and a mean annual rainfall of 1560 mm (calculated from the 1981–2010 averages of the nearest synoptic weather station at Valentia), although the upland summits often receive over 3000 mm per annum. The area of the studied upland site is 162 km2 (16,212 ha) while the area of the entire region in Figure 1(A) is 1030 km2.

Galtee Mountains

The Galtee Mountains span across three counties: Cork, Tipperary and Limerick, and are the highest inland mountain range in Ireland (Galtymore at 920 m). It is a designated candidate Special Area of Conservation (cSAC 000646, Lat: 52.36, Long: −8.14), where the mean recorded temperature range is 5–13°C with a mean annual precipitation of 820 mm for the lowlands, rising to 1900 mm for upland regions (meteorological data recorded at the closest synoptic weather station at Moorepark in Fermoy, Cork). The area of the studied upland site is 83 km2 (8279 ha) while the area of the entire region in Figure 1(B) is 619 km2.

Comeragh Mountains

The Comeragh Mountains are located in county Waterford and are a designated Special Area of Conservation (SAC 001952, Lat: 52.23, Long: −7.56). The central area of the mountains features a boggy plateau and reaches a maximum elevation of 792 m. Moorepark is also the closest synoptic weather station to the Comeragh Mountains and so the historical meteorological data is the same as for the Galtee Mountains. The area of the studied upland site is 103 km2 (10,329 ha) while the area of the entire region in Figure 1(C) is 943 km2.

Satellite data

The Advanced Visible and Near Infrared Radiometer type 2 (AVNIR‐2) instrument onboard the Advanced Land Observation Satellite (ALOS) satellite was a multispectral sensor that acquired data in four visible and near infrared wavebands corresponding to the blue (0.42–0.50 μm), green (0.52–0.60 μm), red (0.61–0.69 μm) and near infrared (0.76–0.89 μm) spectral channels. The ALOS satellite was launched by the Japanese Aerospace Exploration Agency (JAXA) on 19th January 2006 and operated until 12th May 2011. The satellite also had a Phased Array‐type L‐band Synthetic Aperture Radar (PALSAR) instrument onboard that operated at L‐band (wavelength (λ) = 23.6 cm). PALSAR level 1.1 data (single‐look‐complex [SLC]) obtained from the European Space Agency (ESA) were used in this study. Two different modes were selected: fine beam single (FBS) and fine beam dual (FBD) polarization. The characteristics of the satellite data used in this study are displayed in Table 1. AVNIR‐2 and PALSAR data were selected for this study based on their spatial resolution, availability and closeness in acquisition to the field measurements.
Table 1

Satellite data used for each of the study sites. Azimuth corresponds to the solar azimuth and elevation corresponds the sun elevation angle, both in degrees. D corresponds to acquisitions from a descending orbit and A corresponds to acquisitions from an ascending orbit

SiteSensorDateTrackFramePassAzimuthElevation
Mount BrandonAVNIR‐22009‐09‐143582540/2550D166.53/ 166.1940.09/ 40.54
PALSAR FBD2010‐05‐1471030/1040A
2010‐06‐2971030/1040A
PALSAR FBS2010‐03‐2971030/1040A
Galtee MountainsAVNIR‐22010‐10‐113542540D169.9130.02
PALSAR FBD2010‐06‐0731040A
PALSAR FBS2010‐03‐0731040A
2011‐03‐1031040A
Comeragh MountainsAVNIR‐22010‐10‐113542540D169.9130.02
PALSAR FBD2010‐05‐2121030/1040A
2010‐07‐0621030/1040A
PALSAR FBS2011‐02‐2121030/1040A

AVNIR‐2, advanced visible and near infrared radiometer type 2; FBD, fine beam dual; FBS, fine beam single; PALSAR, phased array‐type L‐band synthetic aperture radar.

Satellite data used for each of the study sites. Azimuth corresponds to the solar azimuth and elevation corresponds the sun elevation angle, both in degrees. D corresponds to acquisitions from a descending orbit and A corresponds to acquisitions from an ascending orbit AVNIR‐2, advanced visible and near infrared radiometer type 2; FBD, fine beam dual; FBS, fine beam single; PALSAR, phased array‐type L‐band synthetic aperture radar.

Image pre‐processing

Avnir‐2

All data were received as level 1B2 products: two acquisitions from 2009 and one acquisition from 2010 were analysed for this study. The spatial resolution of the four AVNIR‐2 bands is 10 m and these were resampled to 15 m using a bilinear resampling to match the resolution of the PALSAR data. Each of the AVNIR‐2 scenes were geo‐rectified using ground control points (GCPs) collected from the Ordnance Survey of Ireland (OSi) orthophotography and yielded a root‐mean‐square (rms) error of less than 0.56 pixel. Atmospheric correction was performed using the MODTRAN® correction model as implemented in ATCOR‐3® (Richter and Schlapfer 2011). A C‐factor topographic correction (Teillet et al. 1982) was applied to the data using a sun illumination terrain model derived from a NextMap® 5 m Digital Elevation Model (DEM) (Intermap Technologies, 2008) covering the scene. The topographic correction was implemented in GRASS (GRASS Development Team, 2012). Cloud cover was present in each of the scenes and a mask (manually digitized on screen) was applied in order to exclude cloud and cloud shadow affected areas. Shadows cast by topography were identified using the shadow file output from ATCOR‐3 and subsequently masked.

Vegetation indices

Vegetation indices (VIs) have been used extensively for monitoring, analysing, and mapping vegetation dynamics and are often used to remove the variability caused by bare soil, illumination angles and atmospheric conditions when estimating vegetation parameters (Sarker and Nichol 2011). A selection of commonly used vegetation indices were generated using the atmospherically corrected AVNIR‐2 data in order to assess their additional information contribution to the classification process (see Table 2). In addition to the VIs, simple reflectance ratios were calculated for all four bands (blue/green; blue/red; blue/NIR; green/red; green/NIR; and red/NIR) and included as input in the classifications. Although many of the VIs are highly correlated, the use of multiple VIs could offer a more complete characterization of the upland vegetation classes.
Table 2

Vegetation Indices selected for this study

Vegetation IndexReference
Renormalized difference vegetation index (RDVI)(Roujean and Breon 1995)
Difference vegetation index (DVI)(Tucker 1979)
Modified nonlinear index (MNLI)(Yang et al. 2008)
Normalized difference vegetation index (NDVI)(Rouse et al. 1974)
Soil adjusted vegetation index (SAVI)(Huete 1988)
Optimized soil adjusted vegetation index (OSAVI)(Rondeaux et al. 1996)
Transformed vegetation index (TVI)(Deering and Rouse 1975)
Corrected transformed vegetation index (CTVI)(Perry and Lautenschlager 1984)
Thiam's transformed vegetation index (TTVI)(Thiam 1997)
Vegetation Indices selected for this study

Texture measures

Eight texture measures (mean, homogeneity, contrast, variance, dissimilarity, entropy, correlation, and second moment) based on the grey‐level co‐occurrence matrix (GLCM) (Haralick et al. 1973) of the near infra‐red band and radar backscatter were created using a 3 × 3 kernel size. These measures were included as they often provide unique information concerning the spatial pattern and variation in surface features and have been shown to improve classification accuracy (Lu and Weng 2007; Paneque‐Gálvez et al. 2013).

Palsar

The FBS and FBD data were multi‐looked by a factor of 2 in range and 4 in azimuth, and 1 in range and 4 in azimuth, respectively, to generate 15 × 15 m pixels. The FBS and FBD scenes for each study area were co‐registered and speckle filtered using a multi‐temporal de Grandi filter (De Grandi et al. 1997), and subsequently radiometrically and geometrically calibrated and converted to dB using a range‐doppler approach and a NextMap 5 m spatial resolution DEM. The radar backscatter returned to the sensor is affected by the topography of the surface where certain terrain‐induced distortions are present in areas with increased topographic relief. These areas were subsequently masked for each of the study sites. For Mt Brandon, 15.08 km2 of the upland area out of a total of 162 km2 was masked (due to the presence of cloud and/or shadow and terrain‐induced distortions), corresponding to 9.3% of the total area. For the larger scene (see Fig. 4), 62.29 km2 was masked out of a total of 581.18 km2 (land area only) which corresponds to 10.7% of the total land surface area of the scene. 2.74 km2 of the Comeraghs upland area of 103 km2 and 13.38 km2 of the total area of 943 km2 was masked, corresponding to 2.7% and 1.4% respectively. 2.27 km2 of the Galtees was masked, corresponding to 2.7% and 0.4% of the upland (83 km2) and total area (619 km2) respectively.
Figure 4

Maps derived from the optical and radar datasets (vii) for (A) Mount Brandon, (B) Galtee Mountains, and (C) Comeragh Mountain study areas. The delineated regions correspond to the upland areas of interest within each area.

Ancillary variables

Two different groups of ancillary variables were chosen for inclusion in the classifications: (1) Topographic – elevation and slope, and (2) Soils. Soil and subsoil information was derived from the Teagasc‐EPA Soils and Subsoils dataset (Fealy et al. 2009) and have a nominal working scale of 1:50,000 and elevation and slope data were obtained from a NextMap 5 m DEM. These parameters can influence the spatial distribution of upland vegetation species by affecting the amount of solar radiation and rainfall intercepted by the surface (Bennie et al. 2008, 2010), along with soil nutrient availability and moisture‐holding capacity (Franklin 1995).

Classification schema and reference data

The broad‐scale habitat classification scheme of Fossitt (2000) has been widely adopted by government authorities and the ecological community for habitat mapping in Ireland. The classification schema adopted for the NPWS‐funded National Survey of Upland Habitats (NSUH) is principally based on Fossitt (2000) and has been used in this study (see Table 3). A total of 15 classes (level 2) were identified and a stratified random sampling approach adopted for the selection of sample points. The three study sites have different class distributions and the proportion of each class varies relative to each site. Some classes are not present at some sites (e.g. lowland blanket bog) and some classes have a lower occurrence at other sites (e.g. exposed rock and montane heath at the Galtees). The sample set reflects these differences and as much as possible, the class proportions of the sample data are representative of actual class proportions in the study area landscape. User interpretation of NSUH field survey data for Mount Brandon (collected between May – Aug 2011), Galtee Mountains (Aug –Sept 2011), Comeragh Mountains (Mar – May 2010), Forest Inventory and Planning System (FIPS) and Microsoft® Bing Imagery aided the distinction between the different classes.
Table 3

Classification Schema and number of training samples. Class descriptions are adopted from Fossitt (2000)

Level 0Level 1Level 2BRGTCMDescription
G Grassland GA Improved GA1 Improved340507407Grassland on well drained soils, usually consists of highly managed pastures
GS Semi‐improved GS3 Dry humid grassland266186237Semi‐improved grassland over acid soils
GS4 Wet grassland101106114Semi‐improved grassland on poorly drained soils
H Heath HH Heath HH1 Dry siliceous heath213258158Usually occurs on free‐draining acid soils where the vegetation is open and dwarf shrubs are present
HH3 Wet heath236171122Usually found on lower slopes of upland areas on peaty soils
HH4 Montane heath11657111Substantial cover of dwarf shrubs occurring at high elevation and/or very exposed locations
HD1 Dense bracken111162135Areas of open vegetation dominated by Bracken
P Peatland PBR Raised Bog PB4 Cutover bog///Mostly located in the lowlands of central and mid‐west Ireland where there are accumulations of deep peat (3–12 m)
PBB Blanket Bog PB2 Upland blanket bog271383467Usually occurs on flat or gently sloping ground (above 150 m elevation) on variable peat depths (>0.5 m depth)
PB3 Lowland blanket bog129//Usually confined to wetter regions along the western seaboard. Occurs on flat or gently sloping ground below 150 m elevation
W Woodland381311674Areas dominated by trees and woody vegetation
E Exposed Rock ER Exposed Rock ER1/ER3 Exposed siliceous rock/scree and loose rock16355109Areas of natural and artificial exposure of bedrock and loose rock (excluding sea cliffs)
DG Disturbed Ground ED1/ED2 Exposed sand, gravel or till./67102Areas of exposed sand, gravel or till
B Built land139252338All developed land, including transportation infrastructure and human settlements
C Coastland134//Includes sea cliffs and sand dunes
M Water body30545242Bodies of permanent fresh and/or salt water
Total290525603216
Classification Schema and number of training samples. Class descriptions are adopted from Fossitt (2000)

Classification

The Random Forests (RF) machine learning classifier (Breiman 2001) was used to relate the vegetation types to the satellite and ancillary data. RF was chosen as the preferred classification method as it has consistently demonstrated its skill for vegetation mapping using various types of data (Cutler et al. 2007; Chapman et al. 2010; Bradter et al. 2011; Rodriguez‐Galiano et al. 2012; Barrett et al. 2014; Feilhauer et al. 2014) and can handle high‐dimensional datasets and not suffer from overfitting (Belgiu and Drăguţ 2016). RF builds an ensemble of individual decision‐like trees from which a final prediction is made using a majority voting scheme. The individual trees are trained using a bootstrap sample of the training data (2/3 of samples) with the remaining 1/3 of samples used to test the classification and estimate the out‐of‐bag (OOB) error. In this study, RF models consisted of 200 trees. Separate models were generated to analyse the performance of the different data types separately and collectively. Eight different datasets were analysed to compare the change in classification accuracy depending on the selected input variables. These models concentrated on the use of optical only, radar only and various combinations of optical‐derived and radar‐derived variables along with certain ancillary variables. The influence of the different input variables was calculated and the variable importances (based on the Gini importance) in the initial models were used to improve model fit and model parsimony. RF was implemented in Python 2.7.8 using the sci‐kit learn library (Pedregosa et al. 2011).

Accuracy assessment

The results of all classifications were assessed using a standard confusion matrix to calculate the overall accuracy and the user's and producer's accuracies (Congalton 1991). An additional independent validation was also carried out for comparison to the RF OOB accuracies. A total of 876, 839, and 881 samples were randomly selected throughout the Mt Brandon, Galtee Mts, and Comeragh Mts study areas to create the independent accuracy assessment dataset. The statistical significance of the differences between the classification datasets was evaluated, using the Mc Nemar's test (Foody 2004), using the following formula:where f 12 indicates the number of samples correctly classified in the first classification, but incorrectly in the second classification, and f 21 represents the number of samples correctly classified in the second classification, but incorrectly classified in the first classification. The Mc Nemar's test has been commonly used in previous studies to evaluate the variability between classifications (Duro et al. 2012; Belgiu et al. 2014). In this study, the significance level α is set at 0.05 (z critical value = 1.96).

Results

The accuracy assessments (overall accuracy, user's and producer's accuracy) for the different classification datasets are shown in Table 4. The overall accuracy (OA) values among the datasets vary from 59.8% to 94.3% across the three study locations. The highest overall accuracies (93.2–94.3%) were obtained for the combined optical, radar and ancillary data (viii), across all three study areas. Most datasets achieved high accuracies (>~85%) with the exception of the radar and texture measures dataset (≤68%). The RF classifier displayed a relatively consistent overall performance across the three study sites, however certain differences between classes were observed.
Table 4

Level 2 classification results

(i) Optical and texture(ii) Radar and texture(iii) Optical and texture and VIs(iv) Optical and ancillary data(v) Radar and ancillary data(vi) Optical and Radar(vii) Optical and Radar (inc texture and VIs)(viii) Optical and Radar and ancillary data
BRGTCMBRGTCMBRGTCMBRGTCMBRGTCMBRGTCMBRGTCMBRGTCM
Overall Accuracy (%)85.386.185.660.059.868.084.685.584.993.092.493.187.483.288.785.685.886.288.486.488.594.393.293.8
Improved grassland (GA1)
PA0.980.970.960.630.650.700.970.960.950.970.990.980.950.940.950.980.970.960.970.970.970.990.990.97
UA0.970.950.980.830.820.870.960.950.980.981.001.000.960.980.990.980.960.990.970.970.990.981.001.00
Dry humid grassland (GS3)
PA0.940.780.870.400.490.450.920.780.870.950.910.910.800.810.720.930.770.880.920.800.860.950.940.92
UA0.980.760.930.510.220.360.970.750.910.980.900.940.800.680.830.980.760.910.970.770.920.980.910.93
Wet grassland (GS4)
PA0.940.750.630.320.560.740.930.750.590.960.960.830.880.840.800.940.770.630.950.760.710.970.970.87
UA0.920.770.540.120.130.120.910.750.490.910.880.780.650.610.710.940.770.580.940.750.540.940.870.81
Dry siliceous heath (HH1)
PA0.780.770.710.440.460.490.760.770.690.870.860.810.790.730.710.830.760.740.840.760.750.900.850.84
UA0.750.820.630.540.540.220.780.830.610.910.880.680.850.740.650.790.860.650.830.840.680.920.890.72
Wet heath (HH3)
PA0.580.650.570.320.350.430.570.600.600.830.770.760.700.620.770.560.560.560.630.600.710.860.810.81
UA0.650.510.320.360.150.050.630.490.360.730.710.710.700.660.420.620.400.270.720.430.320.790.780.70
Montane heath (HH4)
PA0.780.690.600.570.270.290.760.680.580.860.810.860.920.920.810.810.720.490.830.750.700.870.840.88
UA0.780.720.360.350.050.110.750.740.340.830.770.840.730.580.550.830.680.390.820.700.520.900.750.77
Dense bracken (HD1)
PA0.800.870.710.630.480.330.810.840.700.960.900.880.920.780.810.840.860.690.880.840.790.970.910.88
UA0.770.890.550.260.170.270.750.880.590.930.940.870.810.810.840.780.860.640.740.900.760.910.950.94
Upland blanket bog (PB2)
PA0.600.770.780.380.420.470.580.790.790.840.870.930.740.750.820.590.760.770.700.760.760.830.880.92
UA0.600.880.930.540.710.890.590.860.920.930.930.960.920.880.940.610.870.920.730.860.950.950.930.96
Lowland blanket bog (PB3)
PA0.66//0.32//0.67//0.84//0.81//0.65//0.77//0.93//
UA0.55//0.05//0.50//0.75//0.61//0.53//0.56//0.78//
Woodland (W)
PA0.990.970.990.930.860.951.000.970.981.000.980.990.960.860.991.000.990.991.000.990.991.000.990.99
UA0.990.990.980.980.980.950.990.980.970.990.990.980.991.000.971.000.990.981.000.990.991.001.000.99
Exposed Rock (ER1)
PA0.840.830.700.610.640.620.840.810.650.900.930.750.860.850.780.840.800.690.890.900.710.940.910.79
UA0.820.690.680.260.250.320.840.640.640.900.760.830.870.600.780.800.670.720.880.670.770.910.760.85
Disturbed ground (ED1)
PA/0.870.55/0.650.39/0.870.55/0.930.93/1.000.89/0.930.71/0.960.75/0.960.85
UA/0.780.97/0.160.13/0.720.97/0.810.89/0.220.91/0.760.77/0.780.87/0.790.96
Builtland (B)
PA0.891.000.990.690.800.870.911.000.990.971.000.990.940.950.950.910.980.990.940.990.991.001.001.00
UA0.930.981.000.710.890.920.920.981.000.990.991.000.900.920.980.920.971.000.950.991.000.991.001.00
Coastland (C)
PA0.95//0.79//0.94//1.00//1.00//0.96//0.98//0.99//
UA0.93//0.66//0.92//0.97//0.99//0.93//0.96//1.00//
Water body (M)
PA1.001.000.990.940.980.901.001.000.981.001.000.991.001.001.001.001.000.981.001.000.991.001.001.00
UA1.000.870.890.950.910.941.000.910.881.000.980.991.001.000.991.000.980.991.000.961.001.000.981.00

PA, producer accuracy; UA, user accuracy for the different datasets at each of the three study sites. BR, Mount Brandon; GT, Galtee Mountains; CM, Comeragh Mountains.

Level 2 classification results PA, producer accuracy; UA, user accuracy for the different datasets at each of the three study sites. BR, Mount Brandon; GT, Galtee Mountains; CM, Comeragh Mountains. Figure 2 displays the producer's (PA) and user's (UA) accuracies for each of the eight classification datasets and provides more insight into the classification errors that are unique to specific classes. The producer's and user's accuracy represent the omission and commission errors respectively. The radar dataset (ii) displays the highest variation between PA and UA for many of the vegetation classes indicating that the radar and texture data tend to overestimate and when used alone, cannot reliably separate these classes. The lowest values for most datasets are confined to the heath classes, where differences between the study areas become more readily apparent. When both optical and radar datasets are combined with the ancillary datasets (viii), these differences between the study areas are less obvious.
Figure 2

Producers and User's accuracies, represented as the first and second column at each of the three study sites is displayed for the eight different classification datasets (i–viii) and correspond with those as presented in Table 4.

Producers and User's accuracies, represented as the first and second column at each of the three study sites is displayed for the eight different classification datasets (i–viii) and correspond with those as presented in Table 4. It can be seen that the increases in the accuracies achieved in some of the datasets by the addition of certain variables are not large. RF produces a measure of the variable importance by analyzing the deterioration of the predictive ability of the model when each predictor variable is replaced in turn by random noise (Vincenzi et al. 2011). In general, the texture measures and radar data have low importance scores. The class‐specific contributions of different variables to the models are shown in Figure 3. Due to their negligible influence, the texture measures (optical and radar) have been omitted. In all three study areas, all models strongly relied on distinct spectral bands and band ratios. The influence of the ancillary data is variable between classes and study sites. The RF models were applied across the entire study areas to obtain vegetation cover for the whole regions (see Fig. 4), while the upland subsets in these study areas are shown in Figure 5. These maps were created using the (vii) dataset, without the inclusion of the soils and elevation ancillary data. A 3 × 3 pixel majority filter was applied to the thematic outputs to improve the homogeneity of the final product. As can be seen from Figure 4, the dominant vegetation cover in all areas is grasslands, and this is relatable to most areas in Ireland. There is very little forest cover on the Dingle Peninsula, while both the Galtee and Comeragh study areas have considerably larger forest areas, especially along the lower slopes of the upland areas. These areas usually represent lands that are marginal for agriculture and since the 1950s, large extents have been afforested, supported through various government and EU incentive programmes. Concentrating on the upland subsets in Figure 5, the true value of upland areas in terms of habitat diversity is apparent. Mount Brandon (Fig. 5A) has extensive areas of wet heath, semi‐improved (dry‐humid acid) grasslands, blanket bog and dry siliceous heath. Large areas of montane heath are observed, especially along the western edge of the area making it quite distinctive when compared to the Galtee and Comeragh Mountains. From Figure 5(B), the dominant classes for the Galtee Mountains are dry‐humid acid grassland along the north‐west of the area, dry siliceous heath and blanket bog. Wet heath occurs less frequently, compared to the Mount Brandon area, though there are increased areas of wet grassland. Similar to the Galtees, the dominant classes in the Comeragh Mountains area (Fig. 5C) are blanket bog, dry siliceous heath and dry‐humid acid grassland. Small areas of wet heath are scattered throughout the area and areas of dense bracken are prevalent along the eastern edges of the upland area.
Figure 3

Variable importance scores of the different classes for the three study areas. Apart from the mean, all texture measures were excluded as their importance was negligible. Radar backscatter data (black) represent the first four (Galtee Mountains) and five (Mount Brandon and Comeragh Mountains) variables followed by the four spectral bands (b1, b2, b3, b4) and spectral band ratios (b1b2, b1b3, b1b4, b2b3, b2b4, b3b4) in green. The vegetation indices (NDVI, SAVI, OSAVI, DVI, CTVI, TVI, TTVI, RDVI, and MNLI) are in blue with the band 4 mean, HH polarization mean, and HV polarization in light grey. The final four variables are the soil, subsoil, elevation, and slope (dark grey). NDVI, normalized difference vegetation index; OSAVI, optimized soil adjusted vegetation index; RDVI, renormalized difference vegetation index; SAVI, soil adjusted vegetation index.

Figure 5

Maps of the upland areas of (A) Mount Brandon, (B) Galtee Mountains, and (C) Comeragh Mountain study areas. These areas correspond to the delineated regions in Figure 3.

Variable importance scores of the different classes for the three study areas. Apart from the mean, all texture measures were excluded as their importance was negligible. Radar backscatter data (black) represent the first four (Galtee Mountains) and five (Mount Brandon and Comeragh Mountains) variables followed by the four spectral bands (b1, b2, b3, b4) and spectral band ratios (b1b2, b1b3, b1b4, b2b3, b2b4, b3b4) in green. The vegetation indices (NDVI, SAVI, OSAVI, DVI, CTVI, TVI, TTVI, RDVI, and MNLI) are in blue with the band 4 mean, HH polarization mean, and HV polarization in light grey. The final four variables are the soil, subsoil, elevation, and slope (dark grey). NDVI, normalized difference vegetation index; OSAVI, optimized soil adjusted vegetation index; RDVI, renormalized difference vegetation index; SAVI, soil adjusted vegetation index. Maps derived from the optical and radar datasets (vii) for (A) Mount Brandon, (B) Galtee Mountains, and (C) Comeragh Mountain study areas. The delineated regions correspond to the upland areas of interest within each area. Maps of the upland areas of (A) Mount Brandon, (B) Galtee Mountains, and (C) Comeragh Mountain study areas. These areas correspond to the delineated regions in Figure 3. The results of the Mc Nemar's test between classification (vii) and the others are displayed in Table 5 for all study sites. McNemar′s test is non parametric and based on the classifier's confusion matrices with the null hypothesis of no significant differences between classifications (e.g. (i) = (vii)). For all three sites, the difference between (vii) and (ii) and (vii) and (v) were significantly different (P < 0.001). The difference between (vii) and (iii) was significantly different (P < 0.001) at both the Mt Brandon and Comeragh Mts sites. Mt Brandon displayed significant differences (P < 0.05) between all classifications except (vii) and (iv) while the Comeragh Mts also displayed significant differences (P < 0.001) between (vii) and (iii) and (vii) and (vi).
Table 5

Summary of the classification comparisons for the three study areas

Mt BrandonGaltees MtsComeragh Mts
Class 1Class 2|z| P value|z| P value|z| P value
(i)(vii)3.0350.0021.3310.1832.3730.176
(ii)(vii)14.284<0.00113.844<0.00112.736<0.001
(iii)(vii)3.428<0.0011.8250.0683.582<0.001
(iv)(vii)0.4470.6551.2810.2000.6810.496
(v)(vii)6.167<0.0016.972<0.0013.618<0.001
(vi)(vii)2.3310.0201.5430.1222.592<0.001
(viii)(vii)4.587<0.0011.6430.1001.7090.087
Summary of the classification comparisons for the three study areas

Discussion

The results from this study demonstrate the advantage of integrating EO satellite data from multiple sensors to improve vegetation mapping in upland regions. Even though it may not be surprising that the multispectral data outperforms the radar data, there is merit in incorporating both data types in the classifier models. One of the first published studies to investigate radar differences between upland and lowland vegetation was by Krohn et al. (1983) using L‐band SEASAT data. Since then, few published studies on the use of radar for mapping uplands can be found in the literature. The results from this study reveal that a short time series of L‐band radar data cannot exclusively separate all the distinct vegetation classes used in this analysis. The results show that combined optical and radar data obtain the highest classification accuracies, in agreement with previous studies (e.g. Bagan et al. (2012)). The inclusion of ancillary datasets containing information on the soil and elevation further improves the classification accuracies (between 5 and 27%, depending on the input classification dataset) and is similar to that found in previous studies for both optical (Sesnie et al. 2008) and radar data (Barrett et al. 2014). When several vegetation classes are grouped into broader habitat types, classification accuracies also show an improvement. There is little difference between level 0 and level 1 accuracies and in most cases, the lower level classifications show only a marginal improvement upon level 2 accuracies (see Fig. S1 and Tables S1, S2). To determine the stability of the level 2 classification results, 25 iterations of the RF classifications were run for the optical and radar dataset (vii) where the maximum variation observed in OA for Mt Brandon was 1.01%, Galtee Mts was 0.71%, and Comeragh Mts was 0.69%.

Relative importance of explanatory variables

It can be seen from Figure 3 that the radar data has low importance scores for most of the vegetation classes, with the lowest scores obtained for the GS3 and PB2 classes. This is likely due to the long wavelength of the radar signal (λ = 23.6 cm) which penetrates through the vegetation canopy and returns mostly information about the underlying soil properties. Shorter wavelength (e.g. C‐ or X‐band) backscatter is influenced more by the vegetation canopy and may provide more information on the plant geometry that could facilitate the distinction of different upland vegetation classes. Within the optical domain, the NIR signal is particularly useful for discriminating between grassland types (GA1, GS3 and GS4) while the green and red spectral bands perform well for distinguishing between the heath classes (HH1, HH3 and HH4) and blanket bog (PB2). The spectral band ratios (blue/red and green/red) performed especially well in separating dense bracken (HD1), and in general performed better than the vegetation indices. The greater importance of these band ratios is likely due to the higher reflectance of bracken compared to other vegetation in autumn, especially in the red wavelengths due to the higher amount of underlying dead litter. Similar findings were observed by Holland and Aplin (2013) for winter acquisitions. Factors such as the bare soil, moisture conditions, solar zenith angle and the atmosphere can impact on the effective use of VIs for distinguishing vegetation types (Jackson and Huete 1991). Soil‐adjusted indices such as SAVI and OSAVI minimise the soil background influence but do not outperform other VIs, indicating a likely negligible influence of bare soil on the classifications. In fact, the nine VIs investigated in this study perform similarly across the different study areas. The exception is for the improved grassland (GA1) class where the DVI and renormalized difference vegetation index (RDVI) revealed the highest discriminatory power for the Mount Brandon and Comeragh Mountains. In both of these areas, the NIR channel also had a higher influence than other spectral bands or indices. This is likely due to the strong absorption of electromagnetic radiation in the red wavelengths (0.61–0.69 μm) by chlorophyll in pastures and it's high reflectance in the NIR region. RDVI is similar to NDVI but tends to be more sensitive to changes in vegetation coverage under low leaf area index conditions. Elevation is one of the most important factors determining the broad‐scale distribution of upland vegetation as it influences precipitation and temperature. Thus, elevation controls the ecological and physiological adaptations of various plant species (Lomolino 2001) and the significance of this variable and to a lesser extent, the slope can be seen across most of the classes. The high explanatory power of these variables is not surprising, as upland grasslands and heaths tend to occur on sloping ground, and montane heaths generally occur at high elevations. Similarly, blanket bogs usually occur on level ground or gentle slopes. Furthermore, they generally occur on deep peaty soils and the results indicate that the soil and subsoil variables had a high importance in this class also. The particular importance of soil characteristics for vegetation mapping has been demonstrated in previous studies by Rogan et al. (2003), Barrett et al. (2014), and Gartzia et al. (2014). Studies within different scientific disciplines (e.g. bioinformatics, statistics, ecology) suggest RF variable importance measures can display a bias towards highly correlated variables (Strobl et al. 2008; Genuer et al. 2010; Ellis et al. 2012). This bias can be lessened by increasing the subsample size of input variables at each node but at the expense of increasing the generalization error and decreasing the overall accuracy (Breiman 2001). Although not considered here, approaches such as the conditional permutation method (Strobl et al. 2008) could be explored as an alternative importance measure in future studies.

Predicted output map uncertainty

The retrieval of habitat information in upland areas using EO data is challenging due to the variable topography and the difficulty of obtaining cloud‐free acquisitions in these regions. Furthermore, habitat delineation is more difficult to achieve as the landscape is more heterogeneous (in terms of composition and structure) and consists of a number of interlinked habitats at different scales (spatial, temporal and spectral) (Varela et al. 2008). In this study, misclassification has occurred within and between subclasses of the main vegetation classes of interest (grassland, heaths, and blanket bog). An important feature of the RF algorithm is the ability to compute class probabilities in order to quantify the level of uncertainty in the predicted output maps. The probability of correct classification for each class was calculated to make this uncertainty explicitly available, whereby the relative proportion of each vegetation class per pixel is provided in Figure 6. The predicted probabilities of the main vegetation classes are shown, where the darkest areas represent the pixels with the lowest uncertainty of the assigned class. The classes with the highest overall probabilities in each of the study sites are dry humid acid grasslands (GS3), blanket bogs (PB2), and dry siliceous heath (HH1). Ireland is the most important European country for blanket bog habitats and contains almost 8% of the worldwide blanket bog resource, thus these areas are of prime conservation value. Furthermore, these expanses represent a significant active natural carbon sink (Tomlinson 2005; Bullock et al. 2012).
Figure 6

Prediction probabilities for the main classes of interest for the upland areas of Mount Brandon (left), Galtee Mountains (middle) and Comeragh Mountains (right). Darker areas represent higher probabilities while the lighter areas indicate low probabilities. Class designations correspond to those in Table 3.

Prediction probabilities for the main classes of interest for the upland areas of Mount Brandon (left), Galtee Mountains (middle) and Comeragh Mountains (right). Darker areas represent higher probabilities while the lighter areas indicate low probabilities. Class designations correspond to those in Table 3.

Comparison with additional independent validation dataset

Evaluation of classification accuracy, using the OOB accuracies reported in the RF algorithm have generally been shown to be a reliable measure of classification accuracy (Lawrence et al. 2006; Devaney et al. 2015). Belgiu and Drăguţ (2016) suggest that this claim requires further validation using a variety of datasets and application areas. In this study, an additional independent validation was performed and the results are presented in Table S3. In all cases, the accuracies obtained for the independent validation were, on average 5.1 ± 2.5% lower than the achieved OOB accuracies for all three study areas. The radar and ancillary dataset (v) had the largest differences, ranging between 8.4 and 11.6%, while the optical and radar (including texture measures and VIs) (vii) had the lowest, ranging between 2.5 and 3.3%. Although many studies have demonstrated the ability of RF to perform well on high dimensional data, Millard and Richardson (2015) found that RF can underestimate the error and recommend reducing the dimensionality of high dimensional datasets to significantly reduce the difference between OOB and independent assessment accuracies.

EO data acquisition timing and spatial resolution

The similarity of accuracies between the study areas may be attributable in part to the similar acquisition periods of the optical and radar data for each of the study areas. The AVNIR‐2 scenes were acquired in September (Mount Brandon) and in October (Galtee and Comeragh Mountains) while the radar acquisitions were acquired between February and March and May and July for the FBS and FBD mode data respectively. The different modes of PALSAR data were only available for certain times of year, as part of JAXA's systematic observation strategy, whereby FBS mode acquisitions were available between January and April, and FBD mode acquisitions were available between May and October. Vegetation has unique spectral signatures which evolve with the plant life cycle during the year. Characteristics such as pigmentation, water content and physiological structure affect the reflectance, absorption, and transmittance of plant leaves, stems and flowers. In this regard, the time of year of image acquisition will have a strong bearing on the classification accuracy and the ability to distinguish different types of vegetation. Nonetheless, it is difficult to identify an optimal temporal window for operational monitoring of all upland vegetation types (Cole et al. 2014), although acquisitions around September are considered optimum as most upland vegetation types are fully developed (Mills et al. 2006). The spectral similarity between different vegetation types during the summer often limits the ability of acquisitions during these months to reliably distinguish between vegetation types. Ideally, a dense time series of data would allow this to be investigated further as the use of multitemporal data can account for the seasonal variation in vegetation and provide more accurate classifications (Gillanders et al. 2008). This could also open up the possibilities of monitoring grazing management (under‐ and over‐grazing) more effectively and identify burning. In addition to multitemporal data, a higher discrimination between classes where misclassifications were high could be achieved with data from several spectral bands. For example, Feilhauer et al. (2014) successfully demonstrated the use of simulated multispectral data at 6 m, 10 m, 20 m and 60 m spatial resolution in providing detailed information on the distribution of habitat types. Similarly, Holland and Aplin (2013) found 4 m spatial resolution IKONOS imagery not to be comprehensively superior to Landsat (30 m spatial resolution) for mapping bracken at an uplands site in the UK. Similar findings were observed by Rocchini (2007) and Nagendra et al. (2010). All of these studies found spectral information to be much more important than spatial resolution. With the successful launch of medium spatial resolution sensors such as Sentinel‐2 on 23rd June 2015 and future launch of the environmental mapping and analysis program (EnMAP) hyperspectral satellite (providing global coverage at 30 m spatial resolution in 232 spectral channels) in 2018, a valuable and inexpensive source of information to derive spatially complete vegetation information for upland areas in a consistent and regular manner can be provided. Moreover, the perceived inadequacy of medium spatial resolution data may be overcome by incorporating information on the class probabilities as a measure of quantifying the level of uncertainty in the predicted output maps.

Conclusion

In upland areas, meteorological, hydrological and ecological conditions often change substantially over relatively short distances and thus contain a high diversity of habitats and species. Improving our knowledge on upland environments will give valuable insights into holistic environmental processes, aiding the development of sustainable land management strategies for managing the effects of climate change, dormancy and promote conservation of terrestrial and aquatic biodiversity (Nogués‐Bravo et al. 2007; Ramchunder et al. 2009; Hodd et al. 2014). EO provides the only means of measuring the characteristics of habitats across broad areas and detecting environmental changes that occur as a result of human or natural processes in these areas on a frequent basis (Kerr and Ostrovsky 2003; Turner et al. 2003; Duro et al. 2007; Nagendra et al. 2014). With the current availability of satellite EO data at low or no cost and an increased number of satellites in orbit or planned, there has never been a better time to incorporate EO data into operational vegetation mapping and monitoring programmes. EO data will never likely provide the fine‐scale information that only field measurements can provide but can offer a powerful complimentary information source (Spanhove et al. 2012; Feilhauer et al. 2014; Pettorelli et al. 2014b; O'Connor et al. 2015). From this study, it can be concluded that medium spatial resolution (~15 m) satellite data acquired from optical and microwave sensors offers a basis for supporting mapping and monitoring of upland vegetation. The mapping approach has been demonstrated over large areas in three distinctive upland regions, indicating the consistency and the transferability of the method.

Conflicts of Interest

The authors declare no conflicts of interest. Figure S1. Overall accuracies for the classification datasets at level 0, level 1, and level 2 for (A) Mount Brandon, (B) Galtee Mountains, and (C) Comeragh Mountains. The classification datasets (i–viii) correspond to those as presented in Table 4. Click here for additional data file. Table S1. Level 1 classification results for the different datasets at each of the three study sites. BR, Mount Brandon; GT, Galtee Mountains; and CM, Comeragh Mountains. Table S2. Level 0 classification results for the different datasets at each of the three study sites. BR, Mount Brandon; GT, Galtee Mountains; CM, Comeragh Mountains. Table S3. Level 2 classification results (PA, producer accuracy; UA, user accuracy) for the different datasets at each of the three study sites for the independent validation. BR, Mount Brandon; GT, Galtee Mountains; CM, Comeragh Mountains. Click here for additional data file.
  9 in total

1.  Gradient forests: calculating importance gradients on physical predictors.

Authors:  Nick Ellis; Stephen J Smith; C Roland Pitcher
Journal:  Ecology       Date:  2012-01       Impact factor: 5.499

2.  Soil carbon stocks and changes in the Republic of Ireland.

Authors:  R W Tomlinson
Journal:  J Environ Manage       Date:  2005-07       Impact factor: 6.789

3.  Random forests for classification in ecology.

Authors:  D Richard Cutler; Thomas C Edwards; Karen H Beard; Adele Cutler; Kyle T Hess; Jacob Gibson; Joshua J Lawler
Journal:  Ecology       Date:  2007-11       Impact factor: 5.499

4.  Automatic habitat classification methods based on satellite images: a practical assessment in the NW Iberia coastal mountains.

Authors:  R A Díaz Varela; P Ramil Rego; S Calvo Iglesias; C Muñoz Sobrino
Journal:  Environ Monit Assess       Date:  2007-10-21       Impact factor: 2.513

5.  Satellite remote sensing, biodiversity research and conservation of the future.

Authors:  Nathalie Pettorelli; Kamran Safi; Woody Turner
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2014-04-14       Impact factor: 6.237

6.  Projected range contractions of European protected oceanic montane plant communities: focus on climate change impacts is essential for their future conservation.

Authors:  Rory L Hodd; David Bourke; Micheline Sheehy Skeffington
Journal:  PLoS One       Date:  2014-04-21       Impact factor: 3.240

7.  Conditional variable importance for random forests.

Authors:  Carolin Strobl; Anne-Laure Boulesteix; Thomas Kneib; Thomas Augustin; Achim Zeileis
Journal:  BMC Bioinformatics       Date:  2008-07-11       Impact factor: 3.169

8.  Forest Cover Estimation in Ireland Using Radar Remote Sensing: A Comparative Analysis of Forest Cover Assessment Methodologies.

Authors:  John Devaney; Brian Barrett; Frank Barrett; John Redmond; John O Halloran
Journal:  PLoS One       Date:  2015-08-11       Impact factor: 3.240

9.  Quantitative evaluation of variations in rule-based classifications of land cover in urban neighbourhoods using WorldView-2 imagery.

Authors:  Mariana Belgiu; Lucian Dr Guţ; Josef Strobl
Journal:  ISPRS J Photogramm Remote Sens       Date:  2014-01       Impact factor: 8.979

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.