Literature DB >> 26664131

Wide-area mapping of small-scale features in agricultural landscapes using airborne remote sensing.

Jerome O'Connell¹, Ute Bradter¹, Tim G Benton¹.

Abstract

Natural and semi-natural habitats in agricultural landscapes are likely to come under increasing pressure with the global population set to exceed 9 billion by 2050. These non-cropped habitats are primarily made up of trees, hedgerows and grassy margins and their amount, quality and spatial configuration can have strong implications for the delivery and sustainability of various ecosystem services. In this study high spatial resolution (0.5 m) colour infrared aerial photography (CIR) was used in object based image analysis for the classification of non-cropped habitat in a 10,029 ha area of southeast England. Three classification scenarios were devised using 4 and 9 class scenarios. The machine learning algorithm Random Forest (RF) was used to reduce the number of variables used for each classification scenario by 25.5 % ± 2.7%. Proportion of votes from the 4 class hierarchy was made available to the 9 class scenarios and where the highest ranked variables in all cases. This approach allowed for misclassified parent objects to be correctly classified at a lower level. A single object hierarchy with 4 class proportion of votes produced the best result (kappa 0.909). Validation of the optimum training sample size in RF showed no significant difference between mean internal out-of-bag error and external validation. As an example of the utility of this data, we assessed habitat suitability for a declining farmland bird, the yellowhammer (Emberiza citronella), which requires hedgerows associated with grassy margins. We found that ∼22% of hedgerows were within 200 m of margins with an area >183.31 m2. The results from this analysis can form a key information source at the environmental and policy level in landscape optimisation for food production and ecosystem service sustainability.

Entities: CellLine Chemical Disease Species

Keywords: Aerial photography; Agriculture; Classification; Object orientated; Random forest; Spatial analysis

Year: 2015 PMID： 26664131 PMCID： PMC4643754 DOI： 10.1016/j.isprsjprs.2015.09.007

Source DB: PubMed Journal: ISPRS J Photogramm Remote Sens ISSN： 0924-2716 Impact factor: 8.979

Introduction

Agricultural land covers approx 38% of the earth’s terrestrial surface (FAO, 2014) and therefore plays a key role in biodiversity, conservation and ecosystem service delivery at a variety of spatial scales (Billeter et al., 2008, Tscharntke et al., 2005). However increasing pressures are likely on the fragmented habitats within these landscapes with the global population set to exceed 9 billion by 2050 driving demand that may require a doubling of food production (Godfray et al., 2010, Tilman et al., 2011). In agricultural landscapes non-cropped habitats are primarily made of features such as trees, hedgerows and grassy margins. For the purpose of this study a margin is a buffer strip 2 m wide and a hedgerow is defined as a length of small trees and shrubs 20 m long and 5 m wide (Maddock, 2008). Trees are defined as having an individual crown >6 m2 and may occur within a hedgerow, in isolation or as part of a woodland/forest. In the UK the total length of hedgerows fell from an estimated 800,000 km in 1956 to under 500,000 km in 1994 (Cornulier et al., 2011) to 477,000 km in 2007 (Carey et al., 2008). Recent reforms to the Common Agricultural Policy (CAP) have encouraged farmers to manage such features by means of financial payments through various agri-environmental schemes overseen by the Department for Environment, Food and Rural Affairs (DEFRA) in England. Features such as hedgerows and margins, are protected under UK (DEFRA, 1997, DEFRA, 2004) and EU (EU, 2007) law due to their importance as an ecological network across mono-cultured landscapes with distribution and connectivity having significant effects on landscape scale biodiversity and regional biota (Benton et al., 2003, Power, 2010). Many ecosystem services depend on the amount, quality and configuration of non-cropped land, as well as the landuse within fields (Benton, 2007, Billeter et al., 2008, Power, 2010). For example, simplification of the landscape through increased field size and reduced natural vegetation cover, especially of grassland areas, has been shown to increase pest damage due to lower populations of natural enemies (Gardiner et al., 2009). Given the importance of the amount, connectivity, heterogeneity and quality of non-cropped habitat for biodiversity and other ecosystem services, spatially extensive knowledge on the location and state of these habitats has now been recognized as a key variable in mapping and modeling ecosystem service delivery and sustainability at local, regional and national scales (Dale and Polasky, 2007, Kremen et al., 2007, Watson et al., 2011). Many of the processes involved need to be assessed at landscape level which makes traditional field based surveys expensive and time consuming. Remote sensing has long since been used to map biodiversity at a variety of spatial scales (Turner et al., 2003). However non-cropped features in the UK are often below the spatial resolution of many satellite sensors (i.e. <5 m), therefore alternative platforms are required for accurate delineation of their extent (O’Connell et al., 2013a). Some studies have used image fusion to enhance the spatial resolution of multispectral bands (Aksoy et al., 2010) while others have looked at sub-pixel image classification for the detection of small scale woody elements in the landscape (Foschi and Smith, 1997, Thornton et al., 2007). Many other approaches focus on the use of edge detection kernels to detect spectral boundaries which may be indicative of hedgerows or margins in agricultural landscapes (Fauvel et al., 2012, Rydberg and Borgefors, 2001). All these approaches are pixel based and rely on either high spatial resolution panchromatic data for contrast or multispectral data to classify the features based on spectral response. Other approaches have classified trees, hedge and shrub vegetation by combining multispectral and structural data via stereo imaging (Tansey et al., 2009), Light Detection And Ranging (LiDAR) (Hellesen and Matikainen, 2013) and radar (Scholefield et al., 2012). While these approaches offer robust mapping of tall vegetation, they generally don’t enhance the classification of surface vegetation such as grassy margins, can be costly to acquire and generally are not suitable for regional scale mapping due to low spatial resolution or low spatial coverage. An alternative approach to the use of pixels is the use of objects which can add additional information to features of interest which can then be utilised in an Object Based Image Analysis (OBIA) protocol (Blaschke, 2010). OBIA uses a variety of spectral, textural, geometric, thematic and contextual attributes built from the aggregation of homogeneous pixels into real world objects, therefore the size of uncorrelated feature space is significantly increased when compared to traditional pixel based approached (Benz et al., 2004, Mallinis et al., 2008, Myint et al., 2011). Non-cropped features in structured agricultural landscapes generally have high geometric and contextual properties; e.g. margins typically have a high length-to-width ratio, show a high contrast to neighbouring features such as hedges and are located at the edge of fields. Several studies have used OBIA in the classification of non-cropped features such as trees and hedges with varying levels of success (Bock et al., 2005, Mueller et al., 2004, Sheeren et al., 2009, Tansey et al., 2009, Vannier and Hubert-Moy, 2008). Classification in OBIA has been dominated by algorithms such as maximum likelihood, nearest neighbour and Knowledge Based Classifiers (KBC) (Blaschke, 2010). A KBC incorporates expert knowledge in building a set of rules that utilise the attributes of each object in the image and have proved successful in the classification of non-cropped areas in the UK (O’Connell et al., 2013a, Tansey et al., 2009). However the creation of a robust KBC is an iterative process which can be time consuming with respect to the selection of suitable features and thresholds (Stumpf and Kerle, 2011). Model transferability can be a significant issue where the thresholds and membership functions within the rule-base break down when the KBC is applied to data which was taken at a different time or location (O’Connell et al., 2013a). An alternative approach which is gaining popularity within the remote sensing community is the use of machine learning or ensemble algorithms such as Support Vector Machine (SVM) and Random Forest (RF) in OBIA (Duro et al., 2012). Such algorithms have advantages over conventional algorithms based on their ability to detect subtle and complex patterns in high dimensional data using robust statistical techniques with a high degree automation (Blaschke, 2010, Rodriguez-Galiano et al., 2012). Previous experience of RF (Breiman, 2001) from some of the authors of this study in the classification of complex habitats in the Yorkshire Dales National Park in the UK (Bradter et al., 2011) gave an indication of its potential when applied to environmental and remote sensing data. RF is based on an ensemble of decision trees which are each grown on random selections of two thirds of the data with replacement. This “bagging” approach makes the algorithm more insensitive to noise in the data (Rodriguez-Galiano et al., 2012), including variations in reflectance due to solar zenith or Biodirectional Reflectance Distribution Function (BRDF) (Chan and Paelinckx, 2008). This concept of machine learning by randomisation over multiple iterations allows for discernible pattern to emerge from highly dimensional data. The set of variables used at each decision node is randomly selected which can reduce the strength of individual trees but also reduces correlation between trees and thus reduces the generalisation error (Liaw and Wiener, 2002). The proportion between misclassifications and the total number of Out Of Bag (OOB) elements (i.e. the remaining one third of data) contributes to an unbiased estimate of generalisation error. This error converges as the number of trees increases; therefore adding more trees does not over fit the data (Cutler et al., 2007). RF uses the “best” variables at each node based on node purity. Several options to calculate variable importance exist, including permutation importance which is calculated by randomly permutating all values in the selected variable and using the difference in OOB error as an indication of the importance of that variable to the classifier. Pruning of trees is not necessary as the final classification is based on the majority vote of all trees within the forest. The package randomForest 4.6–7 (Liaw and Wiener, 2002) was used in the R (3.0.1) statistical coding environment (R Development Core Team, 2014), which is based on the original Fortran programs by Breiman and Cutler (https://www.stat.berkeley.edu/~breiman/RandomForests/cc_software.htm). The objective of this study was to create a novel and robust classification protocol for the mapping of non-cropped features in a “case study” agricultural landscape. The protocol needed to be semi-automated to enable wide area mapping of such features with a minimal number of variables. Based on these criteria and results from a previous studies (Bradter et al., 2013, O’Connell et al., 2013a) as well as a review of some comparative studies with other algorithms (Chan and Paelinckx, 2008, Duro et al., 2012, Lawrence et al., 2006, Pal, 2005, Rodriguez-Galiano et al., 2012), it was felt that OBIA with the ensemble classifier RF may yield best results. The combination of both approaches in remote sensing is a relatively new area of research and yields some uncertainties in the areas of optimisation of RF parameters, model transparency, training and validation size, hierarchical accuracy assessment and feature selection/importance with respect to class and object hierarchy. This study addresses some of the aforementioned uncertainties through the mapping of non-cropped areas using OBIA and the machine learning classifier RF.

Materials

Study area

The study area was located in East Anglia, England (52°19′07″ N, 0°49′43″ E) in an intensively managed agricultural landscape of arable crops and temperate grassland over a mosaic of lime rich loam, clay loam and sandy soils (NSRI, 2011). The topography of the site was undulating with an elevation range of 22–73 m, mean of 45 m and total area of 10,029 ha. Annual rainfall for the region was 810 mm, with an average of 130 days of rain per year (Met, 2012). Despite being intensively managed, the site contained various designated areas (Fig. 1) including; 12.4% under the England Habitat Network (Catchpole, 2007), 5.9% under Countryside Stewardship/Environmental Stewardship and 0.3% under priority grassland habitat via the UK Biodiversity Action Plan (BAP) (JNCC, 2007).

Fig. 1

Location of the study area. Designated areas outlined where; CountrySide Stewardship (CSS)/Environmental Stewardship (ES), English Habitat Network (EHN), Mire Fen Bog (MFB) and Lowland meadows/Lowland dry acid grassland (BAP priority habitats). Strategi data downloaded from the EDINA (Edinburgh Data and Information Access) Digimap OS service. ©Crown Copyright/database right 2009. An Ordnance Survey/EDINA supplied service. Cities Revealed® aerial photography copyright The GeoInformation® Group 2012.

Data

An airborne survey of the study area was commissioned for the summer of 2012 and completed on the 6th of September that year. The survey used a Vexcel UltraCam-Xp (Vexcel, 2008) pan-sharpened to 25 cm spatial resolution at an altitude of 4230 m ± 1 m. The data was projected to British National Grid (EPSG: 27700) with level 3 geometric correction and cubic convolution resampling. Landcover and boundary information was acquired from the Ordnance Survey (OS) MasterMap topography layer (OS, 2010). The topography layer gave information on nine themes; land area classifications, buildings, roads, tracks and paths, rail, water, terrain and height, heritage and antiquities, structures and administrative boundaries at 1:1250 to 1:10000 scale. Accuracy assessment of the various categories (kappa 0.68) from a previous study (O’Connell et al., 2013a) was used to identify themes that gave the most robust representation with respect to this study. Field data was also collected between late June to late August 2012 in the cropped and non-cropped areas. The sampled farms were initially stratified based on soils data (NSRI, 2011), with some constraints imposed due to land access permission. For each farm, a representative sample of vegetation communities present within the non-cropped areas was made. Homogeneous sections of hedgerow were mapped using a Global Positioning System (GPS) with waypoints at either end. Individual trees were also mapped with a GPS, with tree height and species also included. Crop type in fields was recorded as part of the hedge/tree survey. Additional visual ground-truthing was aided by a July 2008 GeoEye 1 image and commissioned 25 cm aerial photography from June 2014 (ARSF, 2014). Information on designated areas (Fig. 1) were accessed from the Natural England GIS database (NE, 2012).

Methods

The processing chain for this study was divided up into four key processes (bold in Fig. 2), each of which is outlined in the following sections.

Fig. 2

Processing chain identifying the four main processing steps in bold, where RF is Random Forest and signifies the classification process and Masking indicates the rule based classifier used to mask out certain MasterMap classes.

Preprocessing

Fifty-two image tiles (13 × 4 tiles) were produced from the airborne survey; 17,310 pixels across track × 11,310 pixels along track (pan-sharpened), with a 60% overlap along track and 40% overlap across track. Sub-pixel accuracy image mosaicing and histogram based colour matching was undertaken using Erdas Imagine’s MosaicPro software (Intergraph, 2013). To assist in the classification process the Normalised Difference Vegetation Index (NDVI) (Sellers, 1985) and Enhanced Vegetation Index 2 (EVI2) (Jiang et al., 2008) were calculated. NDVI correlates with vegetation health and vigor (Carlson and Ripley, 1997) and EVI2 has been shown to be more sensitive than traditional indices in highly vegetated landscapes or where soil reflectance can be prominent (Jiang et al., 2008, O’Connell et al., 2013b). The Canny edge-detection algorithm (Canny, 1986) was used on the EVI2 (O’Connell et al., 2013a) image to emphasize boundaries using eCognition software (Trimble, 2013a).

Masking and class structure

The MasterMap topography layer (OS, 2010) was used to create a mask using landcover categories of known accuracy based on a previous study (O’Connell et al., 2013a). The topography layer was first converted to objects in eCognition on a separate map using the chessboard segmentation algorithm (Trimble, 2013b) and then synchronised with the airborne data on the main processing level projected to British National Grid (EPSG: 27700). A simple rule based classifier (Appendix A) using the MasterMap FeatCode, DescTerm and DesGroup attributes (OS, 2010) and a vegetation threshold was used to extract all objects for the predefined classes. The primary classes used, with FeatCode in parentheses, were: Buildings (10021), Manmade (10056), Trees (10111), Mixed (10053) and Water (10086). The area coverage of these classes (6.75% of the study area) was then used as a mask in the segmentation of the airborne data on the main processing level (Fig. 2). The class structure was based on an initial 30 spectral classes derived from the Iterative Self-Organizing Data algorithm (ISODATA) (Ball and Hall, 1965) and landcover information from the field data. A class hierarchy was present for some classification scenarios (Fig. 3) depending on the segmentation procedure (Section 3.3), however all scenarios where completed using the same final nine class structure. Many of the arable crops were merged to two distinct classes; closed canopy (Crop 1) and open canopy (Crop 2) as the focus was in mapping non-cropped areas. Grass was defined as intensively managed grassland and Scrub characterised as having a heterogeneous structure and often occurring in field corners, urban parks or around woodlands. The class Crop (4 class hierarchy) was based on merger of all vegetation classes that weren’t non-crop (Fig. 3). Sparse vegetation was based on a mean spectral threshold (EVI2 < 45, 8 bit scaled) which was obtained from MasterMap sealed surface classes, field data and visual examination. The Sparse and Shadow classes were present as complete objects all segmentation scenarios (Section 3.3), therefore both classes were included for both the 4 and 9 class scenarios (Fig. 3).

Fig. 3

Schematic of 4 and 9 class structure with classification scenarios: H1 and F1 are based on the 4 class structure and H1H2, F1F2 and F2 on the 9 class structure. H1 and F1 proportion of votes are used as input variables for H1H2 and F1F2. F2 and F1F2 use the same objects.

Image segmentation

In this study the multiresolution segmentation algorithm (Baatz and Schäpe, 2000) was used in the main segmentation process using eCognition. To segment, we used two approaches; flat (single) and hierarchical (multi-scale) segmentation. Multi-scale segmentation creates a hierarchical object structure based on the premise that real world features are found in an image at different scales of analysis (Blaschke, 2010). The object hierarchy was based on the class structure, with top down approach adopted, to classify cropped objects at larger scale factors (e.g. H1 in Fig. 3), thereby focusing the analysis on the smaller objects in the non-cropped areas at lower segmentation scales (e.g. H1H2 in Fig. 3). Optimisation of the scale parameter for both the flat and hierarchical segmentation used the Estimation of Scale Parameter (ESP 2) tool (Drăguţ et al., 2014), which eliminated the subjectivity of reference objects (Witharana and Civco, 2014). For the hierarchical approach the ESP 2 settings were a top down hierarchy with a starting scale and step size of 1 for level 1, 5 and 25 for level 2 and 100 and 25 for level 3 and number of loops set to 400. For the flat approach the starting scale and step size for level 1 was 1, 1 and 10 for level 2 and the number of loops set to 300. The tool outputs a graph with local variance and rate of change of local variance plotted as a function of scale factor. This was used to cross-reference the retained scale factors by visually examining break points in rate of change curve after continuous decay (Drǎguţ et al., 2010).

Classification

Training data

Training data were extrapolated from the field data where multiple points were created from individual non-cropped polygons delineated from GPS waypoints and crop varieties in fields. This extrapolation of field data labelled 5.59% (4799) of the total population of objects for the hierarchical and 3.22% (5713) for the flat scenario. Previous studies have shown that classifiers like RF tend to produce higher accuracies for majority over minority classes (Chen et al., 2004, Lin and Chen, 2012) and show greater variability in variable importance values as a result of class imbalance (Janitza et al., 2013). The 4 class objects in this study were therefore resampled to 15% of the smallest sample size (i.e. Shadow) to maintain a sufficient level of class balance. With the increased use of machine learning algorithms in remote sensing there has been little reporting of the influence of training data size on classification accuracy. It has also been reported that RF does not need external validation due to the prediction accuracy calculated using the OOB sample (Breiman, 2001). However the robustness of this calculation is based on several assumptions, one of which is the size of training sets and subsequent OOB sample sizes. In this study we compared the error rates between the internal (OOB error) and external validation. A total sample size of 5713 objects was used with 30% of the data used for external validation. Training data were sampled randomly from the remaining 70% for the external validation and from the full dataset for the internal validation. Training data size was varied between 10% and 70% in steps of 10% with external and internal (OOB) error rates calculated. Validation was repeated 10 times for each training data size using a new random sample (for the 70% size of the external validation all remaining points were used). Error rates for the internal and external validation were compared using a generalised linear squares (GLS) model. In the GLS model different variances per factor level were allowed (likelihood ratio test between model with and without allowing for different variances per stratum: L = 37.0, df = 13, p = 0.0004), with sample size fitted as a fixed factor. Error stabilisation as a function of sample size was also investigated for the OOB error to see if an asymptote would be reached with respect to increasing sample size. Again 10 random samples per size category were used (10–100% in steps of 10%), with a GLS model allowing for different variances per sample size (likelihood ratio test between model with and without allowing for different variances per stratum: L = 72.4, df = 9, p < 0.0001). Differences between different levels of the factor sample size were explored using contrasts.

Variable selection

Once the segmentation process was completed object variables (i.e. features) were calculated for each level in each scenario. The eCognition software offers a significant array of object based variables. However many of these variables are highly correlated and while RF has been shown to deal well with correlated predictors (Chan and Paelinckx, 2008, Gislason et al., 2006), the additional computational effort required to process them is inefficient for a large scale mapping protocol. Therefore a variable selection process was used (Genuer et al., 2010) which selected variables based on ranked mean importance as a function of OOB error. The function used (Bradter, 2013) was written in R using the parameters; ntree: 2000, mtry: square root of the number of variables with the result averaged over 50 repetitions (Genuer et al., 2010). The variables used were divided up into 5 main categories (spectral, geometric, neighbourhood, textural and thematic) with the total number of variables changing for each classification scenario (Fig. 3b). Spectral referred to variables that were derived directly from the 6 spectral images and the Canny image. Geometric variables were based on within object geometric properties (e.g. length/width) and ratio based properties (e.g. area to super object). Neighbourhood variables were derived from spectral and textural ratios, constrained by a distance or object count window. Texture variables were calculated after Haralick analysis (Haralick et al., 1973) and based in all directions for directional invariance (Trimble, 2013b). Thematic variables were based on the FeatCode attribute from neighbouring and parent objects as well as proportion of votes from the 4 class models (see Section 3.4.3).

Random forest

All RF models were run using the package randomForest (Liaw and Wiener, 2002) in R 3.0.1 (R Development Core Team, 2014). Predictions using the selected variables were made using the following parameters: 50 iterations, ntree set to 500, proximity and importance set to true (importance based on mean decrease in accuracy) and all remaining parameters as default. The mtry parameter was varied between 1 and 15 to assess its effect on OOB error. The proportion of votes was used instead of the majority prediction as a variable for the 9 class scenarios. This gave a better indication of the confidence of the 4 class result rather than a single categorical value which would have resulted from a majority vote. This “soft” class hierarchy methodology is ideally suited to RF as it allows for discernible patterns to emerge at each level without error propagation due to local classifiers. Classification accuracy was assessed using both flat and hierarchical measures. The flat approach used overall, user, producer and kappa measures (Congalton and Green, 2008) which were derived from a confusion matrix generated from the OOB data using the R package Caret (6.0–37) (Kuhn, 2015). Hierarchical assessment differs from traditional approaches in that it encompasses the multi-level class structure in the final estimation of accuracy. The hierarchical assessment in this study was based on the hierarchical F measure described by Kiritchenko et al. (2005) and recommended by Silla and Freitas (2011). In short (see Appendix B for more detail), the measure extends the regular precision, recall and F measures by accounting for the location of each observed and predicted class of each case (object) in the class hierarchy (Fig. 3). Once completed, randomForest classification results were exported into the eCognition software where the MasterMap masked classes (i.e. Buildings, Manmade, Trees, Mixed and Water) were segmented using the same scale factors as the classification scenarios. A k-Nearest-Neighbour (kNN = 1) classifier was built for each MasterMap class using all the objects classified in the RF model as training data. Post-classification, various morphological processes (e.g. growing and shrinking) were used to adjust class boundaries as previous work had shown the MasterMap data to have poor delineation of many natural and manmade features (O’Connell et al., 2013a). A simple set of rule base classifiers were also created to remove individual errors; e.g. “classify Crop 2 as Trees if the object is completely enclosed by Trees, <5 × 5 pixels and are ±0.0125 EVI2 of the mean of the class Trees”. A random sample of 450 objects was selected from the kNN classification to assess its accuracy based on the RF training data.

Spatial analysis

To explore the utility of the classification map, we assessed the spatial distribution of non-cropped features within the study area. Spatial clustering was assessed in ArcGIS (ESRI, 2012) using nearest neighbour analysis on margins and hedgerows, based on euclidean distance across the whole study site for both classes. To examine the degree of spatial clustering as a function of area, incremental spatial autocorrelation (Moran’s I) was used on margins and hedgerows over 15 stages at increments of 30 m starting at 300 m. Habitat fragmentation was assessed for hedgerows and margins using 6 categories of fragmentation (interior, perforated, edge, transitional, patch, and undetermined) as outlined by Riitters et al. (2000). This was done using the geoscientific software SAGA (SAGA, 2015) and the add-on package Module Fragmentation (Conrad, 2008) with a maximum and minimum neighbourhood setting of 10 and 3 respectively. To provide a specific focus, we used the map to identify potential nesting habitat (see Appendix D) for the bird species Emberiza citronella (Yellowhammer). The criteria were to identify large areas of margin that were in close proximity to long lengths of hedgerow (Douglas et al., 2010, Morris et al., 2001).

Results

ESP 2 analysis identified a scale parameter of 295 for H1 giving 19,880 objects and a scale parameter of 110 for H2 giving 858,49 objects (Fig. 3). For the flat approach a single scale parameter of 96 was selected from a possible three (i.e. 422, 256, 96) giving 177,419 objects.

Training sample size

For training sample size the interquartile range within each sample size decreased with increasing sample size (Fig. 4).

Fig. 4

Box plots showing External (a) and Internal (OOB) (b) error as a function of sample size over 10 repetitions; where P10 is 10% sample size, P20 is 20% sample size etc. Whiskers represent the max and min, top and bottom of the box plot by 3rd and 1st quartile and the median by the centreline. The Y axis applies to both plots.

There was no significant difference found in the OOB error rate between sample sizes of 90% and 100% (parameter estimate: −0.0014 ± 0.0007, p = 0.0632) and differences between sample sizes of 80% and 90% were already very small (parameter estimate: −0.0024 ± 0.0010, p = 0.025). For training sample size the mean error rates from the internal OOB validation were not significantly different to those from the external validation with Likelihood ratio value (L) of 1.06 (p-value = 0.304) (Zuur et al., 2009).

Variable selection

Results from the mtry parameter tests showed little variation in OOB error (±0.0025) when varying mtry. For this reason mtry was set to the square root of the total number of variables which is the default setting for classification in randomForest (Liaw and Wiener, 2002). A total of 90 variables for the F2, 94 for the F1F2 and 108 variables for the H1H2 model (Appendix C) were created in eCognition. Variable selection (Bradter, 2013) reduced the number of variables by 25.5% ± 2.7% for the three models; 27.8% for the F2 model, 26.6% for F1F2 and 22.2% for the H1H2 model (see Appendix C for more detail). Variable importance showed some similarities across the 3 classification models when proportion of votes are excluded (Table 1). Classification accuracy for the H1 and F1 models was a kappa value of 0.794 and 0.920 respectively (Table 2). The model F2 performed poorest for the 9 class scenario with F1F2 performing best indicating the importance of the 4 class proportion of votes as this was the only difference between the two models. Overall accuracy for F1F2 was also very high (Table 3) with a precision, recall and f-measure of 0.959 which compares very favorably with other non-cropped mapping studies (Aksoy et al., 2010, Rydberg and Borgefors, 2001, Sheeren et al., 2009, Tansey et al., 2009).

Table 1

The top 15 variable importance rankings across all 9 classes for the 3 different classification models (i.e. F2, F1F2 and H1H2). Superscript letters indicate variable category. See Appendix C for more details.

Rank	F2	F1F2	H1H2
1	GLCM_Dissid	Crope	Crope
2	Mean_Vis1a	Noncrope	GLCM_Dissid
3	Mean_Vis3a	Sparsee	Noncrope
4	GLCM_Corred	GLCM_Dis_1d	GLCM_Corred
5	Mean_Vis2a	Mean_Vis3a	Ratio_NDVIa
6	Length/Widthb	Length/Widthb	Asymmetryb
7	Asymmetryb	GLCM_Corred	Mean_Vis3a
8	GLCM_Mean_d	Mean_Vis1a	Mean_Vis2a
9	Border lengthb	Mean_Vis2a	Circular_Mb
10	FeatCodee	Asymmetryb	Mean_Vis1a
11	Edge_Contrc	Shadowe	Ratio_EVI2a
12	StD NIRa	GLCM_Mean_d	Brightnessa
13	Diff to Cannya	StD NIRa	GLCM_Ang_2d
14	Mean_EVI2a	Dist to Sparsec	GLCM_Mean_d
15	StD Vis3a	HSI_Transfa	Dist to borderc
16	Diff to EVI2a	Length mb	HSI Transfa
17	Mean NDVIa	GLCM_Ang_2d	Dist to Mixedc
18	Mean NIRa	GLCM Entrod	Sparsee
19	HSI Transfa	Mean_EVI2a	StD NIRa
20	Area m2b	Diff to Cannya	FeatCodee

Spectral.

Geometry.

Neighbourhood.

Texture.

Thematic.

Table 2

Kappa, OOB error and standard deviation in OOB (StD OOB) error (based on 50 forests) for all 4 and 9 class models.

Accuracy
Scenario	Kappa	OOB	StD OOB
4 Class
H1	0.794	0.114	0.002
F1	0.920	0.051	0.001

9 Class
H1H2	0.882	0.102	0.001
F1F2	0.909	0.080	0.001
F2	0.865	0.119	0.001

Table 3

Error matrix and accuracy measure for the F1F2 model showing overall, user, producer, kappa, precision, recall and F-measure values.

	Sparse	Grass	Crop 1	Crop 2	Scrub	Trees	Hedges	Margins	Shadow	Total	User
Sparse	755	0	0	0	0	0	0	0	0	755	100.00
Grass	0	471	2	5	21	0	0	0	0	499	94.39
Crop 1	5	13	697	6	11	0	0	0	0	732	95.22
Crop 2	0	10	2	906	26	0	0	0	0	944	95.97
Scrub	0	21	4	35	724	0	0	0	0	784	92.35
Trees	0	0	0	0	0	459	44	13	0	516	88.95
Hedges	0	0	0	0	0	63	459	73	0	595	77.14
Margins	1	0	0	0	0	12	92	476	0	581	81.93
Shadow	0	0	0	0	0	0	0	0	307	307	100.00
Total	761	515	705	952	782	534	595	562	307	5713
Producer	99.21	91.46	98.87	95.17	92.58	85.96	77.14	84.70	100.00

			Overall	Kappa	Precision	Recall	F-meas
			91.97	0.9087	0.9593	0.9593	0.9593

Spatial statistics

The classification provides a wide-area, fine-scaled map of non-cropped habitat (Fig. 6). As a proportion of the total area, Trees, Hedges and Margins covered 12.39%, 1.90% and 3.58% respectively. The Sparse class had the highest percentage cover (36.25%) and based on the field data and the image acquisition date it was assumed that the vast majority of this coverage was associated with annual cereals such as wheat, oats and barley. Nearest neighbour analysis of margins and hedgerows revealed that both classes were significantly clustered in space, with margins having a nearest neighbour ratio of 0.64 (z score = −78.73, P < 0.005) and hedgerows 0.65 (z score = −64.53, P < 0.005). Based on area, incremental spatial clustering analysis of margins showed no significant autocorrelation; whereas hedgerows showing a single peak (Appendix D) at 668 m (Moran, I = 0.04, z score 30.72, P < 0.005). Habitat fragmentation analysis on the distribution of hedgerows classified 20.9% as transitional, 12.52% as edge and 66.44% as patch. The results for margins were similar for transitional (19.56%) and patch (55.47%), with the edge category at 21.43%. The declining farmland bird, the Yellowhammer, prefers nesting in hedgerows and foraging for its young in margins within 200 m of the nest site (Douglas et al., 2010, Douglas et al., 2009). Based on these criteria 9.35% of the total population of hedgerows were within 25 m of areas of margin that were >183.31 m2 (Table 4). A cumulative total of 21.53% (40.21 ha) of all hedgerows were within 200 m of large margins areas (Appendix D).

Fig. 6

Non-cropped map with the 9 class scenario for the F1F2 model (a) for the whole study area, (b) enlargement based on red square in (a) showing image objects and (c) enlargement of non-cropped map for the same area. Cities Revealed® aerial photography copyright The GeoInformation® Group 2012. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 4

Percentage area relative to total area and area (ha) for hedgerows based on their intersection with multiple ring buffers (m) around margin hotspot areas.

Distance	25	50	100	200	300	400	500
% area	0.77	0.57	1.41	2.78	3.06	3.68	1.99
Area (ha)	1.45	1.07	2.63	5.20	5.71	6.87	3.72

Discussion

The selected scale parameter for image segmentation was chosen as it gave the best delineation of the target features trees, hedgerows and margins. The increased number of objects for the flat hierarchy was due to the absence of any connection to larger scale parent objects (i.e. H1). Results from the analysis of training sample size showed that the OOB error provided a similar accuracy measure to the mean overall error from an external validation. This stabilisation of mean error rate indicates that, with the full training dataset, the training data was representative for the study area. By plotting mean class and overall error based on 50 RF models, there was a clear trend of error stabilisation above 100 trees for all classification scenarios, validating the ntree parameter, as RF does not induce an over adjustment in the model above the convergence point (Breiman, 2001). Variable selection was not performed on the 4 class models (i.e. F1, H1) due to the lower number of variables produced at this level and considering that only the proportion of votes from this level were inputted into two of the 9 class models. Class specific variable importance also showed clear trends with Sparse, Shadow, Grass, Crop 1 and Crop 2 dominated by spectral and textural features and Scrub, Trees, Hedges and Margins dominated by spectral and geometric features (Table 1). The distinction between Grass and Scrub was predominantly driven by textural and geometric variables with Scrub having higher standard deviation in GLCM and lower area. For H1H2 and F1F2, the top ranked variable for 7 of the 9 classes was proportion of votes, with the 4 class vote correctly attributed to the 9 class category (i.e. Noncrop for Trees) in all cases. A trend of significant decrease in permutation importance for the first ten ranked variables was present for all three models (Fig. 5b). Variables from 20 and above had little influence on classification accuracy as mean OOB error was stabilised (Fig. 5a).

Fig. 5

Plot of mean OOB error based on 50 repetitions over the cumulative number of variables used (a) and plot of variable importance measures based on 50 repetitions in descending order (b) for the model F1F2.

Lower user and producer accuracy for Hedges was evident as there was some commission and omission with Margins and Trees. All three non-cropped classes were often in a similar spatial and spectral domain with similarly ranked variables; therefore it was unsurprising that some confusion did exist. Scrub, while having high user and producer accuracy, did have some confusion with Crop 1, Crop 2 and Grass. This was due to the fact that the Scrub class transcended many spectral textural, geometric and thematic variables and therefore in some occasions didn’t have a key set of variables that best defined its class description. Crop 2 had minor confusion with Scrub and this was attributed to a high mean and standard deviation in GLCM which was characteristic of both classes. It’s likely that the confusion in this study between the cropped classes would have been significantly reduced by the inclusion of additional images taken throughout the growing season. In this case the different management practices for the different crops and their associated variation in spectral response would increase the discrimination between classes. Various studies have demonstrated the benefits of a multi-level object hierarchy in landcover classification (Benz et al., 2004, Blaschke, 2010, Lucas et al., 2007, Mallinis et al., 2008), with parent objects representing general landuse categories and child objects representing more specialised classes. However in this study a single object level with a multi class hierarchy gave the highest overall classification accuracy. This was attributed to the fact that the smaller objects for the F1F2 model were well defined for both the 4 and 9 class scenarios through the large number of spectral, textural, geometric and thematic variables. Some objects for the H1 model may have had both cropped and non-cropped features as signified by the low kappa value. These mixed objects would have had a negative effect on the accuracy of the proportion of votes for the subsequent H1H2 model, reducing its variable importance as a result. Because the unit area for the F1 and F1F2 models were the same, the high kappa value for F1 was directly transferred to the F1F2 model via the proportion of votes resulting in higher importance values for these variables. Spatial analysis of the non-cropped map showed that although there was significant spatial clustering for margins, this clustering has no correlation with margin size. For hedgerows the autocorrelation with area is at a landscape scale which may have implications for various bird species that depend on the proximity of large clusters of hedgerows. The high percentage of hedgerow in the patch category (fragmentation analysis) was indicative of the discontinuous nature of their distribution as indicated by the nearest neighbour analysis, occurring in small clumps rather than as continuous linear features across the landscape. The increase in the edge category for margins was attributed to the location of the margins relative to other classes in the study area, where the margins typically delineated a transition between crop and non-cropped areas. The analysis of Yellowhammer nesting characteristics indicates the potential predictive utility of this classification approach by identifying algorithmically suitable nesting areas of a red listed bird species (Baillie et al., 2014). The classification protocol outlined in this study is potentially scalable over large areas of the UK due to the availability of high resolution CIR aerial photography (Landmap, 2014) and the semi-automated fashion of the methodology. While previous studies have demonstrated the potential of mapping hedgerows using various different techniques (Aksoy et al., 2010, Foschi and Smith, 1997, Sheeren et al., 2009, Tansey et al., 2009, Vannier and Hubert-Moy, 2008), the protocol outlined in this study can also identify field margins and scrub areas that are separate to hedgerows and are therefore vital to ecosystem services like pollination. The enhanced spatial resolution of this approach over traditional mapping protocols in the UK (CEH, 2007) is key in assessing small scale habitat features across large landscapes. While the inclusion of datasets derived from active sensors like LiDAR or radar would increase classification accuracy, the coverage of such datasets is still quite limited and this study has demonstrated a very high level of classification accuracy without the need for such data. Wide area mapping and monitoring of non-cropped areas in the UK could form a key information source at the environmental and policy level with the development of model-based decision, or discussion, tools to optimise landscapes for multiple ecosystem service delivery functions. Knowledge of the characteristics of the non-cropped areas, which is a habitat for a considerable proportion of farmland biodiversity, can be an initial proxy for many difficult-to-assess ecological factors in the landscape – such as natural pest control and pollination services, as well as biodiversity. By simulating landscapes based on marginal changes on the existing non-cropped areas, it would be possible to find interventions – such as the best sitting and types of agri-environment schemes that have the biggest impacts. As there are increasing calls to manage ecosystem services at the landscape scale (Benton, 2012, Gonthier et al., 2014, Tscharntke et al., 2012), it is necessary to develop tools that are evidence-based but feasible to parameterize from a cost perspective. This study provides some of the methodology which would allow such models to be developed. Such models would aid UK and EU policy goals for the creation of sustainable, multifunctional agricultural landscapes as well as help enable sustainable intensification: increasing the production farmland whilst reducing the impact on the environment. Further work is needed on the spatial and temporal transferability of RF models in the mapping of non-cropped features before a robust tool can be validated for wide area mapping. The inclusion of a temporal stack of imagery will also be investigated as it is likely to significantly aid in the dissemination between the various cropped and margin areas.

Conclusions

In this study the integration of the machine learning algorithm RF in an object orientated environment was investigated under three classification scenarios. A flat approach signified by a single object hierarchy and 4–9 class structure gave the highest overall accuracy. Parameterisation of RF models was straightforward with mtry values greater than 1 producing optimal results. Variable importance analysis showed that proportion of votes followed by spectral and textural variables were deemed most important for classification accuracy. The inclusion of the proportion of votes was a novel approach to transferring class hierarchical information from one level to another and had a significant influence on overall classification accuracy. Analysis of training sample size showed no significant difference between mean internal OOB error and external validation. The classification protocol outlined in this study offers a robust methodology for mapping of cropped and non-cropped areas where high resolution imagery is available. The protocol also allows for the integration of existing landcover and auxiliary datasets where available, however the inclusion of such data is not deemed critical to overall accuracy. The study has shown the potential of machine-learning to correctly classify fine scale ecological features to a high accuracy (on par with field surveys) at a wide spatial extent. The use of such a protocol for mapping and monitoring of cropped and non-cropped areas at local and regional scales could provide a key information source at the environmental and policy level in landscape optimisation for food production and ecosystem service sustainability.

Classes:

Buildings

Manmade

Mixed

Non Veg

and (min)

[0–60]: Mean EVI2

Scrub

and (min)

Threshold: FeatCode: MasterMap = 10111

Threshold: Mean EVI2 > 100

or (max)

Threshold: DescTerm: MasterMap = “Rough Grassland”

Threshold: DescTerm: MasterMap = “Rough Grassland,Scrub”

Threshold: DescTerm: MasterMap = “Scrub”

Threshold: DescTerm: MasterMap = “Scrub,Rough Grassland”

Small Objects

and (min)

Threshold: Area <= 35 m²

or (max)

Threshold: Border to Water >= 1 Pxl

Threshold: Border to Trees >= 1 Pxl

Threshold: Border to Scrub >= 1 Pxl

Threshold: Border to Non Veg >= 1 Pxl

Threshold: Border to Mixed >= 1 Pxl

Threshold: Border to Buildings >= 1 Pxl

Threshold: Border to Manmade >= 1 Pxl

Trees

and (min)

or (max)

Threshold: DescTerm: MasterMap = “Heath”

Threshold: DescTerm: MasterMap <> “Scrub,Rough Grassland”

Threshold: DescTerm: MasterMap <> “Scrub”

Threshold: DescTerm: MasterMap <> “Rough Grassland,Scrub”

Threshold: DescTerm: MasterMap <> “Rough Grassland”

Threshold: DescTerm: MasterMap = “Scrub,Nonconiferous Trees”

Threshold: FeatCode: MasterMap = 10111

Water

and (min)

Threshold: Mean EVI2 <= 100

Threshold: Theme: MasterMap = “Water”

Threshold: FeatCode: MasterMap = 10089

Process: Main:

1. MasterMap extraction

copy map: on main : copy map to ‘MasterMap’. Extraction chessboard

segmentation: on MasterMap : chess board: 99999999 creating ‘Level 1’

2. classification: on MasterMap unclassified at Level 1: Trees assign class: on MasterMap unclassified with Theme: MasterMap = “Buildings” at Level 1: Buildings

3. assign class: on MasterMap unclassified with Make: MasterMap = “Manmade” and DescGroup: MasterMap = “General Surface” at Level 1: Manmade

4. assign class: on MasterMap unclassified with FeatCode: MasterMap = 10053 and DescTerm: MasterMap = “Multi Surface” at Level 1: Mixed

5. classification: on MasterMap unclassified at Level 1: Water

6. classification: on MasterMap unclassified at Level 1: Scrub

7. edge extraction canny: on main : edge extraction canny (Canny’s Algorithm) ‘EVI2’ => ‘Canny’

Segmentation

Level 1

multiresolution segmentation: 108 [shape:0.1 compct.:0.5] creating ‘Level 1’

synchronize map: on MasterMap Buildings, Manmade, Mixed, Scrub, Trees, Water

at Level 1: synchronize map ‘main’

classification: on main unclassified at Level 1: Non Veg

classification: unclassified at Level 1: Small Objects

remove objects: Small Objects at Level 1: remove objects into Buildings,

Manmade, Mixed, Non Veg, Scrub, Trees, Water (merge by color)

Table C1

Object variables used for F2, F1F2 and H1H2 classifiers before and after variable selection. Note the variables under the Thematic category consist of MasterMap categories (i.e. FeatCode) and 4 class proportion of votes from level 1 (e.g. Crop). All variables in the flat classifier are also included in the hierarchical classifier before variable selection with additional variables for the hierarchical classifier due to parent/child features. Other variable names are abbreviated in accordance with eCognition; see the eCognition reference manual for more details (Trimble, 2013b).

		F2		F1F2		H1H2
Category	Variable	Before	After	Before	After	Before	After
Spectral	Mean	7	6	7	6	7	6
Standard deviation	7	6	7	6	7	7
Pixel based	7	6	7	6	7	6
To super object	0	0	0	0	5	5
To scene	5	3	5	3	5	4
HSI	3	1	3	1	3	1

Geometry	Extent	6	4	6	4	6	5
Shape	5	5	5	5	5	5
To super object	0	0	0	0	4	3
Based on polygons	5	3	5	3	5	3
Based on skeletons	5	3	5	3	5	4
Thematic	Border to	9	5	9	5	9	7
Distance to	9	7	9	7	9	7
Difference to	9	6	9	6	9	7
To super object	0	0	0	0	5	0
MasterMap	1	1	1	1	1	1
4 class votes	0	0	4	4	4	4
Texture	GLCM	9	9	9	9	9	9
GLDV	3	0	3	0	3	0

Total		90	65	94	69	108	84

13 in total

1. Global food demand and the sustainable intensification of agriculture.

Authors: David Tilman; Christian Balzer; Jason Hill; Belinda L Befort
Journal: Proc Natl Acad Sci U S A Date: 2011-11-21 Impact factor: 11.205

Review 2. Landscape moderation of biodiversity patterns and processes - eight hypotheses.

Authors: Teja Tscharntke; Jason M Tylianakis; Tatyana A Rand; Raphael K Didham; Lenore Fahrig; Péter Batáry; Janne Bengtsson; Yann Clough; Thomas O Crist; Carsten F Dormann; Robert M Ewers; Jochen Fründ; Robert D Holt; Andrea Holzschuh; Alexandra M Klein; David Kleijn; Claire Kremen; Doug A Landis; William Laurance; David Lindenmayer; Christoph Scherber; Navjot Sodhi; Ingolf Steffan-Dewenter; Carsten Thies; Wim H van der Putten; Catrin Westphal
Journal: Biol Rev Camb Philos Soc Date: 2012-01-24

Review 3. Food security: the challenge of feeding 9 billion people.

Authors: H Charles J Godfray; John R Beddington; Ian R Crute; Lawrence Haddad; David Lawrence; James F Muir; Jules Pretty; Sherman Robinson; Sandy M Thomas; Camilla Toulmin
Journal: Science Date: 2010-01-28 Impact factor: 47.728

4. Ecology. Managing farming's footprint on biodiversity.

Authors: Tim G Benton
Journal: Science Date: 2007-01-19 Impact factor: 47.728

5. Random forests for classification in ecology.

Authors: D Richard Cutler; Thomas C Edwards; Karen H Beard; Adele Cutler; Kyle T Hess; Jacob Gibson; Joshua J Lawler
Journal: Ecology Date: 2007-11 Impact factor: 5.499

6. A computational approach to edge detection.

Authors: J Canny
Journal: IEEE Trans Pattern Anal Mach Intell Date: 1986-06 Impact factor: 6.226

Review 7. Biodiversity conservation in agriculture requires a multi-scale approach.

Authors: David J Gonthier; Katherine K Ennis; Serge Farinas; Hsun-Yi Hsieh; Aaron L Iverson; Péter Batáry; Jörgen Rudolphi; Teja Tscharntke; Bradley J Cardinale; Ivette Perfecto
Journal: Proc Biol Sci Date: 2014-09-22 Impact factor: 5.349

Review 8. Ecosystem services and agriculture: tradeoffs and synergies.

Authors: Alison G Power
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2010-09-27 Impact factor: 6.237

9. An AUC-based permutation variable importance measure for random forests.

Authors: Silke Janitza; Carolin Strobl; Anne-Laure Boulesteix
Journal: BMC Bioinformatics Date: 2013-04-05 Impact factor: 3.169

10. Automated parameterisation for multi-scale image segmentation on multiple layers.

Authors: L Drăguţ; O Csillik; C Eisank; D Tiede
Journal: ISPRS J Photogramm Remote Sens Date: 2014-02 Impact factor: 8.979

2 in total

1. Effects of urban sprawl on arthropod communities in peri-urban farmed landscape in Shenbei New District, Shenyang, Liaoning Province, China.

Authors: Zhen-Xing Bian; Shuai Wang; Qiu-Bing Wang; Miao Yu; Feng-Kui Qian
Journal: Sci Rep Date: 2018-01-08 Impact factor: 4.379

2. Classifying grass-dominated habitats from remotely sensed data: The influence of spectral resolution, acquisition time and the vegetation classification system on accuracy and thematic resolution.

Authors: Ute Bradter; Jerome O'Connell; William E Kunin; Caroline W H Boffey; Richard J Ellis; Tim G Benton
Journal: Sci Total Environ Date: 2019-11-03 Impact factor: 7.963

2 in total