Literature DB >> 35445158

A multi-layer perceptron-based approach for early detection of BSR disease in oil palm trees using hyperspectral images.

Chee Cheong Lee¹, Voon Chet Koo¹, Tien Sze Lim¹, Yang Ping Lee², Haryati Abidin².

Abstract

Basal Stem Rot (BSR) disease caused by Ganoderma boninense is identified as the biggest threat in oil palm industry in Malaysia, resulting in significant yield losses. Effective BSR disease detection is important for plantation management to ensure stable palm oil production. Existing method is done by experience personnel, via visual inspection it is very time consuming. Rapid development of unmanned aerial vehicle (UAV) and machine learning has the potential to address this issue with higher efficiency. This paper proposed a new framework to automate BSR disease detection with UAV images to improve time efficiency and automate detection process. The proposed method has two steps, first hyperspectral image (HSI) pre-processing, followed by artificial neural network disease detection. Multilayer-Perceptron model is introduced to learn spectral features from different infection stages. The model is trained with ground truth collected by trained surveyors. The HSI sample size consists of 2 healthy trees, 5 Stage A (mild infection), 5 Stage B (moderate infection), and 3 Stage C (severe infection). Performance is examined with support vector machine (SVM), 1 dimensional convolutional network (1D CNN), and several vegetation indices, namely Normalized Difference Vegetation Index (NDVI), Normalized Difference Red Edge (NDRE), Optimised Soil-Adjusted Vegetation Index (OSAVI), and Merris Terrestrial Chlorophyll Index (MTCI). All machine learning algorithms can segregate infection stages, MLP modal had a highest overall accuracy 86.67%, compared to SVM and 1D CNN at 66.67% and 73.33%. Whereas for vegetation index, it can only detect Stage C tree, and not able to differentiate between Healthy, Stage A and Stage B tree. In term of computational cost, MLP modal had balance performance with moderate training time, but faster inference time. It demonstrates effectiveness on BSR disease detection, even at early infection stage.

Entities: Chemical

Keywords: Basel stem rot disease; Computer vision; Hyperspectral image; Machine learning; Multilayer perceptron; Vegetation index

Year: 2022 PMID： 35445158 PMCID： PMC9014396 DOI： 10.1016/j.heliyon.2022.e09252

Source DB: PubMed Journal: Heliyon ISSN： 2405-8440

Introduction

BSR disease is a major disease in Malaysia oil palm industry. It causes yield reduction of 0.04 t/ha for 10 years planting and 4.34 t/ha for 22 years planting. From survey conducted by Mohd Shukri I. et al., covering area of 37351.81ha of oil palm, BSR disease affected 9.2% of the survey area [1]. It is crucial for plantation management to detect BSR disease especially at its early stage and know the area of infection. Treatment can be done to infected plant to retain its economic value and prevent it from infecting others. Tree healthiness can be categorized into four levels, healthy (healthy palm, no Ganoderma fruiting bodies), Stage A (mild infection, has Ganoderma fruiting bodies, but leaves look healthy), Stage B (moderate infection, has Ganoderma fruiting bodies, leaves show unhealthy/yellowish), Stage C (severe infection, dead trees with Ganoderma fruiting bodies). Ground evaluation of the disease is done with visual inspection. Inspector finds the symptoms to identify infected plants, such as few unopened new fronds, green fronds wilts and hanging downwards, yellowish fronds, small canopy, and existing of basidiomata on the trunks. The localization is done simultaneously, inspector marks location of infected plants with global position system (GPS) device. The process is labour intensive and time consuming [2]. This often cause the reinspection interval is very long and make it difficult for real time detection in large plantation area. The symptoms of BSR disease are decay of the bottom of the stem from where basidiocarps emerge. Stem rotting limits the uptake of water and nutrients to the fronds and causing chlorosis [3]. Chlorophyll is essential in photosynthesis, which allows plants to convert light energy to chemical energy. Healthier plants have higher level of chlorophyll, and it reflects more green light and Near Infrared, absorb more red light. Some of these non-visible lights can be captured using hyperspectral sensors such as spectrometer and hyperspectral camera. In recent years, drone hyperspectral imaging becomes popular in detecting BSR disease due to its capability in non-destructive detection at large scale, as well as flexibility in mounting different sensors [4]. Hyperspectral or spectroradiometer is usually mounted on a structure like scaffoldings to capture data. With rapid development of unmanned aerial vehicle (UAV), hyperspectral sensor can be integrated into UAV to enable large area monitoring, and it is more practical in field work. Compared to spectroradiometer, UAV mounted hyperspectral sensor can capture both spatial and spectral information. Existing studies show large scale detection is possible. AISA airborne hyperspectral imaging spectrometer is used in investigation of six vegetation indices and four red-edge techniques to detect BSR disease. These techniques could produce accuracy ranging from 73% to 84% [9]. Izzuddin M.A et. al [12] has conducted similar vegetation indices and red-edge technique illustrates a different result. The overall accuracy of evaluated six vegetation indices are between 30% to 40%. Instead, continuum removal gives promising result in detecting early stage of disease. The research suggests more analysis should be conducted and validated to establish the methods’ robustness. Machine learning, especially convolutional neural network, or its variety has emerged as state-of-art method in image classification [24, 25] and object detection [26] in recent years. Spectral information is important in classification task in HSI. However, not all bands are critical in performing this task, some bands consist more important information than the others. Extracting spectral information manually is not feasible. There are some traditional machine learning algorithms in extracting useful information from spectral bands, such as principal component analysis [27], independent component discriminant analysis [28], and linear discriminant analysis [29]. However, these methods are simple linear processing and may not be suitable to handle complex feature in spectral bands [30]. Artificial neural network is introduced to address above problem. Wei Hu et al [37] proposed 1D CNN to extract spectral features from HSI. Overall accuracy is better than support vector algorithm. Y. Chen et al [38], J. M. Haut et al [39] and X. Yang et al [40] also evaluated effectiveness of CNN in HSI classification. In addition, classification accuracy can be improved by extracting spatial feature from HSI. A 3D CNN is used to extract spatial-spectral information effectively. Y. Li et al [31] has implemented 3D CNN in classification task and achieve high accuracy in different dataset. Few other research also evaluated the efficiency of deep neural network in hyperspectral image classification [13, 14, 15]. It is proven deep neural network is effective in learning and extracting spectral-spatial feature and achieve very good result in scene classification. Aside from scene classification, some research also implemented machine learning in plant disease detection. Xin Zhang et al [16] establish a deep convolution neural network method to automate crop disease detection using UAV hyperspectral images. The propose Deep CNN uses multiple Inception-Resnet [17, 18] layers to extract features from both spatial and spectral domain for yellow rust detection. The result shows overall detection accuracy of 85%, which is higher than conventional machine learning technique, random forest-based classifier [19] with 77% overall accuracy. Qiao Pan et al [20] proposes another Deep CNN, pyramid scene parsing network semantic segmentation model for wheat yellow rust disease detection and achieve 94% accuracy. This suggests the effectiveness of deep neural network in plant disease detection. Some existing studies have also utilized machine learning in BSR disease detection in oil palm tree as summarized in Table 1. Nur A. Husin et al [21] employed Terrestrial Laser Scanning (TLS) data and machine learning to classify healthiness level of oil palm tree. TLS scans each tree at hour different locations and generates five tree features as input to machine learning algorithm. This method can detect early infection and best accuracy is achieved by Kernel Naïve Bayes method, with overall accuracy 85%. Izrahayu et al [22] study shows the MLP in backscatter variable from synthetic aperture radar (SAR) image can differentiate infected and healthy tree. Camille C. D. Lelong et. al. [4], measure plant's canopy with field spectroradiometer and scaffoldings. Partial Least Square Discriminant Analysis (PLS-DA) method can classify plants into four labels of disease severity with 94% accuracy. Parisa Ahmadi et al [8] improve the method by implementing Artificial Neural Network on raw, first, and second derivative spectroradiometer dataset. It shows satisfactory result with 83.3% and 100% accuracy to detect healthy and infected plants. Aiman Nabilah Noor Azmi et al [7] used visible near infrared (NIR) Hyperspectral images for BSR disease detection. It shows there is significant difference in NIR spectrum. 35 bands are used as support vector machine input for classification, and it achieves excellent result. There are also other studies [32, 33, 34, 35, 36] has shown effectiveness machine learning techniques in BSR disease detection. Spectrometer, thermal sensor, spectroradiometer, laser scanner and dielectric spectroscopy requires on site measurement tree by tree. Thus, they are impractical for large scale plantations. Multispectral and hyperspectral can be used with UAV and Satellite to capture large scale plantation image. However multispectral camera does not yield a convincing result, and [7] is conducted under closed environment. SAR image can provide large scale detection ability, nevertheless, it is not capable to detect early infection.

Table 1

Summary of different ML Methods in BSR Disease Detection.

Sensor Type	Machine Learning Technique	Accuracy	Disease Severity Separation	Reference
Laser Scanner	Kernel Naïve Bayes	85%	Yes	[21]
Synthetic Aperture Radar	Multilayer Perceptron	95.65%	No	[22]
Spectroradiometer	PLS-DA	94%	Yes	[4]
Spectroradiometer	Artificial Neural Network	83%–100%	Yes	[8]
Hyperspectral Imaging	Support Vector Machine	93%–100%	Yes	[7]
Spectroradiometer	K-Nearest Neighbor	97%	Yes	[32]
Spectrometer	Linear Discriminant Analysis	90%	Yes	[33]
Multispectral Imaging	Neural Net	41.3%	Yes	[34]
Dielectric Spectroscopy	Support Vector Machine	74.77%	Yes	[35]
Thermal Imaging	Random Forest	73.9%–90.2%	No	[36]

Summary of different ML Methods in BSR Disease Detection. In this paper, a Multilayer Perception neural network (MLP) method is proposed to perform BSR disease detection from high resolution UAV hyperspectral images (HSI). Image pre-processing is done to improve image quality that captured under open environment. MLP is designed to use to learn spectral information for detection. The proposed methods are tested at experimental oil palm plantation site. Performance from MLP is compared with several vegetation indices. This paper is organized as below: section 2 describes datasets, section 3 explains the methods and experimental setup, section 4 illustrates result and discussion, and finally, conclusion in section 5.

Datasets

Study Area

In this work, HSI sample is collected at 2.4460672 N, 102.4009867 E, Melaka, Malaysia, consists of 5 sites. Location is illustrated in Figure 1. Expert was following the team and mark plant health as ground truth. Plants are categorized as stage A, B, C and Healthy according to expert evaluation. UAV is deployed to capture red, green, blue (RGB) image and HSI. RGB image is used for marking the ground truth. Hyperspectral camera mounted on same UAV acquired HSI, which covers near-infrared to ultra-violet bands from 510nm–900nm. Spectral resolution is 1nm and data output of 392 bands. Raw data were recorded with 1024 × 1024-pixel resolution. There is total 15 HSI collected, label of each tree is tabulated in Table 2.

Figure 1

(a) Study Area on Malaysia Map. (b) Location of Study Area in Melaka. (c) Ground truth marked by expert in RGB Images.

Table 2

HSI sample and label.

Site	Healthy	Stage A	Stage B	Stage C
1	N/A	N/A	N/A	S1–1C, S1–2C
2	N/A	S2-1A, S2-2A	S2–3B	N/A
3	S3–4H	S3-2A	S3–1B, S3–3B	N/A
4	S4–5H	S4-2A, S4-3A	S4–1B, S4–4B	N/A
15	N/A	N/A	N/A	S15–1C

Note: The label S3–4H is referring to Site 3, Plant Number 4 (Healthy). N/A means sample not available.

(a) Study Area on Malaysia Map. (b) Location of Study Area in Melaka. (c) Ground truth marked by expert in RGB Images. HSI sample and label. Note: The label S3–4H is referring to Site 3, Plant Number 4 (Healthy). N/A means sample not available.

Data collection

Training and validation datasets are from HSI Site 3. Other sites are used as test im-ages for performance evaluation. Training and validation datasets are collected manually by randomly select pixels from different stages of infected plants, healthy plants, and ground. They are labelled as A, B, H and G respectively. There is no training and validation dataset for Stage C, as there is nothing left but soil for stage C infection. Figure 2 visualizes the training and validation datasets selection from Site 3 HSI.

Figure 2

(a) Training data – G; (b) training data – B; (c) training data – A; (d) training data – B; (e) training data – H.

(a) Training data – G; (b) training data – B; (c) training data – A; (d) training data – B; (e) training data – H. Total 5739 datasets had been collected. Among them are 1290 Stage A, 1345 Stage B, 1292 Healthy and 1812 Ground. Datasets are split into two groups; training sets and validation sets with the proportion around 7.5:2.5. The dataset distribution is shown in Table 3.

Table 3

Datasets distribution.

Classes	Description	Train	Validation
A	Stage A- Early Infection	990	300
B	Stage B – Mild Infection	1045	300
G	Stage C – Late Stage/Ground	992	300
H	Healthy	1312	500

Datasets distribution.

Methods

Overall framework of BSR disease detection is explained in this section. Given a hyperspectral image (HSI), the aim is to find the unhealthy plants, and detect as early as possible. In this study, Multilayer Perceptron (MLP) Neural Network Model is proposed to classify each pixel of HSI into one of these classes: Stage A, Stage B, Stage C, Healthy and Background. Framework includes (1) image alignment, (2) HSI denoising using minimum noise fraction (MNF) to enhance image quality, (3) dataset preparation for proposed MLP model training, (4) MLP inference and output visualization as demonstrated in Figure 3.

Figure 3

Block Diagram of proposed method.

Block Diagram of proposed method. To assess the proposed method, several experiments have been done. Mainly focus on the following aspects: (1) accuracy of the network by using two training datasets, namely ABGH and BGH datasets. This is to evaluate the effect of Stage A and Stage B spectral signature towards MLP performance, (2) To investigate the effect of width and depth of the network, number of neurons per layer is set to 128, 256, and 392 and number of hidden layers is set to 1, 2, 3, 4. Each model is trained with same dataset and parameter setting. (3) Comparison with other ML methods, and traditional vegetation index method. In this work, the result will be compared with support vector machine (SVM) and 1 dimensional convolutional neural network (1D CNN), and four vegetation indices, Normalized Difference Vegetation Index (NDVI), Normalized Difference Red Edge (NDRE), Optimised Soil-Adjusted Vegetation Index (OSAVI), and Merris Terrestrial Chlorophyll Index (MTCI).

Data-preprocessing

HSI band alignment

HSI images captured is out of align in certain band. One of the reasons causing misalignment between bands is UAV flight stability. During data capturing, UAV is programmed to hover above the plants. However, in actual flight, it may disturb by wind, and result undesired translational, rotational or shear motion. This result bands between HSI has the same motion. In such, any two bands within HSI cube are related by a motion model. The misalignment can be visualized by comparing band 280 and band 380. The visualized output of dataset S4–1B is shown as Figure 4(a). Band 280 is shown as blue colour, and band 380 is displayed as green colour. The misalignment is significant in translational motion, especially across X-axis.

Figure 4

(a) Before image alignment. (b) After image alignment.

(a) Before image alignment. (b) After image alignment. The band images realignment can be seen as image registration problem. To further simplified this problem, a band image is chosen as template image to align rest of the band images. Ideal template image is to has good contrast and feature. In this study, band 240 is used as template image. The registration is done using Parametric Image Alignment Using Enhanced Correlation Coefficient Maximization [5]. This method requires motion model to estimates the geometric transformation between rest of the bands with template band image. Affine Transformation is used as motion model as it can express rotations, translations, and shear motion of an image. Figure 4(b) shows the misalignment is reduced after realignment process.

Image denoising

Minimum Noise Fraction (MNF) is a widely used algorithm in hyperspectral image denoising. MNF transforms hyperspectral data into MNF space, components are ordered by descending signal-to-noise ratio (SNR), which means that the MNF output images contain steadily decreasing image quality. The eigenvalues of the components are equal to one plus SNR in the transformed space. Hence, near unity eigenvalue components are noise dominated. One can spatially filter noisiest components and follow by invert MNF transform to obtain smoothing image without serious signal degradation [6]. Figure 5 illustrates output of component 2, 34, and 70 in MNF domain. Component 2 has very strong signature, while component 34 contains noises. Feature still can be observed component 70 with strong noise. MNF images are truncated at 80 bands, as they contain useful features, and eliminate band 81-392, which dominating by noise. Inverse MNF transform to HSI is performed on truncated MNF images. The HSI is smoother after denoising, salt-and-pepper noise is removed. Figure 6 demonstrates image comparison in band 280 of data S4–1B, before and after denoising.

Figure 5

MNF output (a) component 2; (b) component 34; (c) component 70.

Figure 6

(a) Before denoise (b) after denoise.

MNF output (a) component 2; (b) component 34; (c) component 70. (a) Before denoise (b) after denoise.

Multilayer Perceptron

In early stage of infection, there is no difference in plant shape from UAV view. When it starts showing symptom like leaves hanging downwards vertically from the point of attachment to the trunk that make it skirt-like appearance, it is at advance stage of infection. Therefore, spatial feature is not the key feature in BSR disease detection in early detection. Feature extraction is designed to extract information from hyperspectral signature using a Multilayer Perceptron Model (MLP). In contrast to Convolutional Neural Network, there is no spatial feature extraction in this network. MLP has only 392 inputs in this work, and this scale down network size, lesser number of hyperparameters, hence reduce computational cost. And this makes it an efficient network opposed to other deep learning network. The classification based on hyperspectral features can be seen as nonlinear input-output mapping, aims at finding a mathematical function y = f(x) that can map an m-dimensional Euclidean input space(x) to an M-dimensional Euclidean output space(y). Universal approximation theorem states that a single hidden layer with finite number of neurons MLP is sufficient to compute an approximation to a given training set represented by a set of input and a desired output . Even Though one sufficiently large hidden layer can approximate a non-linear function, it is also found two layers hidden layers network is generally perform better than single hidden layer network [11]. Hence, a two hidden layer MLP is implemented in this work. Besides, dropout regularizations are added at hidden layer. The dropout rate is set to 30%, meaning one of three inputs will be randomly excluded from each update cycle. This prevent MLP from overfitting and more robust to the inputs. Lastly, output layer has 4 neurons with Softmax activation function, to classify each pixel into one of four categories, Stage A, Stage B, Stage C, and Healthy. Table 4 depicts the network architecture of proposed MLP model.

Table 4

Proposed MLP network architecture.

No	Layer Type	Activation Function	Size
1	Fully Connected	Sigmoid	392
2	Fully Connected (dropout 0.3)	Sigmoid	392
3	Fully Connected	Softmax	4

Proposed MLP network architecture. The loss of network is minimized, and weight is updated using Adaptive Moment Estimation (ADAM) optimizer [23]. In contrast with conventional gradient descent algorithm, which is having constant learning rate and momentum, ADAM optimizer accelerates weights update by taking into consideration of exponentially moving average of gradient. The weights are updated as described in Eq. (1):Where is weight, is step size, is compute bias-corrected second raw moment estimate, is compute bias-corrected first moment estimate and is a small value to avoid the function being divided by zero when the gradient is almost zero. and can be calculated using Eqs. (2) and (3):Where is update biased first moment estimate, is update biased second raw moment estimate, and are exponential decay rates for the moment estimates for and correspondingly. and can be derived using Eqs. (4) and (5):Where is calculated gradient. From above equations, there are four parameters , , and to tune ADAM optimizer.

Vegetation index

A vegetation index is generated by calculating bath math between multiple spectral bands into a single value to reflect plant properties. Various vegetation indices are developed to obtain different plant's properties. Among the common one, Normalized Difference Vegetation Index (NDVI) [41], Normalized Difference Red Edge (NDRE), Optimised Soil-Adjusted Vegetation Index (OSAVI) [42], and Merris Terrestrial Chlorophyll Index (MTCI) [43], as summarized in Table 5.

Table 5

List of vegetation indices.

Vegetation Indices	Equation
Normalised Difference Vegetation Index (NDVI)	NDVI=NIR−REDNIR+RED(6)where:NIR = 800nmRed = 660nm
Normalised Difference Red Edge (NDRE)	NDRE=NIR−REDEdgeNIR+RedEdge(7)where:NIR = 800nmRed Edge = 720nm
Optimized Soil Adjusted Vegetation Index (OSAVI)	OSAVI=(1+Y)∗(R800−R670)(R800+R670+Y)(8)where:Y = 0.16R800 = 800nmR670 = 670nm
Merris Terrestrial Chlorophyll Index (MTCI)	MTCI=R850−R730R730+R675(9)where:R850 = 850nmR730 = 730nmR675 = 675nm

List of vegetation indices. NDVI is calculated from the visible red light and near infrared light reflected by vegetation, which can be expressed using equation (6). It is commonly used to quantify green biomass and vegetation coverage. As NDVI correlates with chlorophyll, it is also being used as plant health's indicator. NDVI always result in a number between -1 and +1. High value indicates high vegetation density. NDRE has similar calculation with NDVI. Instead of using visible light, it uses red edge to obtain the index as describe in equation (7). Compared to red band, red edge is capable of penetrating leaves better and sensitive to chlorophyll content in leaves. NDRE plays an important role for late season crops that concentration of chlorophyll is relatively higher. OSAVI is a modification of NDVI. It adds a correction factor to reduce influences of soil reflectance. OSAVI can be calculated from equation (8). It enhances the vegetation spectral features where vegetation cover is low, and soil is expose. It also works well in areas with high vegetation density. MTCI is mainly designed for Merris Dataset to estimate chlorophyll content. Equation (9) shows NIR, red edge and red band are use in this index, hence it is sensitive towards wide range of chlorophyll concentration. S.Mori et. al. [10]. mentioned MTCI is among the most efficient algorithm.

Classification process

HSI of 1024 × 1024 pixels are flatten to 102, 481, 024 pixels and feedforward by MLP to generate classification output of 102, 481, 024 pixels, and the output is re-shape into 1024 × 1024 pixels and form a classification map. Each pixel is categorized into one of the classes as described in Table 3. A region of interest (ROI) of 300 × 300 pixels from the centre of the plants in classification map will be used to determine infection stage. Classification of plant's infection stage is done based on pixels composition of four classes within the ROI. In MLP with A, B, G, H classes, healthy plants have higher percentage (>70%) in H pixel, Stage C has more G pixel (>50%) compared to others. Stage A and B is classified based on who has higher percentage in the ROI. In the other hand, MLP with B, G, H classes, there is no class A pixel, thus stage A and stage B is differentiated by the ratio between H pixel and B pixel. Higher ratios (>0.5) represent there are more H pixel in the plant, hence it is Stage A, otherwise Stage B. Table 6 summarizes the classification condition.

Table 6

Plant infection stage classification condition.

Dataset	Healthy	Stage A	Stage B	Stage C
A, B, G, H	>70% H	A > B	B > A	>50% C
B, G, H	>60% H	H/B ≥ 0.5	H/B < 0.5	>40% C

Plant infection stage classification condition. Figure 7 shows example of classification map and ROI pixel composition, which consists of 77.08% H pixel, 14.97% B pixel and 7.96% G pixel.

Figure 7

Example classification map and pixel composition.

Performance evaluation

To evaluate performance of proposed method, overall accuracy, recall, and precision scores are chosen as accuracy performance metrics. The overall accuracy calculated by adding all correctly classified samples and dividing by total number of samples in all classes. Recall and precision can be calculated from True Positive (TP), False Positive (FP) and False Negative (FN), as described in Eqs. (10) and (11). In addition, computational cost will also be evaluated. All machine learning methods are trained and evaluated with same datasets. Training and inference time are tabulated for comparison purpose. As of vegetation indices, it is mathematical calculation, thus only accuracy performance will be compared. The results are generated on a PC equipped with Intel Core i5 3.3GHz, 16GB memory and GPU Nvidia GTX745.

Result and discussion

MLP with A, B, G, H datasets

The model is trained with A, B, G, H datasets, batch size of 4, total training 500 iterations. Training parameter with ADAM optimizers are = 0.001, = 0.9, = 0.999 and = 1e−7. The validation loss is approaching 0.1 after 100 iterations. Similarly, validation Revieaccuracy is getting above 0.95 after 100 iterations, and finally achieves training loss 0.0595, training accuracy 0.9825, validation loss 0.0355 and validation accuracy 0.9864, as shown in Figure 8.

Figure 8

Training performance for each iteration (ABGH) (a) Model Loss (b) Model Accuracy.

Training performance for each iteration (ABGH) (a) Model Loss (b) Model Accuracy. Classification map for 15 HSI are generated. To visualize classification map, G pixel is set to greyscale value 0 (Black), B pixel is 80 (Dark Grey), A pixel is 160 (Light Grey), and H pixel is 240 (White). Figures 9, 10, 11, and 12 shows classification map and pixel composition for H, A, B, G pixel. Detection results are tabulated in Table 7.

Figure 9

Detection result (ABGH) S3–4H. 87.78% H, 0.28% a, 3.60% B, 8.36% G.

Figure 10

Detection result (ABGH) S3-2A. 0.46% H, 87.55% a, 1.12% B, 10.88% G.

Figure 11

Detection result (ABGH) S3–3B. 4.44% H, 0.67% a, 80.99% B, 13.90% G.

Figure 12

Detection result (ABGH) S1–1C. 0.37% H, 0.32% a, 45.69% B, 53.62% G.

Table 7

Composition of pixel in different classes and detection result from MLP (ABGH).

Dataset	H (%)	A (%)	B (%)	G (%)	Detection Result	Ground Truth
S1–1C	0.37	0.32	45.69	53.62	C	C
S1–2C	0.26	0.68	22.86	76.20	C	C
S2–1A	21.68	5.60	25.85	46.86	B	A
S2–2A	8.69	8.30	40.11	42.90	B	A
S2–3B	14.87	11.80	42.54	30.79	B	B
S3–1B	0.88	0.17	68.91	30.04	B	B
S3–2A	0.46	87.55	1.12	10.88	A	A
S3–3B	4.44	0.67	80.99	13.90	B	B
S3–4H	87.78	0.28	3.60	8.36	H	H
S4–1B	5.64	7.35	61.92	25.10	B	B
S4–2A	21.37	4.13	43.70	30.80	B	A
S4–3A	29.68	5.14	30.31	34.86	B	A
S4–4B	27.99	8.31	35.42	28.28	B	B
S4–5H	79.57	0.38	11.42	8.63	H	H
S15–1C	6.32	0.94	26.36	66.37	C	C

Note: MLP (ABGH) represents MLP modal trained and inference with A, B, G, H datasets.

Detection result (ABGH) S3–4H. 87.78% H, 0.28% a, 3.60% B, 8.36% G. Detection result (ABGH) S3-2A. 0.46% H, 87.55% a, 1.12% B, 10.88% G. Detection result (ABGH) S3–3B. 4.44% H, 0.67% a, 80.99% B, 13.90% G. Detection result (ABGH) S1–1C. 0.37% H, 0.32% a, 45.69% B, 53.62% G. Composition of pixel in different classes and detection result from MLP (ABGH). Note: MLP (ABGH) represents MLP modal trained and inference with A, B, G, H datasets. The MLP achieves 100% accuracy in detecting disease and healthy plants. However, to further differentiate Stage A, B, and C, MLP's can still perform with 78.57% accuracy. From detection result, MLP can separate Healthy plants, and Stage C successfully, nevertheless, stage A and B cannot be differentiated. It is also observed there is only one plant classified as Stage A (S3–4H), which is used as training datasets. This could indicate network overfitting happen at Stage A. Considering the network simplicity and dropout regularization being used during the training, it may also indicate the spectral signature for Stage A and Stage B may not exists. There are only healthy, infected, and ground spectral signatures.

MLP with B, G, H datasets

Based on the finding above, the network is fine-tuned further. Network is trained with B, G, and H datasets only, where B represents infected pixels. Same training steps are done, it achieves training loss 0.0479, training accuracy 0.9842, validation loss 0.0457 and validation accuracy 0.9891, as shown in Figure 13.

Figure 13

Training performance for each iteration (BGH) (a) Model Loss (b) Model Accuracy.

Training performance for each iteration (BGH) (a) Model Loss (b) Model Accuracy. Classification map for 15 HSI are generated. Figures 14, 15, 16, and 17 shows example classification map and pixel composition of H, B and G respectively. To visualize classification output, G pixel is set to greyscale value 0 (Black), B pixel is 120 (Grey), and H pixel is 240 (White). Detection results are tabulated in Table 8.

Figure 14

Detection result (BGH) S3–4H. 89.79% H, 3.03% B, 7.18% G.

Figure 15

Detection result (BGH) S3-2A. 47.08% H, 31.84% B, 21.08% G.

Figure 16

Detection result (BGH) S3–3B. 4.76% H, 84.74% B, 10.50% G.

Figure 17

Detection result (BGH) S1–1C. 0.84% H, 50.90% B, 48.26% G.

Table 8

Composition of pixel in different classes and detection result from MLP (BGH).

Dataset	H (%)	B (%)	G (%)	H/B Ratio	Detection Result	Ground Truth
S1–1C	0.84	50.90	48.26	0.02	C	C
S1–2C	1.19	37.14	61.66	0.03	C	C
S2–1A	43.34	36.55	20.12	1.19	A	A
S2–2A	20.53	47.62	31.85	0.43	B	A
S2–3B	27.08	54.94	17.96	0.49	B	B
S3–1B	1.55	70.54	27.91	0.02	B	B
S3–2A	47.08	31.84	21.08	1.48	A	A
S3–3B	4.76	84.74	10.50	0.06	B	B
S3–4H	89.79	3.03	7.18	29.68	H	H
S4–1B	12.11	68.91	18.97	0.18	B	B
S4–2A	30.71	48.38	20.91	0.63	A	A
S4–3A	40.34	31.84	27.82	1.27	A	A
S4–4B	37.54	42.14	20.32	0.89	A	B
S4–5H	80.93	13.12	5.95	6.17	H	H
S15–1C	9.49	29.87	60.64	0.32	C	C

Note: MLP (BGH) represents MLP modal trained and inference with B, G, H datasets. H/B ratio is ratio between H pixel and B pixel within ROI in classification map.

Detection result (BGH) S3–4H. 89.79% H, 3.03% B, 7.18% G. Detection result (BGH) S3-2A. 47.08% H, 31.84% B, 21.08% G. Detection result (BGH) S3–3B. 4.76% H, 84.74% B, 10.50% G. Detection result (BGH) S1–1C. 0.84% H, 50.90% B, 48.26% G. Composition of pixel in different classes and detection result from MLP (BGH). Note: MLP (BGH) represents MLP modal trained and inference with B, G, H datasets. H/B ratio is ratio between H pixel and B pixel within ROI in classification map. As previous experiment, MLP achieves 100% accuracy in detecting disease and healthy plants. It can also differentiate Stage A, Stage B, Stage C and Healthy plants with overall accuracy 86.67%. From detection result, only one stage A plant categorized wrongly as stage B and one stage B plant categorized wrongly as stage A. Table 9 compares the result between MLP (ABGH) and MLP (BGH). Although both experiments yield comparable overall accuracy, but MLP (ABGH) model has lower recall in Stage A and lower precision at Stage B. This shows MLP (ABGH) is not able to differentiate Stage A and Stage B well.

Table 9

Classification result between MLP (ABGH) and MLP (BGH).

Category	MLP (ABGH)	MLP (BGH)
Healthy	2/2	2/2
Stage A	1/5	4/5
Stage B	5/5	4/5
Stage C	3/3	3/3
Overall Accuracy	78.57%	86.67%
Recall – Stage A	0.2	0.8
Recall – Stage B	1.0	0.8
Precision – Stage A	1.0	0.8
Precision – Stage B	0.56	0.8

Classification result between MLP (ABGH) and MLP (BGH).

MLP width and depth

From experiment 4.1 and 4.2, it shows datasets B, G, H is sufficient to perform early-stage detection. MLP is further tuned to target better accuracy or faster processing time. This is done by setting different number of hidden layer and neurons. Table 10 shows the overall accuracy of each setting. The result shows maximum overall accuracy 86.67% happens 4 times, 3 happen at MLP with 392 neurons at each layer, 1 at MLP with 4 hidden layers, and 1 happen at 1 hidden layer with 64 neurons. In general, deeper network (4 hidden layer) or higher number of neurons (392) can perform better. In term of recall and precision, these 4 settings have achieved 100% in Healthy and Stage C detection, except MLP with 3 Layer with 392 neurons. As of recall and precision for Stage A and B, it shows none of settings can separate them perfectly (see Tables 11, 12, 13, 14).

Table 10

Overall accuracy (%).

NeuronHidden Layer	128	256	392
1	60.00	73.33	73.33
2	73.33	53.33	86.67
3	80.00	80.00	86.67
4	86.67	73.33	86.67

Table 11

Recall and precision – healthy.

NeuronHidden Layer	128	256	392
1	1.0/0.5	0.67/1.0	0.67/0.67
2	0.67/1.0	1.0/1.0	1.0/1.0
3	1.0/1.0	1.0/0.67	0.67/1.0
4	1.0/1.0	1.0/1.0	1.0/1.0

Table 12

Recall and precision – stage A

NeuronHidden Layer	128	256	392
1	0.0/0.0	0.6/0.6	0.75/0.6
2	0.6/0.6	0.0/0.0	0.8/0.8
3	1.0/0.4	0.8/0.8	0.8/0.8
4	1.0/0.6	1.0/0.4	0.71/1.0

Table 13

Recall and precision – stage B.

NeuronHidden Layer	128	256	392
1	0.59/1.0	0.75/0.6	0.67/0.8
2	1.0/0.6	1.0/0.6	0.8/0.8
3	0.62/1.0	0.67/0.8	1.0/0.8
4	0.71/1.0	0.67/0.8	1.0/0.6

Table 14

Recall and precision – stage C.

NeuronHidden Layer	128	256	392
1	0.75/1.0	1.0/1.0	1.0/1.0
2	0.75/1.0	0.33/1.0	1.0/1.0
3	1.0/1.0	1.0/1.0	1.0/1.0
4	1.0/1.0	0.6/1.0	1.0/1.0

Overall accuracy (%). Recall and precision – healthy. Recall and precision – stage A Recall and precision – stage B. Recall and precision – stage C. Different setting also results in different training and inference time. As tabulated in Tables 15 and 16, a greater number of layers or neuron has higher processing time. Training time is not the major concern, as neural network does not require frequent training. Furthermore, maximum training time is 752s, which can be considered as fast training time. From the result, there are two MLP settings give good detection result with better inference time, (1) 2 Hidden Layers, 392 Neurons, (2) 4 Hidden Layers, 128 Neurons.

Table 15

Training time (s).

	128	256	392
1	300	326	360
2	362	406	531
3	411	474	639
4	489	538	752

Table 16

Average inference time per image (s).

	128	256	392
1	12.2	12.8	13.9
2	11.9	13.0	13.8
3	12.6	14.3	14.8
4	13.3	14.6	18.6

Training time (s). Average inference time per image (s). Table 17 tabulate accuracy score and Table 18 tabulate computational cost between several ML algorithm. MLP, 1D-CNN, Linear SVM and Radical Basis Function Kernel SVM (RBF SVM). All algorithms are trained and inference with B, G and H datasets only. MLP is found to be the most effective ML modal compared to SVM and 1D-CNN, showing the highest score in term of overall accuracies, recall and precision in all infection stages. It also shows it is the fastest model in average inference time. Unlike SVM, neural network methods are having significant longer training time to find the best curve to segregate the classes. Both SVMs shows training time is around 1 s or lower. Nevertheless, they have lower classification accuracy (66.67%) when compared to MLP model (86.67%) and 1D-CNN modal (73.33%). Despite having fast training time, inference time in SVM is taking much longer which is 107s for linear SVM and 305s for RBF SVM. This is 5–18 times slower than MLP modal. This is having huge impact on the practicability, when actual inference will be done on hundred- or thousand-times larger image than samples used in this study.

Table 17

Performance - accuracy.

	Overall Accuracy	Recall/Precision H	Recall/Precision A	Recall/Precision B	Recall/Precision C
MLP	86.67	1.0/1.0	0.71/1.0	1.0/0.6	1.0/1.0
1D-CNN	73.33	1.0/0.2	0.56/1.0	1.0/1.0	1.0/1.0
Linear SVM	66.67	0.6/0.6	0.67/0.4	0.6/1.0	1.0/1.0
RBF SVM	66.67	0.67/0.4	0.6/0.6	0.6/1.0	1.0/1.0

Table 18

Performance – computational cost.

	Training Time	Inference Time
MLP	752	18.6
1D-CNN	512	30.7
Linear SVM	0.56	107
RBF SVM	1.05	350

Performance - accuracy. Performance – computational cost. All ML modals have perfect recall and precision for stage C infection. MLP has highest score in recall and precision in healthy plant, indicating it can correctly identify infected and non-infected plant. In the other hand 1D CNN shows highest score in recall and precision in Stage B and C, this represent early detection is not working well in 1D-CNN, but it can separate mild and late stage correctly. SVM has worst performance in term of recall and precision, despite it is having moderate overall accuracy, this shows it is not performing well in early-stage detection. Proposed method also being compared with vegetation indices, which transform spectral properties of plants into index. In NDVI result, as shown from Figures 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, histogram distribution for stage C is obvious compared to the rest. In stage A, B, and Healthy plant, pixel values are concentrated at 200 and above, while stage C has distribution across 150-250. This indicates NDVI can detect late stage of infection but unable to differentiate between early-stage infection and healthy plants. NDRE result is illustrated from Figures 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47. It shows similar detection result with NDVI. Stage C pixel values concentrate at 160, whereas Healthy, Stage A and Stage B has peak value at 190. Only exception at S3–1B, where it shows Stage C histogram pattern. Figures 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62 shows the detection result from MTCI. There is no significant pattern to differentiate Healthy, Stage A, B and C. Figures 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77 presents result for OSAVI. Healthy, Stage A and Stage B has pixel values concentrate between 0-20. While Stage C has more values distributed at 180-250. Overall, vegetation indices can separate Stage C and the rest well, except MTCI, but they could not perform in separating Healthy, Stage A and Stage B plants. S3–1B shows histogram pattern of Stage C in NDRE, NDVI and OSAVI. This represents the key spectral features in separating infected trees in different stages is not contain within the bands used to calculate NDRE, NDVI, OSAVI and MTCI.

Figure 18

NDVI Detection Result: S1–1C.

Figure 19

NDVI detection result: S1–2C.

Figure 20

NDVI detection result: S2–1A.

Figure 21

NDVI detection result: S2–2A.

Figure 22

NDVI detection result: S2–3B.

Figure 23

NDVI detection result: S3–1B.

Figure 24

NDVI detection result: S3-2A.

Figure 25

NDVI detection result: S3–3B.

Figure 26

NDVI detection result: S3–4H.

Figure 27

NDVI detection result: S4–1B.

Figure 28

NDVI detection result: S4–2A.

Figure 29

NDVI detection result: S4–3A.

Figure 30

NDVI detection result: S4–4B.

Figure 31

NDVI detection result: S5–5H.

Figure 32

NDVI detection result: S15–1C.

Figure 33

NDRE detection result: S1–1C.

Figure 34

NDRE detection result: S1–2C.

Figure 35

NDRE detection result: S2–1A.

Figure 36

NDRE detection result: S2–2A.

Figure 37

NDRE detection result: S2–3B.

Figure 38

NDRE detection result: S3–1B.

Figure 39

NDRE detection result: S3–2A.

Figure 40

NDRE detection result: S3–3B.

Figure 41

NDRE detection result: S3–4H.

Figure 42

NDRE detection result: S4–1B.

Figure 44

NDRE detection result: S4–3A.

Figure 45

NDRE detection result: S4–4B.

Figure 46

NDRE detection result: S5–5H.

Figure 47

NDRE detection result: S15–1C.

Figure 48

MTCI detection result: S1–1C.

Figure 49

MTCI detection result: S1–2C.

Figure 50

MTCI detection result: S2–1A.

Figure 51

MTCI detection result: S2–2A.

Figure 52

MTCI detection result: S2–3B.

Figure 53

MTCI detection result: S3–1B.

Figure 54

MTCI detection result: S3–2A.

Figure 55

MTCI detection result: S3–3B.

Figure 56

MTCI detection result: S3–4H.

Figure 57

MTCI detection result: S4–1B.

Figure 58

MTCI detection result: S4–2A.

Figure 59

MTCI detection result: S4–3A.

Figure 60

MTCI detection result: S4–4B.

Figure 61

MTCI detection result: S5–5H.

Figure 62

MTCI detection result: S15–1C.

Figure 63

OSAVI detection result: S1–1C.

Figure 64

OSAVI detection result: S1–2C.

Figure 65

OSAVI detection result: S2–1A.

Figure 66

OSAVI detection result: S2–2A.

Figure 67

OSAVI detection result: S2–3B.

Figure 68

OSAVI detection result: S3–1B.

Figure 69

OSAVI detection result: S3–2A.

Figure 70

OSAVI detection result: S3–3B.

Figure 71

OSAVI detection result: S3–4H.

Figure 72

OSAVI detection result: S4–1B.

Figure 73

OSAVI detection result: S4–2A.

Figure 74

OSAVI detection result: S4–3A.

Figure 75

OSAVI detection result: S4–4B.

Figure 76

OSAVI detection result: S5–5H.

Figure 77

OSAVI detection result: S15–1C.

NDVI Detection Result: S1–1C. NDVI detection result: S1–2C. NDVI detection result: S2–1A. NDVI detection result: S2–2A. NDVI detection result: S2–3B. NDVI detection result: S3–1B. NDVI detection result: S3-2A. NDVI detection result: S3–3B. NDVI detection result: S3–4H. NDVI detection result: S4–1B. NDVI detection result: S4–2A. NDVI detection result: S4–3A. NDVI detection result: S4–4B. NDVI detection result: S5–5H. NDVI detection result: S15–1C. NDRE detection result: S1–1C. NDRE detection result: S1–2C. NDRE detection result: S2–1A. NDRE detection result: S2–2A. NDRE detection result: S2–3B. NDRE detection result: S3–1B. NDRE detection result: S3–2A. NDRE detection result: S3–3B. NDRE detection result: S3–4H. NDRE detection result: S4–1B. NDRE detection result: S4–2A. NDRE detection result: S4–3A. NDRE detection result: S4–4B. NDRE detection result: S5–5H. NDRE detection result: S15–1C. MTCI detection result: S1–1C. MTCI detection result: S1–2C. MTCI detection result: S2–1A. MTCI detection result: S2–2A. MTCI detection result: S2–3B. MTCI detection result: S3–1B. MTCI detection result: S3–2A. MTCI detection result: S3–3B. MTCI detection result: S3–4H. MTCI detection result: S4–1B. MTCI detection result: S4–2A. MTCI detection result: S4–3A. MTCI detection result: S4–4B. MTCI detection result: S5–5H. MTCI detection result: S15–1C. OSAVI detection result: S1–1C. OSAVI detection result: S1–2C. OSAVI detection result: S2–1A. OSAVI detection result: S2–2A. OSAVI detection result: S2–3B. OSAVI detection result: S3–1B. OSAVI detection result: S3–2A. OSAVI detection result: S3–3B. OSAVI detection result: S3–4H. OSAVI detection result: S4–1B. OSAVI detection result: S4–2A. OSAVI detection result: S4–3A. OSAVI detection result: S4–4B. OSAVI detection result: S5–5H. OSAVI detection result: S15–1C. In summary, vegetation indices cannot segregate early-stage infection and healthy plant. NDVI, NDRE and OSAVI can detect Stage C infection, except MTCI, which has no obvious histogram pattern in the output for different stages. The result is shown in Table 19. This result is within expectation because only few bands are used for calculation out of 392 bands. And bands used may not be significant to detect BSR disease. In the other hand, MLP learns spectral features out of 392 bands, and decide which band is useful for disease detection.

Table 19

Comparison between MLP and vegetation index.

	Healthy	Stage A	Stage B	Stage C
Ground Truth	2	5	5	3
MLP	2	4	4	3
NDVI	11			4
NDRE	11			4
OSAVI	11			4

Comparison between MLP and vegetation index.

Conclusion

In this work, a Multilayer Perceptron Neural Network is proposed to detect BSR disease on UAV hyperspectral images. The network is optimized with 2 layers neural network, and 392 neurons in each layer. Benefitting from neural network self-learning ability, it uses spectral information to learn and act as novel vegetation index. The MLP has been validated with training data and unseen ground truth. The result is compared with various machine learning algorithm and vegetation indices, and it shows convincing result. It also shows there may be no spectral signature for Stage A and Stage B, but only spectral signature for Healthy, Infected and Ground. Stage A, B, and H can be differentiated by the composition percentage of these pixels type in ROI. Nonetheless, amount of data for each stage is considerably small in this study, more field experiments, and data collection to be carried out for further validation with proposed method. This study can be further enhanced with two stages neural network. First stage is proposed MLP method, learn spectral signature and classifying each pixel into corresponding class, and second stage is to learn ROI pixel composition in identifying plant disease stage.

Declarations

Author contribution statement

Chee Cheong Lee: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper. Voon Chet Koo: Conceived and designed the experiments; Contributed reagents, materials, analysis tools or data; Wrote the paper. Tien Sze Lim: Conceived and designed the experiments; Contributed reagents, materials, analysis tools or data. Yang Ping Lee: Contributed reagents, materials, analysis tools or data; Wrote the paper. Haryati Abidin: Contributed reagents, materials, analysis tools or data.

Funding statement

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Data availability statement

The authors do not have permission to share data.

Declaration of interests statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.

5 in total