| Literature DB >> 27977767 |
Feng Qin1, Dongxia Liu2, Bingda Sun3, Liu Ruan1, Zhanhong Ma1, Haiguang Wang1.
Abstract
Common leaf spot (caused by Pseudopeziza medicaginis), rust (caused by Uromyces striatus), Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana) and Cercospora leaf spot (caused by Cercospora medicaginis) are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering) and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis). After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection), disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM) and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features) was the optimal model. For this SVM model, the recognition accuracies of the training set and the testing set were 97.64% and 94.74%, respectively. Semi-supervised models for disease recognition were built based on the 45 effective features that were used for building the optimal SVM model. For the optimal semi-supervised models built with three ratios of labeled to unlabeled samples in the training set, the recognition accuracies of the training set and the testing set were both approximately 80%. The results indicated that image recognition of the four alfalfa leaf diseases can be implemented with high accuracy. This study provides a feasible solution for lesion image segmentation and image recognition of alfalfa leaf disease.Entities:
Mesh:
Year: 2016 PMID: 27977767 PMCID: PMC5158033 DOI: 10.1371/journal.pone.0168274
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Work flow diagram of main steps for lesion image segmentation.
Extracted image features (excluding Hu invariant moments) and calculation formulas.
| Feature parameter | Calculation formula | Reference |
|---|---|---|
| Contrast | [ | |
| Energy | [ | |
| Homogeneity | [ | |
| First moment | [ | |
| Second moment | [ | |
| Third moment | [ | |
| Color ratio | [ | |
| Color ratio | [ | |
| Color ratio | [ | |
| Circularity | [ | |
| Complexity | [ |
Performance evaluations of the twelve segmentation methods based on the sub-images of four alfalfa leaf diseases.
| Image dataset | Clustering method | Supervised classification method | Recall | Precision | Score | |||
|---|---|---|---|---|---|---|---|---|
| Mean | Median | Mean | Median | Mean | Median | |||
| Image dataset of alfalfa common leaf spot | Logistic regression analysis | 0.6443 | 0.6727 | 0.8940 | 0.9127 | 0.7691 | 0.7925 | |
| Naive Bayes algorithm | 0.6318 | 0.6379 | 0.8922 | 0.9047 | 0.7620 | 0.7644 | ||
| CART | 0.5547 | 0.5683 | 0.8716 | 0.8829 | 0.7132 | 0.7300 | ||
| Linear discriminant analysis | 0.7694 | 0.7981 | 0.9235 | 0.9392 | 0.8465 | 0.8743 | ||
| Fuzzy | Logistic regression analysis | 0.6117 | 0.6455 | 0.8859 | 0.9119 | 0.7488 | 0.7783 | |
| Naive Bayes algorithm | 0.5725 | 0.5894 | 0.8785 | 0.8876 | 0.7255 | 0.7484 | ||
| CART | 0.4856 | 0.4684 | 0.8570 | 0.8686 | 0.6713 | 0.6639 | ||
| Linear discriminant analysis | 0.7389 | 0.8036 | 0.9155 | 0.9357 | 0.8272 | 0.8736 | ||
| Logistic regression analysis | 0.6989 | 0.7261 | 0.9010 | 0.9272 | 0.7999 | 0.8291 | ||
| Naive Bayes algorithm | 0.6850 | 0.6785 | 0.8982 | 0.9234 | 0.7916 | 0.7949 | ||
| CART | 0.6153 | 0.5926 | 0.8806 | 0.9001 | 0.7479 | 0.7435 | ||
| Linear discriminant analysis | 0.7905 | 0.8199 | 0.9220 | 0.9366 | 0.8562 | 0.8810 | ||
| Image dataset of alfalfa rust | Logistic regression analysis | 0.7508 | 0.7714 | 0.9459 | 0.9568 | 0.8484 | 0.8626 | |
| Naive Bayes algorithm | 0.7200 | 0.7354 | 0.9396 | 0.9517 | 0.8298 | 0.8444 | ||
| CART | 0.7021 | 0.7370 | 0.9372 | 0.9518 | 0.8197 | 0.8413 | ||
| Linear discriminant analysis | 0.8303 | 0.8376 | 0.9583 | 0.9639 | 0.8943 | 0.9013 | ||
| Fuzzy | Logistic regression analysis | 0.6741 | 0.6772 | 0.9338 | 0.9423 | 0.8039 | 0.8091 | |
| Naive Bayes algorithm | 0.6366 | 0.6424 | 0.9266 | 0.9386 | 0.7816 | 0.7872 | ||
| CART | 0.5998 | 0.6156 | 0.9197 | 0.9314 | 0.7598 | 0.7721 | ||
| Linear discriminant analysis | 0.8051 | 0.8116 | 0.9549 | 0.9609 | 0.8800 | 0.8870 | ||
| Logistic regression analysis | 0.8166 | 0.8384 | 0.9542 | 0.9644 | 0.8854 | 0.9025 | ||
| Naive Bayes algorithm | 0.8019 | 0.8288 | 0.9458 | 0.9595 | 0.8738 | 0.8968 | ||
| CART | 0.7915 | 0.8341 | 0.9475 | 0.9640 | 0.8695 | 0.9019 | ||
| Linear discriminant analysis | 0.8516 | 0.8596 | 0.9606 | 0.9671 | 0.9061 | 0.9137 | ||
| Image dataset of alfalfa Leptosphaerulina leaf spot | Logistic regression analysis | 0.8329 | 0.8736 | 0.9634 | 0.9722 | 0.8982 | 0.9248 | |
| Naive Bayes algorithm | 0.8561 | 0.8908 | 0.9635 | 0.9713 | 0.9098 | 0.9335 | ||
| CART | 0.7665 | 0.7933 | 0.9571 | 0.9697 | 0.8618 | 0.8803 | ||
| Linear discriminant analysis | 0.9002 | 0.9285 | 0.9657 | 0.9716 | 0.9330 | 0.9510 | ||
| Fuzzy | Logistic regression analysis | 0.8102 | 0.8334 | 0.9622 | 0.9733 | 0.8862 | 0.9040 | |
| Naive Bayes algorithm | 0.8181 | 0.8407 | 0.9623 | 0.9718 | 0.8902 | 0.9078 | ||
| CART | 0.7188 | 0.7458 | 0.9536 | 0.9674 | 0.8362 | 0.8569 | ||
| Linear discriminant analysis | 0.8900 | 0.9170 | 0.9652 | 0.9723 | 0.9276 | 0.9441 | ||
| Logistic regression analysis | 0.8919 | 0.9255 | 0.9613 | 0.9691 | 0.9266 | 0.9481 | ||
| Naive Bayes algorithm | 0.9091 | 0.9480 | 0.9583 | 0.9626 | 0.9337 | 0.9565 | ||
| CART | 0.8629 | 0.9156 | 0.9556 | 0.9656 | 0.9093 | 0.9409 | ||
| Linear discriminant analysis | 0.9287 | 0.9495 | 0.9636 | 0.9690 | 0.9462 | 0.9583 | ||
| Image dataset of alfalfa Cercospora leaf spot | Logistic regression analysis | 0.5851 | 0.6044 | 0.8250 | 0.8471 | 0.7051 | 0.7172 | |
| Naive Bayes algorithm | 0.5296 | 0.5394 | 0.8044 | 0.8173 | 0.6670 | 0.6823 | ||
| CART | 0.4240 | 0.4184 | 0.7627 | 0.7672 | 0.5933 | 0.5908 | ||
| Linear discriminant analysis | 0.7656 | 0.7824 | 0.8932 | 0.9041 | 0.8294 | 0.8431 | ||
| Fuzzy | Logistic regression analysis | 0.5808 | 0.5954 | 0.8231 | 0.8401 | 0.7019 | 0.7184 | |
| Naive Bayes algorithm | 0.5089 | 0.5184 | 0.7968 | 0.8122 | 0.6529 | 0.6680 | ||
| CART | 0.4184 | 0.4120 | 0.7620 | 0.7667 | 0.5902 | 0.5878 | ||
| Linear discriminant analysis | 0.7508 | 0.7695 | 0.8877 | 0.9027 | 0.8192 | 0.8330 | ||
| Logistic regression analysis | 0.6237 | 0.6330 | 0.8362 | 0.8612 | 0.7300 | 0.7488 | ||
| Naive Bayes algorithm | 0.5705 | 0.5885 | 0.8171 | 0.8373 | 0.6938 | 0.7149 | ||
| CART | 0.4876 | 0.4702 | 0.7824 | 0.7985 | 0.6350 | 0.6292 | ||
| Linear discriminant analysis | 0.7786 | 0.7938 | 0.8951 | 0.9109 | 0.8369 | 0.8496 | ||
| Aggregated image dataset | Logistic regression analysis | 0.6789 | 0.7040 | 0.8846 | 0.9116 | 0.7818 | 0.8057 | |
| Naive Bayes algorithm | 0.6510 | 0.6474 | 0.8730 | 0.8960 | 0.7620 | 0.7690 | ||
| CART | 0.5650 | 0.5728 | 0.8481 | 0.8791 | 0.7065 | 0.7138 | ||
| Linear discriminant analysis | 0.8105 | 0.8317 | 0.9242 | 0.9419 | 0.8674 | 0.8870 | ||
| Fuzzy | Logistic regression analysis | 0.6567 | 0.6633 | 0.8808 | 0.9044 | 0.7687 | 0.7814 | |
| Naive Bayes algorithm | 0.6132 | 0.6041 | 0.8658 | 0.8830 | 0.7395 | 0.7412 | ||
| CART | 0.5288 | 0.5180 | 0.8430 | 0.8679 | 0.6859 | 0.6922 | ||
| Linear discriminant analysis | 0.7940 | 0.8184 | 0.9201 | 0.9388 | 0.8571 | 0.8777 | ||
| Logistic regression analysis | 0.7282 | 0.7648 | 0.8916 | 0.9270 | 0.8099 | 0.8468 | ||
| Naive Bayes algorithm | 0.7022 | 0.7042 | 0.8795 | 0.9119 | 0.7908 | 0.8028 | ||
| CART | 0.6407 | 0.6714 | 0.8600 | 0.9035 | 0.7504 | 0.7868 | ||
| Linear discriminant analysis | 0.8294 | 0.8514 | 0.9249 | 0.9424 | 0.8771 | 0.8997 | ||
Note: The aggregated image dataset was obtained after aggregation of four image datasets of alfalfa common leaf spot, alfalfa rust, alfalfa Leptosphaerulina leaf spot and alfalfa Cercospora leaf spot.
Fig 2Results of automatic segmentation of sub-images of four alfalfa leaf diseases using the segmentation method integrated with K_ median clustering algorithm and linear discriminant analysis.
A: Sub-image of alfalfa common leaf spot. B: Image after segmentation of alfalfa common leaf spot. C: Sub-image of alfalfa rust. D: Image after segmentation of alfalfa rust. E: Sub-image of alfalfa Leptosphaerulina leaf spot. F: Image after segmentation of alfalfa Leptosphaerulina leaf spot. G: Sub-image of alfalfa Cercospora leaf spot. H: Image after segmentation of alfalfa Cercospora leaf spot.
Names of image features extracted and results of feature selection using the ReliefF method, the 1R method and the CFS method.
| Feature name | Feature ranking based on the ReliefF method | Feature ranking based on the 1R method | Feature name | Feature ranking based on the ReliefF method | Feature ranking based on the 1R method | Feature name | Feature ranking based on the ReliefF method | Feature ranking based on the 1R method |
|---|---|---|---|---|---|---|---|---|
| φLab_L1 | 61 | 81 | φHSV_H2 | 71 | 30 | Second moment RGB_B | 4 | 127 |
| φLab_L2 | 77 | 103 | φHSV_H3 | 101 | 40 | Second moment HSV_H | 32 | 55 |
| φLab_L3 | 102 | 106 | φHSV_H4 | 85 | 43 | Second moment HSV_S | 38 | 67 |
| φLab_L4 | 87 | 128 | φHSV_H5 | 106 | 24 | Second moment HSV_V | 50 | 114 |
| φLab_L5 | 95 | 113 | φHSV_H6 | 88 | 28 | Second moment Lab_L | 28 | 18 |
| φLab_L6 | 94 | 112 | φHSV_H7 | 129 | 49 | Second moment Lab_b | 14 | 9 |
| φLab_L7 | 126 | 126 | φHSV_S1 | 75 | 38 | Second moment Lab_a | 41 | 83 |
| φLab_a1 | 45 | 1 | φHSV_S2 | 100 | 41 | Third moment RGB_R | 49 | 48 |
| φLab_a2 | 70 | 31 | φHSV_S3 | 110 | 72 | Third moment RGB_G | 54 | 53 |
| φLab_a3 | 84 | 77 | φHSV_S4 | 111 | 75 | Third moment RGB_B | 42 | 118 |
| φLab_a4 | 79 | 58 | φHSV_S5 | 113 | 65 | Third moment HSV_H | 44 | 76 |
| φLab_a5 | 115 | 123 | φHSV_S6 | 112 | 50 | Third moment HSV_ | 36 | 69 |
| φLab_a6 | 120 | 98 | φHSV_S7 | 118 | 88 | Third momentHSV_V | 53 | 110 |
| φLab_a7 | 116 | 124 | φHSV_V1 | 63 | 74 | The third moment Lab_L | 30 | 19 |
| φLab_b1 | 56 | 64 | φHSV_V2 | 74 | 85 | Third moment Lab_b | 13 | 11 |
| φLab_b2 | 68 | 46 | φHSV_V3 | 96 | 102 | Third moment Lab_a | 39 | 79 |
| φLab_b3 | 105 | 42 | φHSV_V4 | 83 | 109 | Contrast RGB_R | 59 | 66 |
| φLab_b4 | 119 | 61 | φHSV_V5 | 98 | 104 | Energy RGB_R | 35 | 71 |
| φLab_b5 | 117 | 121 | φHSV_V6 | 90 | 86 | Homogeneity RGB_R | 18 | 25 |
| φLab_b6 | 128 | 63 | φHSV_V7 | 124 | 111 | Contrast RGB_G | 64 | 100 |
| φLab_b7 | 114 | 122 | Circularity | 2 | 5 | Energy RGB_G | 55 | 73 |
| φRGB_R1 | 62 | 68 | Complexity | 69 | 4 | Homogeneity RGB_G | 22 | 39 |
| φRGB_R2 | 73 | 80 | φshape1 | 66 | 37 | Contrast RGB_B | 21 | 52 |
| φRGB_R3 | 97 | 99 | φshape2 | 72 | 44 | Energy RGB_B | 1 | 34 |
| φRGB_R4 | 82 | 91 | φshape3 | 80 | 59 | Homogeneity RGB_B | 16 | 47 |
| φRGB_R5 | 99 | 107 | φshape4 | 81 | 32 | Contrast HSV_H | 47 | 6 |
| φRGB_R6 | 89 | 94 | φshape5 | 107 | 97 | Energy HSV_H | 6 | 27 |
| φRGB_R7 | 125 | 101 | φshape6 | 93 | 45 | Homogeneity HSV_H | 31 | 3 |
| φRGB_G1 | 51 | 90 | φshape7 | 127 | 108 | Contrast HSV_S | 33 | 16 |
| φRGB_G2 | 76 | 62 | First moment RGB_R | 25 | 17 | Energy HSV_ | 52 | 54 |
| φRGB_G3 | 104 | 119 | First moment RGB_G | 27 | 29 | Homogeneity HSV_ | 9 | 8 |
| φRGB_G4 | 86 | 120 | First moment RGB_B | 24 | 36 | Contrast HSV_V | 60 | 70 |
| φRGB_G5 | 92 | 93 | Color ratio RGB_R | 3 | 14 | Energy HSV_V | 37 | 84 |
| φRGB_G6 | 91 | 95 | Color ratio RGB_G | 10 | 12 | Homogeneity HSV_V | 19 | 23 |
| φRGB_G7 | 123 | 125 | Color ratio RGB_B | 23 | 92 | Contrast Lab_L | 67 | 87 |
| φRGB_B1 | 48 | 78 | First moment HSV_H | 8 | 33 | Energy Lab_L | 46 | 96 |
| φRGB_B2 | 78 | 60 | First moment HSV_S | 7 | 51 | Homogeneity Lab_L | 20 | 22 |
| φRGB_B3 | 109 | 116 | First moment HSV_V | 26 | 21 | Contrast Lab_a | 58 | 2 |
| φRGB_B4 | 103 | 117 | First moment Lab_L | 29 | 20 | Energy Lab_a | 5 | 26 |
| φRGB_B5 | 121 | 105 | First moment Lab_b | 15 | 10 | Homogeneity Lab_a | 43 | 15 |
| φRGB_B6 | 108 | 89 | First moment Lab_a | 40 | 82 | Contrast Lab_b | 65 | 7 |
| φRGB_B7 | 122 | 129 | Second moment RGB_R | 34 | 115 | Energy Lab_b | 11 | 35 |
| φHSV_H1 | 12 | 13 | Second moment RGB_G | 17 | 57 | Homogeneity Lab_b | 57 | 56 |
Note:
* Features marked with an asterisk in the table were selected based on the CFS method.
Recognition results for four alfalfa leaf diseases using random forest models based on selected features using the ReliefF method, the 1R method and the CFS method.
| Number of decision trees | ReliefF method | 1R method | CFS method | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Recognition accuracy of training set (%) | Recognition accuracy of testing set (%) | Number of applied features | Recognition accuracy of training set (%) | Recognition accuracy of testing set (%) | Number of applied features | Recognition accuracy of training set (%) | Recognition accuracy of testing set (%) | Number of applied features | |
| 10 | 99.82 | 90.56 | 74 | 99.73 | 89.29 | 90 | 99.64 | 89.29 | 21 |
| 20 | 99.91 | 91.47 | 57 | 99.91 | 90.20 | 88 | 99.91 | 88.57 | 21 |
| 30 | 100 | 92.38 | 52 | 99.91 | 90.56 | 129 | 99.91 | 88.75 | 21 |
| 40 | 100 | 92.56 | 61 | 100 | 90.56 | 126 | 100 | 89.47 | 21 |
| 50 | 100 | 92.38 | 59 | 99.91 | 90.74 | 76 | 100 | 88.02 | 21 |
| 60 | 100 | 92.20 | 65 | 100 | 91.29 | 128 | 100 | 90.20 | 21 |
| 70 | 100 | 92.74 | 62 | 100 | 90.74 | 119 | 100 | 89.11 | 21 |
| 80 | 100 | 92.56 | 57 | 100 | 90.56 | 105 | 100 | 88.38 | 21 |
| 90 | 100 | 92.38 | 55 | 100 | 90.93 | 107 | 100 | 88.57 | 21 |
| 100 | 100 | 92.20 | 54 | 100 | 90.93 | 114 | 100 | 89.11 | 21 |
Note: For each number of decision trees, only the best random forest model for the recognition of the four alfalfa leaf diseases is shown when the features were selected using the ReliefF method or 1R method.
Recognition results for four alfalfa leaf diseases using SVM models based on selected features using the ReliefF method, the 1R method and the CFS method.
| Model | Feature selection method | Recognition accuracy of training set (%) | Recognition accuracy of testing set (%) | Number of applied features | ||
|---|---|---|---|---|---|---|
| Model 4 | The ReliefF method | 6.964 | 0.435 | 97.64 | 94.74 | 45 |
| Model 5 | The 1R method | 36.758 | 0.144 | 97.91 | 94.37 | 122 |
| Model 6 | The CFS method | 21.112 | 0.758 | 95.18 | 91.83 | 21 |
Note: Only the best SVM model for the image recognition of the four alfalfa leaf diseases is shown when the features were selected using the ReliefF method or 1R method.
Recognition results for four alfalfa leaf diseases using KNN models based on selected features using the ReliefF method, the 1R method and the CFS method.
| ReliefF method | 1R method | CFS method | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Recognition accuracy of training set (%) | Recognition accuracy of testing set (%) | Number of applied features | Recognition accuracy of training set (%) | Recognition accuracy of testing set (%) | Number of applied features | Recognition accuracy of training set (%) | Recognition accuracy of testing set (%) | Number of applied features | |
| 5 | 93.55 | 90.38 | 68 | 92.36 | 88.93 | 71 | 92.27 | 87.30 | 21 |
| 9 | 91.27 | 89.66 | 39 | 90.00 | 88.75 | 71 | 90.64 | 87.11 | 21 |
| 13 | 90.00 | 89.66 | 38 | 88.64 | 88.20 | 84 | 90.18 | 86.93 | 21 |
Note: Only the best KNN model for the image recognition of the four alfalfa leaf diseases is shown when the features were selected using the ReliefF method or 1R method.
Recognition results of each alfalfa leaf disease using the optimal model (Model 4).
| Individual disease | Recognition accuracy of training set (%) | Recognition accuracy of testing set (%) |
|---|---|---|
| Alfalfa common leaf spot | 89.19 | 75.00 |
| Alfalfa rust | 99.63 | 96.24 |
| Alfalfa Leptosphaerulina leaf spot | 97.30 | 96.76 |
| Alfalfa Cercospora leaf spot | 99.15 | 97.74 |
Fig 3Changes in cumulative contribution rates with increasing number of principal components based on 45 features used for building Model 4.
Fig 4Recognition results for four alfalfa leaf diseases using semi-supervised models at a ratio of labeled to unlabeled samples of 2:1.
Fig 5Recognition results for four alfalfa leaf diseases using semi-supervised models at a ratio of labeled to unlabeled samples of 1:1.
Fig 6Recognition results for four alfalfa leaf diseases using semi-supervised models at a ratio of labeled to unlabeled samples of 1:2.
Recognition results of four alfalfa leaf diseases using optimal semi-supervised models with various ratios of labeled to unlabeled samples.
| Model | The ratio of labeled samples to unlabeled samples | The number of Principal components | Cumulative contribution rate (%) | Recognition accuracy of training set (%) | Recognition accuracy of testing set (%) |
|---|---|---|---|---|---|
| Model 10 | 2:1 | 9 | 92.22 | 82.82 | 82.76 |
| Model 11 | 1:1 | 10 | 93.45 | 80.36 | 80.58 |
| Model 12 | 1:2 | 10 | 93.45 | 79.18 | 80.58 |