| Literature DB >> 32647230 |
André Dantas de Medeiros1, Nayara Pereira Capobiango2, José Maria da Silva2, Laércio Junio da Silva2, Clíssia Barboza da Silva3, Denise Cunha Fernandes Dos Santos Dias2.
Abstract
New computer vision solutions combined with artificial intelligence algorithms can help recognize patterns in biological images, reducing subjectivity and optimizing the analysis process. The aim of this study was to propose an approach based on interactive and traditional machine learning methods to classify soybean seeds and seedlings according to their appearance and physiological potential. In addition, we correlated the appearance of seeds to their physiological performance. Images of soybean seeds and seedlings were used to develop models using low-cost approaches and free-access software. The models developed showed high performance, with overall accuracy reaching 0.94 for seeds and seedling classification. The high precision of the models that were developed based on interactive and traditional machine learning demonstrated that the method can easily be used to classify soybean seeds according to their appearance, as well as to classify soybean seedling vigor quickly and non-subjectively. The appearance of soybean seeds is strongly correlated with their physiological performance.Entities:
Mesh:
Year: 2020 PMID: 32647230 PMCID: PMC7347887 DOI: 10.1038/s41598-020-68273-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Confusion matrices and metrics of the interactive classification of soybean seeds according to their visual appearance.
| Classa | High-quality seed | Kneaded seed | Purple stained seed | Broken seed | Seed coat tear | Moisture damaged seed | Green-ish seed |
|---|---|---|---|---|---|---|---|
| n = 630 | |||||||
| High-quality seed | 70 | 1 | 3 | 0 | 2 | 27 | 0 |
| Kneaded seed | 0 | 76 | 17 | 0 | 0 | 0 | 6 |
| Purple stained seed | 1 | 2 | 64 | 0 | 0 | 0 | 2 |
| Broken seed | 0 | 0 | 0 | 87 | 0 | 1 | 0 |
| Seed coat tear | 0 | 3 | 3 | 3 | 88 | 1 | 2 |
| Moisture damaged seed | 19 | 0 | 1 | 0 | 0 | 61 | 3 |
| Greenish seed | 0 | 8 | 2 | 0 | 0 | 0 | 77 |
| Accuracy | 0.92 | 0.95 | 0.94 | 0.99 | 0.98 | 0.92 | 0.96 |
| Kappa | 0.68 | 0.78 | 0.77 | 0.97 | 0.91 | 0.65 | 0.85 |
| Precision | 0.68 | 0.77 | 0.93 | 0.99 | 0.88 | 0.73 | 0.89 |
| Sensitivity | 0.78 | 0.84 | 0.71 | 0.97 | 0.98 | 0.68 | 0.86 |
| Specificity | 0.94 | 0.96 | 0.99 | 1 | 0.98 | 0.96 | 0.98 |
The Random Forest classifier was applied and 10% of total images were used for training.
aIn the columns are the true seed classes, and in the rows are the estimated classes.
The number of seeds correctly classified in each class and metrics for the external seed classification models using the seed descriptors generated by Ilastik software.
| Method | Class | Training set | Cross-validation | Validation set |
|---|---|---|---|---|
| (n = 490) | (n = 210) | |||
| Hits (total) | ||||
| LDA | High-quality seed | 68 (70) | – | 29 (30) |
| Kneaded seed | 64 (70) | – | 29 (30) | |
| Purple stained seed | 69 (70) | – | 28 (30) | |
| Broken seed | 68 (70) | – | 30 (30) | |
| Seed coat tear | 70 (70) | – | 30 (30) | |
| Moisture damaged seed | 66 (70) | – | 25 (30) | |
| Greenish seed | 69 (70) | – | 27 (30) | |
| Overall accuracy | 0.98 | 0.93 ± 0.03 | 0.94 | |
| Kappa | 0.96 | 0.92 ± 0.04 | 0.93 | |
| Precision | 0.97 | 0.93 ± 0.04 | 0.94 | |
| Sensitivity | 0.97 | 0.93 ± 0.03 | 0.94 | |
| Specificity | 0.99 | 0.99 ± 0.01 | 0.99 | |
| RF | High-quality seed | 64 (70) | – | 25 (30) |
| Kneaded seed | 65 (70) | – | 27 (30) | |
| Purple stained seed | 70 (70) | – | 28 (30) | |
| Broken seed | 69 (70) | – | 30 (30) | |
| Seed coat tear | 70 (70) | – | 29 (30) | |
| Moisture damaged seed | 64 (70) | – | 24 (30) | |
| Greenish seed | 66 (70) | – | 26 (30) | |
| Overall accuracy | 0.96 | 0.89 ± 0.06 | 0.90 | |
| Kappa | 0.95 | 0.87 ± 0.07 | 0.88 | |
| Precision | 0.96 | 0.90 ± 0.06 | 0.90 | |
| Sensitivity | 0.96 | 0.89 ± 0.07 | 0.90 | |
| Specificity | 0.99 | 0.98 ± 0.07 | 0.98 | |
| SVM | High-quality seed | 67 (70) | – | 27 (30) |
| Kneaded seed | 63 (70) | – | 26 (30) | |
| Purple stained seed | 68 (70) | – | 28 (30) | |
| Broken seed | 68 (70) | – | 30 (30) | |
| Seed coat tear | 70 (70) | – | 30 (30) | |
| Moisture damaged seed | 64 (70) | – | 24 (30) | |
| Greenish seed | 67 (70) | – | 26 (30) | |
| Overall accuracy | 0.95 | 0.92 ± 0.03 | 0.91 | |
| Kappa | 0.95 | 0.91 ± 0.04 | 0.89 | |
| Precision | 0.95 | 0.92 ± 0.03 | 0.91 | |
| Sensitivity | 0.95 | 0.92 ± 0.03 | 0.90 | |
| Specificity | 0.99 | 0.99 ± 0.01 | 0.98 | |
Confusion matrices and metrics of the interactive Ilastik machine learning classification of soybean seeds according to their physiological quality.
| Classa | Vigorous seedling | Weak seedling | Non-germinated seed |
|---|---|---|---|
| n = 600 | |||
| Vigorous seedling | 230 | 18 | 0 |
| Weak seedling | 1 | 157 | 9 |
| Non-germinated seed | 0 | 10 | 174 |
| Accuracy | 0.97 | 0.94 | 0.97 |
| Kappa | 0.93 | 0.85 | 0.93 |
| Precision | 0.93 | 0.94 | 0.95 |
| Sensitivity | 0.99 | 0.85 | 0.95 |
| Specificity | 0.95 | 0.98 | 0.98 |
aIn the columns are the true seed classes, and in the rows are the estimated classes.
The number of seedlings classified correctly in each class and metrics for external classification using Ilastik descriptors of soybean seeds and seedlings according to their physiological quality.
| Method | Class | Training set | Cross-validation | Validation set |
|---|---|---|---|---|
| (n = 422) | (n = 178) | |||
| Hits (Total) | ||||
| LDA | Vigorous seedlings | 159 (162) | – | 64 (69) |
| Weak seedlings | 122 (131) | – | 44 (55) | |
| Non-germinated seeds | 129 (129) | – | 52 (54) | |
| Overall accuracy | 0.97 | 0.92 ± 0.04 | 0.90 | |
| Kappa | 0.96 | 0.89 ± 0.06 | 0.85 | |
| Precision | 0.97 | 0.92 ± 0.04 | 0.90 | |
| Sensitivity | 0.97 | 0.93 ± 0.04 | 0.90 | |
| Specificity | 0.99 | 0.96 ± 0.02 | 0.95 | |
| RF | Vigorous seedlings | 160 (162) | – | 67 (69) |
| Weak seedlings | 120 (131) | – | 49 (55) | |
| Non-germinated seeds | 128 (129) | – | 52 (54) | |
| Overall accuracy | 0.97 | 0.93 ± 0.03 | 0.94 | |
| Kappa | 0.95 | 0.89 ± 0.04 | 0.92 | |
| Precision | 0.97 | 0.93 ± 0.03 | 0.94 | |
| Sensitivity | 0.97 | 0.93 ± 0.03 | 0.94 | |
| Specificity | 0.98 | 0.96 ± 0.02 | 0.97 | |
| SVM | Vigorous seedlings | 158 (162) | – | 60 (69) |
| Weak seedlings | 110 (131) | – | 42 (55) | |
| Non-germinated seeds | 127 (129) | – | 48 (54) | |
| Overall accuracy | 0.94 | 0.89 ± 0.03 | 0.84 | |
| Kappa | 0.90 | 0.83 ± 0.04 | 0.76 | |
| Precision | 0.94 | 0.89 ± 0.02 | 0.84 | |
| Sensitivity | 0.93 | 0.89 ± 0.03 | 0.84 | |
| Specificity | 0.97 | 0.94 ± 0.02 | 0.92 | |
Figure 1Star plot for the metrics of the machine learning classifiers tested. Classifications based on seed appearance (a) and physiological quality (b).
Figure 2Relationship between seed appearance and seedling growth. Biplot of principal component analysis showing the importance of the seed quality parameters for class dispersion (a), probability of generating vigorous seedlings and weak seedlings or lack of germination of individual seeds according to their class (b), and seed appearance and the predominant aspect of the seedlings in each seed class (c).
Figure 3Representation of interactive machine learning steps and physiological quality classification in soybean seeds.