| Literature DB >> 31516934 |
Jorge Parraga-Alava1,2, Kevin Cusme1, Angélica Loor1, Esneider Santander1.
Abstract
In this article we introduce a robusta coffee leaf images dataset called RoCoLe. The dataset contains 1560 leaf images with visible red mites and spots (denoting coffee leaf rust presence) for infection cases and images without such structures for healthy cases. In addition, the data set includes annotations regarding objects (leaves), state (healthy and unhealthy) and the severity of disease (leaf area with spots). Images were all obtained in real-world conditions in the same coffee plants field using a smartphone camera. RoCoLe data set facilitates the evaluation of the performance of machine learning algorithms used in image segmentation and classification problems related to plant diseases recognition. The current dataset is freely and publicly available at https://doi.org/10.17632/c5yvn32dzg.2.Entities:
Keywords: Coffee leaf rust; Hemileia vastatrix; Machine learning; Plant diseases recognition; Red spider mite; Tetranychus urticae
Year: 2019 PMID: 31516934 PMCID: PMC6727496 DOI: 10.1016/j.dib.2019.104414
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Fig. 1The aerial view of the coffee plants in the farmland of the study area (CIIDEA, Calceta, Manabí, Ecuador). In dashed line zone, we highlighted plants used to obtain images. The empty spaces correspond to not considered plants because they were plants growing process or unseeded space.
Severity scale of rust in coffee leaf.
| Level | Affected leaf area (spots) |
|---|---|
| 1 | 1–5% |
| 2 | 6–20% |
| 3 | 21–50% |
| 4 | >50% |
Fig. 2Annotation examples of a segmentation mask in RoCoLe dataset.
Fig. 3Example of coffee leaf with different states (classes). A) healthy. B) Red Spider Mite. C) Rust level 1. D) Rust level 2. E) Rust level 3. F) Rust level 4.
Specifications Table
| Subject area | |
| More specific subject area | |
| Type of data | |
| How data was acquired | |
| Data format | |
| Experimental factors | The images were taken at a working distance of 200–300 mm from coffee plants during cloudy, sunny and windy days and considering scenarios with background variety. Images correspond to upper and back sides of healthy and infected leaves. |
| Experimental features | |
| Data source location | |
| Data accessibility | Data is publicly available on mendeley data public repository with 10.17632/c5yvn32dzg.2 doi, at |
Data can be used for training, testing and validation of classification algorithms for binary and multiclass problems using images of healthy leaves or with red spider mite presence and images with rust infection severity, respectively. Data can be used for benchmarking of algorithms for coffee plants diseases recognition and diagnosis as well as for images segmentation. Data can server as a motivation to encourage further research into plant diseases and machine learning methods for coffee pest identification. Images in dataset include annotations of the ground-truth for objects (leaves, red spider mite presence, rust presence and infection severity) which can be used to improve the accuracy of classification/segmentation image algorithms trained on this dataset as well as to extract new knowledge about diseases that affect the leaves of coffee plants. |