Literature DB >> 35198687

A dataset of cassava whitefly count images.

Joyce Nakatumba-Nabende1, Jeremy Francis Tusubira2, Claire Babirye2, Solomon Nsumba2, Christopher Omongo Abu3.   

Abstract

Whiteflies are insect vectors that affect a variety of plants such as tomatoes, cabbages, sweet potatoes, eggplants, and cassava. In Uganda, whiteflies are a major contributor to the spread of Cassava Brown Streak Disease (CBSD). By suckling on infected cassava plants, whiteflies can potentially transfer the Cassava Brown Streak Virus that causes CBSD to uninfected clean plants nearby when they migrate. When they attack the cassava plants in large numbers, whiteflies can also cause significant physical damage through suckling. This eventually can lead to leaf loss or plant death. Whiteflies also excrete "honeydew", which harbors a fungus known as "sooty mold" that covers the leaves, limiting access to sunlight which in turn affects plant food production. As part of their work, the cassava breeders often conduct studies to assess the population of whiteflies in cassava fields through a manual process of visual inspection which can be arduous and time-consuming. This paper presents a cassava whitefly dataset that has been curated to enable researchers to build solutions for the automation of the count and detection of whiteflies. The dataset contains 3,000 images captured in a whitefly trial site in Uganda. It depicts different variations of whitefly infestation from low to high infestation. This data has already been used to provide a proof-of-concept solution for whitefly counting based on Machine Learning approaches.
© 2022 The Authors. Published by Elsevier Inc.

Entities:  

Keywords:  CBSD; Cassava; Whitefly; Whitefly count

Year:  2022        PMID: 35198687      PMCID: PMC8845108          DOI: 10.1016/j.dib.2022.107911

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table

Value of the Data

The whitefly image data can be used to train machine learning models to perform automated whitefly surveillance tasks by cassava breeders. The automation of whitefly detection as shown in [2] can help to improve the efficiency and accuracy of traditional whitefly counting methods described in [5]. The availability of the images also provides the opportunity to conduct evidence-based verification of whitefly count field surveys and to easily conduct new studies on collected data. The primary beneficiaries of the whitefly image data are researchers who can apply computing algorithms in the agriculture domain. By conducting tasks such as computer vision, the data can be used to develop models and systems for performing tasks like whitefly counting. Cassava breeders can utilize the developed systems to potentially improve the efficacy of whitefly studies and develop new standards and methods for performing the critical task of whitefly counting. This cassava whitefly image dataset provides an intersection between the computing and agricultural communities to collaborate towards building tools adaptable to the agriculture domain. The dataset can be used as a benchmark to evaluate the ability of machine learning object detection models to generalize to data from a different domain. This data can also be used to train weights that can be used in transfer learning for performing similar tasks.

Data Description

We present a dataset that contains images of whitefly-infested cassava leaves captured from cassava fields located at the National Crop Resources Research Institute (NaCRRI). The data contains images of adult whiteflies, which are one of the leading contributors to the spread of Cassava Brown Streak Disease (CBSD). The images were captured from the top open cassava leaves of randomly selected cassava plants. The technique used to capture the image data is suitable for monitoring the populations of adult whiteflies in the field [3]. The data is presented through two figures and one table. Fig. 1 shows an example of a sample leaf captured with the whiteflies. Fig. 2 shows images that depict different levels of population count of the whiteflies on the cassava leaves. Table 1 provides a description of the category of whitefly count abundance. The dataset contains 3000 images and their corresponding annotation files. This raw dataset is publicly available as a Mendeley repository [1]. The dataset presented is sufficient for training object detection models. This is based on the experiments conducted in [2] where a sample of 2000 images was sufficient to train a whitefly detection model with high precision.
Fig. 1

A sample image captured showing whiteflies on the backside of the leaves.

Fig. 2

Sample images for each whitefly abundance category. The image on the left shows a cassava leaf with low abundance of whiteflies, the middle image shows a leaf with moderate abundance and the image on the right shows a cassava leaf with super abundance.

Table 1

Category of whitefly count abundance.

Category/AbundanceWhitefly count (WC)
LowLess than 10
ModerateBetween 10 - 99
SuperGreater than 99
A sample image captured showing whiteflies on the backside of the leaves. Sample images for each whitefly abundance category. The image on the left shows a cassava leaf with low abundance of whiteflies, the middle image shows a leaf with moderate abundance and the image on the right shows a cassava leaf with super abundance. The adult female whiteflies are found on the underside of the cassava leaves and the leaf stalks [8]. To capture the whitefly images, the cassava leaves were turned to reveal the underside of the leaves shown in Fig. 1. The whiteflies can be seen as tiny white specks on the infested leaves.

Experimental Design, Materials and Methods

Field data collection

Image cassava whitefly data was collected from Namulonge, the National Crops Resources Research Institute (NACRRI). Namulonge is located in the central region of Uganda in Wakiso District and is at 00 32″ N of the Equator and 320 37″ E. Namulonge has an average temperature of 20 C, Wind E at 2 km/h, and 90% Humidity. It is located 10 km north of Gayaza and 30 km North East of Kampala, the capital city of Uganda at an elevation of 1160 m above sea level. The image data was collected from cassava trial gardens set up for the purpose of studying whiteflies populations on cassava leaves, by the cassava breeders. The images were captured using a Tecno Spark 3 which is an android-based smartphone with a camera resolution of 13 megapixels. This produces JPEG images with a dimensions of 4000 × 1920 pixels or vice-versa depending on the orientation of the camera when the picture was taken as shown in Fig. 1. The cassava whitefly image data was collected from the NAROCASS 1 variety [9] which is a commercially grown cassava variety in Uganda. The cassava was at age three months after planting. Category of whitefly count abundance. Traditional approaches estimate whitefly count by sampling five of the top most fully grown cassava leaves on the plant [5] and counting the number of fully grown visible whiteflies on each leaf. According to [6] and as shown in Table 1 whitefly count (WC) can be grouped into three levels of abundance; Low abundance where the whitefly count is less than 10. Moderate abundance where the whitefly count is between 10 and 100. Super abundance where the whitefly count is greater than 100. Fig. 2 shows sample cassava image files for each of the whitefly abundance categories described in Table 1.

Data preprocessing

The data was labeled for the task of using computer vision techniques to identify and provide a count of the number of whiteflies in the images. Objects of interest were labeled by drawing bounding boxes around each whitefly and tagging each box with a corresponding text label which was “whitefly”. The bounding boxes were drawn using an open-source image annotation tool called LabelImg1 as shown in Fig. 3. The LabelImg tool stores the annotations in the PASCAL VOC (VisualObject Classes) [4] format which is a popular format for use in computer vision object detection tasks. Each annotation is an XML (Extensible Markup Language) [7] file that contains information about the positions of the whitefly objects in each image and the text label assigned to each object
Fig. 3

Image annotation using the LabelImg tool. Adapted from [2].

Image annotation using the LabelImg tool. Adapted from [2]. Using the WC abundance definition in [6], the images in this dataset have been placed into three categories; images with a low whitefly abundance, images with a moderate whitefly abundance and images with a super abundance of whiteflies. The WC in each image was determined by counting the number of annotated whitefly objects in the corresponding annotation file.

Ethics Statement

The work presented in this paper is original and has not been published or submitted for consideration anywhere else. All the required consent to collect and publish the data was gathered from the responsible parties and the data is available in the public domain.

CRediT authorship contribution statement

Joyce Nakatumba-Nabende: Conceptualization, Writing – original draft. Jeremy Francis Tusubira: Data curation, Methodology, Writing – review & editing. Claire Babirye: Writing – review & editing. Solomon Nsumba: Writing – review & editing. Christopher Omongo Abu: Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
SubjectComputer Vision, Machine Learning, Agriculture
Specific subject areaWhiteflies on Cassava Leaves
Type of dataRaw image data
How data were acquired13-megapixel smartphone cameras were used to capture the pictures of cassava infested whitefly leaves.
Data formatImages (JPEG) of cassava leaves infested with whiteflies.Annotations for the labeled whiteflies are stored as XML files.The data is organized in three categories based on the whitefly abundance on the images of the cassava leaves.
Parameters for data collectionImage data collected during cassava whitefly trial harvests from randomly selected plants.
Description of data collectionImages were collected in the field from cassava plants in Namulonge.
Data source locationInstitution: National Crop Resources Research Institute (NaCRRI)Town: NamulongeCountry: UgandaLocation: Cassava whitefly trial fields at NaCRRI (00 32° N of the Equator and 320 37° E)
Data accessibilityData is publicly available at Mendeley data.Repository name: MendeleyData identification number: 10.17632/5g38399z9p.2Direct URL to data: https://data.mendeley.com/datasets/5g38399z9p/2The dataset citation is in Ref [1].
Related research articleTusubira, J.F., Nsumba, S., Ninsiima, F., Akera, B., Acellam, G., Nakatumba, J., Mwebaze, E., Quinn, J. and Oyana, T., 2020. Improving In-field Cassava Whitefly Pest Surveillance with Machine Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 68–69) [2].DOI: 10.1109/CVPRW50498.2020.00042
  4 in total

Review 1.  Methods of surveying the incidence and severity of cassava mosaic disease and whitefly vector populations on cassava in Africa: a review.

Authors:  P Sseruwagi; W S Sserubombwe; J P Legg; J Ndunguru; J M Thresh
Journal:  Virus Res       Date:  2004-03       Impact factor: 3.303

2.  Resistance of advanced cassava breeding clones to infection by major viruses in Uganda.

Authors:  Daniel Rogers Mukiibi; Titus Alicai; Robert Kawuki; Geoffrey Okao-Okuja; Fred Tairo; Peter Sseruwagi; Joseph Ndunguru; Elijah Miinda Ateka
Journal:  Crop Prot       Date:  2019-01       Impact factor: 2.571

3.  What has changed in the outbreaking populations of the severe crop pest whitefly species in cassava in two decades?

Authors:  Hadija M Ally; Hajar El Hamss; Christophe Simiand; M N Maruthi; John Colvin; Christopher A Omongo; Helene Delatte
Journal:  Sci Rep       Date:  2019-10-15       Impact factor: 4.379

Review 4.  Cassava whitefly, Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae) in East African farming landscapes: a review of the factors determining abundance.

Authors:  S Macfadyen; C Paull; L M Boykin; P De Barro; M N Maruthi; M Otim; A Kalyebi; D G Vassão; P Sseruwagi; W T Tay; H Delatte; Z Seguni; J Colvin; C A Omongo
Journal:  Bull Entomol Res       Date:  2018-02-13       Impact factor: 1.750

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.