Literature DB >> 34917715

FruitNet: Indian fruits image dataset with quality for machine learning applications.

Vishal Meshram1, Kailas Patil1.   

Abstract

Fast and precise fruit classification or recognition as per quality parameter is the unmet need of agriculture business. This is an open research problem, which always attracts researchers. Machine learning and deep learning techniques have shown very promising results for the classification and object detection problems. Neat and clean dataset is the elementary requirement to build accurate and robust machine learning models for the real-time environment. With this objective we have created an image dataset of Indian fruits with quality parameter which are highly consumed or exported. Accordingly, we have considered six fruits namely apple, banana, guava, lime, orange, and pomegranate to create a dataset. The dataset is divided into three folders (1) Good quality fruits (2) Bad quality fruits, and (3) Mixed quality fruits each consists of six fruits subfolders. Total 19,500+ images in the processed format are available in the dataset. We strongly believe that the proposed dataset is very helpful for training, testing and validation of fruit classification or reorganization machine leaning model.
© 2021 The Authors. Published by Elsevier Inc.

Entities:  

Keywords:  Computer vision; Convolutional neural network; Deep learning; Fruit classification; Fruit detection; Fruit image dataset; Machine learning

Year:  2021        PMID: 34917715      PMCID: PMC8668825          DOI: 10.1016/j.dib.2021.107686

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the Data The dataset is comprehensive which consist of 19500+ high-quality images of six different classes. The dataset consist of good quality, bad quality, and mixed quality fruit images. To the best of our knowledge this is the first open access dataset of indian fruits consistes of good, bad and mixed quality fruits. This dataset is useful to build applications of fruit classification and detection with quality. The dataset will be useful for training, testing and validation of fruit classification or reorganization model. The dataset is useful to build fruit classification with quality applications which are beneficial for farmers, agriculture industries, wholesalers, hawkers, and customers, and fruit export companies.

Data Description

The profit percentage share of fruit market is substantial with respect to the total agriculture output [1], [2], [3]. In the agro-industry fast and accurate fruit classification is the highest need. The fruits can be classified into different classes as per their external features like shape, size and color using some computer vision and deep learning techniques [4], [5], [6], [7], [8]. The FruitNet dataset was created to include Indian fruits along with its quality parameters for those which are highly consumed or exported as per [9]. It consists of six classes of Indian fruits namely apple, banana, guava, lime, orange, and pomegranate. They further categorized into good quality, bad quality, and mixed quality. The fruit images were taken with different background, in different light conditions in indoor and outdoor environment. The Fig. 1 shows the sample images in the dataset consisting of images taken in various environments.
Fig. 1

Partial images of the dataset.

Partial images of the dataset.

Experimental Design, Materials and Methods

Experimental design

The image data acquisition process is shown in Fig. 2. The fruit images were acquired using three different make of camera's i.e. iPhone6 (Apple), ZUK (Z2 Plus), and Realme (Realme 5 Pro) mobile's high resolution rear camera. In all 19500+ images were captured using camera and then were segregated and saved in respective folders as per their quality and classification.
Fig. 2

Fruits data acquisition process.

Fruits data acquisition process. The data acquisition process steps are shown in Table 1. The fruit images are captured in the natural and artificial lighting conditions with different angles and background in months of July to October. Images pre-processing is done using python script. In the pre-processing we changed the dimensions to 256 × 256 which is standard resolution required to build object classification or object detection model.
Table 1

Data acquisition steps.

Sr. No.StepDurationActivity
1.Data GatheringJuly to OctoberDaily captured the fruits images in the natural and artificial light with different angles and background.

2.Pre-processing and creating datasetNovemberRun the python script to pre-process the images (convert all images in 256 × 256 resolution) and save the images into respective folders as per their quality and classification (i.e. bad, good and mixed)
Data acquisition steps.

Materials or specification of image acquisition system

The fruit images are captured using Apple iphone6 with rear camera of 8 megapixels, Z2 plus with rear camera of 13 megapixel, and realme 5 pro with rear camera of 48 megapixels. All dataset images of original size 3024 × 3024 were resized to 256 × 256 dimensions using a python script. The images are in .jpg images. The images acquired in variety of environmental conditions such as different light conditions, different background, and from different angles. After capturing the images were organized as Bad quality, Good quality, and Mixed quality folders. Further each quality folder has six different folders of fruit classes i.e. apple, banana, guava, lime, orange, and pomegranate, respectively. The specifications of devices used for image acquisition and acquired images specifications are shown in Tables 2 and 3, respectively.
Table 2

Specification of image acquisition device.

Sr. No.Camera ParticularsDetails
1Camera makerAppleZUKRealme
2Camera ModeliPhone 6Z2 PlusRealme 5 Pro
3F-stopf/2.2f/2.2f/1.8
4Exposure time1/25 s1/214 s1/33 s
5ISO SpeedISO-250ISO-100ISO-1120
6Exposure bias0 step0 step0 step
7Focal length4 mm4 mm5 mm
8metering modePatternCentered Weighted AverageUnknown
9Flash modeNo flashNo flashNo flash
1035mm focal length29290
Table 3

Specification of images.

Details as per Fruit classes
Sr. No.ParticularsBad FruitGood FruitMixed Fruit
1Dimension256 × 256256 × 256256 × 192
2Width256 pixels256 pixels256 pixels
3Height256 pixels256 pixels192 pixels
4Horizontal Resolution72 dpi96 dpi72 dpi
5Vertical Resolution72 dpi96 dpi72 dpi
6Bit Depth242424
7Resolution unit222
8Color representationsRGBsRGBUncalibrated
Specification of image acquisition device. Specification of images.

Method

All fruit images are acquired using three mobile make with a high resolution rear camera in different angles and different backgrounds. The orignal images of size 3024 × 3024 were resized to 256 × 256 using a python script. Table 4 describes the classes, number of image taken and the environments in which images are taken.
Table 4

FruitNet dataset details.

Quality classesFruit classes ConsideredImage Taken in which DirectionImage Taken in different BackgroundsNo. of Images of each denominationTotal No. of Images
Bad qualityapple,banana,guava,lime,orange,pomegranateFront Direction, Top View, Backward Direction,Bottom View,Direction Rotated 180 degrees,Dark color, grass, light color, ground, multicolorapple - 1141banana - 1087guava - 1129lime - 1085orange - 1159pomegranate - 11876778

Good qualityapple,banana,guava,lime,orange,pomegranateFront Direction, Top View, Backward Direction,Bottom View,Direction Rotated 180 degrees,Dark color, grass, light color, ground, multicolorapple - 1149banana - 1113guava -1152lime -1094orange - 1216pomegranate - 594011664

Mixed qualityapple,banana,guava,lime,orange,pomegranateFront Direction, Top View, Backward Direction,Bottom View,Direction Rotated 180 degrees,Dark color, grass, light color, ground, multicolorapple – 113banana – 285guava – 148lime – 278orange – 125pomegranate - 1251074

Total Number of Images in the Dataset19526
FruitNet dataset details.

Ethics Statement

There is no funding for the present effort. There is no conflict of interest. The data is available in public domain.

CRediT authorship contribution statement

Vishal Meshram: Methodology, Software, Validation, Writing – original draft. Kailas Patil: Conceptualization, Supervision, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
SubjectMachine learning, agriculture science, horticulture
Specific subject areaFruits image dataset with quality classification (good, bad, and mixed)
Type of dataIndian fruits images
How data were acquiredFruits images were using high resolution mobile phone camera in the natural and artificial light conditions with different backgrounds.
Data formatRaw
Parameters for data collectionThe fruit dataset images are .jpg images of 256 × 256 dimension and resolution is 72 dpi.
Description of data collectionThe fruits images were collected using high resolution mobile phones rear camera. The original .jpg images of fruits are of dimensions 3024 × 3024. These images are resized to 256 × 256 dimensions. The dataset is categorized into 3 subfolders Good Quality Fruits, Bad Quality Fruits, and Mixed Quality Fruits. Further each folder contain six fruits classes namely Apple, Banana, Guava, Lime, Orange, Pomegranate. The images were taken at the different backgrounds and in different lighting conditions. The proposed dataset can be used for training, testing and validation of fruit classification or reorganization model.
Data source locationVISHWAKARMA UNIVERSITYSurvey No. 2, 3, 4 Laxmi Nagar, Kondhwa Budruk, Pune - 411 048. Maharashtra, India.Latitude and longitude: 18.4603° N, 73.8836° EHUBTOWN COUNTRYWOODS SOCIETYTilekar Nagar, Kondhwa Budruk, Pune - 411 048. Maharashtra, India.Latitude and longitude: 18.442866 ° N, 73.884894° E
Data accessibilityRepository name: FruitNet: Indian Fruits Dataset with quality (Good, Bad & Mixed quality)Data identification number(doi): 10.17632/b6fftwbr2v.1Direct URL to data: https://data.mendeley.com/datasets/b6fftwbr2v/1
  2 in total

1.  PSFD-Musa: A dataset of banana plant, stem, fruit, leaf, and disease.

Authors:  Epsita Medhi; Nabamita Deb
Journal:  Data Brief       Date:  2022-06-28

2.  Dataset of Stagnant Water and Wet Surface Label Images for Detection.

Authors:  Sonali Bhutad; Kailas Patil
Journal:  Data Brief       Date:  2021-12-23
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.