Literature DB >> 34608963

Resources for image-based high-throughput phenotyping in crops and data sharing challenges.

Monica F Danilevicz¹, Philipp E Bayer¹, Benjamin J Nestor¹, Mohammed Bennamoun², David Edwards¹.

Abstract

High-throughput phenotyping (HTP) platforms are capable of monitoring the phenotypic variation of plants through multiple types of sensors, such as red green and blue (RGB) cameras, hyperspectral sensors, and computed tomography, which can be associated with environmental and genotypic data. Because of the wide range of information provided, HTP datasets represent a valuable asset to characterize crop phenotypes. As HTP becomes widely employed with more tools and data being released, it is important that researchers are aware of these resources and how they can be applied to accelerate crop improvement. Researchers may exploit these datasets either for phenotype comparison or employ them as a benchmark to assess tool performance and to support the development of tools that are better at generalizing between different crops and environments. In this review, we describe the use of image-based HTP for yield prediction, root phenotyping, development of climate-resilient crops, detecting pathogen and pest infestation, and quantitative trait measurement. We emphasize the need for researchers to share phenotypic data, and offer a comprehensive list of available datasets to assist crop breeders and tool developers to leverage these resources in order to accelerate crop breeding.

Entities: Chemical

Mesh：

Year: 2021 PMID： 34608963 PMCID： PMC8561249 DOI： 10.1093/plphys/kiab301

Source DB: PubMed Journal: Plant Physiol ISSN： 0032-0889 Impact factor: 8.340

Introduction

Plant phenotypic variation is the result of the complex interplay between genetics and environmental conditions (Boyer, 1982; Ficke et al., 2018; Frantzeskakis et al., 2020). Advances in genome sequencing have uncovered substantial genetic diversity within species (Hirsch et al., 2014; Golicz et al., 2016; Zhao et al., 2018; Hübner et al., 2019; Song et al., 2020). However, the wealth of genetic information is rarely translated into gains for real-world agricultural crops (Araus et al., 2018), partially due to the lack of phenotypic information associated with the genetic variation (Furbank and Tester, 2011; Mir et al., 2019). High-throughput phenotyping (HTP) has emerged to overcome the phenomics bottleneck. HTP platforms enable noninvasive data collection through several types of sensors that can be deployed in glasshouse facilities or field monitoring devices, including ground platforms to unmanned aerial vehicles (UAVs) and satellites (Li et al., 2014; Hank et al., 2015; Kirchgessner et al., 2016; Naito et al., 2017; Danzi et al., 2019). These platforms can support the capture of temporal phenotypic variation for large populations across plant development, generating massive amounts of data. Systematic large-scale phenotyping platforms can be used for genetic dissection of targeted traits and assist the development of better performing varieties (Li et al., 2018; Mir et al., 2019). The increasing adoption of HTP platforms leads to a demand for new computer-based tools that can leverage these datasets and integrate-associated information (e.g. experimental conditions, weather measurements, and genotypic data) to extract meaningful insights regarding crop development and performance (Tattaris et al., 2016; van Eeuwijk et al., 2019). HTP data analysis is a nontrivial task, requiring a high level of expertise in computer science and plant development to understand the implications of phenotypic variation in the plant. Completeness of HTP metadata is crucial for plant physiologists to characterize the genetic and environmental conditions in which a phenotype occurs. Even though the majority of available datasets are lacking a clear description of conditions depicted, it is important that new datasets include metadata and methods to collect environmental data in their experimental design. Mathematical models, machine learning, and most recently deep learning models, can be used as guides to identify stress and predict crop performance under defined conditions (Bai et al., 2016; Atkinson et al., 2017a; Joalland et al., 2017; Moghadam et al., 2017; Naito et al., 2017; Fernandez-Gallego et al., 2018; Prey et al., 2019; Walter et al., 2019; Ducournau et al., 2020; Kerkech et al., 2020; Selvaraj et al., 2020). Deep learning models have the advantage of automatically extracting features from the image by constructing increasingly abstract representations of the relationships within the dataset (LeCun et al., 2015). In contrast, classic statistical approaches rely solely on the researcher to manually define the features before the analysis. Because deep learning models build the features based solely on the dataset, it usually requires large amounts of high-quality data to learn from these features to achieve high performance. A broad diversity of sensors enables capturing and quantifying previously undetectable phenotypic traits. Combining the reflectance of different spectra allows for the detection of abiotic stress, such as nitrogen deficiency and frost damage. • HTP has the potential to accelerate crop breeding, producing data that can be used to identify varieties with improved traits and higher performance, but there are technical challenges to overcome. Deep learning models are effective in plant phenotyping tasks due to their capacity to leverage highly complex and multidimensional data, but their performance is dependent on the quality and diversity of the dataset. A large effort is required to facilitate sharing high-quality phenotype datasets because they provide a key resource for developing tools for agronomic trait measurement and crop breeding. Developing a custom pipeline of software applications for processing HTP raw sensor data into traits, followed by its analysis, amounts to a major part of the cost to adopt HTP (Reynolds et al., 2019). However, if the data analysis pipeline is being reutilized from a previous project, the cost of implementing the pipeline would drop to 10%–20% (Reynolds et al., 2019), which means that being able to employ whole or part of a developed HTP analysis pipeline can decrease the costs for adopting HTP in research projects. Many challenges prevent the research community from efficiently reusing data processing tools and analysis pipelines. For example, the lack of interoperability between processing tools (image processing, weather data transformation) and analysis models (trait quantification, classification) due to the absence of standardized data processing methodology prevent the utilization of previously published analysis models (Krajewski et al., 2015; Janssen et al., 2017; Yu et al., 2017; van Eeuwijk et al., 2019). The inconsistency of data processing pipelines can be partially overcome by providing tools to standardize data input for target analysis models (Busemeyer et al., 2013; Yu et al., 2017; Chopin et al., 2018; Selvaraj et al., 2020); however, a robust solution requires standardizing syntax (formats) and semantics (definitions, ontology) of input/output files used by HTP data processing tools (Janssen et al., 2017). Data sharing is an important step for the advancement of crop breeding and the development of analysis pipelines (Zamir, 2013; Mir et al., 2019). The need to establish a repository to host raw phenotypic datasets with associated information has long been recognized (Zamir, 2013; Lobet, 2017). A centralized database with access to raw data and standardized metadata would increase discoverability and reutilization of the datasets, allowing researchers to reanalyze data using updated state-of-the-art tools, which may lead to the identification of novel and potentially interesting results (Zamir, 2013). Even though some platforms have been developed to host selected datasets (Granier et al., 2006; Lobet et al., 2013; Seren et al., 2017), the majority of datasets are insufficiently described, preventing plant researchers from properly analyzing phenotypic variations and leading to misinterpretation of results. The Minimum Information About a Plant Phenotyping Experiment (MIAPPE) initiative (https://www.miappe.org/) provides a framework for phenotypic data sharing designed to standardize data publication with a controlled ontology vocabulary referencing multiple previously established ontologies (Papoutsoglou et al., 2020). The MIAPPE guidelines are compatible with the Breeding Application Programming Interface (BrAPI), which aim to increase breeding datasets interoperability and provide easy access to breeding tools (Selby et al., 2019; Papoutsoglou et al., 2020). Adoption of these standardization guidelines for dataset description is a crucial step in transforming HTP datasets into data assets for plant researchers and breeders. Adequately described datasets can also be used to establish benchmark datasets (detailed in Box 1). Benchmark datasets provide a standard to compare computer-based tools performance, helping uncover the tool limitations and strengths (Zamir, 2013; Minervini et al., 2016; Lobet, 2017). Assessing tool performance will guide the user to apply the most effective methodology for their experimental design and data (Lobet, 2017). This review reports on previously published image-based HTP datasets with the aim of assisting the community to access and benefit from their development. The main contributions presented are (1) highlighting the challenges faced by researchers when reusing HTP datasets; (2) describing some of the criteria required when creating an effective benchmark dataset; and (3) presenting a collection of image-based HTP datasets available as a resource for researchers to improve model development and analysis. What is a benchmarking dataset? Benchmark dataset refers to a comprehensive data collection that represents real life data that a method or tool may encounter when performing the given task. Benchmark datasets are often employed as a standardized way to assess a new method’s performance, finding its strengths, and limitations (Lobet, 2017). General requisites for benchmark datasets in most of the applications described in this study are: (1) intentional, the dataset must be designed to be employed on specific tasks; (2) relevant, the data should be coherent with the event it attempts to describe and have the limitations identified and clearly stated; (3) representative, meaning that the dataset covers most cases commonly encountered when performing a task within the defined scope (Schaafsma and Vihinen, 2018), reporting any underrepresented classes; (4) sizable, the dataset must contain enough examples of each class or target to enable training machine learning and computer vision methods; (5) reliable, the data points must be experimentally obtained instead of artificially generated and annotations must be performed by plant experts (Sasidharan Nair and Vihinen, 2013); and (6) descriptive, the dataset must have an extensive description of data collection methodology (sensors, UAV altitude), biological information (species, genotype, growth stage), and experimental conditions (temperature, soil, water availability). The importance of these criteria changes depending on the purpose of the dataset utilization. For computer tool developers, the first five criteria are probably more relevant as they can directly impact the performance and robustness of the new method. For plant physiology researchers, the sixth criteria is particularly important as it enables reutilization of the datasets to gain a deep understanding of the plant conditions, extract meaningful insights from plant phenotype analysis, and compare plant phenotype analyses with external phenotypic datasets.

Applications of HTP

Improving crop productivity

A myriad of components contribute to yield, as plant performance is regulated by a combination of genetic factors (G), environmental factors (E), and the interaction between them (G × E; Juliana et al., 2018; Montesinos-López et al., 2018). Because of the high complexity that underlies plant performance, breeders have to submit potential varieties to extensive field testing to determine their potential yield (Hunt et al., 2020). Field HTP can substantially accelerate the breeding process by allowing breeders to predict end-of-season traits, such as yield and biomass at early growth stages. Early yield prediction allows researchers to bypass plant growth time, a key limiting factor in crop breeding. In a soybean (Glycine max) study, 2,551 genotypes were grown in different locations, and it was observed that yield, plant maturity, and seed size can be predicted at an early stage using Cubist regression because it presented the best result in comparison to Partial Least Squares Regression, Random Forests, Artificial Neural Networks, and Support Vector Regression (Yuan et al., 2019). Similar results were observed for wheat (Triticum aestivum), barley (Hordeum vulgare), and other soybean genotypes (Bai et al., 2016; Nevavuori et al., 2019). Although promising, the results are constrained to the conditions evaluated, since interannual weather variation, changes in agroecological zones, differences in farm management practices, sensor use and specifications, and other factors can cause instability in model accuracy. The broad diversity of remote sensors enables capturing different aspects of the plant phenotype. Different combinations of RGB, multispectral, and thermal image data associated with weather and soil have been employed to train deep learning models for crop yield forecasting (Vega et al., 2015; Gracia-Romero et al., 2019; Zhang et al., 2019a; Maimaitijiang et al., 2020; da Silva et al., 2020). The models can support differentiating crop performance in relation to irrigation regimes (Gracia-Romero et al., 2019), quantify growth rate under nitrogen treatment (Holman et al., 2016; Arroyo et al., 2017; Aranguren et al., 2020), estimate variation of wheat grain protein content (Rodrigues et al., 2018; Sharabiani et al., 2019), and monitor crop height variation during the season (Ziliani et al., 2018). A systematic review on machine learning models for crop yield prediction was published by van Klompenburg et al. (2020), showing that deep learning models are increasing in popularity. The most used architectures were Convolutional Neural Networks (CNNs) and Long-Short Term Memory (LSTM). These architectures were created for different purposes: LSTM is designed specifically for sequence prediction tasks, while CNN’s structure is suited to extract features from complex image data. These architectures can benefit from transferred learning, in which pretrained weights obtained with different data can be implemented in the new model (He et al., 2016) that allows rapid and high performance. Multimodal machine learning can be employed to analyze datasets with multiple data sources (rainfall, temperature, multispectral image, soil data), each data type is a modality that will be analyzed and combined to increase model performance (Baltrušaitis et al., 2017). van Klompenburg et al. (2020) observe in the review that temperature, rainfall, and soil type were the most used data types in machine learning models, but different feature combinations and the volume of data can directly impact the model performance and should be tested during development. A few HTP datasets were recently released with the goal to improve yield prediction and more specifically support genotype to phenotype prediction. The Genomes to Field (G2F) datasets comprise genotype (single nucleotide polymorphism information), manual phenotype measurements, climatic data, soil information, inbred ear images, and UAV collected multispectral and hyperspectral images of several maize (Zea mays) varieties grown over multiple years (Supplemental Data Set 1; McFarland et al., 2020). Detailed metadata are essential for understanding genotype to phenotype relationships in each season/environment. However, the G2F field trials were carried out in a single location, which limit the robustness of the traits identified. For sorghum (Sorghum bicolor) and wheat, the Transportation Energy Resources from Renewable Agriculture Phenotyping Reference Platform (TERRA-REF) database offers a comprehensive resource of sensor data (five thermal, spectral, and shape imaging sensors), phenotypic measurements, environmental and genomic data, including genome sequencing of 384 varieties and genotyping by sequencing of 768 varieties (Supplemental Data Set 1; LeBauer et al., 2017; Burnette et al., 2018). TERRA-REF has four sensing platforms to collect image-based phenotyping and agronomic traits from both controlled environment and field grown plants. TERRA-REF maintains a manuscript management section in their website where researchers willing to use the data can register their proposed manuscript to prevent overlap and encourage collaboration. Oftentimes, researchers will delay publishing datasets until their planned publications are completed. Nonetheless, the TERRA-REF approach to register publications enables early publishing of the data and allows other groups to explore different aspects of the dataset or collaborate. Federated learning is another strategy that can be used when the data must be protected due to privacy or security concerns, showing increasing use in medical research (Lee et al., 2018; Huang et al., 2019; Rieke et al., 2020). Federated learning allows for training machine learning models collaboratively without exchanging the data, in this framework, each dataset owner institution downloads the model and trains it locally. The trained parameters from each institution are exported and aggregated, creating a model that benefits from previously inaccessible datasets while the data governance and accessibility remain in the control of the data owner (Konečný et al., 2016). Both G2F and TERRA-REF datasets present limitations regarding the types of environments represented, species grown, and the treatments that they were subjected to. Other similar phenotyping initiatives covering different locations and plant species (including noncrop plants) are needed to depict phenotypic variation. Nonetheless, the above datasets offer an extensive resource that can assist the identification of quantitative trait loci (QTLs) related to crop performance, develop tools for genotype to phenotype prediction based on the multidimensional dataset, and ultimately these could be used as benchmark datasets to assess tool performance. Moreover, smaller datasets for field trial experiments can be found at the Global Agricultural Research Data Innovation Acceleration Network and the International Maize and Wheat Improvement Center described in Supplemental Data Set 1. Grain yield in wheat is directly related to spike head population density, size, and maturity stage. The Annotated Crop Image Dataset (ACID) provides images with coordinates to identify wheat spikes under greenhouse conditions (Pound et al., 2017b). ACID was designed for training novel tools for identifying the spike heads, and measuring individual head traits, but the tool could be further applied to new datasets and to link measured traits with genotypic variability. Limited metadata annotation in ACID prevents further exploration of the dataset itself for identification of yield-related traits because the genotypes and experimental conditions are not described. The global wheat head database compiles multiple RGB wheat images collected in the field, from several countries using different cameras (David et al., 2020). The dataset was used in a challenge hosted on Kaggle (https://www.kaggle.com) with the goal to benchmark wheat head detection approaches. Top solutions used object detection deep learning architectures (EfficientDet, Faster-RCNN, and Yolo-v3), with data augmentation techniques playing a major role for their success. Data augmentation is a computer vision technique to increase dataset size through a series of transformations, such as flipping or rotating the image (Buslaev et al., 2020). It is important to note that with field images, a greater variability of conditions can occur such as genotype differences, head orientation, and mixed developmental stages, which can cause the object detection model to present performance instability such as mislabeling plant organs at a higher rate when the conditions differ from what was seen in the training data. The global wheat head dataset provides a valuable resource for developing and benchmarking tools due to the high variability of wheat genotypes and conditions represented. Similar datasets for different species can be developed collaboratively by annotating previously released HTP data (such as the G2F and TERRA-REF datasets). This can decrease the cost of producing a dataset and benefit from the described metadata.

Developing crops tolerant to abiotic stress

The development of climate resilient crops must consider the effect of combined abiotic stresses occurring in the region (Cammarano et al., 2019). As a result, datasets featuring combined abiotic stresses provide a resource to understand how their interaction impacts plant health and development. The Eschikon dataset (Supplemental Data Set 2) includes temporal images of beet (Beta vulgaris) under multiple independent and combined drought, nitrogen deficiency, and weed stresses (Khanna et al., 2019). This dataset was employed to develop 3D representations of the plants from which the authors were able to extract canopy cover, height, and vegetation indices. These traits were used to classify stress in plants with 83%–93% accuracy depending on the stress measured. The dataset can be further explored to measure agronomic traits related to each stress and understand plant response, it can also be employed in further developing computer vision tools for stress classification (Khanna et al., 2019). Plant researchers willing to predict the effects of climatic change in crop species will require the creation (or release) of more datasets in which the combined stresses are observed. These datasets must offer a detailed description of the environmental conditions and if possible, of the genetic data to enable accurate interpretation of the results. Ideally, the aggregated datasets must depict the diversity of agroecological zones including low latitude locations, which are currently underrepresented. Crop water management is essential in regions currently facing or predicted to face water scarcity. Infrared thermography has been successfully implemented to assess water use by crops (Nhamo et al., 2020), and for measuring genotype performance under salinity or water deficit stress (Raza et al., 2014; Kumar et al., 2017; Thapa et al., 2018; Hou et al., 2019; Zhang et al., 2019b; Kumar et al., 2020; Masina et al., 2020). In cotton (Gossypium arboreum) monitored by infrared thermography, it was observed that yield, fiber length, and micronaire suffered reduction after canopy temperature exceeded a given threshold (Conaty et al., 2015). Canopy temperature and evapotranspiration (ET) maps are used as a proxy for measuring the phenotypic response to both stresses as they influence stomatal conductance (Fischer et al., 1998; Sirault et al., 2009), and are observed from close range, at the aerial and spatial level. Remotely sensed thermal data collected by satellite platforms allow mapping water resource use through the prediction of ET maps (Anderson et al., 2012). In 2018, a space station mission (ECOSTRESS) was launched to measure ET and identify plant stress (Fisher et al., 2020). It provides a higher spatial and temporal resolution ratio (60 m with 1–5 d interval) in comparison to Landsat (>60 m, 16-d interval) or MODIS (>375 m, daily; Anderson et al., 2012). The ECOSTRESS library provides satellite imagery associated with laboratory measurements of vegetation to help correlate the spectral patterns (Meerdink et al., 2019; Fisher et al., 2020). This dataset has been employed to assess plant species diversity in restoration areas, showing that sites with higher species diversity present lower temperatures (Hamberg et al., 2020). For this review, we chose to focus on HTP images collected by aerial or ground devices since satellite images currently do not yet provide enough resolution to be used for assessing plants at the field level. However, satellite HTP imagery offers the potential to help understand abiotic stress at a large-scale (Anderson et al., 2012; Miralles et al., 2014), thus we have included links to satellite libraries (ECOSTRESS, Landsat and MODIS) in the resources in Supplemental Data Set 2. Besides infrared thermography, multispectral and hyperspectral sensors are also used in HTP. These sensors are capable of detecting physiological changes in the plant leaf composition (Bruning et al., 2020). For example, decomposition of foliar hyperspectral signatures showed that C3 and C4 plants have divergent and well-defined patterns of reflectance (Baranoski et al., 2016). Hyperspectral images were employed to quantitatively rank salt tolerance between four wheat varieties (dataset available in Supplemental Data Set 2) using machine learning and dimensionality reduction. The authors observed that multiple trait measurements would be required to correctly assess the plants, whereas with hyperspectral images they were able to score them in a fast noninvasive way, dataset is described in Supplemental Data Set 2 (Moghimi et al., 2018). Multispectral and hyperspectral images have also been employed to identify salt stress in sugarcane (Saccharum officinarum L.) and wheat irrigated with saline water (Hamzeh et al., 2013; El-Hendawy et al., 2019), acidic and heavy metal stress (Liu et al., 2011; Jin et al., 2013; Li et al., 2015; Zhang et al., 2017; Wang et al., 2018a), nutrient deficiency (Pacumbaba and Beyl, 2011), heat stress (Gautam et al., 2015; Trachsel et al., 2019), and frost (Fitzgerald et al., 2019; Nuttall et al., 2019; Murphy et al., 2020). Frost damage in wheat can have a major impact, as a single frost event can severely reduce quality and yield (Boer et al., 1993; Frederiks et al., 2012; Martino and Abbate, 2019). Rapid detection of frost damage would enable growers to take management decisions to avoid losses. A study using hyperspectral images indicated that under controlled conditions, frosted and nonfrosted individuals, present significant spectral differences (Murphy et al., 2020). Mixed results were observed when detecting frost under field conditions, indicating that more research is needed (Fitzgerald et al., 2019; Nuttall et al., 2019). The Frost nursery dataset provides multispectral images of several commercial wheat varieties grown in the field and were affected by frost at different developmental stages (AgReFed, 2019). This dataset includes final yield, leaf protein, and abundance of metabolites, which can be used to characterize the effect of frost (Supplemental Data Set 2). The association of hyperspectral data with physiological measurements may assist frost damage quantification, which can support crop breeders screening for more tolerant varieties. Hyperspectral imagery has the potential to capture traits related to the biochemical composition of target tissues. However, due to various technical factors, the recorded data are usually noisy with nonnegligible redundancy (Mishra et al., 2019). Datasets including hyperspectral data may benefit from detailed description of the experimental conditions and sensors used, helping guide researchers how to better extract the information. Platforms such as Quantitative Plant (Lobet et al., 2013), Phenopsis (Granier et al., 2006), and BrAPI (Selby et al., 2019) are dedicated to assemble a wide range of phenotyping datasets that can be used to compare phenotypic response to stress within and between species. These platforms are focused on making phenotypic datasets more findable. Information and website links for other abiotic stress datasets are described in Supplemental Data Set 2.

Pathogen and pest detection in the field

Changes in environmental conditions are likely to shift pathogen and pest regional distributions (Hovmøller et al., 2008; Shaw and Osborne, 2011; Bebber et al., 2013; Garrett, 2013; Mariette et al., 2016; Skelsey et al., 2016). To provide suitable crop varieties and agricultural management recommendations for these new conditions, it is necessary to gain greater understanding of the ecological, phenotypic, and molecular basis of the interaction between plant and pathogens (Skelsey et al., 2016). Pathogen identification and disease severity estimation are an important part of characterizing their distribution in the field (Ali and Hodson, 2017). The detection and quantification of disease are usually performed by visually assessing crop symptoms, which may be subjected to bias and human error, besides being labor and time intensive. Various datasets have been released to assist the development of automated systems for disease identification and assessment (Supplemental Data Set 3). The Plant Village, BRACOL, RoCoLe, citrus (Citrus sp.), cassava (Manihot esculenta), and apple (Malus sp.) datasets offer close range annotated images of infected plant organs against a clean background, offering a resource for disease diagnosis and severity scoring in collected leaves (Mohanty, 2016; Arsenovic et al., 2019; Chouhan et al., 2019; Krohling, 2019; Parraga-Alava et al., 2019; Rauf et al., 2019; Tian et al., 2019; Nakatumba-Nabende et al., 2020; Singh et al., 2020). Machine learning models using support vector machines, CNNs, and self-attention CNNs trained on similar datasets were published recently (Abdu et al., 2020; El Abidine et al., 2020; Zeng and Li, 2020), some of which report increased efficiency when using segmented regions for pathogen identification (Esgario et al., 2020; Karlekar and Seal, 2020). A comprehensive review on machine learning for disease assessment in crops was published by Hasan et al. (2020). Although disease detection models trained with the above datasets can be used in the field, the input samples have to be manually collected and imaged which can be time consuming. Hence, many researchers have focused on developing models that use UAV-collected images to accelerate disease detection (Vergara-Diaz et al., 2015; Sugiura et al., 2016; Moriya et al., 2019; Qiu et al., 2019; Tetila et al., 2020; Zhao et al., 2020). (Marzougui et al., 2019) combined HTP images from greenhouse and field experiments to quantify Aphanomyces root rot resistance in lentils (Lens culinaris). The authors developed 12 normalized spectral indices that correlate with disease symptoms and severity, allowing breeders to objectively quantify genotype resistance. Another study used hyperspectral data and machine learning for early identification of charcoal rot disease in soybean, obtaining classification accuracy of 90% for plants 3 d after infection (Nagasubramanian et al., 2018). These studies demonstrate the potential of image-based HTP to enable growers and breeders to automatically screen plants. To the best of our knowledge, there are no available datasets for disease-related tasks in the field which prevents the development and benchmarking of computer vision-based tools. Benchmark datasets created for this task requirements are shown in Box 1, with specific image annotations depending on the target task (disease detection, identification, severity scoring, and lesion segmentation). Field HTP is widely applied to the detection and quantification of pests. Rapid pest identification is important so growers can take action to control pest spread and limit damage to crops. A large benchmark dataset for insect pest detection was released containing 75,000 close range images of annotated pests belonging to 102 categories (see Supplemental Data Set 3 for a detailed description; Wu et al., 2019). This benchmark dataset is a valuable resource for the development of crop monitoring and management approaches, allowing researchers to test model performance over a wide range of pests. This dataset can also be complemented with the mango (Mangifera indica) pest classification dataset, which has images of mango plants infected with 15 different categories of pests, with a large volume of augmented images to increase model robustness (Kusrini et al., 2020a). Precise algorithms for the detection of pests can support assessing crop resistance by counting the pests, helping identify pest species in the field, and monitoring pest spread. Employing HTP datasets to measure plant–insect interactions can allow the use of RGB sensors to quantify leaf damage and defoliation (O’Neal et al., 2002). Thermal infrared and hyperspectral images can also be used to capture physiological changes, such as stomata regulation (Backoulou et al., 2011; Nabity et al., 2013). Novel datasets targeting the plant–insect interactions should follow the guidelines proposed in Box 1 with special attention to providing detailed metadata (view MIAPPE project) and ground-truth measurements and labeling. The development of navigation maps is particularly important for weed management systems, in which the map can be used for targeted herbicide application or by weed killing robots (Somerville et al., 2019; Gašparović et al., 2020; Hunter et al., 2020; Raja et al., 2020). Weed detection systems can reduce herbicide application by up to 60% in comparison to broadcast applications (Somerville et al., 2019; Hunter et al., 2020) and increase efficiency in organic production systems. A key challenge for implementing weed detection in the field using image-based HTP data is the difficulty in establishing robust computer vision-based models that can distinguish between crop and weed species under varying field conditions. To help overcome this challenge, many datasets have been released consisting of RGB and multispectral images of a wide variety of weed and crop species, some of which contain pixel level annotations to separate the weed from background (Supplemental Data Set 3; Haug and Ostermann, 2015; Dos Santos Ferreira, 2017; Giselsson et al., 2017; Sa et al., 2018; Teimouri et al., 2018; Skovsen et al., 2019; Sudars et al., 2020). A few datasets feature images of weed seedlings, enabling the development of models that can detect weed infestation at an early stage. Studies using similar datasets employed computer vision and machine learning algorithms for weed detection, though these presented a high variability in the precision rate (69%–98%) depending on the crop field analyzed (Wang et al., 2007; dos Santos Ferreira et al., 2017; Pallottino et al., 2018; Umamaheswari et al., 2018; Bah et al., 2019; Partel et al., 2019). These results emphasize the need to produce more datasets with an increased variety of crop and weed species at different growth stages. Furthermore, the datasets need to reflect the management practices (e.g. sowing density) that the weed detection model would encounter in the field. Increasing model robustness to varied field conditions is essential to enable its adoption in agricultural management systems and allow plant researchers to quantify herbicide or other weed control practices efficiency.

Root phenotyping

Root system architecture (RSA) greatly influences nutrient access, efficient water uptake, and plant tolerance to stress (Mary et al., 2018; York et al., 2018; Mattupalli et al., 2019; Busener et al., 2020; Griffiths et al., 2020; McKay Fletcher et al., 2020; Seo et al., 2020). Increased efforts in breeding for desirable RSA traits can drive a breakthrough in crop productivity and resource efficiency (Lynch, 2007). To leverage RSA potential in crop breeding, it is important that we improve current root phenotyping strategies. Noninvasive RSA imaging is extremely challenging due to soil opacity. At the same time, soil replacements such as transparent gels or hydroponic mediums often lead to phenotypes that diverge substantially from the ones observed in regular soil (Hargreaves et al., 2009; Wojciechowski et al., 2009; Clark et al., 2011; Ma et al., 2019). A wide variety of sensors can be employed to acquire 2D or 3D images of plant root grown in the glasshouse, such as X-ray computed tomography, magnetic resonance imaging, positron emission tomography, and hyperspectral imaging (Jahnke et al., 2009; Garbout et al., 2012; Mooney et al., 2012; van Dusschoten et al., 2016; Bodner et al., 2018). In Supplemental Data Set 4, we list available RSA datasets with metadata at varied levels of detail including from plants grown in multiple types of media, such as gellan gum, soil, and hydroponics. In addition, a synthetic root system dataset is available. This large dataset was produced for tool calibration and modeling since it provides ground-truth of fibrous and tap-root images which help identify artifacts generated by the model when dealing with complex, overlapping root structures. The data were produced using ArchiSimple with three levels of noise, and the roots present varying degrees of complexity (Lobet et al., 2017). Field root phenotyping frequently requires the manual excavation of individual plants followed by imaging of the washed root crown system for quantitative trait analysis (Trachsel et al., 2011; Bucksch et al., 2014; Colombi et al., 2015). Root crown datasets of multiple crop species are described in Supplemental Data Set 4, some of which were produced with the aim to automatically quantify RSA traits using different tools. Noninvasive alternative approaches are not as commonly employed, but offer the potential to undertake a time-series analysis of crop development. These include electrical resistance tomography, electromagnetic inductance, and ground penetrating radar (Diaz and Herrero, 1992; Zenone et al., 2008; Srayeddin and Doussan, 2009), which are used to characterize root water uptake of wheat and vine plants in the field (Shanahan et al., 2015; Whalley et al., 2017; Mary et al., 2018). Overall, image-based RSA phenotyping has many applications, such as linking RSA traits to micronutrient concentration and heritability (Busener et al., 2020; McKay Fletcher et al., 2020), the effect of dwarf genes in seedling roots (Wojciechowski et al., 2009), changes in the root crown in response to disease (Corona-Lopez et al., 2019; Mattupalli et al., 2019), to investigate root plasticity (Rosas et al., 2013), genetically driven root architecture differences (Jiang et al., 2019), and QTL mapping of regions controlling RSA (Topp et al., 2013). Most of the studies cited above use a combination of tools for RSA trait extraction (DIRT; Das et al., 2015), RhizoVision (Seethepalli and York, 2019), RSA-GiA (Galkovskyi et al., 2012; Topp et al., 2013), or Rootscape (Ristova et al., 2013)) followed by statistical analysis (variations of ANOVA, three-parameter logistic function, PCA) or linear regression to test if the observed traits relate to environmental or genetic data. The wide range of approaches used reflects the diversity of input data formats. The sensors employed to collect RSA traits are very diverse and capture different aspects of the root. Hence, the decision for which feature extraction tool and analysis method to implement must be decided case by case. Even more important in this case is tool and data interoperability because it will allow researchers to explore the resources efficiently. Root image datasets from several major crop species can be downloaded from the Quantitative Plant platform (quantitative-plant.org/dataset) and Zenodo database(zenodo.org/). The reconstruction of the data as 2D or 3D representations of the root system, and root segmentation from the medium usually assumes a high contrast between root and background, which is not always the case (Atkinson et al., 2019). Machine and deep learning-based tools have been developed for root segmentation in 2D or 3D (Iyer-Pascuzzi et al., 2010; Bucksch et al., 2014; Falk et al., 2020; Yasrab et al., 2020a), including very thin (1–3 pixels) roots grown in visible medium (RootNet; Yasrab et al., 2020b) and in soil (Soltaninejad et al., 2020), while other tools aimed for RSA trait quantification (Atkinson et al., 2017a; Falk et al., 2020). Although there are many potential approaches to perform root segmentation, most are not suited for newer image data types. In addition, few tools are capable of linking observed RSA to genotypic information. Recently, deep learning models have been employed to attempt to bridge phenotype to genotype predictions (Pound et al., 2017a; Yasrab et al., 2020a) and can achieve similar results for QTL identification as user supervised methods (Pound et al., 2017a). However, to effectively integrate high-throughput phenotype to genotype tools into the breeding process requires refined tools. These tools must be capable of dealing with phenotype and sensor variability and of aggregating experimental metadata into the analysis. The success in the development of such tools relies on the quality and size of the available datasets because these are the sole source of information for the deep learning model to adjust its internal parameters.

Quantitative plant morphology

The description of plant morphological traits, for example, number of leaves, canopy cover, number of flowers, and seeds, provides a foundation to characterize plant phenotypic response, which is directly related to plant developmental stage, yield potential, and overall health (Kouressy et al., 2008). The quantification of agronomic traits often relies on manual measurements, which are costly, labor-intensive, and prone to errors. Several approaches including neural networks and other machine learning models have been published to perform leaf counting, area estimation, folding and plant growth stage classification, stem–leaf segmentation, and seed counting (Parmar et al., 2016; Pereira et al., 2016; Sodhi et al., 2017; Teimouri et al., 2018; Uzal et al., 2018; Jin et al., 2019; Rascio et al., 2020). Deep learning models are widely applied to image analysis due to the high complexity of the data and their potential for quantitative morphology lies partially in their capacity to segment the target object from the nontarget objects in the image. Hence, it is possible to measure the traits of the segmented object (number of seeds, color, fruit shape, fruit, or seed size). This measurement ability was shown in a study for fish morphology quantification that used Mask R-CNN for pixel-wise segmentation of the fish body followed by measurement of its morphological features (Yu et al., 2020). A variety of trait phenotyping datasets have been released to develop pipelines for trait measurement, such as the hypocotyl dataset with images of A. thaliana seedlings for length determination (Dobos et al., 2019), image time-series of A. thaliana growth that can be used to predict performance (Taghavi Namin et al., 2018), and species identification datasets (Kumar et al., 2012; Lee et al., 2015, 2017; Fricker et al., 2019; Zheng et al., 2019) as shown in Supplemental Data Set 5. PlantCV and Deep Plant Phenomics are the two platforms that offer packaged pretrained deep learning models to run as applications for phenotyping (Fahlgren et al., 2015; Ubbens and Stavness, 2017). However, tools for quantitative morphology analysis can only guarantee performance if under restricted image conditions and may require further image processing steps. Producing and sharing annotated datasets from a diverse set of species are the most efficient way to ensure new tools can be developed to incorporate them. The Plant Phenotyping Datasets (Supplemental Data Set 5) are the collection of annotated top-view images of A. thaliana and tobacco (Nicotiana tabacum) undergoing different treatments (Minervini et al., 2016). It is a benchmark dataset (Box 1), that was employed in the leaf segmentation and leaf counting challenges at the Computer Vision Problems in Plant Phenotyping conference, and propelled the development of tools for leaf segmentation and counting (Aich and Stavness, 2017; Dobrescu et al., 2017; Giuffrida et al., 2018; Praveen Kumar and Domnic, 2020), which can be later used for assessing plant growth and biomass. Other datasets focused on seed and fruit organs are available. Some datasets are useful to compare variance in seed morphological traits (Ducournau et al., 2020), while others can be used for the development of computer vision tools for fruit counting and automatic quality assessment. In this category, there is a soybean image dataset to assess seed damage from mechanical and biological sources (Pereira et al., 2019), a dataset for the identification of Indian basmati rice (Oryza sativa) seed varieties (Sharma et al., 2020), sugar beet (Beta vulgaris) seed traits (Ducournau et al., 2020), a cocoa bean (Theobroma cacao) dataset for quality assessment (Santos et al., 2019), a banana (Musa sp.) tier abnormality classification (Piedad, 2019), and hyperspectral images of different loose tea (Camellia sinensis; Mishra, 2018; Supplemental Data Set 5). Determining leaf inclination and distribution on the plant is an important morphological trait, it impacts the plant spectral reflectance and is a mechanism to increase tolerance to abiotic stress, with impacts on leaf temperature, water loss, and drought tolerance (Ehleringer and Comstock, 1987; Fuchs, 1990; He et al., 1996; Werner et al., 1999). In common bean (Phaseolus vulgaris L.) the extent of leaf movement increases as the water availability drops, allowing the plants to maintain leaf temperature despite stomata closure (Pastenes et al., 2005). A dataset for leaf angle estimation with ground-truth angles for 71 Eucalyptus species (Pisek and Adamson, 2020) is described in Supplemental Data Set 5, it contains images of Eucalyptus canopies that can be used to estimate leaf angle distribution in trees. Automated pipelines for leaf angle extraction have been developed and tested for A. thaliana, beet, apple (Malus domestica), maize, and sorghum (Müller-Linow et al., 2015; Kenchanmane Raju et al., 2020), allowing researchers to track leaf angle variability and distribution over time. Identifying varieties with desired leaf angle distribution can assist breeders to select the varieties best adapted to specific environmental conditions, such as high planting densities where a narrow angle prevents the leaf from being shadowed by others (Pepper et al., 1977; Lambert and Johnson, 1978). A multitask pipeline capable of phenotyping a comprehensive array of traits in different tissues would produce a snapshot that can be used to identify new QTLs. It was observed that genetic traits may contribute to different tissues causing multiple trait variance (Li et al., 2018). This would provide a resource to detect QTLs and improve our understanding of the genetic basis of complex phenotypes (Topp et al., 2013). Trait phenotyping can also be used for the construction of 3D representations of the plant structure (Topp et al., 2013; Vadez et al., 2015; van Dusschoten et al., 2016; McCormick et al., 2016; Bengochea-Guevara et al., 2017; Sodhi et al., 2017; Vázquez-Arellano et al., 2018; Wang et al., 2018b). This avoids loss of information caused by 2D compression and prevents the generation of artifacts that can occur due to lighting, occlusion, and overlaps.

Concluding remarks

HTP platforms and tools are revolutionizing the way we capture plant phenotypic variation, by allowing the quantification of agronomic traits, and the identification of genetic traits with potential for crop breeding. Publishing the collected phenotypic datasets and associated information would help drive the development of high-performance crops, allowing growers to more effectively monitor their crops and giving breeders the opportunity to explore research from a new perspective with updated tools. The research community must adhere to standardized practices for dataset release such as proposed by MIAPPE in order for the datasets to be explored and interpreted (see “Outstanding questions”). Because of the multiple types of data comprising a HTP dataset, it is important that the terms are clearly defined so researchers from different fields (computer science, remote sensing, and plant biology) can collaborate. In cases where data sharing is unfeasible due to privacy or security concerns, federative learning offers an opportunity to train machine learning algorithms collaboratively without exchanging data. A variety of mathematical and machine learning methods have recently been applied to address the bottleneck of phenotypic quantitative analysis. However, without established benchmark datasets, it is difficult to compare the performance of these approaches, imposing a barrier to improvements and our understanding of the limitations of techniques. It is also important that novel tools are intuitive and well documented, allowing domain experts with minimal programing background to benefit (Klukas et al., 2014; Ubbens and Stavness, 2017). Plant phenotyping is a rapidly evolving field with a growing community, it is important that we use this growth to establish structures such as public repositories and benchmarks to support the field so it may achieve its potential to accelerate crop breeding. What is the best approach to solve the high variability in HTP data collection and processing methodology? Should we define standard methodologies for these tasks or develop tools to detect variance? How can we collate sufficient benchmark datasets to evaluate tool performance? Are the current benchmarks capable of exposing limitations of the tools? How should authors be encouraged to release datasets with their publications, similar to what is required when publishing the results from analysis of genomic datasets? What structures are needed to support the release and maintenance of these datasets? How can we increase data interoperability to integrate datasets from multiple sources (genomic, environmental data)? What is the minimum metadata needed to ensure that?

Supplemental data

The following materials are available in the online version of this article. . Available image-based HTP datasets for crop yield prediction. . Available image-based HTP datasets for abiotic stress phenotyping . Available image-based HTP datasets for disease and pest detection . Root phenotyping datasets . Other miscellaneous databases that may be useful for applications not discussed in this review.

Funding

This work was supported by the Australian Government through the Australian Research Council (Projects DP200100762, DP1601004497, and LP140100537) and the Grains Research and Development Corporation (Projects 9177539 and 9177591) Benjamin J. Nestor is supported by a university postgraduate award at The University of Western Australia. Monica F. Danilevicz and Philipp E. Bayer received support from the Forrest Research Foundation. Monica F. Danilevicz and Benjamin J. Nestor are further supported by the Research Training Program scholarships. Conflict of interest statement. The authors declare no competing interests. Click here for additional data file.

128 in total

Review 1. Image Analysis in Plant Sciences: Publish Then Perish.

Authors: Guillaume Lobet
Journal: Trends Plant Sci Date: 2017-05-29 Impact factor: 18.313

2. Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records.

Authors: Li Huang; Andrew L Shea; Huining Qian; Aditya Masurkar; Hao Deng; Dianbo Liu
Journal: J Biomed Inform Date: 2019-09-24 Impact factor: 6.317

3. Three-Dimensional Time-Lapse Analysis Reveals Multiscale Relationships in Maize Root Systems with Contrasting Architectures.

Authors: Ni Jiang; Eric Floro; Adam L Bray; Benjamin Laws; Keith E Duncan; Christopher N Topp
Journal: Plant Cell Date: 2019-05-23 Impact factor: 11.277

4. A Deep Learning-Based Approach for High-Throughput Hypocotyl Phenotyping.

Authors: Orsolya Dobos; Peter Horvath; Ferenc Nagy; Tivadar Danka; András Viczián
Journal: Plant Physiol Date: 2019-10-21 Impact factor: 8.340

5. An image analysis pipeline for automated classification of imaging light conditions and for quantification of wheat canopy cover time series in field phenotyping.

Authors: Kang Yu; Norbert Kirchgessner; Christoph Grieder; Achim Walter; Andreas Hund
Journal: Plant Methods Date: 2017-03-21 Impact factor: 4.993

6. Using a Portable Active Sensor to Monitor Growth Parameters and Predict Grain Yield of Winter Wheat.

Authors: Jiayi Zhang; Xia Liu; Yan Liang; Qiang Cao; Yongchao Tian; Yan Zhu; Weixing Cao; Xiaojun Liu
Journal: Sensors (Basel) Date: 2019-03-05 Impact factor: 3.576

7. Maize genomes to fields (G2F): 2014-2017 field seasons: genotype, phenotype, climatic, soil, and inbred ear image datasets.

Authors: Bridget A McFarland; Naser AlKhalifah; Martin Bohn; Jessica Bubert; Edward S Buckler; Ignacio Ciampitti; Jode Edwards; David Ertl; Joseph L Gage; Celeste M Falcon; Sherry Flint-Garcia; Michael A Gore; Christopher Graham; Candice N Hirsch; James B Holland; Elizabeth Hood; David Hooker; Diego Jarquin; Shawn M Kaeppler; Joseph Knoll; Greg Kruger; Nick Lauter; Elizabeth C Lee; Dayane C Lima; Aaron Lorenz; Jonathan P Lynch; John McKay; Nathan D Miller; Stephen P Moose; Seth C Murray; Rebecca Nelson; Christina Poudyal; Torbert Rocheford; Oscar Rodriguez; Maria Cinta Romay; James C Schnable; Patrick S Schnable; Brian Scully; Rajandeep Sekhon; Kevin Silverstein; Maninder Singh; Margaret Smith; Edgar P Spalding; Nathan Springer; Kurt Thelen; Peter Thomison; Mitchell Tuinstra; Jason Wallace; Ramona Walls; David Wills; Randall J Wisser; Wenwei Xu; Cheng-Ting Yeh; Natalia de Leon
Journal: BMC Res Notes Date: 2020-02-12

8. Digital imaging of root traits (DIRT): a high-throughput computing and collaboration platform for field-based root phenomics.

Authors: Abhiram Das; Hannah Schneider; James Burridge; Ana Karine Martinez Ascanio; Tobias Wojciechowski; Christopher N Topp; Jonathan P Lynch; Joshua S Weitz; Alexander Bucksch
Journal: Plant Methods Date: 2015-11-02 Impact factor: 4.993

9. Automatic detection of regions in spinach canopies responding to soil moisture deficit using combined visible and thermal imagery.

Authors: Shan-e-Ahmed Raza; Hazel K Smith; Graham J J Clarkson; Gail Taylor; Andrew J Thompson; John Clarkson; Nasir M Rajpoot
Journal: PLoS One Date: 2014-06-03 Impact factor: 3.240