| Literature DB >> 29020747 |
Michael P Pound1, Jonathan A Atkinson2, Alexandra J Townsend2, Michael H Wilson3, Marcus Griffiths2, Aaron S Jackson1, Adrian Bulat1, Georgios Tzimiropoulos1, Darren M Wells2, Erik H Murchie2, Tony P Pridmore1, Andrew P French1,2.
Abstract
In plant phenotyping, it has become important to be able to measure many features on large image sets in order to aid genetic discovery. The size of the datasets, now often captured robotically, often precludes manual inspection, hence the motivation for finding a fully automated approach. Deep learning is an emerging field that promises unparalleled results on many data analysis problems. Building on artificial neural networks, deep approaches have many more hidden layers in the network, and hence have greater discriminative and predictive power. We demonstrate the use of such approaches as part of a plant phenotyping pipeline. We show the success offered by such techniques when applied to the challenging problem of image-based plant phenotyping and demonstrate state-of-the-art results (>97% accuracy) for root and shoot feature identification and localization. We use fully automated trait identification using deep learning to identify quantitative trait loci in root architecture datasets. The majority (12 out of 14) of manually identified quantitative trait loci were also discovered using our automated approach based on deep learning detection to locate plant features. We have shown deep learning-based phenotyping to have very good detection and localization accuracy in validation and testing image sets. We have shown that such features can be used to derive meaningful biological traits, which in turn can be used in quantitative trait loci discovery pipelines. This process can be completely automated. We predict a paradigm shift in image-based phenotyping bought about by such deep learning approaches, given sufficient training sets.Entities:
Keywords: Phenotyping; QTL; deep learning; image analysis; root; shoot
Mesh:
Year: 2017 PMID: 29020747 PMCID: PMC5632296 DOI: 10.1093/gigascience/gix083
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:A simplified example of a CNN architecture operating on a fixed size image of part of an ear of wheat. The network performs alternating convolution and pooling operations (see the online methods for details). Each convolutional layer automatically extracts useful features, such as edges or corners, outputting a number of feature maps. Pooling operations shrink the size of the feature maps to improve efficiency. The number of feature maps is increased deeper into the network to improve classification accuracy. Finally, standard neural network layers comprise the classification layers, which output probabilities for each class.
Figure 2:Example training and validation images from our root tip and shoot feature datasets. Positive samples were taken at locations annotated by a user. Negative samples were generated on the root system and at random for the root images, and on computed feature points on the shoot images.
Classification results for both root and shoot datasets
| Feature | Correctly classified | Misclassified | Accuracy (%) |
|---|---|---|---|
| Root tip | 2904 | 73 | 97.5 |
| Root tip negative | 5687 | 65 | 98.9 |
|
|
|
|
|
| Feature | Correctly classified | Misclassified | Accuracy (%) |
| Leaf tip | 2225 | 113 | 95.2 |
| Leaf base | 2299 | 52 | 97.8 |
| Ear tip | 686 | 15 | 97.9 |
| Ear base | 765 | 23 | 97.1 |
| Shoot negative | 6110 | 136 | 97.8 |
|
|
|
|
|
Leaf tips represent the hardest classification problem in the datasets, with large variations in orientation, size, shape, and colour. In all cases, the accuracy has remained above 95%, with the average accuracy of both networks above 97%. The root tip network performs marginally better overall, perhaps to be expected due to the simpler nature of the image data. Complete confusion matrices can be found in Additional file 3.
Figure 3:Localization examples. Images showing the response of our classifier using a sliding window over each input image. (a) Three examples of wheat root tip localization. Regions of high response from the classifier are shown in yellow. (b) Two examples of wheat shoot feature localization. Regions of high response from the classifier for leaf tips are highlighted in orange, leaf bases in yellow, ear tips in blue, and ear bases in pink. A portion of the second image has been zoomed and shown with and without features highlighted. More images can be seen in Additional file 1.
Testing results for our image scanning approach over 20 unseen root images and 20 unseen shoot images
| Feature | False positive (%) | False negative (%) | Feature accuracy (%) | |
|---|---|---|---|---|
| Roots | Root tip | 0.03 | 0.12 | 99.85 |
| Shoots | Leaf tip | 0.24 | 0.12 | 99.64 |
| Leaf base | 0.22 | 0.10 | 99.68 | |
| Ear tip | 0.08 | 0.02 | 99.91 | |
| Ear base | 0.11 | 0.05 | 99.85 |
Feature accuracy is the number of true-positive and true-negative pixels, divided by the total number of pixels over the 20 images. Actual testing images and results can be seen in Additional file 2.
List of root traits derived from the tip-detection CNN output and how they were computed
| Name | Description |
|---|---|
| Tip count | The sum of all connected components found |
| Hull area | The area of the convex hull derived from the centroids of all tips |
| Width/depth | The width and depth of the bounding box surrounding all tips |
| Width:depth ratio | Calculated as width divided by depth |
| Mean X/Y | The mean X and Y positions of all tips |
| Standard deviation X/Y | The standard deviation of the X and Y positions of all tips |
| Top 100/200/300px count | A count of the number of tips located in the top 100-, 200-, and 300-pixel strips below the seed position calculated above |
| Total length | An estimate for the length of the root system, calculated as the sum of the distances from each tip to the seed position |
| Centre mass X/Y | The mean X, Y position of all tips |
The name is derived from the trait they can be seen to estimate or represent.
QTL discovery results from user-supervised (RootNav, RN) and CNN-derived deep learning (DL) approaches
| RN | DL | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Trait | Chr | Pos | LOD | CI | Chr | Pos | LOD | CI | Additive effect | Nearest marker |
| Centre of mass (x) | 1A | 70.3 | 2.5 | 47.7–163.6 | ||||||
| Width/depth ratio | 4D | 4.8 | 2.7 | 0.8–67.6 | 4D | 2.8 | 3.2 | 0.8–67.6 | 0.07 | IAAV5065 |
| Total root length | 6D | 4.4 | 24.0 | 2–53 | 6D | 4.4 | 12.7 | 2–53 | –2201 | wsnp_Ex_c4789_8550135 |
| Convex hull | 6D | 4.4 | 17.6 | 2–53 | 6D | 4.4 | 17.3 | 2–53 | –264 026 | wsnp_Ex_c4789_8550135 |
| Centre of mass (x) | 6D | 26 | 2.8 | 0–92.5 | 6D | 5 | 17.1 | 2–53 | –151 | wsnp_Ex_c4789_8550135 |
| Centre of mass (y) | 6D | 4.4 | 19.1 | 2–53 | 6D | 4.4 | 10.0 | 0–53 | –105 | wsnp_Ex_c4789_8550135 |
| Lateral count/tip count | 6D | 4.4 | 9.1 | 0–53 | 6D | 4.4 | 10.2 | 0–53 | –4.53 | wsnp_Ex_c4789_8550135 |
| Maximum depth | 6D | 4.4 | 22.7 | 2–53 | 6D | 4.4 | 25.1 | 2–53 | –388 | wsnp_Ex_c4789_8550135 |
| Maximum width | 6D | 4.4 | 6.4 | 0–53 | 6D | 6 | 15.0 | 2–53 | –241 | wsnp_Ex_c4789_8550135 |
| Total root length | 7D | 27 | 9.0 | 16–52 | 7D | 30 | 3.4 | 16–52 | –1122 | Kukri_c48125_714 |
| Lateral count/tip count | 7D | 29 | 2.4 | 16–101.8 | 7D | 29 | 4.5 | 16–101.8 | –2.76 | wsnp_Ra_c8297_14095831 |
| Centre of mass (x) | 7D | 19 | 2.7 | 16–38.8 | ||||||
| Convex hull | 7D | 34 | 3.5 | 16–62.4 | 7D | 34 | 4.4 | 16–62.4 | –123 896 | Kukri_c48125_714 |
| Maximum depth | 7D | 30 | 5.8 | 16–52 | 7D | 30 | 6.9 | 16–62.4 | –155 | wsnp_Ra_c8297_14095831 |
Note there are 2 QTL identified using RN that are missed by the DL approach; all others were identified by both methods. Chr: chromosome; CI: confidence interval start and end positions; DL: deep learning; Pos: position; RN: RootNav.
Figure 4:The architecture of both convolutional neural networks (left: root, right: shoot). In each case, convolution and pooling layers reduce the spatial resolution to 1 × 1, while increasing the feature resolution. All convolutional layers used kernels size 3 × 3 pixels, and the number of different filters is shown at the right of each layer. Following the convolution and pooling layers, the fully connected (neural network) layers perform classification of the images. We included rectified linear unit (ReLu) layers between all convolutional and fully connected layers, and dropout layers between each fully connected layer.