Wenlong Ma1,2, Zhixu Qiu1,3, Jie Song1,2, Jiajia Li1,3, Qian Cheng1,3, Jingjing Zhai1,2, Chuang Ma4,5. 1. State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, 712100, Shaanxi, China. 2. Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Yangling, 712100, Shaanxi, China. 3. Biomass Energy Center for Arid and Semi-arid Lands, Northwest A&F University, Shaanxi, 712100, Yangling, China. 4. State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, 712100, Shaanxi, China. cma@nwafu.edu.cn. 5. Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Yangling, 712100, Shaanxi, China. cma@nwafu.edu.cn.
Abstract
MAIN CONCLUSION: Deep learning is a promising technology to accurately select individuals with high phenotypic values based on genotypic data. Genomic selection (GS) is a promising breeding strategy by which the phenotypes of plant individuals are usually predicted based on genome-wide markers of genotypes. In this study, we present a deep learning method, named DeepGS, to predict phenotypes from genotypes. Using a deep convolutional neural network, DeepGS uses hidden variables that jointly represent features in genotypes when making predictions; it also employs convolution, sampling and dropout strategies to reduce the complexity of high-dimensional genotypic data. We used a large GS dataset to train DeepGS and compared its performance with other methods. The experimental results indicate that DeepGS can be used as a complement to the commonly used RR-BLUP in the prediction of phenotypes from genotypes. The complementarity between DeepGS and RR-BLUP can be utilized using an ensemble learning approach for more accurately selecting individuals with high phenotypic values, even for the absence of outlier individuals and subsets of genotypic markers. The source codes of DeepGS and the ensemble learning approach have been packaged into Docker images for facilitating their applications in different GS programs.
MAIN CONCLUSION: Deep learning is a promising technology to accurately select individuals with high phenotypic values based on genotypic data. Genomic selection (GS) is a promising breeding strategy by which the phenotypes of plant individuals are usually predicted based on genome-wide markers of genotypes. In this study, we present a deep learning method, named DeepGS, to predict phenotypes from genotypes. Using a deep convolutional neural network, DeepGS uses hidden variables that jointly represent features in genotypes when making predictions; it also employs convolution, sampling and dropout strategies to reduce the complexity of high-dimensional genotypic data. We used a large GS dataset to train DeepGS and compared its performance with other methods. The experimental results indicate that DeepGS can be used as a complement to the commonly used RR-BLUP in the prediction of phenotypes from genotypes. The complementarity between DeepGS and RR-BLUP can be utilized using an ensemble learning approach for more accurately selecting individuals with high phenotypic values, even for the absence of outlier individuals and subsets of genotypic markers. The source codes of DeepGS and the ensemble learning approach have been packaged into Docker images for facilitating their applications in different GS programs.
Entities:
Keywords:
Deep learning; Ensemble learning; Genomic selection; Genotypic marker; High phenotypic values; Machine learning
Authors: Gustavo de los Campos; Hugo Naya; Daniel Gianola; José Crossa; Andrés Legarra; Eduardo Manfredi; Kent Weigel; José Miguel Cotes Journal: Genetics Date: 2009-03-16 Impact factor: 4.562
Authors: José Crossa; Paulino Pérez-Rodríguez; Jaime Cuevas; Osval Montesinos-López; Diego Jarquín; Gustavo de Los Campos; Juan Burgueño; Juan M González-Camacho; Sergio Pérez-Elizalde; Yoseph Beyene; Susanne Dreisigacker; Ravi Singh; Xuecai Zhang; Manje Gowda; Manish Roorkiwal; Jessica Rutkoski; Rajeev K Varshney Journal: Trends Plant Sci Date: 2017-09-28 Impact factor: 18.313
Authors: M F R Resende; P Muñoz; M D V Resende; D J Garrick; R L Fernando; J M Davis; E J Jokela; T A Martin; G F Peter; M Kirst Journal: Genetics Date: 2012-01-23 Impact factor: 4.562
Authors: Javaid A Bhat; Sajad Ali; Romesh K Salgotra; Zahoor A Mir; Sutapa Dutta; Vasudha Jadon; Anshika Tyagi; Muntazir Mushtaq; Neelu Jain; Pradeep K Singh; Gyanendra P Singh; K V Prabhu Journal: Front Genet Date: 2016-12-27 Impact factor: 4.599
Authors: Carlos Guzman; Roberto Javier Peña; Ravi Singh; Enrique Autrique; Susanne Dreisigacker; Jose Crossa; Jessica Rutkoski; Jesse Poland; Sarah Battenfield Journal: Appl Transl Genom Date: 2016-10-29
Authors: Giovanni Galli; Felipe Sabadin; Rafael Massahiro Yassue; Cassia Galves; Humberto Fanelli Carvalho; Jose Crossa; Osval Antonio Montesinos-López; Roberto Fritsche-Neto Journal: Front Plant Sci Date: 2022-03-07 Impact factor: 5.753
Authors: Fabiana F Moreira; Hinayah R Oliveira; Jeffrey J Volenec; Katy M Rainey; Luiz F Brito Journal: Front Plant Sci Date: 2020-05-26 Impact factor: 5.753