| Literature DB >> 36015442 |
Karansher Singh Sandhu1, Aalok Shiv2, Gurleen Kaur3, Mintu Ram Meena4, Arun Kumar Raja5, Krishnapriya Vengavasi5, Ashutosh Kumar Mall2, Sanjeev Kumar2, Praveen Kumar Singh2, Jyotsnendra Singh2, Govind Hemaprabha6, Ashwini Dutt Pathak2, Gopalareddy Krishnappa6, Sanjeev Kumar2.
Abstract
Marker-assisted selection (MAS) has been widely used in the last few decades in plant breeding programs for the mapping and introgression of genes for economically important traits, which has enabled the development of a number of superior cultivars in different crops. In sugarcane, which is the most important source for sugar and bioethanol, marker development work was initiated long ago; however, marker-assisted breeding in sugarcane has been lagging, mainly due to its large complex genome, high levels of polyploidy and heterozygosity, varied number of chromosomes, and use of low/medium-density markers. Genomic selection (GS) is a proven technology in animal breeding and has recently been incorporated in plant breeding programs. GS is a potential tool for the rapid selection of superior genotypes and accelerating breeding cycle. However, its full potential could be realized by an integrated approach combining high-throughput phenotyping, genotyping, machine learning, and speed breeding with genomic selection. For better understanding of GS integration, we comprehensively discuss the concept of genetic gain through the breeder's equation, GS methodology, prediction models, current status of GS in sugarcane, challenges of prediction accuracy, challenges of GS in sugarcane, integrated GS, high-throughput phenotyping (HTP), high-throughput genotyping (HTG), machine learning, and speed breeding followed by its prospective applications in sugarcane improvement.Entities:
Keywords: GEBV; breeding; genomic accuracy; genomic selection; high-throughput genotyping; high-throughput phenotyping; machine learning; prediction models; speed breeding; sugarcane
Year: 2022 PMID: 36015442 PMCID: PMC9412483 DOI: 10.3390/plants11162139
Source DB: PubMed Journal: Plants (Basel) ISSN: 2223-7747
Details of genomic selection studies in sugarcane.
| Number of Genotypes in Panel | Traits Considered | Number and Type of Marker Used in Genotyping | GS Prediction Models Tested | Prediction Accuracy | Conclusion/Remarks | Reference |
|---|---|---|---|---|---|---|
| 167 | 10 traits: | 1,499 DArT | Bayes LASSO, ridge regression, reproducing kernel Hilbert space | 0.13–0.21 (smut), ScYLV, BR, sugar content, 0.16–0.33 (Bagasse content), bagasse digestibility, plant morphology | Equivalent accuracy among the four predictive models for a given trait, and marked differences were observed among traits | [ |
| 173 F1s | Lignocellulosic | 21,895 single-dose | Additive, dominance, and epistasis model, | 0.44–0.77 | MAS performed reasonably well for traits controlled | [ |
| Three different panels: | Cane yield, CCS% | Affymetrix Axiom SNP Array | Bayes A, Bayes B, Bayes LASSO, ridge regression | 0.25–0.45 | Unbalanced dataset, ascertainment bias array | [ |
| 219 | Brown rust | 13,458 SNPs | K-nearest neighbor (KNN), support vector machine (SVM), Gaussian process (GP), decision tree (DT), random forest (RF), multilayer perceptron (MLP) neural network, adaptive boosting (AB), Gaussian naive Bayes (GNB) | Increase in | [ | |
| 3984 | Cane yield, CCS%, fibre content, flowering traits | 26 K SNP from Affymetrix Axiom | GBLUP, genomic single step (GenomicSS), BayesR | Cane yield: 0.3 | The single-step | [ |
| 432 | Orange rust, brown rust | 8825 coding region-based SNPs | Random regression BLUP | 0.28–0.43 BR | Parametric GS model outperformed non-parametric models | [ |
| 1000 | 10,000 SNPs | An additive trait genetic model, | 0.3–0.5 | A combination of improving both additive and non-additive genetic effects holds the potential to improve long-term genetic gain in sugarcane breeding | [ |
Figure 1Various approaches which could be integrated in genomic selection in sugarcane for accelerated genetic gain.
Figure 2Infrared thermal image of canopy temperature of a sugarcane plant.
Figure 3Various spectra under which different physio-biochemical and morphometric traits are measured in sugarcane high-throughput phenomics.
Description and source codes for the important machine and deep learning models for genomic selection in plant breeding programs.
| Prediction Model | Description | Code | Reference |
|---|---|---|---|
| Random Forests (RF) | RF uses different sets of nodes, branches, and depth for building a regression-based tree | [ | |
| Support Vector Machines (SVM) | SVM uses kernel and cost functions to model hyperplane for predictions | [ | |
| Partial Least Square (PLSR) | PLSR uses dimensional reduction technique to produce a latent variable, which is ultimately used to make predictions | [ | |
| Multilayer Perceptron (MLP) | MLP uses a set of input, hidden, and output layer, with large number of neurons and activation functions to model the trend in the data | [ | |
| Convolutional Neural Network (CNN) | CNN uses a set of convolutional, flattening, pooling, and dense layer for predicting the output | [ | |
| DeepGS | DeepGS uses a set of input, convolutional, sampling, and fully connected layer to predict the output | [ | |
| Recurrent Neural Network (RNN) | RNN mostly uses for predicting longitudinal and time-series-based data | [ | |
| Arc-cosine Kernel (AK) | AK estimates the stepwise covariance matrix in model training | [ |