| Literature DB >> 34956254 |
Lívia Gomes Torres1, Eder Jorge de Oliveira2, Alex C Ogbonna3,4, Guillaume J Bauchet4, Lukas A Mueller3,4, Camila Ferreira Azevedo5, Fabyano Fonseca E Silva6, Guilherme Ferreira Simiqueli7, Marcos Deon Vilela de Resende7,8.
Abstract
Genomic prediction (GP) offers great opportunities for accelerated genetic gains by optimizing the breeding pipeline. One of the key factors to be considered is how the training populations (TP) are composed in terms of genetic improvement, kinship/origin, and their impacts on GP. Hydrogen cyanide content (HCN) is a determinant trait to guide cassava's products usage and processing. This work aimed to achieve the following objectives: (i) evaluate the feasibility of using cross-country (CC) GP between germplasm's of Embrapa Mandioca e Fruticultura (Embrapa, Brazil) and The International Institute of Tropical Agriculture (IITA, Nigeria) for HCN; (ii) provide an assessment of population structure for the joint dataset; (iii) estimate the genetic parameters based on single nucleotide polymorphisms (SNPs) and a haplotype-approach. Datasets of HCN from Embrapa and IITA breeding programs were analyzed, separately and jointly, with 1,230, 590, and 1,820 clones, respectively. After quality control, ∼14K SNPs were used for GP. The genomic estimated breeding values (GEBVs) were predicted based on SNP effects from analyses with TP composed of the following: (i) Embrapa genotypic and phenotypic data, (ii) IITA genotypic and phenotypic data, and (iii) the joint datasets. Comparisons on GEBVs' estimation were made considering the hypothetical situation of not having the phenotypic characterization for a set of clones for a certain research institute/country and might need to use the markers' effects that were trained with data from other research institutes/country's germplasm to estimate their clones' GEBV. Fixation index (FST) among the genetic groups identified within the joint dataset ranged from 0.002 to 0.091. The joint dataset provided an improved accuracy (0.8-0.85) compared to the prediction accuracy of either germplasm's sources individually (0.51-0.67). CC GP proved to have potential use under the present study's scenario, the correlation between GEBVs predicted with TP from Embrapa and IITA was 0.55 for Embrapa's germplasm, whereas for IITA's it was 0.1. This seems to be among the first attempts to evaluate the CC GP in plants. As such, a lot of useful new information was provided on the subject, which can guide new research on this very important and emerging field.Entities:
Keywords: Manihot esculenta; breeding; cross predictions; cyanide content; haplotype prediction; population structure
Year: 2021 PMID: 34956254 PMCID: PMC8692580 DOI: 10.3389/fpls.2021.742638
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Experimental areas, cities, geographical coordinates, years, number of phenotypical observations, and clones.
| Institute | City | Geographical coordinates | Years | Observations | Clones |
| Embrapa | Cruz das Almas | 12°40′42.4″S | 2016–2019 (4 years total) | 8,355 | 1,230 |
| IITA | Ibadan | 7°29′44.5″N | 1998–2012 (15 years total) | 5,158 | 590 |
Number of single nucleotide polymorphisms (SNP) in each data file after quality control, and the number of shared SNPs between data files after quality control.
| Analyses | # SNPs |
| Embrapa | 14,323 |
| IITA | 13,524 |
| Embrapa+IITA | 14,924 |
| Common SNPs | 11,883 |
FIGURE 1Principal components analysis (PCA) of the genomic kinship coefficients between cassava clones. (A) PCA from the genomic relationship matrix between all cassava clones (N = 1,820; 14,924 SNPs) from Brazil (Embrapa) and Nigeria (IITA), showing the first three principal components and the variance explained by each component in parenthesis on the corresponding axis (58.24, 12.21, and 5.85% for PC1, PC2, and PC3, respectively). In black representing Embrapa clone’s dispersion and in grey representing IITA’s. (B) PC diagram highlighting the 12 clusters identified by the DAPC analysis.
Discriminant analysis of principal components (DAPC) that accounted for most (>95%) of the total genetic variability and genetic clusters were inferred (k = 12) based on Bayesian Information Criterion (BIC).
| Cluster | N (Total) | n1 (Embrapa) | n2 (IITA) | Mean HCN |
| 1 | 143 | 134 | 9 | 4.39 ± 0.94 |
| 2 | 299 | 10 | 289 | 5.37 ± 0.84 |
| 3 | 86 | – | 86 | 5.51 ± 0.89 |
| 4 | 79 | – | 79 | 5.53 ± 0.79 |
| 5 | 133 | 133 | – | 7.15 ± 0.84 |
| 6 | 269 | 255 | 14 | 5.44 ± 1.38 |
| 7 | 297 | 296 | 1 | 6.80 ± 0.97 |
| 8 | 112 | – | 112 | 5.60 ± 1.02 |
| 9 | 173 | 173 | – | 5.26 ± 1.37 |
| 10 | 61 | 61 | – | 3.75 ± 0.68 |
| 11 | 68 | 68 | – | 4.69 ± 1.37 |
| 12 | 100 | 100 | – | 7.08 ± 0.71 |
| Total | 1820 | 1230 | 590 | 5.57 ± 1.38 |
Total number of clones (N), number of clones from Embrapa (n
Fixation index (FST) estimated by single nucleotide polymorphisms (SNPs) between cassava clones from Embrapa and IITA and between clusters identified within the joint dataset (Embrapa+IITA).
| FST between genotypic data from Embrapa and IITA = 0.072 | |||||||||||
| Clusters | Clusters | ||||||||||
| 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |
| 1 | 0.004 | 0.013 | 0.007 | 0.005 | 0.009 | 0.018 | 0.084 | 0.084 | 0.074 | 0.023 | 0.030 |
| 2 | 0.000 | 0.004 | 0.002 | 0.009 | 0.003 | 0.019 | 0.086 | 0.086 | 0.073 | 0.019 | 0.026 |
| 3 | 0.004 | 0.000 | 0.003 | 0.020 | 0.005 | 0.023 | 0.090 | 0.091 | 0.076 | 0.020 | 0.024 |
| 4 | 0.002 | 0.003 | 0.000 | 0.011 | 0.002 | 0.018 | 0.087 | 0.086 | 0.072 | 0.020 | 0.023 |
| 5 | 0.009 | 0.020 | 0.011 | 0.000 | 0.013 | 0.023 | 0.090 | 0.090 | 0.082 | 0.030 | 0.037 |
| 6 | 0.003 | 0.005 | 0.002 | 0.013 | 0.000 | 0.018 | 0.084 | 0.083 | 0.070 | 0.023 | 0.024 |
| 7 | 0.019 | 0.023 | 0.018 | 0.023 | 0.018 | 0.000 | 0.032 | 0.030 | 0.023 | 0.007 | 0.009 |
| 8 | 0.086 | 0.090 | 0.087 | 0.090 | 0.084 | 0.032 | 0.000 | 0.004 | 0.023 | 0.043 | 0.047 |
| 9 | 0.086 | 0.091 | 0.086 | 0.090 | 0.083 | 0.030 | 0.004 | 0.000 | 0.020 | 0.040 | 0.043 |
| 10 | 0.073 | 0.076 | 0.072 | 0.082 | 0.070 | 0.023 | 0.023 | 0.020 | 0.000 | 0.027 | 0.027 |
| 11 | 0.019 | 0.020 | 0.020 | 0.030 | 0.023 | 0.007 | 0.043 | 0.040 | 0.027 | 0.000 | 0.008 |
| 12 | 0.026 | 0.024 | 0.023 | 0.037 | 0.024 | 0.009 | 0.047 | 0.043 | 0.027 | 0.008 | 0.000 |
FIGURE 2Heatmap of the kinship G matrix by Institute/Country (A,B) by DAPC Cluster (1–12).
Estimates of phenotype-based heritability (H2), SNP-based heritability or haplotype-based heritability (h2), predicted mean, coefficient of residual variation (CVe), genetic variance (), variance due to the combination of year and replication (), residual variance (), predictive ability (PA), accuracy (Ac) and bias for individual markers, and haplotypic block genomic analyses for hydrogen cyanide (HCN) content in Cassava.
|
| ||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||||||||
| Embrapa | 0.76 | 0.97 | 7.11 | 0.13 | 31.90* | 0.25* | 0.79 | 0.59 ± 0.05 | 0.67 ± 0.06 | 0.41 ± 0.06 |
| IITA | 0.18 | 0.20 | 4.95 | 0.28 | 0.64* | 0.62* | 1.92 | 0.26 ± 0.12 | 0.61 ± 0.29 | 0.39 ± 0.37 |
| Embrapa+IITA | 0.56 | 0.65 | 5.57 | 0.21 | 3.68* | 0.59* | 1.39 | 0.64 ± 0.05 | 0.85 ± 0.07 | 0.23 ± 0.06 |
|
| ||||||||||
|
| ||||||||||
|
| ||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||||||||
| Embrapa | 0.76 | 0.96 | 5.91 | 0.16 | 31.00* | 0.27* | 0.86 | 0.48 ± 0.07 | 0.56 ± 0.08 | 0.58 ± 0.08 |
| IITA | 0.18 | 0.20 | 5.05 | 0.27 | 0.65* | 0.62* | 1.92 | 0.22 ± 0.14 | 0.51 ± 0.34 | 0.48 ± 0.38 |
| Embrapa+IITA | 0.56 | 0.62 | 5.74 | 0.21 | 3.31* | 0.59* | 1.42 | 0.60 ± 0.04 | 0.80 ± 0.05 | 0.30 ± 0.05 |
FIGURE 3Boxplots contrasting the hydrogen cyanide predictions on country-dataset and cross-country-dataset to show sweet, intermedium and bitter classes. (A) Hydrogen cyanide for Embrapa’s clones based on GEBV prediction on Embrapa’s dataset, IITA’s dataset and the two datasets together, respectively. (B) Hydrogen cyanide for IITA’s clones based on GEBV prediction on IITA’s dataset, Embrapa’s datasets and the two datasets together, respectively.
The number of cassava clones allocated in each class of HCN content, for Embrapa and IITA, according to the GEBVs estimated with marker effects from the three referred analyses.
| Institute | Embrapa | IITA | ||||
| Classification | Analysis 1 | Analysis 2 | Analysis 3 | Analysis 1 | Analysis 2 | Analysis 3 |
| Sweet | 263 | 0 | 243 | 18 | 36 | 22 |
| Intermedium | 165 | 79 | 187 | 260 | 72 | 171 |
| Bitter | 802 | 1151 | 800 | 312 | 482 | 397 |
| Total clones | 1230 | 1230 | 1230 | 590 | 590 | 590 |