| Literature DB >> 25504737 |
.
Abstract
Cassava (Manihot esculenta Crantz) is a major staple crop in Africa, Asia, and South America, and its starchy roots provide nourishment for 800 million people worldwide. Although native to South America, cassava was brought to Africa 400-500 years ago and is now widely cultivated across sub-Saharan Africa, but it is subject to biotic and abiotic stresses. To assist in the rapid identification of markers for pathogen resistance and crop traits, and to accelerate breeding programs, we generated a framework map for M. esculenta Crantz from reduced representation sequencing [genotyping-by-sequencing (GBS)]. The composite 2412-cM map integrates 10 biparental maps (comprising 3480 meioses) and organizes 22,403 genetic markers on 18 chromosomes, in agreement with the observed karyotype. We used the map to anchor 71.9% of the draft genome assembly and 90.7% of the predicted protein-coding genes. The chromosome-anchored genome sequence will be useful for breeding improvement by assisting in the rapid identification of markers linked to important traits, and in providing a framework for genomic selection-enhanced breeding of this important crop.Entities:
Keywords: F1 cross; SNP; composite genetic map; genotyping-by-sequencing; pseudomolecules
Mesh:
Substances:
Year: 2014 PMID: 25504737 PMCID: PMC4291464 DOI: 10.1534/g3.114.015008
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Mapping populations used in this study
| Population | Female Parent | Male Parent | Cross Type | No. of Individuals Sequenced | No. of Validated Progeny | Purpose of Cross (Segregating Traits) |
|---|---|---|---|---|---|---|
| ARAL | AR40-6 | Albert | F1 | 154 | 129 | CBSD and green mite resistance |
| KAR | Kiroba | AR37-80 | F1 | 192 | 132 | CBSD and green mite resistance |
| MP4 | TMS-IBA30001 | TMS-IBA961089A | F1 | 190 | 177 | Starch, dry matter content, CMD resistance, and root rot |
| MP5 | TMS-IBA961089A | TMS-IBA30001 | F1 | 187 | 162 | Starch, dry matter content, CMD resistance, and root rot |
| MT | Mkombozi | Unknown | F1 | 157 | 135 | CBSD resistance |
| NCAR | Nachinyaya | AR37-80 | F1 | 240 | 233 | CBSD and green mite resistance |
| NDLAR | NDL06/132 | AR37-80 | F1 | 247 | 244 | CBSD and green mite resistance |
| NxA | Namikonga | Albert | F1 | 303 | 256 | CBSD resistance |
| TMEB419-S1 | TMEB419 | TMEB419 | S1 | 149 | 117 | Starch content |
| 412×425 | TMS-IBA4(2)1425 | TMS-IBA011412 | F1 | 177 | 155 | Root carotenoid content, CMD resistance |
Nine biparental (F1) and one self-pollinated (S1) populations were generated in which a variety of disease and agronomic traits were segregating. After sequencing, individuals that were not full sibs and/or had insufficient read depth for accurate variant calling were removed prior to map construction. CBSD, cassava brown streak disease; CMD, cassava mosaic disease.
Ferguson laboratory, International Institute of Tropical Agriculture (IITA) Nairobi, Kenya.
Rabbi laboratory, IITA Ibadan, Nigeria.
Egesi laboratory, National Root Crops Research Institute (NRCRI) Umudike, Nigeria.
Sequence reads and variability
| Population Name | Total Reads | Average no. of reads/barcode | Coefficient of Variation |
|---|---|---|---|
| ARAL | 1,004,686,146 | 6,440,295 | 0.3689 |
| KAR | 913,517,846 | 3,713,487 | 0.3669 |
| MP4 | 734,460,036 | 3,672,300 | 0.6334 |
| MP5 | 735,540,482 | 3,733,708 | 0.7023 |
| MT | 609,010,716 | 3,421,408 | 0.7470 |
| NCAR | 1,076,417,780 | 4,375,682 | 0.3830 |
| NDLAR | 1,329,433,056 | 5,113,204 | 0.3741 |
| NxA | 1,189,124,090 | 3,823,550 | 0.4403 |
| TMEB419-S1 | 693,520,182 | 4,532,811 | 0.3051 |
| 412×425 | 1,183,937,274 | 6,399,660 | 0.5171 |
Summary statistics are shown for the 10 mapping populations used in this study. The coefficient of variation for a given population is an average of libraries sequenced for that population. The total (for reads) or average (for average reads per barcode and coefficient of variation) are shown on the last line (bold).
Figure 1Data analysis pipeline used in this study. Using a combination of publicly available and custom tools (gray text), the pipeline starts with sequence data and generates a map for each population (Pop) through a series of analyses (white text on blue). Finally, the maps are merged using LPmerge to generate a single composite map (Endelman and Plomion 2014).
Figure 2Analysis of parentage and sibling relationships. (A) The fraction of non-Mendelian genotypes detected in each individual in a population is plotted as a function of the minimum genotype quality (GQ) threshold. Off-types can be detected by a consistently high rate of Mendelian violation (black dotted lines), while, for legitimate progeny (solid gray), the non-Mendelian genotype rate is consistently lower than that of the putative parents (solid black lines). (B) A bivariate clustering analysis was performed on each population to verify parentage and full-sibling relationships. In this plot, we show the ARAL population as an example. Pairwise kinship coefficients phi are calculated between progeny and parents (Manichaikul ). Each putative progeny is represented as a point in two-dimensional space colored by its cluster assignment: full-sib F1 progeny are in red and S1 progeny of AR40-6 (therefore unrelated to Albert) are in green; individuals consistent with being progeny from a full sib of Albert crossed to AR40-6 are in purple; individuals consistent with being the progeny from an S1 of AR40-6 being crossed to Albert are in blue.
Mapping parameters and statistics
| Population | No. of Markers | No. of framework markers | LOD Threshold | LGs |
|---|---|---|---|---|
| ARAL | 6765 | 3657 | 13.0 | 21 |
| KAR | 3047 | 1952 | 8.0 | 19 |
| MP4 | 3392 | 1903 | 6.0 | 19 |
| MP5 | 3388 | 1803 | 6.0 | 21 |
| MT | 3991 | 2301 | 10.0 | 18 |
| NCAR | 5192 | 2894 | 16.0 | 20 |
| NDLAR | 4460 | 2385 | 12.0 | 20 |
| NxA | 3940 | 2241 | 15.0 | 18 |
| TMEB419-S1 | 4340 | 1943 | 10.0 | 21 |
| 412×425 | 5942 | 2975 | 7.0 | 18 |
For each population and the composite map, we report the total number of markers and the number of markers used by JoinMap for map estimation, the minimum LOD threshold for defining LG, and the number of LGs output from the grouping procedure. NA, not applicable
Figure 3Pairwise comparison of single population maps. (A) An example of a pairwise comparison between maps. Every marker is plotted at a position corresponding to its genetic distance in each map (for shared markers) or along the axis (for markers unique to one map). Shared markers reveal the correspondences and relative orientations between LGs produced by JoinMap (arbitrarily numbered). Runs of shared markers appearing as approximately straight, continuous lines demonstrate marker orders consistent between the two maps; a positive or negative gradient indicates identical or opposite orientation, respectively, in the two maps. In addition, LGs to be joined or split are revealed, respectively, as multiple unusually small LGs in one map corresponding to a single LG in another map (NCAR LG 3, and KAR LGs 13 and 14) and as a single unusually large LG in one map corresponding to multiple LGs in the other (NCAR LGs 7 and 9, and KAR LG 1). (B) An example of a “V”-shaped dot plot pattern, typically observed in LGs, that required (and could be corrected with) JoinMap parameters that increased the sensitivity (see Materials and Methods).
Figure 4Marker distribution in the composite map. The scale on the left shows the map distance in cM. Our composite map consists of 18 LGs with a marker density of 1.95 nonredundant markers per cM. Previous maps did not recapitulate 18 LGs with a clear one-to-one mapping to our map, so we adopted a Roman numeral numbering system, ordered by decreasing genetic size.
Linkage groups in the composite map
| Chromosome (LG) | No. of Markers | Length (cM) | Average Marker Density (markers/cM) | Maximum Inter-marker Distance (cM) | No. of Scaffolds Anchored | No. of Bases Anchored |
|---|---|---|---|---|---|---|
| I | 2323 | 164.78 | 2.72 | 3.63 | 90 | 26,714,966 |
| II | 1366 | 164.22 | 2.40 | 7.85 | 81 | 24,343,195 |
| III | 1326 | 155.60 | 2.04 | 5.97 | 116 | 22,858,152 |
| IV | 1459 | 148.73 | 1.64 | 18.41 | 93 | 21,649,806 |
| V | 1330 | 146.87 | 1.57 | 6.12 | 81 | 23,125,959 |
| VI | 1462 | 144.73 | 2.56 | 3.42 | 79 | 22,319,908 |
| VII | 848 | 141.96 | 1.65 | 6.73 | 88 | 18,888,952 |
| VIII | 1212 | 137.48 | 2.01 | 7.21 | 111 | 23,111,533 |
| IX | 1207 | 137.35 | 1.68 | 6.01 | 92 | 20,667,830 |
| X | 1011 | 133.31 | 2.03 | 5.58 | 95 | 20,387,986 |
| XI | 1330 | 132.16 | 2.12 | 2.56 | 96 | 20,727,479 |
| XII | 863 | 128.85 | 1.30 | 10.09 | 103 | 22,667,256 |
| XIII | 865 | 125.09 | 1.89 | 4.92 | 107 | 20,100,115 |
| XIV | 1346 | 120.26 | 1.48 | 11.74 | 88 | 17,859,824 |
| XV | 1548 | 117.89 | 2.54 | 2.56 | 72 | 20,107,995 |
| XVI | 920 | 117.13 | 1.56 | 5.71 | 75 | 18,834,636 |
| XVII | 974 | 104.95 | 1.71 | 7.48 | 95 | 20,464,108 |
| XVIII | 1013 | 90.99 | 2.41 | 8.46 | 90 | 17,820,998 |
The number of markers and genetic distances are shown for the 18 LGs. The average marker density is calculated for genetically nonredundant markers only, i.e., only one marker at a given genetic position was included.
Figure 5Additional maps incorporate more markers, scaffolds, and anchored bases. These plots show the effects of adding maps to the framework map. Each additional map incorporates more genetically nonredundant markers (A) into the framework map, but the number of scaffolds incorporated is saturating (B) and the number of mapped bases (C) is reaching a plateau. This is because the scaffolds being added in later maps are getting smaller and smaller and, hence, adding ever fewer bases.