| Literature DB >> 24062527 |
Sanjeev Kumar Sharma1, Daniel Bolser, Jan de Boer, Mads Sønderkær, Walter Amoros, Martin Federico Carboni, Juan Martín D'Ambrosio, German de la Cruz, Alex Di Genova, David S Douches, Maria Eguiluz, Xiao Guo, Frank Guzman, Christine A Hackett, John P Hamilton, Guangcun Li, Ying Li, Roberto Lozano, Alejandro Maass, David Marshall, Diana Martinez, Karen McLean, Nilo Mejía, Linda Milne, Susan Munive, Istvan Nagy, Olga Ponce, Manuel Ramirez, Reinhard Simon, Susan J Thomson, Yerisf Torres, Robbie Waugh, Zhonghua Zhang, Sanwen Huang, Richard G F Visser, Christian W B Bachem, Boris Sagredo, Sergio E Feingold, Gisella Orjeda, Richard E Veilleux, Merideth Bonierbale, Jeanne M E Jacobs, Dan Milbourne, David Michael Alan Martin, Glenn J Bryan.
Abstract
The genome of potato, a major global food crop, was recently sequenced. The work presented here details the integration of the potato reference genome (DM) with a new sequence-tagged site marker-based linkage map and other physical and genetic maps of potato and the closely related species tomato. Primary anchoring of the DM genome assembly was accomplished by the use of a diploid segregating population, which was genotyped with several types of molecular genetic markers to construct a new ~936 cM linkage map comprising 2469 marker loci. In silico anchoring approaches used genetic and physical maps from the diploid potato genotype RH89-039-16 (RH) and tomato. This combined approach has allowed 951 superscaffolds to be ordered into pseudomolecules corresponding to the 12 potato chromosomes. These pseudomolecules represent 674 Mb (~93%) of the 723 Mb genome assembly and 37,482 (~96%) of the 39,031 predicted genes. The superscaffold order and orientation within the pseudomolecules are closely collinear with independently constructed high density linkage maps. Comparisons between marker distribution and physical location reveal regions of greater and lesser recombination, as well as regions exhibiting significant segregation distortion. The work presented here has led to a greatly improved ordering of the potato reference genome superscaffolds into chromosomal "pseudomolecules".Entities:
Keywords: Solanaceae; genetic map; genome anchoring; physical map; potato; pseudomolecules; scaffold orientation; sequence-tagged sites
Mesh:
Substances:
Year: 2013 PMID: 24062527 PMCID: PMC3815063 DOI: 10.1534/g3.113.007153
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Distribution of 1751 markers comprising four different classes across the 12 chromosomes in the DMDD population, with the concomitant map and interval lengths (cM) for each chromosome
| Chr | Mapped Markers | Map Length, cM | Interval Spacing, cM/interval |
|---|---|---|---|
| 01 | 201 | 93.0 | 0.46 |
| 02 | 221 | 77.4 | 0.35 |
| 03 | 134 | 101.8 | 0.77 |
| 04 | 143 | 99.7 | 0.70 |
| 05 | 107 | 64.1 | 0.61 |
| 06 | 134 | 70.5 | 0.53 |
| 07 | 108 | 67.1 | 0.63 |
| 08 | 176 | 67.8 | 0.39 |
| 09 | 152 | 87.9 | 0.58 |
| 10 | 144 | 68.9 | 0.48 |
| 11 | 108 | 62.9 | 0.59 |
| 12 | 123 | 75.2 | 0.62 |
| All | 1751 | 936.2 | 0.54 |
SSR, simple sequence repeat.
Based on the SSRs mapped in previous studies and further confirmed by using in silico approaches.
Excluding 718 co-segregating markers; when the segregation pattern of two or more markers was identical, only a single marker per set of identical markers was retained to generate the maps; 128 ungrouped markers (including 15 unassigned co-segregating markers) that did not fit any linkage group were also excluded.
Calculated as the map length divided by the number of intervals (mapped markers minus 1, for “total” it is mapped markers minus 12).
Figure 1Pipeline for anchoring of markers to the potato genome assembly.
Figure 2Step-wise linkage group assignment and ordering of DM superscaffolds using genetic-anchoring information successively from the DM, RH, and tomato genetic maps.
Anchoring statistics by chromosome for the three different physical maps, de novo (DM) and in silico (RH and tomato)
| Chromosome | DM Map | RH Map | Tomato Map | ||||||
|---|---|---|---|---|---|---|---|---|---|
| DMB Anchored | Cumulative Length, Mb | No. of Markers | DMB Anchored | Cumulative Length, Mb | No. of Markers | DMB Anchored | Cumulative Length, Mb | No. of Markers | |
| 01 | 39 | 45 | 162 | 69 | 80 | 208 | 43 | 41 | 271 |
| 02 | 35 | 43 | 175 | 35 | 43 | 120 | 33 | 40 | 233 |
| 03 | 19 | 24 | 108 | 28 | 27 | 73 | 41 | 45 | 194 |
| 04 | 34 | 47 | 138 | 51 | 57 | 168 | 40 | 39 | 174 |
| 05 | 20 | 27 | 74 | 33 | 45 | 137 | 25 | 30 | 112 |
| 06 | 29 | 34 | 108 | 44 | 46 | 119 | 34 | 34 | 133 |
| 07 | 26 | 24 | 89 | 35 | 39 | 122 | 32 | 31 | 136 |
| 08 | 32 | 32 | 152 | 24 | 23 | 57 | 40 | 32 | 129 |
| 09 | 27 | 28 | 109 | 34 | 33 | 91 | 40 | 39 | 136 |
| 10 | 31 | 38 | 106 | 34 | 44 | 102 | 26 | 32 | 110 |
| 11 | 20 | 26 | 113 | 36 | 38 | 110 | 22 | 26 | 116 |
| 12 | 22 | 26 | 72 | 47 | 52 | 164 | 26 | 28 | 109 |
| Total | 334 | 394 | 1406 | 470 | 527 | 1471 | 402 | 417 | 1853 |
DM, doubled monoploid reference clone; RH, RH89-039-16; DMB, DM superscaffold.
Only markers mapped in DMDD and uniquely and reliably anchored to DM assembly are included.
Figure 3Summary of DM genome assembly anchoring using three different map resources. The number of uniquely and jointly anchored superscaffolds for each resource is given in the appropriate intersection. Cumulative size (Mb) of superscaffolds anchored in each category is shown in parenthesis. The total number of 649 anchored superscaffolds represents 623 Mb of the assembled DM potato genome. Figure updated from the Potato Genome Sequencing Consortium (2011).
Figure 4Depiction of “Link-peak” walk strategy taking superscaffold PGSC0003DMB000000159 as an example. (A) Custom GBrowse “Link-peak” intensity track features (shown as red and blue arrows) provided ordered navigation through superscaffolds using the aggregated PEMP. Link peaks to the right (red arrow) indicate “suggested path” downstream of the AGP, whereas those to the left (blue arrow) indicate converse. Reversal of this trend indicates a negative strand for the superscaffold in question. Traversing from one superscaffold to another by taking leads from these ‘Link-peak’ intensity tracks assisted in manually curating all 12 PMs. (B) Visualization of the underlying PEMP data.
Figure 5Assembled BAC sequence for LuSP197F07. Each scaffold assembly is derived from PE sequences of a combined pool of 82 DM BACs (spanning scaffolding gaps on chromosome 4) and single end sequence at greater read depth from one of the six subpools derived from the same BACs. The assemblies show a direct sequence running from PGSC0003DMB000000278 (− orientation, full length, cyan) through into PGSC0003DMB000000051 (+ orientation, blue) in accordance with the AGP and fully validating the decision to split PGSC00003DMB0000000278 at position 824768 and to split PGSC0003DMB000000051 at position 1859342 as indicated in the AGP file. Regions of good alignment (>98% identity, >1000 bases) are indicated as thick lines. Thin lines indicate no good alignment between the superscaffold and BAC sequences. The BAC end sequences are labeled with their Genbank IDs and are indicated at each end of the plot by black arrows. Breakpoints in the BAC sequences are indicated by orange diagonal lines and annotated with the assigned breakpoints coordinate from the AGP.
Figure 6Enhanced accuracy of the current DM PMs. Panels A and E show anchoring of superscaffolds to the PM versions 4.03 and 2.1.11, respectively. Superscaffolds with known and unknown orientations are depicted in alternating shades of blue and red, respectively. Gaps in between the superscaffolds are marked in gray. Black areas in panel E represent unanchored superscaffolds (version 2.1.11) that were eventually anchored and ordered in PM version 4.03. Panels B and C show gene and repeat region densities, respectively, in 1 MB bins of PM version 4.03. Gene and repeat region densities ranges from 0 to >150 genes/MB and 0 to >900 repeats/MB, respectively. Panels D and F show the correspondence of the genetic maps (D84, green; DRH, black), adapted from Felcher , to PM versions 4.03 and 2.1.11, respectively. Graphs show the genetic (cM) positions plotted against the physical coordinates (Mb) for the SolCAP SNP markers; panels G (D84) and H (DRH) show elaborated examples of good correspondence from chromosome 9.
Figure 7Illustration of the chromosome 1 PM integrated with the DM and RH genetic maps. STS and AFLP markers anchor sequence locations in the chromosome 1 PM to the DMDD and RH genetic maps, respectively. The AFLP marker positions in the PM were identified through sequence tag alignment of BAC clones from the RH WGP physical map. Superscaffolds comprising the PM are shown as alternating gray and white rectangular blocks. The layout of the PM for each of the genetic maps is shown separately but is identical with superscaffold IDs depicted in the middle. The pachytene idiogram is adapted from the potato reference genome publication (Potato Genome Sequencing Consortium 2011). The putative centromere region and pericentromeric/heterochromatic boundaries are demarcated by asterisk and dashed lines, respectively. Each DMDD marker type is color coded: blue = DArTs, yellow = SNPs, green = SSRs. Blue and magenta lines emerging from the RH genetic map represent AFLP anchors and the intensity of green color corresponds to the AFLP marker density per bin as reported by Van Os . Magenta lines represent AFLP markers with a relatively inaccurate mapping position on the RH genetic map, covering an interval of 5 or more bins. Regions in the central heterochromatin where superscaffold order and orientation are not completely resolved are indicated in yellow. Inversions with the tomato sequence are indicated with red interval bars.
Improvements in DM PMs before and after execution of the link peak-based orientation strategy
| Chr | Stage I | Stage II | ||||
|---|---|---|---|---|---|---|
| DMB Anchored | DMB Anchored | DMB Oriented | ||||
| No (Size in Mb) | N50 | No (Size in Mb) | N50 | No (Size in Mb) | Percentage | |
| 01 | 83 (79.7) | 1.7 | 123 (82.6) | 2.6 | 121 (79.8) | 96.6 |
| 02 | 51 (45.0) | 1.3 | 68 (45.3) | 2.2 | 68 (45.3) | 100.0 |
| 03 | 53 (45.3) | 1.6 | 103 (57.2) | 4.3 | 103 (57.2) | 100.0 |
| 04 | 73 (60.9) | 1.2 | 120 (66.3) | 2.9 | 119 (62.1) | 93.7 |
| 05 | 41 (44.8) | 1.7 | 52 (49.5) | 2.9 | 47 (40.4) | 81.6 |
| 06 | 63 (54.0) | 1.4 | 90 (55.1) | 2.7 | 90 (55.1) | 100.0 |
| 07 | 52 (50.6) | 1.8 | 78 (52.9) | 7.2 | 78 (52.9) | 100.0 |
| 08 | 51 (41.6) | 1.2 | 91 (52.4) | 4.9 | 91 (52.4) | 100.0 |
| 09 | 61 (50.6) | 1.2 | 86 (57.3) | 8.3 | 85 (55.9) | 97.7 |
| 10 | 50 (51.4) | 1.5 | 77 (56.0) | 4.1 | 74 (55.4) | 99.0 |
| 11 | 35 (34.4) | 1.4 | 60 (42.5) | 5.7 | 60 (42.5) | 100.0 |
| 12 | 61 (58.5) | 1.5 | 77 (57.4) | 1.9 | 76 (56.0) | 97.7 |
| Total | 674 (616.8) | 1.5 | 1025 | 4.1 | 1012 | 97.2 |
DM, doubled monoploid reference clone; PMs, pseudomolecules; DMB, DM superscaffold.
Refers to the status of PMs before execution of the “Link-peak” walk strategy.
Refers to the status of PMs after execution of the “Link-peak” walk strategy.
Only attempted at stage II.
Total.
Average.
Chimeric superscaffolds have been included more than once (net number of DMBs anchored = 951).
Figure 8NUCmer sequence alignment dot plots for the twelve potato chromosomes using current (ver4.03, plotted on x-axis) and previous (ver2.1.11, plotted on y-axis) versions of DM PMs. Sequences aligned in forward and reverse orientations are represented by red and blue lines, respectively. Scaffold misplacements are shown as horizontal or vertical shifts in parts of the aligned blocks.