| Literature DB >> 28284188 |
Ignazio Verde1, Jerry Jenkins2, Luca Dondini3, Sabrina Micali4, Giulia Pagliarani3, Elisa Vendramin4, Roberta Paris3,5, Valeria Aramini4, Laura Gazza4,6, Laura Rossini7,8, Daniele Bassi7, Michela Troggio9, Shengqiang Shu10, Jane Grimwood2, Stefano Tartarini3, Maria Teresa Dettori4, Jeremy Schmutz2,10.
Abstract
BACKGROUND: The availability of the peach genome sequence has fostered relevant research in peach and related Prunus species enabling the identification of genes underlying important horticultural traits as well as the development of advanced tools for genetic and genomic analyses. The first release of the peach genome (Peach v1.0) represented a high-quality WGS (Whole Genome Shotgun) chromosome-scale assembly with high contiguity (contig L50 214.2 kb), large portions of mapped sequences (96%) and high base accuracy (99.96%). The aim of this work was to improve the quality of the first assembly by increasing the portion of mapped and oriented sequences, correcting misassemblies and improving the contiguity and base accuracy using high-throughput linkage mapping and deep resequencing approaches.Entities:
Keywords: Centromeric regions; Gap patching; Linkage mapping; NGS resequencing; Prunus persica; Recombination rates; SNPs; SSRs; WGS assembly
Mesh:
Year: 2017 PMID: 28284188 PMCID: PMC5346207 DOI: 10.1186/s12864-017-3606-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Anchoring statistics of the Peach v2.0 assembly
| Chromosome (LG) Pseudomolecule | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Mapping progeny | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Total | |
| Number of markers integrateda | TxE | 256 (186) | 143 (101) | 155 (113) | 120 (83) | 123 (90) | 159 (121) | 143 (107) | 125 (94) |
|
| PxF | 269 | 207 | 292 | 408 | 153 | 219 | 224 | 202 |
| |
| CxA | 29 | 24 | 36 | 42 | 12 | 29 | 25 | 28 |
| |
|
|
|
|
|
|
|
|
|
|
| |
| Number of scaffolds anchored | TxE | 10 | 9 | 10 | 7 | 5 | 8 | 4 | 4 |
|
| PxF | 9 | 8 | 8 | 7 | 5 | 7 | 4 | 3 |
| |
| CxA | 9 | 8 | 9 | 6 | 3 | 9 | 4 | 4 |
| |
|
|
|
|
|
|
|
|
|
|
| |
| Genetic distances covered (cM) | TxE | 77.5 | 42.7 | 44.1 | 51.7 | 47.6 | 81 | 70.6 | 57.5 |
|
| PxF | 117.4 | 70.9 | 69.9 | 69.3 | 62.1 | 81.1 | 67.3 | 67.6 |
| |
| PxF F1 | 139.8 | 88.1 | 71.9 | 76.7 | 59.1 | 91 | 63.6 | 68.4 |
| |
| PxF recurrent | 53.7 | 31 | -- | 44.2 | 27.6 | 45.9 | 19.6 | 10.4 |
| |
| CxA | 98 | 36.3 | 67.7 | 64.5 | 36.5 | 79.9 | 62.3 | 64.4 |
| |
| Physical distance in bp covered with the integrated markers and | TxE | 47,190,243 | 29,794,491 | 27,174,422 | 24,974,520 | 17,300,580 | 30,384,999 | 21,009,142 | 22,199,033 |
|
| PxF | 46,854,330 | 29,975,524 | 26,162,111 | 25,167,755 | 17,989,526 | 29,985,579 | 22,201,468 | 20,421,932 |
| |
| PxF F1 | 46,854,330 | 29,652,167 | 26,162,111 | 25,167,755 | 17,989,526 | 29,209,364 | 20,322,548 | 20,087,434 |
| |
| PxF recurrent | 39,665,700 | 22,115,897 | -- | 19,074,112 | 5,684,854 | 26,868,520 | 3,306,229 | 3,031,158 |
| |
| CxA | 45,955,086 | 19,656,032 | 27,246,203 | 25,053,083 | 7,348,324 | 29,848,481 | 21,600,273 | 22,073,557 |
| |
|
|
|
|
|
|
| 30,767,194 | 22,388,614 |
|
| |
For each map and for each chromosome the number of markers, number of anchored scaffolds, genetic and physical distance covered with the integrated markers and the total number of anchored bases are reported
aIn brackets the BIN mapped markers in TxE
Genetic/physical ratio (cM/Mb) for each map and each chromosome
| LG-Ppa | TxE | PxF | PxF F1 | PxF recurrent | CxA | WxB | DvsS |
|---|---|---|---|---|---|---|---|
| 1 | 1.642 | 2.506 | 2.984 | 1.354 | 2.133 | 2.261 | 1.858 |
| 2 | 1.433 | 2.365 | 2.971 | 1.402 | 1.847 | 2.141 | 1.435 |
| 3 | 1.623 | 2.672 | 2.748 | -- | 2.485 | 2.340 | 2.174 |
| 4 | 2.070 | 2.754 | 3.048 | 2.317 | 2.575 | 2.544 | 2.384 |
| 5 | 2.751 | 3.452 | 3.285 | 4.855 | 4.967 | 3.777 | 2.672 |
| 6 | 2.666 | 2.705 | 3.115 | 1.708 | 2.677 | 2.393 | 1.770 |
| 7 | 3.360 | 3.031 | 3.130 | 5.928 | 2.884 | 3.003 | 1.930 |
| 8 | 2.590 | 3.310 | 3.405 | 3.431 | 2.918 | 2.775 | 1.812 |
| Total | 2.148 | 2.768 | 3.057 | 1.941 | 2.564 | 2.553 | 1.954 |
a LG Linkage Group, Pp Pseudomolecule
Summary statistics of the Peach v2.0 chromosome-scale assembly statistics and its comparison with the v1.0
| Peach v2.0 | Peach v1.0 | |
|---|---|---|
| Number of scaffolds | 191 | 202 |
| Number of contigs | 2,525 | 2,730 |
| Scaffold sequence | 227.4 Mb | 227.3 Mb |
| Mapped scaffold sequence | 225.7 Mb (99.2%) | 218.4 Mb (96%) |
| Oriented scaffold seqeuence | 223.3 Mb (98.2%) | 194.6 Mb (85.6%) |
| Contig sequence | 224.6 Mb | 224.6 Mb |
| Scaffold N/L50 | 4/27.4 Mb | 4/26.8 Mb |
| Contig N/L50 | 250/255.4 kb | 294/214.2 kb |
| Number of scaffolds > 50 KB | 11 | 21 |
| % main genome in scaffolds > 50 kb | 99.4% | 99.4% |
Summary of gap patching and indel and SNP correction
| Pseudomolecules | No. of contigs | No. of gaps closed | Gap bases patched | Initial contig length | Post gap-patching contig length | Bases gained | Indels corrected | SNP corrected |
|---|---|---|---|---|---|---|---|---|
| Pp01 | 36 | 36 | 6,820 | 47,412,656 | 47,417,444 | 4,788 | 269 | 143 |
| Pp02 | 27 | 27 | 6,884 | 29,982,897 | 29,985,295 | 2,398 | 185 | 117 |
| Pp03 | 29 | 30 | 4,831 | 27,022,361 | 27,022,947 | 586 | 167 | 128 |
| Pp04 | 24 | 24 | 3,895 | 25,545,546 | 25,549,276 | 3,730 | 133 | 132 |
| Pp05 | 19 | 21 | 10,824 | 18,291,031 | 18,295,669 | 4,638 | 110 | 58 |
| Pp06 | 25 | 26 | 9,635 | 30,419,305 | 30,423,361 | 4,056 | 197 | 117 |
| Pp07 | 29 | 30 | 7,501 | 22,049,797 | 22,053,146 | 3,349 | 157 | 75 |
| Pp08 | 17 | 18 | 3,485 | 22,391,144 | 22,392,798 | 1,654 | 129 | 89 |
|
|
|
|
|
|
|
|
|
|
Fig. 1Plots of genetic-by-physical distances (MareyMap). Comparison of v1.0 and v2.0 physical distances (Mb, in the horizontal axis) and PxF genetic distances (cM, in the vertical axis). Dots represent the mapped markers. The vertical bars indicate the putative position of the centromere. The solid line represents the recombination rate plotted along the 8 pseudomolecules
Comparison of the peach genome to other published plant genomes
| Genome release [Reference] | Coverage | Assembled scaffold sequence Mb | Mapped sequences Mb | N50 | L50 Mb | N50 | L50 Mb | Contig N50 | Contig L50 kb | Sequencing methods |
|---|---|---|---|---|---|---|---|---|---|---|
| Scaffold WGSa | Scaffold Chrb | |||||||||
| Peach ( | 8.47x | 227.4 | 225.7 | 10 | 7.3 | 4 | 27.4 | 250 | 255.4 | Sanger (WGS) |
| Apple ( | 16.9x | 598.3 | 528.3 | 80 | 2 | -- | -- | 16171 | 13.4 | Sanger, 454 (WGS) |
|
| -- | 119.7 | 119.7 | -- | -- | 3 | 23.5 | -- | -- | Sanger, (BAC by BAC) |
| Rice | -- | 382.2 | 382.2 (100) | -- | -- | 6 | 30.8 | -- | -- | Sanger, (BAC by BAC) |
| Soybean ( | 8.04x | 955.4f | 932.5 (97.6) | -- | -- | 10 | 48.6f | 1548f | 182.8f | Sanger (WGS) |
| Poplar ( | 9.44xf | 423f | 388 (91.7) | -- | -- | 8f | 19.5f | 206f | 552.8f | Sanger (WGS) |
| Grape ( | 8.4x | 467.5 | 290.2 (62.1) | -- | 2.1 | 14 | 13.9 | 2012 | 66.4 | Sanger (WGS) |
| Papaya ( | <3x | 271.7 | 235 (86.5) | -- | -- | 74 | 1.3 | 7109 | 10.6 | Sanger (WGS) |
|
| 9.4x | 271.9 | 271.1 (99.8) | -- | -- | 3 | 59.3 | -- | 22000f | Sanger (WGS) |
|
| 8.5x | 697.6 | 625.6 (89.7) | -- | -- | 6 | 62.4 | -- | 1200f | Sanger (WGS) |
|
| 7x | 212.6 | -- (--) | 38 | 1.7 | -- | -- | 515 | 119.8 | Sanger (WGS) |
|
| 8.92x | 466.7 | -- (--) | 86 | 1.7 | 12f | 17.4f | 312f | 464.9f | Sanger (WGS) |
| Tomato ( | 25x | 781.3 | 759.9 (97.3) | 52 | 4.5 | 6 | 64.8 | 3641 | 55.7 | Sanger, 454, Solid, Illumina (WGS) |
| Banana ( | 20.5x | 472.2 | 331.8 (70.3) | 65 | 1.3 | 8 | 28.6 | 2113 | 43.1 | Sanger, 454 (WGS) |
| Citrus ( | 6.97x | 309.9 | 288.6 (93.1) | -- | 6.8 | -- | 31.4 | -- | 115.9 | Sanger (WGS) |
| Watermelon ( | 108.64x | 353.5 | 330 (93.4) | 42 | 2.4 | -- | -- | 3315 | 26.4 | Illumina (WGS) |
|
| 30x | 706 | -- | 50 | 4.9 | -- | -- | 6448 | 29.4 | Sanger, 454, Illumina (WGS) |
|
| -- | 328.9 | 297.1 (90.3) | 53 | 1.27 | 4 | 38.9 | -- | -- | Sanger, 454, Illumina (WGS, BAC by BAC) |
| Melon ( | 13.52x | 361.4 | 316.3 (87.5) | 26 | 4.68 | 6g | 17.7g | -- | 18.2 | Sanger, 454 (WGS) |
| Coffee | 30x | 568.6 | 364.1 (64.0) | 108 | 1.3 | 5g | 32g | 2290 | 51.1 | Sanger, 454, Illumina (WGS) |
| Cotton ( | 103.6x | 775.2 | 567.2 (73.2) | 95 | 2.3 | -- | -- | 4918 | 44.9 | Illumina (WGS) |
| Pineapple | 410x | 381.9 | 315.8 (82.7) | -- | 0.64 | 13g | 11.8 | -- | 126.5 | PACbio, Illumina, 454, Moleculo, (WGS) |
aN50/L50 statistics of the WGS assembly prior to pseudomolecule build
bN50/L50 statistics of the chromosome-scale assembly
Arabidopsis assembly, obtained using BAC by BAC approach, represents the golden standard for plant genome. Statistics were calculated from TAIR10 release. (http://www.ncbi.nlm.nih.gov/mapview/stats/BuildStats.cgi?taxid=3702&build=9&ver=2)
dRice assembly, obtained using BAC by BAC approach, represents the golden standard for plant genome. Statistics were calculated from IRGSP Releases Build 4.0 (http://rgp.dna.affrc.go.jp/IRGSP/Build4/build4.html)
eData retrieved from Schmutz et al. [88]; they recalculated the original statistics to better match chromosome-scale assemblies
fData from recent releases retrieved from Phytozome
gData were recalculated based on the original statistics reported in the paper
Tuckey's pairwise comparison test among the different maps
| DvsS | WxB | CxA | PxF recurrent | PxF F1 | PxF | |
|---|---|---|---|---|---|---|
| TxE | 0.9335 | 0.6958 | 0.9596 | 0.2531 |
| 0.2337 |
| PxF |
| 0.9843 | 0.7918 |
| 0.9592 | |
| PxF F1 |
| 0.5816 | 0.2327 |
| ||
| PxF recurrent parent | 0.8660 |
|
| |||
| CxA | 0.4103 | 0.9959 | ||||
| WxB | 0.1362 |
Significant p-values (α = 0.05) are reported in bold