| Literature DB >> 25958312 |
Jan M de Boer1,2, Erwin Datema3,4, Xiaomin Tang5,6, Theo J A Borm7, Erin H Bakker8, Herman J van Eck9, Roeland C H J van Ham10,11, Hans de Jong12, Richard G F Visser13, Christian W B Bachem14.
Abstract
BACKGROUND: In flowering plants it has been shown that de novo genome assemblies of different species and genera show a significant drop in the proportion of alignable sequence. Within a plant species, however, it is assumed that different haplotypes of the same chromosome align well. In this paper we have compared three de novo assemblies of potato chromosome 5 and report on the sequence variation and the proportion of sequence that can be aligned.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25958312 PMCID: PMC4470070 DOI: 10.1186/s12864-015-1578-1
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Statistics of sequenced chromosome 5 BACs
|
|
|
|---|---|
| Total sequence length (bp) | 70,278,808 |
| Minimum BAC sequence length (bp) | 14,260 |
| Maximum BAC sequence length (bp) | 216,437 |
| Average BAC sequence length (bp) | 122,651 |
| N50 BAC sequence length (bp) | 129,138 |
Chromosome 5 BAC tiling path sequence and AFLP marker statistics
|
|
|
|---|---|
| Minimum length (BACs) | 1 |
| Maximum length (BACs) | 25 |
| Average length (BACs) | 5.36 |
| N50 length (BACs) | 8 |
| Minimum length (bp) | 69,329 |
| Maximum length (bp) | 2,803,458 |
| Average length (bp) | 518,193 |
| N50 length (bp) | 748,401 |
| BAC tiling paths with AFLP markers | 91 |
| Number of AFLP markers in tiling paths (a) | 211 |
| Average number of AFLP markers per tiling path | 2.32 |
| Minimum AFLP markers per tiling path | 1 |
| Maximum AFLP markers per tiling path | 10 |
| N50 number of AFLP markers per tiling path | 3 |
(a) Includes 9 markers identified from the sequence data.
Figure 1Alignment of the potato chromosome 5 genetic, cytogenetic and sequence maps. A. Chr-5 genetic map of genotype RH [16]. The map is divided in 78 bin segments of which the gray scale intensity corresponds to the number of AFLP markers per bin. Bin 46 has the highest marker density (174 markers) and contains the centromere. B. Digitally stretched cytogenetic map of RH pachytene Chr-5. Intense DAPI staining (white) marks the pericentromeric heterochromatin. Coloured foci mark the FISH positions of 35 BAC clones from the RH sequence tiling paths. For selected BAC clones, the connections are shown to the RH genetic (A) and sequence (C) maps. C. Alignment of the RH and DM genomic sequences. Positions of RH BAC tiling path sequences of haplotypes {0} and {1}, are shown as green and red blocks respectively, along the DM pseudomolecule sequence map (dark violet) of Chr-5. In the central heterochromatin, RH BAC MTPs of which the exact position is unknown are placed in arbitrary order and are shown in lighter colours. Likewise, DM sequence scaffolds without alignment to the RH sequences are shown in a lighter colour. The DM superscaffold sequences are marked only by their ID numbers, e.g. sequence block 103 at the start of Chr-5 is superscaffold PGSC0003DMB000000103 [18]. D. Model for the distribution of homozygous and polymorphic regions on RH Chr-5. E. Classification of sequence collinearity in overlap regions between RH and DM sequences. RH {0} vs RH {1} is the comparison between both sequence haplotypes in RH. RH {0) vs DM and RH {1} vs DM are comparisons of RH haplotype {0} and {1} sequences with the DM reference sequence.
Quantitative distribution of the non-redundant BAC MTP sequences across regions of chromosome 5
|
| ||||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| RH homozygous | 1,364,250 | 1,300,000 (a) | 2,664,250 | |||
| RH unknown | 132,838 | 132,838 | ||||
| RH haplotype {0} | 4,884,186 | 4,819,059 | 14,443,961 | 1,193,223 | 2,552,327 | 27,892,756 |
| RH haplotype {1} | 5,519,970 | 4,328,956 | 12,886,551 | 578,280 | 1,443,101 | 24,756,858 |
| RH totals | 11,768,406 | 9,148,015 | 27,463,350 | 1,771,503 | 5,295,428 | 55,446,702 |
| DM sequence | 9,973,565 | 12,734,159 | 17,195,885 | 1,791,234 | 7,825,315 | 49,520,158 |
(a) Estimated length of homozygous sequence incorporated in haplotype {0} MTP 6915 and haplotype {1} MTP 6844.
Figure 2Detailed cytogenetic map of RH chromosome 5. This map was created by simultaneous fluorescence in situ hybridization of 35 BAC clones from the sequence MTPs to a pachytene chromosome spread. Intense DAPI staining of the condensed DNA of the pericentromeric heterochromatin is visible as white background to the colored BAC fluorescence signals. An asterisk marks the centromeric constriction. Subtelomeric BAC clone RH042N03 (yellow) produced a second fluorescence signal at the south end of Chr-5 (labelled in brackets). Eighty three percent of clone RH042N03 consists of the CL14 subtelomeric repeat [73,74] and this ectopic hybridisation signal most likely is caused by the presence of a similar subtelomeric repeat at the south terminus of Chr-5. Tiling path and marker data for the hybridized BACs are listed in Table 4.
Physical and genetic locations of BAC clones hybridized to potato chromosome 5
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|
| 5 L (EC) | RH042N03 | 1,2 | Yellow | 0.40 ± 0.06 | 4846 | 0 | 20179 | EAAGMACC_272 | 4 |
| 5 L (EC) | RH201A02 | 1,2 | Blue | 1.14 ± 0.30 | 1070 | 0 | (4) | ||
| 5 L (EC) | RH081B09 | 1,2 | Green | 2.70 ± 0.42 | 2129 | 0 | 7241 | EAACMCAG_149.7 | 4 |
| 5 L (EC) | RH199B23 | 1,2 | Yellow | 3.20 ± 0.66 | 3171 | - | |||
| 5 L (EC) | RH202I20 | 1,2 | Red | 5.00 ± 0.42 | 6457 | 0 | 7202 | EAGGMAGA_385 | 1-4 |
| 5 L (EC) | RH133F12 | 1,2 | Magenta | 5.88 ± 0.35 | 25 | 0 | 7235 | EATCMCTC_236.2 | 4 |
| 5 L (EC) | RH043P07 | 1,2 | Green | 7.01 ± 0.33 | 941 | 1 | (12) | ||
| 5 L (EC) | RH133I18 | 1,2 | Blue | 8.52 ± 0.37 | 382 | 0 | 7260 | PGA/MATG_101.5 | 12 |
| 5 L (EC) | RH085N22 | 1,2 | Magenta | 8.66 ± 0.76 | 1015 | 1 | 7265 | EACAMACC_361.6 | 16 |
| 5 L (EC) | RH052G17 | 1,2 | Red | 10.07 ± 0.85 | 2478 | 0 | 20145 | EAACMCTT_435H | 16-17 |
| 5 L (EC) | RH095I08 | 1,2,3 | Green/Blue | 12.71 ± 1.15 | 1015 | 1 | 7280 | EAACMAGG_167 | 21 |
| 5 L (EC) | RH052M07 | 1,2 | Yellow | 13.46 ± 1.13 | 903 | 0 | 7287 | CAAGMCAT_174.6 | 26-27 |
| 5 L (EC) | RH035K21 | 1,2 | Blue | 16.64 ± 2.34 | (c) | ||||
| 5 L (EC) | RH082N16 | 1,2 | Magenta | 16.64 ± 2.34 | 1382 | M | |||
| 5 L (EC) | RH196F10 | 1,2 | Red | 19.51 ± 1.25 | 178 | M | |||
| 5 L (EC) | RH076O08 | 1,2 | Blue | 20.66 ± 1.32 | 799 | 1 | 20640 | EAGAMACC_230 | 37 |
| 5 L (EC) | RH103P18 | 1,2 | Green | 21.76 ± 1.04 | 1841 | M | |||
| 5 L (EC) | RH078E16 | 1,2 | Red | 23.02 ± 1.27 | 1800 | 1 | 7296 | EAGGMACA_500 | 37 |
| 5 L (EC) | RH085C18 | 1,2 | Blue | 25.21 ± 2.02 | 1114 | 0 | (37) | ||
| 5 L (HC) | RH167O23 | 1,2 | Yellow | 24.76 ± 0.51 | 3004 | 1 | 11894 | EATCMCAG_17H | 37 |
| 5 L (HC) | RH176P24 | 1,2 | Magenta | 25.21 ± 2.02 | 672 | 0 | (46–47) | ||
| 5 L (HC) | RH017I02 | 1,2 | Red | 25.67 ± 2.33 | 672 | 0 | (46–47) | ||
| 5 L (HC) | RH048J07 | 1,2 | Yellow | 35.34 ± 2.78 | 1685 | 1 | 12427 | EAGGMAAG_2H | 47 |
| Centromere | 1,2,3 | Asterisk | 41.67 | ||||||
| 5S (HC) | RH056O13 | 1,2 | Magenta | 45.11 ± 2.68 | 5325 | 1 | 11936 | EACAMACC_270.9H | 46-47 |
| 5S (HC) | RH071D16 | 1,2 | Green | 46.78 ± 2.45 | 3424 | 0 | 7590 | EAGTMAGC_997 | 46 |
| 5S (HC) | RH102K09 | 3 | Green | 47.86 ± 2.38 | 1645 | 1 | 12129 | EAGAMCCT_586.0H | 46 |
| 5S (HC) | RH138C23 | 3 | Magenta | 47.93 ± 2.00 | 1645 | 1 | (46–47) | ||
| 5S (HC) | RH095M08 | 3 | Yellow | 51.94 ± 2.13 | 1054 | 0 | 7392 | EAAGMCAT_15 | 45 |
| 5S (HC) | RH193O24 | 3 | Red | 52.03 ± 2.45 | 1054 | 0 | (47) | ||
| 5S (HC) | RH013O07 | 1,2 | Red | 57.28 ± 1.74 | (d) | (46–47) | |||
| 5S (EC) | RH089A21 | 1,2 | Blue | 60.68 ± 1.87 | 6472 | 1 | 12520 | EACTMCTA_188.9H | 55 |
| 5S (EC) | RH044A21 | 1,2 | Magenta | 68.57 ± 1.80 | 4210 | 0 | 7626 | EAACMCTC_205 | 60 |
| 5S (EC) | RH136O01 | 1,2 | Green | 70.33 ± 1.47 | 6915 | 0 | 7629 | EAAGMCTT_144.9 | 65 |
| 5S (EC) | RH028L14 | 1,2 | Blue | 71.12 ± 1.98 | 6915 | 0 | |||
| 5S (EC) | RH144F10 | 1,2 | Yellow | 72.16 ± 1.14 | 6915 | 0 | 7631 | CATAMCCA_322.3 | 66-69 |
| 5S (EC) | RH060F21 | 1,2 | Green | 73.51 ± 0.84 | 6844 | M | |||
| 5S (EC) | RH093O07 | 1,2 | Yellow | 74.36 ± 1.14 | 429 | 0 | |||
| 5S (EC) | RH089M16 | 1,2 | Magenta | 75.42 ± 1.17 | 6844 | 1 | 12569 | EATCMCTC_27H | 78 |
| 5S (EC) | RH075N11 | 1,2 | Red | 76.70 ± 0.93 | 648 | 0 | 7651 | EAGTMCCA_208.5 | 78 |
(a) 5 L = long arm; 5S = short arm; EC = euchromatin; HC = heterochromatin.
(b) Physical location (= fraction length) was calculated as (S/T)binT, where S = the distance in μm from the FISH hybridization site to the north end of Chr-5, T = the total length of Chr-5 in micrometer, binT = the total bin value (78) of Chr-5.
(c) Unsequenced BAC clone aligning to DM superscaffold PGSC0003DMB000000210.
(d) Unsequenced BAC clone with marker GP188 and aligning to DM superscaffold PGSC0003DMB000000328.
(e) ‘-’ = haplotype undetermined; M = monomorphic region (i.e. no difference between haplotypes).
(f) Bin numbers in brackets are from AFLP anchors in adjacent BAC clones in the RH physical map. Bin values for AFLP markers can deviate slightly from the true bin value due to missing scores, e.g. in practice markers mapped to the bin 45–47 region all belong to the bin 46 segment of the genetic map.
Figure 3Dot plot alignment between RH and DM in a gene cluster in the north euchromatin of chromosome 5. A. Alignment of haplotype {0} BAC MTP 25 versus the DM reference sequence shows collinearity throughout a 150 kb region with sequence duplications. The position of BAC clone RH133F12 (magenta) is shown in MTP 25 {0}. This clone was used for in situ hybridization in Figures 1B and 2. B. At the same chromosome location, the RH haplotype {1} MTP 2322 has no collinearity to DM in the duplication region. This duplication region had caused problems in the DM genome assembly and the DM sequence used here for alignment is a concatenation of sequence fragments from four superscaffolds (410, 1176, 251 and 51), from which 150 kb of pseudomolecule gap spacing sequence has been removed.
Figure 4Regions with substantial loss of sequence collinearity in the proximal part of the north euchromatin of chromosome 5. A. Alignment of RH BAC MTP 178 {−} versus the DM reference sequence. The grey zone marks 50 kb of spacer sequence that fills a gap of unknown length between superscaffolds 540 and 216 in the DM pseudomolecule sequence. The position of BAC clone RH196F10 is shown in MTP 178 {−}. This clone was used for in situ hybridization in Figures 1B and 2. B. Alignment of RH haplotype {1} BAC MTP 799 versus the DM reference sequence. The position of BAC clone RH076O08 (blue) is shown in MTP 799 {1}. This clone was used for in situ hybridization in Figures 1B and 2. C. RH haplotype {0} BAC MTP 2540 versus the same DM region as in (B). The dark boxes in alignment (C), which are also partly visible in alignment (B), are caused by a 180 bp sized tandem repeat.
Figure 5Examples of large breaks in sequence collinearity in the heterochromatin borders. A. Alignment of RH haplotype {0} MTP 1058 to the DM reference genome. The start of MTP 1058 {0} is missing in DM. The yellow band indicates an unsequenced region in MTP 1058 {0}, estimated to be 80 kb in length. B. Alignment of RH haplotype {1} MTP 1685 to the same section of the DM reference genome. Alignment gaps of 145 kb and 164 kb to DM are found at different positions compared to (A). The position of BAC clone RH048J07 (yellow) is shown in MTP 1685 {1}. This clone was used for in situ hybridization in Figures 1B and 2. C. Alignment of the two RH haplotypes shows two large inserts of 188 kb and approximately 266 kb in MTP 1058 {0}. The position of BAC clone RH048J07 (yellow) is shown in MTP 1685 {1}. This clone was used for in situ hybridization in Figures 1B and 2. D. The terminal 270 kb of RH MTP 3074 {0} from the south arm has no overlap with the DM sequence. The remaining part of MTP 3074 {0} has a very fragmented collinearity with DM, which disappears in the boxed area. This box marks a sequence duplication region that contains - among others - five DNA repair helicase genes. Scale bar 200 kb applies to all figure panels.
Quantification of overlaps between unaligned sequences in the central heterochromatin of chromosome 5
|
|
|
|
| ||
|---|---|---|---|---|---|
|
|
|
| |||
| RH haplotype {0} MTPs | 24 | 12764 | -------- | 2550 | 4608 |
| RH haplotype {1} MTPs | 13 | 10004 | 25.5 (a) | -------- | 2104 |
| DM superscaffolds | 10 | 15192 | 36.0 (b) | 21.0 (b) | -------- |
(a) Percent phase 1 sequence recovered in phase 0 sequence.
(b) Percent RH sequence recovered in DM.
Figure 6Dot-plot alignment of putative allelic RH BAC tiling paths from the central heterochromatin. RH MTP 1054 of haplotype {0} has fragmented collinearity with RH MTP 1645 of haplotype {1} over a length of 350 kb of sequence, suggesting that these MTPs are allelic. The BAC clones that make up these tiling paths are depicted on the axes. Four BAC clones (colored) have been selected for cytogenetic mapping by FISH in order to verify this possible allelism. The blue ID numbers on the axes correspond to AFLP markers from bin 46 of the genetic map, which anchor these sequenced BACs to the corresponding haplotypes of RH Chr-5.
Figure 7Pachytene BAC FISH verification of putative allelism of MTPs 1054 and 1654 from the central heterochromatin of chromosome 5. BAC clones RH095M08 and RH193O24 from the haplotype {0} MTP 1054 map to a cytogenetic position that is different from the location of clones RH102K09 and RH138C23 from haplotype {1} MTP 1645. This means that these two MTPs are not allelic, despite their partial sequence overlap. Clone RH102K09 gives an additional signal at MTP 1054 {0}, which is presumably cross hybridisation to sequence from the overlap area in MTP 1054 {0} that is not covered by hybridisation with the shorter clone RH095M08. Clone RH095I08 marks the north euchromatic arm of Chr-5.