Literature DB >> 27084896

Structure, evolution, and comparative genomics of tetraploid cotton based on a high-density genetic linkage map.

Ximei Li1, Xin Jin2, Hantao Wang2, Xianlong Zhang2, Zhongxu Lin3.   

Abstract

A high-density linkage map was constructed using 1,885 newly obtained loci and 3,747 previously published loci, which included 5,152 loci with 4696.03 cM in total length and 0.91 cM in mean distance. Homology analysis in the cotton genome further confirmed the 13 expected homologous chromosome pairs and revealed an obvious inversion on Chr10 or Chr20 and repeated inversions on Chr07 or Chr16. In addition, two reciprocal translocations between Chr02 and Chr03 and between Chr04 and Chr05 were confirmed. Comparative genomics between the tetraploid cotton and the diploid cottons showed that no major structural changes exist between DT and D chromosomes but rather between AT and A chromosomes. Blast analysis between the tetraploid cotton genome and the mixed genome of two diploid cottons showed that most AD chromosomes, regardless of whether it is from the AT or DT genome, preferentially matched with the corresponding homologous chromosome in the diploid A genome, and then the corresponding homologous chromosome in the diploid D genome, indicating that the diploid D genome underwent converted evolution by the diploid A genome to form the DT genome during polyploidization. In addition, the results reflected that a series of chromosomal translocations occurred among Chr01/Chr15, Chr02/Chr14, Chr03/Chr17, Chr04/Chr22, and Chr05/Chr19.
© The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

Entities:  

Keywords:  comparative genomics; evolution; genetic linkage map; genome structure; tetraploid cotton

Mesh:

Year:  2016        PMID: 27084896      PMCID: PMC4909315          DOI: 10.1093/dnares/dsw016

Source DB:  PubMed          Journal:  DNA Res        ISSN: 1340-2838            Impact factor:   4.458


Introduction

Cotton (Gossypium) is an important cash crop and the uppermost source of textile fibre, and G. hirsutum and G. barbadense account for 90 and 8% of the world cotton production, respectively.[1] Thus, genome structure analysis of allotetraploid cotton is important, but the allotetraploid (2n = 4x = 52) species has a large genome size of ∼2,246 Mb,[2] indicating that it is especially difficult to generate a reference genome sequence. As a result, a high-density genetic linkage map is an optional tool to reveal genome structure and chromosomal architecture of allotetraploid cotton. Currently, there are four comparatively dense linkage maps in cotton based on experimental data. Rong et al.[3] (2,584 loci, 4447.9 cM in length with an average distance of 1.72 cM) revealed all 13 expected homologous chromosome pairs and reported that there were no major structural changes between DT and D chromosomes, but two reciprocal translocations between AT and A chromosomes and several inversions. Guo et al.[4] (2,247 loci, 3440.4 cM in length with an average distance of 1.58 cM) again reported the translocations that occurred between some chromosomes (Chr02 and Chr03, Chr04 and Chr05), providing a glimpse of cotton genome complexity. Based on a genome-wide simple sequence repeat (SSR) genetic map (2,316 loci, 4418.9 cM in length with an average distance of 1.91 cM) constructed in our laboratory,[5] 21 segregation distortion regions (SDRs) were found, and 3 segregation distorted chromosomes (Chr02, Chr16, and Chr18) were identified with 99.9% of distorted markers segregating towards the heterozygous allele. A genetic linkage map (2,072 loci, 3379.9 cM in length with an average distance of 1.63 cM) constructed by Yu et al.[6] showed that the allotetraploid cotton genome produced equivalent recombination frequencies in its two genomes and revealed that the genetically smallest homologous chromosome pair was Chr04 and Chr22, and the largest was Chr05 and Chr19. Just prior to this publication, a sequence-based interspecific genetic map composed of 4,999,048 SNP loci was reported, which contains only 4,049 recombination bins and covers 4,042 cM with an average interbin genetic distance of 1.0 cM.[7] First, this map played a role in genomic assembly of allotetraploid cotton. Second, some structural variations not only in the AT genome but also in the DT genome were detected, including 15 first reported simple translocations. Third, centromeric regions of tetraploid cotton were predicted. All of the above genetic maps played fundamental roles in understanding the cotton genome structure and in studying cotton evolutionary genomics. With the release of G. raimondii genome sequence, genetic relationship analysis showed that G. raimondii and Theobroma cacao belong to a common subclade.[8] In addition, the two genomes possess a moderate syntenic relationship, with 463 collinear blocks (with ≥5 genes per block) covering 64.8 and 74.41% of the assembled G. raimondii and T. cacao genomes, respectively.[8] Subsequently, G. arboreum genome sequence was released,[9] and a close collinear relationship between G. arboreum and T. cacao was also discovered. Besides, collinearity analysis between G. raimondii and G. arboreum revealed that chromosomes 1, 4–6, and 9–13 were highly collinear, whereas large-scale rearrangements were observed on chromosomes 2 and 3 of G. raimondii, and deletions/insertions were observed on chromosomes 7 and 8 of G. arboreum. In 2015, G. hirsutum genome sequence was released,[10] which showed a conserved order between the ATDT genome and the already sequenced diploid genomes.[8,9] However, another version of the ATDT genome (G. hirsutum)[11] showed that such collinearity was not obvious with either G. arboreum[9] or G. raimondii.[8] Instead, it is largely conserved with another version of the D-progenitor genome (G. raimondii).[12] In addition, the overall gene order and collinearity are largely conserved between the AT and DT genomes although at least 9 translocations and 28 inversions were identified. In the present study, a high-density genetic linkage map including 5,152 loci for tetraploid cotton was constructed mainly based on SSR markers. With this map, we aimed to reveal the cotton genome structure by homologous chromosome comparisons and to explain the polyploidization of tetraploid cotton by comparing this map with two sequenced progenitor diploid cottons.

Materials and methods

Plant materials

Gossypium hirsutum cv. Emian22 and G. barbadense acc. 3–79 were used to detect polymorphisms of all the new markers. Emian22 is a high yield cultivar with moderate fibre quality and no resistance to verticillium wilt, whereas 3–79 is the genetic and cytogenetic standard line for G. barbadense with super fibre quality and high resistance to verticillium wilt. To improve the performance of Emian22 by backcrossing and molecular-assisted selection, a cross between these two materials was performed. Subsequently, the BC1 population [(Emian22 × 3–79) × Emian22] including 141 progeny was used as the mapping population, which had been used to construct a 2,316 loci map.[5]

Molecular markers

During map construction in this study, 5,299 new primer pairs were applied, including 4,569 newly published SSRs (2,937 MON-prefixed, 664 NBRI-prefixed, 670 CCRI-prefixed, 200 HAU-prefixed, and 98 NAU-prefixed; http://www.cottongen.org/) and 730 other batches of primers (579 GhirPIP-prefixed,[13] 115 cg-prefixed,[14] and 36 cot-prefixed[14]).

Genotyping analysis

Polymorphism detection of all the new SSRs was performed as previously described.[15] For the remnant monomorphic SSRs, single-strand conformation polymorphism (SSCP) analysis was applied, which is identical to the method described by Li et al.[16] For the other three batches of primers (GhirPIP-prefixed, cg-prefixed, and cot-prefixed), SSCP analysis is the only method for genotyping analysis. Subsequently, genotyping of the whole population using polymorphic primers was carried out on the corresponding condition. All DNA fragments were detected with silver staining.

Map construction

The mapping data for each BC1 individual were scored according to the definition of JoinMap 3.0.[17] For each segregating marker, a χ2 analysis was performed to determine whether it is deviated from the expected 1 : 1 segregation ratio. During map construction, the logarithm of odds (LOD) threshold was ≥8.0, and the maximum recombination rate was 0.4. Map distances in centiMorgans (cM) were calculated using the Kosambi mapping function.[18] In the resulting linkage map, a region with at least three adjacent loci showing significant segregation distortion (P < 0.05) was defined as the SDR.[19]

Collinearity and comparative genomic analysis

Based on the high-density genetic linkage map and sequences corresponding to markers, collinearity analysis between homologous chromosome pairs was performed using a BLASTN search with E ≤ 1e−5, identity ≥80%, and matched length ≥200 bp. Next, the best hit for each marker was chosen, and all the best hits were illustrated intuitively using online drawing tools (http://circos.ca/). Comparative genomic analysis between tetraploid cotton and diploid cottons was performed using a similar method.

Results

Marker information

To construct a high-density genetic linkage map to meet the demands of cotton genetics and breeding, 200 novel EST-SSRs (HAU3599-HAU3798) were developed in this study. In detail, 3,647 unique sequences were obtained from a normalized adversity cDNA library of G. barbadense acc. Hai7124, and then they were searched against the cotton EST database (posted date: 22 March 2009, with a total of 375,349 sequences), with an E-value cut-off of <1e−10. As a result, 255 had no BLAST hits to known sequences. Subsequently, a total of 200 novel EST-SSRs were developed. During map construction, traditional genotyping analysis made 26 (13.00%) polymorphic primer pairs generating 26 polymorphic loci, and SSCP analysis made 14 (8.05%) polymorphic primer pairs generating 16 polymorphic loci from the remnant monomorphic markers. In total, 40 (20.00%) HAU-prefixed EST-SSRs showed polymorphism, and generated 42 polymorphic loci, with an average of 1.05 alleles. Among the 670 CCRI-prefixed and 98 NAU-prefixed EST-SSRs, traditional genotyping analysis made 75 (11.19%) and 66 (67.35%) polymorphic EST-SSRs generating 78 and 79 polymorphic loci, respectively. For the remnant monomorphic markers, SSCP analysis was applied. As a result, 67 (11.26%) and 11 (34.38%) were polymorphic with 73 and 11 polymorphic loci, respectively. In total, 142 (21.19%) CCRI-prefixed and 77 (78.57%) NAU-prefixed EST-SSRs were polymorphic, and generated 151 and 90 polymorphic loci, with an average of 1.06 and 1.17 alleles, respectively. The total 2,937 MON-prefixed SSRs consist of 2,521 gSSRs and 416 EST-SSRs. After traditional genotyping analysis, 879 (34.87%) gSSRs and 129 (31.01%) EST-SSRs showed polymorphism, and generated 1,005 and 140 polymorphic loci, respectively. Then, the SSCP analysis was conducted with the remnant monomorphic markers. Subsequently, 126 (7.67%) gSSRs and 12 (4.18%) EST-SSRs showed polymorphism, and generated 138 and 13 polymorphic loci, respectively. In total, 1,005 (39.87%) MON-prefixed gSSRs and 141 (33.89%) MON-prefixed EST-SSRs showed polymorphism, and generated 1,143 and 153 polymorphic loci, with an average of 1.14 and 1.09 alleles, respectively. Taking the 2,937 MON-prefixed SSRs as a whole, 1,146 (39.02%) primers were polymorphic, and generated 1,296 polymorphic loci, with an average of 1.13 alleles. The total 664 NBRI-prefixed SSRs consist of 263 gSSRs and 401 EST-SSRs. After traditional genotyping analysis, 85 (32.32%) gSSRs and 122 (30.42%) EST-SSRs showed polymorphism, and generated 97 and 136 polymorphic loci, respectively. Then, SSCP analysis was performed with the remnant monomorphic markers. Subsequently, 11 (6.18%) gSSRs and 15 (5.38%) EST-SSRs showed polymorphism, and generated 11 and 17 polymorphic loci, respectively. In total, 96 (36.50%) NBRI-prefixed gSSRs and 137 (34.16%) NBRI-prefixed EST-SSRs showed polymorphism, and generated 108 and 153 polymorphic loci, with an average of 1.13 and 1.12 alleles, respectively. Taking the 664 NBRI-prefixed SSRs as a whole, 233 (35.09%) primers showed polymorphism, and generated 261 polymorphic loci, with an average of 1.12 alleles. For the 579 GhirPIP-prefixed, 115 cg-prefixed, and 36 cot-prefixed primer pairs, SSCP analysis was directly applied. As a result, 27 (4.66%) GhirPIP-prefixed, 13 (11.30%) cg-prefixed, and 5 (13.89%) cot-prefixed primer pairs showed polymorphism, and each polymorphic primer pair produced one polymorphic locus.

Map construction and overview

The 1,885 loci obtained in this study were added to the 3,747 loci updated in our laboratory,[5,16,20-27] and a total of 5,632 loci were used for map construction. After calculation, a linkage map with 5,152 loci was constructed, and it was 4696.03 cM in total length and 0.91 cM in mean distance (Supplementary Fig. S1 and Table 1). The chromosomes were built with LODs ranging from 8.0 to 15.0 (Table 1). The AT genome contained 2,473 loci with 2359.36 cM in total length and 0.95 cM in mean distance, whereas the DT genome contained 2,679 loci with 2336.67 cM in total length and 0.87 cM in mean distance.
Table 1.

Characteristics of the linkage map constructed from the BC1 population

ChromosomeLODTotal lociSize (cM)Mean distance (cM)Largest gap (cM)>10 cM makers
Chr0110.0143186.871.3121.221
Chr0212.0130156.031.2010.671
Chr0315.0169164.930.989.490
Chr0412.0122149.821.237.850
Chr0513.0285242.760.859.160
Chr069.0165171.431.0410.411
Chr0714.0163105.780.653.420
Chr0811.0213151.020.719.380
Chr0910.0180148.830.834.150
Chr1010.0180200.941.1217.931
Chr1113.0271234.770.876.090
Chr1214.0241238.050.996.260
Chr1310.0211208.140.996.560
AT genome2,4732,359.360.9521.224
Chr1415.0180164.120.919.450
Chr1511.0215197.100.9211.051
Chr1614.017994.320.534.680
Chr179.0150162.231.0822.562
Chr1810.0188146.950.7812.181
Chr1913.0306252.270.825.360
Chr2010.0183117.610.648.480
Chr2113.0265256.030.978.850
Chr228.0147169.931.1616.391
Chr2310.0209193.190.9212.842
Chr2410.0225198.850.8811.811
Chr2512.0194172.190.8911.411
Chr269.0238211.900.8914.802
DT genome2,6792,336.670.8722.5611
Total5,1524,696.030.9122.5615

LOD, logarithm of odds.

Characteristics of the linkage map constructed from the BC1 population LOD, logarithm of odds. The chromosome with the most loci was Chr19 (306 loci), whereas Chr04 had the fewest loci (122 loci), with average loci on each chromosome of 198. The longest chromosome was Chr21 (256.03 cM), whereas the shortest was Chr16 (94.32 cM), with an average chromosome length of 180.62 cM. The largest average distance between markers was on Chr01 (1.31 cM), and the least was on Chr16 (0.53 cM). The largest gap between markers was 22.56 cM on Chr17, and there were a total of 15 gaps >10 cM with 4 on the AT and 11 on the DT genome. Among the 5,632 polymorphic loci used for map construction, 1,006 loci (17.86%) showed segregation distortion (P < 0.05) and 776 distorted loci, accounting for 15.06% of the mapped loci, were unevenly mapped on cotton chromosomes with 5–125 loci on each chromosome (Supplementary Fig. S1). The most distorted loci were on Chr02 (71), Chr16 (125), and Chr18 (106) (>50% of loci were distorted), accounting for 38.92% of the mapped distorted loci. A total of 62 SDRs were found on 18 cotton chromosomes. More SDRs were found on Chr02 (9), Chr16 (11), and Chr18 (10), which were the chromosomes with the most distorted loci.

Collinearity in the cotton genome

Based on the 4,807 available marker-derived sequences in the genetic linkage map, homology analysis showed that homologous chromosomes between AT and DT had the highest homology (Fig. 1). In detail, seven homologous chromosome pairs (Chr01 and Chr15, Chr06 and Chr25, Chr08 and Chr24, Chr09 and Chr23, Chr11 and Chr21, Chr12 and Chr26, and Chr13 and Chr18) showed high collinearity. However, not the whole chromosomes showed collinearity between Chr10 and Chr20 (Supplementary Fig. S2A), whereas Chr07 and Chr16 had good homology but hardly any collinearity (Supplementary Fig. S2B). In addition, although most of Chr02 had good collinearity with Chr14, the rest had comparatively good collinearity with Chr17 (Supplementary Fig. S3A). Additionally, most of Chr03 had good collinearity with Chr17, but the rest had comparatively good collinearity with Chr14 (Supplementary Fig. S3B). Moreover, this also appeared among Chr04, Chr05, Chr19, and Chr22 (Supplementary Fig. S3C and D).
Figure 1.

Collinearity among 26 chromosomes in tetraploid cotton. This figure is available in black and white in print and in colour at DNA Research online.

Collinearity among 26 chromosomes in tetraploid cotton. This figure is available in black and white in print and in colour at DNA Research online.

Comparative genomics between the tetraploid cotton and diploid cottons

Using the 4,807 available nucleotide sequences of 5,152 mapped markers and the genome sequence of G. arboreum or G. raimondii, we found by comparative genomics that sequences of 3,534 and 3,584 markers were homologous from the physical map of G. arboreum and G. raimondii, with an alignment proportion of 68.59 and 69.57%, respectively. The 3,534 markers blasted on the G. arboreum genome included 1,717 markers from the AT genome with an alignment proportion of 69.43%, and 1,817 markers from the DT genome with an alignment proportion of 67.82%. The 3,584 markers blasted on the G. raimondii genome included 1,631 markers from the AT genome with an alignment proportion of 65.95% and 1,953 markers from the DT genome with an alignment proportion of 72.90%. In general, most chromosomes had good homology with their corresponding chromosomes in the diploid A or D genome (Figs 2 and 3, and Table 2). Exceptionally, during comparative analysis with the diploid A genome, three homologous chromosome pairs, Chr01/Chr15, Chr02/Chr14, and Chr03/Chr17, also showed good homology with A2, A5, and A7, respectively (Fig. 2 and Table 2). During comparative analysis with the diploid D genome, Chr02, Chr03, Chr04, and Chr05 also showed good homology with D3, D5, D9, and D12, respectively (Fig. 3 and Table 2).
Figure 2.

Intuitive diagram of affinity between the linkage map and G. arboreum. This figure is available in black and white in print and in colour at DNA Research online.

Figure 3.

Intuitive diagram of affinity between the linkage map and G. raimondii. This figure is available in black and white in print and in colour at DNA Research online.

Table 2.

Chromosome collinearity between tetraploid cotton and diploid cotton

Chromosome in tetraploid cottonChromosome of highest/higher homology with G. arboreumChromosome of highest/higher homology with G. raimondiiChromosome in tetraploid cottonChromosome of highest/higher homology with G. arboreumChromosome of highest/higher homology with G. raimondii
Chr01 (AT1)A7/A2D2Chr15 (DT1)A7/A2D2
Chr02 (AT2)A5/A2D3/D5Chr14 (DT2)A5D5
Chr03 (AT3)A5/A7D5/D3Chr17 (DT3)A7/A2D3
Chr04 (AT4)A12D12/D9Chr22 (DT4)A12D12
Chr05 (AT5)A10D9/D12Chr19 (DT5)A10D9
Chr06 (AT6)A8D10Chr25 (DT6)A8D10
Chr07 (AT7)A1D1Chr16 (DT7)A1D1
Chr08 (AT8)A3D4Chr24 (DT8)A3D4
Chr09 (AT9)A11D6Chr23 (DT9)A11D6
Chr10 (AT10)A9D11Chr20 (DT10)A9D11
Chr11 (AT11)A4D7Chr21 (DT11)A4D7
Chr12 (AT12)A6D8Chr26 (DT12)A6D8
Chr13 (AT13)A13D13Chr18 (DT13)A13D13
Chromosome collinearity between tetraploid cotton and diploid cotton Intuitive diagram of affinity between the linkage map and G. arboreum. This figure is available in black and white in print and in colour at DNA Research online. Intuitive diagram of affinity between the linkage map and G. raimondii. This figure is available in black and white in print and in colour at DNA Research online.

Blast analysis between the tetraploid cotton and the mixed genome of two diploid cottons

A blast analysis of markers on the tetraploid cotton genetic linkage map was conducted by taking the G. arboreum and G. raimondii genomes together. Regardless of whether it is from the AT or DT genome, all homologous chromosome pairs, except for Chr01/Chr15, Chr02/Chr14, Chr03/Chr17, and Chr05/Chr19, had the highest homology with the corresponding homologous chromosome in the diploid A genome, and had the second highest homology with the corresponding homologous chromosome in the diploid D genome (Fig. 4 and Table 3). Chr01 was predominantly homologous to A7, followed by A2 and D2; Chr15 had the highest homology with D2, followed by A2 and A7. Chr02 had the highest homology with A5, followed by A2, but very low homology with D5; Chr14 had the highest homology with A5, followed by D5, but very low homology with A2. Chr03 had the highest homology with A5, followed by A7, and low homology with D3; Chr17 had the highest homology with D3, followed by A7, and no homology with A5. Chr04 had the highest homology with A12, followed by D12 and D9; Chr22 had the highest homology with A12, followed by D12. Chr05 had the highest homology with A10, followed by D9 and A12; Chr19 had the highest homology with D9, followed by A10 (Fig. 4, Supplementary Fig. S4, and Table 3).
Figure 4.

Intuitive diagram of blast analysis between tetraploid cotton and the mixed genome of two diploid cottons. This figure is available in black and white in print and in colour at DNA Research online.

Table 3.

Blast analysis between the tetraploid cotton and the mixed genome of two diploid cottons

Chromosome in tetraploid cottonChromosome of highest homology with the mixed genome of two diploid cottonsChromosome of second highest homology with the mixed genome of two diploid cottonsChromosome in tetraploid cottonChromosome of highest homology with the mixed genome of two diploid cottonsChromosome of second highest homology with the mixed genome of two diploid cottons
Chr01 (AT1)A7A2Chr15 (DT1)D2A2
Chr02 (AT2)A5A2Chr14 (DT2)A5D5
Chr03 (AT3)A5A7Chr17 (DT3)D3A7
Chr04 (AT4)A12D12Chr22 (DT4)A12D12
Chr05 (AT5)A10D9Chr19 (DT5)D9A10
Chr06 (AT6)A8D10Chr25 (DT6)A8D10
Chr07 (AT7)A1D1Chr16 (DT7)A1D1
Chr08 (AT8)A3D4Chr24 (DT8)A3D4
Chr09 (AT9)A11D6Chr23 (DT9)A11D6
Chr10 (AT10)A9D11Chr20 (DT10)A9D11
Chr11 (AT11)A4D7Chr21 (DT11)A4D7
Chr12 (AT12)A6D8Chr26 (DT12)A6D8
Chr13 (AT13)A13D13Chr18 (DT13)A13D13
Blast analysis between the tetraploid cotton and the mixed genome of two diploid cottons Intuitive diagram of blast analysis between tetraploid cotton and the mixed genome of two diploid cottons. This figure is available in black and white in print and in colour at DNA Research online.

Discussion

Different polymorphism ratios of markers

High-density genetic linkage maps are becoming increasingly important. Because the polymorphism in cotton is low,[28,29] more researchers have paid attention to developing novel markers. Our laboratory also endeavoured to develop more molecular markers to construct high-density linkage maps in cotton to facilitate our cotton genetics and breeding studies. In this study, 200 HAU-prefixed new SSRs were developed from newly released ESTs from an adversity cDNA library of G. barbadense acc. Hai7124, which supplement our previously published markers. Among all the SSR markers applied in this study, NAU-prefixed EST-SSRs showed the highest polymorphism ratio (78.57%), mainly owing to the artificial selection of polymorphic SSRs based on published results.[30] Polymorphisms of HAU (20.00%), CCRI (21.19%), MON (33.89%), and NBRI (34.16%)-prefixed EST-SSRs were universally lower than that of MON (39.87%) and NBRI (36.50%)-prefixed gSSRs, which is in accordance with previous reports that ESTs are more conservative because of more selective pressure.[16,31] Polymorphisms of MON- and NBRI-prefixed EST-SSRs were much higher than that of HAU- and CCRI-prefixed EST-SSRs for the following reasons. MON-prefixed SSRs reliably generated two or more amplicons,[32] which raised the possibility of producing polymorphisms in each primer pair. NBRI-prefixed SSRs were developed from G. herbaceum (AA);[33] when they were used in tetraploid cottons (AADD), high polymorphism between G. hirsutum and G. barbadense may be detected. Generally speaking, the SSR markers in this study were comparatively highly polymorphic. One cause may be the great difference between the two parents, and the other would be the application of the SSCP method during genotyping analysis. GhirPIP-prefixed markers[13] (containing intron single-nucleotide polymorphisms and intron length polymorphisms with a predicted ratio of 3 : 1), cg-prefixed SNP/InDel markers[14] (developed from 3′ end, 5′ end, or intron sequences), and cot-prefixed SNP/InDel markers[14] (developed from conserved orthologous sets) all showed low polymorphism (4.66, 11.30, and 13.89%, respectively), mainly because the SSCP method applied in this study has its recognized limits in reflecting SNPs/InDels. The reason why GhirPIP-prefixed markers showed such a low polymorphism rate (4.66%) may be that they were developed from the predicted introns of G. hirsutum based on the complete genome sequence of model plants, which existed deviation undoubtedly. Thus, together with previous reports that SSR markers are highly reproducible across species,[34] and could provide more convenient assays of collinearity between different genomes,[35] SSRs are ideal markers for map construction.

Characteristics of genome structure in tetraploid cotton

As previously reported, a high-density genetic linkage map could accelerate genome structure analysis. The present map (5,152 loci, 4696.03 cM in length with an average distance of 0.91 cM; Supplementary Fig. S1 and Table 1) revealed that (i) more loci were found on the DT genome than on the AT genome, consistent with the results of Guo et al.[4] and Yu et al.,[5] but inconsistent with those of Rong et al.[3] and Yu et al.;[6] (ii) the DT genome was shorter than the AT genome, consistent with the results of Rong et al.,[3] Yu et al.,[5] and Yu et al.,[6] but inconsistent with those of Guo et al.;[4] (iii) the average marker distance of the DT genome is shorter than that of the AT genome, consistent with the results of Rong et al.,[3] Guo et al.,[4] and Yu et al.,[5] but inconsistent with those of Yu et al.;[6] and (iv) there were more gaps (>10 cM) on the DT genome than on the AT genome, consistent with the results of Yu et al.,[5] but inconsistent with those of Rong et al.,[3] Guo et al.,[4] and Yu et al.[6] Variations among the five genetic linkage maps are likely the results of differences in the mapping population types and sizes, as well as in the numbers and sources of molecular markers. In this study, 776 of the total 1,006 distorted loci were mapped on the 26 cotton chromosomes, and 62 SDRs were discovered. Further analysis showed that three chromosomes (Chr02, Chr16, and Chr18) had extreme segregation distortion (>50% of loci showing distortion, accounting for 38.92% of the mapped distorted loci, and containing 30 SDRs), which has been identified by our previous study.[5] Because segregation distortion is increasingly recognized as a potentially powerful evolutionary force,[36] the present results suggest the prospect for a wider application. In addition, segregation distortion mechanisms of these three chromosomes are being researched using eight reciprocal backcrossing populations, which were financially supported by the National Science Foundation of China (Grant No. 31171593). Tetraploid cotton, containing AT and DT genomes, was formed from an interspecific hybridization event between diploid A and diploid D cotton species that may have evolved from a common ancestor.[37,38] As a result, 13 expected homologous AT/DT chromosome pairs were further confirmed by the present results that the highest homology or collinearity existed between them (Fig. 1), which is in accordance with previous reports.[3] These results may be useful in identifying candidate genes. For example, if one quantitative trait locus (QTL) was detected on the AT chromosome, a candidate gene controlling the same trait could be predicted on the corresponding collinear segment of the DT chromosome, so that we can study the functional divergence of homologous genes. However, it is notable that an obvious inversion appeared on Chr10 or Chr20, and repeated inversions appeared on Chr07 or Chr16 (Fig. 1 and Supplementary Fig. S2). In addition, two reciprocal translocations between Chr02 and Chr03 and between Chr04 and Chr05 were confirmed in this study (Fig. 1 and Supplementary Fig. S3), corresponding to previous reports.[3,6,28,39] All of the above may result from genome rearrangements during or after the polyploidization process of the two ancestral diploid genomes, indicating complex but linear features of the tetraploid cotton genome.[6] Another possibility is that the conformation of Chr02, Chr03, Chr04, and Chr05 in the AT genome may result from fracture and fusion of ancient chromosomes, since diploid cotton is a paleohexaploid with a radix of seven.[12]

Good collinearity between the AT/DT genome in tetraploid and diploid A/D genomes

Comparative genomic analysis between tetraploid cotton and diploid cottons showed that 12 chromosomes in the AT genome, except for Chr02 (AT2), showed good collinearity with chromosomes in the diploid A genome (Fig. 2 and Table 2); and 13 chromosomes in the DT genome showed good collinearity with chromosomes in the diploid D genome (Fig. 3 and Table 2), which is consistent with the results of Li et al.[9] and Wang et al.,[8] respectively, and is consistent with the report that no major structural changes exist between the DT and D chromosomes, but rather between the AT and A chromosomes.[3,40] Moreover, homology between the AT genome and the diploid D genome (Fig. 3 and Table 2), as well as between the DT genome and the diploid A genome (Fig. 2 and Table 2), supplied further evidence to define 13 homologous chromosome pairs in tetraploid cotton. In addition, it also provided evidence for collinearity between the diploid A and D genomes, as well as homology between chromosomes from different diploid genomes. The present results are a complete unification of the reports that are entirely based on the genome sequences of two diploid species.[9]

The diploid A genome dominates the tetraploid cotton genome

It is well known that tetraploid cotton was formed from an interspecific hybridization event between an A-genome species and a D-genome species,[4] and the diploid A genome is nearly 2-fold larger than the D genome, although they diverged from a common ancestor ∼5–10 million years ago.[38] However, previous studies,[3-6] as well as the present study, reported that there is a minor difference between the AT and DT genomes in tetraploid cotton. Previous reports that the AT genome is more stable than the DT genome,[28] and the present fact that regardless of an AT or DT genome origin, most chromosomes had the highest homology with the corresponding homologous chromosomes in the diploid A genome (Fig. 4 and Table 3), support the inference that during the hybridization evolution of tetraploid cotton, the DT genome underwent invasion of the diploid A genome (Fig. 5). This result may partly explain why the diploid D-genome species do not produce spinnable fibre,[41] while many QTLs for fibre-related traits have been detected in the DT genome of tetraploid cotton.[42-44]
Figure 5.

Schematic drawing that illustrates the evolution of most chromosomes. This figure is available in black and white in print and in colour at DNA Research online.

Schematic drawing that illustrates the evolution of most chromosomes. This figure is available in black and white in print and in colour at DNA Research online. If the formation of Chr04/Chr22 and Chr05/Chr19 was similar to other homologous chromosome pairs, their structure should be as shown in Fig. 6A. However, blast analysis between the tetraploid cotton genome and the mixed genome of two diploid cottons showed that Chr04 had the highest homology with A12, followed by D12 and D9 that were equal to each other. Additionally, Chr05 had the highest homology with A10, followed by D9 and A12 that were equal to each other (Fig. 4 and Supplementary Fig. S4). Thus, it is predicted that Chr04 should have a segment of D9, and Chr05 should have a segment of A12 (as shown in Fig. 6C). This prediction could also be confirmed by comparative genomics between the tetraploid cotton and diploid cottons (Figs 2 and 3, and Table 2). In that way, chromosomal translocations must occur during interspecific hybridization events as shown in Fig. 6B. In detail, one chromosome translocation occurred between Chr04 and Chr05, and another one occurred between Chr04 and Chr19. Similarly, considering the results of comparative genomics between the tetraploid cotton and diploid cottons, and the results of blast analysis between the tetraploid cotton genome and the mixed genome of two diploid cottons, a series of chromosomal translocations among Chr01/Chr15, Chr02/Chr14, and Chr03/Chr17 could also be predicted (Fig. 7). All of the above demonstrate that these five homologous chromosome pairs changed in a complex and dramatic manner in that they coevolved during polyploidization, while other chromosomes in the AT genome just happened minor variations and those in the DT genome just underwent invasion of the corresponding homologous chromosome in the diploid A genome.
Figure 6.

Schematic drawing that illustrates the evolution of Chr04/Chr22 and Chr05/Chr19. This figure is available in black and white in print and in colour at DNA Research online.

Figure 7.

Schematic drawing that illustrates the evolution of Chr01/Chr15, Chr02/Chr14, and Chr03/Chr17. This figure is available in black and white in print and in colour at DNA Research online.

Schematic drawing that illustrates the evolution of Chr04/Chr22 and Chr05/Chr19. This figure is available in black and white in print and in colour at DNA Research online. Schematic drawing that illustrates the evolution of Chr01/Chr15, Chr02/Chr14, and Chr03/Chr17. This figure is available in black and white in print and in colour at DNA Research online.

Applications of the high-density linkage map in tetraploid cotton genomics, genetics, and breeding

The present high-density genetic linkage map, with an average interval of 0.91 × 400 kb between genetic markers on the basis of a consensus estimate of genome size of ∼2,246Mb, provides a foundation to facilitate genome sequencing and sequence assembly.[45] Besides, it well revealed the genome structure of tetraploid cotton and provided preliminary hypotheses of tetraploid cotton formation from two diploid cotton species. Until now, cotton breeders have been working to transfer excellent genes controlling fibre quality from G. barbadense to G. hirsutum. And, introgression lines have proved to be an optional way to solve high sterility and crazed segregation of interspecific hybrid progeny. Excellent introgression lines are usually constructed by marker-assisted selection, which is inevitably linked to a genetic linkage map. The present high-density linkage map could also provide markers that are tightly linked with target traits, which is undoubtedly very useful to molecular breeding and genetic improvement.

Supplementary data

Supplementary data are available at www.dnaresearch.oxfordjournals.org.

Funding

This work was financially supported by the National Science Foundation of China (grant no. 31171593) and Genetically Modified Organisms Breeding Major Project of China (no. 2014ZX08009). Funding to pay the Open Access publication charges for this article was provided by the National Science Foundation of China (grant no. 31171593).
  35 in total

1.  A microsatellite-based, gene-rich linkage map reveals genome structure, function and evolution in Gossypium.

Authors:  Wangzhen Guo; Caiping Cai; Changbiao Wang; Zhiguo Han; Xianliang Song; Kai Wang; Xiaowei Niu; Cheng Wang; Keyu Lu; Ben Shi; Tianzhen Zhang
Journal:  Genetics       Date:  2007-04-03       Impact factor: 4.562

2.  PIP: a database of potential intron polymorphism markers.

Authors:  Long Yang; Gulei Jin; Xiangqian Zhao; Yan Zheng; Zhaohua Xu; Weiren Wu
Journal:  Bioinformatics       Date:  2007-06-01       Impact factor: 6.937

3.  Polyploid formation created unique avenues for response to selection in Gossypium (cotton).

Authors:  C Jiang; R J Wright; K M El-Zik; A H Paterson
Journal:  Proc Natl Acad Sci U S A       Date:  1998-04-14       Impact factor: 11.205

4.  Development and characterization of genomic and expressed SSRs for levant cotton (Gossypium herbaceum L.).

Authors:  Satya Narayan Jena; Anukool Srivastava; Krishan Mohan Rai; Alok Ranjan; Sunil K Singh; Tarannum Nisar; Meenal Srivastava; Sumit K Bag; Shrikant Mantri; Mehar Hasan Asif; Hemant Kumar Yadav; Rakesh Tuli; Samir V Sawant
Journal:  Theor Appl Genet       Date:  2011-10-30       Impact factor: 5.699

5.  Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution.

Authors:  Fuguang Li; Guangyi Fan; Cairui Lu; Guanghui Xiao; Changsong Zou; Russell J Kohel; Zhiying Ma; Haihong Shang; Xiongfeng Ma; Jianyong Wu; Xinming Liang; Gai Huang; Richard G Percy; Kun Liu; Weihua Yang; Wenbin Chen; Xiongming Du; Chengcheng Shi; Youlu Yuan; Wuwei Ye; Xin Liu; Xueyan Zhang; Weiqing Liu; Hengling Wei; Shoujun Wei; Guodong Huang; Xianlong Zhang; Shuijin Zhu; He Zhang; Fengming Sun; Xingfen Wang; Jie Liang; Jiahao Wang; Qiang He; Leihuan Huang; Jun Wang; Jinjie Cui; Guoli Song; Kunbo Wang; Xun Xu; John Z Yu; Yuxian Zhu; Shuxun Yu
Journal:  Nat Biotechnol       Date:  2015-04-20       Impact factor: 54.908

6.  Genetic mapping and characteristics of genes specifically or preferentially expressed during fiber development in cotton.

Authors:  Ximei Li; Daojun Yuan; Jinfa Zhang; Zhongxu Lin; Xianlong Zhang
Journal:  PLoS One       Date:  2013-01-25       Impact factor: 3.240

7.  Genome sequence of the cultivated cotton Gossypium arboreum.

Authors:  Fuguang Li; Guangyi Fan; Kunbo Wang; Fengming Sun; Youlu Yuan; Guoli Song; Qin Li; Zhiying Ma; Cairui Lu; Changsong Zou; Wenbin Chen; Xinming Liang; Haihong Shang; Weiqing Liu; Chengcheng Shi; Guanghui Xiao; Caiyun Gou; Wuwei Ye; Xun Xu; Xueyan Zhang; Hengling Wei; Zhifang Li; Guiyin Zhang; Junyi Wang; Kun Liu; Russell J Kohel; Richard G Percy; John Z Yu; Yu-Xian Zhu; Jun Wang; Shuxun Yu
Journal:  Nat Genet       Date:  2014-05-18       Impact factor: 38.330

8.  Genome mapping of white clover (Trifolium repens L.) and comparative analysis within the Trifolieae using cross-species SSR markers.

Authors:  Yan Zhang; Mary K Sledge; Joe H Bouton
Journal:  Theor Appl Genet       Date:  2007-03-14       Impact factor: 5.699

9.  Genetic mapping and comparative expression analysis of transcription factors in cotton.

Authors:  Xuemei Chen; Xin Jin; Ximei Li; Zhongxu Lin
Journal:  PLoS One       Date:  2015-05-06       Impact factor: 3.240

10.  Linkage mapping and expression analysis of miRNAs and their target genes during fiber development in cotton.

Authors:  Xuemei Chen; Wenhui Gao; Jinfa Zhang; Xianlong Zhang; Zhongxu Lin
Journal:  BMC Genomics       Date:  2013-10-16       Impact factor: 3.969

View more
  18 in total

1.  SSR-based association mapping of fiber quality in upland cotton using an eight-way MAGIC population.

Authors:  Cong Huang; Chao Shen; Tianwang Wen; Bin Gao; Xiaofang Li; Muhammad Mahmood Ahmed; Dingguo Li; Zhongxu Lin
Journal:  Mol Genet Genomics       Date:  2018-02-01       Impact factor: 3.291

2.  Re enhances anthocyanin and proanthocyanidin accumulation to produce red foliated cotton and brown fiber.

Authors:  Nian Wang; Beibei Zhang; Tian Yao; Chao Shen; Tianwang Wen; Ruiting Zhang; Yuanxue Li; Yu Le; Zhonghua Li; Xianlong Zhang; Zhongxu Lin
Journal:  Plant Physiol       Date:  2022-06-27       Impact factor: 8.005

3.  QTL analysis and candidate gene identification for plant height in cotton based on an interspecific backcross inbred line population of Gossypium hirsutum × Gossypium barbadense.

Authors:  Jianjiang Ma; Wenfeng Pei; Qifeng Ma; Yanhui Geng; Guoyuan Liu; Ji Liu; Yupeng Cui; Xia Zhang; Man Wu; Xingli Li; Dan Li; XinShan Zang; Jikun Song; Shurong Tang; Jinfa Zhang; Shuxun Yu; Jiwen Yu
Journal:  Theor Appl Genet       Date:  2019-06-24       Impact factor: 5.699

4.  Identification and Characterization of Segregation Distortion Loci on Cotton Chromosome 18.

Authors:  Baosheng Dai; Huanle Guo; Cong Huang; Muhammad M Ahmed; Zhongxu Lin
Journal:  Front Plant Sci       Date:  2017-01-12       Impact factor: 5.753

5.  Genome-wide recombination rate variation in a recombination map of cotton.

Authors:  Chao Shen; Ximei Li; Ruiting Zhang; Zhongxu Lin
Journal:  PLoS One       Date:  2017-11-27       Impact factor: 3.240

6.  Population structure and genetic basis of the agronomic traits of upland cotton in China revealed by a genome-wide association study using high-density SNPs.

Authors:  Cong Huang; Xinhui Nie; Chao Shen; Chunyuan You; Wu Li; Wenxia Zhao; Xianlong Zhang; Zhongxu Lin
Journal:  Plant Biotechnol J       Date:  2017-04-12       Impact factor: 9.803

7.  Genetic Map Construction and Fiber Quality QTL Mapping Using the CottonSNP80K Array in Upland Cotton.

Authors:  Zhaoyun Tan; Zhiqin Zhang; Xujing Sun; Qianqian Li; Ying Sun; Peng Yang; Wenwen Wang; Xueying Liu; Chunling Chen; Dexing Liu; Zhonghua Teng; Kai Guo; Jian Zhang; Dajun Liu; Zhengsheng Zhang
Journal:  Front Plant Sci       Date:  2018-02-27       Impact factor: 5.753

8.  Chromosome structural variation of two cultivated tetraploid cottons and their ancestral diploid species based on a new high-density genetic map.

Authors:  Wen-Wen Wang; Zhao-Yun Tan; Ya-Qiong Xu; Ai-Ai Zhu; Yan Li; Jiang Yao; Rui Tian; Xiao-Mei Fang; Xue-Ying Liu; You-Ming Tian; Zhong-Hua Teng; Jian Zhang; Da-Jun Liu; De-Xin Liu; Hai-Hong Shang; Fang Liu; Zheng-Sheng Zhang
Journal:  Sci Rep       Date:  2017-08-09       Impact factor: 4.379

9.  A comparative genomics approach revealed evolutionary dynamics of microsatellite imperfection and conservation in genus Gossypium.

Authors:  Muhammad Mahmood Ahmed; Chao Shen; Anam Qadir Khan; Muhammad Atif Wahid; Muhammad Shaban; Zhongxu Lin
Journal:  Hereditas       Date:  2017-05-18       Impact factor: 3.271

10.  QTL Mapping for Fiber Quality and Yield Traits Based on Introgression Lines Derived from Gossypium hirsutum × G. tomentosum.

Authors:  Ayaz Ali Keerio; Chao Shen; Yichun Nie; Muhammad Mahmood Ahmed; Xianlong Zhang; Zhongxu Lin
Journal:  Int J Mol Sci       Date:  2018-01-14       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.