| Literature DB >> 31278252 |
Zhaoen Yang1,2, Xiaoyang Ge1,2, Zuoren Yang1,2, Wenqiang Qin1,2, Gaofei Sun3, Zhi Wang1,2, Zhi Li1,2, Ji Liu1,2, Jie Wu1,2, Ye Wang1,2, Lili Lu1,2, Peng Wang1,2, Huijuan Mo1,2, Xueyan Zhang1,2, Fuguang Li4,5.
Abstract
Multiple cotton genomes (diploid and tetraploid) have been assembled. However, genomic variations between cultivars of allotetraploid upland cotton (Gossypium hirsutum L.), the most widely planted cotton species in the world, remain unexplored. Here, we use single-molecule long read and Hi-C sequencing technologies to assemble genomes of the two upland cotton cultivars TM-1 and zhongmiansuo24 (ZM24). Comparisons among TM-1 and ZM24 assemblies and the genomes of the diploid ancestors reveal a large amount of genetic variations. Among them, the top three longest structural variations are located on chromosome A08 of the tetraploid upland cotton, which account for ~30% total length of this chromosome. Haplotype analyses of the mapping population derived from these two cultivars and the germplasm panel show suppressed recombination rates in this region. This study provides additional genomic resources for the community, and the identified genetic variations, especially the reduced meiotic recombination on chromosome A08, will help future breeding.Entities:
Mesh:
Year: 2019 PMID: 31278252 PMCID: PMC6611876 DOI: 10.1038/s41467-019-10820-x
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Global statistical comparison of TM-1 and ZM24 genomes
| Category | TM-1 genome | ZM24 genome | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Numbers | N50 | Longest | Size | Percentage of assembly | Numbers | N50 | Longest | Size | Percentage of assembly | |
| Contigs | 1283 | 4760 | 23 | 2286 | 100 | 3718 | 1976 | 14 | 2309 | 100 |
| Anchored and oriented | 699 | 4856 | 23 | 2226 | 97.4 | 1497 | 2151 | 14 | 2150 | 93.2 |
| Gene annotated | 73,624 | 228 | 10.0 | 73,707 | 243 | 10.5 | ||||
| Repeat sequence | NAa | NA | NA | 1686 | 73.7 | NA | NA | NA | 1665 | 72.1 |
a not applicable.
Fig. 1Comparison of the assembled genomes with their diploid progenitors genomes. a Whole genome comparison among the TM-1 At subgenome, the G. arboreum genome (A2), and the ZM24 At subgenome. b Whole genome comparison among the TM-1 Dt subgenome, G. raimondii genome (D5) and the ZM24 Dt subgenome. c, d TM-1 Hi-C data mapped to TM-1 A06 and G. arboreum Chr06. The dotted boxes indicate the inversions between A06 and Chr06. e Genome comparison of TM-1 A06 and G. arboreum Chr06. The double arrow dash line indicated the same inversions between the Hi-C map and Chromosomes. Source data of (a, b) are provided in a Source Data file
Fig. 2Genomic landscape between TM-1 and ZM24 genomes. a Subgenomes of TM-1 (green) and ZM24 (orange). b, c Transposable elements and gene density in 1 Mb sliding windows. d Introgression from chromosomes of the Dt subgenome to its counterpart chromosomes in the At subgenome in 1 Mb sliding windows. e Distribution of PAV sequences in 1 Mb sliding windows. f InDels between the TM-1 and ZM24 At subgenomes and G. arboreum (A2) in 1 Mb sliding windows. g InDels between TM-1 and ZM24 At subgenomes in 1 Mb sliding windows. h SNPs between At subgenomes and A2 in 1 Mb sliding windows. i SNPs between TM-1 and ZM24 At subgenomes in 1 Mb sliding windows. j Large-scale variations between the TM-1 and ZM24 subgenomes. Source data are provided in a Source Data file
Variations within genes between TM-1 and ZM24 genomes
| Variation type | Syntenic orthologous gene pairs | Nonsyntenic orthologous gene pairs | ||||||
|---|---|---|---|---|---|---|---|---|
| At | Percent | Dt | Percent | At | Percent | Dt | Percent | |
| Structurally conserved genes | 27,428 | 80.10% | 27,440 | 80.34% | 222 | 56.78% | 160 | 44.19% |
| Without amino acid substitutions | 23,963 | 69.98% | 24,144 | 70.69% | 104 | 26.60% | 60 | 16.57% |
| No DNA variation in CDS region | 23,290 | 68.01% | 23,463 | 68.69% | 77 | 19.69% | 45 | 12.40% |
| No DNA variation in CDS and intron region | 19,130 | 55.87% | 19,415 | 56.84% | 41 | 10.49% | 27 | 7.45% |
| No DNA variation in genic region a | 7712 | 22.52% | 8302 | 24.31% | 2 | 0.51% | 10 | 2.76% |
| Same sense mutation | 673 | 1.97% | 681 | 1.99% | 27 | 6.91% | 15 | 4.14% |
| With amino acid changes | 3465 | 10.12% | 3296 | 9.65% | 118 | 30.18% | 100 | 27.55% |
| With missense mutation in CDS | 2340 | 6.83% | 2222 | 6.51% | 97 | 24.81% | 78 | 21.62% |
| With 3n InDel in CDS | 1125 | 3.29% | 1074 | 3.14% | 21 | 5.37% | 22 | 6.07% |
| Genes with large-effect mutations | 2185 | 6.38% | 2214 | 6.48% | 48 | 12.28% | 62 | 17.12% |
| With 3n ± 1 InDel in CDS | 629 | 1.84% | 550 | 1.61% | 17 | 4.35% | 19 | 5.24% |
| Start-codon mutation | 921 | 2.69% | 1127 | 3.30% | 11 | 2.81% | 21 | 5.80% |
| Stop-codon mutation | 390 | 1.14% | 327 | 0.96% | 15 | 3.84% | 15 | 4.14% |
| Splice-acceptor mutation | 23 | 0.07% | 20 | 0.06% | 1 | 0.26% | 0 | 0.00% |
| Splice-donor mutation | 222 | 0.65% | 190 | 0.56% | 4 | 1.02% | 7 | 1.93% |
| Genes with large structural variations | 4630 | 13.52% | 4502 | 13.18% | 121 | 30.95% | 140 | 38.67% |
| At least one CDS missing | 4370 | 12.76% | 4164 | 12.19% | 90 | 23.02% | 104 | 28.73% |
| Total | 34,243b | 100.00% | 34,156b | 100.00% | 391b | 100.00% | 362b | 100.00% |
aGenic regions include 2 kb upstream and downstream of the gene body
bOnly genes and their orthologs in the counterpart genome anchored in 26 chromosomes were included for the analysis
Fig. 3Genetic effect of inversions present on A08. a Large-scale inversions (SV1, SV2, and SV3) on A08 between TM-1 and ZM24. b ZM24 Hi-C data mapped to TM-1 A08. c Genotyping of breakpoints in SV1 and SV3. The 421 accessions colored by the A08 inversion allele (orange, TM-1-like allele; green, ZM24-like allele; grey, unsigned) as assessed by the breakpoints for SV1 and SV3. The accessions were ordered based on the haplotype clustering shown in (d). d Haplotype clustering based on 315,868 SNPs (MAF ≥ 0.05) genome-wide, revealing two distinct clusters (TM-like and ZM24-like groups), which identically matched the distributions of the two A08 inversion alleles. e Principal component analysis of 421 accessions. f Genetic differentiation (F) between the TM-1-like group and ZM24-like group. g Haplotype diversity for the TM-1-like and ZM24-like group. h Local impact of large inversions on meiotic recombination rate using a RIL population derived from a cross of TM-1 and ZM24. Black line, cM/Mb values across the chromosome; Colored boxes, location of inversions. Source data of (e–h) are provided in a Source Data file