| Literature DB >> 33283404 |
Peirong Li1,2,3, Tongbing Su1,2,3, Xiuyun Zhao1,2,3, Weihong Wang1,2,3, Deshuang Zhang1,2,3, Yangjun Yu1,2,3, Philipp E Bayer4, David Edwards4, Shuancang Yu1,2,3, Fenglan Zhang1,2,3.
Abstract
Brassica rapa displays a wide range of morphological diversity which is exploited for a variety of food crops. Here we present a high-quality genome assembly for pak choi (Brassica rapa L. subsp. chinensis), an important non-heading leafy vegetable, and comparison with the genomes of heading type Chinese cabbage and the oilseed form, yellow sarson. Gene presence-absence variation (PAV) and genomic structural variations (SV) were identified, together with single nucleotide polymorphisms (SNPs). The structure and expression of genes for leaf morphology and flowering were compared between the three morphotypes revealing candidate genes for these traits in B. rapa. The pak choi genome assembly and its comparison with other B. rapa genome assemblies provides a valuable resource for the genetic improvement of this important vegetable crop and as a model to understand the diversity of morphological variation across Brassica species.Entities:
Keywords: zzm321990Brassica rapazzm321990; flowering genes; genetic variation; genome; leaf-heading genes; pak choi
Mesh:
Year: 2021 PMID: 33283404 PMCID: PMC8131043 DOI: 10.1111/pbi.13522
Source DB: PubMed Journal: Plant Biotechnol J ISSN: 1467-7644 Impact factor: 9.803
Comparisons of assembly statistics and annotation of the pak choi genome and published Chinese cabbage and yellow sarson genome assemblies
| Pak choi genome | Chinese cabbage genome | Yellow sarson genome | |
|---|---|---|---|
|
| |||
| Total assembly size (Mb) | 370.42 | 353.14 | 401.92 |
| Total chromosome size (Mb) | 341.14 | 296.58 | 357.07 |
| Contig number | 1985 | 1498 | 1037 |
| Contig N50 (Mb) | 2.82 | 1.45 | 2.27 |
| Longest length (Mb) | 22.37 | 9.42 | 22.13 |
|
| |||
| Gene model | 45 363 | 46 250 | 46 721 |
| Percentage of anchored genes (%) | 98.50% | 98.58% | 98.14% |
See Table S1.
Figure 1Genomic landscape of pak choi, Chinese cabbage and yellow sarson. (a) Genomic comparison of pak choi (PC) and Chinese cabbage (CC). (b) Genomic comparison of pak choi and yellow sarson (YS). (c) Genomic comparison of Chinese cabbage and yellow sarson. The distribution of SVs, repeat element density, gene density, distribution of PAV sequences, number of indels, number of SNPs and gene pairs between the two genomes are shown as identified using the best‐hit method. All the components were calculated using 1‐Mb sliding windows.
Figure 2Synteny, phylogenetic evolution and mummerplot comparisons of the three Brassica rapa species. (a) Gene order comparison. Each line connects a pair of best bidirectional hits orthologous genes between the pak choi, Chinese cabbage and yellow sarson genomes. (b) Phylogenetic analysis of pak choi with other Brassiceae species. The estimated species divergence time (Mya) and 95% confidential intervals are shown at each branch site. The divergence used for time recalibration is highlighted by red dots. (c) Nucmer comparison of the pak choi genome and Chinese cabbage genome assembly. (d) Nucmer comparison of the pak choi genome and yellow sarson genome assembly. (e) Nucmer comparison of the Chinese cabbage genome and yellow sarson genome assembly. Chromosomal inversions and breakage in assembly orientation in relation to the chromosomal sequence on the X‐axis are shown in blue.
Gene variations between the pak choi (PC) genome compared with the Chinese cabbage (CC) and yellow sarson (YS) genomes
| Variation type | PC vs CC | PC vs YS | Classification |
|---|---|---|---|
| Structurally conserved genes | 35 874 | 32 582 | Classification Ⅰ |
| Without amino acid substitutions | 16 396 | 10 737 | Classification Ⅰ |
| No DNA variation in the CDS region | 12 753 | 7006 | Classification Ⅰ |
| No DNA variation in the CDS and intron region | 9968 | 4905 | Classification Ⅰ |
| No DNA variation in the genic region | 1887 | 346 | Classification Ⅰ |
| With 3n indel in CDS | 5022 | 6018 | Classification Ⅰ |
| With amino acid changes | 19 478 | 21 845 | Classification Ⅰ |
| With missense mutation in CDS | 19 462 | 21 822 | Classification Ⅰ |
| Genes with large‐effect mutations | 5135 | 7036 | Classification Ⅱ |
| Premature stop codon | 3689 | 5184 | Classification Ⅱ |
| Splice acceptor mutation | 343 | 471 | Classification Ⅱ |
| Splice donor mutation | 217 | 304 | Classification Ⅱ |
| Start codon mutation | 258 | 366 | Classification Ⅱ |
| Stop codon mutation | 386 | 505 | Classification Ⅱ |
| With 3n+‐1 indel in CDS | 2346 | 3544 | Classification Ⅱ |
| Genes with incomplete CDS | 3124 | 4307 | Classification Ⅲ |
| At least one exon missing | 1995 | 2841 | Classification Ⅲ |
Genic regions contain 2 kb upstream and downstream of the gene body. Classification Ⅰ, Ⅱ, Ⅲ mean structurally conserved genes, genes with large‐effect mutations and genes with large structural variation, respectively.
Figure 3Gene variations and expression patterns in the putative genes involved in leaf shape and polarity between the pak choi (PC) and Chinese cabbage genomes (CC). IGV alignments showing the variations of BrKAN1.1 (a), BrKAN1.2 (b), BrKAN2.2 (c), BrKAN3.1 (d) and BrKAN3.2 (e) using long reads of pak choi genome. The exon (green box), intron (black line) and 1 kb upstream sequence were shown in the figure. (f) The expression heatmap of 13 genes at six growing stages. PC LR is short for pak choi long reads.
The SV and PAV genes for leaf‐heading morphotype
| Genes |
| Arabidopsis ID | Annotation | Variation type |
|---|---|---|---|---|
|
|
|
| Auxin response factor ARF3 | SV: Premature stop codon, Splice acceptor mutation, with 3n+‐1 indel in CDS |
|
|
|
| Homeobox‐leucine zipper ATHB‐15 | PAV: CC absent |
|
|
|
| Brevis radix | SV: Premature stop codon |
|
|
|
| Brevis radix‐like 2 | SV: Premature stop codon, Splice acceptor mutation, Splice donor mutation |
|
|
|
| Brevis radix‐like 2 | PAV: PC absent |
|
|
|
| Transcription repressor KANADI 1 | SV: Stop codon mutation |
|
|
|
| Transcription repressor KANADI 1 | PAV: PC absent |
|
|
|
| Transcription factor KANADI 2 | PAV: PC absent |
|
|
|
| Transcription factor KANADI 3 | SV: Premature stop codon |
|
|
|
| Transcription factor KANADI 3 | SV: Premature stop codon, with 3n+‐1 indel in CDS |
|
|
|
| Homeobox protein knotted‐1‐like 4 | SV: Premature stop codon |
|
|
|
| Homeobox‐leucine zipper REVOLUTA | SV: Premature stop codon |
|
|
|
| Axial regulator YABBY 1 | SV: Premature stop codon |
Figure 4Gene variations and expression patterns in the putative genes involved in the flowering pathway between the pak choi (PC) and yellow sarson (YS) genomes. IGV alignments showing the variations of BrMAF4 (a), BrSVP (b), BrCSTF77 (c), BrBBX19 (d), BrTOE2 (e) and BrAP1A (f) using long reads of pak choi genome. (g) The expression heatmap of 18 genes at five growing stages. (h) Expression pattern of BrFLC2, BrPHYA and BrMAF4 at five stages. The exon (green box), intron (black line) and 1 kb upstream sequence were shown in the figure. (i) Real‐time PCR of BrMAF4 at five stages. PC LR is short for pak choi long reads.