Jianping Zhang1, Yanni Qi2, Limin Wang2, Lili Wang3, Xingchu Yan4, Zhao Dang2, Wenjuan Li2, Wei Zhao2, Xinwu Pei5, Xuming Li3, Min Liu3, Meilian Tan4, Lei Wang4, Yan Long5, Jing Wang3, Xuewen Zhang3, Zhanhai Dang6, Hongkun Zheng7, Touming Liu8. 1. Institute of Crop Research, Gansu Academy of Agricultural Sciences, Lanzhou, Gansu, China. Electronic address: zhangjpzw3@gsagr.ac.cn. 2. Institute of Crop Research, Gansu Academy of Agricultural Sciences, Lanzhou, Gansu, China. 3. Biomarker Technologies Corporation, Beijing, China. 4. Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, Hubei, China. 5. Institute of Biotechnology, Chinese Academy of Agricultural Sciences, Beijing, China. 6. Institute of Crop Research, Gansu Academy of Agricultural Sciences, Lanzhou, Gansu, China. Electronic address: 13669338239@163.com. 7. Biomarker Technologies Corporation, Beijing, China. Electronic address: zhenghk@biomarker.com.cn. 8. Institute of Bast Fiber Crops and Center of Southern Economic Crops, Chinese Academy of Agricultural Sciences, Changsha, Hunan, China. Electronic address: liutouming@caas.cn.
Abstract
Flax has been cultivated for its oil and fiber for thousands of years. However, it remains unclear how the modifications of agronomic traits occurred on the genetic level during flax cultivation. In this study, we conducted genome-wide variation analyses on multiple accessions of oil-use, fiber-use, landraces, and pale flax to identify the genomic variations during flax cultivation. Our findings indicate that, during flax domestication, genes relevant to flowering, dehiscence, oil production, and plant architecture were preferentially selected. Furthermore, regardless of origins, the improvement of the modern oil-use flax preceded that of the fiber-use flax, although the dual selection on oil-use and fiber-use characteristics might have occurred in the early flax domestication. We also found that the expansion of MYB46/MYB83 genes may have contributed to the unique secondary cell wall biosynthesis in flax and the directional selections on MYB46/MYB83 may have shaped the morphological profile of the current oil-use and fiber-use flax.
Flax has been cultivated for its oil and fiber for thousands of years. However, it remains unclear how the modifications of agronomic traits occurred on the genetic level during flax cultivation. In this study, we conducted genome-wide variation analyses on multiple accessions of oil-use, fiber-use, landraces, and pale flax to identify the genomic variations during flax cultivation. Our findings indicate that, during flax domestication, genes relevant to flowering, dehiscence, oil production, and plant architecture were preferentially selected. Furthermore, regardless of origins, the improvement of the modern oil-use flax preceded that of the fiber-use flax, although the dual selection on oil-use and fiber-use characteristics might have occurred in the early flax domestication. We also found that the expansion of MYB46/MYB83 genes may have contributed to the unique secondary cell wall biosynthesis in flax and the directional selections on MYB46/MYB83 may have shaped the morphological profile of the current oil-use and fiber-use flax.
Flax (Linum usitatissimum L.) is one of the earliest domesticated crops, with records spanning more than 8,000 years, and provides a source of oil and fiber for humans (Fu, 2011, van Zeist and Bakker-Heeres, 1975). There are two primary morphotypes of cultivated flax, oil-use flax, and fiber-use flax, which display remarkable differences in morphology and agronomic performance. That is, oil-use flax is shorter, has more branches, and produces larger seeds that contain ∼40% oil, and fiber-use flax is comparatively taller, less branched, and produces fewer seeds. The primitive cultivated flax is deemed to be descended from a wild flax species, pale flax (L. bienne Mill.), which is a winter annual or perennial that possesses narrow leaves, dehiscent capsules, and lodging-prone stems (Zohary and Hopf, 2000, Allaby et al., 2005). Since then, multiple domestication processes gave rise to the cultivated flax, whose traits such as indehiscence, winter hardiness, oil content, and fiber content were improved. Owing to the inconsistent use of genetic markers and sampling strategies, previous flax population analyses often drew inconsistent conclusions regarding which trait-specific group was first established (Fu, 2011, Fu, 2012, Fu et al., 2012). Although molecular evidence suggests that the domestication of modern oil-use flax occurred before that of fiber-use flax, the studies of early flax domestication were probably complicated by the fact that flax was domesticated as an oil-fiber dual-use crop from prehistoric times, as revealed by archaeological records (Helback, 1959, van Zeist and Bakker-Heeres, 1975). Especially, pale flax has a very wide biogeographic range spanning Europe, Africa, and Asia (Helback, 1959, Diederichsen and Hammer, 1995), unlike many relic wild progenitors of crops that were confined to a single geographic location. Therefore, multiple independent domestication events might have occurred in the flax domestication history (Fu, 2012, Fu and Peterson, 2012).The artificial selections during crop domestications and improvements often substantially reduce genetic variations. For many conventional crops such as rice (Zhang et al., 2014, Stein et al., 2018), soybean (Li et al., 2014, Xie et al., 2019), maize (Yang et al., 2017), cassava (Bredeson et al., 2016), sunflower (Hübner et al., 2019), pepper (Qin et al., 2014), tomato (Bolger et al., 2014, Gao et al., 2019), Brassica (Golicz et al., 2016), and citrus (Wang et al., 2018), both the desirable trait targeted selection in the domesticated crops and the genomic diversity in their wild progenitors have been extensively studied. For example, the selection on TomLoxC promoter is found to affect the tomato flavor during domestication by sequencing 725 representative tomato samples (Gao et al., 2019); the aconitate hydratase (ACO) gene regulating citrate content was under selection during the domestication by analyzing the wild and landrace mandarin (Wang et al., 2018); introgression of the genes related to biotic stress response from wild species to cultivated sunflower (Hübner et al., 2019); and the progenitor Malus sylvestris contributed alleles for fruit quality and production traits to dessert apple cultivars (Duan et al., 2017). However, similar studies for flax are still lacking. In previous studies, a variety of molecular markers were used to investigate the genetic diversity and lineage relationships in cultivated and pale flax (Allaby et al., 2005, Fu et al., 2002a, Fu et al., 2002b, Soto-Cerda et al., 2012, Smykal et al., 2011, Xie et al., 2018). Some selective loci responsible for the agricultural improvement of flax were identified through genetic mapping and genome-wide association studies (Cloutier et al., 2011, Kumar et al., 2015, Xie et al., 2018). However, in these studies, the low coverage of the flax genome potentially clouded the conclusions. For example, by analyzing sad2 locus, Fu et al. (2012) deduced that the increased oil content occurred prior to capsular indehiscence; but if using another set of 49 EST-SSRs, capsular dehiscence was identified as the earliest domesticated trait (Fu, 2011). In addition to the low genome coverage, the lack of pale flax genome sequence prevented the inference of genome-wide variations during the flax cultivation. In this study, we de novo assembled three flax genomes and resequenced 83 cultivated flax accessions. Through this, we sought to identify and understand the genetic variations that resulted from flax domestication and improvement at the global genome level.
Results
De Novo Assembly of Three Flax Genomes
Whole-genome shotgun sequencing was performed on oil-use flax variety “Longya-10,” fiber-use flax variety “Heiya-14,”,and pale flax (Table S1 and Figure S1). A total of 68.2, 73.5, and 49.1 billion high-quality base pairs (133-, 142-, and 93-fold genome coverage, respectively) were assembled into 306.0-, 303.7-, and 293.5-Mb genomes for Longya-10, Heiya-14, and pale flax, with the contig N50/scaffold N50 length of 131Kb/1,235Kb, 156Kb/700Kb, and 59Kb/384 Kb, respectively (Tables S2–S6 and 1). The gap length in Longya-10, Heiya-14, and pale flax genome was 5.8, 2.8, and 5.6, respectively (Table 1). To further improve the assembly quality, we utilized Hi-C technology and genetic map to improve the Longya-10 genome, resulting in 434 scaffolds (295.7 Mb in total) for chromosomal-level assembly (Tables S7 and S8 and Figures S2 and S3). Approximately 43,500 protein-coding genes and ∼2,600–2,800 non-coding RNAs were identified in each genome. In addition, there were 288,633 (∼122.2 Mb), 275,796 (∼115.4 Mb), and 244,460 (∼109.4 Mb) repetitive sequences found in the Longya-10, Heiya-14, and pale flax genomes, respectively (Figure 1 and Tables S9–S12). Phylogenetic analysis revealed that the cultivated flax and pale flax diverged at about 2.32 million years ago (Figure S4). There were two whole-genome duplication events (WGDs) (Ks = 0.13 and Ks = 0.77, respectively) identified since the ancient hexaploidization occurred during angiosperm evolution (Table S13 and Figure S5).
Table 1
Assembly Statistics for Longya-10, Heiya-14, and Pale Flax
Accession
Scaffold Number
Total Scaffold Length (bp)
Scaffold N50 (bp)
Scaffold N90 (bp)
Longest Scaffold (bp)
Total Gap Length (bp)
Longya-10
1,865
305,975,888
1,235,007
270,149
4,613,305
5,817,576
Heiya-14
2,748
303,668,802
699,937
156,528
3,040,329
2,841,264
Pale flax
2,609
293,538,124
383,912
88,775
3,507,611
5,635,035
Figure 1
Characterization of the Three Flax Genomes
The outermost to innermost tracks indicate GC content, repeat sequence density, gene density, noncoding RNA distribution, and colinear gene pairs (a set of quadruplicate collinear regions were highlighted). The outer to inner layers of each track indicate pale flax, Longya-10, and Heiya-14 data. See also Tables S11 and S12.
Assembly Statistics for Longya-10, Heiya-14, and Pale FlaxCharacterization of the Three Flax GenomesThe outermost to innermost tracks indicate GC content, repeat sequence density, gene density, noncoding RNA distribution, and colinear gene pairs (a set of quadruplicate collinear regions were highlighted). The outer to inner layers of each track indicate pale flax, Longya-10, and Heiya-14 data. See also Tables S11 and S12.
Genomic Comparison of Two Cultivars and Wild Pale Flax
We generated a phylogenetic tree combining our four sequenced genomes (an additional L. grandiflorum individual was also shotgun sequenced) and the available GenBank data of another ten Linum species, giving the hypothesis that the modern cultivated flax might have originated from pale flax (Allaby et al., 2005, Diederichsen and Hammer, 1995, Fu et al., 2002a, Fu et al., 2002b, Gill, 1966, Gill, 1987, Tammes, 1928) (Figure S6). Then, we explored the genomic variations between the two cultivars and pale flax to understand the molecular mechanism for the selection of key agronomic traits in flax domestication. In the Longya-10 genome, a total of 3,623,057 single nucleotide variations (SNVs) and 555,580 insertions and deletions (InDels) were identified, and 3,686,366 SNVs and 557,691 InDels were identified in the Heiya-14 genome (Figure 2A and Table S14). Our results showed that approximately 13.7% SNVs in Longya-10 and 14.2% SNVs in Heiya-14 fell into coding regions, more than half of which were nonsynonymous variations (covering more than 31,000 protein-coding genes in each genome; Figure 2B). In addition, 482 genes containing these nonsynonymous SNVs were positively selected in the two cultivars compared with pale flax (Table S15) and 23 of these genes are homologs of genes involved in oil and fiber biosynthesis (Table S16). Only 4.26% and 4.51% of InDels existed in CDS regions of the Longya-10 and Heiya-14 genomes (covering ∼11,000 genes in each genome), respectively (Figure 2B and Table S14).
Figure 2
Genomic Variations between Longya-10, Heiya-14, and Pale Flax
(A) Distribution and density of genomic variations across the flax genomes. The outer to inner circles of each track show SNVs and InDels. The outer to inner layers of each track indicate variations between pale flax and Longya-10 and variations between Heiya-14 and Longya-10. See also Table S14.
(B) Distribution of SNVs and InDels in intergenic, intron, and CDS regions between pale flax and Longya-10 and pale flax and Heiya-14. In CDS, SNVs were classified into synonymous and nonsynonymous SNVs. See also Table S14.
(C) KEGG enrichment of genes carrying nonsynonymous SNVs between cultivars (Longya-10 and Heiya-14) and pale flax. An asterisk indicates a significantly enriched pathway. See also Table S20.
(D) KEGG enrichment of genes carrying InDels between cultivars (Longya-10 and Heiya-14) and pale flax. An asterisk indicates a significantly enriched pathway. See also Table S20.
(E–H) InDels in LuFCA, LuMYB83-1, LuALC, and LuLEC1. Gene structures of LuFCA, LuMYB83-1, LuALC, and LuLEC1 in Longya-10 are shown at the top (The exons are shown in orange, introns are shown in black lines); nucleotide and amino acid sequences are shown at the bottom. Red indicates InDels in Longya-10 and Heiya-14 compared with pale flax. At the bottom, the upper layers to the lower layers indicate pale flax, Longya-10, and Heiya-14. See also Table S18 and Figure S7.
Genomic Variations between Longya-10, Heiya-14, and Pale Flax(A) Distribution and density of genomic variations across the flax genomes. The outer to inner circles of each track show SNVs and InDels. The outer to inner layers of each track indicate variations between pale flax and Longya-10 and variations between Heiya-14 and Longya-10. See also Table S14.(B) Distribution of SNVs and InDels in intergenic, intron, and CDS regions between pale flax and Longya-10 and pale flax and Heiya-14. In CDS, SNVs were classified into synonymous and nonsynonymous SNVs. See also Table S14.(C) KEGG enrichment of genes carrying nonsynonymous SNVs between cultivars (Longya-10 and Heiya-14) and pale flax. An asterisk indicates a significantly enriched pathway. See also Table S20.(D) KEGG enrichment of genes carrying InDels between cultivars (Longya-10 and Heiya-14) and pale flax. An asterisk indicates a significantly enriched pathway. See also Table S20.(E–H) InDels in LuFCA, LuMYB83-1, LuALC, and LuLEC1. Gene structures of LuFCA, LuMYB83-1, LuALC, and LuLEC1 in Longya-10 are shown at the top (The exons are shown in orange, introns are shown in black lines); nucleotide and amino acid sequences are shown at the bottom. Red indicates InDels in Longya-10 and Heiya-14 compared with pale flax. At the bottom, the upper layers to the lower layers indicate pale flax, Longya-10, and Heiya-14. See also Table S18 and Figure S7.To identify genomic variants that are likely important in flax domestication, we annotated the genes harboring the common nonsynonymous SNVs and InDels in the two cultivars. The results show that InDel variations occurred in the homologs of flowering time-related gene FCA, fruit dehiscence-related gene ALCATRAZ (ALC), secondary cell wall biosynthesis-related gene MYB83, and seed oil biosynthesis-related gene leafy cotyledon 1 (LEC1) during flax domestication (Figures 2E–2H and S7 and Tables S17 and S18) (Simpson et al., 2010, Rajani and Sundaresan, 2001, Zhong et al., 2007, Tang et al., 2018). Importantly, LuALC, a gene related to the MYC/bHLH family of transcription factors, carries a frameshift variation caused by a 4-bp insertion in the two cultivars compared with pale flax; LuMYB83-1, a homolog of lodging-related gene AtMYB83, has a 21-bp insertion in the C terminal domain in the two cultivars. These large-effect variations (nonsynonymous SNV, frameshift, premature, etc.) were possibly maintained from the original selection for favorable agronomic traits in flax domestication. Additionally, the gene expressions of LuFCA, LuMYB83-1, and LuLEC1, but not LuALC, were remarkably elevated in the two cultivars (Table S19 and Figure S8). In Arabidopsis, AtALC expression can promote the cell separation in fruit dehiscence (Rajani and Sundaresan, 2001), whereas in cultivated flax, a low level of LuALC expression is maintained until fruit harvest. This reduced expression of LuALC may indicate the selection for indehiscent flax lineages during flax cultivation. Functional enrichment analysis of genes carrying SNVs and InDels shows that genes involved in plant hormone signal transduction (ko04075, ko00905), pentose and glucuronate interconversions (ko00040), starch and sucrose metabolism (ko00500), and glycosphingolipid biosynthesis (ko00603) are significantly overrepresented (Figures 2C and 2D and Table S20), indicating that plant architecture (plant height, leaf shape, branching pattern, upright/prostrate, etc.), seed yield, and/or nutritional quality were the primary domestication objectives.
Divergence of the Cultivated Flax Population
The cultivated flax is divided into two major morphotypes: oil-use flax and fiber-use flax. To understand the genomic basis of divergence of oil-use and fiber-use flax during its improvement, we performed a population analysis using 83 flax accessions (including 24 landraces, 47 oil-use, and 12 fiber-use cultivars, Table S21 and Figure S9). Re-sequencing of these 83 accessions generated a total of 4.88 billion paired-end reads (∼615 Gb) with an average depth of 11.2× and coverage of 97.4%. By aligning all sequencing reads against the Longya-10 genome, a total of 2,245,463 SNPs and 394,658 InDels were detected in 83 accessions (Tables S22 and S23). We constructed a phylogenetic tree and conducted a population structure analysis using whole-genome SNPs, supporting that all 83 flax accessions resulted in three large groups belonging to landrace, oil-use, and fiber-use flax groups, respectively (Figures 3A and S10). These three groups were further validated by the principal component analysis (Figure 3B). A closer relationship between the oil-use group and landrace group was resolved through the phylogenetic tree and population structure analyses. Additionally, the lowest population diversity (π = 9.80×10−4) and longest linkage disequilibrium (LD) decay distance (66.7Kb) were observed in the fiber flax group (Figures 3C and 3D). The climate oscillations and artificial directional selections on crop traits can dramatically diminish genetic diversity and in turn influence the effective population sizes (Ne). Using SMC++ (Terhorst et al., 2017), we indeed inferred that all three flax populations experienced sharp bottlenecks mirroring by the continual Ne declines in the recent 20,000 years, coinciding with the period of the Last Glacial Maximum (about 20,000 years ago) and the onset of flax cultivation (about 10,000 years ago, Figure S11, Kleman and Hättestrand, 1999, Hillman, 1975, van Zeist and Bakker-Heeres, 1975, Zohary and Hopf, 2000).
Figure 3
Flax Populations
(A) A neighbor-joining tree of 83 flax accessions (24 landraces, 47 oil-use flax, and 12 fiber-use flax) using SNPs detected in whole-genome resequencing data.
(B) Principal component analysis plots of the first two components of 83 accessions.
(C) Nucleotide diversity (π) within groups and population divergence (FST) across groups.
(D) Decay of LD measured by r for each of the three groups.
Flax Populations(A) A neighbor-joining tree of 83 flax accessions (24 landraces, 47 oil-use flax, and 12 fiber-use flax) using SNPs detected in whole-genome resequencing data.(B) Principal component analysis plots of the first two components of 83 accessions.(C) Nucleotide diversity (π) within groups and population divergence (FST) across groups.(D) Decay of LD measured by r for each of the three groups.
Selective Sweeps during Flax Improvement
Crop improvement frequently causes a drastic loss of diversity in genomic regions (named selective sweep) that contain genes conferring favorable agronomic traits. To illuminate the different molecular mechanisms underlying the divergence of traits in flax improvement, we identified potential selective sweeps by comparing the oil-use and fiber-use groups with the landrace group separately (designated as landrace-to-oil and landrace-to-fiber, respectively). A total of 108 putative selective sweeps (15.5 Mb in length, 1,958 genes) and 60 potential selective sweeps (8.2 Mb in length, 1,018 genes) were detected in the landrace-to-oil and landrace-to-fiber comparison, respectively, among which 27 selective sweeps overlapped with each other (Tables S24, S25, and S26 and Figure S12).Variations of genes in the selective sweeps unique for either the oil-use or the fiber-use flax might be specifically required for the improvement of the oil or fiber properties. Therefore, we investigated the 1,547 and 780 genes in the unique sweeps of the landrace-to-oil and landrace-to-fiber comparison, respectively. Annotations of the genes carrying large-effect variations show that oil-related genes encoding alpha biotin carboxyl carrier protein (LuBCCP), lipoxygenase (LuLOX), fatty acyl-ACP thioesterases A (LuFatA), lipid transfer protein (LuLTP), E2 component of pyruvate dehydrogenase complex (LuPDH-E2), and seed size-related genes brassinosteroid Insensitive 2 (LuBIN2) and LuGW5 are detected in the landrace-to-oil comparison, whereas homologs of the secondary cell wall biosynthesis-related genes (LuMYB46-1, LuXTH, and LuROPGAP3) and the plant stem length-related genes (LuGA3ox, LuGA20ox, and LuGID1; Figures 4A, 4C–4E, and S13, Tables S27 and S28) were found in landrace-to-fiber comparison. Along with the differential gene expression patterns associated with fatty acid and secondary cell wall biosynthesis during stem and seed development (Figure S14), these results illustrate the direction and strength of artificial selections on the oil-use and fiber-use flax diverge during the modern flax breeding.
Figure 4
Detection and Functional Annotation of Selective Sweeps
(A and B) Selection signals in landrace-to-oil comparison and oil-to-fiber comparison were defined by the top 5% πratio and FST values (the genomic regions below and above the horizonal lines, respectively). The arrows indicate the genes associated with several important agronomic traits. (A) Landrace-to-oil comparison; (B) oil-to-fiber comparison.
(C–G) (C–G) The πratio and FST values for candidate genes are shown at the top; the amino acid substitutions resulting from the large-effect SNP mutations for those candidate genes are shown at the bottom. Red indicates amino acid substitutions between landrace, oil, and fiber flax. Landrace, oil, and fiber flax groups are indicated from the top to the lower layers.
Detection and Functional Annotation of Selective Sweeps(A and B) Selection signals in landrace-to-oil comparison and oil-to-fiber comparison were defined by the top 5% πratio and FST values (the genomic regions below and above the horizonal lines, respectively). The arrows indicate the genes associated with several important agronomic traits. (A) Landrace-to-oil comparison; (B) oil-to-fiber comparison.(C–G) (C–G) The πratio and FST values for candidate genes are shown at the top; the amino acid substitutions resulting from the large-effect SNP mutations for those candidate genes are shown at the bottom. Red indicates amino acid substitutions between landrace, oil, and fiber flax. Landrace, oil, and fiber flax groups are indicated from the top to the lower layers.Considering that the modern fiber-use flax cultivars were often bred from oil-use flax (Allaby et al., 2005, Fu et al., 2012), we also identified 47 potential selective sweeps (6.5 Mb in length, 867 genes) in the oil-to-fiber comparison, of which 50.9% (441/867 genes) are also in the selective sweeps found in the landrace-to-fiber comparison, suggesting that these relevant genomic regions were continuously subjected to strong selective pressure during the improvement of fiber-use flax (Figures 4B, 4F, 4G, and S12, Tables S24, S25, and S26). Approximately half of the genes (426/867 genes) were only found to locate in the oil-to-fiber comparison. Annotations of these unique genes carrying large-effect variations identified the homologs of genes encoding endo-β-1,4-glucanase (LuKorrigan), pectin methyl esterase (LuPME), and copalyl pyrophosphate synthase (LuCPS) (Tables S27 and S28). These divergent selections in fiber-use flax, corroborated by the transcriptome analysis results (Figure S14), imply that multiple rounds of selection on diverse genomic loci contributed to the improvement of flax fiber properties.To further investigate the contributions of selective sweeps to the flax improvement, we compared our selective sweeps with the previously reported quantitative trait/genome-wide association study (QTL/GWAS) loci (Soto-Cerda et al., 2014, Kumar et al., 2015; Xie et al., 2018). We found two oil-use selective sweeps that overlap with two QTLs of stearic acid and one fiber-use selective sweep that overlaps with a GWAS locus of stem length. Interestingly, we also found another three fiber-use selective sweeps that intersect with three oil biosynthesis QTL/GWAS loci. This phenomenon, in conjunction with the common selective sweeps found in the landrace-to-oil and landrace-to-fiber comparisons, implies a dual selection for oil-use and fiber-use flax, also called “syndrome” traits domestication/improvement (Table S29 and Figure S15).
Evolution of MYB46/MYB83 Genes and Their Roles in the Secondary Cell Wall Biosynthesis in Flax
Fibers are a type of specialized cell with a thickened secondary cellular wall in plants. It is well known that AtMYB83/MYB46 are two master regulators for secondary cell wall biosynthesis in Arabidopsis (Zhong et al., 2007). Phylogenetic analysis of MYB46/MYB83 genes from the eleven species uncovered that at least two copies of MYB46/MYB83 existed within the ancestral lineages of eudicots, belonging to the MYB46 and MYB83 gene lineages, respectively (Figure S16). In the following evolutionary trajectory, species-specific duplications occurred in MYB46/MYB83 genes for flax, poplar, apple, alfalfa, and cassava. In our study, four of the eight identified LuMYB46/LuMYB83 homologs displayed elevated expressions in Longya-10 or Heiya-14 in comparison with pale flax (Table S19 and Figure S8). Additionally, many genomic variations of LuMYB46-1, -2 and LuMYB83-1 were found in cultivated flax. LuMYB83-1 was detected a 21-bp insertion in two cultivars in comparison to pale flax (Figure 2F), and LuMYB46-1 underwent strong selection during the flax improvement (Figures 4B and 4G). LuMYB46-2 also has divergent insertion/deletion variations in Longya-10 and/or Heiya-14 (Figure S17). Because MYB46/MYB83 genes are important for the secondary cell wall biosynthesis (Zhong et al., 2007, Zhong and Ye, 2012), the evolution of LuMYB46/LuMYB83 was likely to be essential in reshaping the biosynthesis of the secondary cell wall during flax domestication and improvement.In flax, four pairs of MYB46/MYB83 sister genes situate in collinear genomic regions and the latest split happened around the time when the most recent WGD occurred (Ks = 0.13, Table S30), implying that this WGD event led to the latest expansion of MYB46/MYB83 genes in flax. A comparison of the collinear blocks between flax and grape supports the hypothesis that two additional block duplications caused the expansions of MYB46/MYB83 genes (Table S31 and Figure S18). The deteriorated collinearity between the non-sister blocks and the high Ks values (all Ks > 1 except for the sister MYB46/MYB83 gene pairs) of MYB46/MYB83 gene pairs seemingly excluded the possibility that the expansion of MYB46/MYB83 genes stemmed from an early WGD event (Ks = 0.77) or other block duplications happened at that period (Table S32). Of course, the status of divergence in MYB46/MYB83 genes might be blurred by the dynamic changes of the evolutionary rate and the genome fractionation during the repeated polyploidization and diploidization. But no matter how they duplicated under what kinds of circumstances, the expansion of MYB46/MYB83 genes provided potential activators of secondary cell wall biosynthesis. These MYB46/MYB83 homologs, also observed in several other plants, might be specifically required for the secondary cell wall biosynthesis by regulating the expressions of downstream genes (Zhao and Dixon, 2011, Zhong et al., 2007, Zhong and Ye, 2015). To test this hypothesis, we examined the expressions of 49 genes associated with secondary cell wall biosynthesis in Longya-10, Heiya-14, and pale flax (Table S33). Of the identified 40 differentially expressed genes, eight showed more than a 10-fold increase in at least one cultivar, and the expression levels of three genes encoding Xyloglucan endotransglycosylases/hydrolases, which participate in fiber elongation, increased by more than 100-fold in Heiya-14 compared with that of Longya-10 and pale flax (Figure S19). A more comprehensive expression profile of 1,199 genes associated with secondary cell wall biosynthesis between Tianshuixian (a landrace accession), Longya-10, and Heiya-14 was further investigated using RNA sequencing (Figure S20). The result reveals that highly expressed genes tend to enrich in Heiya-14, demonstrating that artificial selection for fiber properties was intensified in fiber-use flax.
Discussion
A previous study produced a fragmented genome assembly for an oil-use cultivar CDC Bethune, consisting of 88,384 scaffolds (116,602 contigs) (Wang et al., 2012). Recently, a chromosome-level assembly of the CDC Bethune genome has been constructed using BioNano genome optical map technology (You et al., 2018). However, a large number of discontinuous contigs remained in the flax genome assembly. In this study, we de novo assembled the genome of another oil-use cultivar, Longya-10, reducing the number of contigs and scaffolds to 6,521 and 2,006, respectively (You et al., 2018), among which 96.7% of assembly could be further scaffolded into 15 pseudochromosomes by combined Hi-C interaction signal and genetic map. This improved flax reference genome can deepen the evolutionary genomics analysis. Under the long-term artificial selection of beneficial agronomic traits, the cultivated flax has distinct phenotypes compared with pale flax: decreased growing period (60 versus 300 days), indehiscent capsule, increased yield (∼5 versus ∼1 g/1,000 seeds), modifications in plant architecture (upright versus prostate; 70 versus 40 cm in plant height; ∼5 versus ∼70 in branching number). The genetic changes behind these changes of phenotypes from pale flax to cultivated flax were not expounded by a genome-wide comparative analysis. With the aid of the assemblies of two flax cultivars and a pale flax in our study, we found 804 flax genes with large-effect variations whose homologs are considered to regulate domestication-related traits in plants (Badouin et al., 2017, Fang et al., 2017, Li et al., 2014, Varshney et al., 2017). Importantly, homologs of FCA, ALC, LEC1, and MYB83-1 genes are important for flowering, oil synthesis, secondary cell wall biosynthesis, and indehiscence, respectively. Published studies revealed that activated FCA promotes early flowering by repressing the mRNA accumulation of floral repressor FLOWERING LOCUS C (FLC); overexpression of LEC1 in Arabidopsis and Arachis hypogaea can enhance the production of fatty acid; overexpression of MYB83 is capable of thickening secondary cell walls in the xylem vessels; and wild-type siliques in Arabidopsis forms a nonlignified cell layer at the site of separation but alc mutation fails to differentiate such a cell layer, leading to the production of indehiscent fruits (Simpson et al., 2010, Tang et al., 2018, Zhu et al., 2018, McCarthy et al., 2009, Rajani and Sundaresan, 2001). The novel variations found in these genes in cultivated flax may help to reveal the early footprints of flax domestication. Additionally, we speculated that the modified regulations of plant hormones (gibberellin and brassinosteroid) profoundly affected the flax plant architecture during domestication based on the functional enrichment of genes with large-effect variations in the two cultivars compared with pale flax.The Ne analysis implies that the ancestors of flax experienced strong bottlenecks owing to prehistoric climatic oscillations and subsequent human selections. Furthermore, in agreement with previous studies, our population analysis confirmed that the domestication of oil-use flax preceded the fiber-use flax, although the scarcity of fiber-use flax (12 accessions) probably caused a loss of information on the pedigree relationships. It is noteworthy that most flax cultivars investigated up to now have been representatives of modern flax breeding programs since the 1900s, whereas landrace and oil-fiber dual-purpose flax are supposed to be more closely related to the primitive domesticated flax lineages (Fu et al., 2012). As a consequence, the selective sweeps explored in our study can provide hints of modern oil-use and fiber-use flax improvement. As expected, oil-use and fiber-use flax have undergone divergent selections owing to their respective application preference. Similar to previous studies of oil-use flax domestication history, unique selective sweeps found in landrace-to-fiber comparison and oil-to-fiber comparison imply divergent geographic origins or multiple rounds of selection for fiber-use characteristic, despite their monophyletic clustering in our population phylogeny. Unlike other crop progenitors, the pale flax has a worldwide biogeographical distribution. Furthermore, as a principal source of oil and fiber, its domestication started from prehistoric times (Zohary and Hopf, 2000). Therefore, it is likely that a suite of landrace flax populations independently formed in situ, from which oil-use and fiber-use flax were gradually domesticated/improved. Moreover, the repeated selections on the same genomic region imperative for both oil and fiber characteristics signified that a series of syndrome traits collectively evolved during the cultivation in flax.The MYB transcription factor family participates in a wide range of biological processes in plants (Cominelli and Tonelli, 2009, Xie et al., 2010). The MYB46/MYB83, as master switch genes, can activate secondary cell wall biosynthesis in fibers and vessels (Zhong and Ye, 2012). In flax, the number of MYB46/MYB83 genes expanded 4-fold since the divergence from the ancestral eudicots lineages, and the latest expansion of MYB46/MYB83 genes resulted from the most recent WGD event. The continual duplication and functional divergence of MYB46/MYB83 genes potentially shaped the unique regulation in the secondary cell wall biosynthesis in flax. During the domestication and improvement, the agronomically beneficial variations of MYB46/MYB83 genes were retained by the artificial selections in the oil-use and fiber-use flax populations, making the flax a popular crop worldwide. Our data that uncovered genes with major effects on flax domestication and improvement will facilitate molecular breeding in the future.
Limitations of the Study
Owing to the absence of wild flax populations (pale flax populations), the domestication history from pale flax to landrace flax was studied by genomic comparison between pale and two cultivated flax assemblies. Although the fiber flax accessions were gathered over four countries (Belgium, France, Holland, and China), genetic diversity within the fiber-use flax population might be largely underestimated when only twelve individuals were investigated.
Methods
All methods can be found in the accompanying Transparent Methods supplemental file.
Authors: Zhiwen Wang; Neil Hobson; Leonardo Galindo; Shilin Zhu; Daihu Shi; Joshua McDill; Linfeng Yang; Simon Hawkins; Godfrey Neutelings; Raju Datla; Georgina Lambert; David W Galbraith; Christopher J Grassa; Armando Geraldes; Quentin C Cronk; Christopher Cullis; Prasanta K Dash; Polumetla A Kumar; Sylvie Cloutier; Andrew G Sharpe; Gane K-S Wong; Jun Wang; Michael K Deyholos Journal: Plant J Date: 2012-08-14 Impact factor: 6.417
Authors: Gordon G Simpson; Rebecca E Laurie; Paul P Dijkwel; Victor Quesada; Peter A Stockwell; Caroline Dean; Richard C Macknight Journal: Plant Cell Date: 2010-11-12 Impact factor: 11.277
Authors: Liubov V Povkhova; Nataliya V Melnikova; Tatiana A Rozhmina; Roman O Novakovskiy; Elena N Pushkova; Ekaterina M Dvorianinova; Alexander A Zhuchenko; Anastasia M Kamionskaya; George S Krasnov; Alexey A Dmitriev Journal: Plants (Basel) Date: 2021-11-28
Authors: Nadezhda L Bolsheva; Nataliya V Melnikova; Ekaterina M Dvorianinova; Liudmila N Mironova; Olga Y Yurkevich; Alexandra V Amosova; George S Krasnov; Alexey A Dmitriev; Olga V Muravenko Journal: Plants (Basel) Date: 2022-02-27