Sarah Schiessl1, Bruno Huettel2, Diana Kuehn2, Richard Reinhardt2, Rod Snowdon1. 1. Department of Plant Breeding, Justus Liebig University, IFZ Research Centre for Biosystems, Land Use and Nutrition, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany. 2. Max Planck Institute for Breeding Research, Carl-von-Linné-Weg 10, 50829 Cologne, Germany.
Abstract
Genetic models for polyploid crop adaptation provide important information relevant for future breeding prospects. A well-suited model is Brassica napus, a recent allopolyploid closely related to Arabidopsis thaliana. Flowering time is a major adaptation trait determining life cycle synchronization with the environment. Here we unravel natural genetic variation in B. napus flowering time regulators and investigate associations with evolutionary diversification into different life cycle morphotypes. Deep sequencing of 35 flowering regulators was performed in 280 diverse B. napus genotypes. High sequencing depth enabled high-quality calling of single-nucleotide polymorphisms (SNPs), insertion-deletions (InDels) and copy number variants (CNVs). By combining these data with genotyping data from the Brassica 60 K Illumina® Infinium SNP array, we performed a genome-wide marker distribution analysis across the 4 ecogeographical morphotypes. Twelve haplotypes, including Bna.FLC.A10, Bna.VIN3.A02 and the Bna.FT promoter on C02_random, were diagnostic for the diversification of winter and spring types. The subspecies split between oilseed/kale (B. napus ssp. napus) and swedes/rutabagas (B. napus ssp. napobrassica) was defined by 13 haplotypes, including genomic rearrangements encompassing copies of Bna.FLC, Bna.PHYA and Bna.GA3ox1. De novo variation in copies of important flowering-time genes in B. napus arose during allopolyploidisation, enabling sub-functionalisation that allowed different morphotypes to appropriately fine-tune their lifecycle.
Genetic models for polyploid crop adaptation provide important information relevant for future breeding prospects. A well-suited model is Brassica napus, a recent allopolyploid closely related to Arabidopsis thaliana. Flowering time is a major adaptation trait determining life cycle synchronization with the environment. Here we unravel natural genetic variation in B. napusflowering time regulators and investigate associations with evolutionary diversification into different life cycle morphotypes. Deep sequencing of 35 flowering regulators was performed in 280 diverse B. napus genotypes. High sequencing depth enabled high-quality calling of single-nucleotide polymorphisms (SNPs), insertion-deletions (InDels) and copy number variants (CNVs). By combining these data with genotyping data from the Brassica 60 K Illumina® Infinium SNP array, we performed a genome-wide marker distribution analysis across the 4 ecogeographical morphotypes. Twelve haplotypes, including Bna.FLC.A10, Bna.VIN3.A02 and the Bna.FT promoter on C02_random, were diagnostic for the diversification of winter and spring types. The subspecies split between oilseed/kale (B. napus ssp. napus) and swedes/rutabagas (B. napus ssp. napobrassica) was defined by 13 haplotypes, including genomic rearrangements encompassing copies of Bna.FLC, Bna.PHYA and Bna.GA3ox1. De novo variation in copies of important flowering-time genes in B. napus arose during allopolyploidisation, enabling sub-functionalisation that allowed different morphotypes to appropriately fine-tune their lifecycle.
Polyploid crops like wheat, potato, oats and rapeseed have been enormously successful as field crops because of their huge adaptation potential. Indeed, the fact that all flowering plants derive from ancient or recent polyploidisation events12 points to an enormous evolutionary advantage associated with polyploidy. On the other hand, most polyploid events do not lead to a successful establishment of a new species3. Understanding how polyploids achieve adaptive potential has important implication for breeding in the context of environmental change.On the other hand, the complexity of polyploid genomes has considerably restricted large-scale genetic studies of polyploid species456, so broad conclusions are often drawn based on diploid model plants like Arabidopsis thaliana. The polyploid crop most closely related to A. thaliana is rapeseed (Brassica napus), making it an excellent system to transfer information from the model to the crop. Despite its very recent origin and strong allopolyploidisation bottleneck6, rapeseed can be grown from boreal to subtropical and semi-arid areas, a result of strong differentiation into distinctly different morphotypes7.The morphotype with highest seed yields is the biannual winter oilseed type8. The prerequisites for this lifecycle are winter hardiness for winter survival, along with vernalisation requirement to avoid pre-winter flowering7. In subtropical areas, cultivation of semi-winter types that can be vernalised in warmer temperatures is possible7. Boreal or semi-arid regions have periods of low plant survival rates, either due to strong winter freezing or extreme heat stress. In these regions, annual spring types are prominent. These are neither winter-hardy nor vernalisation-dependent, and the short growing season strongly limits yield potential. B. napus can also be grown as beet-like forms, known as swedes or rutabagas, which form a different subspecies (ssp. napobrassica)7. Swedes are generally of winter type, however have limited winter-hardiness and require extended vernalisation to flower (Fig. 1). No wild-types of B. napus are known, hence the species is assumed to have arisen in cultivation7, with at least one origin believed to be as recent as a few hundred years ago9. The different cultivated forms are bred in separate breeding pools, with introgression between morphotypes only in cases of extreme introgression benefit. However, this necessitates tedious backcrossing programs to restore the required ecogeographic adaptation characters10. Knowledge of the factors determining lifecycle traits like vernalisation requirement and flowering time is crucial for successful exchange of genetic material between B. napus gene pools10.
Figure 1
Schematic representation of the life cycles of the four different Brassica napus morphotypes.
Periods of cold required for vernalisation in the respective morphotypes are indicated by blue boxes. Relative seed production is indicated by the number of grains.
Although the mechanisms of vernalisation have been studied in depth in Arabidopsis, specific winter or spring alleles were not yet defined for B. napus. The predominant assumption is that the underlying genetic mechanisms are identical or very similar across crucifer species. The allopolyploid B. napus carries two almost intact subgenomes from the ancestors Brassica rapa (A subgenome donor) and Brassica oleracea (C subgenome donor). Both ancestral subgenomes arose from a common, hexaploid ancestor, raising the theoretical copy number of Arabidopsis gene homologs to six. Due to post-polyploidisation genome reduction, the average gene copy number is 4.411, whereby considerable variation has been observed among different gene families, with copy number ranging from 1 to 1212. Homology-driven chromosome rearrangements during allopolyploidisation are a key driver of such variation612. Copy number variations (CNVs) have been found to impart large phenotypic influence in several plant species like Arabidopsis13, wheat14, potato15 and maize16, but also in domestic animals17 and humans18.In Arabidopsis, FLOWERING LOCUS C (FLC) is the major repressor for the activity of the central flowering transcription factor FLOWERING LOCUS T (FT)19. This gene cannot be expressed before FLC protein levels drop19, however when this occurs FT can be activated by the photoperiod pathway via the transcription factor CONSTANS (CO)20. Downregulation of FLC takes place at the transcriptional level. The FLC chromatin is modified and rearranged in order to stabilize a new inactive form2122. Different mechanisms are involved in the structural regulation of FLC gene activity, including both autonomous regulators and the vernalisation pathway2223. Three different mechanisms may exist for the breakdown of vernalisation requirement: (i) alteration of FLC regulating factors like FRIGIDA (FRI); (ii) alteration of FLC gene sequence or activity; (iii) alteration of FLC binding sites or FT promoter sequences. Arabidopsis annuals and biannuals have been found to differ either in FRI or in FLC24, indicating that the Arabidopsis winter-spring split is governed by the first two levels of regulation. As a consequence, research on B. napus vernalisation has been heavily focused on investigating FLC homologs252627. Indeed, a number of QTL studies in different mapping populations have suggested FLC loci as candidates for flowering time in B. napus, including populations without vernalisation requirement2829303132. Moreover, it has been reported that a transposon insertion in the first intron of Bna.FLC.A10 is associated with the vernalisation requirement of winter-type rapeseed27.The aim of the present work was therefore the definition of morphotype-specific alleles or haplotypes that might further our understanding of vernalisation control in a complex allopolyploid, and simultaneously allow breeders to successfully select for desirable lifecycle traits. By comparing results of vernalisation experiments with data from genome-wide marker distribution analysis, targeted deep-sequencing of essential flowering time regulators and the FT promoter, and coverage analysis to estimate CNV, we provide novel insights that reveal the complexity of post-polyploidisation morphological diversification in an important crop species.
Material and methods
Plant material and phenotyping
A panel of 280 genetically diverse B. napus inbred lines (selfed for 5 or more generations) was grown in Giessen, Germany (50° 35′ N, 8° 40′ E) in 2012. The plant material was part of the ERANET-ASSYST B. napus diversity set that has been described previously3334. Winter-type rapeseed and swede accessions were grown in autumn-sown trials, whereas spring-type and semi-winter accessions were grown in spring-sown trials. Plots were sown in a completely randomized block design with a harvest plot size of 3 × 1.25 m in a single replicate (containing around 200 plants).In a separate experiment, a selection of 33 genotypes from the same set was grown in the greenhouse under semi-controlled conditions (20 °C). These genotypes were selected to represent spring, winter and swede material with different CNV patterns for Bna.FLC. Twenty seeds were sown in vermiculite, before being transplanted after one week into plates in soil, with 5 replicates per treatment. Four weeks after planting, these plants were either transferred to a climate chamber for vernalisation at 4 °C and short-day conditions for 6 weeks (mild vernalisation) or 12 weeks (strong vernalisation), or kept in the greenhouse (no vernalisation). Begin of flowering (BBCH 61) was tracked daily for every single plant.
DNA isolation
Leaf material for genomic DNA extraction was harvested in spring 2012 from the field trial in Giessen, Germany. Pooled leaf samples were taken from at least 5 different plants per genotype, immediately shock-frozen in liquid nitrogen and kept at −20 °C until extraction. Leaf material was ground in liquid nitrogen with a mortar and pestle. DNA was extracted using a common CTAB protocol modified from Doyle and Doyle (1990) as described earlier12. DNA concentration was determined using a Qubit fluorometer and the Qubit dsDNA BR assay kit (Life Technologies, Darmstadt, Germany) according to the manufacturer’s protocol. DNA quantity and purity was further checked on 0.5% agarose gel (3 V/cm, 0.5xTBE, 120 min).
Selection of target genes
As described previously12, a set of 29 A. thalianaflowering time genes was selected to cover the entire genetic network controlling flowering time, including circadian clock regulators (CYCLING DOF FACTOR 1 (CDF1), EARLY FLOWERING 3 (ELF3), GIGANTEA (GI) and ZEITLUPE (ZTL)), the input pathways for vernalisation (EARLY FLOWERING 7 (ELF7), EARLY FLOWERING IN SHORT DAYS (EFS), FLOWERING LOCUS C (FLC), FRIGIDA (FRI), SHORT VEGTATIVE PHASE (SVP), SUPPRESSOR OF FRIGIDA 4 (SUF4), TERMINAL FLOWER 2 (TFL2), VERNALISATION 2 (VRN2), VERNALISATION INSENSITIVE 3 (VIN3)), photoperiod sensitivity (CONSTANS (CO), CRYPTOCHROME 2 (CRY2), PHYTOCHROME A (PHYA), PHYTOCHROME B (PHYB)) and gibberellin (GIBBERELLIN-3-OXIDASE 1 (GA3ox1)), along with downstream signal transducers (AGAMOUS-LIKE 24 (AGL24), APETALA 1 (AP1), CAULIFLOWER (CAL), FLOWERING LOCUS D (FD), FLOWERING LOCUS T (FT), FRUITFUL (FUL), LEAFY (LFY), SQUAMOSA PROMOTOR PROTEIN LIKE 3 (SPL3), SUPPRESSOR OF CONSTANS 1 (SOC1), TEMPRANILLO 1 (TEM1), TERMINAL FLOWER 1 (TFL1)). On top, we also included CIRCADIAN CLOCK ASSISTED 1 (CCA1), FLAGELLIN-SENSITIVE 2 (FLS2), GLYCIN-RICH PROTEIN 7 (GRP7), GLYCIN-RICH PROTEIN 8 (GRP8), GORDITA (GORD) and SENSITIVITY TO RED LIGHT REDUCED 1 (SRR1).A full list of gene names and putative functions is provided in Supplementary Table 1.
Bait development
In order to perform target enrichment, complementary sequences of 120 nt length were first developed for each target region. A group of 120mer oligonucleotide sequences covering a certain target region is hereinafter referred to as a bait group for that target region, while collectively all bait groups are referred to as the bait group pool. In the present study the bait group pool for the sequence capture, developed mainly using gene sequences from B. rapa or B. oleracea12, was modified in order to improve specificity. Enriched regions captured in our previous study12 were classified into target regions and non-target regions. The bait pool was then blasted against target and non-target regions with an E-value cut-off of 10−10. Baits which had excessive non-target hits were manually removed. This was the case for bait groups on FT, FUL and PHYA. For some bait groups (AP1, CO, SOC1), too many baits (>30%) were deleted. In these cases, bait groups were created using a pre-publication draft (version 4.0) of the B. napus ‘Darmor-Bzh’ reference genome sequence assembly, which was kindly made available prior to public release by INRA, France, Unité de Recherche en Génomique Végétale6, using the Agilent Genomic Workbench program SureDesign (Agilent Inc., Santa Clara, CA, USA). These replaced the corresponding bait groups developed previously using B. rapa or B. oleracea. Bait groups were created using the ‘Bait Tiling’ tool. The parameters were set as follows: Sequencing Technology: ‘Illumina’, Sequencing Protocol: ‘Paired-End long Read (75 bp+)’, ‘Use Optimized Parameters (Bait length 120, Tiling Frequency 1x)’, Avoid Overlap: ‘20’, ‘User defined genome’, ‘Avoid Standard Repeat Masked Regions’. Baits for genes on the minus-strand were developed in sense, while baits on the plus-strand were developed in antisense.In total, 63 bait groups were created for B. rapa copies of the target genes, 71 bait groups for B. oleracea copies and 24 bait groups for B. napus copies.
Sequence capture and sequencing
Custom bait production was carried out by Agilent Technologies (Agilent Inc., Santa Clara, CA, USA) using the output oligonucleotide sequences from eArrayXD. Sequence capture was performed using the SureSelectXT 1 kb–499 kb Custom Kit (Agilent Inc., Santa Clara, CA, USA) according to the manufacturer’s instructions. The resulting TruSeq DNA library (Illumina Inc., San Diego, CA, USA) was sequenced on an Illumina HiSeq 2500 sequencer at the Max Planck Institute for Breeding Research (Cologne, Germany) in 100 bp single-read mode.
Sequence data analysis
Quality control of the raw sequencing data was performed using FASTQC. Reads were mapped onto version 4.1 of the B. napus ‘Darmor-Bzh’ reference genome sequence assembly6. Mapping was performed using the SOAPaligner algorithm35, with default settings and the option r = 0 to extract uniquely aligned reads. Removal of duplicates, sorting and indexing was carried out with samtools version 0.1.1936. Alignments were visualised using the IGV browser version 2.3.1237. Enriched regions and coverage differences were calculated using the bedtools software with multiBamCov38. Calling of single nucleotide polymorphisms (SNPs) was performed with the algorithm mpileup in the samtools toolkit. SNP and InDel annotation was performed using CooVar39. Target regions were defined using the gene annotation list from the B. napus ‘Darmor-bzh’ v4.1 reference genome6 and BLAST position results of the bait pool (E-value cut-off 10−100) on the mapping reference, and used to calculate the fraction of target covered. For InDel calling, a separate mapping using Bowtie240 was performed, as described previously41. Removal of duplicates, sorting and indexing was carried out with samtools version 0.1.19. An initial InDel calling was performed using samtools mpileup, and realignment of reads around InDels was performed using GATK RealignerTargetCreator, version 3.1.142. A final InDel calling was then performed as described above. InDels were filtered for a minimum mapping quality of 30 and a read depth of 10 or more using vcftools43.Read coverage for each captured region was normalised as follows: normalised coverage = (number of reads per region*total length of genome)/(total number of aligned reads per genotype*average read length). Copy number variation (CNV) in a given target region was assumed if the ratio of normalised coverage(genotype)/normalised coverage(all genotypes) was smaller than 0.5 or higher than 1.5, respectively.Sequencing data for 3 genotypes from a former experiment (Silona, Campino, Magres Pajberg)12 were analysed separately with the same pipeline to allow inclusion in the marker distribution analysis.
SNP genotyping and pre-processing
The 283 accessions were genotyped using the Brassica 60 K Illumina® Infinium SNP array by TraitGenetics GmbH (Gatersleben, Germany). We used the SNP positions as published in44. Heterozygous calls were treated as missing values. Moreover, we used the deep sequencing data to include all confidently called SNPs in biallelic state which lay in the analysed regions. Confidently called InDels were included by coding reference alleles as AA, insertions as CC, deletions as TT and heterozygous calls as missing values. The SNP matrix from the SNP array and the SNP and InDel data from deep sequencing were combined to one single marker file and sorted by position. The subsequent marker set contained 43733 markers. After pre-processing the marker set for non-missing marker values >0.9, minor allele frequency >0.01 and individuals (genotypes) with non-missing individual markers >0.8, we retained 33944 unique SNP markers and a population of 271 individuals for marker distribution analysis. Data pre-processing was performed with R (version 3.1.0) using the package GenABEL45.
Population structure
Population structure analysis and visualization were performed in R (version 3.1.0) using the package SelectionTools (http://fb09-pg-s207.agrar.uni-giessen.de/~frisch-m/), which applies principal component analysis based on genetic distances calculated according to the euclidean modified Rodger’s distance method. The most likely number of population subclusters was determined to be 3 by plotting the within-cluster sum of squares against the possible number of clusters, ranging from 1 to 15. K-means clustering was then performed in R using SelectionTools.
Marker distribution analysis
For every marker, we counted the allele frequency of the alternative allele in each morphotype pool. The ratio between the frequency of the allele in the winter pool (winter + swedes) and the spring pool (semi-winter + spring) was used to assign the allele as a winter or spring allele. If the ratio was <1, the alternative allele was denoted spring (s), if it was >1, the alternative allele was denoted winter (w). We then first tested if the marker would be suitable to explain a morphotype split, by comparing the observed distribution of w alleles in the winter pool (without swedes) and s alleles in the spring pool (without semi-winter) with the expected distribution (139/114), using a χ2 test. Only markers which did not show significant deviation from this distribution (p-value > 0.1) were considered in the next step. In the next step, we tested the distribution against random distribution between the pools, by comparing the observed distribution of w alleles in winter/s alleles in spring/s alleles in winter/w alleles in spring against the expected random distribution of 69.5/57/69.5/57. We then considered the top 0.1% of −log(p-value) as split markers. The same was done for the swede and non-swede material.
Results
Deep sequencing and variant calling
We defined regions as genetic regions which were covered with a mean coverage in the population of at least 10. In total, we analyzed 1184 regions, of which 637 regions were annotated as genes. Of these, 184 corresponded to the intended target genes. Two target genes copies for VERNALISATION INSENSITIVE 3 (VIN3) (Bna.VIN3.A01 and Bna.VIN3.C01) had insufficient coverage for this analysis and were not considered. Among the non-genic regions, we found 33 regions giving a BLAST hit to the FLOWERING LOCUS T (FT) promoter. A further 12 regions were identified as pseudogenes of the target genes. Those regions which were assigned to one of those classes (target genes, target pseudogenes and FT promoter) were summarized as target regions. A gene group is defined as all copies of a specific gene.We called and annotated 13053 SNPs, of which 4806 were located in the target regions. InDel calling revealed a total of 1894 InDels, with 506 in the target regions. Only 25 InDels were frameshifts, amino acid insertions or splice variants. All gene groups showed potentially functional variation, i.e. at least one copy of the gene group carried either a non-synonymous SNP, stop codon mutation, amino acid insertion, splice variant or frameshift InDel. Altogether, only 7 copies were completely conserved, while 16 copies carried only silent or synonymous variation. Interestingly, no functional variation was observed in two copies of Bna.FLOWERING LOCUS C (FLC) (on chromosomes A02/C02) and two copies of Bna.FT (also on chromosomes A02/C02), respectively. On the other hand, other copies of Bna.FLC (A03, A10) and Bna.FT (C06) carried a surprisingly large range of variation. Among the genes with frameshift variants were copies of Bna.FRIGIDA (FRI), Bna.PHYTOCHOME A (PHYA), Bna.EARLY FLOWERING IN SHORT DAYS (EFS), Bna.EARLY FLOWERING 7 (ELF7), Bna.PHYTOCHROM B (PHYB), Bna.VERNALISATION 2 (VRN2) and Bna.LEAFY (LFY) (Fig. 2). We also calculated copy number variation (CNV) based on read depth. No gene group was found without CNV, and only two lines were found which did not carry any CNV among the target copies. The distribution of SNPs, InDels and CNVs is shown in Fig. 2.
Figure 2
Distribution of SNPs/InDels (above) and CNV events (below) over all target gene copies.
The chromosomal locations of the copies are given below the common Arabidopsis gene name (white background), with colors representing the respective type of sequence variation observed (see color code below each diagram). Upper panel: Silent SNPs are not indicated if synonymous or non-synonymous SNPs are present in the same copy, and synonymous SNPs are not indicated if non-synonymous SNPs are present in the same copy. Lower panel: Gene copies showing two different colors are deleted in some lines and duplicated in some others.
Among the analysed population of 271 accessions, we had 139 winter type accessions, 7 semi-winter type accessions, 114 spring type accessions and 11 swedes. Analyzing this population with a Principal Component Analysis (PCA) showed a strong population substructure, as the first principal component explained 24.1% of the variation, while further components explained 5.4, 2.5 and 2.1%, respectively (Fig. 3). The population falls into three main clusters: the first cluster contained 137 winter type accessions, the second one 93 spring type accessions and a semi-winter type accession, and the third and most diverse cluster contained 11 swedes, 6 semi-winter type accessions, 21 spring type accessions and 2 winter type accessions. We concluded that the winter material was genetically least diverse, while spring material was more diverse, followed by semi-winter and swede material. Overlap between winter and spring pools is minimal, while all other types show more overlap, although swedes are more distant from the winter and spring core clusters.
Figure 3
2D plots of PCA for the total population.
The explained variance is given in brackets. Colors indicate the cluster. Cluster 1 is shown in red, cluster 2 in green and cluster 3 in black. Letters indicate the morphotype: w for winter, s for spring, e for semi-winter and d for swedes.
To analyse marker distribution on a genome-wide scale, we used SNP data from the Brassica 60 K Illumina® Infinium SNP array and combined it with data from deep sequencing (SNPs and InDels). In order to find the most indicative marker patterns for the differential flowering behaviour of winter and spring material, we analyzed the differential marker pattern between the different morphotypes using the χ2 test. We first defined “winter” and “spring” alleles by allele frequencies in the different morphotypes and assessed their distribution in both pools. First, we excluded all markers with non-suitable allele frequencies. We regard all markers as non-suitable if their minor allele frequency was too low to explain a population split. This was tested in a foregoing χ2 test (see Methods). The remaining markers were tested against random distribution in the respective morphotypes. The same was done for swedes against non-swede accessions. Choosing a cut-off which considers the top 0.1% of markers (-log[p-value] = 38.9 for winter/spring and 55.8 for swedes), we detected 12 regions on chromosomes A01, A02, A03, A07, A09, A10, C03, C06 and C09 for the winter-spring split (Fig. 4), and 13 regions on chromosomes A03, A04, A06, A09, C01, C08 and C09 for the swede split.
Figure 4
Genome-wide distribution of χ2 p-values tested against equal distribution in winter and spring material.
The chromosomes are coloured differently. The solid lines indicate the marker cut-off threshold of 0.1%.
Analysis of split regions
We subsequently counted how many of the 12 winter-spring split regions have a clear winter or spring pattern in each genotype, i.e. the number of cases where every split marker in the haplotype corresponded to the winter or spring state (Table 1). The distribution of lines carrying clear winter and spring haplotypes is shown in Fig. 5. Mixed haplotypes were excluded here, as they account for less than 5% of the haplotypes. From this distribution, we concluded that characterizing these regions for their haplotype pattern is sufficient to distinguish winter from spring morphotypes, but not to distinguish semi-winter or swede morphotypes. The same analysis on 13 split regions identified for the swede vs. non-swede split revealed a more explicit distribution (Table and Fig. 6). Genotyping these loci is therefore sufficient to distinguish swede morphotypes from non-swedes.
Table 1
Marker distributions of SNP markers associated with the winter-spring morphotype split in B. napus, along with the most closely associated markers from deep sequencing for the winter-spring split.
Marker name
Chromosome
Position
winter allele
spring allele
located in gene
clear winter
clear spring
mixed
deletions
winter pop
spring pop
winter pop
spring pop
−log(p)
split markers
ww
ss
ws
00
Bn-A01-p4641747
chrA01
4261277
132
13
4
101
39.3
BnaA01g08820D
Bn-A01-p4641802
chrA01
4261332
132
13
4
101
39.3
BnaA01g08820D
136
102
12
3
Bn-A01-p4803773
chrA01
4413051
128
9
6
105
39.8
Bn-A02-p3207085
chrA02
695308
131
7
4
106
42.8
Bn-A02-p3208275
chrA02
700390
129
7
5
106
41.7
Bn-A02-p3295898
chrA02
786193
130
8
5
101
39.9
BnaA02g01700D
129
104
12
8
Bn-A02-p3297592
chrA02
787627
128
6
6
104
40.6
Bn-A02-p3299206
chrA02
789246
130
6
5
105
42.1
BnaA02g01710D
Bn-A02-p3300731
chrA02
790766
128
6
6
105
40.9
BnaA02g01710D
Bn-A02-p3302725
chrA02
792753
126
6
7
105
39.8
Bn-A02-p3361391
chrA02
849106
126
6
8
104
39.1
BnaA02g01860D
Bn-A02-p5907701
chrA02
3096806
136
12
1
99
41.9
Bn-A02-p5917045
chrA02
3104382
137
11
2
101
42.9
148
101
2
2
Bn-A03-p6576575
chrA03
5874703
126
3
9
111
42.6
BnaA03g12910D
129
113
9
1
Bn-A03-p6636780
chrA03
5928259
131
5
6
109
44.0
Bn-A03-p9836757
chrA03
9057095
128
8
7
103
39.1
136
110
na
7
Bn-A07-p15352802
chrA07
17269795
126
7
9
106
39.0
BnaA07g22720D
133
115
na
5
Bn-A09-p30805314
chrA09
28557636
125
4
9
107
40.2
BnaA09g40670D
Bn-A09-p30805387
chrA09
28557709
125
4
9
110
41.4
BnaA09g40670D
Bn-A09-p30887157
chrA09
28628531
131
11
5
103
39.9
129
101
19
4
Bn-A09-p30909393
chrA09
28655613
131
9
4
105
41.7
BnaA09g40920D
Bn-A09-p30918224
chrA09
28662308
131
13
4
101
38.9
Bn-A09-p30921980
chrA09
28664405
131
11
4
103
40.3
BnaA09g40940D
Bn-A10-p7357442
chrA10
9020292
128
7
9
106
39.8
BnaA10g10600D
Bn-A10-p7357555
chrA10
9020402
127
6
9
106
39.8
BnaA10g10600D
135
115
0
3
Bn-scaff_17109_2-p79906
chrA10
14916811
121
1
17
111
38.9
BnaA10g21860D
122
128
na
3
Bn-scaff_16002_1-p1767743
chrC03
12604057
127
7
9
105
39.0
134
114
na
5
Bn-scaff_18206_3-p62755
chrC06
18959652
131
12
4
100
38.9
143
104
na
6
Bn-scaff_16912_1-p190291
chrC09
12697195
129
9
8
104
39.0
BnaC09g15770D
Bn-scaff_20836_1-p198809
chrC09
12804839
129
8
9
105
39.4
Bn-scaff_20836_1-p198391
chrC09
12805246
129
8
9
105
39.4
137
112
2
2
Bn-scaff_20836_1-p197940
chrC09
12805697
130
8
9
105
39.8
Bn-scaff_20836_1-p197387
chrC09
12806250
129
8
9
105
39.4
Bn-scaff_20836_1-p196601
chrC09
12807036
129
8
9
104
39.0
Regions from deep sequencing
chrA02_3321143
chrA02
3321143
127
9
12
105
37.2
Bna.SRR1.A02
chrA02_3862842
chrA02
3862842
120
3
19
111
37.1
Bna.VIN3.A02
chrA03_5891342
chrA03
5891342
126
6
13
107
38.3
protein agamous-like 71
chrA09_random_3749261
chrA09_random
3749261
126
12
13
102
34.4
Bna.CCR1.A09_random
chrA10_14998726
chrA10
14998726
127
10
12
104
36.5
Bna.FLC.A10
chrC02_random_990005
chrC02_random
990005
125
11
14
103
34.4
Bna.FT.C02_random promoter
The table shows the marker name, chromosomal position and the number of lines carrying either a winter or a spring allele in the respective winter-type and spring-type populations. The table also gives the −log(p-value) used to determine the split markers, along with the gene ID where the marker is located. If empty, the marker is non-genic. The markers with the highest −log(p-value) in each split region are shown in bold letters. The last four columns of the table show how many clear winter or spring haplotypes were counted, along with the number of mixed haplotypes and deletions. For regions only containing one marker, mixed haplotypes do not apply (na).
Figure 5
Distribution of clear winter haplotypes (above) and clear spring haplotypes (below) in the total population for all identified split regions.
Mixed haplotypes were not counted. The distribution on morphotypes is colour-coded.
Table 2
Marker distributions of SNP markers associated with the swede vs. non-swede morphotype split in B. napus, along with the most closely associated markers from deep sequencing for the swede vs. non-swede split.
Marker name
Chromosome
Position
non-swede allele
swede allele
located in gene
clear non-swede
clear swede
mixed
deletions
non-swede pop
swede pop
non-swede pop
swede pop
−log(p)
split markers
nn
ss
sn
00
chrA03_4639027
chrA03
4639027
260
0
0
11
57.7
Bna.VIN3.A03
260
11
na
0
chrA04_12696607
chrA04
12696607
0
8
260
3
55.8
263
8
na
0
chrA06_5607262
chrA06
5607262
2
11
258
0
56.0
Bna.CCR1.A06
257
6
8
0
chrA06_5607744
chrA06
5607744
0
9
260
2
56.3
Bna.CCR1.A06
chrA06_5608016
chrA06
5608016
1
11
259
0
56.9
Bna.CCR1.A06
chrA06_5608089
chrA06
5608089
1
11
259
0
56.9
Bna.CCR1.A06
chrA06_5614815
chrA06
5614815
0
8
260
3
55.8
chrA08_14983629
chrA08
14983629
260
3
0
8
55.8
Bna.TEM1.A08
263
8
na
0
chrA09_11993194
chrA09
11993194
0
8
260
3
55.8
BnaA09g19070D
263
8
0
0
chrA09_11993662
chrA09
11993662
0
8
260
3
55.8
BnaA09g19070D
chrA09_11995810
chrA09
11995810
260
3
0
8
55.8
Bn-A09-p21922383
chrA09
19312044
0
8
260
3
55.8
263
8
na
0
chrA09_32435440
chrA09
32435440
2
11
258
0
56.0
Bna.PHYA.A09
258
9
4
0
chrA09_32435455
chrA09
32435455
2
11
258
0
56.0
Bna.PHYA.A09
chrA09_32437048
chrA09
32437048
0
10
260
1
56.9
Bna.PHYA.A09
chrA09_32441986
chrA09
32441986
260
2
0
9
56.3
BnaA09g48430D
chrA10_17106726
chrA10
17106726
0
8
260
3
55.8
Bna.ELF7.A10
263
8
0
0
chrA10_17106744
chrA10
17106744
0
8
260
3
55.8
Bna.ELF7.A10
chrA10_17108533
chrA10
17108533
260
3
0
8
55.8
Bna.ELF7.A10
chrC01_1447013
chrC01
1447013
0
8
260
3
55.8
262
8
1
0
chrC01_1447235
chrC01
1447235
0
9
260
2
56.3
Bna.FD.C01
chrC01_1447273
chrC01
1447273
260
2
0
9
56.3
Bna.FD.C01
chrC01_1447516
chrC01
1447516
0
9
260
2
56.3
Bna.FD.C01
chrC01_1447693
chrC01
1447693
260
2
0
9
56.3
Bna.FD.C01
chrC01_1447972
chrC01
1447972
0
9
260
2
56.3
Bna.FD.C01
Bn-scaff_16770_1-p1357882
chrC08
24087909
0
8
260
3
55.8
263
8
na
0
chrC08_36752901
chrC08
36752901
260
1
0
10
56.9
BnaC08g42670D
259
9
3
0
chrC08_36752954
chrC08
36752954
0
10
260
1
56.9
BnaC08g42670D
chrC08_36753586
chrC08
36753586
1
10
259
1
56.1
chrC08_36754224
chrC08
36754224
0
9
260
2
56.3
BnaC08g42680D
chrC08_36755379
chrC08
36755379
1
10
259
1
56.1
chrC08_36755562
chrC08
36755562
0
9
260
2
56.3
Bn-scaff_16389_1-p12505
chrC08
38113567
0
8
260
0
56.8
260
8
na
3
chrC09_43739821
chrC09
43739821
0
8
260
3
55.8
Bna.CO-li.C09
263
8
na
0
Regions from deep sequencing
chrA01_random_477115
chrA01_random
477115
5
8
255
3
51.6
mads-box protein
chrA03_6053137
chrA03
6053137
8
9
252
2
49.6
Bna.FRI.A03
chrA03_6243410
chrA03
6243410
12
8
248
2
46.2
Bna.FLC.A03
chrA04_12695445
chrA04
12695445
3
8
257
3
53.3
Bna.ELF3.A04
chrA05_5425314
chrA05
5425314
1
8
259
3
55.0
Bna.SPL3.A05
chrA05_9211460
chrA05
9211460
4
9
256
2
52.9
ubiquitin-conjugating enzyme family protein
chrA07_23775522
chrA07
23775522
3
8
257
3
53.3
cinnamoyl- reductase 2-2
chrA10_1357187
chrA10
1357187
1
8
259
3
55.0
Bna.CRY2.A10
chrA10_13359226
chrA10
13359226
1
9
259
2
55.4
Bna.CO.A10
chrA10_14998679
chrA10
14998679
3
10
257
1
54.4
Bna.FLC.A10
chrAnn_random_610372
chrAnn_random
610372
10
11
250
0
49.4
Bna.TFL1.Ann.random
chrAnn_random_20504534
chrAnn_random
20504534
253
3
7
8
49.9
Bna.VRN2.Ann.random
chrC03_8403949
chrC03
8403949
12
8
248
3
45.9
Bna.FLC.C03
chrC03_random_5400150
chrC03_random
5400150
259
2
1
9
55.4
Bna.FD.C03.random
The table shows the marker name, chromosomal position and the number of lines carrying either a non-swede or a swede allele in the respective non-swede and swede populations. The table also gives the −log(p-value) which used to determine the split markers, along with the gene ID or the name of the gene where the marker is located. If empty, the marker is non-genic. The markers with the highest −log(p-value) in each split region are shown in bold letters. The last four columns of the table show how many clear non-swede or swede haplotypes were counted, along with the number of mixed haplotypes and deletions. For regions only containing one marker, mixed haplotypes do not apply (na).
Figure 6
Distribution of clear non-swede haplotypes (above) and clear swede haplotypes (below) in the total population for all 13 identified split regions.
Mixed haplotypes were not counted. The distribution on morphotypes is colour-coded.
In order to exclude candidates for the respective morphotype split, we specifically looked at the variant distributions from deep sequencing in our marker set. Because these derived from sequence data, a poorly fitting distribution excludes the sequence from being a major cause for this morphotype, as sequencing covers the total variation of a gene. This is not the case for data from the SNP array, as even genic SNPs are not always completely predictive for their neighbor SNP. For the winter-spring split, only 6 sequenced regions with a distribution comparable to the detected split markers could be found, among them Bna.VIN3.A02, Bna.FLC.A10 and the Bna.FT promoter on the non-assembled scaffolds of C02_random (Table 1). With the exception of Bna.FLC.A10 (R10P mutation), all those SNPs are either synonymous or located in an intron. For the non-swede vs. swede split, we found 17 sequenced regions carrying variants with an acceptable distribution, for example three copies of Bna.FLC, two copies of Bna.CO and a further copy of Bna.FLOWERING LOCUS D (FD) (Table 2).Upstream of the gene Bna.FT on C02_random we found two regions, spanning 4622 and 4904 bp, respectively, which retrieved BLAST hits to the A02 or C02 copies of the Bna.FT promoter listed by NCBI. We therefore identified these sequences as the promoter of Bna.FT on C02_random. Both sequences contain a CArG box core motif, whereby the first sequence also contains 3 additional FLC binding sites known from A. thaliana46 and the second sequence contains 2 such FLC binding sites. No SNP is located in those motifs. We found that most (143 of 145) winter types are unchanged in both sequences or carry only minor changes, whereas most (71 of 116) of the spring population carried one of two distinctive haplotype patterns involving a SNP at position C02_random:980227 (Fig. 7). These patterns were shared by only two putative winter-type accessions, one of which is an exotic accession that may not need vernalisation, whereas the other is an accession which the vernalisation experiments revealed to have vernalisation-independent flowering (see below).
Figure 7
Haplotype distribution for the Bna.FT promoter on C02_random.
The distribution on morphotypes is colour-coded. The haplotype patterns are provided in the text.
We furthermore compared the numbers of deletion and duplication events in the different morphotype pools. For the winter-spring split, we found no specific pattern for the total population. In contrast, we found several patterns of deletions and duplications which were almost exclusive to swedes, concerning split regions on A08, A09, A10, C08 and C09 (Table 3). The regions on A09/C08 (containing copies of Bna.PHYA and Bna.GIBBERELLIN 3 OXIDASE 1 (GA3ox1) and A10/C09 (containing copies of Bna.FLC) are homeologous to each other. Some of these regions, particularly those on C08, appear to involve larger homoeologous exchanges that probably affect not only the detected genes.
Table 3
Distribution of deletion and duplication events in the non-swede and swede populations.
Gene ID
Chromosome
start
stop
deletions
duplications
mean coverage
Gene name
nonswede population
swede population
nonswede population
swede population
BnaA08g15780D
chrA08
13097823
13098361
11
9
0
0
1641.5
no annotation
BnaA09g48410D
chrA09
32434233
32438771
31
0
10
8
1136.2
Bna.PHYA.chrA09
BnaA09g57140D
chrA09_random
4043861
4045464
1
0
4
8
1675.6
Bna.GA3ox.chrA09.random
BnaA10g22080D
chrA10
14998617
15003197
2
0
2
9
1321.5
Bna.FLC.chrA10
BnaC08g38580D
chrC08
34776298
34779240
9
8
1
0
1226.1
Bna.CCR1.chrC08
BnaC08g38810D
chrC08
34907098
34908735
4
9
0
0
1581.1
Bna.GA3ox.chrC08
BnaC08g42660D
chrC08
36746642
36751390
6
10
15
0
1537.7
Bna.PHYA.chrC08
BnaC08g42670D
chrC08
36752307
36753108
7
9
14
0
1660.2
germin like protein
BnaC09g46500D
chrC09
46345350
46350092
2
9
13
0
1096.9
Bna.FLC.chrC09
BnaC09g46540D
chrC09
46366645
46371180
3
9
11
0
1031.5
Bna.FLC.chrC09
The table shows the gene ID, chromosomal position and the number of lines in the respective non-swede and swede populations which carry either a deletion or duplication. The table also gives the mean coverage of the respective gene and the common gene name.
For the winter-spring split, we found 234 genes with flowering-related gene ontology terms within 1 Mb of one of the diagnostic split markers. Examples are shown in Table 4. In the 13 regions showing a split between swedes and non-swedes, we found 260 candidate genes within 1 Mb. In this analysis, several split markers lay directly in candidate genes covered by deep sequencing, for example Bna.VIN3, Bna.PHYA (2 copies), Bna.TEMPRANILLO1 (TEM1), Bna.ELF7, Bna.CONSTANS (CO), Bna.CO-like or Bna.CINNAMOYL COA REDUCTASE 1 (CCR1) (Table 5). Some of those markers were non-synonymous SNPs (in Bna.CCR1, Bna.TEM1, Bna.CO, Bna.CO-like), whereas others were either synonymous or located in introns or untranslated regions (UTRs).
Table 4
Selected candidate genes for all 12 regions associated with the split between winter-type and spring-type B. napus accessions.
chromatin structure-remodeling complex protein syd
445.8
protein early flowering 4-like
164.3
btb poz domain-containing protein
48.3
chrA07
Bn-A07-p15352802
two-component response regulator arr5
352.9
sin3 histone deacetylase complex
654.6
protein argonaute 7-like
894.1
floral homeotic protein apetala 1
964.5
chrA09
Bn-A09-p30909393
protein early flowering 3-like
17.9
e3 ubiquitin-protein ligase orthrus 2
67.9
dna methyltransferase
727.3
chrA10
Bn-A10-p7357555
topoisomerase i
651.5
swinger
285.1
histone acetyltransferase type b catalytic subunit-like
187.2
chrA10
Bn-scaff_17109_2-p79906
transcription factor hy5
261.7
flowering-promoting factor 1-like
99.2
flowering locus c
81.8
chrC03
Bn-scaff_16002_1-p1767743
btb poz domain-containing protein
67.5
two-component response regulator arr16
12.8
phy rapidly regulated 1
743.0
chrC06
Bn-scaff_18206_3-p62755
histone z
939.5
squamosa-promoter binding protein
213.7
shatterproof1
749.7
chrC09
Bn-scaff_20836_1-p197940
set domain isoform 1
845.4
The table lists the chromosome and the marker with the most significant deviation from the expected random distribution, together with candidate genes selected based on gene ontology and literature. The last column specifies the distance of the gene to the closest marker within the split-associated region. The candidate gene closest to the split region is shown in italics.
Table 5
Selected candidate genes for all defined regions associated with the split between swede and non-swede B. napus morphotypes.
Chromosome
Marker with highest p-value
Candidate
Distance from closest split marker [kbp]
chrA03
chrA03_4639027
vernalization insensitive 3
0.0
histone acetyltransferase type b catalytic subunit
163.4
chrA04
chrA04_12696607
protein ovule abortion 4
45.0
protein early flowering 3-like
0.1
terminal flowering 1 protein 1
413.7
chrA06
chrA06_5608089
gibberellin 3-oxidase
181.8
cinnamoyl- reductase
0.0
transcriptional factor b3 family protein
248.7
histone acetyltransferase hac12
263.1
protein elf4-like 4
473.9
chrA08
chrA08_14983629
ap2-erebp rave subfamily protein rav2
0.0
cycling dof factor 2
170.7
cullin 3
187.3
chrA09
chrA09_11993662
cullin 4
982.5
della protein
348.6
agamous-like mads-box protein agl3
367.7
chrA09
Bn-A09-p21922383
mads-box protein gordita
875.9
protein suppressor of fri 4
759.6
nuclear transcription factor y subunit a-7
151.7
chrA09
chrA09_32437048
phytochrome a
0.0
phytochrome interacting factor 3
11.9
histone-lysine n-methyltransferase atx2
810.9
chrA10
chrA10_17106744
probable lysine-specific demethylase elf6-like
457.6
protein lhy cca1-like 1
51.2
pseudo-response regulator 7
41.0
protein early flowering 7
0.0
chrC01
chrC01_1447516
bzip transcription factor
0.0
chrC08
Bn-scaff_16770_1-p1357882
dek domain-containing chromatin associated protein
942.4
chrC08
chrC08_36752954
phytochrome a
1.5
phytochrome interacting factor 3
10.5
medea
551.6
chrC08
Bn-scaff_16389_1-p12505
histone-lysine n-methyltransferase atx2
434.0
dna helicase
405.8
tata-box-binding protein 2
107.5
chrC09
chrC09_43739821
chromo domain-containing protein lhp1-like
985.9
della protein
866.0
col1 protein
0.0
coa
5.9
sepallata2
47.5
The table lists the chromosome and the marker with the most significant deviation from the expected random distribution, together with candidate genes selected based on gene ontology and literature. The last column specifies the distance of the gene to the closest marker within the split-associated region. A value of “0.0” indicates that the marker lies within the gene. The candidate gene closest to the split region is shown in italics.
Vernalisation trials
In order to test if the swede-specific pattern of Bna.FLC deletions and duplications would affect vernalisation dependency, we conducted a vernalisation trial with a reduced set of lines (11 lines each from the winter, spring and swede panels). These were selected to represent either lines without CNV in Bna.FLC (as a control), or lines which have alternative patterns of deletion and duplication in Bna.FLC. The plants were subjected to either 6 or 12 weeks of vernalisation, or were not vernalised. We then scored the time until opening of the first flower. We found that all spring lines and one winter line were vernalisation independent. The vernalisation-independent winter line was one of the two genotypes carrying a strongly divergent Bna.FT promoter on C02_random. All swedes and three winter lines were found to be strongly vernalisation dependent, meaning that no plants flowered after mild vernalisation. At the end of the experiment, one winter line and 8 swede lines did not flower at all, meaning that 12 weeks of vernalisation were not sufficient to induce flowering (Fig. 8).
Figure 8
Distribution of flowering plants in the vernalisation trial.
The height of the bars represents the number of replications which could be phenotyped. Yellow indicates flowering plants, grey indicates non-flowering plants. The different treatments are framed in different colors, with green indicating no vernalisation, blue mild vernalisation (6 weeks) and red strong vernalisation (12 weeks).
For the spring types, we found that lines with altered Bna.FLC patterns flower significantly later than lines without such changes (Fig. 9). All the same, spring lines carrying a swede pattern in Bna.FLC were not vernalisation dependent, indicating that this pattern is not sufficient to induce vernalisation.
Figure 9
Barplots of flowering time recorded in days after sowing (DAS) for FLC-affected and non-affected plants among the spring genotypes in the vernalisation trial.
Whiskers show standard errors. The asterisks denote the level of significance for Student’s t-test (*p-value < 0.05, ***p-value > 0.001). NV: No vernalisation, MV: mild vernalisation (6 weeks), SV: strong vernalisation (12 weeks).
Discussion
Our study aimed at identifying genetic variants which are responsible for the separation of the different morphotype pools in B. napus. According to population structure analyses performed in this and other studies333447, winter-type B. napus accessions tend to separate almost completely from other accessions, while some spring types along with semi-winter and swede material are more diverse. Here, we defined a total of 12 variant haplotypes which are diagnostic for the winter-spring split, and 13 variant haplotypes for the swede-non-swede split. Moreover, we found one winter type without vernalisation requirement that was nevertheless winter-hardy, and spring types with some degree of vernalisation responsiveness. Swedes were found to be extremely vernalisation dependent, with some variation among the accessions, presumably because swede forms have been bred to maintain their vegetative state for as long as possible. Vernalisation is a quantitative process48, so there is natural variation in responsive temperature range and vernalisation duration21495051. Markers for such life cycle traits are extremely important in order to introgress desirable traits between ecogeographical or morphotype gene pools, for example seed quality traits from spring to winter oilseed forms52 or resistance traits from swede to non-swede material53.An R10P mutation in the MADS box domain of Bna.FLC.A10 was revealed as one candidate for the winter-spring split in B. napus, however our data shows that this mutation is neither the only candidate nor the best one. Neither the other Bna.FLC homologues, nor the detected copies of Bna.FRI, showed an appropriate variant distribution to explain the winter-spring split. Similar results were found for natural variation in flowering time for A. thaliana5455. This excludes the possibility that genetic variation within Bna.FLC gene sequences, besides Bna.FLC.A10, are causal for vernalisation requirement, thus indicating that other Bna.FLC copies are either not responsible for vernalisation or there is variation in cis-regulatory elements. As shown by the vernalisation trials, spring-type plants without CNV in Bna.FLC copies flower earlier, indicating that Bna.FLC still plays a role in modulating flowering time in the absence of vernalisation requirement. Spring genotypes with a high Bna.FLC copy number showed accelerated flowering under vernalisation, indicating that they established weak vernalisation responsiveness. FLC is known to bind many other genes in A. thaliana46, and it also regulates other developmental processes like germination56, hence additional copies might be assumed to underlie strong selection. The differential degree of conservation between the copies suggests sub-functionalisation, whereby conserved Bna.FLC.A02 and Bna.FLC.C02 presumably retain more general roles, whereas Bna.FLC.A10 might be more specialized towards flowering regulation. Sub-functionalisation events are characteristic for the evolution of MADS box transcription factors5758.On the other hand, Bna.SRR1.A02, Bna.VIN3.A02, Bna.AGL71.A03, Bna.CCR1.A09_random and the Bna.FT promoter on C02_random represent further candidates for the morphotype split. SRR1 is a clock-associated gene found to regulate CO, FT and CYCLING DOF FACTOR 1 (CDF1) in A. thaliana59. Moreover, A. thalianasrr1 mutants have reduced levels of FLC and respond only weakly to vernalisation59. Similarly, Bna.VIN3.A02 represents a copy of another vernalisation candidate upstream of Bna.FLC54. In Arabidopsis, VIN3 is expressed during cold and associates to the PRC2 complex to downregulate FLC gene activity2260. AGAMOUS-LIKE 71 (AGL71) is closely related to the flowering integrator SUPPRESSOR OF CONSTANS 1 (SOC1) and seems to be involved in gibberellin-dependent flowering pathways61. Its promoter contains a CArG box for FLC binding in A. thaliana62. CCR1 is a biosynthetic enzyme in lignin production and leaf development, which is known to regulate the concentration of the antioxidative compound ferulic acid63. This could be related to cold perception, as cold is partly perceived via the redox state64. Although all observed candidate SNPs are synonymous or silent, they may still have strong potential consequences for cis-regulatory elements, methylation, small RNA regulation and chromatin structure, and associated changes in the promoter. Moreover, alternative splicing was found to occur abundantly in resynthesized B. napus, also for copies of Bna.FLC65. On the other hand, we also cannot fully exclude that the observed effects are caused by linkage to additional genes in the neighbourhood of the investigated flowering-time regulators.We also found patterns on the Bna.FT promoter on C02_random associating with the winter-spring split. Haplotype analysis showed that only two winter lines showed a strongly varying pattern, resembling most of the spring morphotypes. One of these was an exotic line, while the other was found to be vernalisation independent. This indicates that a functional promoter sequence for this gene is necessary to build up vernalisation requirement. A change in vernalisation requirement through variation in an FT promoter was already found in different Brassicas66, narrow-leafed lupine67 and litchi68. In A. thaliana, FLC was shown to bind to a CArG box located in the first intron of FT69, although in cereals the binding site lies in the promoter70. Indeed, the Bna.FT copy on C02_random contains a CArG box motif. It is possible that both the promoter and the intron are responsible for Bna.FLC binding. All the same, in both winter and spring types it has been reported66 that the A02 copy in B. napus is constitutively expressed and the C02 copy is completely silenced, whereas the copies on A07 and C06 appear to be specifically silenced in winter morphotypes but transcribed in spring morphotypes66. As our Bna.FT copy on C02_random corresponds to the C02 copy reported in the aforementioned study66, we assume that either there was a problem with the RT-PCR due to allelic variation, or the regulatory mechanism is more complex. However, in an independent transcriptome study we were unable to detect the constitutive expression of Bna.FT.A02, nor of any other Bna.FT copy, in winter-type B. napus before vernalisation (C. Obermeier, unpublished data).Genomic rearrangements are common in B. napus6417172737475. They are particularly predominant in the first generations after allopolyploidisation72, but the process is ongoing and believed to have an important role in speciation76. Different studies found indications for genomic rearrangements between B. napus morphotypes61229. In the present study we found CNVs concerning copies of Bna.FLC, Bna.PHYA and Bna.GA3ox1 to involve duplications in the A subgenome and corresponding homoeologous deletions in the C subgenome. This indicates replacement of the C-subgenome regions by the respective A-subgenome regions, a process known as homeologous non-reciprocal translocation (HNRT)77. A de novo HNRT will erase any sub-functionalisation which may have occurred prior to the rearrangement. Our data concur with the hypothesis27 that the Bna.FLC.A10 copy is most specifically involved in flowering regulation. A duplication in Bna.FLC.A10 would therefore increase vernalisation requirement. This hypothesis fits with the strong vernalisation requirement we observed in lines carrying this duplication. Differential expression of Bna.FLC in highly rearranged, resynthesized rapeseed was observed before26. All the same, this pattern can only be effective when the vernalisation system is functional, as two spring lines with the same pattern are not vernalisation dependent. One of these presumably has a defective Bna.FT promoter on C02_random, whereas neither of them carries the swede-specific Bna.VIN3.A03 marker.Other genes affected by such HNRTs are Bna.PHYA and Bna.GA3ox1. PHYA is a red/far-red perceiving photoreceptor which has a stabilising role for CO under long days78. This might represent a necessary co-adaptation of the photoperiodic pathway due to the strong vernalisation requirement, as later flowering means that the day length is longer at the time of flowering. This assumption is underlined by the finding that a D94G mutation in a copy of Bna.CO-like is a candidate for the swede split. GA3ox1, a biosynthetic key gene involved in GA production, is regulated by PHYB and by feedback mechanisms of downstream pathways79. GA also affects other developmental processes like seed germination, hypocotyl elongation and fruit set7980. Bna.GA3ox1 is therefore also a candidate for the swede morphotype, which is characterized by an enlarged hypocotyl and low seed-set. This might also apply to Bna.CCR1 as a candidate for the swede split. In ArabidopsisCCR1 is involved in lignin biosynthesis, leaf development regulation and regulation of the redox state63 (see above).All swede lines share a silent mutation in Bna.VIN3.A03, which is not shared by any other line. Although the consequence of this mutation is unclear, Bna.VIN3 is a strong candidate for vernalisation requirement, particularly because another copy is a candidate for the winter-spring split (see above). Copies of VIN3 have previously been named as candidates for vernalisation requirement and flowering time in A. thaliana and B. napus5481. In B. oleracea, it was found that BoVIN3 was upregulated much faster than A. thaliana VIN382, indicating that the expression is more sensitive to cold. Another upstream candidate for the strong vernalisation requirement is Bna.ELF7. ELF7 is involved in chromatin remodeling of FLC during vernalisation83. A further gene variant which was not found outside the swede population lay in Bna.TEM1. TEM1 encodes another repressor of FT, which competes with CO for the same genetic region to fulfill their function84. The variant is a non-synonymous T167R mutation that potentially affects binding to the UTR of FT. A further candidate, Bna.FD, possibly modulates FT protein effectiveness, as FD is a direct and essential interaction partner of FT in the shoot apex85.These results represent an excellent base for further experiments to transfer morphotype features between B. napus genetic pools. Moreover, they also shed light on the evolution of major flowering time genes in the aftermath of allopolyploidisation, and their role in morphotype diversification and ecogeographical adaptation. We clearly demonstrate that different copies of important flowering regulators play different regulatory roles across the vernalisation and flowering pathways. The scarcity of non-synonymous mutations, along with the observed variation in the Bna.FT promoter, underline the importance of cis-regulatory mechanisms in flowering time regulation.
Additional Information
How to cite this article: Schiessl, S. et al. Post-polyploidisation morphotype diversification associates with gene copy number variation. Sci. Rep.
7, 41845; doi: 10.1038/srep41845 (2017).Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Authors: Matthew N Nelson; Michał Książkiewicz; Sandra Rychel; Naghmeh Besharat; Candy M Taylor; Katarzyna Wyrwa; Ricarda Jost; William Erskine; Wallace A Cowling; Jens D Berger; Jacqueline Batley; James L Weller; Barbara Naganowska; Bogdan Wolko Journal: New Phytol Date: 2016-07-15 Impact factor: 10.151
Authors: Weiwei Deng; Hua Ying; Chris A Helliwell; Jennifer M Taylor; W James Peacock; Elizabeth S Dennis Journal: Proc Natl Acad Sci U S A Date: 2011-04-04 Impact factor: 11.205
Authors: Bhavna Hurgobin; Agnieszka A Golicz; Philipp E Bayer; Chon-Kit Kenneth Chan; Soodeh Tirnaz; Aria Dolatabadian; Sarah V Schiessl; Birgit Samans; Juan D Montenegro; Isobel A P Parkin; J Chris Pires; Boulos Chalhoub; Graham J King; Rod Snowdon; Jacqueline Batley; David Edwards Journal: Plant Biotechnol J Date: 2018-01-10 Impact factor: 9.803
Authors: Anna Stein; Olivier Coriton; Mathieu Rousseau-Gueutin; Birgit Samans; Sarah V Schiessl; Christian Obermeier; Isobel A P Parkin; Anne-Marie Chèvre; Rod J Snowdon Journal: Plant Biotechnol J Date: 2017-04-27 Impact factor: 9.803
Authors: M Michelle Malmberg; Fan Shi; German C Spangenberg; Hans D Daetwyler; Noel O I Cogan Journal: Front Plant Sci Date: 2018-04-19 Impact factor: 5.753