Introduction: Mushroom-forming fungi comprise diverse species that develop complex multicellular structures. In cultivated species, both ecological adaptation and artificial selection have driven genome evolution. However, little is known about the connections among genotype, phenotype and adaptation in mushroom-forming fungi. Objectives: This study aimed to (1) uncover the population structure and demographic history of Lentinula edodes, (2) dissect the genetic basis of adaptive evolution in L. edodes, and (3) determine if genes related to fruiting body development are involved in adaptive evolution. Methods: We analyzed genomes and fruiting body-related traits (FBRTs) in 133 L. edodes strains and conducted RNA-seq analysis of fruiting body development in the YS69 strain. Combined methods of genomic scan for divergence, genome-wide association studies (GWAS), and RNA-seq were used to dissect the genetic basis of adaptive evolution. Results: We detected three distinct subgroups of L. edodes via single nucleotide polymorphisms, which showed robust phenotypic and temperature response differentiation and correlation with geographical distribution. Demographic history inference suggests that the subgroups diverged 36,871 generations ago. Moreover, L. edodes cultivars in China may have originated from the vicinity of Northeast China. A total of 942 genes were found to be related to genetic divergence by genomic scan, and 719 genes were identified to be candidates underlying FBRTs by GWAS. Integrating results of genomic scan and GWAS, 80 genes were detected to be related to phenotypic differentiation. A total of 364 genes related to fruiting body development were involved in genetic divergence and phenotypic differentiation. Conclusion: Adaptation to the local environment, especially temperature, triggered genetic divergence and phenotypic differentiation of L. edodes. A general model for genetic divergence and phenotypic differentiation during adaptive evolution in L. edodes, which involves in signal perception and transduction, transcriptional regulation, and fruiting body morphogenesis, was also integrated here.
Introduction: Mushroom-forming fungi comprise diverse species that develop complex multicellular structures. In cultivated species, both ecological adaptation and artificial selection have driven genome evolution. However, little is known about the connections among genotype, phenotype and adaptation in mushroom-forming fungi. Objectives: This study aimed to (1) uncover the population structure and demographic history of Lentinula edodes, (2) dissect the genetic basis of adaptive evolution in L. edodes, and (3) determine if genes related to fruiting body development are involved in adaptive evolution. Methods: We analyzed genomes and fruiting body-related traits (FBRTs) in 133 L. edodes strains and conducted RNA-seq analysis of fruiting body development in the YS69 strain. Combined methods of genomic scan for divergence, genome-wide association studies (GWAS), and RNA-seq were used to dissect the genetic basis of adaptive evolution. Results: We detected three distinct subgroups of L. edodes via single nucleotide polymorphisms, which showed robust phenotypic and temperature response differentiation and correlation with geographical distribution. Demographic history inference suggests that the subgroups diverged 36,871 generations ago. Moreover, L. edodes cultivars in China may have originated from the vicinity of Northeast China. A total of 942 genes were found to be related to genetic divergence by genomic scan, and 719 genes were identified to be candidates underlying FBRTs by GWAS. Integrating results of genomic scan and GWAS, 80 genes were detected to be related to phenotypic differentiation. A total of 364 genes related to fruiting body development were involved in genetic divergence and phenotypic differentiation. Conclusion: Adaptation to the local environment, especially temperature, triggered genetic divergence and phenotypic differentiation of L. edodes. A general model for genetic divergence and phenotypic differentiation during adaptive evolution in L. edodes, which involves in signal perception and transduction, transcriptional regulation, and fruiting body morphogenesis, was also integrated here.
Fungi are ideal organisms for illuminating the molecular mechanisms of adaptive evolution in eukaryotes because of their contrasting ecological niches, short generation times, generally compact genomes, and simple morphologies [1]. Adaptive evolution in fungi is driven by various environmental factors, such as temperature, water, light, heavy metals, and salinity [1], [2]. Recently, the population genomics approach has been widely used in studies on adaptive evolution and domestication history. In fungi, molecular evolutionary processes of diverse groups have been extensively studied based on population genomics. These include studies of allopatric isolation [3] and responses to environmental factors such as temperature [4], [5], [6], salinity [7], and domestication [8], [9], [10]. However, studies on adaptive evolution of mushroom-forming fungi using population genomics methods are still in their infancy, and are exemplified by reports focused on the mycorrhizal species Suillus brevipes and S. luteus
[4], [7], [11]. Saline environments might have driven divergence of two S. brevipes populations in California, in which a gene enhancing salt tolerance was found to be under strong selection [7]. In S. brevipes populations across North America, genes under positive selection and significantly associated with environmental variables are mainly related to transmembrane transport and helicase activity, which are potentially involved in cold stress response [4]. Besides, a S. luteus population in Belgium was inferred to have undergone adaptive divergence driven by soil heavy metal contamination, as genes involved in heavy metal metabolism were detected to be under selection [11].Many traits potentially involved in adaptive evolution show quantitative inheritance [12]. Thereby, understanding the genetic architecture of complex adaptive traits is a major challenge. Genome-wide association studies (GWAS), which aim to identify causal variants responsible for traits of interest, have proven to be powerful for inferring quantitative trait architectures in model fungi such as Saccharomyces cerevisiae
[13], [14] and Neurospora crassa
[15], as well as several fungal pathogens [16], [17], [18]. There is every reason to expect that GWAS could be a powerful tool to dissect the genetic basis of adaptive evolution in mushroom-forming fungi.As the most complex multicellular structures in fungi [19], [20], fruiting bodies are subject to natural selection and environmental adaptation [21]. Environmental factors, such as temperature, water, light, and CO2 concentration, are critical for fruiting body induction and development [20], [22]. Recently, a transcriptomic atlas of mushroom development has unveiled conserved developmentally regulated genes in Agaricomycetes (mushroom-forming basidiomycetes), including genes related to fungal cell wall remodeling, targeted protein degradation, transcriptional regulation, and signal transduction, etc. [23], [24]. In addition, genes related to environmental response, cell cycle regulation, metabolism and transport are also reported to function in fruiting body development [22], [25], [26].Lentinula edodes (shiitake or xiang gu) is an edible mushroom reported to have antiviral and immunomodulating properties [27]. L. edodes was first cultivated in China more than 800 years ago [28] and now has the highest total production among all domesticated mushrooms in the world [29]. Our previous study [6] has reported the genomes of 60 Chinese L. edodes strains, including cultivars and wild strains, and identified three distinct subgroups. Fruiting body-related traits (FBRTs), including precocity (time to fruiting) and morphology, were found to be significantly differentiated among the three subgroups, with cultivars characterized by a higher precocity and better fruiting body quality than the wild strains. However, the molecular mechanisms of adaptation, and more specifically of domestication, are still elusive in Chinese L. edodes.In the present study, we analyzed the genomes and FBRTs in 133 L. edodes strains and performed GWAS. First, the population structure and demographic history of L. edodes were uncovered by population genomics analyses. Second, the genetic basis of adaptive evolution in L. edodes was dissected, focusing on signatures of genetic divergence and phenotypic differentiation of FBRTs. Genes related to population genetic divergence were determined by genomic scan and those underlying FBRTs were identified by GWAS. Genes identified by both genomic scan and GWAS were considered to be related to phenotypic differentiation. Finally, We conducted RNA-seq analysis of fruiting body development in the YS69 strain to explore if genes related to fruiting body development are involved in adaptive evolution. This is the first report of the evolutionary history of an economically important mushroom using population genomic methods. Our study would provide insights into the genetic underpinnings of adaptation in mushroom-forming fungi, which may facilitate breeding of L. edodes cultivars for desirable traits.
Materials and methods
L. edodes strains for sequencing and cultivation trial
A total of 133 dikaryotic strains of L. edodes were used in this study, including 99 wild strains and 34 cultivated strains (Fig. 1, Supplementary Dataset 1). Most of the strains were from China, except four cultivars from South Korea, Brazil, Australia, and Finland (L. edodes does not occur naturally outside Asia), and one wild isolate from Vietnam. Among the 133 L. edodes strains, 60 were previously sequenced and deposited in the NCBI Sequenced Read Archive under the accession PRJNA320211 [6]. The other 73 strains were newly sequenced in this study with the accession PRJNA535345.
Fig. 1
Distribution of 99 wild strains of L. edodes used in this study. Strains belonging to the Cultivars, Wild1 and Wild2 subgroups are displayed in green, red and blue, respectively. Two outlier strains between Cultivars and Wild1, YS84 and YS88, were excluded from the three subgroups and are shown in gray. Regions corresponding to Southwest China, Northeast China, and Central and Northwest China are indicated in atrovirens, darkred and indigo, respectively.
Distribution of 99 wild strains of L. edodes used in this study. Strains belonging to the Cultivars, Wild1 and Wild2 subgroups are displayed in green, red and blue, respectively. Two outlier strains between Cultivars and Wild1, YS84 and YS88, were excluded from the three subgroups and are shown in gray. Regions corresponding to Southwest China, Northeast China, and Central and Northwest China are indicated in atrovirens, darkred and indigo, respectively.
Cultivation trials
Two cultivation trials were performed in the Huazhong Agricultural University (114.35°E, 30.48°N) in 2013 and 2017. All the strains were cultivated in mushroom houses in accordance with the randomized-block design [30]. Here, we studied 10 previously described FBRTs [6], including the time interval in days from incubation to formation of the first primordium (FP), the time interval in days from incubation to harvest of the first fruiting body (FB), pileus diameter in mm (PD), pileus thickness in mm (PT), pileus weight in gram (PW), stipe length in mm (SL), stipe diameter in mm (SD), stipe weight in gram (SW), number of fruiting bodies per bag (NF), and weight of a single fruiting body in gram (WF). Cultivation trials were performed in simple mushroom houses without environmental condition control. Two temperature indexes, namely the minimum temperature (tem_min) and maximum temperature (tem_max) required for fruiting body formation for each strain, were also analyzed. Phenotypes were measured from at least 10 fruiting bodies. Pearson correlation analysis was used to determine the correlations of FBRTs or temperature indexes between two years, as well as those of different FBRTs or temperature indexes in the same year. Analysis of variance (ANOVA) and Duncan's multiple range test were used for analyzing the differentiation of FBRTs and the two temperature indexes among subgroups in two years. These analyses were conducted by SPSS software.
Genome sequencing, mapping, and genotype calling
The L. edodes Le (Bin) 0899 genome was sequenced with the Pacific Biosciences Sequel platform at the Joint Genome Institute (JGI). The 45.6 Megabase L. edodes genome was assembled into 128 scaffolds, with N50 of 690 kb and L50 of 22 scaffolds, and contains 14,079 predicted genes. Details of the genome sequence, assembly and annotation are available via the JGI fungal genome portal MycoCosm (https://mycocosm.jgi.doe.gov/Lenedo1) [31]. The genome data was also deposited at GenBank under accession (TO BE PROVIDED UPON PUBLICATION). Further details of genome sequencing are given in the Supplementary Methods.For the re-sequencing of L. edodes strains, DNA samples from mycelia of the 133 strains were extracted as previously described [32]. DNA samples were sent to Berry Genomics Co. Ltd. (Beijing, China) for library preparation and sequencing. A total amount of 1.5 μg genomic DNA per sample was used to construct paired-end sequencing libraries with an insert size of approximately 500 bp according to the manufacturer’s instructions (Illumina). The libraries were sequenced on an Illumina HiSeq2500 platform with 125 bp paired-end reads. Trimmomatic version 0.33 was used to remove adapters and low-quality reads with default parameters [33].All clean reads were mapped to the reference genome using BWA with the parameters “mem -t 20 -R -M” [34]. Picard version 2.81 (http://broadinstitute.github.io/picard/) was used to convert the mapping results into the BAM format and remove duplicate reads. Re-sequencing depth and genome coverage were calculated with BamCoverage (https://github.com/BGI-shenzhen/BamCoverage). Single nucleotide polymorphisms (SNPs) were called among the 133 L. edodes strains using samtools and bcftools [35]. Only alignments with a mapping quality ≥ 30 and bases with quality ≥ 30 were used to identify variations. In order to reduce the SNP detection error rate, SNPs were filtered as follows: (1) SNP with a quality score > 40; (2) SNP with a mean mapping quality > 40; (3) polymorphic loci covered by at least 5 reads; (4) for heterozygous genotypes, with 0.3 < DR/(DA + DR) < 0.7, where DR and DA denote read counts for the reference and alternative alleles, respectively; (5) for loci that failed the above criteria, their individual genotypes were assigned as missing. After the above processing, SNPs with>50% missing data or a minor allele frequency (MAF) < 0.05 were also excluded.
Population genetics analysis
A neighbor-joining phylogenetic tree was constructed by PHYLIP 3.69 under the p-distance nucleotide substitution model by using fourfold degenerate SNPs [36], [37], and a distance matrix was generated by VCF2Dis (https://github.com/BGI-shenzhen/VCF2Dis). Gymnopus luxurians (Accession no. JJNP00000000) was used as an outgroup for its closest kinship with L. edodes
[38]. Population structure was investigated using the Bayesian clustering program fastStructure version 1.0 [39] with K = 1–20. The ‘chooseK.py’ program in fastStructure was used to determine a K value that can best explain the number of population structure and group membership for each accession. Principal component analysis (PCA) was performed using GCTA version 1.26.0 [40]. LD (linkage disequilibrium) for all pairs of SNPs (MAF > 0.05) was calculated using PopLDdecay version 3.31 with the parameters -MaxDist 50 -MAF 0.05 [41]. LD decay was calculated based on r value and physical distance between SNP pairs.
Diversity and divergence analyses
Both π and Watterson’s theta (θw) indicate nucleotide diversity. To compare the genetic diversity in each subgroup, π values were calculated for each locus across the L. edodes genome using VCFtools v0.1.13 [42], and Watterson’s theta (θw) values were computed by PopGenome [43]. Population differentiation was evaluated by the fixation index Fst using VCFtools v0.1.13 [42], [44], and pairwise Fst values were calculated among subgroups. The allele frequencies for SNPs in each subgroup were calculated by VCFtools v0.1.13 [42], and polymorphic and fixed SNPs between subgroups were extracted manually and used for estimating population differentiation. According to the allele frequencies, heterozygous loci and the heterozygosity of each strain were also evaluated.
Demographic history
The diffusion approximation method implemented in δaδi software was used to estimate the effective population size and demographic history [45] as previously described [46]. The best parameters for fitting were estimated in three-population models with 18 divergence models (Supplementary Fig. S1). The best divergence model in the three-population models was determined based on the maximum value of the likelihoods and Akaike information criterion. To obtain 95% confidence intervals based on the best fitting parameters, simulation was carried out 100 times. The parameters inferred by δaδi were scaled by 2Ne, where Ne is the ancestral population size. We estimated the ancestral population size using the formula 4Ne × μ × L = θ, where μ is the mutation rate (10−9) as in previous reports [7], [47], L is the genome size of L. edodes (45.6 M) and θ is the effective mutation rate of the ancestral population. 2Ne was estimated to be 16.90 × 105 here. All the parameters outputted by δaδi were then scaled by 2Ne to estimate the generation time and population size.
Genomic scan for divergence
The combination of Fst and average number of nucleotide differences per site between populations (Dxy) methods [7] was used to identify genomic regions of high genetic divergence among the three subgroups, which we called here Cultivars, Wild1, and Wild2. Pairwise comparisons were conducted between Cultivars and Wild1, Cultivars and Wild2, and Wild1 and Wild2. The level of Fst between two subgroups was first measured using a 5 kb window with a step size of 500 bp. Windows with the top 5% of the highest Fst were then selected and the consecutive windows merged into a large window. Similarly, windows with the top 5% of the highest Dxy were identified. Genomic regions identified by both the Dxy and Fst methods were considered to be regions of high genetic divergence (RHGD). Genes within RHGD were considered candidates related to genetic divergence and were functionally annotated with Blast2GO [48]. GO enrichment analyses were performed by clusterProfiler [49]. Transcription factors were identified based on Interpro domains using in-house Perl scripts. Genes encoding putative carbohydrate-active enzymes were annotated using a method described by Krizsán et al. [24].
Genome-wide association studies (GWAS)
GWAS is powerful method to identify associations between genotype and phenotype on a genome-wide scale, and is used to identify candidate genes underlying phenotypic traits. Based on phenotypic datasets of the 10 FBRTs surveyed in 2013 and 2017 and genome-wide SNPs of the 133 strains, GWAS was conducted with a linear mixture model implemented in FaST-LMM [50]. PCA was used to model population structure and the first five principle components were applied to control for the effects of population structure. The genome-wide significance thresholds of FBRTs were P = 9.03 × 10-7 (calculated by P = 1/n, where n is the number of SNPs, 1,106,800 here) [51]. If a significant FBRTs-associated SNP was included within a gene sequence or its flanking noncoding regions, the gene was regarded as a candidate of FBRTs-associated gene. The functions of such genes were annotated using the same methods as in genomic scan for divergence. Genes detected both as FBRTs-associated genes by GWAS and candidates related to genetic divergence by genomic scan were defined as candidates related to phenotypic differentiation.
Transcriptome for fruiting body development
Transcriptome analysis was performed to illustrate if genes related to fruiting body development are involved in genetic divergence and phenotypic differentiation. RNA-seq data were retrieved from five fruiting body development stages from the YS69 strain with three biological replicates per stage (NCBI accession No. PRJNA607848): mycelium, primordium, young fruiting body, mature fruiting body before pileus opening, and mature fruiting body after pileus opening (Supplementary Fig. S2). Fresh tissues were immediately frozen in liquid nitrogen and ground to fine powder. Total RNA was extracted from L. edodes samples using Trizol [52]. NEBNext UltraTM II RNA Library Prep Kit (New England Biolabs, Inc, USA) was used to construct cDNA library. The L. edodes samples were sequenced on an Illumina X-Ten sequencing platform, generating > 4 Gb of clean data per sample. RNA extraction, library construction and sample sequencing were performed by Wuhan Genoseq Technology Co., Ltd. Trimmomatic was used to remove adapters and low-quality reads with the default parameters. Clean reads were aligned to the L. edodes reference genome using HISAT2 [53]. Number of mapped clean reads was estimated by HTseq [54], followed by conversion to Transcripts Per Million (TPM) data via TBtools [55]. Differentially expressed genes were screened using edgeR [56] with | log2 (fold change) | > 1 and a false discovery rate (FDR) < 0.01. Two groups of developmentally regulated genes were identified as by Krizsan et al. [24]. Fruiting body initiation genes (FB_init) are those with expression levels increased by at least twofold from vegetative mycelium to the primordium stage (Supplementary Fig. S2, comparison between samples ① and ②), whereas fruiting body development genes (FB_dev) are those changed by at least twofold between any two development stages from primordium to mature fruiting body (Supplementary Fig. S2, comparison between samples ②, ③, ⑤ and ⑦; or comparison between samples ②, ④, ⑥ and ⑧). These two groups of genes are related to fruiting body development.
Data availability
Background data, including SNP datasets, results of genomic scan for divergence (Fst and Dxy) between subgroups, and gene expression levels (TPM values) in different stages of fruiting body development in the YS69 strain, were deposited into Figshare (https://doi.org/10.6084/m9.figshare.14259296).
Results
Genome re-sequencing
At least 2 Gb of clean DNA reads with Q30 > 80% was obtained for each of the 133 L. edodes strains and mapped to the reference genome. The mean genome coverage was 93.19% and the mean depth was 39.87×. The average mapping rate of reads was 75.74% (Supplementary Dataset 1). We identified 1,557,350 SNPs that are covered by ≥ 5 reads and present in ≥ 50% of the 133 strains.Among the 1,557,350 SNPs, we detected 1,106,800 SNPs across the genome with a MAF > 0.05 and < 10% missing data in the 133 strains, and 49,172 SNPs at fourfold degenerate sites with a MAF > 0.05 and no genotypic missing. The fourfold degenerate SNPs were used in the analyses of phylogeny, population structure and demographic history since they do not cause amino-acid changes and thus should be under low selective pressure and reliably reflect the population structure and demography [57]. The 1,106,800 SNPs with a MAF > 0.05 and < 10% missing data in the 133 strains were utilized for GWAS and PCA. The remaining analyses of population genomics were conducted by using the 1,557,350 SNPs covered by ≥ 5 reads and present in ≥ 50% of the 133 strains.
Population structure of L. edodes
Three major subgroups of L. edodes were revealed in the neighbor-joining tree by using the fourfold degenerate SNPs (Fig. 2), including one subgroup containing all but one of the cultivated strains (Cultivars) and two subgroups containing mostly the wild strains (Wild1 and Wild2). Wild1 contained 58 wild strains, including 55 from Southwest China (Sichuan and Yunnan Provinces), two from Central and Northwest China (Anhui and Shaanxi Provinces), and one from Vietnam (Fig. 1, Supplementary Dataset 1). Wild2 contained 22 wild strains and one cultivar, with a broad distribution across Southwest (Guizhou, Sichuan and Yunnan Provinces), Central (Anhui, Hunan and Hubei Provinces), and Northwest China (Gansu and Shaanxi Provinces). The sole cultivar in Wild2, ZP85, was reported to be recently developed from a Chinese wild strain [58]. The Cultivars subgroup contained 33 cultivated strains and 17 wild strains. Twelve wild strains in the Cultivars subgroup co-occurred with Wild1 and Wild2 strains in five Provinces in Southwest, Northwest and Central China, but five strains uniquely occurred in Jiangxi Province (Central China, one strain) and Northeast China (Jilin and Liaoning Provinces, four strains). The wild strains from Northeast China formed a sister clade to the rest of the Cultivars subgroup (Fig. 2). A group containing two strains (YS84 and YS88) from Sichuan Province was resolved as a sister group to the clade containing the Cultivars and Wild2 subgroups (Fig. 2). YS84 and YS88 were independent from the three subgroups, and were excluded from all comparative analyses among subgroups. We analyzed the numbers of heterozygous loci and inbreeding coefficients in the 133 strains. The inbreeding coefficients of 10 strains, including YS84 and YS88, and eight strains (YS4, YS13, YS21, YS44, YS45, YS47, YS48, and YS87) in Wild2, were lower than that of the other strains, while their ratio of heterozygous loci were higher than that of the others, implying an excess of heterozygous loci among them (Supplementary Table S1). These strains were named highly heterozygous strains (HHSs).
Fig. 2
Neighbor-joining tree of L. edodes strains generated by using 49,172 SNPs at fourfold degenerate sites. Gymlu1 represents the outgroup Gymnopus luxurians. The origin of each wild strain is labeled beside the strain number. Red circles denote cultivated strains; stars denote highly heterozygous strains.
Neighbor-joining tree of L. edodes strains generated by using 49,172 SNPs at fourfold degenerate sites. Gymlu1 represents the outgroup Gymnopus luxurians. The origin of each wild strain is labeled beside the strain number. Red circles denote cultivated strains; stars denote highly heterozygous strains.Results of STRUCTURE analysis and PCA were congruent with those of the neighbor-joining tree (Fig. 3A-B). In STRUCTURE analysis, the marginal likelihood reached the maximum (-0.515) when K = 3 (Supplementary Fig. S3), confirming that the 133 L. edodes strains formed three subgroups. For K = 2, Wild2 and Cultivars clustered into one subgroup, indicating a closer kinship between them.
Fig. 3
Population structure and differentiation of 133 L. edodes strains. A Bayesian model-based clustering of the 133 L. edodes strains with the number of ancestry kinship (K) from 2 to 3. Stars denote highly heterozygous strains (HHSs). Each strain is shown by a vertical bar, and the y-axis quantifies cluster memberships. When K = 3, the 133 strains assigned to the three subgroups are labeled in red (Wild1), blue (Wild2) or green (Cultivars). B Principal component analysis (PCA) of the tested L. edodes strains. HHSs between Cultivars and Wild1 (YS84 and YS88) are shown in gray and circled in red, whereas eight HHSs between Cultivars and Wild2 are shown in blue and circled in red. C LD decay of the three subgroups measured by r. D Proportion of polymorphic and fixed SNPs, as well as population differentiation (Fst) across the three subgroups. E Best-fit demographic model. Simultaneous split, symmetric migration between all populations is the best-fit demographic model predicted using δaδi software. T1 (36,871 generations) indicates the divergence time of the three subgroups, whereas m1 (0.278), m2 (2.560) and m3 (7.194) indicate effective migration per generation among the subgroups: m1 (between Cultivars and Wild1), m2 (between Wild1 and Wild2), and m3 (between Cultivars and Wild2), respectively.
Population structure and differentiation of 133 L. edodes strains. A Bayesian model-based clustering of the 133 L. edodes strains with the number of ancestry kinship (K) from 2 to 3. Stars denote highly heterozygous strains (HHSs). Each strain is shown by a vertical bar, and the y-axis quantifies cluster memberships. When K = 3, the 133 strains assigned to the three subgroups are labeled in red (Wild1), blue (Wild2) or green (Cultivars). B Principal component analysis (PCA) of the tested L. edodes strains. HHSs between Cultivars and Wild1 (YS84 and YS88) are shown in gray and circled in red, whereas eight HHSs between Cultivars and Wild2 are shown in blue and circled in red. C LD decay of the three subgroups measured by r. D Proportion of polymorphic and fixed SNPs, as well as population differentiation (Fst) across the three subgroups. E Best-fit demographic model. Simultaneous split, symmetric migration between all populations is the best-fit demographic model predicted using δaδi software. T1 (36,871 generations) indicates the divergence time of the three subgroups, whereas m1 (0.278), m2 (2.560) and m3 (7.194) indicate effective migration per generation among the subgroups: m1 (between Cultivars and Wild1), m2 (between Wild1 and Wild2), and m3 (between Cultivars and Wild2), respectively.
Genetic diversity and differentiation of the L. edodes subgroups
The genetic diversity and divergence of different L. edodes subgroups were evaluated. The number of SNPs in the Wild1 (1,276,581), Wild2 (1,106,054) and Cultivars subgroups (734,914) accounted for 81.97%, 71.02%, and 47.19%, of the total SNPs in L. edodes, respectively (Supplementary Table S2), indicating that Wild1 and Wild2 retained more genetic variation than Cultivars. Watterson’s theta (θw) was 5.435 × 10-3 in Wild1, 5.701 × 10-3 in Wild2 and 3.209 × 10-3 in Cultivars. θw in Wild1 was significantly higher than that in Cultivars (Kruskal-Wallis test, P < 0.05), and that was also the case between Wild2 and Cultivars (P < 0.01). However, there was no significant difference in θw between Wild1 and Wild2 (P > 0.05). The π values also suggested that Wild1 (7.878 × 10-3) and Wild2 (7.155 × 10-3) possessed a higher genetic diversity than Cultivars (4.951 × 10-3) (Supplementary Table S2).LD decay curve of the three subgroups showed a sharp decline within regions of hundreds base pairs, demonstrating small linkage blocks in L. edodes. LD decay, measured as the physical distance at r declining to 50% [59], was 1,039 bp for Cultivars, 187 bp for Wild1 and 533 bp for Wild2 (Fig. 3C). Therefore, Cultivars showed less recombination footprints than Wild1 and Wild2. LD decay in L. edodes was comparable to that in other basidiomycetes, such as Schizophyllum commune and Hetreobasidion annosum (∼750 bp) [59]. As for the fixed SNPs among subgroups, 67.3% of SNPs were fixed in one or two subgroups between Wild1 and Cultivars (Fig. 3D), which were higher than those in the other two comparisons, suggesting a relatively higher differentiation between Wild1 and Cultivars. Pairwise Fst was 0.390 between Cultivars and Wild1, 0.257 between Cultivars and Wild2, and 0.275 between Wild1 and Wild2, indicating a great genetic differentiation among the three L. edodes subgroups (Fig. 3D). However, Wild2 was relatively close to the other two subgroups, compared to that between Wild1 and Cultivars. Collectively, Wild1 and Wild2 had more genetic diversity than Cultivars. Similar to the S. brevipes population from North America, the Chinese shiitake population displayed robust subgroup differentiation and a high genetic diversity [4].
Demographic history analysis
The demographic history of L. edodes was explored using δaδi
[45] and 18 three-population models (Cultivars, Wild1 and Wild2) were tested to guide the development of demographic models for L. edodes (Supplementary Fig. S1). The maximum values of the likelihoods and Akaike information criterion suggest that the best-fit demographic model was a simultaneous split with symmetric migration between all populations (Supplementary Table S3), implying that the three subgroups simultaneously originated from a common ancestral population with symmetric migration among them (Fig. 3E). The effective population size of the subgroups was estimated to be 21,617 (Wild1), 14,404 (Wild2) and 4,547 (Cultivars) (Table 1).
Table 1
Best-fit population demographic parameters.
sim_split_sym_mig_all a
Value
95% confidence interval b
Ancestral population size (Ne)
98,993
98,993–112,005
Divergence time (generation)
36,871
35,223–59,945
Population size of Wild1
21,617
21,617–30,599
Population size of Wild2
14,404
14,404–19,204
Population size of Cultivars
4,547
4,269–5,306
sim_split_sym_mig_all: simultaneous split and symmetric migration between all populations.
95% confidence intervals were analyzed from 100 simulated datasets by using the maximum likelihood estimates.
Best-fit population demographic parameters.sim_split_sym_mig_all: simultaneous split and symmetric migration between all populations.95% confidence intervals were analyzed from 100 simulated datasets by using the maximum likelihood estimates.Based on the demography model, the estimated divergence time of the three L. edodes subgroups was 36,871 generations with a 95% confidence interval of 35,223–59,945 generations (Table 1). In nature, L. edodes fructifies about once a year (although its generation time can be shortened dramatically under cultivation), suggesting that its minimum generation time is one year. Thus, the split among the three L. edodes subgroups occurred roughly 37,000 years ago. The divergence time of L. edodes was comparable to two populations of the mushroom-forming fungus S. brevipes, whose generation time is also assumed to be about one year and which were reported to have diverged roughly 25,000 generations ago [7]. It has been suggested that many species, including those of fungi, differentiated during the Last Glaciation (12,000–110,000 years ago) because of drastic climate changes [60], [61].
Genomic scan for divergence and candidate genes related to genetic divergence among L. edodes subgroups
Based on Dxy and Fst methods, 182, 154 and 178 genomic regions of high genetic divergence were detected between Cultivars and Wild1 (Cul_Wild1), Cultivars and Wild2 (Cul_Wild2), and Wild1 and Wild2 (Wild1_Wild2), respectively (Fig. 4A-C, Supplementary Datasets 2–4). These regions contained 455, 402 and 485 genes related to genetic divergence (Fig. 4D).
Fig. 4
Genomic regions of high genetic divergence between the L. edodes subgroups identified by intersection of Dxy and Fst values. A-C Distribution of Dxy and Fst values, evaluated by using a 5 kb sliding window with a 500 bp step size. High genetic divergent regions are shown with red dots. Data points on the right of the vertical dashed line correspond to the highest 5% empirical Fst distribution, and those above the horizontal dashed line correspond to the highest 5% empirical Dxy distribution. (A Cultivars_vs_Wild1. B Cultivars_vs_Wild2. C Wild1_vs_Wild2). D Distribution of genes related to genetic divergence across the three subgroups (genes located in genomic regions with both highest 5% Dxy and Fst values).
Genomic regions of high genetic divergence between the L. edodes subgroups identified by intersection of Dxy and Fst values. A-C Distribution of Dxy and Fst values, evaluated by using a 5 kb sliding window with a 500 bp step size. High genetic divergent regions are shown with red dots. Data points on the right of the vertical dashed line correspond to the highest 5% empirical Fst distribution, and those above the horizontal dashed line correspond to the highest 5% empirical Dxy distribution. (A Cultivars_vs_Wild1. B Cultivars_vs_Wild2. C Wild1_vs_Wild2). D Distribution of genes related to genetic divergence across the three subgroups (genes located in genomic regions with both highest 5% Dxy and Fst values).Thirteen genes were found to be within RHGD in all comparisons among the three subgroups, suggesting that they have important roles in the genetic divergence of L. edodes (Supplementary Table S4). Besides these genes, only 94 genes were shared in Cul_Wild1 and Cul_Wild2. These genes were considered to be critical for divergence of Cultivars (Fig. 4D). Similarly, 273 genes and seven genes were critical for divergence of Wild1 and Wild2, respectively (Fig. 4D). As a whole, 942 genes were found to be related to genetic divergence among the L. edodes subgroups (Supplementary Dataset 5).Many GO terms were significantly enriched in candidate genes related to genetic divergence among subgroups (P < 0.05). For biological process, the numbers of enriched GO terms were 36 (Cul_Wild1), 34 (Cul_Wild2), and 34 (Wild1_Wild2). Those in GO terms of molecular function were 26 (Cul_Wild1), 20 (Cul_Wild2), and 32 (Wild1_Wild2) (Supplementary Tables S5-S7). Enriched GO terms that are potentially relevant to functions in fruiting body development included (1) environmental response, (2) signal transduction, (3) transcription regulation, (4) cell cycle regulation, (5) fungal cell wall remodeling, (6) protein degradation, and (7) metabolism and transport (Supplementary Table S8).
Differentiation of fruiting body-related traits and temperature response among L. edodes subgroups
All 133 strains were used in cultivation trials over two years (2013 and 2017), but those that formed < 10 fruiting bodies were excluded from further phenotypic analyses. The number of strains used for phenotypic analyses were 119 and 121 for year 2013 and 2017, respectively (Supplementary Dataset 1). Descriptive statistics of the 10 FBRTs and the two temperature indexes are shown in Supplementary Table S9. Pearson correlation analysis between FBRTs indicated that FP (the time interval from incubation to formation of the first primordium) and FB (the time interval from incubation to harvest of the first fruiting body), i.e., FBRTs related to precocity, were highly positively correlated (r = 0.997 in 2013, P < 0.01; r = 0.996 in 2017, P < 0.01). Similarly, FBRTs related to the size of individual fruiting bodies, WF (weight of a single fruiting body), PD (pileus diameter), PT (pileus thickness), PW (pileus weight), SD (stipe diameter), SL (stipe length), and SW (stipe weight), were highly positively correlated. In contrast, NF (number of fruiting bodies) was negatively correlated with the seven size-related traits (Supplementary Tables S10 and S11). Accordingly, FP and FB were defined as precocity traits, WF, PD, PT, PW, SD, SL, and SW were classified as fruiting body morphologic traits (FBM), and NF was considered as a third trait category. Moreover, the two temperature indexes were also positively correlated (r = 0.308 in 2013, P < 0.01; r = 0.425 in 2017, P < 0.01) (Supplementary Tables S10 and S11).All the 10 FBRTs were significantly differentiated among subgroups (P < 0.01) (Supplementary Figs. S4 and S5). In both years, tem_min and tem_max in Wild1 were higher than those in the other two subgroups (P < 0.01), while tem_min and tem_max in Wild2 were between those in Wild1 and Cultivars (Supplementary Figs. S4 and S5). Both temperature indexes were positively correlated with the precocity traits (P < 0.01) and negatively correlated with FBM (P < 0.01) except SL (Supplementary Tables S10 and S11). This implies that the higher the temperature, the later the fruiting body formation, and the smaller the size of fruiting body. In general, strains in the Cultivars subgroup fructified earlier at a relatively low temperature and produced the fewest fruit bodies with the largest fruiting body size. In contrast, strains in Wild1 fructified later at a relatively high temperature and produced the maximum numbers of fruiting bodies with the smallest fruiting body size. Phenotypic values of the traits in Wild2 were between those of Wild1 and Cultivars (Supplementary Figs. S4 and S5).
Candidate genes associated with FBRTs detected by GWAS
Based on the 2013 cultivation trial, 364 SNPs associated with 250 genes and 157 SNPs associated with 67 genes were significantly associated with FBM (fruiting body morphologic traits) and NF (number of fruiting bodies) traits, respectively (Supplementary Dataset 6 and Supplementary Table S12). Based on the 2017 cultivation trial, 16 SNPs associated with 12 genes, 723 SNPs associated with 394 genes, and 22 SNPs loci associated with 14 genes were significantly associated with precocity, FBM, and NF traits, respectively (Supplementary Dataset 7 and Supplementary Table S12). All the above-mentioned genes were potential candidate genes underlying the corresponding traits. For the candidate genes, one associated with NF and 13 associated with FBM were repeatedly detected by using both years’ cultivation datasets (Supplementary Table S13).In both years, a total of 719 FBRTs-associated genes were mainly involved in GO terms as follows: (1) Biological process: biological regulation, regulation of biological process, metabolic process, response to stimulus, and signaling; (2) Molecular function: transcription regulator activity, transporter activity, and catalytic activity (Supplementary Fig. S6).
Candidate genes related to phenotypic differentiation among L. edodes subgroups
Combining results of genomic scan and GWAS, candidate genes related to phenotypic differentiation were explored. In the 2017 cultivation trial, one, three and 35 genes associated with precocity, NF, and FBM traits, respectively, were found to be within RHGD. In the 2013 cultivation trial, 14 and 27 genes associated with NF and FBM, respectively, were found to be within RHGD (Supplementary Table S12 and S14). Here, a total of 80 genes were related to phenotypic differentiation among the L. edodes subgroups (Supplemental Fig. S7).
Important candidate genes for genetic divergence and phenotypic differentiation of fruiting body-related traits in L. edodes
Numerous genes are involved in the genetic divergence and phenotypic differentiation among L. edodes subgroups. Here, we focused on the functions of 29 important genes related to phenotypic differentiation following seven categories below (Two FBRTs-associated genes, jgi.p|Lenedo1|1235377 and jgi.p|Lenedo1|1087721, might be related to phenotypic differentiation since they were within regions of the top 5% Fst or Dxy) (Table 2). Other genes, despite not being related to phenotypic differentiation, play important roles in genetic divergence and their potential functions were further described in the Supplementary Results.
Table 2
Candidate genes for genetic divergence and phenotypic differentiation of fruiting body-related traits in L. edodes.
Functional categories
Gene ID
Descriptions (encoded proteins)
High divergence between subgroups
Associated traitsd
Developmental regulatione
Cul_Wild1
Cul_Wild2
Wild1_Wild2
1.Responses to environmental factors
1) Temperature
jgi.p|Lenedo1|1054370
lanosterol 14-alpha-demethylase (Erg11, CYP51)
Y
FBM
FB_init
jgi.p|Lenedo1|1077413a
HSP20-like chaperone
Y
Y
FB_dev
jgi.p|Lenedo1|1055325a
HSP70-like protein
Y
Y
jgi.p|Lenedo1|296243a
Pbs2-like MAPKK kinase
Y
FB_dev
jgi.p|Lenedo1|115619a
Slt2 MAPK kinase
Y
jgi.p|Lenedo1|1036799a
DEAD box RNA helicase Ded1p
Y
Y
FB_dev
jgi.p|Lenedo1|1151588a
DEAD box RNA helicase Ded1p
Y
Y
FB_dev
jgi.p|Lenedo1|1055459a
HSF1-like protein
Y
Y
2) Light
jgi.p|Lenedo1|835872
velvet factor
Y
FBM
FB_dev
jgi.p|Lenedo1|1235377
velvet factor
Yb
Yb
FBM
/
jgi.p|Lenedo1|680571a
WC-1-like protein
Y
Y
FB_dev
jgi.p|Lenedo1|1130408a
COP9 signalosome complex subunit 1
Y
Y
/
3) Oxygen
jgi.p|Lenedo1|1185465
Ofd1-like protein (O2 sensor)
Y
Y
NF
FB_init
jgi.p|Lenedo1|1039749
C4-methyl sterol oxidase (ERG25)
Y
FBM
FB_dev
2. Signal transduction
1)ROS signal pathway
jgi.p|Lenedo1|1041804
NoxR protein
Y
FBM
/
jgi.p|Lenedo1|1087721
activator protein of Rho-like small GTPases
Yc
NF
/
jgi.p|Lenedo1|995875
thioredoxin-like protein
Y
Y
Y
NF
/
jgi.p|Lenedo1|1262226a
Rac1
Y
Y
jgi.p|Lenedo1|1058969a
Cdc42
Y
jgi.p|Lenedo1|1042285a
glutaredoxin
Y
jgi.p|Lenedo1|1071454a
glutaredoxin
Y
FB_dev
2)Responses to pheromones
jgi.p|Lenedo1|1100231
pheromone-processing carboxypeptidase KEX1
Y
NF
/
jgi.p|Lenedo1|56273a
heterotrimeric G protein alpha subunit 4
Y
Y
/
3. Transcriptional regulation
jgi.p|Lenedo1|1055685
fungal-specific transcription factor
Y
FBM
FB_dev
jgi.p|Lenedo1|1078491
AT_hook transcription factor
Y
FBM
FB_init
jgi.p|Lenedo1|1122213
Fungal trans, Zn(2)C6 fungal transcription factor
Y
NF
/
jgi.p|Lenedo1|1033446a
PriA
Y
Y
FB_init
jgi.p|Lenedo1|1039314a
PriB
Y
Y
FB_dev
jgi.p|Lenedo1|904589a
PriB
Y
Y
/
4. Cell cycle regulation
jgi.p|Lenedo1|726979
cell cycle checkpoint protein Rad1
Y
FBM
FB_init
jgi.p|Lenedo1|1160258
kinase with PIKKc_ATR domain
Y
FBM
FB_dev
jgi.p|Lenedo1|399146a
Hus1-like protein
Y
/
5. Fungal cell wall remodeling
jgi.p|Lenedo1|1063270
Endoglucanase, glycoside hydrolase family 12 protein
Y
NF
FB_dev
jgi.p|Lenedo1|1212909
Xyloglucosyltransferase, glycoside hydrolase family 16 protein
Y
Y
FBM
FB_dev
jgi.p|Lenedo1|1271474
Acetylesterase,carbohydrate esterase family 16 protein
Y
FBM
FB_dev
jgi.p|Lenedo1|1165047
Cellulose/chitin loosening
Y
Y
FBM
FB_dev
6. Protein degradation
jgi.p|Lenedo1|1036019
small ubiquitin-related modifier
Y
FBM
/
jgi.p|Lenedo1|1100553
RING-type zinc-finger protein
Y
Y
FBM
/
jgi.p|Lenedo1|1174915
ubiquitin-conjugating enzyme
Y
FBM
FB_init
jgi.p|Lenedo1|1068427
F-box domain contained protein
Y
Y
FBM
/
jgi.p|Lenedo1|1234515
small ubiquitin-related modifier
Y
FBM
/
7.Metabolism and transport
jgi.p|Lenedo1|1042643
acetoin dehydrogenase-like protein
Y
Y
FBM
FB_init
jgi.p|Lenedo1|646249
aminoacylase 1-like protein 2
Y
Y
FBM
FB_dev
jgi.p|Lenedo1|1055634
sucrose transporter
Y
NF
/
jgi.p|Lenedo1|323702
sugar transporter
Y
FBM
FB_dev
jgi.p|Lenedo1|58514
MFS polyamine transporter
Y
FBM
/
jgi.p|Lenedo1|1036823
MFS general substrate transporter
Y
Y
Y
FBM
/
These genes are within regions of highly genetic divergence but not FBRTs-associated.
Gene jgi.p|Lenedo1|1235377 is within the genomic region with the top 5% of Fst in both Cul_Wild1 and Wild1_Wild2.
Gene jgi.p|Lenedo1|1087721 is within the genomic region with the top 5% of Dxy in Wild1_Wild2, and several SNPs within it were highly divergent (Fst > 0.9) in Wild1_Wild2.
FBM: fruiting body morphological traits; NF: number of fruiting bodies.
FB_init: Fruiting body initiation; FB_dev: Fruiting body development.
Candidate genes for genetic divergence and phenotypic differentiation of fruiting body-related traits in L. edodes.These genes are within regions of highly genetic divergence but not FBRTs-associated.Gene jgi.p|Lenedo1|1235377 is within the genomic region with the top 5% of Fst in both Cul_Wild1 and Wild1_Wild2.Gene jgi.p|Lenedo1|1087721 is within the genomic region with the top 5% of Dxy in Wild1_Wild2, and several SNPs within it were highly divergent (Fst > 0.9) in Wild1_Wild2.FBM: fruiting body morphological traits; NF: number of fruiting bodies.FB_init: Fruiting body initiation; FB_dev: Fruiting body development.Genes related to environmental response.Temperature. Environmental factors, especially temperature, are vital for fruiting body formation of L. edodes[19], [62], as well as in fungal adaptive evolution [63], [64]. Ergosterol stabilises the membranes of fungal cells against heat stress in yeast [65]. A gene encoding lanosterol 14-alpha-demethylase (jgi.p|Lenedo1|1054370) was detected to be related to phenotypic differentiation (Table 2), whose homolog participates in ergosterol biosynthesis [66], and might function in temperature response.Seven other genes crucial in fungal temperature response were found to be related to genetic divergence, although they are not related to phenotypic differentiation (Table 2, Supplementary Results).Light. Light is a critical factor for fruiting body formation of L. edodes
[22]. The velvet complex coordinates light signal with fungal development [67]. Two velvet factor encoding genes (jgi.p|Lenedo1|835872 and jgi.p|Lenedo1|1235377) were candidate genes for FBRTs (Fig. 5G, Table 2); one was within RHGD and the other within region of the top 5% Fst (Fig. 5A–C, Table 2).
Fig. 5
Trait-associated genes detected to be within regions of high genetic divergence (RHGD) among the L. edodes subgroups. These genes are related to phenotypic differentiation. A-F Distribution of Fst (green) and Dxy (yellow) across Scaffolds_7 and Scaffolds_80 among subgroups. The horizontal dashed lines indicate the highest 5% Fst (blue) and 5% Dxy (brown) values, whereas the vertical dashed lines (gray) indicate genomic regions containing important candidate genes related to phenotypic differentiation. A-C RHGD in Scaffolds_7. The important candidate genes include two velvet factor genes (jgi.p|Lenedo1|1235377 and jgi.p|Lenedo1|835872) which are located in order across Scaffolds_7, and associated with PW (belonging to FBM). D-F RHGD in Scaffolds_80, which contains an important candidate gene of NF encoding thioredoxin (jgi.p|Lenedo1|995875). G-H Manhattan plots of GWAS association peaks related to PW in 2017 on Scaffold_7 (G) and related to NF in 2013 on Scaffold_80 (H). The graph plots display genomic position (x axis) against its significance expressed as -log10 (P) value (y axis).The red solid lines indicate the Bonferroni significance threshold of GWAS (P = 9.03 × 10-7, -log10 (P) = 6.04). Significantly associated loci within above-mentioned candidate genes are marked in red and indicated by purple arrowhead.
Trait-associated genes detected to be within regions of high genetic divergence (RHGD) among the L. edodes subgroups. These genes are related to phenotypic differentiation. A-F Distribution of Fst (green) and Dxy (yellow) across Scaffolds_7 and Scaffolds_80 among subgroups. The horizontal dashed lines indicate the highest 5% Fst (blue) and 5% Dxy (brown) values, whereas the vertical dashed lines (gray) indicate genomic regions containing important candidate genes related to phenotypic differentiation. A-C RHGD in Scaffolds_7. The important candidate genes include two velvet factor genes (jgi.p|Lenedo1|1235377 and jgi.p|Lenedo1|835872) which are located in order across Scaffolds_7, and associated with PW (belonging to FBM). D-F RHGD in Scaffolds_80, which contains an important candidate gene of NF encoding thioredoxin (jgi.p|Lenedo1|995875). G-H Manhattan plots of GWAS association peaks related to PW in 2017 on Scaffold_7 (G) and related to NF in 2013 on Scaffold_80 (H). The graph plots display genomic position (x axis) against its significance expressed as -log10 (P) value (y axis).The red solid lines indicate the Bonferroni significance threshold of GWAS (P = 9.03 × 10-7, -log10 (P) = 6.04). Significantly associated loci within above-mentioned candidate genes are marked in red and indicated by purple arrowhead.Oxygen. Aeration induces fruiting body formation and much oxygen is consumed during development [62]. An O2 sensor gene encoding the Ofd1-like protein (jgi.p|Lenedo1|1185465) is critical for genetic divergence of Wild1 and related to phenotypic differentiation (Table 2). In yeast, prolyl 4-hydroxylase-like 2-oxoglutarate-Fe(II) dioxygenase Ofd1 accelerates the degradation of Sre1N (the N-terminal transcription factor domain of Sre1) in the presence of oxygen [68]. Yeast Sre1 responds to changes in oxygen-dependent sterol synthesis, acting as an indirect measure of oxygen availability [69]. Besides, the homolog of a C4-methyl sterol oxidase encoding gene (jgi.p|Lenedo1|1039749) is involved in maintaining ergosterol biosynthesis and plays a role in adaptation to hypoxia stress in Aspergillus fumigatus
[70].Genes related to signal transduction. ROS (reactive oxygen species) production controlled by Nox is required in regulating fruiting body formation [71], [72]. The Nox regulator gene (NoxR) functions in fruiting body development in Ganoderma lucidum
[73], and a NoxR protein encoding gene (jgi.p|Lenedo1|1041804) was detected to be related to phenotypic differentiation here (Table 2). Small Rho GTPases adjust the activity of the Nox complex [74]. The gene encoding an activator protein of Rho-like small GTPase (jgi.p|Lenedo1|1087721) was a potential gene related to phenotypic differentiation identified here since it was detected as a candidate gene of NF in both cultivation trials and within region of the top 5% Dxy in Wild1_Wild2 (Table 2).Thioredoxin is also a ROS-related protein because it controls cellular redox potential [75]. A gene encoding a thioredoxin-like protein (jgi.p|Lenedo1|995875) was related to genetic divergence involving in all the subgroups and was a candidate gene of NF (Table 2, Fig. 5D-F, 5H).Fruiting body formation is induced not only by external environmental factors but also by internal pheromone signals. A gene encoding a pheromone-processing carboxypeptidase KEX1 (jgi.p|Lenedo1|1100231) [76] was detected as a gene related to phenotypic differentiation (Table 2).Genes related to transcriptional regulation. The drastic morphological and physiological changes during fungal fruiting body development require genome-wide changes in gene expression. A total of 71 transcription factor (TF) genes were detected in 942 genes related to genetic divergence (Supplementary Table S15), including Zinc finger, C2H2, HMG box, Fungal trans, Zn(2)C6 fungal, and Gti1/Pac2, etc. Three TF genes (jgi.p|Lenedo1|1055685, jgi.p|Lenedo1|1078491and jgi.p|Lenedo1|1122213) were related to phenotypic differentiation (Table 2).Genes related to cell cycle regulation. Checkpoints are vital regulatory pathways that enable correct cell cycle progression coordinating with cellular morphogenesis [77]. Four checkpoint genes were identified to be related to genetic divergence, and two of them were related to phenotypic differentiation (Supplementary Table S16). Rad1 is an evolutionarily conserved protein required for cell cycle checkpoint control in Schizosaccharomyces pombe[78]. ATR kinase gene serves essential and dispensable roles in cell cycle [79]. Rad1 encoding gene (jgi.p|Lenedo1|726979) and gene encoding kinase with the PIKKc_ATR domain (jgi.p|Lenedo1|1160258) were identified as phenotypic differentiation-related genes here (Table 2).Genes related to fungal cell wall remodeling. Carbohydrate-active enzymes (CAZymes) play vital roles in cell wall remodeling during fruiting body development. CAZymes generate fruiting body-specific cell wall architecture, affect adhesive properties of neighboring hyphae, and modify cell wall plasticity for growth by cell expansion [24]. Thirty-nine CAZyme genes were related to genetic divergence (Supplementary Table S17), among which four were candidate genes related to phenotypic differentiation (Table 2).Genes related to protein degradation. Continual changes and remodeling of proteins at the genome-wide level are essential for the development of multicellular fruiting body. Ubiquitination is an important post-translational protein modification. The ubiquitin–proteasome system is the main pathway of protein degradation in cell [80]. Twenty-five ubiquitination-related genes were involved in genetic divergence (Supplementary Table S18). Five of them (jgi.p|Lenedo1|1036019, jgi.p|Lenedo1|1100553, jgi.p|Lenedo1|1174915, jgi.p|Lenedo1|1068427 and jgi.p|Lenedo1|1234515) were associated with phenotypic differentiation (Table 2).Genes related to metabolism and transport. Genes related to metabolism of lipids, amino acids, carbohydrates, and nucleotides were found to be involved in genetic divergence (Supplementary Table S19), as were putative transporter genes which may mediate transport of carbohydrate and polyamine (Supplementary Table S20). Among them, two metabolic process-related genes (jgi.p|Lenedo1|1042643 and jgi.p|Lenedo1|646249) and four transporter genes (jgi.p|Lenedo1|1055634, jgi.p|Lenedo1|323702, jgi.p|Lenedo1|58514 and jgi.p|Lenedo1|1036823) were associated with phenotypic differentiation (Table 2).
Candidate genes related to fruiting body development
Transcriptome analysis of fruiting body development in the YS69 strain identified candidate genes related to fruiting body development, including 1,842 FB_init genes and 2,816 FB_dev genes (Supplementary Dataset 8). Among the genes related to genetic divergence, 166 (36.48%), 152 (37.81%) and 196 (40.41%) were genes related to fruiting body development, in Cul_Wild1, Cul_Wild2, and Wild1_Wild2, respectively (Supplementary Datasets 2–4). As a whole, 364 genes related to genetic divergence were relevant to fruiting body development (Supplementary Fig. S7). For the 719 FBRTs-associated genes, 283 (39.36%) were related to fruiting body development, including 103 FB_init genes and 180 FB_dev genes (Supplementary Datasets 6 and 7). Among the 80 candidate genes of phenotypic differentiation, 40 of them were related to fruiting body development (Supplementary Fig. S7), including 16 FB_init genes and 24 FB_dev genes (Supplementary Dataset 8). In paticular, 16 out of the 29 important genes for phenotypic differentiation described above were detected as genes related to fruiting body development (Table 2).
Discussion
In this study, the population structure and demographic history of L. edodes were disclosed by population genomics analysis of 133 strains. Genes related to genetic divergence and phenotypic differentiation among L. edodes subgroups were then dissected by combined analyses of genomic scan for divergence and GWAS. Genes related to fruiting body development were further confirmed to be involved in adaptive evolution by transcriptome analysis.
Origin of modern Chinese cultivars of L. edodes
Based on the results of population structure and demographic history analyses, we infer that L. edodes cultivars in China were not domesticated from the wild strains in Wild1 and Wild2, in line with the conclusions of our previous work [6]. However, the origin of L. edodes cultivars in China might be inferred from the geographic origins of the wild strains in the Cultivars subgroup.The Cultivars subgroup contained 13 wild strains from Central, Northwest and Southwest China (Fig. 1, Fig. 2, Supplementary Dataset 1). We infer that these wild strains are cultivars that have escaped into the wild, rather than native wild strains, for the following reasons. First, population structure analysis has indicated that wild strains from the same or adjoining regions tend to cluster in SNP-based phylogenetic trees and PCA plots, suggesting that the genetic relationships of wild L. edodes strains are highly correlated with their geographic distributions. However, the 13 wild strains in the Cultivars subgroup are relatively distantly related to the wild strains from the same regions in Wild1 and Wild2 (Fig. 2). Second, nine of the 13 wild strains in the Cultivars subgroup were collected from regions that are within the main production areas of L. edodes cultivars in China. The nine strains (YS14, YS16, YS51, YS66, YS67, YS68, YS69, YS70, and YS72) were collected from the junction areas of Shaanxi, Gansu and Sichuan Provinces, which represent a main shiitake production area in China. In fact, dispersal of spores (monokaryons) and escapes of mycelia (dikaryons) of cultivars into wild populations are common in the course of artificial cultivation [81].The phylogenetic status of an Asian L. edodes population was reappraised in a previous study. Results indicated that all strains from Northeast Asia (two from Northeast China, six from Japan, two from North Korea, and one from Russia) were included in the same phylogenetic group [82], among which YS083 from Liaoning was employed here. Thus, the genetic relationships of Shiitake strains from Northeast Asia are correlated with their geographical distribution.Four wild strains from Liaoning and Jilin Provinces, which are in Northeast China, formed a sister clade to the rest of the Cultivars subgroup. These four strains were collected from nature reserves and remote mountainous areas far from the cultivation sites; for example, YS01332 and YS01346 were derived from Changbai Mountains. Modern cultivation of L. edodes started in the 1930s [83], suggesting that the current cultivars are recently domesticated. The four wild strains from Liaoning and Jilin Provinces appear to be genetically differentiated from all current cultivars in our dataset. If the four Northeast strains are escapes of cultivars dispersed early in the cultivation period (∼1930s), they could not differentiate enough in such a short period of less than 100 years and would be within the same clade with the other cultivars in the Cultivars subgroup. Thus, we infer that these four isolates are true wild strains native to Northeast China and not escaped cultivars.The Chinese L. edodes cultivars may have been imported from Japan, but we cannot confirm that the ancestral strains are native to Japan because Japanese breeders collected L. edodes strains throughout Asia for strain development [84]. Some of the Japanese strains, such as the 7401–7405 series and the 79 series, have become main cultivars in China and served as parents for breeding schemes [85]. Although there are no wild-collected strains from Japan in our dataset, some cultivars are introduced from Japan, including ZP9, ZP10, ZP42, ZP65, and ZP271, and some others are hybrids of Japanese strains, such as ZP6 and ZP49. Northeast China is geographically close to Japan (Fig. 1), therefore modern L. edodes cultivars in China may originate from the vicinity of Northeast China, including Japan and Korea. Additionally, the four wild strains from Liaoning and Jilin Provinces are rich in genetic variation as a whole (θw = 4.315 × 10-3), which is relatively higher than that in the other 46 strains in the Cultivars subgroup (θw = 3.022 × 10-3). There is moderate genetic differentiation between the four Northeast strains and the other 46 strains in the Cultivars subgroup (Fst = 0.135). Crops are frequently deemed to have a genetic diversity center near where they were originally domesticated [86]. Thus, we infer that modern L. edodes cultivars might originate from the vicinity of Northeast China because of the abundant genetic diversity in the Northeast strains and their close genetic relationship with the cultivated strains. However, to verify the exact origin of modern Chinese cultivars of L. edodes, additional wild strains from Japan should be included in further studies.Collectively, we infer that the population structure of L. edodes in China is correlated to the geographical distribution. Strains in Wild1 are distributed in Yunnan Plateau and Hengduanshan Mountains of Southwest China with a relatively low latitude and high altitude; those in Cultivars are distributed in the vicinity of Northeast China with a relatively high latitude; and those in Wild2 are mainly distributed in Central and Northwest China with a latitude between the two (Fig. 1).
Genetic divergence and phenotypic differentiation among L. edodes subgroups
Increasing evidence from fungal population genomics studies has suggested how environmental adaptation could drive population differentiation [4], [5], [7], [11], [87]. Among the environmental factors, temperature has a strong effect on many important fungal traits, including growth, development and reproduction [88], and has already shown to be vital in fungal adaptation [89]. In N. crassa, two genes related to temperature and circadian rhythm within the genomic island of divergence are essential for fungal growth in cold temperature, and the genomic islands of divergence might be the result of local adaptation to low temperature [5].Population genomics has uncovered three L. edodes subgroups, which are distributed across regions with significantly different environmental conditions (Fig. 1). Fruiting body formation is conceived to be a stress-response process, since the fungus senses environmental stresses and generates a fruiting body to disseminate its spores for propagation [6]. Fruiting body development is often triggered by environmental changes and fruiting body morphology is affected by environmental factors [19], [22]. Environmental factors causing L. edodes fructification are temperature fluctuations, light, aeration, and nutrient depletion [62]. Thus, genes related to environmental responses could be important in the local adaptation for FBRTs. Here, we have identified genes involving in the responses to temperature, light and oxygen that appear to be related to genetic divergence (Supplementary Dataset 5), some of which are related to phenotypic differentiation among the L. edodes subgroups (Table 2).The variation in two temperature indexes observed in our L. edodes cultivation trials suggests different temperature requirements for fruiting body induction and development in different subgroups. Although the cultivation trials were not conducted in the original ecological niches where the tested strains grew, phenotypes were determined in a mushroom house wherein the environmental conditions were not controlled, which simulates a changing natural environment. Therefore, the phenotypes of different strains represented the original genetic characteristics formed by natural selection or artificial selection. The temperature required for fruiting body formation was highest in Wild1, followed by Wild2, and lowest in Cultivars (Supplementary Figs. S4 and S5). For the three subgroups, the Wild1 strains are distributed in areas with the lowest latitude and the highest temperatures, and the Cultivars strains occur in areas with the highest latitude and lowest temperatures (Fig. 1), whereas the latitude and the temperature conditions that the Wild2 strains occur are in between. Thus, the L. edodes population in China showed local adaptation in responses to environmental conditions, especially temperature. In this study, 14 genes responding to environmental factors are detected to be related to genetic divergence and/or phenotypic differentiation, eight of which are involved in temperature response, such as those encoding lanosterol 14-alpha-demethylase, heat shock proteins, and DEAD box RNA helicase (Table 2). As an outcome of local adaptation, genetic divergence and phenotypic differentiation occurred among the subgroups.
Genes related to fruiting body development are involved in adaptive evolution
Environmental factors are important drivers for fungal adaptive evolution [1], [2], and often influence fruiting body development in mushroom-forming fungi [22]. As the main part of mushroom-forming fungi, fruiting bodies are subject to natural selection and environmental adaptation [21]. In the domestication of L. edodes, FBRTs are the main targets of artificial selection. Findings here also present significant phenotypic differentiation of FBRTs among the subgroups. Therefore, to dissect the genetic basis of adaptive evolution in L. edodes, genes related to fruiting body development must be considered; this is because such genes manipulate fruiting body development and eventually result in morphological changes of the fruiting bodies. In this study, many candidate genes underlying FBRTs and genes related to genetic divergence and phenotypic differentiation are involved in fruiting body development, including 39.36% of the FBRTs-associated genes, roughly 40% of genes related to genetic divergence, and 40 genes of phenotypic differentiation. Our study has suggested that genes related to fruiting body development are involved in adaptive evolution and ultimately lead to genetic divergence and phenotypic differentiation among L. edodes subgroups. Among the 29 important genes described here that are involved in phenotypic differentiation, 16 are related to fruiting body development (Table 2), implying their key roles in adaptive evolution of L. edodes. In particular, several important genes related to fruiting body development were identified to be involved in adaptive evolution for the first time in L. edodes, including those encoding O2 sensor Ofd1-like protein, thioredoxin, and cell cycle regulation proteins (Table 2).
Conclusion
Three distinct L. edodes subgroups distributed across regions with different latitudes, altitudes and environmental conditions in China were detected. Phenotypes of FBRTs and temperature responses were significantly differentiated among the subgroups. Genes responsible for genetic divergence among the L. edodes subgroups were identified, some of them were associated with FBRTs and involved in phenotypic differentiation (Table 2, Supplementary Dataset 5). Our findings suggest that genetic divergence and phenotypic differentiation among the L. edodes subgroups result from adaptation to the local environment.A general model for genetic divergence and phenotypic differentiation during adaptive evolution in L. edodes was integrated here (Fig. 6). Environmental and endogenous stimuli are sensed by and transduced into L. edodes cells, the organism then activates the transcription factor regulation process to respond to the extracellular and intracellular stimuli. Subsequently, biological processes including protein degradation, metabolism and transport, fungal cell wall remodeling, and cell cycle regulation are activated. All the genes involved in the above-mentioned processes are related to genetic divergence, some of which are related to phenotypic differentiation, and some present candidate genes related to fruiting body development (Table 2, Supplementary Tables S15-S20, Supplementary Dataset 5). Through more than 30,000 years of environmental adaptation and divergence, genes underlying fruiting body-related traits are highly divergent, eventually driving phenotypic differentiation of these traits among the subgroups. This model provides a framework for dissecting the genetic underpinnings of population evolution in mushroom-forming fungi.
Fig. 6
A model of genetic divergence and phenotypic differentiation in fruiting body-related traits among L. edodes subgroups during adaptive evolution. The fruiting body development process was classified into three stages: I. Signal (environmental factor and endogenous factor) perception and transduction, II. Transcriptional regulation, and III. Fruiting body morphogenesis. Several key genes in stage I were identified to be related to genetic divergence and phenotypic differentiation (Table 2), including those encoding light sensor velvet factors and WC-1, oxygen sensor Ofd1-like protein, key stress (e.g. heat stress and osmotic stress) response factors Pbs2 (high-osmolarity glycerol MAPK pathway) and Slt2 (cell wall integrity MAPK pathway), regulator of NOX complex (NoxR), small GTPases Rac1 and Cdc42, thioredoxin and glutaredxoin, and heterotrimeric G-protein. In stage II, many transcription factors were detected to mediate transcriptional regulation of fruiting body development of L. edodes. Seventy-one transcription factor genes were related to genetic divergence, and three of them were related to phenotypic differentiation (Supplementary Table S15). Stage III includes several biological processes, such as protein degradation, fungal cell wall remodeling, metabolism and transport, and the critical cell cycle regulation. In Stage III, many genes related to the above-mentioned process were detected to be involved in genetic divergence, and some of them were related to phenotypic differentiation (for details, refer to Table 2 and Supplementary Tables S15-S20). * Candidate genes related to phenotypic differentiation. Others without asterisk denote genes identified to be related to genetic divergence among subgroups but not to be associated with phenotypic differentiation (Supplemantary Results). ROS (reactive oxygen species) production controlled by Nox is required in regulating fruiting body formation, and it interacts with NoxR and small GTPases Rac1 and Cdc42. Although the Nox complex genes were neither within regions of high genetic divergence nor trait-associated, they are still presented in this figure. During adaptive evolution, genes underlying fruiting body related traits are highly divergent, eventually causing phenotypic differentiation of these traits among the subgroups. Strains in the Cultivars subgroup produced the fewest fruiting bodies with the largest fruiting body size. In contrast, strains in Wild1 produced the maximum numbers of fruiting bodies with the smallest fruiting body size. Phenotypic values of these traits in Wild2 were between those of Wild1 and Cultivars.
A model of genetic divergence and phenotypic differentiation in fruiting body-related traits among L. edodes subgroups during adaptive evolution. The fruiting body development process was classified into three stages: I. Signal (environmental factor and endogenous factor) perception and transduction, II. Transcriptional regulation, and III. Fruiting body morphogenesis. Several key genes in stage I were identified to be related to genetic divergence and phenotypic differentiation (Table 2), including those encoding light sensor velvet factors and WC-1, oxygen sensor Ofd1-like protein, key stress (e.g. heat stress and osmotic stress) response factors Pbs2 (high-osmolarity glycerol MAPK pathway) and Slt2 (cell wall integrity MAPK pathway), regulator of NOX complex (NoxR), small GTPases Rac1 and Cdc42, thioredoxin and glutaredxoin, and heterotrimeric G-protein. In stage II, many transcription factors were detected to mediate transcriptional regulation of fruiting body development of L. edodes. Seventy-one transcription factor genes were related to genetic divergence, and three of them were related to phenotypic differentiation (Supplementary Table S15). Stage III includes several biological processes, such as protein degradation, fungal cell wall remodeling, metabolism and transport, and the critical cell cycle regulation. In Stage III, many genes related to the above-mentioned process were detected to be involved in genetic divergence, and some of them were related to phenotypic differentiation (for details, refer to Table 2 and Supplementary Tables S15-S20). * Candidate genes related to phenotypic differentiation. Others without asterisk denote genes identified to be related to genetic divergence among subgroups but not to be associated with phenotypic differentiation (Supplemantary Results). ROS (reactive oxygen species) production controlled by Nox is required in regulating fruiting body formation, and it interacts with NoxR and small GTPases Rac1 and Cdc42. Although the Nox complex genes were neither within regions of high genetic divergence nor trait-associated, they are still presented in this figure. During adaptive evolution, genes underlying fruiting body related traits are highly divergent, eventually causing phenotypic differentiation of these traits among the subgroups. Strains in the Cultivars subgroup produced the fewest fruiting bodies with the largest fruiting body size. In contrast, strains in Wild1 produced the maximum numbers of fruiting bodies with the smallest fruiting body size. Phenotypic values of these traits in Wild2 were between those of Wild1 and Cultivars.Artificial selection has also contributed to the phenotypic changes of L. edodes cultivars since strains with excellent agronomic traits are preferentially selected during cultivation. To understand the role of artificial selection, additional wild strains from the vicinity of Northeast China should be sampled and studied, especially from Japan.
Credit author statement
Y.X., Y.B. and H.S.K. conceived and designed the study. J.Z., C.L., X.X. and G.L. performed cultivation trails. J.Z., C.L. and G.L. prepared DNA samples for population genome sequencing. J.Z., N.S. and Y.X. performed the analyses. J.Z., Y.X., N.S., D.S.H., Y.L., M.K.C., and H.S.K. wrote and revised the manuscript. S.P., K.B., W.A., A.L., R.R., G.H., M.Y. and I.V.G. sequenced, assembled and annotated the reference genome of L. edodes.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.