| Literature DB >> 29988533 |
Nicholas R LaBonte1, Peng Zhao2, Keith Woeste3.
Abstract
Chestnuts (Castanea) are major nut crops in East Asia and southern Europe, and are unique among temperate nut crops in that the harvested seeds are starchy rather than oily. Chestnut species have been cultivated for three millennia or more in China, so it is likely that artificial selection has affected the genome of orchard-grown chestnuts. The genetics of Chinese chestnut (Castanea mollissima Blume) domestication are also of interest to breeders of hybrid American chestnut, especially if the low-growing, branching habit of Chinese chestnut, an impediment to American chestnut restoration, is partly the result of artificial selection. We resequenced genomes of wild and orchard-derived Chinese chestnuts and identified selective sweeps based on pooled whole-genome SNP datasets. We present candidate gene loci for chestnut domestication and discuss the potential phenotypic effects of candidate loci, some of which may be useful genes for chestnut improvement in Asia and North America. Selective sweeps included predicted genes potentially related to flower phenology and development, fruit maturation, and secondary metabolism, and included some genes homologous to domestication candidates in other woody plants.Entities:
Keywords: Fagaceae; Illumina sequencing; chestnut; crop domestication; nut tree; pool-seq; selective sweep; woody perennial
Year: 2018 PMID: 29988533 PMCID: PMC6026767 DOI: 10.3389/fpls.2018.00810
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Map of the People's Republic of China showing locations from which wild trees were sampled (open squares) and the location of orchards sampled (filled triangles).
Castanea mollissima DNA sample pools, with individuals (n) per sample site.
| Y1 | Yunnan- 26.013°N 101.0932°E | Forest | 9 | 7,375,355,857 | 9.46 | 8,564,349 | 6,359,398 | 12.38 |
| Y2 | Yunnan- Fengqing County | Forest | 10 | 11,619,019,171 | 14.89 | 11,431,073 | 9,371,773 | 17.56 |
| S1 | Shaanxi- Zhuque and Heihe Forests | Forest | 13 | 7,344,176,568 | 9.42 | 13,492,467 | 8,279,766 | 9.46 |
| S2 | Shaanxi- Ningshan County | Forest | 10 | 8,630,350,892 | 11.06 | 7,632,501 | 5,074,304 | 9.12 |
| S3 | Shaanxi- Liuba County | Forest | 10 | 5,612,880,521 | 7.19 | 10,272,008 | 8,155,228 | 13.09 |
| S4 | Shaanxi-33.772° N 108.766° E | Orchard | 10 | 9,890,960,153 | 12.68 | 17,086,140 | 12,681,410 | 13.96 |
| GZ | Guizhou- 26.236° N 105.1676° E | Forest | 10 | 8,594,920,262 | 11.02 | 10,046,443 | 7,949,407 | 14.10 |
| HB | Hebei- 40.597° N 118.399° E | Orchard | 10 | 6,175,704,628 | 7.92 | 7,796,678 | 5,479,063 | 11.03 |
| BY | Beijing- Yanqing County | Orchard | 10 | 5,240,552,948 | 6.72 | 6,844,594 | 4,301,595 | 9.03 |
| ECC | Ohio, U.S.A. | Orchard | 12 | 3,498,422,441 | 4.49 | 4,939,429 | 2,079,207 | 7.10 |
Number of individuals in pool.
Sites with a variant called in a given pool with read depth >6.
Sites with an alternate allele of frequency at least 0.2.
Average observed depth at variant sites numbered in the 8th column.
Grown in Ohio, USA; derived from northern Chinese orchard cultivars.
Notable regions under selection during chestnut domestication with statistical support and functional annotations.
| LGA | 17560000 | lga_g2116 | 1-aminocyclopropane-1-carboxylate oxidase | −1.57 | 0.31 | 0.006 | 0.19 | 0.0014 | 0.0015 | 1.11 | 0.76 |
| − | |||||||||||
| LGA | 46000000 | lga_g5798 | Flowering time control protein FCA | −2.03 | −0.47 | 0.020 | 0.00 | 0.0000 | 0.0005 | 483.33 | 0.81 |
| LGA | 46360000 | lga_g5850 | Alkane hydroxylase MAH1 | −1.98 | −0.05 | 0.005 | 0.05 | 0.0013 | 0.0032 | 2.54 | 0.92 |
| LGA | 53300000 | lga_g6764 | Transcription factor bHLH78 | −1.25 | 0.44 | 0.013 | 0.11 | 0.0009 | 0.0016 | 1.73 | 0.82 |
| LGA | 53710000 | lga_g6816 | Late embryogenesis abundant protein LEA5 | −1.70 | −0.07 | 0.016 | 0.10 | 0.0001 | 0.0007 | 4.56 | 0.85 |
| LGA | 58690000 | lga_g7476 | Probable polygalacturonase ADPG2 | −2.12 | −0.47 | 0.015 | 0.10 | 0.0008 | 0.0014 | 1.66 | 0.64 |
| LGA | 72490000 | lga_g9205 | Anthocyanidin 3-O-glucosyltransferase 2 | −1.59 | 0.07 | 0.019 | 0.10 | 0.0008 | 0.0017 | 2.08 | 0.89 |
| LGA | 104070000 | lga_g13074 | Transcription factor GATA-type | −1.86 | 0.13 | 0.003 | 0.07 | 0.0012 | 0.0022 | 1.83 | 0.86 |
| LGB | 3070000 | lgb_g404 | Homeobox-leucine zipper protein ATHB-14 | −1.92 | −0.08 | 0.011 | 0.12 | 0.0009 | 0.0012 | 1.30 | 0.74 |
| LGB | 7750000 | lgb_g1007 | Transcription factor bHLH147 | −1.64 | 0.43 | 0.004 | 0.14 | 0.0031 | 0.0043 | 1.37 | 0.38 |
| − | |||||||||||
| LGB | 8690000 | lgb_g1139 | ZF-BED domain protein RICESLEEPER 1 | −1.71 | 0.17 | 0.009 | 0.20 | 0.0012 | 0.0018 | 1.50 | 0.81 |
| − | |||||||||||
| LGC | 6500000 | lgc_g807 | Cytochrome P450 CYP71D312 | −1.65 | 0.07 | 0.013 | 0.07 | 0.0019 | 0.0125 | 6.61 | 0.82 |
| LGC | 21310000 | lgc_g2594 | Reticuline oxidase | −1.68 | 0.56 | < 0.001 | 0.07 | 0.0008 | 0.0024 | 2.84 | 0.86 |
| LGC | 30360000 | lgc_g3816 | Histone-lysine N-methyltransferase SUVH4 | −1.61 | 0.27 | 0.007 | 0.23 | 0.0012 | 0.0008 | 0.72 | 0.68 |
| LGC | 49350000 | lgc_g6157 | 1-aminocyclopropane-1-carboxylate oxidase 4 | −1.52 | 0.15 | 0.015 | 0.04 | 0.0001 | 0.0017 | 17.21 | 0.91 |
| LGC | 50850000 | lgc_g6330 | Transcription factor RAX2 | −1.53 | 0.49 | 0.003 | 0.02 | 0.0005 | 0.0029 | 5.90 | 0.97 |
| LGD | 7620000 | lgd_g1017 | SHOOT GRAVITROPISM 5, IDD15 | −1.44 | 0.33 | 0.009 | 0.10 | 0.0002 | 0.0008 | 3.34 | 0.58 |
| LGD | 18020000 | lgd_g2376 | Major allergens Pru1 | −2.51 | 0.17 | < 0.001 | 0.12 | 0.0006 | 0.0033 | 5.34 | 0.88 |
| LGE | 50780000 | lge_g6427 | POLLENLESS 3, MS5 | −1.71 | 0.39 | 0.001 | 0.04 | 0.0014 | 0.0116 | 8.25 | 0.63 |
| LGG | 13830000 | lgg_g2955 | Peroxisome biogenesis protein 1 | −1.87 | 0.09 | 0.003 | 0.00 | 0.0001 | 0.0027 | 23.02 | 0.94 |
| LGG | 24830000 | lgg_g3130 | LEA protein D-29 | −1.29 | 0.52 | 0.007 | 0.15 | 0.0011 | 0.0012 | 1.07 | 0.79 |
| LGG | 49050000 | lgg_g6369 | Syntaxin-132, isocitrate lyase | −2.08 | −0.29 | 0.008 | 0.13 | 0.0008 | 0.0015 | 1.84 | 0.73 |
| LGH | 780000 | lgh_g105 | Ras-related protein RAA4b | −2.23 | 0.35 | < 0.001 | 0.17 | 0.0025 | 0.0046 | 1.88 | 0.67 |
| LGI | 10470000 | lgi_g1363 | Syntaxin-132 | −1.59 | 0.34 | 0.008 | 0.19 | 0.0010 | 0.0006 | 0.58 | 0.68 |
| LGI | 33370000 | lgi_g4214 | FT (Flowering Time) -interacting protein 1 | −2.08 | 0.33 | < 0.001 | 0.10 | 0.0017 | 0.0023 | 1.36 | 0.77 |
| LGI | 40500000 | lgi_g5153 | Peroxidase 24 | −1.94 | 0.33 | 0.001 | 0.10 | 0.0017 | 0.0073 | 4.39 | 0.66 |
| LGJ | 16440000 | lgj_g2112 | Transcription factor VRN1 | −2.09 | 0.03 | 0.002 | 0.23 | 0.0024 | 0.0043 | 1.76 | 0.52 |
| LGJ | 16890000 | lgj_g2169 | Universal stress protein PHOS34 | −1.78 | 0.28 | 0.003 | 0.17 | 0.0019 | 0.0030 | 1.63 | 0.39 |
| − | |||||||||||
| LGL | 38090000 | lgl_g4810 | 1-aminocyclopropane-1-carboxylate synthase 7 | −2.08 | 0.25 | < 0.001 | 0.24 | 0.0059 | 0.0038 | 0.65 | 0.50 |
Bolded entries indicate intervals with local false discovery rates < 0.25, calculated by qvalue (Storey, .
Linkage group and
starting position of sweep region in pseudochromosome draft assembly (Staton et al., ;
Predicted gene selected from sweep based on annotation and statistical significance;
Annotation of selected gene;
Tajima's D for orchard and
wild Chinese chestnut pools across the sweep interval;
Permutation-derived p value for sweep;
Average heterozygosity for the selected gene in an independent sample of 12 Chinese individual chestnut genomes;
Nucleotide diversity among Castanea mollissima (Cm) and
Castanea dentata (Cd) for the selected gene;
Factor by which π was greater in wild Cd than in domesticated Cm;
F.
Local false discovery rate < 0.05.
Additional regions under selection due to domestication and regional climatic variation identified by allelic fixation at SNPs.
| LGA | 50030000 | lga_g6327 | THOC6_ARATH THO complex subunit, ABA signaling | 0.93 | 0.63 | >3 | 0.05 | 0.0004 | 0.0019 | 4.47 | 0.82 |
| LGA | 65859000 | lga_g8373 | RPE_SOLTU Ribulose-phosphate epimerase | 0.86 | 0.56 | >3 | 0.04 | 0.0006 | 0.0021 | 3.53 | 0.93 |
| LGC | 31478000 | lgc_g3960 | PP14_ARATH Serine/threonine protein phosphatase | 0.86 | 0.64 | >3 | 0.04 | 0.0006 | 0.0019 | 3.27 | 0.72 |
| LGE | 25481000 | lge_g3428 | MYBF_ARATH, transcription factor | 0.87 | 0.71 | >3 | 0.08 | 0.0007 | 0.0020 | 2.87 | 0.67 |
| LGE | 29262000 | lge_g3727 | CE101_ARATH lectin receptor kinase | 0.93 | 0.66 | >3 | 0.07 | 0.0003 | 0.0057 | 18.48 | 0.82 |
| LGE | 44249000 | lge_g5605 | DRE2D_ARATH Dehydration-responsive element | 0.92 | 0.61 | >5 | 0.03 | 0.0002 | 0.0015 | 6.67 | 0.85 |
| LGF | 16717000 | lgf_g2053 | ALsF1_ARATH Aldehyde dehydrogenase | 0.84 | 0.60 | >3 | 0.09 | 0.0002 | 0.0009 | 5.13 | 0.86 |
| LGF | 27956000 | lgf_g3433 | SAUR32_ARATH Auxin responsive element | 0.84 | 0.63 | >3 | 0.09 | 0.0004 | 0.0007 | 1.81 | 0.90 |
| LGG | 8420000 | lgg_g5871 | CDPK_SOYBN calcium-dependent protein kinase SK5 | 0.96 | 0.64 | >3 | 0.11 | 0.0015 | 0.0025 | 1.64 | 0.49 |
| LGG | 2054000 | lgg_g2558 | GONS1_ARATH GDP-mannose transporter | 0.90 | 0.68 | >3 | 0.08 | 0.0006 | 0.0016 | 2.63 | 0.86 |
| LGG | 23410000 | lgg_g1699 | LEA34_GOSHI late-embryogenesis-abundant protein | 0.85 | 0.59 | >3 | 0.20 | 0.0004 | 0.0015 | 3.82 | 0.68 |
| LGG | 33377000 | lgg_g4266 | MEE14_ARATH CCG-binding AGAMOUS interactor | 0.92 | 0.64 | >3 | 0.11 | 0.0009 | 0.0029 | 3.16 | 0.78 |
Linkage group and
starting position of 10 kb sweep region in pseudochromosome draft assembly (Staton et al., ;
Predicted gene selected from sweep based on annotation and statistical significance;
Annotation of selected gene;
Average major allele frequency for SNPs in orchard and
wild Chinese chestnut pools across the sweep interval (10 kb);
Permutation-derived p value for sweep;
Average heterozygosity for the selected gene in an independent sample of 12 individual Chinese chestnut genomes;
Nucleotide diversity among orchard-derived Chinese chestnut, Castanea mollissima (Cm) and American chestnut
Castanea dentata (Cd) for the selected gene;
Factor by which π was greater in wild Cd than in domesticated Cm;
F.
Figure 2Tajima's D statistic in an independent sample of 8 orchard-derived chestnut whole-genome sequences, graphed over putative selective sweeps on LGC, LGD, LGL, and LGI of the Chinese chestnut genome identified using pooled whole-genome data. Approximate locations of predicted chestnut genes that were the best alignment for genes in domestication-associated selective sweeps of apple (red), grape (purple) and peach (orange) are labeled with the name of the aligned apple, grape, or peach gene.
Evidence of synteny between chestnut domestication candidate loci and domestication-associated chromosomal regions in other woody plant crops.
| LGA:50030000 | 0 | 0 | 3 |
| LGA:65859000 | 1 | 0 | 3 |
| LGB: | 1 | 0 | 0 |
| LGC: 6505000 | 1 | 1 | 1 |
| LGC: 30360000 | 2 | 0 | 0 |
| LGD: 5350000 | 9 | 1 | 0 |
| LGE: 25481000 | 0 | 2 | 0 |
| LGF: 27956000 | 2 | 0 | 0 |
| LGI:27225000 | 1 | 2 | 0 |
| LGI:33385000 | 3 | 0 | 0 |
| LGL:23280000 | 2 | 0 | 3 |
Number of predicted proteins from a domestication region in apple, peach, or grape that were the best alignment for a protein in the indicated chestnut selective sweep region in chestnut, in an alignment of all chestnut proteins vs. all apple, peach, and grape proteins.
Putative loci differentially selected among northern and southern samples of wild Chinese chestnut, identified by comparing allele frequencies among pools of chestnut, with annotations based on the best UniProt alignments of predicted genes.
| LGA | 72907000 | 72988000 | 0.56 | 087 | >2 | C94A2_VICSA: cytochrome P450, fatty acid oxidation |
| LGA | 79800000 | 79880000 | 0.67 | 0.95 | >3 | Y2060_ARATH: BTB/POZ domain ubiquination protein |
| LGA | 80300000 | 80330000 | 0.61 | 0.89 | >3 | SD25_ARATH: protein kinase |
| LGA | 82239000 | 82355000 | 0.65 | 0.99 | >4 | PLY19_ARATH: pectate lyase 19 |
| LGB | 15342000 | 15410000 | 0.64 | 0.91 | >3 | E134_MAIZE: endo-1,3;1,4-beta-D-glucanase |
| LGB | 6540000 | 6678000 | 0.71 | 0.95 | >3 | SEOB_ARATH |
| LGC | 48510000 | 48811632 | 0.66 | 0.83 | >2 | SIB1_ARATH: sigma binding factor, pathogen defense |
| LGC | 50000000 | 50186000 | 0.69 | 0.95 | >3 | CNR2_MAIZE |
| LGC | 53870000 | 53947000 | 0.57 | 0.86 | >3 | PP413_ARATH: pentatricopeptide repeat-containing protein |
| LGE | 16600000 | 16700000 | 0.67 | 1.00 | >4 | CCR1_ARATH: cinnamoyl-CoA reductase, lignin synthesis; |
| LGG | 43890000 | 43990000 | 0.70 | 0.99 | >3 | HMDH1_GOSHI: isoprenoid precursor (mevalonate) synthesis |
| LGG | 48970000 | 49040000 | 0.61 | 0.88 | >3 | CPC_ARATH: trichome development transcription factor |
| LGI | 4295000 | 4336000 | 0.62 | 0.89 | >4 | SILD_FORIN, ILR1_ARATH: lignin biosynthesis |
| LGL | 58890000 | 59190000 | 0.69 | 0.97 | >4 | ERF25_ARATH, DRE1B_ARATH: cold tolerance |
| LGE | 34000000 | 34100000 | 0.91 | 0.65 | >3 | CADH_EUCBO: cinnamoyl alcohol dehydrogenase, lignin synthesis |
| LGH | 18500000 | 18547000 | 0.81 | 0.62 | >3 | SAG13_ARATH: senescence-associated protein |
Gray shading indicates loci with evidence of directional selection in northern Chinese samples.
Average major allele frequency in northern Chinese wild trees for the given interval of 10 predicted genes;
Average major allele frequency in southern Chinese wild trees for the given interval of 10 predicted genes;
Standard deviations greater than the average difference in major allele frequency between orchard and wild pools.
Standard deviations from mean difference in allele frequency between northern and southern pools;
Annotation of predicted chestnut gene (AUGUSTUS) based on alignment to the UniProtKB/Swiss-Prot database.
Also identified as putative domestication loci due to low Tajima's D-value in orchard samples.