| Literature DB >> 31912142 |
Hamid Razifard1,2, Alexis Ramos3, Audrey L Della Valle4, Cooper Bodary5, Erika Goetz5, Elizabeth J Manser2, Xiang Li6, Lei Zhang3, Sofia Visa5, Denise Tieman6, Esther van der Knaap3, Ana L Caicedo1,2.
Abstract
The process of plant domestication is often protracted, involving underexplored intermediate stages with important implications for the evolutionary trajectories of domestication traits. Previously, tomato domestication history has been thought to involve two major transitions: one from wild Solanum pimpinellifolium L. to a semidomesticated intermediate, S. lycopersicum L. var. cerasiforme (SLC) in South America, and a second transition from SLC to fully domesticated S. lycopersicum L. var. lycopersicum in Mesoamerica. In this study, we employ population genomic methods to reconstruct tomato domestication history, focusing on the evolutionary changes occurring in the intermediate stages. Our results suggest that the origin of SLC may predate domestication, and that many traits considered typical of cultivated tomatoes arose in South American SLC, but were lost or diminished once these partially domesticated forms spread northward. These traits were then likely reselected in a convergent fashion in the common cultivated tomato, prior to its expansion around the world. Based on these findings, we reveal complexities in the intermediate stage of tomato domestication and provide insight on trajectories of genes and phenotypes involved in tomato domestication syndrome. Our results also allow us to identify underexplored germplasm that harbors useful alleles for crop improvement.Entities:
Keywords: GWAS; domestication history; population genomics; tomato; whole-genome sequencing
Mesh:
Year: 2020 PMID: 31912142 PMCID: PMC7086179 DOI: 10.1093/molbev/msz297
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FPopulation delimitation of SLL and closely related groups. (a) A coalescent-based phylogenetic tree (central area) constructed in SVDQuartets using whole-genome 4D SNPs; population structure (color bars associated with tree tips) constructed in fastSTRUCTURE using all genomic SNPs after quality filtering; population delimitations of each accession is provided with country abbreviations and lines in different colors. Main populations as well as unassigned accessions (black) were delimited following the results of ancestral structure, phylogenetic clustering, geographic origin, and passport data (see Materials and Methods). (b) Distribution map of each accession, excluding the unassigned accessions and SLL accessions from outside South and Mesoamerica (see Materials and Methods). (c) PCA based on genomic SNPs of all accessions except those unassigned in (a). The first two principle components explain 13.21% and 8.04% of the total variation. (d) Pairwise population differentiation estimated using FST.
Genome-Wide Estimates of π, Watterson’s Θ, and Tajima’s D from 10-kb Windows for Populations Delimited in This Study.
| Population |
|
|
| Watterson’s Θ | Tajima’s |
|---|---|---|---|---|---|
| SP SECU | 0.36684 | 0.06948 | 0.00071 | 0.00092 | 0.20780 |
| SP PER | 0.15502 | 0.04953 | 0.00038 | 0.00059 | −0.71840 |
| SP NECU | 0.25151 | 0.04624 | 0.00066 | 0.00088 | −0.26170 |
| SLC ECU | 0.19438 | 0.02074 | 0.00069 | 0.00079 | 0.00608 |
| SLC PER | 0.17290 | 0.01690 | 0.00038 | 0.00055 | −0.69670 |
| SLC San Martin | 0.24020 | 0.02304 | 0.00021 | 0.00026 | −0.38550 |
| SLC MEX-CA-NSA | 0.21702 | 0.02823 | 0.00048 | 0.00064 | −0.29230 |
| SLC MEX | 0.14028 | 0.00791 | 0.00017 | 0.00030 | −0.95790 |
| SLL | 0.04781 | 0.01076 | 0.00003 | 0.00012 | −1.39500 |
| SLL ex. 10 | 0.06583 | 0.01158 | 0.00003 | 0.00009 | −1.19800 |
NOte.—“SLL ex. 10” refers to SLL after excluding ten potentially modern admixed accessions (supplementary table 3, Supplementary Material online).
FGene flow and ADAFs. TreeMix topologies without gene flow (a) and with one suggested gene flow event (b) are presented. Bootstrap support values are provided for each node. The suggested gene flow event (orange arrow) were further evaluated using a four-taxon test (c). ADAF for SNPs containing different alleles for the SLL reference genome and Solanum pennellii is presented in (d). Derived alleles occur with relative high frequency on all populations, due to the distance between the outgroup and studied populations. However, derived alleles occurring in SLL are most common in SLC MEX, suggesting a close relationship.
FInferred tomato phylogeny and domestication history based on the combined results of population history analyses and comparisons of median fruit size. Estimates of divergence times and changes in population sizes using ∂a∂i are provided for the major events in tomato domestication history (left panel): origin of SLC (1), northward spreads of SLC (2), and redomestication of SLL (3). Asymmetric gene flow between groups is shown by double arrows with black arrow representing stronger gene flow in that direction. Width of the branches on the summary tree represents population expansion or contraction. Dotted lines represent populations included only in the models examining the origin of SLC (1). In the right panel, a transition to wild-like fruit sizes is evident in the northernmost populations of SLC (SLC MEX and SLC MEX-CA-NSA). A scale bar (=10 cm) applicable to all fruit images is provided. Black arrows represent northward spreads of SLC. Gray arrow represents gene flow between SP PER and SLC PER.
FPhenotypic and genotypic changes of eight agriculturally important fruit traits (a–h) through tomato domestication history. For each trait, the phenotypic distribution (left) and the genotype frequencies of the most significantly correlated SNP from GWAS (right) are presented. In the genotype frequency plots, alleles frequent in SP were considered as ancestral homozygous genotypes for these and are shown in turquoise; homozygotes for derived alleles are shown in blue. The frequency of heterozygotes is shown in purple. Bar widths are proportionate to the sample size of each population. An N- (or И-) shaped trend (see Results) is evident in many of the phenotypes presented.
A List of SNPs with the Highest Association (lowest P value below Bonferroni cutoff) with the Phenotypes Examined in Figure 4.
| Trait | Top Associated SNP (chromosome, position) | Associated Gene if No Known Candidate | Known Candidate Locus in Region | Function |
|---|---|---|---|---|
| Dry weight | 6, 41594907 | Solyc06g066330 | None | Poly polymerase catalytic domain |
| Locule number | 2, 47183456 | NA |
| Determining stem cell fate |
| Pericarp thickness | 2, 53794883 | NA | CNR (Solyc02g090730) | SQUAMOSA promoter binding protein-like |
| Soluble solids | 9, 3477979 | NA |
| Beta-fructofuranosidase |
| Beta-carotene | 4, 11582333 | Solyc04g019320 | None | PIF1-like helicase |
| Citric acid | 9, 12911725 | Solyc09g018100 | None | Ulp1 protease family C-terminal catalytic domain containing protein |
| Malic acid | 6, 44929013 | Solyc06g072840 | None | Hydrogen peroxide-induced protein 1 |
| Glucose | 9, 10833093 | NA | Solyc09g015490 | Transmembrane 9 superfamily member |
NOte.—Information on gene ID and function of the gene associated with each variant, or of known candidate genes in the GWA region are also provided.
FSignals of selective sweeps during the main events in the domestication history of tomato. Support (y-axis) for each sweep is defined as the number of statistics (π ratio, FST, and/or CLR) with a top 2% value for a potential sweep (see Materials and Methods) at a chromosomal position (x-axis). The position of well-studied genes is marked with dotted lines.