| Literature DB >> 32986797 |
Erik Gobbo1, Nicolas Lartillot2, Jack Hearn3, Graham N Stone4, Yoshihisa Abe5, Christopher W Wheat6, Tatsuya Ide7, Fredrik Ronquist1.
Abstract
Gall wasps (Hymenoptera: Cynipidae) induce complex galls on oaks, roses, and other plants, but the mechanism of gall induction is still unknown. Here, we take a comparative genomic approach to revealing the genetic basis of gall induction. We focus on Synergus itoensis, a species that induces galls inside oak acorns. Previous studies suggested that this species evolved the ability to initiate gall formation recently, as it is deeply nested within the genus Synergus, whose members are mostly inquilines that develop inside the galls of other species. We compared the genome of S. itoensis with that of three related Synergus inquilines to identify genomic changes associated with the origin of gall induction. We used a novel Bayesian selection analysis, which accounts for branch-specific and gene-specific selection effects, to search for signatures of selection in 7,600 single-copy orthologous genes shared by the four Synergus species. We found that the terminal branch leading to S. itoensis had more genes with a significantly elevated dN/dS ratio (positive signature genes) than the other terminal branches in the tree; the S. itoensis branch also had more genes with a significantly decreased dN/dS ratio. Gene set enrichment analysis showed that the positive signature gene set of S. itoensis, unlike those of the inquiline species, is enriched in several biological process Gene Ontology terms, the most prominent of which is "Ovarian Follicle Cell Development." Our results indicate that the origin of gall induction is associated with distinct genomic changes, and provide a good starting point for further characterization of the genes involved.Entities:
Keywords: Bayesian selection analysis; Cynipidae; codon models; gall induction; gene duplication; gene set enrichment analysis
Year: 2020 PMID: 32986797 PMCID: PMC7674688 DOI: 10.1093/gbe/evaa204
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.Phylogenetic relationships among cynipids and other gall-associated cynipoids based on a combination of molecular, morphological, and life-history traits (from Ronquist et al. [2015]). Numbers of branches are Bayesian posterior probabilities. Gall inducers and inquilines are marked with different colors.
. 2.Phylogenetic relationships among species of Synergus and related genera in the tribe Synergini sensu stricto based on COI and 28S data, 1.1 kb in total (modified from Ide et al. [2018]). Numbers on branches represent Bayesian posterior probabilities and bootstrap support values from a maximum likelihood analysis. The star indicates species that are known or suspected to be true gall inducers.
. 3.Unrooted tree depicting relationships among the four studied Synergus species. The tree was inferred using MrBayes from the protein sequences of 300 randomly selected single-copy genes present in all species. The topology was supported in all sampled trees. Numbers on branches are average branch lengths (expected substitution per amino acid site).
Results from Analysis of the Gene-Wise Branch Model Using Codeml (PAML)
| Positive Signature Genes | Negative Signature Genes | |||||
|---|---|---|---|---|---|---|
| Species/Branch | Muscle | ClustalOmega | Intersection | Muscle | ClustalOmega | Intersection |
|
|
|
|
|
|
|
|
|
| 219 | 256 | 180 | 66 | 68 | 50 |
|
| 74 | 91 | 62 | 318 | 324 | 283 |
|
| 75 | 92 | 60 | 195 | 208 | 169 |
| Internal branch | 340 | 398 | 312 | 76 | 83 | 54 |
Note.—Positive signature genes (PSGs) and negative signature genes (NSGs) were identified as those genes with a log-likelihood ratio >3.8415 (P ≤ 0.05) for the hypothesis of a higher dN/dS in the foreground (PSGs) or in the background (NSGs) subtree. Values for the gall inducer (S. itoensis) in bold.
Positive Signature Genes (PSGs) Identified in the Integrative Gene–Branch Interaction Model Using Bayescode
| Muscle | ClustalOmega | ||||||
|---|---|---|---|---|---|---|---|
| Species/Branch | Run 1 | Run 2 | Both Runs | Run 3 | Run 4 | Both Runs | All Runs |
|
|
|
|
|
|
|
|
|
|
| 154 | 155 | 148 | 183 | 185 | 177 | 108 |
|
| 159 | 155 | 150 | 189 | 196 | 182 | 121 |
|
| 142 | 142 | 138 | 161 | 164 | 160 | 110 |
| Internal branch | 119 | 122 | 114 | 130 | 130 | 120 | 100 |
Note.—PSGs were identified as those genes with posterior probability ≥ 0.95 for the hypothesis of the gene having a dN/dS ratio higher than expected from the interaction between the gene factor and the branch factor. The gall inducer S. itoensis consistently displays the highest number of PSGs (values in bold).
Negative Signature Genes (NSGs) Identified in the Integrative Gene–Branch Interaction Model Using Bayescode
| Muscle | ClustalOmega | ||||||
|---|---|---|---|---|---|---|---|
| Species/Branch | Run 1 | Run 2 | Both Runs | Run 3 | Run 4 | Both Runs | All Runs |
|
|
|
|
|
|
|
|
|
|
| 133 | 129 | 119 | 148 | 150 | 138 | 90 |
|
| 63 | 60 | 54 | 82 | 83 | 75 | 44 |
|
| 54 | 49 | 45 | 55 | 60 | 48 | 36 |
| Internal branch | 157 | 159 | 146 | 168 | 179 | 153 | 102 |
Note.—The NSGs were identified as those genes with posterior probability ≥ 0.95 for the hypothesis of the gene having a dN/dS ratio lower than expected from the interaction between the gene factor and the branch factor. The gall inducer S. itoensis consistently displays the highest number of NSGs (values in bold).
. 4.Distribution of the Bayescode posterior probabilities, P, of the gene effects in the gene–branch interaction model being positive in each of the five branches. Data from run 2, based on the Muscle alignment. The posterior probability of the gene effect being negative, q, is the complement of P, that is q = 1 − P. If P is very high, say >95%, then the gene is likely to have had an elevated dN/dS on the selected branch compared with what would have been expected for that gene and that branch. Conversely, if P is very low (and q high), the gene is very likely to have had a lowered dN/dS on the selected branch.
Results from the Gene Set Enrichment Analysis (GSEA) of Gene Ontology (GO) Terms
| GO ID | GO term | Muscle Alignment | Clustal Alignment | Nonspecific Occurrences | |||
|---|---|---|---|---|---|---|---|
| Run 1 | Run 2 | Run 3 | Run 4 | Number of Occurrences | Lowest | ||
| GO:0030707 | Ovarian follicle cell development | 0.00015 | 0.00051 | 0.00292 | 4.2e-05 | — | |
| GO:0007507 | Heart development | 0.00025 | 0.0021 | 0.00049 | 0.00049 | 2 | 0.0227 |
| GO:0007409 | Axonogenesis | 0.00043 | 0.0082 | 7.8e-05 | 0.00066 | — | |
| GO:0061564 | Axon development | 0.00097 | 0.01536 | 0.00021 | 0.00166 | — | |
| GO:0042221 | Response to chemical | 0.00099 | 0.00171 | 0.03052 | — | — | |
| GO:0007320 | Insemination | — | — | 0.00075 | 0.02168 | — | |
| GO:0006928 | Movement of cell or subcellular component | 0.00227 | 0.00305 | 0.00078 | 9.4e-05 | — | |
| GO:0007498 | Mesoderm development | — | — | 0.00431 | 0.00035 | — | 0.02539 |
| GO:0040011 | Locomotion | 0.00768 | 0.00858 | 0.0037 | 0.00054 | 3 | 0.02232 |
Note.—The P value of the parent–child Fisher’s test is given for all the Bayescode runs in which a term is overrepresented among the PSGs of Synergus itoensis. When the term is overrepresented also in a branch other than S. itoensis, the number of nonspecific occurrences is given together with the lowest P value among those runs.