| Literature DB >> 31101123 |
Mathieu Seppey1,2, Panagiotis Ioannidis3,4,5, Brent C Emerson6, Camille Pitteloud1,7, Marc Robinson-Rechavi1,4, Julien Roux1,4,8, Hermes E Escalona9, Duane D McKenna10, Bernhard Misof9, Seunggwan Shin10, Xin Zhou11, Robert M Waterhouse12,13, Nadir Alvarez14,15.
Abstract
BACKGROUND: The diversity and evolutionary success of beetles (Coleoptera) are proposed to be related to the diversity of plants on which they feed. Indeed, the largest beetle suborder, Polyphaga, mostly includes plant eaters among its approximately 315,000 species. In particular, plants defend themselves with a diversity of specialized toxic chemicals. These may impose selective pressures that drive genomic diversification and speciation in phytophagous beetles. However, evidence of changes in beetle gene repertoires driven by such interactions remains largely anecdotal and without explicit hypothesis testing.Entities:
Keywords: Beetle diversification; Beetle-plant trophic interactions; Detoxification enzymes; Gene family evolution
Mesh:
Year: 2019 PMID: 31101123 PMCID: PMC6525341 DOI: 10.1186/s13059-019-1704-5
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1The ultrametric species phylogeny with gene family expansions and contractions quantified for nodes of interest and bar charts showing completeness of the genomic and transcriptomic datasets studied. The species tree was built from 405 single-copy orthologs and constrained to have Geadephaga (C. frigidum, E. aureus, C. hybridia) and Hydradephaga (the six other Adephaga) as monophyletic sister clades (e.g., following [6]). Branch lengths are scaled in millions of years. Maximum likelihood bootstrap support was 99 or 100% for all branches. [G] symbol indicates data from species with sequenced genomes with the remaining species being from transcriptomes. The numbers of orthologous groups (OGs) with expansions (+) and contractions (−) are displayed at the root node of each suborder. Pie charts show proportions of OGs with gene losses (black) and gene gains (green) with respect to OGs with no significant losses or gains for all considered OGs (gray) and only the candidate OGs (blue). While gains constitute only a small subset of all OGs in both suborders, the proportion of gains is much larger among candidate OGs in Polyphaga. The nodes indicated by blue circles in the Polyphaga subtree lead to species-rich clades containing species that are largely phytophagous (e.g., Chrysomelidae and Curculionidae, respectively Chrys. and Curc.) and experienced larger proportions of gains among the candidate OGs. The Benchmarking Universal Single-Copy Ortholog (BUSCO) scores indicate the relative levels of completeness and putative gene duplications for the genome-based and transcriptome-based datasets in terms of 1658 BUSCOs from the insecta_odb9 assessment dataset
Beetle genomes and transcriptomes included in the study. Taxonomic classifications are listed with data sources, as well as completeness (Benchmarking Universal Single-Copy Ortholog, BUSCO, score, C = complete, S = complete single-copy, D = complete duplicated, F = fragmented, M = missing), number of predicted proteins, and number of orthologous groups (OGs) with genes from each species. The outgroup species used in the phylogeny, Stylops melittae, belongs to the order Strepsiptera, which is the sister group of Coleoptera [33]
| Species | Short form | Suborder | Family | Type of assembly | Accession | Bioproject | Source | BUSCO score (1658 genes) | Predicted proteins | OrthoDB groups (OG) |
|---|---|---|---|---|---|---|---|---|---|---|
|
| CHYBR | Adephaga | Carabidae | Transcriptome | GDMH01000000 | PRJNA286505 | 1KITE, this study | C: 90.4 [S: 84.9%, D: 5.5%], F: 3.4%, M: 6.2% | 13,916 | 8111 |
|
| CFRIG | Adephaga | Carabidae | Transcriptome | GDLF01000000 | PRJNA286499 | 1KITE, this study | C: 78.3% [S: 73.5%, D: 4.8%], F: 9.8%, M: 11.9% | 9844 | 6742 |
|
| EAURE | Adephaga | Carabidae | Transcriptome | GDPI01000000 | PRJNA286520 | 1KITE, this study | C: 92.4% [S: 85.9%, D: 6.5%], F:2.2%, M: 5.4% | 12,808 | 8020 |
|
| NCLAV | Adephaga | Noteridae | Transcriptome | GDNA01000000 | PRJNA286561 | 1KITE, Vasilikopoulos et al. [ | C: 90.9% [S: 84.1%, D: 6.8%], F: 4.4%, M: 4.7% | 12,981 | 7918 |
|
| HFLUV | Adephaga | Haliplidae | Transcriptome | GDMW01000000 | PRJNA286525 | 1KITE, Vasilikopoulos et al. [ | C: 91.8% [S: 76.3%, D: 15.5%], F: 4.3%, M: 3.9% | 19,408 | 8528 |
|
| CLATE | Adephaga | Dytiscidae | Transcriptome | GDLH01000000 | PRJNA286512 | 1KITE, Vasilikopoulos et al. [ | C: 88.2% [S: 83.1%, D: 5.1%], F: 3.7%, M:8.1% | 13,916 | 7256 |
|
| SWRAS | Adephaga | Aspidytidae | Transcriptome | GDNH01000000 | PRJNA286492 | 1KITE, Vasilikopoulos et al. [ | C: 87.0% [S: 76.7%, D: 10.3%], F: 4.0%, M: 9.0% | 13,392 | 7721 |
| DINEU | Adephaga | Gyrinidae | Transcriptome | GDNB01000000 | PRJNA286516 | 1KITE, Vasilikopoulos et al. [ | C: 71.9% [S: 51.6%, D: 20.3%], F: 13.6%, M: 14.5% | 14,644 | 7089 | |
|
| GMARI | Adephaga | Gyrinidae | Transcriptome | GAUY01000000 | PRJNA219564 | 1KITE, Misof et al. [ | C: 81.5% [S: 79.0%, D: 2.5%], F: 8.1%, M: 10.4% | 13,867 | 7663 |
|
| ACURT | Polyphaga | Staphylinidae | Transcriptome | GATW01000000 | PRJNA219522 | 1KITE, Misof et al. [ | C: 91.4% [S: 87.5%, D: 3.9%], F: 4.0%, M: 4.6% | 20,280 | 8513 |
|
| AGLAB | Polyphaga | Cerambycidae | Genome | GCF_000390285 | PRJNA167479 | I5k, McKenna et al. [ | C: 96.9% [S: 95.8%, D: 1.1%], F: 2.7%, M: 0.4% | 22,035 | 10,959 |
|
| APLAN | Polyphaga | Buprestidae | Genome | GCF_000699045 | PRJNA230921 | I5k, unpublished | C: 92.5% [S: 91.2%, D: 1.3%], F: 4.5%, M: 3.0% | 15,497 | 9089 |
|
| DPOND | Polyphaga | Curculionidae | Genome | GCF_000355655 | PRJNA162621 | Keeling et al. [ | C: 91.2% [S: 86.0%, D: 5.2%], F: 4.1%, M: 4.7% | 13,457 | 8518 |
|
| LDECE | Polyphaga | Chrysomelidae | Genome | GCF_000500325 | PRJNA171749 | I5k, Schoville et al. [ | C: 88.9% [S: 87.5%, D: 1.4%], F: 9.9%, M: 1.2% | 24,671 | 11,149 |
|
| LTESS | Polyphaga | Curculionidae | Transcriptome | 10.5281/zenodo.1336288 | N/A | This study, Seppey et al. [ | C: 93.8% [S: 91.9%, D: 1.9%], F: 1.4%, M: 4.8% | 18,448 | 8616 |
|
| MVIOL | Polyphaga | Meloidae | Transcriptome | GATA01000000 | PRNJA219578 | 1KITE, Misof et al. [ | C: 90.3% [S: 85.6%, D: 4.7%], F: 5.9%, M: 3.8% | 14,295 | 8480 |
|
| OTAUR | Polyphaga | Scarabaeidae | Genome | GCF_000648695 | PRJNA167478 | I5k, unpublished | C: 96.2% [S: 93.9%, D: 2.3%], F: 2.5%, M: 1.3% | 17,483 | 9315 |
|
| TCAST | Polyphaga | Tenebrionidae | Genome | GCF_000002335 | PRJNA12540 | Richards et al. [ | C: 97.0% [S: 96.5%, D: 0.5%], F: 1.6%, M:1.4% | 16,645 | 9429 |
|
| SMELI | Outgroup | Stylopidae | Transcriptome | GAZM02000000 | PRNJA219603 | 1KITE, Misof et al. [ | C: 76.5% [S: 55.0%, D: 21.5%], F: 7.1%, M: 16.4% | 13,026 | 6104 |
Candidate gene categories with the keywords and identifiers used to select them from the full sets of sequences annotated with InterProScan. To be included as candidate orthologous groups (OGs) in the category, OGs were required to have at least one sequence matching both a UniRef and an InterProScan entry, and an additional gene ontology term in the case of serine proteases
| Gene family category | InterProScan (Pfam or InterPro identifiers) or Gene Ontology | UnifRef keyword | Number of OGs |
|---|---|---|---|
| UDP-glycosyltransferases (UGTs) | PF00201 | name:“cluster UDP glucuronosyltransferase” OR name:“cluster UDP glycosyltransferase” | 4 |
| Cytochrome P450 oxidases (P450s) | PF00067 | name:“cluster Cytochrome P450” | 22 |
| Carboxylesterases (CEs) | PF02230, PF00135 | name:“cluster carboxylesterase” OR name:“carboxylic ester hydrolase” | 19 |
| Glutathione S-transferases (GSTs) | PF00043, PF02798 | name:“cluster Glutathione S-transferase” | 6 |
| Serine proteases (SERs) | PF00450, PF12146, PF05577, GO:0008236 | name:“cluster Serine protease” OR name:“cluster Serine peptidase” | 4 |
| Cysteine proteases (CYSs) | PF00112 | name:“cluster cysteine protease” OR name:“cluster cystein protease” OR name:“cluster Papain” | 7 |
| ABC transporters (ABCs) | PPF00005, PF00664 | name:“cluster ABC” | 28 |
| Glycoside hydrolases (GHs) | IPR000334, IPR000743, IPR001360, IPR001547 | name:“cluster Glycoside hydrolase” | 1 |
| Total | 91 | ||
Candidate orthologous groups (OGs) with CAFE overall p values < 0.01 for which a model favoring selection for larger sizes in Polyphaga showed a greater likelihood. OG identifiers for functional category cytochrome P450s (P450), carboxylesterases (CE), glutathione S-transferases (GST), and cysteine proteases (CYS) are from OrthoDB v8 (ODB8 ID). Akaike Information Criterion corrected for small sample size (AICc) values are reported for all tested models. BM1 (Brownian motion with a single rate for the whole tree), BMS (Brownian motion with different rates for each regime), OU1 (selection towards the same optimum for both regimes) all represent the null hypothesis. OUM (selection towards two optima, same variance) and OUMV (selection towards two optima, two variances) represent the alternative hypotheses. The mean values in each suborder (Adephaga vs. Polyphaga) are presented in the last two columns. Values in italics indicate the preferred (maximum likelihood) model. A delta AICc > 2 is required for H1 to be retained
| Category | ODB8 ID | BM1 AICc H0.1 | BMS AICc H0.2 | OU1 AICc H0.3 | OUM AICc H1.1 | OUMV AICc H1.2 | Mean Adephaga | Mean Polyphaga |
|---|---|---|---|---|---|---|---|---|
| P450 | EOG805VG7 | 148.37 | 153.21 | 143.35 | 148.10 |
| 34.13 | 34.47 |
| CE | EOG87DCWX | 143.23 | 143.77 | 143.63 |
| 141.15 | 6.55 | 18.78 |
| CE | EOG8KD911 | 87.08 | 91.23 | 82.90 | 89.10 |
| 0.89 | 2.86 |
| CE | EOG876NDC | 80.64 | 85.67 | 80.08 | 87.42 |
| 1.72 | 3.48 |
| GST | EOG87WR3Z | 86.24 | 87.76 | 76.05 | 74.40 |
| 1.71 | 3.16 |
| GST | EOG81RS7Z | 108.77 | 114.44 | 109.19 | 113.76 |
| 6.85 | 11.69 |
| GST | EOG85F05D | 117.62 | 115.88 | 111.53 | 107.62 |
| 5.69 | 9.16 |
| CYS | EOG8JDKNM | 91.85 | 91.62 | 89.74 |
| 88.25 | 1.80 | 3.78 |
Fig. 2Molecular phylogeny from the largest glutathione S-transferase (GST) orthologous group among those exhibiting lineage-specific expansions driven by selection. Red labels indicate genes belonging to species of Polyphaga, accounting for 98 out of 152 genes (their Ornstein-Uhlenbeck per-species optimum is 11.69 vs. 6.85 for Adephaga (blue labels), see Table 3). The presence of several clades of polyphagan and adephagan genes delineates duplication events following the divergence of the two suborders. Encircling the gene labels are red bars that highlight polyphagan clades with bootstrap support of > 50% and yellow bars that highlight intra-specific duplications with bootstrap support of > 50%. Corresponding full names of species are given in Table 1. Branch lengths represent substitutions per site and bootstrap support below 50% is not displayed
Gene family category and candidate orthologous group (OG) enrichments among positive results. The top panel presents the statistical significance of each test for enrichment of candidate gene families among the positive results when compared to the background, for Polyphaga. The lower panel indicates the number of positive results in both suborders, for candidate OGs and background. Significant values at the 0.05 threshold are shown in italics
| Category | Positive/total OGs | Category enrichment FDR |
| P450 | 1/22 | 0.36268 |
| CE | 3/19 |
|
| GST | 3/6 |
|
| CYS | 1/7 | 0.16627 |
| UGT | 0/4 | 1.00000 |
| SER | 0/4 | 1.00000 |
| ABC | 0/28 | 1.00000 |
| GH | 0/1 | 1.00000 |
| Category | Positive/total OGs | Candidate vs. background enrichment |
| Background (Polyphaga) | 88/9720 | 0 |
| Candidates (Polyphaga) | 8/91 | |
| Background (Adephaga) | 21/9720 | 1 |
| Candidates (Adephaga) | 0/91 |