| Literature DB >> 30809447 |
Stefanos Siozios1, Jack Pilgrim2, Alistair C Darby1, Matthew Baylis2,3, Gregory D D Hurst1.
Abstract
BACKGROUND: It is estimated that 13% of arthropod species carry the heritable symbiont Cardinium hertigii. 16S rRNA and gyrB sequence divides this species into at least four groups (A-D), with the A group infecting a range of arthropods, the B group infecting nematode worms, the C group infecting Culicoides biting midges, and the D group associated with the marine copepod Nitocra spinipes. To date, genome sequence has only been available for strains from groups A and B, impeding general understanding of the evolutionary history of the radiation. We present a draft genome sequence for a C group Cardinium, motivated both by the paucity of genomic information outside of the A and B group, and the importance of Culicoides biting midge hosts as arbovirus vectors.Entities:
Keywords: Cardinium hertigii; Culicoides biting midges; Gene family expansion; Genome sequence; Heritable symbionts; Phylogenomic analysis
Year: 2019 PMID: 30809447 PMCID: PMC6387759 DOI: 10.7717/peerj.6448
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Genome Features of cCpun draft genome and its closest relatives.
| Number of scaffolds | 57 | 1 | 11 | 1 | 1 | 27 | 1 |
| Plasmids | 0 | 1 | 1 | 0 | 0 | 0 | 0 |
| Total size in kb | 1,137 | 887 (58) | 1,013 (52) | 1,103 | 1,193 | 1,358 | 1,884 |
| GC content (%) | 33.7 | 36.6 (31.5) | 35 (32) | 39.2 | 38.2 | 35.8 | 35 |
| CDS | 917 | 841 (65) | 709 (30) | 795 | 974 | 1,131 | 1,557 |
| Avg. CDS length (bp) | 993 | 911 (733) | 1,033 (1,389) | 1,052 | 997 | 941 | 990 |
| Coding density (%) | 80 | 85.5 (82.1) | 79.7 (80.1) | 75.7 | 81.4 | 78.3 | 81.8 |
| rRNAs | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| tRNAs | 37 | 37 | 35 | 35 | 37 | 34 | 35 |
| Ankyrin repeat proteins | 46 | 18-19 | 26 | 29 | 27 | 32 | 54 |
| Reference | this study |
Notes:
Penz et al. (2012).
Santos-Garcia et al. (2014).
Zeng et al. (2018).
Showmaker et al. (2018).
Brown et al. (2018).
Schmitz-Esser et al. (2010).
contigs > 500 bp.
chromosome (plasmid).
Figure 1Phylogenetic relationships of Cardinium strains.
(A) The phylogenetic tree was inferred from the concatenated analysis of 278 single copy core proteins and separately from a subset of 46 core ribosomal proteins using the Maximum Likelihood method as implemented in IQTRE v1.6.6 (model: LG+F+R4). Both datasets retrieved the same tree topology and here we present only the first one. The numbers on the branches represent support values based on 1,000 bootstrap replicates (black bold values: complete matrix; blue values: ribosomal dataset). The three major Cardinium groups A, B, and C are denoted with different color shading. Cyclobacterium marinum and Marivirga tractuosa, two free living members of Bacteroidetes were used as outgroups. (B, C) Distribution of the phylogenetic signal in Cardinium concatenated ML phylogeny. The gene-wise differences in log-likelihood scores (ΔGLS) between the concatenated Maximum likelihood tree in (A) versus two alternative topologies: A,C-groups monophyletic relative to B-group (B) and B,C-groups monophyletic relative to A-group (C) were calculated as described in (Shen, Hittinger & Rokas, 2017) and plotted in descending order. The red bars represent the genes supporting the Maximum likelihood tree while the blue bars represent the genes supporting each of the alternative topologies.
Figure 2Genome content comparison across the seven Amoebophilaceae genomes.
UpSet plot showing unique and overlapping protein ortholog clusters across the seven Amoebophilaceae genomes cCpun, cEper1, cBtQ1, cSfur, cHgTN10, cPpe, and Amoebophilus asiaticus. The intersection matrix is sorted in descending order. Green bars represent the orthogroup size for each genome ordered by their phylogenetic relationships. Connected dots represent intersections of overlapping orthogroups while vertical bars shows the size of each intersection. The core orthogroup and the cCpun unique orthogroup cluster are shown with the blue and the orange bars respectively. The plot was generated using UpSetR package in R (Conway et al., 2017).
Figure 3Organization and comparison of the antifeeding prophage (Afp-like) genes clusters in the seven Amoebophilaceae genomes.
The phylogeny of the Afp-like secretion system was inferred with Maximum Likelihood based on the concatenated alignment of the 15 constituent protein sequences using IQTREE v1.6.6. Conserved regions are connected with a gradient of red shadings based on tblastx identities. The dash-line rectangles denote a duplicated region in cPpe strain described in Brown et al. (2018). The synteny and the phylogenetic tree of the Afp-like gene clusters were visualized using the genoPlotR package (Guy, Roat Kultima & Andersson, 2010).
Figure 4DUF1703 expansion in cCpun genome.
(A) Phylogenetic analysis of the cCpun DUF1703 gene family. The unrooted phylogeny was inferred using maximum likelihood from the amino acid sequences of 156 DUF1703 homologs using IQ-TREE v1.6.6 (method: automated best model selection). Cardinium, Simkania, and Rickettsia homologs are shaded in blue, red, and green respectively. (B) The unique expansion of cCpun DUF1703 gene family within the Amoebophilaceae. (C) Phylogenetic network showing the reticulated evolution of the cCpun DUF1703 paralogs.
Figure 5Planet DUF1703.
Abundance and taxonomic distribution of DUF1703 proteins in PFAM database. *: cCpun genome. The graph was constructed using Circos v0.69 (Krzywinski et al., 2009).
Example of cCpun genes likely originated from HGTs.
| Gene id | Length (AA) | Annotation | Taxonomy of the Best BLAST hit, (GenBank accession) | AA identity (%) | |
|---|---|---|---|---|---|
| CCPUN_00040 | 308 | Hypothetical protein, putative transposase | 2E-128 | 64 | |
| CCPUN_00530 | 328 | Hypothetical protein, putative transposase | 3E-124 | 62 | |
| CCPUN_01090 | 346 | Hypothetical protein, putative transposase | Rickettsiales bacterium, ( | 6E-133 | 58 |
| CCPUN_02050 | 379 | Hypothetical protein, putative transposase | Rickettsiales bacterium, ( | 5E-55 | 44 |
| CCPUN_04150 | 328 | Hypothetical protein, putative transposase | 9E-125 | 59 | |
| CCPUN_04430 | 297 | Hypothetical protein, putative transposase | Rickettsiales bacterium, ( | 9E-136 | 65 |
| CCPUN_01120 | 218 | Carbonic anhydrase | 2E-95 | 59 | |
| CCPUN_03570 | 551 | DNA repair protein RecN | Rickettsiales bacterium, ( | 2E-175 | 48 |
| CCPUN_03900 | 258 | Hypothetical protein, putative transposase | 3E-114 | 67 | |
| CCPUN_06490 | 469 | Arginine/agmatine antiporter | Gammaproteobacteria bacterium 39-13, ( | 4E-112 | 43 |
| CCPUN_07910 | 266 | Chromosome-partitioning protein Spo0J | 9E-101 | 57 | |
| CCPUN_07920 | 327 | Sporulation initiation inhibitor protein Soj | 5E-135 | 62 | |
| CCPUN_08840 | 436 | Folylpolyglutamate synthase | Wolbachia pipientis, ( | 0E+00 | 76 |
| CCPUN_08910 | 340 | Hypothetical protein | 2E-155 | 73 | |
| CCPUN_03830 | 426 | Hypothetical protein | 2E-60 | 39 | |
| CCPUN_08280 | 1,360 | Hypothetical protein | Aedes albopictus, ( | 5E-72 | 27 |