| Literature DB >> 35900091 |
Allison K Hansen1, Ariana N Sanchez1, Younghwan Kwak1.
Abstract
Candidatus (Ca.) Liberibacter taxa are economically important bacterial plant pathogens that are not culturable; however, genome-enabled insights can help us develop a deeper understanding of their host-microbe interactions and evolution. The draft genome of a recently identified Liberibacter taxa, Ca. Liberibacter capsica, was curated and annotated here with a total draft genome size of 1.1 MB with 1,036 proteins, which is comparable to other Liberibacter species with complete genomes. A total of 459 orthologous clusters were identified among Ca. L. capsica, Ca. L. asiaticus, Ca. L. psyllaurous, Ca. L. americanus, Ca. L. africanus, and L. crescens, and these genes within these clusters consisted of housekeeping and environmental response functions. We estimated the rates of molecular evolution for each of the 443 one-to-one ortholog clusters and found that all Ca. L. capsica orthologous pairs were under purifying selection when the synonymous substitutions per synonymous site (dS) were not saturated. These results suggest that these genes are largely maintaining their conserved functions. We also identified the most divergent single-copy orthologous proteins in Ca. L. capsica by analyzing the ortholog pairs that represented the highest nonsynonymous substitutions per nonsynonymous site (dN) values for each pairwise comparison. From these analyses, we found that 21 proteins which are known to be involved in pathogenesis and host-microbe interactions, including the Tad pilus complex, were consistently divergent between Ca. L. capsica and the majority of other Liberibacter species. These results further our understanding of the evolutionary genetics of Ca. L. capsica and, more broadly, the evolution of Liberibacter. IMPORTANCE "Candidatus" (Ca.) Liberibacter taxa are economically important plant pathogens vectored by insects; however, these host-dependent bacterial taxa are extremely difficult to study because they are unculturable. Recently, we identified a new Ca. Liberibacter lineage (Ca. Liberibacter capsica) from a rare insect metagenomic sample. In this current study, we report that the draft genome of Ca. Liberibacter capsica is similar in genome size and protein content compared to the other Ca. Liberibacter taxa. We provide evidence that many of their shared genes, which encode housekeeping and environmental response functions, are evolving under purifying selection, suggesting that these genes are maintaining similar functions. Our study also identifies 21 proteins that are rapidly evolving amino acid changes in Ca. Liberibacter capsica compared to the majority of other Liberibacter taxa. Many of these proteins represent key genes involved in Liberibacter-host interactions and pathogenesis and are valuable candidate genes for future studies.Entities:
Keywords: Ca. Liberibacter capsica; Candidatus (Ca.) Liberibacter; Tad pilus complex; horizontal gene transfer; purifying selection
Mesh:
Year: 2022 PMID: 35900091 PMCID: PMC9430466 DOI: 10.1128/spectrum.02091-22
Source DB: PubMed Journal: Microbiol Spectr ISSN: 2165-0497
Genome statistics of the draft genome of Ca. Liberibacter capsica compared to fully sequenced reference genomes of Liberibacter species
| Species | NCBI iD | nt length | Number of proteins | Clusters | Singletons |
|---|---|---|---|---|---|
| This study | 1,111,229 | 1,036 | 580 | 409 | |
|
| 1,192,232 | 1,075 | 912 | 131 | |
|
| 1,195,201 | 992 | 897 | 69 | |
|
| 1,225,162 | 1,082 | 916 | 131 | |
|
| 1,258,278 | 1,132 | 955 | 95 | |
|
|
| 1,522,119 | 1,331 | 871 | 396 |
FIG 1Orthologous gene clusters among Ca. L. capsica, Ca. L. asiaticus, Ca. L. psyllaurous, Ca. L. americanus, Ca. L. africanus, and L. crescens. (A) Overlap of orthologous gene clusters that are unique and shared between the six Liberibacter species. Numbers on the Venn diagram indicate the number of orthologous clusters shared by the species with overlapping sectors. Note that multiple genes per taxa can be present within an orthologous gene cluster. See Tables S2 and S3 for detail. (B) A total of 459 orthologous clusters are shared among all six Liberibacter taxa, and these clusters belong to 172 unique biological GO term categories, which are represented as circles in the figure (see Table S2 for detail). Multidimensional scaling (MDS) was used in Revigo to reduce the dimensionality of the matrix for these 172 GO terms based on pairwise similarities in biological function. The size of the circle and color indicates the relative number of proteins within each unique GO term category. The top 17 gene clusters that contain the most proteins are labeled in the figure.
Annotated protein clusters that are unique for each Liberibacter species
| Species | Protein count | Swiss-Prot hit | GO annotation |
|---|---|---|---|
| 3 |
| GO:0006101; P:citrate metabolic process | |
| 3 |
| GO:0006260; P:DNA replication | |
| 2 |
| GO:0009307; P:DNA restriction-modification system | |
| 3 |
| GO:0009253; P:peptidoglycan catabolic process | |
| 2 |
| NA | |
|
| 6 |
| GO:0016021; C:integral component of membrane |
| 4 |
| GO:0009253; P:peptidoglycan catabolic process | |
| 2 |
| GO:0015293; F:symporter activity | |
| 2 |
| GO:0006535; P:cysteine biosynthetic process from serine | |
| 2 |
| GO:0042493; P:response to drug | |
| 2 |
| GO:0016747; F:transferase activity | |
| 2 |
| GO:0000105; P:histidine biosynthetic process | |
| 2 |
| GO:0042883; P:cysteine transport | |
| 2 |
| GO:0006865; P:amino acid transport |
FIG 2Phylogenetic analysis of MFS transporter proteins in Ca. L. americanus. The tree was rooted with the Firmicute outgroup. Bootstrap values are indicated for nodes with 50% or above.
FIG 3Box and whisker plots showing the divergence between 1:1 orthologs of Ca. L. capsica and other Liberibacter species based on nonsynonymous substitutions per nonsynonymous site (dN). All points shown are outliers, and red points are dN values for proteins that are consistently in the top 10% per species pair for all Liberibacter species comparisons. All dN values are shown per pairwise comparison (N = 443).
Annotations of the most diverged one-to-one orthologs between Ca. L. capsica and other Liberibacter species
| Protein description | Scientific name | % identity | E value | KO |
|---|---|---|---|---|
| 4-hydroxybenzoate octaprenyltransferase | Ca. L. europaeus | 73 | 8.00E-70 |
|
| 1 pyruvate dehydrogenase complex dihydrolipoamide acetyltransferase | Ca. L. americanus | 69.9 | 2.55E-72 |
|
| trypsin-like peptidase domain-containing protein, DegP* | Ca. L. americanus | 64.1 | 6.30E-216 |
|
| hypothetical protein* | Ca. L. americanus | 68.1 | 1.18E-72 | |
| DNA polymerase III subunit chi* | Ca. L. europaeus | 67.9 | 3.05E-19 | |
| PAS domain S-box protein* | Ca. L. americanus | 72.9 | 0 | |
| hypothetical protein, COG5462* | Ca. L. americanus | 65.5 | 7.63E-28 | |
| Flp pilus assembly protein, secretin CpaC | Ca. L. americanus | 74.8 | 6.17E-208 |
|
| Ribosomal protein L10* | Ca. L. americanus | 75 | 2.66E-47 |
|
| preprotein translocase subunit SecG* | Ca. L. americanus | 60.3 | 1.51E-37 |
|
| hypothetical protein* | Ca. L. americanus | 70.8 | 7.00E-236 | |
| pilus assembly protein, TadG/RcpC* | Ca. L. americanus | 51.6 | 7.22E-23 | |
| DUF1217 domain-containing protein; similar to FlgF* | Ca. L. americanus | 68.3 | 5.47E-145 | |
| 50S ribosomal protein L9 | Ca. L. americanus | 65.7 | 4.80E-20 |
|
| pilus assembly protein, N-terminal domain-containing, CpaI* | Ca. L. americanus | 44.8 | 4.58E-36 | |
| hypothetical protein* | Ca. L. americanus | 70.5 | 2.66E-85 |
|
| outer membrane protein assembly factor BamA* | Ca. L. americanus | 70.6 | 2.69E-306 |
|
| RIP metalloprotease RseP* | Ca. L. americanus | 62.4 | 2.18E-118 |
|
| M23 family metallopeptidase | Ca. L. americanus | 61.9 | 6.15E-109 | |
| disulfide bond formation protein B* | Ca. L. americanus | 73.7 | 8.12E-78 | |
| peptidylprolyl isomerase* | Ca. L. americanus | 69.3 | 1.56E-89 |
|
Proteins that are consistently the most divergent (top 10% of dN values) for all Liberibacter pairwise comparisons with Ca. L. capsica are indicated with an asterisk (*). Proteins that are not marked with an asterisk are consistently the highest values of dN (top 10%) for all Liberibacter pairwise comparisons except for the most divergent species, L. crescens. Protein descriptions, closest sequence match, identity, and E values are based on BLASTP against the NCBI nr database.
KO term based on BlastKOALA‘s best hit.
FIG 4Phylogenetic analyses of Tad pilus genes in Liberibacter taxa and relatives for (A) TadG, (B) CpaI, and (C) CpaC genes. Branches are colored according to their bacterial classes within the Proteobacteria, and ranges of bootstrap values are indicated by filled or open circles as indicated in the key. See Fig. S1, S2, and S3 in the supplemental material for trees with detailed taxa names and accession numbers.