| Literature DB >> 35199143 |
Nicolas Helmstetter1, Aleksandra D Chybowska2, Christopher Delaney3, Alessandra Da Silva Dantas1, Hugh Gifford1, Theresa Wacker1, Carol Munro2, Adilia Warris1, Brian Jones4, Christina A Cuomo5, Duncan Wilson1, Gordon Ramage3, Rhys A Farrer1,5.
Abstract
Candida glabrata is the second most common etiological cause of worldwide systemic candidiasis in adult patients. Genome analysis of 68 isolates from 8 hospitals across Scotland, together with 83 global isolates, revealed insights into the population genetics and evolution of C. glabrata. Clinical isolates of C. glabrata from across Scotland are highly genetically diverse, including at least 19 separate sequence types that have been recovered previously in globally diverse locations, and 1 newly discovered sequence type. Several sequence types had evidence for ancestral recombination, suggesting transmission between distinct geographical regions has coincided with genetic exchange arising in new clades. Three isolates were missing MATα1, potentially representing a second mating type. Signatures of positive selection were identified in every sequence type including enrichment for epithelial adhesins thought to facilitate fungal adhesin to human epithelial cells. In patent microevolution was identified from 7 sets of recurrent cases of candidiasis, revealing an enrichment for nonsynonymous and frameshift indels in cell surface proteins. Microevolution within patients also affected epithelial adhesins genes, and several genes involved in drug resistance including the ergosterol synthesis gene ERG4 and the echinocandin target FKS1/2, the latter coinciding with a marked drop in fluconazole minimum inhibitory concentration. In addition to nuclear genome diversity, the C. glabrata mitochondrial genome was particularly diverse, with reduced conserved sequence and conserved protein-encoding genes in all nonreference ST15 isolates. Together, this study highlights the genetic diversity within the C. glabrata population that may impact virulence and drug resistance, and 2 major mechanisms generating this diversity: microevolution and genetic exchange/recombination.Entities:
Keywords: zzm321990 Candida glabratazzm321990 ; candidiasis; drug-resistance; epidemiology; evolution; genome sequencing; microevolution; mitochondria
Mesh:
Substances:
Year: 2022 PMID: 35199143 PMCID: PMC9071574 DOI: 10.1093/genetics/iyac031
Source DB: PubMed Journal: Genetics ISSN: 0016-6731 Impact factor: 4.402
Fig. 1.Candida glabrata isolates were collected across 8 health boards across Scotland in 2012, belonging to 20 separate sequence types, including the newly described ST204. Duplicate isolates stemming from the same patient at different time points have been excluded.
Fig. 2.A NeighborNet network using SplitsTree, with ST labels replacing isolate names at the nodes. Green = found in Scotland, purple = not found in Scotland. The scale bar represents nucleotide substitutions per site.
Fig. 3.a) A RAxML phylogenetic tree with 1000 bootstrap support of all Saccharomycetaceae species that had a genome assembly in NCBI or JGI Mycocosm and nucleotide diversity (π). Note: C. glabrata is calculated from whole-genome sequence data presented in this study, while the other species are based on ITS sequences only (Irinyi ). b) π (based on ITS sequences only) for all Saccharomycotina and non-Saccharomycotina that are listed in the ISHAM ITS reference DNA barcoding database (Irinyi ). c) nonoverlapping 5, 10, and 20 kb windows of π (π for all sites in the genome divided by window length).
Fig. 4.Population genetics of C. glabrata ST. a) PCA of whole-genome SNPs using SmartPCA revealed little evidence of sub-clustering among STs (isolates are calculated and plotted individually, but labeled by their ST alone for clarity). SmartPCA failed to calculate the eigenvalues for some isolates including those belonging to ST4. b) The CV error from running unsupervised ADMIXTURE for variant-sites across the C. glabrata population, testing K-values between 1 and 35. K = 20 provided the lowest CV error. c) ADMIXTURE plot for all isolates using K = 20, revealing several isolates with evidence of mixed ancestry. Isolates are ordered according to the neighbor-joining tree constructed with PAUP in Supplementary Fig. 1.
Fig. 5.Breadth of coverage and depth of coverage across each of the 37 mitochondrial encoded genes for all 151 isolates compared in this study (Each point represents an isolate). a) Breadth of coverage as a % across each gene. b) The normalized depth of coverage for each gene (total read depth for each gene/total read depth across both nuclear and mitochondrial genomes). c) Breadth of coverage as a % across each gene, categorized by sequence types (ST)s. d) Normalized depth of coverage for each gene, categorized by ST.
GO-term and PFAM enrichment [2-tailed Fisher exact test with FDR-corrected P-values (q) of < 0.05] for genes with dN/dS (ω) > 1, and genes with either microevolutionary frameshifts or nonsynonymous mutations across the 7 sets of serial isolates.
| Category | |||||||
|---|---|---|---|---|---|---|---|
| dN/dS > 1 | GO/PFAM term |
|
| Fisher |
| Rel. prop | GO/PFAM description |
| GO: 0003723 | 428 | 25 | 2.26E-04 | 3.04E-02 | 1.95 | RNA binding | |
| GO: 0003735 | 152 | 3 | 7.38E-05 | 1.49E-02 | 5.78 | Structural constituent of ribosome | |
| GO: 0003824 | 1745 | 155 | 8.61E-05 | 1.64E-02 | 1.28 | Catalytic activity | |
| GO: 0005515 | 1100 | 81 | 4.48E-06 | 1.41E-03 | 1.55 | Protein binding | |
| GO: 0005740 | 278 | 11 | 5.08E-05 | 1.06E-02 | 2.88 | Mitochondrial envelope | |
| GO: 0005759 | 160 | 4 | 1.90E-04 | 2.65E-02 | 4.56 | Mitochondrial matrix | |
| GO: 0006412 | 287 | 11 | 2.68E-05 | 6.05E-03 | 2.98 | Translation | |
| GO: 0019693 | 100 | 1 | 3.63E-04 | 4.35E-02 | 11.4 | Ribose phosphate metabolic process | |
| GO:0022626 | 87 | 0 | 1.16E-04 | 1.99E-02 | N/A | Cytosolic ribosome | |
| GO:0022857 | 277 | 13 | 4.08E-04 | 4.70E-02 | 2.43 | Transmembrane transporter activity | |
| GO:0036094 | 724 | 52 | 2.58E-04 | 3.31E-02 | 1.59 | Small molecule binding | |
| GO:0043168 | 689 | 48 | 1.89E-04 | 2.65E-02 | 1.64 | Anion binding | |
| GO:0044281 | 498 | 27 | 1.57E-05 | 3.86E-03 | 2.1 | Small molecule metabolic process | |
| GO:0044391 | 143 | 3 | 2.32E-04 | 3.05E-02 | 5.44 | Ribosomal subunit | |
| GO:0071840 | 1325 | 113 | 2.76E-04 | 3.47E-02 | 1.34 | Cellular component organization or biogenesis | |
| GO:1901362 | 364 | 19 | 1.92E-04 | 2.65E-02 | 2.18 | Organic cyclic compound biosynthetic process | |
| PF00624.20 | 4 | 23 | 6.53E-20 | 7.42E-17 | 0.02 | Flocculin repeat | |
| PF10528.11 | 12 | 10 | 1.87E-05 | 1.07E-02 | 0.13 | GLEYA domain | |
| PF00514.25 | 8 | 8 | 5.72E-05 | 2.17E-02 | 0.11 | Armadillo repeat | |
|
|
|
|
|
|
|
|
|
| GO:0009986 | 30 | 9 | 7.53E-12 | 4.26E-08 | 0.03 | Cell surface | |
| GO:0009987 | 3616 | 17 | 1.20E-05 | 3.38E-02 | 1.7 | Cellular process | |
| PF05001.15 | 0 | 17 | 1.10E-39 | 1.25E-36 | 0 | RNA polymerase Rpb1 C-terminal repeat | |
| PF10528.11 | 13 | 9 | 5.54E-15 | 3.15E-12 | 0.01 | GLEYA domain | |
| PF00399.21 | 24 | 9 | 4.09E-13 | 1.55E-10 | 0.02 | Yeast PIR protein repeat | |
| PF08238.14 | 14 | 6 | 2.64E-09 | 7.49E-07 | 0.02 | Sel1 repeat | |
| PF11765.10 | 6 | 3 | 2.58E-05 | 5.86E-03 | 0.01 | Hyphally regulated cell wall protein N-terminal | |
| PF09770.11 | 0 | 2 | 4.79E-05 | 9.08E-03 | 0 | Topoisomerase II-associated protein PAT1 | |
|
|
|
|
|
|
|
|
|
| GO:0009986 | 28 | 11 | 7.88E-10 | 4.45E-06 | 0.06 | Cell surface | |
| PF10528.11 | 12 | 10 | 3.14E-11 | 3.58E-08 | 0.03 | GLEYA domain | |
| PF11765.10 | 4 | 5 | 1.03E-06 | 5.87E-04 | 0.02 | Hyphally regulated cell wall protein N-terminal |
The relative proportion (Rel. prop) was calculated as (number of terms in set 1/number of terms in set 2) × (genes with any terms in set 2/genes with any terms in set 1).
Summary of microevolution across 7 sets of between 2 and 9 C. glabrata isolates from recurrent cases of candidiasis.
| Case ID | Initial | Relapse | ST | All Mutations | Coding | Noncoding | Coding Indel | Coding Indel (frameshift) | Coding (Non.Syn.) | Coding Nonsense | Coding (Syn.) | Coding (revert to ref.) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | CG107A | CG107B | 36 | 97 | 83 | 14 | 18 | 9 | 16 | 0 | 17 | 32 |
| 2 | CG18A | CG18B | 10 | 140 | 127 | 13 | 20 | 7 | 24 | 0 | 20 | 63 |
| 3 | CG191A | CG191B | 10 | 64 | 53 | 11 | 13 | 4 | 11 | 0 | 8 | 21 |
| 3 | CG191B | CG191C | 10 | 78 | 66 | 12 | 14 | 9 | 10 | 0 | 8 | 34 |
| 3 | CG191C | CG191D | 10 | 83 | 72 | 11 | 17 | 12 | 14 | 0 | 10 | 31 |
| 3 | CG191D | CG191E | 10 | 96 | 85 | 11 | 14 | 4 | 12 | 0 | 9 | 50 |
| 3 | CG191E | CG191F | 10 | 87 | 77 | 10 | 23 | 16 | 19 | 0 | 17 | 18 |
| 4 | CG48A | CG48F | 7 | 92 | 79 | 13 | 10 | 3 | 11 | 0 | 13 | 45 |
| 5 | CG84F | CG84G | 67 | 76 | 64 | 12 | 17 | 5 | 8 | 0 | 5 | 34 |
| 5 | CG84G | CG84H | 67 | 71 | 59 | 12 | 21 | 5 | 9 | 0 | 3 | 26 |
| 6 | CG93A | CG93B | 162 | 125 | 114 | 11 | 20 | 4 | 15 | 0 | 16 | 63 |
| 6 | CG93B | CG93C | 162 | 119 | 105 | 14 | 23 | 10 | 23 | 0 | 18 | 41 |
| 6 | CG93C | CG93D | 162 | 124 | 110 | 14 | 19 | 7 | 16 | 0 | 9 | 66 |
| 6 | CG93D | CG93E | 162 | 112 | 97 | 15 | 24 | 10 | 20 | 1 | 12 | 40 |
| 6 | CG93E | CG93H | 162 | 119 | 105 | 14 | 21 | 5 | 20 | 0 | 18 | 46 |
| 6 | CG93H | CG93I | 162 | 116 | 102 | 14 | 20 | 6 | 13 | 0 | 14 | 55 |
| 6 | CG93I | CG93J | 162 | 96 | 82 | 14 | 22 | 13 | 13 | 0 | 11 | 36 |
| 6 | CG93J | CG93K | 162 | 135 | 120 | 15 | 21 | 11 | 22 | 1 | 28 | 48 |
| 7 | CG97A | CG97B | 25 | 79 | 66 | 13 | 14 | 3 | 15 | 0 | 10 | 27 |
| 7 | CG97B | CG97C | 25 | 86 | 75 | 11 | 11 | 4 | 12 | 0 | 18 | 34 |
We documented 1,995 mutations between all serial isolates, which were either in protein-coding regions (Coding) or Intergenic and intron regions (noncoding). Coding mutations were further characterized into Coding Indels, some of which caused frameshifts [Coding Indel (frameshift)], nonsynonymous mutations (Coding Non.Syn.), nonsense mutations, (Coding Nonsense), synonymous mutations (Coding Syn.), and bases that reverted back to the ST15 CBS138 reference base (either from a previous microevolutionary change or a pre-existing variant between the initial isolate and the reference ST15 CBS138).
Fig. 6.Microevolutionary changes across 7 sets of C. glabrata isolates. a) a RAxML phylogenetic tree of the serial isolates using the general-time-reversible model and CAT rate approximation with 100 bootstrap support. Branch lengths indicate the mean number of changes per site. b) The number of serial mutations total (All), those within protein-coding sequence (CDS), intergenic and intronic regions. c) Those same serial mutations per kb [calculated as the count of serial mutations divided by the total length of the feature (where All = whole genome) and multiplied by 1,000]. d) Serial mutations within CDS categorized by their effect on the sequencing: Insertion/Deletion (Indel), synonymous mutation (Syn.), nonsynoynmous mutation (Non.Syn.), and nonsense mutation. e) Those same serial mutations within CDS per kb.
MIC values of fluconazole for each of the serial isolates.
| Case ID | Strain | MIC (µg/ml) | Change |
|---|---|---|---|
| 1 | 107a | 8 | |
| 1 | 107b | 16 | 8 |
| 2 | 18a | 8 | |
| 2 | 18b | 8 | |
| 3 | 191a | 8 | |
| 3 | 191b | 8 | |
| 3 | 191c | 8 | |
| 3 | 191d | 8 | |
| 3 | 191e | 8 | |
| 3 | 191f | 8 | |
| 4 | 48a | 4 | |
| 4 | 48f | 4 | |
| 5 | 84f | 8 | |
| 5 | 84g | 8 | |
| 5 | 84h | 4 | −4 |
| 6 | 93a | 4 | |
| 6 | 93b | 4 | |
| 6 | 93c | 4 | |
| 6 | 93d | 4 | |
| 6 | 93e | 4 | |
| 6 | 93h | >64 | >60 |
| 6 | 93i | >64 | |
| 6 | 93j | >64 | |
| 6 | 93k | 4 | −60 |
| 7 | 97a | 4 | |
| 7 | 97b | 4 | |
| 7 | 97c | 8 | 4 |