| Literature DB >> 26330512 |
Rhys A Farrer1, Christopher A Desjardins1, Sharadha Sakthikumar1, Sharvari Gujja1, Sakina Saif1, Qiandong Zeng1, Yuan Chen2, Kerstin Voelz3, Joseph Heitman4, Robin C May3, Matthew C Fisher5, Christina A Cuomo6.
Abstract
UNLABELLED: Cryptococcus gattii is a fungal pathogen of humans, causing pulmonary infections in otherwise healthy hosts. To characterize genomic variation among the four major lineages of C. gattii (VGI, -II, -III, and -IV), we generated, annotated, and compared 16 de novo genome assemblies, including the first for the rarely isolated lineages VGIII and VGIV. By identifying syntenic regions across assemblies, we found 15 structural rearrangements, which were almost exclusive to the VGI-III-IV lineages. Using synteny to inform orthology prediction, we identified a core set of 87% of C. gattii genes present as single copies in all four lineages. Remarkably, 737 genes are variably inherited across lineages and are overrepresented for response to oxidative stress, mitochondrial import, and metal binding and transport. Specifically, VGI has an expanded set of iron-binding genes thought to be important to the virulence of Cryptococcus, while VGII has expansions in the stress-related heat shock proteins relative to the other lineages. We also characterized genes uniquely absent in each lineage, including a copper transporter absent from VGIV, which influences Cryptococcus survival during pulmonary infection and the onset of meningoencephalitis. Through inclusion of population-level data for an additional 37 isolates, we identified a new transcontinental clonal group that we name VGIIx, mitochondrial recombination between VGII and VGIII, and positive selection of multidrug transporters and the iron-sulfur protein aconitase along multiple branches of the phylogenetic tree. Our results suggest that gene expansion or contraction and positive selection have introduced substantial variation with links to mechanisms of pathogenicity across this species complex. IMPORTANCE: The genetic differences between phenotypically different pathogens provide clues to the underlying mechanisms of those traits and can lead to new drug targets and improved treatments for those diseases. In this paper, we compare 16 genomes belonging to four highly differentiated lineages of Cryptococcus gattii, which cause pulmonary infections in otherwise healthy humans and other animals. Half of these lineages have not had their genomes previously assembled and annotated. We identified 15 ancestral rearrangements in the genome and over 700 genes that are unique to one or more lineages, many of which are associated with virulence. In addition, we found evidence for recent transcontinental spread, mitochondrial genetic exchange, and positive selection in multidrug transporters. Our results suggest that gene expansion/contraction and positive selection are diversifying the mechanisms of pathogenicity across this species complex.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26330512 PMCID: PMC4556806 DOI: 10.1128/mBio.00868-15
Source DB: PubMed Journal: mBio Impact factor: 7.867
FIG 1 Phylogeny, gene content, and synteny of 16 de novo assemblies of C. gattii. (Top) Phylogenetic tree inferred by using RAxML from single-copy (1:1) orthologs among four lineages of C. gattii and the outgroup C. neoformans. Numbers above tree branches indicate gene gain and loss events, and asterisks indicate 100% bootstrap support from 1,000 replicates. The central table details the origin and source of each isolate, as well as the number of contigs and total length (megabases) of each assembly. The bar chart shows the numbers of lineage-specific (LS) and multilineage-specific (MLS) genes, divergent 1:1 orthologs (unclustered by OrthoMCL but identified via synteny), paralogous clusters, and auxiliary (present in ≥1 isolate but not all isolates of the encompassing lineage) and unique genes. (Bottom) Visualization of the synteny (gray) and structural variants (red) between representatives for each lineage (VGI WM276, VGII R265, VGIII CA1280, and VGIV IND107). Genes are shown as small black boxes, while LS and MLS are shown above in red and green, respectively (corresponding to the bar chart). Scaffold numbers or letters are shown along with orientation (+/−).
Top 10 significantly enriched, nonambiguous Pfam domains (q value, <0.05) identified across each lineage(s)
| Lineage | Pfam accession no. | Pfam description | |
|---|---|---|---|
| VGI specific | PF02301.13 | HORMA (HORMA domain) | 3.62E−08 |
| PF01794.14 | Ferric reduct (ferric reductase-like transmembrane component) | 1.27E−06 | |
| PF08022.7 | FAD binding 8 (FAD-binding domain) | 2.29E−06 | |
| PF03151.11 | TPT (triose-phosphate transporter family) | 3.85E−06 | |
| PF08030.7 | NAD binding 6 (ferric reductase NAD-binding domain) | 5.42E−06 | |
| PF00098.18 | zf-CCHC (zinc knuckle) | 1.02E−05 | |
| PF00628.24 | PhD (PhD-finger) | 3.92E−05 | |
| PF01408.17 | GFO IDH MocA (oxidoreductase family, NAD-binding Rossmann fold) | 1.37E−04 | |
| PF00005.22 | ABC tran (ABC transporter) | 3.02E−03 | |
| PF04982.8 | HPP (HPP family) | 7.59E−03 | |
| VGI-VGIII specific | PF00098.18 | zf-CCHC (zinc knuckle) | 5.58E−13 |
| PF00160.16 | Pro isomerase (cyclophilin-type peptidyl-prolyl | 9.37E−12 | |
| VGI-VGIII lost | PF00070.22 | Pyr redox (pyridine nucleotide- disulfide oxidoreductase) | 8.55E−17 |
| PF07110.6 | EthD (EthD domain) | 3.04E−12 | |
| PF07992.9 | Pyr redox 2 (pyridine nucleotide- disulfide oxidoreductase) | 1.17E−15 | |
| VGIII specific | PF05970.9 | PIF1 (PIF1-like helicase) | 1.07E−02 |
| VGIII lost | PF03952.11 | Enolase N (enolase, N-terminal domain) | 4.68E−43 |
| PF00113.17 | Enolase C (enolase, C-terminal TIM barrel domain) | 4.21E−42 | |
| PF01176.14 | eIF-1a (translation initiation factor 1A/IF-1) | 1.68E−35 | |
| PF07766.8 | LETM1 (LETM1-like protein) | 1.68E−35 | |
| PF02627.15 | CMD (carboxymuconolactone decarboxylase family) | 1.61E−33 | |
| PF00732.14 | GMC oxred N (GMC oxidoreductase) | 3.76E−28 | |
| PF05199.8 | GMC oxred C (GMC oxidoreductase) | 3.76E−28 | |
| PF07476.6 | MAAL C (methylaspartate ammonia-lyase C terminus) | 1.13E−27 | |
| PF00199.14 | Catalase (catalase) | 3.39E−25 | |
| PF06628.7 | Catalase-rel (catalase-related immune responsive) | 3.39E−25 | |
| VGIV specific | PF13650.1 | Asp protease 2 (aspartyl protease) | 7.38E−03 |
| VGIV lost | PF07883.6 | Cupin 2 (Cupin domain) | 1.51E−44 |
| PF01758.11 | SBF (sodium bile acid symporter family) | 1.08E−22 | |
| PF02678.11 | Pirin (Pirin) | 1.08E−22 | |
| PF00190.17 | Cupin 1 (Cupin) | 6.01E−22 | |
| PF04145.10 | Ctr (Ctr copper transporter family) | 6.01E−22 | |
| PF13344.1 | Hydrolase 6 (haloacid dehalogenase- like hydrolase) | 1.42E−20 | |
| PF13242.1 | Hydrolase-like (HAD-hydrolase-like) | 1.47E−18 | |
| PF00702.21 | Hydrolase (haloacid dehalogenase-like hydrolase) | 4.10E−15 | |
| PF00631.17 | G-gamma (GGL domain) | 5.09E−08 | |
| VGII specific | PF04144.8 | SCAMP (SCAMP family) | 4.21E−16 |
| PF13865.1 | FoP duplication (C-terminal duplication domain of Friend of PRMT1) | 4.21E−16 | |
| PF00722.16 | Glyco hydro 16 (glycosyl hydrolase family 16) | 2.32E−12 | |
| PF02567.11 | PhzC-PhzF (phenazine biosynthesis-like protein) | 2.32E−12 | |
| PF02893.15 | Gram (GRAM domain) | 2.32E−12 | |
| PF05071.11 | NDUFA12 (NADH ubiquinone oxidoreductase subunit NDUFA12) | 2.32E−12 | |
| PF00326.16 | Peptidase S9 (prolyl oligopeptidase family) | 9.61E−11 | |
| PF00657.17 | Lipase GDSL (GDSL-lik lipase/acylhydrolase) | 9.61E−11 | |
| PF02441.14 | Flavoprotein (flavoprotein) | 9.61E−11 | |
| PF01619.13 | Pro dh (proline dehydrogenase) | 1.30E−10 | |
| VGII lost | PF02170.17 | PAZ (PAZ domain) | 1.05E−24 |
| PF02171.12 | Piwi (Piwi domain) | 1.05E−24 | |
| PF11790.3 | Glyco hydro ml (glycosyl hydrolase catalytic core) | 6.10E−19 | |
| PF01902.12 | ATP bind 4 (ATP-binding region) | 1.13E−16 | |
| PF00784.12 | MyTH4 (MyTH4 domain) | 6.24E−15 | |
| PF02897.10 | Peptidase S9 N (prolyl oligopeptidase, N-terminal beta-propeller domain) | 6.24E−15 | |
| PF08660.6 | Alg14 (oligosaccharide biosynthesis protein Alg14-like) | 6.24E−15 | |
| PF12862.2 | Apc5 (anaphase-promoting complex subunit 5) | 6.24E−15 | |
| PF00141.18 | Peroxidase (peroxidase) | 5.18E−11 | |
| PF01713.16 | Smr (smr domain) | 5.18E−11 |
Domains belong to genes with homology to essential genes in Saccharomyces cerevisiae, and similar nucleotide sequence was detected in the corresponding C. gattii genome using tBLASTn.
Abbreviations: FAD, flavin adenine dinucleotide; HAD, haloacid dehalogenase.
Corrected P values were calculated from the two-tailed Fisher exact test with q-value FDR.
FIG 2 RAxML phylogeny of 53 nuclear genomes of C. gattii. All sites that were homozygous in all isolates and had an SNP in ≥1 isolate were used (1,432,518 sites, or 8.3% of the total length). Isolate names are colored according to geographic origin (blue, Pacific Northwest [PNW]; red, South America; green, Australia) and labeled. Isolates labeled USA are non-PNW. The asterisk indicates 100% bootstrap support using 1,000 replicates. The branch site model (BSM) of selection in Codeml was employed across 17 subclades highlighted in the table (to the right of the tree) to identify genes under selection across the internal branches within each subclade. Multiple testing correction was performed using both the Storey-Tibshirani and Benjamini-Hochberg methods (requiring q values of <0.05 for each). The number of genes identified as under positive selection is reported as the second column of the table. Last, enrichment Pfam domains from these genes compared with the remaining unselected genes were assessed using two-tailed Fisher’s exact test with q-value FDR, shown in the final column of the table.
FIG 3 Topological discordance between the nuclear (left) and mitochondrial (right) genomes of C. gattii. Orange, VGI; red, VGII; green, VGIII; yellow, VGIV. (A) RAxML trees for the nuclear and mitochondrial genomes (boldface branches, 100% bootstrap support using 1,000 replicates). (B) Principal component analysis for the nuclear and mitochondrial genomes. (C) SNPs across the mitochondrial genomes of 7 isolates, including representatives from each lineage. Differences from the nuclear tree were most visible in four isolates: progeny 5, E566, Ru294, and EJB2. SNPs are colored according to the lineage to which they are unique. VGII-auxiliary (VGI-VGII, VGII-VGIII, VGII-VGIV, VGI-VGII-VGIII, VGI-VGII-VGIV, VGII-VGIII-VGIV, and VGI-VGII-VGIII-VGIV) is also colored red, while VGIV-auxiliary (VGI-VGIV, VGIII-VGIV, and VGI-VGIII-VGIV) is colored yellow. N, NADH ubiquinone oxidoreductase; ND, NADH dehydrogenase; A, ATP synthase; CCO, cytochrome c oxidase; Cb, cytochrome b; SRP, small ribosomal protein; c, chain; s, subunit.