| Literature DB >> 24259314 |
Leszek P Pryszcz1, Tibor Németh, Attila Gácser, Toni Gabaldón.
Abstract
Invasive candidiasis is the most commonly reported invasive fungal infection worldwide. Although Candida albicans remains the main cause, the incidence of emerging Candida species, such as C. parapsilosis is increasing. It has been postulated that C. parapsilosis clinical isolates result from a recent global expansion of a virulent clone. However, the availability of a single genome for this species has so far prevented testing this hypothesis at genomic scales. We present here the sequence of three additional strains from clinical and environmental samples. Our analyses reveal unexpected patterns of genomic variation, shared among distant strains, that argue against the clonal expansion hypothesis. All strains carry independent expansions involving an arsenite transporter homolog, pointing to the existence of directional selection in the environment, and independent origins of the two clinical isolates. Furthermore, we report the first evidence for the existence of recombination in this species. Altogether, our results shed new light onto the dynamics of genome evolution in C. parapsilosis.Entities:
Keywords: Candida; genome comparison; pathogens; recombination
Mesh:
Substances:
Year: 2013 PMID: 24259314 PMCID: PMC3879973 DOI: 10.1093/gbe/evt185
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Basic Strain and Assembly Statistics for the Genomes Obtained within This Work and the Reference C. parapsilosis Strain
| Strain | Origin | Library | Number of Reads (millions) | Coverage ( | Scaffolds | Assembly Size (kb) | Heterozygous Sites (%) | Genes |
|---|---|---|---|---|---|---|---|---|
| CBS1954 | Environmental, olive fruit, Italy | 76 bp; GAIIx; paired at 300 bp | 87.51 | 510 | 20 | 13,107 | 0.00637 | 5,696 |
| CBS6318 | Environmental, healthy skin, USA | 76 bp; GAIIx; paired at 300 bp | 42.45 | 248 | 24 | 13,138 | 0.00457 | 5,677 |
| GA1 | Clinical, human blood, Germany | 96 bp; HiSeq; paired at 600 bp | 36.51 | 269 | 27 | 13,017 | 0.00717 | 5,692 |
| CDC317 | Clinical, skin, USA | Sanger | 0.23 | 12 | 8 | 13,030 | 0.01576 | 5,836 |
Note.—For each strain, the table provides, in this order: strain name; geographical origin and sampling context; sequencing strategy, indicating read length, sequencing platform and insert size (in case of paired reads); number of reads obtained, average coverage (x fold); number of scaffolds; total assembly size; fraction of heterozygous sites; number of predicted genes.
F(A) Number of orthologs groups—that is, including orthologous genes and their recently duplicated paralogs—among the three newly sequenced C. parapsilosis strains; In brackets is indicated the orthologous groups that are single copy across all strains. (B) Shared SNPs using CDC317 as a reference for the three newly sequenced strains. (C) SNP-based phylogeny of four C. parapsilosis strains, using the closely related C. orthopsilosis as an outgroup.
FMaximum likelihood phylogenetic tree representing the evolutionary relationships among ALS family members in the four C. parapsilosis strains (marked in red/brown) and in other related Candida species (black). For simplicity, some clades of intraspecific paralogs have been collapsed indicating the number of sequences involved. The complete tree, including labeled nodes and sequence names, is available in newick format as supplementary file S5 (Supplementary Material online). Naming of ALS clades in C. parapsilosis species correspond to CDC317 genes.
FSchematic view of the region surrounding a CBS1954-specific deletion of 6.8 kb (DEL 31). Density of mapped reads along the coordinates of the reference strain (CDC317) is indicated. An unmapped region of 6.8 kb is identified as a deletion in CBS1954 only. This deletion is predicted to have originated an in-frame fusion of two genes CPAR2_600430 and CPAR2_600440.
FCNVs involving a homolog of ARR3 in the four C. parapsilosis strains examined. Data for the reference strain CDC317 was obtained by remapping the raw reads of that genomic project. Excess of coverage in a discrete genomic region indicates the presence of a higher number of copies. Although the region coding for ARR3 is included in the CNVs detected in all strains, the clearly distinct boundaries of the duplicated blocks clearly indicate that they originate from independent duplication events. Inferred number of copies based on depth of coverage is indicated.