| Literature DB >> 31671659 |
Matthew J Dunn1, Matthew Z Anderson2,3.
Abstract
Genome instability often leads to cell death but can also give rise to innovative genotypic and phenotypic variation through mutation and structural rearrangements. Repetitive sequences and chromatin architecture in particular are critical modulators of recombination and mutability. In Candida albicans, four major classes of repeats exist in the genome: telomeres, subtelomeres, the major repeat sequence (MRS), and the ribosomal DNA (rDNA) locus. Characterization of these loci has revealed how their structure contributes to recombination and either promotes or restricts sequence evolution. The mechanisms of recombination that give rise to genome instability are known for some of these regions, whereas others are generally unexplored. More recent work has revealed additional repetitive elements, including expanded gene families and centromeric repeats that facilitate recombination and genetic innovation. Together, the repeats facilitate C. albicans evolution through construction of novel genotypes that underlie C. albicans adaptive potential and promote persistence across its human host.Entities:
Keywords: genome stability, telomere, subtelomere, gene family expansion, LTR, MRS, Candida albicans
Mesh:
Substances:
Year: 2019 PMID: 31671659 PMCID: PMC6896093 DOI: 10.3390/genes10110866
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Major classes of genetic repeats in Candida albicans. C. albicans contains four major categories of repeat sequences: the telomeres that contain multiple copies of a 23 bp repeat; the Major Repeat Sequence (MRS) composed of repetitive sequence (RPS) repeats; the rDNA locus, which encodes the polycistronic rRNA transcripts; and the subtelomeres, a telomere-proximal region containing transposable elements and a gene family expansion. Centromeres are indicated by grey circles.
Repeat composition summary in C. albicans.
| Repeat Element | Repeat Composition | Size | Repeat Copy Number (Haploid Genome) | Mechanisms of Instability |
|---|---|---|---|---|
|
| 5’-ACTTCTTGGTGTACGGATGTCTA-3’ | 500 bp–5 kb | >20 repeats | Recombination, t-circles |
|
| Expanded gene families, LTRs, non-LTR retrotransposons | 15 kb proximal to the telomeric repeats | 14 | Recombination, LTR insertions, LOH, copy number variations |
|
| Unique regions without common sequence motifs, inverted repeats | Core 3 kb regions | Eight loci | Interchromosomal recombination, isochromosome formation |
|
| RPS repeat units often flanked by non-repetitive RB2 and HOK elements | 50 kb on average | Nine complete MRS loci, 14 RB2 elements, and two HOK elements | Chromosome translocations, intrachromosomal recombination |
|
| 18 s, 5.8s, 25 s, and 5 s rRNAs organized as tandem repeating units | 11.6 kb - 12.5 kb per | 21 to 176 copies of repeating | Intrachromosomal recombination |
|
| Tandemly repeated Ser/Thr-rich domain | ~3–6 kb | Eight genes | Intragenic recombination, intergenic recombination |
|
| 20 bp or longer repeats present more than once in the genome | 65–6499 bp | 1974 long repeats (2.87% of the haploid reference genome) | Recombination, LOH, chromosomal inversions, copy number variations |
C. albicans characterized subtelomeric genes.
| Gene Name | Description 1 | Genomic Location |
|---|---|---|
|
| Member of a family of telomere-proximal genes of unknown function; hypha-induced expression; rat catheter biofilm repressed | Ca22chrRA_C_albicans_SC5314:9111 to 9863 |
|
| Inositol-1-phosphate synthase; antigenic in human; repressed by farnesol in biofilm or by caspofungin; upstream inositol/choline regulatory element; glycosylation predicted; rat catheter, flow model induced; Spider biofilm repressed | Ca22chrRA_C_albicans_SC5314:2157044 to 2155482 |
|
| Putative transcription factor/activator; Med2 mediator complex domain; transcript is upregulated in an RHE model of oral candidiasis; member of a family of telomere-proximal genes; Efg1, Hap43-repressed | Ca22chrRA_C_albicans_SC5314:2286198 to 2285377 |
|
| Subunit of the mitochondrial F1F0 ATP synthase; sumoylation target; protein newly produced during adaptation to the serum; Spider biofilm repressed | Ca22chrRA_C_albicans_SC5314:2285228 to 2284743 |
|
| D-xylulose reductase; immunogenic in mice; soluble protein in hyphae; induced by caspofungin, fluconazole, Hog1 and during cell wall regeneration; Mnl1-induced in weak acid stress; stationary phase enriched; flow model biofilm induced | Ca22chrRA_C_albicans_SC5314:2284454 to 2283372 |
|
| Alpha-glucosidase; hydrolyzes sucrose for sucrose utilization; transcript regulated by Suc1, induced by maltose, repressed by glucose; Tn mutation affects filamentous growth; upregulated in RHE model; rat catheter and Spider biofilm induced | Ca22chrRA_C_albicans_SC5314:2276745 to 2278457 |
|
| Protein similar to S. cerevisiae Iah1p, which is involved in acetate metabolism; mutation confers hypersensitivity to tunicamycin; transposon mutation affects filamentous growth | Ca22chrRA_C_albicans_SC5314:2276572 to 2275766 |
|
| Putative component of the monopolin complex with role in rDNA silencing, homologous chromosome segregation, protein localization to nucleolar rDNA repeats | Ca22chrRA_C_albicans_SC5314:2272097 to 2272774 |
|
| Putative transcription factor; Med2 mediator domain; activates transcription in 1-hybrid assay in S. cerevisiae; repressed by Efg1; member of a family of telomere-proximal genes; Tbf1-induced | Ca22chr1A_C_albicans_SC5314:10718 to 11485 |
|
| Transcriptional corepressor; represses filamentous growth; regulates switching; role in germ tube induction, farnesol response; in repression pathways with Nrg1, Rfg1; farnesol upregulated in biofilm; rat catheter, Spider biofilm repressed | Ca22chr1A_C_albicans_SC5314:12163 to 13701 |
|
| Mevalonate diphosphate decarboxylase; functional homolog of S. cerevisiae Erg19; possible drug target; regulated by carbon source, yeast-hypha switch, growth phase, antifungals; gene has intron; rat catheter, Spider biofilm repressed | Ca22chr1A_C_albicans_SC5314:13778 to 14917 |
|
| Member of a family of telomere-proximal genes of unknown function; transcript induced in an RHE model of oral candidiasis; Hap43-repressed | Ca22chr1A_C_albicans_SC5314:3186663 to 3186463 |
|
| Protein encoded in retrotransposon Zorro2 with similarity to retroviral endonuclease-reverse transcriptase proteins; lacks an ortholog in S. cerevisiae; transposon mutation affects filamentous growth | Ca22chr1A_C_albicans_SC5314:3185097 to 3184381 |
|
| Predicted component of the sorting and assembly machinery (SAM complex) of the mitochondrial outer membrane, involved in protein import into mitochondria | Ca22chr1A_C_albicans_SC5314:3177333 to 3176578 |
|
| Putative Rab GDP-dissociation inhibitor; GlcNAc-induced protein; Spider biofilm repressed | Ca22chr1A_C_albicans_SC5314:3172988 to 3171639 |
|
| Member of a family of telomere-proximal genes of unknown function; may be spliced in vivo | Ca22chr2A_C_albicans_SC5314:4248 to 4778 |
|
| Protein with a predicted role in recruitment of RNA polymerase I to rDNA; caspofungin induced; flucytosine repressed; repressed in core stress response; repressed by prostaglandins | Ca22chr2A_C_albicans_SC5314:5665 to 7335 |
|
| Putative SSU processome and 90S preribosome component; repressed in core stress response; repressed by prostaglandins | Ca22chr2A_C_albicans_SC5314:11063 to 9360 |
|
| Putative alpha-1,6-mannanase; induced by mating factor in MTLa/MTLa opaque cells | Ca22chr2A_C_albicans_SC5314:12443 to 11100 |
|
| GPI-anchored cell surface protein of unknown function; Hap43p-repressed gene; fluconazole-induced; possibly an essential gene, disruptants not obtained by UAU1 method | Ca22chr2A_C_albicans_SC5314:14415 to 13261 |
|
| Putative DNA-dependent ATPase with a predicted role in DNA recombination and repair; transcriptionally induced by interaction with macrophages | Ca22chr2A_C_albicans_SC5314:2221521 to 2223911 |
|
| Member of a family of telomere-proximal genes of unknown function; may be spliced in vivo; rat catheter biofilm repressed | Ca22chr3A_C_albicans_SC5314:13756 to 14265 |
|
| Member of a family of telomere-proximal genes of unknown function; may be spliced in vivo | Ca22chr3A_C_albicans_SC5314:1788114 to 1787605 |
|
| Essential GDP-mannose pyrophosphorylase; makes GDP-mannose for protein glycosylation; functional in S. cerevisiae psa1; on yeast-form, not hyphal cell surface; alkaline induced; induced on adherence to polystyrene; Spider biofilm repressed | Ca22chr3A_C_albicans_SC5314:1786177 to 1785089 |
|
| Member of a family of telomere-proximal genes of unknown function; Hap43p-repressed gene | Ca22chr4A_C_albicans_SC5314:983 to 1660 |
|
| Na+/H+ antiporter; required for wild-type growth, cell morphology, and virulence in a mouse model of systemic infection; not transcriptionally regulated by NaCl; fungal-specific (no human or murine homolog) | Ca22chr4A_C_albicans_SC5314:5033 to 7435 |
|
| Putative beta-1,3-glucanosyltransferase with similarity to the A. fumigatus GEL family; fungal-specific (no human or murine homolog); possibly an essential gene, disruptants not obtained by UAU1 method | Ca22chr4A_C_albicans_SC5314:12793 to 14309 |
|
| Member of a family of telomere-proximal genes of unknown function | Ca22chr4A_C_albicans_SC5314:1597812 to 1597159 |
|
| Subunit of the NuA4 histone acetyltransferase complex; soluble protein in hyphae; Spider biofilm repressed | Ca22chr4A_C_albicans_SC5314:1596377 to 1594317 |
|
| Member of a family of telomere-proximal genes of unknown function; may be spliced in vivo | Ca22chr5A_C_albicans_SC5314:1918 to 2427 |
|
| Septin; cell and hyphal morphology, agar-invasive growth, full virulence and kidney tissue invasion in mouse, but not kidney colonization, immunogenicity; hyphal and cell-cycle-regulated phosphorylation; rat catheter biofilm repressed | Ca22chr5A_C_albicans_SC5314:8946 to 10154 |
|
| Putative threonyl-tRNA synthetase; transcript regulated by Mig1 and Tup1; repressed upon phagocytosis by murine macrophages; stationary phase enriched protein; Spider biofilm repressed | Ca22chr5A_C_albicans_SC5314:14674 to 12554 |
|
| Putative transcription factor; positive regulator of gene expression; Efg1-repressed; member of a family of telomere-proximal genes; transcript upregulated in RHE model of oral candidiasis | Ca22chr5A_C_albicans_SC5314:1182868 to 1182110 |
|
| Putative tRNA-His synthetase; downregulated upon phagocytosis by murine macrophage; stationary phase enriched protein; Spider biofilm repressed | Ca22chr5A_C_albicans_SC5314:1180902 to 1179397 |
|
| Putative delta-4 sphingolipid desaturase; planktonic growth-induced gene | Ca22chr5A_C_albicans_SC5314:1178288 to 1179400 |
|
| Telomerase subunit; allosteric activator of catalytic activity, but not required for catalytic activity; has TPR domain | Ca22chr5A_C_albicans_SC5314:1176356 to 1178194 |
|
| Transcription factor; transposon mutation affects filamentous growth | Ca22chr6A_C_albicans_SC5314:5 to 346 |
|
| Member of a family of telomere-proximal genes of unknown function; may be spliced in vivo; overlaps orf19.6337.1, which is a region annotated as blocked reading frame | Ca22chr6A_C_albicans_SC5314:5545 to 6069 |
|
| Putative GPI-anchored adhesin-like protein; fluconazole-downregulated; induced in oralpharyngeal candidasis; Spider biofilm induced | Ca22chr6A_C_albicans_SC5314:9894 to 7276 |
|
| Putative sphingolipid transfer protein; involved in localization of glucosylceramide which is important for virulence; Spider biofilm repressed | Ca22chr6A_C_albicans_SC5314:12543 to 11950 |
|
| Putative transporter; fungal-specific; similar to Nag3p and to S. cerevisiae Ypr156Cp and Ygr138Cp; required for wild-type mouse virulence and wild-type cycloheximide resistance; gene cluster encodes enzymes of GlcNAc catabolism | Ca22chr6A_C_albicans_SC5314:1026875 to 1025130 |
|
| Putative MFS transporter; similar to Nag4; required for wild-type mouse virulence and cycloheximide resistance; in gene cluster that includes genes encoding enzymes of GlcNAc catabolism; Spider biofilm repressed | Ca22chr6A_C_albicans_SC5314:1024922 to 1023237 |
|
| N-acetylglucosamine-6-phosphate (GlcNAcP) deacetylase; N-acetylglucosamine utilization; required for wild-type hyphal growth and virulence in mouse systemic infection; gene and protein are GlcNAc-induced; Spider biofilm induced | Ca22chr6A_C_albicans_SC5314:1021987 to 1023228 |
|
| Glucosamine-6-phosphate deaminase; required for normal hyphal growth and mouse virulence; converts glucosamine 6-P to fructose 6-P; reversible reaction in vitro; gene and protein is GlcNAc-induced; Spider biofilm induced | Ca22chr6A_C_albicans_SC5314:1021746 to 1021000 |
|
| N-acetylglucosamine (GlcNAc) kinase; involved in GlcNAc utilization; required for wild-type hyphal growth and mouse virulence; GlcNAc-induced transcript; induced by alpha pheromone in SpiderM medium | Ca22chr6A_C_albicans_SC5314:1019276 to 1020757 |
|
| Protein required for endocytosis; contains a BAR domain, which is found in proteins involved in membrane curvature; null mutant exhibits defects in hyphal growth, virulence, cell wall integrity, and actin patch localization | Ca22chr7A_C_albicans_SC5314:2040 to 1246 |
|
| Ortholog of S. cerevisiae Rad3; 5′ to 3′ DNA helicase, nucleotide excision repair and transcription, subunit of RNA polII initiation factor TFIIH and Nucleotide Excision Repair Factor 3 (NEF3) | Ca22chr7A_C_albicans_SC5314:5761 to 3464 |
|
| Putative GTPase activating protein (GAP) for Rho1; repressed upon adherence to polystyrene; macrophage/pseudohyphal-repressed; transcript is upregulated in RHE model of oral candidiasis and in clinical oral candidiasis | Ca22chr7A_C_albicans_SC5314:9430 to 7583 |
|
| Surface antigen on elongating hyphae and buds; strain variation in repeat number; ciclopirox, filament induced, alkaline induced by Rim101; Efg1-, Cph1, Hap43-regulated; required for WT RPMI biofilm formation; Bcr1-induced in a/a biofilms | Ca22chr7A_C_albicans_SC5314:13080 to 10024 |
|
| Putative ferric reductase; alkaline induced by Rim101; fluconazole-downregulated; upregulated in the presence of human neutrophils; possibly adherence-induced; regulated by Sef1, Sfu1, and Hap43 | Ca22chr7A_C_albicans_SC5314:14047 to 15825 |
|
| Member of a family of telomere-proximal genes of unknown function; may be spliced in vivo | Ca22chr7A_C_albicans_SC5314:943352 to 942825 |
|
| Putative Golgi integral membrane protein; transcript regulated by Mig1 | Ca22chr7A_C_albicans_SC5314:941499 to 940993 |
|
| Putative transcription elongation factor; transposon mutation affects filamentous growth; transcript induced in an RHE model of oral candidiasis and in clinical isolates from oral candidiasis | Ca22chr7A_C_albicans_SC5314:934878 to 939083 |
1 As assigned in the Candida Genome Database [49].
Figure 2Organization of subtelomeric repetitive sequences in C. albicans. The telomere-associated (TLO) genes and other subtelomeric repetitive sequences are marked on each C. albicans chromosome arm for the genome reference strain SC5314. The TLO open reading frame (ORF; cyan) is commonly flanked by two long terminal repeats (LTR) elements, indicated in black. The TLO recombination element (TRE; orange) overlaps with the TLO 3’ untranslated region (UTR) and extends towards the centromere to encompass the Bermuda Triangle Sequence (BTS; red). All repetitive sequences are oriented similarly on all chromosome arms. Both the TRE and BTS contribute to subtelomeric recombination. Telomeric repeats are denoted by “~~~”. Figure adapted from [66].
C. albicans subtelomeric LTRs and retrotransposons.
| ORF Name | Description 1 | Genomic Location |
|---|---|---|
|
| Solo copy of the long terminal repeat (LTR) associated with the transposon Tca2; about 280 bp long, 5-10 copies per genome | Ca22chr1A_C_albicans_SC5314:3187028 to 3187306 |
|
| Solo copy of the long terminal repeat (LTR) associated with the transposon Tca2; about 280 bp long, 5-10 copies per genome | Ca22chr1A_C_albicans_SC5314:3187383 to 3187662 |
|
| Non-LTR retrotransposon, encodes a potential DNA-binding zinc-finger protein and a polyprotein similar to pol with conserved endonuclease and reverse transcriptase domains; member of L1 clade of transposons | Ca22chr1A_C_albicans_SC5314:3185887 to 3181663 |
|
| Solo copy of the long terminal repeat (LTR) associated with the transposon Tca4; about 381 bp long, 1-4 copies per genome | Ca22chr1A_C_albicans_SC5314:3178575 to 3178194 |
|
| Long terminal repeat (LTR); about 275 bp long, 13 copies per genome | Ca22chr2A_C_albicans_SC5314:4611 to 4885 |
|
| Long terminal repeat (LTR); about 275 bp long, 13 copies per genome | Ca22chr3A_C_albicans_SC5314:14098 to 14372 |
|
| Long terminal repeat (LTR); about 199 bp long, 19 copies per genome | Ca22chr3A_C_albicans_SC5314:14751 to 14553 |
|
| Long terminal repeat (LTR); about 275 bp long, 13 copies per genome | Ca22chr3A_C_albicans_SC5314:1787772 to 1787498 |
|
| Long terminal repeat (LTR); about 275 bp long, 13 copies per genome | Ca22chr5A_C_albicans_SC5314:2260 to 2534 |
|
| Long terminal repeat (LTR); about 275 bp long, 13 copies per genome | Ca22chr6A_C_albicans_SC5314:5902 to 6176 |
|
| Long terminal repeat (LTR) associated with the transposon Tca10; about 192 bp long, 11 copies per genome | Ca22chr6A_C_albicans_SC5314:1032389 to 1032200 |
|
| Long terminal repeat (LTR) associated with the transposon Tca6; about 280 bp long, 10-15 copies per genome | Ca22chr6A_C_albicans_SC5314:1028505 to 1028227 |
|
| Long terminal repeat (LTR); about 167 bp long, 9 copies per genome | Ca22chr6A_C_albicans_SC5314:1027905 to 1028066 |
|
| Solo copy of the long terminal repeat (LTR) associated with the transposon Tca4; about 381 bp long, 1-4 copies per genome | Ca22chr7A_C_albicans_SC5314:942855 to 942475 |
1 As assigned in the Candida Genome Database [49].
Figure 3Rates of loss of heterozygosity (LOH) increase towards chromosome ends. An average LOH rate for each genomic region is depicted for chromosome internal sequences (black), the subtelomeres (yellow), and telomeres (orange) based on published (chromosome internal and subtelomeric) [28,56] and unpublished results (telomeric). All data is derived from LOH assays in which a URA3 marker was inserted within different genomic regions and its location determined by either sequencing or contour clamped homogenous electric field (CHEF) gel analysis and Southern blotting. Centromeres are indicated by grey circles.
Figure 4MRS elements in the C. albicans genome. The nine complete MRS loci (blue), 14 RB2 elements (green) and two HOK elements (orange) have been placed on their relative chromosome positions in C. albicans strain SC5314.