| Literature DB >> 29371508 |
Theo N Kirkland1, Anna Muszewska2, Jason E Stajich3.
Abstract
Coccidioides immitis and C. posadasii are primary pathogenic fungi that cause disease in immunologically-normal animals and people. The organism is found exclusively in arid regions of the Southwestern United States, Mexico, and South America, but not in other parts of the world. This study is a detailed analysis of the transposable elements (TE) in Coccidioides spp. As is common in most fungi, Class I and Class II transposons were identified and the LTR Gypsy superfamily is the most common. The minority of Coccidioides Gypsy transposons contained regions highly homologous to polyprotein domains. Phylogenetic analysis of the integrase and reverse transcriptase sequences revealed that many, but not all, of the Gypsy reverse transcriptase and integrase domains clustered by species suggesting extensive transposition after speciation of the two Coccidiodies spp. The TEs were clustered and the distribution is enriched for the ends on contigs. Analysis of gene expression data from C. immitis found that protein-coding genes within 1 kB of hAT or Gypsy TEs were poorly expressed. The expression of C. posadasii genes within 1 kB of Gypsy TEs was also significantly lower compared to all genes but the difference in expression was smaller than C. immitis. C. posadasii orthologs of C. immitis Gyspsy-associated genes were also likely to be TE-associated. In both C. immitis and C. posadasii the TEs were preferentially associated with genes annotated with protein kinase gene ontology terms. These observations suggest that TE may play a role in influencing gene expression in Coccidioides spp. Our hope is that these bioinformatic studies of the potential TE influence on expression and evolution of Coccidioides will prompt the development of testable hypotheses to better understand the role of TEs in the biology and gene regulation of Coccidioides spp.Entities:
Keywords: Coccidioides spp.; fungus; genomics; transcriptome; transposable elements
Year: 2018 PMID: 29371508 PMCID: PMC5872316 DOI: 10.3390/jof4010013
Source DB: PubMed Journal: J Fungi (Basel) ISSN: 2309-608X
Predicted TEs in Coccidioides spp.
| Type | Number | Mean Length | ORFs > 300 nt (%) | GC Content (%) | Number | Mean length | ORF > 300 nt (%) | GC Content (%) |
|---|---|---|---|---|---|---|---|---|
| DNA/ | 286 | 884 | 113 (40) | 37 | 575 | 991 | 456 (79) | 36 |
| -Ant1 | 63 | 31 | ||||||
| -Fot1 | 134 | 430 | ||||||
| -Pogo | 27 | 29 | ||||||
| -Tc1 | 66 | 83 | ||||||
| DNA/ | 100 | 2416 | 80 (80) | 36 | 37 | 1232 | 15 (40) | 31 |
| LTR/ | 1204 | 2089 | 1557 (130) | 33 | 1199 | 2046 | 1387 (116) | 34 |
| LTR/ | 287 | 1351 | 151 (52) | 38 | 190 | 937 | 131 (69) | 39 |
| LINE | 364 | 1331 | 239 (66) | 28 | 225 | 1237 | 144 (64) | 31 |
Polyprotein domains in Gypsy TEs.
| Total | Pol Domain | None | Total | Pol Domain | None | |
|---|---|---|---|---|---|---|
| Number of TEs | 1204 | 338 (28%) | 866 (72%) | 1199 | 260 (22%) | 938 (78%) |
| Associated loci | 571 | 60 (11%) | 511 (89%) | 341 | 24 (7%) | 317 (91%) |
Figure 1Phylogenetic relationship between RT and INT domains in C. immitis and C. posadasii. The black lines represent C. immitis Gypsy RT and INT domains; the orange lines represent C. posadasii Gypsy RT and INT domains. The size of the black dot represents the bootstrap value.
Figure 2The genomic distribution of protein-coding loci (blue track) and the TEs (red track) in C. immitis (a) and C. posadasii (b) assemblies. These data show an inverse relationship between predicted genes and TEs and the tendency of TEs to cluster at the ends of contigs, reflecting both assembly difficulties and potential preferential accumulation regions. The C. immitis genome is mapped on six contigs; C. posadasii on 20.
Figure 3Histogram showing the frequency of TEs in the two largest C. immitis contigs. The number of C. immitis TEs is plotted on the Y axis and the position on contigs 1 and 2 is plotted on the Y axis. A bin width of 5% of the contig length was used.
Figure 4Expression (median values log2 FPKM) of C. immitis and C. posadasii loci within 1 kB of one or more TE. The p-values for gene expression of all C. immitis groups flanked by at least one TE compared to control expression levels (“None”) are less than 1 × 10−4. The p-values of C. posadasii gene expression of genes flanked by at least two TEs is 7 × 10−3 compared to the control (“None”). The remaining C. posadasii gene expression values are not significantly lower than the control value.
C. posadasii orthologous protein-encoding loci associated TEs.
| Number of TEs | |||||
|---|---|---|---|---|---|
| ≥4 | 3 | 2 | 1 | None | |
| 27 | 17 | 2 | 5 | 4 | 1 |
Figure 5Mean expression values (log2 FPKM) of C. immitis and C. posadasii loci within 1 kB of TE superfamilies compared to all loci. The p-values for C. posadasii genes associated with Gypsy were 5 × 10−3 and all others were not significant. The p-values for C. immitis genes associated with any TE superfamily were all ≤1 × 10−4; p-values for C. immitis genes associated with Gypsy or hAT TEs were ≤1 × 10−10.
Relationship of gene location to TE to gene expression in C. immitis.
| Control | Upstream a | Overlap b | Downstream c | ||||
|---|---|---|---|---|---|---|---|
| Mean | Number | Mean | Number | Mean | Number | Mean | |
| All genes | 4.032 | 992 | 773 | ||||
| 115 | 3.734 | 131 | 3.759 | 115 | 3.734 | ||
| 75 | 1.134 | 73 | 1.135 | 77 | 1.292 | ||
| 119 | 3.242 | 69 | 3.343 | 137 | 4.229 | ||
| 272 | 3.027 | 217 | 2.001 | 312 | 2.156 | ||
a. Gene upstream of TE; b. TE overlaps gene; c. Gene downstream of TE.
GO enrichment analysis of genes associated with TEs.
| ID | Name | Odds Ratio | Odds Ratio | ||
|---|---|---|---|---|---|
| GO:0016310 | Phosphorylation | 3.35 | 1.43 × 10−7 | 3.34 | 7.35 × 10−7 |
| GO:0006468 | protein phosphorylation | 3.47 | 1.56 × 10−7 | 3.60 | 2.10 × 10−6 |
| GO:0036211 | protein modification process | 2.72 | 3.87 × 10−6 | 2.48 | 2.90 × 10−4 |
| GO:0006464 | cellular protein modification process | 2.72 | 3.87 × 10−6 | 2.48 | 2.90 × 10−4 |
| GO:0006796 | phosphate-containing compound metabolic process | 2.45 | 1.17 × 10−5 | 2.07 | 4.41 × 10−3 |
| GO:0006793 | phosphorus metabolic process | 2.43 | 1.36 × 10−5 | 2.05 | 5.00 × 10−3 |