| Literature DB >> 24504088 |
Gabriela Aguileta1, Damien M de Vienne, Oliver N Ross, Michael E Hood, Tatiana Giraud, Elsa Petit, Toni Gabaldón.
Abstract
From their origin as an early alpha proteobacterial endosymbiont to their current state as cellular organelles, large-scale genomic reorganization has taken place in the mitochondria of all main eukaryotic lineages. So far, most studies have focused on plant and animal mitochondrial (mt) genomes (mtDNA), but fungi provide new opportunities to study highly differentiated mtDNAs. Here, we analyzed 38 complete fungal mt genomes to investigate the evolution of mtDNA gene order among fungi. In particular, we looked for evidence of nonhomologous intrachromosomal recombination and investigated the dynamics of gene rearrangements. We investigated the effect that introns, intronic open reading frames (ORFs), and repeats may have on gene order. Additionally, we asked whether the distribution of transfer RNAs (tRNAs) evolves independently to that of mt protein-coding genes. We found that fungal mt genomes display remarkable variation between and within the major fungal phyla in terms of gene order, genome size, composition of intergenic regions, and presence of repeats, introns, and associated ORFs. Our results support previous evidence for the presence of mt recombination in all fungal phyla, a process conspicuously lacking in most Metazoa. Overall, the patterns of rearrangements may be explained by the combined influences of recombination (i.e., most likely nonhomologous and intrachromosomal), accumulated repeats, especially at intergenic regions, and to a lesser extent, mobile element dynamics.Entities:
Keywords: Basidiomycota; basal fungi; endosymbiosis; fungal phylogeny; genome size reduction; rearrangement rates; sordariomycetes
Mesh:
Substances:
Year: 2014 PMID: 24504088 PMCID: PMC3942027 DOI: 10.1093/gbe/evu028
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
List of the Species Analyzed, Accessions, and References
| Species | Taxonomy | GenBank Accession | Reference |
|---|---|---|---|
| Ur | NC_001715 | ||
| E | NC_012830 | ||
| S | NC_017842 | ||
| S1 | NC_018046 | ||
| S2 | NC_004691 | ||
| S | NC_015893 | ||
| S | NC_013145 | ||
| B | NC_004336 | ||
| S1 | NC_010166 | ||
| S1 | NC_013147 | ||
| S | NC_017930 | ||
| S | NC_009493 | Herring et al. (unpublished) | |
| S2 | NC_006077 | ||
| S | NC_004514 | ||
| S | NC_008068 | ||
| D | NC_010222 | ||
| E | NC_012832 | ||
| S1 | NC_013255 | ||
| B | NC_005927 | ||
| B | NC_020353 | Lang (unpublished) | |
| S2 | NC_012621 | ||
| S1 | NC_014805 | ||
| B | NC_014344 | ||
| E | NC_007935 | ||
| D | NC_016955 | ||
| E | NC_005256 | ||
| D | NC_009746 | ||
| S1 | NC_015384 | ||
| B | NC_009905 | ||
| S | NC_001329 | ||
| Ur | NC_003053 | ||
| X | NC_004332 | ||
| B | NC_003049 | ||
| B | NC_010651 | Yi et al. (unpublished) | |
| B | NC_013933 | ||
| B | NC_008368 | Kennell and Bohmer (unpublished) | |
| S2 | NC_009638 | Scanell et al. (unpublished) | |
| X | NC_002659 |
aTaxonomy: B, basidiomycetes; S1, saccharomycetes1; D, dothideomycetes; E, eurotiomycetes; S, sordariomycetes; Ur, early diverging or basal; X, other; S2, saccharomycetes2.
FML phylogeny of our sampled taxa including the 38 species in the dikarya data set. The gene tree was inferred from a concatenated alignment of 14 single-copy orthologous genes (atp6, atp8, atp9, nad1–nad6, nad4L, cob, and cox1–cox3). RAxML v.7.2.6 (Stamatakis 2006) was used assuming the LG substitution matrix and default parameters. On the right side of each taxon name is a series of colored boxes representing the mt gene order according to GenBank annotation. Bootstrap support appears next to each node. bsGOL values are shown next to each species name, and they are estimated by minimizing the following expression: L = ∑(∑b, – GOL)2, where b, is a Boolean variable that specifies the branches that are relevant for the estimation of a particular bsGOL (i.e., 0 if it is not relevant and 1 if it is), x is obtained by minimizing L and is the actual bsGOL value, and GOL are the estimatedvalues from the pairwise comparisons, in other words, GOL = 1 − GOC (see Fischer et al. 2006 for more details). Significant tRNA clustering was found in species marked with a spiral. This figure was made using the ETE python environment for tree exploration (Huerta-Cepas et al. 2010).
FGOC between pairs of genomes of the dikarya data set as a function of their phylogenetic (patristic) distance. Distances were estimated using the estimated branch lengths in figure 3, listed in table 2. Models are fitted by nonlinear regression. Model 0: GOC = 2/1+eα. Model 1: GOC = 1 – √αt. Model 2: 1/GOC = αt + 1. Model 3: GOC = p, where parameter α is adjusted by regression and t is the patristic distance between the two compared taxa.
bsGOL, NRPS Rates, and Summary of the Number of Different Genomic Elements per Species
| Taxon | bsGOL | Branch Lengths | GOL Rate | Normalized GOL Rate | NPRS Rates | tRNAs | Intronic ORFs | Introns | Repeats | Intergenic Repeats | Genome Size |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 3.50E-01 | 0.7 | 1.05E+00 | 2.96E+00 | 0.002 | 20 | 1 | 2 | 128 | 9 | 24,874 | |
| 1.16E-01 | 0.22 | 3.36E-01 | 7.89E-01 | 0.001 | 30 | 11 | 13 | 1,648 | 272 | 109,103 | |
| 3.12E-01 | 0.25 | 5.62E-01 | 1.32E+00 | 0.0007 | 20 | 33 | 22 | 1,003 | 380 | 107,808 | |
| 2.38E-01 | 0.7 | 9.38E-01 | 2.11E+00 | 0.002 | 24 | 2 | 5 | 1,898 | 31 | 31,825 | |
| 1.69E-01 | 0.15 | 3.19E-01 | 6.38E-01 | 0.0009 | 24 | 9 | 12 | 645 | 3 | 73,242 | |
| 1.39E-01 | 0.32 | 4.59E-01 | 9.09E-01 | 0.002 | 27 | 0 | 0 | 1,718 | 61 | 49,704 | |
| 8.97E-02 | 0.2 | 2.90E-01 | 5.74E-01 | 0.0008 | 24 | 0 | 0 | 568 | 52 | 65,147 | |
| 3.00E-01 | 0.16 | 4.60E-01 | 9.11E-01 | 0.0006 | 25 | 26 | 26 | 1,196 | 143 | 91,500 | |
| 1.71E-01 | 0.14 | 3.11E-01 | 6.16E-01 | 0.0005 | 23 | 11 | 12 | 337 | 37 | 56,814 | |
| 2.37E-01 | 0.24 | 4.77E-01 | 1.22E+00 | 0.003 | 30 | 0 | 2 | 279 | 40 | 40,420 | |
| 1.64E-01 | 0.19 | 3.54E-01 | 8.64E-01 | 0.004 | 25 | 4 | 4 | 389 | 20 | 29,462 | |
| 2.15E-01 | 0.26 | 4.75E-01 | 1.16E+00 | 0.003 | 25 | 3 | 5 | 3,968 | 939 | 76,453 | |
| 2.16E-01 | 0.2 | 4.16E-01 | 9.83E-01 | 0.004 | 25 | 11 | 14 | 710 | 31 | 39,107 | |
| 1.41E-01 | 0.4 | 5.41E-01 | 1.54E+00 | 0.004 | 26 | 6 | 6 | 944 | 46 | 41,719 | |
| 1.29E-01 | 0.21 | 3.39E-01 | 8.76E-01 | 0.002 | 25 | 3 | 6 | 1,080 | 21 | 35,683 | |
| 1.97E-01 | 0.16 | 3.57E-01 | 8.43E-01 | 0.0007 | 27 | 0 | 0 | 218 | 66 | 43,964 | |
| 9.28E-02 | 0.2 | 2.93E-01 | 8.35E-01 | 0.001 | 26 | 17 | 20 | 468 | 31 | 62,785 | |
| 2.03E-01 | 0.18 | 3.83E-01 | 1.09E+00 | 0.0009 | 27 | 2 | 4 | 667 | 36 | 49,761 | |
| 7.66E-02 | 0.02 | 9.66E-02 | 2.48E-01 | 0.0007 | 25 | 1 | 1 | 325 | 14 | 24,105 | |
| 7.66E-02 | 0.02 | 9.66E-02 | 2.38E-01 | 0.0007 | 25 | 1 | 1 | 344 | 10 | 23,943 | |
| 1.54E-01 | 0.24 | 3.94E-01 | 9.26E-01 | 0.001 | 25 | 3 | 13 | 552 | 175 | 71,335 | |
| 2.38E-01 | 0.16 | 3.98E-01 | 9.19E-01 | 0.0007 | 28 | 9 | 11 | 303 | 16 | 35,438 | |
| 2.90E-02 | 0.002 | 3.10E-02 | 7.38E-02 | 0.0002 | 25 | 5 | 5 | 284 | 10 | 29,961 | |
| 1.45E-01 | 0.1 | 2.45E-01 | 6.29E-01 | 0.0008 | 28 | 3 | 3 | 927 | 113 | 127,206 | |
| 5.53E-02 | 0.009 | 6.43E-02 | 1.65E-01 | 0.0005 | 24 | 4 | 5 | 295 | 10 | 32,263 | |
| 5.37E-02 | 0.01 | 6.37E-02 | 1.66E-01 | 0.0006 | 25 | 1 | 2 | 298 | 1 | 34,477 | |
| 5.67E-02 | 0.02 | 7.67E-02 | 2.00E-01 | 0.0006 | 28 | 33 | 35 | 529 | 48 | 95,676 | |
| 8.50E-02 | 0.05 | 1.35E-01 | 3.54E-01 | 0.002 | 25 | 1 | 1 | 241 | 9 | 24,499 | |
| 1.46E-01 | 0.12 | 2.66E-01 | 6.97E-01 | 0.001 | 24 | 1 | 1 | 220 | 2 | 24,673 | |
| 2.01E-01 | 0.09 | 2.91E-01 | 6.87E-01 | 0.0008 | 27 | 31 | 34 | 326 | 48 | 100,314 | |
| 1.97E-01 | 0.3 | 4.97E-01 | 7.44E-01 | 0.0004 | 25 | 10 | 28 | 219 | 14 | 57,473 | |
| 3.65E-01 | 1.2 | 1.56E+00 | 3.84E+00 | 0.002 | 7 | 1 | 1 | 809 | 13 | 68,834 | |
| 5.08E-01 | 0.87 | 1.38E+00 | 3.38E+00 | 0.002 | 25 | 3 | 2 | 928 | 94 | 80,059 | |
| 2.00E-01 | 0.56 | 7.60E-01 | 1.76E+00 | 0.002 | 27 | 15 | 17 | 792 | 72 | 47,916 | |
| 9.56E-02 | 0.09 | 1.86E-01 | 2.78E-01 | 0.001 | 23 | 3 | 3 | 797 | 68 | 20,063 | |
| 3.49E-01 | 0.14 | 4.89E-01 | 1.09E+00 | 0.001 | 22 | 3 | 3 | 1,190 | 279 | 40,291 | |
| 8.18E-02 | 0.2 | 2.82E-01 | 7.33E-01 | 0.003 | 23 | 0 | 0 | 7,896 | 3,791 | 107,123 | |
| 2.62E-01 | 0.17 | 4.32E-01 | 1.06E+00 | 0.002 | 23 | 0 | 0 | 1,258 | 186 | 21,684 |
absGOL values were estimated by minimizing the following expression: L = ∑(∑b, – GOL)2, where b, is a Boolean variable that specifies the branches that are relevant for the estimation of a particular bsGOL (i.e., 0 if it is not relevant and 1 if it is), x is obtained by minimizing L and is the actual bsGOL value, and GOL are the estimated values from the pairwise comparisons, in other words, GOL = 1 − GOC (see Fischer et al. [2006] for more details).
bThese branch lengths were obtained by ML phylogenetic reconstruction with RaxML (Stamatakis 2006).
cbsGOL normalized by the branch length.
dbsGOL rates normalized relative to the mean GOL rate value.
eRates obtained with r8s with the nonparametric method minimizing local transformations (NPRS) and optimization via Powell’s method (Sanderson 2003).
fNumber of tRNAs (GenBank).
gNumber of intronic ORFs (GenBank).
hNumber of introns (GenBank).
iNumber of repeats (whole genome) detected with Repeatmasker (Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0.1996-2010; http://www.repeatmasker.org, last accessed February 18, 2014).
jNumber of intergenic repeats detected with mreps (Kolpakov et al. 2003).
kGenome size in bp (GenBank).
FPearson’s correlation between bsGOC values and branch lengths (R = 0.7, P value < 0.0005) for the dikarya data set.
Number of Intergenic Repeats Normalized by the Number of Species in Each Fungal Cluster, with and without Outliers, and Rearrangement Events per Fungal Cluster
| Fungal Cluster | Intergenic Repeats | Intergenic Repeats (without Outliers) | Rearrangement Events |
|---|---|---|---|
| Basidiomycetes | 988 (109.78) | 336 (37.33) | 414 |
| Sordariomycetes | 241 (30.13) | 32 (4) | 42 |
| Dothideomycetes | 133 (44.33) | 133 (44.33) | 24 |
| Eurotiomycetes | 215 (53.75) | 40 (10) | 22 |
| Saccharomycetes1 | 1,097 (182.83) | 158 (26.33) | 156 |
| Saccharomycetes2 | 4,324 (1,081) | 254 (63.5) | 22 |
| Basals | 27 (13.5) | 27 (13.5) | 14 |
aOutliers are defined as the species that have higher than average repeat content relative to their cluster: Dekkera bruxellensis, Paracoccidioides brasiliensis, Microbotryum violaceum-Sl, Moniliophthora perniciosa, Chaetomium thermophilum, Gibberella zeae, Podospora anserina, Nakaseomyces bacillisporus, and Kluyveromyces lactis.
FPairwise GOC values as a function of the phylogenetic (patristic) distance between them, for the dikarya data set. Here, we conducted a randomization test as follows: for each genome, we listed the order of genes, made all possible pairwise comparisons of these lists, estimated the GOC score (Rocha 2006), shuffled randomly the gene order, and estimated a new GOC value. We obtained 100,000 reshufflings and compared the original GOC to the distribution of the shuffled GOCs. We applied the Bonferroni correction for multiple testing and determined the significance (P values) of the comparisons. The red dots represent significant P values, which correspond to the group of sordariomycetes (in fig. 1, the clade grouping Chaetomium thermophilum, Podospora anserina, Gibberella zeae, F. oxysporum, Lecanicillium muscarium, Cordyceps bassiana, Beauveria bassiana, and Metarhizium anisopliae).