| Literature DB >> 21326234 |
Thierry Rouxel1, Jonathan Grandaubert, James K Hane, Claire Hoede, Angela P van de Wouw, Arnaud Couloux, Victoria Dominguez, Véronique Anthouard, Pascal Bally, Salim Bourras, Anton J Cozijnsen, Lynda M Ciuffetti, Alexandre Degrave, Azita Dilmaghani, Laurent Duret, Isabelle Fudal, Stephen B Goodwin, Lilian Gout, Nicolas Glaser, Juliette Linglin, Gert H J Kema, Nicolas Lapalu, Christopher B Lawrence, Kim May, Michel Meyer, Bénédicte Ollivier, Julie Poulain, Conrad L Schoch, Adeline Simon, Joseph W Spatafora, Anna Stachowiak, B Gillian Turgeon, Brett M Tyler, Delphine Vincent, Jean Weissenbach, Joëlle Amselem, Hadi Quesneville, Richard P Oliver, Patrick Wincker, Marie-Hélène Balesdent, Barbara J Howlett.
Abstract
Fungi are of primary ecological, biotechnological and economic importance. Many fundamental biological processes that are shared by animals and fungi are studied in fungi due to their experimental tractability. Many fungi are pathogens or mutualists and are model systems to analyse effector genes and their mechanisms of diversification. In this study, we report the genome sequence of the phytopathogenic ascomycete Leptosphaeria maculans and characterize its repertoire of protein effectors. The L. maculans genome has an unusual bipartite structure with alternating distinct guanine and cytosine-equilibrated and adenine and thymine (AT)-rich blocks of homogenous nucleotide composition. The AT-rich blocks comprise one-third of the genome and contain effector genes and families of transposable elements, both of which are affected by repeat-induced point mutation, a fungal-specific genome defence mechanism. This genomic environment for effectors promotes rapid sequence diversification and underpins the evolutionary potential of the fungus to adapt rapidly to novel host-derived constraints.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21326234 PMCID: PMC3105345 DOI: 10.1038/ncomms1189
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 17.694
Figure 1Phylogenetic relationships between Dothideomycetes and an example of microsynteny between related species.
(a) An example of microsynteny between L. maculans and closely related Dothideomycetes, P. nodorum, C. heterostrophus and P. tritici-repentis, showing the integration of an AT-rich genomic region (grey boxes) between two orthologous genes encoding for fungal transcription factors (red and green arrows) of the three other species, along with generation of one novel small-secreted protein-encoding gene (blue arrow) in L. maculans only. Grey arrow, P. nodorum predicted gene. The ID of each gene in the corresponding genome sequence is indicated. The intergenic distance (expressed in kb) is shown. (b) A phylogenetic tree and estimated time divergences of major lineages in Ascomycota with a selection of plant pathogenic lineages in Dothideomycetes. The phylogenetic analysis was performed using RaxML44 and the chronogram, calibrated using recent data from the literature and fossil dates, produced using r8s (ref. 45). Classes outside of the Dothideomycetes were collapsed in TreeDyn, except for Sordariomycetes where the order Hypocreales represented an important calibration point. The blue vertical lines correlate with divergence times when the root of the tree was fixed at 500 MYA, whereas the green lines of the tree represent a fixed root of 650 MYA. The range of dates for the emergence of Dothideomycetes and Pleosporineae are highlighted with stippled lines. Thickened branches on the tree represents nodes that had more than 70% bootstrap values in a RAxML run. Species with genome data are marked with a DNA logo.
Assembly statistics for the L. maculans genome.
| Number | 76 | 1,743 |
| Size (Mb) | 45.12 | 43.76 |
| N50 (kb) | 1,770 | 61 |
| Min/max size (kb) | 0.49/4,258.57 | 0.22/395.37 |
| Mean size (kb) | 594 | 26 |
| Median size (kb) | 29 | 11 |
Features of genomes of L. maculans and other related Dothideomycetes.
| No. of chromosomes | 17–18 | 19 | 11 | 15–16 | 9–11 | 21 |
| Genome size (Mb) | 45.1 | 36.6 | 37.8 | 34.9 | 30.3 | 39.7 |
| No of contigs | 1,743 | 496 | 703 | 400 | 4,039 | 21 |
| No of SuperContigs (SCs) | 76 | 107 | 47 | 89 | 838 | 21 |
| SC N50 (Mb) | 1.8 | 1.1 | 1.9 | 1.3 | 2.4 | NA |
| Gaps (%) | 2.5 | 0.4 | 1.7 | 1.1 | 5.4 | 0.01 |
| No. of predicted genes | 12,469 | 10,762 | 12,141 | 9,633 | 10,688 | 10,952 |
| Average gene length (bp) | 1,323 | 1,326 | 1,618 | 1,836 | 1,523 | 1,600 |
| GC content (%) | 44.1 | 50.3 | 50.4 | 52–54 | 50.5 | 55.0 |
| Repeat content (%) | 34.2 | 7.1 | 16.0 | 7.0 | 9.0 | 18.0 |
| 'Core' genome size (Mb) | 29.7 | 34.5 | 31.7 | 32.5 | 27.6 | 32.6 |
| Gene density/core genome (no. of gene per 10 kb) | 4.2 | 3.1 | 3.8 | 3.0 | 3.9 | 3.4 |
*References for the genomes as follows: L. maculans54, P. nodorum55, P. tritici-repentis56, C. heterostrophus57, A. brassicicola58, M. graminicola59; unpublished reannotation of P. nodorum genome was provided by J. K. Hane and R. P. Oliver.
†Not applicable, as the M. graminicola genome is finished; that is, each SC corresponds to a chromosome.
‡'Core' genome excluding the repeated elements, but including the gaps in the genome sequence.
Figure 2Main features of the L. maculans genome as exemplified by chromosome 5 SuperContig 1.
(a) Transposable elements (TEs) distribution and gene density along the supercontig. TE density is drawn in green and gene density is in blue. (b) Location of SSPs (small-secreted protein encoding genes). Blue arrowheads, SSP in AT-blocks, corresponding to TE-rich regions in a; red arrowheads, SSP in GC-blocks, corresponding to gene-rich genome regions in a. (c) GC content along the SC showing alternating GC-equilibrated and AT-rich regions, with location of a polyketide synthase-encoding gene, PKS4. (d) Genetic (upper part, expressed in centiMorgan—cM) to physical (expressed in kb) distance relationship as a function of the isochore-like structure. Lower part: physical location of genetic markers. Upper part of the panel: genetic map using MapMaker/Exp 3.0 with parameters set at likelihood ratio value >3.0 and minimum distance=20 cM. Only markers drawn from the sequence data are represented.
Comparative features of SSP-encoding genes occurring in diverse genome environments.
| No. | 12,469 | 529 (4.2%) | 407 (3.3%) | 91 (0.7%) | 65 (0.5%) | 57 (0.5%) |
| BLAST hits (%)† | 71.3 | 48.4 | 60.2 | 34.1 | 15.4 | 8.8 |
| GC content (%) | 54.1 | 54.6 | 52.9 | 48.9 | 51.1 | 48.2 |
| TpA/ApT | 1.04 | 1.20 | 1.12 | 1.49 | 1.19 | 1.44 |
| TpA/ApT >1.5 (%)‡ | 6.9 | 16.4 | 11.5 | 36.3 | 20.0 | 38.6 |
| EST, transcriptomic or proteomic support (%) | 84.8 | 77.1 | 73.7 | 54.9 | 56.9 | 60.0 |
| No. of genes present on the NimbleGen array, and with transcriptomic support | 10,524 | 396 | 298 | 47 | 35 | 33 |
| Genes overexpressed | 9.9 | 19.1 | 11.1 | 36.2 | 13.9 | 72.7 |
| Genes overexpressed | 11.0 | 15.4 | 11.8 | 8.5 | 22.2 | 24.2 |
| Average protein size (amino acid) | 418.4 | 167.7 | 396.1 | 192.4 | 111.6 | 98.6 |
| % Cysteines in the predicted protein | 1.7 | 2.9 | 1.9 | 2.1 | 3.8 | 4.5 |
TpA/ApT, frequency of occurrence of dinucleotide TA over dinucleotide AT.
* 'Borders' refer to 859±385 bp transition regions between AT-rich and GC-equilibrated genomic regions.
†BLAST to nr cutoff=1×e−10.
‡Percentage of genes showing a TpA/ApT RIP index above 1.5. This cutoff corresponds to that observed in the majority of RIP-inactivated AvrLm6 alleles39.
§Genes with more than 1.5-fold change in transcript level and an associated P-value <0.05 were considered as significantly differentially expressed during infection (7 or 14 dpi) compared with growth in vitro; expressed as a percent of genes with transcriptomic support. The same genes may be overexpressed at 7 and 14 dpi.
Main families and characteristics of transposable elements and other repeats in the L. maculans genome.
| | 12.30 Mb (27.26%) | ||||||
| RLG_ | 3.06 Mb | 7,246 | 250 | 1,085 | 187 | 0.172 | |
| RLG_ | 2.97 Mb | 6,928 | 179 | 1,014 | 164 | 0.162 | |
| RLG_ | 2.24 Mb | 11,875 | 235 | 594 | 46 | 0.077 | |
| RLC_ | 3.10 Mb | 6,981 | 281 | 1,020 | 83 | 0.081 | |
| RLG_ | 0.30 Mb | 6,620 | 228 | 85 | 30 | 0.353 | |
| RLC_ | 0.16 Mb | 5,306 | 177 | 97 | 14 | 0.144 | |
| RLx_ | 0.02 Mb | 803 | 259 | 57 | 5 | 0.088 | Unknown |
| RLx_ | 0.40 Mb | 10,397 | 217 | 164 | 8 | 0.049 | Unknown |
| RLG_ | 0.05 Mb | 7,289 | None | 22 | 3 | 0.136 | |
| | 1.19 Mb (2.64%) | ||||||
| DTF_ | 199.3 kb | 2,173 | 57 | 158 | 54 | 0.342 | |
| DTM_ | 25.6 kb | 3,489 | 49 | 36 | 3 | 0.083 | |
| DTx_ | 5.3 kb | 866 | 49 | 15 | 4 | 0.267 | Unknown |
| DTx_ | 11.5 kb | 1,793 | 37 | 73 | 2 | 0.027 | Unknown |
| DTT_ | 2.7 kb | 529 | 29 | 7 | 4 | 0.571 | |
| DTT_ | 10.8 kb | 523 | 29 | 31 | 11 | 0.355 | |
| DTT_ | 7.4 kb | 806 | 29 | 15 | 4 | 0.267 | |
| DTM_ | 782.8 kb | 5,992 | None | 873 | 49 | 0.056 | |
| DTx_ | 112.9 kb | 606 | None | 279 | 51 | 0.183 | Unknown |
| DTM_ | 33.4 kb | 3,582 | 37 | 48 | 1 | 0.021 | |
| Uncharacterized repeats (11 families) | 159.9 kb (0.35%) | ||||||
| rDNA repeats† | 767 kb (1.70%) | 7,800‡ | >100 | 50 | |||
| Telomeric repeats§ | 935.0 kb (2.07%) | | | | | | |
* Classification of TEs according to Wicker et al.16: the three-letter code refers to class (R, retrotransposon; D, DNA transposon), order (L, Long terminal repeat—LTR; T, terminal inverted repeat—TIR; P, Penelope-like element—PLE) and superfamily (G, Gypsy; C, Copia; P, Penelope; F, Fot1-Pogo; T, Tc1-Mariner; M, Mutator, x, unknown superfamily) followed by the family (or subfamily) name italicized.
†Including a rDNA-specific LINE element.
‡Excluding variable length short-tandem repeats flanking almost every rDNA repeat.
§Including telomere-associated Penelope-like retroelement RPP-Circe and RecQ telomere-linked helicase.
Figure 3Repeat-induced point (RIP) mutation in ribosomal DNA of L. maculans shown as RIPCAL output.
(a) Schematic representation of the rDNA unit in L. maculans (ITS, internal transcribed spacers; IGS, intergenic spacer); (b) a schematic multiple alignment of the 7.8 kb 'complete' ribosomal DNA (rDNA) units occurring in SuperContigs 2 and 19. Polymorphic nucleotides are coloured as a function of the type of RIP mutation observed, with black, invariant nucleotide; red, CpA TpA or TpG TpA mutations; dark blue, CpC TpC or GpG GpA mutations; pale blue, CpT TpT or ApG ApA mutations; green, CpG TpG or CpG CpA mutations; (c) RIP mutation frequency plot over a rolling sequence window, corresponding to the multiple alignment directly above. Nucleotide polymorphisms (against the alignment consensus, which is also the highest GC-content sequence) mostly correspond to CpA TpA or TpG TpA (red curve) and CpG TpG or CpG CpA (green curve).
Figure 4Dynamics of transposable elements in the L. maculans genome.
A phylogenetic analysis was used to retrace the evolutionary history of each transposable element (TE) family after elimination of mutations due to repeat-induced point mutations. Terminal fork branch lengths were assumed to correspond to an evolutionary distance used to estimate the age of the last transposition activity. The divergence values were converted to estimated divergence time using a substitution rate of 1.05×10−9 substitution per location per year5253 (expressed as 'million years ago' MYA). (a) Box plot graph of divergence times. The red line represents the median value; the boxes include values between the first and the third quartile of the distribution; squares and circles, first and ninth decile, respectively. (b) Kernel density of divergence plots. A R-script was written to plot a histogram of the terminal fork branch length with kernel density estimate for each family.