| Literature DB >> 21695235 |
Stephen B Goodwin1, Sarrah Ben M'barek, Braham Dhillon, Alexander H J Wittenberg, Charles F Crane, James K Hane, Andrew J Foster, Theo A J Van der Lee, Jane Grimwood, Andrea Aerts, John Antoniw, Andy Bailey, Burt Bluhm, Judith Bowler, Jim Bristow, Ate van der Burgt, Blondy Canto-Canché, Alice C L Churchill, Laura Conde-Ferràez, Hans J Cools, Pedro M Coutinho, Michael Csukai, Paramvir Dehal, Pierre De Wit, Bruno Donzelli, Henri C van de Geest, Roeland C H J van Ham, Kim E Hammond-Kosack, Bernard Henrissat, Andrzej Kilian, Adilson K Kobayashi, Edda Koopmann, Yiannis Kourmpetis, Arnold Kuzniar, Erika Lindquist, Vincent Lombard, Chris Maliepaard, Natalia Martins, Rahim Mehrabi, Jan P H Nap, Alisa Ponomarenko, Jason J Rudd, Asaf Salamov, Jeremy Schmutz, Henk J Schouten, Harris Shapiro, Ioannis Stergiopoulos, Stefano F F Torriani, Hank Tu, Ronald P de Vries, Cees Waalwijk, Sarah B Ware, Ad Wiebenga, Lute-Harm Zwiers, Richard P Oliver, Igor V Grigoriev, Gert H J Kema.
Abstract
The plant-pathogenic fungus Mycosphaerella graminicola (asexual stage: Septoria tritici) causes septoria tritici blotch, a disease that greatly reduces the yield and quality of wheat. This disease is economically important in most wheat-growing areas worldwide and threatens global food production. Control of the disease has been hampered by a limited understanding of the genetic and biochemical bases of pathogenicity, including mechanisms of infection and of resistance in the host. Unlike most other plant pathogens, M. graminicola has a long latent period during which it evades host defenses. Although this type of stealth pathogenicity occurs commonly in Mycosphaerella and other Dothideomycetes, the largest class of plant-pathogenic fungi, its genetic basis is not known. To address this problem, the genome of M. graminicola was sequenced completely. The finished genome contains 21 chromosomes, eight of which could be lost with no visible effect on the fungus and thus are dispensable. This eight-chromosome dispensome is dynamic in field and progeny isolates, is different from the core genome in gene and repeat content, and appears to have originated by ancient horizontal transfer from an unknown donor. Synteny plots of the M. graminicola chromosomes versus those of the only other sequenced Dothideomycete, Stagonospora nodorum, revealed conservation of gene content but not order or orientation, suggesting a high rate of intra-chromosomal rearrangement in one or both species. This observed "mesosynteny" is very different from synteny seen between other organisms. A surprising feature of the M. graminicola genome compared to other sequenced plant pathogens was that it contained very few genes for enzymes that break down plant cell walls, which was more similar to endophytes than to pathogens. The stealth pathogenesis of M. graminicola probably involves degradation of proteins rather than carbohydrates to evade host defenses during the biotrophic stage of infection and may have evolved from endophytic ancestors.Entities:
Mesh:
Year: 2011 PMID: 21695235 PMCID: PMC3111534 DOI: 10.1371/journal.pgen.1002070
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Sizes and gene contents of the 21 chromosomes of Mycosphaerella graminicola isolate IPO323.
| Chromosome | All genes | Unique genes | Signal peptides | Average gene size (bp) | Genes/Mb DNA | Percent G+C | Percent repetitive | milRNAs/Mb DNA | |||
| Number | Size | Number | Annotated | Number | Annotated | ||||||
| 1 | 6,088,797 | 1,980 | 1,258 | 1,067 | 497 | 208 | 1338.6 | 325 | 53.1 | 9.5 | 9.7 |
| 2 | 3,860,111 | 1,136 | 650 | 607 | 238 | 108 | 1402.7 | 294 | 52.4 | 15.7 | 9.6 |
| 3 | 3,505,381 | 1,071 | 630 | 583 | 246 | 122 | 1337.1 | 306 | 52.6 | 14.2 | 6.3 |
| 4 | 2,880,011 | 821 | 498 | 421 | 182 | 81 | 1388.6 | 285 | 52.2 | 16.1 | 13.2 |
| 5 | 2,861,803 | 778 | 489 | 389 | 180 | 91 | 1352.6 | 272 | 52.0 | 19.1 | 18.9 |
| 6 | 2,674,951 | 692 | 427 | 328 | 152 | 66 | 1353.0 | 259 | 51.4 | 22.2 | 12.3 |
| 7 | 2,665,280 | 766 | 357 | 462 | 131 | 96 | 1202.7 | 287 | 52.6 | 14.0 | 16.1 |
| 8 | 2,443,572 | 689 | 397 | 384 | 159 | 62 | 1311.2 | 282 | 51.7 | 17.6 | 13.5 |
| 9 | 2,142,475 | 604 | 353 | 305 | 134 | 69 | 1345.1 | 282 | 51.5 | 20.8 | 18.7 |
| 10 | 1,682,575 | 516 | 298 | 266 | 110 | 46 | 1418.7 | 307 | 52.5 | 14.1 | 9.5 |
| 11 | 1,624,292 | 488 | 279 | 270 | 115 | 65 | 1352.5 | 300 | 52.8 | 10.5 | 5.5 |
| 12 | 1,462,624 | 408 | 227 | 232 | 96 | 59 | 1254.3 | 279 | 52.3 | 14.5 | 10.9 |
| 13 | 1,185,774 | 330 | 183 | 165 | 68 | 47 | 1195.7 | 278 | 52.0 | 17.8 | 17.7 |
| 14 | 773,098 | 114 | 25 | 48 | 5 | 3 | 920.1 | 147 | 48.5 | 36.7 | 23.3 |
| 15 | 639,501 | 86 | 6 | 44 | 1 | 2 | 773.7 | 134 | 51.0 | 34.4 | 25.0 |
| 16 | 607,044 | 88 | 5 | 40 | 1 | 5 | 898.5 | 145 | 51.5 | 25.6 | 31.3 |
| 17 | 584,099 | 78 | 6 | 36 | 1 | 1 | 777.9 | 134 | 52.0 | 26.4 | 18.8 |
| 18 | 573,698 | 64 | 7 | 28 | 4 | 0 | 965.1 | 112 | 48.6 | 40.3 | 33.1 |
| 19 | 549,847 | 87 | 8 | 53 | 3 | 4 | 658.3 | 158 | 51.3 | 25.1 | 23.6 |
| 20 | 472,105 | 79 | 4 | 41 | 2 | 4 | 863.1 | 167 | 51.5 | 21.1 | 25.4 |
| 21 | 409,213 | 58 | 4 | 21 | 1 | 2 | 921.6 | 142 | 51.9 | 30.1 | 14.7 |
| Total | 39,686,251 | 10,933 | 6,111 | 5,790 | 2,326 | 1,141 | 13.5 | ||||
At a BLAST cutoff value of 1×e−20.
Predicted numbers of loci for pre-microRNA-like small RNAs.
This chromosome contains two internal gaps of unclonable DNA marked by gaps of 1.4 and 4.5 kb; all other chromosomes are complete.
The sequence of one telomere is missing from this chromosome; all other telomeres are complete.
Figure 1Features of chromosome 2 of Mycosphaerella graminicola and alignment to genetic linkage maps.
A, Plot of GC content. Areas of low GC usually correspond to regions of repetitive DNA. B, Repetitive regions of the M. graminicola genome. C, Single-copy (red) regions of the M. graminicola genome. D, Locations of genes for proteins containing signal peptides. E, Locations of homologs involved in pathogenicity or virulence that have been experimentally verified in species pathogenic to plant, animal or human hosts. F, Approximate locations of quantitative trait loci (QTL) for pathogenicity to wheat. G, Alignments between the genomic sequence and two genetic linkage maps of crosses involving isolate IPO323. Top half, Genetic linkage map of the cross between IPO323 and the Algerian durum wheat isolate IPO95052. Bottom half, Genetic linkage map of the cross between bread wheat isolates IPO323 and IPO94269. The physical map represented by the genomic sequence is in the center. Lines connect mapped genetic markers in each linkage map to their corresponding locations on the physical map based on the sequences of the marker loci. Exceptions to the almost perfect alignment between the three maps are indicated by crossed lines, most likely due to occasional incorrect scorings of the marker alleles. Chromosome 2 was used for this illustration because no QTL mapped to chromosome 1.
Figure 2Box plots of comparative genome hybridizations (CGH) of DNA from five isolates of Mycosphaerella graminicola to a whole-genome tiling array made from the finished sequence of isolate IPO323.
A, CGH between IPO323 and the Dutch field isolate IPO94269. B, CGH between IPO323 and progeny isolate #51 from the cross between IPO323 and IPO94269. C, CGH between IPO323 and progeny isolate #2133 of the cross between IPO323 and IPO95052. D, CGH between IPO323 and Algerian field isolate IPO95052, which was isolated from and is adapted to durum (tetraploid) wheat. The genomic difference between the strains for each CGH is shown by 21 box plots, one for each chromosome of M. graminicola. The horizontal line in each box is the median log ratio of hybridization signals of the two strains; the upper and lower ends of a box represent the 25% and 75% quartiles. The whiskers extending from each box indicate 1.5 times the interquartile range, the distance between the 25% and 75% quartiles. The larger the deviation from 0, the greater the difference between the strains for a particular chromosome. Pink boxes that are significantly less than the zero line indicate missing chromosomes. The purple boxes in panel B (4 and 18) that are significantly higher than the zero line indicate chromosomes that are disomic.
Differences between essential and dispensable chromosomes in the genome of Mycosphaerella graminicola isolate IPO323.
| Chromosomes | |||
| Statistic | Core (1–13) | Dispensable (14–21) | Combined (1–21) |
| Size in bp | |||
| Total | 35,077,646 | 4,608,605 | 39,686,251 |
| Mean | 2,698,280 | 576,076 | 1,889,821 |
| Percent | 88.4 | 11.6 | 100.0 |
| All genes | |||
| Total | 10,279 | 654 | 10,933 |
| Mean | 790.7 | 81.8 | 521 |
| Percent of total | 94.0 | 6.0 | 100.0 |
| Unique genes | |||
| Total | 5,479 | 311 | 5,790 |
| Mean | 421.5 | 38.9 | 276 |
| Percent of all | 53.3 | 47.6 | 53.0 |
| Annotated genes | |||
| Total | 6,046 | 65 | 6,111 |
| Mean | 465.1 | 8.1 | 291.0 |
| Percent of all | 58.8 | 9.9 | 55.9 |
| Unique total | 2,308 | 18 | 2,326 |
| Unique mean | 177.5 | 2.3 | 110.8 |
| Transcript size, mean in bp | 1327.1 | 847.3 | 1144.3 |
| Gene density, | 288.9 | 142.4 | 233.1 |
| Repetitive DNA, mean | 15.9% | 30.0% | 21.2% |
| G+C, mean | 52.3% | 50.9% | 51.7% |
At a BLAST cutoff value of 1×e−20.
***The mean for the dispensable chromosomes is significantly different from that for the essential chromosomes at P<0.001 by one-tailed t test.
**The mean for the dispensable chromosomes is significantly different from that for the essential chromosomes at P = 0.012 by one-tailed t test.
Figure 3Analysis of genes that are shared between dispensable chromosome 14 and the 13 core chromosomes of Mycosphaerella graminicola isolate IPO323.
Each chromosome is drawn to scale as a numbered bar around the outer edge of the circle, and the sequence was masked for repetitive DNA prior to analysis. Lines connect regions of 100 bp or larger that are similar between each core chromosome and the corresponding region on chromosome 14 at 1×e−5 or lower. Chromosome 14 is an amalgamation of genes from all of the core chromosomes but they are mixed together with no synteny. Genes on the other dispensable chromosomes were not included in this analysis.
Figure 4Principal Component Analysis of: S, observed genes on the dispensome; O, observed samples of genes on the core chromosomes before mutation; and x, samples of genes from the core chromosomes after mutation.
Mutation was simulated using observed frequencies of all mutations in families of transposable elements with ten or more copies, and included mutations from RIP and other processes. Mutating the samples of genes from the core chromosomes always made them more similar to the observed value for the dispensome but only rarely included the dispensome value (see panel C). This occurred primarily with codon preference and GC content by amino acid, which are the quantities that are least subject to natural selection for protein function. A, amino acid frequency using the values for the aligned sequence with the highest GC content to build the table of mutation frequencies; B, codon preference using the consensus of the aligned sequences to make the table of mutation frequencies covering only the 5′ portion of each gene; C, codon preference using the values for the aligned sequence with the highest GC content to build the table of mutation frequencies covering only the 5′ portion of each gene; D, codon usage using the values for the aligned sequence with the highest GC content to build the table of mutation frequencies but with all mutation frequencies cut in half; E, codon usage using the values for the aligned sequence with the highest GC content to build the table of mutation frequencies; and F, GC skew using the consensus of the aligned sequences to make the table of mutation frequencies. The first principal component always separated out the pre- and post-mutated chromosome samples. The locations of the observed values for the dispensome (S) are circled.
Figure 5Comparisons of Mycosphaerella graminicola genome assembly versions 1 and 2 against that of Stagonospora nodorum isolate SN15.
Scaffolds/chromosomes are ordered along their respective axes according to both decreasing length and increasing number. The 6-frame translations of both genomes were compared via MUMMER 3.0 [53]. Homologous regions are plotted as dots, which are color coded for percent similarity as per the bar on the right. Amendments made in the version 2 assembly and their corresponding regions in assembly version 1 are circled in red. Version 2 chromosomes 5 (B, circle II), 7 (B, circle I) and 10 (B, circle III) were derived from joined version 1 scaffolds 7 and 17 (A, circle II), 10 and 14 (A, circle I) and 12 and 22 (A, circle III), respectively, validating the method. Observation of the mesosyntenic pattern also could be used to identify inappropriately joined scaffolds. For example, M. graminicola v2 chromosomes 6 and 16 (B, circle IV) and 12 and 21 (B, circle V) were derived from split version 1 scaffolds 4 (A, circle IV) and 9 (A, circle V), respectively. These scaffolds are characterized by an abrupt termination of the mesosyntenic block at the split point as indicated by red lines (A, circles IV and V). A total of 21 predictions was made and 14 were validated.
Figure 6Numbers of genes for proteases and plant cell wall (PCW) degrading polysaccharidases in the genomes of seven fungi with sequenced genomes.
Genes for PCW-polysaccharidases were severely reduced in the genome of Mycosphaerella graminicola but proteases were about the same. The overall profile of the enzymes in M. graminicola was most similar to that of T. reesei than to any of the other plant pathogens. Species analyzed included the saprophytes Aspergillus nidulans (Anid), Neurospora crassa (Ncra), and Trichoderma reesei (Trees), and the plant pathogens Fusarium graminearum (Fgram), Mycosphaerella graminicola (Mgram), Magnaporthe oryzae (Moryz), and Stagonospora nodorum (Snod).
Numbers of predicted enzymes degrading cellulose across seven ascomycete species with sequenced genomes.
| Saprophytes | Pathogens | ||||||
| CAZy family | Anid | Ncra | Trees | Fgram | Mgram | Moryz | Snod |
| GH5 cellulases | 3 | 1 | 2 | 2 | 0 | 2 | 3 |
| GH6 | 2 | 3 | 1 | 1 | 0 | 3 | 4 |
| GH7 | 3 | 5 | 2 | 2 | 1 | 6 | 5 |
| GH12 | 1 | 1 | 2 | 4 | 1 | 3 | 4 |
| GH45 | 1 | 1 | 1 | 1 | 1 | 1 | 3 |
| GH61 | 9 | 14 | 3 | 15 | 2 | 17 | 30 |
| GH74 | 2 | 1 | 1 | 1 | 0 | 1 | 0 |
| CBM1 | 8 | 19 | 15 | 12 | 0 | 22 | 13 |
| Total cellulases | 29 | 45 | 27 | 38 | 5 | 55 | 62 |
Species analyzed included the saprophytes Aspergillus nidulans (Anid), Neurospora crassa (Ncra), and Trichoderma reesii (Trees), and the plant pathogens Fusarium graminearum (Fgram), Mycosphaerella graminicola (Mgram), Magnaporthe oryzae (Moryz), and Stagonospora nodorum (Snod).
Families defined in the Carbohydrate-active enzymes database (www.cazy.org).
GH5 is a family containing many different enzyme activities; only those targeting cellulose are included.