| Literature DB >> 29089018 |
Fabrizio Menardo1, Coraline R Praz1, Thomas Wicker2, Beat Keller3.
Abstract
BACKGROUND: Grass powdery mildew (Blumeria graminis, Ascomycota) is a major pathogen of cereal crops and has become a model organism for obligate biotrophic fungal pathogens of plants. The sequenced genomes of two formae speciales (ff.spp.), B.g. hordei and B.g. tritici (pathogens of barley and wheat), were found to be enriched in candidate effector genes (CEGs). Similar to other filamentous pathogens, CEGs in B. graminis are under positive selection. Additionally, effectors are more likely to have presence-absence polymorphisms than other genes among different strains.Entities:
Keywords: Blumeria graminis; Effectors; Powdery mildew
Mesh:
Year: 2017 PMID: 29089018 PMCID: PMC5664452 DOI: 10.1186/s12862-017-1064-2
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Fig. 1Phylogenetic tree of lineages of B. graminis. Simplified phylogenetic tree of the five lineages of B. graminis used in this study (modified from Menardo et al. [22]). The median estimation for the divergence time is reported at each bifurcation of the tree. B. graminis growing on Lolium has a different notation because it was never formally designated as a f.sp.
Basic statistics for the three new assembled Bg genomes
| Lineage | Number of contigs | Assembled size | N50 |
|---|---|---|---|
|
| 28,511 | 52,497,740 | 5645 |
|
| 51,877 | 67,455,197 | 4840 |
|
| 74,353 | 86,469,665 | 3777 |
Fig. 2Bioinformatic pipeline for the annotation of B. graminis genome assemblies
Results of gene annotation and effector identification in B. graminis
| Taxa | Annotated genes | Percentage of core eukaryotic genes annotated | Annotated effectors | Proportion of effectors in gene set |
|---|---|---|---|---|
|
| 7073 | 96.1% | 734 | 10.4% |
|
| 6949 | 96.7% | 722 | 10.4% |
|
| 6322 | 96.5% | 362 | 5.7% |
|
| 6391 | 96.5% | 408 | 6.3% |
|
| 6575 | 96.7% | 572 | 8.7% |
|
| 9733 | 98.6% | – | – |
|
| 10,604 | 99.8% | – | – |
The 20 largest families of effectors in B. graminis
| IDa | Totalb |
|
|
|
|
| SignalPh | Length (aa)i | Protein domainsj | Fast Evolutionk | λl |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 609 | 161 | 151 | 113 | 69 | 115 | 127 | 288 | * | 80.187 | |
| 2 | 141 | 30 | 37 | 22 | 11 | 41 | 112 | 153 | * | 14.130 | |
| 3 | 131 | 9 | 5 | 12 | 6 | 99 | 49 | 317 | 5 microbial_RNases | * | 1.901 |
| 4 | 103 | 23 | 50 | 7 | 8 | 15 | 74 | 332 | 16 microbial_RNases | * | 0.062 |
| 5 | 103 | 29 | 29 | 14 | 12 | 19 | 60 | 323 | 23 microbial_RNases | * | 0.048 |
| 6 | 101 | 47 | 40 | 6 | 8 | 0 | 95 | 114 | * | 2.298 | |
| 7 | 91 | 9 | 21 | 10 | 11 | 40 | 61 | 145 | * | 16.931 | |
| 8 | 68 | 19 | 14 | 11 | 20 | 4 | 38 | 331 | 12 microbial_RNases | * | 0.105 |
| 9 | 63 | 16 | 15 | 8 | 8 | 16 | 7 | 253 | 0.027 | ||
| 10 | 62 | 25 | 12 | 8 | 8 | 9 | 44 | 133 | * | 0.095 | |
| 11 | 52 | 18 | 17 | 6 | 7 | 4 | 45 | 114 | 2 TROVE | * | 0.073 |
| 12 | 50 | 4 | 18 | 6 | 7 | 15 | 20 | 133 | * | 0.132 | |
| 13 | 44 | 19 | 10 | 6 | 8 | 1 | 14 | 262 | * | 0.108 | |
| 14 | 37 | 10 | 5 | 4 | 6 | 12 | 27 | 178 | 4 microbial_RNases | 0.045 | |
| 15 | 33 | 4 | 6 | 5 | 6 | 12 | 18 | 312 | 9 microbial_RNases | 0.095 | |
| 16 | 31 | 15 | 7 | 2 | 4 | 3 | 29 | 150 | * | 0.170 | |
| 17 | 30 | 6 | 9 | 3 | 4 | 8 | 28 | 175 | 6 ML | * | 0.029 |
| 18 | 27 | 10 | 5 | 3 | 4 | 5 | 25 | 157 | * | 0.037 | |
| 19 | 26 | 3 | 10 | 5 | 6 | 2 | 18 | 169 | * | 0.069 | |
| 20 | 26 | 3 | 4 | 2 | 3 | 14 | 11 | 181 | 0.023 |
aFamily identifier
bTotal number of genes in the family
cNumber of family members in B.g. tritici
dNumber of family members in B.g. hordei
eNumber of family members in B.g. avenae
fNumber of family members in B.g. infecting Lolium
gNumber of family members in B.g. poae
hNumber of genes in the family with predicted signal peptide (SignalP)
iAverage length of the protein sequences
jConserved protein domains found in the NCBI CDD (only domains found in at least 2 genes are reported)
kFamilies that evolved significantly differently (p-value <0.01) compared to the null model inferred overall non-effector gene families
lTurnover rate (average number of duplications and disruptions that a gene undergoes in a million years)
Fig. 3Alignment of family 17 with the ML domain. Alignment of nine representative protein sequences of family 17 with nine representative sequences of the ML domain for different species (complete alignment in Additional file 3). The alignment starts 20 amino acids after the signal peptide and covers the entire protein length of both effector proteins and ML domain sequences. Two cysteines and a glycine are perfectly conserved (indicated with asterisks). Moreover, the spacing between hydrophobic amino acids is conserved throughout the alignment (the complete alignment is available in Additional file 3). B. graminis genes are named according to their f.sp. (Bg_POA = B.g. poae; Bgh = B.g. hordei; Bgt = B.g. tritici; Bg_AVE = B.g. avenae; Bg_LOL = B.g. infecting Lolium). The members of the ML domain are named with their Uniprot ID: GENENAME_SPECIES (ARATH = Arabidopsis tahliana; VITVI = Vitis vinifera; USTMA = Ustilago maydis; CRYNJ = Cryptococcus neoformans; PICGU = Meyerozyma guilliermondii; LODEL = Lodderomyces elongisporus; CANAL = Candida albicans; PICST = Scheffersomyces stipitis)
Fig. 4Phylogenetic tree of effector gene family 8 (a) Simplified phylogenetic tree of the five lineages of B. graminis used in this study (modified form Menardo et al. [22]). The median estimation for the divergence time is reported at each bifurcation of the tree. b Maximum likelihood tree of CEG family 8. Branches are colored according to the lineage of B. graminis to which the respective effector gene belongs. The species tree with the color code is represented in panel a. Branch labels report the bootstrap support for the clade inferred with 1000 replications. The scale is in expected amino-acid substitutions per site. In the lowest part of the tree there is one of the few effector genes in B. graminis for which we found a single copy in all lineages and for which the gene tree is concordant to the species tree. Two large lineage-specific expansions (18 genes in B.g. on Lolium and 9 genes in B.g. poa) are present in the upper part of the figure
Fig. 5Phylogenetic tree of effector gene family 3 (a) Simplified phylogenetic tree of the five lineages of B. graminis used in this study (modified form Menardo et al. [22]). The median estimation for the divergence time is reported at each bifurcation of the tree. b Maximum likelihood tree of CEG family 3. Branches are colored according to the lineage of B. graminis to which the respective effector gene belongs. The species tree with the color code is represented in panel a. Branch labels report the bootstrap support for the clade inferred with 1000 replications. The scale is in expected amino-acid substitutions per site. In this family there is the most massive lineage-specific expansion: a clade composed of 90 genes found in the B.g. poae genome
Fig. 6Phylogenetic tree of a transporter gene family (a) Simplified phylogenetic tree of the five lineages of B. graminis used in this study (modified form Menardo et al. [22]). The median estimation for the divergence time is reported at each bifurcation of the tree. b) Maximum likelihood tree of a major facilitator family (transmembrane transporters). This was randomly picked among non-effector genes families to show an example of the different patterns in the evolution of CEG families compared to non-CEG families. Branches are colored according to the lineage of B. graminis to which the respective effector gene belongs. The species tree with the color code is represented in panel a. Branch labels report the bootstrap support for the clade inferred with 1000 replications. The scale is in expected amino-acid substitutions per site. In this family, the average number of substitutions after divergence of the different lineages is much lower than in CEG families; this is represented by the short terminal branch lengths and could be explained by positive selection that fixes non-synonymous substitutions. Moreover, nine different genes are clearly recognizable. For five of them there is one copy for each lineage and for two of them the gene tree concords with the species tree. This is in contrast with effector gene families where it is often impossible to identify orthologous genes in the different lineages
Fig. 7Most parsimonious reconciliation cost (log10) in CEG families and non-CEG families. Most CEG families have a higher reconciliation cost (the number of gene duplications, losses and transfers weighted by their cost) compared to other gene families
Fig. 8Turnover rate (log10(λ)) in CEG families and non-CEG families. Most CEG families have a higher turnover rate compared to other gene families
Fig. 9Transposable element composition of sequences surrounding B. graminis genes. Genes were separated into CEGs and non-CEGs. The 5 kb upstream and downstream of the predicted start and end point of the CDS were divided into 10 sequence bins. For each bin, average TE composition was determined across all sequences in the dataset