| Literature DB >> 25309551 |
Antoine Persoons1, Emmanuelle Morin1, Christine Delaruelle1, Thibaut Payen1, Fabien Halkett1, Pascal Frey1, Stéphane De Mita1, Sébastien Duplessis1.
Abstract
Melampsora larici-populina is a fungal pathogen responsible for foliar rust disease on poplar trees, which causes damage to forest plantations worldwide, particularly in Northern Europe. The reference genome of the isolate 98AG31 was previously sequenced using a whole genome shotgun strategy, revealing a large genome of 101 megabases containing 16,399 predicted genes, which included secreted protein genes representing poplar rust candidate effectors. In the present study, the genomes of 15 isolates collected over the past 20 years throughout the French territory, representing distinct virulence profiles, were characterized by massively parallel sequencing to assess genetic variation in the poplar rust fungus. Comparison to the reference genome revealed striking structural variations. Analysis of coverage and sequencing depth identified large missing regions between isolates related to the mating type loci. More than 611,824 single-nucleotide polymorphism (SNP) positions were uncovered overall, indicating a remarkable level of polymorphism. Based on the accumulation of non-synonymous substitutions in coding sequences and the relative frequencies of synonymous and non-synonymous polymorphisms (i.e., PN/PS ), we identify candidate genes that may be involved in fungal pathogenesis. Correlation between non-synonymous SNPs in genes encoding secreted proteins (SPs) and pathotypes of the studied isolates revealed candidate genes potentially related to virulences 1, 6, and 8 of the poplar rust fungus.Entities:
Keywords: Pucciniales; effector; genomics; obligate biotroph; polymorphism; virulence
Year: 2014 PMID: 25309551 PMCID: PMC4164029 DOI: 10.3389/fpls.2014.00450
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Summary of .
| 93ID6 | 1993 | Champenoux (NE France) | N 48° 45′ 02″, E 06° 20′ 20″ | 3-4 | |
| 02Y5 | 2002 | Charrey-sur-Saône (NE France) | N 47° 05′ 18″, E 05° 09′ 11″ | 2-3-4-7-8 | |
| 09BS12 | 2009 | Mirabeau (SE France) | N 43° 41′ 29″, E 05° 40′ 21″ | 4-6 | |
| 94ZZ15 | 1994 | Saulchoy (N France) | N 50° 21′, E 01° 50′ | 3-4-5-7 | |
| 94ZZ20 | 1994 | Nogent-sur-Vernisson (Central France) | N 47° 50′, E 02° 45′ | 3-4-7 | |
| 08EA47 | 2008 | Prelles (SE France) | N 44° 51′ 00″, E 06° 34′ 47″ | 2-4 | |
| 95XD10 | 1995 | Rogécourt (N France) | N 49° 39′, E 03° 25′ | 3-4-5-7 | |
| 08EA20 | 2008 | Prelles (SE France) | N 44° 51′ 00″, E 06° 34′ 47″ | 4 | |
| 08EA77 | 2008 | Prelles (SE France) | N 44° 51′ 00″, E 06° 34′ 47″ | 4-6 | |
| 97CF1 | 1997 | Champenoux (NE France) | N 48° 45′ 02″, E 06° 20′ 20″ | 3-4-7 | |
| 08KE26 | 2008 | Mirabeau (SE France) | N 43° 41′ 29″, E 05° 40′ 21″ | 4 | |
| 9683B13 | 1996 | Orléans (Central France) | N 47° 49′ 39″, E 01° 54′ 40″ | 1-3-4-5-6-7 | |
| 98AG31 | 1998 | Moy-de-l′Aisne (N France) | N 49° 45′, E 03° 21′ | 3-4-7 | |
| 93JE3 | 1993 | Champenoux (NE France) | N 48° 45′ 02″, E 06° 20′ 20′ | 2-4 | |
| 98AR1 | 1998 | Geraardsbergen (Flanders, Belgium) | N 50° 45′, E 03° 52′ | 1-3-4-5-7-8 |
Isolate name, year, and location of sampling are indicated. Host indicates the poplar species/cultivar on which the isolate was sampled. The pathotype profile (combination of virulences) was confirmed in triplicate by inoculation on a differential set of poplar cultivars carrying the eight known resistances to M. larici-populina.
General mapping information for the 15 .
| 93ID6 | 3,594,455,577 | 2,656,764,147 | 73.9 | 226,296,523 | 84.4 | 26.3 |
| 02Y5 | 3,691,995,994 | 3,218,997,193 | 87.2 | 269,105,383 | 85.4 | 31.8 |
| 09BS12 | 6,230,429,688 | 4,717,557,005 | 75.7 | 479,213,815 | 84.2 | 46.6 |
| 94ZZ15 | 3,653,741,644 | 3,290,238,877 | 90.1 | 278,395,986 | 85.3 | 32.5 |
| 94ZZ20 | 3,387,309,786 | 3,045,158,939 | 89.9 | 253,470,401 | 85.2 | 30.1 |
| 08EA47 | 4,659,300,813 | 3,460,505,640 | 74.3 | 352,258,523 | 83.3 | 34.2 |
| 95XD10 | 4,701,407,950 | 3,993,529,488 | 84.9 | 396,812,163 | 83.7 | 39.5 |
| 08EA20 | 4,829,802,826 | 3,034,419,164 | 62.8 | 290,972,918 | 83.2 | 30.0 |
| 08EA77 | 4,259,571,919 | 3,840,082,037 | 90.2 | 340,127,111 | 84.7 | 38.0 |
| 97CF1 | 3,570,560,916 | 3,083,826,749 | 86.4 | 270,564,864 | 84.5 | 30.5 |
| 08KE26 | 5,407,393,523 | 4,626,803,739 | 85.6 | 434,085,871 | 85.0 | 45.8 |
| 9683B13 | 6,378,404,736 | 2,537,206,558 | 39.8 | 223,243,679 | 83.1 | 25.1 |
| 98AG31 | 2,779,485,081 | 2,529,716,573 | 91.0 | 218,294,868 | 85.2 | 25.0 |
| 93JE3 | 4,310,048,066 | 2,796,054,256 | 64.9 | 258,175,513 | 84.1 | 27.6 |
| 98AR1 | 2,562,143,464 | 2,227,530,892 | 86.9 | na | 76.0 | 22.0 |
Illumina reads of each genome were mapped onto the 98AG31 JGI reference genome. na, not applicable.
Figure 1Patterns of sequencing depth along scaffold 90 in 15 . Illumina reads from 15 isolates were mapped onto the 98AG31 reference genome. Scaffold 90 is presented here to illustrate distinct patterns of sequencing depth between groups of isolates: pattern A (red box) with coverage and sequencing depth similar to 98AG31, pattern B (green box) presenting four regions of lower coverage, and pattern C (orange box) with overall reduced coverage. Graphical outputs in blue represent the local sequencing depth along scaffold 90, normalized to the maximum depth measured in each isolate. Average coverage and sequencing depth are detailed for each isolate on the right. The bars below represent scaffold 90 from JGI reference genome website (red blocks indicate gaps) and predicted gene models (38 in total). Scale in nucleotides is presented at the top. The total scaffold length is 319,043 bp.
Genomic variants identified in 15 .
| 93ID6 | 84,849 | 88,855 | 3534 | 4198 | 3302 | 162,670 | 173,704 | 179,274 |
| 02Y5 | 76,511 | 95,418 | 3514 | 4399 | 3348 | 160,668 | 171,929 | 177,658 |
| 09BS12 | 91,934 | 54,500 | 3170 | 4020 | 2835 | 136,409 | 146,434 | 151,298 |
| 94ZZ15 | 84,155 | 82,478 | 3485 | 4287 | 3160 | 155,701 | 166,633 | 172,001 |
| 94ZZ20 | 80,851 | 80,541 | 3385 | 4085 | 3002 | 150,920 | 161,392 | 166,613 |
| 08EA47 | 85,423 | 75,527 | 3435 | 4158 | 3026 | 150,331 | 160,950 | 166,117 |
| 95XD10 | 68,735 | 87,520 | 2909 | 3554 | 2903 | 146,889 | 156,255 | 160,886 |
| 08EA20 | 90,268 | 91,000 | 3723 | 4354 | 3469 | 169,722 | 181,268 | 187,146 |
| 08EA77 | 89,765 | 83,569 | 3599 | 4275 | 3222 | 162,238 | 173,334 | 178,887 |
| 97CF1 | 75,954 | 76,585 | 3061 | 3902 | 2958 | 142,618 | 152,539 | 157,525 |
| 08KE26 | 102,244 | 55,022 | 3578 | 4268 | 3100 | 146,320 | 157,266 | 162,670 |
| 9683B13 | 70,974 | 82,208 | 3004 | 3708 | 2866 | 143,604 | 153,182 | 157,967 |
| 98AG31 | 14,219 | 78,970 | 1626 | 2945 | 1741 | 86,877 | 93,189 | 96,099 |
| 93JE3 | 91,933 | 75,793 | 3277 | 3938 | 3182 | 157,329 | 167,726 | 172,951 |
| 98AR1 | 77,267 | 88,799 | 3315 | 3932 | 3130 | 155,689 | 166,066 | 170,921 |
MNV, Multiple Nucleotide Variant; SNV, Single Nucleotide Variant (i.e., Single Nucleotide Polymorphism).
Analysis of polymorphism in 15 .
| 93ID6 | 2.30 | 33,428 | 16,489 | 112,753 | 15,950 | 5.0 | 5.7 | 18.3 | 20.5 |
| 02Y5 | 2.30 | 32,904 | 16,325 | 111,439 | 15,553 | 5.5 | 5.4 | 15.9 | 20.5 |
| 09BS12 | 2.34 | 26,086 | 13,365 | 96,958 | 12,905 | 5.2 | 5.6 | 16.8 | 19.1 |
| 94ZZ15 | 2.29 | 31,938 | 16,056 | 107,707 | 15,252 | 6.2 | 5.7 | 17.2 | 20.5 |
| 94ZZ20 | 2.29 | 31,035 | 15,461 | 104,424 | 14,859 | 5.2 | 5.2 | 17.7 | 20.6 |
| 08EA47 | 2.30 | 29,848 | 14,986 | 105,497 | 14,493 | 5.3 | 5.5 | 17.3 | 19.9 |
| 95XD10 | 2.42 | 29,932 | 14,817 | 101,940 | 15,950 | 4.5 | 4.9 | 14.8 | 20.4 |
| 08EA20 | 2.30 | 35,069 | 17,230 | 117,423 | 16,911 | 5.3 | 5.4 | 18.0 | 20.7 |
| 08EA77 | 2.35 | 32,152 | 16,383 | 113,703 | 15,653 | 5.4 | 5.4 | 17.1 | 19.8 |
| 97CF1 | 2.33 | 29,566 | 14,649 | 98,403 | 14,218 | 5.7 | 6.1 | 17.9 | 20.7 |
| 08KE26 | 2.36 | 27,137 | 13,886 | 105,297 | 13,862 | 5.3 | 5.6 | 16.5 | 18.5 |
| 9683B13 | 2.33 | 29,776 | 14,719 | 99,109 | 14,442 | 5.1 | 5.5 | 17.7 | 20.7 |
| 98AG31 | 2.27 | 18,749 | 9335 | 58,793 | 8825 | 6.6 | 5.8 | 19.9 | 21.6 |
| 93JE3 | 2.36 | 32,155 | 15,684 | 109,490 | 15,651 | 4.9 | 5.4 | 18.0 | 20.4 |
| 98AR1 | 2.21 | 31,389 | 15,352 | 108,948 | 14,441 | 4.7 | 5.2 | 17.0 | 20.2 |
CDS, Coding DNA sequence. Tr/Tv, rate of transition to transversion; MNV, Multiple Nucleotide Variants; SNV/SNP, Single Nucleotide Variant/Polymorphism.
Top 30 genes accumulating non-synonymous (NS) Single Nucleotide Polymorphism (SNP).
| 66139 | 5273 | 15819 | 227 | 66 | AAA+ ATPase | 0003677 | 1808 |
| 84101 | 1325 | 3975 | 95 | 57 | Hypothetical protein | No hit | No hit |
| 93626 | 1737 | 5211 | 82 | 54 | Hypothetical protein | No hit | No hit |
| 62079 | 1821 | 5463 | 73 | 47 | Hypothetical protein, telomere-length maintenance and DNA damage repair domain | 0001584 | No hit |
| 106057 | 2195 | 6585 | 136 | 45 | Hypothetical protein, NAM-like protein C-terminal domain | No hit | No hit |
| 92944 | 1135 | 3405 | 71 | 45 | Hypothetical protein, DNA breaking-rejoining enzymes, C-terminal catalytic domain | No hit | No hit |
| 95670 | 893 | 2679 | 87 | 45 | Hypothetical protein | No hit | 1187 |
| 66458 | 929 | 2787 | 55 | 45 | Hypothetical protein | No hit | 1245 |
| 70222 | 1542 | 4626 | 73 | 44 | DEAD-like helicase superfamily | No hit | 0351 |
| 101154 | 1470 | 4410 | 79 | 44 | Hypothetical protein | No hit | No hit |
| 114610 | 948 | 2844 | 91 | 41 | Hypothetical protein | No hit | No hit |
| 85441 | 1256 | 3768 | 56 | 40 | Hypothetical protein | No hit | 0714 |
| 92226 | 1393 | 4179 | 59 | 38 | Hypothetical protein | No hit | No hit |
| 67208 | 1203 | 3609 | 76 | 37 | Hypothetical protein | No hit | 1015 |
| 108793 | 931 | 2793 | 54 | 37 | Hypothetical protein | No hit | No hit |
| 96388 | 1344 | 4032 | 63 | 36 | Hypothetical protein | No hit | No hit |
| 108574 | 2851 | 8553 | 114 | 35 | Hypothetical protein, down-regulated in metastasis domain | No hit | No hit |
| 91870 | 1131 | 3393 | 72 | 35 | Hypothetical protein, alpha kinase domain family | 0004674 | 3614 |
| 118268 | 1649 | 4947 | 108 | 34 | Hypothetical protein, sister-chromatid cohesion C-terminus domain | 0006520 | No hit |
| 68278 | 1507 | 4521 | 54 | 34 | Hypothetical protein | No hit | 4475 |
| 65221 | 568 | 1704 | 44 | 34 | Hypothetical protein | No hit | No hit |
| 91258 | 771 | 2313 | 51 | 33 | Hypothetical protein, GCM transcription factor family motif | No hit | 2992 |
| 88323 | 575 | 1725 | 55 | 33 | Hypothetical protein | No hit | No hit |
| 60895 | 698 | 2094 | 58 | 33 | Hypothetical protein | No hit | 2992 |
| 84177 | 639 | 1917 | 57 | 33 | Hypothetical protein | No hit | No hit |
| 92190 | 551 | 1653 | 52 | 33 | Hypothetical protein | 0006306 | No hit |
| 101664 | 1102 | 3306 | 63 | 32 | Hypothetical protein | No hit | No hit |
| 95815 | 1486 | 4458 | 46 | 32 | Hypothetical protein | No hit | 1245 |
| 107058 | 720 | 2160 | 51 | 32 | Hypothetical protein | No hit | No hit |
| 64441 | 1107 | 3321 | 45 | 31 | Hypothetical protein | No hit | No hit |
Protein ID number, Eukaryotic Orthologous Group (KOG) and Gene Ontology (GO) annotations were retrieved from the 98AG31 reference genome at the Joint Genome Institute Mycocosm website (.
Figure 2Functional categories over-represented among genes exhibiting five non-synonymous polymorphisms or more. Percentages of genes falling in the different KOG categories among genes exhibiting five non-synonymous polymorphisms or more (NS ≥ 5) relative to the global gene distribution are shown. Black and white bars correspond to selected NS ≥ 5 genes and all genes, respectively. The category “No hits” corresponding to genes with no KOG annotation (~75% in both sets) is not represented on the graph to facilitate visualization of other categories. Significantly over-represented KOG categories are indicated by asterisks (Fisher's exact test, p < 0.05).
Top 30 genes encoding secreted proteins accumulating non-synonymous SNPs/Kb.
| 124497 | 77 | 231 | 5 | 5 | 21.6 | hypothetical secreted protein of 8 kDa | No hit | No hit |
| 124050 | 151 | 453 | 13 | 9 | 19.9 | hypothetical secreted protein of 17 kDa | No hit | No hit |
| 124361 | 88 | 264 | 5 | 5 | 18.9 | hypothetical secreted protein of 9 kDa | No hit | No hit |
| 109910 | 230 | 690 | 17 | 13 | 18.8 | hypothetical secreted protein | No hit | No hit |
| 123541 | 75 | 225 | 6 | 4 | 17.8 | hypothetical secreted protein of 8 kDa | No hit | No hit |
| 123852 | 135 | 405 | 55 | 7 | 17.3 | hypothetical secreted protein of 15 kDa | No hit | No hit |
| 104907 | 117 | 351 | 6 | 6 | 17.1 | hypothetical secreted protein | 1245 | No hit |
| 123868 | 139 | 417 | 15 | 7 | 16.8 | hypothetical secreted protein of 15 kDa | No hit | No hit |
| 66458 | 929 | 2787 | 55 | 45 | 16.1 | hypothetical secreted protein | No hit | No hit |
| 103402 | 151 | 453 | 15 | 7 | 15.5 | hypothetical secreted protein | No hit | No hit |
| 101262 | 131 | 393 | 18 | 6 | 15.3 | hypothetical secreted protein | No hit | No hit |
| 124304 | 200 | 600 | 10 | 9 | 15.0 | hypothetical secreted protein of 22 kDa | No hit | No hit |
| 107425 | 268 | 804 | 28 | 12 | 14.9 | hypothetical secreted protein | No hit | No hit |
| 124511 | 67 | 201 | 3 | 3 | 14.9 | hypothetical secreted protein of 7 kDa | No hit | No hit |
| 124264 | 90 | 270 | 5 | 4 | 14.8 | hypothetical secreted protein of 10 kDa, | No hit | 9055 |
| 107508 | 720 | 2160 | 51 | 32 | 14.8 | hypothetical secreted protein | No hit | No hit |
| 124351 | 92 | 276 | 7 | 4 | 14.5 | hypothetical secreted protein of 10 kDa | No hit | No hit |
| 95362 | 301 | 903 | 18 | 13 | 14.4 | hypothetical secreted protein | No hit | No hit |
| 64885 | 188 | 564 | 23 | 8 | 14.2 | hypothetical secreted protein of 21 kDa | No hit | No hit |
| 58423 | 142 | 426 | 10 | 6 | 14.1 | hypothetical secreted protein of 14 kDa | No hit | No hit |
| 124524 | 71 | 213 | 3 | 3 | 14.1 | hypothetical secreted protein of 8 kDa | No hit | No hit |
| 63656 | 315 | 945 | 22 | 13 | 13.8 | hypothetical secreted protein | No hit | No hit |
| 70838 | 97 | 291 | 9 | 4 | 13.7 | hypothetical secreted protein of 10 kDa | No hit | No hit |
| 123559 | 146 | 438 | 10 | 6 | 13.7 | hypothetical secreted protein of 16 kDa | No hit | No hit |
| 61241 | 392 | 1176 | 39 | 16 | 13.6 | hypothetical secreted protein, PLECKSTRIN homology domain | No hit | No hit |
| 68348 | 247 | 741 | 18 | 10 | 13.5 | hypothetical secreted protein | No hit | No hit |
| 123552 | 150 | 450 | 12 | 6 | 13.3 | hypothetical secreted protein of 17 kDa | No hit | No hit |
| 124134 | 125 | 375 | 14 | 5 | 13.3 | hypothetical secreted protein of 14 kDa | No hit | No hit |
| 108793 | 931 | 2793 | 54 | 37 | 13.2 | hypothetical secreted protein | No hit | No hit |
| 36743 | 179 | 537 | 8 | 7 | 13.0 | hypothetical secreted protein of 21 kDa, peptidase M, neutral zinc metallopeptidase | No hit | 8237 |
Protein ID number, Eukaryotic Orthologous Group (KOG) and Gene Ontology (GO) annotations were retrieved from the 98AG31 reference genome at the Joint Genome Institute Mycocosm website (.
Figure 3Distribution of . Ratios of non-synonymous to synonymous polymorphisms (P) between 0 and 1 are shown for SP genes and non-SP genes. The insert shows distribution of genes with a P > 1. Numbers of non-SP genes were divided by 5 for representation. Note the different scale for y-axes in figure and insert.
Figure 4Conservation protein profile of the . The profile was designed using WebLogo with 40 sequences corresponding to the 12 members in the CPG5464 family (Hacquard et al., 2012), six variants deduced from the 15 genomes sequenced in this study, 22 AvrP4 homologs sequenced from 9 Melampsora spp. (Van der Merwe et al., 2009) and 16 Melampsora lini AvrP4 variants. The predicted signal peptide and K/R and D/E rich regions previously shown in Hacquard et al. (2012) are depicted on the profile. Green arrows point to sites under selection in Barrett et al. (2009). Red arrows point to sites of substitution observed in M. larici-populina variants. Asterisks in the red box indicate amino acids under positive selection in Van der Merwe et al. (2009) and asterisks in the blue box indicate amino acids under positive selection in Hacquard et al. (2012).