| Literature DB >> 29233913 |
Jonathan K Richards1, Nathan A Wyatt2, Zhaohui Liu1, Justin D Faris2,3, Timothy L Friesen4,2,3.
Abstract
Parastagonospora nodorum, the causal agent of Septoria nodorum blotch in wheat, has emerged as a model necrotrophic fungal organism for the study of host-microbe interactions. To date, three necrotrophic effectors have been identified and characterized from this pathogen, including SnToxA, SnTox1, and SnTox3. Necrotrophic effector identification was greatly aided by the development of a draft genome of Australian isolate SN15 via Sanger sequencing, yet it remained largely fragmented. This research presents the development of nearly finished genomes of P. nodorum isolates Sn4, Sn2000, and Sn79-1087 using long-read sequencing technology. RNAseq analysis of isolate Sn4, consisting of eight time points covering various developmental and infection stages, mediated the annotation of 13,379 genes. Analysis of these genomes revealed large-scale polymorphism between the three isolates, including the complete absence of contig 23 from isolate Sn79-1087, and a region of genome expansion on contig 10 in isolates Sn4 and Sn2000. Additionally, these genomes exhibit the hallmark characteristics of a "two-speed" genome, being partitioned into two distinct GC-equilibrated and AT-rich compartments. Interestingly, isolate Sn79-1087 contains a lower proportion of AT-rich segments, indicating a potential lack of evolutionary hotspots. These newly sequenced genomes, consisting of telomere-to-telomere assemblies of nearly all 23 P. nodorum chromosomes, provide a robust foundation for the further examination of effector biology and genome evolution.Entities:
Keywords: Genome Report; Parastagonospora nodorum; RNAseq; effector; genome sequencing; wheat
Mesh:
Year: 2018 PMID: 29233913 PMCID: PMC5919747 DOI: 10.1534/g3.117.300462
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
SMRT sequencing and assembly statistics
| Sn4 | Sn2000 | Sn79-1087 | |
|---|---|---|---|
| Sequencing reads | 485,091 | 366,428 | 354,610 |
| Total sequenced bases | 5,400,955,164 | 5,484,796,333 | 5,359,857,877 |
| Average read length | 11,134 | 14,968 | 15,115 |
| Nuclear contigs | 24 | 24 | 22 |
| Nuclear contigs with both telomeres | 21 | 19 | 20 |
| Nuclear genome (bp) | 37,694,868 | 37,459,375 | 34,991,254 |
| Mitochondrial genome (bp) | 75,092 | 68,589 | 68,282 |
| L50 (contigs) | 9 | 9 | 8 |
| L90 (contigs) | 20 | 20 | 19 |
| N50 (bp) | 1,657,153 | 1,711,973 | 1,583,228 |
| N90 (bp) | 1,090,035 | 1,118,796 | 1,122,469 |
Smallest number of contigs whose length equals 50% of the genome assembly.
Smallest number of contigs whose length equals 90% of the genome assembly.
Length of the smallest contig in an ordered set of contigs corresponding to 50% of the assembly length.
Length of the smallest contig in an ordered set of contigs corresponding to 90% of the assembly length.
Figure 1Dot plots illustrating whole-genome alignments of P. nodorum isolates Sn4 and Sn2000 (A), Sn4 and Sn79-1087 (B), and Sn2000 and Sn79-1087 (C). Black arrows indicate the complete absence of contig 23 in the genome of isolate Sn79-1087. Gray arrows highlight a large expansion present in contig 10 of isolates Sn4 and Sn2000 compared with Sn79-1087.
Annotated gene properties
| Sn4 | Sn2000 | Sn79-1087 | |
|---|---|---|---|
| Annotated genes | 13,379 | 13,532 | 13,294 |
| Mean gene length (bp) | 1402.0 | 1376.8 | 1384.1 |
| Mean exon count | 2.7 | 2.6 | 2.6 |
| Predicted secreted proteins | 1361 | 1328 | 1247 |
| Predicted effector proteins | 287 | 281 | 237 |
| Conserved Ascomycota orthologs (%) | 97.3 | 97.5 | 97.9 |
Proteins harboring predicted signal sequence via SignalP (Petersen ).
Secreted proteins predicted to be effectors via EffectorP (Sperschneider )
Proportion of 1315 conserved Ascomycota orthologous genes present in annotated gene set as determined via BUSCO (Simão )
Figure 2Distribution of GC content in the genomes of Sn4 (A), Sn2000 (B), and Sn79-1087 (C). GC content (%) is illustrated on the x-axis and the proportion of the genome (0–1.00) is shown on the y-axis.
Genomic GC contents
| Sn4 | Sn2000 | Sn79-1087 | |
|---|---|---|---|
| High GC content range (%) | 34.3–100.0 | 34.2–100.0 | 32.7–100.0 |
| High GC content genome proportion (%) | 91.0 | 91.5 | 97.2 |
| High GC content peak (%) | 52.3 | 52.3 | 52.2 |
| Low GC content range (%) | 0.0–34.3 | 0.0–34.2 | 0.0–32.7 |
| Low GC content genome proportion (%) | 9.0 | 8.5 | 2.8 |
| Low GC content peak (%) | 26.0 | 26.3 | 26.4 |
| Gene density in high GC regions (genes/Mbp) | 389.0 | 394 | 391 |
| Gene density in low GC regions (genes/Mbp) | 12.6 | 15.9 | 3.1 |
Genomic GC content attributes as derived from analysis via OcculterCut (Testa ) indicating a two-speed genome.
Range of GC content in the elevated GC regions of the genome.
Proportion of the genome containing a relatively higher GC content.
Peak GC content within the high GC genomic compartment.
Range of GC content in the relatively lower GC regions of the genome.
Proportion of the genome containing a relatively lower GC content.
Peak GC content within the low GC genomic compartment.
Density of annotated genes within the relatively high GC regions of the genome.
Density of annotated genes within the relatively low GC regions of the genome.