| Literature DB >> 19772613 |
Scott Bringans1, James K Hane, Tammy Casey, Kar-Chun Tan, Richard Lipscombe, Peter S Solomon, Richard P Oliver.
Abstract
BACKGROUND: Stagonospora nodorum, a fungal ascomycete in the class dothideomycetes, is a damaging pathogen of wheat. It is a model for necrotrophic fungi that cause necrotic symptoms via the interaction of multiple effector proteins with cultivar-specific receptors. A draft genome sequence and annotation was published in 2007. A second-pass gene prediction using a training set of 795 fully EST-supported genes predicted a total of 10762 version 2 nuclear-encoded genes, with an additional 5354 less reliable version 1 genes also retained.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19772613 PMCID: PMC2753851 DOI: 10.1186/1471-2105-10-301
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Peptide clusters which did not confirm an existing gene model or conflict with an existing gene model in the opposing orientation were compared to grouped blastx HSPs from related dothideomycete proteins. Some of these peptide clusters were linked back to existing gene models as illustrated by Peptide Cluster 1 and Gene 1, which share a homology relationship with Blastx HSP group 1. This provides strong evidence for the reannotation of Gene 1 to become Modified Gene 1. Other peptide clusters could not be linked to existing gene annotations such as Peptide Cluster 2, which provides evidence for the creation of New Gene 2.
Sub-cellular localisation predictions using WolfPsort of proteins over-represented within the Mascot peptide supported gene annotations of S. nodorum.1
| Localisation | Peptide supported genes (all) | Peptide supported genes (v1) | Peptide supported genes (v2) | Unsupported genes | Expected genes | Total Genes |
|---|---|---|---|---|---|---|
| 605 | 24 | 581 | 2118 | 361 | 2723 | |
| 87 | 7 | 80 | 491 | 77 | 578 | |
| 1442 | 157 | 1285 | 11346 | 1094 | 12788 | |
| 0 | 0 | 0 | 27 | 4 | 27 | |
1Numbers of peptide supported genes in each category were compared to expected counts from a random sampling of the whole genome via Fisher's exact test at a significance threshold of <0.05.
Sub-cellular localisation predictions using SignalP of proteins over-represented within the Mascot peptide supported gene annotations of S. nodorum.1
| Peptide supported genes (all) | Peptide supported genes (v1) | Peptide supported genes (v2) | Unsupported genes | Expected genes | Total Genes | |
|---|---|---|---|---|---|---|
| 1847 | 163 | 1684 | 11259 | 1773 | 13106 | |
| 287 | 25 | 262 | 2723 | 436 | 3010 | |
1Numbers of peptide supported genes in each category were compared to expected counts from a random sampling of the whole genome via Fisher's exact test at a significance threshold of <0.05.
Relative molecular masses of predicted proteins significantly over-represented within the Mascot peptide supported gene annotations of S. nodorum.1
| Peptide supported genes (all) | Peptide supported genes (v1) | Peptide supported genes (v2) | Unsupported genes | Expected genes | Total Genes | |
|---|---|---|---|---|---|---|
| 284 | 60 | 224 | 4495 | 633 | 4779 | |
| 1536 | 121 | 1415 | 8816 | 1371 | 10352 | |
| 310 | 7 | 303 | 669 | 130 | 979 | |
| 4 | 0 | 4 | 2 | 1 | 6 | |
1Numbers of peptide supported genes in each category were compared to expected counts from a random sampling of the whole genome via Fisher's exact test at a significance threshold of <0.05.
Figure 2Comparison of . Version 2 genes are derived from EST alignments and a second round of EST-trained gene predictions. Version 2 genes and are considered to be more reliable than the remaining tentative version 1 gene annotations.
Summary of 12947 6-frame translated genome-mapped peptides and 1840 peptide clusters corresponding to annotated gene features categorised by either direct overlap or close proximity (within 200 bp).
| Match Type | Overlap | Within 200 bp | No match | Total |
|---|---|---|---|---|
| 11635 | 323 | 989 | ||
| 1520 | 54 | 266 |
1 CDS features are coding exons (gene sequence from translation start to stop, excluding introns). The number of proximal peptide and peptide cluster matches is greatly reduced in whole gene comparisons relative to CDS matches, indicating significant mis-annotations in EST-supported UTR regions and/or intron regions.
Counts of S. nodorum version 1 and 2 gene annotations matching 6-frame translated genome-mapped peptide clusters.1
| Annotation version | Confirmed no conflict | UTR/intron conflict (EST support) | UTR/intron conflict (no EST support) | No-match | Total |
|---|---|---|---|---|---|
| 11 | 0 | 30 | 5313 | 5354 | |
| 300 | 355 | 209 | 9898 | 10762 |
1Un-translated region (UTR)/intron conflicts were determined based on peptide matches outside of coding exon regions but within either 200 bp or within the boundaries of known UTRs. UTRs were known for genes for which EST alignments were available, where UTR regions were defined as EST-aligned regions not corresponding to coding exons or introns. 41% of peptide-supported version 2 (reliable) genes with UTR regions confirmed by EST support have suspected UTR/intron conflicts (355 out of 864).
Summary of frame conflicts within coding-exon (CDS) annotations confirmed by overlapping 6-frame translated genome-mapped peptides.
| TOTAL peptide-CDS matches in frame | 11224 |
| TOTAL peptide-CDS matches out of frame | 482 |
| Genes with all peptide matches to CDS in frame | 715 |
| Version 1 | 13 |
| Version 2 | 702 |
| Genes with all peptide matches to CDS out of frame | 86 |
| Version 1 | 10 |
| Version 2 | 76 |
| Genes with peptide matches to CDS both in and out of frame | 58 |
| Version 1 | 1 |
| Version 2 | 57 |
The majority of peptide matches agree with current coding-exon frames, however there are 144 (86+58) gene annotations requiring frame reassessment.
Figure 3Summary of version 1 and 2 . Genes were identified as candidates for re-annotation if the 6-frame translated genome-matched peptides indicated: conflicts in annotated coding-exons open reading frames (A); peptide-genome matches residing within annotated introns or untranslated regions (UTRs) (B) or; peptide-genome matches matched to the genome which could be linked back to a gene model via tblastn homology between the genome sequence and selected dothideomycete genomes (C). 47 new gene candidates were identified by a multiple methods: 3 peptide clusters which could not be linked to an existing gene annotation via a tblastn homolog; 29 unassembled read-contigs matching dothideomycete proteins via blastx but not matching S. nodorum proteins and; 15 unassembled read-singletons matching dothideomycete proteins via blastx but not matching S. nodorum proteins.
Summary of the 266 6-frame translated genome-mapped peptide clusters1 not confirming existing S. nodorum CDS annotations by either overlap or proximity within 200 bp.
| In conflicting orientation with existing gene annotation | 113 |
| No conflict, no supporting evidence | 135 |
| Overlaps genomic tblastn hit | 18 |
| Genomic tblastn hit links to an existing gene | 15 |
| Genomic tblastn hit | 3 |
1 Peptide clusters were assessed for orientation conflicts with genes in the opposing strand and for overlap with grouped regions of tblastn homology to related dothideomycete genomes (L. maculans, P. tritici-repentis, C. heterostrophus, A. brassisicola, M. graminicola and M. fijiensis).
Summary of 6-frame translated unassembled reads supported by MASCOT peptides.
| with peptide support | 423 |
| with blastx hit to dothideomycetes | 270 |
| Hits | 241 |
| does not hit | 29 |
| Without blastx hit to dothideomycetes | 153 |
| without peptide support | 516 |
| with peptide support | 651 |
| with blastx hit to dothideomycetes | 437 |
| hits | 422 |
| does not hit | 15 |
| Without blastx hit to dothideomycetes | 214 |
| without peptide support | 10188 |
15455 reads that were not included in the main genome assembly of S. nodorum were re-assessed for overlap. 4616 reads were clustered into 939 contigs, with 10839 singleton reads remaining.