| Literature DB >> 18847483 |
Rebecca L Poole1, Gary L A Barker, Kay Werner, Gaia F Biggi, Jane Coghill, J George Gibbings, Simon Berry, Jim M Dunwell, Keith J Edwards.
Abstract
BACKGROUND: Serial Analysis of Gene Expression (SAGE) is a powerful tool for genome-wide transcription studies. Unlike microarrays, it has the ability to detect novel forms of RNA such as alternatively spliced and antisense transcripts, without the need for prior knowledge of their existence. One limitation of using SAGE on an organism with a complex genome and lacking detailed sequence information, such as the hexaploid bread wheat Triticum aestivum, is accurate annotation of the tags generated. Without accurate annotation it is impossible to fully understand the dynamic processes involved in such complex polyploid organisms. Hence we have developed and utilised novel procedures to characterise, in detail, SAGE tags generated from the whole grain transcriptome of hexaploid wheat.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18847483 PMCID: PMC2584110 DOI: 10.1186/1471-2164-9-475
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of SAGE libraries
| Library | Total tag count | Number of Unique tags (%) | Number of singletons (%) | Number of tags with a count of >3 (cumulative count) |
| Xi19 (normal) | 13,286 | 9,471 (71) | 8,382 (63) | 313 (3167) |
| Xi19 (normal) tech. rep | 10,978 | 7,890 (72) | 6,999 (64) | 217 (2474) |
| Scorpion 25 (normal) | 13,875 | 9,853 (71) | 8,713 (63) | 304 (3295) |
| Scorpion 25 (normal) tech. rep | 9,786 | 4,850 (50) | 4,323 (44) | 527 (5393) |
| Xi19 (hot and dry) | 12,460 | 7,818 (63) | 6,942 (56) | 260 (4136) |
| Scorpion 25 (hot and dry) | 11,545 | 6,289 (54) | 5,508 (48) | 344 (5141) |
Percentages displayed are of the total cumulative tag count.
Figure 1Schematic diagram of the assignment and annotation of SAGE tags. Each processing step was performed using a custom PERL script (Additional file 1). UniGenes are assigned annotations by BLASTX, with the UniGene sequences searched against the non-redundant (nr) protein database. Tags are preferentially assigned to UniGenes with annotations and in cases of multiple matches assigned to the UniGene with the highest cumulative frequency, to reduce redundancy within the data. Fuzzy matching tolerates up to 2 bp mismatch between the tag and the representative UniGene sequence.
Figure 2SAGE tag classification and spatial distribution. In total 5,304 unique tags with a count ≥2 were attempted to be assigned to a UniGene (NCBI build #38) sequence. The tags were classified into 5 categories according to the sequence alignment (a); Perfect forward matches (yellow), Perfect reverse matches (black), fuzzy forward matches (red) and fuzzy reverse matches (blue), no match to a UniGene (green). Distribution analysis of the forward (b) and reverse (c) tags across the length of the transcript was performed on total tag count data for tags with an annotation and a count ≥2 and reveals that the majority of tags are derived from the 3' most CATG site (position 1) of the respective transcripts. The perfect matched tags (blue) follow the same pattern as the fuzzy matches (red).
Figure 3Alignment of SAGE tags to the Pina (a) and Pinb (b) mRNA complete sequence from the Ha (hardness) locus [GenBank accession: CR626934] Chantret et al. [41]. All anchoring enzyme sites are denoted by upper case letters and SAGE tags in bold (reverse tags are in addition italicised), the coding sequence is delimited by open arrow heads. Putative polyadenylation signals are indicated by asterisks and the termination sites of the truncated transcripts highlighted by block arrow heads (Gautier et al. [43]). Cumulative tag counts across all six libraries are indicated in boxes beneath each tag. In both cases the penultimate (and non-canonical) tag has the highest frequency.
Summary of 40 most abundant antisense UniGenes
| gnl|UG|Ta#S17980503 | no hit | Unknown | 374 | 0 | 2 | No |
| gnl|UG|Ta#S12872250 | no hit | Unknown | 338 | 0 | 4 | No |
| gnl|UG|Ta#S12922882 | Alpha/beta-gliadin A-II precursor (Prolamin) | Storage | 238 | 5 | 5 | Yes |
| gnl|UG|Ta#S32420068 | PREDICTED: similar to rRNA intron-encoded homing endonuclease | Reproduction | 190 | 1 | 3 | Yes |
| gnl|UG|Ta#S32610130 | no hit | Unknown | 166 | 0 | 1 | No |
| gnl|UG|Ta#S18010719 | putative inositol-(1,4,5) trisphosphate 3-kinase [Oryza sativa] | Signalling | 97 | 0 | 1 | No |
| gnl|UG|Ta#S16057965 | putative argonaute protein [Oryza sativa] | Reproduction | 85 | 0 | 2 | No |
| gnl|UG|Ta#S17985265 | putative AT-hook DNA-binding protein [Oryza sativa] | Reproduction | 84 | 0 | 1 | No |
| gnl|UG|Ta#S15823985 | no hit | Unknown | 83 | 0 | 1 | No |
| gnl|UG|Ta#S12923304 | gamma-gliadin [Triticum aestivum] | Storage | 64 | 5 | 0 | Yes |
| gnl|UG|Ta#S26027296 | UBX domain, putative [Oryza sativa (japonica cultivar-group)] | Unknown | 60 | 0 | 1 | No |
| gnl|UG|Ta#S12923123 | gliadin gamma | Storage | 56 | 3 | 0 | Yes |
| gnl|UG|Ta#S12923126 | low molecular weight glutenin subunit LMW-Di31 [Triticum turgidum] | Storage | 48 | 1 | 1 | Yes |
| gnl|UG|Ta#S17988646 | putative glucose-6-phosphate dehydrogenase [Oryza sativa] | Metabolism | 47 | 0 | 1 | No |
| gnl|UG|Ta#S19133035 | low-molecular-weight glutenin subunit group 3 type II | Storage | 46 | 5 | 0 | Yes |
| gnl|UG|Ta#S16466298 | no hit | Unknown | 44 | 0 | 2 | No |
| gnl|UG|Ta#S22389847 | no hit | Unknown | 39 | 0 | 2 | Yes |
| gnl|UG|Ta#S12922884 | alpha-gliadin [Triticum aestivum] | Storage | 35 | 1 | 0 | Yes |
| gnl|UG|Ta#S32643313 | OSJNBa0070C17.22 (CpG binding domain*) | Reproduction | 34 | 0 | 2 | No |
| gnl|UG|Ta#S13111511 | wound-inducible basic protein – kidney bean | Defense | 30 | 0 | 1 | No |
| gnl|UG|Ta#S18010204 | choline kinase [Oryza sativa] | Membrane | 29 | 0 | 1 | No |
| gnl|UG|Ta#S13179349 | no hit | Unknown | 27 | 0 | 3 | No |
| gnl|UG|Ta#S13005586 | gamma-gliadin [Triticum aestivum] | Storage | 25 | 1 | 0 | Yes |
| gnl|UG|Ta#S15880157 | no hit | Unknown | 24 | 1 | 0 | Yes |
| gnl|UG|Ta#S17883810 | putative serine/threonine protein phosphatase PP1 [Oryza sativa] | Signalling | 23 | 0 | 1 | No |
| gnl|UG|Ta#S12966614 | putative receptor protein kinase PERK1 [Oryza sativa] | Signalling | 23 | 0 | 1 | No |
| gnl|UG|Ta#S32583944 | unknown protein [Oryza sativa] | Unknown | 21 | 0 | 1 | No |
| gnl|UG|Ta#S16191894 | putative wall-associated protein kinase [Oryza sativa] | Signalling | 21 | 0 | 2 | No |
| gnl|UG|Ta#S12923306 | gamma-gliadin [Triticum aestivum] | Storage | 21 | 3 | 1 | Yes |
| gnl|UG|Ta#S17975314 | no hit | Unknown | 20 | 0 | 1 | No |
| gnl|UG|Ta#S32572951 | Nucleolar GTP-binding protein 1-like [Oryza sativa] | Signalling | 20 | 0 | 1 | No |
| gnl|UG|Ta#S22379110 | putative branched-chain alpha-keto acid decarboxylase E1 beta | Cell Wall | 20 | 0 | 1 | No |
| gnl|UG|Ta#S12917789 | no hit | Unknown | 20 | 0 | 1 | No |
| gnl|UG|Ta#S16228057 | no hit | Unknown | 20 | 0 | 1 | Yes |
| gnl|UG|Ta#S22368491 | protein phosphatase 2C, putative, expressed [Oryza sativa] | Signalling | 19 | 0 | 1 | No |
| gnl|UG|Ta#S16058509 | high-molecular-weight glutenin subunit Bx17 [Triticum aestivum] | Storage | 19 | 1 | 1 | Yes |
| gnl|UG|Ta#S12932494 | unknown protein; 58745–68005 [Arabidopsis thaliana] | Unknown | 19 | 0 | 1 | Yes |
| gnl|UG|Ta#S32736316 | no hit | Unknown | 18 | 1 | 0 | No |
| gnl|UG|Ta#S17886389 | LacZ-alpha [Shuttle vector pLPV111] | Unknown | 18 | 1 | 1 | No |
| gnl|UG|Ta#S32503514 | DNA polymerase delta small subunit, putative, expressed | Reproduction | 18 | 0 | 1 | No |
PM – Unique perfect match tags.
FM – Unique fuzzy matched tags
* – Annotation obtained by manual search
Figure 4Sense gene expression across the Each bar represents the median relative intensity of hybridisation to a 30 mer oligo. Oligo names represent the position of the first base in the oligo within the Ha locus sequence [GenBank accession CR626934]. Hybridisations were performed with cDNA from 14 dpa endosperm and revealed the ability of the microarray approach to predict the genic regions as defined in GenBank accession CR626934. Thin black lines (under the graph) indicate the gene regions with the thick black lines highlighting the coding sequences. The array also highlights areas of transcription found in the inter-genic regions (indicated by an asterisk).
Figure 5Expression profiles of Mean relative intensities of Pina (a) and Pinb (b) sense oligos across the tiled array, each bar represents the median relative intensity of hybridisation of cDNA from 14 dpa endosperm to a 30 mer oligo. The thin black lines under the graphs indicate the gene regions with the thick black lines representing the coding sequence. Mean relative intensities of the Pina sense (c), Pinb sense (d), Pina antisense (e) and Pinb antisense (f) transcripts were calculated over development using all anisense oligos, including both the tiled oligos and the ORF oligos. Expression of both sense and antisense transcripts peak around 10 dpa, the sense transcripts remain in abundance during the middle phase of development, whilst the antisense transcripts have declined by 14 dpa. All oligo sequences are provided in additional material 7.