| Literature DB >> 17439966 |
Rowan A C Mitchell1, Nathalie Castells-Brooke, Jan Taubert, Paul J Verrier, David J Leader, Christopher J Rawlings.
Abstract
Wheat biologists face particular problems because of the lack of genomic sequence and the three homoeologous genomes which give rise to three very similar forms for many transcripts. However, over 1.3 million available public-domain Triticeae ESTs (of which approximately 850,000 are wheat) and the full rice genomic sequence can be used to estimate likely transcript sequences present in any wheat cDNA sample to which PCR primers may then be designed. Wheat Estimated Transcript Server (WhETS) is designed to do this in a convenient form, and to provide information on the number of matching EST and high quality cDNA (hq-cDNA) sequences, tissue distribution and likely intron position inferred from rice. Triticeae EST and hq-cDNA sequences are mapped onto rice loci and stored in a database. The user selects a rice locus (directly or via Arabidopsis) and the matching Triticeae sequences are assembled according to user-defined filter and stringency settings. Assembly is achieved initially with the CAP3 program and then with a single nucleotide polymorphism (SNP)-analysis algorithm designed to separate homoeologues. Alignment of the resulting contigs and singlets against the rice template sequence is then displayed. Sequences and assembly details are available for download in fasta and ace formats, respectively. WhETS is accessible at http://www4.rothamsted.bbsrc.ac.uk/whets.Entities:
Mesh:
Year: 2007 PMID: 17439966 PMCID: PMC1933201 DOI: 10.1093/nar/gkm220
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.WhETS database preparation steps. Dashed arrows indicate steps which are repeated in automatic weekly updates.
Figure 2.Output from WhETS for Os06g04200.1. The black line at the top corresponds to the rice gene CDS with intron position and size indicated as red vertical lines with triangles. Thin horizontal lines below this indicate the coverage of hits from the Triticeae sequences. The rows below show these hits, with blast HSPs for contigs and singlets shown as lines coloured according to percentage identity, and coordinates aligned to the rice template. The CAP3 step gives three contigs; one of these (contig 1) is then divided into five new contigs by the SNP analysis step (contigs 1.1, 1.2, etc.) The genome of origin (A, B, D) has been added to the screenshot according to 100% identity matches of the contig consensus to the known-homoeologue transcript sequences (exons of accessions AB019622, AB019623 and AB019624). Contigs 1.1 and 1.2 are not combined because of a lack of substantial overlap. Contigs 1.4 and 1.5 appear to be splice variants with an indel in the 5′UTR. Contigs 2 and 3 appear quite different and may be paralogues.
Figure 3.Display window showing details for a contig which is opened by clicking on contig 1.1 link in Figure 2.