| Literature DB >> 15980559 |
Xiang Jia Min1, Gregory Butler, Reginald Storms, Adrian Tsang.
Abstract
TargetIdentifier is a webserver that identifies full-length cDNA sequences from the expressed sequence tag (EST)-derived contig and singleton data. To accomplish this TargetIdentifier uses BLASTX alignments as a guide to locate protein coding regions and potential start and stop codons. This information is then used to determine whether the EST-derived sequences include their translation start codons. The algorithm also uses the BLASTX output to assign putative functions to the query sequences. The server is available at https://fungalgenome.concordia.ca/tools/TargetIdentifier.html.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15980559 PMCID: PMC1160197 DOI: 10.1093/nar/gki436
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Categories of algorithm-predicted cDNA clones. (A) A full-length sequence that includes one or more stop codons in the predicted 5′-UTR, a completely sequenced protein coding region and a 3′-UTR. (B) A sequence similar to those described in (A) except that the 3′ end of the ORF region is not sequenced. (C) A sequence having a start codon but lacking a stop codon in the 5′-UTR, whether it contains a potential translation start codon or not is determined by comparing the BLASTX alignment between its predicted protein and the subject. (D) A sequence having a stop codon in the 5′-UTR but lacking an in-frame start codon. This is an ambiguous sequence. (E) A sequence that includes a coding region but neither a stop codon nor a start codon in the sequenced portion. The length of the low quality sequence removed by Lucy (15) is taken into consideration when predicting whether or not it was a ‘possible full-length’ sequence. Asterisk: stop codon upstream of the start codon (5′ end stop codon); solid circle: predicted translation initiation codon; solid triangle: a stop codon downstream from the start codon (3′ end stop codon); question mark: indicates checking if a 3′ stop codon exists; (X): the first amino acid in the alignment of the HSP in BLASTX; (M): methionine; (d1) the length of predicted peptide from a predicted start codon to X; (d2) the length of M to X in the subject sequence of the HSP in BLASTX; (d3) length of EST sequence trimmed by Lucy, can include a portion of a vector, an adaptor and a low quality region of a cDNA sequence; thick solid line: sequences retained after processing by Lucy; thin solid line: the low quality sequence removed from the 5′ end by Lucy; dashed line: amino acid sequence of the subject in BLASTX.
Figure 2A decision tree for EST-derived sequence classification. The definitions of each category of EST-derived sequences are described in detail in the text. Start codon: ATG; 5′ stop codon: stop codon (TAA, TAG, or TGA) in the 5′-UTR; d1: the predicted length of the peptide that extends from the start codon encoded methionine to the first amino acid of the query in the HSP alignment in the output of BLASTX; d2: the subject's beginning position in the HSP alignment in the output of BLASTX; d3: the estimated length of the low quality sequence removed by Lucy (15).