| Literature DB >> 15993496 |
Corinna Benz1, Daniel Nilsson, Björn Andersson, Christine Clayton, D Lys Guilbride.
Abstract
In Kinetoplastids, protein-coding genes are transcribed polycistronically by RNA polymerase II. Individual mature mRNAs are generated from polycistronic precursors by 5' trans splicing of a 39-nt capped leader RNA and 3' polyadenylation. It was previously known that trans splicing generally occurs at an AG dinucleotide downstream of a polypyrimidine tract, and that polyadenylation is coupled to downstream trans splicing. The few polyadenylation sites that had been examined were 100-400 nt upstream of the polypyrimidine tract which marked the adjacent trans splice site. We wished to define the sequence requirements for trypanosome mRNA processing more tightly and to generate a predictive algorithm. By scanning all available Trypanosoma brucei cDNAs for splicing and polyadenylation sites, we found that trans splicing generally occurs at the first AG following a polypyrimidine tract of 8-25 nt, giving rise to 5'-UTRs of a median length of 68 nt. We also found that in general, polyadenylation occurs at a position with one or more A residues located between 80 and 140 nt from the downstream polypyrimidine tract. These data were used to calibrate free parameters in a grammar model with distance constraints, enabling prediction of polyadenylation and trans splice sites for most protein-coding genes in the trypanosome genome. The data from the genome analysis and the program are available from: .Entities:
Mesh:
Substances:
Year: 2005 PMID: 15993496 DOI: 10.1016/j.molbiopara.2005.05.008
Source DB: PubMed Journal: Mol Biochem Parasitol ISSN: 0166-6851 Impact factor: 1.759