| Literature DB >> 25015079 |
Vladimír Klimeš1, Eleni Gentekaki2, Andrew J Roger3, Marek Eliáš4.
Abstract
Termination codons in mRNA molecules are typically specified directly by the sequence of the corresponding gene. However, in mitochondria of a few eukaryotic groups, some mRNAs contain the termination codon UAA deriving one or both adenosines from transcript polyadenylation. Here, we show that a similar phenomenon occurs for a substantial number of nuclear genes in Blastocystis spp., divergent unicellular eukaryote gut parasites. Our analyses of published genomic data from Blastocystis sp. subtype 7 revealed that polyadenylation-mediated creation of termination codons occurs in approximately 15% of all nuclear genes. As this phenomenon has not been noticed before, the procedure previously employed to annotate the Blastocystis nuclear genome sequence failed to correctly define the structure of the 3'-ends of hundreds of genes. From sequence data we have obtained from the distantly related Blastocystis sp. subtype 1 strain, we show that this phenomenon is widespread within the Blastocystis genus. Polyadenylation in Blastocystis appears to be directed by a conserved GU-rich element located four nucleotides downstream of the polyadenylation site. Thus, the highly precise positioning of the polyadenylation in Blastocystis has allowed reduction of the 3'-untranslated regions to the point that, in many genes, only one or two nucleotides of the termination codon are left.Entities:
Keywords: Blastocystis; evolution; gene expression; mRNA processing; polyadenylation; termination codons; translation
Mesh:
Substances:
Year: 2014 PMID: 25015079 PMCID: PMC4159000 DOI: 10.1093/gbe/evu146
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FPolyadenylation-mediated creation of termination codons in Blastocystis sp. subtype 7. (A) An example of a gene from Blastocystis sp. subtype 7 with polyadenylation-dependent creation of a termination codon. The figure shows the 3′-end of a gene (FN668683.1, positions 539586–539266) encoding a GTPase of the RHO family, which is characterized by the presence of a C-terminal CXXX motif. Although the current prediction of the respective protein sequence (CBK24281.2) lacks the expected C-terminal motif, the protein sequence predicted by taking into account an EST sequence (FQ805196.1) does exhibit the motif (CTVM in bold). The termination codon in the EST sequence (TAA) created by polyadenylation is highlighted in red, the conserved polyadenylation motif (see the text) is boxed, introns are marked by the characters “>.” The sequences displayed in the figure are defined above by their GenBank accession numbers. (B) A sequence logo for the conserved motif downstream from the polyadenylation site. The logo was created using WebLogo (Crooks et al. 2004; http://weblogo.berkeley.edu/, last accessed July 15, 2014) on the basis of 2,419 individual sequences.
FExample of a gene from Blastocystis sp. subtype 1 displaying polyadenylation-mediated creation of the termination codon. The termination codon in the transcript (TAA) is highlighted in red, the putative polyadenylation motif is boxed, an intron is marked by the characters “>.” Homologous proteins from Blastocystis sp. subtype 7 and Phytophthora infestans are shown for comparison. Note that the protein sequence from Blastocystis sp. subtype 7 is most likely incorrectly predicted, because the gene also seems to rely on polyadenylation-mediated creation of a termination codon, as is indicated by the presence of a putative polyadenylation motif (boxed). Hence, the triplet TTT (in red) is probably changed to TAA in a corresponding transcript, which would shorten the predicted coding sequence of the gene and make the encoded protein more similar to its homologs. Subtype 7 DNA sequence: FN668683.1, positions 458734–458973; subtype 7 predicted model: CBK24259.2; Phytophthora infestans protein sequence: XP_002895556.1; subtype 1 DNA sequence: a part of a scaffold from a genome assembly based on 454 and Illumina reads (Gentekaki E, Curtis B, Roger AJ, unpublished data); subtype 1 EST sequence: EC650050.1; subtype 1 predicted protein sequence: a theoretical inference based on the EST sequence.