| Literature DB >> 30482228 |
Shuqi E Wang1, Abdul S Amir1, Tai Nguyen1, Anthony M Poole1, Augusto Simoes-Barbosa2.
Abstract
BACKGROUND: The human protozoan parasite Trichomonas vaginalis is an organism of interest for understanding eukaryotic evolution. Despite having an unusually large genome and a rich gene repertoire among protists, spliceosomal introns in T. vaginalis appear rare: only 62 putative introns have been annotated in this genome, and little or no experimental evidence exists to back up these predictions.Entities:
Keywords: Deep-branching eukaryote; Introns; Spliceosome; Splicing; Trichomonas vaginalis
Mesh:
Substances:
Year: 2018 PMID: 30482228 PMCID: PMC6260720 DOI: 10.1186/s13071-018-3196-7
Source DB: PubMed Journal: Parasit Vectors ISSN: 1756-3305 Impact factor: 3.876
Fig. 1A Venn diagram representing the overlap on the number of introns that were predicted by TrichDB (blue) and reported by the previous publications (red: Vanacova et al. (2005) [15] and green: Deng et al. (2008) [16]) and as indicated
Summary of the PCR validation for the 62 putative introns distributed in 61 protein-coding genes as annotated by TrichDB
| Categorya | Gene IDb | Notes |
|---|---|---|
| (A) Functional introns | TVAG_110020, TVAG_390460, TVAG_126240, TVAG_225200, TVAG_087980, TVAG_413420, TVAG_388620, TVAG_176980, TVAG_053820, TVAG_020880, TVAG_110580, TVAG_350500, TVAG_148640, TVAG_460790, TVAG_198230, TVAG_125100, TVAG_085780, TVAG_065500, TVAG_014960 | Described by Vanacova et al. [ |
| TVAG_383350 | Described by Deng et al. [ | |
| TVAG_324910, TVAG_416520, TVAG_134480, TVAG_089630, TVAG_043580, TVAG_306990, TVAG_147850, TVAG_217460, TVAG_242770, TVAG_056030, TVAG_203580 | No early references | |
| (B) Non-functional introns | TVAG_355610, TVAG_411060, TVAG_107710, TVAG_593670, TVAG_045310, | |
| (C) Undetermined introns | TVAG_066220, TVAG_115540, TVAG_249380, |
aBased on the RT-PCR results (Additional file 3: Figure S1), these introns were categorized as (A) Functional, (B) Non-functional or (C) Undetermined
bGenes, where introns were predicted to be in the untranslated regions (UTRs) and not in the coding sequences (CDS), are shown in bold
cThis is the only gene from the list that was claimed to contain 2 introns instead of 1
Fig. 2The exon boundaries of the 11 newly discovered intron-containing genes in T. vaginalis. Gel images, cropped from Additional file 3: Figure S1, show the unspliced and spliced amplicons in the first and second lanes, respectively followed by the 100 bp DNA ladder (New England Biolabs). The TrichDB ID of the genes is shown on the top of each gel image followed by the expected bp size of unspliced | spliced amplicons. Each gel image is accompanied by part of the DNA sequencing chromatogram where the line indicates the precise boundary between exon 1 and 2. The actual DNA sequence is shown under the chromatogram with the arrow indicating the nucleotide boundary between exons. The image in the box contains the 100 bp DNA ladder for reference
Features of the 11 newly discovered introns and intron-containing genes in T. vaginalis
| Gene ID | Predicted function | Intron length (bp) | Exon/Intron GC content | IPa | RIPb | Intron phase | Exon/Exon nucleotide sequence | Exon/Exon amino acid sequencec |
|---|---|---|---|---|---|---|---|---|
| TVAG_306990 | CMGC family protein kinase | 93 | 43.18/29.03 | 307 | 0.25 | 0 | ATAC/ATTA | LAY/IKA |
| TVAG_217460 | Hypothetical protein | 72 | 35.51/31.94 | 120 | 0.1 | 2 | TAGA/CTGT | YEL/dCE |
| TVAG_147850 | CAMK family protein kinase | 68 | 36.72/25.0 | 533 | 0.45 | 1 | CCAA/AATA | GSP/kYV |
| TVAG_416520 | Hypothetical protein | 26 | 41.63/30.77 | 109 | 0.17 | 0 | TGAA/AATT | FSE/NYV |
| TVAG_043580 | Mob1 phocein family | 25 | 35.15/28.0 | 21 | 0.03 | 2 | AAAT/GCAT | FSK/mHS |
| TVAG_056030 | Hypothetical protein | 25 | 40.33/24.0 | 137 | 0.42 | 1 | GTTT/ATGG | RPV/yGL |
| TVAG_089630 | AGC family protein kinase | 25 | 37.39/24.0 | 78 | 0.06 | 2 | GGAT/TGAG | DNR/iEI |
| TVAG_134480 | Putative protein kinase | 25 | nd/24.0 | 78 | 0.08 | 2 | GGAT/TGAG | DNR/iEI |
| TVAG_242770 | Hypothetical protein | 25 | 37.21/36.0 | 93 | 0.07 | 2 | AATA/TTTG | IIK/yLK |
| TVAG_324910 | Hypothetical protein | 25 | nd/24.0 | 327 | nd | 2 | AAAT/ATAT | LIE/iYK |
| TVAG_203580 | Hypothetical protein | 25 | 34.77/24.0 | 165 | 0.12 | 2 | AATA/TATT | LTE/yIL |
aIntron position (IP) indicates the amino acid position of the intron relative to the first ATG in the open reading frame
bRelative intron position (RIP) indicates the intron position relative to the total gene ORF length
cAmino acids interrupted by phase 1 or 2 introns are shown in lower case
Abbreviations: nd, not determined (because of ambiguity of DNA sequence or incomplete length of CDS, as per TrichDB)
Fig. 3The 11 newly discovered T. vaginalis introns are classified into two types based on their sequence properties. Types A (top) and B (bottom) fit closely with the introns described by Vanacova et al. [15] and Deng et al. [16], respectively. The nucleotides of the newly discovered introns that are identical to the previously identified introns [15, 16] are shaded in grey. The branch site sequence, initially described as identical to the yeast consensus [26], is underlined with the red arrowhead indicating the branch adenosine. The distance in nucleotides (nt) between the 5' SS and the motif that encompasses the BS and 3' SS is indicated. Based on the intron nucleotide sequences, the consensus sequences for intron types A and B are shown below each alignment. Nucleotide ambiguity represents: ‘W’ to A or T; ‘Y’ to T or C; ‘H’ to A, C or T and ‘{1,2}’ specifies that ‘H’ can be one or two nucleotides
Fig. 4A newly identified type B intron in the gene TVAG_269270. a Experimental validation by RT-PCR and DNA sequencing. Left, the gel image of the RT-PCR is labelled with the gene ID on the top followed by the expected bp size of unspliced vs spliced amplicons. The lanes A-D were loaded with PCR products obtained from water, RNA, gDNA and cDNA templates, respectively. The lane L contains a molecular weight marker (100 bp DNA ladder by New England Biolabs) with band sizes indicated. Right, part of the DNA sequencing chromatogram where the line and arrow indicate the precise exon-exon boundary. The actual DNA sequence is shown under the chromatogram with the arrow indicating the nucleotide boundary between exons. b Partial sequence of gene TVAG_269270. The nucleotide sequence shows the entire 5' UTR where this 25nt-intron, highlighted in grey, was found. This is followed by part of the CDS sequence (in red) which is accompanied by its translation into one-letter amino acids. Primers surrounding the intron, which were used for the PCR here, are shown as underlined sequences. In contrast to the other characterized introns (Table 1), this type-B intron is found in the 5' UTR of the transcript