| Literature DB >> 16553959 |
Scott J Westenberger1, Gustavo C Cerqueira, Najib M El-Sayed, Bianca Zingales, David A Campbell, Nancy R Sturm.
Abstract
BACKGROUND: The mitochondrial DNA of kinetoplastid flagellates is distinctive in the eukaryotic world due to its massive size, complex form and large sequence content. Comprised of catenated maxicircles that contain rRNA and protein-coding genes and thousands of heterogeneous minicircles encoding small guide RNAs, the kinetoplast network has evolved along with an extreme form of mRNA processing in the form of uridine insertion and deletion RNA editing. Many maxicircle-encoded mRNAs cannot be translated without this post-transcriptional sequence modification.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16553959 PMCID: PMC1559615 DOI: 10.1186/1471-2164-7-60
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1The All annotated genes are shown as arrows indicating coding direction. The non-coding regions of both genomes are distinct from one another, with the exception of a duplicated conserved element lying between the repetitive region and the 12S rRNA.
Gene positions and lengths on CL Brener and Esmeraldo maxicircle consensus sequences
| CL Brener | Esmeraldo | CL BrenerEsmeraldo | |||
| GENE | Editing of mRNA | position | position | Length | Length |
| 12S rRNA | 1–1161 | 1–1153 | 1161 | 1153 | |
| 9S rRNA | 1200–1808 | 1197–1804 | 608 | 607 | |
| ND8 | Pan-edited | 1853–2131 | 1857–2127 | 279 | 271 |
| ND9 (revcomp) | Pan-edited | 2195–2532 | 2192–2545 | 338 | 354 |
| MURF5 (revcomp) | 5' end edited * | 2568–2715 | 2576–2722 | 148 | 147 |
| ND7 | Pan-edited | 2857–3611 | 2862–3617 | 755 | 756 |
| COIII | Pan-edited | 3678–4100 | 3685–4109 | 424 | 425 |
| Cyb | 5' end edited | 4175–5254 | 4167–5246 | 1080 | 1080 |
| ATPase6 | 5' half edited | 5292–5627 | 5288–5622 | 336 | 335 |
| MURF1 (revcomp) | 5' end edited | 5675–7015 | 5681–7029 | 1340 | 1347 |
| CR3 | Pan-edited | 7002–7120 ** | 7016–7136 ** | uncertain | uncertain |
| ND1 (revcomp) | 5' end edited | 7116–8057 | 7132–8073 | 942 | 942 |
| COII | Internal editing +4Us | 8071–8699 | 8088–8716 | 629 | 629 |
| MURF2 | 5' end edited | 8725–9780 | 8750–9794 | 1056 | 1045 |
| COI (revcomp) | Not edited | 9771–11420 | 9785–11434 | 1650 | 1650 |
| CR4 (revcomp) | Pan-edited | 11471–11677 | 11487–11658 | 207 | 172 |
| ND4 | Not edited | 11782–13095 | 11659–12872 | 1314 | 1214 |
| ND3 (revcomp) | Pan-edited | 13087–13279 | 12864–13051 | 193 | 188 |
| RPS12 | Pan-edited | 13357–13547 | 13129–13315 | 191 | 187 |
| ND5 | Not edited | 13568–15337 | 13335–15105 | 1770 | 1771 |
(revcomp) indicates reverse complemented genes encoded on the opposite strand
Gene positions are given relative to the start of the 12S rRNA
* MURF5 start position is uncertain, so 5' end editing is assumed to create the start codon
** CR3 5' and 3' end positions are uncertain, editing pattern, start, stop codons are unknown
Nucleotide composition of T. cruzi maxicircle regions
| Coding Region | Non-coding Region | Overall | ||||||||||
| CLB | Esmo | Tb | Lt | CLB | Esmo | Tb | Lt | CLB | Esmo | Tb | Lt | |
| % A | 38 | 37 | 37 | 35 | 46 | 52 | 59 | 52 | 39 | 40 | 44 | 38 |
| % C | 11 | 11 | 10 | 10 | 12 | 14 | 9 | 9 | 11 | 11 | 9 | 10 |
| % G | 15 | 15 | 16 | 13 | 10 | 6 | 8 | 6 | 14 | 13 | 13 | 12 |
| % T | 37 | 37 | 37 | 43 | 33 | 28 | 24 | 33 | 36 | 36 | 34 | 41 |
| % A+T | 74 | 74 | 74 | 77 | 79 | 80 | 83 | 85 | 75 | 76 | 78 | 79 |
| % G+C | 26 | 26 | 26 | 23 | 21 | 20 | 17 | 15 | 25 | 24 | 22 | 21 |
| AT skew | 0.01 | -0.01 | 0.00 | -0.10 | 0.17 | 0.30 | 0.43 | 0.23 | 0.04 | 0.04 | 0.14 | -0.03 |
| GC skew | 0.15 | 0.16 | 0.24 | 0.15 | -0.08 | -0.38 | -0.02 | -0.17 | 0.10 | 0.05 | 0.19 | 0.10 |
Coding region region is defined as the contiguous nucleotide sequence of the maxicircle consensus from the start of 12S rRNA through the end of the ND5 gene including intergenic regions, as defined for T. cruzi in Table 1. CLB, Esmo, Tb, and Lt represent T. cruzi CL Brener strain, T. cruzi Esmeraldo strain, T. brucei maxicircle coding sequence [GenBank:M94286] and variant region sequence [GenBank:Z15118], and L. tarentolae maxicircle sequence [GenBank:M10126]
Figure 2Strong nucleotide biases in the coding region correlate with positions of unedited and pre-edited genes. TA and GC skew analyses reflect the T-richness in protein-coding genes requiring little or no editing and G-richness in genes requiring extensive post-transcriptional RNA editing to create their coding regions. Window size = 100 bp.
Figure 3Dot plot analyses generated using Dottup comparing A) CL Brener and T. brucei, and B) CL Brener and L. tarentolae maxicircles. Dots represent an exact match of 10 bp.
Average percent identities among CL Brener, Esmeraldo,T. brucei and L. tarentolae rRNAs, gene coding regions and inferred protein sequences
| Edited | Non-edited | Non-edited | Entire coding | ||
| Comparison | rRNAs | Genes | Genes | Proteins | region |
| CL Brener vs. Esmeraldo | 92.6% | 86.2% | 89.9% | 81.7% | 88.2% |
| CL Brener vs. | 79.6% | 57.1% | 77.8% | 79.3% | 73.3% |
| Esmeraldo vs. | 79.8% | 55.2% | 78.5% | 81.0%** | 72.5% |
| CL Brener vs. | 78.6% | 41.8%* | 75.4% | 76.0% | 65.2% |
| Esmeraldo vs. | 78.9% | 41.6%* | 76.0% | 67.5% | 64.7% |
| 78.0% | 42.4%* | 76.6% | 76.4% | 64.7% |
Non-edited genes include ND5, ND4,COI,COII,ND1,MURF1,MURF2,Cyb
Extensively-edited genes include COIII, ATPase6, ND7, ND8, ND9, CR4, CR5, RPS12
MURF5 and CR3 were not included because both 5'- and 3'-gene boundaries are uncertain
* COIII, ND7 and ATPase6 were not included in Lt comparisons due to different editing patterns
** ND5 not included due to frameshifts in Esmeraldo
Entire coding region includes contiguous sequence from the start of 12S rRNA to the end of ND5
Figure 4The A) Dot plot analyses generated using Dottup comparing CL Brener maxicircle consensus vs. itself showing a repetitive region of short motifs (red) and a longer duplicated conserved element in the variable region (green). B) Expanded view of the repetitive non-coding region of the CL Brener dotplot. C) Dot plot comparison generated using Dotter software of CL Brener and Esmeraldo variable regions shows the presence of a conserved element (gray bars). For this analysis the CL Brener and Esmeraldo sequences of interest were joined for a combined comparison to self and to each other. Enlarged inset shows a conserved element with cross-structure indicative of a palindrome. D) The 39-bp imperfect palindrome within the variable region conserved element. E) Schematic representation of percent identity of variable region conserved element among all CL Brener and Esmeraldo assemblies.
Figure 5Esmeraldo has a strain-specific deletion truncating the 5' ends of two genes. A) Schematic representation of the 236-bp deletion in Esmeraldo (34 nt of CR4, 105 nt of intergenic region, and 98 nt of ND4). B) Alignments of the N-terminus of predicted proteins from T. cruzi strains CL Brener and Esmeraldo, T. brucei and L. tarentolae. Conserved, alternative downstream start-site methionines are boxed.
Figure 6Summary of CL Brener and Esmeraldo coding region polymorphisms. Specific indels lying in regions not processed by RNA editing in T. brucei are indicated for each strain. Indels resulting in the creation of a downstream termination codon are highlighted by an asterisk (*).
Figure 7Postulated inheritance of maxicircle genomes superimposed upon the two hybridization events in The maxicircle clades defined by Machado and Ayala [31] are overlaid on the schema of the evolutionary history [23] of T. cruzi sub-groups. In agreement with the phylogenetic comparisons among the three clades, the maxicircle donor in the first hybridization event between the DTU I (clade A, yellow) and DTU IIb (clade C, orange) strains was the DTU I parent. Open circles and dotted lines represent maxicircle inheritance. Over time, the maxicircle in the new hybrid line accumulated a unique set of mutations distinguishing them from the DTU I parental maxicircle, designated clade B (green), as seen in sibling DTUs IIa and IIc. In the second hybridization event between strains from DTU IIc and DTU IIb, the clade B maxicircle was passed on to the progeny represented by DTUs IId and IIe. The sequences assembled for this manuscript represent maxicircles from clades B (CL Brener) and C (Esmeraldo).