| Literature DB >> 31561566 |
Jean-Stéphane Varré1, Nunzio D'Agostino2, Pascal Touzet3, Sophie Gallina4, Rachele Tamburino5, Concita Cantarella6, Elodie Ubrig7, Teodoro Cardi8, Laurence Drouard9, José Manuel Gualberto10, Nunzia Scotti11.
Abstract
Mitochondrial genomes (mitogenomes) in higher plants can induce cytoplasmic male sterility and be somehow involved in nuclear-cytoplasmic interactions affecting plant growth and agronomic performance. They are larger and more complex than in other eukaryotes, due to their recombinogenic nature. For most plants, the mitochondrial DNA (mtDNA) can be represented as a single circular chromosome, the so-called master molecule, which includes repeated sequences that recombine frequently, generating sub-genomic molecules in various proportions. Based on the relevance of the potato crop worldwide, herewith we report the complete mtDNA sequence of two S. tuberosum cultivars, namely Cicero and Désirée, and a comprehensive study of its expression, based on high-coverage RNA sequencing data. We found that the potato mitogenome has a multi-partite architecture, divided in at least three independent molecules that according to our data should behave as autonomous chromosomes. Inter-cultivar variability was null, while comparative analyses with other species of the Solanaceae family allowed the investigation of the evolutionary history of their mitogenomes. The RNA-seq data revealed peculiarities in transcriptional and post-transcriptional processing of mRNAs. These included co-transcription of genes with open reading frames that are probably expressed, methylation of an rRNA at a position that should impact translation efficiency and extensive RNA editing, with a high proportion of partial editing implying frequent mis-targeting by the editing machinery.Entities:
Keywords: RNA editing; Solanaceae family; comparative genomics; mitochondria; mtDNA; multichromosomal structure; potato; repeated sequences
Mesh:
Year: 2019 PMID: 31561566 PMCID: PMC6801519 DOI: 10.3390/ijms20194788
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1S. tuberosum mitochondrial genome assembly. (A) The 6 unitigs output by HGAP for the Cicero mitogenome. Colored blocks on the unitigs show repeats. Labeled ticks give primers positions, with those in black in forward orientation and those in light red in reverse orientation. (B) The 3 contigs obtained after PCR validation. Colored blocks on the contigs show repeats. The larger contig could not be circularized.
List of transcribed protein-encoding genes and orfs. Genes are clustered according to transcription units.
|
| ||||
|
|
|
|
|
|
| nad1e | 3860 | 4118 | + | |
| orf119 | 12570 | 12929 | – | |
| atp1 | 13197 | 14732 | – | |
| Mttb | 31292 | 32122 | – | |
| orf265a | 32244 | 33041 | – | N-term atp8 (20 codons), ELF-domain (pfam03317). Co-transcribed with MttB |
| 26S | 33385 | 36879 | – | |
| nad2cde | 50394 | 50581 | – | |
| 52050 | 52622 | – | ||
| 55094 | 55254 | – | ||
| nad5ab | 61695 | 62910 | – | |
| 63755 | 63984 | – | ||
| nad4 | 64989 | 65077 | – | |
| 67705 | 68127 | – | ||
| 71259 | 71773 | – | ||
| 73185 | 73645 | – | ||
| orf125 | 73849 | 74226 | – | |
| orf247 | 78169 | 78912 | – | |
| rps4 | 90192 | 91013 | + | No evidence of TAG created by editing. Possible non-canonical initiation at GTG codon |
| nad6 | 91707 | 92360 | + | Transcript is processed upstream of stop codon |
| nad4L | 98027 | 98329 | + | |
| atp4 | 98518 | 99114 | + | |
| orf438 | 109344 | 110660 | – | RdRp-like |
| orf141 | 110784 | 111209 | – | |
| 5S | 111720 | 111838 | – | |
| 18S | 112001 | 113946 | – | |
| orf304 | 117275 | 118189 | + | N-term pfam12725. C-term has 59% identity to hypothetical protein RirG_027070 |
| nad1d | 118391 | 118449 | + | |
| matR | 119111 | 121087 | + | |
| nad5de | 125647 | 125793 | – | |
| 126889 | 127283 | – | ||
| orf152 | 127397 | 127855 | – | |
| orf105 | 128459 | 128776 | – | |
| orf159 | 128839 | 129318 | – | RdRp-like |
| orf137 | 139197 | 139610 | – | CMS-associated protein |
| nad1a | 148955 | 149339 | + | |
| rps19 | 159585 | 159869 | + | |
| rps3 | 159883 | 159956 | + | |
| 161026 | 162643 | + | ||
| rpl16 | 162534 | 163049 | + | Editing site 162570 (96%) creates internal stop codon. Editing is conserved in Arabidopsis. It implies that there is no re-initiation of translation inside rps3. Possible initiation at GTG codon. |
| cox2 | 163299 | 163680 | + | |
| 165066 | 165466 | + | ||
| ccmC | 172753 | 173583 | + | ORF overlaps with tRNA and transcript is processd at the tRNA 5’, without stop codon. |
| rps19 | 208754 | 209038 | + | |
| rps3 | 209052 | 209125 | + | |
| 210195 | 211812 | + | ||
| rpl16 | 211703 | 212218 | + | |
| cox2 | 212468 | 212849 | + | |
| 214235 | 214635 | + | ||
| orf77 | 233142 | 233375 | + | |
| nad1e | 233431 | 233689 | + | |
| atp6 | 233940 | 235106 | + | Editing site 235065 creates stop codon making orf 13 aa shorter, with the same C-term as Arabidopsis atp6. |
| atp9 | 243146 | 243379 | + | Editing site 243368 creates stop codon making protein 3 aa shorter. With editing it is the same C-term as Arabidopsis atp9. |
| nad5c | 246225 | 246246 | + | |
| orf161 | 246314 | 246799 | + | |
| nad7 | 271765 | 272026 | – | |
| 273770 | 274477 | – | ||
| 275936 | 276004 | – | ||
| 276920 | 277062 | – | ||
| orf103 | 278419 | 278730 | – | |
| nad1bc | 292250 | 292441 | – | |
| 293925 | 294007 | – | ||
| rps13 | 294547 | 294897 | – | |
| nad1d | 295623 | 295681 | – | |
| orf304 | 295883 | 296797 | – | hypothetical protein RirG ( |
| 18S | 300126 | 302071 | + | |
| 5S | 302234 | 302352 | + | |
| orf141 | 302863 | 303288 | + | Similarities to region 3’ UTR of orf247 |
| orf438 | 303412 | 304728 | + | RdRp-like |
| orf320 | 310786 | 311748 | – | Chimeric orf: 5’ of atp1, 3’ region upstream nad5c. Promoter of atp1 that is present in repeat R5 |
|
| ||||
|
|
|
|
|
|
| ccmFC | 6136 | 6685 | – | |
| 7635 | 8401 | – | ||
| Cob | 33524 | 34705 | – | |
| sdh4 | 51176 | 51589 | – | Overlaps cox3. Real ATG might be at codon 24. The transcript is processed about 8 codons before stop codon. Internal stop codon created by partial editing (26%) at codon 93. |
| cox3 | 51517 | 52314 | – | |
| atp8 | 52947 | 53417 | – | |
| orf118 | 54391 | 54747 | – | Chimeric orf: C-term is from atp6 |
| rps1 | 54964 | 55635 | – | |
| ccmFN | 71926 | 73737 | – | |
| cox1 | 76329 | 77825 | – | Initiation codon created by editing |
| rps10 | 78075 | 78187 | – | Stop codon created by editing (78107) |
| 78963 | 79212 | – | Initiation codon created by editing (79211). | |
| rps14 * | 81682 | 82051 | – | Pseudo-gene |
| rpl5 | 82053 | 82613 | – | |
| rps12 | 99496 | 99867 | – | |
| nad3 | 99916 | 100272 | – | |
| orf265b | 100423 | 101220 | – | N-term atp8, ELF-domain (pfam03317) |
|
| ||||
|
|
|
|
|
|
| ccmB | 9945 | 10565 | + | Editing site 10238 creates stop codon in about 50% of the transcripts, in the middle of the ORF. |
| rpl10 | 16440 | 16919 | – | |
| rpl2 | 17204 | 17320 | – | |
| 19230 | 20114 | – | ||
| orf210 | 20294 | 20926 | – | |
| sdh3 | 35400 | 35726 | + | |
| nad2ab | 36302 | 36453 | + | |
| 37470 | 37862 | + | ||
| nad9 | 41589 | 42161 | – | |
Figure 2Organization of the potato mtDNA. Molecules 1, 2 and 3 are represented in green, light blue and red respectively. Gene sequences are shown below the sequence line, with protein genes in dark blue and rRNA and tRNA genes in red. Repeated sequences larger than 100 bp are shown above the sequence as orange arrows. Green bars indicate sequences of plastidial origin. Bent green and red lines indicate 5′ and 3′ transcript boundaries, respectively. Grey horizontal arrows represent major transcripts. Green horizontal arrows indicate consensus promoters found upstream of transcripts. Thin vertical lines indicate editing sites.
List of tRNA genes. Plastid-like tRNA genes are tagged by ‘*’. Expression is according to [46]. Absence of expression for plastid-like Cys, and Val tRNA genes has been checked by northern blot (Figure S3). nd = not determined.
| tRNA Gene | Start | Stop | Strand | Editing NGS | Expression |
|---|---|---|---|---|---|
|
| |||||
| trnP(UGG) | 17220 | 17294 | - | + | |
| trnF(GAA) | 17545 | 17618 | - | + | + |
| trnS(GCU) | 17981 | 18068 | - | + | |
| trnMf(CAU) | 37458 | 37531 | - | + | |
| trnY(GUA) | 55772 | 55854 | - | + | |
| trnN(GUU) * | 56390 | 56461 | - | + | |
| trnC(GCA) | 58606 | 58676 | - | + | + |
| trnC(GCA) * | 171005 | 171076 | + | - | |
| trnI(CAU) * | 173488 | 173561 | + | nd | |
| trnMe(CAU) * | 189363 | 189435 | - | + | |
| trnG(GCC) | 201326 | 201397 | + | + | |
| trnQ(UUG) | 204702 | 204773 | + | + | |
| trnI(CAU) | 260636 | 260709 | + | + | |
|
| |||||
| trnN(GUU) * | 28511 | 28582 | + | + | |
| trnS(UGA) | 35798 | 35884 | - | + | |
| trnD(GUC) * | 43141 | 43214 | + | + | |
| trnS(GGA) * | 43901 | 43987 | + | + | |
| trnV(GAC) * | 64733 | 64804 | - | - | |
|
| |||||
| trnK(UUU) | 7326 | 7398 | - | + | |
| trnE(UUC) | 23306 | 23377 | - | + | |
| trnW(CCA) * | 40411 | 40484 | - | + | |
| trnP(UGG) | 40642 | 40715 | - | + | |
| trnH(GUG) * | 45897 | 45971 | - | + |
List of stem-loops and T-elements.
|
| ||||
|
|
|
|
|
|
| ccmFC | 5’ | trnG | RNAseZ | No |
| rps3 | 5’ | trnK | RNAseZ | No |
| rps4 | 5’ | t-element | RNAseZ | No |
| ccmFN1 | 5’ | t-element | RNAseZ | No |
| cox1 | 5’ | t-element | RNAseZ | No |
| rpl5 | 5’ | Acceptor stem-like | RNAseZ | No |
| rpl5 | 5’ | Acceptor stem-like | RNAseZ | No |
| atp6-2 | 5’ | Acceptor stem-like | RNAseZ | No |
| nad7 | 5’ | Acceptor stem-like | RNAseP | No |
| atp6-1 | 3’ | trnS | RNAseP | No |
| atp6-2 | 3’ | trnS | RNAseP | No |
| atp9 | 3’ | Double stem-loop | RNAseZ | No |
| nad1e | 3’ | Double stem-loop | RNAseZ | No |
| cox2 | 3’ | Stem-loop | RNAseZ | No |
| ccmC | 3’ | t-element | RNAseP | trnI |
| nad6 | 3’ | t-element | RNAseP | Yes |
|
| ||||
| rrrn26S | 5’ precursor | trnfM | RNAseZ | |
| nad2cde | 5’ | trnY | RNAseZ | |
| non-coding RNA | 5’ | trnC | RNAseZ | |
| Cob | 5’ | trnS | RNAseZ | |
| ccmC | 5’ | trnC | RNAseZ | |
| ccmC | 3’ | trnI | RNaseP | |
| atp1 | 3’ | Double stem-loop | RNAseZ | 12397–12441 |
| mttB | 3’ | Stem-loop | RNAseZ | 31111–31148 |
| nad5ab | 3’ | Double stem-loop | RNAseZ | 60956–60986 |
| orf247 | 3’ | Double stem-loop | RNAseZ | 76895–76938 |
| nad6 | 3’ | t-element | RNAseZ | 92309–92353 |
| atp4 | 3’ | Stem-loop | RNAseZ | 99142–99166 |
| nad1a | 3’ | Stem-loop | RNAseZ | 150600–150637 |
| orf438 | 3’ | Stem-loop | RNAseZ | 305247–305291 |
Figure 3Methylation of the 18S rRNA. RNA-seq data revealed in about half of the reads an A-to-T mismatch at position 960 of 18S rRNA (position 968 in domain 31 of E. coli 16S rRNA). Such mismatch is diagnostic of mis-incorporation during cDNA synthesis because of m1A methylation. (A) Comparison of the corresponding domains of the E. coli 16S rRNA (which is methylated at positions 966 and 967, m2G and m5C respectively) and of the potato 18S rRNA. Sequence differences are in lower case. (B) In the 3D structure of the bacterial ribosome, these nucleotides are close to the anticodon of the tRNA at position P (dark blue nucleotides in the tRNA structure pairing with the mRNA), with base 966 stacking with the first anticodon nucleotide.
Mitochondrial genes encoding proteins and transcribed open reading frames (orfs) among Solanaceae species available in GenBank. The symbol ● indicates the presence of the gene; ψ, a pseudogene; -, gene loss.
| Gene |
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● a | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ψ | ψ | ψ | ψ | ψ | ψ | ψ | ψ |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | - |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | ● | ● | ● | - |
|
| ● | ● | ψ | ψ | - | - | - | - |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | - | - | - | - | - | - | - |
|
| ● | - | - | - | - | - | - | - |
|
| ● | ● | ● | ● | - | - | - | - |
|
| ● | ● | ● | ● | ● | ● | ● | - |
|
| ● | ● | ● | ● | ψ | - | - | - |
|
| ● | ● | ● | ● | ● | ● | ● | ψ |
|
| ● | ● | ● | ● | - | - | - | - |
|
| ● | ● | ● | ● | - | - | - | - |
|
| ● | ● | ● | ● | ● | ● | ● | ψ |
|
| ● | ● | ● | ● | ● | ● | ● | ψ |
|
| ● | ● | ● | ● | ψ | - | - | - |
|
| ● | ● | ● | ● | ● | ● | ● | ● |
|
| ● | ● | ● | ● | - | - | - | - |
a = atp6 sequences available in GenBank (MF989960.1 and MF989961.1 accessions) were incomplete.
Figure 4Representative examples of variable mitochondrial protein sequences in Solanaceae species. The primary structures of S. tuberosum ATP6 (A) and COX2 (B) proteins were compared with those of other Solanaceae species available in GenBank. (A) Red and green bars indicate non-conserved or conserved aa sequences in ATP6, respectively. (B) Cultivated (S. lycopersicum) and wild (S. pennellii) tomato have additional 194 aa at the C-terminus of the protein compared with potato and other Solanaceae species. The numbers above red and green bars indicate the start and end positions of potato mitochondrial proteins.
Figure 5Phylogenetic tree of Solanaceae species. Phylogram of the best maximum-likelihood (ML) tree as determined using the RAxML software from the concatemer of the coding sequences of 15 protein-coding genes. Numbers associated with branches are ML bootstrap support values. Vitis vinifera and Ipomea nil were used as outgroups.