| Literature DB >> 32276324 |
Denis O Omelchenko1, Maxim S Makarenko1, Artem S Kasianov1, Mikhail I Schelkunov1,2, Maria D Logacheva1,2, Aleksey A Penin1.
Abstract
Shepherd's purse (Capsella bursa-pastoris) is a cosmopolitan annual weed and a promising model plant for studying allopolyploidization in the evolution of angiosperms. Though plant mitochondrial genomes are a valuable source of genetic information, they are hard to assemble. At present, only the complete mitogenome of C. rubella is available out of all species of the genus Capsella. In this work, we have assembled the complete mitogenome of C. bursa-pastoris using high-precision PacBio SMRT third-generation sequencing technology. It is 287,799 bp long and contains 32 protein-coding genes, 3 rRNAs, 25 tRNAs corresponding to 15 amino acids, and 8 open reading frames (ORFs) supported by RNAseq data. Though many repeat regions have been found, none of them is longer than 1 kbp, and the most frequent structural variant originated from these repeats is present in only 4% of the mitogenome copies. The mitochondrial DNA sequence of C. bursa-pastoris differs from C. rubella, but not from C. orientalis, by two long inversions, suggesting that C. orientalis could be its maternal progenitor species. In total, 377 C to U RNA editing sites have been detected. All genes except cox1 and atp8 contain RNA editing sites, and most of them lead to non-synonymous changes of amino acids. Most of the identified RNA editing sites are identical to corresponding RNA editing sites in A. thaliana.Entities:
Keywords: Capsella bursa-pastoris; RNA editing; SMRT PacBio; complete mitochondrial genome; structural variants
Year: 2020 PMID: 32276324 PMCID: PMC7238199 DOI: 10.3390/plants9040469
Source DB: PubMed Journal: Plants (Basel) ISSN: 2223-7747
The structural variants in the mitogenome of C. bursa-pastoris supported by at least two circular consensus sequencing (CCS) reads.
| Repeat ID | Structural Variant Type | Number of Supporting CCS Reads | Repeat Length (bp) | First Repeat Unit Position (bp) | Second Repeat Unit Position (bp) | Repeat Units’ Sequence Identity |
|---|---|---|---|---|---|---|
| Rep_1 | inversion | 13 | 854 | 54,219–55,072 | 49,506–48,653 | 98.9% |
| Rep_2 | inversion | 8 | 635 | 158,147–158,781 | 106,515–105,881 | 100.0% |
| Rep_3 | duplication | 6 | 538 | 134,019–134,556 | 116,143–116,679 | 99.8% |
| Rep_7 | deletion | 5 | 418 | 220,620–221,037 | 54,655–55,072 | 99.8% |
| Rep_10 | inversion | 4 | 420 | 220,623–221,042 | 49,067–48,648 | 97.9% |
| Rep_11 | duplication | 3 | 356 | 64,729–65,084 | 26,047–26,402 | 100.0% |
| Rep_12 | duplication | 3 | 327 | 93,763–94,089 | 62,446–62,772 | 100.0% |
| Rep_9 | deletion | 2 | 404 | 238,171–238,573 | 17,371–17,774 | 99.8% |
| Rep_11 | deletion | 2 | 356 | 64,729–65,084 | 26,047–26,402 | 100.0% |
Figure 1Full-genome alignment of mtDNA of C. bursa-pastoris (top) and C. rubella (bottom). Inversion block 47,173 bp is blue, and inversion block 5021 bp is yellow.
Figure 2Map of the mitogenome of C. bursa-pastoris. Colored blocks on circular axis denote genes, and white blocks within them are introns. The gray histogram on the inner ring shows the guanine-cytosine (GC) composition with the thin dark gray line denoting 50% GC content. Arcs indicate repeats associated with structural variants: deletions are red, duplications are yellow, inversions are blue, and the orange arc is a repeat, which corresponds to both deletion and duplication. Two blue regions on the circular axis represent inversions in comparison with the C. rubella mitogenome.
RNAseq supported open reading frames (ORFs) of the C. bursa-pastoris mitogenome.
| Name * | InterProScan Predictions | BLASTp Similarity |
|---|---|---|
|
| Cytochrome c oxidase, subunit II ( |
|
|
| 1 transmembrane region |
|
|
| Nothing found | hypothetical protein |
|
| Member of Protein TIC214 ( |
|
|
| Signal peptide (located 1–17 aa) and 2 transmembrane regions |
|
|
| Member of Ribosomal protein L2 ( |
|
|
| Member of ATP synthase, F0 complex, subunit C ( |
|
|
| Nothing found | hypothetical protein |
*—ORFs are named by their amino acid length.
Figure 3The absolute number of RNA editing substitutions per gene (blue bars) and the relative number of RNA editing substitutions by gene length normalized to 100 bp (red bars).