Literature DB >> 31818248

Mitochondrial genomes of the early land plant lineage liverworts (Marchantiophyta): conserved genome structure, and ongoing low frequency recombination.

Shanshan Dong1,2, Chaoxian Zhao1,3, Shouzhou Zhang1, Li Zhang1, Hong Wu2, Huan Liu4, Ruiliang Zhu3, Yu Jia5, Bernard Goffinet6, Yang Liu7,8.   

Abstract

BACKGROUND: In contrast to the highly labile mitochondrial (mt) genomes of vascular plants, the architecture and composition of mt genomes within the main lineages of bryophytes appear stable and invariant. The available mt genomes of 18 liverwort accessions representing nine genera and five orders are syntenous except for Gymnomitrion concinnatum whose genome is characterized by two rearrangements. Here, we expanded the number of assembled liverwort mt genomes to 47, broadening the sampling to 31 genera and 10 orders spanning much of the phylogenetic breadth of liverworts to further test whether the evolution of the liverwort mitogenome is overall static.
RESULTS: Liverwort mt genomes range in size from 147 Kb in Jungermanniales (clade B) to 185 Kb in Marchantiopsida, mainly due to the size variation of intergenic spacers and number of introns. All newly assembled liverwort mt genomes hold a conserved set of genes, but vary considerably in their intron content. The loss of introns in liverwort mt genomes might be explained by localized retroprocessing events. Liverwort mt genomes are strictly syntenous in genome structure with no structural variant detected in our newly assembled mt genomes. However, by screening the paired-end reads, we do find rare cases of recombination, which means multiple concurrent genome structures may exist in the vegetative tissues of liverworts. Our phylogenetic analyses of the nuclear encoded double stand break repair protein families revealed liverwort-specific subfamilies expansions.
CONCLUSIONS: The low repeat recombination level, selection, along with the intensified nuclear surveillance, might together shape the structural evolution of liverwort mt genomes.

Entities:  

Keywords:  Bryophytes; introns; recombination; repeats; structural evolution

Mesh:

Substances:

Year:  2019        PMID: 31818248      PMCID: PMC6902596          DOI: 10.1186/s12864-019-6365-y

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Mitochondrial (mt) genomes of vascular plants are highly variable in size (i.e., from ca. 66 Kb in Viscum scurruloideum [1] to 11.3 Mb in Silene conica [2]), and in gene content (i.e., 13 to 64 [3], excluding duplicated genes and open reading frames (ORFs)). By contrast, the mitogenomes within the three bryophyte lineages are conserved in terms of gene content, and exhibit rather narrow size variation [4]. The 40 mt genomes from 29 moss genera (e.g., [5, 6]) are typically smaller (101–141 Kb, median ~ 107 Kb) than those of other land plants, and constantly encode 40 protein-coding genes (PCGs), 24 tRNAs, and 3 rRNAs. The mt genomes of hornworts sampled from four genera [7-9] are larger in size (185–242 Kb) but smaller in gene content (21–23 PCGs, 18–23 tRNAs, 3 rRNAs). The 18 mt genomes from seven liverwort genera [10-14] are intermediate in size between those of mosses and hornworts (142–187 Kb, median ~ 164 Kb) and may comprise more genes (i.e., 39–42 PCGs, 25–27 tRNAs, and 3 rRNAs). Vascular plant mt genomes contain a variable set of introns, ranging in numbers from three (Viscum album [3]) to 37 (Selaginella moellendorffii [15]), with an average of 21. Bryophyte mt genomes tend to hold more introns than those of vascular plants, i.e., 28–38 (average 33) in hornworts [7], 26 or 27 in mosses [5], and 23–30 (average 28) in liverworts [13]. Each major land plant lineage contains a number of unique mt introns, and not a single intron is shared across all mt genomes of land plants [16]. As a result, the intron content is conserved within but not among land plant lineages [13, 16], suggesting that vast intron gains and losses happened during the early evolution of land plants, and conservative evolution maintained their stable content in the descendant lineages. Although each bryophyte group appears to hold a stable set of introns that parallels their conserved evolution of mt genomes of overall structure, recent small-scale mitogenomic studies provided evidences for distinct intron losses in leafy liverworts, and suggested retroprocessing as the likely causes [13]. Comprehensive studies with expanded taxon samplings are still needed to provide the framework for reconstructing the evolution of introns in liverwort mt genomes, and to assess the underlying causes for their variation in number. Mt genomes of vascular plants hold abundant repeated sequences, including some large repeats (> 1000 bp), more medium sized repeats (100–1000 bp), and numerous small repeats (50–100 bp). Increased repeat length and identity facilitate intragenomic recombinations [16], and may hence account for the fluid genome structure of vascular mt genomes. In contrast, the mt genomes of the three bryophyte lineages usually have fewer repeated sequences and are structurally more stable. The mt genome of mosses contains only few and moreover small repeated sequences and is hence structurally nearly static [5]. Hornworts contain relatively more repeated sequences, which may explain the few rearrangements distinguishing the four assembled mitogenomes [7]. Finally, the mt genome of liverworts contains repeated sequences of intermediate abundance and size between those of mosses and hornworts [10]. The structure of their mt genome is highly conserved except for that of Gymnomitrion concinnatum [11], for which two inversions are needed to restore collinearity with the other liverwort mt genomes. The nuclear encoded double-strand break repair (DSBR) proteins are involved in the suppression of the recombination between repeated DNA sequences [17]. Mutation of DSBR proteins in model organisms, such as Arabidopsis and Physcomitrella, often results in increased rearrangements between repeated sequences in organellar genomes that can impact the function of plastids and/or mitochondria [18-21]. The evolution of plant mt genome structure is therefore shaped by intrinsic parameters, such as repeat-mediated recombination, and extrinsic ones, such as nuclear DSBR proteins [17]. The structure of the mt genome of bryophytes is considered highly conserved and stable [5] but the possible underlying causes such as the repeat recombination level and the evolution of the nuclear encoded DSBR genes remain unexplored. At present, only representatives of five of the currently recognized 15 orders of liverworts have seen their mitogenome assembled [22]. Here we broadened the sampling to exemplars of 10 orders to 1) test whether the mt gene content and 2) the structure of the mt genome is conserved across a broader phylogenetic breadth of liverworts; 3) investigate the repeat recombination rates in liverwort mitochondrion; 4) explore the possible intrinsic and extrinsic factors responsible for the structural evolutionary pattern of liverwort mt genomes.

Results and Discussion

Liverwort mitogenomes are small in size but abundant in genes

Assemblies of high throughput sequences from total DNA extracts yielded complete circular mt genomes for 29 liverworts, representing 27 genera, 23 families, and 10 orders, of which five orders were newly sampled for their mt genomes. The sequencing depth ranges from 98 to 756 × (Additional file 1: Table S1). Including the published mt genomes from 13 species, complete liverwort mt genomes are thus available from 37 species, 31 genera, and 10 orders (Additional file 2: Table S2). On average, liverwort mt genomes are smaller (~ 167 Kb) than those of vascular plants (average, ~ 476 Kb) or hornworts (average, ~ 197 Kb), but larger than those of mosses (average, ~ 107 Kb). Among liverworts, complex thalloids hold the largest (average, ~ 185 Kb) mt genomes, about 1.3 times the size of those of the Jungermanniopsida clade B that have the smallest mt genomes (average, ~ 147 Kb). Thus, variation in liverwort mitogenome size is much narrower compared to that in angiosperms (i.e., ~ 200-fold range). The trend in the evolution of mitogenome size during the diversification of liverworts is ambiguous (Fig. 1a), compared to a distinct increase during the evolution of land plants (Fig. 1b). The rather stable exome is consistently the smallest component (~ 23%) of liverwort mt genomes (~ 37 Kb). Consequently, changes in total mitogenome size in liverworts are shaped by the variation in intron content and, as shown in other land plants, intergenic spacer size (Fig. 1b).
Fig. 1

Comparison of mt genome size, intergenic, intronic, and exonic contents among (a) liverwort genera, and (b) land plant major lineages. The order of liverworts and land plants in (a) and (b) are sorted by the phylogeny in Fig. 3. Major liverwort clades, HAP (Haplomitriopsida), MAR (Marchantiopsida), PEL (Pelliidae), MET (Metzgeriidae), POR (Porellales), JUNA (Jungermanniales clade A), JUNB (Jungermanniales clade B), JUNC (Jungermanniales clade C), are indicated in (a). The number of taxa analyzed for each group is indicated in the bracket following each group’s name in (b)

Comparison of mt genome size, intergenic, intronic, and exonic contents among (a) liverwort genera, and (b) land plant major lineages. The order of liverworts and land plants in (a) and (b) are sorted by the phylogeny in Fig. 3. Major liverwort clades, HAP (Haplomitriopsida), MAR (Marchantiopsida), PEL (Pelliidae), MET (Metzgeriidae), POR (Porellales), JUNA (Jungermanniales clade A), JUNB (Jungermanniales clade B), JUNC (Jungermanniales clade C), are indicated in (a). The number of taxa analyzed for each group is indicated in the bracket following each group’s name in (b)
Fig. 3

Heat map of mt genome rearrangements in pairwise comparisons of land plants along a phylogenetic tree based on mt nucleotide sequences (Additional file 3: Figure S1). All nodes are maximally supported (i.e., 100% bootstrap support) unless otherwise marked

The mitochondrial genome of liverworts spanning a broad phylogenetic breadth (Additional file 3: Figure S1) holds a nearly constant and large set of PCGs (39–43, Additional file 4: Figure S2). Treubia possesses the smallest set of mt PCGs (i.e., 39) among liverworts, and lacks all Cytochrome c genes (ccmB, ccmC, ccmFN and ccmFC), a characteristic first reported by Liu et al. [23] and confirmed here based on a new accession. A complete set of Cytochrome c genes is also lacking in the mitogenome of hornworts [7], lycophytes [24], and algae [16]. The nad7 gene is widespread among algae, mosses, and vascular plants, but is pseudogenized in liverworts, except in the Haplomitriopsida (Haplomitrium and Treubia) that hold the intact and potentially functional nad7, lending support to the hypothesis of the earliest divergence of this lineage in the diversification of liverworts [25]. Ribosomal protein genes have been transferred from the mitochondria to the nucleus many times during the evolution of angiosperms [26]. By contrast, all liverwort mt genome share an identical set of ribosomal protein genes, including rps8 and rps10 that rarely occur in other land plant mt genomes (Additional file 5: Table S3). Liverwort mt genomes also encode a rich set of RNA genes among land plants (Additional file 6: Figure S3), including three rRNA genes (rrn5, rrn18 and rrn26) and 25–27 tRNA genes (i.e., ± trnRucg and trnTggu). Among land plants (Additional file 5: Table S3), trnRucg is restricted to the complex thalloid liverworts, and trnTggu occurs in only few liverwort lineages (i.e., Blasia, Marchantia and Haplomitrium), as well as in mosses and hornworts. Overall, the gene content of mitogenomes seems to be highly conserved during the evolution of liverworts, which spans for at least 400 million years [27].

Intron content of liverwort mitogenomes is much variable than that of mosses

Liverwort mt genomes hold a rich intron set, ranging from 23 introns in some leafy species to 32 in all the complex thalloid species, with an average of 28 introns (Fig. 2). Species of Haplomitriopsida and Jungermanniopsida possess an average of 29 and 25 introns, respectively. Simple thalloid Pelliidae contains an average of 28 introns, whereas Metzgeriidae hold an average of 31 introns. Leafy Jungermanniidae show most cases of intron losses, especially in the mt genomes of Frullania, Radula, clade B and C of Jungermanniales (Fig. 2). Among the 17 genes disrupted by at least one intron, six (i.e., atp1, cob, cox1, nad4L, rrn18, and rrn26) vary in their intron content (Fig. 2). The three introns of cob gene are commonly present in liverworts, except for 3′ end intron cobi824g2 that is absent from Radula and clade A of Jungermanniales. The nine introns of cox1 gene are shared by all thalloid liverworts, and the first three at the 5′ end (cox1i44g2, cox1i178g2, and cox1i375g1) by almost all liverworts, but the remaining six variably lost among leafy liverworts. The respiratory chain complex I (or nad) genes possess a stable intron content, except for nad4L, for which one intron (nad4Li100g2) is lacking in a few Jungermanniales (clade B, and two species from clade A: Plagiochila, Heteroscyphus), and another one (nad4Li283g2) in Treubia. The liverwort unique rrn18 group II intron (rrn18i1065g2) is present only in Haplomitrium and complex thalloids. Two introns were found in the ribosomal gene rrn26, and intron rrn26i827g2 is present in all liverworts but clade B of the Jungermanniales, and rrn26i2352g1, a group I intron, is here newly reported and only observed in Haplomitrium.
Fig. 2

Intron content of liverwort mitochondrial genomes. The species are ordered as in the phylogeny in Fig. 3. The black circle indicates the presence of an intron; the white square the absence of an intron. Intron nomenclature follows Dombrovska and Qiu (2004) and Knoop (2004). The number of species analyzed for each genus is indicated in the bracket following each genus’s name. The numbers at the bottom of each column indicate the total intron number

Intron content of liverwort mitochondrial genomes. The species are ordered as in the phylogeny in Fig. 3. The black circle indicates the presence of an intron; the white square the absence of an intron. Intron nomenclature follows Dombrovska and Qiu (2004) and Knoop (2004). The number of species analyzed for each genus is indicated in the bracket following each genus’s name. The numbers at the bottom of each column indicate the total intron number Based on our phylogeny reconstructed from a concatenated nucleotide sequences of 41 mitochondrial protein-coding genes (Fig. 2), the early diverging lineages Haplomitriopsida and Marchantiopsida tend to hold a more complete set of introns compared to the derived clades (especially clade B of Jungermanniales) that seem to have undergone parallel reduction in intron content. The alignments (Additional file 7: Figure S4; Additional file 8: Figure S5) of the four protein coding genes with intron number variations and RNA editing site distributions in liverworts support localized retroprocessing events [13] as the most possible causes for the intron loss phenomenon observed in liverworts. First, all exons are intact and the lost introns appear to be precisely cut off from the splicing site in liverwort mt PCGs. Moreover, liverwort intron losses are more frequent toward the 3′ end of genes (i.e., cob, cox1), which is a strong characteristic of retroprocessing induced intron losses [28]. Furthermore, in some cases, the intron losses in liverworts seem to be accompanied by RNA editing site losses near the exon-intron splicing boundaries.

Intergenic spacer variations mirror the genome size variations in liverwort mitogenomes

Intergenic spacers compose the bulk of the mt genome of land plants, accounting, for example, for about 80% in vascular plant mt genome size [16]. Intergenic spacers comprise repeated sequences [29], sequences transferred from plastid [30] or nuclear genomes [31], and DNA fragments horizontally transferred from foreign donors [32, 33]. In liverworts, the intergenic spacers constitute the largest portion of mt genomes (average, ~ 81 Kb, ~ 49%), matching the combined exonic (~ 37 Kb, ~ 24%) and intronic (~ 44 Kb, ~ 27%) components (Table 1). The pattern in spacer length variation parallels that of genome size variation in liverworts (Fig. 1a). On average the complex thalloid mt genomes hold the largest spacer component (~ 91 Kb or ~ 49%), followed by that of Haplomitrium (~ 89 Kb or ~ 50%). The average spacer size decreases to ~ 83 Kb (~ 48%) in Pelliidae, 79 Kb (~ 50%) in Porellales, and ~ 74 Kb (~ 45%) in Metzgeriidae. In the Jungermanniales, the average spacer size is relatively small but varies between 84 Kb (~ 51%), 69 Kb (~ 46%), and 82 Kb (~ 51%) in clade A, B, C, respectively (Additional file 2: Table S2).
Table 1

Statistics of 47 available liverwort mitochondrial genomes. Taxa are sorted alphabetically. All ratios are calculated to the percentage of spacer length unless otherwise indicated. ORF = open reading frame; SSR = simple sequence repeat

TaxonGenBank No.Total spacer size (bp) (% of whole genome)ORF (%)Repeats (%)aSSR (%)Plastid origin (%)Nuclear origin (%)bPseudogene (%)
Aneura pinguisNC_02690175,732 (45.73)20.872.700.0015.006.4
Aneura pinguisKR81758275,139 (45.54)19.602.910.1015.105.7
Aneura pinguisKU14042774,612 (45.48)21.702.330.1015.006.2
Aneura pinguisKY24238674,865 (45.46)20.382.450.1016.105.9
Aneura pinguisKY70272276,316 (46.01)20.302.380.2014.606.1
Aneura pinguisKY70272377,158 (46.19)19.622.730.0015.306.3
Aneura pinguisMK23092577,098 (46.19)20.792.730.0015.376.3
Bazzania japonicaMK23092686,267 (53.05)16.423.820.4017.726.2
Bazzania tridensMK31853486,304 (53.06)16.073.930.4017.706.2
Blasia pusillaMK23092790,173 (48.76)10.946.380.0028.307.3
Calypogeia argutaNC_03597879,802 (50.17)8.561.940.1017.105.7
Calypogeia integristipulaNC_03597782,607 (50.66)13.163.560.1019.006.7
Calypogeia neogaeaNC_03598082,896 (51.12)14.713.470.1019.306.5
Calypogeia suecicaNC_03597982,723 (51.08)15.283.530.2017.706.6
Conocephalum conicumMK23092891,366 (49.08)12.356.090.1028.936.3
Dumortiera hirsutaMK23092986,433 (48.56)15.285.360.2028.216.9
Fossombronia cristulaMK23093685,625 (47.66)14.224.250.1020.986.6
Frullania orientalisMK23093775,479 (50.53)12.114.340.3019.594.4
Gymnomitrion concinnatumNC_04013280,799 (49.70)11.923.930.1018.205.3
Haplomitrium mnioidesMK23093189,333 (49.96)26.242.710.0012.815.3
Haplomitrium mnioidesMK23093289,333 (49.96)26.242.710.0012.815.3
Herbertus ramosusMK23094678,882 (49.82)12.794.190.2022.088.5
Heteroscyphus zollingeriMK23094779,048 (50.09)12.922.520.3016.376.4
Lepidozia trichodesMK23095487,006 (53.25)20.833.530.2015.786.7
Makinoa crispataMK23095881,459 (48.73)19.356.610.1024.837.3
Marchantia paleaceaNC_00166091,942 (49.27)14.047.610.1030.907.0
Marchantia polymorpha subsp. ruderalisNC_03750891,328 (49.05)12.537.610.1030.797.1
Metacalypogeia alternifoliaMK23095386,713 (51.99)18.183.360.2017.707.3
Metzgeria leptoneuraMK23095680,637 (47.11)18.083.700.1018.266.6
Monosolenium tenerumMK23093192,474 (49.29)15.156.820.1029.426.8
Monosolenium tenerumMK23093292,476 (49.29)15.156.800.1029.426.8
Nowellia curvifoliaMK23095261,971 (41.83)11.662.930.1014.346.7
Odontoschisma grosseverrucosumMK23095170,494 (45.28)17.433.080.2016.566.6
Plagiochila subtropicaMK23094882,736 (51.26)14.443.210.3017.176.5
Pleurozia purpureaMK23095677,352 (45.90)16.284.200.1020.487.3
Pleurozia purpureaNC_01344477,347 (45.90)15.814.190.1020.507.3
Plicanthus hirtellusMK23095972,309 (49.32)16.994.380.2019.876.8
Porella plumosaMK23093888,966 (49.80)16.815.630.1021.086.6
Radula japonicaMK23093973,160 (48.90)20.684.420.3015.135.7
Riccardia latifronsMK23095561,901 (41.62)16.472.300.2015.866.0
Riccardia planifloraMK31853562,537 (41.87)18.131.890.11014.905.1
Riccia cavernosaMK23093492,483 (49.23)12.376.770.1029.456.4
Scapania ornithopodioidesMK23095068,944 (48.22)11.813.400.1017.427.3
Treubia lacunosaNC_01612271,724 (47.45)16.602.320.0018.556.1
Treubia lacunosaMK23094372,567 (47.75)15.502.260.0018.606.1
Trichocolea tomentellaMK23094989,913 (51.74)16.733.620.8018.756.3
Tritomaria quinquedentataMG64057069,393 (48.69)15.193.100.2021.106.2

aRepeated sequences were detected by BLASTN, and sequences with similarity higher than 85% and aligned length longer than 50 bp were considered as repeat

bPutative nuclear-derived sequences were detected by blasting the whole-intergenic spacer of each species against the Marchantia polymorpha nuclear genome; blast hits with E value 0.00001 were summarized. For length calculation, overlapping regions in mitochondrial spacers were removed

Statistics of 47 available liverwort mitochondrial genomes. Taxa are sorted alphabetically. All ratios are calculated to the percentage of spacer length unless otherwise indicated. ORF = open reading frame; SSR = simple sequence repeat aRepeated sequences were detected by BLASTN, and sequences with similarity higher than 85% and aligned length longer than 50 bp were considered as repeat bPutative nuclear-derived sequences were detected by blasting the whole-intergenic spacer of each species against the Marchantia polymorpha nuclear genome; blast hits with E value 0.00001 were summarized. For length calculation, overlapping regions in mitochondrial spacers were removed The spacer region of liverwort mt genomes is composed of ORFs, pseudogene fragments, nuclear homologous sequences, dispersed repeated sequences, SSRs, and other non-coding sequences of unknown origin. Plastid-derived or horizontally transferred DNA sequences are seemingly lacking in liverwort mt genomes (Table 1), as they were in those of mosses [5]. Nuclear derived sequences generally make up the largest component with an average of 20%, ranging from 13% in Haplomitrium to over 30% in the Marchantiopsida. Considering that we used the nuclear genome of Marchantia polymorpha, the only available nuclear genome for liverworts, as the reference for the blast search, it is not surprising that Marchantia polymorpha mt spacers returned the highest percentage hit (31%) on homologous sequences. The fast evolving/long branch lineage Haplomitrium (13%) and some simple thalloid taxa (such as Aneura and Riccardia) show the lowest percentage (i.e., 15%) of homologous sequence hits. Generally, the total size of the nuclear homologous sequence content in liverworts (average, ~ 15.7 Kb) is comparable to that in mosses (average, ~ 14.7 Kb), but the percentage relative to the total spacer size is only half that of mosses (average, ~ 42% [5]), which reflects the smaller size of moss mt spacers. ORFs usually make up the second largest part of the liverwort mt spacers, with an average ratio of 16%, ranging from 12% in some Jungermanniales species to 26% in Haplomitrium. The ratio of ORF to spacer size is smallest in the Marchantiopsida (average, ~ 13%), followed by the Jungermanniopsida (average, ~ 15%), then simple thalloids (average, ~ 17%). The ORF content in liverworts is distinctively larger than in mosses (i.e., 5%). As in mosses, SSR sequences compose generally less than 1% of the combined spacer region in liverwort mt genomes. Whereas moss mitogenomes lack repeated sequences in their intergenic spacers, liverworts hold on average 4% of repeated sequences in their intergenic spacer regions, with only ~ 2% in the Haplomitriopsida and ~ 7% in the Marchantiopsida. The relatively higher repeated sequence content might allow for higher potential for mt genome recombination, as a positive correlation between the number of repeated sequences and the gene rearrangements has been suggested [5].

Liverwort mitogenomes are conserved in gene order despite repeat mediated recombinations

Repeated sequences play an important role in plant mt genome structure stability since they can mediate intragenomic homologous recombinations leading to inversions and translocations of genomic regions [1, 16, 34]. Except for the mitogenome of Gymnomitrion concinnatum of the Jungermanniales [11], all remaining available liverwort mt genomes (i.e., 46) share exactly the same gene order (Fig. 3). The mt genome of G. concinnatum needs two inversions to make it collinear with the other liverwort mt genomes, which is in stark contrast to the situation in vascular plants wherein any two mt genomes require on average 31 rearrangements to gain collinearity [5]. The stable mt genome structure of liverworts might suggest either very low recombination level in liverwort mitochondrion and/or intensified nuclear surveillance over repeat recombinations. Heat map of mt genome rearrangements in pairwise comparisons of land plants along a phylogenetic tree based on mt nucleotide sequences (Additional file 3: Figure S1). All nodes are maximally supported (i.e., 100% bootstrap support) unless otherwise marked Compared to the compact moss mt genomes with only a few small repeats shorter than 100 bp, liverwort mt genomes are much inflated, containing on average 14 pairs of small repeated sequences of 50–100 bp, and 14 pairs of medium-sized repeats of 100–900 bp (Additional file 9: Table S4). We examined the recombination rates of all repeats (535 pairs, Additional file 10: Table S5) within the 50–250 bp range for all the 29 samples and found recombination evidence for 26 repeats from 16 species (Table 2), 40% of the liverwort species show no evidence of recombination. The mitogenomes of nine leafy liverworts appear to recombine more frequently than those of complex thalloids (three species) and simple thalloids (two species). The two representatives of the early diverging Haplomitriopsida have recombinants detected. The repeats actively involved in recombination have an average size of the 103 bp with ten repeats (38%) exceeding 100 bp in length. Recombination rates range from 0.67 to 60% with a median value of 4.45%. About two thirds of the recombination events were mediated by small repeats (i.e., shorter than 100 bp) with a median recombination rate of 11.46%. Repeat length and recombination rate are not positively correlated. About half (12 out of 26) of the repeat mediated recombinations cause gene order changes and direct repeat recombinations affect genes more often than their inverted counterparts (62.5% vs 30%). Most (eight out of 10) inverted and four out of 16 direct repeat recombinations cause gene order changes. These gene order changes could give rise to alternative genome conformations, which, if they occurred in germ cells would be passed on to the offspring [35].
Table 2

Recombination frequency of repeats within 50–250 bp range for 16 liverwort mitogenomes

SpeciesLength (bp)Identity (%)DirectionPositionReads support master circle conformationReads support alternative conformationGene order state after recombinationGenes affected
Bazzania japonica5194.12direct142,103–142,15330 (68.18%)14 (31.82%)unchangedtrnSUGA
141,693–141,743
6687.88direct70,639–70,70439 (84.78%)7 (15.22%)unchangedatp8
70,053–70,118
Conocephalum conicum13893.48direct85,167–85,304148 (99.33%)1 (0.67%)unchangedcox2 and cox3
84,355–84,492
Fossombronia cristula7398.63inverted60,943–61,01467 (57.76%)49 (42.24%)unchanged
58,015–58,087
9586.32direct45,965–46,05967 (60.91%)43 (39.09%)unchanged
46,130–46,221
9387.1inverted61,247–61,33986 (81.9%)19 (18.1%)unchanged
58,038–58,130
5296.15direct55,570–55,61980 (97.57%)2 (2.43%)unchanged
55,443–55,494
7398.63direct61,341–61,41380 (98.73%)1 (1.23%)unchanged
60,943–61,014
Frullania orientalis7596inverted12,671–12,74512 (92.31%)1 (7.69%)changed
12,671–12,745
Haplomitrium mnioides I7788.31direct87,784–87,86048 (97.96%)1 (2.04%)changedcox3 and trnSGCU
40,562–40,637
241100direct140,823–141,06395 (98.96%)1 (1.04%)changed
132,233–132,473
5190.2direct14,195–14,24535 (97.2%)1 (2.8%)unchanged
14,085–14,135
Heteroscyphus zollingeri25396.05direct66,972–67,22327 (96.43%)1 (3.57%)changedrps7
21,339–21,591
Monosolenium tenerum I13892.03direct86,651–86,78794 (98.95%)1 (1.05%)unchangedcox2 and cox3
85,840–85,977
Nowellia curvifolia120100inverted88,658–88,77731 (60.78%)20 (39.22%)changedcob
5529–5648
13898.55inverted88,962–89,09962 (78.48%)17 (21.52%)changedcob
5392–5529
6493.75inverted14,0191–14,025330 (96.77%)1 (3.23%)changed
14,0191–14,0253
Odontoschisma grosseverrucosum5294.23inverted94,954–95,00532 (40%)48 (60%)changedcob
5637–5688
Pleurozia purpurea14098.57direct113,528–113,66721 (95.46%)1 (4.54%)unchangedatp1
108,644–108,788
Plicanthus hirtellus5498.15inverted112,052–112,10531 (48.44%)33 (51.56%)changed
84,436–84,489
Porella plumosa6290.32inverted45,426–45,48520 (95.24%)1 (4.76%)changed
45,426–45,485
Radula japonica11489.47direct68,120–68,22749 (98%)1 (2%)unchangedcox2 and cox3
67,321–67,434
18999.47direct142,227–142,41567 (98.53%)1 (1.47%)changed
79,402–79,590
Riccia cavernosa7787.01inverted110,110–110,18566 (95.65%)3 (4.35%)changed
110,110–110,185
Treubia lacunosa13793.43direct72,748–72,88482 (98.8%)1 (1.2%)unchangedcox2 and cox3
72,001–72,137
Trichocolea tomentella6392.06direct151,893–15,195514 (82.35%)3 (17.65%)unchangedtrnSUGA
151,475–15,1537
Recombination frequency of repeats within 50–250 bp range for 16 liverwort mitogenomes Although empirical studies suggest that repeats longer than 50 bp and with an identity above 85% may mediate recombinations [17, 34], the recombination activity of repeats is actually positively correlated with the length of repeated sequences, with small repeats (< 100 bp) rarely inducing recombinations [36, 37]. The detection of repeat recombinations for two thirds of liverwort species with an average of two active repeats per species might indicate repeat recombinations occurred, but in low frequencies. Considering the average of five repeats (per species) longer than 250 bp not investigated for recombinations, it is very likely that some of these larger repeats (250–900 bp) may also allow for recombination. As recombination and structure fluidity are supposed to be positively correlated [1, 4, 5], the stable mt genome structure across liverwort diversity that spans over than 400 myr [27] is surprising given that liverwort mt genomes indeed recombine and alternative genome conformations coexisted. The apparent paradox of structural stasis of the mitogenome during the evolution of liverworts despite evidence of ongoing recombination may be addressed in the following three contexts. First, as some authors reported the adaptive value of repeats and recombinations in plant mitochondrion [37], recombinations happen more frequently when the cells are in stress and/or under some environmental stimuli [38], it is likely that the low recombinations observed in liverworts might primarily happen in the old vegetative cells rather than in young differentiating cells and germ cells, therefore the liverwort progeny inherited the master circle conformation. Second, if repeat recombinations occurred in the reproductive cells, those recombinants with genes affected or genes missing might be selected against. Recent studies in Drosophila also suggested a generalizable mechanism for selection against deleterious alternative mt configurations: ATP selection after mitochondrion fragmentation can retain those mtDNA fragments that contain the ‘correct’ mt genome with the complete set of the genes [39]. Nevertheless, those alternative conformations with altered gene order but no genes affected may survive this ATP selection and might possibly lead to offspring with rearranged gene order. Finally, It is also possible that maintaining the organization of genes is essential to the transcription of polycistronic operons in liverworts and mosses [5], hence an identical gene order is selected across all liverwort lineages despite the existence of alternative genome conformations.

Liverwort mt genomes might be under intensified nuclear surveillance

The structural stability of liverwort mt genomes, in accordance with the remarkable structural conservatism of liverwort plastomes [40], might also be shaped by nuclear encoded DSBR proteins that suppress the error-prone ectopic recombinations across small direct repeats during DSBR in mitochondrion and plastid [41]. We here characterized six frequently reported DSBR gene families in the transcriptome assemblies of 125 land plant representatives (Additional file 11: Table S6). Functional studies have confirmed the mtDNA repair function of the moss orthologs in Physcomitrella patens for four of these gene families, RecA, RecG, RecX, and MSH1. In our phylogenetic analyses (Additional file 12: Figure S6), four DSBR gene families (RecA, RecG, RecX, and OSB) show notable liverwort-specific subfamily expansions, suggesting peculiar evolutionary pattern of DNA repair mechanisms in liverworts. The subcellular localization of these DSBR proteins (as predicted by TargetP) yielded ambiguous results for most of these DSBR proteins, including those expanded ones, suggesting either truncated protein sequences or incomplete query database. Although the subcellular localization (as predicted by TargetP), and the in vivo function of these expanded liverwort gene family members remain elusive, we could not rule out the possibility that these liverwort specific expansions might also contribute to the mtDNA sequence stability of liverworts. Specifically, selective pressure might have driven the functional divergence of DSBR gene families under different evolutionary constraints, and that those diversified DSBR proteins may exhibit a wider range of characteristics and specificities, which possibly help intensify the DNA repair mechanisms in liverworts.

Conclusions

Based on the assembly of mitochondrial genomes for a widely expanded sample of liverworts spanning over 400 million years of evolution [27], we a) confirm that the mitogenome of liverworts is highly conserved in both gene content and gene order, while having a large and variable set of introns (23–32 introns) subject to a series of localized retroprocessing events, b) reveal ongoing intragenomic recombination mediated by small repeats and c) show liverwort specific DSBR subfamily expansions in four gene families. Although the mt genomes are relatively conserved in each of the three major bryophyte lineages (i.e., liverworts, mosses and hornworts), they differ in the content of repeat sequences and frequency of rearrangements, suggesting that the mechanisms maintaining mitogenome structure differ among these lineages. The conserved evolution of liverwort mt genomes might be explained by low recombination level, selection, and intensified nuclear surveillance.

Methods

Materials and DNA extractions

Wild samples of liverwort materials of 29 accessions were collected in field trips from China (Xizang, Sichuan, Yunnan, Fujian, and Guangzhou Provinces), Madagascar, the United States (Connecticut), Vietnam, and New Zealand (Additional file 1: Table S1). No specific permissions were needed on current study of these samples. These samples were studied and identified by Qiang He, and the voucher specimens were deposited in SZG (Herbarium of Shenzhen Fairy lake Botanical Garden, Shenzhen, China). Our sampling spreads across the liverwort phylogeny, representing 10 of the 15 orders of extant liverworts. Each sample was cleaned with distilled water and dried using lab paper. The clean shoots of individual samples were isolated under the dissecting microscope and used for DNA extractions using the modified CTAB methods [42]. The DNA quality and quantity of each sample were examined using 1% Agarose gel electrophoresis and Nanodrop 2000 spectrophotometer.

Mitochondrial genome sequencing and assembly

For genomic DNA sequencing, 1 μg high quality genomic DNA was sheared using the Covaris M220 (Woburn, MA, USA), DNA fragments of 300–500 bp were used to generate sequencing libraries using the Illumina TruSeq™ DNA PCR-free library preparation kit (Illumina, CA, USA) following the manufacturer’s instructions. The libraries were paired-end (2 × 150 bp) sequenced on an Illumina HiSeq 2000 sequencing platform at the WuXiNextCode (Shanghai, China). Approximately 10 Gb sequencing data were generated for each sample. The raw NGS data were trimmed and filtered for adaptors, low quality reads, undersized inserts, and duplicate reads using Trimmomatic [43]. The filtered reads for each species were de novo assembled using CLC Genomics Workbench v5.5 (CLC Bio, Aarhus, Denmark). All assembled contigs were blasted to the Marchantia polymorpha mt genome (GenBank accession: NC_001660) to identify mt contigs. As the genomes of the three cellular compartments are significantly different in copy numbers and hence read coverage in sequencing [44], the read depth for mt contigs is distinctly lower than that of plastid contigs, but significantly higher than that of nuclear contigs [45]. For each species, every resulting mt contigs (usually with one to five per species) was first checked for read depth to ensure they are in a similar range, and then each of these mt contig was elongated at both ends in Geneious v10.0.2 (Biomatters, New Zealand) using Bowtie2 [46] till their ends overlapped with one another by at least 5000 bp. Altogether, 29 circular mt genomes were assembled for liverworts. Finally, the corresponding genomic reads were mapped back to the complete mitochondrial genomes to double check for sequencing depth (Additional file 1: Table S1) of the assembled mt genomes. All the 29 liverwort mt genomes newly assembled in this study received constant depth along the mt genomes.

Genome annotation and comparative analysis

The mt genomes were annotated following the steps described by Li et al. [9] and Xue et al. [8]. In summary, the protein-coding and rRNA genes were annotated by Blastn searches of the non-redundant database of National Center for Biotechnology Information (NCBI). The exact gene and exon/intron boundaries were further verified in Geneious v10.0.2 (Biomatters, New Zealand) by aligning orthologous genes from the published annotated liverwort mitochondrial genomes. The tRNA genes were annotated using tRNAscan-SE v2.0 [47]. Mitochondrial RNA editing sites were obtained from our previous study on liverwort organellar RNA editing [48]. We summarized and compared the structural evolution of all the formally published (as by June 2019) liverwort mt genomes (Additional file 2: Table S2) from the gff3 annotation files. As accessions from the same species/genus show very similar characters in mt genomes in all aspects, directly averaging the values across all organisms would bias towards better-represented lineages. Therefore, to avoid such biases we calculated the average values under three different strategies, if multiple accessions of a species exist: (1) we average them first and then calculate the average value among species (e.g., genome/exon/intron length, and gene/intron number); (2) similarly for among genera, we averaged the values of species, and (3) for among clades, we averaged the value of each genus.

Repeats and recombination analysis

Repeat sequences are annotated and analyzed (Additional file 9: Table S4) using the python tool developed by Wynn & Christensen [37]. Considering at least 50 bp matches at both ends of the insert fragment and an average size of insert fragments being 350 bp in our sequencing libraries, we can estimate the recombination rate of all non-overlapping repeats between 50 bp and 250 bp. Altogether, we analyzed 534 repeats for the 29 liverwort mt genomes (Additional file 10: Table S5) following Dong et al. [36]. Specifically, for each repeat pair, we built four or eight reference sequences, each with 200 bp up- and down-stream of the two template sequences (original sequences), and the two (for repeat pair with identity = 100) or six (for repeat pair with identity < 100) recombined sequences (alternative configurations) constructed from the putative recombination products. All the recombined templates were blasted against the corresponding mt genome sequences to identify those on the genome. Then, we searched the reference sequences against the corresponding paired-end reads database, and counted the number of matching read pairs with a blast identity above 98%, and a hit coverage of at least 50 bp in both flanking regions of each repeat sequence. After that, the best matched read pairs for all the recombinants were extracted and mapped to the corresponding mt genomes using Bowtie2 [46] and the resultant bam file was visualized in Geneious v10.0.2 (Biomatters, New Zealand) to confirm the existence of the alternative configurations in reads.

Gene alignment and phylogenetic reconstruction

For the phylogenetic reconstruction, sequences of each of the 41 mitochondrial genes were aligned using TranslatorX [49], which first translates the nucleotide (nt) sequence into amino acids (aa) using the standard, universal genetic code, and then uses MAFFT [50] to create an amino acid alignment. The alignment is further trimmed for ambiguous portions by GBLOCKS [51] with the least stringent settings. The cleaned aa alignment is then used as a guide to generate the nt sequence alignment. The resultant nt alignments of individual genes were then combined as one dataset in Geneious v10.0.2 (Biomatters, New Zealand). The concatenated nt dataset was analyzed by the maximum likelihood (ML) method as implemented in the RAxML software [52] with codon-partitioned GTRGAMMA model. Branch support for each internode was evaluated with 300 bootstrap approximation (Additional file 3: Figure S1). The mitochondrial phylogeny of liverworts is mostly congruent with that of the plastid phylogeny [53].

Identification of DSBR proteins and phylogenetic analysis

We used HMMER [54] to conduct profile hidden Markov model (HMM) searches using the Pfam RecA (PF00154), SSB (PF00436), Whirly (PF08536), RecX (PF02631), MutS_V (PF00488), DEAD and Helicase_C (PF00270, PF00271) domains as queries to search the annotated proteins from 125 plant species (Additional file 11: Table S6) for characterization of the gene family members of RecA, OSB, Why, RecX, MSH, and RecG, respectively. For gene loci with multiple isoforms predicted, the primary isoform was used if primary isoform annotation is available; otherwise the longest protein was used. We considered sequences with Pfam domain identified by HMMER with an E-value of 1e-6, and the hit alignment length of at least 50% of the domain length to be candidate proteins. Those candidate proteins from the HMMER search were further manually confirmed using the SMART [55] and Pfam [56] databases. The proteins sequences were filtered for redundancy using CD-hit [57] with a similarity threshold of 0.95, aligned using MATTF [50], and trimmed for ambiguous portions by GBLOCKS with the least stringent settings [51]. The final protein alignment were analyzed by the maximum likelihood (ML) method as implemented in the IQ-TREE software [58] with 1000 ultrafast bootstrap replicates. The subcellular localizations of these DSBR proteins were predicted using online TargetP 1.1 server [59]. Additional file 1: Table S1. Overview of the 29 newly generated liverwort mitochondrial genomes. Additional file 2: Table S2. Comparison of the genome content of the 47 liverwort mitochondrial genomes. Additional file 3: Figure S1. ML phylogram of selected land plants with an emphasis on liverworts based on a concatenated nucleotide data set. Additional file 4: Figure S2. Distribution of mitochondrial protein genes in 31 liverwort genera. Additional file 5: Table S3. Comparison of gene content and gene order from 88 selected land plant mitochondrial genomes. Additional file 6: Figure S3. Distribution of mitochondrial RNA genes in 31 liverwort genera. Additional file 7: Figure S4. RNA editing site variations and intron losses in the cox1 gene as an exemplar. Additional file 8: Figure S5. RNA editing site distributions on three gene alignments in a phylogenetic context. Additional file 9: Table S4. Distribution of repeat sizes among liverwort mitochondrial genomes. Additional file 10: Table S5. Repeat recombinations for 534 repeats within 50 to 300 bp range in liverwort mitochondria. Additional file 11: Table S6. Taxa list for DSBR protein identification. Additional file 12: Figure S6. Phylogenetic trees of DSBR protein sequences from 125 Viridiplantae taxa inferred by Iqtree.
  49 in total

1.  Small, repetitive DNAs contribute significantly to the expanded mitochondrial genome of cucumber.

Authors:  J W Lilly; M J Havey
Journal:  Genetics       Date:  2001-09       Impact factor: 4.562

Review 2.  Small repeated sequences and the structure of plant mitochondrial genomes.

Authors:  C André; A Levy; V Walbot
Journal:  Trends Genet       Date:  1992-04       Impact factor: 11.639

3.  Mitochondrial genome dynamics in plants and animals: convergent gene fusions of a MutS homologue.

Authors:  Ricardo V Abdelnoor; Alan C Christensen; Saleem Mohammed; Bryan Munoz-Castillo; Hideaki Moriyama; Sally A Mackenzie
Journal:  J Mol Evol       Date:  2006-07-07       Impact factor: 2.395

4.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Authors:  Weizhong Li; Adam Godzik
Journal:  Bioinformatics       Date:  2006-05-26       Impact factor: 6.937

5.  Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments.

Authors:  Gerard Talavera; Jose Castresana
Journal:  Syst Biol       Date:  2007-08       Impact factor: 15.683

6.  Origins and recombination of the bacterial-sized multichromosomal mitochondrial genome of cucumber.

Authors:  Andrew J Alverson; Danny W Rice; Stephanie Dickinson; Kerrie Barry; Jeffrey D Palmer
Journal:  Plant Cell       Date:  2011-07-08       Impact factor: 11.277

7.  The plant-specific ssDNA binding protein OSB1 is involved in the stoichiometric transmission of mitochondrial DNA in Arabidopsis.

Authors:  Vincent Zaegel; Benoît Guermann; Monique Le Ret; Charles Andrés; Denise Meyer; Mathieu Erhardt; Jean Canaday; José M Gualberto; Patrice Imbault
Journal:  Plant Cell       Date:  2006-12-22       Impact factor: 11.277

8.  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence.

Authors:  O Emanuelsson; H Nielsen; S Brunak; G von Heijne
Journal:  J Mol Biol       Date:  2000-07-21       Impact factor: 5.469

9.  Massive gene loss in mistletoe (Viscum, Viscaceae) mitochondria.

Authors:  G Petersen; A Cuenca; I M Møller; O Seberg
Journal:  Sci Rep       Date:  2015-12-02       Impact factor: 4.379

10.  The complete mitochondrial genome of the early flowering plant Nymphaea colorata is highly repetitive with low recombination.

Authors:  Shanshan Dong; Chaoxian Zhao; Fei Chen; Yanhui Liu; Shouzhou Zhang; Hong Wu; Liangsheng Zhang; Yang Liu
Journal:  BMC Genomics       Date:  2018-08-14       Impact factor: 3.969

View more
  7 in total

1.  An empirical analysis of mtSSRs: could microsatellite distribution patterns explain the evolution of mitogenomes in plants?

Authors:  Karine E Janner de Freitas; Carlos Busanello; Vívian Ebeling Viana; Camila Pegoraro; Filipe de Carvalho Victoria; Luciano Carlos da Maia; Antonio Costa de Oliveira
Journal:  Funct Integr Genomics       Date:  2021-11-09       Impact factor: 3.410

2.  The complete mitochondrial genome of new species candidate of Rosa rugosa (Rosaceae).

Authors:  Jongsun Park; Hong Xi; Yongsung Kim; Suhwan Nam; Kyeong-In Heo
Journal:  Mitochondrial DNA B Resour       Date:  2020-09-29       Impact factor: 0.658

3.  Comparative mitogenome analyses uncover mitogenome features and phylogenetic implications of the subfamily Cobitinae.

Authors:  Peng Yu; Li Zhou; Wen-Tao Yang; Li-Jun Miao; Zhi Li; Xiao-Juan Zhang; Yang Wang; Jian-Fang Gui
Journal:  BMC Genomics       Date:  2021-01-14       Impact factor: 3.969

4.  The complete mitochondrial genome of Scapania ampliata Steph., 1897 (Scapaniaceae, Jungermanniales).

Authors:  Seung Se Choi; Juhyeon Min; Woochan Kwon; Jongsun Park
Journal:  Mitochondrial DNA B Resour       Date:  2021-03-01       Impact factor: 0.658

5.  The complete mitochondrial genome of Wiesnerella denudata (Mitt.) Steph. (Wiesnerellaceae, Marchantiophyta): large number of intraspecific variations on mitochondrial genomes of W. denudata.

Authors:  Seung Se Choi; Juhyeon Min; Woochan Kwon; Jongsun Park
Journal:  Mitochondrial DNA B Resour       Date:  2020-09-21       Impact factor: 0.658

6.  The complete mitochondrial genome of Cycas debaoensis revealed unexpected static evolution in gymnosperm species.

Authors:  Sadaf Habib; Shanshan Dong; Yang Liu; Wenbo Liao; Shouzhou Zhang
Journal:  PLoS One       Date:  2021-07-22       Impact factor: 3.240

7.  The draft mitochondrial genome of Magnolia biondii and mitochondrial phylogenomics of angiosperms.

Authors:  Shanshan Dong; Lu Chen; Yang Liu; Yaling Wang; Suzhou Zhang; Leilei Yang; Xiaoan Lang; Shouzhou Zhang
Journal:  PLoS One       Date:  2020-04-15       Impact factor: 3.240

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.