Literature DB >> 27917315

The first complete plastid genomes of Melastomataceae are highly structurally conserved.

Marcelo Reginato1, Kurt M Neubig2, Lucas C Majure3, Fabian A Michelangeli1.   

Abstract

BACKGROUND: In the past three decades, several studies have predominantly relied on a small sample of the plastome to infer deep phylogenetic relationships in the species-rich Melastomataceae. Here, we report the first full plastid sequences of this family, compare general features of the sampled plastomes to other sequenced Myrtales, and survey the plastomes for highly informative regions for phylogenetics.
METHODS: Genome skimming was performed for 16 species spread across the Melastomataceae. Plastomes were assembled, annotated and compared to eight sequenced plastids in the Myrtales. Phylogenetic inference was performed using Maximum Likelihood on six different data sets, where putative biases were taken into account. Summary statistics were generated for all introns and intergenic spacers with suitable size for polymerase chain reaction (PCR) amplification and used to rank the markers by phylogenetic information.
RESULTS: The majority of the plastomes sampled are conserved in gene content and order, as well as in sequence length and GC content within plastid regions and sequence classes. Departures include the putative presence of rps16 and rpl2 pseudogenes in some plastomes. Phylogenetic analyses of the majority of the schemes analyzed resulted in the same topology with high values of bootstrap support. Although there is still uncertainty in some relationships, in the highest supported topologies only two nodes received bootstrap values lower than 95%. DISCUSSION: Melastomataceae plastomes are no exception for the general patterns observed in the genomic structure of land plant chloroplasts, being highly conserved and structurally similar to most other Myrtales. Despite the fact that the full plastome phylogeny shares most of the clades with the previously widely used and reduced data set, some changes are still observed and bootstrap support is higher. The plastome data set presented here is a step towards phylogenomic analyses in the Melastomataceae and will be a useful resource for future studies.

Entities:  

Keywords:  Chloroplast; Genome skimming; Melastomataceae; Myrtales; NGS; Phylogenomics; Plastome

Year:  2016        PMID: 27917315      PMCID: PMC5131623          DOI: 10.7717/peerj.2715

Source DB:  PubMed          Journal:  PeerJ        ISSN: 2167-8359            Impact factor:   2.984


Introduction

The Melastomataceae Juss. has over 5,000 species distributed predominantly across the tropical regions. The observed levels of diversity, endemism or abundance of its members across different habitats make the family an important ecological group, as well as an excellent model for a variety evolutionary studies. The Melastomataceae belong in the Myrtales, where it is sister to the small CAP clade (Crypteroniaceae, Alzateaceae and Penaeaceae), which all together form a clade sister to Myrtaceae + Vochysiaceae (Berger et al., 2016). Plastid markers along with the nuclear ribosomal spacers (nrETS and nrITS) have been the major, and very often the exclusive, source of phylogenetic information in the family. Melastomataceae debut in molecular phylogenies was in a Myrtales-focused study, based on a partial amino acid sequence of the rbcS gene (Martin & Dowd, 1986). This study was followed by a more comprehensive nucleotide-based phylogeny, where the plastid rbcL gene was analyzed (Conti, Litt & Sytsma, 1996). The first Melastomataceae-wide phylogeny used a plastid data set including the rbcL and ndhF genes plus the rpl16 intron (Clausing & Renner, 2001). This plastid data set is still the most employed source of information in studies focusing on generic relationships across the family (Fritsch et al., 2004; Renner, 2004; Amorim, Goldenberg & Michelangeli, 2009; Michelangeli et al., 2011; Goldenberg et al., 2012; Michelangeli, Ulloa & Sosa, 2014; Goldenberg et al., 2015; Zeng et al., 2016). Phylogenetic studies within lower lineages of Melastomataceae have predominantly used the plastid spacers accD-psaI, atpF-atpH, psbK-psbI, and trnS-trnG, along with the ribosomal spacers nrETS and nrITS (Bécquer-Granados et al., 2008; Reginato, Michelangeli & Goldenberg, 2010; Kriebel, Michelangeli & Kelly, 2015; Reginato & Michelangeli, 2016). Recently, the latter data set has also been used at deeper level studies (Michelangeli et al., 2013; Rocha et al., 2016). Family-wide phylogenetic studies based on plastid markers have uncovered major relationships in the Melastomataceae, with several implications to the classification and evolutionary understanding in the family. Early studies have consolidated the sister relationship of Olisbeoideae and the remaining Melastomataceae, settling on the currently accepted family circumscription (Conti, Litt & Sytsma, 1996; Angiosperm Phylogeny Group, 1998; but see Clausing & Renner, 2001 for a different perspective). Latter studies focused in some tribal re-arrangements (Fritsch et al., 2004; Penneys et al., 2010; Michelangeli et al., 2011), generic placement (Amorim, Goldenberg & Michelangeli, 2009; Goldenberg et al., 2012; Michelangeli, Ulloa & Sosa, 2014; Goldenberg et al., 2015; Kriebel, 2016; Rocha et al., 2016; Zeng et al., 2016), phylogenetic evaluation of higher species-rich lineages (Michelangeli et al., 2004; Stone, 2006; Goldenberg et al., 2008; Martin et al., 2008; Michelangeli et al., 2008; Michelangeli et al., 2013), and lower taxon phylogenies (Bécquer-Granados et al., 2008; Reginato, Michelangeli & Goldenberg, 2010; Penneys, 2013; Kriebel, Michelangeli & Kelly, 2015; Gamba-Moreno & Almeda, 2014; Majure et al., 2015; Reginato & Michelangeli, 2016). Even in family-wide phylogenies, the level of variation across these few sampled plastid markers is unsatisfactory, as evidenced by low statistical support among many relationships in different published analyses. This issue becomes more prominent in phylogenetic analyses of lineages within Melastomataceae, where the plastid phylogeny is overall weakly supported, and concatenated results tend to be dominated by the more variable nuclear ribosomal data (Reginato, Michelangeli & Goldenberg, 2010; Reginato & Michelangeli, 2016). Phylogenomic studies are sparse in the Myrtales and absent in the Melastomataceae. Currently, there are 54 full plastids of Myrtales on the NCBI database, covering three out of the nine families in the order (Lythraceae, Myrtaceae and Onagraceae). Full plastomes can potentially improve hypotheses of phylogenetic relationships within the family, as well as in the Myrtales, and provide basic information for other aspects of molecular biology (e.g., DNA barcoding, plastome evolution, development of molecular markers). Here, we present the first complete plastid genomes in the Melastomataceae, covering 16 species spread across the family. The objectives of this study are to describe the structure of the sampled plastomes; compare main features of the plastomes within the family and to other available Myrtales plastomes; and survey the plastomes for highly informative phylogenetic markers for future use.

Material and Methods

Taxon sampling, DNA extraction and sequencing

Genome skimming was performed for 16 species of Melastomataceae. Sampling was based on previous family wide phylogenetic studies (Michelangeli, Ulloa & Sosa, 2014; Goldenberg et al., 2015), where each sample belongs to a different major lineage of the family, either with a formal tribe status or not. Voucher information along with GenBank accession codes are presented in Table 1. Total genomic DNA was isolated from silica-dried tissue using the Qiagen DNAeasy plant mini-kit (Qiagen, Valencia, CA, USA) following the protocol suggested by Alexander et al. (2007) or used a modified CTAB extraction where the aqueous supernatant was silica-column purified (Neubig et al., 2014). Total DNA samples were quantified using a NanoDrop Spectrophotometer (Thermo Scientific, Waltham, MA, USA) or Qubit 2.0 (Invitrogen, Carlsbad, CA, USA). Total genomic libraries and barcoding was performed at Cold Spring Harbor Laboratories or at Rapid Genomics (Gainesville, FL, USA) for sequencing on an Illumina HiSeq2000 platform (Illumina, Inc., San Diego, CA, USA).
Table 1

Voucher information and GenBank accessions of the chloroplast sequenced in the Melastomataceae.

SpeciesTribe/“clade”GenBankVoucherHerbarium
Allomaieta villosa (Gleason) LozanoCyphostyleaeKX826819David, H. 2188HUA, NY
Bertolonia acuminata GardnerBertolonieaeKX826820Goldenberg, R. 810NY, UPCB
Blakea schlimii (Naudin) TrianaBlakeeaeKX826821Michelangeli, F.A. 1227NY
Eriocnema fulva Naudin“Eriocnema”KX826822Almeda, F. 8416CAS
Graffenrieda moritziana TrianaMerianieaeKX826823Michelangeli, F.A. 832NY
Henriettea barkeri (Urb. & Ekman) AlainHenrietteeaeKX826824Ionta, G. 2029FLAS
Merianthera pulchra Kuhlm.“Cambessedesia”KX826825Goldenberg, R. 1153NY, UPCB
Miconia dodecandra Cogn.MiconieaeKX826826Michelangeli, F.A. 758NY
Nepsera aquatica (Aubl.) Naudin“Marcetia”KX826827Michelangeli, F.A. 1998NY
Opisthocentra clidemioides Hook. f.UnplacedKX826828Caddah, M.K. 578NY, UPCB
Pterogastra divaricata (Bonpl.) NaudinMelastomeaeKX826829Michelangeli, F.A. 540NY
Rhexia virginica L.RhexieaeKX826830Michelangeli, F.A. 1448NY
Rhynchanthera bracteata TrianaMicrolicieaeKX826831Zenteno, F. 8801NY
Salpinga maranoniensis WurdackMerianieaeKX826832Clark, J.L. 13577UNA
Tibouchina longifolia (Vahl) Baill.MelastomeaeKX826833Majure, L. 4277FLAS
Triolena amazonica (Pilg.) Wurdack“Triolena”KX826834Michelangeli, F.A. 1366NY

Note:

Informal clades are quoted.

Note: Informal clades are quoted.

Plastid genome assembly and annotation

Total reads number yielded was on average ca. 11.5 Gb per sample (s.d. = 6 Gb). Paired reads were imported into Geneious 7.1 (Biomatters Ltd., Auckland, New Zealand), trimmed by quality (at 0.05 probability) and de novo assembled (Geneious Assembler, “low sensitivity” option, default settings). Filtered assembled contigs (length > 1 kb) were blasted against the Eucalyptus polybractea plastome (NC022393). The identified plastid contigs were then reference assembled against the E. polybractea plastome in order to generate a single contig to construct the circular maps. Eventual short gaps were filled by iteratively mapping the total paired reads against the contig ends. Plastid annotation was performed in Geneious 7.1 with Arabidopsis thaliana (NC000932) and Eucalyptus polybractea (NC022393) as references. Graphical representations of the plastid circular and linear maps were generated with OGDRAW (Lohse et al., 2013) and the R package genoPlotR (R Development Core Team, 2016; Guy, Kultima & Andersson, 2010). Plastome structure, gene content, and general characteristics of the plastid genome were compared among the 16 Melastomataceae plastomes and to eight published plastomes of Myrtales, covering all families in this order available on the NCBI website. The Myrtales plastomes included one species in the Lythraceae (Lagerstroemia fauriei–NC029808), one Onagraceae (Oenothera grandiflora–NC029211) and six Myrtaceae (Allosyncarpia ternata–NC022413; Angophora costata–NC022412; Corymbia gummifera–NC022407; Eucalyptus polybractea–NC022393; Eugenia uniflora–NC027744; and Stockwellia quadrifida–NC022414).

Phylogenetic analyses

Three major data sets were generated for phylogenetic inference. The first included the non-coding regions (ncs data set), the second included 78 protein-coding genes (cds data set), and the third consisted of fully assembled plastomes (full data set). In all data sets one of the IR sequences was removed to reduce overrepresentation of duplicated sequences. Full plastids were aligned with MAFFT v. 7 using the FFT-NS-i × 1,000 strategy (Katoh & Standley, 2013). Coding sequences were extracted from the full alignment, resulting in the cds and ncs data sets. Each gene in the cds data set was re-aligned using its translation under the same strategy of the full data set and then concatenated. Given that phylogenetic inference might be biased by poorly aligned regions with ambiguous homology, heterogeneous rates of substitution in the different codon positions, synonymous substitutions in Arginine, Leucine and Serine codons, among others (Misof & Misof, 2009; Cox et al., 2014), we further divided the three major data sets into six different schemes where we attempted to circumvent those issues. Poorly aligned regions of the ncs data set were removed using aliscore.pl with the −N and −r options (Misof & Misof, 2009), and in the cds data set; all codons coding for Arginine, Leucine and Serine were ambiguated. Thus, the final six schemes included: 1. all ncs data set (ncs); 2. ncs data set without poorly aligned sites (ncs filtered); 3. all cds data set (cds); 4. cds with A, L and S codons ambiguated (cds ambiguated); 5. translated cds (protein); 6. ncs filtered plus all cds non-ambiguated (full). Additionally, in order to carry out a more objective comparison with previous phylogenetic hypotheses, we also analyzed a reduced data set that included only the three more commonly used markers for family wide phylogenies in the Melastomataceae (ndhF and rbcL genes along with the rpl16 intron, concatenated). Phylogenetic inference for all schemes was performed using Maximum Likelihood implemented in RAxML 8.2.4 (Stamatakis, 2014). The GTR+G model was employed for all nucleotide data and the PROT+G model for the protein sequences. Support was estimated through 1,000 bootstrap replicates. Protein-coding sequences were partitioned by codon position in all schemes, while no partitioning was employed for the non-coding regions.

Phylogenetic informative regions

In order to identify and rank highly phylogenetically informative regions in the Melastomataceae plastomes, all introns (19) and variable intergenic spacers with suitable size for PCR amplification (22) were selected and compared. Each individual marker was aligned with MAFFT (FFT-NS-i × 1,000 strategy), and its Maximum likelihood tree inferred with RAxML (not partitioned, GTR+G model, 100 bootstrap replicates). For each marker, we report the number of variable sites, number of parsimony informative sites, mean sequence distance (under K80 model), alignment length, mean sequence length, mean bootstrap support and distance to the full scheme plastid tree (RF distance; Robinson & Foulds, 1981). The metrics were retrieved using functions of the R packages ape and phangorn (Paradis, Claude & Strimmer, 2004; Schliep, 2011). Markers were ranked by phylogenetic information using a weighted mean of relative values of the following metrics: number of variable sites (weight = 1), mean bootstrap (weight = 2) and distance to the full plastid tree (weight = 3). For the top 10 markers identified in the previous step, we designed primer pairs for PCR amplification. Primers flanking the target regions were designed with Primer3, using the default settings (Rozen & Skaletsky, 2000). All metrics reported, as well primer design, considered only the ingroup (the 16 Melastomataceae plastids).

Results

Plastome structure

All plastomes have a quadripartite organization, with one large single copy region (LSC), one small single copy (SSC) and two inverted repeats (IRs). A circular map of the Miconia dodecandra plastome is presented in Fig. 1 and linear maps of all Melastomataceae plastomes in Fig. 2. Sequence depth ranged from 42 to 705 (mean = 289) and plastome length from 153,311 to 157,216 bp (mean = 155,806 pb). Sequence length and GC content of the different regions across the Melastomataceae plastomes are presented in Table 2. Overall, GC content is similar across species within the same plastid region, while the LSC regions has the greatest standard deviation in sequence length (s.d. = 616 bp), followed by IR (s.d. = 250 bp) and the SSC (s.d. = 126 bp).
Figure 1

Map of the Miconia dodecandra plastid genome.

Genes shown outside the outer circle are transcribed counterclockwise and genes inside the outer circle are transcribed clockwise. Genes in different functional groups are color coded following the legend. The shaded area inside the inner circle indicates the GC content, with dark shading indicating GC percent.

Figure 2

Maximum likelihood tree recovered with the full data set (left).

On the right, linear plastid maps of the 16 Melastomataceae species. All genes are depicted as arrows (indicating transcription direction) and color coded following the legend of Fig. 1. Gray lines link the same genes on contiguous maps. LSC, long single copy region; SSC, small single copy region; IRA, inverted repeat A; IRB, inverted repeat B.

Table 2

Comparison of plastid genome size and GC content across different regions in the 16 Melastomataceae species.

SpeciesCoverage (mean)LSCSSCIRFull
bpGCbpGCbpGCbpGC
Allomaieta villosa27885,9150.34716,9750.30626,7810.425156,4520.369
Bertolonia acuminata18985,5710.34717,0080.30826,7330.425156,0450.370
Blakea schlimii17085,3700.34916,9980.30826,7470.425155,8620.370
Eriocnema fulva4285,4310.34816,9530.30826,8050.425155,9940.370
Graffenrieda moritziana68385,3410.34716,9240.30926,7340.425155,7330.370
Henriettea barkeri13085,9910.34717,0360.30626,7500.425156,5270.369
Merianthera pulchra5685,6210.34817,0010.30726,7730.424156,1680.370
Miconia dodecandra31886,6090.34816,9990.31026,8040.425157,2160.370
Nepsera aquatica70584,6440.34817,0660.31026,7000.426155,1100.371
Opisthocentra clidemioides10085,8660.34816,9420.30926,7720.425156,3520.370
Pterogastra divaricata18484,7180.35117,1560.31226,5370.425154,9480.372
Rhexia virginica68384,4590.35116,9240.31126,6260.425154,6350.372
Rhynchanthera bracteata30485,0930.34716,7290.30726,6430.426155,1080.370
Salpinga maranoniensis53785,1280.35316,6530.31725,7650.428153,3110.374
Tibouchina longifolia19586,2970.34917,1240.31126,6840.425156,7890.371
Triolena amazonica4886,2000.34716,9700.30726,7410.425156,6520.369

Notes:

Length, length in bp; GC, GC content %.

LSC, long single copy region; SSC, small single copy region; IR, inverted repeat; Full, full plastome.

Map of the Miconia dodecandra plastid genome.

Genes shown outside the outer circle are transcribed counterclockwise and genes inside the outer circle are transcribed clockwise. Genes in different functional groups are color coded following the legend. The shaded area inside the inner circle indicates the GC content, with dark shading indicating GC percent.

Maximum likelihood tree recovered with the full data set (left).

On the right, linear plastid maps of the 16 Melastomataceae species. All genes are depicted as arrows (indicating transcription direction) and color coded following the legend of Fig. 1. Gray lines link the same genes on contiguous maps. LSC, long single copy region; SSC, small single copy region; IRA, inverted repeat A; IRB, inverted repeat B. Notes: Length, length in bp; GC, GC content %. LSC, long single copy region; SSC, small single copy region; IR, inverted repeat; Full, full plastome. Most plastomes have 84 protein-coding genes (CDS), 37 transfer RNA (tRNA) and eight ribosomal (rRNA), totaling 129 genes (including duplicates and ycf1, ycf2, ycf3 and ycf4). Among the duplicated genes in the IR, there are six CDS, seven tRNA, and four rRNA. As for the plastid regions, GC content is similar across different species within the same sequence class (CDS, tRNA, rRNA, intron and intergenic spacers), whereas the greatest variation in sequence length is observed across intergenic spacers (s.d. = 617 bp). A comparative summary of length and GC content in the different sequence classes across the Melastomataceae plastomes is given in Table 3. In the majority of the species sampled, gene content and order is similar to other Myrtales plastids, such as Lagerstroemia fauriei (NC029808) and Eucalyptus polybractea (NC022393). The exceptions are rps16 and rpl2, which are putative pseudogenes in some plastids. The former seems to have been pseudogenized in Graffenrieda moritziana and Pterogastra divaricata (where the first exon is absent) and in Salpinga margaritacea (with several insertions changing the reading frame in the second exon); the second copy of rpl2 gene (in the IRB) is likely a pseudogene in Salpinga margaritacea due to a shift in the IRB-LSC boundary in that plastid, which resulted in the loss of the second exon. Additionally, some variation is observed in all region boundaries across the Melastomataceae plastomes. The LSC-IRA boundary is located in the rps19 gene in most species, except in S. margaritacea where it is located in the intron of the rpl2 gene; the IRA-SSC boundary is located in the overlapping ψycf1 and ndhF; the SSC-IRB in the ycf1; and the IRB-LSC in the rpl2-trnH spacer or in the trnH gene. Introns are found in 17 genes in all Melastomataceae plastomes, including six tRNA genes and 11 CDS, from which three have two introns (clpP, rps12 and ycf3). A comparison of the number of genes, regions and plastome length of one Melastomataceae (M. dodecandra) and eight Myrtales plastids is presented in Table 4. The sequence length of the full plastome and its regions in the Melastomataceae sampled here are in the range observed for other Myrtales.
Table 3

Comparison of length and GC content across different sequence classes in the plastome of the 16 Melastomataceae species.

SpeciesProtein-codingtRNArRNAIntronIntergenic
bpGCbpGCbpGCbpGCbpGC
Allomaieta villosa80,8260.3743,3480.4979,0500.42520,5530.34742,6750.316
Bertolonia acuminata80,6700.3753,3560.4979,0500.42520,4370.34742,5320.316
Blakea schlimii80,7420.3753,3480.4989,0500.42520,5410.34742,1810.319
Eriocnema fulva80,6280.3753,3540.4979,0500.42520,5400.34742,4220.318
Graffenrieda moritziana80,2860.3753,3490.4979,0500.42519,6910.34743,3570.317
Henriettea barkeri80,7810.3743,3630.4959,0500.42520,5710.34742,7620.315
Merianthera pulchra80,7510.3753,3640.4989,0500.42520,4780.34742,5250.318
Miconia dodecandra80,5860.3763,3540.4989,0500.42520,5480.34743,6780.317
Nepsera aquatica80,6460.3753,3700.4969,0500.42520,6190.34741,4250.318
Opisthocentra clidemioides80,6430.3763,3600.4969,0500.42520,6410.34742,6580.317
Pterogastra divaricata80,4270.3773,3390.4989,0500.42519,9110.34742,2210.318
Rhexia virginica80,4660.3773,3530.4969,0500.42520,2600.34741,5060.319
Rhynchanthera bracteata80,4150.3753,2410.5029,0480.42520,5380.34741,8660.317
Salpinga maranoniensis79,3260.3763,3490.5009,0500.42518,9910.34742,5950.326
Tibouchina longifolia80,6820.3773,3480.4979,0500.42520,6660.34743,0430.317
Triolena amazonica80,6190.3753,3370.4969,0500.42520,4760.34743,1700.316

Note:

Length, length in bp; GC, GC content %.

Table 4

Comparison of plastid genome size of one Melastomataceae species (Miconia dodecandra) with eight other Myrtales.

FamilySpeciesCodingtRNArRNALSCSSCIRFull
MelastomataceaeMiconia dodecandra8437886,60916,99926,804157,216
MyrtaceaeAllosyncarpia ternata8437888,21818,57126,402159,563
MyrtaceaeAngophora costata8437888,76918,77326,392160,326
MyrtaceaeCorymbia gummifera8437888,31017,19727,603160,713
MyrtaceaeEucalyptus polybractea8437888,94418,53026,397160,268
MyrtaceaeEugenia uniflora8437887,45918,31826,334158,445
LythraceaeLagerstroemia fauriei8437883,92316,93325,792152,440
OnagraceaeOenothera grandiflora8438889,86219,03528,824166,545
MyrtaceaeStockwellia quadrifida8437888,24718,54426,385159,561

Notes:

Protein-coding, tRNA and rRNA (number of genes); LSC, long single copy region, length in bp; SSC, small single copy region, length in bp; IR, inverted repeat, length in bp and Full (length in bp).

Note: Length, length in bp; GC, GC content %. Notes: Protein-coding, tRNA and rRNA (number of genes); LSC, long single copy region, length in bp; SSC, small single copy region, length in bp; IR, inverted repeat, length in bp and Full (length in bp). The majority of the six analytical schemes recovered the same topology (Figs. 2 and 3B). The only exception was the “all non-coding” scheme (i.e., the full non-coding regions without filtering of dubiously aligned base pairs), where Blakea + Opistocentra, Triolena + Merianthera and Rhynchanthera assume a different position (Fig. 3A). Pairwise tree distances among all schemes are depicted in Fig. 3C, and all Maximum Likelihood trees with bootstrap support values are given in the Fig. S1. Bootstrap support is highest in the “full” and “cds” schemes and lower in the “protein” and “all non-coding” schemes (Fig. 3D). In the highest supported topologies, there are only two nodes with bootstrap values lower than 95, and those involve the relationship disagreements between the two alternate topologies (Figs. 3A and 3B). While filtering the non-coding poorly aligned sites improved bootstrap support and also changed the topology (“ncs” vs. “ncs filtered,” Fig. 3), ambiguating common amino acids in the coding sequences did not have any apparent effect in the topology or support values (“cds” vs. “cds ambiguated;” Fig. 3D).
Figure 3

Maximum likelihood trees of the all non-coding–ncs (A) and all coding genes–cds (B) data sets.

Bootstrap support is given adjacent to the nodes. (C) Tree distance (RF) pairwise matrix between all six schemes analyzed. (D) Mean bootstrap support of all six schemes analyzed.

Maximum likelihood trees of the all non-coding–ncs (A) and all coding genes–cds (B) data sets.

Bootstrap support is given adjacent to the nodes. (C) Tree distance (RF) pairwise matrix between all six schemes analyzed. (D) Mean bootstrap support of all six schemes analyzed. The commonly used plastid data set in previous family-wide studies (rbcL, ndhF and rpl16 intron) also resulted in a different topology from the “full” scheme, although with most clades in common (Fig. S2). Disagreements involved the position of Allomaieta, Trioleta + Merianthera, Blakea + Opisthocentra, and Rhynchanthera; these disagreements manifest in nodes of low bootstrap support where, in the reduced data set, they range from 24 to 100 (mean = 73).

Phylogenetically informative regions

Summary statistics for all intron and intergenic spacers with suitable size for PCR amplification are presented in Table S1. A list of the top 10 markers ranked by phylogenetic information, taking into account topological distance to the tree based on the “full” scheme (Fig. 2), mean bootstrap support and number of variable sites is given in Table 5, and the full list is available in Table S1. All single marker phylogenies presented some disagreement to the tree based on the “full” scheme (RF tree distance ranging from 4 to 22). Bootstrap support ranged from 26 to 82 (mean = 63) and number of variable sites from 12 to 507 (mean = 224). Primer pair sequences for PCR amplification are provided for the top 5 markers in Table 6.
Table 5

Summary statistics for the top 10 introns and intergenic spacers with suitable size for PCR amplification.

Markers are ranked by phylogenetic information based on a weighed mean of relative values of number of variable sites (weight = 1), mean bootstrap (weight = 2) and distance to the full plastid tree (weight = 3).

MarkerBasesAligned (bp)Variable sitesPISDNA distance (mean)Tree distanceBootstrap (mean)
1. trnS-trnG spacer780 [628, 884]1,125438 (38.9%)128 (11.4%)0.104482
2. ndhF-rpl32 spacer898 [849, 965]1,266507 (40%)171 (13.5%)0.114671
3. trnG intron762 [743, 790]846236 (27.9%)76 (9%)0.059475
4. ndhC-trnV spacer734 [504, 821]991330 (33.3%)98 (9.9%)0.081463
5. ndhA intron1,016 [939, 1,045]1,127250 (22.2%)74 (6.6%)0.046464
6. trnG-atpA spacer641 [550, 750]895353 (39.4%)136 (15.2%)0.114665
7. atpH-atpI spacer898 [638, 980]1,178323 (27.4%)92 (7.8%)0.062876
8. psbE-petL spacer1,058 [570, 1,165]1,396381 (27.3%)132 (9.5%)0.068870
9. petA-psbJ spacer736 [420, 944]1,062285 (26.8%)90 (8.5%)0.076876
10. trnE-trnT spacer842 [478, 1,029]1,345406 (30.2%)121 (9%)0.089863

Note:

PIS, parsimony informative sites; Tree distance, RF distance.

Table 6

Primer pair sequences for the identified top five highly informative markers across the 16 plastomes of Melastomataceae.

MarkerPrimer forward (5′–3′)Primer reverse (5′–3′)Ta (°C)
1. trnS-trnG spacerCACTCAGCCATCTCTCCCAAACCCGCTACAATGCCATTATTG55
2. ndhF-rpl32 spacerAGGAAAGGACCACATACGTCGTCCTTGCTCATTGATTTTGATCCA55
3. trnG intronGGTCCCTCGGATTTGCTTCAGAACCCGCATCGTTAGCTTG55
4. ndhC-trnV spacerAGATGAACTCCTAGGGAATGTGACCGAGAAGGTCTACGGTTCG55
5. ndhA intronCGCTAGTCCAGAACCGTACAACCCCATGATTGGTTGATTAGTGA55

Summary statistics for the top 10 introns and intergenic spacers with suitable size for PCR amplification.

Markers are ranked by phylogenetic information based on a weighed mean of relative values of number of variable sites (weight = 1), mean bootstrap (weight = 2) and distance to the full plastid tree (weight = 3). Note: PIS, parsimony informative sites; Tree distance, RF distance.

Discussion

Plastid genomes of higher plants are of relatively small size, ranging from 115 to 165 kb in most groups, with an average of 90 CDS across most land plants (Ravi et al., 2008; Wicke et al., 2011). In general, the quadripartite organization, gene content and order are conserved, and GC content is usually stable within plastid regions and sequence classes (Ravi et al., 2008; Wicke et al., 2011). Melastomataceae plastomes are no exception for these patterns, being highly conserved and structurally similar to most other Myrtales, as well as to an ordinary angiosperm plastome. Melastomataceae plastomes’ mean length (156 kb) is closer to the upper bound observed across most plants (165 kb), while the number of genes and GC content are around the average (90 genes, GC = 37%; Ravi et al., 2008). High conservation in genomic structure of plastomes among the Myrtales has been previously suggested (Gu et al., 2016) and is extended here to include Melastomataceae. The greatest variation in sequence length among different region classes in Melastomataceae are observed in the intergenic spacers, which is also another general pattern in plastomes (Ravi et al., 2008; Gu et al., 2016). Additionally, the boundaries of the IRs vary, as observed in some Myrtales and other groups (Bayly et al., 2013). Conservation in gene order, content and virtual lack of recombination make the plastome a useful tool for plant phylogenetic studies (Ravi et al., 2008). An updated comprehensive phylogenetic hypothesis for the entire Melastomataceae is overdue, and full plastid sequences would contribute greatly to such an endeavor. Additionally, as sampling increases in the Myrtales, full plastids also might help to narrow down phylogenetic uncertainty in the Myrtales (e.g., Combretaceae position, Berger et al., 2016). Despite the fact that the full plastome phylogeny recovered here shares most of the clades with the widely used “rbcL + ndhF + rpl16” tree, some changes are still observed and bootstrap support is higher. A more conclusive account on the extent of such changes will require more taxa to be sampled. Here, we provide a list of potentially highly informative plastid markers for Melastomataceae. We acknowledge that the information descriptors employed are very sensitive to the taxa under analysis. Nonetheless, this ranked list can be used as guidance for sampling design of future studies, whereas the new family specific primers will increase the plastid options for Sanger sequencing-based phylogenies. There has been some debate as to whether the availability of full plastome sequencing (and other NGS tools) would render Sanger sequencing obsolete (Hert, Fredlake & Barron, 2008). Here we show that a full plastome phylogeny is an improvement on single or few plastid loci phylogenies, especially on the level of statistical support. However, considering scalability, computational complexity and budget limitations, a comprehensive NGS-based phylogeny for the mega-diverse Melastomataceae might not be achieved in the short term. Nonetheless, an expanded full plastome data set along with the more abundant Sanger-based sequences available, could be coupled in future studies. A hybrid NGS and Sanger sequencing approach has been employed for other groups (Xi et al., 2012; Leaché et al., 2014; Gardner et al., 2016), and could help clarifying the backbone of a comprehensive Melastomataceae phylogeny. Recalcitrant phylogenetic backbones are a widespread and challenging phenomenon in angiosperms (Xi et al., 2012; Straub et al., 2014), and their resolution is critical to increase the confidence of ancestral state reconstructions, historical biogeographical scenarios and other evolutionary hypotheses. Although full plastomes, or an expanded sample of plastid markers, may help to improve the confidence of phylogenetic relationships within the Melastomataceae, we also recognize the need of parallel sampling of additional independent genealogies (i.e., nuclear and mitochondrial genomes) for further refinement in the Melastomataceae tree.

Figure S1.

Maximum likelihood trees of all six analyzed schemes in this study. Bootstrap support is given adjacent to the nodes. Click here for additional data file.

Figure S2.

Comparison of the Maximum Likelihood tree of the full data set (ncs filtered + cds; on the left) with a reduced data set of commonly used markers for family wide phylogenies in the Melastomataceae (ndhF, rbcL and rpl16 intron; on the right). Bootstrap support is given adjacent to the nodes. Click here for additional data file.

Table S1.

Summary statistics for all introns and intergenic spacers with suitable size for PCR amplification. Markers are ranked by phylogenetic information based on a weighed mean of relative values of number of variable sites (weight = 1), mean bootstrap (weight = 2) and distance to the full plastid tree (weight = 3). PIS = parsimony informative sites; Tree distance = RF distance; NA = not applicable. Click here for additional data file. Click here for additional data file.
  23 in total

1.  Divergence times, historical biogeography, and shifts in speciation rates of Myrtales.

Authors:  Brent A Berger; Ricardo Kriebel; Daniel Spalink; Kenneth J Sytsma
Journal:  Mol Phylogenet Evol       Date:  2015-11-14       Impact factor: 4.286

2.  Untangling the phylogeny of Leandra s.str. (Melastomataceae, Miconieae).

Authors:  Marcelo Reginato; Fabián A Michelangeli
Journal:  Mol Phylogenet Evol       Date:  2015-12-14       Impact factor: 4.286

3.  Chloroplast genome analysis of Australian eucalypts--Eucalyptus, Corymbia, Angophora, Allosyncarpia and Stockwellia (Myrtaceae).

Authors:  Michael J Bayly; Philippe Rigault; Antanas Spokevicius; Pauline Y Ladiges; Peter K Ades; Charlotte Anderson; Gerd Bossinger; Andrew Merchant; Frank Udovicic; Ian E Woodrow; Josquin Tibbits
Journal:  Mol Phylogenet Evol       Date:  2013-07-19       Impact factor: 4.286

4.  Bayesian analysis of combined chloroplast loci, using multiple calibrations, supports the recent arrival of Melastomataceae in Africa and Madagascar.

Authors:  Susanne S Renner
Journal:  Am J Bot       Date:  2004-09       Impact factor: 3.844

5.  Phylogeny and circumscription of the near-endemic Brazilian tribe Microlicieae (Melastomataceae).

Authors:  Peter W Fritsch; Frank Almeda; Susanne S Renner; Angela B Martins; Boni C Cruz
Journal:  Am J Bot       Date:  2004-07       Impact factor: 3.844

6.  Phylogenomics and a posteriori data partitioning resolve the Cretaceous angiosperm radiation Malpighiales.

Authors:  Zhenxiang Xi; Brad R Ruhfel; Hanno Schaefer; André M Amorim; M Sugumaran; Kenneth J Wurdack; Peter K Endress; Merran L Matthews; Peter F Stevens; Sarah Mathews; Charles C Davis
Journal:  Proc Natl Acad Sci U S A       Date:  2012-10-08       Impact factor: 11.205

7.  MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors:  Kazutaka Katoh; Daron M Standley
Journal:  Mol Biol Evol       Date:  2013-01-16       Impact factor: 16.240

8.  Conflicting phylogenies for early land plants are caused by composition biases among synonymous substitutions.

Authors:  Cymon J Cox; Blaise Li; Peter G Foster; T Martin Embley; Peter Civán
Journal:  Syst Biol       Date:  2014-01-06       Impact factor: 15.683

9.  OrganellarGenomeDRAW--a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets.

Authors:  Marc Lohse; Oliver Drechsel; Sabine Kahlau; Ralph Bock
Journal:  Nucleic Acids Res       Date:  2013-04-22       Impact factor: 16.971

10.  The Complete Plastid Genome of Lagerstroemia fauriei and Loss of rpl2 Intron from Lagerstroemia (Lythraceae).

Authors:  Cuihua Gu; Luke R Tembrock; Nels G Johnson; Mark P Simmons; Zhiqiang Wu
Journal:  PLoS One       Date:  2016-03-07       Impact factor: 3.240

View more
  12 in total

1.  Characteristics analysis of the complete Wurfbainia villosa chloroplast genome.

Authors:  Wenli An; Jing Li; Zerui Yang; Yuying Huang; Song Huang; Xiasheng Zheng
Journal:  Physiol Mol Biol Plants       Date:  2020-03-19

2.  Structure and features of the complete chloroplast genome of Melastoma dodecandrum.

Authors:  Xiasheng Zheng; Changwei Ren; Song Huang; Jing Li; Ying Zhao
Journal:  Physiol Mol Biol Plants       Date:  2019-03-12

3.  Comparative analysis of chloroplast genome structure and molecular dating in Myrtales.

Authors:  Xiao-Feng Zhang; Jacob B Landis; Hong-Xin Wang; Zhi-Xin Zhu; Hua-Feng Wang
Journal:  BMC Plant Biol       Date:  2021-05-15       Impact factor: 4.215

4.  Characterization of the complete chloroplast genome sequence of Blastus cochinchinensis (Melastomataceae).

Authors:  Wenchun Zhang; Zhenying Wen; Sijin Zeng; Liang Luo; Donghui Peng
Journal:  Mitochondrial DNA B Resour       Date:  2019-07-10       Impact factor: 0.658

5.  The complete chloroplast genome sequence of monotypic Cyphotheca (Melastomataceae), an endemic genus in China.

Authors:  Zhenying Wen; Sijin Zeng; Tingzhang Li; Guoqiang Zhang; Donghui Peng
Journal:  Mitochondrial DNA B Resour       Date:  2019-07-11       Impact factor: 0.658

6.  Extremely low nucleotide diversity among thirty-six new chloroplast genome sequences from Aldama (Heliantheae, Asteraceae) and comparative chloroplast genomics analyses with closely related genera.

Authors:  Benoit Loeuille; Verônica Thode; Carolina Siniscalchi; Sonia Andrade; Magdalena Rossi; José Rubens Pirani
Journal:  PeerJ       Date:  2021-02-24       Impact factor: 2.984

7.  The complete chloroplast genome sequence of Blastus pauciflorus (Melastomataceae).

Authors:  Zhen Ying Wen; Si Jin Zeng; Wenchao Han; Bin Chen; Dong Hui Peng
Journal:  Mitochondrial DNA B Resour       Date:  2019-11-06       Impact factor: 0.658

8.  Comparative analyses of Mikania (Asteraceae: Eupatorieae) plastomes and impact of data partitioning and inference methods on phylogenetic relationships.

Authors:  Verônica A Thode; Caetano T Oliveira; Benoît Loeuille; Carolina M Siniscalchi; José R Pirani
Journal:  Sci Rep       Date:  2021-06-24       Impact factor: 4.379

9.  Comparative analyses of plastid genomes from fourteen Cornales species: inferences for phylogenetic relationships and genome evolution.

Authors:  Chao-Nan Fu; Hong-Tao Li; Richard Milne; Ting Zhang; Peng-Fei Ma; Jing Yang; De-Zhu Li; Lian-Ming Gao
Journal:  BMC Genomics       Date:  2017-12-08       Impact factor: 3.969

10.  Plastome Rearrangements in the "Adenocalymma-Neojobertia" Clade (Bignonieae, Bignoniaceae) and Its Phylogenetic Implications.

Authors:  Luiz H M Fonseca; Lúcia G Lohmann
Journal:  Front Plant Sci       Date:  2017-11-01       Impact factor: 5.753

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.