Literature DB >> 34193929

Complete plastomes of six species of Wikstroemia (Thymelaeaceae) reveal paraphyly with the monotypic genus Stellera.

Liefen He1, Yonghong Zhang2, Shiou Yih Lee3.   

Abstract

Wikstroemia (Thymelaeaceae) is a diverse genus that extends from Asia to Australia and has been recorded on the Hawaiian Islands. Despite its medicinal properties and resource utilization in pulp production, genetic studies of the species in this important genus have been neglected. In this study, the plastome sequences of six species of Wikstroemia were sequenced and analysed. The plastomes ranged in size between 172,610 bp (W. micrantha) and 173,697 bp (W. alternifolia) and exhibited a typical genome structure consisting of a pair of inverted repeat (IR) regions separated by a large single-copy (LSC) region and a small single-copy (SSC) region. The six plastomes were similar in the 138 or 139 genes predicted, which consisted of 92 or 93 protein-coding genes, 38 tRNA genes, and 8 rRNA genes. The overall GC contents were identical (36.7%). Comparative genomic analyses were conducted with the inclusion of two additional published species of Wikstroemia in which the sequence divergence and expansion of IRs in the plastomes were determined. When compared to the coding sequences (CDSs) of Aquilaria sinensis, five genes, namely, rpl2, rps7, rps18, ycf1 and ycf2, indicated positive selection in W. capitata. The plastome-based phylogenetic analysis inferred that Wikstroemia in its current state is paraphyletic to Stellera chamaejasme, while the ITS-based tree analyses could not properly resolve the phylogenetic relationship between Stellera and Wikstroemia. This finding rekindled interest in the proposal to synonymize Stellera with Wikstroemia, which was previously proposed but rejected due to taxonomic conflicts. Nevertheless, this study provides valuable genomic information to aid in the taxonomic implications and phylogenomic reconstruction of Thymelaeaceae.

Entities:  

Year:  2021        PMID: 34193929      PMCID: PMC8245458          DOI: 10.1038/s41598-021-93057-3

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

Wikstroemia Engl. (Thymelaeaceae) is a diverse genus of approximately 70 species. Members of Wikstroemia are widely distributed in the Asian and Oceanian regions and scattered around the Hawaiian Islands[1]. The species are mostly fibrous trees, shrubs or subshrubs with a woody rhizome. Several species are cultivated as raw material for pulp production[2,3], while a handful of them are reported to have medicinal properties[4,5]. However, studies of Wikstroemia have been confined to its utilization in pulp production and pharmacological applications; reports on genetic studies of Wikstroemia are scarce. The only reports on the genetic diversity to date include one on Wikstroemia ganpi in Korea using inter simple sequence repeat (ISSR) markers[6] and two, published, complete plastome sequences of Wikstroemia chamaedaphne and Wikstroemia indica[7,8]. Due to the lack of molecular evidence, taxonomic studies of Wikstroemia have relied solely on morphological characteristics[9]. Ironically, the continuous nature of morphological variation in members of Wikstroemia has led to much taxonomic confusion in attempts to distinguish species and has resulted in ambiguities in taxonomic classifications between Wikstroemia and its sister genera[9,10]. Among the key morphological characteristics proposed to differentiate Wikstroemia from allied genera is the presence of petaloid scales in the flower[11]. However, the presence and characteristics of the disc in the flowers of Wikstroemia was not emphasized. Failure to analyse this character may result in misidentifications due to overlap in this feature during classification[10]. The subgeneric classification of Wikstroemia, consisting of only the subgenera Wikstroemia and Diplomorpha, is generally accepted[12,13]. Another problem in classification is the difficulty in detecting natural hybridization among the species due to the possibility of low reproductive isolation and high genetic similarity, suggesting that Wikstroemia represents a large complex of species[14]. The plastome is a circular double-stranded DNA molecule. In plants, the plastome is mostly maternally inherited and not disturbed by gene recombination[15]. A typical plant plastome ranges in size from 120 to 217 kb[16]. The complete plastome has a typical quadripartite structure, including a large single-copy (LSC) region, a small single-copy (SSC) region, and two separate inverted regions (IRs)[17]. Owing to its slow rate of evolution and ease of sequencing and assembly due to its small size, the plastome has been receiving much attention among biologist and taxonomist because it is highly informative and provides evolutionary and genetics insights[18,19]. The taxonomic placement of Wikstroemia has been controversial. This genus has experienced a complicated classification history in reviews of members of the Thymelaeaceae. Stellera chamaejasme of the monotypic genus Stellera was reported to be sister to Wikstroemia based on combined plastid DNA sequences (trnT-trnL, trnL-trnF, trnL intron, and rpl16 intron)[20], while Wikstroemia, along with 14 sister genera based on palynology findings, has been taxonomically placed in the Daphne group of the tribe Daphneae[21,22]. Although phylogenetic studies in Thymelaeaceae are ongoing[23], phylogenetic relationships in Wikstroemia are likely to be understudied. Constituent genera in Thymelaeaceae have experienced similar molecular challenges, in which poor phylogenetic resolution is likely due to low genetic variation in the selected molecular markers[23]. Such conflicts can be overcome by utilizing genome-scale datasets[24]. At the same time, highly divergent regions may be identified through genome comparisons, which could aid in future phylogenetic studies of such a diverse genus as Wikstroemia. In this study, we sequenced the complete plastomes of six species of Wikstroemia, W. alternifolia, W. canescens, W. capitata, W. dolicantha, W. micrantha, and W. scytophylla, to analyse and compare genomes using bioinformatic tools. Our aims were to (1) characterize the plastomes of the six species of Wikstroemia; (2) examine the variation in sequence repeats and codon usage in the six plastome sequences; (3) identify highly divergent regions in the plastome sequences; and (4) improve the understanding of the intrageneric/intergeneric phylogeny of Wikstroemia within Thymelaeaceae based on plastome sequences and the nuclear ribosomal DNA internal transcribed spacer (ITS) region.

Results

Plastome features of six species of Wikstroemia

The total length of the plastomes of the six species of Wikstroemia analysed in this study ranged from 172,610 bp (W. micrantha) to 173,697 bp (W. alternifolia). All six plastomes exhibited a typical quadripartite structure (Table 1, Fig. 1) consisting of a pair of inverted repeat (IR) regions (41,850–42,073 bp) separated by an LSC region (86,111–86,701 bp) and an SSC region (2799–2871 bp). All six plastomes had the same GC content at 36.7%. However, the GC content in the plastome of each species of Wikstroemia was unevenly distributed. The IR region accounted for the highest GC content (38.8–38.9%), followed by the LSC region (34.8–34.9%), while the SSC region showed the lowest GC content (28.7–29.6%).
Table 1

Plastome features of six species of Wikstroemia.

SpeciesOriginCollector and collection numberCoordinates (longitude, latitude)PlastomePlastid genesGenBank accession number
Total (bp)GC content (%)LSC (bp)GC content (%)SSC (bp)GC content (%)IR (bp)GC content (%)TotalCDStRNArRNAPlastomeITS
Wikstroemia alternifoliaBatang County, SichuanY. H. Zhang et al., RXK3029°19′27″N, 99°18′40″E173,69736.786,69434.8285729.542,07338.813993388MW073913MW075476
Wikstroemia canescensBatang County, SichuanY. H. Zhang et al., RXK3229°19′27″N, 99°18′40″E173,66736.786,70134.8285429.642,05638.813993388MW073911MW075477
Wikstroemia capitataGuanyang, Wushan County, ChongqingY. H. Zhang and W. G. Sun, RXK3331°28′02″N, 109°55′53″E172,84936.786,15434.8287129.441,91238.913892388MW073909MW075480
Wikstroemia dolicanthaKunming, YunnanY. H. Zhang et al. RXK3925°07′48″N, 102°42′24″E172,80436.786,23034.8285428.741,86038.913892388MW073912MW075475
Wikstroemia micranthaChangshou County, ChongqingY. H. Zhang et al. S0693-Zhang429°49′02″N, 107°4′35″E172,61036.786,11134.9279929.541,85038.913993388MN756675MW075479
Wikstroemia scytophyllaKunming Botanical GardenY. H. Zhang, et al. RXK4825°08′36″N, 102°44′27″E173,25436.786,33834.8284029.442,03838.813993388MW073910MW075474
Figure 1

Gene map for the plastomes of six species of Wikstroemia used in this study. Genes on the inside of the map are transcribed in the clockwise direction; genes on the outside of the map are transcribed in the counterclockwise direction. Darker grey in the inner circle represents the GC content, whereas light grey corresponds to the AT content. Different functional groups of genes are shown in different colours. The gene map was generated using OGDRAW[45].

Plastome features of six species of Wikstroemia. Gene map for the plastomes of six species of Wikstroemia used in this study. Genes on the inside of the map are transcribed in the clockwise direction; genes on the outside of the map are transcribed in the counterclockwise direction. Darker grey in the inner circle represents the GC content, whereas light grey corresponds to the AT content. Different functional groups of genes are shown in different colours. The gene map was generated using OGDRAW[45]. The six plastomes of Wikstroemia displayed an identical gene content and gene order with no structural reconfigurations. A total of 138 to 139 genes were detected in the six species used in this study, comprising 92 to 93 protein-coding genes , 38 transfer RNA (tRNA) genes, and 8 ribosomal RNA (rRNA) genes (Table 1). However, 27 genes were duplicated in the IR regions, including 15 protein-coding genes (ccsA, ndhA, ndhB, ndhD, ndhE, ndhH, ndhG, ndhI, psaC, rpl2, rpl23, rps7, rps15, ycf1, ycf2), eight tRNA genes (trnA-UGC, trnI-CAU, trnI-GAU, trnL-CAA, trnL-UAG, trnN-GUU, trnR-ACG and trnV-GAC) and four rRNAs (rrn4.5, rrn5, rrn16 and rrn23) (Table 2). Fifteen genes contained an intron, five of which (ndhA, ndhB, rpl2, trnA-UGC and trnI-GAU) were located in the IR region, and the remaining 10 genes (atpF, petB, petD, rpl16, rpoC1, rps16, trnG-UCC, trnL-UAA, trnK-UUU and trnV-UAC) were located in the LSC region (see Supplementary Table S1 online). Only the ycf3 gene, which was present in the LSC region, was detected to contain a pair of introns. Upon comparison, we found that the trnK-UUU gene had the longest intron, ranging from 2498 to 2508 bp, in all six genomes.
Table 2

Genes present in the plastomes of six species of Wikstroemia used in this study.

Genes
RNAs, ribosomalrrn4.5(×2), rrn5(×2), rrn16(×2), rrn23(×2)
RNAs, transfertrnA-UGC(×2), trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnfM-CAU, trnG-GCC, trnG-UCC, trnH-GUG, trnI-CAU(×2), trnI-GAU(×2), trnK-UUU, trnL-CAA(×2), trnL-UAA, trnL-UAG(×2), trnM-CAU, trnN-GUU(×2), trnP-UGG, trnQ-UUG, trnR-ACG(×2), trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnT-UGU, trnV-GAC(×2), trnV-UAC, trnW-CCA, trnY-GUA
Transcription and splicingmatK, rpoA, rpoB, rpoC1, rpoC2
Small subunitrps3, rps4, rps7(×2), rps8, rps11, rps12, rps14, rps15(×2), rps16, rps18, rps19
Large subunitrpl2(×2), rpl14, rpl16, rpl20, rpl22, rpl23(×2), rpl32, rpl33, rpl36
ATP synthaseatpA, atpB, atpE, atpF, atpH, atpI
Photosystem IpsaA, psaB, psaC(×2), psaI, psaJ
Photosystem IIpsbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ
Calvin cyclerbcL
Cytochrome complexpetA, petB, petD, petG, petL, petN
NADH dehydrogenasendhA(×2), ndhB(×2), ndhC, ndhD(×2), ndhE(×2), ndhF, ndhG(×2), ndhH(×2), ndhI(×2), ndhJ, ndhK
OthersaccD, ccsA(×2), cemA, ycf1(×2), ycf2(×2), ycf3, ycf4
Genes present in the plastomes of six species of Wikstroemia used in this study.

Repetitive sequence analysis

The total number of short sequence repeats (SSRs) in the plastome sequences of W. alternifolia, W. canescens, W. capitata, W. dolicantha, W. micrantha, and W. scytophylla were 127, 128, 110, 87, 90 and 110, respectively (Fig. 2A). No hexanucleotide sequences, however, were detected in the plastome sequences of W. alternifolia, W. canescens and W. scytophylla. The majority of SSRs (W. alternifolia: 70.87%; W. canescens: 70.31%; W. capitata: 68.18%; W. dolicantha: 63.22%; W. micrantha: 61.11%, W. scytophylla: 63.64%) were located in the LSC regions rather than in the other two regions of the plastome (Fig. 2B).
Figure 2

Distribution of small sequence repeats (SSRs) in the plastomes of six accessions of Wikstroemia. (A) Number of different SSR types detected in the plastomes of six species of Wikstroemia; (B) Frequencies of identified SSRs in large single-copy (LSC), small single-copy (SSC) and inverted repeat (IR) regions.

Distribution of small sequence repeats (SSRs) in the plastomes of six accessions of Wikstroemia. (A) Number of different SSR types detected in the plastomes of six species of Wikstroemia; (B) Frequencies of identified SSRs in large single-copy (LSC), small single-copy (SSC) and inverted repeat (IR) regions. All six species of Wikstroemia contained the same number of long repeats (Fig. 3A). In general, all of them contained 24 forward repeats and 25 palindromic repeats, except for W. canescens and W. capitata. Long forward repeats that ranged between 30 and 40 bp were the most abundant in W. dolicantha and W. micrantha, while W. alternifolia, W. canescens, W. capitata, and W. scytophylla were noted to have a higher number of long forward repeats with lengths of 41 to 60 bp (Fig. 3B). Long palindromic repeats were equally abundant in W. alternifolia and W. canescens, ranging from 40 to 60 bp and above 60 bp (Fig. 3C), while long palindromic repeats were abundant in the range of 30 to 60 bp in W. capitata, W. dolicantha, W. micrantha and W. scytophylla. Long reverse repeats, mostly within the range of 30 to 40 bp, were detected only in W. canescens and W. capitata (Fig. 3D).
Figure 3

Analysis of long repeat sequences in the plastomes of six species of Wikstroemia. (A) Quantities of long repeats based on type; (B) frequencies of forward repeats by length; (C) frequencies of palindromic repeats by length; and (D) frequencies of reverse repeats by length.

Analysis of long repeat sequences in the plastomes of six species of Wikstroemia. (A) Quantities of long repeats based on type; (B) frequencies of forward repeats by length; (C) frequencies of palindromic repeats by length; and (D) frequencies of reverse repeats by length.

Analysis of codon usage

Thirty preferred codons (relative synonymous codon usage; RSCU > 1.00) were recorded in W. alternifolia, W. canescens, W. capitata, W. dolicantha, W. micrantha and W. scytophylla (see Supplementary Table S2 online). The stop codon UAA was most abundant and preferred over the other two stop codons, UAG and UGA, in all six species. Preferred codons mostly ended with the amino acids A or U, except for the leucine-encoded (Leu) codon UUG. The Leu-encoded codons had the greatest occurrence (9.38%), while cysteine-encoded (Cys) codons had the fewest occurrences (3.13%) among all six species of Wikstroemia.

Sequence divergence analysis

The plastome sequence alignment of the eight species of Wikstroemia, using the W. chamaedaphne plastome as a reference, indicated high sequence conservatism across the plastomes of eight species but not in the plastome of W. indica (Fig. 4). Overall, the size and gene order of the plastomes in Wikstroemia were well conserved, but a distinct large gap was observed beginning within the ycf1 gene sequence of the IRa to 5′ region of the trnL-UAG in the IRb of W. indica. Both single-copy regions were recorded as having greater sequence divergence than the IR region (Fig. 5). With a Pi-value cut-off point of 0.025, eight highly variable gene regions were identified: ndhD-ndhF, ndhF-rpl32, ndhJ, petL-petG, psbI-trnS-GCU, trnG-UCC, trnK-UUU-rps16 and the trnL-UAA-trnF-GAA intergenic spacer regions. Six of the highly variable regions were located in the LSC, while two of them were in the SSC region.
Figure 4

Complete plastome comparison of eight species of Wikstroemia using the plastome of W. chamaedaphne as reference.

Figure 5

Sliding window analysis of complete plastome sequences among eight species of Wikstroemia (window length: 1000 bp; step size: 500 bp).

Complete plastome comparison of eight species of Wikstroemia using the plastome of W. chamaedaphne as reference. Sliding window analysis of complete plastome sequences among eight species of Wikstroemia (window length: 1000 bp; step size: 500 bp).

Contraction and expansion in the IR region

Genes adjacent to the IR borders were consistent across members of Wikstroemia, except in W. indica, which varied in its adjacent genes at the IRb/SSC (JSB) and IRa/SSC (JSA) borders (Fig. 6). In contrast to the rpl32 and ndhF genes in the SSC region, adjacent to JSB and JSA, respectively, the ycf1 gene was located across both JSA and JSB in the plastomes of W. indica. The trnL-UAG gene was also adjacent to JSA in the SSC region of the W. indica plastome. In comparison, six species (W. alternifolia, W. chamaedaphne, W. dolicantha, W. indica, W. micrantha and W. scytophylla) had their rps19 gene crossing the IRb/LSC (JLB) border.
Figure 6

Comparison of borders between LSC, SSC and IR regions across the plastomes of eight species of Wikstroemia. Image was generated with IRscope[50].

Comparison of borders between LSC, SSC and IR regions across the plastomes of eight species of Wikstroemia. Image was generated with IRscope[50].

Selection pressure

Sixty-nine shared protein-coding genes were included in the selection pressure analysis between Aquilaria sinensis and W. capitata (Table 3). When analysed separately, the Ka/Ks values indicated that five genes, namely, rpl2, rps7, rps18, ycf1 and ycf2, displayed positive selection; 61 genes indicated purifying selection, and three genes did not exhibit any synonymous (Ks) values indicative of selection due to the constraints of the model used. The Ka/Ks value for the combined dataset revealed that the overall selection pressure of the 69 shared protein-coding genes was 0.435, showing signals of purifying selection.
Table 3

Selection pressure analysis of 69 shared protein-coding gene sequences for Aquilaria sinensis (GenBank accession MN720647) and Wikstroemia capitata, analysed separately and combined.

No.GeneKaKsKa/KsP-valueLength of alignment (bp)No. of substitutions
1accD0.13780.24060.57250.00391254176
2atpA0.02010.14160.14170.0000151867
3atpB0.00730.09560.07600.0000149442
4atpE0.02030.08730.23210.008339914
5atpH0.01130.10090.11160.00482438
6atpI0.01980.06510.30350.006074122
7ccsA0.01230.02690.45640.117996915
8matK0.05800.12290.47170.00031512106
9ndhA0.29320.31610.92730.70272001488
10ndhB0.00500.01350.36990.0630205214
11ndhC0.01370.15610.08770.000036013
12ndhD0.00910.01340.68090.4815151815
13ndhE0.00000.03550.00000.00002972
14ndhF0.06990.32060.21800.00002124216
15ndhG0.00490.01680.29230.22135284
16ndhH0.00890.02900.30520.0175117915
17ndhI0.00730.01170.62360.53135014
18ndhJ0.00270.11440.02380.000047412
19ndhK0.02710.14670.18490.000068134
20petA0.01540.06220.24750.000796025
21petB0.03430.06180.55610.0379138055
22petD0.04550.06230.73030.2785116756
23petG0.00000.15170.00000.00001084
24petL0.00000.10070.00000.0000932
25petN0.00000.09470.00000.0000872
26psaA0.00490.08770.05550.0000225057
27psaB0.00480.08490.05650.0000220250
28psaC0.00000.01990.00000.00002431
29psaI0.02420.07860.30740.24241114
30psaJ0.00000.09270.00000.00001323
31psbA0.00000.14730.00000.0000105936
32psbB0.00930.11450.08150.0000152446
33psbC0.00190.06590.02940.0000141926
34psbD0.00620.05390.11500.0000105918
35psbE0.00000.10380.00000.00002495
36psbF0.01220.12990.09380.02341175
37psbH0.05320.12610.42220.115621914
38psbI0.00000.13440.00000.00001084
39psbJ0.02210.03640.60920.54521203
40psbK0.03070.0000NA0.05651834
41psbL0.00000.03880.00000.00001141
42psbM0.03960.04330.91360.66021024
43psbT0.00000.21670.00000.0000995
44psbZ0.00000.14320.00000.00001865
45rbcL0.01000.09270.10770.0000143138
45rpl20.00920.00481.9090.5022140711
46rpl140.03280.13220.24820.004336619
48rpl200.04960.08380.59140.381934819
49rpl220.05920.16710.35400.007250137
50rpl230.00910.0000NA0.00002792
51rpl320.12660.43130.29360.007515322
52rpl330.05030.12370.40680.100619813
53rpl360.03570.04140.86270.64081114
54rpoA0.04630.13710.33780.000198461
55rpoB0.01700.10100.16860.00003210117
56rpoC20.03460.09650.35890.00004092190
57rps20.03080.05550.55420.138970825
58rps30.04930.14590.33780.001165442
59rps40.04410.09460.46560.047560332
60rps70.03620.03201.1310.977246516
61rps80.03610.08380.43010.079040218
62rps110.07990.17490.45670.019741438
63rps140.03510.06280.55860.262230012
64rps150.03170.0000NA0.00732706
65rps180.06340.03381.8790.513027615
66rps190.04960.10940.45340.195527016
67ycf10.07460.05081.4690.00755316353
68ycf20.04240.01413.0070.00006717232
69ycf40.02320.05950.39020.056756117
70Concatenated dataset0.03730.08570.43500.000065,1723019

NA not available.

Selection pressure analysis of 69 shared protein-coding gene sequences for Aquilaria sinensis (GenBank accession MN720647) and Wikstroemia capitata, analysed separately and combined. NA not available.

Phylogenetic analysis

The maximum-likelihood (ML) and Bayesian inference (BI) trees based on the complete plastome sequences excluding the IRa sequences and the dataset of the intergenic spacer (IGS) sequences revealed that all the branch nodes for eight species of Wikstroemia included in the phylogenetic tree were supported with high bootstrap values and Bayesian posterior probabilities (ML: ≥ 90%; BI: ≥ 95%) (Fig. 7). For the dataset of the total gene sequences containing protein-coding genes, tRNAs, and rRNAs that are shared by all species, strong posterior probabilities were recorded in most of the branch nodes of the BI tree but not in the ML tree, in which moderate bootstrap support was recorded for the backbone structure of the Wikstroemia clade (see Supplementary Figure S1 online). The molecular placement of W. capitata and W. indica, forming sister to each other under low branch support, in the ML tree and BI tree based on the dataset of all protein-coding genes was incongruent with the phylogenetic trees based on the datasets using complete plastome sequences excluding the IRa and intergenic spacer (IGS) sequences. The ML trees and BI trees based on the datasets of the first, second, and third codons of each amino acid in the protein-coding sequences did not display matching molecular placement of Wikstroemia when compared with each other; most of the branches were poorly supported in the Wikstroemia clade (see Supplementary Figure S2 online). The phylogenetic tree using complete plastome sequences excluding the IRa sequences suggested that a paraphyletic relationship was present in Wikstroemia. Two species, W. alternifolia and W. canescens, were clustered with Stellera chamaejasme, while six species of Wikstroemia (W. capitata, W. chamaedaphne, W. dolicantha, W. indica, W. micrantha and W. scytophylla) formed a monophyletic group.
Figure 7

Maximum likelihood (ML) and Bayesian inference (BI) of Wikstroemia and allied genera based on the complete plastome sequences excluding the inverted repeat A (IRa) region, and a dataset of the intergenic spacer (IGS) regions of 17 taxa representing 5 genera of Thymelaeaceae, analysed separately. Branch nodes that were calculated with reliable support values (ML: bootstrap ≥ 75%; BI: posterior probability ≥ 0.90) are indicated with an asterisk (*). Sequences obtained through this study are indicated in bold; two species, Psidium guajava (KY635879) and Gossypium gossypioides (HQ901195), were included as outgroups.

Maximum likelihood (ML) and Bayesian inference (BI) of Wikstroemia and allied genera based on the complete plastome sequences excluding the inverted repeat A (IRa) region, and a dataset of the intergenic spacer (IGS) regions of 17 taxa representing 5 genera of Thymelaeaceae, analysed separately. Branch nodes that were calculated with reliable support values (ML: bootstrap ≥ 75%; BI: posterior probability ≥ 0.90) are indicated with an asterisk (*). Sequences obtained through this study are indicated in bold; two species, Psidium guajava (KY635879) and Gossypium gossypioides (HQ901195), were included as outgroups. The ITS-based ML tree revealed a paraphyletic relationship between Wikstroemia and S. chamaejasme, while most of the branch nodes within the Wikstroemia clade were not highly supported (Fig. 8A). Strong bootstrap support was recorded for the sistership between W. alternifolia and W. canescens and between W. micrantha and W. stenophylla. Weakly supported sisterships were present between W. dolicantha and W. scytophylla and between W. capitata and W. ligustrina. In contrast, the BI analysis displayed a monophyletic relationship within the Wikstroemia clade (Fig. 8B). Similar to the ML tree, sisterships were strongly supported between W. alternifolia and W. canescens and between W. micrantha and W. stenophylla but not between W. dolicantha and W. scytophylla or between W. capitata and W. ligustrina in the BI tree.
Figure 8

Phylogenetic analyses of Thymelaeaceae based on nuclear ribosomal DNA internal transcribed spacer (ITS) gene sequences of 34 taxa representing 6 genera of Thymelaeaceae. (A) Maximum-likelihood (ML) and (B) Bayesian inference (BI) tree analyses were conducted with 1000 bootstrap replicates. Branch nodes that were calculated with reliable support values (ML: bootstrap ≥ 75%; BI: posterior probability ≥ 0.90) are indicated with an asterisk (*). Two species, Psidium guajava (MN2953604) and Gossypium australe (AF057763), were included as outgroups.

Phylogenetic analyses of Thymelaeaceae based on nuclear ribosomal DNA internal transcribed spacer (ITS) gene sequences of 34 taxa representing 6 genera of Thymelaeaceae. (A) Maximum-likelihood (ML) and (B) Bayesian inference (BI) tree analyses were conducted with 1000 bootstrap replicates. Branch nodes that were calculated with reliable support values (ML: bootstrap ≥ 75%; BI: posterior probability ≥ 0.90) are indicated with an asterisk (*). Two species, Psidium guajava (MN2953604) and Gossypium australe (AF057763), were included as outgroups.

Discussion

The plastomes of the species in Wikstroemia examined in this study were highly conserved, which is similar to the situation in other angiosperms. The length of the plastomes of the six species of Wikstroemia varied little and were similar in size to typical angiosperms[25]. The same number and contents of the genes were predicted in this study, suggesting that the evolution of the gene sequences was consistent across the six species. Similar to most angiosperms, sequence repeats for A/T were more abundant than those of G/C in the Wikstroemia plastomes and may represent bias in the base composition, which is potentially affected by the tendency of the genome to change to A-T rather than to G-C[26]. An additional validation step for these SSRs, for which five novel SSR primer sets were designed, was conducted for the six species of Wikstroemia reported in this study. Details of the newly designed SSR primer sets and the resulting pherograms are included for reference (see Supplementary Table S3 and Data S1 online). Expansion and contraction of the IR region are major evolutionary events that influence the length of the plastomes[27]. The IR junctions in the plastomes reported in this study were placed and annotated with Geneious Prime[28] and further validated with GeSeq[29] as well as Sanger sequencing using novel specific primer sets (see Supplementary Table S4 and Data S2 online). Our study indicated that the contractions and expansions of the IR regions exhibited relatively stable patterns within Wikstroemia, with slight variation; gene recombination between the repetitive sequence or poly-A structure and tRNA could be one of the reasons for the change in length in the IR region[30]. However, W. indica indicated dissimilarity in its IR borders, which differed from most angiosperms[31]. We suspect that the plastome IR contraction and expansion in W. indica is severe and may be due to extensive gene transfer and larger IR expansion due to the results of the double strand break repair mechanism[32-34]. Interestingly, when compared to other species of Wikstroemia sequenced in this study, the plastome of W. indica was smaller (151,731 bp) and had a greater GC content (37.4%)[8]. We found that the plastome of W. indica had a shorter IR region and larger SSC region than other species of Wikstroemia. Changes in the placement of the IR borders in the plastome of W. indica were likely due to contraction of the IR region, causing a loss in the number and content of the genes. Among the genes that were not found in W. indica but were present in other species of Wikstroemia, ndhA, ndhG, and ndhI were supposed to be present in the IR region; genes such as ccsA, ndhD, ndhE, ndhH, psaC, rps15, and trnL-UAG that are commonly duplicated in the IR regions were reduced to only one copy and were transferred to the SSC region, while the ndhF and rpl32 genes, common genes in the SSC region, were not detected. Therefore, it can be concluded that the contraction of the IR region that caused gene loss contributed to the difference in plastome content between W. indica and the other seven species of Wikstroemia. Molecular evidence based on plastome sequences revealed a nonmonophyletic relationship between the species of Wikstroemia due to W. alternifolia and W. canescens clustering with Stellera chamaejasme. Information on the phylogenetic relationships of Wikstroemia species is scarce. Although taxonomic work is challenging in a genus with diverse species, continuous efforts among taxonomists studying members of the Thymelaeaceae have provided some insights into the taxonomic status of Wikstroemia. To provide better insight into the phylogenetic relationships at the nuclear level, we used ITS sequences to perform ML and BI analyses. Unlike phylogenomic tree analyses on complete plastome sequences, low bootstrap support and Bayesian posterior probabilities were observed at the species level in Wikstroemia. The molecular placement of the species of Wikstroemia, however, was identical in both the ML and BI trees, while the most distinct difference between both phylogenetic trees was the placement of S. chamaejasme. In the ML tree based on the ITS sequences, S. chamaejasme clustered within the Wikstroemia clade, but it was sister to Wikstroemia in the BI tree. The discordance between the plastid and nuclear phylogenies in this study may be due to phylogenetic sorting, convergence, unequal rates of evolution, long branch attraction, and introgression[35]. However, low branch node support in both the ITS-based ML and BI trees suggested that either the inclusion of additional nuclear gene sequences or the application of the restriction site-associated DNA sequence (RAD-Seq) technique that integrates up to 10% of the nuclear genome[36] could be helpful in resolving the phylogenetic relationships within Wikstroemia. Evidently, in this study, the use of a single nuclear gene sequence, i.e., ITS, which was suspected to be useful in delimiting many plants at the species level[37], was insufficient for resolving the phylogenetic relationships between Stellera and Wikstroemia. Members of Wikstroemia currently comprise species previously placed under Capura L., Daphne L., Diplomorpha Meisn., Daphnimorpha Nakai, Lonicera L., Passerina L., Restella Pobed., and Stellera L.[1,38]. The monotypic genus Stellera, which exhibits strikingly similar morphological characteristics, has troubled some taxonomists who compared it to Wikstroemia. At least five species were placed under Stellera before they were transferred to Wikstroemia; others were transferred to allied genera, such as Daphne, Diarthron and Thymelaea in the tribe Daphneae[38]. This is understandable, as Stellera has a longer taxonomic history, i.e., back to 1747, when compared to other genera in the Daphneae. As a result, S. chamaejasme, as the type species, is the only species left in the genus. Based on the literature, we found that Wikstroemia has an interesting nomenclatural history in which two genera, Diplomorpha and Daphnimorpha, were synonymized and excluded. Combining Stellera with Wikstroemia was previously proposed by transferring the type species S. chamaejasme to the monotypic subgenus Chamaejasme[11,39]. However, the proposal was rejected, as Stellera has priority over Wikstroemia[40], and based on the Rules of Nomenclature, the combination can only be accepted if Stellera is proposed as a nomen genus rejiciendum (nom. gen. rejic.)[12]. Therefore, we do not exclude the possibility that Stellera should be synonymized with Wikstroemia. In that case, Wikstroemia would be synonymized under Stellera. One should not jump to such a conclusion rashly, based on the current situation, as the taxonomic dispute on whether Wikstroemia should be synonymized with Daphne is yet unresolved[41]. Unless Daphne is considered in a subsequent taxonomic treatment, based on the phylogenetic trees in this study, we could only conclude that Wikstroemia is not monophyletic and that Stellera is unquestionably closely related to Wikstroemia. While phylogenetic analyses based on the plastome sequences of Wikstroemia have proven to be promising, we suggest that larger sampling is required to resolve the taxonomic dispute in Wikstroemia through a molecular approach. We foresee that the genetic information in the complete plastome sequences of Wikstroemia is deemed sufficient and could aid in the classification of Wikstroemia, both at the genus level and at the species level.

Conclusion

To the best of our knowledge, this study presents the first genome-scale analysis of species of Wikstroemia. The findings revealed high conservation of genes in the plastomes. The identification of highly variable gene regions in the plastome sequences of Wikstroemia could potentially be useful in resolving phylogenetic relationships in the genus. A strong sistership between Wikstroemia and the monotypic genus Stellera was present. The ML and BI trees based on the plastome sequences revealed that all the branch nodes for eight species of Wikstroemia included in the phylogenetic tree were supported with high bootstrap values and Bayesian posterior probabilities (ML: ≥ 90%; BI: ≥ 95%), while the ITS-based tree analyses could not properly resolve the phylogenetic relationship between Stellera and Wikstroemia. Nevertheless, the molecular data obtained in this study will serve as a valuable resource for providing greater insights into the taxonomy and phylogeny of Thymelaeaceae.

Materials and methods

Plant materials and DNA extraction

Fresh leaf materials of six species of Wikstroemia, W. alternifolia, W. canescens, W. capitata, W. dolicantha, W. micrantha and W. scytophylla, were collected from botanical gardens and natural populations in China (Table 1). Species identification was carried out by Yonghong Zhang, and the voucher specimens were deposited in the Herbarium of Yunnan Normal University (YNUB)[42]. Based on the local guidelines and legislation on plant study, permissions for collections and research were unnecessary, as the samples were not collected in protected areas or recorded as threatened species. However, W. scytophylla was collected under permit record number w2021005, which was authorized in the Kunming Botanical Garden, Chinese Academy of Science, China. All collections are permitted and legal. Total genomic DNA was extracted using the Axygen AxyPrep Multisource Genomic Miniprep DNA kit (Corning, USA) following the manufacturer’s protocol.

Plastome sequencing, assembly and annotation

A sequence library was constructed, and sequencing was performed on the Illumina HiSeq 2500-PE150 platform (Illumina, USA). All raw reads were filtered using NGS QC Toolkit version 2.3.3 with default parameters to obtain clean reads[43]. The plastome was de novo assembled using NOVOPlasty[44] with the rbcL gene sequence of Daphne kiusiana (GenBank accession KY991380) as the seed sequence. Gene annotation was performed in Geneious Prime[28] using the complete plastome sequence of W. chamaedaphne (GenBank accession MN563132) as the reference genome. The circular physical map of the plastome was generated using OGDRAW[45].

Repeat analyses

SSRs were identified using MISA-web[46], in which parameters for the identification of perfect mono-, di-, tri-, tetra-, penta-, and hexanucleotide motifs were set for a minimum of 10, 5, 4, 3, 3, and 3 repeats, respectively. Long repeats, including forward, palindrome, reverse and complement repeats, were determined using REPuter[47] with a Hamming distance of 3 and a minimal repeat size of 30 bp.

Codon usage

Coding sequences of each plastome were extracted, and the RSCU was analysed using MEGA7[48].

Comparative genome and divergence analyses

The complete plastome sequences of two species of Wikstroemia, W. chamaedaphne (GenBank accession MN563132) and W. indica (GenBank accession MN453832), which were available in NCBI GenBank, were downloaded and included in subsequent analyses. By using the plastome sequences of W. chamaedaphne as the reference genome, nucleotide variation in the plastome sequence alignment of the eight species of Wikstroemia was visualized using mVISTA[49] in Shuffle-LAGAN mode. To detect the expansion and contraction of the IR region in the plastomes across the eight species, the IR/SC boundaries of the plastomes were visualized using IRscope[50]. To detect the mutational hotspots and divergence regions in the plastomes of the eight species, sequence alignment of the plastome sequences was carried out using Geneious Prime[28]. Calculations of the nucleotide variability (Pi) among the eight plastomes were performed using DnaSP v5[51] with a window length of 1000 bp and a step size of 500 bp.

Selection pressure analysis

The ratio of nonsynonymous to synonymous substitutions (Ka/Ks) of protein-coding genes was calculated for Aquilaria sinensis (GenBank accession MN720647) and Wikstroemia capitata. Calculations were conducted for two sets of data: (1) shared genes analysed separately and (2) a combined dataset containing all shared genes. Prior to sequence alignment using MUSCLE embedded in MEGA7[48], the plastome sequence of A. sinensis was reannotated to ensure uniformity. For the combined dataset, the coding sequences were concatenated manually. Selection pressure acting on these genes was estimated using KaKs_Calculator 2.0[52] based on the Yang and Nielsen codon frequency (YN) model, with parameters for the initial ratio of transition to transversion frequency (K) set between 0.3 and 0.7. A Ka/Ks value equal to or less than 1.0 indicates the presence of purifying selection, in which changes in gene residues of amino acids that may favour excess synonymous versus nonsynonymous substitutions have been avoided, while the presence of positive selection is specified if the Ka/Ks value is more than 1.0.

Polymerase chain reaction and Sanger sequencing

Polymerase chain reaction (PCR) amplification was carried out in a 20 µL reaction volume using the ITS universal primer set: 5F: 5ʹ-GGAAGTAAAAGTCGTAA-CAAGG-3ʹ (forward) and 4R: 5ʹ-TCCTCCGCTTATTGATATGC-3ʹ (reverse). The PCRs for the nuclear ribosomal DNA ITS region contained 10 µL of 2× Taq PCR Starmix with loading dye (Genstar Biosolutions, China), 0.4 µM of each primer and 20 ng of genomic DNA as a template. PCR amplifications were conducted on a T100 Thermal Cycler (Bio-Rad, USA), with initial denaturation at 93 °C for 5 min; 40 cycles of denaturation at 93 °C for 30 s, annealing at 60 °C for 30 s, and extension at 72 °C for 30 s; and a final extension at 72 °C for 5 min. PCR products were sent for direct Sanger sequencing at both ends using an ABI 3730 DNA Analyzer (Applied Biosystems, USA).

Phylogenetic analyses

Phylogenetic analyses were conducted based on the plastome or gene sequences of 17 selected taxa from Thymelaeaceae. Two species, Psidium guajava (Myrtaceae; GenBank accession KY635879) and Gossypium gossypioides (Malvaceae; GenBank accession HQ901195), were included as outgroups. Seven datasets, including the (1) complete plastome sequences excluding IRa, (2) the total gene sequences containing protein-coding genes, tRNAs, and rRNAs that are shared by all species, (3) the intergenic spacer (IGS) sequences, (4) all protein-coding genes that are shared by all species, and (5) three additional subdatasets at the codon level for the first/second/third codons of each amino acid in the protein-coding sequences, were used to perform phylogenetic inferences. Part of the complete plastome sequences excluding the Ira and the targeted genic and intergenic regions in the plastomes, was extracted and concatenated using Geneious Prime[28], while the first/second/third codons of each amino acid in the shared genes were extracted using MEGA7[48]. Sequence alignment was carried out using MAFFT v7.450[53]. The ML tree was constructed based on all the sequence datasets using RAxML 8.2.11[54]. The general-time-reversible (GTR) and gamma distributed (+ G) (+ GTR + G) DNA substitution model was selected, and all branch nodes were calculated under 1000 bootstrap replicates. BI analysis was conducted for all the datasets[54,55]. BI analysis was executed through the MrBayes[55] pipeline available in the CIPRESS Science Gateway web portal[56]. Markov chain Monte Carlo (MCMC) was conducted with 2,000,000 generations, and sampling was collected every 100 cycles. The final tree was visualized using FigTree[57] and edited manually. The ITS sequences were aligned and manually trimmed for their primer sequences to obtain clean sequences. A total of 26 additional ITS sequences derived from members of the Thymelaeaceae were downloaded from the NCBI GenBank and MUSCLE-aligned against the ITS sequences of the six species of Wikstroemia used in this study using MEGA7[48]. Two species, P. guajava (Myrtaceae; GenBank accession MN295360) and Gossypium australe (Malvaceae; GenBank accession AF057763), were included as outgroups. The alignment was trimmed using trimAL v1.2[58] with the gappyout method to reduce systematic errors produced by poor alignment. The optimal DNA substitution model for the ML analysis using the “Find Best DNA/Protein Model (ML)” function embedded in MEGA7[48] was calculated to be the Kimura two-parameter (K2P) model with the discrete Gamma model (+ G4) and invariant sites included (+ I) (= K2P + G + I). ML analysis was performed using MEGA7[48] with 1000 bootstrap replicates. BI analysis was conducted with a previously described method[55]. Supplementary Information 1. Supplementary Information 2.
  31 in total

1.  OrganellarGenomeDRAW (OGDRAW): a tool for the easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes.

Authors:  Marc Lohse; Oliver Drechsel; Ralph Bock
Journal:  Curr Genet       Date:  2007-10-24       Impact factor: 3.886

2.  DnaSP v5: a software for comprehensive analysis of DNA polymorphism data.

Authors:  P Librado; J Rozas
Journal:  Bioinformatics       Date:  2009-04-03       Impact factor: 6.937

3.  IRscope: an online program to visualize the junction sites of chloroplast genomes.

Authors:  Ali Amiryousefi; Jaakko Hyvönen; Peter Poczai
Journal:  Bioinformatics       Date:  2018-09-01       Impact factor: 6.937

Review 4.  Bioactive components and pharmacological action of Wikstroemia indica (L.) C. A. Mey and its clinical application.

Authors:  Yan-Min Li; Liang Zhu; Jian-Guo Jiang; Li Yang; Ding-Yong Wang
Journal:  Curr Pharm Biotechnol       Date:  2009-12       Impact factor: 2.837

5.  Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data.

Authors:  Matthew Kearse; Richard Moir; Amy Wilson; Steven Stones-Havas; Matthew Cheung; Shane Sturrock; Simon Buxton; Alex Cooper; Sidney Markowitz; Chris Duran; Tobias Thierer; Bruce Ashton; Peter Meintjes; Alexei Drummond
Journal:  Bioinformatics       Date:  2012-04-27       Impact factor: 6.937

6.  MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space.

Authors:  Fredrik Ronquist; Maxim Teslenko; Paul van der Mark; Daniel L Ayres; Aaron Darling; Sebastian Höhna; Bret Larget; Liang Liu; Marc A Suchard; John P Huelsenbeck
Journal:  Syst Biol       Date:  2012-02-22       Impact factor: 15.683

7.  KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies.

Authors:  Dapeng Wang; Yubin Zhang; Zhang Zhang; Jiang Zhu; Jun Yu
Journal:  Genomics Proteomics Bioinformatics       Date:  2010-03       Impact factor: 7.691

8.  MISA-web: a web server for microsatellite prediction.

Authors:  Sebastian Beier; Thomas Thiel; Thomas Münch; Uwe Scholz; Martin Mascher
Journal:  Bioinformatics       Date:  2017-08-15       Impact factor: 6.937

9.  Characterization of the complete chloroplast genome of a medicinal plant, Wikstroemia indica (Thymelaeaceae).

Authors:  Shao-Juan Qian; Yong-Hong Zhang
Journal:  Mitochondrial DNA B Resour       Date:  2019-12-12       Impact factor: 0.658

10.  Dynamics and evolution of the inverted repeat-large single copy junctions in the chloroplast genomes of monocots.

Authors:  Rui-Jiang Wang; Chiao-Lei Cheng; Ching-Chun Chang; Chun-Lin Wu; Tian-Mu Su; Shu-Miaw Chaw
Journal:  BMC Evol Biol       Date:  2008-01-31       Impact factor: 3.260

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.