Literature DB >> 28588587

A Dynamic Tandem Repeat in Monocotyledons Inferred from a Comparative Analysis of Chloroplast Genomes in Melanthiaceae.

Hoang Dang Khoa Do1, Joo-Hwan Kim1.   

Abstract

Chloroplast genomes (cpDNA) are highly valuable resources for evolutionary studies of angiosperms, since they are highly conserved, are small in size, and play critical roles in plants. Slipped-strand mispairing (SSM) was assumed to be a mechanism for generating repeat units in cpDNA. However, research on the employment of different small repeated sequences through SSM events, which may induce the accumulation of distinct types of repeats within the same region in cpDNA, has not been documented. Here, we sequenced two chloroplast genomes from the endemic species Heloniopsis tubiflora (Korea) and Xerophyllum tenax (USA) to cover the gap between molecular data and explore "hot spots" for genomic events in Melanthiaceae. Comparative analysis of 23 complete cpDNA sequences revealed that there were different stages of deletion in the rps16 region across the Melanthiaceae. Based on the partial or complete loss of rps16 gene in cpDNA, we have firstly reported potential molecular markers for recognizing two sections (Veratrum and Fuscoveratrum) of Veratrum. Melathiaceae exhibits a significant change in the junction between large single copy and inverted repeat regions, ranging from trnH_GUG to a part of rps3. Our results show an accumulation of tandem repeats in the rpl23-ycf2 regions of cpDNAs. Small conserved sequences exist and flank tandem repeats in further observation of this region across most of the examined taxa of Liliales. Therefore, we propose three scenarios in which different small repeated sequences were used during SSM events to generate newly distinct types of repeats. Occasionally, prior to the SSM process, point mutation event and double strand break repair occurred and induced the formation of initial repeat units which are indispensable in the SSM process. SSM may have likely occurred more frequently for short repeats than for long repeat sequences in tribe Parideae (Melanthiaceae, Liliales). Collectively, these findings add new evidence of dynamic results from SSM in chloroplast genomes which can be useful for further evolutionary studies in angiosperms. Additionally, genomics events in cpDNA are potential resources for mining molecular markers in Liliales.

Entities:  

Keywords:  Heloniopsis tubiflora; Liliales; Melanthiaceae; Parideae; Xerophyllum tenax; chloroplast genome; slipped-strand mispairing; tandem repeats

Year:  2017        PMID: 28588587      PMCID: PMC5438981          DOI: 10.3389/fpls.2017.00693

Source DB:  PubMed          Journal:  Front Plant Sci        ISSN: 1664-462X            Impact factor:   5.753


Introduction

Chloroplast genome sequences provide useful information for phylogenetic studies of higher level taxa, including families and orders (Zomlefer et al., 2001; Ji et al., 2006; Barrett et al., 2013; Kim and Kim, 2013; Kim et al., 2013, 2016a; Nguyen et al., 2013; Ruhfel et al., 2014). Structural changes such as small and large inversions, gene contents (duplication, triplication, and deletion), and pseudogenization have provided valuable resources for examining genome evolution among plants. Gene duplications have been reported in previous studies (Lee et al., 2007; Cai et al., 2008; Schmickl et al., 2009). Specifically, different copies of trnF_GAA were found in several genera of Brassicaceae (Schmickl et al., 2009). Additionally, repeated DNA sequences, which were assumed to have originated from different mechanisms such as gene conversion, unequal recombination, and slipped-strand mispairing (SSM), are main resources for genomic events of duplication, deletion, and rearrangement in chloroplast genomes (Levinson and Gutman, 1987; Cai et al., 2008; Huang et al., 2014; Sveinsson and Cronk, 2014). Melanthiaceae is a family within the order Liliales that includes 16 genera divided into five tribes: Melanthieae (7 genera), Heloniadeae (3 genera), Parideae (3 genera), Chionographideae (2 genera), and Xerophylleae (1 genus) (Angiosperm Phylogeny Group, 2009, 2016; Govaerts, 2016; WCSP, 2016). Prior to its grouping within Liliales, these genera were classified into different orders of Dioscoreales and Melanthiales based on the morphological characteristics of their extrorse anthers and ovaries, and often with the presence of three styles (Rudall et al., 2000). The tribe Parideae, comprising Paris, Psedotrillium, and Trillium, was formerly treated as an independent family, Trilliaceae (Thorne, 1992; Takhtajan, 1997). However, based on molecular and morphological data, this tribe was later reclassified as monophyletic within Liliales (Chase et al., 2000; Angiosperm Phylogeny Group, 2009; Kim and Kim, 2013). Recently, Pellicer et al. (2014) identified the extreme variations in genome size and a significant reduction in the number of chromosomes in Parideae. The evolution of the chloroplast genome (cpDNA) has been investigated in Veratrum patulum (Do et al., 2013), Chionographis japonica (Bodin et al., 2013), Paris verticillata (Do et al., 2014), Trillium species (Kim et al., 2016b), and Paris sp. (Huang et al., 2016), which represent three tribes, Melanthieae, Chionographideae, and Parideae, respectively. Specifically, different numbers of trnI_CAU and repeat sequences in rpl23-ycf2 regions and inversion was detected in tribe Parideae (Do et al., 2014; Huang et al., 2016; Kim et al., 2016b). The rps16 gene was completely lost in C. japoncica and partially deleted in V. patulum (Bodin et al., 2013; Do et al., 2014). Collectively, these findings suggest that Melanthiaceae possess evidence of different genomic events in cpDNA. Nonetheless, these genomic events in Melanthiaceae have not been fully characterized because of the lack of cpDNA data. In this study, we sequenced the complete chloroplast genomes of Heloniopsis tubiflora (GenBank Accession number KM078036) and Xerophyllum tenax (GenBank Accession number KM078035), representing the two unreported tribes of Heloniadeae and Xerophylleae, to cover the gap of cpDNA data within Melanthiaceae. Based on the complete cpDNA sequences, we characterized the differentiation, including gene loss, duplication, and fluctuation of IR-LSC boundary, among five tribes of Melanthiaceae. Then, we applied these features to create the first potential molecular marker for recognizing two sections of Veratrum. Additionally, we questioned the pattern and the mechanism of repeat's accumulation in rpl23-ycf2 regions. Therefore, we sequenced this region among representatives from other families in Liliales and conducted comparative analyses of sequence data to (1) investigate the pattern of repeat's accumulation within rpl23-ycf2 regions of examined species, and (2) propose hypothetical scenarios for the duplication process.

Materials and methods

Sample collection, DNA extraction, whole-genome sequencing, and assembly

Fresh leaves of H. tubiflora were collected in Deogyusan National Park, South Korea. Voucher specimens were deposited in the Herbarium of Gachon University (GCU). Dried leaves of X. tenax were obtained from the Forestfarm Plant Nursery (Williams, Oregon, USA). The plant materials used in this study were collected through the KNRRC (Medicinal Plants Resources Bank NRF-2010-0005790), supported by the Korea Research Foundation (resources provided by the Ministry of Education, Science and Technology in 2014). Total DNA was extracted using a DNAEasy Plant Mini Kit (Qiagen, Seoul, South Korea). These DNA samples were sequenced using the 454 system for H. tubiflora and the Hiseq2000 system for X. tenax. After removal of reads with ambiguous “N” bases, the remaining reads were trimmed with no more than a 5% chance of error per base before being mapped to the reference chloroplast genome sequences of C. japonica (Bodin et al., 2013) and P. verticillata (Do et al., 2014), to isolate chloroplast genome sequences using Geneious (Biomatters Ltd., Auckland, New Zealand). Based on the tribal relationship in Melanthiaceae (Kim et al., 2016a), we mapped the reads of H. tubiflora and X. tenax to cpDNA sequences of C. japonica and P. verticillata, respectively. The assembled reads were then extracted and reanalyzed in Geneious using the De Novo Assembly tool with the option of “no gaps or mismatches per read.” The consensus sequences generated from De Novo Assembly were used as references to reassemble raw reads. These steps were repeated until the complete cpDNA sequences were identified. Occasionally, gaps were present among chloroplast contigs. These remaining gaps were closed using the Sanger method with newly designed primers based on homologous sequences between the reads and reference sequences. Additionally, borders between the LSC, small single copy (SSC), and IR regions, as well as ambiguous regions (i.e., insertion and deletion events in coding regions and low coverage regions) were confirmed by Sanger sequencing methods. A total of 1,093,684 and 8,719,277 reads of H. tubiflora and X. tenax were generated, respectively. The results showed that cpDNA of H. tubiflora consisted of 37,973 (3.47%) out of 1,093,684 reads with a coverage rate of 19.2 x. For X. tenax, 196,299 reads (2.25%) belonged to chloroplast genome sequences with a coverage rate of 112.6 × over the cpDNA. The complete cpDNA sequences of X. tenax and H. tubiflora were deposited into GenBank under accession numbers KM078035 and KM078036, respectively (Table 1).
Table 1

Comparison of the features of cpDNAs from five families of Liliales.

FamilySpeciesAccession numberLength (bp)AT content (%)GC content (%)Protein coding genestRNAsrRNAs
TotalLSC regionSSC regionIR region
MelanthiaceaeParis verticillataKJ433485157,37982,72617,09728,37362.437.681304
Paris quadrifoliaKX784051157,09783,77218,28727,92462.337.781304
Paris cronquisitiiKX784041157,71084,50218,31627,44662.737.381304
Paris dunnianaKX784042157,98484,48218,36427,56962.837.281304
Paris fargesiiKX784043157,51884,54918,31127,32962.737.381304
Paris forrestiiKX784044158,34584,39618,67127,63962.737.381304
Paris luquanensisKX784045158,45184,40818,40327,82062.737.381304
Paris maireiKX784046157,89184,42018,36127,55562.737.381304
Paris marmorataKX784047157,56684,22118,30127,52262.737.381304
Paris polyphylla var. chinensisKX784048158,30785,18718,17527,47362.837.281304
Paris polyphylla var. yunnanensisKX784049157,54784,22418,31927,50262.737.381304
Paris vietnamensisKX784050158,22484,79418,36027,53562.837.281304
Trillium maculatumKR780075157,35986,34019,94925,53562.537.581304
Trillium tschonoskiiKR780076156,85283,98119,86926,50162.537.581304
Xerophyllum tenaxKM078035156,74683,91018,09627,37062.237.881304
Heloniopsis tubifloraKM078036157,94084,84018,01827,54162.537.581304
Chionographis japonicaKF951065154,64681,65318,19527,39962.337.780304
Veratrum patulumKF437397153,69983,37217,60726,36062.337.781304
ColchicaceaeColchicum autumnaleKP125337156,46284,24616,73427,74162.437.680304
Gloriosa superbaKP125338157,92485,01216,78628,06362.437.681304
SmilacaceaeSmilax chinaHM536959157,87884,60818,53627,37662.7537.2580304
LiliaceaeLilium longiflorumKC968977152,79382,23017,52326,52062.9837.0281304
AlstroemeriaceaeAlstroemeria aureaKC968976155,51084,24117,86726,70162.7437.2680304

LSC, Large single copy; SSC, small single copy; IR, inverted repeat; bp, base pairs.

Comparison of the features of cpDNAs from five families of Liliales. LSC, Large single copy; SSC, small single copy; IR, inverted repeat; bp, base pairs.

Genome annotation, comparison, visualization, and characterization of repeat sequences

The complete cpDNA sequences of H. tubiflora and X. tenax were annotated using Geneious. All tRNA sequences were confirmed using the online web-based tool tRNAScan-SE (Schattner et al., 2005). The Mauve alignment, embedded in Geneious, was used to compare the 23 complete cpDNA sequences and to identify significant differentiation, including gene loss, duplication, and fluctuation of IR-LSC boundary with default settings (Darling et al., 2004). Genome maps were generated using OGDraw v1.2 (Lohse et al., 2013), followed by manual modifications. The maps of cpDNA sequences of X. tenax and H. tubiflora, which illustrate the genome structure and gene composition and order, were shown in Figure 1. The locations of repeat sequences were identified using Phobos (Mayer, 2006) with default settings.
Figure 1

Map of . Genes shown outside of the outer circle are transcribed counterclockwise, whereas those shown inside are transcribed clockwise. The thick lines in the small circles indicate the inverted repeat regions. The dark gray area in the inner circle indicates the CG content of the chloroplast genome. The colors represent different groups of genes in cpDNA. LSC, Large single copy; SSC, small single copy; IRA, inverted repeat region A; IRB, inverted repeat region B.

Map of . Genes shown outside of the outer circle are transcribed counterclockwise, whereas those shown inside are transcribed clockwise. The thick lines in the small circles indicate the inverted repeat regions. The dark gray area in the inner circle indicates the CG content of the chloroplast genome. The colors represent different groups of genes in cpDNA. LSC, Large single copy; SSC, small single copy; IRA, inverted repeat region A; IRB, inverted repeat region B.

Characterization of rps16 loss and junction of IR-LSC regions

After conducting alignment of complete cpDNA sequences using Geneious, we designed primer pairs for amplifying trnK_UUU-trnQ_UUG region, containing rps16 (data not shown). Because of different PCR results of verifying rps16 loss in tribe Melanthieae, primer pairs (rps16-F: 5′-GTCAATATGAATGTTGATAA-3′ and rps16-R: 5′-TTTTCTATTCCATACACATG-3′) were designed using Primer3 (Untergasser et al., 2012). The PCR profile consisted of denaturation at 94°C for 3 min followed by 35 cycles at 94°C for 1 min, 54°C for 1 min, and 72°C for 1 min, with a final extension at 72°C for 7 min. These newly designed primer pairs were applied for Veratrum species of two sections including Veratrum (V. oxysepalum—JK Hong 015; V. lobelianum—Chase 19618; V. grandiflorum—KUN0303565) and Fuscoveratrum (V. versicolor—GCU1411788; V. maackii—KWU03358; V. nigrum—GCU121205181). All PCR products were purified using the MEGAquick-spin Total Fragment DNA Purification Kit (iNtRON Biotechnology, Seoul, Korea) and sequenced using the BigDye Terminator Cycle Sequencing Kit (Applied Biosystems, Foster City, CA, USA) according to the manufacturer's instructions. These sequences were assembled and annotated in Geneious. For identifying junction of IR/LSC regions, we designed primer pairs for rpl2-psbA and rpl2-rps3 regions (data not shown). The PCR products were sequenced and then annotated in Geneious. After getting annotation results, these sequences were aligned to identify the border among examined species.

Confirmation of duplication events in liliales

To confirm the duplication patterns in Liliales, we sampled 68 taxa of 9 families within Liliales, except Corsiaceae which contains mycoheterotrophic species (Table 2). Total genomic DNA was extracted from dried leaves using a modified CTAB method based on Doyle and Doyle (1987). Primer pairs that amplify the entire rpl23-ycf2 IGS were applied with the same PCR profile from the description of Kim et al. (2016b). The further steps for PCR product's treatment were conducted identically as the description above.
Table 2

List of the surveyed taxa with the length of .

SpeciesVoucher (Herbarium) or accession number or referencesLength of rpl23-ycf2 IGS (bp)Number of trnI_CAU copyLength of IGS of rpl23-trnI and trnI-ycf2Total number of repeat units (>=18 bp)Hypothetical scenarios
Melanthiaceae
Paris verticillataS.C Kim 009 (GCU)5913172/673III-B
Paris quadrifoliaIan Christie—SRGC5913172/673III-B
Paris incompletaBONN 227066403164/923
Paris japonicaChase 29052 (KEW)8182159/1462#
Paris dulongensisKUN 03015427241159/49116III-A
Paris fargesiiKim et al., 2016b7081159/47516III-A
Paris maireiKim et al., 2016b6121159/37912III-A
Paris luquanensisKim et al., 2016b6361159/40313III-A
Paris rugosaKim et al., 2016b6121159/37912III-A
Paris dunnianaKim et al., 2016b4201159/1874III-A
Paris thibeticaKim et al., 2016b4851159/2527III-A
ParisaxialisKim et al., 2016b5011159/2686III-A
Paris vietnamensisKim et al., 2016b5001159/2676III-A
Paris polyphylla var. chinensisKim et al., 2016b5231159/29012III-A
Paris polyphylla var. polyphyllaKim et al., 2016b6121159/2908III-A
Paris polyphylla var. stenophyllaKim et al., 2016b6361159/40313III-A
Pseudotrillium rivaleKim et al., 2016b2991164/610
Trillium undulatumKim et al., 2016b3071165/680
Trillium decumbensNC_0272824021205/1230
Trillium cuneatumNC_0271853781210/940
Trillium smaliiKim et al., 2016b4471173/2007III-A
Trillium tschonoskiiKR7800764831240/1697III-A
Trillium flexipesKim et al., 2016b6371173/39018#III-A*
Trillium simileKim et al., 2016b6511173/40419#III-A*
Trillium rugeliiKim et al., 2016b6651169/42220#III-A*
Trillium erectumKim et al., 2016b6691173/42220#III-A*
Trillium underwoodiiKim et al., 2016b6022205/1132#II*
Trillium chloropetalumKim et al., 2016b5812205/1022II
Trillium luteumKim et al., 2016b5652210/942II*
Trillium sessileKim et al., 2016b5652210/942II*
Trillium maculatumKR7800755692214/942II*
Trillium govanianumKim et al., 2016b6123164/653I
Xerophyllum tenaxKM0780353061164/680
Xerophyllum asphodeloidesKim et al., 2016b3061164/680
Chionographis japonicaKF9510653021160/680
Heloniopsis tubifloraKM0780363021160/680
Veratrum patulumKF4373973041157/730
Anticlea elegans*RBGE 19560436A3221157/912
Stenanthium densum*RBGE 19661391A3221157/912
Toxicoscordion micranthusRBGE 19951902A3041157/730
Schoenocaulon coricifoliumChase 18523041157/730
Zigadenus glaberrimusChase 1532991157/680
Liliaceae
Lilium longiflorumKC9689773071165/680
Fritillaria cirrosaNC_0247283071165/680
Calochortus venustusGCU062003071165/680
Tricyrtis macropodaGCU061993071165/680
Gagea trifloraTUT330423071165/680
Erythronium japonicumGCU051683071165/680
Tulipa sylvestrisAvon Bulbs 131-37B53071165/680
Clintonia udensisHong 0533071165/680
Streptopus ovalisKWU014113071165/680
Smilacaceae
Smilax chinaHM5369593071165/680
Smilax nipponicaHong 0083071165/680
Smilax glyciphyllaKim and Bodin 2013-2 (GCU)3071165/680
Heterosmilax chinaC.X Fu (KUN)3071165/680
Philesiaceae
Philesia magellanicaCL0BR19850151 Botanical Garden Maise, Belgium3071165/680
Rhipogonaceae
Rhipogonum scandensF2010036. Garden of Auckland, New Zealand3071165/680
Colchicaceae
Colchicum autumnaleKP1263372981165/590
Gloriosa superbaKP1253383071165/680
Wurmbea burtiiPeter Brownless, Royal Botanic Garden Edinburgh3021160/680
Tripladenia cunninghamiiKim and Bodin 2013-1 (GCU)2991157/680
Uvularia grandifloraFloden et al., 12463071165/680
Disporum smilacinumS.C. Kim 05267 (GCU)3071165/680
Alstroemeriaceae
Alstroemeria aureaKC9689763071165/680
Bomarea edulisKM2336413071165/680
Luzuriaga radicansKM2336403071165/680
Petermanniaceae
Petermannia cirrosaKim and Bodin 2013-3(GCU)3071165/680
Campynemataceae
Campynema lineareNC_0267853121170/680

KUN, Herbarium of Kunming Institute of Botany; TUT, Daejeon University Herbarium; GCU, Gachon University Herbarium; KEW, the Royal Botanic Gardens Kew; SRGC, Scottish Rock Garden Club. The dagger (≠) indicates complex duplication events which are not able to be divided into three scenarios. The number sign (#) represent the species which possess two type of repeat sequences. The asterisk (.

List of the surveyed taxa with the length of . KUN, Herbarium of Kunming Institute of Botany; TUT, Daejeon University Herbarium; GCU, Gachon University Herbarium; KEW, the Royal Botanic Gardens Kew; SRGC, Scottish Rock Garden Club. The dagger (≠) indicates complex duplication events which are not able to be divided into three scenarios. The number sign (#) represent the species which possess two type of repeat sequences. The asterisk (.

Results

Features of cpDNA among five families of liliales and five tribes of melanthiaceae

The lengths of circular double-stranded DNA molecules differed among the families, ranging from 152,793 bp (Lilium longiflorum, Liliaceae) to 158,451 bp (Paris luquanensis, Melanthiaceae, Table 1). In Melanthiaceae, V. patulum (tribe Melanthieae) possesses the smallest cpDNA whereas the biggest cpDNA belongs to Paris species (tribe Parideae, Table 1). Although, the lengths varied, the AT and GC contents were relatively stable among the observed taxa (Table 1). Most of the examined species have 81 protein-coding genes, 30 tRNAs and 4 rRNAs in cpDNA sequences. However, there were 80 protein-coding genes in C. japonica, Colchicum autumnale, and Alstroemeria aurea because of the complete loss of rps16 in C. japonica and the deletion of whole region of ycf15 in C. autumnale and A. aurea. Comparative genomic analysis among five representative taxa of Melanthiaceae revealed that the deletion of rps16 was only found in tribes Chionographideae and Melanthieae. Further investigation on rps16 among genera of tribe Melanthieae revealed that the loss of rps16 was not common and only found in Veratrum, Toxicoscordion, and Schoenocaulon (Figure 2). Specifically, exon 1 of rps16 was deleted in Schoenocaulon whereas a part of exon 2 of this gene (47 bp) remained in Toxicoscordion. In contrast to complete loss of rps16 in C. japonica, only exon 2 of this gene was deleted in V. patulum, suggesting that there were different stages of this event in Veratrum genus which was divided into two sections: Veratrum and Fuscoveratrum. To track the loss of rps16, one primer pairs which covered the whole coding regions of rps16 was designed and applied for Veratrum species (Figure 3A). As expected, the PCR results revealed that there were two types of deletion of rps16 in Veratrum (Figure 3B). The first type was found in section Veratrum of which exon 2 of rps16 was lost. Remaining of exon 1 of rps16 resulted in a PCR product of ~1.5 kb in three examined taxa of section Veratrum (V. oxysepalum, V. lobelianum, and V. grandiflorum; Figures 3A,B). In contrast, the deletion of whole coding region of rps16 in section Fuscoveratrum caused a 400 bp-PCR product in V. versicolor, V. maackii, and V. nigrum (Figures 3A,B). In Liliaceae, Smilacaceae, Altroemeriaceae, and Colchicaceae, the intact coding sequence of rps16 was found (Table 1, Figure 2).
Figure 2

Comparison of . The simplified phylogenetic tree was based on Kim et al. (2016a). The dashed line box indicates the lost sequences and the dotted line showed the significantly changed sequences in comparison with others. Par, Parideae; Xer, Xerophylleae; Hel, Heloniadeae; Chi, Chionographideae; Mel, Melanthieae; Lil, Liliaceae; Smi, Smilacaceae; Col, Colchicaceae. The question mark (?) means missing data. The asterisk shows the genus which has two types of rps16 loss (A) completely lost and (B) partially lost.

Figure 3

Confirmation of The design of primers (Forward primer: rps16-F; Reverse primer: rps16-R) which cover the whole coding sequences of rps16 based on the cpDNA sequence of V. patulum (Accession number KF437397) and their positions in two types of rps16 gene loss. Expected lengths are ~1.5 kb in section Veratrum and 400 bp in section Fuscoveratrum. (B) Results of PCR among species of Veratrum. The section Veratrum includes V. oxysepalum, V. Lobelianum, and V. grandiflorum. The section Fuscoveratrum includes V. versicolor, V. maackii, and V. nigrum.

Comparison of . The simplified phylogenetic tree was based on Kim et al. (2016a). The dashed line box indicates the lost sequences and the dotted line showed the significantly changed sequences in comparison with others. Par, Parideae; Xer, Xerophylleae; Hel, Heloniadeae; Chi, Chionographideae; Mel, Melanthieae; Lil, Liliaceae; Smi, Smilacaceae; Col, Colchicaceae. The question mark (?) means missing data. The asterisk shows the genus which has two types of rps16 loss (A) completely lost and (B) partially lost. Confirmation of The design of primers (Forward primer: rps16-F; Reverse primer: rps16-R) which cover the whole coding sequences of rps16 based on the cpDNA sequence of V. patulum (Accession number KF437397) and their positions in two types of rps16 gene loss. Expected lengths are ~1.5 kb in section Veratrum and 400 bp in section Fuscoveratrum. (B) Results of PCR among species of Veratrum. The section Veratrum includes V. oxysepalum, V. Lobelianum, and V. grandiflorum. The section Fuscoveratrum includes V. versicolor, V. maackii, and V. nigrum. Sequences flanking the LSC/IR junction were compared between other taxa of the Liliales (Figure 4). The IR/LSC borders varied among the taxa. Specifically, the IR/LSC borders located in coding region of rps19 in Liliaceae and Colchicaceae. Meanwhile, it expanded to a part of rpl22 in Smilacaceae (Figure 4). In Melanthiaceae, it occurred in the trnH_GUG/rps19 intergenic spacer in Veratrum and Toxicoscordion. However, in other taxa, it expanded into a part of rps19 (350 bp in Trillium), into the rps19/rpl22 intergenic spacer (IGS; Anticlea and Stenanthium), into the rpl22/rps3 IGS (Heloniopsis), and into a section of rps3 (6 bp in Xerophyllum and Paris; 83 bp in Chionographis; 65 bp in Schoenocaulon; 161 bp in Zigadenus). Compared to other tribes, Melanthieae possessed a wide range of IR/LSC junctions (Figure 4).
Figure 4

Comparison of borders between IR and LSC regions among species of Melanthiaceae and other families. The simplified phylogenetic tree was based on Kim et al. (2016a). The dotted line indicates the junction of IR-LSC regions. The (Ψ) represents pseudogenes. Par, Parideae; Xer, Xerophylleae; Hel, Heloniadeae; Chi, Chionographideae; Mel, Melanthieae; Lil, Liliaceae; Smi, Smilacaceae; Col, Colchicaceae. The question mark (?) means missing data.

Comparison of borders between IR and LSC regions among species of Melanthiaceae and other families. The simplified phylogenetic tree was based on Kim et al. (2016a). The dotted line indicates the junction of IR-LSC regions. The (Ψ) represents pseudogenes. Par, Parideae; Xer, Xerophylleae; Hel, Heloniadeae; Chi, Chionographideae; Mel, Melanthieae; Lil, Liliaceae; Smi, Smilacaceae; Col, Colchicaceae. The question mark (?) means missing data.

Accumulation of repeat sequences in tribe parideae

Further investigation of repeat sequences showed that the IGS between rpl23 and ycf2 containing trnI_CAU was extremely variable in length, ranging from 299 to 818 bp among Paris and Trillium, while a more stable length was detected in other species (Table 2). In Paris, the trnI-ycf2 IGS was ranged from 67 to 491 bp. As is the case in Paris, Trillium has varying lengths of the trnI–ycf2 IGS (from 68 to 422 bp). Notably, nearly all species in other Liliales families have equal lengths of the trnI-ycf2 IGS (68 bp). The length variation in the rpl23-ycf2 IGS of tribe Parideae can be attributed to the presence of tandem repeat sequences (Table 2, Supplementary Data S1). The results of repeat analysis in rpl23-ycf2 IGS regions among Liliales species showed that accumulation of repeat occurred only in tribes Parideae and Melanthieae of Melanthiaceae. Additionally, the number of repeat units was different among examined species (Table 2). In Paris species, this region contained a ranging copy number from 2 to 16 whereas the number of copy varied from 2 to 20 in Trillium taxa (Table 2, Supplementary Data S1). Also, the length of these repeats was different in both genera, ranging from 24 to 155 bp in Paris and from 18 to 209 bp in Trillium (Supplementary Data S1). Additionally, we observed upstream and downstream of repeats because of the important role of initial repeats in SSM mechanism. As a result, we found two groups of small conserved repeated sequences in most of the surveyed taxa (Figure 5A, Supplementary Data S1). In the first group, there were two 7 bp—direct repeats which were located upstream and within the coding sequence of trnI_CAU. In the rest group, there was a cluster of direct repeats including R1 (5′-CAAATTCCAAT-3′), R1a (5′-CCAATTCCAAT-3′), and R1b (5′-ATTCCA-3′).
Figure 5

Conserved repeated sequences in The representative sequence of rpl23-ycf2 IGS extracted from Campynema lineare (Accession number NC026785) and positions and two groups of conserved repeats in Liliales. Blue shaded and squared yellow shaded sequences represent two groups of repeats. The letters of R1, R1a, and R1b above sequence indicate variety of repeats. The bold letters indicate the coding region of trnI_CAU in rpl23-ycf2 IGS sequence. (B) Hypothetical scenarios for formation of repeats. The blue and yellow bars represent repeat units. Black bars indicate new generated repeats. HIFR stands for homology facilitated illegitimate recombination. Asterisks showed the trnI_CAU sequence which can be included in repeat units.

Conserved repeated sequences in The representative sequence of rpl23-ycf2 IGS extracted from Campynema lineare (Accession number NC026785) and positions and two groups of conserved repeats in Liliales. Blue shaded and squared yellow shaded sequences represent two groups of repeats. The letters of R1, R1a, and R1b above sequence indicate variety of repeats. The bold letters indicate the coding region of trnI_CAU in rpl23-ycf2 IGS sequence. (B) Hypothetical scenarios for formation of repeats. The blue and yellow bars represent repeat units. Black bars indicate new generated repeats. HIFR stands for homology facilitated illegitimate recombination. Asterisks showed the trnI_CAU sequence which can be included in repeat units.

Discussion

Comparative characteristics of cpDNA among melanthiaceae species and its implication

The cpDNA structures of representative species of Melanthiaceae consist of typical double-stranded DNA molecules and are highly conserved, as reported in previous angiosperm cpDNA studies (Palmer, 1991; Yang et al., 2010; Liu et al., 2012; Huang et al., 2013, 2014, 2016; Kim and Kim, 2013; Luo et al., 2014; Nguyen et al., 2015). In this study, length variations were identified among the Melanthiaceae taxa (Table 1). The longer sequences of cpDNA were found in Paris, Trillium, Xerophyllum, and Heloniopsis species which possessed either repeat units in rpl23-ycf2 regions or expansion of IR/LSC border. Although, C. japonica has the expansion of IR/LSC junction to rps3 (83 bp), the loss of rps16 caused a shorten length of its cpDNA. Therefore, it is suggested that the length variations within Melanthiaceae could have been led by the deletion and duplication of genes, as well as the expansion of IR regions. The comparative analysis among five families of Liliales revealed a notable variety of length and different losses of genes in cpDNA (Table 1). However, further studies, which cover all 10 families, should be conducted to investigate the overall trends of genomic events in Liliales. The rps16 gene, encoding ribosomal protein S16, is commonly detected in the plant chloroplast genomes. However, the loss of this gene was also recorded in different taxa including Connarus, Epifagus, Pinus, Viola, Fagus, legume species, and etc. (Downie and Palmer, 1992; Doyle et al., 1995). For understanding this loss, it was proposed that the rps16 gene was transferred to the nucleus and its protein product was able to target both chloroplast and mitochondria in the case of Medicago truncatula and Populus alba (Ueda et al., 2008). Additionally, deletion of rps16 was found in a moss species of Physcomitrella patens subsp. patens (Sugiura et al., 2003). These results suggest that the transfer event of rps16 occurred independently at the early divergence of plants. It was lost in C. japonica and partially or completely deleted in V. patulum, S. densum, and T. micranthus among the Melanthiaceae; therefore, ribosomal protein S16 was predicted to be untranscribed and untranslated from cpDNA. However, this deficiency could be compensated from nuclear rps16 products as described in a previous study (Ueda et al., 2008). In contrast to the deletion of exon 2 and complete loss of rps16 in Chionographis and Veratrum, the deletion of exon 1 and remains of a piece of exon 2 were recorded in Schoenocaulon and Toxicoscordion, respectively. Additionally, 22 out of 26 species of Schoenocaulon are endemic from the Southern of United States of America to Peru (Zomlefer and Judd, 2008). Therefore, this genomic feature might contribute to investigating the evolution of cpDNA in this genus. Further studies which cover all species of Schoenocaulon and Toxicoscordion should be conducted to clarify the overview of this feature in the tribe Melanthieae. Furthermore, two types of rps16 deletions were found in two sections of Veratrum which were distinguished by characteristics of leaf, style, and sheath of stem base (Chen and Takahashi, 2000; Zomlefer et al., 2003; Figure 3). Previously, genomic events in chloroplast genome sequences were specifically detected in some species and could be molecular markers. For example, the inversion of the trnV_UAC-atpB region was only detected in species of Trillium subgenus Phyllantherum of Melanthiaceae and the loss of ycf15 was observed in tribe Colchiceae of Colchicaceae (Nguyen et al., 2015; Kim et al., 2016b). In this study, based on the finding of partial or complete loss of rps16, we provide the first potential molecular maker for recognizing two sections among Veratrum (Figure 3). From these results, it is likely that genomic events in chloroplast genomes are effective for making molecular markers and reflect the phylogeny among Liliales taxa. In general, IR expansion affects length variation in cpDNA. For example, the expansion of the IR region (36,501 bp) into psbB in Mahonia bealei cpDNA led to an increased total genome length (164,792 bp; Ma et al., 2013). In Melanthiaceae, the IR/LSC junctions were also variable (Figure 4). This variability affected the total length of the cpDNA region. For instance, the IR/LSC junction expansion from trnH_GUG into rps3 resulted in an increased length of the IR region from 26,360 bp in V. patulum to 28,373 bp in P. verticillata (Table 1). Wang et al. (2008) suggested that the IR/LSC junctions in the Liliales taxa contained the trnH_GUGpsbA cluster, but variable patterns of junction existed in the order Liliales (Figure 4). Within the same family of monocots, IR/LSC junctions contained similarities; for example, the boundaries located in the rps19 and rpl22 genes of the Arecaceae and Orchidaceae, respectively (Huang et al., 2013; Luo et al., 2014). A similar trend was observed in dicots species of the Araliaceae, in which a common IR/LSC boundary was detected in the rps19 gene (Li et al., 2013). In contrast, the border of IR/LSC varied among Melanthiaceae in which the IR region was expanded from trnH_GUG into a part of rps3 (Figure 4). Significantly, there were three different borders in tribe Melanthieae (Figure 4). The unique expansion into 161 bp of rps3 might be a potentially molecular marker for monotypic species—Z. glaberrimus (Figure 4).

Hypothetical scenarios for dynamic SSM events in cpDNA

Previously, the DSB mechanism induced recombination in Chlomydomonas reinhardtii cpDNA (Dürrenberger et al., 1996). Kwon et al. (2010) reported the DSB repair pathways from both microhomology and no homology in Arabidopsis. Additionally, cpDNA sequences typically contain two inverted repeat regions which can be reversely used as a template for repairing the break of DNA through recombination. Tandem repeats ranging from 6 to 33 bp in IR region was discovered in Oenothera species (Onagraceae, Myrtales) (Blasko et al., 1988; Nimzyk et al., 1993; Sears et al., 1996), and tandem repeats comprising a 29-bp sequence have been found in the rps8-rpl14 IGS of the LSC region of Oenothera (Wolfson et al., 1991). The copy correction of IR regions after imprecise alignment, replication slippage, and recombination have also been proposed as a mechanism for the accumulation of tandem repeats in Oenothrea cpDNA (Blasko et al., 1988; Wolfson et al., 1991; Sears et al., 1996). Recently, Massouh et al. (2016) surveyed and found spontaneous mutants in chloroplast genomes of Oenothera which were mostly caused by the replication slippage events. SSM was believed to be a major factor for DNA evolution (Levinson and Gutman, 1987). Although, results of SSM were previously reported in cpDNA of angiosperms, there have not been records of utilization of different small conserved repeats in the same region of cpDNA for generating newly repeated sequences. In this study, due to the presence of the conserved regions which flanked tandem repeats, we proposed three different patterns for generating the repeated sequences among Parideae taxa (Figure 5A). In the first scenario (I), the R1b sequence was utilized through SSM mechanism to form three tandem repeats of 164 bp in Trillium govanianum which includes the whole trnI_CAU sequence (Supplementary Data S1). Within the second pathway (II), prior to the process of SSM, a point mutation which changed the adenine base to cytosine base to create a perfect direct repeat between R1 and R1a occurred. The present of direct repeat (R1a) induced SSM process which resulted in formation of two repeats in Trillium taxa. Generally, in the SSM mechanism, initial repeats play an important role. Therefore, in the third case (III), we proposed the formation of initial repeats through the double-strand break (DSB) repair mechanism (Figure 5B). Specifically, two repair mechanisms may be involved in this case due to the difference in repeat contents among species. First, in the III-A subcase, unequal recombination occurred downstream of trnI_CAU and induced the formation of initial repeats which were employed in SSM process. Meanwhile, in the second subcase (III-B), repairing mechanism of DSB through homology facilitated illegitimate recombination (HFIR) occurred based on direct repeat sequences of 7 bp (5′-ATGGATG-3′) to create a longer 16 bp- repeat unit (5′-ATGGATGCTTAACAGG-3′) which was assumed to be an initial repeat unit for SSM event. Because of the different initial repeat units, SSM events occurred and resulted in newly distinct types of repeat sequences in both Paris and Trillium species (Figure 5B, Table 2, Supplementary Data S1). Albeit the sequence data supported our hypothetical scenarios, there was not essential evidence of in vivo experiment in this study. However, GuhaMajumdar et al. (2008) previously attempted to trace replication slippage in vivo and successfully confirmed this event from results of deletion and duplication in C. reinhardtii and Escherichia coli. Although, this study employed only one type of short tandem repeat, it fundamentally supported the reliability of three hypothetical scenarios in our study. Further studies, which use more types of small sequence repeats in the same region, should be conducted to provide substantial evidence for our hypothesis. Recently, Kim et al. (2016b) used the number of trnI_CAU to classify the type of duplication events across Parideae. This classification was incongruent with infrageneric circumscription of Paris members, but not for Trillium species. In contrast, in terms of the origin of repeat sequences, there were no relationships between the classification within tribe Parideae and mechanisms of repeat's accumulation. For instance, in Paris, the formation of repeats could be explained by the (III) scenario, except P. incompleta whose repeats have likely arisen from the (I) pathway followed by point mutation events and P. japonica which reflected complex duplication processes (Table 2, Supplementary Data S1). In Trillium, all three pathways can be found. For example, the (I) scenario was recorded only in T. govanianum. Meanwhile, the (II) and (III-A) pathways could be found in subgenus Phylantherum and subgenus Trillium, respectively. Moreover, repeats were not found in rpl23-ycf2 IGS of T. undulatum (Subgenus Trillium), or in T. decumbens and T. cuneatum (subgenus Phylantherum). These findings suggested that SSM events occurred independently across the tribe Parideae. Additionally, the number of <24 bp—repeat units was more abundant than those of over 24 bp in length, suggesting that that SSM occurred more frequently for short repeats than for long repeat sequences in the tribe Parideae (Melanthiaceae, Liliales). In Anticlea elegans and S. densum, two repeats (19 bp) were found (Table 2), suggesting that the accumulation of repeats may also occur in tribe Melanthieae, which is composed of 7 genera and 78 species of Melanthiceae. Although, sequence data provided evidence for different scenarios of SSM and its independence, there was not enough evidence regarding the alternation of initial repeats during the SSM process in Parideae. Notably, small conserved units were found in most of the examined taxa; however, within Liliales, the repeats in the rpl23-ycf2 IGS were mainly present in the tribe Parideae of Melanthiaceae. Therefore, variation within this region may be due to a unique genomic event in Parideae. Previous studies have found diverse genome sizes among Melanthiaceae (Pellicer et al., 2014). In contrast to the trend of reduced genome size in other tribes, Parideae exhibit significant increases in chromosome size and possess the largest nuclear genome in Melanthiaceae. This trend can also been seen in the patterns of repeats between Parideae and other tribes. It is likely that the causes of chromosome changes in Parideae might be related to the accumulation of repeats within this tribe. More studies should be conducted to shed light on the significance of these two unique features of the genomes of Parideae. Additionally, accumulation of repeat sequences was also found in rpl23-ycf2 IGS of monocots such as Acorus calamus (Accession number NC_007407), Sagittaria lichuanensis (Accession number NC_029815), Anomochloa marantoidea (Accession number NC_014062), Eustrephus latifolius (Accession number KM_233639), Curcuma roscoeana (Accession number KF_601574), and Musa acuminata subp malaccensis (Accession number HF677508; Data not shown), suggesting that the rpl23-ycf2 IGS may be one of the “hot spots” for genomic events in angiosperm species.

Conclusions

In conclusion, comparative analysis of cpDNA in Melanthiaceae revealed that genomic events including pseudogenization, duplication, and deletion in the chloroplast genome are precise sources for mining molecular marker in plants. Specifically, gene loss events of rps16 were potentially valuable molecular data for identifying two sections of the Veratrum species. Melanthiaceae also exhibits a significant change in junctions between LSC and IR regions. Additionally, we provided the first evidence of different employments of small repeat sequences for SSM in chloroplast genomes of monocots species. Though the origin of these differences remains unclear, these data highlight the dynamic molecular evolution in chloroplast genomes. With the increasing number of complete organelle genomes, these patterns could be detected in other species and be useful references for tracing genomic evolution among plants.

Author contributions

HDKD carried out the genomic experiment and drafted the manuscript. HDKD and J-HK participated in the design of the study and revised the manuscript. All authors read and approved the final manuscript.

Funding

This work was supported by the National Research Foundation of Korea (NRF) Grant Fund (MEST 2010-0029131) and Scientific Research of Korea National Arboretum (KNA) Grant Fund (KNA 1-2-13, 14-2).

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  36 in total

1.  Creation of a chloroplast microsatellite reporter for detection of replication slippage in Chlamydomonas reinhardtii.

Authors:  Monica GuhaMajumdar; Ethan Dawson-Baglien; Barbara B Sears
Journal:  Eukaryot Cell       Date:  2008-02-08

2.  The complete chloroplast genome of colchicine plants (Colchicum autumnale L. and Gloriosa superba L.) and its application for identifying the genus.

Authors:  Pham Anh Thi Nguyen; Jung Sung Kim; Joo-Hwan Kim
Journal:  Planta       Date:  2015-04-23       Impact factor: 4.116

3.  The complete chloroplast genome sequence of Mahonia bealei (Berberidaceae) reveals a significant expansion of the inverted repeat and phylogenetic relationship with other angiosperms.

Authors:  Ji Ma; Bingxian Yang; Wei Zhu; Lianli Sun; Jingkui Tian; Xumin Wang
Journal:  Gene       Date:  2013-07-27       Impact factor: 3.688

4.  Gene relocations within chloroplast genomes of Jasminum and Menodora (Oleaceae) are due to multiple, overlapping inversions.

Authors:  Hae-Lim Lee; Robert K Jansen; Timothy W Chumley; Ki-Joong Kim
Journal:  Mol Biol Evol       Date:  2007-02-28       Impact factor: 16.240

5.  Substitution of the gene for chloroplast RPS16 was assisted by generation of a dual targeting signal.

Authors:  Minoru Ueda; Tomotaro Nishikawa; Masaru Fujimoto; Hideki Takanashi; Shin-Ichi Arimura; Nobuhiro Tsutsumi; Koh-Ichi Kadowaki
Journal:  Mol Biol Evol       Date:  2008-05-02       Impact factor: 16.240

6.  The complete chloroplast genome sequence of date palm (Phoenix dactylifera L.).

Authors:  Meng Yang; Xiaowei Zhang; Guiming Liu; Yuxin Yin; Kaifu Chen; Quanzheng Yun; Duojun Zhao; Ibrahim S Al-Mssallem; Jun Yu
Journal:  PLoS One       Date:  2010-09-15       Impact factor: 3.240

7.  Comparative genomics of four Liliales families inferred from the complete chloroplast genome sequence of Veratrum patulum O. Loes. (Melanthiaceae).

Authors:  Hoang Dang Khoa Do; Jung Sung Kim; Joo-Hwan Kim
Journal:  Gene       Date:  2013-08-23       Impact factor: 3.688

8.  Complete sequencing of five araliaceae chloroplast genomes and the phylogenetic implications.

Authors:  Rong Li; Peng-Fei Ma; Jun Wen; Ting-Shuang Yi
Journal:  PLoS One       Date:  2013-10-18       Impact factor: 3.240

9.  OrganellarGenomeDRAW--a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets.

Authors:  Marc Lohse; Oliver Drechsel; Sabine Kahlau; Ralph Bock
Journal:  Nucleic Acids Res       Date:  2013-04-22       Impact factor: 16.971

10.  From algae to angiosperms-inferring the phylogeny of green plants (Viridiplantae) from 360 plastid genomes.

Authors:  Brad R Ruhfel; Matthew A Gitzendanner; Pamela S Soltis; Douglas E Soltis; J Gordon Burleigh
Journal:  BMC Evol Biol       Date:  2014-02-17       Impact factor: 3.260

View more
  5 in total

1.  The newly developed single nucleotide polymorphism (SNP) markers for a potentially medicinal plant, Crepidiastrum denticulatum (Asteraceae), inferred from complete chloroplast genome data.

Authors:  Hoang Dang Khoa Do; Joonhyung Jung; JongYoung Hyun; Seok Jeong Yoon; Chaejin Lim; Keedon Park; Joo-Hwan Kim
Journal:  Mol Biol Rep       Date:  2019-04-12       Impact factor: 2.316

2.  The implication of plastid transcriptome analysis in petaloid monocotyledons: A case study of Lilium lancifolium (Liliaceae, Liliales).

Authors:  Hoang Dang Khoa Do; Joo-Hwan Kim
Journal:  Sci Rep       Date:  2019-04-30       Impact factor: 4.379

3.  Comparative analysis and implications of the chloroplast genomes of three thistles (Carduus L., Asteraceae).

Authors:  Joonhyung Jung; Hoang Dang Khoa Do; JongYoung Hyun; Changkyun Kim; Joo-Hwan Kim
Journal:  PeerJ       Date:  2021-01-14       Impact factor: 2.984

4.  Phylogenomic Analyses of Hepatica Species and Comparative Analyses Within Tribe Anemoneae (Ranunculaceae).

Authors:  Kyu Tae Park; SeonJoo Park
Journal:  Front Plant Sci       Date:  2021-06-04       Impact factor: 5.753

5.  Comparative Chloroplast Genomics of Fritillaria (Liliaceae), Inferences for Phylogenetic Relationships between Fritillaria and Lilium and Plastome Evolution.

Authors:  Jiao Huang; Yan Yu; Yan-Mei Liu; Deng-Feng Xie; Xing-Jin He; Song-Dong Zhou
Journal:  Plants (Basel)       Date:  2020-01-21
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.