Literature DB >> 26933149

Insight into infrageneric circumscription through complete chloroplast genome sequences of two Trillium species.

Sang-Chul Kim1, Jung Sung Kim1, Joo-Hwan Kim2.   

Abstract

Genomic events including gene loss, duplication, pseudogenization and rearrangement in plant genomes are valuable sources for exploring and understanding the process of evolution in angiosperms. The family Melanthiaceae is distributed in temperate regions of the Northern Hemisphere and divided into five tribes (Heloniadeae, Chionographideae, Xerophylleae, Melanthieae and Parideae) based on the molecular phylogenetic analyses. At present, complete chloroplast genomes of the Melanthiaceae have been reported from three species. In the previous genomic study of Liliales, atrnI-CAU gene duplication event was reported fromParis verticillata, a member of Parideae. To clarify the significant genomic events of the tribe Parideae, we analysed the complete chloroplast genome sequences of twoTrilliumspecies representing two subgenera:TrilliumandPhyllantherum InTrillium tschonoskii(subgenusTrillium), the circular double-stranded cpDNA sequence of 156 852 bp consists of two inverted repeat (IR) regions of 26 501 bp each, a large single-copy (LSC) region of 83 981 bp and a small single-copy (SSC) region of 19 869 bp. The chloroplast genome sequence ofT. maculatum(subgenusPhyllantherum) is 157 359 bp in length, consisting of two IRs (25 535 bp), one SSC (19 949 bp) and one LSC (86 340 bp), and is longer than that ofT. tschonoskii The results showed that the cpDNAs of Parideae are highly conserved across genome structure, gene order and contents. However, the chloroplast genome ofT. maculatumcontained a 3.4-kb inverted sequence betweenndhCandrbcLin the LSC region, and it was a unique feature for subgeneraPhyllantherum In addition, we found three different types of gene duplication in the intergenic spacer betweenrpl23andycf2containingtrnI-CAU, which were in agreement with the circumscription of subgenera and sections in Parideae excludingT. govanianum These genomic features provide informative molecular markers for identifying the infrageneric taxa ofTrilliumand improve our understanding of the evolution patterns of Parideae in Melanthiaceae. Published by Oxford University Press on behalf of the Annals of Botany Company.

Entities:  

Keywords:  Chloroplast genome; Trillium maculatum; Trillium tschonoskii, trnI-CAU; comparative genomics; gene duplication; single inversion

Year:  2016        PMID: 26933149      PMCID: PMC4823371          DOI: 10.1093/aobpla/plw015

Source DB:  PubMed          Journal:  AoB Plants            Impact factor:   3.276


Introduction

The chloroplast that characterizes all green plants (Viridiplantae) originated from an endosymbiotic event between independent living cyanobacteria and a non-photosynthetic host (Dyall ). Chloroplast genomes of flowering plants are typically circular double-stranded DNA molecules, and usually contain two inverted repeat (IR) regions (IRA and IRB) separated by a large single-copy and a small single-copy (LSC and SSC, respectively) regions (Ravi ). The plastid genome is mostly stable in structure, gene content and gene order across land plant lineages (Jansen ). Due to this stability, it demonstrated great utility for developing phylogenetic hypotheses across the plant tree of life (Jansen ; Zhang ; Li ). Within seed plants, plastid genomes usually contain 101–118 unique genes with the majority of those 66–82 coding for proteins involved in photosynthesis and gene expression, 29–32 of these genes code for transfer RNAs and 4 code for the ribosomal RNA genes (Jansen and Ruhlman 2012). The advance of next-generation sequencing has facilitated rapid growth of complete chloroplast genomes due to time-saving and low-cost advantages (Shendure and Ji 2008). To date, ∼500 complete chloroplast DNA genome sequence data have been released in GenBank's Organelle Genome Resources (http://www.ncbi.nlm.nih.gov/genome). Melanthiaceae, a member of Liliales, comprises 17 genera and ∼178 species of perennial herbs that are mostly distributed in the temperate regions of the Northern Hemisphere (Zomlefer ). Species of this family are characterized by their extrorse anthers and carpels bearing three distinct styles (Rudall ). The family has been divided into five tribes: Heloniadeae, Chionographideae, Xerophylleae, Melanthieae and Parideae (The Angiosperm Phylogeny Group 2009). Prior to any molecular systematic analyses, Melanthiaceae were divided into several taxonomically independent families by Takhtajan (1997) due to their unique autapomorphies. Trilliaceae, which is now recognized as tribe Parideae (Trillieae), are unique in having solitary flowers, berries, membranous nectary and large chromosomes with five chromosomes as the base number. The phylogeny of species within the Trilliaceae (now Parideae) was highly debated by many researchers using molecular and morphological data (Kato ; Osaloo ; Osaloo and Kawano 1999; Farmer and Schilling 2002; Farmer 2006). Tribe Parideae includes three genera: Paris, Trillium and Pseudotrillium. Paris has 4–15 leaves in a whorl, flowers 4-merous or more and inner perianth segments that are much narrower than outer ones, while Trillium has only 3 leaves in a whorl, flowers 3-merous and inner perianth segments that are a little narrower than the outer ones. Pseudotrillium has thick, tough, heart-shaped leaves, spotted petals and flower stalks that extend until the ripe fruit touches the ground. Trillium has been divided into two subgenera differing in the presence of pedicel: subgenus Trillium (with pedicels) and Phyllantherum (without pedicels) (Freeman 1969, 1975). The monophyly of subgenus Phyllantherum was strongly supported in many previous studies (Osaloo ; Osaloo and Kawano 1999; Farmer and Schilling 2002; Farmer 2006). On the other hand, subgenus Trillium is rendered a paraphyletic group by the inclusion of Phyllantherum. Currently, complete chloroplast genomes of the Melanthiaceae have been reported from Paris verticillata (KJ433485; Do ), Veratrum patulum (KF437397; Do ) and Chionographis japonica (KF951065; Bodin ), which represent three tribes of Parideae, Melanthieae and Chionographideae, respectively. In this study, we analysed complete chloroplast genome sequences of subgenera Trillium and Phyllanthrum of Trillium to better understand the evolution of the chloroplast genomes in tribe Parideae and across the Melanthiaceae. We analysed the sequence variation between two subgenera and proposed novel molecular markers for phylogenetic studies by comparing the two newly generated genome sequences. In addition, we characterized the trnI-CAU duplication event in Parideae, detected in P. verticillata chloroplast genome (KJ433485), to determine the origin of the repeating unit. Consequently, these results provide additional knowledge about the patterns of the chloroplast genome evolution within tribe Parideae.

Methods

DNA extraction, sequencing and annotation

We collected Trillium tschonoskii from Ulleung Island, South Korea. The voucher specimen and plant materials were deposited at the herbarium (GCU) and Medicinal Plant Resources Bank (MPRB) of Gachon University. Trillium maculatum was obtained from the Abraham Baldwin Agricultural College, USA (voucher No. Susan Farmer 19990006). We used silica gel-dried leaves from each species to extract total genomic DNA using the DNeasy Plant Mini Kit (Qiagen, Seoul, South Korea). The Hiseq 2000 system was employed to sequence chloroplast genomes of T. tschonoskii and T. maculatum. Raw data were assembled using Geneious ver. 7. 1 (Biomatters Ltd, New Zealand) with default settings. After trimming the sequences, we mapped pair-end reads to the reference sequence of P. verticillata (KJ433485). Aligned contigs were ordered according to the reference genome and the gaps were filled via direct sequencing of polymerase chain reaction (PCR) products with newly designed primers. In addition, the ambiguous sequences including low assembly coverage regions and the borders of the four junctions between LSC, SSC and IR regions were confirmed using the Sanger method. Complete chloroplast genomes of both species were annotated by Geneious ver. 7. 1 (Biomatters Ltd), with manual corrections for putative start and stop codons. The exon positions of protein-coding genes and intron were determined using released Liliales chloroplast genome sequences as references. All tRNA sequences were confirmed utilizing the web-based online tool of tRNAScan-SE (Schattner ) with default settings to corroborate tRNA boundaries identified by Geneious. The genome maps were generated using OGDraw (OrganellarGenomeDRAW; Lohse ) followed by manual modification.

Comparison of the chloroplast genome sequences of two subgenera

The simple sequence repeats (SSRs) were analysed using Phobos Version 3.3.12 (Kraemer ), with thresholds of eight repeat units for mononucleotide SSRs, four repeat units for dinucleotide, trinucleotide SSRs and three repeat units for tetranucleotide, pentanucleotide and hexanucleotide SSRs. All the detected repeats were manually verified, and the redundant results were removed. We aligned the plastid genome sequences of two Trillium using MAFFT (Katoh ). The identified insertion/deletion mutations (indels) from the results were confirmed by reassembling the whole reads generated by HiSeq 2000. The single nucleotide polymorphisms (SNPs) were analysed using Geneious 7.1 (Kearse ), and each indel and SNP were separated based on the position excluding one of IR regions. Since we are comparing only two genomes, we quantified the sequence divergence as the ratio of aligned nucleotide sites within specifically different regions (p-distance). Sanger sequences and assembled genomes were calculated using mean p-distance in MEGA 6.0 (Tamura ). Twenty-nine species, representing the two subgenera of Trillium in Parideae, were selected for comparative sequencing of inversion. The PCR amplification primers were designed based on the sequence comparisons among three chloroplast genome sequences of two Trillium species (in this study), and P. verticillata (KJ433485). Presence and absence PCR amplifications were carried out using various combinations of the three primers (I1F: 5′-CCC TAG GTT TTT TTC TTC AAG-3′, I1R: 5′-TTA TGT AGC TTA TCC TTT AGA CC-3′ and I2R: 5′-AGA AGG TCT ACG GTT CGA G-3′).

trnI-CAU duplication pattern in the tribe Parideae

To clarify the trnI-CAU duplication pattern in the tribe Parideae, we designed two primers (Primer 1: 5′-GAA GAG TTC GAC CCA ATG CT-3′, Primer 2: 5′-TTA TGA AAC TCT TTG ACC CC-3′) for amplifying the intergenic spacer (IGS) region of rpl23-ycf2 based on the identical sequence among the three species (P. verticillata, T. maculatum and T. tschonoskii). The PCR condition for IGS region of rpl23-ycf2 was at initial denaturation at 94 °C for 5 min, followed by 30 cycles of denaturation at 94 °C for 1 min, annealing at 50 °C for 1 min and extension at 72 °C for 2 min, with a final extension at 72 °C for 5 min. We obtained variously sized PCR products ranging from 500 to 1200 bp, and compared the sequences of this region from 33 species covering the infrageneric classification of the tribe. Sequence editing and assembly were performed using Sequencher (ver. 5.1). The sequence alignment was initially performed using MAFFT (Katoh ) and was adjusted manually.

Results

Comparison of the complete chloroplast genomes of subgenera Trillium and Phyllantherum

We sequenced the complete chloroplast genome sequence of two Trillium species, T. tschonoskii (subgenera Trillium; GenBank accession number KR780076) and T. maculatum (subgenera Phyllantherum; GenBank accession number KR780075) (Fig. 1). In total, 4 292 702 (T. tschonoskii) and 18 348 134 (T. maculatum) paired-end reads were generated. Out of those, 60 805 and 246 240 reads were identified as the chloroplast genome sequences for T. tschonoskii and T. maculatum, respectively. The chloroplast genome of T. tschonoskii was composed of 156 852 bp in length (AT content 62.5 %), and it comprised a LSC region (83 981 bp), a SSC region (19 869 bp) and two IR regions (26 501 bp), while T. maculatum was 157 359 bp in length (AT content 62.5 %, 86 340 bp of LSC, 19 949 bp of SSC and 25 535 bp of IRs).
Figure 1.

Gene maps and summary of the T. tschonoskii Maxim. and T. maculatum Raf. chloroplast genomes. IR, inverted repeat; LCS, large single-copy region; SSC, small single-copy region.

Gene maps and summary of the T. tschonoskii Maxim. and T. maculatum Raf. chloroplast genomes. IR, inverted repeat; LCS, large single-copy region; SSC, small single-copy region. The gene content and order were slightly different between both species because of the rpl22 position in the IR-LSC boundary and trnI-CAU duplication in IR. While the rpl22 gene remained in the LSC region of the T. maculatum plastid genome, this gene was present in the IR region of T. tschonoskii plastid genome (Fig. 2). In total, 116 genes of T. maculatum were identified and consisted of 78 coding genes, 4 rRNA genes, 31 tRNA genes and 3 pseudogenes, while those of T. tschonoskii were 115 genes without tRNA gene duplication [see . In addition, T. tschonoskii has 7 coding genes, 4 rRNA genes, 9 tRNA genes, 2 pseudogenes, whereas T. maculatum has 8 coding genes, 4 rRNA genes, 8 tRNA genes, 2 pseudogenes, duplicated in the IR region, making a total of 138 genes and 137 genes presented in the T. tschonoskii and T. maculatum chloroplast genome, respectively. Among these genes, 22 intron-containing genes were found including 15 protein-coding genes and 7 tRNA genes. Among them, ycf3 and clpP gene contained two introns. The trnK-UUU has the largest intron (T. tschonoskii: 2614 bp, T. maculatum: 2640 bp) including the matK gene. Ycf15 and ycf68 in the IR region were pseudogenized because of the presence of several internal stop codons. Furthermore, the cemA gene located in the LSC of both genomes was also pseudogenized.
Figure 2.

Comparison of the IR boundaries among five species within Melanthiaceae.

Comparison of the IR boundaries among five species within Melanthiaceae.

Characterization of single inversion in subgenus Phyllantherum

Based on comparison of T. maculatum, T. tschonoskii and P. verticillata, a single inversion of 3.4 kb is characterized in the chloroplast genome of T. maculatum. This inversion is located between the ndhC and rbcL genes. We designed three different primers including I1F (5′-CCC TAG GTT TTT TTC TTC AAG-3′), I1R (5′-TTA TGT AGC TTA TCC TTT AGA CC-3′) and I2R (5′-AGA AGG TCT ACG GTT CGA G-3′) to confirm and clarify the distribution of this inversion throughout the genus Trillium. Specifically, the primer pairs of I1F and I1R worked only in the normal type, while I2R and I1R primer pairs were utilized for the recognized inversion type among examined species. The results showed that the inversion occurred in all examined species of the subgenera Phyllantherum (Fig. 3A and B).
Figure 3.

Confirmation of inversion (3492 bp) between ndhC and rbcL in the genus Trillium. (A) Design of primer to amplify junction regions between atpB and rbcL regions. The positions of atpB and rbcL genes in LSC regions are drawn based on the sequence assembly results of T. tschonoskii, T. maculatum in this study (red text). *The data downloaded from the NCBI. The forward primer I1F contains the sequence in atpB region. The sequence of the reverse primer (I1R) is located in the rbcL gene. Polymerase chain reaction amplification of IGS between atpB and rbcL. Relationships of Parideae lineages followed the phylogenetic trees of S. C. Kim, J. S. Kim, W. C. Mark, F. F. Michael and J. H. Kim (unpublished data). (B) Primers were designed to amplify junction regions between trnV-UAC and rbcL regions. The positions of trnV-UAC and rbcL genes in LSC regions are drawn based on the sequence assembly results of T. tschonoskii, T. maculatum in this study (red text). *The data downloaded from the NCBI. The forward primer I2R contains the sequence in trnV-UAC region. The sequence of the reverse primer I1R is located in the rbcL gene. Polymerase chain reaction amplification of IGS between trnV-UAC and rbcL. Relationships of Parideae lineages followed the phylogenetic trees of S. C. Kim, J. S. Kim, W. C. Mark, F. F. Michael and J. H. Kim (unpublished data).

Confirmation of inversion (3492 bp) between ndhC and rbcL in the genus Trillium. (A) Design of primer to amplify junction regions between atpB and rbcL regions. The positions of atpB and rbcL genes in LSC regions are drawn based on the sequence assembly results of T. tschonoskii, T. maculatum in this study (red text). *The data downloaded from the NCBI. The forward primer I1F contains the sequence in atpB region. The sequence of the reverse primer (I1R) is located in the rbcL gene. Polymerase chain reaction amplification of IGS between atpB and rbcL. Relationships of Parideae lineages followed the phylogenetic trees of S. C. Kim, J. S. Kim, W. C. Mark, F. F. Michael and J. H. Kim (unpublished data). (B) Primers were designed to amplify junction regions between trnV-UAC and rbcL regions. The positions of trnV-UAC and rbcL genes in LSC regions are drawn based on the sequence assembly results of T. tschonoskii, T. maculatum in this study (red text). *The data downloaded from the NCBI. The forward primer I2R contains the sequence in trnV-UAC region. The sequence of the reverse primer I1R is located in the rbcL gene. Polymerase chain reaction amplification of IGS between trnV-UAC and rbcL. Relationships of Parideae lineages followed the phylogenetic trees of S. C. Kim, J. S. Kim, W. C. Mark, F. F. Michael and J. H. Kim (unpublished data).

Indels, SNPs and SSR between two subgenera of Trillium

A total of 402 indels were detected between T. maculatum and T. tschonoskii, and most indels were located in the IGS regions (78.2 %). 66.2 % of the total number of indels were found in the LSC, while 22.1 and 11.7 % were present in the SSC and IR regions, respectively [Table 1, see ]. The average length of indels was 74.8 bp, and the largest indel was located in ycf1 and ycf2. The frequency of 1 bp indels was 10.6 %, while 79.3 % of all indels were over 20 bp in length. In rRNA sequences, one indel of 3 bp and four indels of 5 bp were found in 16S rRNA and 23S rRNA. In addition, indel events were identified in 20 coding genes of both species (accD, atpB, ccsA, cemA, clpP, infA, matK, ndhF, rpl2, rpl20, rpl22, rpl32, rpoC1, rpoC2, rps11, rps15, rps18, rps19, ycf1 and ycf2).
Table 1.

The number and total length of insertion–deletion mutations between the chloroplast genomes of T. tschonoskii and T. maculatum in Parideae.

RegionNumber of indelsTotal length of indels
IGS2625139
Intron51528
Coding gene863005
The number and total length of insertion–deletion mutations between the chloroplast genomes of T. tschonoskii and T. maculatum in Parideae. A total of 2861 SNPs were detected between T. maculatum and T. tschonoskii (Table 2), and 1620 SNPs were transversions. In total, 1707 (59.7 %) SNPs were located in the coding regions, and 1154 (40.3 %) were within IGS regions or within introns.
Table 2.

Single nucleotide polymorphisms found between the plastid genomes of T. tschonoskii and T. maculatum. (A) Single nucleotide polymorphisms in coding gene. (B) Single nucleotide polymorphisms in intron. (C) Single nucleotide polymorphisms in IGS regions. Bold values represent p-distance >0.08.

GeneAligned length (bp)No. SNPp-DistanceGeneAligned length (bp)No. SNPp-DistanceGeneAligned length (bp)No. SNPp-Distance
(A)
psbD106220.002ycf455540.007rpl2828140.017
psaB220550.002cemA69550.007rrn5S12120.017
psaA225340.002rrn23S2814190.007rpl3611420.018
atpE40510.002ndhD1503100.007rrn4.5S10320.019
psbB152730.002ndhE30620.007rrn16S1494300.02
petB64810.002ndhI54340.007rpl1436990.024
psbA106230.003rps1527630.007ycf272091460.024
ndhK76820.003rpoB3213250.008rps7468110.024
ndhC36310.003psbF12010.008rpl2038790.025
rbcL143450.003rps839930.008rps1836380.026
petA96330.003psbN13210.008rps3657180.027
ndhB153350.003ycf1523420.009trnT-UGU7320.027
ccsA96930.003pebT10810.009trnE-UUC7320.027
rps1625210.004rps1430330.01trnC-GCA7120.028
atpA152460.004rps460660.01trnQ-UUG7220.028
atpH24610.004rpoA1023100.01rps12372110.03
psbC142260.004infA24320.01rpl3320170.035
psbE25210.004rpoC12097220.011rpl22387130.035
petD48320.004trnS-GCU8810.011rpl3215660.04
atpB152170.005rpl2328230.011trnP-UGG7430.041
ycf6837620.005ndhF2232260.012trnI-CAU7430.041
ndhG53130.006matK1554220.014rps11405190.048
ndhA108370.006trnH-GUG7410.014rps19351240.084
ndhH118270.006trnW-CCA7410.014clpP639650.111
atpF55540.007rpl1641160.015ycf167786640.121
rpoC24140270.007rps2711120.017accD15663230.23
GeneAligned length (bp)No. SNPp-DistanceGeneAligned length (bp)No. SNPp-Distance
(B)
atpF intron812140.018rpoC1 intron71490.013
clpP intron 1709120.034rps16 intron783110.015
clpP intron 2983290.012trnI-GAU intron93620.002
ndhA intron107760.006trnK-UUU intron1109340.012
ndhB intron69510.001trnL-UAA intron53820.004
petB intron82340.005trnV-UAC59540.007
petD intron74750.007ycf3 intron173770.006
rpl16 intron1075260.026ycf3 intron273880.01
rpl2 intron66430.005
IGSAligned length (bp)No. SNPp-DistanceIGSAligned length (bp)No. SNPp-Distance
(C)
trnL-CAA_ndhB57810.002rrn23S_rrn4.5S10220.02
ndhB_rps732310.003rpl32_trnL-UAG938160.02
atpI_rps224210.004ndhH_rps1511020.02
rps12_trnV-GAC190580.004rpoB_trnC-GCA859160.022
psaI_ycf437620.005rps19_trnH-GUG14730.022
petB_petD20510.005atpA_atpF9220.023
psbE_petL95260.006trnG-GCC_trnfM-CAU13230.023
psbB_psbT16810.006petA_psbJ1139260.023
ycf4_cemA78550.007petG_trnW-CCA13830.023
rrn16S_trnI-GAU29620.007rps8_rpl1417940.023
trnA-UGC_rrn23S14410.007ndhE_ndhG30140.023
trnR-ACG_trnN-GUU57240.007rps15_ycf142990.023
rps14_psaB13210.008trnK-UUU_rps16783230.024
psbN_psbH12410.008rpl20_clpP1208250.024
infA_rps829620.008trnL-UAG_ccsA8220.025
trnC-GCA_petN83180.01ndhD_psaC11930.025
trnD-GUC_trnY-GUA41240.01cemA_petA24260.026
trnS-GGA_rps430730.01trnQ-UUG_psbK36090.027
ycf15_trnL-CAA67420.01trnG-UCC_trnR-UCU15040.027
psbM_trnD-GUC1031110.011trnT-UGU_trnL-UAA739210.029
trnT-GGU_psbD1016120.011rpoA_rps116820.029
ndhJ_ndhK8910.011atpH_atpI655190.031
psaJ_rpl3348230.011rps16_trnQ-UUG1204370.037
petD_rpoA17920.011trnW-CCA_trnP-UGG16760.037
ndhG_ndhI28330.011ndhF_rpl32778280.038
trnF-GAA_ndhJ68680.012psbK_psbI396150.039
rps2_rpoC224630.013psaC_ndhE380140.04
ycf3_trnS-GGA75980.013psbH_petB13450.042
rps4_trnT-UGU32040.013trnE-UUC_trnT-GGU724260.044
trnL-UAA_trnF-GAA38650.013rpl33_rps1820080.049
atpF_atpH47470.015clpP_psbB507230.049
psbZ_trnG-GCC29640.015rps11_rpl3615170.051
trnM_CAU- atpE20630.015trnS-GCU_trnG-UCC1178570.052
psaA_ycf3642100.016psbI_trnS-GCU12460.054
petL_petG18330.016psbC__trnS-UGA14080.057
rpl36_infA15420.016rpl23_trnI-CAU210110.065
rpl14_rpl1612620.016rpoC1_rpoB3720.077
psbA_trnK-UUU243100.017trnN-GUU_ndhF782310.086
petN_psbM712120.017rpl22_rps1911540.091
trnY-GUA_trnE-UUC5910.017rps3_rpl228050.098
ccsA_ndhD24240.018accD_psaI285200.099
rpoC2_rpoC115430.019psbT_psbN65100.154
trnP-UGG_psaJ38870.019trnH-GUG_rpl24480.186
rpl16_rps316630.019trnI-CAU_ycf2210200.238
rps7_rps125410.019rbcL_accD20902360.291
Single nucleotide polymorphisms found between the plastid genomes of T. tschonoskii and T. maculatum. (A) Single nucleotide polymorphisms in coding gene. (B) Single nucleotide polymorphisms in intron. (C) Single nucleotide polymorphisms in IGS regions. Bold values represent p-distance >0.08. In our result of SNPs, p-distance values in coding regions range from 0.002 to 0.23 and the average value was 0.02. On the other hand, the average p-distance value in non-coding regions was 0.034. Figure 4 shows the average p-distance for five classes of genomic regions: protein-coding genes, tRNAs, rRNAs, IGSs and introns. The IGS divergence is almost double that of the next highest class (genes). Introns hold the lowest sequence divergence, at an average of 0.011%.
Figure 4.

Average p-distance across five classes of genomic regions between two Trillium.

Average p-distance across five classes of genomic regions between two Trillium. We detected SSRs longer than 8 bp in T. maculatum, T. tschonoskii and P. verticillata chloroplast genomes by the method of Qian . According to Qian et al., the threshold was set because 8 bp or longer SSRs are prone to slipstrand mispairing, which is thought to be the primary mutational mechanism causing their high level of polymorphism. In this analysis, the total number of SSRs was 204 in P. verticillata, 205 in T. maculatulatum and 213 in T. tschonoskii (Table 3). The most abundant type of SSR in Parideae was a mononucleotide, with 138 in P. verticillata, 121 in T. maculatulatum and 133 in T. tschonoskii. In addition to mononucleotide SSRs, there are 52 dinucleotide SSRs in P. verticillata, 57 in T. maculatulatum and 53 in T. tschonoskii. Trinucleotide SSRs were less frequent with 6, 14 and 7 in P. verticillata, T. maculatum and T. tschonoskii, respectively. The hexanucleotide SSRs were found only in Trillium species. The majority of mononucleotide repeats were A-T rich (Table 3).
Table 3.

Number of SSRs present in the three Parideae chloroplast genomes.

TaxonParis verticillataTrillium maculatumTrillium tschonoskii
Genome size157 379157 359156 852
No. of SSRs
 A/T133117127
 C/G546
 AC/GT333
 AG/CT171918
 AT/TA323532
 AAG/CTT22
 AAT/ATT6103
 ACT/AGT21
 ATC/GAT1
 AAAG/CTTT1
 AAAT/ATTT344
 AAGG/CCTT111
 AATC/GATT111
 AATG/CATT113
 AGAT/ATCT111
 ACTAT/ATAGT11
 AAAAT/ATTTT11
 AATAT/ATATT21
 AATATG/CATATT1
 AAAATC/GATTTT1
 ATATCC/GGATAT1
 AAAAAT/ATTTTT2
 AAGACT/AGTCTT1
 AACTAC/GTAGTT1
 AAAGAG/CTCTTT1
 Total204205213
Number of SSRs present in the three Parideae chloroplast genomes.

Type of trnI-CAU of Parideae

We compared the sequences of the IGS region between rpl23 and ycf2 using 33 species including Xerophyllum to understand the evolutionary implication of trnI-CAU duplication, which was reported from the Paris chloroplast genome (Do ). Based on the results, we found that this region is of highly variable length among the species, and we distinguished three major types based on the number of copies of trnI-CAU (Fig. 5). Type A was composed of a single trnI-CAU and was found in Xerophyllum, Pseudotrillium rivale and T. undulatum. It was also identified in several Trillium and Paris species, but with variable lengths: in subgenus Trillium species, the sequences ranged from 207 to 445 bp, in which there are two tandem repeats of ‘CAG GTA TTA TCA TAC TGA AA’ (20 bp) and ‘CAT ATT ATC ATA CTG AAA’ (18 bp). Similarly, in subgenus Daiswa of Paris, there were 24 bp random tandem repeats of TAT AAC TTA ACA GGA ATC ATC GTA. Type B contained two copies of trnI-CAU. This type is found in subgenus Phyllantherum of Trillium and section Kinugasa of subgenus Paris. The lengths of tandem repeat sequences were 180 bp (subgenera Phyllantherum) and 155 bp (section Kinugasa of subgenus Paris), which included 74 bp of trnI-CAU. Remarkably, section Kinugasa (Paris japonica) has the longest length of IGS between rpl23 and ycf2 among the tribe Parideae. Type C, possessing three copies of trnI-CAU genes in the sequenced region, was detected in T. govanianum and section Paris of subgenus Paris. They included three fully repeated units including trnI-CAU, and the lengths were 155 and 139 bp, respectively.
Figure 5.

Summary of three types of trnI-CAU gene duplication in the tribe Parideae. *Including tandem repeats.

Summary of three types of trnI-CAU gene duplication in the tribe Parideae. *Including tandem repeats.

Discussion

Comparison of complete plastid genomes of subgenera Trillium and Phyllantherum

The plastid genome structure of the two Trillium species, T. maculatum and T. tschonoskii, have a typical form found in most angiosperms (Zhang ; Kim and Kim 2013; Li ; Qian ). The T. tschonoskii chloroplast genome was 507 bp shorter than T.maculatum, and we confirmed that the length variation among Parideae chloroplast genomes including Paris verticillata occurred by gene deletion and duplication as well as its IR expansion. Although chloroplast genomes are considered highly conserved among land plants, sequence polymorphisms were often observed among closely related species. From the T. tschonoskii and T. maculatum chloroplast genome sequences, we confirmed that 402 indels and 2861 SNPs were present between the two species. In addition, we found that SSRs (i.e. microsatellites), composed of 1–6 bp in length per unit, are distributed throughout both genomes. The SSRs have been accepted as one of the major molecular markers for genome variation between species or within populations due to their high polymorphism within the species and have been widely practiced for analysing plant population structure, diversity, differentiation and maternity analysis (Liu ). Simple sequence repeats have successfully been applied to the study of Poaceae, Brassicaceae and Solanaceae (Provan , 1999; Bryan ; Flannery ). Simple sequence repeats detected in the present study will provide basic information for the further analysis of genetic diversity in Parideae. Based on our results, the IR/LSC boundary and the IR/SSC boundary differed between the two subgenera of Trillium. Inverted repeat/large single-copy junction was expanded to a part of rps3 in T. tschonoskii, whereas that of T. maculatum was found at rps19. The ycf1 was completely located in SSC of T. tschonoskii, but a part of the ycf1 gene was duplicated in IR of T. maculatum. Within the Parideae, the IR boundary pattern of T. tschonoskii was more similar to P. verticillata than T. maculatum (Fig. 2).

Inversion events in Melanthiaceae

Inversions caused by the recombination between repeated sequences are considered to be a main mechanism for changes in gene order among plastid genomes (Jansen and Ruhlman 2012). Most of the reported inversions in plastid genomes are in the LSC region (Kim ). In subtribe Phaseolinae of Fabaceae, there is a 78 kb inversion between trnH/rpl14 and rps19/rps8 in the chloroplast genome (Tangphatsornruang ). Additionally, Kim reported that the inversion occurred in the spacer between tRNAGly and tRNASer genes of Lactuca sativa. Also, they defined two inversions that characterize Asteraceae. The two inversions were identical across all members of Asteraceae, suggesting that the inversion events are likely to occur simultaneously or within a short period of time following the origin of the family. In Campanulaceae, >50 large inversions occurred during diversification of the family, in which at least 20 occurred in Cyphia, and a minimum of 53 are now known in Lobelia (Knox 2014). Fabaceae are known to exhibit a number of unusual phenomena in their chloroplast genome: Trifolium subterraneum has undergone extensive genomic reconfiguration, including the loss of six genes and two introns and numerous gene order changes, attributable to 14–18 inversions (Cai ). Our results confirmed a single inversion in Melanthiaceae. It was remarkable that a single inversion of 3492 bp embedded four genes between ndhC and rbcL genes, which specifically occurred in the monophyletic subgenus Phyllantherum (Fig. 3). This event is thought to have occurred after the evolutionary divergence between subgenus Phyllantherum and subgenus Trillium. This new finding may be an effective molecular marker for classifying subgenera of the genus Trillium.

Diverse patterns of trnI-CAU duplication in Parideae

Gene duplication is an important process in organellar genome evolution. Most duplicated genes occur within the IR regions due to the mechanisms underlying IR expansion and contraction (Xiong ). Gene duplication in plastid genome has been reported in tRNA genes (Hipkins ; Vijverberg and Bachmann 1999; Schmickl ) and in some protein-coding genes. Most of the duplications can be detected only in rearranged chloroplast genomes, as in grasses, legumes and conifers. Hipkins compared the number of direct repeats between partially duplicated trnY-GUA and the complete trnY-GUA gene in Pseudotsuga. They found that the length-variable region in Pseudotsuga comprised imperfect tandem direct repeats based on the trnY gene sequence. Schmickl used the 5′-trnL-UAA_trnF-GAA region for phylogeographic reconstructions, gene diversity calculations and phylogenetic analyses among the genera Arabidopsis and Boechera. The Cruciferous taxa are characterized by these pseudogenes in at least four independent phylogenetic lineages. In addition, the tRNA gene as well as the coding gene could be confirmed by duplication events in Jasminum and Menodora, which have the duplicated rbcL_psaI region. Most chloroplast gene duplications outside of the IR involve tRNAs, as in the case of Oleaceae (Lee ). A total of 30–32 tRNA genes are present within the chloroplast genome of land plants (Tsudzuki ; Vijverberg and Bachmann 1999), and they may be involved in chloroplast genome rearrangements through their secondary structure (Howe ). These genes are dispersed throughout the genome, but five to eight genes are located in the IR (Maréchal-Drouard ). We found that three major types of trnI-CAU gene duplication are located between rpl23 and ycf2 at the IR of tribe Parideae (Fig. 5). Traditionally, Parideae included two genera, Paris and Trillium; however, Trillium was separated into two genera Trillium and Pesudotrillium in recent classifications (Farmer and Schilling 2002). Using the various duplication patterns of trnI-CAU in the IR region, the infrageneric circumscription of Parideae member was strongly supported. The type of trnI-CAU that had been discovered in Xerophyllum, Pesudotrillium and T. undulatum with one trnI-CAU between rpl23 and ycf2 was seen to be similar to the ancestor of Parideae (Type A, Fig. 5). This type was found also in most chloroplast genomes of Liliales (Liu ; Bodin ; Do ; Kim and Kim 2013). It was modified in subgenus Trillium of Trillium and subgenus Daiswa of Paris to be extended by the tandem repeat between trnI-CAU and ycf2. Type B was found in subgenus Phyllantherum of Trillium and section Kinugasa of subgenus Paris although section Kinugasa possessed the additional tandem repeat between trnI-CAU units. Type C, which was found in T. govanianum and section Paris of subgenus Paris, has three copies of the trnI-CAU gene. From the results, we suggested that duplicate events of trnI-CAU have occurred independently in the tribe Parideae of Melanthiaceae, and it provided useful information for determining the infrageneric circumscription. However, T. govanianum, which was classified into another genus Trillidium by Farmer and Schilling (2002) based on morphological characters and geographical distribution, was more similar to Paris than Trillium. Also, this result showed that the trnI-CAU gene duplication pattern of T. govanianum was more similar to Paris than Trillium. Interestingly, it was positioned at the same clade together with the North American species T. undulatum in the molecular phylogenetic tree (Farmer and Schilling 2002; S. C. Kim, J. S. Kim, W. C. Mark, F. F. Michael, J. H. Kim, unpublished data). Further studies are necessary to clarify the relationship between both species.

Conclusions

We analysed the complete chloroplast genomes of two species of T. tschonoskii (subgenus Trillium) and T. maculatum (subgenus Phyllantherum) to verify the specific feature in the genome level. As a result, we found a 3.4 kb inverted sequence between ndhC and rbcL in the LSC region in the chloroplast genome of T. maculatum, which was unique to subgenus Phyllantherum. In addition, three different gene duplication patterns of trnI-CAU gene were found and they were the informative molecular markers for identifying the infrageneric taxa of Trillium.

Sources of Funding

This work was supported by the National Research Foundation of Korea (NRF) Grant Foundation (MEST 2010-0029131) and Korea National Arboretum (KNA1-2-13,14-2).

Contributions by the Authors

J.-H.K conceived and designed the experiments, S.-C.K. performed the experiments, S.-C.K. and J.S.K. analysed the data and S.-C.K. and J.-H.K. wrote the paper.

Conflict of Interest Statement

None declared.

Supporting Information

The following additional information is available in the online version of this article – Table S1. Genes found in Trillium tschonoskii and T. maculatum chloroplast genomes. Table S2. The detailed list of insertion–deletion mutations between the chloroplast genomes of T. tschonoskii and T. maculatum in Parideae. Table S3. Sequences of rpl23_ycf2 IGS among Parideae species.
  33 in total

1.  Two chloroplast DNA inversions originated simultaneously during the early evolution of the sunflower family (Asteraceae).

Authors:  Ki-Joong Kim; Keung-Sun Choi; Robert K Jansen
Journal:  Mol Biol Evol       Date:  2005-05-25       Impact factor: 16.240

Review 2.  Gene duplication, transfer, and evolution in the chloroplast genome.

Authors:  Ai-Sheng Xiong; Ri-He Peng; Jing Zhuang; Feng Gao; Bo Zhu; Xiao-Yan Fu; Yong Xue; Xiao-Feng Jin; Yong-Sheng Tian; Wei Zhao; Quan-Hong Yao
Journal:  Biotechnol Adv       Date:  2009 Jul-Aug       Impact factor: 14.227

3.  Common features of three inversions in wheat chloroplast DNA.

Authors:  C J Howe; R F Barker; C M Bowman; T A Dyer
Journal:  Curr Genet       Date:  1988-04       Impact factor: 3.886

4.  Polymorphic chloroplast simple sequence repeat primers for systematic and population studies in the genus Hordeum

Authors: 
Journal:  Mol Ecol       Date:  1999-03       Impact factor: 6.185

5.  Molecular evolution of a tandemly repeated trnF(GAA) gene in the chloroplast genomes of Microseris (Asteraceae) and the use of structural mutations in phylogenetic analyses.

Authors:  K Vijverberg; K Bachmann
Journal:  Mol Biol Evol       Date:  1999-10       Impact factor: 16.240

6.  A mutation hotspot in the chloroplast genome of a conifer (Douglas-fir: Pseudotsuga) is caused by variability in the number of direct repeats derived from a partially duplicated tRNA gene.

Authors:  V D Hipkins; K A Marshall; D B Neale; W H Rottmann; S H Strauss
Journal:  Curr Genet       Date:  1995-05       Impact factor: 3.886

7.  Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns.

Authors:  Robert K Jansen; Zhengqiu Cai; Linda A Raubeson; Henry Daniell; Claude W Depamphilis; James Leebens-Mack; Kai F Müller; Mary Guisinger-Bellian; Rosemarie C Haberle; Anne K Hansen; Timothy W Chumley; Seung-Bum Lee; Rhiannon Peery; Joel R McNeal; Jennifer V Kuehl; Jeffrey L Boore
Journal:  Proc Natl Acad Sci U S A       Date:  2007-11-28       Impact factor: 11.205

8.  Comparative genomics of four Liliales families inferred from the complete chloroplast genome sequence of Veratrum patulum O. Loes. (Melanthiaceae).

Authors:  Hoang Dang Khoa Do; Jung Sung Kim; Joo-Hwan Kim
Journal:  Gene       Date:  2013-08-23       Impact factor: 3.688

9.  Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data.

Authors:  Matthew Kearse; Richard Moir; Amy Wilson; Steven Stones-Havas; Matthew Cheung; Shane Sturrock; Simon Buxton; Alex Cooper; Sidney Markowitz; Chris Duran; Tobias Thierer; Bruce Ashton; Peter Meintjes; Alexei Drummond
Journal:  Bioinformatics       Date:  2012-04-27       Impact factor: 6.937

10.  Complete sequencing of five araliaceae chloroplast genomes and the phylogenetic implications.

Authors:  Rong Li; Peng-Fei Ma; Jun Wen; Ting-Shuang Yi
Journal:  PLoS One       Date:  2013-10-18       Impact factor: 3.240

View more
  9 in total

1.  The newly developed single nucleotide polymorphism (SNP) markers for a potentially medicinal plant, Crepidiastrum denticulatum (Asteraceae), inferred from complete chloroplast genome data.

Authors:  Hoang Dang Khoa Do; Joonhyung Jung; JongYoung Hyun; Seok Jeong Yoon; Chaejin Lim; Keedon Park; Joo-Hwan Kim
Journal:  Mol Biol Rep       Date:  2019-04-12       Impact factor: 2.316

2.  A Dynamic Tandem Repeat in Monocotyledons Inferred from a Comparative Analysis of Chloroplast Genomes in Melanthiaceae.

Authors:  Hoang Dang Khoa Do; Joo-Hwan Kim
Journal:  Front Plant Sci       Date:  2017-05-22       Impact factor: 5.753

3.  Chloroplast Genomic Resource of Paris for Species Discrimination.

Authors:  Yun Song; Shaojun Wang; Yuanming Ding; Jin Xu; Ming Fu Li; Shuifang Zhu; Naizhong Chen
Journal:  Sci Rep       Date:  2017-06-13       Impact factor: 4.379

4.  The implication of plastid transcriptome analysis in petaloid monocotyledons: A case study of Lilium lancifolium (Liliaceae, Liliales).

Authors:  Hoang Dang Khoa Do; Joo-Hwan Kim
Journal:  Sci Rep       Date:  2019-04-30       Impact factor: 4.379

5.  Species Identification of Dracaena Using the Complete Chloroplast Genome as a Super-Barcode.

Authors:  Zhonglian Zhang; Yue Zhang; Meifang Song; Yanhong Guan; Xiaojun Ma
Journal:  Front Pharmacol       Date:  2019-11-29       Impact factor: 5.810

6.  Comparative analysis and implications of the chloroplast genomes of three thistles (Carduus L., Asteraceae).

Authors:  Joonhyung Jung; Hoang Dang Khoa Do; JongYoung Hyun; Changkyun Kim; Joo-Hwan Kim
Journal:  PeerJ       Date:  2021-01-14       Impact factor: 2.984

7.  Analysis of Complete Chloroplast Genome Sequences Improves Phylogenetic Resolution in Paris (Melanthiaceae).

Authors:  Yuling Huang; Xiaojuan Li; Zhenyan Yang; Chengjin Yang; Junbo Yang; Yunheng Ji
Journal:  Front Plant Sci       Date:  2016-11-29       Impact factor: 5.753

8.  Molecular Phylogeny and Dating of Forsythieae (Oleaceae) Provide Insight into the Miocene History of Eurasian Temperate Shrubs.

Authors:  Young-Ho Ha; Changkyun Kim; Kyung Choi; Joo-Hwan Kim
Journal:  Front Plant Sci       Date:  2018-02-05       Impact factor: 5.753

9.  Reconfiguration of the plastid genome in Lamprocapnos spectabilis: IR boundary shifting, inversion, and intraspecific variation.

Authors:  Seongjun Park; Boram An; SeonJoo Park
Journal:  Sci Rep       Date:  2018-09-11       Impact factor: 4.379

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.