Literature DB >> 29731764

The Repeat Sequences and Elevated Substitution Rates of the Chloroplast accD Gene in Cupressophytes.

Jia Li1, Yingjuan Su2,3, Ting Wang4.   

Abstract

The plastid accD gene encodes a subunit of the acetyl-CoA carboxylase (ACCase) enzyme. The length of accD gene has been supposed to expand in Cryptomeria japonica, Taiwania cryptomerioides, Cephalotaxus, Taxus chinensis, and Podocarpus lambertii, and the main reason for this phenomenon was the existence of tandemly repeated sequences. However, it is still unknown whether the accD gene length in other cupressophytes has expanded. Here, in order to investigate how widespread this phenomenon was, 18 accD sequences and its surrounding regions of cupressophyte were sequenced and analyzed. Together with 39 GenBank sequence data, our taxon sampling covered all the extant gymnosperm orders. The repetitive elements and substitution rates of accD among 57 gymnosperm species were analyzed, the results show: (1) Reading frame length of accD gene in 18 cupressophytes species has also expanded. (2) Many repetitive elements were identified in accD gene of cupressophyte lineages. (3) The synonymous and non-synonymous substitution rates of accD were accelerated in cupressophytes. (4) accD was located in rearrangement endpoints. These results suggested that repetitive elements may mediate the chloroplast genome rearrangement and accelerated the substitution rates.

Entities:  

Keywords:  accD; cupressophytes; rearrangement; repeat sequences; substitution rates

Year:  2018        PMID: 29731764      PMCID: PMC5920036          DOI: 10.3389/fpls.2018.00533

Source DB:  PubMed          Journal:  Front Plant Sci        ISSN: 1664-462X            Impact factor:   5.753


Introduction

Cupressophytes, also called non-Pinaceae conifers, comprise about 380 species in 58 genera of five families: Araucariaceae, Podocarpaceae, Sciadopityaceae, Taxaceae (including Cephalotaxaceae), and Cupressaceae (including Taxodiaceae) (Christenhusz et al., 2011). Most species of Araucariaceae and Podocarpaceae are usually distributed in Southern Hemisphere, while other three families are located in the Northern Hemisphere. Some of the cupressophytes species are of economic and ecological value to humans. For instance, most species of Cupressaceae are valued for the production of timbers or ornamentals. The secondary metabolite paclitaxel (taxol) extracted from the bark of Taxus is a chemotherapy drug to treat ovarian and breast cancer. Dispersed repetitive DNA sequences are scattered throughout the chloroplast genome. Most of the studies concentrate on detecting repeat sequences on a chloroplast genome-wide degree (Saski et al., 2005; Guo et al., 2007; Haberle et al., 2008; Tangphatsornruang et al., 2011); while there are only very few reports about the presence and structure of the repetitive DNA of a specific gene among many lineages (Hipkins et al., 1995; Erixon and Oxelman, 2008). Some repeats exist in the coding regions of chloroplast gene. For instance, ycf1 and ycf2 in Panax ginseng (Zhao et al., 2014), Taxus chinensis (Zhang et al., 2014), Podocarpus lambertii (Vieira et al., 2014), Cephalotaxus oliveri (Yi et al., 2013), and Globe artichoke (Curci et al., 2015); as well as accD in T. chinensis (Zhang et al., 2014), P. lambertii (Vieira et al., 2014), Medicago truncatula (Gurdon and Maliga, 2014), C. oliveri (Yi et al., 2013), Capsicum annuum (Jo et al., 2011), Pisum sativum and Lathyrus sativus (Magee et al., 2010) all have repeat sequences. Many studies have suggested that highly rearranged chloroplast genome generally possess a great many repetitive sequences which are associated with rearrangement endpoints, and this phenomenon has been observed in some land plants: Douglas-fir (Pseudotsuga menziesii) (Tsai and Strauss, 1989), Pelargonium (Chumley et al., 2006), Trifolium (Cai et al., 2008), Trachelium (Haberle et al., 2008), Oleaceae (Lee et al., 2007), and Asteraceae (Kim et al., 2005; Timme et al., 2007). The size and number of repeats also correlate to the extent of genome rearrangement (Guisinger et al., 2011). The genome containing the most frequency of long repeats is usually the most reorganized. Acetyl-CoA carboxylase (ACCase) facilitate the acetyl-CoA to form malonyl-CoA and is supposed to regulate de novo fatty acid biosynthesis (Konishi and Sasaki, 1994; Sasaki and Nagano, 2004). Most higher plants, except for Gramineae, have two forms of ACCase: a prokaryotic type made up of several subunits in the stroma of plastids and a eukaryotic form composed of an only multifunctional polypeptide located in the cytosol (Konishi et al., 1996). The prokaryotic ACCase form is organized by the α-carboxyl transferase, the biotin carboxyl carrier, the biotin carboxylase, and the β-carboxyl transferase subunit (Gornicki et al., 1997). Except for β-carboxyl transferase was encoded by the plastid accD gene, other three subunit are all nucleus encoded. The plastid-localized accD gene is essential for leaf growth and to maintain plastid compartment in tobacco (Kode et al., 2005). Elevation of accD expression successfully raised the entire ACCase amount in plastids, and significantly raised the fatty acid content in tobacco leaves (Madoka et al., 2002). Furthermore, expression of accD was considered to be essential at the stage of embryo development in Arabidopsis (Bryant et al., 2011). AccD is widely distributed in plants, including the reduced chloroplast genome of parasitic and non-photosynthetic plants (Wolfe et al., 1992; de Koning and Keeling, 2006). However, accD has been lost several times from the chloroplast genomes of some angiosperm lineage: Acoraceae (Goremykin et al., 2005), Poaceae (Konishi and Sasaki, 1994; Harris et al., 2012), Campanulaceae (Haberle et al., 2008), Geraniaceae (Guisinger et al., 2008), and Fabaceae (Magee et al., 2010). In Poaceae, the plastid-located prokaryotic form ACCase is functionally replaced by the nuclear-encoded eukaryotic type (Konishi et al., 1996; Gornicki et al., 1997). The loss of accD gene from the chloroplast genomes of Campanulaceae and Fabaceae was also consistent with an additional ACCase counterpart in the nucleus (Magee et al., 2010; Rousseau et al., 2013). In Trifolium repens of Fabaceae, through scanning high-throughput EST sequence data, accD was found to fuse with a nuclear gene for plastid lipoamide dehydrogenase (LPD2) (Magee et al., 2010); in Trachelium caeruleum of Campanulaceae, a transit peptide is combined with an abridged accD gene, which includes only 331 amino acids (Rousseau et al., 2013). In contrast to the loss of this gene among the above species, the length of accD gene in cupressophyte species including Cryptomeria japonica, Taiwania cryptomerioides, Cephalotaxus wilsoniana, C. oliveri, T. chinensis, and P. lambertii have diversified in an increasing direction (Hirao et al., 2008; Wu et al., 2011; Yi et al., 2013; Vieira et al., 2014; Zhang et al., 2014). The extension of the accD gene length is mainly caused by the insertion of large number of tandem repeated sequences in this area. But the repetitive elements of the gene are different among Cephalotaxus, T. cryptomerioides, T. chinensis, and P. lambertii (Yi et al., 2013; Vieira et al., 2014; Zhang et al., 2014). Therefore, evolutionary mechanisms underlying the occurrence of repetitive elements in cupressophyte of accD gene remain poorly studied. Sequence data from a wider phylogenetic breadth of cupressophytes are needed to clarify the evolutionary history of accD gene. In the study of four mammalian and a bird genome, it is suggested that regions surrounding tandem repeats evolve faster than other non-repeat-containing regions (Simon and Hancock, 2009). One explanation is that regions nearby repeat sequences have evolved under weaker negative selection than the remaining region they embedded in (Djian et al., 1996; Faux et al., 2007). Another explanation is that the repeat sequences give rise to more substitutions near the flanking sequences (Huntley and Clark, 2007). Recent evidence also suggests that the insertion of repeat sequence elevated substitution rate of the entire sequence (Huntley and Clark, 2007). It is also assured that repeat sequence themselves evolves faster than their flanking sequence (Huntley and Golding, 2000). With many repeat elements in accD, whether the substitution rates for the repeat sequences or their flanking sequences have accelerated is unknown. To elucidate the overall evolutionary history or patterns of the repeat sequences in chloroplast genome, substitution rate pattern of accD gene were identified in this study. In order to have a better insight into the evolutionary trace of accD in cupressophytes, in this study, we have sequenced accD genes from 18 cupressophytes species. The aim of this study focuses on: (1) investigating whether accD gene length in cupressophytes tends to increase; (2) exploring if accD gene in other cupressophytes species have specific repetitive elements like Cephalotaxus, T. cryptomerioides, T. chinensis, and P. lambertii; (3) determining the substitution rates pattern of accD in cupressophytes; (4) identifying gene order states around accD gene and verifying the association of repetitive elements, substitution rates and genome rearrangement.

Materials and Methods

Plant Sampling

Fresh leaves of 18 conifer species were sampled from Wuhan Botanical Garden, Chinese Academy of Sciences (CAS), Institute of Botany, CAS, and Sun Yat-sen University, respectively (Table ). The materials used for DNA extraction were saved in silica gel. List of taxa and sequences analyzed in this study.

DNA Extraction and Sequencing

Total genomic DNA was isolated from the leaves of samples using the CTAB method (Gawel and Jarret, 1991). The quality of the genomic DNA was determined by 1% agarose gel electrophoresis. The accD gene investigated in this study was acquired using polymerase chain reaction (PCR). PCR primers (Supplementary Table 1) were designed from conserved region sequences in four gymnosperms (C. japonica, NC_010548; T. cryptomerioides, NC_016065; C. wilsoniana, NC_016063; C. oliveri, KC136217). The PCR system was as described in former study (Li et al., 2016). Then the PCR products were cloned into PCR 2.1 plasmid vector (Invitrogen, Carlsbad, CA, United States), and transformed to E. coli DH5α. At least three random positive clones were sequenced using ABI 3730xl DNA Analyzer (Applied Biosystems, Foster City, CA, United States).

Sequence Assembly and Annotation

The sequences generated from different primers were assembled as a single sequence by BioEdit (Hall, 1999) with an overlapping of 150–300 bp. Contigs were initially annotated by DOGMA (Dual Organellar GenoMe Annotator). Genes that not be confirmed by DOGMA were recognized using Blastx[1] and ORF Finder[2]. The tRNA genes were annotated by tRNAscan-SE v1.21 (Lowe and Eddy, 1997).

Repeat Sequence Analyses

The sequences were initially scanned by REPuter at a repeat length ≥20 bp with a similarity of above 90% (Kurtz et al., 2001). Sequences were further processed by the Tandem Repeats Finder software (Benson, 1999).

The Estimation of Substitution Rate

For the analysis in Figure , we first constructed a maximum likelihood (ML) tree using rbcL sequences. The analysis was performed in RaxML v8.1.x software with the GTR+I+G model. In addition, according to the strongly supported relationship published elsewhere (Lu et al., 2014), the Podocarpaceae and Araucariaceae were adjusted as a sister group for the rbcL ML tree. At last, this tree was used for the following substitution rate calculation. In order to compare the substitution rate of accD gene to two other widely used chloroplast gene marker rbcL and matK, we also downloaded these two gene sequences from GenBank. The branch lengths of non-synonymous (dN) and synonymous (dS) nucleotide substitutions for accD, matK, and rbcL trees were calculated using the free-ratio model implemented in PAML Codeml program.

Results

The General Features of accD Gene in Cupressophytes

The sequences acquired in this study were deposited in the GenBank with the accession number of KT30780-KT30797. A comparison of 57 gymnosperm accD sequences showed that the approximate 200 amino-acids at the end of this gene were highly conserved (Supplementary Figures 1-9, 1-10, the position of 1200–1400 in the alignment). This C-terminal region is functional importance for ACCD protein (Zhang et al., 2003). However, we found that the residues at the N-terminal and the middle region showed low similarities (Supplementary Figures 1-1 to 1-8). The major difference between 57 gymnosperm accD sequences is apparent as a large insertion sequences in the N-terminal and the middle region of cupressophyte accD sequence (Supplementary Figures 1-1 to 1-8). Furthermore, the open reading frame has not been destroyed by these insertion sequences. The accD gene length in cupressophyte experienced an extraordinary expansion. The accD gene in Podocarpaceae lineage expands above 600 codons (Table ). The Cephalotaxus hainanensis analyzed in this study shows the largest accD gene size, reaching 1070 codons (Table ), which is approximately three times of the other Pinaceae species. The accD gene length also varies significantly within family. The accD gene in Taxus has experienced dramatic expansion, reaching as long as 735, 736, 753, 759, and 767 codons in T. mairei, T. wallichiiana, T. media, T. chinensis, and T. yunnanensis, respectively (Table ); P. chienii has only 657 codons; T. fargesii and T. grandis have 871 codons and 879 codons, respectively; the accD gene length of A. argotaenia in Taxaceae is the longest, possessing 916 codons (Table ). In general, our results support Hirao et al’s hypothesis that the accD gene length in cupressophytes has been expanded (Hirao et al., 2008).

Repetitive Amino Acid Elements in accD

To initiate our investigation into the mechanisms underlying accD gene length-associated mutation, REPuter and Tandem Repeat Finder were used to search repetitive sequences. As expected, accD gene length variation is explained by the insertions consisting of tandem repeated sequences. The repetitive sequences in accD gene are represented by a total of 31 categories present in 2–13 nearly identical copies, all of which are in the same (i.e., direct) orientation relative to each other (Supplementary Table 2 and Figure ). Cycadaceae, Ginkgoaceae and the Pinaceae species with a relatively small gene size (Table ) do not have repetitive elements. In comparison, the accD in cupressophytes investigated in this study possess a great many repetitive sequences. Ten repetitive elements were identified in the accD gene from the Cephalotaxaceae (Supplementary Table 2 and Figure ). Some repetitive elements, represented by R5, R9, and R8, were exclusively found either in C. wilsoniana or C. hainanensis, whereas the other repetitive elements such as R1, R2, R3, R4, R10 were found in all Cephalotaxus species. R1, R2, and R10 repetitive elements were all duplicated two times in the five Cephalotaxus species. The copy number of R3, R4, R6, and R7 varies in different species. For instance, the C. hainanensis has 13 repetitive elements of R3; while C. sinensis, C. Wilsoniana, and C. fortune have 12 copies of this repetitive element and C. oliveri has only six copies of R3 repetitive elements. The repetitive elements of accD gene in five groups of species. Sequence alignments were performed individually in each group using MEGA. (A) The alignment of amino acid sequences of ACCD protein from five Cephalotaxaceae species. The figure only shows the regions from 364 to 422 and 488 to 876. (B) The alignment of amino acid sequences of ACCD protein from ten Cupressaceae species. The figure only shows the regions from 1 to 61, 276 to 290, and 522 to 744. (C) The alignment of amino acid sequences of ACCD protein from seven Taxus species and P. chienii. The figure only shows the regions from 1 to 50 and 436 to 636. (D) The alignment of amino acid sequences of ACCD protein from two Torreya and two Amentotaxus species. The figure only shows the regions from 1 to 297. (E) The alignment of amino acid sequences of ACCD protein from eight Podocarpaceae and two Araucariaceae species. The figure only shows the regions from 171 to 237, 434 to 497 and 530 to 707. Different repetitive elements were marked with different colored boxes. The arrows indicate the repetitive elements which has only one copy. The spacer between two fragments was divided by three dots. Three repetitive elements of R11 and four of R12 were found in Cunninghamia lanceolata and Calocedrus formosana, respectively (Supplementary Table 2 and Figure ). Juniperus virginiana has two copies of R14 while the other Juniperus species have only one copy of this repetitive element (Supplementary Table 2 and Figure ). The main difference in repetitive elements between two Taiwania species was the copy number variations of R16 and R18. T. cryptomerioides has six copies of R16 and three copies of R18, while T. flousiana has seven copies of R16 and two copies of R18 (Supplementary Table 2 and Figure ). R19 repetitive element commonly exists in Cupressaceae species except for C. lanceolata and Taiwania (Supplementary Table 2 and Figure ). In Taxus and Pseudotaxus, the accD gene contained four kinds of repetitive elements: R20, R21, R22 and R23 (Supplementary Table 2 and Figure ). Two copies of R21 were found in T. wallichiana while other Taxus species have only one copy of R21. T. chinensis and T. cuspidate both have four copies of R20 and T. yunnanensis has only two, while the remaining four Taxus species each have three copies of this repetitive elements. P. chienii also has R20 element but only one copy. The copy number of R22 in Taxus is also different, ranging from two in T. wallichiana to four in T. yunnanensis. The largest tandem repetitive elements, spanning 59 amino acids, named as R25, exist in Torreya and Amentotaxus (Supplementary Table 2 and Figure ). Two copies of R25 were identified in Torreya and Amentotaxus. Amentotaxus has two genus-specific repetitive elements, R24 and R26, whose copy number are also different between Amentotaxus argotaenia and Amentotaxus formosana (Supplementary Table 2 and Figure ). Podocarpaceae has only a few repetitive elements (Supplementary Table 2 and Figure ). P. macrophyllus and P. neriifolius each contain three copies of R27, while P. lambertii contains two. Other Podocarpaceae species contain two copies of R28. A. cunninghamii and A. dammara each have eight and seven copies of R31, which is also lineage specific (Supplementary Table 2 and Figure ). The consensus sequences of R31 were also found in Podocarpaceae but all existing as single copy (not repeated), suggesting that R31 repetitive element was only duplicated in Araucariaceae. Furthermore, no pairs of direct repetitive sequences were identified in two sides of the inserted repetitive elements of cupressophytes.

Rapid Evolution of accD in Cupressophytes

The value of dN and dS for accD, rbcL, and matK gene were represented as branch lengths in Figure . In the dN tree, rbcL and matK gene has a relatively low substitute rate through the entire tree. The branch leading to the ancestry clade of Cupressaceae, Taxaceae, and Cephalotaxaceae in the accD dN tree is longer than other branches, suggesting that accD evolves faster in this clade. In addition, the branch leading to Podocarpaceae and Araucariaceae in the accD dN tree is longer than other gymnosperms (Figure ). Interestingly, the accD gene length also starts to expand at the lineage of Podocarpaceae and Araucariaceae. For the matK and rbcL dS tree, most gymnosperm species evolve slowly and consistently except for the branch leading to Podocarpaceae. However, the dS value of accD gene for cupressophyte evolves much faster than many of the Pinaceae species. Compared with rbcL and matK, accD gene shows a high level of divergence among cupressophyte species. In general, accD has experienced substitution rates acceleration and this acceleration is locus and lineage specific. dN and dS trees for accD, matK, and rbcL gene. Branch lengths are in terms of dN and dS as estimated by PAML under a constrained topology. The topology of the accD dN tree, accD dS tree, rbcL dN tree, rbcL dS tree were identical to each other. The matK dN tree and dS tree were similar with rbcL and accD trees after removing C. hainanensis. The gray boxes denote the branch whose dN or dS has been accelerated.

Gene Order Around accD in Gymnosperms

The gene order around accD could be classified into six types (Figure ). At high taxonomic levels, the gene order tends to be conserved across Cycadaceae, Ginkgoaceae and Pinaceae with a type of: rbcL-trnR-accD-psaI. Gene order in Araucariaceae and Podocarpaceae excluding Podocarpus totara is nearly identical to that of Cycadaceae, Ginkgoaceae and Pinaceae except that an extra trnD gene was found between rbcL and trnR. In P. totara, the gene order is: psbM-trnD-accD-psaI, which is different from that of the other three Podocarpus species, despite being members of the same genus. In Taxaceae, C. japonica, Taiwania, M. glyptostroboides and C. lanceolata, rbcL and clpP is near accD. The gene order of Cephalotaxus differs from that of Taxaceae by the inversion of clpP and translocation of rps16. Comparing with Taxaceae, the rpl23 takes the place of clpP making the gene order to be: rbcL-accD-rpl23 in Juniperus and C. formosana. It is amazing that gymnosperm chloroplast genomes have so much difference in gene organization surrounding accD, so we speculate that the accD gene must be involved in some rearrangement events of gymnosperm chloroplast genome. Gene organization around the accD locus in gymnosperms. Green star indicates the lineage where accD has expanded. Genes shown above line are transcribed from left to right, while those located below line are transcribed opposite direction. The half-height region in rps16 represents an intron. The topology (not drawn to scale) in the left side was the same as accD dN tree in Figure . The roman numbers I–VI denotes six types of gene organization around the accD.

Discussion

The accD Gene Length and Repetitive Elements

In gymnosperms, the reading frame lengths of accD vary considerably. At present, six complete chloroplast (cp) genomes of Gnetales have been published. However, accD could not be found in these cp genomes, suggesting that accD was lost from the cp genomes of Gnetales (Wu et al., 2009). The accD gene length of Cycas (359 codons) and Ginkgo (323 codons) is relatively short. In Pinaceae, the accD gene length range from 319 (Picea abies) to 326 (Picea morrisonicola) codons. However, we identified that the accD gene length in cupressophyte experienced an extraordinary expansion. From the alignment of 57 gymnosperm accD gene sequences, we can speculate that the enlarged accD gene size in cupressophytes is mainly caused by numerous amounts of insertion repetitive sequences in the middle region. Meanwhile, many different repetitive elements were identified in the inserted sequence. The repetitive elements have a relatively low similarity among different genus (Supplementary Table 2 and Figure ), suggesting these repetitive elements likely do not have a common origin, and have formed independently.

The Function of Repetitive Elements in accD

In addition to cupressophyte, the repeat sequences in accD were also reported in two legume species (P. sativum and L. sativus) (Magee et al., 2010), pepper (C. annuum) (Jo et al., 2011) and M. truncatula (Gurdon and Maliga, 2014). This verifies the idea that some proteins are more easily generating repeats during evolution (Mularoni et al., 2010). The accD gene in both P. sativum and L. sativus contains many repeat sequences in their middle region, but the repetitive elements from these two species show low similarity. The repetitive elements in P. sativum and L. sativus were also different with those in cupressophyte, suggesting that repetitive elements were species-specific. In pepper, seven repeats of an 18 bp-long element sequences were observed. And interestingly, one pair of short direct repeat sequences was located nearby the inserted repeat sequences. But no such sequences were found near the inserted repeat sequences of accD gene in cupressophytes and legume, suggesting that these direct repeat sequences were not necessary for the formation of repeat sequences. The transcription of accD gene in pepper was confirmed by reverse transcriptase PCR, so the expanded accD gene in pepper is supposed to be functional. Furthermore, a large number of complex repeats were found in the different ecotype of M. truncatula. It is suggested that the function of these inserted repeat sequences is not very important for ACCase (Gurdon and Maliga, 2014). However, on the other hand, the reading frame in this gene was not destroyed; so we speculate that the repetitive elements in these species may play a role of regulation to protein function. All of these results suggest that accD is a specific gene that tends to be easily form independently repeat sequence. And these repeat sequences are species-specific, which were only detected in some species. The accD gene encodes the carboxyltransferase β subunit of ACCase. It is essential for leaf development in tobacco, as knocking out accD gene may be lethal (Kode et al., 2005). Three points strongly indicate that the function of this gene has not been destroyed in cupressophytes. Firstly, despite containing repeats, the original reading frame of accD gene is maintained, revealing that the genes in cupressophytes work well. Secondly, three sites were considered to be important for accD gene in potato: an acetyl-CoA bonding site, a CoA-carboxylation catalytic site and a carboxy-biotin binding site. All these three sites were located at the C-terminal region of the protein in all gymnosperm species. Thirdly, only the lineages of cupressophytes contain a large number of complex repeats. The Cycas (359 amino acids), Ginkgo (323 amino acids) (Table ) and most angiosperm accD genes have not been expanded and did not contain repeat sequences. Yi et al. (2013) have confirmed that accD in C. oliveri still have function after expansion. From the alignment of accD gene in 57 gymnosperms, we could see that the accD gene is considerably conserved in the 3′ end. In comparison, the nucleus copies of accD in Trachelium and Trifolium each encodes only the 3′ end region of the chloroplastic gene (Rousseau et al., 2013). The 3′ end region of the accD gene encode the carboxyl transferase, which is the most important functional region discovered in this gene (Zhang et al., 2003). So it is reasonable to see a higher conservable 3′ end of the accD relative to the much variable 5′ end as their function restriction.

The Acceleration of Substitution Rate

Gene-specific rate acceleration is considered to be a common character of chloroplast genome evolution (Jansen et al., 2007; Park et al., 2017). Among three genes (rbcL, matK, accD) analyzed in this study, only accD gene shows an obviously acceleration in both dN and dS in cupressophyte (Figure ), which was also the lineage full of abundant repeat sequences (Supplementary Table 2 and Figure ). Meanwhile, the repetitive elements were also identified for rbcL and matK using Tandem Repeat Finder, but no repetitive elements were found. It seems that there is a positive correlation between repetitive elements and substitution rate acceleration. Many studies support that species-specific rate acceleration have relevant with genomic rearrangement (Jansen et al., 2007; Guisinger et al., 2008, 2010). However, the relationship between repeat sequence and rate acceleration has been little documented. Park et al. (2017) attribute the acceleration substitutate rate of accD gene in Geraniaceae to the insertion sequence. However, we found that repetitive elements also exist in these insertion sequences. Maybe the insertion of repetitive elements promoted the sequence to be more variable, thus leading to the acceleration of substitution rate.

Genome Rearrangement Happened Near accD

Dispersed repeat elements were supposed to locate in rearrangement endpoints. In this study, we also sequenced the gene nearby accD, and found that there are six kinds of gene order type near accD (Figure ). This indicates that accD gene is located in inversion endpoints. In addition to the relocation or complete loss of trnD or trnR, rbcL is generally located on one side of accD. For Cephalotaxus, a large inversion has happened relative to other gymnosperms which relocated accD gene near rps16 rather than rbcL (Wu et al., 2011). We speculated that repetitive elements may induce the rearrangement near accD. An explanation for this correlation is that recombination between repeat sequence can lead to rearrangements of genome (Rogalski et al., 2006; Gray et al., 2009). In addition to accD, other rearrangement endpoints causing by inversions also exist. For instance, two inversions were identified between Agathis dammara and Nageia nagi chloroplast genome (Wu and Chaw, 2014), making the intergenic region of ycf1 and clpP, rpl23, rpl20, and petG as rearrangement endpoints. However, repeat elements were only detected in the intergenic region of ycf1 and clpP. Rpl23, rpl20, and petG did not have repeat sequence. These suggest that not all the rearrangement endpoints have repeat elements.

Conclusion

The accD gene in cupressophyte has undergone an extraordinary length expansion, which was mainly caused by abundant independent repetitive elements. Accompanied with repetitive elements, the dN and dS of accD are also accelerated. In addition, accD has been involved in many rearrangement events. All these results suggest that the repetitive elements may promote the acceleration of substitution rate and mediate the genome rearrangement. Our study provides a typical case for the research of relationship between repetitive elements, genome rearrangement and substitution rate.

Author Contributions

JL carried out the experiments, conducted data analysis, and wrote the manuscript. TW wrote the manuscript. YS designed the study and wrote the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The handling Editor declared a shared affiliation, though no other collaboration, with one of the authors YS.
Table 1

List of taxa and sequences analyzed in this study.

FamilyGenusSpeciesSampling siteaccD-GenBankrbcL-GenBankmatK-GenBankAccD gene length (codons)
CycadaceaeCycasCycas taitungensisNC_009618NC_009618NC_009618359
Cycas revolutaNC_020319NC_020319NC_020319359
GinkgoaceaeGinkgoGinkgo bilobaNC_016986NC_016986NC_016986323
PinaceaeCathayaCathaya argyrophyllaNC_014589NC_014589NC_014589320
CedrusCedrus deodaraNC_014575NC_014575NC_014575322
KeteleeriaKeteleeria davidianaNC_011930NC_011930NC_011930323
PseudotsugaPseudotsuga sinensisNC_016064NC_016064NC_016064322
LarixLarix deciduaNC_016058NC_016058NC_016058324
PiceaPicea morrisonicolaNC_016069NC_016069NC_016069326
Picea sitchensisNC_011152NC_011152NC_011152320
Picea abiesAJ001004AJ001004AJ001004319
PinusPinus thunbergiiNC_001631NC_001631NC_001631321
StrobusPinus KoraiensisNC_004677NC_004677NC_004677320
Pinus LambertianaNC_011156NC_011156NC_011156320
Pinus contortaNC_011153NC_011153AB080921321
Pinus gerardianaNC_011154NC_011154NC_011154320
Pinus krempfiiNC_011155NC_011155NC_011155320
Pinus nelsoniiNC_011159NC_011159NC_011159320
Pinus massonianaNC_021439NC_021439NC_021439321
Pinus taedaNC_021440NC_021440NC_021440321
PodocarpaceaePodocarpusPodocarpus macrophyllusWBGKT030792AY450866JQ512476693
Podocarpus neriifoliusWBGKT030793AY450867HM593765693
Podocarpus totaraNC_020361NC_020361NC_020361683
Podocarpus lambertiiKJ010812KJ010812KJ010812683
RetrophyllumRetrophyllum piresiiNC_024827NC_024827NC_024827683
NageiaNageia nagiWBGKT030794AY450868AB644449683
Nageia nagiNC_023120NC_023120NC_023120683
Nageia fleuryiWBGKT030795HM593612HM593714683
AraucariaceaeAraucariaAraucaria cunninghamiiWBGKT030797JQ512518JQ512394849
AgathisAgathis dammaraNC_023119NC_023119NC_023119819
TaxaceaeTaxusTaxus mediaWBGKT030780EF660722EF660664753
Taxus wallichianaWBGKT030784EF660717DQ478792736
Taxus chinensisWBGKT030782AY450855AF228103759
Taxus cuspidataWBGKT030781EF660720JX174691759
Taxus yunnanensisIBKT030783AY450857JX174720.1767
Taxus maireiNC_020321NC_020321NC_020321735
Taxus maireiKJ123824KJ123824KJ123824773
PseudotaxusPseudotaxus chieniiSYSUKT030788AY450858AF228105657
TorreyaTorreya fargesiiWBGKT030785EF660735AF228107871
Torreya grandisWBGKT030786EF660733AF228108879
AmentotaxusAmentotaxus argotaeniaIBKT030787AY450859JQ512393916
Amentotaxus formosanaNC_024945NC_024945NC_024945910
CephalotaxaceaeCephalotaxusCephalotaxus hainanensisIBKT030790AY4508621070
Cephalotaxus sinensisWBGKT030789AY450864AB0239881038
Cephalotaxus fortuneiWBGKT030791AY450863FJ6009111038
Cephalotaxus wilsonianaNC_016063NC_016063NC_0160631056
Cephalotaxus oliveriKC136217KC136217KC136217936
CupressaceaeTaiwaniaTaiwania cryptomerioidesNC_016065NC_016065NC_016065800
Taiwania flousianaNC_021441NC_021441NC_021441795
CryptomeriaCryptomeria japonicaNC_010548NC_010548NC_010548700
MetasequoiaMetasequoia glyptostroboidesWBGKT030796JQ512563JQ512439697
CalocedrusCalocedrus formosanaNC_023121NC_023121NC_023121800
CunninghamiaCunninghamia lanceolataNC_021437NC_021437NC_021437740
JuniperusJuniperus virginianaKF866300.1KF866300.1KF866300.1902
Juniperus monospermaKF866298KF866298KF866298896
Juniperus scopulorumKF866299KF866299KF866299884
Juniperus bermudianaKF866297KF866297KF866297899
  54 in total

1.  Localized hypermutation and associated gene losses in legume chloroplast genomes.

Authors:  Alan M Magee; Sue Aspinall; Danny W Rice; Brian P Cusack; Marie Sémon; Antoinette S Perry; Sasa Stefanović; Dan Milbourne; Susanne Barth; Jeffrey D Palmer; John C Gray; Tony A Kavanagh; Kenneth H Wolfe
Journal:  Genome Res       Date:  2010-10-26       Impact factor: 9.043

2.  The tobacco plastid accD gene is essential and is required for leaf development.

Authors:  Vasumathi Kode; Elisabeth A Mudd; Siriluck Iamtham; Anil Day
Journal:  Plant J       Date:  2005-10       Impact factor: 6.417

3.  Evolutionary analysis of amino acid repeats across the genomes of 12 Drosophila species.

Authors:  Melanie A Huntley; Andrew G Clark
Journal:  Mol Biol Evol       Date:  2007-06-29       Impact factor: 16.240

4.  RCPdb: An evolutionary classification and codon usage database for repeat-containing proteins.

Authors:  Noel G Faux; Gavin A Huttley; Khalid Mahmood; Geoffrey I Webb; Maria Garcia de la Banda; James C Whisstock
Journal:  Genome Res       Date:  2007-06-13       Impact factor: 9.043

5.  Acetyl-CoA carboxylase in higher plants: most plants other than gramineae have both the prokaryotic and the eukaryotic forms of this enzyme.

Authors:  T Konishi; K Shinohara; K Yamada; Y Sasaki
Journal:  Plant Cell Physiol       Date:  1996-03       Impact factor: 4.927

6.  Codon repeats in genes associated with human diseases: fewer repeats in the genes of nonhuman primates and nucleotide substitutions concentrated at the sites of reiteration.

Authors:  P Djian; J M Hancock; H S Chana
Journal:  Proc Natl Acad Sci U S A       Date:  1996-01-09       Impact factor: 11.205

7.  Identification of nuclear genes encoding chloroplast-localized proteins required for embryo development in Arabidopsis.

Authors:  Nicole Bryant; Johnny Lloyd; Colleen Sweeney; Fumiyoshi Myouga; David Meinke
Journal:  Plant Physiol       Date:  2010-12-07       Impact factor: 8.340

8.  Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns.

Authors:  Robert K Jansen; Zhengqiu Cai; Linda A Raubeson; Henry Daniell; Claude W Depamphilis; James Leebens-Mack; Kai F Müller; Mary Guisinger-Bellian; Rosemarie C Haberle; Anne K Hansen; Timothy W Chumley; Seung-Bum Lee; Rhiannon Peery; Joel R McNeal; Jennifer V Kuehl; Jeffrey L Boore
Journal:  Proc Natl Acad Sci U S A       Date:  2007-11-28       Impact factor: 11.205

9.  Tobacco plastid ribosomal protein S18 is essential for cell survival.

Authors:  Marcelo Rogalski; Stephanie Ruf; Ralph Bock
Journal:  Nucleic Acids Res       Date:  2006-08-31       Impact factor: 16.971

10.  Complete chloroplast genome of the multifunctional crop globe artichoke and comparison with other Asteraceae.

Authors:  Pasquale L Curci; Domenico De Paola; Donatella Danzi; Giovanni G Vendramin; Gabriella Sonnante
Journal:  PLoS One       Date:  2015-03-16       Impact factor: 3.240

View more
  6 in total

1.  Two Independent Plastid accD Transfers to the Nuclear Genome of Gnetum and Other Insights on Acetyl-CoA Carboxylase Evolution in Gymnosperms.

Authors:  Edi Sudianto; Shu-Miaw Chaw
Journal:  Genome Biol Evol       Date:  2019-06-01       Impact factor: 3.416

2.  Mutational Dynamics of Aroid Chloroplast Genomes II.

Authors:  Claudia L Henriquez; Thomas B Croat; Peter Poczai; Ibrar Ahmed
Journal:  Front Genet       Date:  2021-01-20       Impact factor: 4.599

3.  The complete chloroplast genomes of three Hamamelidaceae species: Comparative and phylogenetic analyses.

Authors:  NingJie Wang; ShuiFei Chen; Lei Xie; Lu Wang; YueYao Feng; Ting Lv; YanMing Fang; Hui Ding
Journal:  Ecol Evol       Date:  2022-02-16       Impact factor: 2.912

4.  Genome structure and diversity among Cynanchum wilfordii accessions.

Authors:  Sae Hyun Lee; Jiseok Kim; Hyun-Seung Park; HyunJin Koo; Nomar Espinosa Waminal; Remnyl Joyce Pellerin; Hyeonah Shim; Hyun-Oh Lee; Eunbi Kim; Jee Young Park; Hong Seob Yu; Hyun Hee Kim; Jeonghoon Lee; Tae-Jin Yang
Journal:  BMC Plant Biol       Date:  2022-01-03       Impact factor: 4.215

5.  Genome skimming and exploration of DNA barcodes for Taiwan endemic cypresses.

Authors:  Chung-Shien Wu; Edi Sudianto; Yu-Mei Hung; Bo-Cyun Wang; Chiun-Jr Huang; Chi-Tsong Chen; Shu-Miaw Chaw
Journal:  Sci Rep       Date:  2020-11-26       Impact factor: 4.379

6.  Mahonia vs. Berberis Unloaded: Generic Delimitation and Infrafamilial Classification of Berberidaceae Based on Plastid Phylogenomics.

Authors:  Chia-Lun Hsieh; Chih-Chieh Yu; Yu-Lan Huang; Kuo-Fang Chung
Journal:  Front Plant Sci       Date:  2022-01-06       Impact factor: 5.753

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.