Literature DB >> 34345693

Complete chloroplast genome sequencing of five Salix species and its application in the phylogeny and taxonomy of the genus.

Jie Zhou1, Zhongyi Jiao1, Jiahui Guo2, Bao Song Wang1, Jiwei Zheng1.   

Abstract

In this study, whole chloroplast genomes of five Salix species (S. argyracea, S. dasyclados, S. eriocephala, S. integra 'Hakuro Nishiki', and S. suchowensis) were sequenced. These chloroplast genomes were 155 ,605, 155, 763, 155, 552, 155, 538, and 155 ,550 bp in length, harboring 131 genes (77 unigenes), 37 tRNA genes, 8 rRNA genes, and 86 mRNA genes, respectively. The genes ycf1, psaI, ycf2-2, rpoC2, rpl22, atpF, and ndhF were under positive selection among the 21 Salix species. psaI, ycf2-2, atpF, and ycf1-2 were under positive selection between the tree willow and shrub willow, and rpoC2, rpl22, and ycf1-2 were positively selected among the shrub genomes. The gene rps7 was most variable among the genomes. Phylogenetic analysis of 21 Salix species and Chosenia arbutifolia provide evidence that the cp genome data partially support the relationship with traditional taxonomic concepts in the Flora of China. This chloroplast genome elucidates Salix taxonomy and provides evidence for evolutionary research.
© 2021 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.

Entities:  

Keywords:  Salix; Salix argyracea; Salix dasyclados; Salix eriocephala; Salix integra ‘Hakuro Nishiki’; Salix suchowensis; chloroplast genome; phylogeny; taxonomy

Year:  2021        PMID: 34345693      PMCID: PMC8284152          DOI: 10.1080/23802359.2021.1950055

Source DB:  PubMed          Journal:  Mitochondrial DNA B Resour        ISSN: 2380-2359            Impact factor:   0.658


Introduction

Chloroplast DNA (cpDNA) is maternally inherited, thus providing essential information for molecular markers, breeding of new varieties, and plant phylogeny (Cui et al. 2019; Njuguna et al. 2019). The willow genus (Salix spp.) is composed of 350–520 species that are distributed worldwide. In the ‘Flora of China’, the species distributed in China are classified into 37 groups (Wang and Shi 2019). The five species sequenced here (S. argyracea, S. dasyclados, S. eriocephala, S. integra ‘Hakuro Nishiki’, and S. suchowensis) are widely planted in Jiangsu Province and produce a large amount of biomass. Salix eriocephala was introduced from the United States for its high biomass yield and as a source of bioenergy. All these species absorb the heavy metal cadmium (Cd) in their roots and are the most promising candidates for phytoremediation among the willow species. In addition, the leaves and flowers have great ornamental value. Salix integra ‘Hakuro Nishiki’ is available from nurseries in shrub and tree form with vibrant white and pink leaves. Salix argyracea, S. suchowensis, and S. dasyclados are widely used in crafts for wickerwork and decorations. Thus, sequencing of the cpDNA and molecular marker mining will be effective methods to segregate willow germplasms and reveal phylogenetic relationships.

Materials and methods

Plant materials

The five Salix species were collected and deposited in the willow collection at Jiangsu Academy of Forestry (31.861947°N, 118.777145°E). The voucher specimens of S. argyracea, S. dasyclados, S. eriocephala, S. integra ‘Hakuro Nishiki’, and S. suchowensis were deposited at the herbarium of Jiangsu Academy of Forestry under the voucher numbers P102, P126, 87, P646, and P63, respectively. The email of the person who is in charge of the sample collection is zjwin718@126.com.

cpDNA sequencing and de novo assembly

Fresh leaves were collected for DNA isolation and library construction, and the DNA samples were stored at Key Laboratory of Jiangsu Academy of Forestry, Nanjing, China. Genomic sequencing was performed using the Illumina Novaseq PE150 platform (San Diego, CA, USA). The raw data were sequenced and filtered using fastp (version 0.20.0, https://github.com/OpenGene/fastp) software to obtain clean data. Then de novo assembly was constructed using SPAdes v3.10.1 (http://cab.spbu.ru/software/spades/) for the complete pseudo genome.

Chloroplast gene annotation, selective press analysis and phylogenetic analysis

The cpDNA coding sequence was annotated using GeSeq (https://chlorobox.mpimp-golm.mpg.de/geseq-app.html) and visually checked in Geneious v8.0.2 (Kearse et al. 2012). The rRNA and tRNA were predicted using HMMER v3.1b2 (http://hmmer.org/) and ARAGORN v1.2.38 (Laslett and Canback 2004). The sequences were aligned using MAFFT v7.427 (https://mafft.cbrc.jp/alignment/software/). The Ka/Ks value was calculated using KaKs_Calculator v2.0 (https://sourceforge.net/projects/kakscalculator2/). Vcftools was used to calculate the Pi (Nucleotide diversity) value of every gene. The phylogenetic tree was constructed in MrBayes v3.2.7 with the Markov chain Monte Carlo (MCMC) methods and 1000 bootstrap replicates.

Results

Characterization of chloroplast genomes in Salix

The complete chloroplast (cp) genomes of S. argyracea, S. dasyclados, S. eriocephala, S. integra ‘Hakuro Nishiki’, and S. suchowensis were 155, 605, 155 ,763, 155 ,552, 155, 538, and 155 ,550 bp in size, respectively. The GC content of the IR, LSC, and SSC regions was approximately 41%, 30%, and 34%, respectively. It encodes 131 genes (77 unigenes), 37 tRNA genes, 8 rRNA genes, and 86 mRNA genes. The genomes exhibited a typical quadripartite structure with the LSC region (84,414–84,588 bp), SSC region (16,214–16,275 bp), and IRs (27,384–27,479 bp). Fourteen genes (ndhA, ndhB, petB, petD, atpF, rpl16, rpl2, rpoC1, trnA-UGC, trnG-GCC, trnI-GAU, trnK-UUU, trnL-UAA, and trnV-UAC) had one intron, and three genes (rps12, clpP, and ycf3) had two introns (Table 1).
Table 1.

Annotated genes of the chloroplast genome of the five Salix species.

CategoryGene groupGene name
PhotosynthesisSubunits of photosystem IpsaA, psaB, psaC, psaI, psaJ
 Subunits of photosystem IIpsbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ
 Subunits of NADH dehydrogenasendhA*, ndhB*(2), ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
 Subunits of cytochrome b/f complexpetA, petB*, petD*, petG, petL, petN
 Subunits of ATP synthaseatpA, atpB, atpE, atpF*, atpH, atpI
 Large subunit of rubiscorbcL
 Subunits protochlorophyllide reductase
Self-replicationProteins of large ribosomal subunitrpl14, rpl16*, rpl2*(2), rpl20, rpl22, rpl23(2), rpl33, rpl36
 Proteins of small ribosomal subunitrps11, rps12**(2), rps14, rps15, rps18, rps19(2), rps2, rps3, rps4, rps7(2), rps8
 Subunits of RNA polymeraserpoA, rpoB, rpoC1*, rpoC2
 Ribosomal RNAsrrn16(2), rrn23(2), rrn4.5(2), rrn5(2)
 Transfer RNAstrnA-UGC*(2), trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnG-GCC*, trnG-UCC, trnH-GUG, trnI-CAU(2), trnI-GAU*(2), trnK-UUU*, trnL-CAA(2), trnL-UAA*, trnL-UAG, trnM-CAU, trnN-GUU(2), trnP-UGG, trnQ-UUG, trnR-ACG(2), trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnT-UGU, trnV-GAC(2), trnV-UAC*, trnW-CCA, trnY-GUA, trnfM-CAU
Other genesMaturasematK
 ProteaseclpP**
 Envelope membrane proteincemA
 Acetyl-CoA carboxylaseaccD
 c-type cytochrome synthesis geneccsA
 Translation initiation factor#infA
 Other
Genes of unknown functionConserved hypothetical chloroplast Opening Reading Frame (ORF)ycf1(2), ycf15(2), ycf2(2), ycf3**, ycf4

*: Genes with one intron; **: Genes with two introns; #: Pseudogene; (2) after gene name: Number of copies of multi-copy genes.

Annotated genes of the chloroplast genome of the five Salix species. *: Genes with one intron; **: Genes with two introns; #: Pseudogene; (2) after gene name: Number of copies of multi-copy genes.

Positive selection genes

The nonsynonymous substitution rate (Ka), synonymous substitution rate (Ks), and their ratio (Ka/Ks) are commonly used to calculate the direction of evolution and its selective strength in protein-coding genes. The genes ycf1, psaI, ycf2-2, rpoC2, rpl22, atpF, and ndhF were under positive selection in the 21 Salix species (Ka/Ks > 1) (Table 2). The gene rps7, located in the IR region, occupied the highest Pi value (Figure 1), indicating that the gene is the most variable among the 21 Salix genomes that could be used as potential molecular markers.
Table 2.

Positive selection genes among the cp genomes.

SequenceKa/Ksp-Value (Fisher)
S. argyracea_psaI vs S. babylonica_psaI1.101560.317442
S. argyracea_ycf2-2 vs S. babylonica_ycf2-21.115720
S. argyracea_atpF vs S. paraplesia_atpF1.262890.621226
S. argyracea_atpF vs S. tetrasperma_atpF1.262890.621226
S. dasyclados_psaI vs S. babylonica_psaI1.101560.317442
S. dasyclados_ycf2-2 vs S. babylonica_ycf2-21.173340
S. dasyclados_atpF vs S. paraplesia_atpF1.262890.621226
S. dasyclados_atpF vs S. tetrasperma_atpF1.262890.621226
S. eriocephala_psaI vs S. babylonica_psaI1.101560.317442
S. eriocephala_ycf2-2 vs S. babylonica_ycf2-21.115720
S. eriocephala_atpF vs S. paraplesia_atpF1.262890.621226
S. eriocephala_atpF vs S. tetrasperma_atpF1.262890.621226
S. integra_psaI vs S. babylonica_psaI1.101560.317442
S. integra ‘Hakuro Nishiki’_ycf2-2 vs S. babylonica_ycf2-21.115720
S. integra ‘Hakuro Nishiki’_atpF vs S. paraplesia_atpF1.262890.621226
S. integra ‘Hakuro Nishiki’_atpF vs S. tetrasperma_atpF1.262890.621226
S. suchowensis_psaI vs S. babylonica_psaI1.101560.317442
S. suchowensis_atpF vs S. paraplesia_atpF1.262890.621226
S. suchowensis_ycf1-2 vs C. arbutifolia_ycf1-21.171130
S. argyracea_ycf1-2 vs C. arbutifolia_ycf1-21.171130
S. dasyclados_matK vs C. arbutifolia_matK1.282930.349676
S. dasyclados_ycf1-2 vs C. arbutifolia_ycf1-21.172640
S. eriocephala_ycf1-2 vs C. arbutifolia_ycf1-21.171130
S. integra ‘Hakuro Nishiki’_ycf1-2 vs C. arbutifolia_ycf1-21.201070
S. suchowensis_ycf1-2 vs C. arbutifolia_ycf1-21.171130
S. argyracea_rpoC2 vs S. dasyclados_rpoC21.274370.34553
S. argyracea_ycf1-2 vs S. integra_ycf1-21.352020.379283
S. argyracea_rpl22 vs S. interior_rpl221.247190.353952
S. dasyclados_rpoC2 vs S. brachista_rpoC21.274680.345621
S. dasyclados_rpl20 vs S. brachista_rpl201.39160.787437
S. dasyclados_rpoC2 vs S. eriocephala_rpoC21.274370.34553
S. dasyclados_rpoC2 vs S. gracilistyla_rpoC21.274370.34553
S. dasyclados_rpl22 vs S. interior_rpl221.247190.353952
S. dasyclados_rpoC2 vs S. minjiangensis_rpoC21.274370.34553
S. dasyclados_ndhF vs S. oreinoma_ndhF1.072350.288682
S. dasyclados_rpoC2 vs S. rehderiana_rpoC21.274370.34553
S. dasyclados_rpoC2 vs S. suchowensis_rpoC21.274370.34553
S. dasyclados_rpoC2 vs S. taoensis_rpoC21.325280.365324
S. eriocephala_rpl22 vs S. interior_rpl221.247190.353952
S. integra ‘Hakuro Nishiki’_ycf1-2 vs S. brachista_ycf1-21.350490.378801
S. integra ‘Hakuro Nishiki’_ycf1-2 vs S. hypoleuca_ycf1-21.125980.305479
S. integra ‘Hakuro Nishiki’_ycf1-2 vs S. koriyanagi_ycf1-21.352020.379283
S. integra ‘Hakuro Nishiki’_ndhF vs S. oreinoma_ndhF1.834480.846122
S. integra ‘Hakuro Nishiki’_ycf1-2 vs S. rorida_ycf1-21.351850.379231
S. integra ‘Hakuro Nishiki’_ycf1-2 vs S. suchowensis_ycf1-21.352020.379283
S. suchowensis_rpl22 vs S. interior_rpl221.247190.353952
Figure 1.

Nucleotide variability (Pi) values of 22 chloroplast genomes of Salix and C. arbutifolia.

Nucleotide variability (Pi) values of 22 chloroplast genomes of Salix and C. arbutifolia. Positive selection genes among the cp genomes.

Phylogenic analysis

With Eucalyptus spathulata as the outgroup, the phylogenetic tree of 21 Salix (5 sequenced and 16 published), 1 Chosenia arbutifolia, and 8 Populus complete cp genomes were constructed using MAFFT (auto mode) (Figure 2). Salix formed one robust monophyletic clade. The 21 species within Salix were clustered into two subclades. Of the 5 newly sequenced species in this study, S. argyracea, S. suchowensis, and S. eriocephala were in a clade (together with S. gracilistyla). Salix dasyclados was clustered with S. integra ‘Hakuro Nishiki’ in a clade. Based on the phylogenetic relationships inferred from the cp genomes, the genus Salix in China can be divided into two major groups.
Figure 2.

Phylogenic analysis of 21 Salix species, C. arbutifolia, and 8 Populus species based on the complete chloroplast genomes. The maximum likelihood method was based on the auto-model. The bootstrap values are shown next to the branches.

Phylogenic analysis of 21 Salix species, C. arbutifolia, and 8 Populus species based on the complete chloroplast genomes. The maximum likelihood method was based on the auto-model. The bootstrap values are shown next to the branches.

Discussion

Five Salix species were sequenced, and the complete cp genomes of 16 previously published Salix species and that of C. arbutifolia were annotated. The cp genome size of the five Salix species was ∼155 kb and similar to that of the other 17 previously published species (154–156 kb). The GC content of the IR region was high, similar to the previously reported cp genomes of plants (Huang et al. 2017). The results revealed that the structure and synteny of the 21 Salix species and C. arbutifolia were highly conserved. Positively selected genes are vital for pinpointing specific targets in adaptive evolution processes, such as environmental, geographical, and host response (Wang et al. 2017). In a photosynthetic organism, loss of activity of atpF could impair respiratory activity and affect morphology (Lapaille et al. 2010). The psaI encoding photosystem I reaction center subunit VIII indicated that the selection was associated with photosynthesis change in the process of evolution. The ndhF exhibited a positive selection effect for its involvement in adapting to hot and dry climates (Carbonell-Caballero et al. 2015; Caspermeyer 2015). These positive selection genes are central to evolutionary patterns and might have driven the successful adaptation of the Salix genus. The taxonomy and systematic phylogeny of the genus Salix has been obscure. Chosenia arbutifolia was within the clade comprising Salix species (Figure 2), which is consistent with previous reports (Chen 2008). In the ‘Flora of China’ (Wu and Raven 1999), S. dasyclados and S. integra ‘Hakuro Nishiki’ are assigned to the same section as S. suchowensis and S. koriyanagi are. However, the cp genome data partially support the relationship with traditional taxonomic concepts. The rps7 gene encodes the ribosome S7 protein, also known as ribosomal protein S7 (uS7), which is crucial for the assembly and stability of the ribosome. The rps7 shows the most variable region among the 21 genomes, indicating that it could be the molecular marker for species identification. Therefore, it is clear that the identification of cp genomes could provide valuable molecular resources for studying the taxonomy and phylogeny of Salix. This study provides us with valuable resources, which can be further applied for phylogenetic and evolutionary studies in Salix.
  7 in total

1.  ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences.

Authors:  Dean Laslett; Bjorn Canback
Journal:  Nucleic Acids Res       Date:  2004-01-02       Impact factor: 16.971

2.  Most Comprehensive Study to Date Reveals Evolutionary History of Citrus.

Authors:  Joseph Caspermeyer
Journal:  Mol Biol Evol       Date:  2015-05-22       Impact factor: 16.240

3.  Loss of mitochondrial ATP synthase subunit beta (Atp2) alters mitochondrial and chloroplastic function and morphology in Chlamydomonas.

Authors:  Marie Lapaille; Marc Thiry; Emilie Perez; Diego González-Halphen; Claire Remacle; Pierre Cardol
Journal:  Biochim Biophys Acta       Date:  2010-04-21

4.  Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data.

Authors:  Matthew Kearse; Richard Moir; Amy Wilson; Steven Stones-Havas; Matthew Cheung; Shane Sturrock; Simon Buxton; Alex Cooper; Sidney Markowitz; Chris Duran; Tobias Thierer; Bruce Ashton; Peter Meintjes; Alexei Drummond
Journal:  Bioinformatics       Date:  2012-04-27       Impact factor: 6.937

5.  A Phylogenetic Analysis of 34 Chloroplast Genomes Elucidates the Relationships between Wild and Domestic Species within the Genus Citrus.

Authors:  Jose Carbonell-Caballero; Roberto Alonso; Victoria Ibañez; Javier Terol; Manuel Talon; Joaquin Dopazo
Journal:  Mol Biol Evol       Date:  2015-04-14       Impact factor: 16.240

6.  Comparative Analysis of the Complete Chloroplast Genome of Four Known Ziziphus Species.

Authors:  Jian Huang; Ruihong Chen; Xingang Li
Journal:  Genes (Basel)       Date:  2017-11-24       Impact factor: 4.096

7.  Characterization of the Two-Speed Subgenomes of Fusarium graminearum Reveals the Fast-Speed Subgenome Specialized for Adaption and Infection.

Authors:  Qinhu Wang; Cong Jiang; Chenfang Wang; Changjun Chen; Jin-Rong Xu; Huiquan Liu
Journal:  Front Plant Sci       Date:  2017-02-14       Impact factor: 5.753

  7 in total
  3 in total

1.  The complete chloroplast genome sequence of Salix kochiana Trautv. and its phylogenetic analysis.

Authors:  Jing Wang; Zicheng Yu; Xu Yao; Jie Wan; Zhengxuan Wang; Xiaoping Li
Journal:  Mitochondrial DNA B Resour       Date:  2022-06-23       Impact factor: 0.610

2.  Phylogenetic analysis and complete chloroplast genome of Salix wilhelmsiana M.B.

Authors:  Zhengxuan Wang; Huijie Tang; Zicheng Yu; Jing Wang; Xu Yao; Xiaoping Li
Journal:  Mitochondrial DNA B Resour       Date:  2022-09-15       Impact factor: 0.610

3.  Characterization of the complete Salix viminalis var. gmelinii Turcz 1854 chloroplast genome from the northeast of China.

Authors:  Jie Wan; Huijie Tang; Zicheng Yu; Jing Wang; Xu Yao; Xiaoping Li
Journal:  Mitochondrial DNA B Resour       Date:  2022-10-06       Impact factor: 0.610

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.