| Literature DB >> 31531272 |
Qiwen Zhong1,2,3, Shipeng Yang2,3, Xuemei Sun1,2,3, Lihui Wang2,3, Yi Li1.
Abstract
Jerusalem artichoke (Helianthus tuberosus L.) is widely cultivated in Northwest China, and it has become an emerging economic crop that is rapidly developing. Because of its elevated inulin content and high resistance, it is widely used in functional food, inulin processing, feed, and ecological management. In this study, Illumina sequencing technology was utilized to assemble and annotate the complete chloroplast genome sequences of Jerusalem artichoke. The total length was 151,431 bp, including four conserved regions: A pair of reverse repeat regions (IRa 24,568 bp and IRb 24,603 bp), a large single-copy region (83,981 bp), and a small single-copy region (18,279 bp). The genome had a total of 115 genes, with 19 present in the reverse direction in the IR region. A total of 36 simple sequence repeats (SSRs) were identified in the coding and non-coding regions, most of which were biased toward A/T bases. A total of 32 SSRs were distributed in the non-coding regions. A comparative analysis of the chloroplast genome sequence of the Jerusalem artichoke and other species of the composite family revealed that the chloroplast genome sequences of plants of the composite family were highly conserved. Differences were observed in 24 gene loci in the coding region, with the degree of differentiation of the ycf2 gene being the most obvious. A phylogenetic analysis showed that H. petiolaris subsp. fallax had the closest relationship with Jerusalem artichoke, both members of the Helianthus genus. Selective locus detection of the ycf2 gene in eight species of the composite family was performed to explore adaptive evolution traits of the ycf2 gene in Jerusalem artichoke. The results show that there are significant and extremely significant positive selection sites at the 1239N and 1518R loci, respectively, indicating that the ycf2 gene has been subject to adaptive evolution. Insights from our assessment of the complete chloroplast genome sequences of Jerusalem artichoke will aid in the in-depth study of the evolutionary relationship of the composite family and provide significant sequencing information for the genetic improvement of Jerusalem artichoke.Entities:
Keywords: Asteraceae; Chloroplast genome; Helianthus tuberosus L.; Positively selected sites; ycf2 gene
Year: 2019 PMID: 31531272 PMCID: PMC6718157 DOI: 10.7717/peerj.7596
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Gene map of the Helianthus tuberosus L. chloroplast genome.
Genes drawn outside of the circle are transcribed counter-clockwise, while genes shown on the inside of the circle are transcribed clockwise. Genes belonging to different functional groups are color-coded. The darker gray in the inner circle indicates GC content, while the lighter gray corresponds to AT content.
List of genes in the chloroplast genome of Helianthus tuberosus L.
| Groups of genes | Names of genes | |
|---|---|---|
| Protein synthesis and DNA replication | Ribosomal RNAs | |
| Transfer RNAs | ||
| Ribosomal protein small subunit | ||
| Ribosomal protein large subunit | ||
| Subunits of RNA polymerase | ||
| Photosynthesis | Photosystem I | |
| Photosystem II | ||
| Cytochrome b/f complex | ||
| ATP synthase | ||
| NADH-dehydrogenase | ||
| Large subunit Rubisco | ||
| Miscellaneous group | Translation initiation factor IF-1 | |
| Acetyl-CoA carboxylase | ||
| Cytochrome c biogenesis | ||
| Maturase | ||
| ATP-dependent protease | ||
| Inner membrane protein | ||
| Pseudogenes of unknown function | Conserved hypothetical chloroplast open reading frame |
Characteristics of genes including introns and exons in the chloroplast genome of Helianthus tuberosus L.
| Gene | Region | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) |
|---|---|---|---|---|---|---|
| LSC | 51 | 2,528 | 36 | |||
| LSC | 29 | 864 | 226 | |||
| LSC | 431 | 733 | 1,727 | |||
| LSC | 144 | 714 | 391 | |||
| LSC | 152 | 746 | 229 | 700 | 123 | |
| LSC | 36 | 436 | 49 | |||
| LSC | 36 | 574 | 37 | |||
| LSC | 68 | 792 | 290 | 624 | 227 | |
| LSC | 5 | 775 | 641 | |||
| LSC | 8 | 712 | 473 | |||
| LSC | 392 | 663 | 434 | |||
| IR | 755 | 671 | 776 | |||
| IR | 41 | 776 | 34 | |||
| IR | 37 | 822 | 34 | |||
| SSC | 552 | 1,095 | 538 | |||
| LSC-IR | 113 | 230 | 29 |
Figure 2Distribution frequency in Helianthus tuberosus L. cp genome.
(A) The frequency of repeats, length of repeats; Number of repeats. (B) The percentage distribution of gene area.
Comparison of chloroplast and plastid differential genes in Helianthus tuberosus L.
| Gene | NCBI accession | Difference site | Difference position and base | |||
|---|---|---|---|---|---|---|
| T | C | A | G | |||
| 36.8 | 15.6 | 31.6 | 16.0 | |||
| 36.9 | 15.5 | 31.6 | 16.0 | 579T | ||
| 36.8 | 15.6 | 31.6 | 16.0 | |||
| 36.9 | 15.5 | 31.6 | 16.0 | 348G | ||
| 28.9 | 18.0 | 28.6 | 24.5 | 361–363 null | ||
| 29.1 | 18.1 | 28.3 | 24.5 | 362G/363C/70,361T | ||
| 34.7 | 19.6 | 27.6 | 18.0 | |||
| 34.8 | 19.5 | 27.9 | 17.8 | 778–819 null | ||
| 31.0 | 15.2 | 30.9 | 22.9 | |||
| 30.9 | 15.2 | 30.9 | 23.0 | 822G | ||
| 34.1 | 16.2 | 31.5 | 18.2 | |||
| 33.9 | 16.4 | 31.5 | 18.2 | 433C | ||
| 28.9 | 19.3 | 30.8 | 21.0 | |||
| 28.9 | 19.3 | 30.7 | 21.1 | 705G | ||
| 32.9 | 19.0 | 27.5 | 20.5 | |||
| 32.9 | 19.0 | 27.7 | 20.3 | 9A | ||
| 22.9 | 18.2 | 33.5 | 25.4 | |||
| 22.9 | 18.3 | 33.5 | 25.3 | 392–394 null | ||
| 30.0 | 16.9 | 32.4 | 20.7 | 2–22 null | ||
| 30.0 | 16.9 | 32.4 | 20.7 | 4, 5, 8, 10, 11, 22A/3, 6, 9, 12, G/7, 17, 20, C/2, 13, 14, 15, 16, 18, 19, 21T. | ||
| 29.4 | 17.9 | 32.5 | 20.2 | |||
| 29.4 | 17.9 | 32.6 | 20.2 | |||
| 23.7 | 21.3 | 33.1 | 21.9 | 347 null | ||
| 24.6 | 21.6 | 30.8 | 23.0 | 346, 356A/347, 349, 351, 354G, 352T/358-376 null | ||
| 28.5 | 17.2 | 33.0 | 21.3 | |||
| 28.6 | 16.5 | 33.7 | 21.2 | 43–54 null | ||
| 30.6 | 14.2 | 39.6 | 15.6 | |||
| 30.6 | 14.2 | 39.7 | 15.5 | 1A. | ||
| 31.1 | 18.5 | 31.2 | 19.2 | |||
| 31.1 | 18.5 | 31.2 | 19.1 | 4,562–4,597 null | ||
Figure 3Compared Helianthus tuberosus L. chloroplast and plastid genome use BRIG.
Comparison of cp genomes among eight composite species.
| Species | Size (bp) | G+C (%) | Total number of genes | GeneBank | |||||
|---|---|---|---|---|---|---|---|---|---|
| Total | LSC | IR | SSC | Protein-coding | rRNAs | tRNAs | |||
| 153,675 | 83,606 | 25,407 | 19,156 | 37.4 | 89 | 4 | 30 | ||
| 150,689 | 84,815 | 23,755 | 18,358 | 37.5 | 80 | 4 | 28 | ||
| 150,689 | 82,855 | 24,777 | 18,277 | 37.3 | 79 | 4 | 29 | ||
| 152,772 | 84,105 | 25,034 | 18,599 | 37.5 | 78 | 4 | 20 | ||
| 151,431 | 83,981 | 24,568 | 18,279 | 37.6 | 84 | 4 | 27 | ||
| 151,862 | 83,845 | 24,588 | 18,149 | 37.6 | 80 | 4 | 27 | ||
| 151,678 | 83,799 | 24,502 | 18,121 | 37.6 | 82 | 4 | 27 | ||
| 151,104 | 83,530 | 24,633 | 18,308 | 37.6 | 79 | 4 | 27 | ||
Figure 4Percent identity plot for the comparison of eight composite chloroplast genomes.
The whole chloroplast genome was divided into four parts, and the gene names are displayed in sequence on the top line of each part (arrows indicate the transcriptional direction). The sequence similarity of the alignment region of Jerusalem artichoke and seven other species is shown as the filling color in each black stripe. The x-axis indicates the position of the chloroplast genome at a certain site, and the y-axis indicates the average sequence identity percentage (50–100%) with Jerusalem artichoke on the position of a species at a certain position (50–100%). The coding sequences (exons), rRNA, tRNA and the conserved non-coding sequences (CNS) in the genomic region are represented with different colors.
Figure 5Comparison of the similarity of chloroplast genomes between Jerusalem artichoke and seven other species of crops in the composite family.
Figure 6Comparison of the ycf2 gene sequence in chloroplast genomes between Jerusalem artichoke and seven other species of crops in the composite family.
The white vacancy corresponds to the missing amino acid sequence.
Figure 7Molecular phylogenetic tree of 16 composite species based on a neighbor joining analysis.
Numbers above and below nodes are bootstrap support values 50%.
Likelihood ratio statistics of positive selection models against their null models (2Δ ln L).
| Comparison between models | 2Δ lnL | d. | |
|---|---|---|---|
| M0 vs. M3 | 15.2245 | 4 | 0.0043 < 0.01 |
| M1a vs. M2a | 13.5353 | 2 | 0.0012 < 0.01 |
| M7 vs. M8 | 15.0177 | 2 | 0.0005 < 0.01 |
| M8a vs. M8 | 13.5241 | 1 | 0.0002 < 0.01 |
Positive selective amino acid loci and parameter estimation in ycf2 of eight species in the compositae family species.
| Models | Np | lnL | Estimates of parameters | Positive sites (NEB) | Positive sites (BEB) |
|---|---|---|---|---|---|
| M0 (one-ratio) | 15 | −9,464.31 | ω = 0.93903 | Not allowed | Not allowed |
| M3 (Discrete) | 19 | −9,456.70 | 1125W 0.602 | Not allowed | |
| M1a (Near neutral) | 16 | −9,463.47 | Not allowed | Not allowed | |
| M2a (Selection) | 18 | −9,456.70 | 1125W 0.602 | 331I 0.726 | |
| M7 (beta) | 16 | −9,464.36 | Not allowed | Not allowed | |
| M8 (beta & ω) | 18 | −9,460.27 | 1125W 0.600 | 331I 0.882 | |
| M8a (beta & ω = 1) | 17 | −9,463.50 | Not allowed | Not allowed |
Note:
Positively selected sites (*p > 95%; **p > 99%).