| Literature DB >> 35432470 |
Chao Luo1,2,3,4, Wulue Huang1,2,3, Huseyin Yer4,5, Troy Kamuda4, Xinyi Li1,2,3, Yang Li1,2,3, Yuhong Rong1,2,3, Bo Yan1,2,3, Yonghui Wen1,2,3, Qiong Wang1,2,3, Meijuan Huang1,2,3, Haiquan Huang1,2,3.
Abstract
Impatiens L., the largest genus in the family Balsaminaceae with approximately 1,000 species, is a controversial genus. Due to the conflict of morphological features and insufficient genomic resources, the studies of systematic evolution and understanding of taxonomic identification are considered to be very limited. Hence, we have sequenced the complete chloroplast genomes of three ornamental species (Impatiens balsamina, I. hawkeri, and I. walleriana), and compared them with previously published wild species data. We performed a detailed comparison of a highly similar basic structure, size, GC content, gene number, order, and functional array among them. Similarly, most divergent genes were detected from previous work in the literature. The mutational regions containing highly variable nucleotide hotspots were identified and may be used as potential markers for species identification and taxonomy. Furthermore, using whole chloroplast genome data to analysis the phylogenetic relationship of the Balsaminaceae species, we found that they were all part of a single clade. The three phenotypically different ornamental species were clustered together, suggesting that they were very likely to be closely related. We achieved and characterized the plastid genome structure, identified the divergence hotspots, and determined the phylogenetic and taxonomic positions of the three cultivated species in the Impatiens genus. The results may show that the chloroplast genome can be used to solve phylogenetic problems in or between the Impatiens genus and also provide genomic resources for the study of the Balsaminaceae family's systematics and evolution.Entities:
Keywords: Balsaminaceae; Impatiens; chloroplast genome; comparative analysis; phylogenetic relationship
Year: 2022 PMID: 35432470 PMCID: PMC9006450 DOI: 10.3389/fgene.2022.816123
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
The list of basic information of Impatiens species sequenced in this study.
| Species | Altitude (m) | Latitude and Longitude | Location | Voucher Specimen |
|---|---|---|---|---|
|
| 1953.7 | 102°76′43″E, 25°06′15″N | Arboretum of Southwest Forestry University, Yunnan Province, China | SWFU-IBXJNY20180811 |
|
| 1953.7 | 102°76′44″E, 25°06′23″N | Arboretum of Southwest Forestry University, Yunnan Province, China | SWFU-IBSD |
|
| 1,094.4 | 104°71′32″E, 23°12′28″N | Malipo Laoshan Nature Reserve, Wenshan City, Yunnan Province, China | SWFU-IBLH |
Characteristics of complete chloroplast genomes for Impatiens species.
| Species |
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| Length/bp | 152,271 | 151,691 | 151,953 | 152,236 | 152,260 | 154,189 |
| LSC/bp | 83,497 | 83,030 | 82,906 | 83,115 | 83,261 | 84,865 |
| IR/bp | 25,249 | 25,584 | 25,710 | 25,755 | 25,63 | 25,622 |
| SSC/bp | 18,276 | 17,493 | 17,627 | 17,611 | 17,737 | 18,080 |
| Total Genes | 114 | 114 | 114 | 114 | 108 | 112 |
| CDS | 81 | 81 | 81 | 81 | 80 | 81 |
| tRNA | 30 | 30 | 30 | 30 | 29 | 30 |
| rRNA | 4 | 4 | 4 | 4 | 4 | 4 |
| Total GC content (%) | 36.7 | 36.8 | 36.8 | 36.9 | 36.8 | 36.9 |
| GC content in LSC/% | 34.3 | 34.4 | 34.4 | 34.5 | 34.5 | 34.7 |
| GC content in IR/% | 43.2 | 43.2 | 43.2 | 43.1 | 43.1 | 43.1 |
| GC content in SSC/% | 29.3 | 29.6 | 29.4 | 29.3 | 29.4 | 29.9 |
FIGURE 1Chloroplast genome structure of three Impatiens species (I. balsamina, I. hawkeri, and I. walleriana).
The list of genes in the chloroplast genomes of Impatiens species.
| Function of genes | Gene groups | Gene names |
|---|---|---|
| Photosynthesis-related genes | Rubisco | rbcL |
| Photosystem I | psaA psaB psaC psaI psaJ | |
| Assembly and stability of Photosystem I | ycf3•• ycf4 | |
| Photosystem II | psbA psbB psbC psbD psbE psbF psbH psbI psbJ psbK psbL psbM psbN psbT psbZ | |
| ATP synthase | atpA atpB atpE atpF• atpH atpI | |
| Cytochrome b/f complex | petA petB• petD petG petL petN | |
| Cytochrome c synthesis | ccsA | |
| NADPH dehydrogenase | ndhA• ndhB•(2) ndhC ndhD ndhE ndhFndhG ndhH ndhI ndhJ ndhK | |
| Transcription and translation-related genes | Transcription | rpoA rpoB rpoC1• rpoC2 |
| Ribosomal proteins | rpl2•(2) rpl14 rpl16 rpl20 rpl22 rpl23 (2) rpl33 rpl36 rps2 rps3 rps4 rps7 (2) rps8 rps11 rps12•(2) rps14 rps15 rps16•rps18 rps19 (2) | |
| RNA genes | Ribosomal RNA | rrn4.5 rrn5 rrn16 rrn23 |
| Transfer RNA | trnA-UGC•(2) trnC-GCA trnD-GUC trnE-UUC trnF-GAA trnfM-CAU trnG-GCC• trnG-UCC trnH-GUG trnI-CAU*(2) trnI-GAU•(2) trnK-UUU• trnL-CAA (2) trnL-UAG trnL-UAA• trnM-CAU trnN-GUU(2) trnP-UGG trnQ-UUG trnR-ACG (2) trnR-UCU trnS-GCU trnS-GGA trnS-UGA trnT-GGU trnT-UGU trnV-GAC (2) trnV-UAC• trnW-CCA trnY-GUA | |
| Other genes | RNA processing | matK |
| Carbon metabolism | cemA | |
| Fatty acid synthesis | accD | |
| Proteolysis | clpP•• | |
| Genes of unknown function | Conserved reading frames |
|
(2) indicates the m = number of the repeat unit is 2; Gene contains one intron; Gene contains two introns.
FIGURE 2Analysis of repeated sequences in the I. balsamina, I. hawkeri, I. walleriana, I. piufanensis, I. glandulifera, and H. triflora chloroplast genomes. (A) A total of six species of four repeat types by length; (B) Total six species of four repeat types.
FIGURE 3Analysis of simple sequence repeats (SSRs) in the chloroplast genomes of I. balsamina, I. hawkeri, I. walleriana, I. piufanensis, I. glandulifera, and H. triflora. (A) The number of different SSR types detected in each species; (B) type and frequency of each identified SSR.
FIGURE 4Comparison of sequence arrangement in the chloroplast genomes of six Balsaminaceae species.
FIGURE 5Comparison of the borders of four different regions (LSC, SSC, and IRs) among I. balsamina, I. hawkeri, I. walleriana, I. piufanensis, I. glandulifera, and H. triflora chloroplast genomes.
FIGURE 6Alignment of the six chloroplast genomes. Sequence identity plot comparing the five chloroplast genomes with I. balsamina as a reference by using mVISTA.
FIGURE 7Sliding window analysis based on the chloroplast genomes of three Balsaminaceae species. Window length: 2000 bp; step size: 200 bp. X-axis: the position of the midpoint of a window. Y-axis: nucleotide diversity of each window.
FIGURE 8Phylogenetic tree based on whole chloroplast genome sequences from 6 Balsaminaceae species and 23 other species using maximum likelihood (ML) bootstraps and Bayesian posterior probabilities (PP). ML topology is shown with ML bootstrap support values/Bayesian PP given at each node. Asterisks indicate both of PP = 1 and LBS = 100%. Black triangles indicate the cp genomes of the three Impatiens species examined in this study.