| Literature DB >> 28097059 |
Andrew W Gichira1,2,3, Zhizhong Li1,2, Josphat K Saina1,2,3, Zhicheng Long1,2, Guangwan Hu1,3, Robert W Gituru3,4, Qingfeng Wang1,3, Jinming Chen1,3.
Abstract
Hagenia is an endangered monotypic genus endemic to the topical mountains of Africa. The only species, Hagenia abyssinica (Bruce) J.F. Gmel, is an important medicinal plant producing bioactive compounds that have been traditionally used by African communities as a remedy for gastrointestinal ailments in both humans and animals. Complete chloroplast genomes have been applied in resolving phylogenetic relationships within plant families. We employed high-throughput sequencing technologies to determine the complete chloroplast genome sequence of H. abyssinica. The genome is a circular molecule of 154,961 base pairs (bp), with a pair of Inverted Repeats (IR) 25,971 bp each, separated by two single copies; a large (LSC, 84,320 bp) and a small single copy (SSC, 18,696). H. abyssinica's chloroplast genome has a 37.1% GC content and encodes 112 unique genes, 78 of which code for proteins, 30 are tRNA genes and four are rRNA genes. A comparative analysis with twenty other species, sequenced to-date from the family Rosaceae, revealed similarities in structural organization, gene content and arrangement. The observed size differences are attributed to the contraction/expansion of the inverted repeats. The translational initiation factor gene (infA) which had been previously reported in other chloroplast genomes was conspicuously missing in H. abyssinica. A total of 172 microsatellites and 49 large repeat sequences were detected in the chloroplast genome. A Maximum Likelihood analyses of 71 protein-coding genes placed Hagenia in Rosoideae. The availability of a complete chloroplast genome, the first in the Sanguisorbeae tribe, is beneficial for further molecular studies on taxonomic and phylogenomic resolution within the Rosaceae family.Entities:
Keywords: Afromontane; Chloroplast genome; East Africa; Hagenia abyssinica; Phylogeny; Rosaceae
Year: 2017 PMID: 28097059 PMCID: PMC5228516 DOI: 10.7717/peerj.2846
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Comparison of complete chloroplast genomes in 21 taxa of Rosaceae; size, contraction/expansion of the inverted repeats and gene arrangement around the four IR/SC junctions.
| IRa/LSC | IRa/SSC | IRb/SSC | IRb/LSC | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GenBank No. | Species | Genome size | LSC length | SSC Length | IR length | Sub-family | Ψ | |||||||
|
| 160,041 | 88,119 | 19,204 | 26,359 | 119 | 9 | 11 | 1,073 | −190 | 129 | −38 | |||
|
| 159,922 | 87,901 | 19,237 | 26,392 | 21 | −92 | −90 | 110 | 975 | −289 | 149 | −3 | ||
|
| 159,161 | 87,694 | 19,205 | 26,396 | 8 | −79 | −114 | 113 | 493 | −520 | 141 | −91 | ||
|
| 159,328 | 85,239 | 18,485 | 26,302 | 178 | −107 | −110 | −32 | 978 | −3,398 | 179 | −91 | ||
|
| 158,955 | 87,667 | 18,872 | 26,208 | 38 | −109 | 5 | 19 | 1,035 | −109 | * | −22 | ||
|
| 157,882 | 85,969 | 19,121 | 26,396 | 177 | −248 | 13 | −2 | 1,045 | −248 | 162 | −24 | ||
|
| 157,859 | 85,978 | 19,121 | 26,380 | 179 | −250 | 18 | −21 | 1,040 | −250 | 185 | −46 | ||
|
| 157,852 | 85,848 | 19,134 | 26,435 | 216 | −287 | 13 | −2 | 1,045 | −287 | 221 | −21 | ||
|
| 157,833 | 85,952 | 19,121 | 26,381 | 179 | −250 | 17 | −21 | 1,040 | −250 | 185 | −46 | ||
|
| 157,790 | 85,968 | 19,060 | 26,381 | 95 | −167 | −81 | 96 | 946 | −338 | 182 | −3 | ||
|
| 157,736 | 85,755 | 19,209 | 26,386 | 181 | −252 | 5 | 9 | 1,050 | −338 | 182 | −79 | ||
|
| 157,712 | 85,830 | 19,094 | 26,394 | 196 | −267 | −102 | −17 | 1,018 | −298 | 206 | −2 | ||
|
| 156,634 | 85,767 | 18,761 | 26,053 | −14 | −55 | 57 | −44 | 1,105 | −54 | * | −4 | ||
|
| 156,612 | 84,970 | 18,942 | 26,350 | 152 | −223 | 0 | 40 | 1,057 | −222 | 151 | −35 | ||
|
| 155,691 | 85,606 | 18,175 | 25,555 | −10 | −55 | 31 | −93 | 1,091 | −54 | * | −35 | ||
|
| 155,621 | 85,587 | 18,146 | 25,944 | −13 | −54 | 12 | −33 | 1,091 | −54 | * | −34 | ||
|
| 155,603 | 85,568 | 18,147 | 25,944 | −13 | −54 | 12 | −33 | 1,091 | −54 | * | −34 | ||
|
| 155,596 | 85,515 | 18,171 | 25,955 | −13 | −54 | 12 | 59 | 1,091 | −54 | * | −34 | ||
|
| 155,554 | 85,569 | 18,059 | 25,963 | −13 | −55 | 21 | −50 | 1,091 | −54 | * | −34 | ||
|
| 154,961 | 84,320 | 18,696 | 25,971 | −130 | −57 | 53 | 12 | 1,082 | −57 | * | −3 | ||
|
| 154,959 | 85,137 | 18,762 | 25,530 | −1,016 | −489 | −476 | 400 | 1,040 | −60 | * | −3 | ||
Notes.
small single copy
large single copy
inverted repeat (a/b)
base pairs
pseudogene;
missing]
The negative (−) numbers indicate the size of the gap between the IR/SC junction and the gene involved. Except for Ψrps19, the other numbers shows the size of the gene that is located in the IR.
Figure 1A gene map of Hagenia abyssinica chloroplast genome.
The GC content is represented by the dark shading on the inner side of the small circle, whereas the light shading represents the AT content. The genes are color-coded based on different functional group.
List of genes in the chloroplast genome of Hagenia abyssinica.
| Category | Gene type | Gene | ||||||
|---|---|---|---|---|---|---|---|---|
| Self-replication | Ribosomal RNA | |||||||
| Transfer RNA | ||||||||
| Small ribosomal units | ||||||||
| Large ribosomal units | ||||||||
| RNA polymerase sub-units | ||||||||
| Photosynthesis genes | NADH dehydrogenase | |||||||
| Photosystem I | ||||||||
| Photosystem II | ||||||||
| Cytochrome b/f complex | ||||||||
| ATP synthase | ||||||||
| Large subunit of rubisco | ||||||||
| Other genes | Maturase | |||||||
| Protease | ||||||||
| Acetyl-CoA-carboxylase sub-unit | ||||||||
| Envelope membrane protein | ||||||||
| Component of TIC complex | ||||||||
| c-type cytochrome synthesis | ||||||||
| Unknown | hypothetical genes reading frames | |||||||
Notes.
Genes with a single intron.
Genes with two introns.
Characterization of simple sequence repeats discovered in the chloroplast genome of Hagenia abyssinica.
| Microsatellite sequences | Number of repeats | Total | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A | – | – | – | – | – | 25 | 14 | 10 | 5 | 6 | 3 | 1 | 1 | 65 |
| C | – | – | – | – | – | 6 | 3 | 1 | – | – | – | – | – | 10 |
| G | – | – | – | – | – | 3 | 1 | – | – | – | – | – | – | 4 |
| T | – | – | – | – | – | 30 | 20 | 14 | 6 | 2 | 1 | 1 | 1 | 75 |
| AT | – | – | 2 | 2 | – | – | – | – | – | – | – | – | – | 4 |
| TA | – | – | 5 | – | – | – | – | – | – | – | – | – | – | 5 |
| TC | – | – | 1 | – | – | – | – | – | – | – | – | – | – | 1 |
| AAAT | 2 | – | – | – | – | – | – | – | – | – | – | – | – | 2 |
| AATA | 1 | – | – | – | – | – | – | – | – | – | – | – | – | 1 |
| ATGT | 1 | – | – | – | – | – | – | – | – | – | – | – | – | 1 |
| TAAA | 1 | – | – | – | – | – | – | – | – | – | – | – | – | 1 |
| TAAT | 1 | – | – | – | – | – | – | – | – | – | – | – | – | 1 |
| TTTA | 2 | – | – | – | – | – | – | – | – | – | – | – | – | 2 |
| Total | 172 | |||||||||||||
List and location of long repeat sequences in the chloroplast genome of Hagenia abyssinica.
| Repeat size (bp) | Repeat 1 start | Repeat 2 start | Repeat type | Location 1 | Location 2 |
|---|---|---|---|---|---|
| 69 | 26,722 | 26,745 | F | IGS ( | IGS ( |
| 67 | 52,510 | 52,510 | P | IGS ( | IGS |
| 59 | 52,514 | 52,514 | P | IGS ( | IGS ( |
| 56 | 10,134 | 10,134 | P | IGS ( | IGS ( |
| 46 | 26,722 | 26,768 | F | IGS ( | IGS ( |
| 40 | 98,746 | 12,0719 | F | IGS ( | IGS ( |
| 40 | 12,0719 | 14,0493 | P | IGS ( | IGS ( |
| 39 | 44,079 | 98,748 | F | IGS ( | |
| 39 | 44,079 | 14,0492 | P | IGS ( | |
| 38 | 44,079 | 12,0721 | F | IGS | |
| 37 | 12,859 | 12,859 | P | IGS ( | IGS ( |
| 34 | 8342 | 45,240 | P | IGS ( | |
| 30 | 8,346 | 45,240 | P | IGS ( | |
| 30 | 10,7688 | 10,7720 | F | IGS ( | IGS ( |
| 30 | 10,7688 | 13,1529 | P | IGS ( | IGS ( |
| 30 | 10,7720 | 13,1561 | P | IGS ( | IGS ( |
| 30 | 13,1529 | 13,1561 | F | IGS ( | IGS ( |
| 29 | 35,992 | 36,014 | F | IGS ( | IGS ( |
| 28 | 67,251 | 67,275 | F | IGS ( | IGS ( |
| 30 | 47,381 | 47,381 | P | IGS ( | IGS ( |
| 24 | 36,841 | 36,841 | P | IGS ( | IGS ( |
| 24 | 67,255 | 67,279 | F | IGS (psaJ-rpl33) | IGS ( |
| 27 | 9,748 | 36,800 | F | IGS ( | |
| 29 | 7,294 | 12,5722 | R | IGS ( | |
| 29 | 8,344 | 35,778 | F | IGS ( | |
| 23 | 26,722 | 26,791 | F | IGS ( | IGS ( |
| 31 | 96,104 | 96,104 | P | IGS ( | IGS ( |
| 31 | 96,104 | 14,3144 | F | IGS ( | IGS ( |
| 31 | 14,3144 | 14,3144 | P | IGS ( | IGS ( |
| 28 | 10,275 | 10,275 | P | IGS ( | IGS ( |
| 28 | 59,119 | 59,119 | P | IGS ( | IGS ( |
| 22 | 35,852 | 45,182 | P | IGS ( | IGS ( |
| 22 | 56,966 | 56,966 | R | IGS ( | IGS ( |
| 22 | 80,656 | 80,656 | P | IGS ( | IGS ( |
| 25 | 8,348 | 35,782 | F | ||
| 25 | 35,782 | 45,243 | P | ||
| 30 | 7,017 | 7,021 | R | IGS ( | IGS ( |
| 30 | 28,761 | 98,934 | R | IGS ( | IGS ( |
| 30 | 28,761 | 14,0315 | C | IGS ( | IGS ( |
| 30 | 39,041 | 41,265 | F | ||
| 30 | 81,696 | 12,0708 | F | IGS ( | IGS ( |
| 27 | 10,259 | 36,722 | P | IGS ( | IGS ( |
| 27 | 56,961 | 56,966 | R | IGS ( | IGS ( |
| 21 | 8,352 | 35,786 | F | ||
| 21 | 12,792 | 68,281 | F | IGS ( | |
| 21 | 30,092 | 30,092 | R | IGS ( | IGS ( |
| 21 | 35,786 | 45,243 | P | ||
| 21 | 63,682 | 63,682 | R | ||
| 29 | 32,026 | 32,026 | P | IGS ( | IGS ( |
Notes.
forward
reverse
palindromic
complementary
Figure 2Phylogenetic relationship of 21 species of Rosaceae based on maximum likelihood analysis of 71 protein coding genes.