| Literature DB >> 28085925 |
Abdul Latif Khan1, Ahmed Al-Harrasi1, Sajjad Asaf2, Chang Eon Park2, Gun-Seok Park2, Abdur Rahim Khan2, In-Jung Lee2, Ahmed Al-Rawahi1, Jae-Ho Shin2.
Abstract
Boswellia sacra (Burseraceae), a keystone endemic species, is famous for the production of fragrant oleo-gum resin. However, the genetic make-up especially the genomic information about chloroplast is still unknown. Here, we described for the first time the chloroplast (cp) genome of B. sacra. The complete cp sequence revealed a circular genome of 160,543 bp size with 37.61% GC content. The cp genome is a typical quadripartite chloroplast structure with inverted repeats (IRs 26,763 bp) separated by small single copy (SSC; 18,962 bp) and large single copy (LSC; 88,055 bp) regions. De novo assembly and annotation showed the presence of 114 unique genes with 83 protein-coding regions. The phylogenetic analysis revealed that the B. sacra cp genome is closely related to the cp genome of Azadirachta indica and Citrus sinensis, while most of the syntenic differences were found in the non-coding regions. The pairwise distance among 76 shared genes of B. sacra and A. indica was highest for atpA, rpl2, rps12 and ycf1. The cp genome of B. sacra reveals a novel genome, which could be used for further studied to understand its diversity, taxonomy and phylogeny.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28085925 PMCID: PMC5235384 DOI: 10.1371/journal.pone.0169794
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Boswellia sacra–habitat and leaf morphology.
This tree grows wildly in the Dhofar region of Oman.
Fig 2Gene map of the Boswellia sacra chloroplast genome.
A pair of thick lines in the inside circle represents the inverted repeats (IRa and IRb; 26,763 bp each), separating the large single copy region (LSC; 88,055 bp) from the small single copy region (SSC; 18962 bp). The genes drawn inside the circle are transcribed clockwise, while those drawn outside the circle are transcribed counterclockwise.
Summary of the chloroplast genome characteristics in Boswellia sacra and Azadirachta indica.
| Attribute | ||
|---|---|---|
| Size (bp) | 160,543 | 160,737 |
| Overall GC content (%) | 37.6 | 37.5 |
| LSC size in bp (% total) | 88,055 (54.8%) | 88,136 (54.8%) |
| SSC size in bp (% total) | 18,962 (11.8%) | 18,635 (11.6%) |
| IR size in bp (% total) | 26,763 (16.7%) | 26,983 (16.8%) |
| Protein-coding regions size in bp (% total) | 80,289 (50.0%) | 79,685 (49.6%) |
| rRNA and tRNA size in bp (% total) | 11,978 (7.5%) | 11,863 (7.38%) |
| Introns size in bp (% total) | 17,621 (11.0%) | 18,993 (11.8%) |
| Intergenic spacer size in bp (% total) | 57,025 (35.5%) | 52,699 (32.8%) |
| Number of different genes | 114 | 112 |
| Number of different protein-coding genes | 83 | 78 |
| Number of different rRNA genes | 4 | 4 |
| Number of different tRNA genes | 27 | 30 |
| Number of different genes duplicated by IR | 24 | 19 |
| Number of different genes with introns | 16 | 17 |
aEach cp genome contains two copies of inverted repeats (IRs).
bAccording to the original annotation, infA, ycf15, ycf68, orf42, orf56 not including.
List of Genes found in cp genome of Boswellia sacra.
| Group of gene | Name of gene |
|---|---|
Differences between the B. sacra and A. indica cp genomes.
| Length (bp) | Count | |||
| 1 | 2,743 | |||
| 2–10 | 1,216 | |||
| 11–21 | 12 | |||
| Sum | 6,353 | 3,971 | Percentage | |
| Type | Count | |||
| A<->T | 1,571 | |||
| C<->G | 837 | |||
| A<->C | 1,416 | |||
| T<->C | 1,828 | |||
| A<->G | 1,777 | |||
| T<->G | 1,500 | |||
| Sum | 8,929 | Percentage | ||
| Region | length | Pairwise distance | ||
| trnH—GUG-psbA | 457 | 0.47 | ||
| accD—psaI | 746 | 0.32 | ||
| atpH—atpI | 1,167 | 0.31 | ||
| psbZ—trnG-UCC | 546 | 0.30 | ||
| trnK—UUU-rps16 | 1,034 | 0.25 | ||
| petN—psbM | 1,139 | 0.25 | ||
| ycf4—cemA | 942 | 0.25 | ||
| ccsA—ndhD | 307 | 0.24 | ||
| trnR-UCU—atpA | 198 | 0.23 | ||
| trnL-UAG—ccsA | 111 | 0.21 |
aRelative to the length of B. sacra.
bLength in B. sacra.
Fig 3Visualization alignment of chloroplast genome sequences of B. sacra and A. indica.
VISTA based similarity graphical information portraying sequence identity of B. sacra with reference A. indica cp genomes. Thick black lines show the inverted repeats (IRs) in the chloroplast genomes. Genome regions are color-coded as protein coding, rRNA coding, tRNA coding or conserved noncoding sequences (CNS) whereas arrows show the gene presence.
List of simple sequence repeats.
| Repeat unit | Length (bp) | Number of SSRs | Start position |
|---|---|---|---|
| 10 | 5 | ||
| 12 | 2 | 72021; 118991 | |
| 10 | 1 | ||
| 12 | 1 | 49380 | |
| 10 | 1 | ||
| 12 | 1 | ||
| 12 | 3 | ||
| 12 | 1 | ||
| 12 | 1 | 50027 | |
| 12 | 2 | ||
| 12 | 1 | 6383 | |
| 12 | 1 | 117168 | |
| 12 | 1 | 34089 | |
| 12 | 1 | ||
| 12 | 1 | 51864 | |
| 12 | 1 | 38353 | |
| 12 | 1 | 125090 | |
| 18 | 1 | 118904 |
aThe SSR containing coding regions are indicated in parentheses. SSRs that are identical in the A. indica chloroplast genome are highlighted in bold.
List of long repeat sequence.
| P | 9578 | 48299 | LSC | ||
| D | 30965 | 31304 | IGS ( | LSC | |
| D, P | 95755 | 95773, 152797, 152815 | IR | ||
| D | 50042 | 50183 | IGS ( | LSC | |
| D | 15474 | 15696, 16029 | IGS ( | LSC | |
| P | 5863 | 5863 | intron ( | LSC | |
| D, P | 103143 | 125725, 145421 | IR, SSC | ||
| D | 64428 | 64797 | IGS ( | LSC | |
| D | 49991 | 50135 | IGS ( | LSC | |
| D | 64386 | 64759 | IGS ( | LSC | |
| P | 444 | 444 | IGS ( | LSC | |
| P | 31815 | 31815 | LSC | ||
| D | 15507 | 16061 | IGS ( | LSC | |
| D | 31107 | 31522 | IGS ( | LSC | |
| D | 41575 | 43799 | LSC | ||
| D | 64274 | 64648 | IGS ( | LSC | |
| D | 15417 | 15639 | IGS ( | LSC | |
| D | 15729 | 16061 | IGS ( | LSC | |
| D | 62442 | 62769 | LSC |
aD: direct repeat; P: palindrome inverted repeat.
bIGS: intergenic spacer region. Sequences conserved in the A. indica chloroplast genome are highlighted in bold.
*The 65 bp sequence is truncated copy of 204 bp sequence.
Distribution of tandem repeats in the B. sacra chloroplast genome.
| Indices | Repeat Length | Copy Number | Percent Indels | Percent Matches | Location |
|---|---|---|---|---|---|
| 347—387 | 16 | 2.6 | 11 | 88 | |
| 4566—4594 | 12 | 2.4 | 0 | 100 | |
| 4819—4874 | 19 | 2.9 | 5 | 78 | |
| 7784—7814 | 16 | 1.9 | 0 | 100 | |
| 8690—8726 | 18 | 2.1 | 0 | 94 | |
| 9763—9828 | 27 | 2.4 | 5 | 84 | |
| 9783—9831 | 14 | 3.6 | 5 | 83 | |
| 9779—9867 | 27 | 3.5 | 13 | 85 | |
| 9863—9892 | 14 | 2.1 | 0 | 100 | |
| 11501—11572 | 31 | 2.5 | 18 | 81 | |
| 11504—11573 | 25 | 2.7 | 17 | 76 | |
| 11535—11583 | 22 | 2.4 | 12 | 80 | |
| 16355—16385 | 14 | 2.2 | 0 | 94 | |
| 29582—29606 | 12 | 2.1 | 0 | 100 | |
| 31751—31776 | 13 | 2 | 0 | 100 | |
| 49997—50046 | 18 | 2.9 | 11 | 82 | |
| 60254—60278 | 12 | 2.1 | 0 | 100 | |
| 62281—62429 | 76 | 2 | 5 | 88 | |
| 67916—67954 | 19 | 2.1 | 4 | 90 | |
| 69384—69454 | 12 | 6.1 | 18 | 72 | |
| 72176—72214 | 21 | 1.9 | 5 | 94 | |
| 73287—73312 | 13 | 2 | 0 | 100 | |
| 95746—95802 | 18 | 3.2 | 0 | 97 | |
| 112196—112261 | 32 | 2.1 | 0 | 97 | |
| 113767—113809 | 20 | 2.2 | 0 | 100 | |
| 119576—119605 | 15 | 2 | 0 | 93 | |
| 129059—129085 | 13 | 2.1 | 0 | 100 | |
| 134790—134832 | 20 | 2.2 | 0 | 100 | |
| 136338—136403 | 32 | 2.1 | 0 | 97 | |
| 152797—152853 | 18 | 3.2 | 0 | 97 |
Fig 4Comparison of the border positions of the LSC, SSC, and IR regions in three Sapindales species.
Fig 5Neighbor-Joining phylogeny of the representative Malvidae lineages.
The common grape vine (Vitis vinifera) is included as the outgroup to root the tree. A total of 33 complete chloroplast genomes were aligned. In total, 33 nodes were resolved. The position of B. sacra is shown in bold.