| Literature DB >> 31467890 |
Samaila S Yaradua1,2, Dhafer A Alzahrani1, Enas J Albokhary1, Abidina Abba1, Abubakar Bello2.
Abstract
The complete chloroplast genome of J. flava, an endangered medicinal plant in Saudi Arabia, was sequenced and compared with cp genome of three Acanthaceae species to characterize the cp genome, identify SSRs, and also detect variation among the cp genomes of the sampled Acanthaceae. NOVOPlasty was used to assemble the complete chloroplast genome from the whole genome data. The cp genome of J. flava was 150, 888bp in length with GC content of 38.2%, and has a quadripartite structure; the genome harbors one pair of inverted repeat (IRa and IRb 25, 500bp each) separated by large single copy (LSC, 82, 995 bp) and small single copy (SSC, 16, 893 bp). There are 132 genes in the genome, which includes 80 protein coding genes, 30 tRNA, and 4 rRNA; 113 are unique while the remaining 19 are duplicated in IR regions. The repeat analysis indicates that the genome contained all types of repeats with palindromic occurring more frequently; the analysis also identified total number of 98 simple sequence repeats (SSR) of which majority are mononucleotides A/T and are found in the intergenic spacer. The comparative analysis with other cp genomes sampled indicated that the inverted repeat regions are conserved than the single copy regions and the noncoding regions show high rate of variation than the coding region. All the genomes have ndhF and ycf1 genes in the border junction of IRb and SSC. Sequence divergence analysis of the protein coding genes showed that seven genes (petB, atpF, psaI, rpl32, rpl16, ycf1, and clpP) are under positive selection. The phylogenetic analysis revealed that Justiceae is sister to Ruellieae. This study reported the first cp genome of the largest genus in Acanthaceae and provided resources for studying genetic diversity of J. flava as well as resolving phylogenetic relationships within the core Acanthaceae.Entities:
Mesh:
Year: 2019 PMID: 31467890 PMCID: PMC6699374 DOI: 10.1155/2019/4370258
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1Gene map of the J. flava chloroplast genome. Genes outside the circles are transcribed in counterclockwise direction and those inside in clockwise direction. Known functional genes are indicated in the coloured bar. The GC and AT content are denoted by the dark grey and light grey colour in the inner circle, respectively. LSC indicates large single copy; SSC, indicates small single copy, and IR indicates inverted repeat.
Base composition in the J. flava chloroplast genome.
| Region | T(U) (%) | C (%) | A (%) | G (%) | Total (bp) | |
|---|---|---|---|---|---|---|
| cp genome | 31 | 19 | 31 | 19 | 150888 | |
| LSC | 32 | 19 | 31 | 18 | 82995 | |
| SSC | 34 | 17 | 34 | 15 | 16893 | |
| IRA | 28 | 23 | 28 | 21 | 25500 | |
| IRB | 28 | 21 | 28 | 23 | 25500 | |
| 1st Position | 31 | 20 | 30 | 19 | 50296 | |
| 2nd Position | 31 | 19 | 31 | 19 | 50296 | |
| 3rd Position | 31 | 20 | 30 | 19 | 50296 |
Genes present in the chloroplast genome of J. flava.
| Category | Group of genes | Name of genes |
|---|---|---|
| RNA genes | ribosomal RNA genes (rRNA) |
|
|
| ||
| Transfer RNA genes (tRNA) |
| |
|
| ||
| Ribosomal proteins | Small subunit of ribosome |
|
|
| ||
| Transcription | Large subunit of ribosome |
|
|
| ||
| DNA dependent RNA polymerase |
| |
|
| ||
| Protein genes | Photosystem I |
|
|
| ||
| Photosystem II |
| |
|
| ||
| Subunit of cytochrome |
| |
|
| ||
| Subunit of synthase |
| |
|
| ||
| Large subunit of rubisco |
| |
|
| ||
| NADH dehydrogenase |
| |
|
| ||
| ATP dependent protease subunit P |
| |
|
| ||
| Chloroplast envelope membrane protein |
| |
|
| ||
| Other genes | Maturase |
|
|
| ||
| Subunit acetyl-coA carboxylase |
| |
|
| ||
| C-type cytochrome synthesis |
| |
|
| ||
| Hypothetical proteins |
| |
|
| ||
| Component of TIC complex |
| |
+Gene with one intron. ++Gene with two intron. a Gene with copies.
Genes with intron the J. flava chloroplast genome and length of introns and exons.
| Gene | Location | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) |
|---|---|---|---|---|---|---|
|
| LSC | 143 | 659 | 467 | ||
|
| LSC | 443 | 769 | 1628 | ||
|
| LSC | 128 | 682 | 227 | 721 | 152 |
|
| LSC | 68 | 763 | 293 | 635 | 227 |
|
| IR | 392 | 664 | 434 | ||
|
| IR | 776 | 680 | 755 | ||
|
| SSC | 551 | 954 | 539 | ||
|
| LSC | 36 | 2460 | 37 | ||
|
| LSC | 31 | 667 | 59 | ||
|
| LSC | 36 | 501 | 49 | ||
|
| LSC | 37 | 593 | 36 | ||
|
| IR | 41 | 938 | 34 | ||
|
| IR | 37 | 818 | 34 |
Codon-anticodon recognition patterns and codon usage of the J. flava chloroplast genome.
| Codon | Amino Acid | RSCU | tRNA | Codon | Amino Acid | RSCU | tRNA |
|---|---|---|---|---|---|---|---|
| UUU | Phe | 1.18 |
| UAU | Tyr | 1.38 |
|
| UUC | Phe | 0.82 | UAC | Tyr | 0.62 | ||
| UUA | Leu | 1.31 |
| UAA | Stop | 1.01 | |
| UUG | Leu | 1.31 |
| UAG | Stop | 1.03 | |
| CUU | Leu | 1.23 |
| CAU | His | 1.26 |
|
| CUC | Leu | 0.65 | CAC | His | 0.74 | ||
| CUA | Leu | 0.92 | CAA | Gln | 1.37 |
| |
| CUG | Leu | 0.59 | CAG | Gln | 0.63 | ||
| AUU | Ile | 1.22 |
| AAU | Asn | 1.34 |
|
| AUC | Ile | 0.82 | AAC | Asn | 0.66 | ||
| AUA | Ile | 0.95 |
| AAA | Lys | 1.29 |
|
| AUG | Met | 1 |
| AAG | Lys | 0.71 | |
| GUU | Val | 1.45 |
| GAU | Asp | 1.44 |
|
| GUC | Val | 0.65 | GAC | Asp | 0.56 | ||
| GUG | Val | 0.74 | GAA | Glu | 1.38 |
| |
| GUA | Val | 1.16 |
| GAG | Glu | 0.62 | |
| UCU | Ser | 1.46 |
| UGU | Cys | 1.14 |
|
| UCC | Ser | 0.95 | UGC | Cys | 0.86 | ||
| UCG | Ser | 0.76 | UGA | Stop | 0.96 | ||
| UCA | Ser | 1.33 |
| UGG | Trp | 1 |
|
| CCU | Pro | 1.21 |
| CGU | Arg | 0.84 |
|
| CCC | Pro | 0.87 | CGC | Arg | 0.35 |
| |
| CCA | Pro | 1.13 | CGA | Arg | 1.16 | ||
| CCG | Pro | 0.79 | CGG | Arg | 0.77 | ||
| ACU | Thr | 1.22 | AGA | Arg | 1.8 | ||
| ACC | Thr | 0.87 | AGG | Arg | 1.07 | ||
| ACG | Thr | 0.72 |
| AGU | Ser | 0.92 |
|
| ACA | Thr | 1.2 |
| AGC | Ser | 0.58 | |
| GCU | Ala | 1.33 |
| GGU | Gly | 1.18 |
|
| GCC | Ala | 0.76 | GGC | Gly | 0.54 | ||
| GCA | Ala | 1.18 | GGA | Gly | 1.28 | ||
| GCG | Ala | 0.72 | GGG | Gly | 0.99 |
|
Figure 2Amino acids frequencies in J. flava chloroplast genome protein coding sequences.
Predicted RNA editing site in the J. flava chloroplast genome.
| gene | Nucleotide Position | Amino Acid Position | Codon Conversion | Amino Acid Conversion | Score |
|---|---|---|---|---|---|
|
| 800 | 267 | TCG => TTG | S => L | 0.8 |
| 844 | 282 | CCC => TCC | P => S | 0.8 | |
|
| 776 | 259 | ACC => ATC | T => I | 1 |
| 914 | 305 | TCA => TTA | S => L | 1 | |
| 1270 | 424 | CCC => TCC | P => S | 1 | |
|
| 92 | 31 | CCA => CTA | P => L | 0.86 |
|
| 404 | 135 | GCT => GTT | A => V | 1 |
| 620 | 207 | TCA => TTA | S => L | 1 | |
|
| 640 | 214 | CAT => TAT | H => Y | 1 |
| 1249 | 417 | CAT => TAT | H => Y | 1 | |
|
| 326 | 109 | ACT => ATT | T => I | 1 |
| 566 | 189 | TCA => TTA | S => L | 1 | |
| 922 | 308 | CTT => TTT | L => F | 1 | |
|
| 149 | 50 | TCA => TTA | S => L | 1 |
| 467 | 156 | CCA => CTA | P => L | 1 | |
| 586 | 196 | CAT => TAT | H => Y | 1 | |
| 737 | 246 | CCA => CTA | P => L | 1 | |
| 746 | 249 | TCT => TTT | S => F | 1 | |
| 830 | 277 | TCA => TTA | S => L | 1 | |
| 836 | 279 | TCA => TTA | S => L | 1 | |
| 1292 | 431 | TCC => TTC | S => F | 1 | |
| 1481 | 494 | CCA => CTA | P => L | 1 | |
|
| 2 | 1 | ACG => ATG | T => M | 1 |
| 32 | 11 | GCA => GTA | A => V | 1 | |
| 878 | 293 | TCA => TTA | S => L | 1 | |
| 1445 | 482 | GCT => GTT | A => V | 1 | |
|
| 124 | 42 | CTT => TTT | L => F | 1 |
| 671 | 224 | CCA => CTA | P => L | 1 | |
| 713 | 238 | GCT => GTT | A => V | 0.8 | |
| 1505 | 502 | TCT => TTT | S => F | 1 | |
| 1667 | 556 | CCC => CTC | P => L | 1 | |
| 2173 | 725 | CTC => TTC | L => F | 1 | |
|
| 314 | 105 | ACA => ATA | T => I | 0.8 |
|
| 617 | 206 | CCA=> CTA | P => L | 1 |
|
| 22 | 8 | CCT => TCT | P => S | 1 |
| 28 | 10 | CTT => TTT | L => F | 0.86 | |
|
| 596 | 199 | GCG => GTG | A => V | 0.86 |
|
| 308 | 103 | TCA => TTA | S => L | 0.86 |
|
| 338 | 113 | TCT => TTT | S => F | 1 |
| 473 | 158 | TCA => TTA | S => L | 0.86 | |
| 551 | 184 | TCA => TTA | S => L | 1 | |
| 566 | 189 | TCG => TTG | S => L | 1 | |
| 593 | 198 | GCT => GTT | A => V | 0.86 | |
| 2000 | 667 | TCT => TTT | S => F | 1 | |
| 2426 | 809 | TCA => TTA | S => L | 0.86 | |
|
| 2290 | 764 | CGG => TGG | R => W | 1 |
| 3202 | 1068 | CTT => TTT | L => F | 0.86 | |
| 3719 | 1240 | TCA => TTA | S => L | 0.86 | |
|
| 248 | 83 | TCA => TTA | S => L | 1 |
| 266 | 89 | ACA => ATA | T => I | 0.86 | |
|
| 80 | 27 | TCA => TTA | S => L | 1 |
Repeat sequences present in the J. flava chloroplast genome.
| S/N | Repeat Size | Repeat Position 1 | Repeat Type | Repeat Location 1 | Repeat Position 2 | Repeat Location 2 | E-Value |
|---|---|---|---|---|---|---|---|
| 1 | 41 | 97201 | F | IGS | 117423 | IGS | 1.32E-15 |
| 2 | 41 | 117423 | P | IGS | 136591 | IGS | 1.32E-15 |
| 3 | 39 | 42922 | F |
| 97203 | IGS | 2.12E-14 |
| 4 | 39 | 42922 | F |
| 117425 | IGS | 2.12E-14 |
| 5 | 39 | 42922 | P |
| 136591 | IGS | 2.12E-14 |
| 6 | 30 | 7623 | P | IGS- | 44420 | IGS- | 5.55E-09 |
| 7 | 26 | 86664 | P |
| 86664 |
| 1.42E-06 |
| 8 | 26 | 86664 | F |
| 147143 |
| 1.42E-06 |
| 9 | 26 | 120785 | F | IGS | 120808 | IGS | 1.42E-06 |
| 10 | 26 | 147143 | P |
| 147143 |
| 1.42E-06 |
| 11 | 25 | 118049 | F | ndhA Intron | 118074 |
| 5.69E-06 |
| 12 | 24 | 58638 | F | IGS | 58661 | IGS | 2.27E-05 |
| 13 | 23 | 31057 | R | IGS | 31057 | IGS | 9.10E-05 |
| 14 | 22 | 9120 | F |
| 35746 |
| 3.64E-04 |
| 15 | 21 | 7629 | F |
| 34812 |
| 1.46E-03 |
| 16 | 21 | 11762 | R | IGS | 11762 |
| 1.46E-03 |
| 17 | 21 | 34812 | P |
| 44423 | trnS-GGA | 1.46E-03 |
| 18 | 21 | 45956 | R |
| 45956 | trnT-UGU | 1.46E-03 |
| 19 | 21 | 106777 | F | IGS | 123747 |
| 1.46E-03 |
| 20 | 21 | 123747 | P |
| 127035 | IGS | 1.46E-03 |
| 21 | 20 | 29704 | R | IGS | 29704 | IGS | 5.82E-03 |
| 22 | 20 | 30622 | P | IGS | 30622 | IGS | 5.82E-03 |
| 23 | 20 | 49490 | R |
| 49490 |
| 5.82E-03 |
| 24 | 20 | 51290 | P |
| 101924 |
| 5.82E-03 |
| 25 | 20 | 51290 | F |
| 131889 |
| 5.82E-03 |
| 26 | 20 | 92988 | F | IGS | 93005 | IGS | 5.82E-03 |
| 27 | 20 | 92988 | P | IGS | 140808 | IGS | 5.82E-03 |
| 28 | 20 | 93005 | P | IGS | 140825 | IGS | 5.82E-03 |
| 29 | 20 | 140808 | F | IGS | 140825 | IGS | 5.82E-03 |
| 30 | 19 | 11760 | P | IGS | 80056 | IGS | 2.33E-02 |
| 31 | 19 | 43979 | F | IGS | 43997 | IGS | 2.33E-02 |
| 32 | 19 | 58511 | R | IGS | 58511 | IGS | 2.33E-02 |
| 33 | 19 | 108471 | R |
| 108471 |
| 2.33E-02 |
| 34 | 19 | 120905 | R | IGS | 120905 | IGS | 2.33E-02 |
| 35 | 19 | 123419 | F |
| 123443 |
| 2.33E-02 |
| 36 | 18 | 235 | P | IGS | 269 | IGS | 9.32E-02 |
| 37 | 18 | 5052 | F | IGS | 5069 | IGS | 9.32E-02 |
| 38 | 18 | 7694 | F |
| 34882 |
| 9.32E-02 |
| 39 | 18 | 7758 | P | IGS | 67000 | IGS | 9.32E-02 |
| 40 | 18 | 11761 | F |
| 54061 | IGS | 9.32E-02 |
| 41 | 18 | 21857 | F |
| 27884 | IGS | 9.32E-02 |
| 42 | 18 | 30254 | P | IGS | 30254 | IGS | 9.32E-02 |
| 43 | 18 | 30895 | R | IGS | 30895 | IGS | 9.32E-02 |
| 44 | 18 | 37376 | F |
| 39591 |
| 9.32E-02 |
| 45 | 18 | 37925 | F |
| 40149 |
| 9.32E-02 |
| 46 | 18 | 46008 | P | IGS | 46008 | IGS | 9.32E-02 |
| 47 | 18 | 50038 | C | IGS | 80055 | IGS | 9.32E-02 |
| 48 | 18 | 53019 | F |
| 125148 |
| 9.32E-02 |
| 49 | 18 | 54060 | F | IGS | 106779 | IGS | 9.32E-02 |
Figure 3Number of different repeats in four chloroplast genome of Acanthaceae. P= palindromic, F = forward, R=reverse, and C= complement.
Simple sequence repeats in the chloroplast genome of J. flava.
| Repeat | Length (bp) | Number | Start position |
|---|---|---|---|
| A | 8 | 17 | 1, 941; 4,077; 7, 866; 11, 688; 13, 037; 17, 946; 41, 684; 43, 752; 51, 259; 54, 200; 68, 846; 66, 730; 95, 631; 110, 063; 111, 010; 113, 860; 118, 261; 150, 742; 150, 809 |
| 11 | 3 | 7, 592; 14, 787; 15, 599 | |
| 10 | 3 | 14, 665; 21, 867; 28, 269; | |
| 9 | 8 | 27, 894; 43, 551; 45, 670; 63, 290; 88, 481; 114, 354; 132, 510; 150, 784 | |
| 14 | 1 | 80, 057; | |
| 13 | 1 | 127, 037; | |
|
| |||
| C | 12 | 4, 487 | |
|
| |||
| G | 8 | 2 | 57, 723; 74, 644 |
|
| |||
| T | 8 | 25 | 7, 383; 25, 533; 32, 688; 59, 519; 60, 372; 65, 495; 66, 271; 66, 377; 68, 096; 68, 556; 74, 497; 81, 050; 82, 410; 82, 710; 83, 018; 83, 085; 109, 106; 109, 136; 112, 173; 112, 563; 113, 364; 123, 499; 123, 541; 125, 326; 138, 196 |
| 10 | 4 | 9, 482; 30, 515; 53, 621; 123, 624; | |
| 15 | 1 | 11, 766; | |
| 9 | 15 | 12, 488; 15, 906; 17, 805; 70, 662; 74, 580; 75, 746; 81, 096; 83,050; 101, 316; 109, 159; 121, 849; 123, 004; 123, 606; 123, 638; 145, 345 | |
| 11 | 3 | 30, 883; 31, 611; 123, 292; | |
| 12 | 3 | 35, 533; 77, 108; 124, 143 | |
| 14 | 2 | 50, 040; 54, 066; | |
| 13 | 2 | 106, 785; 123, 755 | |
|
| |||
| AT | 6 | 1 | 7, 215 |
| 5 | 1 | 20, 206 | |
|
| |||
| TA | 5 | 2 | 19, 175; 30, 628 |
|
| |||
| TTC | 4 | 1 | 34, 429 |
|
| |||
| TAT | 4 | 1 | 84991 |
|
| |||
| TGA | 4 | 1 | 90, 256 |
|
| |||
| TCT | 5 | 1 | 124, 548 |
|
| |||
| ATC | 4 | 1 | 143, 566 |
|
| |||
| ATA | 4 | 1 | 148, 832 |
|
| |||
| TAA | 4 | 1 | 62, 980 |
|
| |||
| TTTC | 3 | 1 | 5, 222 |
|
| |||
| ATTG | 3 | 1 | 5, 410 |
|
| |||
| ATAA | 3 | 1 | 58, 759 |
|
| |||
| TAAA | 4 | 1 | 66, 121 |
|
| |||
| AAAC | 3 | 1 | 67, 128 |
|
| |||
| AATA | 3 | 1 | 112, 973 |
|
| |||
| AATC | 3 | 1 | 118, 083 |
|
| |||
| AATT | 3 | 1 | 122, 503 |
|
| |||
| CAATA | 3 | 1 | 30, 293 |
Figure 4Frequency of different SSR motifs in different repeat types in J. flava chloroplast genome.
Figure 5Number of SSR types in complete genome, protein coding regions, and noncoding genes.
Figure 6Number of different SSR types in the four chloroplast genome of Acanthaceae.
Figure 7Sequence alignment of four chloroplast genomes in the Acanthaceae family performed with mVISTA using annotation of J. flava as reference. The top arrow shows transcription direction, blue colour indicates protein coding, pink colour shows conserved noncoding sequence CNS, and light green indicates tRNAs and rRNAs. The x-axis represents the coordinates in the cp genome while y-axis represents percentage identity within 50-100%.
Figure 8Comparison of the borders of the IR, SSC, and LSC regions among four chloroplast genome of Acanthaceae.
Figure 9The synonymous (dS) and dN/dS ration values of 78 protein coding genes from four Acanthaceae cp genomes (Jf: J. flava; Rb: R. breedlovei; El: E. longzhouensis).
Figure 10Phylogenetic tree reconstruction of 9 taxa based on the complete chloroplast genome using Bayesian Inference (BI) and Maximum Parsimony (MP) methods showing relationship within the four species of Acanthaceae. The numbers in the branch nodes represent bootstrap percentage (BP)/posterior probability (PP).