| Literature DB >> 32127603 |
Sajjad Asaf1, Abdul Latif Khan2, Adil Khan1, Arif Khan3, Gulzar Khan4, In-Jung Lee5, Ahmed Al-Harrasi6.
Abstract
Plantago ovata (Plantaginaceae) is an economically and medicinally important species, however, least is known about its genomics and evolution. Here, we report the first complete plastome genome of P. ovata and comparison with previously published genomes of related species from Plantaginaceae. The results revealed that P. ovata plastome size was 162,116 bp and that it had typical quadripartite structure containing a large single copy region of 82,084 bp and small single copy region of 5,272 bp. The genome has a markedly higher inverted repeat (IR) size of 37.4 kb, suggesting large-scale inversion of 13.8 kb within the expanded IR regions. In addition, the P. ovata plastome contains 149 different genes, including 43 tRNA, 8 rRNA, and 98 protein-coding genes. The analysis revealed 139 microsatellites, of which 71 were in the non-coding regions. Approximately 32 forward, 34 tandem, and 17 palindromic repeats were detected. The complete genome sequences, 72 shared genes, matK gene, and rbcL gene from related species generated the same phylogenetic signals, and phylogenetic analysis revealed that P. ovata formed a single clade with P. maritima and P. media. The divergence time estimation as employed in BEAST revealed that P. ovata diverged from P. maritima and P. media about 11.0 million years ago (Mya; 95% highest posterior density, 10.06-12.25 Mya). In conclusion, P. ovata had significant variation in the IR region, suggesting a more stable P. ovata plastome genome than that of other Plantaginaceae species.Entities:
Mesh:
Year: 2020 PMID: 32127603 PMCID: PMC7054531 DOI: 10.1038/s41598-020-60803-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary of complete chloroplast genomes.
| Plantago ovata | Plantago media | Plantago maritima | Veronica nakaiana | Veronica persica | Veronicstrum sibiricum | Digitalis lanata | |
|---|---|---|---|---|---|---|---|
| Size (bp) | 162,116 | 164,130 | 158,358 | 152319 | 150198 | 152930 | 153108 |
| Overall GC contents | 38.1 | 38.0 | 38.6 | 37.9 | 37.9 | 38.3 | 38.6 |
| LSC size in bp | 82084 | 82757 | 82222 | 83194 | 81849 | 83615 | 83934 |
| SSC size in bp | 5272 | 4577 | 8665 | 17702 | 17419 | 17801 | 17688 |
| IR size in bp | 37380 | 38398 | 33736 | 25711 | 25465 | 25757 | 25743 |
| Protein coding regions size in bp | 76904 | 88383 | 85374 | 80376 | 79587 | 80142 | 78693 |
| tRNA size in bp | 3211 | 2871 | 2942 | 2798 | 3153 | 2803 | 2777 |
| rRNA size in bp | 9048 | 9062 | 9058 | 9051 | 9051 | 9050 | 9052 |
| Number of genes | 149 | 140 | 137 | 133 | 130 | 131 | 130 |
| Numebr of protein coding genes | 98 | 94 | 90 | 88 | 86 | 86 | 85 |
| Numebr of rRNA | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
| Number of tRNA | 43 | 38 | 39 | 37 | 36 | 37 | 37 |
| Genes with introns | 16 | 16 | 13 | 15 | 14 | 15 | 15 |
Figure 1Gene map of the P. ovata plastome genome. Genes drawn inside the circle are transcribed clockwise, and those outside the circle are transcribed counterclockwise. The red and green colour asterisks indicate intron-containing and trans-spliced genes respectively. Genes belonging to different functional groups are colour-coded. The darker grey in the inner circle corresponds to GC content, and the lighter grey corresponds to AT content.
Genes in the sequenced P. ovata chloroplast genome.
| Category | Group of genes | Name of genes |
|---|---|---|
| Self-replication | Large subunit of ribosomal proteins | |
| Small subunit of ribosomal proteins | ||
| DNA dependent RNA polymerase | ||
| rRNA genes | ||
| tRNA genes | ||
| Photosynthesis | Photosystem I | |
| Photosystem II | ||
| NadH oxidoreductase | ||
| Cytochrome b6/f complex | ||
| ATP synthase | ||
| Rubisco | ||
| Other genes | Maturase | |
| Protease | ||
| Envelop membrane protein | ||
| Subunit Acetyl- CoA-Carboxylate | ||
| c-type cytochrome synthesis gene | ||
| Unknown | Conserved Open reading frames |
*Genes containing introns; aDuplicated gene (Genes present in the IR regions).
The genes with introns in the P. ovata chloroplast genome and the length of exons and introns.
| Gene | Location | Exon I (bp) | Intron 1 (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) |
|---|---|---|---|---|---|---|
| LSC | 141 | 706 | 411 | |||
| LSC | 6 | 715 | 642 | |||
| LSC | 12 | 693 | 474 | |||
| IR | 391 | 676 | 438 | |||
| LSC | 9 | 1602 | 393 | |||
| LSC | 40 | 862 | 227 | |||
| LSC | 453 | 741 | 1611 | |||
| 114 | — | 232 | 535 | 26 | ||
| LSC | 69 | 727 | 291 | 567 | 237 | |
| IR | 552 | 1073 | 531 | |||
| IR | 726 | 675 | 753 | |||
| LSC | 124 | 713 | 230 | 740 | 150 | |
| IR | 38 | 815 | 35 | |||
| IR | 42 | 805 | 35 | |||
| LSC | 37 | 507 | 50 | |||
| LSC | 37 | 2434 | 35 | |||
| LSC | 37 | 483 | 37 |
Figure 2Alignment visualization of the P. ovata plastome genome sequences. VISTA-based identity plot showing sequence identity among the seven-species using P. ovata as a reference. The vertical scale indicates percent identity, ranging from 50% to 100%. The horizontal axis indicates the coordinates within the chloroplast genome. Arrows indicate the annotated genes and their transcription direction. The thick black lines show the inverted repeats (IRs).
Figure 3Pairwise sequence distance of P. ovata, 72 genes with related species.
Figure 4Analysis of simple sequence repeats (SSR) in the seven Plantaginaceae plastomes; (A) SSR numbers detected in the seven species; (B) Frequency of identified SSR motifs in different repeat class types; (C) Frequency of identified SSRs in coding, Non-coding, rRNA and tRNA regions; (D) Frequency of identified SSRs in LSC, SSC and IR regions.
Figure 5Analysis of repeated sequences in the seven Plantaginaceae plastomes. (A) Totals numbers of three repeat types; (B) Number of palindromic repeats by length; (C) Number of tandem repeats by length; (D) Number of forward repeats by length.
Figure 6Comparison of border distance between adjacent genes and junctions of the LSC, SSC, and two IR regions among the plastome genomes of P. ovata and its relatives. Boxes above or below the main line indicate the adjacent border genes. The figure is not to scale with respect to sequence length and only shows relative changes at or near the IR/LSC or IR/SSC borders.
Figure 7Phylogenetic trees were constructed for thirty-five species from eight families representing 22 genera using different methods, and tree is shown for the whole genome sequence data sets constructed by ML method. The whole genome sequence data set was used with four different methods, Bayesian inference (BI), maximum likelihood (ML), maximum persimony (MP) and neighbour-jouining (NJ). Numbers above the branches are the posterior probabilities of BI and bootstrap values of ML, MP and NJ respectively. Black dots represent the position for P. ovata.