| Literature DB >> 25477958 |
Harpreet Kaur1, Bhupinder Pal Singh1, Harpreet Singh2, Avinash Kaur Nagpal1.
Abstract
Availability of complete plastid genomes of ten solanaceous species, Atropa belladonna, Capsicum annuum, Datura stramonium, Nicotiana sylvestris, Nicotiana tabacum, Nicotiana tomentosiformis, Nicotiana undulata, Solanum bulbocastanum, Solanum lycopersicum, and Solanum tuberosum provided us with an opportunity to conduct their in silico comparative analysis in depth. The size of complete chloroplast genomes and LSC and SSC regions of three species of Solanum is comparatively smaller than that of any other species studied till date (exception: SSC region of A. belladonna). AT content of coding regions was found to be less than noncoding regions. A duplicate copy of trnH gene in C. annuum and two alternative tRNA genes for proline in D. stramonium were observed for the first time in this analysis. Further, homology search revealed the presence of rps19 pseudogene and infA genes in A. belladonna and D. stramonium, a region identical to rps19 pseudogene in C. annum and orthologues of sprA gene in another six species. Among the eighteen intron-containing genes, 3 genes have two introns and 15 genes have one intron. The longest insertion was found in accD gene in C. annuum. Phylogenetic analysis using concatenated protein coding sequences gave two clades, one for Nicotiana species and another for Solanum, Capsicum, Atropa, and Datura.Entities:
Year: 2014 PMID: 25477958 PMCID: PMC4248371 DOI: 10.1155/2014/424873
Source DB: PubMed Journal: Adv Bioinformatics ISSN: 1687-8027
Properties of the solanaceous chloroplast genomes.
| Property | Name of species | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| ABE | CAN | DST | NSY | NTA | NTO | NUN | SBU | SLY | STU | |
| Genome size (bp) | 156687 | 156781 | 155871 | 155941 | 155943 | 155745 | 155863 | 155371 | 155461 | 155296 |
| LSC (bp) (coordinates)* | 86,869 (1–86869) | 87366 (1–87366) | 86297 (1–86297) | 86684 (1–86684) | 86,686 (1–86686) | 86392 (1–86392) | 86633 (1–86633) | 85785 (1–85785) | 85,882 (1–85882) | 85737 (1–85737) |
| IRB (bp) (coordinates)* | 25,905 (86870–112774) | 25783 (87367–113149) | 25563 (86298–111860) | 25342 (86685–112026) | 25,343 (86687–112029) | 25429 (86393–111821) | 25331 (86634–111964) | 25588 (85786–111373) | 25,608 (85883–111490) | 25593 (85738–111330) |
| SSC (bp) (coordinates)* | 18,008 (112775–130782) | 17849 (113150–130998) | 18448 (111861–130308) | 18573 (112027–130599) | 18,571 (112030–130600) | 18495 (111822–130316) | 18568 (111965–130532) | 18381 (111374–129754) | 18,363 (111491–129853) | 18373 (111331–129703) |
| IRA (bp) (coordinates)* | 25,905 (130783–156687) | 25783 (130999–156781) | 25563 (130309–155871) | 25342 (130600–155941) | 25,343 (130601–155943) | 25429 (130317–155745) | 25331 (130533–155863) | 25588 (129755–155342) | 25,608 (129854–155461) | 25593 (129704–155296) |
| Coding regions (%) | 58.89 | 58.50 | 59.19 | 61.49 | 61.12 | 61.58 | 63.12 | 58.52 | 58.91 | 58.45 |
| Introns (%) | 12.51 | 12.71 | 11.62 | 12.70 | 12.70 | 12.68 | 12.69 | 12.82 | 12.47 | 12.49 |
| Intergenic regions (%) | 28.60 | 28.79 | 29.19 | 25.81 | 26.18 | 25.73 | 24.19 | 28.66 | 28.62 | 29.06 |
|
| ||||||||||
| AT content (%) | ||||||||||
| Overall | 62.44 | 62.27 | 62.12 | 62.15 | 62.15 | 62.21 | 62.12 | 62.12 | 62.14 | 62.12 |
| Coding regions | 59.86 | 59.68 | 59.65 | 59.85 | 59.79 | 59.79 | 59.70 | 59.61 | 59.65 | 59.59 |
| Noncoding regions | 66.13 | 65.93 | 65.72 | 65.84 | 65.87 | 66.09 | 66.27 | 65.66 | 65.71 | 65.68 |
| tRNAs | 47.70 | 47.38 | 47.08 | 47.06 | 47.05 | 47.10 | 47.08 | 47.12 | 47.01 | 47.06 |
| rRNAs | 44.64 | 44.73 | 44.63 | 44.64 | 44.64 | 44.64 | 44.64 | 44.66 | 44.66 | 44.65 |
| Protein-coding genes | 62.01 | 61.83 | 61.79 | 61.91 | 61.86 | 61.84 | 61.68 | 61.76 | 61.80 | 61.74 |
| LSC | 64.37 | 64.25 | 64.04 | 64.05 | 64.05 | 64.12 | 64.01 | 63.99 | 64.01 | 63.99 |
| SSC | 68.35 | 67.99 | 67.72 | 67.94 | 67.93 | 68.03 | 67.87 | 67.87 | 67.97 | 67.91 |
| IR | 57.14 | 56.94 | 56.87 | 56.78 | 56.78 | 56.84 | 56.78 | 56.93 | 56.91 | 56.90 |
ABE: Atropa belladonna, CAN: Capsicum annuum, DST: Datura stramonium, NSY: Nicotiana sylvestris, NTA: Nicotiana tabacum, NTO: Nicotiana tomentosiformis, NUN: Nicotiana undulata, SBU: Solanum bulbocastanum, SLY: Solanum lycopersicum, STU: Solanum tuberosum, LSC: large single copy region, SSC: small single copy region, and IR: inverted repeat region.
*Start and end position of nucleotide in the genome.
The lengths of introns and exons for the split genes of ten solanaceous species.
| Gene (region) | Exon/intron | ABE | CAN | DST | NSY | NTA | NTO | NUN | SBU | SLY | STU |
|---|---|---|---|---|---|---|---|---|---|---|---|
| trnK-UUU (LSC) | Exon I | 37 | 37 | 37 | 37 | 37 | 37 | 37 | 37 | 37 | 37 |
| Intron I | 2519 | 2500 | 2506 | 2526 | 2526 | 2526 | 2521 | 2501 | 2514 | 2512 | |
| Exon II | 36 | 35 | 35 | 35 | 35 | 35 | 35 | 35 | 35 | 35 | |
|
| |||||||||||
| rps16 (LSC) | Exon I | 40 | 40 | 40 | 40 | 40 | 40 | 40 | 40 | 40 | 40 |
| Intron I | 822 | 865 | 866 | 860 | 860 | 860 | 859 | 855 | 864 | 855 | |
| Exon II | 227 | 227 | 227 | 218 | 218 | 218 | 218 | 227 | 227 | 227 | |
|
| |||||||||||
| trnG-UCC (LSC) | Exon I | 23 | 23 | 23 | 23 | 23 | 23 | 23 | 23 | 23 | 23 |
| Intron I | 692 | 692 | 694 | 692 | 692 | 690 | 691 | 701 | 695 | 692 | |
| Exon II | 48 | 48 | 48 | 48 | 48 | 48 | 48 | 37 | 48 | 48 | |
|
| |||||||||||
| atpF (LSC) | Exon I | 145 | 145 | 145 | 145 | 145 | 145 | 145 | 144 | 144 | 145 |
| Intron I | 715 | 693 | 700 | 695 | 695 | 692 | 692 | 693 | 686 | 693 | |
| Exon II | 410 | 410 | 410 | 410 | 410 | 410 | 410 | 411 | 411 | 410 | |
|
| |||||||||||
| rpoC1 (LSC) | Exon I | 432 | 453 | 453 | 453 | 453 | 432 | 453 | 453 | 453 | 453 |
| Intron I | 737 | 742 | 737 | 737 | 737 | 709 | 733 | 737 | 737 | 737 | |
| Exon II | 1614 | 1614 | 1614 | 1614 | 1614 | 1614 | 1623 | 1614 | 1614 | 1614 | |
|
| |||||||||||
| ycf3 (LSC) | Exon I | 124 | 124 | 124 | 124 | 124 | 124 | 124 | 124 | 124 | 124 |
| Intron I | 739 | 742 | 740 | 739 | 738 | 731 | 735 | 730 | 729 | 727 | |
| Exon II | 230 | 230 | 230 | 230 | 230 | 230 | 230 | 230 | 230 | 230 | |
| Intron II | 763 | 744 | 753 | 783 | 783 | 779 | 781 | 750 | 750 | 750 | |
| Exon III | 153 | 153 | 159 | 153 | 153 | 153 | 153 | 153 | 153 | 153 | |
|
| |||||||||||
| trnL-UAA (LSC) | Exon I | 35 | 35 | 35 | 35 | 35 | 35 | 35 | 37 | 35 | 35 |
| Intron I | 497 | 426 | 501 | 503 | 503 | 497 | 498 | 502 | 497 | 497 | |
| Exon II | 50 | 50 | 50 | 50 | 50 | 50 | 50 | 50 | 50 | 50 | |
|
| |||||||||||
| trnV-UAC (LSC) | Exon I | 38 | 38 | 38 | 38 | 38 | 38 | 38 | 38 | 38 | 38 |
| Intron I | 572 | 575 | 569 | 571 | 571 | 572 | 573 | 569 | 571 | 571 | |
| Exon II | 35 | 35 | 37 | 35 | 35 | 35 | 35 | 37 | 35 | 35 | |
|
| |||||||||||
| rps12* | Exon I | 114 | 114 | 114 | 114 | 114 | 114 | 114 | 114 | 114 | 114 |
| Intron I | — | — | — | — | — | — | — | — | — | — | |
| Exon II | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 232 | |
| Intron II | 535 | 536 | 536 | 536 | 536 | 536 | 536 | 536 | 536 | 536 | |
| Exon III | 26 | 26 | 26 | 26 | 26 | 26 | 26 | 26 | 26 | 26 | |
|
| |||||||||||
| clpP (LSC) | Exon I | 71 | 71 | 71 | 71 | 71 | 71 | 71 | 71 | 71 | 71 |
| Intron I | 799 | 811 | 792 | 807 | 807 | 789 | 789 | 789 | 798 | 789 | |
| Exon II | 292 | 292 | 292 | 292 | 292 | 292 | 292 | 292 | 292 | 292 | |
| Intron II | 622 | 626 | 624 | 637 | 637 | 634 | 631 | 625 | 617 | 620 | |
| Exon III | 228 | 228 | 234 | 228 | 228 | 228 | 228 | 234 | 258 | 234 | |
|
| |||||||||||
| petB (LSC) | Exon I | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |
| Intron I | 759 | 755 | 746 | 753 | 753 | 753 | 753 | 747 | 747 | 747 | |
| Exon II | 642 | 642 | 642 | 642 | 642 | 642 | 642 | 642 | 642 | 642 | |
|
| |||||||||||
| petD (LSC) | Exon I | 8 | 8 | 9 | 8 | 8 | 8 | 8 | 6 | 8 | 8 |
| Intron I | 742 | 742 | 748 | 742 | 742 | 742 | 742 | 739 | 738 | 739 | |
| Exon II | 475 | 475 | 474 | 475 | 475 | 475 | 475 | 477 | 475 | 475 | |
|
| |||||||||||
| rpl16 (LSC) | Exon I | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 |
| Intron I | 1019 | 1026 | 1025 | 1020 | 1020 | 1021 | 1020 | 1014 | 1018 | 1014 | |
| Exon II | 396 | 396 | 396 | 396 | 396 | 396 | 396 | 396 | 396 | 396 | |
|
| |||||||||||
| rpl2 (IR) | Exon I | 391 | 391 | 393 | 391 | 391 | 391 | 391 | 390 | 391 | 391 |
| Intron I | 664 | 665 | 669 | 666 | 666 | 666 | 666 | 666 | 666 | 666 | |
| Exon II | 434 | 434 | 429 | 434 | 434 | 434 | 434 | 435 | 434 | 434 | |
|
| |||||||||||
| ndhB (IR) | Exon I | 777 | 777 | 777 | 777 | 777 | 777 | 777 | 777 | 777 | 777 |
| Intron I | 679 | 679 | 679 | 679 | 679 | 679 | 679 | 679 | 679 | 679 | |
| Exon II | 756 | 756 | 756 | 756 | 756 | 756 | 756 | 756 | 756 | 756 | |
|
| |||||||||||
| trnI-GAU (IR) | Exon I | 37 | 37 | 42 | 37 | 37 | 37 | 37 | 42 | 37 | 37 |
| Intron I | 717 | 722 | 717 | 707 | 707 | 716 | 716 | 717 | 722 | 722 | |
| Exon II | 34 | 35 | 35 | 35 | 35 | 35 | 35 | 35 | 35 | 35 | |
|
| |||||||||||
| trnA-UGC (IR) | Exon I | 38 | 38 | 38 | 38 | 38 | 38 | 38 | 38 | 38 | 38 |
| Intron I | 681 | 811 | 811 | 709 | 709 | 709 | 709 | 811 | 811 | 811 | |
| Exon II | 35 | 35 | 35 | 35 | 35 | 35 | 35 | 35 | 35 | 35 | |
|
| |||||||||||
| ndhA (SSC) | Exon I | 553 | 553 | 552 | 553 | 553 | 553 | 553 | 552 | 553 | 553 |
| Intron I | 1150 | 1157 | 1154 | 1148 | 1148 | 1149 | 1148 | 1158 | 1133 | 1158 | |
| Exon II | 539 | 539 | 537 | 539 | 539 | 539 | 539 | 540 | 539 | 539 | |
*rps12 gene is dividing gene. The 3′-rps12 locates on the IR-region, while the 5′-rps12 locates on the LSC region.
ABE: Atropa belladonna, CAN: Capsicum annuum, DST: Datura stramonium, NSY: Nicotiana sylvestris, NTA: Nicotiana tabacum, NTO: Nicotiana tomentosiformis, NUN: Nicotiana undulata, SBU: Solanum bulbocastanum, SLY: Solanum lycopersicum, and STU: Solanum tuberosum.
InDels in nucleotide sequences of 9 genes of ten solanaceous plastid genomes.
| S. number | Geneabc | Total number of InDels | InDel length (bp) |
|---|---|---|---|
| 1 | accDa | 4 | 24, 9, 141, 6 |
| 2 | clpPa | 24 | 8(I), 14(I), 13(I), 7(I), 1(I), 2-3(I), 7(I), 1–7(I), 3(I), 2(I), 3(I), 1–7(I), 1–3(I), 1(I), 1(I), 1(I), 1–5(I), 4–7(I), 1(I), 9(I), 1-2(I), 3(I), 5(I), 24–30 |
| 3 | ndhAb | 14 | 9(I), 5-6(I), 3(I), 1(I), 9(I), 3(I), 4(I), 1–4(I), 1-2(I), 1–23(I), 1-2(I), 2(I), 1(I), 3(I) |
| 4 | rpl32b | 2 | 2-3, 4 |
| 5 | rps16a | 11 | 1–38, 9(I), 1(I), 1(I), 5(I), 1-2(I), 5(I), 4(I), 6(I), 1(I), 9 |
| 6 | sprAb | 2 | 109, 7 |
| 7 | trnA-UGCc | 1 | 102–130 |
| 8 | trnL-UAAa | 4 | 1, 6, 71, 4 |
| 9 | ycf1b | 31 | 3, 18, 18, 21, 6, 6, 48, 9, 6, 6, 42, 3, 6, 30, 3, 15, 12–39, 18, 6, 9–36, 6, 6, 6, 9, 9, 12, 6, 6, 6, 57, 6 |
abcLocation in different regions; aLSC, bSSC, and cIR; I: InDels present in introns.
InDels in amino acid sequences of 5 proteins of ten solanaceous plastid genomes.
| S. number | Protein | Total number of InDels | InDel length (bp) |
|---|---|---|---|
| 1 | accD | 4 | 8, 3, 47, 2 |
| 2 | clpP | 2 | 2, 10 |
| 3 | rpl32 | 1 | 1-2 |
| 4 | rps16 | 1 | 3 |
| 5 | ycf1 | 29 | 1, 6, 6, 7, 2, 2, 7, 3, 2, 2, 14, 1–10, 1, 5, 4–13, 6, 2, 3–12, 2, 2, 2, 3, 3, 4, 2, 2, 2, 19, 2 |
Figure 1Partial multiple sequence alignment of accD, clpP, ndhA, rpl32, rps16, sprA, tRNA-Ala (UGC), tRNA-Leu(UAA), and ycf1 gene sequences of ten solanaceous species showing location of InDels indicated by hyphens.
Figure 2Partial multiple sequence alignment of amino acid sequences of genes, namely, accD, clpP, ndhA, rpl32, rps16, sprA, tRNA-Ala(UGC), tRNA-Leu(UAA), and ycf1, of ten solanaceous species showing location of InDels indicated by hyphens.
Figure 3Maximum likelihood phylogenetic tree derived using concatenated nucleotide sequences of 75 protein-coding genes of ten solanaceous species and two outgroup species.