| Literature DB >> 26899134 |
Wanjun Lei1, Dapeng Ni2, Yujun Wang1, Junjie Shao3, Xincun Wang3, Dan Yang3, Jinsheng Wang1, Haimei Chen3, Chang Liu3.
Abstract
Astragalus membranaceus is an important medicinal plant in Asia. Several of its varieties have been used interchangeably as raw materials for commercial production. High resolution genetic markers are in urgent need to distinguish these varieties. Here, we sequenced and analyzed the chloroplast genome of A. membranaceus (Fisch.) Bunge var. mongholicus (Bunge) P.K. Hsiao using the next generation DNA sequencing technology. The genome was assembled using Abyss and then subjected to gene prediction using CPGAVAS and repeat analysis using MISA, Tandem Repeats Finder, and REPuter. Finally, the genome was subjected phylogenetic and comparative genomic analyses. The complete genome is 123,582 bp long, containing only one copy of the inverted repeat. Gene prediction revealed 110 genes encoding 76 proteins, 30 tRNAs, and four rRNAs. Five intra-specific hypermutation loci were identified, three of which are heteroplasmic. Furthermore, three gene losses and two large inversions were identified. Comparative genomic analyses demonstrated the dynamic nature of the Papilionoideae chloroplast genomes, which showed occurrence of numerous hypermutation loci, frequent gene losses, and fragment inversions. Results obtained herein elucidate the complex evolutionary history of chloroplast genomes and have laid the foundation for the identification of genetic markers to distinguish A. membranaceus varieties.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26899134 PMCID: PMC4761949 DOI: 10.1038/srep21669
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Genes predicted in the chloroplast genome of A. membranaceus.
| Category of genes | Group of genes | Name of genes |
|---|---|---|
| Self-replication | rRNA genes | |
| tRNA genes | 30 | |
| Small subunit of ribosome | ||
| Large subunit of ribosome | ||
| DNA dependent RNA polymerase | ||
| Genes for photosynthesis | Subunits of NADH-dehydrogenase | |
| Subunits of photosystem I | ||
| Subunits of photosystem II | ||
| Subunits of cytochrome b/f complex | ||
| Subunits of ATP synthase | ||
| Subunit of rubisco | ||
| Other genes | Maturase | |
| Protease | ||
| Envelope membrane protein | ||
| Subunit of Acetyl-CoA-carboxylase | ||
| C-type cytochrome synthesis gene | ||
| Genes of unkown function | Conserved open reading frames |
*The number of asterisks after the gene names indicates the number of introns contained in the genes.
Figure 1Schematic representation of the A. membranaceus chloroplast genome.
The predicted genes are shown and colors represent functional classifications, which are shown at the left bottom. The genes drawn outside the circle are transcribed clockwise, whereas those drawn inside the circle are transcribed counter-clockwise. The inner circle shows the GC content. The large single copy (LSC), small single copy (SSC) and inverted repeat (IR) regions are shown in the inner circle. The three hypermutation regions (AB, BC and DE) are indicted with arrows.
Distribution of tri-, tetra-, and penta- nucleotide SSR loci in the chloroplast genome of A. membranaceus.
| SSR type | SSR sequence | Start | End | Location |
|---|---|---|---|---|
| tri | (TAT)4 | 4483 | 4494 | IGS |
| tri | (ATA)4 | 32341 | 32352 | IGS( |
| tri | (TAT)4 | 45484 | 45495 | IGS( |
| tri | (ATT)4 | 51038 | 51049 | IGS( |
| tri | (TAT)4 | 53563 | 53574 | IGS( |
| tri | (TAT)4 | 54247 | 54258 | IGS( |
| tri | (TAT)4 | 60883 | 60894 | IGS( |
| tri | (ATA)4 | 64011 | 64022 | IGS( |
| tri | (TAA)4 | 83455 | 83463 | IGS( |
| tri | (AAT)4 | 94710 | 94721 | IGS( |
| tri | (ATA)4 | 114770 | 114781 | IGS( |
| tri | (TAA)4 | 120393 | 120404 | IGS( |
| tetra | (ATAG)3 | 1686 | 1697 | IGS( |
| tetra | (TTTA)3 | 10123 | 10134 | IGS( |
| tetra | (CTTA)3 | 47735 | 47746 | IGS( |
| tetra | (ATAG)3 | 55121 | 55132 | IGS( |
| tetra | (TCTT)3 | 62405 | 62416 | IGS( |
| tetra | (TAAT)3 | 83444 | 83455 | IGS( |
| tetra | (ATAG)3 | 90687 | 90698 | IGS( |
| tetra | (AGGT)3 | 101290 | 101301 | CDS |
| tetra | (CAAA)3 | 108493 | 108504 | CDS( |
| tetra | (TATT)3 | 118107 | 118118 | CDS( |
| tetra | (AAAT)3 | 119565 | 119576 | IGS( |
| penta | (TATAT)3 | 65384 | 65398 | IGS( |
aintergenic spacer region, coding sequences.
Repeat sequences identified in the chloroplast genome of A. membranaceus.
| Repeat Number | Repeatsize (bp) | Type | Location | Repeat Unit sequence |
|---|---|---|---|---|
| 1 | 52 | F | CDS | CTATGGCTGACCGATATTGCACATCATCATTTAGCTATTGCAATTCTTTTTC |
| 2 | 48 | F | IGS | CAAAAAAGAACAGGTACAAATATAAAATTGAGGTACCCATTTTATGAT |
| 3 | 41 | F | introns( | TTACAGAACCGTACATGAGATTTTCACCTCATACGGCTCCT |
| 4 | 38 | F | IGS( | GTCTGGATTCAAATCCTACTGAAAGGTCCAGTAGAGAT |
| 5 | 30 | F | IGS( | AAATAATAATCTAATTGAAGTTTAGTAATT |
| 6 | 83 | F | IGS( | TATTATAACATAACAAATTATAACATAACAAAATCATATATATAATTATCATATTATAACATAACAAATTATAACATAACAAA |
| 7 | 114 | F | IGS( | TATATAATTATCATATTATAACATAACAAATTATAACATAACAAATAACATAACAAAATCATACATATAACATATAATTATCATATTATAACATAACAAATTATAACATAACAA |
| 8 | 42 | T | IGS( | AAAGAGGAGGACTCAATGATT (Х2) |
| 9 | 72 | T | IGS( | ATTATTTATATTATATAT(Х4) |
| 10 | 30 | T | IGS( | AATTAATTAT (Х3) |
| 11 | 280 | T | IGS( | ATTATAACATAACAAAATAACATAACAAAACATACATATAATATAATTATCATATTATAACATAACAA (Х4) |
| 12 | 32 | T | IGS( | ATATATTATAATATAT (Х2) |
| 13 | 36 | T | IGS( | TAAATATTCTTATATTAC (Х2) |
acoding sequences; intergenic spacers.
Figure 2Alignment of sequences from the PCR products for the identification of highly polymorphic regions in the A. membranaceus chloroplast genomes.
Panels (A,B) show the sequences obtained from the region AB. Panel (C) is for the region BC. Panels (D,E) show the sequences obtained from the region DE. The ID of each sequence is shown on the left side of each panel. The ID is the concatenation of region name, plant individual id, clone ID and primer direction (F: forward; R: reverse).
Figure 3Molecular phylogenetic analysis of the Papilionoideae subfamily.
The tree was constructed with the sequences of 67 proteins present in all 38 species (Lupinus albus, Lupinus luteus, Robinia pseudoacacia, Lotus japonicus, Glycyrrhiza glabra, Astragalus membranaceus, Cicer arietinum, Pisum sativum, Lathyrus sativus, Trifolium repens, Trifolium meduseum, Trifolium subterraneum, Trifolium glanduliferum, Trifolium strictum, Trifolium aureum,Trifolium grandiflorum, Trifolium boissieri, Medicago truncatula, Indigofera tinctoria, Apios americana, Vigna angularis, Vigna radiata, Vigna unguiculata, Phaseolus vulgaris, Pachyrhizus erosus, Glycine max, Glycine soja, Glycine cyrtoloba, Glycine stenophita, Glycine tomentella, Glycine syndetika, Glycine canescens, Glycine dolichocarpa, Glycine falcata, Millettia pinnata, Arachis hypogaea, Arabidopsis thaliana, Nicotiana tabacum), using the Maximum Likelihood method implemented in RAxML. Two taxa, Nicotiana tabacum and Arabidopsis thaliana were used as outgroups. The tribes, to which each species belongs, are shown to the right side of the tree. Bootstrap supports were calculated from 1000 replicates. Genes lost in a particular branch were indicated with the following symbols: ○(rps16), ▲(ycf4), △(accD), •(rpl23) and ●(rpl33).
Gene losses in the chloroplast genomes of the Papilionoideae subfamily.
| Name of species | rpl22 | rps16 | ycf4 | accD | rpl33 | rpl23 | ndhD | psaI | rpl32 | rps18 | rps19 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Total number of missing gene |
agenes located at the boundaries of the 50 kb inversions.
Figure 4Synteny analyses of chloroplast genomes from A. membranaceus and N. tabacum.
(A) Global synteny view; LSC region, the IRa and IRb and SSC regions are shown at the bottom of the alignment. I, II, III, and IV represent the border regions of the two inversions (enlarged and shown below); (B) Detailed alignments of the border regions of two inversions between A. membranaceus and N. tabacum. The coding regions of genes are represented by lines below the synteny maps, with their names shown on top of the lines. Blue and red colors indicate that the genes are transcribed clockwise and counterclockwise, respectively. The genes lost in A. membranaceus are enclosed in parentheses.
Figure 5Comparative genomic analyses of thirteen chloroplast genomes.
The chloroplast genome of A. membranaceus was aligned with those of twelve species. Each horizontal black line represents a genome. The species names are shown to the right of the corresponding line. The conserved regions are bridged by lines. The numbers on the right of each panel indicates the group number to which the chloroplast genomes have been assigned.