| Literature DB >> 28912544 |
Dongmei Yin1, Yun Wang2, Xingguo Zhang2, Xingli Ma2, Xiaoyan He2, Jianhang Zhang2.
Abstract
ABSRACT: Peanut (Arachis hypogaea L.) is an important oilseed and cash crop worldwide. Wild Arachis spp. are potental sources of novel genes for the genetic improvement of cultivated peanut. Understanding the genetic relationships with cultivated peanut is important for the efficient use of wild species in breeding programmes. However, for this genus, only a few genetic resources have been explored so far. In this study, new chloroplast genomic resources have been developed for the genus Arachis based on whole chloroplast genomes from seven species that were sequenced using next-generation sequencing technologies. The chloroplast genomes ranged in length from 156,275 to 156,395 bp, and their gene contents, gene orders, and GC contents were similar to those for other Fabaceae species. Comparative analyses among the seven chloroplast genomes revealed 643 variable sites that included 212 singletons and 431 parsimony-informative sites. We also identified 101 SSR loci and 85 indel mutation events. Thirty-seven SSR loci were found to be polymorphic by in silico comparative analyses. Eleven highly divergent DNA regions, suitable for phylogenetic and species identification, were detected in the seven chloroplast genomes. A molecular phylogeny based on the complete chloroplast genome sequences provided the best resolution of the seven Arachis species.Entities:
Mesh:
Year: 2017 PMID: 28912544 PMCID: PMC5599657 DOI: 10.1038/s41598-017-12026-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Details of the complete chloroplast genomes of seven Arachis species.
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|
| Total | 156,394 | 156,275 | 156,393 | 156,378 | 156,395 | 156,343 | 156,381 |
| LSC | 85,946 | 85,863 | 85,951 | 85,934 | 85,951 | 85,868 | 85,932 |
| SSC | 18,800 | 18,786 | 18,794 | 18,796 | 18,796 | 18,849 | 18,801 |
| IR | 25,824 | 25,813 | 25824 | 25,824 | 25,824 | 25,813 | 25,824 |
| Total | 110 | 110 | 110 | 110 | 110 | 110 | 110 |
| Protein coding genes | 76 | 76 | 76 | 76 | 76 | 76 | 76 |
| rRNA | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| tRNA | 30 | 30 | 30 | 30 | 30 | 30 | 30 |
| GC% | 36.4% | 36.4% | 36.4% | 36.4% | 36.4% | 36.4% | 36.4% |
Figure 1Map of the Arachis chloroplast genome. The genes inside and outside of the circle are transcribed in the clockwise and counterclockwise directions, respectively. Genes belonging to different functional groups are shown in different colors. Thick lines indicate the extent of the inverted repeats (IRa and IRb) that separate the genomes into small single-copy (SSC) and large single-copy (LSC) regions.
Genes identified in the chloroplast genome of Arachis species.
| Category for genes | Group of gene | Name of gene |
|---|---|---|
| Photosynthesis related genes | Photosystem I |
|
| Photosystem II |
| |
| cytochrome b/f compelx |
| |
| ATP synthase |
| |
| cytochrome c synthesis |
| |
| Assembly/stability of photosystem I |
| |
| NADPH dehydrogenase |
| |
| Rubisco |
| |
| Transcription and translation related genes | transcription |
|
| ribosomal proteins |
| |
| RNA genes | ribosomal RNA |
|
| transfer RNA |
| |
| Other genes | RNA processing |
|
| carbon metabolism |
| |
| fatty acid synthesis |
| |
| proteolysis |
| |
| Genes of unknown function | conserved reading frames |
|
Intron-containing genes are marked by asterisks (*).
Variable site analyses in the seven Arachis chloroplast genomes.
| Variable sites | Information sites | Nucleotide Diversity | ||||
|---|---|---|---|---|---|---|
| Number of sites | Numbers | % | Numbers | % | ||
| LSC | 88,262 | 460 | 0.52% | 298 | 0.34% | 0.00185 |
| SSC | 18,898 | 135 | 0.71% | 91 | 0.48% | 0.0025 |
| IR | 25,829 | 24 | 0.09% | 21 | 0.08% | 0.00037 |
| Complete cp genome | 156,818 | 643 | 0.41% | 431 | 0.27% | 0.00144 |
Figure 3Sliding window analysis of the complete chloroplast genomes of seven Arachis species (window length: 600 bp, step size: 200 bp). X-axis: position of the window midpoint, Y-axis: nucleotide diversity within each window.
Nucleotide substitutions and sequence divergence in seven complete chloroplast genomes in Arachis.
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|
|
| 0.00272 | 0.00086 | 0.00095 | 0.00092 | 0.00271 | 0.00088 | |
|
| 424 | 0.00270 | 0.00281 | 0.00275 | 0.00015 | 0.00273 | |
|
| 134 | 421 | 0.00057 | 0.00061 | 0.00266 | 0.00074 | |
|
| 149 | 438 | 89 | 0.00058 | 0.00278 | 0.00081 | |
|
| 144 | 428 | 96 | 91 | 0.00274 | 0.00077 | |
|
| 422 | 23 | 415 | 433 | 427 | 0.00270 | |
|
| 138 | 425 | 116 | 126 | 121 | 420 |
The lower triangle shows the number of nucleotide substitutions between the genomes. The upper triangle indicates the calculated sequence divergence for the seven complete chloroplast genomes.
Figure 4Indels identified in the cp genomes of seven Arachis species. (A) Numbers of individual indels shown by sequence length. (B) Relative frequency of indel occurrence in introns, exons, and spacer regions.
Figure 2Analyses of simple sequence repeat (SSR) in the Arachis chloroplast genomes. (A) Number different SSRs types detected by MISA. (B) Frequency of identified SSR motifs in the different repeat classes.
SSRs identified from in silico comparative analysis of the seven Arachis cp genomes.
| No. | Position | Region | Locatin | SSR type | Forward sequence | Reverse sequence | Length (bp) |
|---|---|---|---|---|---|---|---|
| 1 | trnK-rbcL | LSC | spacer | (A)10 | TACCATTGAGTTAGCAACCCCC | CGATTTCTTCACGTTACAGAGGC | 248 |
| 2 | trnK-rbcL | LSC | spacer | (A)12 | CGATTTCTTCACGATCGGATTA | AATATAATCAAATTCGATTTA | 141 |
| 3 | rbcL-atpB | LSC | spacer | (A)12 | TCATATGTATGGCGCAACCCAA | TTCATGGGCGAGCATACAATTT | 189 |
| 4 | trnV intron | LSC | intron | (T)12 | TCAAAAACGCAAGGGCTATAGC | TACTGGACGTCTCAACCCTTTG | 190 |
| 5 | trnF-trnL | LSC | spacer | (A)15 | ACTCGAATCCATTTGTGAAAGACT | TCCCTCTATCCCCAAAAGACCT | 131 |
| 6 | trnL-trnT | LSC | spacer | (T)10 | TTGCGATTAGAATCGCATTAA | AGATTCGACAAAATCTGGATA | 151 |
| 7 | trnL-trnT | LSC | spacer | (T)11 | ATTACTGTAACTGTAATAGAA | ATGCTCTAACCTCTGAGCTA | 246 |
| 8 | ycf3 2nd intron | LSC | intron | (A)11 | TGATCTGTCATTACGTGCGACT | TCTTTACGGCGCTTCCTCTATC | 208 |
| 9 | ycf3-psaA | LSC | spacer | (T)12 | TGAAGATCACAGGGCGTTCTTA | TGGATGGACTGATGTAGACAACA | 280 |
| 10 | ycf3-psaA | LSC | spacer | (AT)7 | TAGTTCTATTTATATTATTC | ATTTAAATGAAATATGCATTA | 143 |
| 11 | ycf3-psaA | LSC | spacer | (T)10 | ATTCAAAAAGGTCCGTTGAGCG | CTCCTTCCGGACAACACATACA | 230 |
| 12 | psbD-trnT | LSC | spacer | (A)14 | GTGAAGCCATGATTTGATGTA | ATTAGTCGATATTTACGATTA | 193 |
| 13 | psbD-trnT | LSC | spacer | (A)10 | GAATCTTGAGGAACGGGAGGAT | AGTGGACCTAACCCATTGAATCA | 158 |
| 14 | psbD-trnT | LSC | spacer | (T)13 | TTGATTATCATTCATTAGAAT | GTAAGGCGTAAGTCATCGGT | 243 |
| 15 | trnT-trnE | LSC | spacer | (A)12 | TCCTGCTCTTGAACCGATTCTT | GTTGGTTTGCTAGAAAAGGCGT | 188 |
| 16 | trnT-trnE | LSC | spacer | (G)11 | TGGAATTATAGATTGGCGATT | ATGTCCTGGACCACTAGACGA | 223 |
| 17 | trnD-psbM | LSC | spacer | (A)13 | CCCGTCAGTCCCGAATGAATAA | CGATTCATCGTCGAGAATGGAA | 256 |
| 18 | petN-trnC | LSC | spacer | (T)10 | AAGATTTACTATATCCATGTG | TTGACTCTGTACCAGCGATT | 182 |
| 19 | trnC-rpoB | LSC | spacer | (AT)6 | GAAAAAGGATTTGCAGTCCCCC | GGTTCCGTTTTGTCCTTCCATT | 140 |
| 20 | trnC-rpoB | LSC | spacer | (A)10 | GGTGTGTAAACTCTCCCACCTT | AAATCGACTCGGGATTTGTTCG | 227 |
| 21 | atpH-atpF | LSC | spacer | (T)10 | TACAAGCGGTATTCAAGCCCT | CAATTAATAGAATCAGAATTCA | 227 |
| 22 | atpH-atpF | LSC | spacer | (T)11 | ATTCAGTTCTTCGGTCGAACGA | ACCGTAAACCAATTGTTCGTGT | 259 |
| 23 | atpF-intron | LSC | intron | (A)10 | AAAGCAAAGCTAGGCATAGGCA | ACGTAGGTCATCGATTTCGCAT | 259 |
| 24 | trnQ-accD | LSC | spacer | (a)13 | TGCAAGCAAAAGTGTATTCCGG | ACTTGGTCCAGGATCTTTTAGCT | 167 |
| 25 | psaJ-rpl33 | LSC | spacer | (T)10 | CTATTGATCGAAATCAATCGT | CCATTGAAGCCTGTACCAGAT | 235 |
| 26 | rpl20-rps12 | LSC | spacer | (T)12 | GAGTTGGTTTAGATCAATCT | ATGTCAGCAGCAGAAGCTCA | 231 |
| 27 | rps12-clpP | LSC | spacer | (A)14 | GTGACATTTCGGATTGGCTGTC | ATTGTTGATCTTGTCGCGGTTG | 276 |
| 28 | clpP intron 1 | LSC | intron | (T)15 | AGATCAGCATCAGTAAATGAT | ATCGGAAGCCTATTTCAGTGTC | 249 |
| 29 | clpP-psbB | LSC | spacer | (A)11 | CACACCACCATTGCGTATTGTT | GAACACGATACCAAGGCAAACC | 271 |
| 30 | rps11-rpl36 | LSC | spacer | (TA)6 | GAGATGTATGGATATATTCAT | TTGAATGAATATAGAAATTCTA | 297 |
| 31 | rps11-rpl36 | LSC | spacer | (T)11 | AGTTTGAATTTCAATATCTA | GATCCGAGATTAAGTTGAAGGA | 251 |
| 32 | rpl16 intron | LSC | intron | (TA)7 | TCTACAATGGAGCCTCGCAAAT | ACAAATCAAGAGCACCGAGTCA | 104 |
| 33 | rpl16 intron | LSC | intron | (TTTC)4 | TGTTGATGCTTTATTACACTTCCCC | TCATCGCTTCGCATTATCTGGA | 272 |
| 34 | rpl2 intron | IR | intron | (T)10 | TTGCAATCAGTTTCGCTACAGC | CTTGTACAGTTTGGGAAGGGGT | 161 |
| 35 | ndhF-rpl32 | SSC | spacer | (A)10 | GAACTGGAAGCGGAATGAAAGG | AGAAGTATTGTGCAAAGATTCAG | 212 |
| 36 | ndhF-rpl32 | SSC | spacer | (A)10 | ACAGATATCTATGTTTGGCA | TGCCATGCAACTGATATAGT | 200 |
| 37 | ndhG-ndhI | SSC | spacer | (T)10 | ATAGAACAGATATCGAAATGA | AATAGATATGAAACAGAATA | 142 |
Figure 5Phylogenetic relationships of the seven Arachis species constructed from the complete chloroplast genome sequences using maximum likelihood (ML) and Bayesian inference (BI). ML topology shown with ML bootstrap support value/Bayesian posterior probability given at each node.