| Literature DB >> 27995013 |
Liqun Shen1, Qijie Guan1, Awais Amin1, Wei Zhu1, Mengzhu Li1, Ximin Li2, Lin Zhang1, Jingkui Tian1.
Abstract
Eriobotrya japonica (Thunb.) Lindl (loquat) is an evergreen Rosaceae fruit tree widely distributed in subtropical regions. Its leaves are considered as traditional Chinese medicine and are of high medical value especially for cough and emesis. Thus, we sequenced the complete plastid genome of E. japonica to better utilize this important species. The complete plastid genome of E. japonica is 159,137 bp in length, which contains a typical quadripartite structure with a pair of inverted repeats (IR, 26,326 bp) separated by large (LSC, 89,202 bp) and small (SSC, 19,283 bp) single-copy regions. The E. japonica plastid genome encodes 112 unique genes which consist of 78 protein-coding genes, 30 tRNA genes and 4 rRNA genes. Gene structure and content of E. japonica plastid genome are quite conserved and show similarity among Rosaceous species. Five large indels are unique to E. japonica in comparison with Pyrus pyrifolia and Prunus persica, which could be utilized as molecular markers. A total of 72 simple sequence repeats (SSRs) were detected and most of them are mononucleotide repeats composed of A or T, indicating a strong A or T bias for base composition. The Ka and Ks ratios of most genes are lower than 1, which suggests that most genes are under purifying selection. The phylogenetic analysis described the evolutionary relationship within Rosaceae and fully supported a close relationship between E. japonica and P. pyrifolia.Entities:
Keywords: Chloroplast genome; Eriobotrya; Gene evolution; Loquat; Rosaceae
Year: 2016 PMID: 27995013 PMCID: PMC5127920 DOI: 10.1186/s40064-016-3702-3
Source DB: PubMed Journal: Springerplus ISSN: 2193-1801
Fig. 1Gene map of E. japonica complete chloroplast genome. Exons are annotated by coloured boxes and introns are annotated with white boxes. Genes inside or outside of the large circle are respectively transcribed in the clockwise or counterclockwise direction. Boundaries of LSC, SSC, IR are annotated at inner circle. GC content is represented by dark grey graph within inner circle
List of genes located in E. japonica
| Group of genes | Gene names |
|---|---|
| Ribosomal RNA genes |
|
| Transfer RNA genes |
|
| Small subunit of ribosome |
|
| Large subunit of ribosome |
|
| DNA dependent RNA polymerase |
|
| Subunits of photosystem I |
|
| Subunits of photosystem II |
|
| Subunits of cytochrome |
|
| Subunits of ATP synthase |
|
| ATP-dependent protease |
|
| Large subunit of Rubisco |
|
| Subunits of NADH |
|
| Maturase |
|
| Envelop membrane protein |
|
| Subunit of Acetyl-CoA-carboxylase |
|
| c-Type cytochrome synthesis gene |
|
| Conserved open reading frames |
|
* Gene with intron
#Gene duplicated in IR
Fig. 2Alignment of infA coding regions of seven Rosaceae plastid genomes. Asterisks mean stop codons while dots mean gaps
Codon usage and RSCU analysis of E. japonica cp genome
| Amino acid | Codon | No. | RSCU | tRNA | Amino acid | Codon | No. | RSCU | tRNA |
|---|---|---|---|---|---|---|---|---|---|
| Phe | UUU | 850 | 1.34 | Tyr | UAU | 691 | 1.61 | ||
| UUC | 420 | 0.66 |
| UAC | 166 | 0.39 |
| ||
| Leu | UUA | 814 | 2.05 |
| TER | UAA | 43 | 1.65 | |
| UUG | 485 | 1.22 |
| UAG | 19 | 0.73 | |||
| CUU | 495 | 1.24 | His | CAU | 414 | 1.54 | |||
| CUC | 145 | 0.36 | CAC | 123 | 0.46 |
| |||
| CUA | 305 | 0.77 |
| Gln | CAA | 634 | 1.55 |
| |
| CUG | 144 | 0.36 | CAG | 182 | 0.45 | ||||
| Ile | AUU | 987 | 1.50 | Asn | AAU | 834 | 1.55 | ||
| AUC | 364 | 0.55 |
| AAC | 244 | 0.45 |
| ||
| AUA | 622 | 0.95 | Lys | AAA | 904 | 1.53 |
| ||
| Met | AUG | 537 | 1.00 |
| AAG | 281 | 0.47 | ||
| Val | GUU | 472 | 1.47 | Asp | GAU | 743 | 1.62 | ||
| GUC | 134 | 0.42 |
| GAC | 175 | 0.38 |
| ||
| GUA | 504 | 1.57 |
| Glu | GAA | 897 | 1.50 |
| |
| GUG | 173 | 0.54 | GAG | 299 | 0.50 | ||||
| Ser | UCU | 485 | 1.71 | Cys | UGU | 190 | 1.51 | ||
| UCC | 259 | 0.92 |
| UGC | 62 | 0.49 |
| ||
| UCA | 327 | 1.15 |
| TER | UGA | 16 | 0.62 | ||
| UCG | 155 | 0.55 | Trp | UGG | 396 | 1.00 |
| ||
| Pro | CCU | 363 | 1.56 | Arg | CGU | 301 | 1.33 |
| |
| CCC | 172 | 0.74 | CGC | 97 | 0.43 | ||||
| CCA | 268 | 1.15 |
| CGA | 312 | 1.38 | |||
| CCG | 130 | 0.55 | CGG | 99 | 0.44 | ||||
| Thr | ACU | 483 | 1.64 | Ser | AGU | 364 | 1.29 | ||
| ACC | 211 | 0.72 |
| AGC | 109 | 0.39 | |||
| ACA | 363 | 1.23 |
| Arg | AGA | 407 | 1.80 |
| |
| ACG | 123 | 0.41 | AGG | 138 | 0.61 | ||||
| Ala | GCU | 590 | 1.85 | Gly | GGU | 528 | 1.35 | ||
| GCC | 194 | 0.61 | GGC | 166 | 0.42 |
| |||
| GCA | 349 | 1.10 |
| GGA | 620 | 1.58 |
| ||
| GCG | 141 | 0.44 | GGG | 254 | 0.65 |
Summary of seven Rosaceae plastid genome features
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|
| Accession | KT633951 | NC_015996 | NC_016921 | NC_014697 | NC_019602 | NC_021455 | KF753637 |
| Length | 159,137 |
| 156,612 | 157,790 |
| 156,328 | 156,634 |
| LSC | 87,202 |
|
| 85,968 | 85,586 | 85,239 | 85,767 |
| SSC |
| 19,237 | 18,941 | 19,060 |
| 18,485 | 18,761 |
| IR | 26,326 |
| 26,351 | 26,381 |
| 26,302 | 26053 |
| GC% overall | 36.7 |
| 36.8 | 36.8 |
| 36.9 |
|
| in LSC | 34.5 |
| 34.6 | 34.6 | 35.1 | 34.7 |
|
| in IR | 42.7 | 42.7 |
| 42.6 |
| 42.7 | 42.7 |
| in SSC |
| 30.4 | 30.6 | 30.6 | 31.1 | 30.5 |
|
Number in bold italics indicates the largest value of line
Number in italics indicates the smallest value of line
Fig. 3Six Rosaceae plastid genomes were aligned with E. japonica pairwise. Y-scale stands for identity from 50 to 100%. Blue represents exons of protein-coding genes, lime represents tRNA or rRNA genes and red represents non-coding regions
Distribution of SSRs (mononucleotide) loci in the E. japonica chloroplast genome
| Size (bp) | Number and start position | |||
|---|---|---|---|---|
| A stretch | C stretch | T stretch | G stretch | |
| 10 | 9 (13897, 62517, 68030, 84710, 115751, 117249, 124942, 125608, 143104) | 0 | 17 (174, 4752, 8304, 9472, 9942, 11831, 13310, 14438, 14463, 16746, 26698, 56906, 84155, 85388, 86446, 103167, 130945) | 0 |
| 11 | 6 (16759, 44490, 46845, 51587, 68061, 158959) | 0 | 6 (2753, 6523, 9110, 12905, 18997, 87311) | 0 |
| 12 | 3 (187, 27779, 79320) | 0 | 4 (1612, 51543, 67720, 85645) | 0 |
| 13 | 2 (48460, 74027) | 0 | 1 (71138) | 0 |
| 14 | 1 (37903) | 2 (25627, 116685) | 6 (12551, 32565, 37948, 65963, 73345, 123131) | 0 |
| 15 | 1 (6806) | 0 | 3 (71854, 74083, 116600) | 0 |
| 16 | 4 (69960, 80822, 131576) | 0 | 2 (14900, 125047) | 0 |
| 17 | 1 (7681) | 0 | 1 (82414) | 0 |
| 18 | 0 | 0 | 0 | 0 |
| 19 | 0 | 0 | 1 (59616) | 0 |
| 20 | 0 | 0 | 1 (83566) | 0 |
| Total | 26 | 2 | 42 | 0 |
Synonymous rate, nonsynonymous rate, transition (Ts) and transversion (Tv) in LSC, IR and SSC regions among E. japonica, P. pyrifolia and P. persica
| Region |
|
| ||||||
|---|---|---|---|---|---|---|---|---|
| Ks | Ka | Ts | Tv | Ks | Ka | Ts | Tv | |
| LSC | 0.0062 | 0.0014 | 209 | 313 | 0.0717 | 0.0091 | 1863 | 1706 |
| IR | 0.0035 | 0.00004 | 3 | 6 | 0.0110 | 0.0041 | 81 | 77 |
| SSC | 0.0091 | 0.0026 | 60 | 76 | 0.0942 | 0.0175 | 584 | 572 |
| All | 0.0053 | 0.0024 | 272 | 395 | 0.0694 | 0.0138 | 2528 | 2355 |
| Ratio | Ka/Ks = 0.4528 | Ts/Tv = 0.6858 | Ka/Ks = 0.2126 | Ts/Tv = 1.0964 | ||||
Fig. 4Nucleotide substitution analysis of different functional groups among E. japonica, P. pyrus and P. persica. Group A to K separately refers to small subunit of ribosome, large subunit of ribosome, RNA polymerase subunits, ATP synthase gene, NADH dehydrogenase, Cytochrome b/f complex, Photosystem I, Photosystem II, Large chain of Rubisco, Other genes and Unknown functions. a Synonymous (Ks) and nonsynonymous (Ka) ratios among three species. b Transtion (Ts) and transversion (Tv) ratios among three species
Large indels identified among E. japonica, P. pyrifolia and P. persica
| Type |
|
| ||||
|---|---|---|---|---|---|---|
| Location | Size (bp) | Repeat motifs | Location | Size (bp) | Repeat motifs | |
| Deletion |
| 182 | polyA |
| 151 | |
|
| 417 | AAT |
| 40 | ||
|
| 54 |
| 58 | |||
|
| 52 | polyA |
| 48 | ||
|
| 56 |
| 44 | |||
|
| 48 | polyT |
| 45 | TTTTG | |
|
| 50 |
| 50 | AA | ||
|
| 96 | TA | ||||
| Insertion |
| 79 | TTCG |
| 138 | |
|
| 42 | CTCAAATATATGTTTATCAAT |
| 151 | ||
|
| 69 |
| 47 | |||
|
| 129 | AA |
| 124 | ||
|
| 148 | |||||
|
| 48 | |||||
|
| 42 | |||||
|
| 78 | |||||
|
| 61 | TTTAT | ||||
|
| 146 | |||||
|
| 191 | AATTT | ||||
|
| 59 | |||||
Fig. 5The comparison of IR boundary among 7 Rosaceae plants. Annotated genes are represented by black boxes. Psi letter means pseudogene
Fig. 6Maximum likelihood (ML) analysis using 78 protein-coding genes within Rosaceae family. Bootstrap values are displayed at the nodes. Length scale behind the tree indicates substitutions per site