| Literature DB >> 30413097 |
Xin Zhang1, Chunxiao Rong2, Ling Qin3, Chuanyuan Mo4, Lu Fan5, Jie Yan6, Manrang Zhang7.
Abstract
Malus hupehensis belongs to the Malus genus (Rosaceae) and is an indigenous wild crabapple of China. This species has received more and more attention, due to its important medicinal, and excellent ornamental and economical, values. In this study, the whole chloroplast (cp) genome of Malus hupehensis, using a Hiseq X Ten sequencing platform, is reported. The M. hupehensis cp genome is 160,065 bp in size, containing a large single copy region (LSC) of 88,166 bp and a small single copy region (SSC) of 19,193 bp, separated by a pair of inverted repeats (IRs) of 26,353 bp. It contains 112 genes, including 78 protein-coding genes (PCGs), 30 transfer RNA genes (tRNAs), and four ribosomal RNA genes (rRNAs). The overall nucleotide composition is 36.6% CG. A total of 96 simple sequence repeats (SSRs) were identified, most of them were found to be mononucleotide repeats composed of A/T. In addition, a total of 49 long repeats were identified, including 24 forward repeats, 21 palindromic repeats, and four reverse repeats. Comparisons of the IR boundaries of nine Malus complete chloroplast genomes presented slight variations at IR/SC boundaries regions. A phylogenetic analysis, based on 26 chloroplast genomes using the maximum likelihood (ML) method, indicates that M. hupehensis clustered closer ties with M. baccata, M. micromalus, and M. prunifolia than with M. tschonoskii. The availability of the complete chloroplast genome using genomics methods is reported here and provides reliable genetic information for future exploration on the taxonomy and phylogenetic evolution of the Malus and related species.Entities:
Keywords: Malus hupehensis; chloroplast genome; comparative analysis; phylogenetic analysis
Mesh:
Year: 2018 PMID: 30413097 PMCID: PMC6278565 DOI: 10.3390/molecules23112917
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Summary of complete chloroplast genomes for nine Malus species.
| Genome Characteristics |
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|
| Accession number | MK020147 | KX499858 | KX499862 | KX499863 | KX499859 | MF062434 | KU851961 | KX499861 | MH394388 |
| Genome size (bp) | 160,065 | 160,207 | 159,712 | 160,053 | 160,163 | 159,834 | 160,041 | 159,584 | 160,068 |
| LSC length (bp) | 88,166 | 88,107 | 87,710 | 88,137 | 88,267 | 87,950 | 88,119 | 87,670 | 88,245 |
| SSC length (bp) | 19,193 | 19,316 | 19,250 | 19,210 | 19,188 | 19,176 | 19,204 | 19,168 | 19,211 |
| IR length (bp) | 26,353 | 26,392 | 26,376 | 26,353 | 26,354 | 26,354 | 26,359 | 26,373 | 26,306 |
| No. of different genes | 112 | 110 | 110 | 110 | 109 | 111 | 111 | 110 | 112 |
| No. of different protein-coding genes | 78 | 76 | 77 | 77 | 76 | 77 | 77 | 77 | 78 |
| No. of different tRNA genes | 30 | 30 | 29 | 29 | 29 | 30 | 30 | 29 | 30 |
| No. of different rRNA genes | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| % GC content in LSC | 34.2 | 34.2 | 34.3 | 34.2 | 34.2 | 34.3 | 34.2 | 34.4 | 34.2 |
| % GC content in SSC | 30.4 | 30.3 | 30.4 | 30.4 | 30.4 | 30.4 | 30.4 | 30.4 | 30.4 |
| % GC content in IR | 42.7 | 42.6 | 42.6 | 42.7 | 42.7 | 42.7 | 42.7 | 42.6 | 42.7 |
| % GC content of genome | 36.6 | 36.5 | 36.6 | 36.5 | 36.5 | 36.6 | 36.6 | 36.6 | 36.5 |
Figure 1Gene map of the M. hupehensis chloroplast genome. Genes shown outside the outer circle are transcribed clockwise and those inside are transcribed counterclockwise. The colored bars indicate different functional groups. The dark gray inner circle corresponds to the GC content, the light-gray to the AT content.
Gene contents of the M. hupehensis chloroplast genome, based on genome annotation.
| Group of Genes | Gene Name |
|---|---|
| DNA-dependent RNA polymerase | |
| tRNA genes | |
| Ribosomal small subunit | |
| Ribosomal large subunit | |
| rRNA genes | |
| ATP synthase | |
| Photosystem I | |
| Photosystem II | |
| NADH dehydrogenase | |
| Cytochrome b/f complex | |
| Large subunit of rubisco |
|
| Maturase |
|
| Subunit of acetyl-CoA carboxylase |
|
| Envelope membrane protein |
|
| Protease |
|
| c-type cytochrome synthesis |
|
| Conserved open reading frames |
# genes with one intron, ## genes with two introns, Genes in the IR regions are followed by the (×2) symbol.
Location and length of intron-containing genes within the M. hupehensis chloroplast genome.
| Gene | Location | ExonI (bp) | IntronI (bp) | ExonII (bp) | IntronII (bp) | ExonIII (bp) |
|---|---|---|---|---|---|---|
|
| LSC | 37 | 2497 | 35 | ||
|
| LSC | 23 | 698 | 48 | ||
|
| LSC | 37 | 514 | 50 | ||
|
| LSC | 39 | 592 | 37 | ||
|
| IR | 42 | 943 | 35 | ||
|
| IR | 38 | 807 | 35 | ||
|
| LSC | 114 | - | 232 | 541 | 26 |
|
| LSC | 40 | 864 | 221 | ||
|
| LSC | 9 | 983 | 399 | ||
|
| IR | 390 | 686 | 435 | ||
|
| LSC | 435 | 741 | 1611 | ||
|
| SSC | 552 | 1134 | 540 | ||
|
| IR | 777 | 669 | 756 | ||
|
| SSC | 126 | 708 | 228 | 744 | 153 |
|
| LSC | 6 | 797 | 642 | ||
|
| LSC | 144 | 737 | 411 | ||
|
| LSC | 71 | 826 | 292 | 627 | 228 |
|
| LSC | 8 | 724 | 475 |
Note. rps12 * gene is a trans-spliced gene with the two duplicated 3′ end exons in the IR regions and a 5′ end exon in the LSC region.
Figure 2Codon content of 20 amino acid and the stop codon of 84 coding genes of the M. hupehensis cp genome.
Figure 3Repeat analyses. (A) Repeat unit and amounts of SSR in the M. hupehensis cp genome. (B) Presence of different SSR types in all of the SSRs of nine Malus chloroplast genomes. (C) SSRs in the nine Malus cp genomes. (D) Repeated sequences in the nine Malus cp genomes. (E) Repeat frequency of four types by length in the nine Malus chloroplast genomes.
Figure 4Comparison of the border positions of LSC, SSC, and IR regions among the nine Malus chloroplast genomes.
Figure 5Comparison of nine cp genomes using mVISTA. The chloroplast genome of M. hupehensis as a reference. The grey arrows and thick black lines above the alignment indicate the position and direction of each gene. The y-axis represents the percentage identity (shown: 50–100%).
Figure 6A maximum likelihood (ML) phylogenetic tree based on 26 species chloroplast genomes was constructed. Ficus racemosa and Morus mongolica (Moraceae) were used as the outgroup.