| Literature DB >> 28891993 |
ZhouXian Ni1, YouJu Ye2, Tiandao Bai3,4, Meng Xu5, Li-An Xu6.
Abstract
The chloroplast genome (CPG) of Pinus massoniana belonging to the genus Pinus (Pinaceae), which is a primary source of turpentine, was sequenced and analyzed in terms of gene rearrangements, ndh genes loss, and the contraction and expansion of short inverted repeats (IRs). P. massoniana CPG has a typical quadripartite structure that includes large single copy (LSC) (65,563 bp), small single copy (SSC) (53,230 bp) and two IRs (IRa and IRb, 485 bp). The 108 unique genes were identified, including 73 protein-coding genes, 31 tRNAs, and 4 rRNAs. Most of the 81 simple sequence repeats (SSRs) identified in CPG were mononucleotides motifs of A/T types and located in non-coding regions. Comparisons with related species revealed an inversion (21,556 bp) in the LSC region; P. massoniana CPG lacks all 11 intact ndh genes (four ndh genes lost completely; the five remained truncated as pseudogenes; and the other two ndh genes remain as pseudogenes because of short insertions or deletions). A pair of short IRs was found instead of large IRs, and size variations among pine species were observed, which resulted from short insertions or deletions and non-synchronized variations between "IRa" and "IRb". The results of phylogenetic analyses based on whole CPG sequences of 16 conifers indicated that the whole CPG sequences could be used as a powerful tool in phylogenetic analyses.Entities:
Keywords: comparative genomics; conifer species; genome annotation; phylogenetic analysis; structural inversion
Mesh:
Substances:
Year: 2017 PMID: 28891993 PMCID: PMC6151703 DOI: 10.3390/molecules22091528
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Chloroplast genome annotation map for Pinus massoniana. Genes lying outside the circle are transcribed in a clockwise direction, whereas genes inside are transcribed in a counterclockwise direction. Different colors represent different functional groups. The dashed darker and lighter gray in the inner circle denote GC and AT contents of chloroplast genome, respectively. LSC, SSC and IRs means long single copy, small sngle copy, and inverted repeat regions, respectively.
Gene contents of P.massoniana chloroplast genome based on genome annotation.
| Category | Gene Contents |
|---|---|
| Subunits of photosystem I | |
| Subunits of photosystem II | |
| Small subunit of ribosome | |
| Large subunit of ribosome | |
| Subunits of cytochrome b/f complex | |
| Subunits of ATP synthase | |
| DNA-dependent RNA polymerase | |
| ChlorophyII biosynthesis | |
| Protease | |
| Maturase | |
| Envelope membrane protein | |
| Translation initiation factor | |
| Cytochrome c biogenesis | |
| Subunit Acetyl-CoA-Carboxylate | |
| Subunit of rubisco | |
| Ribosomal RNAs | |
| Conserved open reading frames | |
| Transfer RNA | |
a Gene-copies in genome; b Intro-containing gene.
Figure 2Synteny and rearrangements detected in chloroplast genome sequences of four Coniferous species using the Mauve multiple-genome alignment. (b) is a schematic illustration of the red frame part of the (a). (a) Color bars indicate syntenic blocks, and connecting lines indicate correspondence blocks; (b) Green boxes means protein-coding genes; red boxes means tRNAs. Boxes above and below the main line indicate the forward and reverse direction, respectively.
Figure 3Distribution of each simple sequence repeats (SSR) category in chloroplast genome (CPG) of Pinus massoniana. (a) Distribution of each SSR category in whole chloroplast genome; (b) Distribution of each SSR category in the coding sequence (CDS) and non-CDS of CPG; (c) Distribution of each SSR category in LSC and SSC of CPG.
Figure 4Dotpot analysis of seven ndh genes between P. massoniana and Cryptomeria japonica.
Figure 5Variations of inverted repeats (IRs) using multiple alignment. (a) Variations between P. massoniana and P. taiwanensis; (b) variations between “IRa” and “IRb” in P. koraiensis and P. contorta. Variations are in red frames: Single Nucleotide Polymorphisms (SNPs) are in yellow frames; microsatellites are in blue frames.
Figure 6Phylogenetic tree constructed by maximum likelihood (ML) and Bayesian inference (BI) methods based on whole chloroplast genome sequences and 56 shared protein-coding genes of 16 conifers. (a) Phylogenetic tree based on whole chloroplast genome sequences of 16 conifers; (b) Phylogenetic tree based on 56 shared protein-coding genes of 16 conifers chloroplast genomes; Ginkgo biloba as an outgroup; BI posterior probability/ML bootstrap support values were listed at each node.