| Literature DB >> 25407166 |
Zhihua Wu, Songtao Gui, Zhiwu Quan, Lei Pan, Shuzhen Wang, Weidong Ke, Dequan Liang, Yi Ding.
Abstract
BACKGROUND: The chloroplast genome is important for plant development and plant evolution. Nelumbo nucifera is one member of relict plants surviving from the late Cretaceous. Recently, a new sequencing platform PacBio RS II, known as 'SMRT (Single Molecule, Real-Time) sequencing', has been developed. Using the SMRT sequencing to investigate the chloroplast genome of N. nucifera will help to elucidate the plastid evolution of basal eudicots.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25407166 PMCID: PMC4245832 DOI: 10.1186/s12870-014-0289-0
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Figure 1Gene map of chloroplast genome from PacBio RS II platform. The inverted repeats are indicated by thick lines. Asterisks indicate genes containing introns. Genes on the outside of the circle are transcribed in a clockwise direction and genes on the inside of the circle are transcribed in a counter-clockwise direction.
Statistics of the chloroplast genome sequencing data from Illumina MiSeq and PacBio RS II
|
|
| |
|---|---|---|
| Library size(bp) | 400 | 20,000 |
| Number of raw reads | 12,164,066 | 226,904 |
| High quality bases(M) | 394 | 845 |
| Mean read length(bp) (raw-data) | 250 | 4,474 |
| CP average read depth ( error-corrected) | 712× (n.a.) | 105× |
| proportion of bases > = Q40 | 99.99% | 99.98% |
| SC average read depth | 493× | 83× |
| IR average read depth | 531× | 52× |
| No. of gaps | 2 | 0 |
| No. of contigs | 1 | 1 |
| The total length (bp) | 163,747 | 163,600 |
List of genes present in the chloroplast genome of
|
|
| |
|---|---|---|
| Protein synthesis and DNA-replication | Ribosomal RNAs (8) |
|
| Transfer RNAs (37) |
| |
| Ribosomal proteins small subunit (14) |
| |
|
| ||
| Ribosomal proteins large subunit (11) |
| |
|
| ||
| Subunits of RNA polymerase (4) |
| |
| Photosynthesis | Photosystem I (5) |
|
| Photosystem II (15) |
| |
| Cytochrome b/f complex (6) |
| |
| ATP synthase (6) |
| |
| NADH-dehydrogenase (12) |
| |
| Large subunit of Rubisco (1) |
| |
| miscellaneous group | Translation initiation factor IF-1 (1) |
|
| Acetyl-CoA carboxylase (1) |
| |
| Cytochrome c biogenesis (1) |
| |
| Maturase (1) |
| |
| ATP-dependent protease (1) |
| |
| Inner membrane protein (1) |
| |
| Genes of unknown function | Conserved hypothetical chloroplast reading frames (5) |
|
Genes with introns are marked with asterisks (*).
The numbers in parentheses represents the number of genes.
Alternative start codon usage in the sequenced basal eudicots
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|
|
| ACG | ||||||||
|
| ACG | ACG | ACG | ACG | ACG | ACC | ACG | ACG | |
|
| GTG | GTG | GTG | GTG | |||||
|
| ACG | ACG | ACG | ACG | ACG | ACG | |||
|
| ACG | ACG | ACG | ACG | ATA | ATA | ATA | ACG | ACG |
|
| GTG | GTG | GTG | GTG | GTG | GTG | GTG | GTG | GTG |
|
| GTG | ||||||||
| Total No. | 6 | 4 | 3 | 5 | 4 | 4 | 4 | 5 | 3 |
Relative synonymous codon usage for 79 distinct chloroplast protein-coding genes in
|
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|---|
| UUU(F) | 784 | 1.25 | UCU(S) | 475 | 1.65 | UAU(Y) | 658 | 1.6 | UGU(C) | 190 | 1.46 |
| UUC(F) | 475 | 0.75 | UCC(S) | 284 | 0.99 | UAC(Y) | 166 | 0.4 | UGC(C) | 70 | 0.54 |
| UUA(L) | 692 | 1.76 | UCA(S) | 368 | 1.28 |
| 36 | 1.37 |
| 21 | 0.8 |
| UUG(L) | 506 | 1.29 | UCG(S) | 148 | 0.51 |
| 22 | 0.84 | UGG(W) | 406 | 1 |
| CUU(L) | 497 | 1.26 | CCU(P) | 390 | 1.61 | CAU(H) | 427 | 1.5 | CGU(R) | 333 | 1.42 |
| CUC(L) | 164 | 0.42 | CCC(P) | 177 | 0.73 | CAC(H) | 141 | 0.5 | CGC(R) | 79 | 0.34 |
| CUA(L) | 331 | 0.84 | CCA(P) | 281 | 1.16 | CAA(Q) | 605 | 1.51 | CGA(R) | 325 | 1.39 |
| CUG(L) | 169 | 0.43 | CCG(P) | 123 | 0.51 | CAG(Q) | 197 | 0.49 | CGG(R) | 106 | 0.45 |
| AUU(I) | 945 | 1.45 | ACU(T) | 468 | 1.58 | AAU(N) | 822 | 1.53 | AGU(S) | 358 | 1.24 |
| AUC(I) | 408 | 0.63 | ACC(T) | 224 | 0.75 | AAC(N) | 253 | 0.47 | AGC(S) | 95 | 0.33 |
| AUA(I) | 605 | 0.93 | ACA(T) | 362 | 1.22 | AAA(K) | 860 | 1.48 | AGA(R) | 426 | 1.82 |
| AUG(M) | 547 | 1 | ACG(T) | 133 | 0.45 | AAG(K) | 304 | 0.52 | AGG(R) | 137 | 0.58 |
| GUU(V) | 455 | 1.43 | GCU(A) | 577 | 1.8 | GAU(D) | 750 | 1.59 | GGU(G) | 531 | 1.33 |
| GUC(V) | 156 | 0.49 | GCC(A) | 204 | 0.64 | GAC(D) | 196 | 0.41 | GGC(G) | 159 | 0.4 |
| GUA(V) | 479 | 1.5 | GCA(A) | 348 | 1.09 | GAA(E) | 899 | 1.49 | GGA(G) | 649 | 1.62 |
| GUG(V) | 185 | 0.58 | GCG(A) | 151 | 0.47 | GAG(E) | 307 | 0.51 | GGG(G) | 263 | 0.66 |
1Count means the number of codons used in the 79 protein-coding genes.
2RSCU represents relative synonymous codon usage.
3Codons in bold with an asterisk represent stop codons.
Figure 2Phylogenetic tree of the 133 taxa based on 79 chloroplast protein-coding genes. The ML tree has a -lnL of −1601140.821388 with support values for ML provided at the nodes. Asterisks indicate ML BS =100%. Taxa in blue are the two new genomes sequenced in this study.
Figure 3Posterior estimates of divergence time of 133 taxa on the phylogenetic tree. The values at the nodes represent mean ages in a 95% highest posterior density (HPD) analysis. Estimations were performed with MCMCTree using the IR (independent rate) model.
Figure 4Comparison of the boundaries of LSC, IR and SSC among eight chloroplast genomes of basal eudicots.