| Literature DB >> 30572840 |
Sui Wang1, Chuanping Yang1, Xiyang Zhao1, Su Chen2, Guan-Zheng Qu3.
Abstract
BACKGROUND: Betula platyphylla is a common tree species in northern China that has high economic and medicinal value. Our laboratory has been devoted to genome research on B. platyphylla for approximately 10 years. As primary organelle genomes, the complete genome sequences of chloroplasts are important to study the divergence of species, RNA editing and phylogeny. In this study, we sequenced and analyzed the complete chloroplast (cp) genome sequence of B. platyphylla.Entities:
Keywords: Betula platyphylla; Chloroplast genome; Phylogeny; RNA editing; White birch
Mesh:
Substances:
Year: 2018 PMID: 30572840 PMCID: PMC6302522 DOI: 10.1186/s12864-018-5346-x
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Statistics for the contigs
| Number | Size (bp) | N50 (bp) | N90 (bp) | Longest length (bp) | Shortest length |
|---|---|---|---|---|---|
| 35 | 150,362 | 21,011 | 2253 | 35,505 | 139 |
Fig. 1Chloroplast genome map of Betula platyphylla. As indicated by the arrows, genes inside the circle are transcribed clockwise, and genes outside are transcribed counter-clockwise. The grey inner circle corresponds to the GC content. Genes belonging to different functional groups are shown in different colours
Group of genes within the B. platyphylla chloroplast genome
| Group of genes | Gene names |
|---|---|
| Photosystem I | |
| Photosystem II | |
| Cytochrome b/f complex | |
| ATP synthase | |
| NADP dehydrogenase | |
| RubisCO large subunit |
|
| RNA polymerase | |
| Ribosomal proteins (SSU) | |
| Ribosomal proteins (LSU) | |
| Hypothetical chloroplast reading frames(ycf) | |
| Other genes | |
| Ribosomal RNAs | |
| Transfer RNAs |
Fig. 2Sequence logo and RNA-Seq mapping of the three genes. a: Sequence logo of the first 10 bp of the three genes across the species. b: RNA-Seq mapping of the first 10 bp of the three genes in B. platyphylla
Fig. 3Sequence alignment of 5 chloroplast genomes in Fagales using the mVISTA program with B. platyphylla as a reference. Grey arrows above the alignment indicate the transcriptional directions of genes. Genome regions are color-coded as exon and conserved non-coding sequences (CNS). A cut-off of 50% identity was used for the plots. The Y-axis indicates the percent identity between 50 and 100%
Fig. 4Comparison of the borders of LSC, SSC and IR regions among the four Fagales genomes. ψ indicates a pseudogene. The figure is not strictly proportional. Gene length: B. platyphylla (rps19: 279 bp; rpl2: 1511 bp, ψycf1: 1237 bp, ndhF: 2247 bp, ycf1: 5751 bp, ψrps19: 23 bp); J. regia (rps19: 285 bp, rpl2: 1510 bp, ψycf1: 1155 bp, ndhF: 2226 bp, ycf1: 5676 bp); M. rubra (rps19: 279 bp, rpl2: 1509 bp, ψycf1: 1149 bp, ndhF: 2226 bp, ycf1: 5655 bp); C. mollissima (rps19: 285 bp, rpl2: 1509 bp, ψycf1: 1050 bp, ndhF: 2262 bp, ycf1: 5685 bp)
RNA editing sites and amino acid changes
| Gene | Subunit | Genome Position | Gene Position | Nucleotide Change | Codon Change | Edit Position within Codon | Amino Acid Change | Sequencing Depth | Editing Efficiency |
|---|---|---|---|---|---|---|---|---|---|
|
| exon | 2963 | 710 | G > A | UCU > UUU | 2 | S > F | 37 | 70 |
|
| exon | 3518 | 155 | G > A | UCU > UUU | 2 | S > F | 38 | 84 |
|
| exon | 5795 | 212 | G > A | UCA > UUA | 2 | S > L | 179 | 82 |
|
| intron | 6389 | – | G > A | – | – | – | 47 | 81 |
|
| exon | 12,099 | 914 | G > A | UCA > UUA | 2 | S > L | 120 | 98 |
|
| exon | 12,222 | 791 | G > A | CCG > CUG | 2 | P > L | 189 | 98 |
|
| exon | 14,294 | 92 | G > A | CCA > CUA | 2 | P > L | 199 | 81 |
|
| IGR | 16,247 | – | G > A | – | – | – | 229 | 97 |
|
| exon | 17,781 | 248 | G > A | UCA > UUA | 2 | S > L | 85 | 95 |
|
| exon | 18,698 | 3761 | G > A | UCA > UUA | 2 | S > L | 41 | 98 |
|
| exon | 20,661 | 1798 | G > T | GGU > AGU | 1 | G > S | 33 | 18 |
|
| exon | 20,763 | 1696 | G > C | CUC > GUC | 1 | L > V | 34 | 12 |
|
| exon | 24,185 | 488 | G > A | UCA > UUA | 2 | S > L | 79 | 80 |
|
| exon | 25,453 | 41 | G > A | UCA > UUA | 2 | S > L | 20 | 65 |
|
| exon | 26,307 | 2426 | G > A | UCA > UUA | 2 | S > L | 11 | 100 |
|
| exon | 26,733 | 2000 | G > A | UCU > UUU | 2 | S > F | 15 | 93 |
|
| exon | 27,555 | 1178 | G > A | UCG > UUG | 2 | S > L | 12 | 67 |
|
| exon | 28,167 | 566 | G > A | UCG > UUG | 2 | S > L | 6 | 17 |
|
| exon | 28,182 | 551 | G > A | UCA > UUA | 2 | S > L | 9 | 44 |
|
| exon | 28,395 | 338 | G > A | UCU > UUU | 2 | S > F | 22 | 86 |
|
| IGR | 35,186 | – | A > C | – | – | – | 23 | 17 |
|
| exon | 39,947 | 50 | C > U | UCA > UUA | 2 | S > L | 176 | 26 |
|
| exon | 41,250 | 149 | G > A | CCA > CUA | 2 | P > L | 210 | 90 |
|
| exon | 41,319 | 80 | G > A | CCC > CUC | 2 | P > L | 240 | 88 |
|
| exon | 44,617 | 1395 | G > A | CCC > CCU | 3 | P > P | 95 | 37 |
|
| IGR | 51,329 | – | C > U | – | – | – | 614 | 17 |
|
| exon | 54,554 | 65 | G > A | UCA > UUA | 2 | S > L | 826 | 89 |
|
| exon | 54,719 | 323 | G > A | ACU > AUU | 2 | T > I | 816 | 96 |
|
| IGR | 59,062 | – | G > A | – | – | – | 349 | 28 |
|
| exon | 62,731 | 815 | C > U | UCG > UUG | 2 | S > L | 65 | 92 |
|
| exon | 63,322 | 1406 | C > U | CCA > CUA | 2 | P > L | 302 | 82 |
|
| exon | 64,351 | 88 | C > U | CAU > UAU | 1 | H > Y | 89 | 84 |
|
| exon | 66,260 | 492 | C > U | UUC > UUU | 3 | F > F | 84 | 49 |
|
| IGR | 67,928 | – | C > U | – | – | – | 31 | 13 |
|
| exon | 69,182 | 77 | G > A | UCU > UUU | 2 | S > F | 26 | 96 |
|
| 5’UTR | 69,260 | -2 | G > A | – | – | – | 39 | 62 |
|
| exon | 69,306 | 214 | G > A | CCU > UCU | 1 | P > S | 46 | 96 |
|
| exon | 70,671 | 5 | C > U | CCU > CUU | 2 | P > L | 28 | 46 |
|
| IGR | 71,784 | – | C > U | – | – | – | 1163 | 99 |
|
| exon | 74,986 | 559 | G > A | CAU > UAU | 1 | H > Y | 33 | 97 |
|
| exon | 79,461 | 29 | G > A | UCU > UUU | 2 | S > F | 70 | 30 |
|
| intron | 80,119 | – | C > T | – | – | – | 56 | 18 |
|
| exon | 81,188 | 418 | C > U | CGG > UGG | 1 | R > W | 206 | 94 |
|
| exon | 81,381 | 611 | C > U | CCA > CUA | 2 | P > L | 207 | 94 |
|
| exon | 83,130 | 830 | G > A | UCA > UUA | 2 | S > L | 144 | 70 |
|
| exon | 83,760 | 200 | G > A | UCU > UUU | 2 | S > F | 76 | 83 |
|
| exon | 84,338 | 108 | G > A | UUC > UUU | 3 | F > F | 357 | 55 |
|
| exon | 91,217 | 89 | G > A | UCA > UUA | 2 | S > L | 25 | 84 |
|
| exon | 91,235 | 71 | G > A | UCU > UUU | 2 | S > F | 31 | 97 |
|
| exon | 98,495 | 6863 | A > G | UAA > UGA | 2 | *>* | 6 | 50 |
|
| exon | 99,901 | 1487 | G > A | CCA > CUA | 2 | P > L | 20 | 80 |
|
| exon | 99,955 | 1433 | G > A | UCA > UUA | 2 | S > L | 15 | 40 |
|
| exon | 100,133 | 1255 | G > A | CAU > UAU | 1 | H > Y | 17 | 100 |
|
| exon | 100,276 | 1112 | G > A | UCA > UUA | 2 | S > L | 25 | 88 |
|
| exon | 100,558 | 830 | G > A | UCA > UUA | 2 | S > L | 10 | 70 |
|
| exon | 101,328 | 746 | G > A | UCU > UUU | 2 | S > F | 9 | 22 |
|
| exon | 101,337 | 737 | G > A | CCA > CUA | 2 | P > L | 11 | 64 |
|
| exon | 101,488 | 586 | G > A | CAU > UAU | 1 | H > Y | 11 | 82 |
|
| exon | 101,607 | 467 | G > A | CCA > CUA | 2 | P > L | 23 | 91 |
|
| exon | 101,925 | 149 | G > A | UCA > UUA | 2 | S > L | 23 | 91 |
|
| intron | 103,149 | – | G > A | – | – | – | 87 | 67 |
|
| intron | 103,290 | – | G > A | – | – | – | 115 | 85 |
|
| IGR | 104,462 | – | G > A | – | – | – | 629 | 30 |
|
| exon | 106,489 | 577 | G > A/U/C | – | – | – | 1404 | 49 |
|
| exon | 107,021 | 45 | U > A/G | – | – | – | 1112 | 41 |
|
| exon | 111,418 | 881 | U > A/G/C | – | – | – | 3163 | 73 |
|
| exon | 116,076 | 1734 | C > U | AUG > AUA | 3 | M > I | 23 | 26 |
|
| exon | 117,520 | 290 | G > A | UCA > UUA | 2 | S > L | 21 | 38 |
|
| exon | 121,618 | 1298 | G > A | UCA > UUA | 2 | S > L | 52 | 92 |
|
| exon | 122,029 | 887 | G > A | CCC > CUC | 2 | P > L | 12 | 58 |
|
| exon | 122,242 | 674 | G > A | UCA > UUA | 2 | S > L | 40 | 85 |
|
| exon | 122,317 | 599 | G > A | UCA > UUA | 2 | S > L | 46 | 76 |
|
| exon | 122,533 | 383 | G > A | UCA > UUA | 2 | S > L | 28 | 75 |
|
| exon | 122,914 | 2 | G > A | ACG > AUG | 2 | T > M | 33 | 64 |
|
| exon | 123,626 | 233 | G > A | CCA > CUA | 2 | P > L | 110 | 9 |
|
| exon | 125,765 | 961 | G > A | CCU > UCU | 1 | P > S | 334 | 94 |
|
| exon | 127,589 | 341 | G > A | UCA > UUA | 2 | S > L | 99 | 79 |
|
| exon | 128,810 | 303 | G > A | AUC > AUU | 3 | I > I | 44 | 25 |
|
| exon | 131,464 | 4236 | G > A | CGC > CGU | 3 | R > R | 160 | 19 |
|
| exon | 133,945 | 1755 | G > A | UUC > UUU | 3 | F > F | 250 | 17 |
Editing efficiency is counted by edited reads divided by total mapped reads at the same site
If the sequencing depth of an editing site is less than 30, the edit rate may have a great error
The editing sites in the IR region are calculated only once
IGR Intergenic region
UTR untranslated region, it belongs to exon
*: Stop codon
Fig. 5Validation of inferred editing sites from RNA-Seq by Sanger sequencing. Sequencing chromatogram traces from two exemplary gene loci, accD and matK, are shown. The editing positions are highlighted by arrows. The top trace is genomic DNA (gDNA), and the bottom trace is complementary DNA (cDNA)
Fig. 6Phylogenetic tree reconstruction of Fagales using maximum likelihood (ML) based on whole chloroplast genome sequences. The ML bootstrap support value was given at each node. Nicotiana tabacum was used as the out-group