Literature DB >> 31624267

Deciphering tea tree chloroplast and mitochondrial genomes of Camellia sinensis var. assamica.

Fen Zhang1, Wei Li1, Cheng-Wen Gao2, Dan Zhang1, Li-Zhi Gao3,4.   

Abstract

Tea is the most popular non-alcoholic caffeine-containing and the oldest beverage in the world. In this study, we de novo assembled the chloroplast (cp) and mitochondrial (mt) genomes of C. sinensis var. assamica cv. Yunkang10 into a circular contig of 157,100 bp and two complete circular scaffolds (701719 bp and 177329 bp), respectively. We correspondingly annotated a total of 141 cp genes and 71 mt genes. Comparative analysis suggests repeat-rich nature of the mt genome compared to the cp genome, for example, with the characterization of 37,878 bp and 149 bp of long repeat sequences and 665 and 214 SSRs, respectively. We also detected 478 RNA-editing sites in 42 protein-coding mt genes, which are ~4.4-fold more than 54 RNA-editing sites detected in 21 protein-coding cp genes. The high-quality cp and mt genomes of C. sinensis var. assamica presented in this study will become an important resource for a range of genetic, functional, evolutionary and comparative genomic studies in tea tree and other Camellia species of the Theaceae family.

Entities:  

Mesh:

Year:  2019        PMID: 31624267      PMCID: PMC6797725          DOI: 10.1038/s41597-019-0201-8

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

Tea is the most popular non-alcoholic caffeine-containing and the oldest beverage in the world since 3000 B. C.[1,2]. The production of tea made from the young leaves of Camellia sinensis var. sinensis and C. sinensis var. assamica, together with ornamentally well-known camellias (e.g., C. japonica, C. reticulata and C. sasanqua) and worldwide renowned wooden oil crop C. oleifera[3] has made the genus Camellia possess huge economic values in Theaceae. Besides its industrial, cultural and medicinal values, botanists and evolutionary biologists have increasingly paid attention to this genus. As a result of frequent hybridization and polyploidization, Camellia is almost commonly regarded as one of the most taxonomically and phylogenetically difficult taxa in flowering plants[4]. Thus, it has long been problematic for the taxonomic classification of the Camellia species based on the morphological characteristics[5]. The chloroplast (cp) genomes are able to provide valuable information for taxonomic classification, tracing source populations[6,7] and the reconstruction of phylogeny to resolve complex evolutionary relationships[8-10] due to the conservation of genomic structure, maternal inheritance and a fairly low recombination rate. Genetically speaking, cp genomes are comparatively conserved than plant mitochondria (mt) genomes which are more heterogeneous in nature. However, the presence of NUPT (nuclear plastid DNA) into cp genomes argues that cp genomes assembled from WGS data may include the heterogeneity due to the nuclear cp DNA transferred to the nucleus, resulting in erroneous phylogenetic inferences[11]. It has long been acknowledged that mtDNA has the propensity to integrate DNA from various sources through intracellular and horizontal transfer[12-14]. Partially due to these reasons, the mt genomes vary from ~200 Kbp to ~11.3 Mbp in some living organisms[15-17]. The dynamic nature of mt genome structure has been recognized, and plant mt genomes can have a variety of different genomic configurations due to the recombination and differences in repeat content[18,19]. These characteristics make the plant mt genome a fascinating genetic system to investigate questions related to evolutionary biology. The first effort has been made to sequence the 13 representative Camellia chloroplast genomes using next-generation Illumina genome sequencing platform, which obtained novel insights into global patterns of structural variation across the Camellia cp genomes[4]. The reconstruction of phylogenetic relationships among these representative species of Camellia suggests that cp genomic resources are able to provide useful data to help to understand their evolutionary relationships and classify the ‘difficult taxa’. Increasing interest in the Camellia plants have made up to thirty-eight of cp genomes be sequenced up to date[20-37]. Recently, we decoded the first nuclear genome of C. sinensis var. assamica cv. Yunkang10, providing novel insights into genomic basis of tea flavors[38]. Besides the lack of the C. sinensis var. assamica cp genome among thirty-eight cp genomes that were sequenced in this genus[4,20-37], up to data, none of mt genome has been determined in the genus Camellia. In this study, we filtered cpDNA and mtDNA reads from the WGS genome sequence project[38] and de novo assembled the mt genome and cp genome of C. sinensis var. assamica. The information of both cp and mt genomes will help to obtain a comprehensive understanding of the taxonomy and evolution of the genus Camellia. These genome sequences will also facilitate the genetic modification of these economically important plants, for example, through chloroplast genetic engineering technologies.

Methods

Plant materials, DNA extraction and genome sequencing

Young and healthy leaves of an individual plant of cultivar Yunkang10 of C. sinensis var. assamica were collected for genome sequencing in April, 2009, from Menghai County, Yunnan Province, China. Fresh leaves were harvested and immediately frozen in liquid nitrogen after collection, followed by the preservation at −80 °C in the laboratory prior to DNA extraction. High-quality genomic DNA was extracted from leaves using a modified CTAB method[39]. RNase A and proteinase K were separately used to remove RNA and protein contamination. The quality and quantity of the isolated DNA were separately checked by electrophoresis on a 0.8% agarose gel and a NanoDrop D-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE). A total of eleven paired-end libraries, including four types of small-insert libraries (180 bp, 260 bp, 300 bp, 500 bp) and seven large-insert libraries (2 Kb, 3 Kb, 4 Kb, 5 Kb, 6 Kb, 8 Kb, 20 Kb), were prepared following the Illumina’s instructions, and sequenced using Illumina HiSeq. 2000 platform by following the standard Illumina protocols (Illumina, San Diego, CA). We totally generated ~707.88 Gb (~229.31×) of raw sequencing data[38]. Further reads quality control filtering processes yielded a total of ~492.15 Gb (~159.43×) high-quality data retained and used for subsequent genome assembly.

De novo chloroplast and mitochondria genome assemblies

The chloroplast reads were filtered from whole genome Illumina sequencing data of C. sinensis var. assamica, we mapped all the sequencing reads to the reference genomes[4] using bowtie2 (version 2.3.4.3)[40]. The mapped chloroplast reads were assembled into a circular contig of 157,100 bp in length with an overall GC content of 37.29% using CLC Genomics Workbench v. 3.6.1 (CLC Inc., Rarhus, Denmark) (Fig. 1). For mitochondria genome assembly, the PE and MP sequencing reads were used separately. Briefly, we first performed de novo assembly with VELVET v1.2.08[41], which was previously described[42,43]. Scaffolds were constructed using SSPACE v.3.0[44]. False connection was manually removed based on the coverage and distances of paired reads. Gaps between scaffolds were then filled with GapCloser (version 1.12)[45,46] using all pair-end reads. We obtained the two complete circular scaffolds (701719 bp and 177329 bp) of the C. sinensis var. assamica mt genome from the de-novo assembly of the filtered mitochondrial reads (Figs 2–4). The two scaffolds of the mt genome had overall GC contents of 45.63% and 45.81%, respectively. The completed chloroplast and mitochondria genomes are publicly available in NCBI GenBank under accession numbers MH019307, MK574876 and MK574877 and BIG Genome Warehouse WGS000271, WGS000272.
Fig. 1

Genome map of C. sinensis var. assamica cv. Yunkang10. Genes lying outside of the outer circle are transcribed in the clockwise direction whereas genes inside are transcribed in the counterclockwise direction. Genes belonging to different functional groups are color-coded. Area dashed darker gray in the inner circle indicates GC content while the lighter gray corresponds to AT content of the genome.

Fig. 2

The assembly and annotation pipeline of the tea tree mitochondrial genome.

Fig. 4

Circular map of scaffold 2 in the C. sinensis var. assamica cv. Yunkang10 mitochondrial genome. Gene map showing 17 annotated genes with different functional groups that are color-coded on outer circle as transcribed clock-wise (outside) and transcribed counter clock-wise (inside). The inner circle indicates the GC content as dark grey plot.

Genome map of C. sinensis var. assamica cv. Yunkang10. Genes lying outside of the outer circle are transcribed in the clockwise direction whereas genes inside are transcribed in the counterclockwise direction. Genes belonging to different functional groups are color-coded. Area dashed darker gray in the inner circle indicates GC content while the lighter gray corresponds to AT content of the genome. The assembly and annotation pipeline of the tea tree mitochondrial genome. Circular map of scaffold 1 in the C. sinensis var. assamica cv. Yunkang10 mitochondrial genome. Gene map showing 54 annotated genes with different functional groups that are color-coded on outer circle as transcribed clock-wise (outside) and transcribed counter clock-wise (inside). The inner circle indicates the GC content as dark grey plot. Circular map of scaffold 2 in the C. sinensis var. assamica cv. Yunkang10 mitochondrial genome. Gene map showing 17 annotated genes with different functional groups that are color-coded on outer circle as transcribed clock-wise (outside) and transcribed counter clock-wise (inside). The inner circle indicates the GC content as dark grey plot.

Genome annotation and visualization

The complete chloroplast genome of C. sinensis var. assamica was preliminarily annotated using the online program DOGMA[47] (Dual Organellar Genome Annotator) followed by manual correction. A total of 141 genes were annotated, of which 87 were protein-coding genes, 46 were tRNA genes and eight were rRNA genes (Table 1). MITOFY[15] was used to characterize the complement of protein-coding and rRNA genes in the mitochondrial genome. A tRNA gene search was carried out using the tRNA scan-SE software (version 1.3.1)[48]. We annotated a total of 71 genes, including 44 protein-coding genes, 24 tRNAs and 3 rRNAs (Table 2). Circular genome maps were drawn with OrganellarGenomeDRAW[49] (Figs 3–4).
Table 1

Gene annotation of the C. sinensis var. assamica cp genome.

CategoryGroupGenes
Photosynthesis related genesRubiscorbcL
Photosystem IpsaA, psaB, psaC, psaI, psaJ
Assembly/stability of Photosystem Iycf3
Photosystem IIpsbA, psbB, psbT, psbK, psbI, psbH, psbM, psbN, psbD, psbC, psbZ, psbJ, psbL, psbE, psbF
ATP synthaseatpA, atpB, atpE, atpF, atpH, atpI
Cytochrome b/f complexpetA, petB, petD, petN, petL, petG
Cytochrome csynthesisccsA
NADPH dehydrogenasendhA, ndhB (×2), ndhC, ndhD, ndhE, ndhF, ndhH, ndhG, ndhJ, ndhK, ndhI
Transcription and translation related genesTranscriptionrpoA, rpoC2, rpoC1, rpoB
Ribosomal proteinsrps2, rps3, rps4, rps7 (×2), rps8, rps11, rps12, rps14, rps15, rps16, rps18, rps19, rpl2 (×2), rpl14, rpl16, rpl20, rpl22, rpl23 (×2), rpl32, rpl33, rpl36
Translation initiation factorinfA
RNA genesRibosomal RNArrn16S (×2), rrn23S (×2), rrn4.5 (×2), rrn5 (×2)
Transfer RNAtrnH-GUG, trnK-UUU (×2), trnQ-UUG, trnS-GCU, trnG-UCC (×2), trnR-UCU, trnC-GCA, trnD-GUC, trnY-GUA, trnE-UUC, trnT-GGU, trnS-UGA, trnG-UCC, trnfM-CAU, trnS-GGA, trnT-UGU, trnL-UAA (×2), trnF-GAA, trnV-UAC (×2), trnM-CAU, trnW-CCA, trnP-UGG, trnI-CAU, trnL-CAA (×2), trnV-GAC, trnI-GAU (×3), trnA-UGC (×2), trnR-ACG (×2), trnN-GUU (×2), trnL-UAG, trnN-GUU, trnR-ACG, trnA-UGC (×2), trnV-GAC, trnI-CAU
Other genesRNA processingmatK
Carbon metabolismcemA
Fatty acid synthesisaccD
ProteolysisclpP
Genes of unknown functionConserved ORFsycf1 (×2), cf2, ycf4, ycf2, ycf15 (×2)
Table 2

Gene content of the C. sinensis var. assamica mt genome.

Group of genesName of genes
Scaffold 1Scaffold 2
Complex Inad1, nad2, nad3, nad4, nad4L, nad5, nad6, nad7, nad9 (×2)nad1, nad2
Complex IIsdh3, sdh4sdh3
Complex IIIcob
Complex IVcox1, cox2, cox3
Complex Vatp1, atp4, atp6, atp8, atp9atp9
Cytochrome c biogenesisccmFn, ccmB, ccmCccmFc
Ribosome large subunitrpl2, rpl10, rpl16rpl5
Ribosome small subunitrps1, rps3, rps4, rps7, rps12, rps13, rps19rps14, rps19
rRNA genesrrn5, rrn18, rrn16
tRNA genestrnS(Ser), trnD(Asp), trnK(Lys), trnfM(Met) (×2), trnI(Ile)-cp, trnE(Glu), trnH(His)-cp, trnP(Pro), trnW(Trp)-cp, trnG(Gly), trnQ(Gln), trnC(Cys), trnD(Asp), trnS(Ser), trnV(Val)-cptrnI(Ile), trnM(Met)-cp, trnC(Cys), trnN(Asn)-cp, trnY(Tyr), trnS(Ser), trnF(Phe), trnP(Pro)
chloroplast-derived genestrnI(Ile)-cp, trnH(His)-cp, trnW(Trp)-cp, trnV(Val)-cptrnM(Met)-cp, trnN(Asn)-cp
Other proteinsmatR, mttB
Fig. 3

Circular map of scaffold 1 in the C. sinensis var. assamica cv. Yunkang10 mitochondrial genome. Gene map showing 54 annotated genes with different functional groups that are color-coded on outer circle as transcribed clock-wise (outside) and transcribed counter clock-wise (inside). The inner circle indicates the GC content as dark grey plot.

Gene annotation of the C. sinensis var. assamica cp genome. Gene content of the C. sinensis var. assamica mt genome. Simple sequence repeats (SSRs) were identified and located using MISA (http://pgrc.ipk-gatersleben.de/misa/). All the annotated SSRs were classified by the size and copy number of their tandemly repeated: monomer (one nucleotide, n ≥ 8), dimer (two nucleotides, n ≥ 4), trimer (three nucleotides, n ≥ 4), tetramer (four nucleotides, n ≥ 3), pentamer (five nucleotides, n ≥ 3), hexamer (six nucleotides, n ≥ 3). A total of 214 SSRs were identified in cp genome with 74.42% of which were monomers, 19.07% of dimers, 0.47% of trimers, 4.65% of tetramers and 0.93% of hexamers (Table 3). There were no pentamers found in the cp genome. In mt genome, we obtained 665 SSRs distributed into monomers, dimers, trimers, pentamers, tetramers and hexamers with 31.53%, 45.35%, 4.95%, 15.17%, 2.70% and 0.15%, respectively (Table 3). Repeat sequences including forward and palindromic repeats, were also searched by REPuter[50] with the following parameters: minimal length 50 nt; mismatch 3 nt. Long repeat sequences (repeat unit > 50 bp) of forward and palindromic repeats were further annotated, resulting in 149 bp from 4 paired repeats in the cp genome (Table 4) and 37,878 bp from 58 paired repeats in the mt genome (Online-only Tables 1–2). Our repeat content analyses indicate that the mt genome is more abundant in repeat sequences and more variable than the cp genome of C. sinensis var. assamica (Table 4; Online-only Tables 1–2).
Table 3

Statistics of SSR motifs in the C. sinensis var. assamica mt and cp genomes.

SSR-Motifmt Genomecp Genome
SSR NumberSSR %SSR NumberSSR %
Monomer21031.5316074.42
Dimer30245.354119.07
Trimer334.9510.47
Tetramer10115.17104.65
Pentamer182.7000.00
Hexamer10.1520.93
Table 4

Long repeats (repeat unit > 50 bp) in the C. sinensis var. assamica cp genome.

Repeat LengthType*Start of Copy 1Start of Copy 2
56F9393893956
56P93938149737
56P93956149755
56F149737149755

*P indicates palindromic repeats; F indicates forward repeats.

Overlapped repeats have been manually removed while calculating total length.

Online-only Table 1

Long repeats (repeat unit > 50 bp) in Scaffold 1 of the C. sinensis var. assamica mt genome.

Repeat LengthType*Start of Copy 2Start of Copy 1
5119F207173443366
2191F389017391244
1963F210330212292
1962F212292446523
1930F383226385188
1650F205522207173
1650F205522443366
1469F538290539780
814F496567498047
705F619432621461
665F497382498862
255P151984200526
228P448476544136
204F277002363807
131P73675482324
125F301855468834
104F297204623713
88F228824559689
87F594334641398
84F530415646532
82P224027395044
82F509347623862
81P152363200041
80F304361306020
78P299987587603
74F165777570981
70F165878571083
69F123050384677
69F123050386639
67F1849527472
66F299782537227
66P364849599005
66F684228684285
65P508609683320
64F542385560020
63F605770619261
62P70098424512
62F151516524252
62P156839486845
61F123120384747
61F123120386709
61P142673486240
60F302012395122
59P265260472040
58F285626402303
57P152478199950
57F276881363698
56F402376658389
55P41703667438
55F258578486959

*P indicates palindromic repeats; F indicates forward repeats. Overlapped repeats have been manually removed while calculating total length.

Online-only Table 2

Long repeats (repeat unit > 50 bp) in Scaffold 2 of the C. sinensis var. assamica mt genome.

Repeat LengthType*Start of Copy 1Start of Copy 2
704F3073932294
156P2908567620
86F67291136332
67P425517574
67P2399845730
62F67282135282
55F120664129253
53F135291136332

*P indicates palindromic repeats; F indicates forward repeats. Overlapped repeats have been manually removed while calculating total length.

Statistics of SSR motifs in the C. sinensis var. assamica mt and cp genomes. Long repeats (repeat unit > 50 bp) in the C. sinensis var. assamica cp genome. *P indicates palindromic repeats; F indicates forward repeats. Overlapped repeats have been manually removed while calculating total length.

Prediction of RNA-editing sites

Putative RNA editing sites in protein-coding genes were predicted using the PREP-cp and PREP-mt Web-based program (http://prep.unl.edu/)[51,52]. To achieve a balanced trade-off between the number of false positive and false negative sites, the cutoff score (C-value) was set to 0.8 and 0.6, respectively[53]. Almost all transcripts of protein encoding genes in the plant mitochondria are subject to RNA editing except the T-urf13 gene[54]. Our results showed that the extent of RNA editing varied by gene for both cp and mt genomes of C. sinensis var. assamica. In the C. sinensis var. assamica cp genome, we detected 54 RNA-editing sites in 21 protein-coding genes, ranging from one editing site in atpF, atpI, petB, psaI, psbE, psbF, rpoA, rps2 and rps8 to 8 editing sites in ndhB (Online-only Table 3). In the C. sinensis var. assamica mt genome, we predicted 478 RNA-editing sites in 42 protein-coding genes; they varied from two editing site in atp9 (of scaffold2), sdh3 (of scaffold1 and scaffold2, respectively) and rps14 (of scaffold2) to 35 editing sites in ccmFn (of scaffold1) (Online-only Table 4–5).
Online-only Table 3

Predicted RNA-editing sites in the C. sinensis var. assamica cp genome. (*The cutoff score (C-value) was set to 0.8).

No.GeneNucleotide PosAA PosEffectScore*
1 accD 6422CGG (R) => TGG (W)1
2 accD 1469490CCT (P) => CTT (L)1
3 atpA 791264CCA (P) => CTA (L)1
4 atpA 914305TCA (S) => TTA (L)1
5 atpF 9231CCA (P) => CTA (L)0.86
6 atpI 13445GCT (A) => GTT (V)1
7 matK 445149CAC (H) => TAC (Y)1
8 matK 467156TCG (S) => TTG (L)1
9 matK 631211CAT (H) => TAT (Y)1
10 matK 1234412CAT (H) => TAT (Y)1
11 ndhA 341114TCA (S) => TTA (L)1
12 ndhA 566189TCA (S) => TTA (L)1
13 ndhA 1028343TCT (S) => TTT (F)1
14 ndhA 1073358TCT (S) => TTT (F)1
15 ndhB 14950TCA (S) => TTA (L)1
16 ndhB 467156CCA (P) => CTA (L)1
17 ndhB 586196CAT (H) => TAT (Y)1
18 ndhB 611204TCA (S) => TTA (L)0.8
19 ndhB 737246CCA (P) => CTA (L)1
20 ndhB 746249TCT (S) => TTT (F)1
21 ndhB 830277TCA (S) => TTA (L)1
22 ndhB 1481494CCA (P) => CTA (L)1
23 ndhD 207ACG (T) => ATG (M)1
24 ndhD 401134TCA (S) => TTA (L)1
25 ndhD 692231TCA (S) => TTA (L)1
26 ndhD 896299TCA (S) => TTA (L)1
27 ndhD 905302CCT (P) => CTT (L)1
28 ndhD 1328443TCA (S) => TTA (L)0.8
29 ndhF 20569CAT (H) => TAT (Y)0.8
30 ndhF 29097TCA (S) => TTA (L)1
31 ndhG 16656CAT (H) => TAT (Y)0.8
32 ndhG 314105ACA (T) => ATA (I)0.8
33 petB 641214CCA (P) => CTA (L)1
34 psaI 8027TCT (S) => TTT (F)0.86
35 psbE 21472CCT (P) => TCT (S)1
36 psbF 7726TCT (S) => TTT (F)1
37 rpoA 368123TCG (S) => TTG (L)1
38 rpoB 338113TCT (S) => TTT (F)1
39 rpoB 473158TCA (S) => TTA (L)0.86
40 rpoB 551184TCA (S) => TTA (L)1
41 rpoB 566189TCG (S) => TTG (L)1
42 rpoB 973325CTT (L) => TTT (F)0.86
43 rpoB 2000667TCT (S) => TTT (F)1
44 rpoB 2336779ACA (T) => ATA (I)1
45 rpoC1 4114TCA (S) => TTA (L)1
46 rpoC1 1556519TCG (S) => TTG (L)1
47 rpoC2 1505502ACG (T) => ATG (M)0.86
48 rpoC2 2290764CGG (R) => TGG (W)1
49 rpoC2 2726909ACT (T) => ATT (I)1
50 rpoC2 37281243TCA (S) => TTA (L)0.86
51 rps2 24883TCA (S) => TTA (L)1
52 rps8 18261TCA (S) => TTA (L)0.86
53 rps14 8027TCA (S) => TTA (L)1
54 rps14 14950CCA (P) => CTA (L)1
Online-only Table 4

Predicted RNA-editing sites in Scaffold 1 of the C. sinensis var. assamica mt genome.

No.GeneNucleotide PositionAA PosEffectScore*
1 matR 3211TCC (S) => TTC (F)0.62
2 matR 23679TCC (S) => TTC (F)0.62
3 matR 326109CCA (P) => CTA (L)1
4 matR 917306TCA (S) => TTA (L)1
5 matR 1442481GCC (A) => GTC (V)0.62
6 matR 1667556TCC (S) => TTC (F)1
7 matR 1688563CCT (P) => CTT (L)1
8 matR 1708570CGC (R) => TGC (C)1
9 matR 1744582CAC (H) => TAC (Y)1
10 matR 1775592CCG (P) => CTG (L)1
11 matR 1814605CCA (P) => CTA (L)0.88
12 matR 1832611TCA (S) => TTA (L)0.88
13 ccmFn 3813CCG (P) => CTG (L)1
14 ccmFn 9833CCT (P) => CTT (L)1
15 ccmFn 13746TCG (S) => TTG (L)1
16 ccmFn 14248CGT (R) => TGT (C)1
17 ccmFn 15151CCT (P) => TCT (S)0.83
18 ccmFn 24883TCA (S) => TTA (L)1
19 ccmFn 25686CGG (R) => TGG (W)1
20 ccmFn 28395CTT (L) => TTT (F)0.83
21 ccmFn 334112CAT (H) => TAT (Y)0.67
22 ccmFn 356119TCC (S) => TTC (F)0.67
23 ccmFn 391131CCT (P) => TCT (S)1
24 ccmFn 478160CGT (R) => TGT (C)0.83
25 ccmFn 706236CCT (P) => TTT (F)0.67
26 ccmFn 707236CCT (P) => TTT (F)0.67
27 ccmFn 716239TCA (S) => TTA (L)0.83
28 ccmFn 754252CGT (R) => TGT (C)1
29 ccmFn 776259TCA (S) => TTA (L)1
30 ccmFn 788263CCA (P) => CTA (L)1
31 ccmFn 803268TCA (S) => TTA (L)1
32 ccmFn 893298GCG (A) => GTG (V)1
33 ccmFn 952318CGC (R) => TGC (C)1
34 ccmFn 1270424CGG (R) => TGG (W)1
35 ccmFn 1298433CCA (P) => CTA (L)1
36 ccmFn 1315439CAT (H) => TAT (Y)1
37 ccmFn 1330444CGG (R) => TGG (W)1
38 ccmFn 1348450CGG (R) => TGG (W)1
39 ccmFn 1381461CGG (R) => TGG (W)1
40 ccmFn 1399467CGT (R) => TGT (C)1
41 ccmFn 1442481TCG (S) => TTG (L)1
42 ccmFn 1462488CTT (L) => TTT (F)1
43 ccmFn 1466489CCA (P) => CTA (L)1
44 ccmFn 1478493TCA (S) => TTA (L)1
45 ccmFn 1487496TCT (S) => TTT (F)1
46 ccmFn 1513505CCC (P) => TCC (S)1
47 ccmFn 1561521CGG (R) => TGG (W)0.67
48 nad5 15552CCG (P) => CTG (L)1
49 nad5 23880CCG (P) => TCG (S)0.8
50 nad5 26990TCC (S) => TTC (F)0.7
51 nad5 355119CCT (P) => TTT (F)1
52 nad5 356119CCT (P) => TTT (F)1
53 nad5 371124CCA (P) => CTA (L)0.9
54 nad5 395132TCT (S) => TTT (F)0.9
55 nad5 503168CCT (P) => CTT (L)1
56 nad5 536179CCT (P) => CTT (L)1
57 nad5 626209TCT (S) => TTT (F)0.9
58 nad5 628210CGC (R) => TGC (C)0.9
59 nad5 673225CTT (L) => TTT (F)0.9
60 nad5 710237TCG (S) => TTG (L)1
61 nad5 722241TCA (S) => TTA (L)1
62 nad5 832278CCA (P) => TCA (S)0.9
63 nad5 872291ACG (T) => ATG (M)1
64 nad5 1307436TCA (S) => TTA (L)1
65 nad4 2910TCC (S) => TTC (F)0.67
66 nad4 7425ACT (T) => ATT (I)0.89
67 nad4 7726CCT (P) => CTT (L)0.78
68 nad4 10736CCG (P) => CTG (L)1
69 nad4 15452CCC (P) => TCC (S)1
70 nad4 15853CCT (P) => CTT (L)1
71 nad4 16656CGG (R) => TGG (W)1
72 nad4 19766TCT (S) => TTT (F)1
73 nad4 362121ACA (T) => ATA (I)0.89
74 nad4 368123TCT (S) => TTT (F)1
75 nad4 376126CGT (R) => TGT (C)0.78
76 nad4 403135CGC (R) => TGC (C)1
77 nad4 416139CCT (P) => CTT (L)0.89
78 nad4 433145CTT (L) => TTT (F)1
79 nad4 436146CCC (P) => TTC (F)0.89
80 nad4 437146CCC (P) => TTC (F)0.89
81 nad4 449150CCA (P) => CTA (L)1
82 nad4 547183CTC (L) => TTC (F)0.67
83 nad4 1336446CAC (H) => TAC (Y)1
84 nad4 1352451CCG (P) => CTG (L)1
85 nad4 1357453CGC (R) => TGC (C)1
86 atp6 3713CCA (P) => TCA (S)0.75
87 atp6 11639TCA (S) => TTA (L)1
88 atp6 16756CCG (P) => CTG (L)1
89 atp6 17358CCG (P) => CTG (L)1
90 atp6 22475TCC (S) => TTC (F)1
91 atp6 22977CGC (R) => TGC (C)0.75
92 atp6 23679TCG (S) => TTG (L)0.67
93 atp6 25485TCG (S) => TTG (L)1
94 atp6 26288CGT (R) => TGT (C)1
95 atp6 26990CCC (P) => CTC (L)1
96 atp6 401134TCA (S) => TTA (L)1
97 atp6 460154CCT (P) => TCT (S)1
98 atp6 463155CAT (H) => TAT (Y)1
99 atp6 485162CCA (P) => CTA (L)1
100 atp6 527176TCA (S) => TTA (L)1
101 atp6 548183TCC (S) => TTC (F)1
102 atp6 635212CCG (P) => CTG (L)1
103 atp6 656219TCA (S) => TTA (L)1
104 atp6 664222CAT (H) => TAT (Y)1
105 atp6 671224TCT (S) => TTT (F)1
106 atp6 680227TCA (S) => TTA (L)1
107 atp6 707236ACA (T) => ATA (I)0.92
108 atp6 718240CAA (Q) => TAA (X)1
109 mttB 5820CAT (H) => TAT (Y)0.88
110 mttB 8328TCG (S) => TTG (L)0.88
111 mttB 9131CCA (P) => TCA (S)1
112 mttB 12743CGT (R) => TGT (C)0.88
113 mttB 13445CCA (P) => CTA (L)0.62
114 mttB 16455TCC (S) => TTC (F)0.75
115 mttB 19666CCG (P) => TCG (S)1
116 mttB 25385CGT (R) => TGT (C)0.62
117 mttB 29097TCT (S) => TTT (F)1
118 mttB 299100TCG (S) => TTG (L)0.75
119 ccmB 2810CAT (H) => TAT (Y)0.89
120 ccmB 4315CCC (P) => TCC (S)0.67
121 ccmB 7124CCA (P) => CTA (L)1
122 ccmB 8027TCG (S) => TTG (L)1
123 ccmB 12843TCA (S) => TTA (L)1
124 ccmB 13746TCC (S) => TTC (F)1
125 ccmB 14950CCG (P) => CTG (L)1
126 ccmB 15452CGG (R) => TGG (W)1
127 ccmB 16054CCT (P) => TCT (S)0.67
128 ccmB 16455CCG (P) =>=> CTG (L)0.89
129 ccmB 17258CCT (P) => TCT (S)0.89
130 ccmB 17960CCT (P) => CTT (L)1
131 ccmB 19365CCT (P) => TTT (F)0.89
132 ccmB 19465CCT (P) => TTT (F)0.89
133 ccmB 28696CGG (R) => TGG (W)1
134 ccmB 304102CGT (R) => TGT (C)0.78
135 ccmB 313105CGT (R) => TGT (C)0.89
136 ccmB 338113CCG (P) => CTG (L)1
137 ccmB 367123CGG (R) => TGG (W)0.78
138 ccmB 424142CGT (R) => TGT (C)0.89
139 ccmB 428143TCG (S) => TTG (L)1
140 ccmB 467156TCG (S) => TTG (L)0.89
141 ccmB 476159CCA (P) => CTA (L)0.89
142 ccmB 485162TCA (S) => TTA (L)1
143 ccmB 494165TCA (S) => TTA (L)1
144 ccmB 503168CCA (P) => CTA (L)1
145 ccmB 512171TCT (S) => TTT (F)1
146 ccmB 514172CGT (R) => TGT (C)1
147 ccmB 551184TCA (S) => TTA (L)1
148 ccmB 554185TCG (S) => TTG (L)0.89
149 ccmB 566189TCC (S) => TTC (F)0.78
150 ccmB 569190TCT (S) => TTT (F)0.78
151 ccmB 572191CCG (P) => CTG (L)1
152 ccmB 596199TCG (S) => TTG (L)0.89
153 rpl10 10134TCG (S) => TTG (L)0.83
154 rpl10 23980TCG (S) => TTG (L)0.83
155 rpl10 314105TCA (S) => TTA (L)0.83
156 rps7 15251CCA (P) => CTA (L)0.75
157 rps7 343115CAC (H) => TAC (Y)0.62
158 rps7 368123TCA (S) => TTA (L)0.88
159 atp1 1039347CCC (P) => TCC (S)1
160 atp1 1064355TCG (S) => TTG (L)1
161 atp1 1178393TCA (S) => TTA (L)0.9
162 atp1 1216406CTT (L) => TTT (F)1
163 atp1 1292431CCG (P) => CTG (L)0.8
164 atp1 1415472CCA (P) => CTA (L)1
165 atp1 1490497CCA (P) => CTA (L)0.9
166 atp9 207TCA (S) => TTA (L)1
167 atp9 5017TCA (S) => TTA (L)1
168 atp9 8228CTT (L) => TTT (F)1
169 atp9 9231TCG (S) => TTG (L)1
170 atp9 13445TCA (S) => TTA (L)1
171 atp9 18261TCG (S) => TTG (L)1
172 atp9 19164CCA (P) => CTA (L)1
173 atp9 21271TCA (S) => TTA (L)1
174 atp9 21572TCC (S) => TTC (F)1
175 atp9 22375CGA (R) => TGA (X)1
176 sdh3 6723CCC (P) => TCC (S)1
177 sdh3 376126CTC (L) => TTC (F)0.83
178 rpl16 7927CAG (Q) => TAG (X)1
179 rpl16 22776ACT (T) => ATT (I)1
180 rpl16 355119CTC (L) => TTC (F)0.89
181 rpl16 524175CCA (P) => CTA (L)1
182 rpl16 530177TCG (S) => TTG (L)0.75
183 rps3 314105CCA (P) => CTA (L)0.86
184 rps3 647216CCG (P) => CTG (L)1
185 rps3 674225CCG (P) => CTG (L)0.86
186 rps3 785262TCA (S) => TTA (L)1
187 rps3 838280CGT (R) => TGT (C)1
188 rps3 902301TCA (S) => TTA (L)0.86
189 rps19 6221TCG (S) => TTG (L)1
190 rps19 10937CCT (P) => TTT (F)1
191 rps19 11037CCT (P) => TTT (F)1
192 rpl2 21572CCA (P) => CTA (L)0.75
193 rpl2 329110CCA (P) => CTA (L)1
194 rpl2 494165GCG (A) => GTG (V)0.67
195 rpl2 517173CTC (L) => TTC (F)1
196 rpl2 550184CCC (P) => TCC (S)1
197 atp8 4716TCA (S) => TTA (L)1
198 atp8 5820CTC (L) => TTC (F)1
199 atp8 452151CCA (P) => CTA (L)0.75
200 cox3 28997CTT (L) => TTT (F)0.92
201 cox3 304102CGG (R) => TGG (W)1
202 cox3 311104TCT (S) => TTT (F)0.92
203 cox3 314105TCT (S) => TTT (F)0.92
204 cox3 419140CCC (P) => CTC (L)1
205 cox3 422141CCT (P) => CTT (L)0.92
206 cox3 512171TCA (S) => TTA (L)0.75
207 cox3 653218TCG (S) => TTG (L)1
208 cox3 754252CGG (R) => TGG (W)0.92
209 cox3 764255CCA (P) => CTA (L)0.92
210 sdh4 15552CCA (P) => CTA (L)0.88
211 sdh4 20368CCA (P) => CTA (L)0.75
212 sdh4 25987CAT (H) => TAT (Y)0.88
213 cox1 15552TCT (S) => TTT (F)1
214 cox1 16756TCT (S) => TTT (F)1
215 cox1 26589CCA (P) => TCA (S)1
216 cox1 356119TCA (S) => TTA (L)1
217 cox1 365122TCT (S) => TTT (F)1
218 cox1 428143TCC (S) => TTC (F)1
219 cox1 464155TCA (S) => TTA (L)1
220 cox1 503168CCA (P) => CTA (L)1
221 cox1 581194TCT (S) => TTT (F)1
222 cox1 628210CGG (R) => TGG (W)1
223 cox1 659220CCC (P) => CTC (L)1
224 cox1 674225TCC (S) => TTC (F)1
225 cox1 758253ACA (T) => ATA (I)1
226 cox1 773258TCT (S) => TTT (F)1
227 cox1 950317TCC (S) => TTC (F)1
228 cox1 1099367CAC (H) => TAC (Y)1
229 cox1 1187396CCG (P) => CTG (L)0.89
230 cox1 1318440CGT (R) => TGT (C)0.78
231 cox1 1346449TCA (S) => TTA (L)1
232 cox1 1402468CCA (P) => TCA (S)1
233 cox1 1412471TCG (S) => TTG (L)1
234 nad7 3813TCG (S) => TTG (L)0.75
235 nad7 7726TCA (S) => TTA (L)1
236 nad7 8328TCA (S) => TTA (L)1
237 nad7 13746TCA (S) => TTA (L)1
238 nad7 20569CAT (H) => TAT (Y)1
239 nad7 21271TCA (S) => TTA (L)1
240 nad7 27793CGT (R) => TGT (C)1
241 nad7 29699TCA (S) => TTA (L)0.88
242 nad7 305102TCA (S) => TTA (L)1
243 nad7 344115TCA (S) => TTA (L)1
244 nad7 494165TCC (S) => TTC (F)1
245 nad7 539180TCA (S) => TTA (L)0.88
246 nad7 812271TCA (S) => TTA (L)0.88
247 nad7 859287CCT (P) => TCT (S)0.88
248 nad7 943315CGT (R) => TGT (C)1
249 nad7 965322TCT (S) => TTT (F)1
250 nad7 989330TCT (S) => TTT (F)1
251 nad7 1010337CCA (P) => CTA (L)1
252 nad7 1052351TCT (S) => TTT (F)1
253 nad9 428143TCC (S) => TTC (F)0.73
254 nad9 506169TCT (S) => TTT (F)0.75
255 nad9 527176CCA (P) => CTA (L)0.92
256 nad9 581194TCG (S) => TTG (L)0.92
257 nad9 604202CAT (H) => TAT (Y)1
258 nad9 712238CCG (P) => TCG (S)0.83
259 nad9 742248CGG (R) => TGG (W)1
260 nad9 782261TCC (S) => TTC (F)1
261 nad9 812271TCA (S) => TTA (L)1
262 nad9 853285CTT (L) => TTT (F)1
263 nad9 953318TCT (S) => TTT (F)1
264 nad4L 114TCT (S) => TTT (F)1
265 nad4L 176TCA (S) => TTA (L)1
266 nad4L 259CGG (R) => TGG (W)1
267 nad4L 5619CCT (P) => CTT (L)1
268 nad4L 6522TCA (S) => TTA (L)1
269 nad4L 7024CCA (P) => TCA (S)1
270 nad4L 8027TCA (S) => TTA (L)1
271 nad4L 10134TCG (S) => TTG (L)0.88
272 nad4L 12843TCG (S) => TTG (L)1
273 nad4L 14950TCA (S) => TTA (L)0.75
274 nad4L 15853TCA (S) => TTA (L)0.88
275 nad4L 16756CCA (P) => CTA (L)0.88
276 nad4L 20067TCA (S) => TTA (L)1
277 nad4L 25184TCT (S) => TTT (F)0.88
278 atp4 7124TCA (S) => TTA (L)1
279 atp4 8930TCA (S) => TTA (L)1
280 atp4 11840CGT (R) => TGT (C)0.71
281 atp4 21572TCG (S) => TTG (L)1
282 atp4 24883CCT (P) => CTT (L)1
283 atp4 395132TCA (S) => TTA (L)1
284 atp4 407136CCA (P) => CTA (L)0.71
285 atp4 416139ACT (T) => ATT (I)0.86
286 ccmC 7626CGG (R) => TGG (W)0.78
287 ccmC 10335CAT (H) => TAT (Y)1
288 ccmC 11539CGG (R) => TGG (W)0.78
289 ccmC 13345CTT (L) => TTT (F)0.67
290 ccmC 16154CCG (P) => CTG (L)0.78
291 ccmC 17960GCG (A) => GTG (V)0.78
292 ccmC 18462CGG (R) => TGG (W)1
293 ccmC 299100TCT (S) => TTT (F)1
294 ccmC 331111CGG (R) => TGG (W)1
295 ccmC 395132TCG (S) => TTG (L)1
296 ccmC 400134CTT (L) => TTT (F)0.89
297 ccmC 421141CGT (R) => TGT (C)0.78
298 ccmC 436146CCT (P) => TCT (S)0.89
299 ccmC 446149CCG (P) => CTG (L)0.78
300 ccmC 451151CCT (P) => TCT (S)1
301 ccmC 458153TCA (S) => TTA (L)0.78
302 ccmC 463155CGT (R) => TGT (C)1
303 ccmC 467156GCT (A) => GTT (V)0.78
304 ccmC 473158CCG (P) => CTG (L)1
305 ccmC 497166TCT (S) => TTT (F)1
306 ccmC 521174TCG (S) => TTG (L)1
307 ccmC 548183TCT (S) => TTT (F)1
308 ccmC 568190CCT (P) => TCT (S)1
309 ccmC 575192CCC (P) => CTC (L)1
310 ccmC 605202TCC (S) => TTC (F)1
311 ccmC 608203CCC (P) => CTC (L)0.89
312 ccmC 614205TCA (S) => TTA (L)0.78
313 ccmC 619207CGT (R) => TGT (C)0.78
314 ccmC 650217CCT (P) => CTT (L)0.78
315 ccmC 656219CCA (P) => CTA (L)0.89
316 ccmC 673225CCT (P) => TCT (S)0.78
317 cox2 7124TCT (S) => TTT (F)1
318 cox2 16154TCA (S) => TTA (L)0.95
319 cox2 16355CGG (R) => TGG (W)1
320 cox2 25385CGG (R) => TGG (W)1
321 cox2 27893CCG (P) => CTG (L)1
322 cox2 379127CGG (R) => TGG (W)1
323 cox2 443148ACG (T) => ATG (M)1
324 cox2 461154CCA (P) => CTA (L)1
325 cox2 476159TCA (S) => TTA (L)1
326 cox2 544182CCT (P) => TCT (S)1
327 cox2 557186CCT (P) => CTT (L)1
328 cox2 581194TCA (S) => TTA (L)1
329 cox2 632211TCG (S) => TTG (L)0.84
330 cox2 698233ACG (T) => ATG (M)1
331 cox2 742248CGG (R) => TGG (W)1
332 rps13 52TCA (S) => TTA (L)0.6
333 rps13 269TCA (S) => TTA (L)0.9
334 rps13 5619TCA (S) => TTA (L)0.9
335 rps13 10034CGT (R) => TGT (C)0.9
336 rps13 28796TCG (S) => TTG (L)1
337 rps4 13345CCG (P) => TCG (S)0.67
338 rps4 16455TCA (S) => TTA (L)1
339 rps4 18462CCC (P) => TCC (S)0.83
340 rps4 19365CAT (H) => TAT (Y)1
341 rps4 25786CCA (P) => CTA (L)1
342 rps4 26689CCA (P) => CTA (L)0.83
343 rps4 27893TCG (S) => TTG (L)0.67
344 rps4 29097CCG (P) => CTG (L)0.83
345 rps4 335112CCG (P) => CTG (L)1
346 rps4 482161TCA (S) => TTA (L)1
347 rps4 914305TCG (S) => TTG (L)0.83
348 rps4 925309CAT (H) => TAT (Y)0.83
349 rps4 935312CCA (P) => CTA (L)0.67
350 rps4 950317TCT (S) => TTT (F)1
351 rps4 1001334CCA (P) => CTA (L)0.83
352 rps4 1010337CCT (P) => CTT (L)1
353 rps4 1015339CGG (R) => TGG (W)1
354 nad1 83CCT (P) => CTT (L)0.9
355 nad1 6522TCC (S) => TTC (F)1
356 nad1 10034CCT (P) => TCT (S)0.9
357 nad1 14950GCG (A) => GTG (V)0.9
358 nad1 20970TCC (S) => TTC (F)1
359 nad1 308103TCA (S) => TTA (L)1
360 nad1 434145ACT (T) => ATT (I)1
361 nad6 73CTT (L) => TTT (F)1
362 nad6 8328TCG (S) => TTG (L)1
363 nad6 8830CCC (P) => TTC (F)0.7
364 nad6 8930CCC (P) => TTC (F)0.7
365 nad6 9532CCA (P) => CTA (L)1
366 nad6 10335CGC (R) => TGC (C)1
367 nad6 16154CCA (P) => CTA (L)1
368 nad6 16957CAT (H) => TAT (Y)1
369 nad6 19164TCA (S) => TTA (L)1
370 nad6 446149TCC (S) => TTC (F)1
371 nad6 463155CCT (P) => TCT (S)0.8
372 nad6 569190TCT (S) => TTT (F)1
373 nad2 269TCC (S) => TTC (F)0.89
374 nad2 20368TCT (S) => TTT (F)0.67
375 nad2 20669TCC (S) => TTC (F)1
376 nad2 23077TCT (S) => TTT (F)1
377 nad2 23679TCC (S) => TTC (F)0.67
378 nad2 25184CCA (P) => CTA (L)1
379 nad2 26288CGC (R) => TGC (C)1
380 nad2 28997CAT (H) => TAT (Y)1
381 nad2 29699TCA (S) => TTA (L)1
382 nad2 323108CCT (P) => CTT (L)1
383 nad2 392131TCG (S) => TTG (L)1
384 rps12 7124TCG (S) => TTG (L)0.94
385 rps12 10034CGC (R) => TGC (C)1
386 rps12 10435CCG (P) => CTG (L)1
387 rps12 19666CAC (H) => TAC (Y)0.94
388 rps12 22174TCG (S) => TTG (L)0.88
389 rps12 26990TCG (S) => TTG (L)0.94
390 rps12 28495TCC (S) => TTC (F)0.76
391 nad3 52TCA (S) => TTA (L)0.79
392 nad3 4415CCG (P) => CTG (L)1
393 nad3 6221CCA (P) => CTA (L)0.95
394 nad3 8027CCA (P) => CTA (L)1
395 nad3 14649TCC (S) => TTC (F)1
396 nad3 20870CCT (P) => TTT (F)0.95
397 nad3 20970CCT (P) => TTT (F)0.95
398 nad3 21572CCG (P) => CTG (L)1
399 nad3 23077TCC (S) => TTC (F)0.86
400 nad3 24783CCT (P) => TCT (S)1
401 nad3 25184CCC (P) => CTC (L)0.91
402 nad3 26689CCG (P) => CTG (L)1
403 nad3 27592TCT (S) => TTT (F)1
404 nad3 317106TCT (S) => TTT (F)0.95
405 nad3 344115TCG (S) => TTG (L)1
406 nad3 349117CGG (R) => TGG (W)1
407 rps1 238CCT (P) => CTT (L)0.67
408 rps1 5619CCT (P) => CTT (L)0.67
409 rps1 380127TCA (S) => TTA (L)0.67

*The cutoff score (C-value) was set to 0.6.

Online-only Table 5

Predicted RNA-editing sites in Scaffold 2 of the C. sinensis var. assamica mt genome.

No.GeneNucleotide PositionAA PosEffectScore*
1 rps19 11639TCG (S) => TTG (L)1
2 rps19 16355CCT (P) => TTT (F)1
3 rps19 16455CCT (P) => TTT (F)1
4 atp9 5318TCA (S) => TTA (L)1
5 atp9 8328TCA (S) => TTA (L)1
6 cob 11840CCG (P) => TCG (S)0.92
7 cob 17860CAC (H) => TAC (Y)1
8 cob 28696CTC (L) => TTC (F)1
9 cob 298100CAC (H) => TAC (Y)1
10 cob 325109CAT (H) => TAT (Y)1
11 cob 358120CGG (R) => TGG (W)1
12 cob 419140CCA (P) => CTA (L)1
13 cob 568190CAT (H) => TAT (Y)0.92
14 cob 680227TCT (S) => TTT (F)1
15 cob 808270CCC (P) => TCC (S)1
16 cob 853285CAT (H) => TAT (Y)1
17 cob 908303CCA (P) => CTA (L)1
18 cob 914305TCT (S) => TTT (F)1
19 cob 982328CAC (H) => TAC (Y)0.85
20 cob 1015339CGC (R) => TGC (C)1
21 cob 1084362CCT (P) => TCT (S)1
22 cob 1124375CCG (P) => CTG (L)1
23 rps14 4716GCG (A) => GTG (V)0.6
24 rps14 27191CCT (P) => TCT (S)0.6
25 rpl5 3512TCA (S) => TTA (L)0.78
26 rpl5 4716CCG (P) => CTG (L)1
27 rpl5 5920CCG (P) => CTG (L)0.89
28 rpl5 6422CAC (H) => TAC (Y)1
29 rpl5 9231TCG (S) => TTG (L)1
30 rpl5 17258CGC (R) => TGC (C)0.89
31 rpl5 518173CCA (P) => CTA (L)0.89
32 rpl5 521174CCG (P) => CTG (L)1
33 nad2 11037TCT (S) => TTT (F)1
34 nad2 12542TCC (S) => TTC (F)1
35 nad2 27291TCT (S) => TTT (F)0.67
36 nad2 28495TCA (S) => TTA (L)1
37 nad2 29398TCT (S) => TTT (F)1
38 nad2 412138CAT (H) => TAT (Y)1
39 nad2 442148CGT (R) => TGT (C)0.78
40 nad2 446149ACT (T) => ATT (I)1
41 nad2 512171TCA (S) => TTA (L)0.78
42 nad2 542181TCA (S) => TTA (L)1
43 nad2 611204TCG (S) => TTG (L)1
44 nad2 731244CCA (P) => CTA (L)0.67
45 nad2 760254CGT (R) => TGT (C)1
46 nad2 932311TCA (S) => TTA (L)0.67
47 nad2 941314CCA (P) => CTA (L)1
48 nad2 989330TCA (S) => TTA (L)1
49 sdh3 6723CCA (P) => TCA (S)1
50 sdh3 7425TCC (S) => TTC (F)1
51 ccmFc 3813TCC (S) => TTC (F)0.83
52 ccmFc 5017CCT (P) => CTT (L)1
53 ccmFc 5218CGT (R) => TGT (C)1
54 ccmFc 10335CCC (P) => TCC (S)1
55 ccmFc 11940TCT (S) => TTT (F)1
56 ccmFc 12241TCC (S) => TTC (F)1
57 ccmFc 14649CCT (P) => CTT (L)1
58 ccmFc 15151CCT (P) => TCT (S)0.83
59 ccmFc 15552TCA (S) => TTA (L)1
60 ccmFc 16054CCT (P) => TCT (S)0.67
61 ccmFc 20368ACG (T) => ATG (M)1
62 ccmFc 305102TCA (S) => TTA (L)0.83
63 ccmFc 391131CGT (R) => TGT (C)1
64 ccmFc 406136CGT (R) => TGT (C)0.83
65 ccmFc 620207GCG (A) => GTG (V)1
66 ccmFc 704235GCT (A) => GTT (V)0.83
67 ccmFc 1100367CCA (P) => CTA (L)1
68 ccmFc 1121374TCG (S) => TTG (L)1
69 ccmFc 1276426CGA (R) => TGA (X)1

*The cutoff score (C-value) was set to 0.6.

Phylogenetic analyses

To further determine the phylogenetic position of C. sinensis var. assamica we performed phylogenomic analysis of 20 complete cp genomes using the GTR + R + I model under the maximum likelihood (ML) inference in MEGA v.7.0[55]. Besides C. sinensis var. assamica cv. Yunkang 10, we selected cp genomes from the eighteen Camelia species (C. oleifera, C. crapnelliana, C. szechuanensis, C. mairei, C. elongata, C. grandibracteata, C. leptophylla, C. petelotii, C. pubicosta, C. reticulata, C. azalea, C. japonica, C. cuspidata, C. danzaiensis, C. impressinervis, C. pitardii, C. yunnanensis and C. taliensis) using Apterosperm oblata as outgroup. Our results showed that C. sinensis var. assamica was grouped with C. grandibracteata with 100% bootstrap support (Fig. 5).
Fig. 5

Phylogenetic relationships of 20 complete chloroplast genomes. Maximum likelihood phylogenetic tree of C. sinensis var. assamica cv. Yunkang 10 with 18 species in the genus Camellia based on complete chloroplast genome sequences. The chloroplast sequence of Apterosperma oblata was set as outgroup. The position of C. sinensis var. assamica cv. Yunkang 10 is shown in bold and bootstrap values are shown for each node.

Phylogenetic relationships of 20 complete chloroplast genomes. Maximum likelihood phylogenetic tree of C. sinensis var. assamica cv. Yunkang 10 with 18 species in the genus Camellia based on complete chloroplast genome sequences. The chloroplast sequence of Apterosperma oblata was set as outgroup. The position of C. sinensis var. assamica cv. Yunkang 10 is shown in bold and bootstrap values are shown for each node. The same method was used for phylogenetic analysis with mt genome. A total of thirteen conserved mt protein-coding genes among C. sinensis var. assamica and 14 other plant species were individually aligned with ClustalW[56], and then concatenated to construct a contiguous sequence in the order of cob, cox1, cox2, cox3, nad1, nad2, nad3, nad4, nad4L, nad5, nad6, nad7 and nad9. The selected 14 species includes Cycas taitungensis, Ginkgo biloba, Triticum aestivum, Oryza sativa, Sorghum bicolor, Zea mays, Gossypium arboretum, G. barbadense, Carica papaya, Vitis vinifera, Hevea brasiliensis, Bupleurum falcatum, Glycine max and Salvia miltiorrhiza. The alignment file was used for the construction of Neighbor-Joining Tree at 1000 bootstrap replicates with MEGA 7.0.26[55]. Our results showed that C. sinensis var. assamica is clearly grouped with other dicots that were separated from monocots of the angiosperms while the two gymnosperms (Cycas taitungensis and Ginkgo biloba) were formed the basal clade (Fig. 6).
Fig. 6

Phylogeny inferred from 13 genes common in the 15 plant mitochondrial genomes. Neighbor-joining tree of C. sinensis var. assamica cv. Yunkang 10 with other 14 species based on 13 conserved protein-coding gene sequences with bootstrap support values on each node. The mt sequence of Cycas taitungensis and Ginkgo biloba were set as outgroup.

Phylogeny inferred from 13 genes common in the 15 plant mitochondrial genomes. Neighbor-joining tree of C. sinensis var. assamica cv. Yunkang 10 with other 14 species based on 13 conserved protein-coding gene sequences with bootstrap support values on each node. The mt sequence of Cycas taitungensis and Ginkgo biloba were set as outgroup.

Data Records

Raw reads from Illumina are deposited in the NCBI Sequence Read Archive (SRA)[57-62] and BIG Genome Warehouse[63]. Assembled cp genome sequences and accompanying gene annotations of C. sinensis var. assamica are deposited in the NCBI GenBank[64] and BIG Genome Warehouse[65]. The mt genome final assembly and accompanying gene annotations are deposited at NCBI GenBank[66,67] and BIG Genome Warehouse[68]. The alignment and tree files of the chloroplast genome and mitochondrial genome form the Camellia genus were deposited in Figshare database[69].

Technical Validation

Quality filtering of raw reads

The initially generated raw sequencing reads were evaluated in terms of the average quality score at each position, GC content distribution, quality distribution, base composition, and other metrics. Furthermore, the sequencing reads with low quality were also filtered out before the genome assembly and annotation of gene structure.

Assembly and validation

The chloroplast reads were filtered from whole genome Illumina sequencing data of C. sinensis var. assamica. We mapped all the cleaned reads to the reference chloroplast sequence[4] using bowtie2 (version 2.3.4.3)[40] with default parameters. The mapped chloroplast reads were de novo assembled into the complete chloroplast genome. For mitochondria genome assembly, the PE and MP sequencing reads were used separately. Briefly, we first performed de novo assembly with VELVET v1.2.08[41], which was previously described[42,43]. Scaffolds were constructed using SSPACE v.3.0[44]. False connection was manually removed based on the coverage and distances of paired reads. Gaps between scaffolds were then filled with GapCloser (version 1.12)[45,46] using all pair-end reads.
Measurement(s)genome assembly
Technology Type(s)DNA sequencing
Sample Characteristic - OrganismCamellia sinensis
  35 in total

1.  Scaffolding pre-assembled contigs using SSPACE.

Authors:  Marten Boetzer; Christiaan V Henkel; Hans J Jansen; Derek Butler; Walter Pirovano
Journal:  Bioinformatics       Date:  2010-12-12       Impact factor: 6.937

2.  Clustal W and Clustal X version 2.0.

Authors:  M A Larkin; G Blackshields; N P Brown; R Chenna; P A McGettigan; H McWilliam; F Valentin; I M Wallace; A Wilm; R Lopez; J D Thompson; T J Gibson; D G Higgins
Journal:  Bioinformatics       Date:  2007-09-10       Impact factor: 6.937

3.  Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots.

Authors:  Michael J Moore; Pamela S Soltis; Charles D Bell; J Gordon Burleigh; Douglas E Soltis
Journal:  Proc Natl Acad Sci U S A       Date:  2010-02-22       Impact factor: 11.205

4.  Fungal origin by horizontal transfer of a plant mitochondrial group I intron in the chimeric CoxI gene of Peperomia.

Authors:  J C Vaughn; M T Mason; G L Sper-Whitis; P Kuhlman; J D Palmer
Journal:  J Mol Evol       Date:  1995-11       Impact factor: 2.395

5.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.

Authors:  Ben Langmead; Cole Trapnell; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-03-04       Impact factor: 13.583

6.  Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns.

Authors:  Robert K Jansen; Zhengqiu Cai; Linda A Raubeson; Henry Daniell; Claude W Depamphilis; James Leebens-Mack; Kai F Müller; Mary Guisinger-Bellian; Rosemarie C Haberle; Anne K Hansen; Timothy W Chumley; Seung-Bum Lee; Rhiannon Peery; Joel R McNeal; Jennifer V Kuehl; Jeffrey L Boore
Journal:  Proc Natl Acad Sci U S A       Date:  2007-11-28       Impact factor: 11.205

7.  Complete chloroplast genome of Camellia japonica genome structures, comparative and phylogenetic analysis.

Authors:  Wei Li; Cuiping Zhang; Xiao Guo; Qinghua Liu; Kuiling Wang
Journal:  PLoS One       Date:  2019-05-09       Impact factor: 3.240

8.  The PREP suite: predictive RNA editors for plant mitochondrial genes, chloroplast genes and user-defined alignments.

Authors:  Jeffrey P Mower
Journal:  Nucleic Acids Res       Date:  2009-05-11       Impact factor: 16.971

9.  GapFiller: a de novo assembly approach to fill the gap within paired reads.

Authors:  Francesca Nadalin; Francesco Vezzi; Alberto Policriti
Journal:  BMC Bioinformatics       Date:  2012-09-07       Impact factor: 3.169

10.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

View more
  8 in total

1.  Deciphering the Multi-Chromosomal Mitochondrial Genome of Populus simonii.

Authors:  Changwei Bi; Yanshu Qu; Jing Hou; Kai Wu; Ning Ye; Tongming Yin
Journal:  Front Plant Sci       Date:  2022-06-15       Impact factor: 6.627

2.  The Mitochondrial Genome Assembly of Fennel (Foeniculum vulgare) Reveals Two Different atp6 Gene Sequences in Cytoplasmic Male Sterile Accessions.

Authors:  Fabio Palumbo; Nicola Vitulo; Alessandro Vannozzi; Gabriele Magon; Gianni Barcaccia
Journal:  Int J Mol Sci       Date:  2020-06-30       Impact factor: 5.923

3.  Long-reads reveal that Rhododendron delavayi plastid genome contains extensive repeat sequences, and recombination exists among plastid genomes of photosynthetic Ericaceae.

Authors:  Huie Li; Qiqiang Guo; Qian Li; Lan Yang
Journal:  PeerJ       Date:  2020-04-22       Impact factor: 2.984

4.  Comparative analysis of chloroplast genomes indicated different origin for Indian tea (Camellia assamica cv TV1) as compared to Chinese tea.

Authors:  Hukam C Rawal; Sangeeta Borchetia; Biswajit Bera; S Soundararajan; R Victor J Ilango; Anoop Kumar Barooah; Tilak Raj Sharma; Nagendra Kumar Singh; Tapan Kumar Mondal
Journal:  Sci Rep       Date:  2021-01-08       Impact factor: 4.379

5.  Comparative chloroplast genomes: insights into the evolution of the chloroplast genome of Camellia sinensis and the phylogeny of Camellia.

Authors:  Li Li; Yunfei Hu; Min He; Bo Zhang; Wei Wu; Pumo Cai; Da Huo; Yongcong Hong
Journal:  BMC Genomics       Date:  2021-02-26       Impact factor: 3.969

6.  Exploring the evolutionary characteristics between cultivated tea and its wild relatives using complete chloroplast genomes.

Authors:  Jiao Peng; Yunlin Zhao; Meng Dong; Shiquan Liu; Zhiyuan Hu; Xiaofen Zhong; Zhenggang Xu
Journal:  BMC Ecol Evol       Date:  2021-04-30

7.  Characterization, comparative phylogenetic, and gene transfer analyses of organelle genomes of Rhododendron × pulchrum.

Authors:  Jianshuang Shen; Xueqin Li; Mingzhi Li; Hefeng Cheng; Xiaoling Huang; Songheng Jin
Journal:  Front Plant Sci       Date:  2022-09-21       Impact factor: 6.627

Review 8.  Tea plant genomics: achievements, challenges and perspectives.

Authors:  En-Hua Xia; Wei Tong; Qiong Wu; Shu Wei; Jian Zhao; Zheng-Zhu Zhang; Chao-Ling Wei; Xiao-Chun Wan
Journal:  Hortic Res       Date:  2020-01-01       Impact factor: 6.793

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.