Literature DB >> 28458683

Full Chloroplast Genome Assembly of 11 Diverse Watermelon Accessions.

Chao Shi1, Shuo Wang2, Fei Zhao1, Hua Peng1, Chun-Lei Xiang1.   

Abstract

Entities:  

Keywords:  annotation; chloroplast genome; cucurbitaceae; genome assembly; watermelon

Year:  2017        PMID: 28458683      PMCID: PMC5394170          DOI: 10.3389/fgene.2017.00046

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


× No keyword cloud information.

Introduction

Watermelon [Citrullus lanatus (Thunb.) Matsum and Nakai] is an important cucurbit crop of the family Cucurbitaceae. The large edible watermelon fruits contribute to the diet of consumers throughout the world and the great number of consumption (about 90 million tons every year) makes it among the top five most consumed fresh fruits (http://www.fao.org/faostat/en/#home). It supplies people with not only large amount of water but also important nutritional compounds, such as sugars, lycopene, and cardiovascular health promoting amino acids (Hayashi et al., 2005; Collins et al., 2007). The domestication of wild C. lanatus and its worldwide cultivation have resulted in many modern watermelon varieties with diverse fruit shapes, sizes, color, texture, flavor, and nutrient compositions (Erickson et al., 2005). Human domestication and the breeding of crops from wild to cultivated groups have long been an important issue in plant science (Meyer and Purugganan, 2013). C. lanatus can serve as a good model species for studying this process, since it includes three subspecies corresponding to wild, semi-wild, and cultivated groups (Fursa, 1972): the wild subspecies C. lanatus subsp. lanatus, which represents for an ancient subspecies group that has natural populations in southern Africa; the semi-wild subspecies C. lanatus subsp. mucosospermus Fursa, which represents the egusi watermelon group that contains large seeds in the edible fleshy pericarp; and the cultivated subspecies C. lanatus subsp. vulgaris Fursa, which represents the sweet (dessert) watermelon groups (including East-Asia ecotype and America ecotype) that give rise to the modern cultivated watermelon (Erickson et al., 2005). Previous study has revealed important genome-wide changes under human domestication and breeding (Guo et al., 2013), while the sequence variations of chloroplast genome underwent this process has not been reported. Chloroplast genomes contribute a lot to plant genetic diversity and evolutionary studies (Green, 2011). The chloroplast genomes contain both conserved and variable protein-coding genes that can resolve phylogenetic relationships at either high (Jansen et al., 2007; Moore et al., 2007, 2010) or low taxonomic levels (Parks et al., 2009; Carbonell-Caballero et al., 2015). They also include highly variable non-genic markers that are widely used in plant barcoding (Taberlet et al., 2007; Dong et al., 2012) and population studies (Doorduin et al., 2011). In this study, we report the complete chloroplast genome sequences of 11 watermelon accessions representing morphologically and genetically differentiated taxa of all the three subspecies. As a continuation and supplementary of the watermelon nuclear genome sequencing project (Guo et al., 2013), these chloroplast genome sequences will further expand the genome resources for watermelon genetic studies.

Materials and methods

All 11 watermelon accessions in this study were from the watermelon nuclear genome sequencing project and all plant materials were conserved in Beijing Academy of Agriculture and Forestry Sciences, Beijing, China (Guo et al., 2013). The DNA was extracted from fresh leaves of these materials and the Illumina sequencing libraries construction, sequencing was prepared following sequencer's instructions as previously described (Guo et al., 2013). The sequenced Illumina paired-end sequence reads (2 × 100 bp in length; FASTQ format) were ranged from 1.1 to 2.1 GB. The 11 representative watermelon accessions included five major cultivated varieties of C. lanatus subsp. vulgaris (two East-Asia and three America ecotypes), three semi-wild varieties of C. lanatus subsp. mucosospermus, and three wild varieties of C. lanatus subsp. lanatus (Table 1).
Table 1

Chloroplast genome informations for 11 watermelon accessions in this study.

AccessionsSpeciesGroupsChloroplast genome size
JX-2C. lanatus subsp. vulgaris East-Asia ecotypCultivated watermelon156,907
JLM156,907
Black diamoC. lanatus subsp. vulgaris America ecotype156,906
Calhoun gray156,907
Sugarlee156,906
PI1893C. lanatus subsp. mucosospermusSemi-wild watermleon156,905
PI500301156,905
PI249010156,907
PI4822C. lanatus subsp. lanatusWild watermelon156,699
PI482303156,886
PI482326156,891
Chloroplast genome informations for 11 watermelon accessions in this study. Before assembly, the obtained Illumina paired-end total DNA sequencing data of each accession were subjected to NCBI-blast version 2.2.31+ (ftp://ftp.ncbi.nih.gov/blast/) to screen out chloroplast DNA reads with a reference data set contained all the sequenced angiosperm chloroplast genome sequences so far (ftp://ftp.ncbi.nlm.nih.gov/refseq/release/plastid/). The filtered chloroplast DNA data were then subjected to SOAPdenovo2 (Luo et al., 2012), ABySS version 1.9.0 (Simpson et al., 2009), and SPAdes version 3.1.0 (Bankevich et al., 2012) for several runs of de novo assembly until it resulted in one final circular contig (FASTA format) for each accession. Annotation was performed with DualOrganellarGenomeAnnotator (DOGMA) (Wyman et al., 2004) using default parameters to predict protein-coding genes, tRNA genes, and ribosomal RNA (rRNA) genes. For genes with low sequence identity, manual annotation was performed to determine the positions of start and stop codons depending on the translated amino acid sequence using the chloroplast/bacterial genetic code. The final GenBank format annotation information was produced using Sequin (http://www.ncbi.nlm.nih.gov/). All these records with Fasta and GenBank formats were then deposited and can be viewed in National Center for Biotechnology Information (NCBI) database (http://www.ncbi.nlm.nih.gov/nuccore).

Results and discussion

Sizes of the 11 determined chloroplast genomes of each watermelon accession varied from 156,699 bp of PI482276 to 156,907 bp of JX-2, JLM, Calhoun Gray, and PI249010 (Table 1). All the chloroplast genomes exhibited a typical quadripartite structure, consisting of a pair of inverted repeat regions (IRs) (25,989–26,108 bp) separated by a large single copy region (LSC) (86,472–86,633 bp) and a small single copy region (SSC) (18,187–18,289 bp). These chloroplast genomes encoded an identical set of 133 genes with 19 of which were duplicated in the IR regions and 114 are unique. Among these unique genes, 15 included one intron and two contain two introns. All of these coding regions account for 51.2–51.7% of the whole genome. Sequence similarities among these species were high (average 99.5%), whereas moderate genome sequence variations were also observed in some genic regions (Figure 1). Three genes, psaB, psaA, and psbA, which belonged to photosystem I (psa) and photosystem II (psb) respectively, showed the most sequence variations among all protein-coding genes. In addition, the wild subspecies of C. lanatus subsp. lanatus group exhibited relatively higher sequence variations than both semi-wild and cultivated groups, which may support the conclusion that human domestication and breeding that target for high yield and desirable fruit qualities have narrowed the genetic diversity of cultivated watermelon (Levi et al., 2001). In all, the chloroplast genome sequences reported in this study will further provide new insights into chloroplast genome variations under human domestication and breeding.
Figure 1

Visualization of sequence alignments among the 11 watermelon chloroplast genomes. VISTA-based identity plots show sequence identity among the 11 sequenced chloroplast genomes with JX-2 as a reference. The genomic coding regions ranging from 40 to 80 Kbp were indicated as black boxes.

Visualization of sequence alignments among the 11 watermelon chloroplast genomes. VISTA-based identity plots show sequence identity among the 11 sequenced chloroplast genomes with JX-2 as a reference. The genomic coding regions ranging from 40 to 80 Kbp were indicated as black boxes.

Deposited data and information to the user

The assembled complete chloroplast genome sequences with annotation information were submitted to NCBI Genbank under the accession numbers KY430683-KY430693 (http://www.ncbi.nlm.nih.gov/nuccore). The raw reads in compressed FASTQ format were deposited at SRA database of NCBI under the accession number SRA052158 (http://www.ncbi.nlm.nih.gov/sra). Users can download and reuse the data for research purpose only with an acknowledgment to us and quoting this paper as reference to the data.

Author contributions

CS and CX conceived the study and acquired the funding; CS, SW, and FZ performed the genome assembly and analysis; CS, SW, HP, and CX drafted the manuscript. All authors approved the final manuscript.

Funding

The project was funded by the Youth Innovation Promotion Associaiton, Chinese Academy of Sciences (No. 2013253).

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  18 in total

1.  Automatic annotation of organellar genomes with DOGMA.

Authors:  Stacia K Wyman; Robert K Jansen; Jeffrey L Boore
Journal:  Bioinformatics       Date:  2004-06-04       Impact factor: 6.937

2.  l-Citrulline and l-arginine supplementation retards the progression of high-cholesterol-diet-induced atherosclerosis in rabbits.

Authors:  Toshio Hayashi; Packiasamy A R Juliet; Hisako Matsui-Hirai; Asaka Miyazaki; Akiko Fukatsu; Jun Funami; Akihisa Iguchi; Louis J Ignarro
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-12       Impact factor: 11.205

3.  Using plastid genome-scale data to resolve enigmatic relationships among basal angiosperms.

Authors:  Michael J Moore; Charles D Bell; Pamela S Soltis; Douglas E Soltis
Journal:  Proc Natl Acad Sci U S A       Date:  2007-11-28       Impact factor: 11.205

4.  Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots.

Authors:  Michael J Moore; Pamela S Soltis; Charles D Bell; J Gordon Burleigh; Douglas E Soltis
Journal:  Proc Natl Acad Sci U S A       Date:  2010-02-22       Impact factor: 11.205

Review 5.  Chloroplast genomes of photosynthetic eukaryotes.

Authors:  Beverley R Green
Journal:  Plant J       Date:  2011-04       Impact factor: 6.417

6.  An Asian origin for a 10,000-year-old domesticated plant in the Americas.

Authors:  David L Erickson; Bruce D Smith; Andrew C Clarke; Daniel H Sandweiss; Noreen Tuross
Journal:  Proc Natl Acad Sci U S A       Date:  2005-12-13       Impact factor: 11.205

7.  Watermelon consumption increases plasma arginine concentrations in adults.

Authors:  Julie K Collins; Guoyao Wu; Penelope Perkins-Veazie; Karen Spears; P Larry Claypool; Robert A Baker; Beverly A Clevidence
Journal:  Nutrition       Date:  2007-03       Impact factor: 4.008

8.  Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns.

Authors:  Robert K Jansen; Zhengqiu Cai; Linda A Raubeson; Henry Daniell; Claude W Depamphilis; James Leebens-Mack; Kai F Müller; Mary Guisinger-Bellian; Rosemarie C Haberle; Anne K Hansen; Timothy W Chumley; Seung-Bum Lee; Rhiannon Peery; Joel R McNeal; Jennifer V Kuehl; Jeffrey L Boore
Journal:  Proc Natl Acad Sci U S A       Date:  2007-11-28       Impact factor: 11.205

9.  The complete chloroplast genome of 17 individuals of pest species Jacobaea vulgaris: SNPs, microsatellites and barcoding markers for population and phylogenetic studies.

Authors:  Leonie Doorduin; Barbara Gravendeel; Youri Lammers; Yavuz Ariyurek; Thomas Chin-A-Woeng; Klaas Vrieling
Journal:  DNA Res       Date:  2011-03-28       Impact factor: 4.458

10.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

View more
  2 in total

1.  The implication of plastid transcriptome analysis in petaloid monocotyledons: A case study of Lilium lancifolium (Liliaceae, Liliales).

Authors:  Hoang Dang Khoa Do; Joo-Hwan Kim
Journal:  Sci Rep       Date:  2019-04-30       Impact factor: 4.379

2.  Phylogenetic informativeness analyses to clarify past diversification processes in Cucurbitaceae.

Authors:  Sidonie Bellot; Thomas C Mitchell; Hanno Schaefer
Journal:  Sci Rep       Date:  2020-01-16       Impact factor: 4.379

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.