Literature DB >> 26539206

Draft genome sequence of Oryza sativa elite indica cultivar RP Bio-226.

Mettu M Reddy¹, Kandasamy Ulaganathan¹.

Abstract

Entities: Chemical Mutation Species

Keywords: Oryza sativa; RP-Bio-226; genome sequencing; nitrogen use efficiency; yield

Year: 2015 PMID： 26539206 PMCID： PMC4611092 DOI： 10.3389/fpls.2015.00896

Source DB: PubMed Journal: Front Plant Sci ISSN： 1664-462X Impact factor: 5.753

× No keyword cloud information.

Introduction

Rice cultivars show diversity in various factors influencing yield like nitrogen use efficiency, root development, and stress tolerance (Suryapriya et al., 2009; Singh et al., 2014; Begum et al., 2015; Mickelbart et al., 2015). For better understanding of the factors influencing yield of specific rice cultivars it is necessary to apply the high throughput genomic methods like whole genome sequencing, RNA-seq, smallRNA-seq, and ChIP-Seq. Improved Samba Mahsuri (RP Bio-226) is a bacterial leaf blight resistant indica rice cultivar developed through marker assisted selection from BPT5204 and SS1113. It is widely cultivated in Southern India because of its high yield, premium grain quality, and excellent cooking qualities. For understanding this cultivar in terms of nitrogen use efficiency recently we have sequenced the urea nutrition responsive transcriptome of this cultivar (Reddy and Ulaganathan, 2015). Further, with the aim of studying the genomic basis of yield of RP Bio-226 cultivar, we have sequenced the genome and analyzing it. Here we report the genomic dataset available publicly.

Materials and Methods

Plant Material

RP Bio-226 rice seeds (O. sativa sp. Indica) were surface- sterilized with 0.1% HgCl2 for 5 min and rinsed thoroughly with distilled water. These seeds were then transferred to Murashige and Skoog basal agar medium (Murashige and Skoog, 1962) for germination. Plants were grown in a culture room at 25°C, 60–80% relative humidity and 16 h (light)/8 h (dark) photo-period for 10 days. After this time, seedlings were harvested and used for genomic DNA isolation. The genomic DNA was isolated from the young leaves using cetyl trimethyl ammonium bromide (CTAB) method (Murray and Thompson, 1980). DNA concentration and purity were estimated using Nanodrop Spectrophotometer and Qubit Fluorometer.

Library Preparation and Sequencing

For library preparation ∼3 μg of genomic DNA was sonicated using Covaris to obtain 200–500 bp fragment size. The size distribution was checked by running an aliquot of the sample on Agilent HS DNA Chip. The resulting fragmented DNA was cleaned up using High Prep PCR clean up beads. Fragmented DNA was subjected to a series of enzymatic reactions that repair ends, phosphorylate the fragments, and add a single ∼350–600 bp fragments was size selected on 2% low melting agarose gel and cleaned using MinElute column (QIAGEN). PCR (10 cycles) amplification of adapter ligated fragments was done and cleaned up using High Prep PCR beads. The prepared libraries were quantified using Qubit fluorometer and validated for quality by running an aliquot on High Sensitivity Bioanalyzer Chip (Agilent). Whole genome sequencing was carried out with Illumina_ Nextseq500 system (Illumina, San Diego, CA, USA). The raw read files in Fastq format were used for the genome assembly.

Preprocessing and Genome Assembly

The preprocessing of raw reads was done with FastQC and the adapters were removed with Cut adapt (Andrews, 2010; Martin, 2011). After preprocessing the reads were aligned to the reference genome by using Bowtie2 (version 2.2.4) (Langmead and Salzberg, 2012). The indica rice 93-11 genome downloaded from BGI was used as the reference genome. Reference based assembly of the reads against the reference genome involved, indexing of the reference genome and alignment of reads to the reference and creation of SAM file. Samtools (version 0.1.18) and SnpEff (version 4.1) were used for further analysis (Li et al., 2009, 2012). SAM file was converted into binary BAM file, sorted and indexed by using the ‘view,’ ‘sort,’ and ‘index’ functions of Samtools. The quality of the assembly was checked by viewing the BAM file with ‘Bamview’ tool and used for variant calling. Samtools created a variation report from the Bowtie assembly with a mapping quality of >30 and read depth of >20 as cutoffs by using the mpileup function. It created a ‘bcf’ file which was converted into ‘vcf’ file with Bcftools. The duplicate variants were removed by varFilter in vcfutils. The consensus sequence was created with Samtools. Further annotation was done with snpEff (version 4.1) using the database built from the latest version of representative genes of RAP-DB (Sasaki et al., 2013). Gene prediction was carried out using Fgenesh program (Solovyev et al., 2006).

Results

Whole Genome Sequencing of RP Bio-226

The sequencing produced a total of 32,49,1164 paired-end reads of 151 bp length. These reads were checked for low quality and adapter contamination and the low quality reads and adapter sequences were removed using FastQC and Cutadapt tools. The remaining reads were assembled onto the reference genome, 93-11 indica rice genome, using Bowtie2 (Langmead and Salzberg, 2012). Over 90% reads were aligned to the reference genome and the overall coverage was estimated to be more than 20×. The Sam file produced by this reference based assembly was used for generating the variation (Single nucleotide polymorphism and indels) report using Samtools (Li et al., 2009). The variants reported by the Samtools were annotated with SnpEff tool (Li et al., 2012). The assembled genome size was estimated to be 347 MB and a total of 35,700 protein coding genes were predicted using Soft berry (Fgenesh) gene finding tool (Solovyev et al., 2006) (Table ). Oryza sativa indica cultivar RP Bio-226 genome characteristics and resources.

Direct Link to Deposited Data and Information to Users

The dataset submitted to NCBI include the assembled chromosomal sequences of O. sativa indica cultivar RP Bio-226 in Fasta format and the raw reads. The assembled chromosomal sequences in Fasta format and the raw reads can be accessed at NCBI with the following links CP012609–CP012620 and SRP062659, respectively. Users can download and use the data freely for research purpose only with acknowledgment to us and quoting this paper as reference to the data.

Author Contributions

This work was planned by KU and execution was carried out jointly by both KU and MR.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Table 1

Oryza sativa indica cultivar RP Bio-226 genome characteristics and resources.

Name	RP Bio-226 genome characteristic/resource
NCBI bioproject ID	PRJNA285384
NCBI biosample ID	SAMN03751782
NCBI SRA accession No.	SRP062659
NCBI genome accession numbers	CP012609-CP012620
Sequence type	Illumina_ Next seq500
Total number of reads	69,377,450
Read length	151
Overall coverage	20×
Mapped reads	90%
Estimated genome size	347MB
Predicted protein coding genes	35700
Genome download link	RP Bio-226 genome
Browse the genome	RP Bio-226 Gbrowse

11 in total

1. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.

Authors: Pablo Cingolani; Adrian Platts; Le Lily Wang; Melissa Coon; Tung Nguyen; Luan Wang; Susan J Land; Xiangyi Lu; Douglas M Ruden
Journal: Fly (Austin) Date: 2012 Apr-Jun Impact factor: 2.160

2. Fast gapped-read alignment with Bowtie 2.

Authors: Ben Langmead; Steven L Salzberg
Journal: Nat Methods Date: 2012-03-04 Impact factor: 28.547

Review 3. Genetic mechanisms of abiotic stress tolerance that translate to crop yield stability.

Authors: Michael V Mickelbart; Paul M Hasegawa; Julia Bailey-Serres
Journal: Nat Rev Genet Date: 2015-03-10 Impact factor: 53.242

4. Rapid isolation of high molecular weight plant DNA.

Authors: M G Murray; W F Thompson
Journal: Nucleic Acids Res Date: 1980-10-10 Impact factor: 16.971

5. Physiological response of rice (Oryza sativa L.) genotypes to elevated nitrogen applied under field conditions.

Authors: Hukum Singh; Amit Verma; Mohammad Wahid Ansari; Alok Shukla
Journal: Plant Signal Behav Date: 2014

6. The Sequence Alignment/Map format and SAMtools.

Authors: Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal: Bioinformatics Date: 2009-06-08 Impact factor: 6.937

7. Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics.

Authors: Hiroaki Sakai; Sung Shin Lee; Tsuyoshi Tanaka; Hisataka Numa; Jungsok Kim; Yoshihiro Kawahara; Hironobu Wakimoto; Ching-chia Yang; Masao Iwamoto; Takashi Abe; Yuko Yamada; Akira Muto; Hachiro Inokuchi; Toshimichi Ikemura; Takashi Matsumoto; Takuji Sasaki; Takeshi Itoh
Journal: Plant Cell Physiol Date: 2013-01-07 Impact factor: 4.927

8. Genome-wide association mapping for yield and other agronomic traits in an elite breeding population of tropical rice (Oryza sativa).

Authors: Hasina Begum; Jennifer E Spindel; Antonio Lalusin; Teresita Borromeo; Glenn Gregorio; Jose Hernandez; Parminder Virk; Bertrand Collard; Susan R McCouch
Journal: PLoS One Date: 2015-03-18 Impact factor: 3.240