Literature DB >> 28713559

Draft genome sequencing of the sugarcane hybrid SP80-3280.

Diego Mauricio Riaño-Pachón1,2, Lucia Mattiello1,3.   

Abstract

Sugarcane commercial cultivar SP80-3280 has been used as a model for genomic analyses in Brazil. Here we present a draft genome sequence employing Illumina TruSeq Synthetic Long reads. The dataset is available from NCBI BioProject with accession PRJNA272769.

Entities:  

Keywords:  genomics; long reads; polyploid; sugarcane

Year:  2017        PMID: 28713559      PMCID: PMC5499785          DOI: 10.12688/f1000research.11859.2

Source DB:  PubMed          Journal:  F1000Res        ISSN: 2046-1402


Introduction

Sugarcane is an economically important crop used as source of sugar, ethanol and electricity generation [1]. Sugarcane has a haploid genome of ~1Gpb, however, modern sugarcane cultivars are polyploids derived from interspecific hybridization between S. officinarum L. and S. spontaneum L., reaching up to 130 chromosomes distributed among ~12 homo(eo)logous groups [2, 3], with a total genome size reaching 10Gpb [4]. Its complex genome structure has hampered genome sequencing, assembly and annotation. Partial genomic sequences are available [5– 8], as well as transcriptome sequences [9– 11], but there are no whole genome assemblies available to date. Here we used the Illumina TruSeq Synthetic Long Read sequencing technology to survey the genome of the polyploid cultivar SP80-3280. The generated long reads, their assembly and genome annotation have been made public and will provide useful information for functional genomics studies.

Materials and methods

The leaf rolls of greenhouse grown, two-month old plants of sugarcane cultivar SP80-3280 (provided by Centro de Tecnologia Canavieira, Piracicaba, São Paulo), were collected and immediately frozen in liquid nitrogen. The plant tissue was ground up to become fine powder, and high molecular weight DNA was extracted from 100 mg of fresh frozen tissue using CTAB (Sigma-Aldrich, USA) and chloroform:isoamyl alcohol (Sigma-Aldrich, USA) as previously described [12]. 6µg of DNA were sent to Illumina (CA, USA) for DNA sequencing using TruSeq Synthetic long read technology [13], through their FastTrack Sequencing Service. Sequencing was performed on an Illumina HiSeq2000 system using paired-end chemistry. Nine long read libraries, each generating approx. 600Mbps, were generated, giving an estimated coverage between 4 and 5 of the monoploid genome. A total of 1,378,917 reads longer than 1.5Kbp, or 5,642,855,018 bases, were generated. The underlying 1,966,604,928 short reads amount to 393,320,985,600bp, which would translate to an estimated coverage of 393x of the haploid genome. The maximum read length was 20,918bp, with 36% of the reads being longer than 4.5Kbp. Possible contaminants were removed by comparison against the NCBI’s nucleotide database using BLAST [14], keeping only the long reads with best hits against Viridiplantae, resulting in 1,224,061 useful for assembly. Prior to assembly, long reads originating from mitochondria (NC_008360.1) and chloroplast (NC_005878.2) were excluded using mirabait ( http://mira-assembler.sourceforge.net/). Reads longer than 1.5Kbp were assembled using Celera’s WGS Assembler v8.2 [15], using similar parameters as previously described [13], except for some of the error parameters that were left in their default settings, i.e., ‘unitiger=bogart, merSize=31, ovlMinLen=100’, and the parameters ovlErrorRate, cnsErrorRate, cgwErrorRate, utgGraphErrorRate, utgGraphErrorLimit, utgMergeErrorRate, utgMergeErrorLimit. A non-redundant assembly was created using CD-HIT [16], merging 100% identical sequences and sub-sequences. RNASeq data previously generated in our group [17] for the same cultivar was exploited for gene prediction using BRAKER1 [18] and PASA [19], as well as sugarcane transcript data (ESTs), and Sorghum bicolor proteins using Exonerate [20], all gene evidence was integrated to generate a high quality gene prediction set with Evidence Modeller [21], leading to 153,078 predicted protein-coding genes.

Data availability

The data referenced by this article are under copyright with the following copyright statement: Copyright: © 2017 Riaño-Pachón DM and Mattiello L Raw sequencing data are available at NCBI SRA; the long reads with accession number SRX845504, and the underlying short reads with accessions SRX853961 to SRX853969. The SP80-3280 assembly is available with accession number GCA_002018215.1. All data can be found under the BioProject PRJNA272769. Genome annotation is available from https://figshare.com/projects/Sugarcane_SP80-3280_draft_genome_annotation/22327 The data note entitled ' Draft genome sequencing of the sugarcane hybrid SP80-3280' is perhaps the first report describing the whole genome of sugarcane, a complex polyploid and its availability in NCBI will be a boon to sugarcane researchers. The study is well planned, executed and well drafted. The data presented here would be particularly useful for functional genomic studies in sugarcane. I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. Dear Dr. Mohan, thanks you for your review of our data note. In version 2 of the note we have added links for the genome annotation in addition to the genome assembly. Best regards, Diego Summary: The Data Note, "Draft genome sequencing of the sugarcane hybrid SP80-3280", describes a sugarcane genome assembly that is available at NCBI. The TruSeq method was applied to a monoploid sugarcane cultivar to generate a 1.2 gigabase assembly with a 8433 contig N50 according to GenBank. This is the first sugarcane genome assembly so it will be of interest to the field. This data note is especially useful because it describes the sequence filtering by size, blast, mirabit, and cd-hit prior to release. Suggestions: The sentence, “there are not whole genome assemblies available”, probably should say “there are no whole genome assemblies available”. The text could be made clearer by presenting all the statics for underlying short reads before getting to the synthetic long read stats, and by specifying that the blast filter was applied to the long reads. I would appreciate a reference for Celera Assembler, but that is just me. I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. Dear Dr. Miller, thank you very much for your review of our data note. We have followed your main suggestions, and they are available as version 2 of the data note. Best regards, Diego
  18 in total

1.  Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies.

Authors:  Brian J Haas; Arthur L Delcher; Stephen M Mount; Jennifer R Wortman; Roger K Smith; Linda I Hannick; Rama Maiti; Catherine M Ronning; Douglas B Rusch; Christopher D Town; Steven L Salzberg; Owen White
Journal:  Nucleic Acids Res       Date:  2003-10-01       Impact factor: 16.971

2.  BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.

Authors:  Katharina J Hoff; Simone Lange; Alexandre Lomsadze; Mark Borodovsky; Mario Stanke
Journal:  Bioinformatics       Date:  2015-11-11       Impact factor: 6.937

3.  De novo transcriptome assembly of sugarcane leaves submitted to prolonged water-deficit stress.

Authors:  A A Belesini; F M S Carvalho; B R Telles; G M de Castro; P F Giachetto; J S Vantini; S D Carlin; J O Cazetta; D G Pinheiro; M I T Ferro
Journal:  Genet Mol Res       Date:  2017-05-25

4.  Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum.

Authors:  Clícia Grativol; Michael Regulski; Marcelo Bertalan; W Richard McCombie; Felipe Rodrigues da Silva; Adhemar Zerlotini Neto; Renato Vicentini; Laurent Farinelli; Adriana Silva Hemerly; Robert A Martienssen; Paulo Cavalcanti Gomes Ferreira
Journal:  Plant J       Date:  2014-06-17       Impact factor: 6.417

5.  Physiological and transcriptional analyses of developmental stages along sugarcane leaf.

Authors:  Lucia Mattiello; Diego Mauricio Riaño-Pachón; Marina Camara Mattos Martins; Larissa Prado da Cruz; Denis Bassi; Paulo Eduardo Ribeiro Marchiori; Rafael Vasconcelos Ribeiro; Mônica T Veneziano Labate; Carlos Alberto Labate; Marcelo Menossi
Journal:  BMC Plant Biol       Date:  2015-12-29       Impact factor: 4.215

6.  Illumina TruSeq synthetic long-reads empower de novo assembly and resolve complex, highly-repetitive transposable elements.

Authors:  Rajiv C McCoy; Ryan W Taylor; Timothy A Blauwkamp; Joanna L Kelley; Michael Kertesz; Dmitry Pushkarev; Dmitri A Petrov; Anna-Sophie Fiston-Lavier
Journal:  PLoS One       Date:  2014-09-04       Impact factor: 3.240

7.  A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing.

Authors:  Nam V Hoang; Agnelo Furtado; Patrick J Mason; Annelie Marquardt; Lakshmi Kasirajan; Prathima P Thirugnanasambandam; Frederik C Botha; Robert J Henry
Journal:  BMC Genomics       Date:  2017-05-22       Impact factor: 3.969

8.  CD-HIT: accelerated for clustering the next-generation sequencing data.

Authors:  Limin Fu; Beifang Niu; Zhengwei Zhu; Sitao Wu; Weizhong Li
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

9.  BAC-Pool Sequencing and Assembly of 19 Mb of the Complex Sugarcane Genome.

Authors:  Vagner Katsumi Okura; Rafael S C de Souza; Susely F de Siqueira Tada; Paulo Arruda
Journal:  Front Plant Sci       Date:  2016-03-23       Impact factor: 5.753

10.  Initial genome sequencing of the sugarcane CP 96-1252 complex hybrid.

Authors:  Jason R Miller; Kari A Dilley; Derek M Harkins; Manolito G Torralba; Kelvin J Moncera; Karen Beeri; Karrie Goglin; Timothy B Stockwell; Granger G Sutton; Reed S Shabman
Journal:  F1000Res       Date:  2017-05-17
View more
  26 in total

1.  Identification, classification and transcriptional profiles of dirigent domain-containing proteins in sugarcane.

Authors:  Paula Macedo Nobile; Alexandra Bottcher; Juliana L S Mayer; Michael S Brito; Ivan A Dos Anjos; Marcos Guimarães de Andrade Landell; Renato Vicentini; Silvana Creste; Diego Mauricio Riaño-Pachón; Paulo Mazzafera
Journal:  Mol Genet Genomics       Date:  2017-07-11       Impact factor: 3.291

2.  Reduction of ethylene biosynthesis in sugarcane induces growth and investment in the non-enzymatic antioxidant apparatus.

Authors:  Daniel Neris; Lucia Mattiello; Gustavo Zuñiga; Eduardo Purgatto; Marcelo Menossi
Journal:  Plant Cell Rep       Date:  2022-02-28       Impact factor: 4.570

3.  Characterization of full-length transcriptome in Saccharum officinarum and molecular insights into tiller development.

Authors:  Haifeng Yan; Huiwen Zhou; Hanmin Luo; Yegeng Fan; Zhongfeng Zhou; Rongfa Chen; Ting Luo; Xujuan Li; Xinlong Liu; Yangrui Li; Lihang Qiu; Jianming Wu
Journal:  BMC Plant Biol       Date:  2021-05-22       Impact factor: 4.215

4.  Plant Proteomics and Systems Biology.

Authors:  Flavia Vischi Winck; André Luis Wendt Dos Santos; Maria Juliana Calderan-Rodrigues
Journal:  Adv Exp Med Biol       Date:  2021       Impact factor: 2.622

5.  Genome-wide identification, characterization and expression profile analysis of expansins gene family in sugarcane (Saccharum spp.).

Authors:  Thaís R Santiago; Valquiria M Pereira; Wagner R de Souza; Andrei S Steindorff; Bárbara A D B Cunha; Marília Gaspar; Léia C L Fávaro; Eduardo F Formighieri; Adilson K Kobayashi; Hugo B C Molinari
Journal:  PLoS One       Date:  2018-01-11       Impact factor: 3.240

6.  Metabolite Profiles of Sugarcane Culm Reveal the Relationship Among Metabolism and Axillary Bud Outgrowth in Genetically Related Sugarcane Commercial Cultivars.

Authors:  Danilo A Ferreira; Marina C M Martins; Adriana Cheavegatti-Gianotto; Monalisa S Carneiro; Rodrigo R Amadeu; Juliana A Aricetti; Lucia D Wolf; Hermann P Hoffmann; Luis G F de Abreu; Camila Caldana
Journal:  Front Plant Sci       Date:  2018-06-25       Impact factor: 5.753

7.  Enhanced aluminum tolerance in sugarcane: evaluation of SbMATE overexpression and genome-wide identification of ALMTs in Saccharum spp.

Authors:  Ana Paula Ribeiro; Felipe Vinecky; Karoline Estefani Duarte; Thaís Ribeiro Santiago; Raphael Augusto das Chagas Noqueli Casari; Aline Forgatti Hell; Bárbara Andrade Dias Brito da Cunha; Polyana Kelly Martins; Danilo da Cruz Centeno; Patricia Abrão de Oliveira Molinari; Geraldo Magela de Almeida Cançado; Jurandir Vieira de Magalhães; Adilson Kenji Kobayashi; Wagner Rodrigo de Souza; Hugo Bruno Correa Molinari
Journal:  BMC Plant Biol       Date:  2021-06-29       Impact factor: 4.215

8.  TALEN-mediated targeted mutagenesis of more than 100 COMT copies/alleles in highly polyploid sugarcane improves saccharification efficiency without compromising biomass yield.

Authors:  Baskaran Kannan; Je Hyeong Jung; Geoffrey W Moxley; Sun-Mi Lee; Fredy Altpeter
Journal:  Plant Biotechnol J       Date:  2017-11-18       Impact factor: 9.803

9.  "Targeted Sequencing by Gene Synteny," a New Strategy for Polyploid Species: Sequencing and Physical Structure of a Complex Sugarcane Region.

Authors:  Melina C Mancini; Claudio B Cardoso-Silva; Danilo A Sforça; Anete Pereira de Souza
Journal:  Front Plant Sci       Date:  2018-03-28       Impact factor: 5.753

Review 10.  The Challenge of Analyzing the Sugarcane Genome.

Authors:  Prathima P Thirugnanasambandam; Nam V Hoang; Robert J Henry
Journal:  Front Plant Sci       Date:  2018-05-14       Impact factor: 5.753

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.