Literature DB >> 26217707

Shoot transcriptome of the giant reed, Arundo donax.

Roberto A Barrero1, Felix D Guerrero2, Paula Moolhuijzen1, John A Goolsby3, Jason Tidwell3, Stanley E Bellgard4, Matthew I Bellgard1.   

Abstract

The giant reed, Arundo donax, is a perennial grass species that has become an invasive plant in many countries. Expansive stands of A. donax have significant negative impacts on available water resources and efforts are underway to identify biological control agents against this species. The giant reed grows under adverse environmental conditions, displaying insensitivity to drought stress, flooding, heavy metals, salinity and herbaceous competition, thus hampering control programs. To establish a foundational molecular dataset, we used an llumina Hi-Seq protocol to sequence the transcriptome of actively growing shoots from an invasive genotype collected along the Rio Grande River, bordering Texas and Mexico. We report the assembly of 27,491 high confidence transcripts (≥200 bp) with at least 70% coverage of known genes in other Poaceae species. Of these 13,080 (47.58%), 6165 (22.43%) and 8246 (30.0%) transcripts have sequence similarity to known, domain-containing and conserved hypothetical proteins, respectively. We also report 75,590 low confidence transcripts supported by both trans-ABBySS and Velvet-Oases de novo assembly pipelines. Within the low confidence subset of transcripts we identified partial hits to known (19,021; 25.16%), domain-containing (7093; 9.38%) and conserved hypothetical (16,647; 22.02%) proteins. Additionally 32,829 (43.43%) transcripts encode putative hypothetical proteins unique to A. donax. Functional annotation resulted in 5,550 and 6,070 transcripts with assigned Gene Ontology and KEGG pathway information, respectively. The most abundant KEGG pathways are spliceosome, ribosome, ubiquitin mediated proteolysis, plant-pathogen interaction, RNA degradation and oxidative phosphorylation metabolic pathway. Furthermore, we also found 12, 9, and 4 transcripts annotated as stress-related, heat stress, and water stress proteins, respectively. We envisage that these resources will promote and facilitate studies of the abiotic stress capabilities of this exotic plant species, which facilitates its invasive capacity.

Entities:  

Keywords:  Arundo donax; Giant reed; RNA de novo assembly; RNA-Seq; Shoot; Transcriptome

Year:  2015        PMID: 26217707      PMCID: PMC4509983          DOI: 10.1016/j.dib.2014.12.007

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications table

Value of data

First transcriptome sequence data made available in GenBank/DDBJ/Embbl for the A. donax invasive Rio Grande basin genotype. The A. donax shoot transcriptome dataset provides insights into one of the fastest growing terrestrial plants [1]. A. donax has high tolerance to abiotic stresses and its high invasive nature threatens many natural environments and ecosystems. The abundant biomass of A. donax plants makes it an ideal candidate for biofuel programs [2].

1. Experimental design, materials and methods

1.1 Plant tissue

Approximately 10 g of A. donax shoot tissue was excised from an actively growing shoot, approximately 20 cm above the soil surface of a field plot at the Cattle Fever Tick Research Laboratory, Edinburg, TX, USA. The plants were propagated from plants collected at Laredo, TX in 2008 and designated the Invasive Rio Grande Basin genotype. Excised shoot tissue was taken under natural non-stressed growth conditions and quickly transferred to small vials and placed in dry ice and maintained frozen at −80 °C until transferred into liquid N2 during the RNA purification steps.

1.2. RNA isolation

Shoot tissue was transferred from storage at −80 °C into liquid N2, pulverized, and RNA extracted using the ToTALLY RNA extraction kit according to manufacturer instructions (Life Technologies, Grand Island, NY, USA). A Polytron (Kinematica, Luzern, Switzerland) was used to grind the pulverized tissue for 30 s on ice in the presence of 50 ml of the kit׳s Denaturation Buffer. Following the LiCl precipitation step, a yield of 4 mg of total RNA was obtained. Any traces of contaminating DNA were removed by treatment with TURBO DNA-free kit according to manufacturer׳s instructions (Life Technologies) in RNA aliquots of 10 μg. RNA quality was assessed by agarose gel electrophoresis followed by staining with Gelstar Nucleic Acid Stain (Lonza, Rockland, ME) to help verify genomic DNA contamination was not present.

1.3. Sequencing and bioinformatics

Sequencing was performed at National Center for Genome Resources (Santa Fe, NM, USA) using the standard Illumina RNA library preparation protocol and a single lane of the HiSeq 100 bases pair-end approach. A total of 181,972,782 pair-end Illumina raw reads were produced, and quality assessed using FASTQC version 0.10.1 [http://www.bioinformatics.babraham.ac.uk/projects/fastqc]. The first 12 bases of all reads were trimmed using seqqtk version 4.19 [https://github.com/lh3/seqtk] to remove sequencing biases. Contigs were de novo assembled with trans-ABySS version 1.4.8 [3] and Velvet-Oases version 0.2.08 [4] using kmer sizes of 49, 53, 59 and 63. This yielded 368,848 and 1,477,609 transcripts (≥200 bp) produced by trans-ABBySS and Velvet-Oases, respectively. Trans-ABBySS assembled transcripts were further merged using Cap3 [8] at 99.9% sequence overlap identity resulting in 43,822 merged contigs, and 249,590 unmerged transcripts. Velvet-Oases has been shown to produce overall longer assembled transcripts as compared to other assemblers [5,6]. We also found that Velvet-Oases can produce spurious isoforms and these can be removed by selecting representative transcripts for each locus [7]. We screened assembled transcripts against Poaceae proteins (NCBI NR) and defined as ‘high confidence genes’ those transcripts with sequence identity ≥30% and coverage ≥70% of a known Poaceae genes. We also classified as ‘low confidence genes’ those transcripts with partial or no hits to known Poaceae genes that have been assembled by both trans-ABBySS and Velvet-Oases pipelines with 100% sequence identity and reciprocal transcript coverage greater than 90%. We report a total of 103,081 A. donax transcripts, of these 27,491 and 75,590 are high and low confidence genes, respectively (Table 1 and Fig. 1A). More than 70% of the high confidence genes were functionally annotated, while only 34.55% of the low confidence genes had partial hits to known and domain-containing Poaceae genes (Fig. 1A). We used AutoFACT version 3.4 [9] to functionally annotate transcripts (Supplementary files 1 and 2). The relative abundance of the top 20 KEGG pathways of high confidence genes as compared to the low confidence gene set is shown in Fig. 1B. We found 1.86, 1.71 and 1.58 fold increase of the number of genes assigned to the spliceosome, metabolic pathways of purine metabolism and peroxisome among high confidence genes (Fig. 1B). Fig. 1C shows the top Gene Ontology annotations found among high and low confidence genes. Interestingly, two genes with copper ion binding and transport function were only found among the high confidence genes, while genes involved in nutrient reservoir activity and reproductive growth were only found among the low confidence genes (Fig. 1C). The resources generated in this study will facilitate comparative transcriptomics analyses of invasive plant species.
Table 1

Arundo donax transcriptome assembly statistics.

High confidence genesLow confidence genes
Number of transcripts27,49175,590
Total size of transcripts32,326,85055,020,434
Longest transcript14,9958091
Shortest transcript200200
Number of transcripts>1K nt13,877 (50.5%)14,879 (19.7%)
Number of transcripts>10K nt2 (0.0%)0 (0.0%)
Number of transcripts>100K nt0 (0.0%)0 (0.0%)
Mean transcript size1176728
Median transcript size1008584
N50 transcript length1413870
L50 transcript count781119,821
Transcript %A24.1626.16
Transcript %C25.1123.28
Transcript %G26.3624.02
Transcript %T24.3726.53
Transcript %N00
Fig. 1

Functional annotation of A. donax transcripts: (A) classification of high confidence and low confidence transcripts based on comparison against NCBI NR database. (B) The fold abundance of top 20 KEGG pathways in high confidence transcripts as compared to the low confidence subset is shown. P1=Ribosome; P2=Spliceosome; P3=Ubiquitin mediated proteolysis; P4=Metabolic pathways Oxidative phosphorylation; P5=Plant–pathogen interaction; P6=Proteasome; P7=Protein export; P8=Metabolic pathways, Purine metabolism, Pyrimidine metabolism, RNA polymerase; P9=RNA degradation; P10=Basal transcription factors; P11=Endocytosis; P12=Metabolic pathways, Starch and sucrose metabolism; P13=Peroxisome; P14=Metabolic pathways, N-Glycan biosynthesis; P15=Aminoacyl-tRNA biosynthesis; P16=Natural killer cell mediated cytotoxicity; P17=Base excision repair; P18=Regulation of autophagy; P19=Metabolic pathways, Pyrimidine metabolism; P20=Metabolic pathways, Porphyrin and chlorophyll metabolism. (C) Gene Ontology terms for biological process, molecular function, and cellular componentry were assigned using AutoFACT [9] and summarized using WEGO [10].

2. Direct link to deposited data

Deposited data can be found here: http://www.ncbi.nlm.nih.gov/GBRH01000000.

3. Nucleotide sequence accession number

The assembled and annotated A. donax USA genotype Rio Grande RNA transcriptome has been deposited at DDBJ/EMBL/GenBank under the project accession PRJNA256910. This Transcriptome Shotgun Assembly project has been deposited at DDBJ/EMBL/GenBank under the accession GBRH00000000. The version described in this paper is the first version, GBRH01000000.

Conflict of interest

The authors declare that there is no conflict of interest on any work published in this paper.
Subject areaBiology
More specific subject areaRNA-seq transcriptome data of Arundo donax
Type of dataTable, figure
How data was acquired2×100 HiSeq (single lane of 100 bases pair-end approach)
Data formatRaw FASTQ and processed FASTA
Experimental factors10 g of actively growing shoot, excised approximately 20 cm above soil level
Experimental featuresAssembled transcriptome of actively growing shoot tissue excised from A. donax grown in field plots
Data source locationLaredo, TX, USA
Data accessibilityData is with this article and also available at http://www.ncbi.nlm.nih.gov/GBRH01000000
The assembled and annotated A. donax USA genotype Rio Grande RNA transcriptome has been deposited at DDBJ/EMBL/GenBank under the project accession PRJNA256910
  8 in total

1.  CAP3: A DNA sequence assembly program.

Authors:  X Huang; A Madan
Journal:  Genome Res       Date:  1999-09       Impact factor: 9.043

2.  De novo transcriptome assembly with ABySS.

Authors:  Inanç Birol; Shaun D Jackman; Cydney B Nielsen; Jenny Q Qian; Richard Varhol; Greg Stazyk; Ryan D Morin; Yongjun Zhao; Martin Hirst; Jacqueline E Schein; Doug E Horsman; Joseph M Connors; Randy D Gascoyne; Marco A Marra; Steven J M Jones
Journal:  Bioinformatics       Date:  2009-06-15       Impact factor: 6.937

3.  Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels.

Authors:  Marcel H Schulz; Daniel R Zerbino; Martin Vingron; Ewan Birney
Journal:  Bioinformatics       Date:  2012-02-24       Impact factor: 6.937

4.  De novo assembly of Euphorbia fischeriana root transcriptome identifies prostratin pathway related genes.

Authors:  Roberto A Barrero; Brett Chapman; Yanfang Yang; Paula Moolhuijzen; Gabriel Keeble-Gagnère; Nan Zhang; Qi Tang; Matthew I Bellgard; Deyou Qiu
Journal:  BMC Genomics       Date:  2011-12-13       Impact factor: 3.969

5.  AutoFACT: an automatic functional annotation and classification tool.

Authors:  Liisa B Koski; Michael W Gray; B Franz Lang; Gertraud Burger
Journal:  BMC Bioinformatics       Date:  2005-06-16       Impact factor: 3.169

6.  WEGO: a web tool for plotting GO annotations.

Authors:  Jia Ye; Lin Fang; Hongkun Zheng; Yong Zhang; Jie Chen; Zengjin Zhang; Jing Wang; Shengting Li; Ruiqiang Li; Lars Bolund; Jun Wang
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

7.  De novo transcriptome assembly and analyses of gene expression during photomorphogenesis in diploid wheat Triticum monococcum.

Authors:  Samuel E Fox; Matthew Geniza; Mamatha Hanumappa; Sushma Naithani; Chris Sullivan; Justin Preece; Vijay K Tiwari; Justin Elser; Jeffrey M Leonard; Abigail Sage; Cathy Gresham; Arnaud Kerhornou; Dan Bolser; Fiona McCarthy; Paul Kersey; Gerard R Lazo; Pankaj Jaiswal
Journal:  PLoS One       Date:  2014-05-12       Impact factor: 3.240

8.  Sequencing and de novo transcriptome assembly of Brachypodium sylvaticum (Poaceae).

Authors:  Samuel E Fox; Justin Preece; Jeffrey A Kimbrel; Gina L Marchini; Abigail Sage; Ken Youens-Clark; Mitchell B Cruzan; Pankaj Jaiswal
Journal:  Appl Plant Sci       Date:  2013-03-05       Impact factor: 1.936

  8 in total
  9 in total

1.  Global leaf and root transcriptome in response to cadmium reveals tolerance mechanisms in Arundo donax L.

Authors:  Danilo Fabrizio Santoro; Angelo Sicilia; Giorgio Testa; Salvatore Luciano Cosentino; Angela Roberta Lo Piero
Journal:  BMC Genomics       Date:  2022-06-08       Impact factor: 4.547

2.  Identification of Known and Novel Arundo donax L. MicroRNAs and Their Targets Using High-Throughput Sequencing and Degradome Analysis.

Authors:  Silvia Rotunno; Claudia Cocozza; Vitantonio Pantaleo; Paola Leonetti; Loris Bertoldi; Giorgio Valle; Gian Paolo Accotto; Francesco Loreto; Gabriella Stefania Scippa; Laura Miozzi
Journal:  Life (Basel)       Date:  2022-04-27

3.  Transcriptional, metabolic and DNA methylation changes underpinning the response of Arundo donax ecotypes to NaCl excess.

Authors:  Teresa Docimo; Rosalba De Stefano; Monica De Palma; Elisa Cappetta; Clizia Villano; Riccardo Aversano; Marina Tucci
Journal:  Planta       Date:  2019-12-17       Impact factor: 4.116

4.  Selection of reference genes suitable for normalization of qPCR data under abiotic stresses in bioenergy crop Arundo donax L.

Authors:  Michele Poli; Silvio Salvi; Mingai Li; Claudio Varotto
Journal:  Sci Rep       Date:  2017-09-06       Impact factor: 4.379

5.  De novo assembly, functional annotation, and analysis of the giant reed (Arundo donax L.) leaf transcriptome provide tools for the development of a biofuel feedstock.

Authors:  Chiara Evangelistella; Alessio Valentini; Riccardo Ludovisi; Andrea Firrincieli; Francesco Fabbrini; Simone Scalabrin; Federica Cattonaro; Michele Morgante; Giuseppe Scarascia Mugnozza; Joost J B Keurentjes; Antoine Harfouche
Journal:  Biotechnol Biofuels       Date:  2017-05-30       Impact factor: 6.040

6.  In silico identification and characterization of a diverse subset of conserved microRNAs in bioenergy crop Arundo donax L.

Authors:  Wuhe Jike; Gaurav Sablok; Giorgio Bertorelle; Mingai Li; Claudio Varotto
Journal:  Sci Rep       Date:  2018-11-12       Impact factor: 4.379

7.  RNASeq analysis of giant cane reveals the leaf transcriptome dynamics under long-term salt stress.

Authors:  Angelo Sicilia; Giorgio Testa; Danilo Fabrizio Santoro; Salvatore Luciano Cosentino; Angela Roberta Lo Piero
Journal:  BMC Plant Biol       Date:  2019-08-15       Impact factor: 4.215

8.  Novel genome characteristics contribute to the invasiveness of Phragmites australis (common reed).

Authors:  Dong-Ha Oh; Kurt P Kowalski; Quynh N Quach; Chathura Wijesinghege; Philippa Tanford; Maheshi Dassanayake; Keith Clay
Journal:  Mol Ecol       Date:  2021-12-11       Impact factor: 6.622

Review 9.  Marginal Lands to Grow Novel Bio-Based Crops: A Plant Breeding Perspective.

Authors:  Francesco Pancaldi; Luisa M Trindade
Journal:  Front Plant Sci       Date:  2020-03-03       Impact factor: 5.753

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.