Literature DB >> 31720332

De novo whole genome sequencing data of two mangrove-isolated microalgae from Terengganu coastal waters.

Kit Yinn Teh1,2, C L Wan Afifudeen1,2, Ahmad Aziz3, Li Lian Wong1,2,4, Saw Hong Loh3,1, Thye San Cha3,1,2.   

Abstract

Interest in harvesting potential benefits from microalgae renders it necessary to have the many ecological niches of a single species to be investigated. This dataset comprises de novo whole genome assembly of two mangrove-isolated microalgae (from division Chlorophyta); Chlorella vulgaris UMT-M1 and Messastrum gracile SE-MC4 from Universiti Malaysia Terengganu, Malaysia. Library runs were carried out with 2 × 150 base paired-ends reads, whereas sequencing was conducted using Illumina Novaseq 2500 platform. Sequencing yielded raw reads amounting to ∼11 Gb in total bases for both species and was further assembled de novo. Genome assembly resulted in a 50.15 Mbp and 60.83 Mbp genome size for UMT-M1 and SE-MC4, respectively. All filtered and assembled genomic data sequences have been submitted to National Centre for Biotechnology Information (NCBI) and can be located at DDBJ/ENA/GenBank under the accession of VJNP00000000 (UMT-M1) and VIYE00000000 (SE-MC4).
© 2019 The Authors.

Entities:  

Keywords:  Chlorophyta; IDBA-UD; Next generation sequencing; Oleaginous microalgae; Salinity

Year:  2019        PMID: 31720332      PMCID: PMC6838400          DOI: 10.1016/j.dib.2019.104680

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table First complete chromosomal genome sequencing of two native microalgae isolated from mangrove area in tropical region. Further enrich the currently limited WGS data collections of important microalgae species, aid in strain improvement and support interests of various biotechnology industries. Benefit future works on comparative genome analysis and microalgae adaptation responses.

Data

Response of microalgae to environmental stimuli is species-specific and may even vary from strain to strain [1,2]. Moreover, mangrove dwelling microalgae are often exposed to impending high and low tides making them unique assemblages in a marginal ecosystem niche with possibly unique responses. Being able to regulate and exert control over the outcome of those responses remain as the most difficult conundrums in phycology research. Both UMT-M1 and SE-MC4 used in this research are oleaginous native species isolated from the mangrove areas in Terengganu, Malaysia. UMT-M1 has been intensively studied in our previous research for oil and fatty acid productions under various culture conditions, such as nitrogen starvation [3], phytohormones treatments [[4], [5], [6]], as well as strain improvement through genetic modifications [7,8]. On the other hand, SE-MC4 is a non-model species which has been observed to produce more than 50% (of dry weight) of total oil content in our laboratory. The exploration on novel genome in a non-model microalga is imperative in order to enrich the available genome data for further biodiesel development applications. Efforts to improve microalgae feedstock from a molecular aspect is often curtailed by the limited number of available microalgae genomes [9]. Moreover, available C. vulgaris genome only constitutes a freshwater species [10]. Following in that prospect, the de novo WGS of C. vulgaris UMT-M1 featured in this report represents a mangrove dwelling microalga that is able to adapt and survive in a wide range of salinity. Besides that, exploration of potentially high-oil producing non-model species such as M. gracile SE-MC4 is pertinent for adding genetic variety to the presently available genetic databank [11]. In UMT-M1, subsequent sequencing generated 73, 495,318 raw reads, amounting to 11,097,793,018 (11.09 Gb) in total bases (Table 1). Overall, 89.58% of total bases achieved a Phred score of Q30 with GC content of 62.29%. High quality raw reads from Table 1 were then filtered, normalized and assembled de novo using IDBA-UD assembler [12]. The IDBA-UD assembler internally pipes contigs into scaffolds to form assembled scaffolds. Scaffolds with less than 200 bases were removed. Assembly produced 2547 scaffolds amounting to a total of 50,153,796 bases (50 Mbp). The scaffold positioned at the N50 and N90 were 56,390 and 14,886 bases, respectively (Table 2).
Table 1

Statistics of paired-end sequence library for C. vulgaris UMT-M1 and M. gracile SE-MC4.

SpeciesTotal readsTotal basesGC Content (%)Nt* > Q30% (%)
C. vulgaris UMT-M173,495,31811,097,793,018 (11.09 Gb)62.2989.58
M. gracile SE-MC472,742,15810,984,065,858 (10.98 Gb)68.2790.52

*Nt = nucleotides.

Table 2

De novo sequence statistics for C. vulgaris UMT-M1 and M. gracile SE-MC4.

SpeciesNumber of scaffoldsTotal length (base)Max length (base)Min length (base)N50N90
C. vulgaris UMT-M1254750,153,796386,66020156,39014,886
M. gracile SE-MC432,47360,830,64352,1092012915802
Statistics of paired-end sequence library for C. vulgaris UMT-M1 and M. gracile SE-MC4. *Nt = nucleotides. De novo sequence statistics for C. vulgaris UMT-M1 and M. gracile SE-MC4. In SE-MC4, total bases generated from sequencing amounted to 10,984,065,858 bp (10.98 Gb) with 68.27% GC content and a Phred score of 90.52%. Sequencing data statistics are summarised in Table 1. De novo assembly in SE-MC4 obtained 32,473 scaffolds and a total length of 60,830,643 bp (60.83 Mb) with maximum length of 52,109 bp and minimum length of 201 bp. Mean length (N50) of scaffolds is 2915 bp, while N90 is 802 bp. Statistics of the genome assembly are as shown in Table 2.

Experimental design, materials, and methods

Sample preparation

Inoculum stock was obtained from microalgae culture collection at the Universiti Malaysia Terengganu. Stock cultures were maintained under axenic and sterile culture conditions in modified Guillard's F2 medium [3] prepared with artificial seawater (30 ppt). Microalgae cells were harvested at mid-stationary phase. Microalgal cells were harvested from 50 mL of culture by centrifugation at 7000 rpm for 5 min. DNA was extracted from fresh pellet using Wizard® Genomic DNA Purification Kit (Promega, USA). All extraction steps were carried out as per manufacturer's protocol. Prior to sequencing, DNA purity was evaluated via absorbance values of (260/280, 260/230) ratio, gel electrophoresis pattern and double-strand DNA concentration measurements.

De novo WGS sequencing

Library preparation and sequencing were conducted by Theragen Bio Itex, South Korea. Library preparation was carried out using TruSeq Nano DNA Library Prep Kit (Illumina, USA). Library construction was made by DNA size selection attached with adaptors to produce an insert size of 350 bp [13]. Runs were conducted with 2 × 150 base paired-end reads. Sequencing was then performed on Illumina Novaseq 2500 platform. Cluster generation on flow cells was performed by using constructed libraries on cBot equipment (Illumina, USA). Following sequencing of raw reads, adapter sequences were trimmed via cutadapt v1.10 [14] and quality filtering was performed to remove contaminants. Reads that scored above Q30 were selected for assembly. De novo assembly of high quality reads was then carried out using IDBA-UD assembler to form scaffolds [12]. Scaffolds that were <200 bp in length were removed manually.

Deposition of genome data

Raw data sequence and assembled genome were deposited in NCBI depository portal. Steps by steps guidelines on submission was followed as in NCBI author guide via https://www.ncbi.nlm.nih.gov/genbank/genomesubmit/. Breakdown of the project accession is shown in Table 3.
Table 3

Sequence accession numbers and directory links.

SpeciesDirectory/DataAccession numberLinks
C. vulgaris UMT-M1BioProjectPRJNA550188https://www.ncbi.nlm.nih.gov/bioproject/PRJNA550188
BioSampleSAMN12111214https://www.ncbi.nlm.nih.gov/biosample/SAMN12111214
Raw sequence (SRA)SRR9478717https://www.ncbi.nlm.nih.gov/sra/SRX6245806/
Assembled genomeVJNP00000000https://www.ncbi.nlm.nih.gov/nuccore/VJNP00000000
M. gracile SE-MC4BioProjectPRJNA550185https://www.ncbi.nlm.nih.gov/bioproject/PRJNA550185
BioSampleSAMN12111213https://www.ncbi.nlm.nih.gov/biosample/SAMN12111213
Raw sequence (SRA)SRR9587833https://www.ncbi.nlm.nih.gov/sra/SRX6353668
Assembled genomeVIYE00000000https://www.ncbi.nlm.nih.gov/nuccore/VIYE00000000
Sequence accession numbers and directory links.

Specifications Table

SubjectMolecular Biology
Specific subject areaWhole genome sequencing (WGS)
Type of dataWGS data of:i) C. vulgaris UMT-M1ii) M. gracile SE-MC4
How data were acquiredPaired-end sequencing on Illumina Novaseq 2500 platform followed by de novo assembly using IUBD-DA
Data formatRaw and filtered de novo genome sequences: FASTQ
Parameters for data collectionDNA extracted from axenic cultures
Description of data collectionDNA from fresh microalgae cells was extracted. DNA purity and concentration were measured before sequencing. Data were assembled de novo using IDBA-UD assembler.
Data source locationInstitution: Institute of Marine Biotechnology, Universiti Malaysia TerengganuCity/Town/Region: Kuala Terengganu, TerengganuCountry: MalaysiaLatitude and longitude (and GPS coordinates) for collected samples/data:i) UMT-M1: 5° 24′ 11.39″ N, 103° 05′ 9.60″ E (Mengabang Telipot, Universiti Malaysia Terengganu)ii) SE-MC4: 5° 31′ 59.2″ N 102° 56′ 52.2″ E (Setiu Wetland, Terengganu)
Data accessibilityGenomes of both species can be found at DDBJ/ENA/GenBank under the accession numbers:i) C. vulgaris UMT-M1: VJNP00000000ii) M. gracile SE-MC4: VIYE00000000
Value of the Data

First complete chromosomal genome sequencing of two native microalgae isolated from mangrove area in tropical region.

Further enrich the currently limited WGS data collections of important microalgae species, aid in strain improvement and support interests of various biotechnology industries.

Benefit future works on comparative genome analysis and microalgae adaptation responses.

  9 in total

1.  IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth.

Authors:  Yu Peng; Henry C M Leung; S M Yiu; Francis Y L Chin
Journal:  Bioinformatics       Date:  2012-04-11       Impact factor: 6.937

2.  Gibberellin Promotes Cell Growth and Induces Changes in Fatty Acid Biosynthesis and Upregulates Fatty Acid Biosynthetic Genes in Chlorella vulgaris UMT-M1.

Authors:  Malinna Jusoh; Saw Hong Loh; Ahmad Aziz; Thye San Cha
Journal:  Appl Biochem Biotechnol       Date:  2018-12-10       Impact factor: 2.926

Review 3.  The potentials and challenges of algae based biofuels: a review of the techno-economic, life cycle, and resource assessment modeling.

Authors:  Jason C Quinn; Ryan Davis
Journal:  Bioresour Technol       Date:  2014-10-24       Impact factor: 9.642

Review 4.  Can Omics Approaches Improve Microalgal Biofuels under Abiotic Stress?

Authors:  El-Sayed Salama; Sanjay P Govindwar; Rahul V Khandare; Hyun-Seog Roh; Byong-Hun Jeon; Xiangkai Li
Journal:  Trends Plant Sci       Date:  2019-05-10       Impact factor: 18.313

Review 5.  Library construction for next-generation sequencing: overviews and challenges.

Authors:  Steven R Head; H Kiyomi Komori; Sarah A LaMere; Thomas Whisenant; Filip Van Nieuwerburgh; Daniel R Salomon; Phillip Ordoukhanian
Journal:  Biotechniques       Date:  2014-02-01       Impact factor: 1.993

6.  Indole-3-acetic acid (IAA) induced changes in oil content, fatty acid profiles and expression of four fatty acid biosynthetic genes in Chlorella vulgaris at early stationary growth phase.

Authors:  Malinna Jusoh; Saw Hong Loh; Tse Seng Chuah; Ahmad Aziz; Thye San Cha
Journal:  Phytochemistry       Date:  2015-01-09       Impact factor: 4.072

7.  Differential regulation of fatty acid biosynthesis in two Chlorella species in response to nitrate treatments and the potential of binary blending microalgae oils for biodiesel application.

Authors:  Thye San Cha; Jian Woon Chen; Eng Giap Goh; Ahmad Aziz; Saw Hong Loh
Journal:  Bioresour Technol       Date:  2011-09-16       Impact factor: 9.642

8.  Examination of triacylglycerol biosynthetic pathways via de novo transcriptomic and proteomic analyses in an unsequenced microalga.

Authors:  Michael T Guarnieri; Ambarish Nag; Sharon L Smolinski; Al Darzins; Michael Seibert; Philip T Pienkos
Journal:  PLoS One       Date:  2011-10-17       Impact factor: 3.240

9.  Genome Sequence of the Oleaginous Green Alga, Chlorella vulgaris UTEX 395.

Authors:  Michael T Guarnieri; Jennifer Levering; Calvin A Henard; Jeffrey L Boore; Michael J Betenbaugh; Karsten Zengler; Eric P Knoshaug
Journal:  Front Bioeng Biotechnol       Date:  2018-04-05
  9 in total
  7 in total

1.  A brief period of darkness induces changes in fatty acid biosynthesis towards accumulation of saturated fatty acids in Chlorella vulgaris UMT-M1 at stationary growth phase.

Authors:  Thye San Cha; Willy Yee; Pamela Szu Phin Phua; Saw Hong Loh; Ahmad Aziz
Journal:  Biotechnol Lett       Date:  2021-01-12       Impact factor: 2.461

2.  Influence of nitrogen availability on biomass, lipid production, fatty acid profile, and the expression of fatty acid desaturase genes in Messastrum gracile SE-MC4.

Authors:  Kaben Anne-Marie; Willy Yee; Saw Hong Loh; Ahmad Aziz; Thye San Cha
Journal:  World J Microbiol Biotechnol       Date:  2020-01-07       Impact factor: 3.312

3.  Enhanced fatty acid methyl esters recovery through a simple and rapid direct transesterification of freshly harvested biomass of Chlorella vulgaris and Messastrum gracile.

Authors:  Saw Hong Loh; Mee Kee Chen; Nur Syazana Fauzi; Ahmad Aziz; Thye San Cha
Journal:  Sci Rep       Date:  2021-02-01       Impact factor: 4.379

4.  Double-high in palmitic and oleic acids accumulation in a non-model green microalga, Messastrum gracile SE-MC4 under nitrate-repletion and -starvation cultivations.

Authors:  Che-Lah Wan Afifudeen; Saw Hong Loh; Ahmad Aziz; Kazutaka Takahashi; Abd Wahid Mohd Effendy; Thye San Cha
Journal:  Sci Rep       Date:  2021-01-11       Impact factor: 4.379

5.  Lipid accumulation patterns and role of different fatty acid types towards mitigating salinity fluctuations in Chlorella vulgaris.

Authors:  Kit Yinn Teh; Saw Hong Loh; Ahmad Aziz; Kazutaka Takahashi; Abd Wahid Mohd Effendy; Thye San Cha
Journal:  Sci Rep       Date:  2021-01-11       Impact factor: 4.379

Review 6.  Bioprospecting of microalgae metabolites against cytokine storm syndrome during COVID-19.

Authors:  Che Lah Wan Afifudeen; Kit Yinn Teh; Thye San Cha
Journal:  Mol Biol Rep       Date:  2021-11-09       Impact factor: 2.742

7.  Graph-based models of the Oenothera mitochondrial genome capture the enormous complexity of higher plant mitochondrial DNA organization.

Authors:  Axel Fischer; Jana Dotzek; Dirk Walther; Stephan Greiner
Journal:  NAR Genom Bioinform       Date:  2022-03-31
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.