| Literature DB >> 31720332 |
Kit Yinn Teh1,2, C L Wan Afifudeen1,2, Ahmad Aziz3, Li Lian Wong1,2,4, Saw Hong Loh3,1, Thye San Cha3,1,2.
Abstract
Interest in harvesting potential benefits from microalgae renders it necessary to have the many ecological niches of a single species to be investigated. This dataset comprises de novo whole genome assembly of two mangrove-isolated microalgae (from division Chlorophyta); Chlorella vulgaris UMT-M1 and Messastrum gracile SE-MC4 from Universiti Malaysia Terengganu, Malaysia. Library runs were carried out with 2 × 150 base paired-ends reads, whereas sequencing was conducted using Illumina Novaseq 2500 platform. Sequencing yielded raw reads amounting to ∼11 Gb in total bases for both species and was further assembled de novo. Genome assembly resulted in a 50.15 Mbp and 60.83 Mbp genome size for UMT-M1 and SE-MC4, respectively. All filtered and assembled genomic data sequences have been submitted to National Centre for Biotechnology Information (NCBI) and can be located at DDBJ/ENA/GenBank under the accession of VJNP00000000 (UMT-M1) and VIYE00000000 (SE-MC4).Entities:
Keywords: Chlorophyta; IDBA-UD; Next generation sequencing; Oleaginous microalgae; Salinity
Year: 2019 PMID: 31720332 PMCID: PMC6838400 DOI: 10.1016/j.dib.2019.104680
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Statistics of paired-end sequence library for C. vulgaris UMT-M1 and M. gracile SE-MC4.
| Species | Total reads | Total bases | GC Content (%) | Nt* > Q30% (%) |
|---|---|---|---|---|
| 73,495,318 | 11,097,793,018 (11.09 Gb) | 62.29 | 89.58 | |
| 72,742,158 | 10,984,065,858 (10.98 Gb) | 68.27 | 90.52 |
*Nt = nucleotides.
De novo sequence statistics for C. vulgaris UMT-M1 and M. gracile SE-MC4.
| Species | Number of scaffolds | Total length (base) | Max length (base) | Min length (base) | N50 | N90 |
|---|---|---|---|---|---|---|
| 2547 | 50,153,796 | 386,660 | 201 | 56,390 | 14,886 | |
| 32,473 | 60,830,643 | 52,109 | 201 | 2915 | 802 |
Sequence accession numbers and directory links.
| Species | Directory/Data | Accession number | Links |
|---|---|---|---|
| BioProject | PRJNA550188 | ||
| BioSample | SAMN12111214 | ||
| Raw sequence (SRA) | SRR9478717 | ||
| Assembled genome | VJNP00000000 | ||
| BioProject | PRJNA550185 | ||
| BioSample | SAMN12111213 | ||
| Raw sequence (SRA) | SRR9587833 | ||
| Assembled genome | VIYE00000000 |
Specifications Table
| Subject | Molecular Biology |
| Specific subject area | Whole genome sequencing (WGS) |
| Type of data | WGS data of: |
| How data were acquired | Paired-end sequencing on Illumina Novaseq 2500 platform followed by |
| Data format | Raw and filtered |
| Parameters for data collection | DNA extracted from axenic cultures |
| Description of data collection | DNA from fresh microalgae cells was extracted. DNA purity and concentration were measured before sequencing. Data were assembled |
| Data source location | Institution: Institute of Marine Biotechnology, Universiti Malaysia Terengganu |
| Data accessibility | Genomes of both species can be found at DDBJ/ENA/GenBank under the accession numbers: |
First complete chromosomal genome sequencing of two native microalgae isolated from mangrove area in tropical region. Further enrich the currently limited WGS data collections of important microalgae species, aid in strain improvement and support interests of various biotechnology industries. Benefit future works on comparative genome analysis and microalgae adaptation responses. |