| Literature DB >> 35356318 |
Sujay Paul1, Erika Salavarría2, Katherine García3, Alonso Reyes-Calderón3, Patricia Gil-Kodaka4, Ilanit Samolski3, Aashish Srivastava5, Anindya Bandyopadhyay6,7, Gretty K Villena3.
Abstract
Kelps or brown algae are a wide group of marine macroalgae that play an important role in aquatic ecosystems and generally have high commercial value. To facilitate brown algal studies, we report the complete genome sequence of the largest kelp Macrocystis pyrifera. The whole genome is ∼428 Mb in size, comprises 44,307 scaffolds with an average GC content of 47%, and is predicted to contain a total of 24,778 genes. 18S sequence-based phylogenetic analysis revealed that littoral brown seaweed Scytosiphon lomentaria is the closest species of M. pyrifera. Numerous genes identified in this dataset are involved in genetic information processing, signaling, and cellular processes, carbohydrate metabolism, and terpenoids biosynthesis.Entities:
Keywords: Brown alga; Illumina; Macrocystis pyrifera; Nanopore; Valuable bioproducts; Whole genome sequencing
Year: 2022 PMID: 35356318 PMCID: PMC8958530 DOI: 10.1016/j.dib.2022.108068
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
General features of M. pyrifera the genome.
| Genome size (bp) | 427,916,191 |
| DNA coding (bp) | 3,868,356 |
| GC Contents (%) | 47.36 |
| Total assembled length (bp) | 427,916,191 |
| Number of DNA scaffolds | 44,307 |
| N50 scaffold size | 42,778 (42.8 Kb) |
| Number of total genes | 24,778 |
| Number of total annotated proteins | 20,026 |
| Genes with Pfam domains | 9,705 |
| Genes with signal peptides | 521 |
| Repeated sequences (values and %) | 160,114,011 bp (37.42%) |
| Average gene length (bp) | 5,949 |
| Average coding sequence length (bp) | 978 |
| Average intron length (bp) | 1,514 |
| Average protein Length | 326 |
| Busco Completeness (%) | 49.9 |
Fig. 1(a) BUSCO evaluation of completeness of M. pyrifera genome. (b) 18S sequence-based phylogenetic analysis revealed littoral brown seaweed Scytosiphon lomentaria as the closest species of M. pyrifera. The phylogenetic tree was constructed using MEGA-X v10.0.5 tools through the maximum likelihood method. Bootstrap analysis (1000 replicates) was performed to validate the nodes.
Fig. 2(a) Protein level comparative analysis of M. pyrifera against multiple related species such as Ectocarpus siliculosus, Nemacystus decipiens, and Cladosiphon okamuranus, using Orthovenn tool. (b) GO enrichment analysis of annotated proteins from M. pyrifera.
| Subject | Genomics |
| Specific subject area | Algal Genomics |
| Type of data | Tables, Figures, Charts |
| How the data were acquired | Illumina HiSeq 4000 (paired-end) and Nanopore GridIon-X5 |
| Data format | Raw, filtered, analyzed |
| Description of data collection | Genomic DNA was extracted and purified from apical frond tissue samples of Macrocystis pyrifera using Gene Jet Plant genomic DNA purification Kit (Thermo Scientific, USA) and sequenced both on Illumina Hiseq 4000 (paired-end) and Nanopore- GridION platforms. The short reads (Illumina) and long reads (Nanopore) data from both the sequencing platforms were demultiplexed using bcl2fastq (Illumina) and guppy (Oxford Nanopore Technologies). |
| Data source location | Punta San Juanito, Ica, Peru (Latitude 15°15′11.3′′S, Longitude 0.75°13′ 32.4′′W) |
| Data accessibility Repository name | The nucleotide sequences of raw reads and assembled draft genome are available at NCBI's Sequence Read Archive as BioProject PRJNA605694 ( |