| Literature DB >> 30304855 |
André M Machado1, Ole K Tørresen2, Naoki Kabeya3, Alvarina Couto4,5, Bent Petersen6,7, Mónica Felício8, Paula F Campos9,10, Elza Fonseca11,12, Narcisa Bandarra13, Mónica Lopes-Marques14, Renato Ferraz15,16, Raquel Ruivo17, Miguel M Fonseca18, Sissel Jentoft19,20, Óscar Monroig21, Rute R da Fonseca22,23, L Filipe C Castro24,25.
Abstract
Clupeiformes, such as sardines and herrings, represent an important share of worldwide fisheries. Among those, the European sardine (Sardina pilchardus, Walbaum 1792) exhibits significant commercial relevance. While the last decade showed a steady and sharp decline in capture levels, recent advances in culture husbandry represent promising research avenues. Yet, the complete absence of genomic resources from sardine imposes a severe bottleneck to understand its physiological and ecological requirements. We generated 69 Gbp of paired-end reads using Illumina HiSeq X Ten and assembled a draft genome assembly with an N50 scaffold length of 25,579 bp and BUSCO completeness of 82.1% (Actinopterygii). The estimated size of the genome ranges between 655 and 850 Mb. Additionally, we generated a relatively high-level liver transcriptome. To deliver a proof of principle of the value of this dataset, we established the presence and function of enzymes (Elovl2, Elovl5, and Fads2) that have pivotal roles in the biosynthesis of long chain polyunsaturated fatty acids, essential nutrients particularly abundant in oily fish such as sardines. Our study provides the first omics dataset from a valuable economic marine teleost species, the European sardine, representing an essential resource for their effective conservation, management, and sustainable exploitation.Entities:
Keywords: European sardine; comparative genomics; draft genome; long chain polyunsaturated fatty acids; teleosts
Year: 2018 PMID: 30304855 PMCID: PMC6210256 DOI: 10.3390/genes9100485
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Photograph of a specimen of European sardine, Sardina pilchardus (photograph credits to Mónica Felício and André M. Machado).
Summary of genome and liver transcriptome statistics of the European sardine, Sardina pilchardus.
| Features | Genome # | Liver Transcriptome # |
|---|---|---|
|
| ||
| Raw sequencing reads | 456,775,568 | 122,806,922 |
| Clean reads | 412,914,751 | 111,524,231 |
|
| ||
| Number of contigs | 90,290 | 245,053 |
| Total contig size, Mb | 640.1 | 278.5 |
| Contig N50 size, bp | 10,878 | 1760 |
| Longest contig, bp | 87,474 | 15,773 |
| GC/AT/N, % | 44.45 | 48.10 |
|
| ||
| Number of scaffolds | 45,321 | - |
| Total scaffold size, Mb | 641.5 | - |
| Scaffold N50 size, bp | 25,577 | - |
| Longest scaffold, bp | 285,113 | - |
| Genome coverage, × | 59 | - |
|
| ||
| Complete, % | 82.7/70.5/68.8 | 99.1/80.6/72 |
| Complete and single copy, % | 78.8/68.4/66.3 | 41.5/31.2/29.1 |
| Complete and duplicated, % | 3.9/2.1/2.5 | 57.6/49.4/42.9 |
| Fragmented, % | 9.2/19.0/13.3 | 0.6/10.5/8.6 |
| Missing, % | 8.1/10.5/17.9 | 0.3/8.9/19.4 |
| Total BUSCO found | 91.9/89.5/82.1 | 99.7/91.1/80.6 |
|
| ||
| Number of protein-coding genes | 29,701 | - |
| Number of functionally annotated proteins | 28,783 | - |
| Average CDS length | 1561.42 | - |
| Longest CDS | 49,643 | - |
| Average protein length | 373.45 | - |
| Longest protein | 16,525 | - |
| Average number of exon per gene | 6.59 | - |
# All statistics are based on contigs/scaffolds of size ≥200 bp. Met: From a total of 978 genes of Metazoa library profile; Ver: From a total of 2586 genes of Vertebrata library profile; Actino: From a total of 4584 genes of Actinopterygii library profile.
Figure 2Genome evolution and phylogenomics. (A) Orthologous gene families across four fish genomes (European sardine, zebrafish, herring and blind cave fish). (B) Phylogeny of vertebrates (lamprey as the outgroup species); numbers at nodes represent bootstrap values.
Figure 3Maximum likelihood phylogenetic analysis of fads2 (A) and elovl orthologues (B) analyzed in the present study: Clupeiformes species are highlighted, node numbers indicate bootstrap values. (C) Reconstructed genomic loci of fads2, elovl2, and elovl5 denote synteny conservation between the European sardine and Atlantic herring: scaffold coordinates and identified neighbouring genes are indicated; broken lines and arrows denote reconstruction from overlapping scaffolds. (D) LC-PUFA biosynthesis pathway in the European sardine, dashed line indicates the Δ5 desaturation capacity absent in the European sardine, n-3 fatty acids are indicated in yellow: ALA—α-Linolenic acid (18:3n-3), EPA—eicosapentaenoic acid (20:5n-3) and DHA—docosahexaenoic acid (22:6n-3).