Literature DB >> 30577879

Genome of Tenualosa ilisha from the river Padma, Bangladesh.

Avizit Das1, Peter Ianakiev2, Abdul Baten3,4, Rifath Nehleen1, Tasneem Ehsan1, Oly Ahmed1, Mohammad Riazul Islam1, M Niamul Naser5, Mong Sano Marma6, Haseena Khan7.   

Abstract

OBJECTIVE: Hilsa shad (Tenualosa ilisha), is a popular fish of Bangladesh belonging to the Clupeidae family. An anadromous species, like the salmon and many other migratory fish, it is a unique species that lives in the sea and travels to freshwater rivers for spawning. During its entire life, Tenualosa ilisha migrates both from sea to freshwater and vice versa. DATA DESCRIPTION: The genome of Tenualosa ilisha collected from the river Padma of Rajshahi, Bangladesh has been sequenced and its de novo hybrid assembly and structural annotations are being reported here. Illumina and PacBio sequencing platforms were used for high depth sequencing and the draft genome assembly was found to be 816 MB with N50 size of 188 kb. MAKER gene annotation tool predicted 31,254 gene models. Benchmarking Universal Single-Copy Orthologs refer 95% completeness of the assembled genome.

Entities:  

Keywords:  Clupediae; Hilsa shad; NGS platform; Tenualosa ilisha; Whole genome sequence

Mesh:

Year:  2018        PMID: 30577879      PMCID: PMC6303923          DOI: 10.1186/s13104-018-4028-8

Source DB:  PubMed          Journal:  BMC Res Notes        ISSN: 1756-0500


Objective

Hilsa shad known as ilish in Bangladeshis popular for its taste and the texture of its flesh. This species of fish belongs to the shad in Clupeidae family. In addition to the Bay of Bengal and riverine Bangladesh (the Padma, Jamuna, Meghna, and other coastal rivers of Bangladesh), this fish is also found in the Persian Gulf, Mediterranean Sea, Arabian Sea and China Sea [1]. Fisheries, a part of the Bangladesh’s cultural heritage, have played an important role on its socioeconomic development in terms of protein supply, generation of employment and earning of foreign currency. According to the FAO, in 2018 Bangladesh ranked 3rd in the world in inland fish production. Hilsa (Tenualosa ilisha), is the most popular among the 650 or so marine and inland fish found in Bangladesh. It contributes to 11% of total fish production and 1% to the national GDP, 3.00% of the total export earnings and about 2.5 million people in Bangladesh are directly dependent on Hilsa in providing for their families [2, 3]. At present more than 60% of global Hilsa catch is reported from Bangladesh, 20–25% from Myanmar, 15–20% from India and 5–10% from other countries (e.g., Iraq, Kuwait, Malaysia, Thailand and Pakistan). The recent Hilsa production of Bangladesh is about half a million metric ton [4]. In spite of such importance Hilsa is still lacks molecular genomic information. Significance of this data for the improvement in sustainability and maintenance of diversity of this fish cannot therefore be overemphasized.

Data description

Fresh Tenualosa ilisha samples from the river Padma at Rajshahi were collected and instantly preserved on dry ice. White and red muscles of the fish were used for DNA extraction. A modified SDS (Sodium Dodecyl Sulfate) method [5], optimized in our lab was used for DNA extraction (detailed methodology in Data File 1, Table 1).
Table 1

Overview of data files/data sets

LabelName of data file/data setFile types (file extension)Data repository and identifier (DOI or accession number)
Data file 1DNA isolation and library preparation methodology.docs file https://figshare.com/s/467b8b670149f1a0617c
Data file 2Whole genome assembly dataFASTANCBI GeneBank (Accession numbers: GCA_003651195.1) (http://identifiers.org/ncbi/insdc.gca:GCA_003651195.1.)
Data file 3Whole genome sequenceFASTANCBI GeneBank (Accession numbers: QYSC01000001–QYSC01124209) (http://identifiers.org/ncbi/insdc:QYSC00000000.)
Data file 4Annotation data file.tsv https://figshare.com/s/270b54d9d076ef5e5901
Overview of data files/data sets Pair end library with an insert size of around 300 bp was constructed for Illumina sequencing using NEB NebNext Ultra II DNA kit (detailed methodology in Data File 1, Table 1) Genomic DNA was sequenced by Illumina HiSeq 4000 and Pacific Bioscience Sequel, single molecule, real time (SMRT, Single Molecule Real Time) sequencing platforms. The quality of the reads were checked using FastQC [6]. MaSuRCA (Maryland Super-Read Celera Assembler) ver 3.2.6 was used for hybrid de novo assembly [7] using both the Illumina and PacBio data. The genome assembly data has been deposited in the NCBI GeneBank under the Accession numbers GCA_003651195.1 (Data file 2; Table 1). Illumina only data generated a fragmented assembly and showed 91% BUSCO [8] completeness. Addition of 15.7 Gbp data from PacBio significantly improved the quality and contiguity of the genome. Compared to Illumina only, N50 improved from 13 Kb (kilo base pair) to 188 Kb. Similarly, the total number of scaffolds reduced from 475,121 to 124,209. The assembled genome size of Tenualosa ilisha Padma Bangladesh is now 816 Mb (Mega base pair) and approximately 82% of the genome has been assembled. The BUSCO analysis revealing 95% completeness as well as significantly lower number of scaffolds and considerably better N50 indicates the genome to be of high-quality. The genome sequence data has been deposited in the NCBI GeneBank under the Accession numbers QYSC01000001-QYSC01124209 (Data file 3; Table 1). MAKER ver 3.0 pipeline [9] was used for structural annotation. GC content of the genome was determined to be 43.61%. RepeatMasker and Repeatmodeler using the latest version of repbase database [10-12] identified 27.27% repeat elements. Altogether, 31,254 gene models were predicted using the MAKER gene annotation pipeline based on both de novo and reference based predictions using genes/proteins from other fish species (Atlantic herring, carp, salmon, zebrafish). Out of the 31,254 genes, 24,648 were annotated using InterProScan [13] and 16,078 genes were found to have at least 1 GO (Gene Ontology) term assigned to them (Data file 4, Table 1). The Hilsa genome was found to be comparable to the Atlantic herring (807 Mb genome and 28,335 genes) [14] and to the genome of the common carp (1.8 Gb and 52,000 genes) [15].

Limitations

The number of the regions unassembled in the genome is 4605 and the total number of bases positioned in this gap is 2,268,925 bp.
  8 in total

Review 1.  Repbase Update, a database of eukaryotic repetitive elements.

Authors:  J Jurka; V V Kapitonov; A Pavlicek; P Klonowski; O Kohany; J Walichiewicz
Journal:  Cytogenet Genome Res       Date:  2005       Impact factor: 1.636

2.  MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes.

Authors:  Brandi L Cantarel; Ian Korf; Sofia M C Robb; Genis Parra; Eric Ross; Barry Moore; Carson Holt; Alejandro Sánchez Alvarado; Mark Yandell
Journal:  Genome Res       Date:  2007-11-19       Impact factor: 9.043

3.  A non-invasive technique for rapid extraction of DNA from fish scales.

Authors:  Ravindra Kumar; Poonam Jayant Singh; N S Nagpure; Basdeo Kushwaha; S K Srivastava; W S Lakra
Journal:  Indian J Exp Biol       Date:  2007-11       Impact factor: 0.818

4.  The MaSuRCA genome assembler.

Authors:  Aleksey V Zimin; Guillaume Marçais; Daniela Puiu; Michael Roberts; Steven L Salzberg; James A Yorke
Journal:  Bioinformatics       Date:  2013-08-29       Impact factor: 6.937

5.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

Authors:  Felipe A Simão; Robert M Waterhouse; Panagiotis Ioannidis; Evgenia V Kriventseva; Evgeny M Zdobnov
Journal:  Bioinformatics       Date:  2015-06-09       Impact factor: 6.937

6.  Genome sequence and genetic diversity of the common carp, Cyprinus carpio.

Authors:  Peng Xu; Xiaofeng Zhang; Xumin Wang; Jiongtang Li; Guiming Liu; Youyi Kuang; Jian Xu; Xianhu Zheng; Lufeng Ren; Guoliang Wang; Yan Zhang; Linhe Huo; Zixia Zhao; Dingchen Cao; Cuiyun Lu; Chao Li; Yi Zhou; Zhanjiang Liu; Zhonghua Fan; Guangle Shan; Xingang Li; Shuangxiu Wu; Lipu Song; Guangyuan Hou; Yanliang Jiang; Zsigmond Jeney; Dan Yu; Li Wang; Changjun Shao; Lai Song; Jing Sun; Peifeng Ji; Jian Wang; Qiang Li; Liming Xu; Fanyue Sun; Jianxin Feng; Chenghui Wang; Shaolin Wang; Baosen Wang; Yan Li; Yaping Zhu; Wei Xue; Lan Zhao; Jintu Wang; Ying Gu; Weihua Lv; Kejing Wu; Jingfa Xiao; Jiayan Wu; Zhang Zhang; Jun Yu; Xiaowen Sun
Journal:  Nat Genet       Date:  2014-09-21       Impact factor: 38.330

7.  InterProScan: protein domains identifier.

Authors:  E Quevillon; V Silventoinen; S Pillai; N Harte; N Mulder; R Apweiler; R Lopez
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

8.  The genetic basis for ecological adaptation of the Atlantic herring revealed by genome sequencing.

Authors:  Alvaro Martinez Barrio; Sangeet Lamichhaney; Guangyi Fan; Nima Rafati; Mats Pettersson; He Zhang; Jacques Dainat; Diana Ekman; Marc Höppner; Patric Jern; Marcel Martin; Björn Nystedt; Xin Liu; Wenbin Chen; Xinming Liang; Chengcheng Shi; Yuanyuan Fu; Kailong Ma; Xiao Zhan; Chungang Feng; Ulla Gustafson; Carl-Johan Rubin; Markus Sällman Almén; Martina Blass; Michele Casini; Arild Folkvord; Linda Laikre; Nils Ryman; Simon Ming-Yuen Lee; Xun Xu; Leif Andersson
Journal:  Elife       Date:  2016-05-03       Impact factor: 8.140

  8 in total
  2 in total

1.  Draft genome assembly of Tenualosa ilisha, Hilsa shad, provides resource for osmoregulation studies.

Authors:  Vindhya Mohindra; Tanushree Dangi; Ratnesh K Tripathi; Rajesh Kumar; Rajeev K Singh; J K Jena; T Mohapatra
Journal:  Sci Rep       Date:  2019-11-11       Impact factor: 4.379

2.  Long-read sequencing and de novo genome assembly of marine medaka (Oryzias melastigma).

Authors:  Pingping Liang; Hafiz Sohaib Ahmed Saqib; Xiaomin Ni; Yingjia Shen
Journal:  BMC Genomics       Date:  2020-09-16       Impact factor: 3.969

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.