Literature DB >> 32883365

The draft genome of Labeo catla.

Lakshman Sahoo1, Paramananda Das2, Bismay Sahoo1, Gargee Das1, Prem Kumar Meher1, Uday Kumar Udit1, Kanta Das Mahapatra1, Jitendra Kumar Sundaray1.   

Abstract

OBJECTIVE: Labeo catla (catla), one of the three Indian major carps, is native to the Indo-Gangetic riverine system of India as well as the rivers of Pakistan, Bangladesh, Nepal and Myanmar. Its higher growth rate and compatibility with other major carps, specific surface feeding habit, and consumer preference have increased its popularity in carp polyculture systems among the fish farmers in Indian subcontinent. Recent advancement in sequencing technology coupled with massive parallel sequencing platforms has facilitated accelerated genetic improvement in aquaculture species through integration of genomics tools. A draft genome and allied resources are lacking in catla. Therefore, in the present study, we have performed de-novo assembly of Labeo catla for the first time. DATA DESCRIPTION: A male farm reared catla was used for extracting high molecular weight genomic DNA followed by sequencing in Oxford Nanopore and Illumina platforms. Approximately, 80× coverage of sequence data was assembled adopting the hybrid assembly strategy. The assembled genome size of catla was 1.01 Gb containing 5345 scaffolds with N50 value 0.7 Mb and more than 92% BUSCO completeness. Gene annotation resulted in 25,812 predicted genes.

Entities:  

Keywords:  Genomics resource; Hybrid assembly; Indian major carp; Labeo catla

Mesh:

Year:  2020        PMID: 32883365      PMCID: PMC7469402          DOI: 10.1186/s13104-020-05240-w

Source DB:  PubMed          Journal:  BMC Res Notes        ISSN: 1756-0500


Objective

Aquaculture is the rapidly emerging food production sector all over the world and it is going to be the primary source of fish and shellfish for human diet in the coming future [1]. Genetic improvement of performance traits has huge potential to meet the increasing demand of quality animal protein in the event of exponential growth of human population. Well-designed breeding programmes integrated with genomics tools can accelerate the production and productivity. Recent advancement in sequencing technology coupled with massive parallel sequencing platforms has paved the way for expediting genetic improvement programs in aquaculture species. Labeo catla (catla), one of the Indian major carps, is native to the Indo-Gangetic riverine system of India as well as the rivers of Pakistan, Bangladesh, Nepal and Myanmar. Its higher growth rate and compatibility with other major carps, specific surface feeding habit, and consumer preference have increased its popularity in carp polyculture systems among the fish farmers in India, Bangladesh, Myanmar, Laos, Pakistan and Thailand [2]. L. catla currently accounts for ∼ 3.4% of total freshwater aquaculture production worldwide [3]. With an aim to generate consolidated genomics resource for supporting genetic improvement, we have undertaken de-novo assembly of catla for the first time. The draft genome will also be an important resource for comparative genomics, biological and evolutionary studies of cyprinid species.

Data description

One farm-reared mature (2 years old) male catla weighing approximately 1.7 kg was collected from ICAR-Central Institute of Freshwater Aquaculture (CIFA) farm for this study. Before tissue sampling, fish was anesthetized with MS-222 (300 mg/l) and then weighed. High molecular weight genomic DNA was isolated from testis tissue using standard phenol–chloroform method [4]. The qualitative and quantitative assessment of DNA were performed by NanoDrop spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, U.S.A.) and Qubit fluorometer (Invitrogen, Carlsbad, CA, USA) followed by checking on 0.8% agarose gel. Genomic DNA was sheared using a Covaris S2 sonicator (Covaris, Woburn, Massachusetts, USA) to generate fragments in the range of 200 bp to 20 kb. Four Paired end libraries (insert size: 350 bp) for Illumina Nextseq500 platform and one library (mixed insert size) for Oxford Nanopore were prepared and sequenced as per manufacturer’s instruction. A total of 80.28 Gb sequence data (Table 1, Data file 1) [5] were generated after quality check by FastQC tool [6]. The de novo hybrid assembly was performed with default parameters using MaSuRCA 3.2.8 [7] followed by scaffolding and Gap closing with SSPACE v3.0 [8] and GapCloser v1.12b [9], respectively. This yielded 5,345 scaffolds with N50 value of 0.7 Mb (Table 1, Data file 2) [10] and largest fragment of 6.8 Mb. The assembled genome size of catla is 1.01 Gb (Table 1, Data file 3) [11] against an in silico estimated genome size of 0.95 Gb. The evaluation of genome by Benchmarking Universal Single-Copy Orthologs (BUSCO) version 3.0 [12] and using Actinopterygii odb9 core gene set revealed 92% complete, 87.9% complete and single copy, 4.1% complete and duplicated, 4.1% fragmented and 4.05% missing BUSCOs. RepeatModeler [13] was used for de novo repeat modelling which showed 47.58% of repeat content in catla genome. The genome wide simple sequence repeats of assembled catla genome was 391,331.
Table 1

Overview of data files/data sets

LabelName of data file/data setFile types (file extension)Data repository and Identifier (DOI or Accession number)
Data file 1Sequence dataTable 1.docxhttps://doi.org/10.6084/m9.figshare.12271589 [5]
Data file 2Assembly statisticsTable 2.docxhttps://doi.org/10.6084/m9.figshare.12271619 [10]
Data file 3Assembly dataFASTAhttps://www.ncbi.nlm.nih.gov/assembly/GCA_012976165.1 [11]
Data file 4Whole genome sequence dataFASTANCBI GenBank (Accession numbers VONZ01000001-VONZ01005345) https://identifiers.org/ncbi/insdc:VONZ00000000 [19]
The catla genome is predicted to contain 25,812 protein-coding genes. Additionally, scaffold_2219 of a size of 16,600 bp, was found to be of mitochondrial origin, with 13 mRNAs, 22 tRNAs and 2 rRNAs. Functional annotation of the final set of predicted protein sequences was carried out by BLAST2GO v5.0. Out of 25,812 genes, 17,500 were found to have GO term assigned to them. The number of protein coding genes identified in catla (25,812) is comparable to the genomes of sequenced diploid cyprinids such as Labeo rohita [14], Ctenopharyngodon idellus [15], Danio rerio [16] and Anabarilius grahami [17]. Orthologous relationship among these species using OrthoVenn [18] showed a total of 8,494 orthologous gene clusters to be shared by all five species, with 1,357 species specific gene clusters. The whole genome sequence data has been deposited in the GenBank (Table 1, Data file 4) [19 Overview of data files/data sets

Limitations

The assembled genome size of Labeo catla is 1.01 Gb constituting 5345 scaffolds. The number of unassembled regions is 649 and the number of bases positioned in this gap is 0.8 Mb.
  9 in total

1.  Scaffolding pre-assembled contigs using SSPACE.

Authors:  Marten Boetzer; Christiaan V Henkel; Hans J Jansen; Derek Butler; Walter Pirovano
Journal:  Bioinformatics       Date:  2010-12-12       Impact factor: 6.937

2.  The MaSuRCA genome assembler.

Authors:  Aleksey V Zimin; Guillaume Marçais; Daniela Puiu; Michael Roberts; Steven L Salzberg; James A Yorke
Journal:  Bioinformatics       Date:  2013-08-29       Impact factor: 6.937

3.  The draft genome of the grass carp (Ctenopharyngodon idellus) provides insights into its evolution and vegetarian adaptation.

Authors:  Yaping Wang; Ying Lu; Yong Zhang; Zemin Ning; Yan Li; Qiang Zhao; Hengyun Lu; Rong Huang; Xiaoqin Xia; Qi Feng; Xufang Liang; Kunyan Liu; Lei Zhang; Tingting Lu; Tao Huang; Danlin Fan; Qijun Weng; Chuanrang Zhu; Yiqi Lu; Wenjun Li; Ziruo Wen; Congcong Zhou; Qilin Tian; Xiaojun Kang; Mijuan Shi; Wanting Zhang; Songhun Jang; Fukuan Du; Shan He; Lanjie Liao; Yongming Li; Bin Gui; Huihui He; Zhen Ning; Cheng Yang; Libo He; Lifei Luo; Rui Yang; Qiong Luo; Xiaochun Liu; Shuisheng Li; Wen Huang; Ling Xiao; Haoran Lin; Bin Han; Zuoyan Zhu
Journal:  Nat Genet       Date:  2015-05-04       Impact factor: 38.330

4.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

Authors:  Felipe A Simão; Robert M Waterhouse; Panagiotis Ioannidis; Evgenia V Kriventseva; Evgeny M Zdobnov
Journal:  Bioinformatics       Date:  2015-06-09       Impact factor: 6.937

5.  The zebrafish reference genome sequence and its relationship to the human genome.

Authors:  Kerstin Howe; Matthew D Clark; Carlos F Torroja; James Torrance; Camille Berthelot; Matthieu Muffato; John E Collins; Sean Humphray; Karen McLaren; Lucy Matthews; Stuart McLaren; Ian Sealy; Mario Caccamo; Carol Churcher; Carol Scott; Jeffrey C Barrett; Romke Koch; Gerd-Jörg Rauch; Simon White; William Chow; Britt Kilian; Leonor T Quintais; José A Guerra-Assunção; Yi Zhou; Yong Gu; Jennifer Yen; Jan-Hinnerk Vogel; Tina Eyre; Seth Redmond; Ruby Banerjee; Jianxiang Chi; Beiyuan Fu; Elizabeth Langley; Sean F Maguire; Gavin K Laird; David Lloyd; Emma Kenyon; Sarah Donaldson; Harminder Sehra; Jeff Almeida-King; Jane Loveland; Stephen Trevanion; Matt Jones; Mike Quail; Dave Willey; Adrienne Hunt; John Burton; Sarah Sims; Kirsten McLay; Bob Plumb; Joy Davis; Chris Clee; Karen Oliver; Richard Clark; Clare Riddle; David Elliot; David Eliott; Glen Threadgold; Glenn Harden; Darren Ware; Sharmin Begum; Beverley Mortimore; Beverly Mortimer; Giselle Kerry; Paul Heath; Benjamin Phillimore; Alan Tracey; Nicole Corby; Matthew Dunn; Christopher Johnson; Jonathan Wood; Susan Clark; Sarah Pelan; Guy Griffiths; Michelle Smith; Rebecca Glithero; Philip Howden; Nicholas Barker; Christine Lloyd; Christopher Stevens; Joanna Harley; Karen Holt; Georgios Panagiotidis; Jamieson Lovell; Helen Beasley; Carl Henderson; Daria Gordon; Katherine Auger; Deborah Wright; Joanna Collins; Claire Raisen; Lauren Dyer; Kenric Leung; Lauren Robertson; Kirsty Ambridge; Daniel Leongamornlert; Sarah McGuire; Ruth Gilderthorp; Coline Griffiths; Deepa Manthravadi; Sarah Nichol; Gary Barker; Siobhan Whitehead; Michael Kay; Jacqueline Brown; Clare Murnane; Emma Gray; Matthew Humphries; Neil Sycamore; Darren Barker; David Saunders; Justene Wallis; Anne Babbage; Sian Hammond; Maryam Mashreghi-Mohammadi; Lucy Barr; Sancha Martin; Paul Wray; Andrew Ellington; Nicholas Matthews; Matthew Ellwood; Rebecca Woodmansey; Graham Clark; James D Cooper; James Cooper; Anthony Tromans; Darren Grafham; Carl Skuce; Richard Pandian; Robert Andrews; Elliot Harrison; Andrew Kimberley; Jane Garnett; Nigel Fosker; Rebekah Hall; Patrick Garner; Daniel Kelly; Christine Bird; Sophie Palmer; Ines Gehring; Andrea Berger; Christopher M Dooley; Zübeyde Ersan-Ürün; Cigdem Eser; Horst Geiger; Maria Geisler; Lena Karotki; Anette Kirn; Judith Konantz; Martina Konantz; Martina Oberländer; Silke Rudolph-Geiger; Mathias Teucke; Christa Lanz; Günter Raddatz; Kazutoyo Osoegawa; Baoli Zhu; Amanda Rapp; Sara Widaa; Cordelia Langford; Fengtang Yang; Stephan C Schuster; Nigel P Carter; Jennifer Harrow; Zemin Ning; Javier Herrero; Steve M J Searle; Anton Enright; Robert Geisler; Ronald H A Plasterk; Charles Lee; Monte Westerfield; Pieter J de Jong; Leonard I Zon; John H Postlethwait; Christiane Nüsslein-Volhard; Tim J P Hubbard; Hugues Roest Crollius; Jane Rogers; Derek L Stemple
Journal:  Nature       Date:  2013-04-17       Impact factor: 49.962

6.  OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species.

Authors:  Yi Wang; Devin Coleman-Derr; Guoping Chen; Yong Q Gu
Journal:  Nucleic Acids Res       Date:  2015-05-11       Impact factor: 16.971

7.  Genome Assembly for a Yunnan-Guizhou Plateau "3E" Fish, Anabarilius grahami (Regan), and Its Evolutionary and Genetic Applications.

Authors:  Wansheng Jiang; Ying Qiu; Xiaofu Pan; Yuanwei Zhang; Xiaoai Wang; Yunyun Lv; Chao Bian; Jia Li; Xinxin You; Jieming Chen; Kunfeng Yang; Jinlong Yang; Chao Sun; Qian Liu; Le Cheng; Junxing Yang; Qiong Shi
Journal:  Front Genet       Date:  2018-12-04       Impact factor: 4.599

8.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

9.  De novo Assembly and Genome-Wide SNP Discovery in Rohu Carp, Labeo rohita.

Authors:  Paramananda Das; Lakshman Sahoo; Sofia P Das; Amrita Bit; Chaitanya G Joshi; Basdeo Kushwaha; Dinesh Kumar; Tejas M Shah; Ankit T Hinsu; Namrata Patel; Siddhi Patnaik; Suyash Agarwal; Manmohan Pandey; Shreya Srivastava; Prem Kumar Meher; Pallipuram Jayasankar; Prakash G Koringa; Naresh S Nagpure; Ravindra Kumar; Mahender Singh; Mir Asif Iquebal; Sarika Jaiswal; Neeraj Kumar; Mustafa Raza; Kanta Das Mahapatra; Joykrushna Jena
Journal:  Front Genet       Date:  2020-04-21       Impact factor: 4.599

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.