Literature DB >> 35174333

Nematode genome announcement: The draft genome sequence of entomopathogenic nematode Heterorhabditis indica.

Chaitra G Bhat1, Vishal S Somvanshi1, Roli Budhwar2, Jeffrey Godwin2, Uma Rao1.   

Abstract

Heterorhabditis indica is one of the most widely used entomopathogenic nematodes for the biological control of agricultural insect pests worldwide. The draft genome of H. indica was sequenced using three genomic libraries of 300 bp, 600 bp and 5 kb sizes by Illumina HiSeq platform. The size of the draft genome assembly was 91.26 Mb, comprising 3,538 scaffolds. Genome completeness analysis by BUSCO (Benchmarking Universal Single-Copy Orthologs) showed 84% complete, and 6.5% fragmented BUSCOs. Further, 10,494 protein-coding genes were predicted. The H. indica draft genome will enable comparative and functional genomic studies in Heterorhabditis nematodes.
© 2021 Authors.

Entities:  

Keywords:  Draft genome sequence; Entomopathogenic nematode; Heterorhabditis indica; Illumina

Year:  2022        PMID: 35174333      PMCID: PMC8784978          DOI: 10.21307/jofnem-2021-101

Source DB:  PubMed          Journal:  J Nematol        ISSN: 0022-300X            Impact factor:   1.402


Entomopathogenic nematodes (EPNs) of the genera Heterorhabditis and Steinernema are used worldwide for the biological control of agricultural insect pests (Lacey and Georgis, 2012; Bhat et al., 2020). These EPNs are also an excellent and genetically tractable model to study mutualism and parasitism (Campos-Herrera et al., 2012). Twenty-one Heterorhabditis and one hundred Steinernema species have been described from various parts of the world (Bhat et al., 2020). However, their full potential as bio-control agents and as a model system remains under-exploited. Omics information such as genome and transcriptome data and genome interrogation and editing techniques such as RNAi and CRISPR-Cas9 are potent methods to explore EPN biology and lay the groundwork for improving their bio-control traits (Lu et al., 2016). Genomic information about EPNs is scant in the public domain. Presently, whole-genome information is available for only one heterorhabditid and seven steinernematid nematodes (Lu et al., 2016; Baniya and DiGennaro, 2021). Here, we present the first draft genome sequence assembly for Heterorhabditis indica, which is widely present in the warmer and tropical climatic regions and is one of the most commercialized EPNs (Lacey and Georgis, 2012). The H. indica strain IARI-EPN-Hms1 (Ganguly et al., 2010) was inbred for 20 generations (designated H. indica Hms1-i20) to obtain genetically homogenous nematodes for genome sequencing. Inbreeding involved self-fertilizing the hermaphrodites for 20 generations by placing a single L4 nematode onto the lawn of Photorhabdus akhurstii on the nutrient-agar medium supplemented with cholesterol. Inbred nematodes were lysed using Qiagen buffer G2 (Catalogue no. 19060, Qiagen, USA), and high molecular weight DNA was extracted using the phenol-chloroform method. DNA concentration and quality were estimated using Qubit 4.0 Fluorometer (ThermoFisher Scientific, USA) and agarose gel electrophoresis (0.6% agarose gel, 120 min run time at 100 V). DNA was fragmented by sonication using Bioruptor (Diagenode, Seraing (Ougrée), Belgium). The size distribution was checked by running an aliquot of the fragmented DNA sample on an Agilent high sensitivity bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Illumina TruSeq DNA sample preparation guide (Illumina, San Diego, CA, USA) was followed to prepare two high-quality libraries of insert sizes 300 bp and 600 bp. Illumina Nextera Mate Pair Library Prep Kit was used to prepare a mate-pair library of insert size 5 kb. Paired-end reads of length 2*151, and 2*131 (Mate-pair) were generated on Illumina HiSeq 4000 platform. A total of 22.74 Gb sequence data comprising of ~160 million reads was generated. The reads provided 750X coverage of the H. indica genome. Quality filtering of raw reads by fastp version 0.22 (Chen et al., 2018) identified 155 million High Quality (HQ) reads. The raw and filtered read statistics for H. indica are presented in Table 1. The HQ reads were used to generate primary assembly using SOAPDenovo assembler Release 1.0 (Li et al., 2010). The contaminating mitochondrial and bacterial sequences (of Photorhabdus symbiont) were removed from the assembly using BlobTools v1.0.1 (Laetsch and Blaxter, 2017) and the NCBI server. Primary assembly contigs were further scaffolded using SSPACE ‘Standard’ version (Boetzer et al., 2011) to generate the final draft assembly. The final H. indica genome assembly was of 91.26 Mb size having 3,538 scaffolds with an N50 of 587 kb and an average scaffold length of 25.79 kb. The GC content of the assembled genome was 35.31%, and there were 7.33% N’s in the assembly. Assembly statistics for the H. indica Hms1-i20 are provided in Table 2. Genome completeness assessment was done by Benchmarking Universal Single-Copy Orthologs (BUSCO) v5.2.2 (Seppey et al., 2019) against the 3,131 BUSCOs in the nematoda_odb10 database. The H. indica Hms1-i20 genome showed the presence of 2630 (84%) complete BUSCOs (complete and single-copy- 2617 (83.6%); complete and duplicated- 13 (0.4%)), whereas 205 (6.5%) BUSCOs were fragmented, and 296 (9.5%) were missing. Genome completeness using Core Eukaryotic Genes Mapping Approach (CEGMA) (Parra et al., 2007) against 248 ultra-conserved core eukaryotic genes showed 97.17% completeness. A total of 10,494 protein-coding genes were predicted in the H. indica Hms1-i20 by SNAP (Korf, 2004), Augustus (Stanke and Morgenstern, 2005), GeneMark (Besemer and Borodovsky, 2005), and Maker (Cantarel et al., 2008). Functional annotation by NCBI Blastx+/RefSeq/SwissProt/UniProt databases resulted in the annotation of 9,596 genes. Identification of orthologous genes present in H. indica Hms1-i20 genome compared to four other nematode genomes by Orthofinder (Emms and Kelly, 2019) revealed that H. indica shared 2,848 groups with H. bacteriophora, 4,526 with C. elegans, 4,059 with S. carpocapsae and 3,396 with Oscheius tipulae.
Table 1.

Read statistics for H. indica Hms1-i20 genome.

Raw readsFiltered high quality (HQ) reads
Genomic library (Insert size)Total reads (million)Total bases (Giga bases)GC content (%)Total HQ reads (Million)Total HQ bases (Giga bases)GC content (%)% HQ reads
300 bp48.377.3040.9247.157.0240.6297.47
600 bp39.715.9941.1238.085.6740.8995.89
5 Kb (mate pair library)72.279.4439.6569.758.7738.8396.51
Table 2.

Genome assembly statistics of H. indica Hms1-i20 genome.

ParameterStatistics
Total sequences3,538
Total bases91,263,125
Min sequence length (bp)500
Max sequence length (bp)2,409,384
Average sequence length (bp)25,795.12
Median sequence length (bp)3410.50
N50 length (bp)587,367
As28.71%
Ts28.64%
Gs17.67%
Cs17.64%
(A + T)s57.35%
(G + C)s35.31%
Ns7.33%
Read statistics for H. indica Hms1-i20 genome. Genome assembly statistics of H. indica Hms1-i20 genome. This genomic resource will facilitate functional and comparative genomic studies and genetic exploration in Heterorhabditis nematodes. Accession number(s): The raw sequence data has been deposited in GenBank under BioProject No. PRJNA720543, BioSample No. SAMN18671197 and SRA IDs SRR14181568, SRR14181569 and SRR14181570. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession JAJAVP000000000. The version described in this paper is version JAJAVP010000000.
  14 in total

1.  Scaffolding pre-assembled contigs using SSPACE.

Authors:  Marten Boetzer; Christiaan V Henkel; Hans J Jansen; Derek Butler; Walter Pirovano
Journal:  Bioinformatics       Date:  2010-12-12       Impact factor: 6.937

2.  CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes.

Authors:  Genis Parra; Keith Bradnam; Ian Korf
Journal:  Bioinformatics       Date:  2007-03-01       Impact factor: 6.937

3.  MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes.

Authors:  Brandi L Cantarel; Ian Korf; Sofia M C Robb; Genis Parra; Eric Ross; Barry Moore; Carson Holt; Alejandro Sánchez Alvarado; Mark Yandell
Journal:  Genome Res       Date:  2007-11-19       Impact factor: 9.043

4.  De novo assembly of human genomes with massively parallel short read sequencing.

Authors:  Ruiqiang Li; Hongmei Zhu; Jue Ruan; Wubin Qian; Xiaodong Fang; Zhongbin Shi; Yingrui Li; Shengting Li; Gao Shan; Karsten Kristiansen; Songgang Li; Huanming Yang; Jian Wang; Jun Wang
Journal:  Genome Res       Date:  2009-12-17       Impact factor: 9.043

5.  Entomopathogenic nematodes for control of insect pests above and below ground with comments on commercial production.

Authors:  Lawrence A Lacey; Ramon Georgis
Journal:  J Nematol       Date:  2012-06       Impact factor: 1.402

6.  GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses.

Authors:  John Besemer; Mark Borodovsky
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

7.  AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints.

Authors:  Mario Stanke; Burkhard Morgenstern
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

8.  fastp: an ultra-fast all-in-one FASTQ preprocessor.

Authors:  Shifu Chen; Yanqing Zhou; Yaru Chen; Jia Gu
Journal:  Bioinformatics       Date:  2018-09-01       Impact factor: 6.937

9.  OrthoFinder: phylogenetic orthology inference for comparative genomics.

Authors:  David M Emms; Steven Kelly
Journal:  Genome Biol       Date:  2019-11-14       Impact factor: 13.583

10.  Gene finding in novel genomes.

Authors:  Ian Korf
Journal:  BMC Bioinformatics       Date:  2004-05-14       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.