Literature DB >> 31113823

Hybrid Assembly of the Genome of the Entomopathogenic Nematode Steinernema carpocapsae Identifies the X-Chromosome.

Lorrayne Serra1, Marissa Macchietto2, Aide Macias-Muñoz1, Cassandra Joan McGill1, Isaryhia Maya Rodriguez1, Bryan Rodriguez1, Rabi Murad1, Ali Mortazavi3,4.   

Abstract

Entomopathogenic nematodes from the genus Steinernema are lethal insect parasites that quickly kill their insect hosts with the help of their symbiotic bacteria. Steinernema carpocapsae is one of the most studied entomopathogens due to its broad lethality to diverse insect species and its effective commercial use as a biological control agent for insect pests, as well as a genetic model for studying parasitism, pathogenesis, and symbiosis. In this study, we used long-reads from the Pacific Biosciences platform and BioNano Genomics Irys system to assemble the most complete genome of the S. carpocapsae ALL strain to date, comprising 84.5 Mb in 16 scaffolds, with an N50 of 7.36 Mb. The largest scaffold, with 20.9 Mb, was identified as chromosome X based on sex-specific genome sequencing. The high level of contiguity allowed us to characterize gene density, repeat content, and GC content. RNA-seq data from 17 developmental stages, spanning from embryo to adult, were used to predict 30,957 gene models. Using this improved genome, we performed a macrosyntenic analysis to Caenorhabditis elegans and Pristionchus pacificus and found S. carpocapsae's chromosome X to be primarily orthologous to C. elegans' and P. pacificus' chromosome II and IV. We also investigated the expansion of protein families and gene expression differences between adult male and female stage nematodes. This new genome and more accurate set of annotations provide a foundation for additional comparative genomic and gene expression studies within the Steinernema clade and across the Nematoda phylum.
Copyright © 2019 Serra et al.

Entities:  

Keywords:  Genetics of Sex; S. carpocapsae; Steinernema; Steinernema genome

Mesh:

Year:  2019        PMID: 31113823      PMCID: PMC6686942          DOI: 10.1534/g3.119.400180

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Nematodes are round, non-segmented worms with a simple body plan that populate all known biological niches. Caenorhabditis elegans is by far the best-characterized nematode species and was the first metazoan to have its genome assembled (C. elegans Sequencing Consortium 1998). Studies in C. elegans have been crucial for the study of a multitude of biological processes such as development, cell specification, cell differentiation, apoptosis and genome evolution (Deppe ; Kaletta and Hengartner 2006; Rothman ; Segref ). Other free-living, androdiecious nematode genomes such as Pristionchus pacificus and Panagrellus redivivus have provided tools for genomic comparisons in developmental processes and phenotypic plasticity (Rödelsperger ; Srinivasan ). However, free-living nematodes represent only a subset of the entire nematode phylum as they exclude parasitic nematodes, which are of most agricultural and biomedical interest. Parasitic nematodes have a diverse group of hosts including insects, plants, and humans. Steinernema carpocapsae is an entomopathogenic nematode (EPN) that is a lethal insect parasite but is not harmful to humans or plants. S. carpocapsae is of great interest because of its wide host range with approximately 200 possible insect hosts and the possible orthology of its toxins to mammalian-parasitic nematodes (Shapiro-Ilan ; Yang ). In many countries, S. carpocapsae are commercialized for use as insect pest control (Dillman ; Dito ). Steinernema has a symbiotic relationship with pathogenic Xenorhabdus bacteria where they collectively infect and kill a host within a few days, reproduce separately within the host, and reassociate when forming infective juveniles that will seek out the next host (Hirao ). Steinernema and its bacterial symbiont have been extensively studied as genetic models to explore symbiosis and pathogenesis (Martens and Goodrich-Blair 2005; Sicard ). EPNs are also an excellent satellite organism to study mammalian parasitism as they are closely related to the Strongyloididae nematodes (Blaxter ). In particular, Steinernema’s mechanism of host seeking by olfactory and other sensory cues likely offers a great model for mammalian-parasitic nematodes (Gang & Hallem 2016). Previously published draft genomes done using Illumina sequencing for five Steinernema species opened the door to study evolutionary traits among these parasitic nematodes and compare them to free-living nematodes. These genomes helped elucidate gene families involved in parasitism that were expanded in Steinernema, such as proteases and protease inhibitors, while a comparative analysis to C. elegans revealed orthologous non-coding regulatory motifs (Dillman ). S. carpocapsae and S. feltiae were further used to study the extent of expression conservation in orthologous genes across nematode families that were associated with embryonic development (Macchietto ). This study found a funnel-shaped model of embryonic development based on high conservation of orthologous genes between Caenorhabditis and Steinernema (Macchietto ). The draft genomes also allowed for further studies in neuropeptide sensory perception in S. carpocapsae, and the study of lethal venom proteins in both S. carpocapsae and S. feltiae (Chang , Lu ; Morris ). The draft genome assembly of S. carpocapsae ALL strain had 1,578 contigs with an estimated genome size of 85.6 Mb, N50 of 300 kb and more than 28,000 predicted genes (Dillman ). Another study, also using short-reads assembled S. carpocapsae Breton strain into 347 scaffolds with an N50 of 1.24 Mb (Rougon-Cardoso ). The draft genome status, although helpful, can hinder further studies of protein families’ expansions involved in parasitism. The high fragmentation also deters the analysis of conserved gene sequences, which could play a significantly larger role in the evolution of this phylum. Therefore, a more contiguous assembly of the S. carpocapsae genome will contribute to further understanding the mechanisms of Steinernema genome evolution. In this study, we reassembled the genome with long-read sequencing from Pacific BioSciences (PacBio) in conjunction with optical mapping using the BioNano Irys system, which produces high throughput physical maps (Lam ; Rhoads and Au 2015). The newly assembled genome consists of 16 scaffolds with nearly 31 thousand predicted genes and we identified the largest scaffold as chromosome X. We observed clusters of expansion of proteases families and found gene expression differences in catalytic activity between adult male and female stage nematodes.

Materials and Methods

Axenic S. carpocapsae culture

S. carpocapsae infective Juveniles (IJs) were cultured on Xenorhabdus nematophila colonization defective mutant (HGB315) bacteria to produce axenic IJs. The IJs were first cultured in vivo in Galleria mellonella (waxworms) and surface sterilized (Gaugler and Kaya 1990). X. nematophila bacteria (HGB315) were cultured in tryptic soy broth (cat. No. A00169, BD Scientific) and shaken at 220rpm in a 26° incubator overnight. Cultures were plated on lipid agar (LA) plates (8 g/L of nutrient broth, 5 g/L of yeast extract, 2 g/L of MgCl2, 7 ml/L of corn syrup, 4 ml/L of corn oil, and 15 g/L of Bacto Agar), using 600μL of culture per plate. Approximately 300,000 IJs were plated across 20 lipid agar plates (∼15,000 IJs/plate) seeded with lawns of Xenorhabdus nematophila. Axenic IJs were collected from LA plates after 2-3 weeks by placing the LA plates in larger petri plates filled with DI water. The IJs were collected from the DI and surface sterilized (Gaugler and Kaya 1990). IJs were stored in Ringer’s solution in tissue culture flasks at a density < 15 IJs/μL until enough IJs (∼2 million) were collected for the DNA isolation.

PacBio DNA isolation

Sucrose float to remove dead IJs and bacterial contamination. Axenic IJs stored in the tissue culture flasks were transferred to conical tubes, spun down, and washed 3x with DI water. IJs were re-surface sterilized by adding 7.5 mL of distilled (DI) water to the worm pellet and 7.5 mL of egg solution (3 mL DI water, 4.5 mL 1 M NaOH, 2.5 mL of fresh Clorox bleach) and incubating for 5 min. The egg solution was immediately removed, and the IJs were rinsed 3 times by centrifuging at 2000 rpm for 1-2 min and then suspended in 7 mL of molecular grade water. 7 ml of cold 60% sucrose was added, and the sample was mixed and spun at 50x g (630 rpm/660 rpm) for 1 min at 4°, and then immediately at 1150x g (3000 rpm /3180 rpm) for 3 min at 4°. The live IJs were collected from the top layer and transferred to a sterile conical tube and washed 3x (centrifuging at 1650g (3600 rpm / 3810rpm) for 5 min) with molecular grade water. Grinding of the IJs with a mortar and pestle in liquid nitrogen to break nematode cuticles. A mortar, pestle, and four ultracentrifuge tubes were wrapped in aluminum foil and autoclaved for at least 20 min at 121°. The cleaned IJs were added to the autoclaved mortar and liquid nitrogen was poured into the mortar and ground with the pestle until the liquid nitrogen evaporated. IJs were ground for 10 min adding liquid nitrogen as needed. The ground powder was transferred into a 50mL conical tube on ice. IJ cell/tissue lysis and protein and RNA digestion. In a 15 mL conical, 9.5 mL Qiagen Buffer G2 from the Genomic DNA Buffer Set (Cat. No. 19060, Qiagen), 19 μL RNase A (100mg/mL; Cat. No. EN05231, Thermo Fisher) and 375 μL of proteinase K (> 800mili-absorbance units (mAu), Cat. No. P4850-5ML, Sigma-Aldrich) were mixed, added to the ground IJ sample, and incubated at 50° in a water bath for 3-4 hr until the lysate was clear. After incubated the sample was spun down at 5,000x g for 10 min at 4° to separate out any particulate matter that can clog the genomic tip column. The supernatant was transferred to a clean 15 mL conical tube. Genomic DNA measurements. The total amount of genomic DNA was measured with the Qubit fluorometer to determine which genomic tip size to use. We used the 100G genomic-tip because we had between 80-100 μg of gDNA. Genomic-tip protocol. A 100G Qiagen Genomic-tip (100/G) was equilibrated with 4 mL of Buffer QBT. Separately, the DNA sample was diluted with an equal volume of buffer QBT and vortexed for 10s at maximum speed and immediately applied to the equilibrated genomic-tip. The following steps of the genomic-tip were followed according to the manufacturer’s instructions, except that the ethanol and isopropanol precipitation centrifugation steps were performed at 10,000x g for 45 min each. The DNA was eluted in EB buffer overnight at 55°. The DNA was sheared using 10 pumps of a blunt 24-gauge needle followed by 10 pumps using a blunt 21-gauge needle. The gDNA concentration, absorbance ratios, and fragment size range were determined before SMRT-bell library preparation.

S. carpocapsae genome assembly

The S. carpocapsae genome was assembled with Illumina reads and PacBio reads. Then, further improved in contiguity with BioNano genomics Iris system. Previously published Illumina libraries were assembled into high quality contigs using the Platanus (version 1.2.4) command platanus assemble with a k-mer size of 51 (-k 51) and a minimum k-mer coverage cut-off of 5 (-c 5) (Dillman ; Kajitani ). Then, we sequenced six SMRT cells on the PacBio RS II and obtained a total of 500,026 reads (616,611 sub reads) with a N50 read length of 18,308 bp and a mean subread length of 13,036 bp. This translated to 76.8X PacBio read coverage of the 85 Mb genome. All PacBio reads (76.8X genome coverage) were used with the PBcR pipeline into a self-assembly (PB assembly) (Chakraborty ; Koren ). Then, a hybrid assembly was assembled using the longest error-corrected PacBio reads that provided 25X coverage of the genome, which was output from the PBcR pipeline, with the Illumina assembly. The hybrid assembly was generated using the following DBG2OLC parameters: k 17, KmerCovTh 10, MinOverlap 100, AdaptiveTh 0.02, PathCovTh 3, RemoveChimera 1. The hybrid assembly and self-assembly was corrected for errors and chimeras using the Illumina and PacBio reads with quiver from the smrtanalysis package from PacBio (version 2.3.0p5) and subsequently Pilon (version 1.16) (Walker ). Lastly, the PB-assembly and hybrid assembly were merged with quickmerge and MUMmer (version 3.23) using the default settings to generate a merged assembly (Chakraborty ; Kurtz ). BioNano: DNA was extracted from S. carpocapsae Infective Juveniles (IJs) using the Animal Tissue DNA Isolation kit (Bionano Genomics). Bionano Irys optical data were generated and assembled with IrysSolve 2.1. We then merged the Bionano assembly with the merged assembly using IrysSolve, retaining Bionano assembly features when the two assemblies disagreed (Figure S1A). The new S. carpocapsae genome assembly also went through Haplomerger to create an assembly with minimum possible number of haplotypes. First, the assembly was softmasked with Windowmasker (version 2.2.22) (Morgulis ). Next, the softmasked genome was cleaned using the faDnaPolishing.pl script provided by HaploMerger2 (Huang ). Then, the longest 5% of the genome was used as a target and the remaining 95% was used as a query for alignment with Lastz, which created a score matrix. The alignment threshold was kept at 95% to best identify heterozygosity. Lastly, the S. carpocapsae ALL strain was compared to the Breton strain through genomeevolution.org using synmap function with default settings (Lyons & Freeling 2008).

de novo developmental transcriptome assembly with Oases

Smartseq2 RNA-seq datasets spanning 16 developmental stages (zygote, 2-cell, 4-cell, 8-cell, 24-44-cell, 64-78-cell, comma, 1.fivefold, twofold, moving, L1, nonactivated IJ, 9-15h activated IJ, L4, male, and female) in replicates of 2-4 were assembled into stage-specific transcriptomes using Oases 0.2.8 and Velvet 1.2.10. Stage-specific transcriptomes were assembled using different kmer sizes from 22-35 in steps of 2, and then merged into a final assembly. The merged final transcriptome assemblies were compiled into a single fasta file to use for gene model prediction.

Genome annotation training and prediction with Augustus

539 S. carpocapsae genes sequences that matched 458 C. elegans CEGMA genes (OrthoMCL) were used together with de novo assembled stage-specific transcriptomes to automatically train Augustus gene model exon and intron prediction (Stanke ). C. elegans UTR models provided by Augustus were used for predicting the S. carpocapsae UTRs. A compiled set of de novo assembled stage-specific transcriptomes (from above) was mapped onto the merged genome using Blat version 36 with the following settings: -maxIntron = 70000 –minScore = 100 -minIdentity = 94. The best alignments were taken using the command pslReps with the setting: -singleHit, and were then sorted and converted into hints file for Augustus using blat2hints.pl from Augustus (version 3.2.1).

Assessing genome assembly completeness with Benchmarking Universal Single-Copy Orthologs (BUSCO)

Genome completeness was checked with BUSCO v3 software with default settings for genome and using near-universal single-copy orthologs selected from OrthoDB v9 nematoda_odb9 (Simão ). Nematode_odb9 has 982 groups for which 854 groups were found in the S. carpocapsae assembly, which is an 87% completeness (Figure S1B).

S. carpocapsae female and male DNA collection

IJs were cultured with X. nematophila for approximately 54 hr to collect 100 males and 62 hr to collect 50 females. Female and male DNA were extracted with DNeasy blood and Tissue kit (Qiagen Cat No. 69504). The DNA of females and males were processed following the steps F through J from protocol of Serra . Briefly, DNA was tagmented using the Nextera DNA library prep kit (Illumina, FC-121-1030). Then, tagmented DNA was amplified using Phusion High Fidelity PCR master mix with the amplification program set to 1) 72° 5 min. 2) 98° 30 sec. 3) 98° 10 sec, 63° 30 sec., 72° 1 min. (repeat 10x). 4) 4° Hold. PCR samples were cleaned with Ampure XP beads with a 1:1 ratio, concentrations measured with Qubit fluorometer and bioAnalyzed with Agilent 2100 Bioanalyzer. Libraries were prepared and sequenced as paired-end, 43 base pair reads on the Illumina Nextseq 500.

Genome analysis

To calculate coverage of genome for male and female, first, S. carpocapsae genome was indexed with bowtie (version 1.0.0) and DNA reads were mapped to the genome with the following options: -X 1500 -a -v 2–best–strata –S (Langmead ). Bam files were used to calculate coverage using DeepTools2 (version 3.1.1) with default settings for 100 kb (Ramírez ). Next, gene density was calculated and graphed with GenomicFeatures (version 3.8) and KaryoploteR (version 3.8) packages for R/ Bioconductor (Gel and Serra 2017; Lawrence ). GC content was calculated with perl script GC-content-in-sliding-window using default settings for 100-kb windows (Richard 2018). Number of repeats were calculated with RepeatModeler (version 1.0.8) and RepeatMasker (version 4.0.7) using default settings (Smit, Hubley, & Green, n.d.).

Orthology analysis

Orthologs were determined across S. carpocapsae and C. elegans, and S. carpocapsae and P. pacificus by blasting the longest protein sequence for each gene using OrthoMCL (version 1.4) with the default settings (Li ). Protein sequences were downloaded from WormBase Parasite for C. elegans (version PRJNA13758) and P. pacificus (version PRJNA12644).

Protease identification, location, and phylogeny

Protein sequences for the new S. carpocapsae genome were aligned to the MEROPS (Rawlings ) database using command-line BLAST (Camacho ) . An additional analysis was done by Neil Rawlings using a custom pipeline to search our genome against the MEROPS collection (Rawlings ). Matches with e-value greater than 1xe-10 were kept. Locations for serine and metalloprotease genes were obtained from GTF files. Plots for gene locations and clusters were done using ggplot2. Sequences for orthologs in C. elegans were obtained as mentioned above to generate phylogenetic trees for serine and metalloproteases. Genes were aligned in MEGA7 using MUSCLE and a maximum likelihood tree was done using bootstrap support of 100 replicates.

RNA-sequencing of S. carpocapsae time-course, male, and female

To collect RNA for the time-course, S. carpocapsae IJs (∼10,000) were cultured with X. nematophila in a lipid plate for each allotted time: 3, 6, 9, 12, 15, 24, 36, 48 hr, female and male. The nematodes were washed from the plate into a 1.5mL eppendorf tube and subsequently washed 2 times with Ringer’s and 3 times with HyPure water. Then, S. carpocapsae single nematode transcriptome sequencing was performed according to Serra . Briefly, a single nematode was picked, transferred to a PCR tube and cut with a 25-gauge needle. Then, 2 µl of lysis buffer (for recipe see Serra ) which has RNase inhibitor and proteinase K was added to the tube and incubated at 65° for 10min, 85° for 1min. Immediately after, 1 µl oligo-dT VN primer and 1 µl dNTP was added to the tube and incubated at 72° for 3min. After the incubation period samples were reverse transcribed and PCR amplified according to Serra et al. After PCR, samples were cleaned with Ampure XP beads, concentrations measured with Qubit fluorometer and bioanalyzed with Agilent 2100 Bioanalyzer to check for cDNA quality. If samples had a good BioAnalyzer profile, they were tagmented with Nextera DNA Library Prep Kit, amplified with adapters and sequenced.

RNA-seq analysis of time course, embryos, female and male

S. carpocapsae genome was indexed with a GTF file via RSEM (version 1.2.25) using the command rsem-prepare-reference (Li and Dewey 2011). Embryo data were downloaded from NCBI GEO datasets according to Macchietto (GSE86381).Unstranded, paired-end 43 bp RNA-seq reads for each worm were mapped using bowtie (version 1.0.0) with the following options: bowtie -X 1500 -a -m 200 -S -p 8–seedlen 25 -n 2 -v 3 (Langmead ). Gene expression was quantified with RSEM (version 1.2.25) with the command rsem-calculate-expression. For all analyses, gene expression was reported in Transcripts Per Million (TPM). The TPM generated by rsem-calculate-expression for S. carpocapsae samples were normalized according to groups using the R package limma because samples were collected, processed and sequenced in different batches (Ritchie ). For all RNA-seq analysis genes were kept if reached a minimum of 1TPM for three replicates. For heatmap, gene expression was mean-centered, normalized, and hierarchically clustered with Cluster 3.0 and visualized using Java Treeview (de Hoon ; Saldanha 2004). Differential gene expression was determined using edgeR (Robinson ). Genes were considered differentially expressed if the FDR < 0.05 and log2(fold change) > 2. Gene ontology enrichment analyses was calculated using Blast2GO fisher’s exact test and considered statistically significant if FDR < 0.01 (Conesa ). List of genes used in Blast2GO were differentially expressed according to edgeR or dynamically expressed according to maSigPro.

Data availability

This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession AZBU00000000. The version described in this paper is version AZBU02000000. RNA-seq reads and gene expression tables were submitted to Gene Expression Omnibus (GEO) under the accession number [GSE127823]. Supplemental material available at FigShare: https://doi.org/10.25387/g3.8069438.

Results

S. carpocapsae largest scaffold identified as chromosome X

We used PacBio sequencing along with BioNano optical mapping to generate a S. carpocapsae genome assembly consisting of 17 scaffolds with a N50 of 7.36 Mb and GC content of 45.7%. Comparison of our genome to the Breton strain from Rougon-Cardoso allowed us to merge our smallest contig of 150 kb with scaffold 7 for a total of 16 scaffolds. S. carpocapsae is known to have 5 chromosomes, and the largest scaffold of 20.9 Mb, could correspond to an assembled chromosome (Table S1). We performed Illumina sequencing of the DNA of males and females, being careful to collect females that had not mated to identify scaffolds that would encompass the chromosome X in our new assembly. The largest scaffold displayed a characteristic twofold difference in coverage between females (orange) and males (blue), which we therefore renamed as Chromosome X (Figure 1A). Scaffold 15, which is much smaller (207 kb or 1% of the size of chromosome X) also showed a similar difference in coverage and is likely a small part of the X chromosome. The remaining scaffolds of ≥1 Mb suggest that we have the remaining four chromosomes in 14 pieces at most.
Figure 1

Identifying the X chromosome in S. carpocapsae. (A) Illumina sequencing of females (orange) and males (blue) shows coverage in order of longest to shortest for scaffolds >1 Mb, in which scaffold 0 is chromosome X. (B) Gene density calculated for 100 kb intervals. (C) GC content for every 100 kb with a sliding window of 1 kb. (D) Number of repeats per 100 kb.

Identifying the X chromosome in S. carpocapsae. (A) Illumina sequencing of females (orange) and males (blue) shows coverage in order of longest to shortest for scaffolds >1 Mb, in which scaffold 0 is chromosome X. (B) Gene density calculated for 100 kb intervals. (C) GC content for every 100 kb with a sliding window of 1 kb. (D) Number of repeats per 100 kb. We used our RNA-seq transcriptomes from published studies (Marissa Macchietto ) and this study to reannotate the genes on this new assembly and identified 30,957 genes using Augustus (see methods). We then calculated the gene density along the scaffolds and observed a uniform distribution of genes on chromosome X, which is similar to observations in Caenorhabditis elegans and P. pacificus (Andersen ; Rödelsperger ). Gene density varied little along and among all other scaffolds (Figure 1B). Chromosome X has a rich %GC content in its center when compared to its arms. All other scaffolds either have uniform %GC content or show a distinct increase or decrease (Figure 1C). Lastly, we analyzed the repeat content of chromosome X and scaffolds. As expected from other assembled nematode chromosomes, chromosome X repeats are more frequent on the arms than in its central region (Hillier ; Rödelsperger ; Yin ). All other scaffolds that are part of autosomal chromosomes either have uniform repetitive sequences (such as scaffold 3) or scaffolds in which repetitive sequence either decrease or increase (such as scaffolds 2 and 4) (Figure 1D). In summary, we assembled the S. carpocapsae genome into 16 scaffolds, 14 of which are greater than 1Mb, including a nearly complete chromosome X that shows similar characteristics to other assembled nematode chromosomes.

S. carpocapsae chromosome X is primarily syntenic to C. elegans chromosomes II and IV

We performed a macrosyntenic analysis between S. carpocapsae and C. elegans genomes by identifying the one-to-one orthologs between both species and plotting these genes according to their position in S. carpocapsae scaffolds (Figure 2). We found that S. carpocapsae chromosome X is primarily homologous to sections of C. elegans chromosomes II, X, and IV. S. carpocapsae chromosome X has approximately 10 Mb homologous to C. elegans chromosome II, 7.5 Mb homologous to C. elegans chromosome IV, and 2.5 Mb homologous to C. elegans chromosome X. The other scaffolds are largely homologous to only one of the C. elegans chromosomes. Scaffolds 1, 9, and 11 are homologous to chromosome I. Scaffold 15 is homologous to chromosome II. Scaffolds 4, 7, 8, 13, and 16 are homologous to chromosome III. Scaffolds 2, 6, 10, and 14 are homologous to chromosome IV. Lastly, scaffolds 3, 5, and 12 are homologous to chromosome X.
Figure 2

Macrosynteny between S. carpocapsae and C. elegans. C. elegans one-to-one orthologous genes had their position predicted in the S. carpocapsae assembly. Each rectangle represents the fraction of C. elegans genes present per 500 kb window in S. carpocapsae. Red rectangle indicates no synteny.

Macrosynteny between S. carpocapsae and C. elegans. C. elegans one-to-one orthologous genes had their position predicted in the S. carpocapsae assembly. Each rectangle represents the fraction of C. elegans genes present per 500 kb window in S. carpocapsae. Red rectangle indicates no synteny. Next, we investigated whether S. carpocapsae represents the ancestral state by repeating the macrosyntenic analysis with Pristionchus pacificus (Figure S2). The P. pacificus chromosome configuration has been reported to represent an ancestral state while C. elegans and strongylid nematodes represent the derived state (Rödelsperger ). S. carpocapsae chromosome X is primarily syntenic to P. pacificus chromosomes II, X and IV. The other scaffolds are largely homologs to only one of P. pacificus’ chromosomes. Scaffold 15 is homologous to chromosome II. Scaffolds 2, 3, 5, 6, 7, 10, 12, and 14 are homologous to chromosome I. Scaffolds 4, 8, and 13 are homologous to chromosome III. Lastly, scaffolds 1, 9 and 11 are homologous to chromosome V. P. pacificus chromosomes II, III, IV, X are homologous to the same chromosomes in C. elegans (Rödelsperger ). However, P. pacificus chromosome I is partially homologous to C. elegans chromosome V and X. The similar pattern of homologies between chromosome X of S. carpocapsae and chromosomes II, X, and V of both C. elegans and P. pacificus is unsurprising, given that these species’ chromosomes are themselves homologous. S. carpocapsae chromosome X configuration is likely similar between C. elegans and P. pacificus because of the homology between these species for chromosomes II, IV, X. S. carpocapsae scaffolds do not share homologous regions with chromosome X of P. pacificus. However, several scaffolds are homologous with chromosome I (Figure S2). In summary, S. carpocapsae chromosome X is homologous to C. elegans and P. pacificus chromosomes II, IV, and X. Therefore, S. carpocapsae have a derived state and translocation has occurred in the branch leading to the ancestors of Steinernema.

Rapid metalloprotease expansion in S. carpocapsae

A previous study comparing nematode genomes found an expansion of proteases and protease inhibitors in Steinernema (Dillman ). S. carpocapsae in particular had 654 peptidase genes, approximately one-third of them encoding metalloproteases and one-third being serine proteases (Dillman ). It was suggested that the potential function of these genes may be to aid in parasitism (Dillman ). Further studies on the mechanisms of S. carpocapsae infection identified 472 venom proteins that included proteases and protease inhibitors (Lu ). Our new genome allows us to investigate potential molecular mechanisms by which these gene families may be expanding. In addition, an improved assembly allows us to determine the location of these genes and these genes are in fact numerous individual unique genes or misassembled genes. In our assembly, we identified 228 unique metalloprotease genes and 254 serine protease genes. Mapping the genes to their chromosomal locations revealed aggregation of gene clusters (Figure 3A-B). In order to visualize the location of venom genes, we colored them in red. Venomous metalloprotease genes form a cluster on scaffold 7, while venomous serine protein genes cluster on scaffolds 5, 8 and 14 (Figure 3A-3B). The location of these expanded genes suggest that they are evolving by tandem duplications. One-to-one orthologs to C. elegans shows that the 228 metalloprotease genes correspond to 90 C. elegans gene models and the 284 serine proteases correspond to 50 genes in C. elegans. The phylogenetic trees for these gene families inform us about their potential molecular history and sequence similarity. While many of the metalloproteins have orthologs in C. elegans with a few duplications, serine proteins seem to have undergone an extensive and rapid expansion in S. carpocapsae. Interestingly, there is a small grouping of metalloprotease duplicates that belong to the cluster on chromosome 7; these venomous genes all encode an M14A domain (Figure 2C). Similarly, many of the venomous serine proteases group within what we refer to as the “serine expansion”; these genes encode the S01A domain (Figure 3D).
Figure 3

Protease expansion in S. carpocapsae. A) Location of metalloprotease genes on S. carpocapsae scaffolds. Darker bands represent gene overlap or gene clusters. Red bands represent venom component-encoding genes. B) Location of serine proteases on S. carpocapsae scaffolds. Darker bands represent gene overlap or gene clusters. Red bands represent venom component-encoding genes. C) Maximum-likelihood phylogenetic tree of metalloproteases in S. carpocapsae and their orthologs in C. elegans (highlighted yellow) using a bootstrap support of 100 replicates. Gene names highlighted in red represent genes encoding metalloprotease venom components. D) Maximum-likelihood phylogenetic tree of genes encoding serine protease venom components in S. carpocapsae and their orthologs in C. elegans (highlighted yellow) using a bootstrap support of 100 replicates. Gene names highlighted in red represent venomous serine protease.

Protease expansion in S. carpocapsae. A) Location of metalloprotease genes on S. carpocapsae scaffolds. Darker bands represent gene overlap or gene clusters. Red bands represent venom component-encoding genes. B) Location of serine proteases on S. carpocapsae scaffolds. Darker bands represent gene overlap or gene clusters. Red bands represent venom component-encoding genes. C) Maximum-likelihood phylogenetic tree of metalloproteases in S. carpocapsae and their orthologs in C. elegans (highlighted yellow) using a bootstrap support of 100 replicates. Gene names highlighted in red represent genes encoding metalloprotease venom components. D) Maximum-likelihood phylogenetic tree of genes encoding serine protease venom components in S. carpocapsae and their orthologs in C. elegans (highlighted yellow) using a bootstrap support of 100 replicates. Gene names highlighted in red represent venomous serine protease.

Gene expression profiling of eleven post-activation developmental stages reveals sex-specific differences

We collected early-IJ activation developmental stages of S. carpocapsae at 3, 6, 9, 12, and 15 hr as well as later-IJ post-activation time points that are thought to correspond to Juvenile 4 such as 24 (Juvenile 4), 36, and 48 hr, adult females and males using single-nematode RNA-seq analysis (Strauch ,Serra ). We next compared the expression profiles of 11 embryonic stages from Macchietto (zygote, 2-cells, 4-cells, 8-cells, 24-44-cells, 64-78-cells, comma, 1.fivefold, twofold, threefold, L1) and 11 post-IJ developmental stages. We first generated a heatmap of all 22 developmental stages of all genes expressed with a minimum of 1 Transcript Per Million (TPM), where all the genes in red are upregulated and blue are downregulated (Figure 4A). All eleven embryonic stages and five early post-activation stages have distinct gene expression profiles from three late post-activation stages which has a similar gene expression profile to female and male adults. Interestingly, we found that two of the nematodes collected at 24 hr have a transcriptional profile similar to adult females, whereas the third nematode collected at the same hour is more similar to an adult male. Similarly, we found that two of the 36 hr nematodes have a male-like profile and one nematode has a female-like profile. At 48 hr all nematodes have a female transcriptome profile, which we assume simply represents our chance of collecting three out of three female samples randomly.
Figure 4

Gene expression profiling of S. carpocapsae development in 11 embryonic stages and 11 post-IJ development. (A) Heatmap showing 26,761 genes with minimum of 1 transcript per million (TPM). (B) Volcano plot of 12,461 differentially expressed genes with minimum of 1 TPM, between male (blue) and female (orange) and representative GO terms. (C) Principal Component Analysis of all genes expressed in embryos and during an activation time course. Developmental stages are separated from major life stages (IJ, adults) by PC3.

Gene expression profiling of S. carpocapsae development in 11 embryonic stages and 11 post-IJ development. (A) Heatmap showing 26,761 genes with minimum of 1 transcript per million (TPM). (B) Volcano plot of 12,461 differentially expressed genes with minimum of 1 TPM, between male (blue) and female (orange) and representative GO terms. (C) Principal Component Analysis of all genes expressed in embryos and during an activation time course. Developmental stages are separated from major life stages (IJ, adults) by PC3. We then conducted a comparative gene expression analysis between adult males and females. We found 2,328 genes downregulated in males and 1,026 upregulated in females (Figure 4B). GO terms for the 2,328 downregulated genes showed strong enrichment for lipid metabolic process (P = 1.6x10−8), catalytic activity (P = 2.3x10−11), immune response (P = 3.2x10−4) and embryo development (P = 3.0x10−2). In contrast, the 1,026 upregulated genes are related to male gamete generation (P = 8.7x10−3), sperm motility (P = 2.0x10−2), biological regulation (P = 1.6x10−9) and catalytic activity (P = 4.4x10−4). We generally recovered distinct major GO term categories between female and male except for the category “catalytic activity”, which led us to investigate whether the GO terms for females and males are enriched for distinct enzymes. Such distinct enzymes have been extensively studied in S. carpocapsae IJ activation but not in adults (Lu ). Male GO terms were enriched for phosphotransferase activity (P = 5.93x10−94), kinase activity (P = 1.14x10−23), phosphatase activity (P = 5.69x10−9) and phosphoric ester hydrolase activity (P = 1.38x10−7). In contrast, female GO terms are enriched for peptidase activity (P = 2.38x10−6), serine-type peptidase activity (P = 5.22x10−6), serine hydrolase activity (P = 5.22x10−6) and amine-lyase activity (P = 8.87x10−5). This indicates that females and males express different enzymes at higher levels in adulthood. In a separate analysis, we analyzed the genes from Figure 3B with highest expression in male and highest expression in female (Figure S3A). GO terms for highly expressed genes in females and males confirmed that females’ enzymatic activity is related to hydrolase, peptidase and serine activity while males are enriched for kinase activity, and phosphorous processes (Figure S3B). We then used a Principal Component Analysis (PCA) to assess how gene expression of embryos, late post-developmental stages and adults varies across stages (Figure 4C). Principal component 1 (PC1) accounts for 19.7% of the variance across stages and PC2 accounts for 14.2%. Interestingly, PC3 (10.3%) separated all stages into two clusters. One cluster (blue circle) has all the stages that are developing while the other cluster (pink circle) has all the stages that are fully developed or developmentally arrested (namely infective juveniles (IJs)). GO analysis of PC3 positive values were enriched for catalytic activity, embryo development, testosterone dehydrogenase, female sex differentiation, steroid dehydrogenase, regulation of hormone process and female genitalia development. Meanwhile, PC3 negative values were enriched for kinase activity, ligand-gated ion channel, calcium-release channel activity, nicotinate nucleotide metabolic process, NAD metabolic process and regulation of autophagy (Figure S4). In summary, profiling gene expression in 11 post-activation developmental stages of S. carpocapsae shows that starting at 24 hr nematodes are differentiating their gonads and are transcriptionally different from early post-developmental stages. We also found that a gene expression comparison between female and male reveals catalytic activity as differentially expressed with females enriched for peptidase and serine activity while males are enriched for phosphatases. Lastly, PCA separated embryos, late post-developmental stages and adults into two clusters based on their developmental activity. We investigated conservation of the C. elegans sex determination pathway in S. carpocapsae. We performed an orthology analysis of the genes described in Haag (2005) to identify which sex determination genes are conserved in S. carpocapsae, including genes with more than one paralog (Table S2). Out of the 27 C. elegans sex determination pathway genes, we found 10 that have one to one orthologs such as and , which are responsible for X dosage counting elements and are female-promoting (Hodgkin ). Another seven genes have one to many paralogs such as the case of , the transcriptional repressor, which have three homologs; however, as expected, and do not have orthologs in S. carpocapsae (Klein and Meyer 1993; Lieb ; Nonet and Meyer 1991). Interestingly, and , which are global repressors of , are orthologous to the same five genes in S. carpocapsae with the best hit for gene g2524 (Puoti & Kimble 2000). Genes and , which are germline translational repressors of , are orthologous to one gene g11152 which suggests that this duplication happened along the C. elegans lineage (Zhang ). Lastly, , which is a promoter of spermatogenesis, has one orthologous gene, contrary to which has orthologs in multiple nematodes species but not in S. carpocapsae (Chen ; Jin ). In summary, we found 19 out of the 27 genes to C. elegans in the sex-determination pathway, which might be the core genes important in the nematode sex determination pathway.

Discussion

In this study we improved the genome of S. carpocapsae ALL strain with PacBio technology and BioNano Iris system and identified chromosome X. The S. carpocapsae genome is in 16 scaffolds with an N50 of 7.36 Mb. The sum of the 10 largest scaffolds’ lengths achieves ∼90% of the genome size, while the top 4 scaffolds cover 50% of the genome. In addition, we used stage-specific developmental transcriptomes to re-annotate our new assembly and predicted 30,957 genes, which we used to infer patterns of rearrangements between scaffolds based on macrosyntenic analysis compared to C. elegans. This also allowed us to map the expansion of metalloproteases clusters and to identify the set of genes differentially expressed between males and females. S. carpocapsae has four autosomes and one sex chromosome (Curran 1989). Chromosome-level macrosynteny revealed that S. carpocapsae chromosome X is orthologous to both C. elegans and P. pacificus chromosome II and IV. In addition, S. carpocapsae chromosome X has a small 3 Mb section orthologous to P. pacificus and C. elegans chromosome X. The small 3 Mb of S. carpocapsae chromosome X section is the only section orthologous to P. pacificus chromosome X, while most of the scaffolds are orthologous to P. pacificus chromosome I. This small segment of chromosome X orthologous among S. carpocapsae, C. elegans and P. pacificus might represent a Nigon unit, which are proposed to be deeply conserved linkage groups found conserved between nematodes (Tandonnet ). A comparison of A. rhodensis chromosomes to four Caenorhabditis species including C. elegans pinpointed these Nigon units, one of which (NX) was found as a sole component of chromosome X or found combined with other Nigon units (Tandonnet ). Interestingly, S. carpocapsae scaffolds are orthologous to single C. elegans chromosomes, which is reminiscent of a macrosynteny analysis in other nematode clades that also found conservation of orthologous chromosomal parts (Fradin ). If the last common ancestor among clades had 6 chromosomes, then it is likely that S. carpocapsae had a fusion of C. elegans chromosomes II and IV, which also occurred in unichromosomal nematode Diploscapter pachys in which the order of chromosomes fused were I, X, III, II, IV, and V (Fradin ). The conservation of synteny of S. carpocapsae to both C. elegans and P. pacificus highlights the large-scale chromosomal rearrangements in nematode genome evolution. High quality genomes of other Steinernema species will allow for comparisons of evolution of parasitism in this genus. Thus, the nematode field would continue to benefit from assembling high quality genomes which would further the study of parasitology. Differential expression analysis between males and females uncovered an interesting divergence in catalytic activity. Females have high levels of hydrolase activity, serine-peptidase activity, and proteolysis. Similar catalytic activity is found in adult parasitic females (PF) of Strongyloides species (Hunt ). Strongyloides species have a free-living generation with gonochoristic females and males, and a parasitic female generation that lives inside the mammalian host (Hunt ). Although, in Steinernema, there are female sex-bias in emerging IJ populations, little is known about the role of Steinernema females in infecting a host populations (Alsaiyah ). On the other hand, the catalytic activity for males was enriched for phosphorylation, phosphorous activity and kinase activity. Protein phosphorylation and dephosphorylation regulates protein function with phosphatases important in spermatogenesis and regulation of sperm motility (Cottee ; Guillermet-Guibert ). Kinases are important in nematode male fertility and a target of drugs to treat human parasites (Guillermet-Guibert ; Nolan ). The role of females in infecting a host and the molecular biology of reproduction processes in EPNs are largely unexplored. The study of the reproductive and developmental processes in parasitic nematodes will be important because it could lead to better methods for biocontrol and for mammalian parasite control through the disruption or interruption of the reproductive cycle. In conclusion, we have improved the S. carpocapsae genome from 1,578 to 16 scaffolds and identified its X chromosome. A macrosynteny analysis found that chromosome X is orthologous to C. elegans and P. pacificus chromosomes II and IV. Our results point to a conserved region of chromosome X among the three species. We also found catalytic activity differences between adult females and males. Further analysis will be required to assess the role of female adults in EPN infection and regulatory relationships of male development. Lastly, we believe the improved genome of S. carpocapsae will advance the field of comparative nematode genomics and allow for the mining of new insights in the evolution of nematode parasitism.
  62 in total

1.  WindowMasker: window-based masker for sequenced genomes.

Authors:  Aleksandr Morgulis; E Michael Gertz; Alejandro A Schäffer; Richa Agarwala
Journal:  Bioinformatics       Date:  2005-11-15       Impact factor: 6.937

Review 2.  How to usefully compare homologous plant genes and chromosomes as DNA sequences.

Authors:  Eric Lyons; Michael Freeling
Journal:  Plant J       Date:  2008-02       Impact factor: 6.417

Review 3.  Polarity establishment, asymmetric division and segregation of fate determinants in early C. elegans embryos.

Authors:  Lesilee Rose; Pierre Gönczy
Journal:  WormBook       Date:  2014-12-30

Review 4.  Mechanisms of host seeking by parasitic nematodes.

Authors:  Spencer S Gang; Elissa A Hallem
Journal:  Mol Biochem Parasitol       Date:  2016-05-17       Impact factor: 1.759

5.  Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research.

Authors:  Ana Conesa; Stefan Götz; Juan Miguel García-Gómez; Javier Terol; Manuel Talón; Montserrat Robles
Journal:  Bioinformatics       Date:  2005-08-04       Impact factor: 6.937

6.  Novel Role for p110β PI 3-Kinase in Male Fertility through Regulation of Androgen Receptor Activity in Sertoli Cells.

Authors:  Julie Guillermet-Guibert; Lee B Smith; Guillaume Halet; Maria A Whitehead; Wayne Pearce; Diane Rebourcet; Kelly León; Pascale Crépieux; Gemma Nock; Maria Strömstedt; Malin Enerback; Claude Chelala; Mariona Graupera; John Carroll; Sabina Cosulich; Philippa T K Saunders; Ilpo Huhtaniemi; Bart Vanhaesebroeck
Journal:  PLoS Genet       Date:  2015-07-01       Impact factor: 5.917

7.  karyoploteR: an R/Bioconductor package to plot customizable genomes displaying arbitrary data.

Authors:  Bernat Gel; Eduard Serra
Journal:  Bioinformatics       Date:  2017-10-01       Impact factor: 6.937

8.  A core set of venom proteins is released by entomopathogenic nematodes in the genus Steinernema.

Authors:  Dennis Z Chang; Lorrayne Serra; Dihong Lu; Ali Mortazavi; Adler R Dillman
Journal:  PLoS Pathog       Date:  2019-05-01       Impact factor: 6.823

9.  Chromosome-Wide Evolution and Sex Determination in the Three-Sexed Nematode Auanema rhodensis.

Authors:  Sophie Tandonnet; Georgios D Koutsovoulos; Sally Adams; Delphine Cloarec; Manish Parihar; Mark L Blaxter; Andre Pires-daSilva
Journal:  G3 (Bethesda)       Date:  2019-04-09       Impact factor: 3.154

10.  Fate specification and tissue-specific cell cycle control of the Caenorhabditis elegans intestine.

Authors:  Alexandra Segref; Juan Cabello; Caroline Clucas; Ralf Schnabel; Iain L Johnstone
Journal:  Mol Biol Cell       Date:  2010-01-06       Impact factor: 4.138

View more
  3 in total

1.  A telomere-to-telomere assembly of Oscheius tipulae and the evolution of rhabditid nematode chromosomes.

Authors:  Pablo Manuel Gonzalez de la Rosa; Marian Thomson; Urmi Trivedi; Alan Tracey; Sophie Tandonnet; Mark Blaxter
Journal:  G3 (Bethesda)       Date:  2021-01-18       Impact factor: 3.154

2.  The effect of temperature conditioning (9°C and 20°C) on the proteome of entomopathogenic nematode infective juveniles.

Authors:  Peter E Lillis; Christine T Griffin; James C Carolan
Journal:  PLoS One       Date:  2022-04-07       Impact factor: 3.240

3.  Genome announcement of Steinernema khuongi and its associated symbiont from Florida.

Authors:  Anil Baniya; Peter DiGennaro
Journal:  G3 (Bethesda)       Date:  2021-04-15       Impact factor: 3.154

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.