Literature DB >> 34550348

Identification of essential genes in Caenorhabditis elegans through whole-genome sequencing of legacy mutant collections.

Erica Li-Leger1, Richard Feichtinger2, Stephane Flibotte3, Heinke Holzkamp2, Ralf Schnabel2, Donald G Moerman1.   

Abstract

It has been estimated that 15%-30% of the ∼20,000 genes in C. elegans are essential, yet many of these genes remain to be identified or characterized. With the goal of identifying unknown essential genes, we performed whole-genome sequencing on complementation pairs from legacy collections of maternal-effect lethal and sterile mutants. This approach uncovered maternal genes required for embryonic development and genes with apparent sperm-specific functions. In total, 58 putative essential genes were identified on chromosomes III-V, of which 52 genes are represented by novel alleles in this collection. Of these 52 genes, 19 (40 alleles) were selected for further functional characterization. The terminal phenotypes of embryos were examined, revealing defects in cell division, morphogenesis, and osmotic integrity of the eggshell. Mating assays with wild-type males revealed previously unknown male-expressed genes required for fertilization and embryonic development. The result of this study is a catalog of mutant alleles in essential genes that will serve as a resource to guide further study toward a more complete understanding of this important model organism. As many genes and developmental pathways in C. elegans are conserved and essential genes are often linked to human disease, uncovering the function of these genes may also provide insight to further our understanding of human biology.
© The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America.

Entities:  

Keywords:  C. elegans; embryogenesis; essential genes; fertilization; legacy mutants; maternal-effect; whole-genome sequencing

Mesh:

Substances:

Year:  2021        PMID: 34550348      PMCID: PMC8664450          DOI: 10.1093/g3journal/jkab328

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Introduction

Essential genes are those required for the survival or reproduction of an organism, and therefore encode elements that are foundational to life. This class of genes has been widely studied for a number of reasons. Essential genes are often well conserved and can offer insight into the principles that govern common biological processes (Hughes 2002; Jordan ; Georgi ). Researching these genes and their functions has important implications in understanding the cellular and developmental processes that form complex organisms, including humans. In addition, identifying genes that are lethal when mutated opens up new avenues through which drug development approaches can target parasites, pathogens, and cancer cells (e.g., Doyle ; Shi ; Vyas ; Zhang ). Finally, the concept of a minimal gene set that is comprised of all genes necessary for life has been the subject of much investigation and has recently been of particular interest in the field of synthetic biology (reviewed in Ausländer ). Studying essential genes in humans is complicated by practical and ethical considerations. Accordingly, model organisms have played a key role in identifying and understanding essential genes, and efforts have been made to identify all essential genes in a few model organisms. Systematic genome-wide studies of gene function in Saccharomyces cerevisiae have uncovered more than 1100 essential genes, many of which have phylogenetically conserved roles in fundamental biological processes such as cell division, protein synthesis, and metabolism (Winzeler ; Giaever ; Yu ; Li ). While an important contribution, this is only a fraction of all the essential genes in multicellular organisms. In more complex model organisms, identifying all essential genes in the genome has not been so straightforward. The discovery of RNA interference (RNAi; Fire ) enabled researchers to employ genome-wide reverse genetic screens to examine the phenotypic effects of gene knockdown (Fraser ; Kamath ). In general, this has been an effective, high-throughput method for identifying many genes with essential functions (Gönczy ; Sönnichsen ). However, there are limitations to using RNAi to screen for all essential genes, including incomplete gene knock down, off-target effects, and RNAi resistance in a certain tissue or cell types; thus, many genes of biological importance escape identification in high-throughput RNAi screens. This highlights the motivation to obtain null alleles for every gene in the genome, which has been the goal of several model organism consortia (Bradley ; C.; Varshney ), though it has not yet been achieved for any metazoan. Caenorhabditis elegans has been an important model in developmental biology for decades, and the ability to freeze and store populations of C. elegans indefinitely allows investigators to share their original mutant strains with others around the world. In the first few decades of C. elegans research, dozens of forward genetics screens were used to uncover mutants in hundreds of essential genes (e.g., Herman 1978; Meneely and Herman 1979; Rogalski ; Howell ; Clark ; Johnsen and Baillie 1988, 1991; Kemphues ; McKim , 1992; Howell and Rose 1990; Stewart ; Gönczy ). These early studies generated what we refer to here as legacy collections. The alleles were often mapped to a region of the genome through deficiency or linkage mapping. However, the process of identifying the molecular nature of the genetic mutations one-by-one using traditional methods was slow and laborious before the genome sequence was complete (The ) and next-generation sequencing technologies were developed (reviewed in Metzker 2010; Goodwin ). As whole-genome sequencing (WGS) has become widely adopted, methods for identifying mutant alleles have evolved to take advantage of these technological advances (Sarin ; Smith , 2016; Srivatsan ; Blumenstiel ; Schneeberger ; Doitsidou ; Flibotte ; Zuryn ). With WGS becoming increasingly affordable over time, mutant collections can now be mined for data in efficient ways that were not possible two decades ago. Performing WGS on a single mutant genome is often insufficient to identify a causal variant due to the abundance of background mutations in any given strain, particularly one that has been subjected to random mutagenesis (Denver ; Hillier ; Sarin ; Flibotte ). However, when paired with additional strategies such as deletion or SNP-based mapping or bulk segregant analysis, WGS becomes a valuable tool to expedite gene identification. Furthermore, if multiple independently derived allelic mutants exist, an even simpler approach can be taken. By sequencing two or more mutants within a complementation group and looking for mutations in the same gene, the need for additional mapping or crossing schemes is greatly reduced (Schneeberger and Weigel 2011; Nordström ). In the legacy mutant collections described above, where large numbers of mutants are isolated, it is feasible to obtain complementation groups with multiple alleles for many loci. In addition, the abundance of mutants obtained in these large-scale genetic screens suggests that some legacy mutant collections may harbor strains for which the mutations remain unidentified. If such collections are coupled with thorough annotations, they are valuable resources that can be mined with WGS. Indeed, some investigators have recently used such WGS-based approaches to uncover novel essential genes from legacy collections (Jaramillo-Lambert ; Qin ). These projects bring us closer to identifying all essential genes in C. elegans and also contribute to the ongoing efforts to obtain null mutations in every gene in the genome. There are currently 3755 C. elegans genes that have been annotated with lethal or sterile phenotypes from RNAi knockdown studies (data from WormBase version WS275). In comparison, the number of genes currently represented by lethal or sterile mutant alleles is 1885 (data from WormBase version WS275). These numbers should be considered minimums, as the database annotations are not necessarily up to date. The discrepancy in these numbers could be illustrative of the comparatively time-consuming and laborious nature of isolating and identifying mutants. In addition, some of the genes identified as essential in RNAi screens may belong to paralogous gene families whose redundant functions are masked in single gene knockouts. Although the total number of essential genes in C. elegans is unknown, extrapolation from saturation mutagenesis screens has led to estimates that approximately 15%–30% of the ∼20,000 genes in this organism are essential (Clark ; Howell and Rose 1990; Johnsen and Baillie 1997; The C.). This suggests the possibility that there are many essential genes in C. elegans that remain unidentified and/or lack representation by a null allele. In this study, we use WGS to revisit two C. elegans legacy mutant collections isolated more than 25 years ago. These collections are a rich resource for essential gene discovery; they comprise 75 complementation groups in which at least two alleles with sterile or maternal-effect lethal phenotypes have been found. With these collections, we sought to identify novel essential genes and to conduct a preliminary characterization of their roles in fertilization and development. Wild-type male rescue assays are used to attribute some mutant phenotypes to sperm-specific genetic defects. In addition, we examine arrested embryos using differential interference contrast (DIC) microscopy and document their terminal phenotypes. This work comprises a catalog of 125 alleles with mutations in 58 putative essential genes on chromosomes III–V. Of these 58 genes, 52 are represented by novel alleles in this collection. We present several genes which are reported here for the first time as essential genes and mutant alleles for genes that have only previously been studied with RNAi knockdown. This work aims to help accelerate research efforts by identifying essential genes and providing an entry point into further investigations of gene function. Advancing our understanding of essential genes is imperative to reaching a more comprehensive knowledge of gene function in C. elegans and may provide insight into conserved processes in developmental biology, parasitic nematology, and human disease.

Materials and methods

Generation of legacy mutant collections

Mutant strains were isolated in screens for maternal-effect lethal and sterile alleles in the early 1990s by Heinke Holzkamp and Ralf Schnabel (unpublished data), and Richard Feichtinger (Feichtinger 1995). Two balancer strains were used for mutagenesis; GE1532: unc-32(e189)/qC1 [dpy-19(1259) glp-1(q339)] III; him-3(e1147) IV and GE1550: him-9(e1487) II; unc-24(e138)/nT1[let(m435)] IV; dpy-11(e224)/nT1[let(m435)] V. These parental strains were subjected to ethyl methanesulfonate (EMS) mutagenesis at 20° as described by Brenner (1974), with a mutagen dose of 50–75 mM and duration between 4 and 6 h. Following mutagenesis, L4 F1 animals were singled on plates at either 15° or 17°. Animals with homozygous markers in the F2 or F3 generation were transferred to 25° and subsequently screened for the production of dead eggs, unfertilized oocytes, or no eggs laid. The two mutant collections analyzed in this study are summarized in Table 1.
Table 1

Summary of mutant collections

CollectionNumber of complementation groups with ≥2 allelesChromosomeMutant genotypes
A32III unc-32(e189) let(t…)/qC1 III; him-3(e1147) IV
B25IV him-9(e1487) II; unc-24(e138) let(t…)/nT1 [let(m435)] IV; dpy-11(e224)/nT1 [let(m435)] V
18V him-9(e1487) II; unc-24(e138)/nT1 [let(m435)] IV; dpy-11(e224) let(t…)/nT1 [let(m435)] V
Summary of mutant collections

List of strains

The wild-type Bristol N2 derivative PD1074 and strains with the following mutations were used: him-3(e1147), unc-32(e189), qC1[dpy-19(e1259) glp-1(q339)], him-9(e1487), unc-24(e138), dpy-11(e224, e1180), nT1[let(m435)] (IV; V), nT1[unc(n754)let] (IV; V). Strains carrying the following deletions were used for deficiency mapping: nDf16, nDf40, sDf110, sDf125, tDf5, tDf6, tDf7 (III); eDf19, nDf41, sDf2, sDf21, stDf7 (IV); ctDf1, itDf2, nDf32, sDf28, sDf35 (V). All sDfs were kindly provided by D. Baillie's Lab (Simon Fraser University), and some strains were kindly provided by the Caenorhabditis Genetics Center (University of Minnesota). Nematode strains were cultured as previously described by Brenner (1974).

Outcrossing, mapping, and complementation analysis

All mutant strains were outcrossed at least once to minimize background mutations on other chromosomes. Hermaphrodites of the mutant strains were outcrossed with males of GE1532 [unc-32(e189)/qC1 [dpy-19(1259) glp-1(q339)] III; him-3(e1147) IV] for Collection A and males of GE1964: him-9(e1487) II; +/nT1[let(m435)] IV; dpy-11(e1180)/nT1[let(m435)] V for Collection B. Deficiency mapping was used to localize mutations to a chromosomal region using the deletion strains listed above. A detailed description of the outcrossing and mapping schemes for Collection B can be found in Supplementary File S1 and Feichtinger (1995). Complementation analysis of legacy mutants was performed by crossing 10 males of one mutant strain to 4 hermaphrodites of another strain. The presence of males with homozygous markers indicated successful crossing, and homozygous hermaphrodite progeny were transferred to new plates to determine whether viable offspring were produced and thus complementation occurred. Failure to complement was verified with additional homozygous animals or by repeating the cross. Complementation tests between CRISPR-Cas9 deletion strains and legacy mutants were performed by crossing heterozygous CRISPR-Cas9 deletion (GFP/+) males to heterozygous legacy mutant hermaphrodites. Twenty GFP hermaphrodite F1s were singled on new plates and those segregating viable Dpy and/or Unc progeny indicated complementation between the two alleles.

DNA extraction

Balanced heterozygous strains were grown on 100 mm nematode growth medium (NGM) agar plates (standard recipe with 3 times concentration of peptone) seeded with OP50 and harvested at starvation. Genomic DNA was extracted using a standard isopropanol precipitation technique previously described (Au ). DNA quality was assessed with a NanoDrop 2000c Spectrophotometer (Thermo Scientific) and DNA concentration was measured using a Qubit 2.0 Fluorometer and dsDNA Broad Range Assay kit (Life Technologies).

Whole-genome sequencing and analysis pipeline

DNA library preparation and WGS were carried out by The Centre for Applied Genomics (The Hospital for Sick Children, Toronto, Canada). Between 20 and 33 C. elegans mutant strains were run together on one lane of an Illumina HiSeq X to generate 150-bp paired-end reads. Sequencing analysis was done using a modified version of a previously designed custom pipeline (Flibotte ; Thompson ). Reads were aligned to the C. elegans reference genome (WS263; wormbase.org) using the short-read aligner BWA version 0.7.16 (Li and Durbin 2009). Single nucleotide variants (SNVs) and small insertions or deletions (indels) were called using SAMtools toolbox version 1.6 (Li ). To eliminate unreliable calls, variants at genomic locations for which the canonical N2 strain has historically had low read depth or poor quality (Thompson ) were removed as potential candidates. The variant calls were annotated with a custom Perl script and labeled heterozygous if represented by 20%–80% of the reads at that location. The remaining candidates were then subjected to a series of custom filters: (i) any variants that appeared in more than three strains from the same collection were removed; (ii) homozygous mutations were removed; (iii) only mutations affecting coding exons (indels, missense, and nonsense mutations) or splice sites (defined as the first two and last two base pairs in an intron) were kept, while all variants from other noncoding regions were removed; and (iv) only mutations on the chromosome to which the mutation had originally been mapped were selected, while variants on all other chromosomes were removed. For each pair of strains belonging to a complementation group, the final list of candidate mutations was compared and the gene or genes in common were identified. In cases where there was only one gene in common on both lists, this gene was designated the putative essential gene. For complementation groups with multiple candidate genes in common, additional information such as the nature of the mutations and existing knowledge about the genes was used to select a single candidate gene, when possible. When there was no gene candidate in common within a pair of strains, the list of variants was reanalyzed to look for larger deletions and rearrangements. If available, two additional alleles were sequenced to help identify the gene.

Validation of gene candidates

To validate the candidate gene candidates derived from WGS analysis, the genomic position of each candidate gene was corroborated with the legacy data from deficiency mapping experiments. Approximate boundaries for the deletions were estimated from the map coordinates of genes known to lie internal or external to the deletions according to data from WormBase (WS275). For further validation of select gene candidates, deletion mutants were generated in an N2 wild-type background using a CRISPR-Cas9 genome editing strategy previously described (Norris ; Au ). Two guide RNAs were used to excise the gene of interest and replace it with a selection cassette expressing G418 drug resistance and pharyngeal GFP (loxP + Pmyo-2::GFP::unc-54 3′UTR + Prps-27::neoR::unc-543′UTR + loxP vector, provided by Dr. John Calarco, University of Toronto, Canada). Guide RNAs were designed using the C. elegans Guide Selection Tool (genome.sfu.ca/crispr) and synthesized by Integrated DNA Technologies (IDT). Repair templates were generated by assembling homology arms (450-bp gBlocks synthesized by IDT) and the selection cassette using the NEBuilder Hifi DNA Assembly Kit (New England Biolabs). Cas9 protein (generously gifted from Dr. Geraldine Seydoux) was assembled into a ribonucleoprotein (RNP) complex with the guide RNAs and tracrRNA (IDT) following the manufacturer’s recommendations. PD1074 animals were injected using standard microinjection techniques (Mello ; Kadandale ) with an injection mix consisting of: 50 ng/µl repair template, 0.5 µM RNP complex, 5 ng/µl pCFJ104 (Pmyo-3::mCherry), and 2.5 ng/µl pCFJ90 (Pmyo-2::mCherry). Injected animals were screened according to the protocol described in Norris and genomic edits were validated using the PCR protocol described in Au . Complementation tests between CRISPR-Cas9 alleles and legacy mutant alleles were performed to verify gene identities, as described above.

Analysis of orthologs, GO, and expression patterns

Previously reported phenotypes from RNAi experiments or mutant alleles were retrieved from WormBase (WS275) and GExplore (genome.sfu.ca/gexplore; Hutter ; Hutter and Suh 2016). Life stage-specific gene expression data from the modENCODE project (Hillier ; Gerstein , 2014; Boeck ) were also accessed through GExplore. Visual inspection of these data revealed genes with maternal expression patterns (high levels of expression in the early embryo and hermaphrodite gonad) as well as those predominantly expressed in males. Human orthologs of C. elegans genes were determined using Ortholist 2 (ortholist.shaye-lab.org; Kim ). For maximum sensitivity, the minimum number of programs predicting a given ortholog was set to one. For genes with no human orthologs, NCBI BLASTp (blast.ncbi.nlm.nih.gov; Altschul ) was used to examine distributions of homologs across species and potential nematode-specificity. Protein sequences from the longest transcript of each gene were used to query the nonredundant protein sequences (nr) database, with default parameters and a maximum of 1000 target sequences. The results were filtered with an E-value threshold of 10−5. Gene ontology (GO) term analysis was performed using PANTHER version 16.0 (Thomas ). The list of 58 candidate genes was used for an overrepresentation test, with the set of all C. elegans genes as a background list. Overrepresentation was analyzed with a Fisher’s Exact test and P-values were adjusted with the Bonferroni multiple testing correction.

Temperature sensitivity and mating assays

To assay temperature sensitivity, heterozygous strains were propagated at 15° and homozygous L4 animals were isolated on 60 mm NGM plates (2 × 6/plate or 3 × 3/plate). After 1 week at 15°, plates were screened for the presence of viable homozygous progeny. If present, L4 homozygotes were transferred to new plates at 25° and screened after 3 days to confirm lethality or sterility. Mating assays were carried out using PD1074 males and mutant hermaphrodites. Three L4-stage homozygous mutant hermaphrodites were isolated and crossed with 10 PD1074 males on each of three 60 mm NGM plates. Control plates consisted of three L4 hermaphrodite mutants without males. Mating assays were carried out at 25°C and observations were taken after 3 days, noting the absence or presence of viable cross progeny.

Microscopy

The terminal phenotypes of dead eggs from maternal-effect lethal mutants were observed using DIC microscopy. Young adult homozygous mutants were dissected to release their eggs in either M9 buffer with Triton X-100 (0.5%; M9+TX) or distilled water and embryos were left to develop at 25°C overnight (∼16 h). Embryos were mounted on 2% agarose pads and visualized using a Zeiss Axioplan 2 equipped with DIC optics. Images of representative embryos were captured using a Zeiss Axiocam 105 Color camera and ZEN 2.6 imaging software (Carl Zeiss Microscopy). For embryos incubated in distilled water, an osmotic integrity defective (OID) phenotype was noted for embryos that burst or swelled and filled the eggshell, as described by Sönnichsen .

Results

Identification of 58 putative essential genes

WGS was performed on a total of 157 strains, with depth of coverage ranging between 21x and 65x (average = 38x). A minimum of two alleles for each of 75 complementation groups were sequenced and a total of 58 putative essential genes were identified (Table 2). Literature searches revealed that 49 of these genes have been annotated with lethal or sterile phenotypes from either mutant alleles or RNAi studies. Furthermore, 46 of the 157 alleles have been previously mentioned in publications with some phenotypic description (Vatcher ; Gönczy , 2001; Molin ; Kaitna ; Brauchle ; Cockell ; Delattre ; Sonneville and Gönczy 2004; Bischoff and Schnabel 2006; Langenhan ; Nieto ; von Tobel ). Although 18 of these alleles have been previously sequenced, we were unaware of this when initially analyzing the data, and these alleles therefore served as a blind test set to validate our analysis approach. Eight of the nine genes represented in this set of 18 previously sequenced alleles were correctly identified by our pipeline. The gene cul-2 (Sonneville and Gönczy 2004) escaped identification due to an intronic mutation in one allele that did not pass our filtering criteria but was found upon manual inspection of the sequencing data. A complete list of previously published and sequenced alleles can be found in Supplementary File S2 with their associated publications.
Table 2

List of 58 putative essential genes with associated maternal-effect lethal or sterile alleles

Legacy comp. groupaStrainAllele(s)GeneChr.PositionBase changeMutationMutation typeAmino acid changebProtein size (Amino Acids)bHuman ortholog(s)Associated OMIM phenotype(s)c
YGE2430t2135 air-1 V8221773CTSNVMissenseR62C326AURKA, AURKB, AURKC, STK36Colorectal cancer, susceptibility to [114500]; Spermatogenic failure 5 [243060]
GE2337t2095 air-1 V8223169CATCDeletionFrameshift
xGE2314t1724 aptf-2 IV13414105AGSNVMissenseL244P367TFAP2A, TFAP2B, TFAP2C, TFAP2D, TFAP2EChar syndrome [169100]; Patent ductus arteriosus 2 [617035]; Branchiooculofacial syndrome [113620]
GE2289t1836 aptf-2 IV13414263GTSNVNonsenseC191*
HGE1958t1726 atg-7 IV11079764GASNVNonsenseQ367*647ATG7None
GE1936t1738 atg-7 IV11079973CTSNVNonsenseW311*
TGE2449t2143 atl-1 V9635587CTSNVNonsenseW2346*2531ATR, PRKDCCutaneous telangiectasia and cancer syndrome, familial [614564]; Seckel syndrome 1 [210600]; Immunodeficiency 26 with or without neurologic abnormalities [615966]
GE2467t2155 atl-1 V9637978CTSNVMissenseE1710K
gene-28GE2200t1480 bckd-1A III12969933GASNVNonsenseQ174*432BCKDHA, TMEM91, AC011462.1Maple syrup urine disease [248600]
GE1742t1461 bckd-1A III12971429GASNVNonsenseQ109*
gene-17GE2206t1514 bckd-1A III12971273GASNVNonsenseQ161*
GE2627t1603 bckd-1A III12971305CTSNVNonsenseW150*
vzGE2890t1821 C34D4.4 IV7150054GASNVNonsenseW101*205TVP23A, TVP23B, TVP23C, TVP23C-CDRT4None
GE2840t1860 C34D4.4 IV7150143GASNVNonsenseW131*
aGE2734t2029 C56A3.8 V13560728GASNVMissenseG62E402PI4K2A, PI4K2BNone
GE2886t2055 C56A3.8 V13560787GASNVMissenseE243K
GE2487t2149 C56A3.8 V13561369CTSNVMissenseP82L
VGE2142t2074 ccz-1 V13679756TASNVNonsenseY248*528CCZ1, CCZ1BNone
GE2304t2129d ccz-1 V13680792CTSNVNonsenseQ361*
bGE2047t2021 cept-2 V14349388GASNVNonsenseW128*424CEPT1, CHPT1, SELENOISpastic paraplegia 81, autosomal recessive [618768]
GE2122t2007 cept-2 V14349747GASNVSplice site
gene-4GE2275t1517 cls-2 III9055405GASNVMissenseR102Q1023CLASP1, CLASP2None
GE2357t1527 cls-2 III9055440GASNVMissenseG114R
RGE2082t2053 cpl-1 V16593886GASNVMissenseS148F337CTSF, CTSK, CTSL, CTSS, CTSVPycnodysostosis [265800]; Ceroid lipofuscinosis, neuronal, 13 [615362]
GE2451t2144 cpl-1 V16595201GASNVNonsenseQ49*
AGE2447t1879 cpt-2 IV11180120CTSNVNonsenseQ141*646CPT2Carnitine palmitoyltransferase II deficiency [600649, 608836, 255110]; Encephalopathy, acute, infection-induced, susceptibility to, 4 [614212]
GE1938t1742 cpt-2 IV11180603GASNVNonsenseW194*
gene-24GE2657t1704 cra-1 III6867181GASNVNonsenseQ525*958NAA25None
GE2242t1618 cra-1 III6868737CTSNVNonsenseW149*
DGE1929t1729 csr-1 IV7960467TASNVMissenseN708K1030NoneNone
GE1929t1729 csr-1 IV7961246GASNVMissenseG922E
GE2452t1897 csr-1 IV7959252GASNVSplice site
gene-25GE2595t1662 t1718 cup-5 III7585568CTSNVNonsenseR263*668MCOLN1, MCOLN2, MCOLN3Mucolipidosis IV [252650]
GE2355t1528 cup-5 III7590536GASNVSplice site
gene-30GE2345t1525d cyk-3 III6020590CTSNVNonsenseQ98*1178USP15, USP32, USP6None
GE2352t1535d cyk-3 III6022863GASNVNonsenseW723*
JGE2499t1877 D2096.12 IV8363937CTSNVNonsenseQ126*763NoneNone
GE2407t1906 D2096.12 IV8365654TASNVNonsenseL638*
OGE2135t2043 dgtr-1 V6497335GASNVSplice site359AWAT1, AWAT2, DGAT2, DGAT2L6, MOGAT1, MOGAT2, MOGAT3None
GE2063t2042 dgtr-1 V6498186GASNVMissenseG310R
CGE2028t1801 dif-1 IV7552230ACSNVNonsenseY187*312SLC25A20Carnitine-acylcarnitine translocase deficiency [212138]
GE1932t1732 dif-1 IV7552641CTSNVMissenseG75D
gene-13GE2612t1676d div-1 III10245480GASNVNonsenseQ489*581POLA2None
GE2577t1642 div-1 III10248544CTSNVStart ATGM1I
dGE2335t2056 dlat-1 V14445907GASNVNonsenseQ419*507DLATPyruvate dehydrogenase E2 deficiency [245348]
GE2541t2035 dlat-1 V14446981GASNVMissenseP83L
uGE2402t1940 F21D5.1 IV8727315CTSNVMissenseA436V550PGM3Immunodeficiency 23 [615816]
GE2445t1935 F21D5.1 IV8727668CTSNVMissenseL539F
tGE2837t1791 F56D5.2 IV9397791GASNVNonsenseQ214*385NoneNone
GE2881t1744 F56D5.2 IV9398158GASNVMissenseS107F
gene-26GE1715t1436 gsp-2 III7337087CTSNVNonsenseR95*333PPP1CA, PPP1CB, PPP1CCNoonan syndrome-like disorder with loose anagen hair 2 [617506]
GE2360t1481 gsp-2 III7337383GASNVMissenseG174E
gene-32GE2545t1577 gsr-1 III3652401GASNVMissenseG335R473GSR, TXNRD1, TXNRD2, TXNRD3Hemolytic anemia due to glutathione reductase deficiency [618660]; Glucocorticoid deficiency 5 [617825]
GE2644t1594 gsr-1 III3652407CTSNVNonsenseR337*
gene-31GE2583t1654 hcp-3 III9615498GASNVMissenseR269C288CENPANone
GE2692t1717 hcp-3 III9615555CTSNVMissenseE250K
GGE2455t1914 klp-18 IV7040335TCSNVMissenseY42H932KIF15None
GE2000t1795 klp-18 IV7041203GASNVMissenseE316K
gene-6GE2367t1563 klp-19 III13306451ATSNVMissenseL230H1083KIF4A, KIF4BMental retardation, X-linked 100 [300923]
GE2367t1563 klp-19 III13306457GASNVMissenseA228V
GE2264t1628 klp-19 III13306872CTSNVMissenseG90R
IGE2003t1817 let-99 IV12569291CTSNVNonsenseQ447*698NoneNone
GE2514t1912 let-99 IV12570199CTSNVMissenseL617F
gene-22GE2730t1550d lis-1 III13375376CTSNVNonsenseW92*404PAFAH1B1Lissencephaly 1; Subcortical laminar heterotopia [607432]
GE2653t1698d lis-1 III13375401CTSNVSplice site
zGE2130t1765 mbk-2 IV13033086CTSNVMissenseR533C817DYRK2, DYRK3, DYRK4None
GE2503t1888 mbk-2 IV13033644CTSNVMissenseP701L
gene-10GE2740t1576d mel-32 III6440655CTSNVMissenseG395R507SHMT1, SHMT2None
GE1731t1456d mel-32 III6440831CTSNVMissenseG336E
MGE1999t1793 mex-5 IV13354014TGSNVNonsenseY79*468NoneNone
GE2093t1800 mex-5 IV13354478TASNVNonsenseL219*
SGE2511t2162 mom-2 V8356808TGSNVMissenseC80G362WNT11, WNT9A, WNT9BNone
GE2523t2180 mom-2 V8357121TCSNVMissenseC139R
WGE2497t2137 mre-11 V10735712GASNVMissenseH269Y728MRE11Ataxia-telangiectasia-like disorder 1 [604391]
GE2103t2092 mre-11 V10736080AGSNVMissenseF146S
vGE2091t1772 nstp-2 IV6604731ATSNVMissenseL277H324SLC35B4None
GE2288t1835 nstp-2 IV6605266CTSNVMissenseG131R
FGE2391t1932 perm-5 IV5696931ATSNVMissenseC454S518NoneNone
GE2453t1900 perm-5 IV5698096AGSNVMissenseS323P
gene-21GE2237t1614 pod-1 III13518266GASNVMissenseA912V1136CORO7, CORO7-PAM16None
GE2605t1674 pod-1 III13518357GASNVNonsenseR882*
UGE3128t2177 pos-1 V8414544GASNVSplice site264NoneNone
GE2101t2080 pos-1 V8414579TASNVMissenseV145D
ZGE2517t2175 rad-50 V12247914TASNVNonsenseL350*1312RAD5, AC116366.3Nijmegen breakage syndrome-like disorder [613078]
GE2476t2147 rad-50 V12250324TASNVMissenseI1101N
EGE2189t1750 rad-51 IV10282013ATSNVMissenseI384N395DMC1, RAD51, RAD51B, RAD51C, RAD51DFanconi anemia, complementation group R, group O [617244, 613390]; Mirror movements 2 [614508]; Breast-ovarian cancer, familial, susceptibility to, 3 [613399]
GE2433t1885 rad-51 IV10282328CTSNVMissenseV323I
gene-11GE2347t1519 rmd-1 III9759805GASNVMissenseG89R226RMDN2, RMDN3None
GE2219t1501 rmd-1 III9759929GASNVMissenseR130H
gene-18GE2211t1476d sas-1 III12710102CTSNVMissenseP419S570NoneNone
GE2343t1521d sas-1 III12710202GASNVMissenseG452E
fGE2078t2033d sas-5 V11612449CTSNVMissenseR397C404NoneNone
GE2134t2079d sas-5 V11612449CTSNVMissenseR397C
PGE2469t2173 spn-4 V6783986ATSNVNonsenseL259*351RBFOX1, RBFOX2, RBFOX3None
GE2317t2098 spn-4 V6784646ATSNVMissenseV55D
gGE2386t2165 sqv-4 V10660827GASNVMissenseP182L481UGDHEpileptic encephalopathy, early infantile, 84 [618792]
GE2059t2025 sqv-4 V10661143GASNVMissenseS93L
gene-5GE2277t1496 such-1 III11515520GASNVMissenseL686F798ANAPC5None
GE2277t1496 such-1 III11515883GASNVMissenseH565Y
GE2666t1693 such-1 III11515540CTSNVMissenseR679K
qGE2827t1786 T22B11.1 IV4692945GASNVNonsenseW35*468NoneNone
GE2895t1866 T22B11.1 IV4696017GASNVNonsenseW356*
gene-12GE1734t1438 t1477 tlk-1 III9707175CTSNVNonsenseQ412*965TLK1, TLK2, TLK2PS1Mental retardation, autosomal dominant 57 [618050]
GE2613t1677 tlk-1 III9708080GASNVMissenseA694T
gene-15GE2399t1559 top-3 III11951381GASNVNonsenseQ602*759TOP3AProgressive external ophthalmoplegia with mitochondrial DNA deletions, autosomal recessive 5 [618098]; Microcephaly, growth restriction, and increased sister chromatid exchange 2 [618097]
GE2220t1516 top-3 III11958680CTSNVMissenseG59R
gene-35GE1735t1470 top-3 III11957525CTSNVNonsenseW114*
GE2958t1464 t1484 top-3 III11951669CTSNVMissenseG506R
LGE2512t1909 trcs-1 IV9587541CTSNVMissenseE373K428AADAC, AADACL2, AADACL3, AADACL4, NCEH1None
GE1939t1745 trcs-1 IV9587985GASNVNonsenseQ242*
cGE2112t2037 unc-112 V14692219CTSNVMissenseR669Q720FERMT1, FERMT2, FERMT3Kindler syndrome [173650]; Leukocyte adhesion deficiency, type III [612840]
GE2326t2106 unc-112 V14696546CTSNVSplice site
gene-27GE1722t1435 vps-33.1 III8701605CTSNVNonsenseR159*603VPS33A, VPS33B, AC048338.1Mucopolysaccharidosis-plus syndrome [617303]; Arthrogryposis, renal dysfunction [208085]
GE2366t1561 vps-33.1 III8702923GASNVNonsenseW536*
QGE2292t2114 vps-39 V14035713GASNVNonsenseQ754*926VPS39None
GE1937t2189 vps-39 V14036143CTSNVNonsenseW626*
GE2056t2016 vps-39 V14037839GCSNVNonsenseY122*
NGE2153t1773 wapl-1 IV4444464CTSNVNonsenseW348*748WAPLNone
GE2305t1867 wapl-1 IV4442749-4442872122-bp deletionDeletion
pGE2738t1833 Y54G2A.73 IV3000662ATSNVNonsenseL341*380NoneNone
GE2387t1913 Y54G2A.73 IV3001767GASNVNonsenseR252*
GE2884t1755 Y54G2A.73 IV3008481CTSNVSplice site
gene-23GE1713t1433 ZK688.9 III7882477CTSNVNonsenseW135*281TIPRLNone
GE2621t1587 ZK688.9 III7882717CTSNVSplice site
gene-14GE2348t1518d zyg-8 III12063671CTSNVNonsenseR312*802DCLK1, DCLK2, DCLK3, DCXLissencephaly, X-linked, 1; Subcortical laminar heterotopia, X-linked [300067]
GE2362t1547d zyg-8 III12063832GASNVSplice site
gene-33GE1718t1441d zyg-8 III12069655AGSNVMissenseD665G
GE2533t1638d zyg-8 III12069369–12069742372-bp deletion + 6-bp insertionDeletion/insertion

Complementation group determined by complementation analysis of legacy mutants.

Amino acid position and size derived from the longest transcript (wormbase.org, version WS275).

Phenotypes retrieved from omim.org.

Previously sequenced allele.

List of 58 putative essential genes with associated maternal-effect lethal or sterile alleles Complementation group determined by complementation analysis of legacy mutants. Amino acid position and size derived from the longest transcript (wormbase.org, version WS275). Phenotypes retrieved from omim.org. Previously sequenced allele. There were 17 complementation groups that had no common gene candidates in the mapping region after our initial analysis. Three of these allele pairs were later shown to be allelic with other complementation groups and were assigned gene candidates accordingly (see below, Table 4, and Supplementary File S2). We were unable to confidently assign gene candidates for the remaining 14 complementation groups. However, Supplementary File S2 contains the full list of common gene mutations (in both coding and noncoding regions) for each complementation group. This list may be used in conjunction with additional genetic assays to elucidate identities for these genes in the future.
Table 4

Complementation tests for conflicting groups

Legacy complementation groupStrainAllelePreliminary gene candidateMapped underComplementation test resultsFinal gene assignment
gene-28GE1742t1461 bckd-1A None of tested deficienciesFails to complement: GE2206, GE2627 bckd-1A
gene-17GE2627t1603 bckd-1A tDf5 Fails to complement: GE2206, GE1742 bckd-1A
GE2206t1514 tDf5 Fails to complement: GE2627, GE1742
gene-15GE2220t1516 top-3 tDf5 Fails to complement: GE2399, GE1735 top-3
Complements: GE2278
GE2399t1559 tDf5 Fails to complement: GE2220
gene-34GE2278t1502 top-3 None of tested deficienciesFails to complement: GE1735 unknown gene
Complements: GE2220
gene-35GE1735t1470 top-3 None of tested deficienciesFails to complement: GE2278, GE2220 double mutant: top-3 + unknown gene
While the list of 58 genes includes many known essential genes, among the known genes are alleles that are novel genetic variants. Nineteen genes from this collection which were not previously studied or were not represented by lethal or sterile mutants were designated genes of interest (GOI; Table 3). These 19 GOI, represented by 40 alleles, were further characterized as part of this study. They include 14 genes (28 alleles) with a maternal-effect lethal phenotype and 5 genes (12 alleles) with a sterile phenotype.
Table 3

Genes of interest and associated phenotypes

StrainAlleleGeneProtein functiona Amino acid changea RNAi phenotypebMutant phenotypeEmbryonic osmotic integrity defect
GE1936t1738 atg-7 E1 ubiquitin-activating-like enzyme orthologous to the autophagic budding yeast protein Apg7pW311*Growth variant; dauer body morphology variant; pathogen induced death increased; P granule localization defective; dauer development variant; protein aggregation variant; shortened life span; transgene subcellular localization variant; transgene expression variant; necrotic cell death variant; autophagy variant; antibody staining reducedDead embryosNo
GE1958t1726Q367*Dead embryosNo
GE2627t1603 bckd-1A Predicted mitochondrial protein with alpha-ketoacid dehydrogenase activityW150*Shortened life span; smallDead embryosYes
GE2206t1514Q161*Dead embryosYes
GE2840t1860 C34D4.4 Predicted to have the following domain: Golgi apparatus membrane protein TVP23-likeW131*Unfertilized oocytesN/A
GE2890t1821W101*Unfertilized oocytesN/A
GE2734t2029 C56A3.8 Predicted to have 1-phosphati dylinositol 4-kinase activityG62ELarval lethal; accumulated germline cell corpses; germ cell compartment morphology variant; germline nuclear positioning variant; larval arrest; cell membrane organization biogenesis variant; embryonic lethal; rachis narrow; apoptosis variant; maternal sterile; reduced brood sizeUnfertilized oocytesN/A
GE2487t2149P82LUnfertilized oocytesN/A
GE2886t2055E243KUnfertilized oocytesN/A
GE2122t2007 cept-2 Predicted to have diacylglycerol cholinephosphotransferase activity and ethanolaminephospho-transferase activitySplice siteFat content reduced; embryonic lethal; longDead embryosNo
GE2047t2021W128*No eggs laidSome
(Dead embryos) [ts]
GE2275t1517 cls-2 Member of the CLASP family of microtubule-binding proteinsR102QLocomotion variant; mitosis variant; univalent meiotic chromosomes; no polar body formation; chromosome segregation variant karyomeres early emb; mitotic chromosome segregation variant; mitotic spindle defective early emb; chromosome segregation variant; embryonic lethal; meiotic spindle defective; meiotic progression during oogenesis variant; exploded through vulva; reduced brood size; antibody subcellular localization variant; meiotic chromosome segregation variantDead embryosN/T
GE2357t1527G114RDead embryosNo
GE1938t1742 cpt-2 Carnitine palmitoyl transferaseW194*Embryonic lethalDead embryosNo
GE2447t1879Q141*Dead embryosNo
GE2407t1906 D2096.12 L638*Locomotion variantDead embryosSome
GE2499t1877Q126*Dead embryosYes
GE2063t2042 dgtr-1 Acyl chain transfer enzymeG310RSterile; sick; oocyte number decreased; germline nuclear positioning variant; oocyte septum formation variant; embryonic lethal; embryo OID early emb; oocyte morphology variant; pachytene region organization variant; reduced brood size; germ cell compartment expansion variant; oogenesis variantDead embryosSome
GE2135t2043Splice siteDead embryosYes
GE2541t2035 dlat-1 Predicted to have dihydrolipoyllysine-residue acetyltransferase activityP83LEmbryonic lethal; slow growth; receptor mediated endocytosis defective; pattern of transgene expression variant; sterile progeny; transgene expression increased; general pace of development defective early embDead embryosNo
GE2335t2056Q419*Dead embryosNo
GE2402t1940 F21D5.1 Predicted to have phosphoacetyl-glucosamine mutase activityA436VSterile; germ cell compartment size variant; rachis wide; rachis morphology variant; accumulated germline cell corpses; germ cell compartment morphology variant; germline nuclear positioning variant; embryonic lethal; embryo OID early emb; apoptosis variant; reduced brood size; oogenesis variantDead embryosYes
GE2445t1935L539FDead embryosYes
GE2881t1744 F56D5.2 S107FUnfertilized oocytesN/A
GE2837t1791Q214*Unfertilized oocytesN/A
GE2091t1772 nstp-2 Predicted to have UDP-N-acetylglucosamine and UDP-xylose transmembrane transporter activityL277HLysosome-related organelle morphology variant; transgene subcellular localization variant; RAB-11 recycling endosome localization variant; RAB-11 recycling endosome morphology variantDead embryosNo
GE2288t1835G131RDead embryosNo
GE2391t1932 perm-5 Predicted to have lipid binding activityC454SSterile; apoptosis reduced; oocytes lack nucleus; oocyte number decreased; germ cell compartment morphology variant; germline nuclear positioning variant; germ cell compartment anucleate; oocyte septum formation variant; cell membrane organization biogenesis variant; embryonic lethal; embryo OID early emb; oogenesis variant; diplotene region organization variantDead embryosYes
GE2453t1900S323PDead embryosYes
GE2827t1786 T22B11.1 W35*Unfertilized oocytes [ts]N/A
GE2895t1866W356*Unfertilized oocytes [ts]N/A
GE2399t1559 top-3 Exhibits DNA topoisomerase type I (single strand cut, ATP-independent) activityG59RChromosome morphology variant; hermaphrodite germline proliferation variant; antibody staining increased; somatic gonad development variant; gonad degenerate; chromosome instability; germ cell mitosis variant; gonad arm morphology variant; meiosis variant; oocyte morphology variant; nuclear appearance variant; fewer germ cells; oogenesis variantDead embryosNo
GE2220t1516Q602*Dead embryosNo
GE2512t1909 trcs-1 Putative arylacetamide deacetylase and microsomal lipaseE373KApoptosis reduced; diplotene absent during oogenesis; oocyte number decreased; embryo OID early emb; rachis narrow; chromosome condensation variant; pachytene region organization variant; membrane trafficking variant; pachytene progression during oogenesis variant; apoptosis fails to occur; egg laying variant; germ cell compartment expansion absent; embryonic lethal; cell membrane organization biogenesis variant; no oocytes; germ cell compartment expansion variantDead embryos [leaky ts]Yes
GE1939t1745Q242*No eggs laid (dead embryos) [ts]Yes
GE2884t1755 Y54G2A.73 Splice siteUnfertilized oocytesN/A
GE2387t1913R252*Unfertilized oocytesN/A
GE2738t1833L341*Unfertilized oocytesN/A
GE1713t1433 ZK688.9 Predicted to have the following domain: TIP41-like protein (TOR signaling pathway regulator)W135*Egg laying variant; locomotion variantDead embryosNo
GE2621t1587Splice siteDead embryosNo

[ts], temperature-sensitive; N/A, not applicable; N/T, not tested; —, no information available.

From WormBase (WS275; wormbase.org); amino acid position derived from the longest transcript.

Phenotypes retrieved from GExplore (genome.sfu.ca/gexplore).

Genes of interest and associated phenotypes [ts], temperature-sensitive; N/A, not applicable; N/T, not tested; —, no information available. From WormBase (WS275; wormbase.org); amino acid position derived from the longest transcript. Phenotypes retrieved from GExplore (genome.sfu.ca/gexplore).

Validation of candidate gene assignments

After isolation, the mutant alleles were each localized to a chromosomal region through deficiency mapping. These data were used to corroborate the candidate genes derived from WGS analysis and to resolve complementation groups with more than one initial gene candidate. There were 53 complementation groups with only one common gene candidate when coding and splicing variants were restricted to the mapped region. This was considered to be strong evidence that we correctly identified the essential genes. For the five groups which had more than one gene candidate in the mapped region, the nature of the mutations and existing knowledge about the genes in question were used to select a single candidate (Supplementary File S2). For the majority of complementation groups, the genomic position of the assigned gene is in agreement with the deficiency genetic mapping data (Figure 1). However, with limited information available, it was not possible to assign precise map coordinates to the molecular lesions of the deficiency strains which were used for mapping. For three complementation groups, there is an apparent conflict between the deficiency mapping data and the gene candidates proposed through our analysis. These complementation groups were found to not map under any of the tested deficiencies, but were assigned gene candidates whose genomic coordinates fall into regions covered by the tested deficiencies (alleles of bckd-1A, top-3, and unc-112;Figure 1). In addition, two of these groups were assigned the same gene candidate as another, purportedly distinct, complementation group (Table 4). From WGS analysis, bckd-1A was the initial gene candidate for two different complementation groups, yet only one of these groups had been mapped to a deletion (tDf5) that covers the bckd-1A locus. Similarly, top-3 was the assigned gene candidate for three different complementation groups, only one of which was mapped under a deficiency (tDf5) encompassing that gene. By performing complementation tests with select alleles (Table 4), we concluded that the two bckd-1A groups are not distinct, and indeed they contain mutations in the same gene. One of the groups (gene-35) originally identified as top-3 is a double mutant which fails to complement gene-15 (top-3) and gene-34 (unknown gene).
Figure 1

Schematic of gene assignments and deficiency mapping. Genes and deficiencies are shown with their relative positions on chromosomes III–V (coordinates listed in Supplementary File S2). Approximate boundaries of each deficiency were determined by the coordinates of the closest gene known to lie outside of the deletion, when possible (indicated by a faded edge). If no such genes with physical coordinates are known, the outermost gene known to lie inside the deletion was used as the boundary (indicated by a sharp edge). Gene names are colored according to the deficiency under which the alleles were mapped. Genes names assigned to alleles that did not map under any of the tested deficiencies are highlighted in gray. top-3 and bckd-1A on chromosome III are represented by multiple complementation groups with conflicting results from deficiency mapping.

Schematic of gene assignments and deficiency mapping. Genes and deficiencies are shown with their relative positions on chromosomes III–V (coordinates listed in Supplementary File S2). Approximate boundaries of each deficiency were determined by the coordinates of the closest gene known to lie outside of the deletion, when possible (indicated by a faded edge). If no such genes with physical coordinates are known, the outermost gene known to lie inside the deletion was used as the boundary (indicated by a sharp edge). Gene names are colored according to the deficiency under which the alleles were mapped. Genes names assigned to alleles that did not map under any of the tested deficiencies are highlighted in gray. top-3 and bckd-1A on chromosome III are represented by multiple complementation groups with conflicting results from deficiency mapping. Complementation tests for conflicting groups Three candidate genes (nstp-2, C34D4.4, and F56D5.2) were selected for additional validation by generating a deletion of the gene in a wild-type background using CRISPR-Cas9 genome editing (Norris ; Au ). These genes were chosen because they were expected to be of interest to the broader research community. The deletion alleles have been verified with the PCR protocol described by Au . Guide RNA sequences and deletion-flanking sequences are listed in Supplementary File S3. Complementation testing between the newly generated CRISPR-Cas9 deletion mutants and the legacy mutant strains confirmed that the mutations are allelic, and the genes assigned to the legacy strains are correct (Supplementary File S3).

Human orthologs, gene ontology, and expression patterns

Of the 58 essential gene candidates, 47 genes have predicted human orthologs (Table 2). Many of these genes in humans have been implicated in disease and are associated with OMIM disease phenotypes (Online Mendelian Inheritance in Man; omim.org). BLASTp searches revealed that the set of 19 GOI contains three nematode-specific genes (F56D5.2, perm-5, and T22B11.1) that have homologs in parasitic species, and two uncharacterized genes (D2096.12 and Y54G2A.73) that do not have homology outside the Caenorhabditis genus. To gain insight into the functions of the putative essential genes, an overrepresentation test was used to elucidate the most prominent GO terms associated with them. The biological process terms overrepresented in the set of 58 genes include such terms as organelle organization (GO:0006996), nuclear division (GO:0000280), cellular metabolic process (GO:0044237), and DNA repair (GO:0006281), as shown in Figure 2. In the molecular function category, binding (GO:0005488) and catalytic activity (GO:0003824) are overrepresented by 41 genes (adjusted P = 1.2E-07) and 28 genes (adjusted P = 1.8E-03), respectively. A complete list of overrepresented GO terms and associated genes can be found in Supplementary File S3.
Figure 2

Biological process GO terms overrepresented in the set of 58 putative essential genes. Bar length represents the number of genes in the set associated with each GO term. Overrepresentation was analyzed using PANTHER version 16.0 (Thomas ) and P-values were adjusted with the Bonferroni multiple testing correction. Results were filtered to include terms with adjusted P < 0.05 and edited to exclude redundant terms. A list of overrepresented GO terms and associated genes can be found in Supplementary File S3.

Biological process GO terms overrepresented in the set of 58 putative essential genes. Bar length represents the number of genes in the set associated with each GO term. Overrepresentation was analyzed using PANTHER version 16.0 (Thomas ) and P-values were adjusted with the Bonferroni multiple testing correction. Results were filtered to include terms with adjusted P < 0.05 and edited to exclude redundant terms. A list of overrepresented GO terms and associated genes can be found in Supplementary File S3. To examine the timing of gene expression throughout the life cycle, gene expression data from the modENCODE project (Hillier ; Gerstein , 2014; Boeck ) was retrieved from GExplore (genome.sfu.ca/gexplore; Hutter ; Hutter and Suh 2016) for the 19 GOI (Supplementary File S4). For 10 of the GOI, these data show high levels of gene expression in the early embryonic stages as well as in adulthood, and particularly in the hermaphrodite gonad. This expression pattern is characteristic of a maternal-effect gene, for which gene products are passed on to the embryo from the parent. Five genes have a maternal gene expression pattern as well as expression throughout other stages of the life cycle, indicating an additional, zygotic role for the gene. Seven genes have elevated expression levels in males and L4-stage hermaphrodites. These genes are suspected to be involved in sperm production or fertilization, and the associated strains were subjected to mating assays (see below).

Temperature sensitivity and mating assays for genes of interest

The 40 alleles associated with the 19 GOI were further examined to gain insight into the phenotypic consequences of their mutations. Each allele was assayed for temperature sensitivity, as some of the original mutant screening was carried out at 25°C. Five alleles (marked with a [ts] phenotype in Table 3) were deemed temperature-sensitive and could proliferate as homozygotes at a permissive temperature of 15°C, while being maternal-effect lethal or sterile at a restrictive temperature of 25°C. Curiously, four of these temperature-sensitive alleles were the results of stop codons, not missense mutations. Seven candidate genes (16 alleles) were hypothesized to be involved in male fertility, based on the production of unfertilized oocytes by hermaphrodites and/or predominantly male gene expression patterns. These 16 strains were assayed for their ability to be rescued through mating with wild-type males. 14 of the strains were rescued by the mating assay, while two strains failed to rescue (Table 5). Phenotypic rescue through mating was consistent among alleles of the same gene in five of the seven genes, while two genes had conflicting results among the pair of alleles in their complementation groups (F56D5.2 and nstp-2).
Table 5

Putative male fertility genes

StrainAlleleGeneObserved mutant phenotypeSuccessful WT male rescue
GE2627t1603 bckd-1A Dead embryosYes
GE2206t1514Dead embryosYes
GE2840t1860 C34D4.4 Unfertilized oocytesYes
GE2890t1821Unfertilized oocytesYes
GE2734t2029 C56A3.8 Unfertilized oocytesYes
GE2487t2149Unfertilized oocytesYes
GE2886t2055Unfertilized oocytesYes
GE2881t1744 F56D5.2 Unfertilized oocytesNo
GE2837t1791Unfertilized oocytesYes
GE2091t1772 nstp-2 Dead embryosNo
GE2288t1835Dead embryosYes
GE2827t1786 T22B11.1 Unfertilized oocytes [ts]Yes
GE2895t1866Unfertilized oocytes [ts]Yes
GE2884t1755 Y54G2A.73 Unfertilized oocytesYes
GE2387t1913Unfertilized oocytesYes
GE2738t1833Unfertilized oocytesYes

[ts], temperature-sensitive.

Putative male fertility genes [ts], temperature-sensitive.

Terminal phenotypes of maternal-effect lethal embryos

Using DIC microscopy, the terminal phenotypes of 28 maternal-effect lethal strains (a subset of the 40 GOI strains) were observed. Representative images were selected and compiled into a catalog of terminal phenotypes (Supplementary File S5). Ten strains showed an OID phenotype (as described in Sönnichsen ) in nearly all embryos after incubation in distilled water, while three additional strains had only some embryos that exhibited this phenotype (Table 3). The OID phenotype was evident in embryos that filled the eggshell completely [for example, dgtr-1(t2043), Figure 3A] and eggs that burst in their hypotonic surroundings. Early embryonic arrest was observed in embryos from the two dlat-1 mutant alleles (t2035 and t2056), which arrested most often with only one to four cells (e.g., Figure 3B). Eleven strains had embryos that terminated with approximately 100–200 cells [for example, ZK688.9(t1433), Figure 3C]; while four strains developed into two- or threefold stage embryos that did not hatch and exhibited clear morphological defects, such as nstp-2(t1835) with a lumpy body wall and constricted nose tip (Figure 3D).
Figure 3

Embryonic arrest visualized with DIC microscopy for select maternal-effect lethal mutants. Eggs were dissected from homozygous mutants and imaged immediately (A) or incubated in distilled water overnight before imaging (B–D). (A) Eggs dissected from dgtr-1(t2043) homozygotes exhibit signs of an osmotic integrity defect, by filling the eggshell completely. (B) dlat-1(t2035) embryos exhibit early embryonic arrest, with most embryos consisting of four cells or less. (C) ZK688.9(t1433) embryos arrest with approximately 100 cells. (D) Terminal embryos of nstp-2(t1835) have a lumpy body wall morphology and constricted nose; most animals were moving inside the eggshell but did not hatch. All scale bars represent 10 μm.

Embryonic arrest visualized with DIC microscopy for select maternal-effect lethal mutants. Eggs were dissected from homozygous mutants and imaged immediately (A) or incubated in distilled water overnight before imaging (B–D). (A) Eggs dissected from dgtr-1(t2043) homozygotes exhibit signs of an osmotic integrity defect, by filling the eggshell completely. (B) dlat-1(t2035) embryos exhibit early embryonic arrest, with most embryos consisting of four cells or less. (C) ZK688.9(t1433) embryos arrest with approximately 100 cells. (D) Terminal embryos of nstp-2(t1835) have a lumpy body wall morphology and constricted nose; most animals were moving inside the eggshell but did not hatch. All scale bars represent 10 μm.

Discussion

Revisiting legacy mutant collections with WGS

In this study, we focused on reexamining legacy collections of C. elegans mutants isolated before the complete genome sequence was published (The ) and long before massively parallel sequencing was widely available. With major advances in sequencing technology in the past 30 years (reviewed in Goodwin ), WGS has become affordable and accessible, making it possible to revisit past projects with new approaches and advanced capabilities. We have sequenced paired alleles from 75 complementation groups on chromosomes III–V, from which we identified 58 putative essential genes (Table 2). While WGS is a powerful tool, it does not stand alone as a solution to identifying mutant alleles. This study has shown the power of having multiple alleles in a complementation group when faced with the abundance of genomic variants found in WGS analysis. Indeed, when we sequenced four single alleles, which had no complementation pairs, we were unable to designate a single mutation as the variant responsible for maternal-effect lethality (data not shown). Our approach to gene identification proved to be effective and was validated by a combination of different methods. The blind test set of 18 previously sequenced alleles from which eight of nine genes were readily identified serves as an important validation of our analysis pipeline and gives confidence in the results we obtained. In addition, the deficiency mapping data, gene expression patterns from the modENCODE project, GO term analysis, and phenotypes documented from previous experiments provide evidence to support the gene candidates we assigned in these mutant collections. The CRISPR-Cas9 deletion alleles we generated for selected gene candidates provide additional validation and will be made available to the research community to serve as useful tools for future studies. While the mutant alleles from the original study have been outcrossed, the genetic balancer background and additional mutations that persist can complicate phenotypic analysis. In contrast, these new CRISPR-Cas9 deletion strains were made in a wild-type background, which makes it much easier to handle them and interpret their mutant phenotypes. Furthermore, the pharyngeal GFP expression introduced by the gene-editing approach acts as a dominant and straightforward marker for tracking the alleles in a heterozygous population. This is useful, as the homozygous animals do not produce viable progeny. The complementation groups that could not be assigned gene candidates in our analysis may have been complicated by variants in noncoding regions, poor sequencing coverage, or inaccurate complementation pairing, among other possibilities. In future work, tracking down the genes we were unable to identify will require repeating complementation tests and re-tooling the analysis approach. Among the 19 GOI are four temperature-sensitive alleles, all the result of nonsense mutations. While unusual, temperature-sensitive nonsense alleles are not unprecedented and have been found in several organisms (e.g., Golden and Riddle 1984; Samson ). Both nonsense alleles of T22B11.1 are temperature-sensitive. This makes us suspicious that perhaps the wild-type process this gene is involved in is itself temperature-dependent. This idea stems from the observation that all alleles of the dauer constitutive genes daf-4 and daf-7 are temperature-sensitive (Golden and Riddle 1984). These genes have both amber stop alleles and missense alleles and all are temperature-sensitive. If T22B11.1 were indeed involved in a temperature-dependent process we would expect a deletion allele to also be temperature-sensitive. The gene trcs-1 also has a temperature-sensitive nonsense allele and, in addition, it has a leaky temperature-sensitive missense allele. Again, the product of this gene may be involved in a temperature-dependent process. One requires a different explanation for cept-2 where we have identified two alleles and only one, the nonsense allele, is temperature-sensitive. In Drosophila, the elav gene has temperature-sensitive alleles that are nonsense alleles and yet they make full-length proteins (Samson ). A detailed study of this gene and its gene product concluded that, at some low level, an alternative amino acid is substituted at the stop site, allowing for a full length but unstable protein (Samson ). This type of information suppression suggests we may observe a low-abundance, full-length protein product for cept-2.

GO analysis reveals common themes and gaps in our knowledge

The underlying biological themes of the 58 putative essential genes were revealed by examining their GO terms. The biological processes represented in Figure 2 help to confirm the nature of this set, as a collection of genes that are required for essential functions such as cell division, metabolism, and development. Performing GO-term analysis also revealed that a number of the genes in this collection lacked sufficient annotation to be interpreted this way. We found four genes about which there is little to nothing known (D2096.12, F56D5.2, T22B11.1, and Y54G2A.73). For example, F56D5.2 is a gene with no associated GO terms, no known protein domains, and no orthologs in other model organisms. These wholly uncharacterized genes are intriguing candidates which may help uncover new biological processes and biochemical pathways that are evidently fundamental to life for this organism.

Examining expression patterns leads to discovery of genes involved in male fertility

The life stage-specific expression patterns (Supplementary File S4) provide some insight into the roles the genes in this collection play in development. Fifteen of the nineteen GOI are highly expressed in the early embryo and hermaphrodite gonad, which suggests that the gene product is passed on to the embryo from the parent. Five of these maternal genes also have elevated expression during late embryonic and larval stages, which suggests they are pleiotropic. The zygotic functions of these genes must be nonessential or else a zygotic lethal, rather than maternal-effect lethal, phenotype would be observed. We also identified four genes that are most highly expressed in males and L4 hermaphrodites, as well as three genes that have prominent male expression in addition to characteristic maternal expression patterns. Mating assays confirmed that these male-expressed genes have an essential role in male fertility. Studies have shown that genes expressed in sperm are largely insensitive to RNAi (Fraser ; Gönczy ; Reinke ; del Castillo-Olivares ; Zhu ; Ma ), making these types of genes particularly difficult to identify in high-throughput RNAi screens. With the availability of RNA-seq data across different life stages for nearly every gene in the C. elegans genome (Hillier ; Gerstein , 2014; Boeck ; Tintori ; Packer ), screening for characteristic gene expression patterns may be a useful approach for identifying sterile and maternal-effect lethal genes that remain to be discovered. We propose that the seven male-expressed genes are involved in sperm production and/or function (Table 5). These genes are mostly uncharacterized, and this is the first reporting of their involvement in male fertility. While the mutant hermaphrodites lay unfertilized oocytes (5 genes) or dead eggs (2 genes), this phenotype could be rescued in 14 of the 16 alleles by the introduction of wild-type sperm through mating. The two alleles that could not be rescued had allele pairs in the same complementation groups that were rescued in the mating assay. One of these discrepancies, between F56D5.2(t1744) and F56D5.2(t1791), was resolved when we found a second mutation in a nearby essential gene that was likely responsible for the inability of one strain to be rescued (data not shown). The presence of additional lethal mutations in the genome is unsurprising given the nature of chemical mutagenesis, and it reinforces the advantage of having multiple alleles for a gene when interpreting mutant phenotypes.

Interpreting terminal phenotypes of maternal-effect lethal mutants

The catalog of terminal phenotypes (Supplementary File S5) created in this study provides a window into the roles the maternal-effect genes play in development. Some of these phenotypes corroborate previously observed phenotypes from RNAi studies. For example, RNAi knockdown experiments have shown that DLAT-1 is an enzyme involved in metabolic processes required for cell division in one-cell C. elegans embryos (Rahman ). We uncovered two alleles of dlat-1 in this study (t2035 and t2056) in which most embryos arrest at the one- to four-cell stage (Figure 3B). The mutant alleles presented here can confirm previously reported phenotypes and serve as new genetic tools for continuing the study of essential gene function. We also identified alleles for six genes that exhibit an OID phenotype, resulting in embryos that filled the eggshell completely or burst in distilled water. More than 100 genes have been identified in RNAi screens as important for the osmotic integrity of developing embryos (reviewed in Stein and Golden 2018). Some of these genes have roles in lipid metabolism (Rappleye ; Benenati ), cellular trafficking (Rappleye ), and chitin synthesis (Johnston ). Four of the six genes identified with OID mutants in this study have been previously implicated in osmotic sensitivity: dgtr-1 is involved in lipid biosynthesis (Carvalho ; Olson ), trcs-1 is involved in lipid metabolism and membrane trafficking (Green ); perm-5 is predicted to have lipid binding activity; and F21D5.1 is an ortholog of human PGM3, an enzyme involved in the hexosamine pathway which generates substrates for chitin synthase. We found OID mutants for two additional genes that were not previously characterized with this phenotype, bckd-1A and D2096.12. bckd-1A is a component of the branched-chain alpha-keto dehydrogenase complex, which is involved in fatty acid biosynthesis (Kniazeva ); this may be indicative of a role in generating or maintaining the lipid-rich permeability barrier. D2096.12 is a Caenorhabditis-specific gene with no known protein domains. Elucidating the function of this uncharacterized gene may lead to new insights about the biochemistry of eggshell formation and permeability in C. elegans embryos. Most of the mutant strains we examined with DIC microscopy arrested around the 100- to 200-cell stage as a seemingly disorganized group of cells (e.g., Figure 3C). Others developed into two-fold or later stage embryos that moved inside the eggshell but did not hatch (e.g., Figure 3D). The terminal phenotypes documented here reveal how long the embryo can persist without the maternal contribution of gene products, and the developmental defects that ensue. Future studies might make use of fluorescent markers and automated cell lineage tracking (e.g., Thomas ; Schnabel ; Bao ; Wang ) as well as single-cell transcriptome data (Tintori ; Packer ) to further investigate these essential genes.

Relevance beyond C. elegans

In this collection of 58 putative essential genes, there are 47 genes (81%) with human orthologs; a two-fold enrichment when compared to all C. elegans genes, 41% of which have human orthologs (Kim ). This is in line with previous findings that essential genes are more often phylogenetically conserved than nonessential genes (Hughes 2002; Jordan ; Georgi ). Essential genes in model organisms are often associated with human diseases (Culetto and Sattelle 2000; Silverman ; Dickerson ; Qin ), making the alleles identified in this study potentially relevant to understanding human health. Indeed, there are OMIM disease phenotypes associated with a number of the human orthologs listed in Table 2. Novel mutant alleles in C. elegans may help us better understand genetic disorders by providing new opportunities to interrogate gene function, explore genetic interactions, and screen prospective therapeutics. Nematode-specific genes that are essential are important to nematode biology in general and are particularly relevant in parasitic nematology. We found three genes in our GOI list (F56D5.2, perm-5, and T22B11.1) that have orthologs in parasitic nematode species and not in other phyla. With growing anthelminthic drug resistance around the world (Jabbar ), novel management strategies are needed to combat parasitic nematodes, which infect crops, livestock, and people worldwide (Nicol ; Wolstenholme ; Hotez ). Essential genes are desirable targets for drug development, yet identifying such genes in parasites experimentally is difficult (Kumar ; Doyle ). Thus, as a free-living nematode, C. elegans is a widely used model for genetically intractable parasitic species (Bürglin ; Hashmi ). Our identification of novel essential genes with orthologs in parasitic nematodes may provide new opportunities to explore management strategies. It is our hope that the alleles and phenotypes presented here will serve as a starting point and guide future research to elucidate the specific roles these genes play in embryogenesis. All of the alleles presented in this study are available to the research community through the Caenorhabditis Genetics Center (cgc.umn.edu) and we anticipate they will serve as a valuable resource in the years to come. The wealth of material uncovered in this specific legacy collection will hopefully inspire similar explorations of other frozen mutant collections.

Data availability

The raw sequence data from this study have been deposited in the NCBI Sequence Read Archive (SRA; ncbi.nlm.nih.gov/sra) under accession number PRJNA628853. Supplemental material is available at figshare: Supplementary File S1 (Detailed experimental methods from the generation of Collection B); Supplementary File S2 (Gene candidate selection, Alleles and associated publications, Common gene hits with deficiency mapping for each complementation group); Supplementary File S3 (CRISPR-Cas9 deletion alleles and associated sequences, GO terms and associated genes); Supplementary File S4 (Life stage-specific expression patterns); Supplementary File S5 (Terminal phenotypes of maternal-effect lethal embryos). Figshare DOI: https://doi.org/10.25386/genetics.14702139
  110 in total

1.  Whole-genome sequencing and variant discovery in C. elegans.

Authors:  LaDeana W Hillier; Gabor T Marth; Aaron R Quinlan; David Dooling; Ginger Fewell; Derek Barnett; Paul Fox; Jarret I Glasscock; Matthew Hickenbotham; Weichun Huang; Vincent J Magrini; Ryan J Richt; Sacha N Sander; Donald A Stewart; Michael Stromberg; Eric F Tsung; Todd Wylie; Tim Schedl; Richard K Wilson; Elaine R Mardis
Journal:  Nat Methods       Date:  2008-01-20       Impact factor: 28.547

2.  Four-dimensional imaging: computer visualization of 3D movements in living specimens.

Authors:  C Thomas; P DeVries; J Hardin; J White
Journal:  Science       Date:  1996-08-02       Impact factor: 47.728

3.  Systematic exploration of essential yeast gene function with temperature-sensitive mutants.

Authors:  Zhijian Li; Franco J Vizeacoumar; Sondra Bahr; Jingjing Li; Jonas Warringer; Frederick S Vizeacoumar; Renqiang Min; Benjamin Vandersluis; Jeremy Bellay; Michael Devit; James A Fleming; Andrew Stephens; Julian Haase; Zhen-Yuan Lin; Anastasia Baryshnikova; Hong Lu; Zhun Yan; Ke Jin; Sarah Barker; Alessandro Datti; Guri Giaever; Corey Nislow; Chris Bulawa; Chad L Myers; Michael Costanzo; Anne-Claude Gingras; Zhaolei Zhang; Anders Blomberg; Kerry Bloom; Brenda Andrews; Charles Boone
Journal:  Nat Biotechnol       Date:  2011-03-27       Impact factor: 54.908

4.  A Transcriptional Lineage of the Early C. elegans Embryo.

Authors:  Sophia C Tintori; Erin Osborne Nishimura; Patrick Golden; Jason D Lieb; Bob Goldstein
Journal:  Dev Cell       Date:  2016-08-22       Impact factor: 12.270

5.  A high-resolution C. elegans essential gene network based on phenotypic profiling of a complex tissue.

Authors:  Rebecca A Green; Huey-Ling Kao; Anjon Audhya; Swathi Arur; Jonathan R Mayers; Heidi N Fridolfsson; Monty Schulman; Siegfried Schloissnig; Sherry Niessen; Kimberley Laband; Shaohe Wang; Daniel A Starr; Anthony A Hyman; Tim Schedl; Arshad Desai; Fabio Piano; Kristin C Gunsalus; Karen Oegema
Journal:  Cell       Date:  2011-04-29       Impact factor: 41.582

6.  lis-1 is required for dynein-dependent cell division processes in C. elegans embryos.

Authors:  Moira M Cockell; Karine Baumer; Pierre Gönczy
Journal:  J Cell Sci       Date:  2004-09-01       Impact factor: 5.285

7.  Germline transformation of Caenorhabditis elegans by injection.

Authors:  Pavan Kadandale; Indrani Chatterjee; Andrew Singson
Journal:  Methods Mol Biol       Date:  2009

8.  Latrophilin signaling links anterior-posterior tissue polarity and oriented cell divisions in the C. elegans embryo.

Authors:  Tobias Langenhan; Simone Prömel; Lamia Mestek; Behrooz Esmaeili; Helen Waller-Evans; Christian Hennig; Yuji Kohara; Leon Avery; Ioannis Vakonakis; Ralf Schnabel; Andreas P Russ
Journal:  Dev Cell       Date:  2009-10       Impact factor: 12.270

9.  Maternal-effect lethal mutations on linkage group II of Caenorhabditis elegans.

Authors:  K J Kemphues; M Kusch; N Wolf
Journal:  Genetics       Date:  1988-12       Impact factor: 4.562

10.  GExplore: a web server for integrated queries of protein domains, gene expression and mutant phenotypes.

Authors:  Harald Hutter; Man-Ping Ng; Nansheng Chen
Journal:  BMC Genomics       Date:  2009-11-16       Impact factor: 3.969

View more
  2 in total

1.  Sequenced Breakpoints of Crossover Suppressor/Inversion qC1.

Authors:  Mark L Edgley; Stephane Flibotte; Donald G Moerman
Journal:  MicroPubl Biol       Date:  2021-11-03

2.  Data Incompleteness May form a Hard-to-Overcome Barrier to Decoding Life's Mechanism.

Authors:  Liya Kondratyeva; Irina Alekseenko; Igor Chernov; Eugene Sverdlov
Journal:  Biology (Basel)       Date:  2022-08-12
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.