| Literature DB >> 29339407 |
Zhaozhao Qin1, Robert Johnsen1, Shicheng Yu2,3, Jeffrey Shih-Chieh Chu3, David L Baillie1, Nansheng Chen4.
Abstract
Using combined genetic mapping, Illumina sequencing, bioinformatics analyses, and experimental validation, we identified 60 essential genes from 104 lethal mutations in two genomic regions of Caenorhabditis elegans totaling ∼14 Mb on chromosome III(mid) and chromosome V(left). Five of the 60 genes had not previously been shown to have lethal phenotypes by RNA interference depletion. By analyzing the regions around the lethal missense mutations, we identified four putative new protein functional domains. Furthermore, functional characterization of the identified essential genes shows that most are enzymes, including helicases, tRNA synthetases, and kinases in addition to ribosomal proteins. Gene Ontology analysis indicated that essential genes often encode for enzymes that conduct nucleic acid binding activities during fundamental processes, such as intracellular DNA replication, transcription, and translation. Analysis of essential gene shows that they have fewer paralogs, encode proteins that are in protein interaction hubs, and are highly expressed relative to nonessential genes. All these essential gene traits in C. elegans are consistent with those of human disease genes. Most human orthologs (90%) of the essential genes in this study are related to human diseases. Therefore, functional characterization of essential genes underlines their importance as proxies for understanding the biological functions of human disease genes.Entities:
Keywords: essential gene; functional characterization; genetic balancer; lethal; whole genome sequencing (WGS)
Mesh:
Substances:
Year: 2018 PMID: 29339407 PMCID: PMC5844317 DOI: 10.1534/g3.117.300338
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Physical deficiency map of LGIII(mid) and LGV(left). The top figure shows the middle region (4.5 Mb) of chromosome III that is balanced by a free duplication (sDp3), which has been subdivided into 14 main zones. The bottom figure shows the left half of chromosome V that is balanced by a reciprocal translocation (eT1), which has been subdivided into 22 main zones and several additional subzones by sets of overlapping rearrangements (deficiencies and one duplication).
Average number of SNVs per strain with 20–100% variation frequency in the eT1(V) balanced region, before and after the removal of background variations
| Before Removal of Background Mutations/Strain | After Removal of Background Mutations/Strain | |
|---|---|---|
| Chromosome V | 1357 | 72 |
| Genetic coding region | 282 | 13 |
The number of essential genes identified in this study
| Essential Genes Sequenced | Allele Sequenced | Genes Identified | Total Essential Genes Identified |
|---|---|---|---|
| 37 genes with single allele sequenced | 21 of 37 genes identified | 27 of 43 genes | |
| Six genes with two alleles sequenced | Six of six genes identified | ||
| Six genes with single allele sequenced | Five of six genes identified | 35 of 46 genes | |
| 40 genes with two alleles sequenced | 30 of 40 genes identified | ||
| Total: 89 genes | 62 (60 mapped molecularly) |
Figure 2Three examples of PCR sequencing validation and rescue assay. (A) Gene validation of algn-1/T26AA5.4 for let-747 on chromosome III. (B) Gene validation of T08B1.1 for let-327 on chromosome V. The gene model in each figure shows that the designed primers cover all coding exons of each gene. The red arrows are forward primers while the arrows in dark blue are reverse primers. Under each gene model, there are partials of Sanger sequencing results using forward and reverse PCR primers as sequencing primers. For let-747/algn-1/T26AA5.4, a point mutation (G->A) was found in the last exon causing a premature stop codon. A C->T point mutation was found in the second exon of T08B1.1a and the first exon of T08B1.1b. A premature stop codon occurred in both T08B1.1 transcripts because of this mutation. (C) Fosmids used for the rescue assay. The fosmid (WRM0619aE07) containing the candidate gene, F11A3.2, is in the red box while WRM066cA08 without F11A3.2 is in the green box. F11A3.2 is in the black box with the designed forward primer and reverse primer underneath. The gel image shows the presence of F11A3.2 in WRM0619aE07 with a target band of 794 bp and the absence of F11A3.2 in WRM066cA08 without the 794 bp PCR product. The weak band of ∼200 bp might be a nonspecific band because it can be found in both products using two fosmids as template.
Biological functions of the 60 essential genes
| Chr | Gene Name | Protein Function | Pathway | Evolutionary Conservation | |
|---|---|---|---|---|---|
| III | Unknown | Unknown | N | ||
| III | Unknown | Unknown | N | ||
| III | Zinc-regulated transcription factor | In | N | ||
| III | In | N | |||
| V | Enzyme: phosphoethanolamine N-methyltransferase | PMT-2 lacks known mammalian orthologs, but has orthologs in the parasitic nematodes: fish, plants, and bacteria. PMT-1 only catalyzes the conversion of phosphoethanolamine to phospho-monomethylethanolamine, which is the first step in the PEAMT-pathway. | N | ||
| V | Unknown | Unknown | N | ||
| V | Unknown | Unknown | N | ||
| V | Unknown | Unknown | N | ||
| III | Extracellular matrix protein fibrillin | Fibrillin is a glycoprotein, which is essential for the formation of elastic fibers found in connective tissues. | N, I, M | ||
| III | Unknown | Unknown | N, I, M | ||
| III | Unknown | Unknown | N, I, M | ||
| V | Member of solute carriers family | Predicted to have transmembrane transporter activity. | N, I, M | ||
| V | Son of sevenless homolog | Its ortholog in mouse is a catalytic component of a trimeric complex that participates in transduction of signals from Ras to Rac by promoting the Rac-specific guanine nucleotide exchange factor activity. SOS-1 is involved in multiple Ras-dependent signaling pathways, which also interacts with LET-23 and LET-60 during vulval development. | N, I, M | ||
| V | Member of solute carriers family, riboflavin transporter | In | N, I, M | ||
| V | Highly similar to the extracellular matrix proteins papilin and lacunin | In | N, I, M | ||
| V | Sodium/calcium exchangers | Mediates the electrogenic exchange of Ca2+ against Na+ ions across the cell membrane, thereby contributes to the regulation of cytoplasmic Ca2+ levels and Ca2+-dependent cellular processes. | N, I, M | ||
| III | Enzyme: 3-hydroxy-3-methylglutaryl-coenzyme A reductase | HMG-CoA reductase catalyzes the conversion of HMG-CoA to mevalonate, which is a rate-limiting step in sterol biosynthesis. | N, I, M, F | ||
| III | mRNA splicing: pre-mRNA-processing-splicing factor | Its ortholog in yeast is a component of the U4/U6-U5 snRNP complex and participates in spliceosomal assembly through its interaction with the U1 snRNA. | N, I, M, F | ||
| III | Enzyme: RNA polymerase II (B) subunit | The second largest subunit B150 of RNA polymerase II is the enzyme that produces the primary transcript. | N, I, M, F | ||
| III | Enzyme: cyclin-dependent kinase | Its ortholog in yeast phosphorylates both the RNA pol II subunit to affect transcription, and the ribosomal protein to increase translational fidelity. | N, I, M, F | ||
| III | Enzyme: arginyl(R) aminoacyl tRNA synthetase in mitochondria | The tRNA synthetase catalyzes the attachment of an amino acid to its cognate transfer RNA molecule. | N, I, M, F | ||
| III | Enzyme: ATP-dependent RNA helicase | Its orthologs in yeast and human are involved in the biogenesis of ribosomal subunits. | N, I, M, F | ||
| III | Enzyme: chitobiosyldiphosphodolichol β-mannosyltransferase | Mannosyltransferase is involved in asparagine-linked glycosylation in the endoplasmic reticulum. | N, I, M, F | ||
| III | Ribosome-related protein: rRNA-processing protein | The protein is involved in ribosomal subunit biogenesis. | N, I, M, F | ||
| III | mRNA splicing: splicing factor 3B subunit 1 | U2-snRNP associated splicing factor forms extensive associations with the branch site-3′ splice site-3′ exon region upon prespliceosome formation. | N, I, M, F | ||
| III | Mammalian bystin (adhesion protein) related protein | Its ortholog in yeast is required for pre-rRNA processing and 40S ribosomal subunit synthesis. | N, I, M, F | ||
| III | Enzyme: ubiquitin activating enzyme | RFL-1 activity is required for proper cytokinesis and spindle orientation. | N, I, M, F | ||
| III | Ribosome-related protein: ribosomal protein, small subunit | Protein component of the small ribosomal subunit involved in protein biosynthesis. | N, I, M, F | ||
| III | Component of the Sorting and Assembly Machinery (SAM) complex | SAM complex binds precursors of beta-barrel proteins and facilitates their outer membrane insertion. | N, I, M, F | ||
| III | Kinesin-related motor protein | Its ortholog in yeast is required for mitotic spindle assembly and chromosome segregation. | N, I, M, F | ||
| III | Enzyme: ATP-dependent RNA helicase | Its ortholog in yeast is involved in mRNA decay and rRNA processing. | N, I, M, F | ||
| III | Ribosome-related protein: ribosomal protein (small) in mitochondria | Ribosomal subunit biogenesis | N, I, M, F | ||
| III | Enzyme: mitochondrial ATP synthase subunit | Evolutionarily conserved enzyme complex that is required for ATP synthesis. | N, I, M, F | ||
| III | Enzyme: 3-hydroxyisobutyryl-CoA hydrolase | Biological process unknown | N, I, M, F | ||
| III | Heat shock protein | Its ortholog in yeast is a ATPase component of the heat shock protein Hsp90 chaperone complex and serves as nucleotide exchange factor to load ATP onto the SSA class of cytosolic Hsp70s. | N, I, M, F | ||
| V | Enzyme: Rab family GTPase | Intracellular vesicle trafficking, such as the ER-to-Golgi step of the secretory pathway. | N, I, M, F | ||
| V | Peroxisomal biogenesis factor | Its ortholog in yeast heterodimerizes with Pex1p and participates in the recycling of the peroxisomal signal receptor from the peroxisomal membrane to the cystosol. | N, I, M, F | ||
| V | Enzyme: Mannose phosphate isomerase | Its ortholog in yeast catalyzes the interconversion of fructose-6-P and mannose-6-P, which is required for early steps in protein mannosylation. | N, I, M, F | ||
| V | Member of solute carriers family | Predicted to have transmembrane transporter activity. | N, I, M, F | ||
| V | Translation initiation factor | Based on the homology to yeast, the products of C37C3.2 are predicted to function during translation initiation as GTPase activators to stimulate GTP hydrolysis by eIF2-GTP-Met-tRNAi. | N, I, M, F | ||
| V | Enzyme: RNA polymerase I/III (A/C) shared subunit | Common component of RNA polymerases I and III, which synthesize ribosomal RNA precursors and small RNAs, such as 5S rRNA and tRNAs. | N, I, M, F | ||
| V | Cleavage and polyadenylation specificity factor | Required for 3′ processing, splicing, and transcriptional termination of mRNAs and snoRNAs. | N, I, M, F | ||
| V | HEAT repeat-containing protein | SOAP is involved in a pathway that controls the apical delivery of E-cad and morphogenesis. | N, I, M, F | ||
| V | Nucleolar complex protein-like DNA replication regulator | Its ortholog in yeast binds to chromatin at active replication origins and is required for pre-RC formation as well as maintenance during DNA replication licensing. | N, I, M, F | ||
| V | Translation termination factor | Subunit of the heterodimeric translation release factor complex involved in the release of nascent polypeptides from ribosomes. | N, I, M, F | ||
| V | α-soluble NSF attachment protein | Its ortholog in mouse is required for vesicular transport between the endoplasmic reticulum and the Golgi apparatus. | N, I, M, F | ||
| V | Enzyme: asparagine synthase (glutamine-hydrolyzing) | Catalyzes the synthesis of L-asparagine from L-aspartate in the asparagine biosynthetic pathway. | N, I, M, F | ||
| V | Enzyme: dihydrolipoamide S-succinyltransferase | Its ortholog in yeast is a component of the mitochondrial alpha-ketoglutarate dehydrogenase complex, which catalyzes the oxidative decarboxylation of alpha-ketoglutarate to succinyl-CoA in the TCA cycle. | N, I, M, F | ||
| V | Exportin-1, an importin-β-like protein orthologous to | XPO-1 is predicted to function as a nuclear export receptor for proteins containing leucine-rich nuclear export signals. | N, I, M, F | ||
| V | Nuclear-encoded mitochondrion-specific chaperone that is a member of the DnaK/Hsp70 superfamily of molecular chaperones | In | N, I, M, F | ||
| V | Homeobox protein | In | N, I, M, F | ||
| V | Transcription initiation factor IIA | Its ortholog in yeast is involved in transcriptional activation and acts as antirepressor or coactivator. | N, I, M, F | ||
| V | Enzyme: adenylosuccinate synthetase | Catalyzes the first step in the synthesis of adenosine monophosphate from inosine 5′monophosphate during purine nucleotide biosynthesis. | N, I, M, F | ||
| V | Enzyme: hydroxymethylglutaryl-CoA synthase | Its ortholog in yeast catalyzes the formation of HMG-CoA from acetyl-CoA and acetoacetyl-CoA. | N, I, M, F | ||
| V | Enzyme: valyl-aminoacyl tRNA synthetase in mitochondria | The tRNA synthetase catalyzes the attachment of an amino acid to its cognate transfer RNA molecule. | N, I, M, F | ||
| V | mRNA splicing: crooked neck pre-mRNA splicing factor | The crooked neck gene of | N, I, M, F | ||
| V | Ribosome-related protein: small subunit processome component | A nucleolar protein required for rRNA synthesis and ribosomal assembly. | N, I, M, F | ||
| V | Enzyme: phospholipase C β | In yeast, the ortholog (Plc1p) and inositol polyphosphates are required for acetyl-CoA homeostasis, which regulates global histone acetylation. | N, I, M, F | ||
| V | Enzyme: ATP synthase | Unknown | N, I, M, F | ||
| V | Translation initiation factor | These proteins help stabilize the formation of the functional ribosome around the start codon and also provide regulatory mechanisms in translation initiation. | N, I, M, F |
The table is sorted by evolutionary conservation, chromosome (Chr), and lethal name (let-Name). N, nematodes; I, invertebrates (Drosophila); M, mammals (mouse, human); F, fungi (Saccharomycetaceae).
Figure 3(A) Six missense mutations in six essential genes that are not in annotated functional domains. (B) One gene model (let-522/M05B5.2) that has a missense mutation h240 (C68Y) that does not reside in an annotated domain (Chu ). The gene model was created using the Perl module FeatureStack. Exons are drawn to scale relative to each other in gray boxes; black lines with triangle shapes represent introns, which are not drawn to scale; white boxes represent UTRs. Blue bars indicate the missense mutation. Multiple alignments were performed between the essential genes and their orthologs in 25 other nematodes and four model organisms (Homo sapiens, Drosophila melanogaster, Mus musculus, and Danio rerio). However, no orthologs were found in the four model organisms. Left columns show the names of nematode species. The alignment figure shows a fragment of 100 amino acids. The mutated amino acid (C68Y) as indicated by the black arrow is in the center with 50 amino acids upstream and 49 amino acids downstream. The alignment color scheme is based on “ClustalX,” in which the color of a symbol depends on the residue type and the occurrence frequency in one column. Black boxes highlight the input genes with both wild type and mutant copy.
Figure 4Genes from each group annotated with three ontology terms: (A) molecular function, (B) cellular component, and (C) biological process. The x-axis lists several GO terms in each GO category and the y-axis is the proportion of genes for each GO term over the total number of annotated genes in each group. Four groups are shown in separate colors: G1 in blue, G2 in orange, G3 in gray, and G4 in gold. Statistical difference was calculated for G1 vs. G2, G3, and G4 individually by using Fisher’s exact test (* P-value < 0.05, ** P-value < 0.01, *** P-value < 0.001).
Figure 5Gene essentiality analysis. Group one (G1): essential genes that were isolated through genetic screens in the Baillie laboratory or collaborations and are fully sequenced or rescued by fosmids (Maciejowski ; Jones ; Chu ; S. Ono, personal communication) (143 in total, including 60 genes from the current study). Group two (G2): essential genes that have published alleles supporting lethal phenotypes (1208 in total). Group three (G3): nonessential genes that have known alleles supporting nonlethal phenotypes (796 in total). Group four (G4): no-phenotype supporting genes that have no observable phenotypes caused by either RNAi or known alleles (12,811 in total). Given that this group contains the majority of protein-coding genes in C. elegans, most genes in this group are expected to be nonessential. The four groups are labeled in blue, orange, gray, and gold, respectively. (A) The proportion of singletons and duplicates in the four groups. The x-axis shows singletons and duplicates from the groups. Significant differences (*** P-value < 0.001) were observed between G1 and G3, and between G1 and G4. (B) The distribution of number of interactions per protein (Stark ). The x-axis shows the number of interactions per protein and the y-axis is the proportion of proteins. (C) The developmental stage-specific expression pattern and expression level of genes in four groups. The x-axis shows 23 developmental stages, embryos to young adults, from four cells. m in the x-axis is short for minutes of embryonic development. Each bar represents the distribution of the normalized expression level (RPKM in log2 scale) of each gene in each of four groups for each developmental stage. The linear regression trend line is drawn for each group in their respective color. The stages of embryonic elongation are underlined in red. The boxplot was computed using R and the function ggplot2, which uses a 95% C.I.