| Literature DB >> 29230204 |
Chong Peng1, Yan Lin1, Hao Luo1, Feng Gao1,2,3.
Abstract
Genes critical for the survival or reproduction of an organism in certain circumstances are classified as essential genes. Essential genes play a significant role in deciphering the survival mechanism of life. They may be greatly applied to pharmaceutics and synthetic biology. The continuous progress of experimental method for essential gene identification has accelerated the accumulation of gene essentiality data which facilitates the study of essential genes in silico. In this article, we present some available online resources related to gene essentiality, including bioinformatic software tools for transposon sequencing (Tn-seq) analysis, essential gene databases and online services to predict bacterial essential genes. We review several computational approaches that have been used to predict essential genes, and summarize the features used for gene essentiality prediction. In addition, we evaluate the available online bacterial essential gene prediction servers based on the experimentally validated essential gene sets of 30 bacteria from DEG. This article is intended to be a quick reference guide for the microbiologists interested in the essential genes.Entities:
Keywords: Tn-seq analysis; essential gene; gene essentiality prediction; minimal gene set; synthetic biology
Year: 2017 PMID: 29230204 PMCID: PMC5711816 DOI: 10.3389/fmicb.2017.02331
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Software tools to analyze transposon insertion sequencing data for identifying essential genes.
| Tool | Description | Programming language | Availability | Applicated organisms | Reference | |
|---|---|---|---|---|---|---|
| ESSENTIALS | An open source, web-based software tool for rapid analysis of high throughput transposon insertion sequencing data | Perl and R | Web-interface: | |||
| Source code: | ||||||
| Tnseq | A zero-inflated Poisson model for insertion tolerance analysis of genes based on Tn-seq data | R | – | |||
| Tn-HMM | A method for analyzing Tn-Seq data using Hidden Markov Models | Python | ||||
| Bayesian analysis method | A Bayesian model to analyze gene essentiality based on sequencing of transposon insertion libraries | Python | ||||
| TRANSIT | A software tool for Himar1 Tn-Seq analysis | Python | https://github.com/mad-lab/ transit | – | ||
| ARTIST | Analysis of high-resolution transposon-insertion sequences technique | Matlab | ||||
| Tn-seq Explorer | A package of tools for exploration of the Tn-seq data | Java | ||||
| Bio-Tradis | A set of tools to analyze the output from TraDIS analyses | Perl and R | – | |||
| TSAS | Tn-seq analysis software | Java | ||||
| TnseqDiff | Identification of conditionally essential genes in transposon sequencing studies | R | – | |||
The basic information of essential gene databases.
| Database | Data sources | Species | Category | Bacteria | Archaea | Eukaryotes | Non-coding | Additional tool | URL |
|---|---|---|---|---|---|---|---|---|---|
| DEG | Experiment | 43 | Essential | 15,750(33)a | 519(1) | 33,989(9) | 680(6) | BLAST tools to perform species- and experiment-specific BLAST searches for a single gene, a list of genes, annotated or unannotated genomes. | |
| Non-essential | 109,187(32) | 1,077(1) | 3,573(1) | – | |||||
| pDEG | Prediction | 16 | Essential | 5,880(16) | – | – | – | – | |
| OGEE | Experiment and text-mining | 48 | Essential | 21,914(39) | – | 16,066(9) | – | Tools in the ‘Analyze’ page to visualize the PE% (proportion of essential genes) as a function of other gene properties, including whether a gene is a duplicate or singleton and whether a gene is involved in development. | |
| Non-essential | 78,075(29) | – | 51,744(8) | – | |||||
| EGGS | Experiment | 11 | Essential | 5,655(11) | – | – | – | Subsystem spreadsheet and Subsystem diagram. | |
| Non-essential | 27,201(8) | – | – | – | |||||
Summary of the online essential gene prediction servers.
| Name | Methodology | Input | Standalone version | Annotation | URL |
|---|---|---|---|---|---|
| CEG_Match | Based on gene function | Standard gene name | × | The limitation of CEG_Match is that it is only applicable to name known genes. This will be an appropriate tool when you only know the genes’ names and the complete genome is not at hand. | |
| Geptop | Based on orthology and phylogeny | Amino acid sequence | √ | Geptop tool could be applicable only when the investigated genomes have been completely sequenced. | |
| ZCURVE 3.0 | Based on orthology and phylogeny | Amino acid sequence of predicted genes | √ | ZCURVE 3.0 is a program to find genes in bacterial or archaeal genomes. It has an embedded Geptop program, which has an extended function of searching for essential genes. | |
| EGP | Machine learning-based method | Nucleotide sequence | × | The accuracy of EGP is lower than other tools. Before using this tool, it is advised to check the reference species, which have been used in the training set of EGP. Be cautious to use it when your input gene belongs to the host that does not be included in the same family with any of the reference species. | |
| BLAST | Homology search-based method | Nucleotide sequence Amino acid sequence | √ | DEG has a set of customizable BLAST tools to perform homologous searches against essential gene sets in DEG. Single genes, multiple genes, annotated genomes and unannotated genomes can be submitted for BLAST searches. | |