| Literature DB >> 23219343 |
Carlos G Acevedo-Rocha1, Gang Fang, Markus Schmidt, David W Ussery, Antoine Danchin.
Abstract
A central undertaking in synthetic biology (SB) is the quest for the 'minimal genome'. However, 'minimal sets' of essential genes are strongly context-dependent and, in all prokaryotic genomes sequenced to date, not a single protein-coding gene is entirely conserved. Furthermore, a lack of consensus in the field as to what attributes make a gene truly essential adds another aspect of variation. Thus, a universal minimal genome remains elusive. Here, as an alternative to defining a minimal genome, we propose that the concept of gene persistence can be used to classify genes needed for robust long-term survival. Persistent genes, although not ubiquitous, are conserved in a majority of genomes, tend to be expressed at high levels, and are frequently located on the leading DNA strand. These criteria impose constraints on genome organization, and these are important considerations for engineering cells and for creating cellular life-like forms in SB.Entities:
Mesh:
Year: 2012 PMID: 23219343 PMCID: PMC3642372 DOI: 10.1016/j.tig.2012.11.001
Source DB: PubMed Journal: Trends Genet ISSN: 0168-9525 Impact factor: 11.639
Minimal gene sets obtained by direct and random experimental mutagenesis
| Microorganism | Minimal gene set | Method | Refs |
|---|---|---|---|
| 205/499 | DGI | ||
| 217 | DGI | ||
| 658 | RTM | ||
| 480 | RTM | ||
| 620 | RTM | ||
| 303 | DGI | ||
| 302 | DGI | ||
| 1105 | DGI | ||
| 253 | RTM | ||
| 396 | RTM | ||
| 670 | RTM | ||
| 136/358 | RTM | ||
| 255–344 | RTM | ||
| ∼614 | RTM | ||
| 265–350 | RTM | ||
| 382 | RTM | ||
| 321 | RTM | ||
| 335 | RTM | ||
| 257–490 | RTM | ||
| 82 | RTM | ||
| 71 | RTM | ||
| 351 | RTM | ||
| 150/600 | Random antisense RNA | ||
| 789 | RTM | ||
Protein-coding genes.
DGI, direct gene inactivation.
RTM, Random transposon mutagenesis.
The study additionally suggested 43 RNA genes, in other words the first minimal gene set of 405 genes including RNA genes.
Figure 1Different criteria for defining universally conserved genes. When 1000 genomes are compared via comparative genomics, the number of orthologous genes falls to 0 (left), but this number can increase to about 500 persistent genes by comparing orthologs that belong to a quorum of a similar or different genomes from evolutionarily distinct bacteria, above a threshold computed using a measure that retains frequent genes that tend to cluster together (right).
Figure 2A universe of gene functions. In a particular environment, the sum of all microbial genes corresponds to the metagenome, which is in turned formed by pan-genomes. A pan-genome is the sum of all genomes of similar strains; each having similar (core genome) or distinct (cenomes) sets of nonpersistent genes. About ∼500 persistent genes form the paleome. As an example, the addition of 1500 nonpersistent genes to the 500 persistent genes of the paleome in E. coli makes a core genome of 2000 genes, whereas the sum of all cenomes of each individual E. coli strain comprises about 18 000 genes [71]. For the time being, the pan-genome of E. coli is composed of roughly 20 000 genes (2000 of the core-genome and 18 000 of the cenomes), the majority of which (80%) is often colocalized on genomic islands [72]. For a particular E. coli strain with a genome of 4500 genes the cenome alone would be about 4000 genes.