| Literature DB >> 24381303 |
Norman F Goodacre, Dietlind L Gerloff, Peter Uetz.
Abstract
UNLABELLED: More than 20% of all protein domains are currently annotated as "domains of unknown function" (DUFs). About 2,700 DUFs are found in bacteria compared with just over 1,500 in eukaryotes. Over 800 DUFs are shared between bacteria and eukaryotes, and about 300 of these are also present in archaea. A total of 2,786 bacterial Pfam domains even occur in animals, including 320 DUFs. Evolutionary conservation suggests that many of these DUFs are important. Here we show that 355 essential proteins in 16 model bacterial species contain 238 DUFs, most of which represent single-domain proteins, clearly establishing the biological essentiality of DUFs. We suggest that experimental research should focus on conserved and essential DUFs (eDUFs) for functional analysis given their important function and wide taxonomic distribution, including bacterial pathogens. IMPORTANCE: The functional units of proteins are domains. Typically, each domain has a distinct structure and function. Genomes encode thousands of domains, and many of the domains have no known function (domains of unknown function [DUFs]). They are often ignored as of little relevance, given that many of them are found in only a few genomes. Here we show that many DUFs are essential DUFs (eDUFs) based on their presence in essential proteins. We also show that eDUFs are often essential even if they are found in relatively few genomes. However, in general, more common DUFs are more often essential than rare DUFs.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24381303 PMCID: PMC3884060 DOI: 10.1128/mBio.00744-13
Source DB: PubMed Journal: MBio Impact factor: 7.867
FIG 1 Essential domains of unknown function (eDUFs) are common among bacteria. The table shows species for which essential genes have been determined. All numbers were derived using the reference proteome of either the DEG strain or a common (fully sequenced) strain. Different strains may have different numbers. Domains are all Pfam domains that are not DUFs, while eDUFs are a subset of DUFs. Many essential genes encode DUFs as their only domain. This table is based on Pfam v26 (2012). For a complete list of eDUFs, see Table S1F in the supplemental material.
FIG 2 Many essential domains of unknown function (eDUFs) are not highly conserved. Although eDUFs tend to be better conserved (as measured by the number of genomes they are encoded in), the correlation is weak. Even poorly conserved DUFs are often essential. The linear fit was performed using simple linear regression. The figure uses data from DEG version 8.5.