| Literature DB >> 30260558 |
Abstract
Compositionally biased regions (BRs) occur when a few amino-acid types are enriched in a protein segment. There are possibly BR types in the known protein universe that have not been characterized experimentally. The UniProt protein database has been surveyed for evidence of such compositionally ''dark matter''. A ''dark biased region'' (DBR) is defined as a biased region with low probability of being an individual structural domain or intrinsically disordered region. The bias annotation program fLPS is used to generate a list of >13 million BRs, which is then thoroughly filtered for structure and intrinsic disorder. About a third of BRs (31%) has both substantial intrinsic disorder and structure. After filtering, there are ≈0.9 million DBRs (≈7% of the original BRs in ≈1.4% of proteins). These DBRs are hugely enriched in eukaryotes and hugely depleted in bacteria. They tend to be more hydrophobic than other protein regions, but are made of less extreme combinations of hydrophobic/hydrophilic residues. Given varying assumptions, It has been estimated that how many DBRs there might be for the high bias levels examined (with p-values < 1 × 10-06 ), deriving a reasonable range of 0.7-7.2% of proteins having such DBRs. Hypotheses are examined about what such DBRs might be, that is, that they are from un- or undersampled domain/region categories or are unappreciated categories somewhat like existing ones.Entities:
Keywords: compositional bias; dark matter; dark proteome; intrinsic disorder; prion
Mesh:
Substances:
Year: 2018 PMID: 30260558 DOI: 10.1002/pmic.201800069
Source DB: PubMed Journal: Proteomics ISSN: 1615-9853 Impact factor: 3.984