| Literature DB >> 31408700 |
Mohamed Kamel1, Pablo Mier2, Abdelkamel Tari3, Miguel A Andrade-Navarro4.
Abstract
Low complexity regions (LCRs) in protein sequences have special properties that are very different from those of globular proteins. The rules that define secondary structure elements do not apply when the distribution of amino acids becomes biased. While there is a tendency towards structural disorder in LCRs, various examples, and particularly homorepeats of single amino acids, suggest that very short repeats could adopt structures very difficult to predict. These structures are possibly variable and dependant on the context of intra- or inter-molecular interactions. In general, short repeats in LCRs can induce structure. This could explain the observation that very short (non-perfect) repeats are widespread and many define regions with a function in protein interactions. For these reasons, we have developed an algorithm to quickly analyze local repeatability along protein sequences, that is, how close a protein fragment is from a perfect repeat. Using this algorithm we identified that the proteins of the yeast Saccharomyces cerevisiae are depleted in short repeats (approximate or not) of odd-length, while the human proteins are not, that the fish Danio rerio has many proteins with repeats of length two and that the plant Arabidopsis thaliana has an unusually large amount of repeats of length seven. Our method (REpeatability Scanner, RES, accessible at http://cbdm-01.zdv.uni-mainz.de/~munoz/res/) allows to find regions with approximate short repeats in protein sequences, and helps to characterize the variable use of LCRs and compositional bias in different organisms.Entities:
Keywords: Amino acid short tandem repeats; Computational detection of sequence repeats; Homorepeats; Low complexity regions; Repeatability; Web tool
Mesh:
Substances:
Year: 2019 PMID: 31408700 DOI: 10.1016/j.jsb.2019.08.003
Source DB: PubMed Journal: J Struct Biol ISSN: 1047-8477 Impact factor: 2.867