Literature DB >> 22692068

Cascaded walks in protein sequence space: use of artificial sequences in remote homology detection between natural proteins.

S Sandhya1, R Mudgal, C Jayadev, K R Abhinandan, R Sowdhamini, N Srinivasan.   

Abstract

Over the past two decades, many ingenious efforts have been made in protein remote homology detection. Because homologous proteins often diversify extensively in sequence, it is challenging to demonstrate such relatedness through entirely sequence-driven searches. Here, we describe a computational method for the generation of 'protein-like' sequences that serves to bridge gaps in protein sequence space. Sequence profile information, as embodied in a position-specific scoring matrix of multiply aligned sequences of bona fide family members, serves as the starting point in this algorithm. The observed amino acid propensity and the selection of a random number dictate the selection of a residue for each position in the sequence. In a systematic manner, and by applying a 'roulette-wheel' selection approach at each position, we generate parent family-like sequences and thus facilitate an enlargement of sequence space around the family. When generated for a large number of families, we demonstrate that they expand the utility of natural intermediately related sequences in linking distant proteins. In 91% of the assessed examples, inclusion of designed sequences improved fold coverage by 5-10% over searches made in their absence. Furthermore, with several examples from proteins adopting folds such as TIM, globin, lipocalin and others, we demonstrate that the success of including designed sequences in a database positively sensitized methods such as PSI-BLAST and Cascade PSI-BLAST and is a promising opportunity for enormously improved remote homology recognition using sequence information alone.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22692068     DOI: 10.1039/c2mb25113b

Source DB:  PubMed          Journal:  Mol Biosyst        ISSN: 1742-2051


  3 in total

1.  Profiles of Natural and Designed Protein-Like Sequences Effectively Bridge Protein Sequence Gaps: Implications in Distant Homology Detection.

Authors:  Gayatri Kumar; Narayanaswamy Srinivasan; Sankaran Sandhya
Journal:  Methods Mol Biol       Date:  2022

2.  NrichD database: sequence databases enriched with computationally designed protein-like sequences aid in remote homology detection.

Authors:  Richa Mudgal; Sankaran Sandhya; Gayatri Kumar; Ramanathan Sowdhamini; Nagasuma R Chandra; Narayanaswamy Srinivasan
Journal:  Nucleic Acids Res       Date:  2014-09-27       Impact factor: 16.971

3.  Use of designed sequences in protein structure recognition.

Authors:  Gayatri Kumar; Richa Mudgal; Narayanaswamy Srinivasan; Sankaran Sandhya
Journal:  Biol Direct       Date:  2018-05-09       Impact factor: 4.540

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.