Literature DB >> 9665844

Dynamic sequence databank searching with templates and multiple alignment.

W R Taylor1.   

Abstract

Sequence databank searches are often performed iteratively, taking the results of a search to form a probe (either a pattern or profile) for a subsequent scan of the databank. The advantage of this approach is that, as more sequences are drawn into the probe, it should, in principle be possible to detect increasingly distant members of the family. This approach works well when supervised by an "expert" who has a good "eye" for the quality of the sequence alignment and whether novel matches should be rejected or incorporated into the probe. However, all attempts to automate the process have proved difficult, as the process is inherently unstable. Errors in the alignment, or the misalignment of a non-family member, lead to a deterioration of the probe specificity, so allowing further incorrect sequences to be identified. Here, a combination of two methods is used to provide a check on such instability. A pattern matching (template) search method is used (with a BLAST-like pre-filter for speed) to return sequence segments for alignment in a standard multiple alignment program (MULTAL). Sequences are aligned only to a fixed limit of similarity and any sequences or sub-families that have not joined the original "seed" family are rejected. The remaining core family then provides the basis for a subsequent pattern derivation and databank search. The constant check by the multiple alignment phase allows the search phase to be pushed continually towards the boundary of similarity. This is maintained by lowering the cutoff on the scores of acceptable sequences each time the family remains the same over successive search cycles. The procedure was observed to be stable under misalignments and to have an ability to recognise distantly related family members across super-families that was comparable to Psi-BLAST. The method is applied to the analysis of the hormone-binding domains of the insulin and related growth-factor receptors. Copyright 1998 Academic Press.

Mesh:

Substances:

Year:  1998        PMID: 9665844     DOI: 10.1006/jmbi.1998.1853

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  10 in total

1.  DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches.

Authors:  J D Thompson; F Plewniak; J Thierry; O Poch
Journal:  Nucleic Acids Res       Date:  2000-08-01       Impact factor: 16.971

2.  Identification of the regions of Fv1 necessary for murine leukemia virus restriction.

Authors:  K N Bishop; M Bock; G Towers; J P Stoye
Journal:  J Virol       Date:  2001-06       Impact factor: 5.103

3.  Contact-based sequence alignment.

Authors:  Jens Kleinjung; John Romein; Kuang Lin; Jaap Heringa
Journal:  Nucleic Acids Res       Date:  2004-04-30       Impact factor: 16.971

4.  Retroviral capsid determinants of Fv1 NB and NR tropism.

Authors:  Anthony Stevens; Michael Bock; Scott Ellis; Paul LeTissier; Kate N Bishop; Melvyn W Yap; Willie Taylor; Jonathan P Stoye
Journal:  J Virol       Date:  2004-09       Impact factor: 5.103

5.  Consensus structural models for the amino terminal domain of the retrovirus restriction gene Fv1 and the murine leukaemia virus capsid proteins.

Authors:  William R Taylor; Jonathan P Stoye
Journal:  BMC Struct Biol       Date:  2004-01-12

6.  Homology induction: the use of machine learning to improve sequence similarity searches.

Authors:  Andreas Karwath; Ross D King
Journal:  BMC Bioinformatics       Date:  2002-04-23       Impact factor: 3.169

7.  Homology-extended sequence alignment.

Authors:  V A Simossis; J Kleinjung; J Heringa
Journal:  Nucleic Acids Res       Date:  2005-02-07       Impact factor: 16.971

8.  Accurate prediction of protein functional class from sequence in the Mycobacterium tuberculosis and Escherichia coli genomes using data mining.

Authors:  R D King; A Karwath; A Clare; L Dehaspe
Journal:  Yeast       Date:  2000-12       Impact factor: 3.239

9.  Reduction, alignment and visualisation of large diverse sequence families.

Authors:  William R Taylor
Journal:  BMC Bioinformatics       Date:  2016-08-02       Impact factor: 3.169

10.  Plasticity of influenza haemagglutinin fusion peptides and their interaction with lipid bilayers.

Authors:  Loredana Vaccaro; Karen J Cross; Jens Kleinjung; Suzana K Straus; David J Thomas; Stephen A Wharton; John J Skehel; Franca Fraternali
Journal:  Biophys J       Date:  2004-10-08       Impact factor: 4.033

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.