Literature DB >> 2156132

Rapid and sensitive sequence comparison with FASTP and FASTA.

W R Pearson.   

Abstract

The FASTA program can search the NBRF protein sequence library (2.5 million residues) in less than 20 min on an IBM-PC microcomputer and unambiguously detect proteins that shared a common ancestor billions of years in the past. FASTA is both fast and selective because it initially considers only amino acid identities. Its sensitivity is increased not only by using the PAM250 matrix to score and rescore regions with large numbers of identities but also by joining initial regions. The results of searches with FASTA compare favorably with results using NWS-based programs that are 100 times slower. FASTA is slightly less sensitive but considerably more selective. It is not clear that NWS-based programs would be more successful in finding distantly related members of the G-protein-coupled receptor family. The joining step by FASTA to calculate the initn score is especially useful for sequences that share regions of sequence similarity that are separated by variable-length loops. FASTP and FASTA were designed to identify protein sequences that have descended from a common ancestor, and they have proved very useful for this task. In many cases, a FASTA sequence search will result in a list of high scoring library sequences that are homologous to the query sequence, or the search will result in a list of sequences with similarity scores that cannot be distinguished from the bulk of the library. In either case, the question of whether there are sequences in the library that are clearly related to the query sequence has been answered unambiguously. Unfortunately, the results often will not be so clear-cut, and careful analysis of similarity scores, statistical significance, the actual aligned residues, and the biological context are required. In the course of analyzing the G-protein-coupled receptor family, several proteins were found that, because of a high initn score and a low init1 score that increased almost 2-fold with optimization, appeared to be members of this family which were not previously recognized. RDF2 analysis showed borderline z values, and only a careful examination of the sequence alignments that focused on the conserved residues provided convincing evidence that the high scores were fortuitous. As sequence comparison methods become more powerful by becoming more sensitive, they become more likely to mislead, and even greater care is required.

Mesh:

Substances:

Year:  1990        PMID: 2156132     DOI: 10.1016/0076-6879(90)83007-v

Source DB:  PubMed          Journal:  Methods Enzymol        ISSN: 0076-6879            Impact factor:   1.600


  481 in total

1.  Organization of the biosynthetic gene cluster for the polyketide anthelmintic macrolide avermectin in Streptomyces avermitilis.

Authors:  H Ikeda; T Nonomiya; M Usami; T Ohta; S Omura
Journal:  Proc Natl Acad Sci U S A       Date:  1999-08-17       Impact factor: 11.205

2.  tmRDB (tmRNA database).

Authors:  B Knudsen; J Wower; C Zwieb; J Gorodkin
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

3.  The genome of turkey herpesvirus.

Authors:  C L Afonso; E R Tulman; Z Lu; L Zsak; D L Rock; G F Kutish
Journal:  J Virol       Date:  2001-01       Impact factor: 5.103

Review 4.  Design and implementation of an introductory course for computer applications in molecular genetics. A case study.

Authors:  S A Krawetz; D D Womble
Journal:  Mol Biotechnol       Date:  2001-01       Impact factor: 2.695

5.  Identification of thermophilic species by the amino acid compositions deduced from their genomes.

Authors:  D P Kreil; C A Ouzounis
Journal:  Nucleic Acids Res       Date:  2001-04-01       Impact factor: 16.971

6.  Structural analysis of the KGD sequence loop of barbourin, an alphaIIbbeta3-specific disintegrin.

Authors:  H Minoux; C Chipot; D Brown; B Maigret
Journal:  J Comput Aided Mol Des       Date:  2000-05       Impact factor: 3.686

7.  The pro1(+) gene from Sordaria macrospora encodes a C6 zinc finger transcription factor required for fruiting body development.

Authors:  S Masloff; S Pöggeler; U Kück
Journal:  Genetics       Date:  1999-05       Impact factor: 4.562

8.  A simple sequence repeat-based linkage map of barley.

Authors:  L Ramsay; M Macaulay; S degli Ivanissevich; K MacLean; L Cardle; J Fuller; K J Edwards; S Tuvesson; M Morgante; A Massari; E Maestri; N Marmiroli; T Sjakste; M Ganal; W Powell; R Waugh
Journal:  Genetics       Date:  2000-12       Impact factor: 4.562

9.  The mitochondrial genome of the stramenopile alga Chrysodidymus synuroideus. Complete sequence, gene content and genome organization.

Authors:  J M Chesnick; M Goff; J Graham; C Ocampo; B F Lang; E Seif; G Burger
Journal:  Nucleic Acids Res       Date:  2000-07-01       Impact factor: 16.971

Review 10.  EBI databases and services.

Authors:  P Rodriguez-Tomé
Journal:  Mol Biotechnol       Date:  2001-07       Impact factor: 2.695

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.