Literature DB >> 7584413

An expert system for processing sequence homology data.

E L Sonnhammer1, R Durbin.   

Abstract

When confronted with the task of finding homology to large numbers of sequences, database searching tools such as Blast and Fasta generate prohibitively large amounts of information. An automatic way of making most of the decisions a trained sequence analyst would make was developed by means of a rule-based expert system combined with an algorithm to avoid non-informative biased residue composition matches. The results found relevant by the system are presented in a very concise and clear way, so that the homology can be assessed with minimum effort. The expert system, HSPcrunch, was implemented to process the output to the programs in the BLAST suite. HSPcrunch embodies rules on detecting distant similarities when pairs of weak matches are consistent with a larger gapped alignment, i.e. when Blast has broken a longer gapped alignment up into smaller ungapped ones. This way, more distant similarities can be detected with no or little side-effects of more spurious matches. The rules for how small the gaps must be to be considered significant have been derived empirically. Currently a set of rules are used that operate on two different scoring levels, one for very weak matches that have very small gaps and one for medium weak matches that have slightly larger gaps. This set of rules proved to be robust for most cases and gives high fidelity separation between real homologies and spurious matches. One of the most important rules for reducing the amount of output is to limit the number of overlapping matches to the same region of the query sequence.(ABSTRACT TRUNCATED AT 250 WORDS)

Entities:  

Mesh:

Year:  1994        PMID: 7584413

Source DB:  PubMed          Journal:  Proc Int Conf Intell Syst Mol Biol        ISSN: 1553-0833


  6 in total

1.  BioViews: Java-based tools for genomic data visualization.

Authors:  G A Helt; S Lewis; A E Loraine; G M Rubin
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

2.  Characterization of short tandem repeats from thirty-one human telomeres.

Authors:  M Rosenberg; L Hui; J Ma; H C Nusbaum; K Clark; L Robinson; L Dziadzio; P M Swain; T Keith; T J Hudson; L G Biesecker; J Flint
Journal:  Genome Res       Date:  1997-09       Impact factor: 9.043

3.  Genome sequence of Haloarcula marismortui: a halophilic archaeon from the Dead Sea.

Authors:  Nitin S Baliga; Richard Bonneau; Marc T Facciotti; Min Pan; Gustavo Glusman; Eric W Deutsch; Paul Shannon; Yulun Chiu; Rueyhung Sting Weng; Rueichi Richie Gan; Pingliang Hung; Shailesh V Date; Edward Marcotte; Leroy Hood; Wailap Victor Ng
Journal:  Genome Res       Date:  2004-11       Impact factor: 9.043

4.  A Fugu-Human Genome Synteny Viewer: web software for graphical display and annotation reports of synteny between Fugu genomic sequence and human genes.

Authors:  Mark Halling-Brown; Clare Sansom; David S Moss; Greg Elgar; Yvonne J K Edwards
Journal:  Nucleic Acids Res       Date:  2004-05-11       Impact factor: 16.971

5.  An optimized set of human telomere clones for studying telomere integrity and architecture.

Authors:  S J Knight; C M Lese; K S Precht; J Kuc; Y Ning; S Lucas; R Regan; M Brenan; A Nicod; N M Lawrie; D L Cardy; H Nguyen; T J Hudson; H C Riethman; D H Ledbetter; J Flint
Journal:  Am J Hum Genet       Date:  2000-06-22       Impact factor: 11.043

6.  A genome annotation-driven approach to cloning the human ORFeome.

Authors:  John E Collins; Charmain L Wright; Carol A Edwards; Matthew P Davis; James A Grinham; Charlotte G Cole; Melanie E Goward; Begoña Aguado; Meera Mallya; Younes Mokrab; Elizabeth J Huckle; David M Beare; Ian Dunham
Journal:  Genome Biol       Date:  2004-09-30       Impact factor: 13.583

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.