F Plewniak1, J D Thompson, O Poch. 1. Institut de Génétique et de Biologie Moléculaire et Cellulaire, Laboratoire de Biologie Structurale, (CNRS/INSERM/ULP), BP 163, 67404 Illkirch Cedex, France. plewniak@igbmc.u-strasbg.fr
Abstract
MOTIVATION: Blast programs are very efficient in finding relatively strong similarities but some very distantly related sequences are given a very high Expect value and are ranked very low in Blast results. We have developed Ballast, a program to predict local maximum segments (LMSs-i.e. sequence segments conserved relatively to their flanking regions) from a single Blast database search and to highlight these divergent homologues. The TBlastN database searches can also be processed with the help of information from a joint BlastP search. RESULTS: We have applied the Ballast algorithm to BlastP searches performed with sequences belonging to well described dispersed families (aminoacyl-tRNA synthetases; helicases) against the SwissProt 38 database. We show that Ballast is able to build an appropriate conservation profile and that LMSs are predicted that are consistent with the signatures and motifs described in the literature. Furthermore, by comparing the Blast, PsiBlast and Ballast results obtained on a well defined database of structurally related sequences, we show that the LMSs provide a scoring scheme that can concentrate on top ranking distant homologues better than Blast. Using the graphical user interface available on the Web, specific LMSs may be selected to detect divergent homologues sharing the corresponding properties with the query sequence without requiring any additional database search.
MOTIVATION: Blast programs are very efficient in finding relatively strong similarities but some very distantly related sequences are given a very high Expect value and are ranked very low in Blast results. We have developed Ballast, a program to predict local maximum segments (LMSs-i.e. sequence segments conserved relatively to their flanking regions) from a single Blast database search and to highlight these divergent homologues. The TBlastN database searches can also be processed with the help of information from a joint BlastP search. RESULTS: We have applied the Ballast algorithm to BlastP searches performed with sequences belonging to well described dispersed families (aminoacyl-tRNA synthetases; helicases) against the SwissProt 38 database. We show that Ballast is able to build an appropriate conservation profile and that LMSs are predicted that are consistent with the signatures and motifs described in the literature. Furthermore, by comparing the Blast, PsiBlast and Ballast results obtained on a well defined database of structurally related sequences, we show that the LMSs provide a scoring scheme that can concentrate on top ranking distant homologues better than Blast. Using the graphical user interface available on the Web, specific LMSs may be selected to detect divergent homologues sharing the corresponding properties with the query sequence without requiring any additional database search.
Authors: Doris B Kirschner; Elmar vom Baur; Christelle Thibault; Steven L Sanders; Yann-Gaël Gangloff; Irwin Davidson; P Anthony Weil; Làszlò Tora Journal: Mol Cell Biol Date: 2002-05 Impact factor: 4.272
Authors: Valerie Lamour; Steven T Rutherford; Konstantin Kuznedelov; Udupi A Ramagopal; Richard L Gourse; Konstantin Severinov; Seth A Darst Journal: J Mol Biol Date: 2008-08-12 Impact factor: 5.469
Authors: Francisco Prosdocimi; Benjamin Linard; Pierre Pontarotti; Olivier Poch; Julie D Thompson Journal: BMC Genomics Date: 2012-01-04 Impact factor: 3.969