Literature DB >> 12874051

Alignment of BLAST high-scoring segment pairs based on the longest increasing subsequence algorithm.

Hongyu Zhang1.   

Abstract

MOTIVATION: The popular BLAST algorithm is based on a local similarity search strategy, so its high-scoring segment pairs (HSPs) do not have global alignment information. When scientists use BLAST to search for a target protein or DNA sequence in a huge database like the human genome map, the existence of repeated fragments, homologues or pseudogenes in the genome often makes the BLAST result filled with redundant HSPs. Therefore, we need a computational strategy to alleviate this problem.
RESULTS: In the gene discovery group of Celera Genomics, I developed a two-step method, i.e. a BLAST step plus an LIS step, to align thousands of cDNA and protein sequences into the human genome map. The LIS step is based on a mature computational algorithm, Longest Increasing Subsequence (LIS) algorithm. The idea is to use the LIS algorithm to find the longest series of consecutive HSPs in the BLAST output. Such a BLAST+LIS strategy can be used as an independent alignment tool or as a complementary tool for other alignment programs like Sim4 and GenWise. It can also work as a general purpose BLAST result processor in all sorts of BLAST searches. Two examples from Celera were shown in this paper.

Entities:  

Mesh:

Year:  2003        PMID: 12874051     DOI: 10.1093/bioinformatics/btg168

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  5 in total

1.  GenBlastA: enabling BLAST to identify homologous gene sequences.

Authors:  Rong She; Jeffrey S-C Chu; Ke Wang; Jian Pei; Nansheng Chen
Journal:  Genome Res       Date:  2008-10-06       Impact factor: 9.043

2.  Visualizing sequence similarity of protein families.

Authors:  Vamsi Veeramachaneni; Wojciech Makałowski
Journal:  Genome Res       Date:  2004-05-12       Impact factor: 9.043

3.  The genome of the social amoeba Dictyostelium discoideum.

Authors:  L Eichinger; J A Pachebat; G Glöckner; M-A Rajandream; R Sucgang; M Berriman; J Song; R Olsen; K Szafranski; Q Xu; B Tunggal; S Kummerfeld; M Madera; B A Konfortov; F Rivero; A T Bankier; R Lehmann; N Hamlin; R Davies; P Gaudet; P Fey; K Pilcher; G Chen; D Saunders; E Sodergren; P Davis; A Kerhornou; X Nie; N Hall; C Anjard; L Hemphill; N Bason; P Farbrother; B Desany; E Just; T Morio; R Rost; C Churcher; J Cooper; S Haydock; N van Driessche; A Cronin; I Goodhead; D Muzny; T Mourier; A Pain; M Lu; D Harper; R Lindsay; H Hauser; K James; M Quiles; M Madan Babu; T Saito; C Buchrieser; A Wardroper; M Felder; M Thangavelu; D Johnson; A Knights; H Loulseged; K Mungall; K Oliver; C Price; M A Quail; H Urushihara; J Hernandez; E Rabbinowitsch; D Steffen; M Sanders; J Ma; Y Kohara; S Sharp; M Simmonds; S Spiegler; A Tivey; S Sugano; B White; D Walker; J Woodward; T Winckler; Y Tanaka; G Shaulsky; M Schleicher; G Weinstock; A Rosenthal; E C Cox; R L Chisholm; R Gibbs; W F Loomis; M Platzer; R R Kay; J Williams; P H Dear; A A Noegel; B Barrell; A Kuspa
Journal:  Nature       Date:  2005-05-05       Impact factor: 49.962

4.  A large-scale analysis of mRNA polyadenylation of human and mouse genes.

Authors:  Bin Tian; Jun Hu; Haibo Zhang; Carol S Lutz
Journal:  Nucleic Acids Res       Date:  2005-01-12       Impact factor: 16.971

5.  TTS mapping: integrative WEB tool for analysis of triplex formation target DNA sequences, G-quadruplets and non-protein coding regulatory DNA elements in the human genome.

Authors:  Piroon Jenjaroenpun; Vladimir A Kuznetsov
Journal:  BMC Genomics       Date:  2009-12-03       Impact factor: 3.969

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.