Literature DB >> 16301203

Accurate anchoring alignment of divergent sequences.

Weichun Huang1, David M Umbach, Leping Li.   

Abstract

MOTIVATION: Obtaining high quality alignments of divergent homologous sequences for cross-species sequence comparison remains a challenge.
RESULTS: We propose a novel pairwise sequence alignment algorithm, ACANA (ACcurate ANchoring Alignment), for aligning biological sequences at both local and global levels. Like many fast heuristic methods, ACANA uses an anchoring strategy. However, unlike others, ACANA uses a Smith-Waterman-like dynamic programming algorithm to recursively identify near-optimal regions as anchors for a global alignment. Performance evaluations using a simulated benchmark dataset and real promoter sequences suggest that ACANA is accurate and consistent, especially for divergent sequences. Specifically, we use a simulated benchmark dataset to show that ACANA has the highest sensitivity to align constrained functional sites compared to BLASTZ, CHAOS and DIALIGN for local alignment and compared to AVID, ClustalW, DIALIGN and LAGAN for global alignment. Applied to 6007 pairs of human-mouse orthologous promoter sequences, ACANA identified the largest number of conserved regions (defined as over 70% identity over 100 bp) compared to AVID, ClustalW, DIALIGN and LAGAN. In addition, the average length of conserved region identified by ACANA was the longest. Thus, we suggest that ACANA is a useful tool for identifying functional elements in cross-species sequence analysis, such as predicting transcription factor binding sites in non-coding DNA. AVAILABILITY: ACANA software and test sequence data are publicly available at http://BioMedEmpire.org/

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16301203     DOI: 10.1093/bioinformatics/bti772

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  19 in total

1.  EagleView: a genome assembly viewer for next-generation sequencing technologies.

Authors:  Weichun Huang; Gabor Marth
Journal:  Genome Res       Date:  2008-06-11       Impact factor: 9.043

2.  Inference of transcriptional networks in Arabidopsis through conserved noncoding sequence analysis.

Authors:  Jan Van de Velde; Ken S Heyndrickx; Klaas Vandepoele
Journal:  Plant Cell       Date:  2014-07-02       Impact factor: 11.277

3.  Sequence Comparison Without Alignment: The SpaM Approaches.

Authors:  Burkhard Morgenstern
Journal:  Methods Mol Biol       Date:  2021

4.  Function-based classification of carbohydrate-active enzymes by recognition of short, conserved peptide motifs.

Authors:  Peter Kamp Busk; Lene Lange
Journal:  Appl Environ Microbiol       Date:  2013-03-22       Impact factor: 4.792

5.  Deriving genotypes from RAD-seq short-read data using Stacks.

Authors:  Nicolas C Rochette; Julian M Catchen
Journal:  Nat Protoc       Date:  2017-11-30       Impact factor: 13.491

6.  ART: a next-generation sequencing read simulator.

Authors:  Weichun Huang; Leping Li; Jason R Myers; Gabor T Marth
Journal:  Bioinformatics       Date:  2011-12-23       Impact factor: 6.937

7.  Parameters for accurate genome alignment.

Authors:  Martin C Frith; Michiaki Hamada; Paul Horton
Journal:  BMC Bioinformatics       Date:  2010-02-09       Impact factor: 3.169

8.  Adaptive GDDA-BLAST: fast and efficient algorithm for protein sequence embedding.

Authors:  Yoojin Hong; Jaewoo Kang; Dongwon Lee; Damian B van Rossum
Journal:  PLoS One       Date:  2010-10-22       Impact factor: 3.240

9.  Unravelling cis-regulatory elements in the genome of the smallest photosynthetic eukaryote: phylogenetic footprinting in Ostreococcus.

Authors:  Gwenael Piganeau; Klaas Vandepoele; Sébastien Gourbière; Yves Van de Peer; Hervé Moreau
Journal:  J Mol Evol       Date:  2009-08-20       Impact factor: 2.395

10.  MapToGenome: a comparative genomic tool that aligns transcript maps to sequenced genomes.

Authors:  Srikrishna Putta; Jeramiah J Smith; Chuck Staben; S Randal Voss
Journal:  Evol Bioinform Online       Date:  2007-02-14       Impact factor: 1.625

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.