| Literature DB >> 35454135 |
Jiannan Chao1, Furong Tang2,3, Lei Xu3.
Abstract
The continuous development of sequencing technologies has enabled researchers to obtain large amounts of biological sequence data, and this has resulted in increasing demands for software that can perform sequence alignment fast and accurately. A number of algorithms and tools for sequence alignment have been designed to meet the various needs of biologists. Here, the ideas that prevail in the research of sequence alignment and some quality estimation methods for multiple sequence alignment tools are summarized.Entities:
Keywords: alignment quality estimation; alignment refinement; alignment scoring; heuristic alignment algorithms; multiple sequence alignment
Mesh:
Year: 2022 PMID: 35454135 PMCID: PMC9024764 DOI: 10.3390/biom12040546
Source DB: PubMed Journal: Biomolecules ISSN: 2218-273X
Figure 1Two heuristic algorithms for pairwise sequence alignment. (A) A dynamic programming matrix, which is separated by several anchors, which is certain to be in the optimal path. (B) A shape-based bounded dynamic programming matrix in which the light blue block is calculation-free because these states are thought to be less likely to be in the optimal path.
Figure 2Two strategies for multiple sequence alignment. (A) Star alignment strategy performs multiple sequence alignment based on the consistencies among pairwise alignments of the sequences pending alignment and a center sequence, which is in light blue. (B) Progressive alignment strategy performs multiple sequence alignment along a pre-built guide tree, each of whose internal nodes represents an alignment of two sequences, one sequence with one profile, or two profiles.
Quality estimation methods of multiple sequence alignment software.
| Structural Benchmark | Simulated Sequences | Commonality-Based | |
|---|---|---|---|
| Scalability | Low | High | High |
| Pre-Built Alignment | Yes | Yes | No |
| Scoring Methods | Sum of pair score and true column score | Sum of pair score and true column score | Multiple overlap score and head-or-tail score |
| Dependency | Protein structure | Probabilistic model | / |
| Test Sets | Fixed | Configurable | Not limited |
| Drawbacks | Limited to the diversity of benchmarks | Adopted model may have defects | Tested software can make common mistakes |
| Examples | BAliBASE [ | ROSE [ | MUMSA [ |