Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 HAlign: Fast multiple similar DNA/RNA sequence alignment based on the centre star strategy.

Literature DB >> 25812743

HAlign: Fast multiple similar DNA/RNA sequence alignment based on the centre star strategy.

Quan Zou¹, Qinghua Hu², Maozu Guo³, Guohua Wang³.

Abstract

MOTIVATION: Multiple sequence alignment (MSA) is important work, but bottlenecks arise in the massive MSA of homologous DNA or genome sequences. Most of the available state-of-the-art software tools cannot address large-scale datasets, or they run rather slowly. The similarity of homologous DNA sequences is often ignored. Lack of parallelization is still a challenge for MSA research.
RESULTS: We developed two software tools to address the DNA MSA problem. The first employed trie trees to accelerate the centre star MSA strategy. The expected time complexity was decreased to linear time from square time. To address large-scale data, parallelism was applied using the hadoop platform. Experiments demonstrated the performance of our proposed methods, including their running time, sum-of-pairs scores and scalability. Moreover, we supplied two massive DNA/RNA MSA datasets for further testing and research.

Mesh：

Year: 2015 PMID： 25812743 DOI： 10.1093/bioinformatics/btv177

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

56 in total

1. A Survey of Methods for Constructing Rooted Phylogenetic Networks.

Authors: Juan Wang
Journal: PLoS One Date: 2016-11-02 Impact factor: 3.240

2. Human Protein Subcellular Localization with Integrated Source and Multi-label Ensemble Classifier.

Authors: Xiaotong Guo; Fulin Liu; Ying Ju; Zhen Wang; Chunyu Wang
Journal: Sci Rep Date: 2016-06-21 Impact factor: 4.379

3. Accurate Identification of Cancerlectins through Hybrid Machine Learning Technology.

Authors: Jieru Zhang; Ying Ju; Huijuan Lu; Ping Xuan; Quan Zou
Journal: Int J Genomics Date: 2016-07-13 Impact factor: 2.326

4. Big Data: A Parallel Particle Swarm Optimization-Back-Propagation Neural Network Algorithm Based on MapReduce.

Authors: Jianfang Cao; Hongyan Cui; Hao Shi; Lijuan Jiao
Journal: PLoS One Date: 2016-06-15 Impact factor: 3.240

5. DNA binding protein identification by combining pseudo amino acid composition and profile-based protein representation.

Authors: Bin Liu; Shanyi Wang; Xiaolong Wang
Journal: Sci Rep Date: 2015-10-20 Impact factor: 4.379

HAlign: Fast multiple similar DNA/RNA sequence alignment based on the centre star strategy.

1. A Survey of Methods for Constructing Rooted Phylogenetic Networks.

2. Human Protein Subcellular Localization with Integrated Source and Multi-label Ensemble Classifier.

3. Accurate Identification of Cancerlectins through Hybrid Machine Learning Technology.

4. Big Data: A Parallel Particle Swarm Optimization-Back-Propagation Neural Network Algorithm Based on MapReduce.

5. DNA binding protein identification by combining pseudo amino acid composition and profile-based protein representation.

6. Research on B Cell Algorithm for Learning to Rank Method Based on Parallel Strategy.

7. Two Dimensional Yau-Hausdorff Distance with Applications on Comparison of DNA and Protein Sequences.

8. Identification of Multi-Functional Enzyme with Multi-Label Classifier.

9. Identification of apolipoprotein using feature selection technique.

10. Pattern Recognition on Read Positioning in Next Generation Sequencing.