Literature DB >> 25812743

HAlign: Fast multiple similar DNA/RNA sequence alignment based on the centre star strategy.

Quan Zou1, Qinghua Hu2, Maozu Guo3, Guohua Wang3.   

Abstract

MOTIVATION: Multiple sequence alignment (MSA) is important work, but bottlenecks arise in the massive MSA of homologous DNA or genome sequences. Most of the available state-of-the-art software tools cannot address large-scale datasets, or they run rather slowly. The similarity of homologous DNA sequences is often ignored. Lack of parallelization is still a challenge for MSA research.
RESULTS: We developed two software tools to address the DNA MSA problem. The first employed trie trees to accelerate the centre star MSA strategy. The expected time complexity was decreased to linear time from square time. To address large-scale data, parallelism was applied using the hadoop platform. Experiments demonstrated the performance of our proposed methods, including their running time, sum-of-pairs scores and scalability. Moreover, we supplied two massive DNA/RNA MSA datasets for further testing and research.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2015        PMID: 25812743     DOI: 10.1093/bioinformatics/btv177

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  56 in total

1.  A Survey of Methods for Constructing Rooted Phylogenetic Networks.

Authors:  Juan Wang
Journal:  PLoS One       Date:  2016-11-02       Impact factor: 3.240

2.  Human Protein Subcellular Localization with Integrated Source and Multi-label Ensemble Classifier.

Authors:  Xiaotong Guo; Fulin Liu; Ying Ju; Zhen Wang; Chunyu Wang
Journal:  Sci Rep       Date:  2016-06-21       Impact factor: 4.379

3.  Accurate Identification of Cancerlectins through Hybrid Machine Learning Technology.

Authors:  Jieru Zhang; Ying Ju; Huijuan Lu; Ping Xuan; Quan Zou
Journal:  Int J Genomics       Date:  2016-07-13       Impact factor: 2.326

4.  Big Data: A Parallel Particle Swarm Optimization-Back-Propagation Neural Network Algorithm Based on MapReduce.

Authors:  Jianfang Cao; Hongyan Cui; Hao Shi; Lijuan Jiao
Journal:  PLoS One       Date:  2016-06-15       Impact factor: 3.240

5.  DNA binding protein identification by combining pseudo amino acid composition and profile-based protein representation.

Authors:  Bin Liu; Shanyi Wang; Xiaolong Wang
Journal:  Sci Rep       Date:  2015-10-20       Impact factor: 4.379

6.  Research on B Cell Algorithm for Learning to Rank Method Based on Parallel Strategy.

Authors:  Yuling Tian; Hongxian Zhang
Journal:  PLoS One       Date:  2016-08-03       Impact factor: 3.240

7.  Two Dimensional Yau-Hausdorff Distance with Applications on Comparison of DNA and Protein Sequences.

Authors:  Kun Tian; Xiaoqian Yang; Qin Kong; Changchuan Yin; Rong L He; Stephen S-T Yau
Journal:  PLoS One       Date:  2015-09-18       Impact factor: 3.240

8.  Identification of Multi-Functional Enzyme with Multi-Label Classifier.

Authors:  Yuxin Che; Ying Ju; Ping Xuan; Ren Long; Fei Xing
Journal:  PLoS One       Date:  2016-04-14       Impact factor: 3.240

9.  Identification of apolipoprotein using feature selection technique.

Authors:  Hua Tang; Ping Zou; Chunmei Zhang; Rong Chen; Wei Chen; Hao Lin
Journal:  Sci Rep       Date:  2016-07-22       Impact factor: 4.379

10.  Pattern Recognition on Read Positioning in Next Generation Sequencing.

Authors:  Boseon Byeon; Igor Kovalchuk
Journal:  PLoS One       Date:  2016-06-14       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.