| Literature DB >> 23671842 |
Abstract
Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.Entities:
Year: 2013 PMID: 23671842 PMCID: PMC3647543 DOI: 10.1155/2013/439681
Source DB: PubMed Journal: Int J Genomics ISSN: 2314-436X Impact factor: 2.326
Figure 1Procedure of the refinement stage.
Figure 2Implementation of Hadoop Map/Reduce model.
Figure 3The architecture of the proposed cloud computing service.
Figure 4Cloud service portal.
Figure 5The webpage indicates a submitted request of protein structure alignment.
Figure 6Protein structure alignment and 3D structural image produced by the cloud service.
RMSD computed by our proposed algorithm, DALI, and VAST.
| Protein structure | Original RMSD value | The average RMSD value refined by bipartite matching | The average RMSD value after the proposed refinement algorithm |
|---|---|---|---|
| DALI | 1.5767 | 1.5166 | 1.4713 |
| VAST | 1.5114 | 1.4664 | 1.4228 |
Figure 7RMSD values produced by various numbers of Mappers.