| Literature DB >> 25810434 |
Amit K Chattopadhyay1, Diar Nasiev1, Darren R Flower2.
Abstract
MOTIVATION: Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly.Entities:
Mesh:
Year: 2015 PMID: 25810434 PMCID: PMC4514925 DOI: 10.1093/bioinformatics/btv167
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.An illustration of first passage probability distribution across a threshold X(t) = X0
Fig. 3.Histograms of cluster sizes obtained from the clustering data in Figure 2 and plotted on a log-log scale. The solid line represents actual data (panel 1 of Fig. 2), the dotted line represents the result from the Matlab-inbuilt algorithm (panel 2 of Fig. 2)
Fig. 2.Clustering from the part of the Pfam dataset. Panel 1 represents the reference set of known clusters with groupings obtained from Pfam. Panel 2 is the clustering obtained using an inbuilt Matlab algorithm which uses only the number of clusters from Pfam (Panel 1) as input. The y-axis represents the average value of the order parameter m, while the x-axis represents the sequence. Separate clusters are shown using different colours