Literature DB >> 26529778

A Guaranteed Similarity Metric Learning Framework for Biological Sequence Comparison.

Keru Hua, Qin Yu, Ruiming Zhang.   

Abstract

Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. The distance and similarity between two sequence are very important and widely studied. During the last decades, Similarity(distance) metric learning is one of the hottest topics of machine learning/data mining as well as their applications in the bioinformatics field. It is feasible to introduce machine learning technology to learn similarity metric from biological data. In this paper, we propose a novel framework of guaranteed similarity metric learning (GMSL) to perform alignment of biology sequences in any feature vector space. It introduces the (ϵ, γ, τ)-goodness similarity theory to Mahalanobis metric learning. As a theoretical guaranteed similarity metric learning approach, GMSL guarantees that the learned similarity function performs well in classification and clustering. Our experiments on the most used datasets demonstrate that our approach outperforms the state-of-the-art biological sequences alignment methods and other similarity metric learning algorithms in both accuracy and stability.

Entities:  

Mesh:

Year:  2015        PMID: 26529778     DOI: 10.1109/TCBB.2015.2495186

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  2 in total

1.  Analyzing similarities in genome sequences.

Authors:  I C Fonseca; E Nogueira; P H Figueirêdo; S Coutinho
Journal:  Eur Phys J E Soft Matter       Date:  2018-01-19       Impact factor: 1.890

2.  Machine learning-based chemical binding similarity using evolutionary relationships of target genes.

Authors:  Keunwan Park; Young-Joon Ko; Prasannavenkatesh Durai; Cheol-Ho Pan
Journal:  Nucleic Acids Res       Date:  2019-11-18       Impact factor: 16.971

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.