Literature DB >> 20815139

Detecting duplicate biological entities using Shortest Path Edit Distance.

Alex Rudniy1, Min Song, James Geller.   

Abstract

Duplicate entity detection in biological data is an important research task. In this paper, we propose a novel and context-sensitive Shortest Path Edit Distance (SPED) extending and supplementing our previous work on Markov Random Field-based Edit Distance (MRFED). SPED transforms the edit distance computational problem to the calculation of the shortest path among two selected vertices of a graph. We produce several modifications of SPED by applying Levenshtein, arithmetic mean, histogram difference and TFIDF techniques to solve subtasks. We compare SPED performance to other well-known distance algorithms for biological entity matching. The experimental results show that SPED produces competitive outcomes.

Entities:  

Mesh:

Year:  2010        PMID: 20815139     DOI: 10.1504/ijdmb.2010.034196

Source DB:  PubMed          Journal:  Int J Data Min Bioinform        ISSN: 1748-5673            Impact factor:   0.667


  4 in total

1.  Building the process-drug-side effect network to discover the relationship between biological processes and side effects.

Authors:  Sejoon Lee; Kwang H Lee; Min Song; Doheon Lee
Journal:  BMC Bioinformatics       Date:  2011-03-29       Impact factor: 3.169

2.  Supervised Learning for Detection of Duplicates in Genomic Sequence Databases.

Authors:  Qingyu Chen; Justin Zobel; Xiuzhen Zhang; Karin Verspoor
Journal:  PLoS One       Date:  2016-08-04       Impact factor: 3.240

3.  Literature consistency of bioinformatics sequence databases is effective for assessing record quality.

Authors:  Mohamed Reda Bouadjenek; Karin Verspoor; Justin Zobel
Journal:  Database (Oxford)       Date:  2017-01-01       Impact factor: 3.451

4.  Duplicates, redundancies and inconsistencies in the primary nucleotide databases: a descriptive study.

Authors:  Qingyu Chen; Justin Zobel; Karin Verspoor
Journal:  Database (Oxford)       Date:  2017-01-10       Impact factor: 3.451

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.