Literature DB >> 16306389

The choice of optimal distance measure in genome-wide datasets.

Galina Glazko1, Alexander Gordon, Arcady Mushegian.   

Abstract

MOTIVATION: Many types of genomic data are naturally represented as binary vectors. Numerous tasks in computational biology can be cast as analysis of relationships between these vectors, and the first step is, frequently, to compute their pairwise distance matrix. Many distance measures have been proposed in the literature, but there is no theory justifying the choice of distance measure.
RESULTS: We examine the approaches to measuring distances between binary vectors and study the characteristic properties of various distance measures and their performance in several tasks of genome analysis. Most distance measures between binary vectors turn out to belong to a single parametric family, namely generalized average-based distance with different exponents. We show that descriptive statistics of distance distribution, such as skewness and kurtosis, can guide the appropriate choice of the exponent. On the contrary, the more familiar distance properties, such as metric and additivity, appear to have much less effect on the performance of distances. AVAILABILITY: R code GADIST and Supplementary material are available at http://research.stowers-institute.org/bioinfo/

Entities:  

Mesh:

Year:  2005        PMID: 16306389     DOI: 10.1093/bioinformatics/bti1201

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  15 in total

1.  ORTom: a multi-species approach based on conserved co-expression to identify putative functional relationships among genes in tomato.

Authors:  Laura Miozzi; Paolo Provero; Gian Paolo Accotto
Journal:  Plant Mol Biol       Date:  2010-04-22       Impact factor: 4.076

2.  Genotyping of human Campylobacter jejuni isolates in Greece by pulsed-field gel electrophoresis.

Authors:  Anastassios Ioannidis; Chryssoula Nicolaou; Nicholas John Legakis; Vasiliki Ioannidou; Eleni Papavasileiou; Aliki Voyatzi; Stylianos Chatzipanagiotou
Journal:  Mol Diagn Ther       Date:  2006       Impact factor: 4.074

3.  Measuring gene expression divergence: the distance to keep.

Authors:  Galina Glazko; Arcady Mushegian
Journal:  Biol Direct       Date:  2010-08-06       Impact factor: 4.540

4.  Testing the ortholog conjecture with comparative functional genomic data from mammals.

Authors:  Nathan L Nehrt; Wyatt T Clark; Predrag Radivojac; Matthew W Hahn
Journal:  PLoS Comput Biol       Date:  2011-06-09       Impact factor: 4.475

5.  The Australian dingo is an early offshoot of modern breed dogs.

Authors:  Matt A Field; Sonu Yadav; Olga Dudchenko; Meera Esvaran; Benjamin D Rosen; Ksenia Skvortsova; Richard J Edwards; Jens Keilwagen; Blake J Cochran; Bikash Manandhar; Sonia Bustamante; Jacob Agerbo Rasmussen; Richard G Melvin; Barry Chernoff; Arina Omer; Zane Colaric; Eva K F Chan; Andre E Minoche; Timothy P L Smith; M Thomas P Gilbert; Ozren Bogdanovic; Robert A Zammit; Torsten Thomas; Erez L Aiden; J William O Ballard
Journal:  Sci Adv       Date:  2022-04-22       Impact factor: 14.957

6.  Employing conservation of co-expression to improve functional inference.

Authors:  Carsten O Daub; Erik Ll Sonnhammer
Journal:  BMC Syst Biol       Date:  2008-09-22

7.  Similarity searches in genome-wide numerical data sets.

Authors:  Galina Glazko; Michael Coleman; Arcady Mushegian
Journal:  Biol Direct       Date:  2006-05-30       Impact factor: 4.540

8.  Integrative sequence and tissue expression profiling of chicken and mammalian aquaporins.

Authors:  Raphael D Isokpehi; Rajendram V Rajnarayanan; Cynthia D Jeffries; Tolulola O Oyeleye; Hari H P Cohly
Journal:  BMC Genomics       Date:  2009-07-14       Impact factor: 3.969

9.  Detection of biochemical pathways by probabilistic matching of phyletic vectors.

Authors:  Hua Li; David M Kristensen; Michael K Coleman; Arcady Mushegian
Journal:  PLoS One       Date:  2009-04-24       Impact factor: 3.240

10.  Evolutionary history of bacteriophages with double-stranded DNA genomes.

Authors:  Galina Glazko; Vladimir Makarenkov; Jing Liu; Arcady Mushegian
Journal:  Biol Direct       Date:  2007-12-06       Impact factor: 4.540

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.