Literature DB >> 16705014

Distance-based clustering of CGH data.

Jun Liu1, Jaaved Mohammed, James Carter, Sanjay Ranka, Tamer Kahveci, Michael Baudis.   

Abstract

MOTIVATION: We consider the problem of clustering a population of Comparative Genomic Hybridization (CGH) data samples. The goal is to develop a systematic way of placing patients with similar CGH imbalance profiles into the same cluster. Our expectation is that patients with the same cancer types will generally belong to the same cluster as their underlying CGH profiles will be similar.
RESULTS: We focus on distance-based clustering strategies. We do this in two steps. (1) Distances of all pairs of CGH samples are computed. (2) CGH samples are clustered based on this distance. We develop three pairwise distance/similarity measures, namely raw, cosine and sim. Raw measure disregards correlation between contiguous genomic intervals. It compares the aberrations in each genomic interval separately. The remaining measures assume that consecutive genomic intervals may be correlated. Cosine maps pairs of CGH samples into vectors in a high-dimensional space and measures the angle between them. Sim measures the number of independent common aberrations. We test our distance/similarity measures on three well known clustering algorithms, bottom-up, top-down and k-means with and without centroid shrinking. Our results show that sim consistently performs better than the remaining measures. This indicates that the correlation of neighboring genomic intervals should be considered in the structural analysis of CGH datasets. The combination of sim with top-down clustering emerged as the best approach. AVAILABILITY: All software developed in this article and all the datasets are available from the authors upon request. CONTACT: juliu@cise.ufl.edu.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16705014     DOI: 10.1093/bioinformatics/btl185

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  14 in total

1.  Affinity and Penalty Jointly Constrained Spectral Clustering With All-Compatibility, Flexibility, and Robustness.

Authors:  Pengjiang Qian; Yizhang Jiang; Shitong Wang; Kuan-Hao Su; Jun Wang; Lingzhi Hu; Raymond F Muzic
Journal:  IEEE Trans Neural Netw Learn Syst       Date:  2016-02-18       Impact factor: 10.451

2.  Cross-domain, soft-partition clustering with diversity measure and knowledge reference.

Authors:  Pengjiang Qian; Shouwei Sun; Yizhang Jiang; Kuan-Hao Su; Tongguang Ni; Shitong Wang; Raymond F Muzic
Journal:  Pattern Recognit       Date:  2016-02       Impact factor: 7.740

3.  Cluster Prototypes and Fuzzy Memberships Jointly Leveraged Cross-Domain Maximum Entropy Clustering.

Authors:  Pengjiang Qian; Yizhang Jiang; Zhaohong Deng; Lingzhi Hu; Shouwei Sun; Shitong Wang; Raymond F Muzic
Journal:  IEEE Trans Cybern       Date:  2016-01       Impact factor: 11.448

4.  A method for detecting significant genomic regions associated with oral squamous cell carcinoma using aCGH.

Authors:  Ki-Yeol Kim; Jin Kim; Hyung Jun Kim; Woong Nam; In-Ho Cha
Journal:  Med Biol Eng Comput       Date:  2010-03-20       Impact factor: 2.602

5.  Copy number abnormalities in sporadic canine colorectal cancers.

Authors:  Jie Tang; Shoshona Le; Liang Sun; Xiuzhen Yan; Mucheng Zhang; Jennifer Macleod; Bruce Leroy; Nicole Northrup; Angela Ellis; Timothy J Yeatman; Yanchun Liang; Michael E Zwick; Shaying Zhao
Journal:  Genome Res       Date:  2010-01-19       Impact factor: 9.043

6.  Robust unmixing of tumor states in array comparative genomic hybridization data.

Authors:  David Tolliver; Charalampos Tsourakakis; Ayshwarya Subramanian; Stanley Shackney; Russell Schwartz
Journal:  Bioinformatics       Date:  2010-06-15       Impact factor: 6.937

7.  A mathematical methodology for determining the temporal order of pathway alterations arising during gliomagenesis.

Authors:  Yu-Kang Cheng; Rameen Beroukhim; Ross L Levine; Ingo K Mellinghoff; Eric C Holland; Franziska Michor
Journal:  PLoS Comput Biol       Date:  2012-01-05       Impact factor: 4.475

8.  Detection of recurrent copy number alterations in the genome: taking among-subject heterogeneity seriously.

Authors:  Oscar M Rueda; Ramon Diaz-Uriarte
Journal:  BMC Bioinformatics       Date:  2009-09-23       Impact factor: 3.169

9.  Classification and feature selection algorithms for multi-class CGH data.

Authors:  Jun Liu; Sanjay Ranka; Tamer Kahveci
Journal:  Bioinformatics       Date:  2008-07-01       Impact factor: 6.937

10.  A forward-backward fragment assembling algorithm for the identification of genomic amplification and deletion breakpoints using high-density single nucleotide polymorphism (SNP) array.

Authors:  Tianwei Yu; Hui Ye; Wei Sun; Ker-Chau Li; Zugen Chen; Sharoni Jacobs; Dione K Bailey; David T Wong; Xiaofeng Zhou
Journal:  BMC Bioinformatics       Date:  2007-05-03       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.