Literature DB >> 15044245

A comparison of cluster analysis methods using DNA methylation data.

Kimberly D Siegmund1, Peter W Laird, Ite A Laird-Offringa.   

Abstract

MOTIVATION: Aberrant DNA methylation is common in cancer. DNA methylation profiles differ between tumor types and subtypes and provide a powerful diagnostic tool for identifying clusters of samples and/or genes. DNA methylation data obtained with the quantitative, highly sensitive MethyLight technology is not normally distributed; it frequently contains an excess of zeros. Established tools to analyze this type of data do not exist. Here, we evaluate a variety of methods for cluster analysis to determine which is most reliable.
RESULTS: We introduce a Bernoulli-lognormal mixture model for clustering DNA methylation data obtained using MethyLight. We model the outcomes using a two-part distribution having discrete and continuous components. It is compared with standard cluster analysis approaches for continuous data and for discrete data. In a simulation study, we find that the two-part model has the lowest classification error rate for mixture outcome data compared with other approaches. The methods are illustrated using DNA methylation data from a study of lung cancer cell lines. Compared with competing hierarchical clustering methods, the mixture model approaches have the lowest cross-validation error for detecting lung cancer subtype (non-small versus small cell). The Bernoulli-lognormal mixture assigns observations to subgroups with the lowest uncertainty. AVAILABILITY: Software is available upon request from the authors. SUPPLEMENTARY INFORMATION: http://www-rcf.usc.edu/~kims/SupplementaryInfo.html

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 15044245     DOI: 10.1093/bioinformatics/bth176

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  29 in total

1.  A statistical framework for Illumina DNA methylation arrays.

Authors:  Pei Fen Kuan; Sijian Wang; Xin Zhou; Haitao Chu
Journal:  Bioinformatics       Date:  2010-09-29       Impact factor: 6.937

2.  Method to detect differentially methylated loci with case-control designs using Illumina arrays.

Authors:  Shuang Wang
Journal:  Genet Epidemiol       Date:  2011-08-04       Impact factor: 2.135

3.  Heritable clustering and pathway discovery in breast cancer integrating epigenetic and phenotypic data.

Authors:  Zailong Wang; Pearlly Yan; Dustin Potter; Charis Eng; Tim H-M Huang; Shili Lin
Journal:  BMC Bioinformatics       Date:  2007-02-01       Impact factor: 3.169

4.  Model-Based Clustering With Data Correction For Removing Artifacts In Gene Expression Data.

Authors:  William Chad Young; Adrian E Raftery; Ka Yee Yeung
Journal:  Ann Appl Stat       Date:  2017-12-28       Impact factor: 2.083

Review 5.  Analysing and interpreting DNA methylation data.

Authors:  Christoph Bock
Journal:  Nat Rev Genet       Date:  2012-10       Impact factor: 53.242

6.  Comparison of Methods for Identifying Phenotype Subgroups Using Categorical Features Data With Application to Autism Spectrum Disorder.

Authors:  Mulugeta Gebregziabher; Matthew S Shotwell; Jane M Charles; Joyce S Nicholas
Journal:  Comput Stat Data Anal       Date:  2012-01-01       Impact factor: 1.681

7.  Integrated profiling reveals a global correlation between epigenetic and genetic alterations in mesothelioma.

Authors:  Brock C Christensen; E Andres Houseman; Graham M Poage; John J Godleski; Raphael Bueno; David J Sugarbaker; John K Wiencke; Heather H Nelson; Carmen J Marsit; Karl T Kelsey
Journal:  Cancer Res       Date:  2010-06-29       Impact factor: 12.701

8.  Identification of methylated genes associated with aggressive bladder cancer.

Authors:  Carmen J Marsit; E Andres Houseman; Brock C Christensen; Luc Gagne; Margaret R Wrensch; Heather H Nelson; Joseph Wiemels; Shichun Zheng; John K Wiencke; Angeline S Andrew; Alan R Schned; Margaret R Karagas; Karl T Kelsey
Journal:  PLoS One       Date:  2010-08-23       Impact factor: 3.240

9.  A latent class model with hidden Markov dependence for array CGH data.

Authors:  Stacia M DeSantis; E Andrés Houseman; Brent A Coull; David N Louis; Gayatry Mohapatra; Rebecca A Betensky
Journal:  Biometrics       Date:  2009-12       Impact factor: 2.571

10.  Aging and environmental exposures alter tissue-specific DNA methylation dependent upon CpG island context.

Authors:  Brock C Christensen; E Andres Houseman; Carmen J Marsit; Shichun Zheng; Margaret R Wrensch; Joseph L Wiemels; Heather H Nelson; Margaret R Karagas; James F Padbury; Raphael Bueno; David J Sugarbaker; Ru-Fang Yeh; John K Wiencke; Karl T Kelsey
Journal:  PLoS Genet       Date:  2009-08-14       Impact factor: 5.917

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.