Youjuan Li1, Ji Zhu. 1. Department of Statistics, University of Michigan, Michigan, USA.
Abstract
MOTIVATION: The identification of DNA copy number changes provides insights that may advance our understanding of initiation and progression of cancer. Array-based comparative genomic hybridization (array-CGH) has emerged as a technique allowing high-throughput genome-wide scanning for chromosomal aberrations. A number of statistical methods have been proposed for the analysis of array-CGH data. In this article, we consider a fused quantile regression model based on three motivations: (1) quantile regression may provide a more comprehensive picture for the ratio profile of copy numbers than the standard mean regression approach; (2) for simplicity, most available methods assume uniform spacing between neighboring clones, while incorporating the information of physical locations of clones may be helpful and (3) most current methods have a set of tuning parameters that must be carefully tuned, which introduces complexity to the implementation. RESULTS: We formulate the detection of regions of gains and losses in a fused regularized quantile regression framework, incorporating physical locations of clones. We derive an efficient algorithm that computes the entire solution path for the resulting optimization problem, and we propose a simple estimate for the complexity of the fitted model, which leads to convenient selection of the tuning parameter. Three published array-CGH datasets are used to demonstrate our approach. AVAILABILITY: R code are available at http://www.stat.lsa.umich.edu/~jizhu/code/cgh/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: The identification of DNA copy number changes provides insights that may advance our understanding of initiation and progression of cancer. Array-based comparative genomic hybridization (array-CGH) has emerged as a technique allowing high-throughput genome-wide scanning for chromosomal aberrations. A number of statistical methods have been proposed for the analysis of array-CGH data. In this article, we consider a fused quantile regression model based on three motivations: (1) quantile regression may provide a more comprehensive picture for the ratio profile of copy numbers than the standard mean regression approach; (2) for simplicity, most available methods assume uniform spacing between neighboring clones, while incorporating the information of physical locations of clones may be helpful and (3) most current methods have a set of tuning parameters that must be carefully tuned, which introduces complexity to the implementation. RESULTS: We formulate the detection of regions of gains and losses in a fused regularized quantile regression framework, incorporating physical locations of clones. We derive an efficient algorithm that computes the entire solution path for the resulting optimization problem, and we propose a simple estimate for the complexity of the fitted model, which leads to convenient selection of the tuning parameter. Three published array-CGH datasets are used to demonstrate our approach. AVAILABILITY: R code are available at http://www.stat.lsa.umich.edu/~jizhu/code/cgh/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Chris D Greenman; Graham Bignell; Adam Butler; Sarah Edkins; Jon Hinton; Dave Beare; Sajani Swamy; Thomas Santarius; Lina Chen; Sara Widaa; P Andy Futreal; Michael R Stratton Journal: Biostatistics Date: 2009-10-15 Impact factor: 5.899