| Literature DB >> 18795104 |
Zee Chung Ying Benny1, Lee Jock Wai Jack, Wong Nathalie, Yeo Winnie, Lai Bo San Paul, Mok Shu Kam Tony, Chan Tak Cheung Anthony.
Abstract
Microarray techniques using cDNA array and comparative genomic hybridization (CGH) have been developed for several discovery applications. They are frequently applied for the prediction and diagnosis of cancer in recent years. Many studies have shown that integrating genomic data from different sources may increase the reliability of gene expression analysis results in understanding cancer progression. Therefore, developing a good prognostic model dealing simultaneously with different types of dataset is important. The challenge with these types of data is high background noise. We describe an analytical two-stage framework with a multi-parallel data analysis method named wavelet-based generalized singular value decomposition and shaving method (WGSVD-shaving). This method is proposed for de-noising and dimension-reduction during early stage prognosis modeling. We also applied a supervised gene clustering technique with penalized logistic regression with Cox-model on an integrated data. We show the accuracy of the method using a simulated dataset with a case study on Hepatocelluar Carcinoma (HCC) cDNA and CGH data. The method shows improved results from GSVD-shaving and has application in the discovery of candidate genes associated with cancer.Entities:
Keywords: Cox-model; HCC; generalized singular decomposition; wavelets
Year: 2008 PMID: 18795104 PMCID: PMC2532705 DOI: 10.6026/97320630002395
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Figure 1The framework for prognosis analysis based on wavelet-based combined with the analysis of cDNA and CGH data.
Figure 2Effect of random noise on gene lists. We compared the effects of additive noise on cDNA and CGH data using both GSVD-shaving and WGSVD-shaving algorithms.
Figure 3The retained top 49 highest variant genes display their expressions across 20 HCC samples and the patterns show that the highest parallel contributions to the iterative projections after shaving out all other genes: (a) by GSVD-shaving approach, (b) by WGSVD-shaving approach.
Figure 4The retained top 49 highest variant genes display their copy number ratio (transformed) across 20 HCC samples and the patterns shows genes with highest variation in copy number and strongest correlation across all samples: (a) by GSVD-shaving approach, (b) by WGSVD-shaving approach.
Figure 5Comparison of results between GSVD and wavelet-scaled GSVD at θmax.