Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Supervised clustering of high-dimensional data using regularized mixture modeling.

Literature DB >> 34293851

Supervised clustering of high-dimensional data using regularized mixture modeling.

Wennan Chang¹, Changlin Wan¹, Yong Zang², Chi Zhang³, Sha Cao².

Abstract

Identifying relationships between genetic variations and their clinical presentations has been challenged by the heterogeneous causes of a disease. It is imperative to unveil the relationship between the high-dimensional genetic manifestations and the clinical presentations, while taking into account the possible heterogeneity of the study subjects.We proposed a novel supervised clustering algorithm using penalized mixture regression model, called component-wise sparse mixture regression (CSMR), to deal with the challenges in studying the heterogeneous relationships between high-dimensional genetic features and a phenotype. The algorithm was adapted from the classification expectation maximization algorithm, which offers a novel supervised solution to the clustering problem, with substantial improvement on both the computational efficiency and biological interpretability. Experimental evaluation on simulated benchmark datasets demonstrated that the CSMR can accurately identify the subspaces on which subset of features are explanatory to the response variables, and it outperformed the baseline methods. Application of CSMR on a drug sensitivity dataset again demonstrated the superior performance of CSMR over the others, where CSMR is powerful in recapitulating the distinct subgroups hidden in the pool of cell lines with regards to their coping mechanisms to different drugs. CSMR represents a big data analysis tool with the potential to resolve the complexity of translating the clinical representations of the disease to the real causes underpinning it. We believe that it will bring new understanding to the molecular basis of a disease and could be of special relevance in the growing field of personalized medicine.

Entities: Chemical

Keywords: disease heterogeneity; mixture modeling; supervised learning

Mesh：

Year: 2021 PMID： 34293851 PMCID： PMC8294591 DOI： 10.1093/bib/bbaa291

Source DB: PubMed Journal: Brief Bioinform ISSN： 1467-5463 Impact factor: 11.622

17 in total

1. Feature selection in finite mixture of sparse normal linear models in high-dimensional feature space.

Authors: Abbas Khalili; Jiahua Chen; Shili Lin
Journal: Biostatistics Date: 2010-08-16 Impact factor: 5.899

2. An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules.

Authors: Amrita Basu; Nicole E Bodycombe; Jaime H Cheah; Edmund V Price; Ke Liu; Giannina I Schaefer; Richard Y Ebright; Michelle L Stewart; Daisuke Ito; Stephanie Wang; Abigail L Bracha; Ted Liefeld; Mathias Wawer; Joshua C Gilbert; Andrew J Wilson; Nicolas Stransky; Gregory V Kryukov; Vlado Dancik; Jordi Barretina; Levi A Garraway; C Suk-Yee Hon; Benito Munoz; Joshua A Bittker; Brent R Stockwell; Dineo Khabele; Andrew M Stern; Paul A Clemons; Alykhan F Shamji; Stuart L Schreiber
Journal: Cell Date: 2013-08-29 Impact factor: 41.582

3. Systematic assessment of analytical methods for drug sensitivity prediction from cancer cell line data.

Authors: In Sock Jang; Elias Chaibub Neto; Juistin Guinney; Stephen H Friend; Adam A Margolin
Journal: Pac Symp Biocomput Date: 2014

4. Molecular signatures database (MSigDB) 3.0.

Authors: Arthur Liberzon; Aravind Subramanian; Reid Pinchback; Helga Thorvaldsdóttir; Pablo Tamayo; Jill P Mesirov
Journal: Bioinformatics Date: 2011-05-05 Impact factor: 6.937

5. QUBIC2: a novel and robust biclustering algorithm for analyses and interpretation of large-scale RNA-Seq data.

Authors: Juan Xie; Anjun Ma; Yu Zhang; Bingqiang Liu; Sha Cao; Cankun Wang; Jennifer Xu; Chi Zhang; Qin Ma
Journal: Bioinformatics Date: 2020-02-15 Impact factor: 6.937

6. The Cancer Genome Atlas Pan-Cancer analysis project.

Authors: John N Weinstein; Eric A Collisson; Gordon B Mills; Kenna R Mills Shaw; Brad A Ozenberger; Kyle Ellrott; Ilya Shmulevich; Chris Sander; Joshua M Stuart
Journal: Nat Genet Date: 2013-10 Impact factor: 38.330

7. Challenges of Big Data Analysis.

Authors: Jianqing Fan; Fang Han; Han Liu
Journal: Natl Sci Rev Date: 2014-06 Impact factor: 17.275

8. LTMG: a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data.

Authors: Changlin Wan; Wennan Chang; Yu Zhang; Fenil Shah; Xiaoyu Lu; Yong Zang; Anru Zhang; Sha Cao; Melissa L Fishel; Qin Ma; Chi Zhang
Journal: Nucleic Acids Res Date: 2019-10-10 Impact factor: 16.971

Review 9. Tumor heterogeneity: causes and consequences.

Authors: Andriy Marusyk; Kornelia Polyak
Journal: Biochim Biophys Acta Date: 2009-11-18

10. Ovarian carcinoma subtypes are different diseases: implications for biomarker studies.

Authors: Martin Köbel; Steve E Kalloger; Niki Boyd; Steven McKinney; Erika Mehl; Chana Palmer; Samuel Leung; Nathan J Bowen; Diana N Ionescu; Ashish Rajput; Leah M Prentice; Dianne Miller; Jennifer Santos; Kenneth Swenerton; C Blake Gilks; David Huntsman
Journal: PLoS Med Date: 2008-12-02 Impact factor: 11.069

2 in total

1. Response to 'Letter to the Editor: on the stability and internal consistency of component-wise sparse mixture regression based clustering', Zhang et al.

Authors: Wennan Chang; Chi Zhang; Sha Cao
Journal: Brief Bioinform Date: 2022-07-18 Impact factor: 13.994

2. Letter to the Editor: on the stability and internal consistency of component-wise sparse mixture regression-based clustering.

Authors: Bo Zhang; Jianghua He; Jinxiang Hu; Devin C Koestler; Prabhakar Chalise
Journal: Brief Bioinform Date: 2022-01-17 Impact factor: 13.994

2 in total