| Literature DB >> 27330233 |
Zhiguang Huo1, Ying Ding2, Silvia Liu3, Steffi Oesterreich4, George Tseng5.
Abstract
Disease phenotyping by omics data has become a popular approach that potentially can lead to better personalized treatment. Identifying disease subtypes via unsupervised machine learning is the first step towards this goal. In this paper, we extend a sparse K-means method towards a meta-analytic framework to identify novel disease subtypes when expression profiles of multiple cohorts are available. The lasso regularization and meta-analysis identify a unique set of gene features for subtype characterization. An additional pattern matching reward function guarantees consistent subtype signatures across studies. The method was evaluated by simulations and leukemia and breast cancer data sets. The identified disease subtypes from meta-analysis were characterized with improved accuracy and stability compared to single study analysis. The breast cancer model was applied to an independent METABRIC dataset and generated improved survival difference between subtypes. These results provide a basis for diagnosis and development of targeted treatments for disease subgroups.Entities:
Keywords: Disease subtype discovery; K-means; Lasso; Meta-analysis; Unsupervised machine learning
Year: 2016 PMID: 27330233 PMCID: PMC4908837 DOI: 10.1080/01621459.2015.1086354
Source DB: PubMed Journal: J Am Stat Assoc ISSN: 0162-1459 Impact factor: 5.033