| Literature DB >> 35673616 |
Yingzhou Lu1, Chiung-Ting Wu1, Sarah J Parker2, Zuolin Cheng1, Georgia Saylor3, Jennifer E Van Eyk2, Guoqiang Yu1, Robert Clarke4, David M Herrington3, Yue Wang1.
Abstract
Motivation: Ideally, a molecularly distinct subtype would be composed of molecular features that are expressed uniquely in the subtype of interest but in no others-so-called marker genes (MGs). MG plays a critical role in the characterization, classification or deconvolution of tissue or cell subtypes. We and others have recognized that the test statistics used by most methods do not exactly satisfy the MG definition and often identify inaccurate MG.Entities:
Year: 2022 PMID: 35673616 PMCID: PMC9163574 DOI: 10.1093/bioadv/vbac037
Source DB: PubMed Journal: Bioinform Adv ISSN: 2635-0041
Fig. 1.Overall COT workflow and comparative evaluation. (A) Major steps and functions in COT software tool. (B–E) pROC curves and pAUC values associated with COT and peer methods in the standard (K = 3, n = 3 × 3), more-subtype (K = 5, n = 3 × 5), more-sample (K = 3, n = 4 × 3) and complex-null (mixture of rotated and non-uniform Dirichlet distributions) experimental settings, respectively (Supplementary information)
Fig. 2.Verification of MG detected by COT on benchmark dataset (GSE28490). (A) The null distribution (histogram) superimposed with the FNM approximation using an FDR-embedded expectation-maximization algorithm (5 Gaussians, 4–6 iterations). (B) Empirical distribution of COT P-value and a concordance survey across different P-value threshold, q-value, COT threshold and number of accepted MG. (C, D) Simplex plots and heatmaps of MG (color-coded) detected by COT, OVR-test and a priori MG subset (column—protein, row—sample)
Fig. 3.Case study of detecting de novo protein MG using proteomics data acquired from vascular specimens of three tissue subtypes. (A, B) Top 60 protein markers detected by COT on ‘pure’ vascular specimens (column—sample, row—protein). (C) The associated top enriched pathways in a perspective view. (D) KEGG map of cholesterol metabolism pathway enriched with COT FP-MG
Consistency between the top 60 MG detected by COT from purified specimens and by CAM from bulk tissues (51 FP-MG, 2 FS-MG, 7 NL-MG)
| COT based on pure specimen | ||||
|---|---|---|---|---|
| CAM based on bulk | Top 60 MG | MGFP (51) | MGFS (2) | MGNL (7) |
| Subtype 1 | 42 | 0 | 0 | |
| Subtype 2 | 9 | 2 | 1 | |
| Subtype 3 | 0 | 0 | 6 | |