| Literature DB >> 32504647 |
Neo Christopher Chung1, Howard Choi2, Ding Wang3, Bilal Mirza3, Alexander R Pelletier4, Dibakar Sigdel5, Wei Wang4, Peipei Ping6.
Abstract
OBJECTIVE: During cardiovascular disease progression, molecular systems of myocardium (e.g., a proteome) undergo diverse and distinct changes. Dynamic, temporally-regulated alterations of individual molecules underlie the collective response of the heart to pathological drivers and the ultimate development of pathogenesis. Advances in high-throughput omics technologies have enabled cost-effective, temporal profiling of targeted systems in animal models of human diseases. However, computational analysis of temporal patterns from omics data remains challenging. In particular, bioinformatic pipelines involving unsupervised statistical approaches to support cardiovascular investigations are lacking, which hinders one's ability to extract biomedical insights from these complex datasets. APPROACH ANDEntities:
Keywords: Data science; Oxidative post-translational modification; Proteomics; Temporal molecular signatures; Time-course; Unsupervised clustering
Year: 2020 PMID: 32504647 PMCID: PMC7583079 DOI: 10.1016/j.yjmcc.2020.05.020
Source DB: PubMed Journal: J Mol Cell Cardiol ISSN: 0022-2828 Impact factor: 5.000
Fig. 1.Schematic Overview of CV.Signature.TCP. A computational platform CV.Signature.TCP has been developed to discover temporal patterns of biological molecules related to the progression of diseases (e.g., cardiac hypertrophy). The temporal omics dataset is processed by 3 modules: (I) Preprocessing, (II) Clustering, and (III) Evaluation. Module I conducts missing data imputation and denoising simultaneously via cubic spline. Alternatively, principal component analysis (PCA) and singular value decomposition (SVD) are used. Module II identifies major temporal patterns using K-means with Euclidean distance (ED) and hierarchical clustering with dynamic time warping (DTW). Module III evaluates the significance of molecular variables (e.g., protein O-PTMs) in their clusters using the jackstraw test to obtain p-values and posterior inclusion probabilities (PIPs).
Fig. 2.Analysis of Cysteine O-PTMs using CV.Signature.TCP. (A) The scree plot was used to determine a range for the possible number of clusters (K = 4–6). By comparing the clustering results using these K-values, the optimal number of clusters were determined (K = 5) to sufficiently capture the dynamics of cysteine O-PTMs during cardiac remodeling. (B) CV.Signature.TCP platform was employed to extract 5 unique temporal patterns across 1605 Cysteine O-PTMs. A heatmap was used to visualize the temporal changes of O-PTM occupancy for individual O-PTMs. (C) Cysteine O-PTMs in mice vary over time in response to cardiac remodeling. We applied cubic splines with cross-validation to impute and denoise 1605 cysteine O-PTMs. K-means clustering identified 5 clusters (top row). Then, the jackstraw test for cluster memberships was applied and 1426 O-PTMs with PIP > 80% were selected (bottom row). (D) Protein O-PTMs of temporal significance were further annotated by their temporal patterns (as shown in five clusters) and their biological functions (BFs as shown in 10 essential pathways). Each circle represents a cluster of O-PTMs sharing both the temporal pattern and BF attribute. The occurrences of O-PTMs (a radius of a circle), the false discovery rate (*, FDR < 0.05), and the number of proteins (n) are labelled for each O-PTM cluster. BF1, neutrophil degranulation; BF2, response to elevated platelet cytosolic Ca2+; BF3, extracellular matrix organization; BF4, protein translation; BF5, post-translational protein phosphorylation; BF6, glucose metabolism; BF7, pyruvate metabolism and citric acid (TCA) cycle; BF 8, respiratory electron transport; BF9, branched-chain amino acid (BCAA) catabolism; BF10, fatty acid metabolism.