| Literature DB >> 31398888 |
Mohammad Nazmol Hasan1, Masuma Binte Malek2, Anjuman Ara Begum3, Moizur Rahman4, Md Nurul Haque Mollah5.
Abstract
Background and objectives: Assessment of drugs toxicity and associated biomarker genes is one of the most important tasks in the pre-clinical phase of drug development pipeline as well as in toxicogenomic studies. There are few statistical methods for the assessment of doses of drugs (DDs) toxicity and their associated biomarker genes. However, these methods consume more time for computation of the model parameters using the EM (expectation-maximization) based iterative approaches. To overcome this problem, in this paper, an attempt is made to propose an alternative approach based on hierarchical clustering (HC) for the same purpose. Methods and materials: There are several types of HC approaches whose performance depends on different similarity/distance measures. Therefore, we explored suitable combinations of distance measures and HC methods based on Japanese Toxicogenomics Project (TGP) datasets for better clustering/co-clustering between DDs and genes as well as to detect toxic DDs and their associated biomarker genes.Entities:
Keywords: biomarker gene; doses of drugs; error rate; fold change gene expression; hierarchical clustering; toxicity
Mesh:
Substances:
Year: 2019 PMID: 31398888 PMCID: PMC6723056 DOI: 10.3390/medicina55080451
Source DB: PubMed Journal: Medicina (Kaunas) ISSN: 1010-660X Impact factor: 2.430
Important distance measures used in hierarchical clustering.
| Distance Measure | Mathematical Form |
|---|---|
| Euclidean |
|
| Minkowski |
|
| Manhattan |
|
| Canbera |
|
| Maximum |
|
Percent of error rate (ER) for 35 combinations of distance and HC clustering methods calculated from the glutathione metabolism and PPAR signaling pathway datasets.
| Sl | Combination of Distance and HC Clustering Methods | Drug Clustering ER for GMP Data | Drug Clustering ER for PPAR-SP Data | DDs Clustering ER for GMP Data | DDs Clustering ER for PPAR-SP Data |
|---|---|---|---|---|---|
|
|
|
|
|
|
|
| 2 | euclidean:single | 10 | 40 | 16.66666667 | 36.66666667 |
| 3 | euclidean:complete | 10 | 30 | 26.66666667 | 20 |
| 4 | euclidean:average | 10 | 40 | 26.66666667 | 20 |
| 5 | euclidean:mcquitty | 40 | 40 | 26.66666667 | 13.33333333 |
| 6 | euclidean:median | 40 | 40 | 3.333333333 | 26.66666667 |
| 7 | euclidean:centroid | 40 | 40 | 16.66666667 | 30 |
| 8 | maximum:ward | 10 | 0 | 16.66666667 | 10 |
| 9 | maximum:single | 10 | 40 | 16.66666667 | 36.66666667 |
| 10 | maximum:complete | 20 | 0 | 16.66666667 | 26.66666667 |
| 11 | maximum:average | 10 | 40 | 26.66666667 | 36.66666667 |
| 12 | maximum:mcquitty | 10 | 40 | 26.66666667 | 36.66666667 |
| 13 | maximum:median | 40 | 40 | 26.66666667 | 36.66666667 |
| 14 | maximum:centroid | 40 | 40 | 16.66666667 | 30 |
|
|
|
|
|
|
|
| 16 | manhattan:single | 40 | 40 | 16.66666667 | 36.66666667 |
| 17 | manhattan:complete | 10 | 30 | 3.333333333 | 20 |
| 18 | manhattan:average | 10 | 40 | 26.66666667 | 20 |
| 19 | manhattan:mcquitty | 10 | 40 | 3.333333333 | 20 |
| 20 | manhattan:median | 40 | 40 | 26.66666667 | 36.66666667 |
| 21 | manhattan:centroid | 40 | 40 | 16.66666667 | 30 |
| 22 | canberra:ward | 50 | 10 | 30 | 20 |
| 23 | canberra:single | 50 | 10 | 23.33333333 | 36.66666667 |
| 24 | canberra:complete | 50 | 10 | 30 | 20 |
| 25 | canberra:average | 50 | 10 | 30 | 23.33333333 |
| 26 | canberra:mcquitty | 50 | 40 | 30 | 23.33333333 |
| 27 | canberra:median | 50 | 40 | 40 | 36.66666667 |
| 28 | canberra:centroid | 50 | 40 | 33.33333333 | 36.66666667 |
|
|
|
|
|
|
|
| 30 | minkowski:single | 10 | 40 | 16.66666667 | 36.66666667 |
| 31 | minkowski:complete | 10 | 30 | 26.66666667 | 20 |
| 32 | minkowski:average | 10 | 40 | 26.66666667 | 20 |
| 33 | minkowski:mcquitty | 40 | 40 | 26.66666667 | 13.33333333 |
| 34 | minkowski:median | 40 | 40 | 3.333333333 | 26.66666667 |
| 35 | minkowski:centroid | 40 | 40 | 16.66666667 | 30 |
Figure 1Doses of drugs (DDs) clustering of GMP and PPAR-SP datasets based on the Euclidean distance method in combination with the ward HC method. (A) DDs clustering of GMP dataset at 24 h time point. (B) DDs clustering of GMP dataset at multiple (3 h, 6 h, 9 h, and 24 h) time points. (C) DDs clustering of PPAR-SP dataset at 24 h time point. (D) DDs clustering of PPAR-SP dataset at multiple (3 h, 6 h, 9 h, and 24 h) time point.
Doses of drug and gene co-clustering mean (ranked) of the glutathione metabolism pathway datasets for the combination (Euclidean: ward) of distance and hierarchical clustering methods.
|
| |
|
|
|
| Gene-Cluster-3(1): Compound-Cluster-3(1) | 2.5550390 |
| Gene-Cluster-2(2): Compound-Cluster-2(2) | 1.6619841 |
| Gene-Cluster-3(1): Compound-Cluster-2(2) | 0.8249199 |
| Gene-Cluster-3(1): Compound-Cluster-1(3) | 0.8129127 |
| Gene-Cluster-2(2): Compound-Cluster-3(1) | 0.5994644 |
| Gene-Cluster-1(3): Compound-Cluster-3(1) | 0.5991663 |
| Gene-Cluster-1(3): Compound-Cluster-2(2) | 0.4653372 |
| Gene-Cluster-2(2): Compound-Cluster-1(3) | 0.3402437 |
| Gene-Cluster-1(3): Compound-Cluster-1(3) | 0.2481545 |
|
| |
|
|
|
| Gene-Cluster-3(1): Compound-Cluster-2(1) | 1.2954907 |
| Gene-Cluster-1(2): Compound-Cluster-1(2) | 0.6118177 |
| Gene-Cluster-2(3): Compound-Cluster-1(2) | 0.5850958 |
| Gene-Cluster-3(1): Compound-Cluster-1(2) | 0.5157947 |
| Gene-Cluster-3(1): Compound-Cluster-3(3) | 0.3513179 |
| Gene-Cluster-1(2): Compound-Cluster-2(1) | 0.3360666 |
| Gene-Cluster-2(3): Compound-Cluster-2(1) | 0.3285539 |
| Gene-Cluster-1(1): Compound-Cluster-3(3) | 0.2478899 |
| Gene-Cluster-2(3): Compound-Cluster-3(3) | 0.2424664 |
Doses of drug and gene co-clustering mean (ranked) of the PPAR signaling pathway datasets for the combination (Euclidean: ward) of distance and hierarchical clustering methods.
|
| |
|
|
|
| Gene-Cluster-1(1): Compound-Cluster-1(1) | 1.5972416 |
| Gene-Cluster-3(2): Compound-Cluster-2(2) | 0.6596625 |
| Gene-Cluster-3(2): Compound-Cluster-1(1) | 0.6522308 |
| Gene-Cluster-1(1): Compound-Cluster-2(2) | 0.4973316 |
| Gene-Cluster-2(3): Compound-Cluster-1(1) | 0.3994878 |
| Gene-Cluster-2(3): Compound-Cluster-2(2) | 0.2378871 |
|
| |
|
|
|
| Gene-Cluster-3(1): Compound-Cluster-2(1) | 1.5863836 |
| Gene-Cluster-1(2): Compound-Cluster-2(1) | 0.5842037 |
| Gene-Cluster-1(2): Compound-Cluster-1(2) | 0.4385611 |
| Gene-Cluster-3(1): Compound-Cluster-1(2) | 0.4025768 |
| Gene-Cluster-2(3): Compound-Cluster-2(1) | 0.2569643 |
| Gene-Cluster-2(3): Compound-Cluster-1(2) | 0.1757952 |
Figure 2Structural view of co-clusters retrieved by our HC based proposed co-clustering algorithm of the GMP and PPAR-SP datasets. (A) GMP dataset for 24 h time point. (B) GMP dataset for multiple time points. (C) PPAR-SP dataset for 24 h time point. (D) PPAR-SP dataset for multiple time points.
Biomarker co-clusters consisting of biomarker genes and their regulatory doses of drugs explored by the combination (Euclidean: ward) of distance and hierarchical clustering methods for glutathione metabolism pathway datasets.
| Biomarker Genes | Regulatory Doses of Drugs |
|---|---|
|
| |
|
| |
Biomarker co-clusters consisting of biomarker genes and their regulatory doses of drugs explored by the combination (Euclidean: ward) of distance and hierarchical clustering methods for PPAR signaling pathway datasets.
| Biomarker Genes | Regulatory Doses of Drugs |
|---|---|
|
| |
|
| |
Functional annotation of KEGG pathway on the biomarker genes in co-cluster-1 discovered by the distance and HC method combination Euclidean: ward, Dataset: glutathione metabolism pathway at 24 h time point.
| Term | Count | % | FDR | Genes | |
|---|---|---|---|---|---|
| rno00480: Glutathione metabolism | 2 | 66.66 | 7.48E−3 | 2.04E−38 | RGD1562107, Gpx6 |
Functional annotation of KEGG pathway on the biomarker genes in co-cluster-2 discovered by the distance and HC method combination Euclidean: ward, Dataset: glutathione metabolism pathway at 24 h time point.
| Term | Count | % | Genes | |
|---|---|---|---|---|
| rno00480: Glutathione metabolism | 10 | 100 | 3.85E−20 | Mgst2, Gpx2, G6pd, Gclm, Gsr, Gsta5, Gclc, Gclc, Gstp1, Gstm3, Gstm4 |
| rno00980: Metabolism of xenobiotics by cytochrome P450 | 5 | 50.0 | 7.43E−7 | Mgst2, Gsta5, Gstp1, Gstm3, Gstm4 |
| rno00982: Drug metabolism—cytochrome P450 | 5 | 50.0 | 7.87E−7 | Mgst2, Gsta5, Gstp1, Gstm3, Gstm4 |
| rno05204: Chemical carcinogenesis | 5 | 50.0 | 2.14E−6 | Mgst2, Gsta5, Gstp1, Gstm3, Gstm4 |
| rno04918: Thyroid hormone synthesis | 2 | 20.0 | 0.076 | Gpx2, Gsr |
Functional annotation of KEGG pathway on the biomarker genes in co-cluster-1 discovered by the distance and HC method combination Euclidean: ward, Dataset: PPAR signaling pathway at 24 h time point.
| Term | Count | % | Genes | |
|---|---|---|---|---|
| rno03320: PPAR signaling pathway | 13 | 76.47 | 4.88E−24 | Cpt1b, Aqp7, Cpt1c, Cpt1a, Cyp4a3, Cpt1a, Cpt2, Cyp8b1, Fabp3, Ehhadh, Acaa1a, Cyp4a1, Angptl4, Fabp5 |
| rno00071: Fatty acid degradation | 8 | 47.06 | 3.16E−13 | Cpt1b, Cpt2, Ehhadh, Acaa1a, Cpt1c, Cpt1a, Cyp4a3, Cpt1a, Cyp4a1 |
| rno01212: Fatty acid metabolism | 6 | 35.29 | 1.67E−8 | Cpt1b, Cpt2, Ehhadh, Acaa1a, Cpt1c, Cpt1a, Cpt1a |
| rno04920: Adipocytokine signaling pathway | 3 | 17.65 | 0.0067 | Cpt1b, Cpt1c, Cpt1a, Cpt1a |
| rno04922: Glucagon signaling pathway | 3 | 17.65 | 0.0117 | Cpt1b, Cpt1c, Cpt1a, Cpt1a |
| rno04152: AMPK signaling pathway | 3 | 17.65 | 0.0187 | Cpt1b, Cpt1c, Cpt1a, Cpt1a |
| rno01100: Metabolic pathways | 6 | 35.29 | 0.0500 | Me1, Me1, Cyp8b1, Ehhadh, Acaa1a, Cyp4a3, Cyp4a1 |
| rno00280: Valine, leucine and isoleucine degradation | 2 | 11.76 | 0.0885 | Ehhadh, Acaa1a |
Functional annotation of KEGG pathway on the biomarker genes in co-cluster-1 discovered by the distance and HC method combination Euclidean: ward, Dataset: PPAR signaling pathway at 3 h, 6 h, 9 h, and 24 h time points.
| Term | Count | % | Genes | |
|---|---|---|---|---|
| rno03320: PPAR signaling pathway | 7 | 63.63 | 5.49E−12 | Cpt1b, Cpt2, Ehhadh, Acaa1a, Cyp4a3, Angptl4, Cyp4a1 |
| rno00071: Fatty acid degradation | 6 | 54.54 | 1.37E−10 | Cpt1b, Cpt2, Ehhadh, Acaa1a, Cyp4a3, Cyp4a1 |
| rno01212: Fatty acid metabolism | 4 | 36.36 | 1.09E−5 | Cpt1b, Cpt2, Ehhadh, Acaa1a |
| rno01100: Metabolic pathways | 5 | 45.45 | 0.0172 | Me1, Ehhadh, Acaa1a, Cyp4a3, Cyp4a1 |
| rno00280: Valine, leucine and isoleucine degradation | 2 | 18.18 | 0.0486 | Ehhadh, Acaa1a |
| rno00590: Arachidonic acid metabolism | 2 | 18.18 | 0.0709 | Cyp4a3, Cyp4a1 |
| rno00830: Retinol metabolism | 2 | 18.18 | 0.0726 | Cyp4a3, Cyp4a1 |
| rno04146: Peroxisome | 2 | 18.18 | 0.0743 | Ehhadh, Acaa1a |
| rno04750: Inflammatory mediator regulation of TRP channels | 2 | 18.18 | 0.0994 | Cyp4a3, Cyp4a1 |