| Literature DB >> 23496778 |
Phillip D Yates1, Nitai D Mukhopadhyay.
Abstract
BACKGROUND: Networks are ubiquitous in modern cell biology and physiology. A large literature exists for inferring/proposing biological pathways/networks using statistical or machine learning algorithms. Despite these advances a formal testing procedure for analyzing network-level observations is in need of further development. Comparing the behaviour of a pharmacologically altered pathway to its canonical form is an example of a salient one-sample comparison. Locating which pathways differentiate disease from no-disease phenotype may be recast as a two-sample network inference problem.Entities:
Mesh:
Year: 2013 PMID: 23496778 PMCID: PMC3621801 DOI: 10.1186/1471-2105-14-94
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
32] lists a variety of settings where these conditional inference procedures are useful. Some of the items listed that may apply to biological networks are: the distributional models for the responses are nonparametric, distributional models are not well-specified or may rely on too many nuisance parameters, the asymptotic null sampling distribution is unknown or depends on unknown quantities, or the sample size is less than the number of responses. To continue, these procedures might prove useful for multivariate problems where some variables are categorical (e.g., edge) and others quantitative (e.g., weight), in select multivariate inference problems where the component variables have different degrees of importance (e.g., edges discrepancies may be more severe than weight differences), and when treatment effects impact more than one aspect of the network. Applying the permutation testing principle, as stated in [32], to the two-sample network comparison problem via the customary mechanics serves as the inferential foundation for our two-sample testing strategy.
Figure 1One-sample tests for an Erdős-Rényi graph. P-value results from 100 independent tests of H:G(25,p) = G(25,0.20) versus H:G(25,p) > G(25,0.20). The y-axis is the observed resample p-value; the x-axis is the expected p-value under the null hypothesis. Panels (a) and (c), via a uniform distribution qq-plot, illustrate the Type I error rate using 2 settings for c. Panels (b) and (d) illustrate the performance of D under the alternate hypothesis for 2 settings of c; a horizontal line corresponding to an α = 0.05 level is provided.
Figure 2One-sample comparison for a correlation network under H. 100 resample p-value for a test of H: Ω = Ω versus H: Ω ≠ Ω under an assumed alternate hypothesis. The y-axis indicates the p-value obtained excluding the use of the neighbouring information in calculating D; the x-axis corresponds to the p-value obtained using the neighbouring information in calculating D.
Figure 3Two-sample comparison for partial correlation networks under H. 100 resample p-value for a test of H: Π= Π versus H: Π≠ Π under an assumed alternate hypothesis. The y-axis indicates the p-value obtained excluding the use of the neighbouring information in calculating D; the x-axis corresponds to the p-value obtained using the neighbouring information in calculating D.
Figure 4One-sample correlation network comparison of Type II diabetes versus Normal phenotype. Resample p-values for the 37 gene sets analyzed. Edge/no-edge indicates the inclusion/exclusion of the edge portion in calculating D. Neighbour/no-neighbour indicates the inclusion/exclusion of the neighbouring information in calculating D.
One-sample correlation network comparison of Type II diabetes versus Normal phenotype
| 1 KET-HG-U133A probes | 0.38 | 0.828 |
| 2 MAP31 Inositol metabolism | 0.607 | 0.608 |
| 3 MAP40 Pentose&glucuronate interconversions | 0.599 | 0.574 |
| 4 MAP53 Ascorbate&aldarate metabolism | 0.455 | 0.809 |
| 5 MAP62 Fatty acid biosynthesis path 2 | 0.644 | 0.761 |
| 6 MAP72 Synthesis°radation of ketone bodies | 0.588 | 0.915 |
| 7 MAP130 Ubiquinone biosynthesis | 0.122 | 0.115 |
| 8 MAP140 C21 Steroid hormone metabolism | 0.901 | 0.902 |
| 9 MAP271 Methionine metabolism | 0.49 | 0.879 |
| 10 MAP272 Cysteine metabolism | 0.522 | 0.584 |
| 11 MAP290 Valine leucine&isoleucine biosynthesis | 0.139 | 0.431 |
| 12 MAP400 Phenylalanine tyrosine&tryptophan biosyn | 0.443 | 0.804 |
| 13 MAP430 Taurine&hypotaurine metabolism | 0.782 | 0.705 |
| 14 MAP450 Selenoamino acid metabolism | 0.554 | 0.874 |
| 15 MAP460 Cyanoamino acid metabolism | 0.569 | 0.808 |
| 16 MAP472 D-Arginine&D-ornithine metabolism | 0.916 | 0.948 |
| 17 MAP511 N-Glycan degradation | 0.58 | 0.677 |
| 18 MAP512 O-Glycans biosynthesis | 0.613 | 0.673 |
| 19 MAP522 Erythromycin biosynthesis | 0.081 | 0.254 |
| 20 MAP532 Chondroitin Heparan sulfate biosynthesis | 0.726 | 0.882 |
| 21 MAP533 Keratan sulfate biosynthesis | 0.861 | 0.943 |
| 22 MAP580 Phospholipid degradation | 0.484 | 0.271 |
| 23 MAP601 Blood group glycolipid biosyn lact series | 0.571 | 0.588 |
| 24 MAP603 Globoside metabolism | 0.92 | 0.88 |
| 25 MAP630 Glyoxylate&dicarboxylate metabolism | 0.276 | 0.622 |
| 26 MAP631 1-2-Dichloroethane degradation | 0.473 | 0.797 |
| 27 MAP632 Benzoate degradation | 0.515 | 0.812 |
| 28 MAP680 Methane metabolism | 0.319 | 0.337 |
| 29 MAP720 Reductive carboxylate cycle CO2 fixation | 0.085 | 0.459 |
| 30 MAP740 Riboflavin metabolism | 0.231 | 0.583 |
| 31 MAP760 Nicotinate&nicotinamide metabolism | 0.581 | 0.899 |
| 32 MAP780 Biotin metabolism | 0.451 | 0.67 |
| 33 MAP900 Terpenoid biosynthesis | 0.802 | 0.835 |
| 34 MAP950 Alkaloid biosynthesis I | 0.666 | 0.6965 |
| 35 MAP3030 DNA polymerase | 0.877 | 0.707 |
| 36 PYR-HG-U133A probes | 0.524 | 0.75 |
| 37 ROS-HG-U133A probes | 0.658 | 0.756 |
Resample p-values for the 37 gene sets analyzed based on correlation networks using thresholds ρ = 0.35 and 0.50.
Valine leucine and isoleucine biosynthesis (MAP290) and D-Arginine and D - Ornithine Metabolism (MAP472) gene sets for the Normal and Type II (DM2) phenotypes
| | | | | | | | | | | | | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1.0 | . | . | . | . | . | 1.0 | . | . | . | . | . |
| 2 | | 1.0 | 0.53 | 0.59 | 0.59 | . | | 1.0 | . | . | . | . |
| 3 | | | 1.0 | 0.73 | . | . | | | 1.0 | 0.81 | . | . |
| 4 | | | | 1.0 | 0.86 | . | | | | 1.0 | . | . |
| 5 | | | | | 1.0 | . | | | | | 1.0 | . |
| 6 | | | | | | 1.0 | | | | | | 1.0 |
| | | | | | | | | | | | | |
| 1 | 1.0 | 0.62 | 0.61 | 0.56 | 0.60 | 0.52 | 1.0 | 0.63 | 0.62 | 0.64 | 0.57 | 0.54 |
| 2 | | 1.0 | 0.99 | 0.98 | 0.97 | 0.94 | | 1.0 | 0.99 | 0.97 | 0.98 | 0.94 |
| 3 | | | 1.0 | 0.98 | 0.97 | 0.95 | | | 1.0 | 0.97 | 0.96 | 0.95 |
| 4 | | | | 1.0 | 0.97 | 0.97 | | | | 1.0 | 0.98 | 0.96 |
| 5 | | | | | 1.0 | 0.96 | | | | | 1.0 | 0.94 |
| 6 | 1.0 | 1.0 |
The numerical values are the thresholded correlation estimates. A ‘.’ denotes the lack of an edge between two probes, ‘1.0’ is a visual placeholder. The actual probes differ between the two gene sets.
Gaussian graphical model estimate details for three ovarian cancer phenotypes
| G1-S phase of the cell cycle | 5 | 0 | 0 | 3 |
| S-G2 phase of the cell cycle | 13 | 0 | 2 | 0 |
| Checkpoint | 6 | 0 | 3 | 4 |
| DNA damage and repair | 5 | 4 | 0 | 0 |
| DNA synthesis and replication | 13 | 0 | 30 | 40 |
Number of edges in the Gaussian graphical model estimate for each of the three phenotypes across the five processes categorized by Bracken et al. [47].
Two-sample comparison of select ovarian cancer phenotypes
| G1-S phase of the cell cycle | NA | 0.636 |
| S-G2 phase of the cell cycle | 0.676 | 0.691 |
| Checkpoint | 0.812 | 0.380 |
| DNA damage and repair | 0.637 | NA |
| DNA synthesis and replication | 0.368 | 0.142 |
Resample p-values for phenotypic comparisons of the form H: Π= Π versus H: Π≠ Π.
Estimated networks for the SCA1 and SCA3 phenotypes
| 1.0 | . | . | . | 0.60 | 0.65 | . | −0.75 | . | −0.59 | −0.59 | . | . | |
| | 1.0 | −0.64 | . | . | . | 0.53 | . | . | . | . | 0.52 | −0.48 | |
| | | 1.0 | −0.43 | . | −0.41 | . | . | 0.66 | . | . | 0.88 | −0.72 | |
| | | | 1.0 | . | . | −0.72 | . | 0.64 | . | . | 0.49 | −0.44 | |
| | | | | 1.0 | . | . | . | −0.46 | 0.55 | . | . | . | |
| | | | | | 1.0 | . | 0.66 | 0.47 | 0.50 | . | . | . | |
| | | | | | | 1.0 | 0.58 | . | . | . | . | . | |
| | | | | | | | 1.0 | . | . | −0.77 | . | . | |
| | | | | | | | | 1.0 | . | . | −0.62 | 0.80 | |
| | | | | | | | | | 1.0 | . | . | . | |
| | | | | | | | | | | 1.0 | 0.61 | −0.43 | |
| | | | | | | | | | | | 1.0 | 0.77 | |
| | | | | | | | | | | | | 1.0 | |
| | | | | | | | | | | | | | |
| 1.0 | −0.52 | . | . | 0.37 | 0.42 | 0.64 | . | 0.34 | . | . | 0.34 | −0.44 | |
| | 1.0 | 0.36 | −0.49 | 0.37 | . | 0.84 | . | 0.62 | 0.35 | . | . | −0.56 | |
| | | 1.0 | . | 0.47 | . | . | 0.47. | −0.40 | −0.60 | −0.40 | . | . | |
| | | | 1.0 | 0.60 | 0.34 | . | −0.40 | . | 0.37 | . | −0.43 | . | |
| | | | | 1.0 | −0.44 | −0.43 | . | . | . | . | 0.45 | . | |
| | | | | | 1.0 | −0.35 | . | . | . | . | . | . | |
| | | | | | | 1.0 | . | −0.42 | . | 0.39 | . | 0.70 | |
| | | | | | | | 1.0 | 0.34 | 0.82 | 0.49 | −0.46 | . | |
| | | | | | | | | 1.0 | −0.37 | . | . | 0.60 | |
| | | | | | | | | | 1.0 | −0.58 | 0.34 | . | |
| | | | | | | | | | | 1.0 | 0.46 | . | |
| | | | | | | | | | | | 1.0 | . | |
| 1.0 |
The estimated SCA1 and SCA3 Gaussian graphical model networks for the DNA synthesis and replication genes. Off-diagonal non-zero weights indicate the presence of an edge between two genes, ‘.’ denotes the lack of an edge, and ‘1.0’ is a visual placeholder.