| Literature DB >> 22784566 |
Dukyong Yoon1, Hyosil Kim, Haeyoung Suh-Kim, Rae Woong Park, KiYoung Lee.
Abstract
BACKGROUND: Microarray analyses based on differentially expressed genes (DEGs) have been widely used to distinguish samples across different cellular conditions. However, studies based on DEGs have not been able to clearly determine significant differences between samples of pathophysiologically similar HIV-1 stages, e.g., between acute and chronic progressive (or AIDS) or between uninfected and clinically latent stages. We here suggest a novel approach to allow such discrimination based on stage-specific genetic features of HIV-1 infection. Our approach is based on co-expression changes of genes known to interact. The method can identify a genetic signature for a single sample as contrasted with existing protein-protein-based analyses with correlational designs.Entities:
Mesh:
Year: 2011 PMID: 22784566 PMCID: PMC3287475 DOI: 10.1186/1752-0509-5-S2-S1
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1Study overview. After acquiring expression data from gene expression omnibus (A, B), differentially expressed genes (DEGs) and differentially expressed gene pairs (DEPs) were selected and evaluated for usefulness (C–E). The results of network analysis determined the HIV-related network modules (F, G).
Protein pairs included in the top-30 DEPs
| Protein 1 | Name of Protein 1 | Protein 2 | Name of Protein 2 | P-value | G-mean | Significance |
|---|---|---|---|---|---|---|
| NUTF2 | nuclear transport factor 2 | NUP62 | nucleoporin 62kDa | 7.5E-06 | 0.37 | 1.90 |
| CDC7 | cell division cycle 7 homolog (S. cerevisiae) | MCM3 | minichromosome maintenance complex component 3 | 1.3E-05 | 0.34 | 1.66 |
| VAMP1 | vesicle-associated membrane protein 1 (synaptobrevin 1) | ARFGAP1 | ADP-ribosylation factor GTPase activating protein 1 | 4.3E-05 | 0.36 | 1.56 |
| HSPA8 | heat shock 70kDa protein 8 | TADA3L | transcriptional adaptor 3 | 1.0E-05 | 0.26 | 1.29 |
| TNR | tenascin R (restrictin, janusin) | NFASC | neurofascin | 1.6E-04 | 0.33 | 1.25 |
| ARHGEF2 | Rho/Rac guanine nucleotide exchange factor (GEF) 2 | PRKCI | protein kinase C, iota | 6.4E-05 | 0.28 | 1.17 |
| PDGFRB | platelet-derived growth factor receptor, beta polypeptide | SNX2 | sorting nexin 2 | 8.7E-05 | 0.28 | 1.14 |
| RFC5 | replication factor C (activator 1) 5, 36.5kDa | POLA1 | polymerase (DNA directed), alpha 1, catalytic subunit | 5.5E-05 | 0.27 | 1.13 |
| NFIB | nuclear factor I/B | RFX1 | regulatory factor X, 1 (influences HLA class II expression) | 3.4E-04 | 0.32 | 1.12 |
| EIF3I | eukaryotic translation initiation factor 3, subunit I | SUMO4 | SMT3 suppressor of mif two 3 homolog 4 (S. cerevisiae) | 1.7E-05 | 0.23 | 1.09 |
| COL17A1 | collagen, type XVII, alpha 1 | LAD1 | ladinin 1 | 3.9E-04 | 0.31 | 1.07 |
| IRS1 | insulin receptor substrate 1 | UBTF | upstream binding transcription factor, RNA polymerase I | 4.5E-04 | 0.32 | 1.05 |
| CAV1 | caveolin 1, caveolae protein, 22kDa | TRAF6 | TNF receptor-associated factor 6 | 1.0E-04 | 0.26 | 1.05 |
| RPS14 | ribosomal protein S14 | RPS27A | ribosomal protein S27a | 3.2E-03 | 0.41 | 1.03 |
| HNRNPA2B1 | heterogeneous nuclear ribonucleoprotein A2/B1 | HNRNPH1 | heterogeneous nuclear ribonucleoprotein H1 (H) | 4.1E-05 | 0.23 | 1.02 |
| TAF4 | TATA box binding protein (TBP)-associated factor, 135kDa | CBX3 | chromobox homolog 3 | 1.0E-03 | 0.34 | 1.01 |
| VPS11 | vacuolar protein sorting 11 homolog (S. cerevisiae) | VPS45 | vacuolar protein sorting 45 homolog (S. cerevisiae) | 2.4E-04 | 0.28 | 1.00 |
| ATP5F1 | ATP synthase, H+ transporting, mitochondrial Fo complex, subunit B1 | ATP5J2 | ATP synthase, H+ transporting, mitochondrial Fo complex, subunit F2 | 8.1E-05 | 0.24 | 0.99 |
| POLR2G | polymerase (RNA) II (DNA directed) polypeptide G | SF3B2 | splicing factor 3b, subunit 2, 145kDa | 2.6E-03 | 0.38 | 0.98 |
| PDPK1 | 3-phosphoinositide dependent protein kinase-1 | PRKCQ | protein kinase C, theta | 4.5E-05 | 0.22 | 0.97 |
| EP300 | E1A binding protein p300 | TF | transferrin | 5.3E-04 | 0.30 | 0.97 |
| RPS5 | ribosomal protein S5 | RPL28 | ribosomal protein L28 | 1.4E-03 | 0.34 | 0.96 |
| ELK1 | ELK1, member of ETS oncogene family | GRB10 | growth factor receptor-bound protein 10 | 9.7E-04 | 0.32 | 0.96 |
| RPS13 | ribosomal protein S13 | ATAD3A | ATPase family, AAA domain containing 3A | 5.8E-05 | 0.23 | 0.95 |
| PABPC1 | poly(A) binding protein, cytoplasmic 1 | RPS4Y1 | ribosomal protein S4, Y-linked 1 | 2.5E-03 | 0.37 | 0.95 |
| PRPF4 | PRP4 pre-mRNA processing factor 4 homolog (yeast) | PPIH | peptidylprolyl isomerase H (cyclophilin H) | 6.5E-04 | 0.29 | 0.93 |
| ZFP36 | zinc finger protein 36, C3H type, homolog (mouse) | EIF2C4 | eukaryotic translation initiation factor 2C, 4 | 6.3E-04 | 0.29 | 0.93 |
| RPLP2 | ribosomal protein, large, P2 | RPL29 | ribosomal protein L29 | 1.2E-03 | 0.32 | 0.93 |
| HSF1 | heat shock transcription factor 1 | STIP1 | stress-induced-phosphoprotein 1 | 1.5E-04 | 0.24 | 0.92 |
| GSK3B | glycogen synthase kinase 3 beta | FUS | fused in sarcoma | 2.7E-03 | 0.35 | 0.91 |
Here, significance is “-log(P-value) × G-mean”.
Figure 2Comparison of DEPs and DEGs. (A, B) Expressions and correlations of DEPs. The X- and Y-axes represent expression levels of each gene. Expression levels of the samples are represented by dots (X, Uninfected (UI); O, Acute (AT); *, Non-progressive (NP); +, Chronic (CN)). Median values of samples in the same stage are marked with bigger black “+” and “×”. Trend lines are shown with degrees of correlation. (C) Gene-ontology (GO) enrichment analysis of DEPs and DEGs. The numbers in circles indicate the counts of GO terms relevant to DEPs, DEGs, or both. Precise details and their -log10 p values are listed next to the numbers.
Figure 3Result of principal-component analysis. (A) Global view of all expression data sets using principal component analysis. The first principal component accounts for as much variability in the expression of total genes as possible. The second and third components account for as much of the remaining variability as possible. Results of principal component analysis with DEGs and DEPs are shown in (B) and (C), respectively.
Figure 4Clustering results using DEPs and DEGs. (A) Heat map for expression of DEGs across the samples. Each lane represents the expression profile of one sample. Result of hierarchical clustering with DEGs is shown at the top of the heat map (UI, Uninfected; AT, Acute; NP, Non-progressive; CN, Chronic). (B) Heat map for co-expressed score of DEPs. (C) Four representative clusters of gene pairs. Of six groups of gene pairs clustered by K-means clustering using Pearson’s correlation between pairs and samples, four groups had relatively different co-expression scores compared with the other stages. The X-axis represents samples, and the Y-axis represents co-expression score. The top 10 GO terms related to biological processes are listed in descending order of p-values.
Figure 5Performance of classification models using DEPs and DEGs. (A) Accuracies of the DEP- and DEG-based models. The accuracy of all models was estimated by the leave-one-out test. (B) Sensitivities and specificities of the DEP- and DEG-based models. (C) Confusion matrix of DEP- and DEG-based models in (A) (UI, Uninfected; AT, Acute; NP, Non-progressive; CN, Chronic; DT, Decision tree; NN, Neural network). This matrix shows the actual stages of samples and their predicted stages by classification methods. Each column represents a predicted stage, and each row represents an actual stage. The count represents the result for DEGs/that for DEPs. (D) Accuracy of SVMs according to cutoff values for selecting DEPs. (E) Accuracies according to classification models and the number of principal components used for building models. DEPs showed higher accuracy than did DEGs regardless of the classification model or the number of used principal components.
Figure 6HIV-related network modules. Thin lines represent PPI, and thick lines denote DEPs. Nodes indicate protein, and node color denotes expression level. Blue, yellow, and red represent increasingly higher expressions of genes. The color of the edge represents the sign of the co-expressed score. If the sign is positive, the edge color is red, and if it is negative, blue is used. Representative modules for each stage were created using the median value of the co-expressed score and expression level of samples in the same stage. Network module 1 of the Chronic stage was made from 10 network modules listed in the second row.