| Literature DB >> 22595001 |
Xuesong Wang1, Lijing Li, Yuhu Cheng.
Abstract
BACKGROUND: Previous studies have shown modular structures in PPI (protein-protein interaction) networks. More recently, many genome and metagenome investigations have focused on identifying modules in PPI networks. However, most of the existing methods are insufficient when applied to networks with overlapping modular structures. In our study, we describe a novel overlapping module identification method (OMIM) to address this problem.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22595001 PMCID: PMC3348045 DOI: 10.1186/1471-2105-13-S7-S4
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Artificial word association dataset. The artificial word association dataset is a small scale network used to validate OMIM. It can be seen as a double layer network. 9 words constitute the first layer, in which the word 'day' works as a hub. The second layer consists of 8 sub-networks that center on other 8 words in the first layer, i.e., month, sunshine, camp, sleep, work, enjoy, long and sunny.
Results of the comparison on the word association dataset
| Algorithm | AC | OL | AVD | NUM_M | D_hub | P_hub |
|---|---|---|---|---|---|---|
| OMIM | 1.0000 | 1.0265 | 1.9817 | 8 | 1 | 8 |
| Newman | 0.9810 | 1.0000 | 1.8904 | 8 | - | - |
| MCL | 0.9934 | 1.0063 | 2.0132 | 9 | - | - |
| CPM | 0.0043 | 0.0199 | 0.0397 | 1 | - | - |
AC: accuracy. OL: overlapping rate. AVD: average degree. D_hub: date hub. P_hub: party hub. NUM_M: the number of modules obtained by different methods. '-': a symbol meaning we were unable to discover party or date hubs.
Figure 2Size distribution of PPI modules obtained by OMIM. In Figure 2, the abscissa indicates the size of the modules, i.e, the number of proteins in each module. The ordinate shows the number of modules with the size corresponding to abscissa.
Figure 3Degree distribution of PPI dataset. In Figure 3, K represents the degree of protein and the ordinate P(K) the fraction of proteins in the network with degree K.
Enrichment analysis of 10 randomly selected modules
| Module | Protein | Main functions | ||
|---|---|---|---|---|
| BP(-log P-value) | MF(-log P-value) | CC(-log P-value) | ||
| 3 | CDC39/MOT2/NOT3/NOT5/ | nuclear-transcribed mRNA poly(A) tail shortening (21.60) | ubiquitin-protein ligase activity (10.50) | CCR4-NOT core complex (24.36) |
| 5 | MSH2/MLH1/MSH3/MSH6/PMS1/ | meiotic mismatch repair (31.64) | mismatched DNA binding (33.15) | mismatch repair complex (33.61) |
| 11 | SEN15/SEN2/SEN34/SEN54/ | tRNA-type intron splice site recognition and cleavage (29.28) | endoribonuclease activity, producing 3'-phosphomonoesters (30.03) | tRNA-intron endonuclease complex (29.00) |
| 12 | DIS3/RRP4/RRP42/RRP43/SKI6/ | nuclear polyadenylation-dependent mRNA catabolic process (27.68) | molecular function unknown (RRP4/RRP42/RRP43/SKI6) | cytoplasmic exosome (RNase complex (30.24) |
| 21 | CDC23/CDC16/APC9/APC4/APC2/APC11/APC1/APC5/CDC26/CDC27/DOC1/MND2/SWM1/ | anaphase-promoting complex-dependent proteasomal ubiquitin-dependent protein catabolic process (75.04) | ubiquitin-protein ligase activity (44.07) | anaphase-promoting complex (83.63) |
| 25 | MRS11/TIM12/TIM22/TIM18/TIM54/TIM10/MRS5/TIM9/ | protein import into mitochondrial inner membrane (38.73) | protein transporter activity (27.42) | mitochondrial inner membrane protein insertion complex (43.22) |
| 26 | TOM6/TOM5/TOM40/TOM20/TOM22/TOM7/TOM70/ | protein targeting to mitochondrion (31.14) | protein channel activity | mitochondrial outer membrane translocase complex (47.98) |
| 98 | YOL103w-b/PAN6/YOR142w-a/YER159c-a/YPR158w-a/ | transposition, RNA-mediated (14.06) | RNA binding (6.37) | retrotransposon nucleocapsid (13.24) |
| 103 | TRS85/TRS33/TRS130/TRS20/GSG1/TRS65/TRS31/TRS23/TRS120/BET3/SED5/SLY1/BOS1/BET5/DSS4/YPT1/BET1/SEC34/YKT6/YPT6/SEC22/KRE11/ | golgi vesicle transport (62.74) | rab guanyl-nucleotide exchange factor activity (35.95) | TRAPP complex (55.19) |
| 115 | rox3/sfl1/sin4/srb11/srb9/ | positive regulation of transcription from RNA polymerase II promoter (9.93) | transcription factor binding transcription factor activity (15.39) | mediator complex (18.12) |
Main functions: the GO term that obtained according to -log P-values of all modules for biological process (BP), molecular functions (MF) and cellular component (CC).
Figure 4Cluster frequency of 115 modules on category BP, MF and CC. The abscissa indicates the module number and the ordinate the cluster frequency (%) in Figure 4. Cluster frequency on three main functions BP (biological process), MF (molecular functions) and CC (cellular component) were marked by different colors.
Comparison OMIM with other competing algorithms on PPI dataset
| Algorithm | Module number | Module size | Discard rate | GO(-log P-value) | ||
|---|---|---|---|---|---|---|
| BP | MF | CC | ||||
| OMIM | 115 | 25.81 | 44.26 | 7.27 | 7.69 | 7.44 |
| Newman | 115 | 24.18 | 44.26 | 7.18 | 7.39 | 7.22 |
| MCL | 319 | 7.40 | 52.68 | 7.17 | 6.72 | 7.16 |
| CPM | 66 | 10.96 | 85.51 | 8.39 | 7.60 | 8.53 |
Module number: the number of modules obtained by each algorithm. Module size: the average size of modules in each algorithm. GO: the average -log P-values of all modules for biological process (BP), molecular functions (MF) and cellular component (CC).