| Literature DB >> 20422008 |
Francesco Gregoretti1, Vincenzo Belcastro, Diego di Bernardo, Gennaro Oliva.
Abstract
The reverse engineering of gene regulatory networks using gene expression profile data has become crucial to gain novel biological knowledge. Large amounts of data that need to be analyzed are currently being produced due to advances in microarray technologies. Using current reverse engineering algorithms to analyze large data sets can be very computational-intensive. These emerging computational requirements can be met using parallel computing techniques. It has been shown that the Network Identification by multiple Regression (NIR) algorithm performs better than the other ready-to-use reverse engineering software. However it cannot be used with large networks with thousands of nodes--as is the case in biological networks--due to the high time and space complexity. In this work we overcome this limitation by designing and developing a parallel version of the NIR algorithm. The new implementation of the algorithm reaches a very good accuracy even for large gene networks, improving our understanding of the gene regulatory networks that is crucial for a wide range of biomedical applications.Entities:
Mesh:
Year: 2010 PMID: 20422008 PMCID: PMC2858156 DOI: 10.1371/journal.pone.0010179
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Results of the application of network inference algorithms on the simulated dataset.
| Data Sets | ARACNe | BANJO | NIR | Clustering | Random | ||||
| PPV | Se | PPV | Se | PPV | Se | PPV | Se | PPV | |
| Local (steady-state) | |||||||||
| 10×10 | 0.53 | 0.61 | 0.41 | 0.50 | 0.63 | 0.96 | 0.39 | 0.38 | 0.36 |
| 0.25 | 0.18 | 0.57 | 0.93 | 0.20 | |||||
| 0.15 | 0.05 | 0.57 | 0.93 | 0.10 | |||||
| 100×100 | 0.56 | 0.28 | 0.71 | 0.00 | 0.97 | 0.87 | 0.29 | 0.18 | 0.19 |
| 0.42 | 0.00 | 0.96 | 0.86 | 0.10 | |||||
| 0.60 | 0.00 | 0.96 | 0.86 | 0.05 | |||||
| 1000×1000 | 0.66 | 0.65 | - | - |
|
|
|
|
|
|
|
| ||||||||
PPV: Positive Predicted Value (or accuracy) defined as , where is true positive and is false positive; Se: Sensitivity defined as with false negative. : directed graph; : undirected graph. In bold are the results obtained by using our parallel implementation of the NIR algorithm which could not be obtained in [8]. NIR performs significantly better than other software even for the 1000 gene networks.
Total execution times in seconds.
| number of procs | time (secs) | speedup |
| 1 | 98896 | - |
| 10 | 9725 | 10.1 |
| 20 | 4798 | 20.6 |
| 40 | 2406 | 41.1 |
| 60 | 1643 | 60.2 |
| 80 | 1259 | 78.6 |
| 100 | 969 | 102.0 |