| Literature DB >> 25369052 |
Francesco Gadaleta1, Kristel Van Steen1.
Abstract
Genome-wide association studies can potentially unravel the mechanisms behind complex traits and common genetic diseases. Despite the valuable results produced thus far, many questions remain unanswered. For instance, which specific genetic compounds are linked to the risk of the disease under investigation; what biological mechanism do they act through; or how do they interact with environmental and other external factors? The driving force of computational biology is the constantly growing amount of big data generated by high-throughput technologies. A practical framework that can deal with this abundance of information and that consent to discovering genetic associations and interactions is provided by means of networks. Unfortunately, high dimensionality, the presence of noise and the geometry of data can make the aforementioned problem extremely challenging. We propose a penalised linear regression approach that can deal with the aforementioned issues that affect genetic data. We analyse the gene expression profiles of individuals with a common trait to infer the network structure of interactions among genes. The permutation-based approach leads to more stable and reliable networks inferred from synthetic microarray data. We show that a higher number of permutations determines the number of predicted edges, improves the overall sensitivity and controls the number of false positives.Entities:
Mesh:
Year: 2014 PMID: 25369052 PMCID: PMC4219691 DOI: 10.1371/journal.pone.0110451
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Algorithm of the variable selection and permutation-based stability test.
| 1: |
| 2: |
| 3: |
| 4: |
| 5: |
| 6: |
| 7: |
| 8: |
| 9: |
| 10: |
| 11: |
| 12: |
Parameters of LABnet for both 50-node and 200-node networks.
|
| 3-fold on 10% genes |
|
| 80% genes |
|
| 1 |
|
| 0–500 |
Figure 1Number of predicted edges and false positives vs. number of permutations.
Figure 2False positive rate vs. number of permutations.
Figure 3Number of predicted edges and false negatives vs. number of permutations.
Figure 4True positives and Matthew Correlation Coefficient vs. number of permutations.
Figure 5Degree correlation across real and predicted nodes.
Figure 6Betweenness correlation across real and predicted nodes.
Timings for 4 different execution of LABNet running in sequential (1 CPU) and parallel environments from 2 to 4 CPUs on general purpose hardware (1.3 GHz Intel Core i5), 4GB RAM.
| Genes | Perm | 1 CPU | 2 CPU | 3 CPU | 4 CPU |
| 50 | 500 | 269 | 175 | 153 | 135 |
| 50 | 1000 | 547 | 347 | 282 | 276 |
| 200 | 500 | 2846 | 2073 | 1997 | 1942 |
Perm indicates the number of permutations per gene.