| Literature DB >> 23144540 |
Haja N Kadarmideen1, Nathan S Watson-Haigh.
Abstract
Gene co-expression networks (GCN), built using high-throughput gene expression data are fundamental aspects of systems biology. The main aims of this study were to compare two popular approaches to building and analysing GCN. We use real ovine microarray transcriptomics datasets representing four different treatments with Metyrapone, an inhibitor of cortisol biosynthesis. We conducted several microarray quality control checks before applying GCN methods to filtered datasets. Then we compared the outputs of two methods using connectivity as a criterion, as it measures how well a node (gene) is connected within a network. The two GCN construction methods used were, Weighted Gene Co-expression Network Analysis (WGCNA) and Partial Correlation and Information Theory (PCIT) methods. Nodes were ranked based on their connectivity measures in each of the four different networks created by WGCNA and PCIT and node ranks in two methods were compared to identify those nodes which are highly differentially ranked (HDR). A total of 1,017 HDR nodes were identified across one or more of four networks. We investigated HDR nodes by gene enrichment analyses in relation to their biological relevance to phenotypes. We observed that, in contrast to WGCNA method, PCIT algorithm removes many of the edges of the most highly interconnected nodes. Removal of edges of most highly connected nodes or hub genes will have consequences for downstream analyses and biological interpretations. In general, for large GCN construction (with > 20000 genes) access to large computer clusters, particularly those with larger amounts of shared memory is recommended.Entities:
Year: 2012 PMID: 23144540 PMCID: PMC3489090 DOI: 10.6026/97320630008855
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Figure 1Plots relevant to the All, Control, Treatment, D60 and D67 networks (columns). Top row: TOM plots for the WGCNA networks. Heat maps shows the level of topological overlap as measured by TOM, where dark red/orange represents a higher level of overlap between pairs of nodes in the network. Modules can be defined using the dark red/orange squares along the diagonal. Red bars above and to the left of each heat map indicate the location of the highly differentially ranked (HDR) nodes. All HDR nodes identified from all the networks are show in the ALL network TOM plot. Middle row: Frequency distributions of all Pearson correlations (grey) used to generate the networks and those edges remaining following PCIT (red). Bottom row: Plots of ranked connectivity's calculated from the PCIT and WGCNA derived networks. Data points are semi-transparent, thus dense regions of points appear as dark areas. Green dashed line is the line of equality. HDR nodes are shown in red, with all 1,017 indicated in the connectivity rank plot for the ALL network.