| Literature DB >> 24848012 |
W James Dittmar1, Lauren McIver2, Pawel Michalak2, Harold R Garner2, Gregorio Valdez3.
Abstract
The wealth of publicly available gene expression and genomic data provides unique opportunities for computational inference to discover groups of genes that function to control specific cellular processes. Such genes are likely to have co-evolved and be expressed in the same tissues and cells. Unfortunately, the expertise and computational resources required to compare tens of genomes and gene expression data sets make this type of analysis difficult for the average end-user. Here, we describe the implementation of a web server that predicts genes involved in affecting specific cellular processes together with a gene of interest. We termed the server 'EvoCor', to denote that it detects functional relationships among genes through evolutionary analysis and gene expression correlation. This web server integrates profiles of sequence divergence derived by a Hidden Markov Model (HMM) and tissue-wide gene expression patterns to determine putative functional linkages between pairs of genes. This server is easy to use and freely available at http://pilot-hmm.vbi.vt.edu/.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24848012 PMCID: PMC4086105 DOI: 10.1093/nar/gku442
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Workflow of an EvoCor analysis. The evolutionary history and expression pattern is first compared between the input and all genes (A). EvoCor then utilizes this information to rank genes based on their similar evolutionary history and expression pattern to the input gene (B) and generates a list of functionally related genes (C).
Figure 2.DAVID was used to evaluate EvoCor ability to predict genes with similar biological functions. The fraction of DAVID's key word overlap was determined for gene sets clustered using EvoCor (green and blue lines) and clustered randomly (red line). The percentage of DAVID's overlapping terms is significantly higher in the gene sets generated using EvoCor with the human gene expression set (blue line) as well as the mouse expression set (green line) compared to randomly generated gene sets (red line). For the human expression dataset versus P-value random the P-value = 2.20e-16 and D-value = .3286. For the mouse expression dataset versus random the P-value = 2.22E−16 and D-value = 0.1664.