| Literature DB >> 24817878 |
Xiangqing Sun1, Qing Lu2, Shubhabrata Mukherjee3, Shubhabrata Mukheerjee, Paul K Crane3, Robert Elston1, Marylyn D Ritchie4.
Abstract
Gene-gene interactions may contribute to the genetic variation underlying complex traits but have not always been taken fully into account. Statistical analyses that consider gene-gene interaction may increase the power of detecting associations, especially for low-marginal-effect markers, and may explain in part the "missing heritability." Detecting pair-wise and higher-order interactions genome-wide requires enormous computational power. Filtering pipelines increase the computational speed by limiting the number of tests performed. We summarize existing filtering approaches to detect epistasis, after distinguishing the purposes that lead us to search for epistasis. Statistical filtering includes quality control on the basis of single marker statistics to avoid the analysis of bad and least informative data, and limits the search space for finding interactions. Biological filtering includes targeting specific pathways, integrating various databases based on known biological and metabolic pathways, gene function ontology and protein-protein interactions. It is increasingly possible to target single-nucleotide polymorphisms that have defined functions on gene expression, though not belonging to protein-coding genes. Filtering can improve the power of an interaction association study, but also increases the chance of missing important findings.Entities:
Keywords: biological interaction; epistasis; filtering pipeline; genetic interaction; optimal search
Year: 2014 PMID: 24817878 PMCID: PMC4012196 DOI: 10.3389/fgene.2014.00106
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Biological information databases on gene ontology annotation, gene–gene interactions, pathways, disease related gene networks and systems.
| Database | URL | Description | Reference |
|---|---|---|---|
| KEGG | KEGG is a collection of manually drawn pathway maps representing knowledge on the molecular interaction and reaction networks for metabolism, genetic information processing, environmental information processing, cellular processes, organismal systems, human diseases, and drug development. | ||
| GO | GO provides an ontology of defined terms representing gene product properties. The ontology covers three domains: cellular component, molecular function, and biological processes. | ||
| DIP | Databases of experimentally determined interactions between proteins. | ||
| BioGRID | A comprehensive resource of protein–protein and genetic interactions for all major model organism species. | ||
| NetPath | Resource of signal transduction pathways in humans. | ||
| IntAct | Database of molecular interactions that are derived from literature curation or direct user submissions. | ||
| MINT | MINT focuses on experimentally verified protein–protein interactions mined from the scientific literature by expert curators. | ||
| MINT now uses the IntAct database infrastructure to limit the duplication of efforts and to optimize future software development. | |||
| MIPS | The MIPS mammalian protein–protein interaction Database is a collection of manually curated high-quality interactions. | ||
| Pfam | The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models. There are two kinds of entries in Pfam: Pfam-A entries are high quality, manually curated families; Pfam-B entries have lower quality. | ||
| STRING | A database of known and predicted protein interactions, including direct (physical) and indirect (functional) associations. | ||
| MSigDB | Molecular signatures database, a collection of annotated gene sets integrating canonical pathways representing biological processes. | ||
| BioCarta | Includes classical pathways as well as current suggestions for new pathways. | ||
| Reactome | The Reactome pathway database aims to provide intuitive bioinformatics tools for visualization, interpretation and analysis of pathway knowledge. | ||
| T2DGADB | A disease gene network database for type 2 diabetes. |