| Literature DB >> 31494632 |
Olga Zolotareva1, Maren Kleine2.
Abstract
Modern high-throughput experiments provide us with numerous potential associations between genes and diseases. Experimental validation of all the discovered associations, let alone all the possible interactions between them, is time-consuming and expensive. To facilitate the discovery of causative genes, various approaches for prioritization of genes according to their relevance for a given disease have been developed. In this article, we explain the gene prioritization problem and provide an overview of computational tools for gene prioritization. Among about a hundred of published gene prioritization tools, we select and briefly describe 14 most up-to-date and user-friendly. Also, we discuss the advantages and disadvantages of existing tools, challenges of their validation, and the directions for future research.Entities:
Keywords: Data integration; Gene prioritization; Human diseases
Mesh:
Year: 2019 PMID: 31494632 PMCID: PMC7074139 DOI: 10.1515/jib-2018-0069
Source DB: PubMed Journal: J Integr Bioinform ISSN: 1613-4516
Figure 1:The scheme of a gene prioritization tools. Gene prioritization tools extract information about specified candidates and seed genes or phenotype terms from evidence sources and calculate a score that reflects how likely each gene is responsible for the development of a phenotype. In this example, genes which have alleles causing an early-onset autosomal dominant familial form of Alzheimer’s disease are used as seeds. Candidate genes were obtained from GWAS Catalog [29]. Each candidate gene has at least one variant associated with Alzheimer’s disease. The output of the program is a ranked list of candidate genes arranged according to calculated scores.
Figure 2:Data representation models utilized by gene prioritization tools. A. Relational data structure. The first and the third evidence sources provide relationships between genes labeled with G (seeds) or g (candidates) and diseases (d), the second source provides gene membership in pathways (p) and the last two evidence sources contain different kinds of interactions between genes. Vector representation of seed and candidate genes are shown on the left. The similarity between colorings of gene g7 and seed genes shows that g7 seems to be a promising candidate. B. Network data structure. Nodes depict genes, edges show relationships between genes. Seed genes are highlighted with red. Types of interactions and associations are shown on the right.
Figure 3:The process of gene prioritization tools selection for further detailed comparison.
The characterization of selected gene prioritization tools.
| strategy | approach type | interfaces | input | ||||||
|---|---|---|---|---|---|---|---|---|---|
| integrate gene-disease associations | search for genes associated with seeds | score aggregation | network analysis | web interface | programmatic access | seed genes | disease or phenotype terms | candidate genes | |
| + | + | + | + | no | yes | whole genome | |||
| + | + | + | no | yes | whole genome | ||||
| + | + | + | no | yes | optional | ||||
| + | + | + | + | no | yes | optional | |||
| + | + | + | no | yes | whole genome | ||||
| + | + | + | + | + | no | yes | optional | ||
| + | + | + | + | no | yes | whole genome | |||
| + | + | + | yes | no | yes | ||||
| + | + | + | + | yes | no | whole genome | |||
| + | + | + | + | yes | no | yes | |||
| + | + | + | yes | no | whole genome | ||||
| + | + | + | + | yes | no | yes | |||
| + | + | + | yes | no | yes | ||||
| + | + | + | yes | no | yes | ||||
Types of evidence sources used by each of 14 gene prioritization tools.
| Gene Interactions | Gene Similarities | Gene-Disease associations | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Physical PPI | Pathways | Genetic interactions | Regulation | Interologs | Co-expression | Co-localization | Functional annotations | Phenotype similarity | Shared domains | Sequence similarity | Phylogenetic profile similarity | Chemical interaction | Text Mining | Genetic associations | Differential expression | Animal models | Human phenotype similarity | Chemical information | Pathways | Text mining | |
|
|
| ||||||||||||||||||||
|
| |||||||||||||||||||||
|
| |||||||||||||||||||||
|
|
|
| |||||||||||||||||||
|
|
|
| |||||||||||||||||||
|
| |||||||||||||||||||||
|
|
| ||||||||||||||||||||
|
|
| ||||||||||||||||||||
|
| |||||||||||||||||||||
|
|
| ||||||||||||||||||||
|
|
|
|
| ||||||||||||||||||