| Literature DB >> 31266445 |
Marina Esteban-Medina1, María Peña-Chilet1,2, Carlos Loucera1, Joaquín Dopazo3,4,5.
Abstract
BACKGROUND: In spite of the abundance of genomic data, predictive models that describe phenotypes as a function of gene expression or mutations are difficult to obtain because they are affected by the curse of dimensionality, given the disbalance between samples and candidate genes. And this is especially dramatic in scenarios in which the availability of samples is difficult, such as the case of rare diseases.Entities:
Keywords: Big data; Fanconi anemia; Genomics; Machine learning; Mathematical models; Signaling pathways
Mesh:
Substances:
Year: 2019 PMID: 31266445 PMCID: PMC6604281 DOI: 10.1186/s12859-019-2969-0
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Fanconi anemia curated map, based in the KEGG FA pathway. There are two protein complexes: RPA, composed of RPA1, RPA2, RPA3 and RPA4, and Core, composed of FANCM, FANCG, FANCL, FAAP100, FANCA, FANCB, UBE2T, STRA13, FANCC, FAAP24, HES1, FANCE, FANCF, BLM, RMI1, RMI2 and TOP3A. At the end of the effector nodes, whose names are taken for the circuits, a description of the main functionalities triggered by the signaling circuits can be found
Fig. 2Schema of the procedure followed for the analysis
New genes and connections discovered that allow the expansion of the FA pathway. The first two columns correspond to the two interactor proteins, the third column refers to the type of interaction and the last column shows the supporting bibliographic evidence. Genes MAD2L2, RFWD3 and XRCC2 (in bold) did not appear in the original FA KEGG pathway and were added to the new curated FA pathway
| NODE 1 | Node 2 | INTERACTION | Ref. |
|---|---|---|---|
|
|
| binding | [ |
|
|
| binding/association | [ |
|
|
| activation | [ |
|
|
| binding/association | [ |
|
|
| activation | [ |
|
|
| binding/association | [ |
|
|
| activation | [ |
|
|
| binding/association | [ |
|
|
| binding/association | [ |
|
|
| binding/association | [ |
|
|
| binding/association | [ |
|
|
| binding/association | [ |
Differential circuit activity in a comparison of healthy versus FA bone marrow cells. Circuits are named after their effector nodes (see Fig. 1)
| CIRCUIT | Activation | Statistic | FDR adj. | |
|---|---|---|---|---|
| RAD51 | UP | 0.615 | 0.558 | 0.659 |
| MLH1-PMS2 | UP | 2.400 | 0.016 | 0.067 |
| REV3L | DOWN | −3.789 | 3.917 × 10−5 | 5.092 × 10− 4 |
| RAD51C | DOWN | − 1.924 | 0.056 | 0.162 |
| RPA* | UP | 3.412 | 6.923 × 10−4 | 4.500 × 10−3 |
| FANCM-STRA-FAAP24 | UP | 1.885 | 0.062 | 0.162 |
Fig. 3Observed distribution of circuit activities in the comparison between healthy and FA bone marrow cells
Fig. 4Observed distribution of circuit activities in blood, a tissue affected by the disease, two tissues with a high rate of cell replication (skin and gastrointestinal), where DNA reparation is expected to play a relevant role and another tissue with low rate of cell replication (brain)
Fig. 5Distributions of the cross-validation of the relevance values for the top 50 most relevant genes ordered by their mean. Above the relevance value of 0.006 the relevance rendered by the ML procedure and the means obtained from the cross-validation are consistent. Then this value is taken as a threshold
Fig. 6the distribution of the R2 score for each signaling circuit of the FA pathway across all the training/test splits. The R2 score goes from -infinite to 1, where 0 represents a model that always predicts the mean for each task and a perfect model has a score of 1
List of most relevant genes (relevance > 0.006) obtained by the model. Drug IDs in bold are approved for use according to DrugBank database
| GENE NAME | SYMBOL | ENTREZ ID | RELEVANCE | TARGETING DRUGS (DrugBank ID) |
|---|---|---|---|---|
| NIMA related kinase 2 |
| 4751 | 0.097324 | DB07180, |
| DNA topoisomerase II alpha |
| 7153 | 0.078623 | |
| baculoviral IAP repeat containing 5 |
| 332 | 0.052406 | |
| centromere protein E |
| 1062 | 0.036961 | DB06097 |
| polo like kinase 1 |
| 5347 | 0.036159 | DB06897, DB06963, DB07789 |
| cyclin dependent kinase 1 |
| 983 | 0.022697 | DB05037, DB06195 |
| glutamate ionotropic receptor NMDA type subunit 1 |
| 2902 | 0.019528 | DB01931, DB04620, DB05824, DB06741, |
| cholinergic receptor nicotinic beta 2 subunit |
| 1141 | 0.013228 | DB05855 |
| synaptosome associated protein 25 |
| 6616 | 0.012799 |
|
| enhancer of zeste 2 polycomb repressive complex 2 subunit |
| 2146 | 0.012543 | DB12887, DB14581 |
| methylenetetrahydrofolate dehydrogenase, cyclohydrolase and formyltetrahydrofolate synthetase 1 |
| 4522 | 0.012111 | DB00116, DB02358, DB04322 |
| thymidylate synthetase |
| 7298 | 0.009462 | |
| serpin family E member 1 |
| 5054 | 0.009206 | DB05254 |
| cytochrome c oxidase subunit I |
| 4512 | 0.008027 |
|
| retinoic acid receptor alpha |
| 5914 | 0.007607 | |
| sodium voltage-gated channel alpha subunit 2 |
| 6326 | 0.006728 | DB13520 |
| kinesin family member 11 |
| 3832 | 0.006366 | DB03996, DB04331, DB06040, DB07064, DB08032, DB08033, DB08037, DB08198, DB08239, DB08244, DB08246, DB08250 |
Fig. 7Enrichment analysis with GO terms and rare diseases
Fanconi Anemia ORPHANET (ORPHA:84) database affected genes
| GENE NAME | SYMBOL | ENTREZ ID | ENSEMBL ID | OMIM |
|---|---|---|---|---|
| Fanconi Anemia complementation group F |
| 2188 | ENSG00000183161 | 603,467 |
| Fanconi Anemia complementation group C |
| 2176 | ENSG00000158169 | 227,645 |
| Breast cancer type 2 susceptibility protein |
| 675 | ENSG00000139618 | 114,480 |
| Breast cancer type 1 susceptibility protein |
| 672 | ENSG00000012048 | 113,705 |
| Fanconi Anemia complementation group E |
| 2178 | ENSG00000112039 | 600,901 |
| RAD51 recombinase |
| 5888 | ENSG00000051180 | 114,480 |
| Fanconi Anemia complementation group D2 |
| 2177 | ENSG00000144554 | 227,646 |
| Fanconi Anemia complementation group M |
| 57,697 | ENSG00000187790 | 609,644 |
| DNA repair protein RAD51 homolog 3 |
| 5889 | ENSG00000108384 | 602,774 |
| Ubiquitin-conjugating enzyme E2 T |
| 29,089 | ENSG00000077152 | 610,538 |
| Fanconi Anemia complementation group B |
| 2187 | ENSG00000181544 | 300,514 |
| Fanconi Anemia complementation group G |
| 2189 | ENSG00000221829 | 602,956 |
| Fanconi Anemia complementation group I |
| 55,215 | ENSG00000140525 | 609,053 |
| Fanconi Anemia complementation group L |
| 55,120 | ENSG00000115392 | 608,111 |
| partner and localizer of BRCA2 |
| 79,728 | ENSG00000083093 | 114,480 |
| SLX4 structure-specific endonuclease subunit |
| 84,464 | ENSG00000188827 | 613,278 |
| Ring finger and WD repeat domain 3 |
| 55,159 | ENSG00000168411 | 614,151 |
| BRCA1 interacting protein C-terminal helicase 1 |
| 83,990 | ENSG00000136492 | 114,480 |
| ERCC excision repair 4, endonuclease catalytic subunit |
| 2072 | ENSG00000175595 | 133,520 |
| Mitotic arrest deficient 2 like 2 |
| 10,459 | ENSG00000116670 | 604,094 |
| X-ray repair cross complementing 2 |
| 7516 | ENSG00000196584 | 600,375 |
| Fanconi Anemia complementation group A |
| 2175 | ENSG00000187741 | 227,650 |