Dmitriy Shin1, Gerald Arthur2, Mihail Popescu3, Dmitry Korkin4, Chi-Ren Shyu5. 1. University of Missouri, School of Medicine, Department of Pathology and Anatomical Sciences, Columbia, MO 65212, United States; University of Missouri, Graduate School, MU Informatics Institute, Columbia, MO 65211, United States. Electronic address: shindm@health.missouri.edu. 2. University of Missouri, School of Medicine, Department of Pathology and Anatomical Sciences, Columbia, MO 65212, United States; University of Missouri, Graduate School, MU Informatics Institute, Columbia, MO 65211, United States. 3. University of Missouri, School of Medicine, Department of Health Management and Informatics, Columbia, MO 65212, United States; University of Missouri, Graduate School, MU Informatics Institute, Columbia, MO 65211, United States; University of Missouri, College of Engineering, Department of Computer Science, Columbia, MO 65211, United States. 4. Worcester Polytechnic Institute, Department of Computer Science, Department of Biology and Biotechnology, Department of Applied Math, Worcester, MA 01609, United States. 5. University of Missouri, Graduate School, MU Informatics Institute, Columbia, MO 65211, United States; University of Missouri, College of Engineering, Department of Electrical and Computer Engineering, Columbia, MO 65211, United States.
Abstract
OBJECTIVES: We developed Resource Description Framework (RDF)-induced InfluGrams (RIIG) - an informatics formalism to uncover complex relationships among biomarker proteins and biological pathways using the biomedical knowledge bases. We demonstrate an application of RIIG in morphoproteomics, a theranostic technique aimed at comprehensive analysis of protein circuitries to design effective therapeutic strategies in personalized medicine setting. METHODS: RIIG uses an RDF "mashup" knowledge base that integrates publicly available pathway and protein data with ontologies. To mine for RDF-induced Influence Links, RIIG introduces notions of RDF relevancy and RDF collider, which mimic conditional independence and "explaining away" mechanism in probabilistic systems. Using these notions and constraint-based structure learning algorithms, the formalism generates the morphoproteomic diagrams, which we call InfluGrams, for further analysis by experts. RESULTS: RIIG was able to recover up to 90% of predefined influence links in a simulated environment using synthetic data and outperformed a naïve Monte Carlo sampling of random links. In clinical cases of Acute Lymphoblastic Leukemia (ALL) and Mesenchymal Chondrosarcoma, a significant level of concordance between the RIIG-generated and expert-built morphoproteomic diagrams was observed. In a clinical case of Squamous Cell Carcinoma, RIIG allowed selection of alternative therapeutic targets, the validity of which was supported by a systematic literature review. We have also illustrated an ability of RIIG to discover novel influence links in the general case of the ALL. CONCLUSIONS: Applications of the RIIG formalism demonstrated its potential to uncover patient-specific complex relationships among biological entities to find effective drug targets in a personalized medicine setting. We conclude that RIIG provides an effective means not only to streamline morphoproteomic studies, but also to bridge curated biomedical knowledge and causal reasoning with the clinical data in general.
OBJECTIVES: We developed Resource Description Framework (RDF)-induced InfluGrams (RIIG) - an informatics formalism to uncover complex relationships among biomarker proteins and biological pathways using the biomedical knowledge bases. We demonstrate an application of RIIG in morphoproteomics, a theranostic technique aimed at comprehensive analysis of protein circuitries to design effective therapeutic strategies in personalized medicine setting. METHODS:RIIG uses an RDF "mashup" knowledge base that integrates publicly available pathway and protein data with ontologies. To mine for RDF-induced Influence Links, RIIG introduces notions of RDF relevancy and RDF collider, which mimic conditional independence and "explaining away" mechanism in probabilistic systems. Using these notions and constraint-based structure learning algorithms, the formalism generates the morphoproteomic diagrams, which we call InfluGrams, for further analysis by experts. RESULTS:RIIG was able to recover up to 90% of predefined influence links in a simulated environment using synthetic data and outperformed a naïve Monte Carlo sampling of random links. In clinical cases of Acute Lymphoblastic Leukemia (ALL) and Mesenchymal Chondrosarcoma, a significant level of concordance between the RIIG-generated and expert-built morphoproteomic diagrams was observed. In a clinical case of Squamous Cell Carcinoma, RIIG allowed selection of alternative therapeutic targets, the validity of which was supported by a systematic literature review. We have also illustrated an ability of RIIG to discover novel influence links in the general case of the ALL. CONCLUSIONS: Applications of the RIIG formalism demonstrated its potential to uncover patient-specific complex relationships among biological entities to find effective drug targets in a personalized medicine setting. We conclude that RIIG provides an effective means not only to streamline morphoproteomic studies, but also to bridge curated biomedical knowledge and causal reasoning with the clinical data in general.