| Literature DB >> 28862395 |
Cartik Kothari1, Maxime Wack1, Claire Hassen-Khodja1, Sean Finan2, Guergana Savova2, Megan O'Boyle3, Geraldine Bliss3, Andria Cornell3, Elizabeth J Horn3, Rebecca Davis3, Jacquelyn Jacobs3, Isaac Kohane1, Paul Avillach1.
Abstract
The heterogeneity of patient phenotype data are an impediment to the research into the origins and progression of neuropsychiatric disorders. This difficulty is compounded in the case of rare disorders such as Phelan-McDermid Syndrome (PMS) by the paucity of patient clinical data. PMS is a rare syndromic genetic cause of autism and intellectual deficiency. In this paper, we describe the Phelan-McDermid Syndrome Data Network (PMS_DN), a platform that facilitates research into phenotype-genotype correlation and progression of PMS by: a) integrating knowledge of patient phenotypes extracted from Patient Reported Outcomes (PRO) data and clinical notes-two heterogeneous, underutilized sources of knowledge about patient phenotypes-with curated genetic information from the same patient cohort and b) making this integrated knowledge, along with a suite of statistical tools, available free of charge to authorized investigators on a Web portal https://pmsdn.hms.harvard.edu. PMS_DN is a Patient Centric Outcomes Research Initiative (PCORI) where patients and their families are involved in all aspects of the management of patient data in driving research into PMS. To foster collaborative research, PMS_DN also makes patient aggregates from this knowledge available to authorized investigators using distributed research networks such as the PCORnet PopMedNet. PMS_DN is hosted on a scalable cloud based environment and complies with all patient data privacy regulations. As of October 31, 2016, PMS_DN integrates high-quality knowledge extracted from the clinical notes of 112 patients and curated genetic reports of 176 patients with preprocessed PRO data from 415 patients.Entities:
Keywords: clinical notes; knowledge extraction; knowledge integration; neuropsychiatric disorders; patient reported outcomes; rare
Mesh:
Substances:
Year: 2017 PMID: 28862395 PMCID: PMC5832521 DOI: 10.1002/ajmg.b.32579
Source DB: PubMed Journal: Am J Med Genet B Neuropsychiatr Genet ISSN: 1552-4841 Impact factor: 3.568
Figure 1PMS_DN uses the Apache cTAKES NLP engine to extract occurrences of UMLS concepts in the clinical notes of PMS patients. The UMLS concepts are mapped to 20 different terminologies including ICD‐9, ICD‐10, SNOMED, MeSH, and NDFRT. The i2b2/tranSMART user interface allows for easy browsing—starting with broad biomedical concepts and drilling down to find specific patients and data of interest. The i2b2/tranSMART user interface also displays the counts of patients (PC) and distinct terms (DTC) associated with each concept at all levels of the hierarchy [Color figure can be viewed at http://wileyonlinelibrary.com]
Figure 2Hypothesis testing on the i2b2/tranSMART interface of PMS_DN. In STEP 1, the user drags and drops the “hypotonia” concept and the “Yes” and “No” values for this concept into the two different subset boxes. Then the user clicks the “Generate Summary Statistics” button. In STEP 2, the user drags and drops the “AGE IN YEARS” concept into the Summary section to test the hypothesis that Hypotonia is correlated with age of the patient. The RESULT shows that no significant correlation can be found [Color figure can be viewed at http://wileyonlinelibrary.com]
Figure 3PMS_DN users with advanced access privileges obtained from the PMS Foundation (following appropriate IRB clearances) can view the raw data and perform basic sorting operations on the raw PMS patient data on PMS_DN (a) and also export it (b) [Color figure can be viewed at http://wileyonlinelibrary.com]
Figure 4The pop‐up validation window allows clinical experts to cross check the extracted instance of the “Pes Cavus” concept from the Human Phenotype Ontology (“BEFORE Validation” screenshot) against the raw text from which it was extracted (“Pop‐up Validation Window” screenshots). Clicking on the grey icon next to the “Pes Cavus” concept brings up the Pop‐up Validation window where the user can see the raw sentences from which the concept was extracted. Verification by the expert (by deselecting the checkbox against the raw sentence for patient 2) results in the “Pes cavus”concept being displayed in a green colored font (“AFTER Validation” screenshot) indicating to future users that it has been verified by clinical experts. Note the change in the Patient Count value: from 2 in the “BEFORE Validation” screenshot to 1 in the “AFTER Validation” screenshot. This indicates the immediate update of the knowledge base with the expert's input on the validation window [Color figure can be viewed at http://wileyonlinelibrary.com]
Figure 5The novel validation tool that can be used by clinical experts to crosscheck the identified concepts against the sentences from which they were extracted. a) Apache cTAKES extracts instances of UMLS concepts from the raw text of clinical notes. b) The output of cTAKES is loaded into the PMS_DN database. c) An expert uses the validation tool to verify the extracted UMLS concept against the raw text source. d) The expert verifies that the extraction of the UMLS concept is valid (or otherwise). e) The input of the experts is used to update the knowledge in the PMS_DN database immediately [Color figure can be viewed at http://wileyonlinelibrary.com]
Figure 6Cloud‐hosted architecture and data flow of PMS_DN [Color figure can be viewed at http://wileyonlinelibrary.com]
Figure 7As of Oct 31, 2016, PMS_DN integrates the knowledge extracted from the clinical notes of 112 patients with the curated genetic reports of 176 patients and the Patient Reported Outcomes data obtained from 415 patients. PMS_DN contains all three datasets—PRO, clinical notes, and curated genetic reports—of 70 patients [Color figure can be viewed at http://wileyonlinelibrary.com]