| Literature DB >> 30459587 |
David J Rotenberg1, Qing Chang1, Natalia Potapova1, Andy Wang1, Marcia Hon1, Marcos Sanches1,2, Nikola Bogetic1, Nathan Frias3, Tommy Liu3, Brendan Behan4, Rachad El-Badrawi5, Stephen C Strother6,7, Susan G Evans5, Jordan Mikkelsen5, Tom Gee5,6, Fan Dong5,6, Stephen R Arnott5,6, Shuai Laing5,6, Moyez Dharsee5, Anthony L Vaccarino4,5, Mojib Javadi5, Kenneth R Evans5, Damian Jankowicz1.
Abstract
Investigations of mental illness have been enriched by the advent and maturation of neuroimaging technologies and the rapid pace and increased affordability of molecular sequencing techniques, however, the increased volume, variety and velocity of research data, presents a considerable technical and analytic challenge to curate, federate and interpret. Aggregation of high-dimensional datasets across brain disorders can increase sample sizes and may help identify underlying causes of brain dysfunction, however, additional barriers exist for effective data harmonization and integration for their combined use in research. To help realize the potential of multi-modal data integration for the study of mental illness, the Centre for Addiction and Mental Health (CAMH) constructed a centralized data capture, visualization and analytics environment-the CAMH Neuroinformatics Platform-based on the Ontario Brain Institute (OBI) Brain-CODE architecture, towards the curation of a standardized, consolidated psychiatric hospital-wide research dataset, directly coupled to high performance computing resources.Entities:
Keywords: LabKey; XNAT; collaborative brain science; medical informatics; neuroinformatics
Year: 2018 PMID: 30459587 PMCID: PMC6232622 DOI: 10.3389/fninf.2018.00077
Source DB: PubMed Journal: Front Neuroinform ISSN: 1662-5196 Impact factor: 4.081
Figure 1Overview of the Center for Addiction and Mental Health (CAMH) Neuroinformatics Platform. Data sources include XNAT (imaging), LabKey (molecular), REDCap (electronic case report forms, eCRFs) and case of electronic medical record (eMR) case of electronic medical record (eMR) datasets (CERNER; electronic health records, eHRs) which are federated into a central DB2 database. Federated datasets are available to compute resources (compute and Hadoop clusters) and easily accessible through dashboards and software notebooks through the Neuroinformatics Portal.
Figure 2CAMH Neuroinformatics Portal landing page (Left), Dashboard view for multi-modal dataset (Right). The filter function for data query is illustrated for the Dashboard view.
Figure 3Example, “global” longitudinal quality assurance and quality control (QA/QC) dashboard for functional MRI (fMRI) data.
Figure 4High-level schematic overview of data flow from the eMHR system (CERNER) to the Neuroinformatics Platform database: (1) Electronic Medical Health Care Data are collected as part of clinical care and from clinical trials/translational clinical research; (2) Extract Transform and Load (ETL) scripts extract data from the electronic medical health record system to a curated intermediary database; (3) The NI extraction scripts are run, pulling only the agreed upon variables and anonymous Research IDs. These data, including an up-to-date schema are transferred to a secure location; (4) Anonymization scripts (sdcMicro; Templ et al., 2015) are run to determine whether the new extract fulfills anonymization criteria. If not, data flow ceases and the data are triaged. The extract is revised, until the thresholds are appropriately met; (5) Once the anonymization thresholds are successful, data are transferred to the DB2 database, incorporating updated schemas; (6) Accesses to these data are provided securely to research teams, with prior research ethics approvals only.
Figure 5Example clinical data cohort explorer dashboard, with visualizations of diagnosis, age, gender and average encounter (filterable by diagnosis). QC data and full table views are also made available.
Figure 6Illustration of the CAMH Compute Cluster architecture.
Figure 7Overview of the Neuroinformatics Platform architecture that leverages high performance storage system replication and virtual machines, to support high availability, redundancy and robust failover.
Summary table of data currently stored in the Center for Addiction and Mental Health (CAMH) Neuroinformatics platform. (A) Neuroinformatics platform data summaries.
| Primary database | Number of Participants |
|---|---|
| XNAT—Medical Imaging | 2,878 |
| REDCap—Assessments | 13,514 |
| LabKey—Molecular | 15,385 |
| eMHR—Clinical | 330,000 |
| 361,777 | |
| DTI | 2277 |
| EEG | 1837 |
| T1 | 2600 |
| T2 | 4322 |
| fMRI | 22108 |
| 33144 |
Number of primary records stored in each database, XNAT, REDCap, LabKey and from clinical records, Summary of Neuroimaging data types currently stored in XNAT.