| Literature DB >> 35433572 |
Xi Shi1,2, Gorana Nikolic1, Scott Fischaber3, Michaela Black4, Debbie Rankin4, Gorka Epelde5,6, Andoni Beristain5,6, Roberto Alvarez5,6, Monica Arrue5,6, Joao Pita Costa7,8, Marko Grobelnik7,8, Luka Stopar7,8, Juha Pajula9, Adil Umer9, Peter Poliwoda10, Jonathan Wallace11, Paul Carlin12, Jarmo Pääkkönen13, Bart De Moor1.
Abstract
Background: Healthcare data is a rich yet underutilized resource due to its disconnected, heterogeneous nature. A means of connecting healthcare data and integrating it with additional open and social data in a secure way can support the monumental challenge policy-makers face in safely accessing all relevant data to assist in managing the health and wellbeing of all. The goal of this study was to develop a novel health data platform within the MIDAS (Meaningful Integration of Data Analytics and Services) project, that harnesses the potential of latent healthcare data in combination with open and social data to support evidence-based health policy decision-making in a privacy-preserving manner.Entities:
Keywords: data visualization; decision support system; epidemiology; machine learning; public health
Mesh:
Year: 2022 PMID: 35433572 PMCID: PMC9008448 DOI: 10.3389/fpubh.2022.838438
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Figure 1MIDAS Platform Overview. The platform consists of software hosted within a Policy Site Network for analyzing local data and applications hosted externally (External Network) for analyzing open and social data as indicated by the colored boxes. The light-gray boxes indicate end user-facing web-applications, connected to back-end applications and analytics platforms shown in the dark-gray boxes. The white boxes are the data sources, either internally from different pilots or externally from open and social data. The arrows between different software indicate they have direct interactions.
Research topics for all pilots.
|
|
|
|
|---|---|---|
| Basque | To understand what drives childhood obesity and the etiology of the childhood obesity | Controlled and open data |
| Finland | To understand mental health issues of young people with the support of visual analytics and analysis of available diverse datasets | Controlled and open data |
| Northern Ireland | To analyze the anonymized data extracted from children care system to provide new insights into a child's journey through the care system | Controlled and open data |
| Republic of Ireland | To study the cohort of persons with diabetes and determine the best distribution for diabetes services | Controlled and open data |
Figure 2An overview of the system configured for MIDAS. The core data platform for the MIDAS stack was based on HDFS, Hive, and Spark. The data can be imported into the system through Filesystem, HDFS, or externally to Hive. HDFS was applied to store files and raw data and Hive was employed as a data warehouse for the structured data after processing. External data assets were also virtualized through the Hive interface and they could be accessed by the MIDAS tools similarly to locally loaded data assets and used within GYDRA. The UI of the analytic platform includes Jupyter Notebook with Python and PySpark for developing and testing the underlying analytics models before being implemented within the MIDAS Analytics Backend. For managing and querying the databases in Hive and PostgreSql, an open sourced interactive editor Hue was used, and Zeppelin provided support for running Spark applications. User access was managed by a local LDAP server, which provided role-based access to the user applications and underlying data stores. [2021] IEEE. Reprinted, with permission, from (31).
Figure 3GYDRA data preparation tool UI, provided through an easy-to-use tab-based navigation approach following a common data assessment and preparation steps (i.e., General Stats, Features, Missing Values, Correlation and Outliers analysis tasks). Screenshots for two representative sections are included: (A) General stats–On the left side, general statistics on features and observations are provided, on the right side the variables type distribution is shown on a pie chart. On the lower part a transformations pipeline is included to add dataset transformations as their need is identified. (B) Missing values–On the top left area, missing value proportion is depicted for values, features and observations, on the right side indicators for complete and completely empty features and observations is provided. On the lower part, each feature is analyzed separately on bar charts representing their missing value percentage. Reprinted by permission from Springer Nature Customer Service Centre GmbH: Springer Nature, Business Information System Workshops, Chapter Enhancing the Interactive Visualization of a Data Preparation Tool from in-Memory Fitting to Big Data Sets by (14).
Figure 4Cross-filter of the Finnish Pilot. Users can select the county or region in the map on the right and all the charts will update automatically. Gender, Region, and Age Group are the categorical variables shown in the lower panel next to the map, and users could select subgroups either by pressing the buttons on the panel or selecting the subgroups in the line chart or bar chart. The Finnish cross-filter consists of a matrix heatmap, a bar chart, a line chart and the regional map of Finland. The cross-filter of other pilots have different components.
Pilot-specific analytics.
|
|
|
|
|---|---|---|
| Basque | RandomForests/LASSO | To identify the risk factors of childhood obesity |
| Finland | Lexis diagram analysis | To aggregate, summarize and visualize the selected risk factors in a secure way to protect the privacy of patients |
| Finland | Descriptive analysis | To intuitively evaluate the health, social, and education status of the inhabitants in regional level on a yearly basis by using open data |
| Northern Ireland | Markov chain | To track patterns of behavior over time and to give better visualization to intuitively present how children move in and out of different types of care by estimating the probability of the transition between different types of care |
| Northern Ireland | LSTM Network | To predict the future status of children to improve the protection for children from the policy level |
| Republic of Ireland | ARIMA | To forecast the consumption of diabetic drugs |
Figure 5MIDAS UI screenshots. (A) Common view of generated dashboard. On top of the view are the menus to manage dashboards, add analytics and use the external resources. The reporting tool is located in the middle (hidden in figure) and the rest of the view is the open space for widgets. Users can freely resize and organize widgets in this space. The analytic results come from the MIDAS Analytics Backend, together with the widgets developed externally. (B) Reporting tool open with a single research question and an answer with two associated widgets. This tool is used to generate a PDF file, defining the research questions and answers and attaching the most suitable figures describing them from the available widgets. [2021] IEEE. Reprinted, with permission, from (31).
Figure 6Open and Social Analytics. (A) Social Campaign Manager shows a high-level overview of the (i) sentiment and (ii) emotional analysis of the policy being studied by a Twitter social media campaign. The sentiment and emotions found in responses to a public online survey reaching out to the public to gather their voice on a specific health policy being considered on the dashboard. Clicking into the dashboard provides further insight including responses to particular questions and results processed using Natural Language Processing techniques showing the most common topics of conversation mentioned in the responses and the sentiment in which they were made. (B) MEDLINE custom widget that includes: (i) a list of the top ten MEDLINE articles with the first part of the abstract serving as a short description; (ii) a tag-cloud representing clusters of topics extracted from the MEDLINE articles including the searched keywords; and (iii) a target-shaped pointer that the user can move through the tag-cloud and by that, change the ranking of the listed articles. (C) News custom widget that consists of: (i) a word cloud that represents the main topics of the listed news, enabling a global perspective of the key topics before further activity; (ii) a list of news titles and first lines that are linked to the original news source; and (iii) the search choices where news are based on, defined by the filter and search options at the “Media Monitoring” menu of the external news dashboard.