| Literature DB >> 34850002 |
Johanna Jean Loomba1, Glenn S Wasson2, Ravi Kiran Reddy Chamakuri3, Pabitra Kumar Dash4, Stephen G Patterson4, Mary M A Potter5, Jason Edward Krisch6, Martha M Tenzer7, Karen C Johnston8, Don E Brown9.
Abstract
OBJECTIVE: The integrated Translational Health Research Institute of Virginia (iTHRIV) aims to develop an information architecture to support data workflows throughout the research lifecycle for cross-state teams of translational researchers.Entities:
Keywords: computer systems; data commons; health research software; information storage and retrieval; patient-generated health data
Mesh:
Year: 2022 PMID: 34850002 PMCID: PMC8922196 DOI: 10.1093/jamia/ocab262
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1.The dashboard of the iTHRIV Research Concierge Portal includes access to informational resources and events, concierge consultation services, and the iTHRIV Research Data Commons.
Figure 2.This diagram illustrates governance and oversight over various object types in the iTHRIV Commons. Note that the governance is largely managed at each local institution. Thus, the system provides the overall framework for governance while remaining agnostic to variations at the local level.
This table delineates ownership, governance, and oversight for each object type in the iTHRIV Commons
| Object type | Object ownership | Object governance and oversight |
|---|---|---|
|
|
|
Any Final content requires approval by local the The |
|
|
|
As metadata is stored locally at each institution, local |
|
|
|
Note that Permissions are stored locally and each institution’s |
|
|
| Although the research team manages the documents they upload, the system requires that certain documents be present in order to unlock features that require additional governance (as described in other rows of this table). |
|
|
|
A single project may use data sets that are stewarded by individuals within or outside the project team. The
|
|
|
|
A Data Set Admin must first be added to Project by a Project Owner, but once added they independently control access to their respective data sets. They can assign access to a subset of the project team.
|
|
|
|
Data file uploads and downloads are done by data set admins and collaborators with change history preserved for audit and prior versions remaining available to the study team.
|
Figure 3.The iTHRIV Research Concierge Portal is a complex full-stack cloud-hosted web-application shared by all partner sites with a limited view and a sub-set of services provided to the public. The component relationships are illustrated here.
The iTHRIV Research Concierge Portal full-stack components
| iTHRIV research concierge portal system components | |
|---|---|
| AWS instance | An image of Center for Internet Security (CIS) Ubuntu Linux 18.04 in AWS hosts the iTHRIV Research Concierge Portal. |
| iTHRIV web client host | Provides the client with code to run on the browser, dynamically rendering page content when API calls to the various services are executed. |
| iTHRIV web service | A RESTful API service that receives requests from code running on the client and returns content from the Portal backend. |
| iTHRIV portal commons backend service | Middleware (python flask application) that manages all metadata interactions between the client and the iTHRIV Commons Landing Services APIs, hosted at respective iTHRIV institutions. [Available to authenticated users only.] |
| PostgreSQL database | Resource and Event object metadata are stored in this database. |
| Elasticsearch | Supports indexed search of objects stored in the PostgreSQL database. |
| Amazon S3 Bucket | Provides Storage for files (including audio and video files) that are attached to Resource and Event objects in the portal. |
| Jira service management | A licensed service management software that the iTHRIV Web Service integrates with via backend APIs. User consult requests in the Portal generate a ticket in one of more than 20 different iTHRIV Jira projects depending on the user’s home institution and request type. Service teams manage the tickets through the Jira software, but users can access and track the status of their tickets at any time in the iTHRIV portal interface. [Available to authenticated users only.] |
| Calendar service | This plug-in supports event features in the Portal, allowing the user to select dates and add events to their personal calendars. |
| Email service | An email relay service handles the sending of system emails, such as requests for resource page approval by iTHRIV admins. [Available to authenticated users only.] |
| Google analytics | Provides iTHRIV with web application metrics. |
Figure 4.This diagram is a simple representation of the system interactions between an iTHRIV researcher’s computer, the iTHRIV Research Concierge Portal Web Application (collapsed here but with details available in Figure 3), ancillary authentication services, and the iTHRIV Research Data Commons Landing Service at a single site. Hardware and operating system support are provided by each participating institution, including firewalls, file storage, and servers that support file storage and the Landing Service software components.
The components of a local instance of an iTHRIV Research Data Commons Landing Service, including services and software deployed by the iTHRIV development team and existing local system that are integrated into the product
| iTHRIV research data commons: system components at participating institutions | |
|---|---|
| Firewalls | Institutional firewalls are maintained according to local standards. An F5 appliance sits between the external (AWS and client browser) components and internal system components (institutional iTHRIV Commons Landing Service and local research system assets), and provides an additional overlay of Application Security Services. For example, F5 ensures that Landing Service API calls can only be made from the Portal or from IP addresses associated with approved IP ranges from iTHRIV partner sites. |
| iTHRIV Commons Landing Service APIs | Three custom Python Flask web applications, deployed by the central iTHRIV development team and installed behind each local firewall:
Commons Backend Metadata Service: Provides APIs for metadata CRUD. Commons Backend Data Service: Provides APIs for data file upload/download/delete operations. Commons Backend Permissions Service: Provides APIs for managing user access permissions on projects and data sets. |
| MinIO software | Object store for storing/retrieving data files uploaded by users as well as storing a copy of project/data set metadata. Metadata is saved here to make it possible to rebuild Elasticsearch metadata index as well support metadata versioning feature. |
| Elasticsearch software | Provides service for managing project and data set metadata and facilitates search. |
| Local IRB databases | The Landing Service APIs make calls to the local IRB Database APIs to retrieve protocols and study team information, supporting alignment between Commons permissions and IRB approvals for use of HSD. This integration ensures that the |
| Local research databases | REDCap integration allows researchers to request routine exports of from their existing research databases to the Commons where they control permissions for further sharing of their data. Each new export request is reviewed by local REDCap teams for regulatory compliance before the providing the landing service with the token for scheduled routine extract. |
| Local storage | Data set metadata can point to other ancillary storage systems, allowing for files to remain in their current location while being indexed and managed in the Commons. |
Figure 5.This diagram is a simple representation of the federated iTHRIV Research Data Commons, with an iTHRIV Research Data Commons Landing Service installed at each site. The current iTHRIV Partner institutions are listed here, but other institutions may join the federation in the future under appropriate product licensure. To see the full system components involved at each participating site labeled in this diagram, refer to Figure 3. To see the full iTHRIV Research Concierge Portal Stack components, refer to Figure 2.
This table provides current counts of resource content in the portal by “type,” a system defined set of 10 tags that are more generic than the 95 subcategories but that provide another method for refining a search
| Resource type | Current count |
|---|---|
| Education | 396 |
| Other research resource | 158 |
| Informatics/analytics | 148 |
| Funding resource | 89 |
| Regulatory and compliance | 88 |
| Center or initiative | 82 |
| Administration | 59 |
| Research cores and labs | 58 |
| Learning shot | 44 |
| Health system | 22 |
Note: Each resource is assigned only one type with “Other Research Resource” being used for content that does not belong to the other specific types.
Figure 6.A simple diagram representing planned public access to the iTHRIV Research Data Commons. Note that no authentication is required.