| Literature DB >> 32641124 |
Lars Christoph Gleim1, Md Rezaul Karim2,3, Lukas Zimmermann4, Oliver Kohlbacher4,5,6,7,8, Holger Stenzhorn4,9, Stefan Decker2,3, Oya Beyan2,3.
Abstract
BACKGROUND: Sharing sensitive data across organizational boundaries is often significantly limited by legal and ethical restrictions. Regulations such as the EU General Data Protection Rules (GDPR) impose strict requirements concerning the protection of personal and privacy sensitive data. Therefore new approaches, such as the Personal Health Train initiative, are emerging to utilize data right in their original repositories, circumventing the need to transfer data.Entities:
Keywords: Data access; Distributed systems; FAIR data; Linked data; Personal health train; Privacy; Query design; RDF; SPARQL; Schema extraction; Semantic web
Mesh:
Year: 2020 PMID: 32641124 PMCID: PMC7341611 DOI: 10.1186/s13326-020-00223-z
Source DB: PubMed Journal: J Biomed Semantics
Fig. 1Illustration of several effects of entailment support of a SPARQL endpoint
Fig. 2Directly instantiated schema extracted from Example 1
Fig. 3Locally inferred relevant schema extracted from example 1
Fig. 4An example of a visual SPARQL query builder tool interacting with the schema introspection endpoint to enable assisted query design
Fig. 5Workflow of the proposed architecture
HCLS core statistics [32] of evaluated datasets, vocabularies and extracted schemata
| number of unique | triples | typed entities | subjects | properties | objects | classes | literals | |
|---|---|---|---|---|---|---|---|---|
| Vocabulary Corpus | 833834 | 129827 | 171168 | 1209 | 145498 | 1469 | 180680 | |
| Dataset | 150000 | 20000 | 20000 | 20 | 10003 | 3 | 70717 | |
| schema.org (1) | 8427 | 1617 | 1619 | 15 | 476 | 31 | 3193 | |
| foaf (2) | 631 | 84 | 86 | 15 | 38 | 9 | 154 | |
| merge of (1), (2) | 9058 | 1701 | 1705 | 23 | 508 | 38 | 3335 | |
| directly instantiated | 47 | 23 | 23 | 3 | 5 | 2 | 0 | |
| locally inferred | 576 | 95 | 95 | 13 | 71 | 9 | 118 | |
| LOV inferred | 2345 | 208 | 208 | 87 | 379 | 16 | 850 | |
| Dataset | 11609 | 1123 | 1123 | 24 | 1232 | 13 | 5158 | |
| GenDR Vocabulary | 192 | 20 | 20 | 8 | 6 | 5 | 116 | |
| directly instantiated | 361 | 37 | 37 | 10 | 16 | 7 | 105 | |
| locally inferred | 380 | 37 | 37 | 10 | 16 | 7 | 124 | |
| LOV inferred | 911 | 71 | 71 | 58 | 127 | 12 | 370 | |
| Dataset | 377947 | 28871 | 28871 | 38 | 42891 | 29 | 144773 | |
| Orphanet Vocabulary | 402 | 40 | 40 | 9 | 7 | 5 | 239 | |
| directly instantiated | 799 | 67 | 67 | 12 | 41 | 7 | 217 | |
| locally inferred | 840 | 68 | 68 | 12 | 41 | 7 | 256 | |
| LOV inferred | 1380 | 102 | 102 | 59 | 153 | 12 | 506 | |
| Dataset | 7189742 | 869981 | 869981 | 14 | 1420471 | 10 | 2865019 | |
| Homologene Vocabulary | 62 | 7 | 7 | 8 | 6 | 5 | 38 | |
| directly instantiated | 184 | 24 | 24 | 10 | 13 | 7 | 40 | |
| locally inferred | 190 | 24 | 24 | 10 | 13 | 7 | 46 | |
| LOV inferred | 721 | 58 | 58 | 58 | 124 | 12 | 292 |