| Literature DB >> 24257327 |
Vincent Ferretti1, Isabel Fortier2, Dany Doiron2,3, Paul Burton4, Yannick Marcon2, Amadou Gaye4, Bruce H R Wolffenbuttel5, Markus Perola6,7, Ronald P Stolk8, Luisa Foco9, Cosetta Minelli10, Melanie Waldenberger11, Rolf Holle11, Kirsti Kvaløy12, Hans L Hillege13, Anne-Marie Tassé3.
Abstract
BACKGROUND: Individual-level data pooling of large population-based studies across research centres in international research projects faces many hurdles. The BioSHaRE (Biobank Standardisation and Harmonisation for Research Excellence in the European Union) project aims to address these issues by building a collaborative group of investigators and developing tools for data harmonization, database integration and federated data analyses.Entities:
Year: 2013 PMID: 24257327 PMCID: PMC4175511 DOI: 10.1186/1742-7622-10-12
Source DB: PubMed Journal: Emerg Themes Epidemiol ISSN: 1742-7622
The Healthy Obese Project data harmonization and database federation step-by-step process
| Study recruitment and documentation | Studies are recruited to participate in the HOP and their key characteristics (e.g. design, sampling frame) are catalogued on the BioSHaRE website (www.bioshare.eu). |
| Harmonized variable selection and definition | A set of ‘target’ variables required to answer obesity-related research questions is identified at workshops bringing together BioSHaRE investigators. |
| Study variable identification and harmonization potential assessment | By analysing participating studies’ questionnaires, standard operating procedures, and data dictionaries, the potential for each study to generate this set of target variables is determined. Study-specific variables required to generate target variables are identified. |
| Data processing | Secure servers are set-up in each study’s host institution and the subsets of data required to generate target variables are loaded onto each of these servers. Processing algorithms transforming study data into the target (i.e. harmonized) format are developed and implemented for each study whenever harmonization is deemed possible. |
| Harmonized data federation, dissemination and analysis | A password protected web portal federates the servers found in the different study host institutions across Europe and allows remote retrieval of data summaries, descriptive statistics (frequencies, min, max, mean, standard deviation), and contingency tables. For more complex federated data analyses (e.g. linear regressions), the DataSHIELD method [ |
Healthy Obese Project participating studies to date, number of participants, host institutions, and location
| Cooperative Health Research in South Tyrol Study | CHRIS | 1116 | European Academy of Bolzano | Bolzano, Italy |
| KORA Cooperative Health Research in the Region of Augsburg | KORA | 18 000 | Helmholtz Center Munich | Augsburg, Germany |
| LifeLines Cohort Study | LifeLines | 93 000 | University Medical Center Groningen | Groningen, The Netherlands |
| Microisolates in South Tyrol Study | MICROS | 1300 | European Academy of Bolzano | Bolzano, Italy |
| National Child Development Study | NCDS | 18 558 | University of Leicester | Leicester, United Kingdom |
| FINRISK 2007 Study | FINRISK 2007 | 10 000 | National Institute for Health and Welfare | Helsinki, Finland |
| Nord-Trøndelag Health Study | HUNT | 78 968 | Norwegian University of Science and Technology | Trondheim, Norway |
| Prevention of REnal and Vascular ENd-stage Disease study | PREVEND | 8592 | University Medical Centre Groningen | Groningen, The Netherlands |
Figure 1Example of data processing to obtain a common format: deriving the harmonized Fasting Glucose DataSchema variable for two studies.
Figure 2Data harmonization and federated infrastructure for three HOP studies.