| Literature DB >> 20630989 |
Michael Wolfson1, Susan E Wallace, Nicholas Masca, Geoff Rowe, Nuala A Sheehan, Vincent Ferretti, Philippe LaFlamme, Martin D Tobin, John Macleod, Julian Little, Isabel Fortier, Bartha M Knoppers, Paul R Burton.
Abstract
BACKGROUND: Contemporary bioscience sometimes demands vast sample sizes and there is often then no choice but to synthesize data across several studies and to undertake an appropriate pooled analysis. This same need is also faced in health-services and socio-economic research. When a pooled analysis is required, analytic efficiency and flexibility are often best served by combining the individual-level data from all sources and analysing them as a single large data set. But ethico-legal constraints, including the wording of consent forms and privacy legislation, often prohibit or discourage the sharing of individual-level data, particularly across national or other jurisdictional boundaries. This leads to a fundamental conflict in competing public goods: individual-level analysis is desirable from a scientific perspective, but is prevented by ethico-legal considerations that are entirely valid.Entities:
Mesh:
Year: 2010 PMID: 20630989 PMCID: PMC2972441 DOI: 10.1093/ije/dyq111
Source DB: PubMed Journal: Int J Epidemiol ISSN: 0300-5771 Impact factor: 7.196
Figure 1Schematic representation of structure of scientific problems that DataSHIELD is designed to address. (a) One file: all individual-level data pooled together in one large data file. (b) Partitioned: individual-level data held in six separate data files, one for each study
Figure 2Schematic representation of the structure of DataSHIELD. The computer controlling analysis (heavily shaded circle) is sited at the analysis centre (MP: master process). The data computers (lightly shaded circles) are each sited at one of the study centres involved in the collaborative analysis (SP: slave process). The arrows indicate the flow of analytic instructions and summary statistics. All potentially disclosive individual-level data are secured on the local data computers
Thereby producing a results matrix for each study: for example,
| Estimate | Std. Errorb | |
|---|---|---|
| (Intercept) | 125.130 | 0.2629 |
| AGE | 0.203 | 0.0373 |
| SNP | 0.254 | 0.3907 |
aHere, the results shown are for simulated study 6
bStandard Error
| Estimate | SE | z-value | Pr(>|z|) | |
|---|---|---|---|---|
| Coefficients: | ||||
| (Intercept) | −0.32956 | 0.02838 | −11.612 | <2e-16 |
| BMI | 0.023 | 0.00621 | 3.703 | 0.000213 |
| BMI.456 | 0.04126 | 0.0114 | 3.62 | 0.000295 |
| SNP | 0.55173 | 0.03295 | 16.746 | <2e-16 |