| Literature DB >> 26771577 |
Birgit Schmidt1, Birgit Gemeinholzer2, Andrew Treloar3.
Abstract
This paper presents the findings of the Belmont Forum's survey on Open Data which targeted the global environmental research and data infrastructure community. It highlights users' perceptions of the term "open data", expectations of infrastructure functionalities, and barriers and enablers for the sharing of data. A wide range of good practice examples was pointed out by the respondents which demonstrates a substantial uptake of data sharing through e-infrastructures and a further need for enhancement and consolidation. Among all policy responses, funder policies seem to be the most important motivator. This supports the conclusion that stronger mandates will strengthen the case for data sharing.Entities:
Mesh:
Year: 2016 PMID: 26771577 PMCID: PMC4714918 DOI: 10.1371/journal.pone.0146695
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Countries.
| Country | Frequency | Percentage |
|---|---|---|
| Germany | 205 | 16.4 |
| United States | 184 | 14.7 |
| Italy | 117 | 9.4 |
| United Kingdom | 88 | 7.1 |
| France | 68 | 5.4 |
| Australia | 45 | 3.6 |
| Spain | 43 | 3.4 |
| China | 39 | 3.1 |
| Netherlands | 34 | 2.7 |
| Canada | 32 | 2.6 |
| Norway | 29 | 2.3 |
| Switzerland | 29 | 2.3 |
| Belgium | 28 | 2.2 |
| Japan | 26 | 2.1 |
| Greece | 23 | 1.8 |
| India | 23 | 1.8 |
| Sweden | 23 | 1.8 |
| Austria | 18 | 1.4 |
| Finland | 15 | 1.2 |
| Russia | 14 | 1.1 |
| Brazil | 11 | 0.9 |
| Portugal | 10 | 0.8 |
| Other (less than 10 resp.) | 144 | 11.5 |
Table shows frequencies and valid percentages for each country (n = 1248).
Fig 1Countries with more than 20 responses.
The majority of responses came from central Europe and the United States.
Age groups.
| Age | Frequency | Percentage |
|---|---|---|
| < = 20 | 1 | 0.1 |
| 21–25 | 22 | 1.8 |
| 26–30 | 142 | 11.4 |
| 31–35 | 232 | 18.6 |
| 36–40 | 208 | 16.7 |
| 41–45 | 168 | 13.5 |
| 46–50 | 152 | 12.2 |
| 51–55 | 116 | 9.3 |
| 56–60 | 101 | 8.1 |
| 61–65 | 53 | 4.3 |
| 66–70 | 32 | 2.6 |
| 71–75 | 9 | 0.7 |
| 76–80 | 7 | 0.6 |
| 81–85 | 3 | 0.2 |
| >90 | 1 | 0.1 |
Table shows frequencies and valid percentages for each age category (n = 1247).
Employment role.
| Employment | Frequency | Percentage |
|---|---|---|
| Academia | 878 | 70.1 |
| Government | 224 | 17.9 |
| Non profit | 70 | 5.6 |
| Other | 45 | 3.6 |
| Business | 32 | 2.6 |
| Media | 4 | 0.3 |
Table shows frequencies and valid percentages for each employment category (n = 1253).
Subject disciplines.
| Discipline | Frequency | Percentage |
|---|---|---|
| Earth sciences and environmental sciences | 846 | 68.7 |
| Climate and atmospheric sciences | 386 | 31.3 |
| Biological sciences | 258 | 20.9 |
| Physical sciences | 162 | 13.1 |
| Engineering | 88 | 7.1 |
| Computer sciences | 85 | 6.9 |
| Social sciences | 66 | 5.4 |
| Agricultural and veterinary sciences | 53 | 4.3 |
| Chemical sciences | 50 | 4.1 |
| Other discipline | 40 | 3.2 |
| Health sciences | 22 | 1.8 |
| Economics | 21 | 1.7 |
Table shows frequencies and valid percentages for each disciplinary category, multiple answers were allowed (n = 1232).
Overlap of disciplines in global environmental change research.
| Phys | Chem | Earth | Bio | Agric. | Social | Comp | Clim | Health | Engin | Econ | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Physic. sc. | 162 | 23 | 115 | 17 | 6 | 6 | 17 | 86 | 4 | 22 | 3 |
| percentage | 100.0 | 14.2 | 71.0 | 10.5 | 3.7 | 3.7 | 10.5 | 53.1 | 2.5 | 13.6 | 1.9 |
| Chem. sc. | 23 | 50 | 40 | 11 | 4 | 4 | 8 | 22 | 1 | 3 | 4 |
| percentage | 46.0 | 100.0 | 80.0 | 22.0 | 8.0 | 8.0 | 16.0 | 44.0 | 2.0 | 6.0 | 8.0 |
| Earth sc. | 115 | 40 | 846 | 114 | 32 | 36 | 53 | 249 | 11 | 64 | 14 |
| percentage | 13.6 | 4.7 | 100.0 | 13.5 | 3.8 | 4.3 | 6.3 | 29.4 | 1.3 | 7.6 | 1.7 |
| Biol. sc. | 17 | 11 | 114 | 258 | 20 | 12 | 25 | 29 | 9 | 9 | 5 |
| percentage | 6.6 | 4.3 | 44.2 | 100.0 | 7.8 | 4.7 | 9.7 | 11.2 | 3.5 | 3.5 | 1.9 |
| Agric. sc. | 6 | 4 | 32 | 20 | 53 | 6 | 3 | 14 | 2 | 5 | 6 |
| percentage | 11.3 | 7.5 | 60.4 | 37.7 | 100.0 | 11.3 | 5.7 | 26.4 | 3.8 | 9.4 | 11.3 |
| Social sc. | 6 | 4 | 36 | 12 | 6 | 66 | 12 | 21 | 7 | 7 | 10 |
| percentage | 9.1 | 6.1 | 54.5 | 18.2 | 9.1 | 100.0 | 18.2 | 31.8 | 10.6 | 10.6 | 15.2 |
| Comp. sc. | 17 | 8 | 53 | 25 | 3 | 12 | 85 | 23 | 5 | 15 | 5 |
| percentage | 20.0 | 9.4 | 62.4 | 29.4 | 3.5 | 14.1 | 100.0 | 27.1 | 5.9 | 17.6 | 5.9 |
| Climate sc. | 86 | 22 | 249 | 29 | 14 | 21 | 23 | 386 | 9 | 26 | 11 |
| percentage | 22.3 | 5.7 | 64.5 | 7.5 | 3.6 | 5.4 | 6.0 | 100.0 | 2.3 | 6.7 | 2.8 |
| Health | 4 | 1 | 11 | 9 | 2 | 7 | 5 | 9 | 22 | 5 | 3 |
| percentage | 18.2 | 4.5 | 50.0 | 40.9 | 9.1 | 31.8 | 22.7 | 40.9 | 100.0 | 22.7 | 13.6 |
| Engin. | 22 | 3 | 64 | 9 | 5 | 7 | 15 | 26 | 5 | 88 | 5 |
| percentage | 25.0 | 3.4 | 72.7 | 10.2 | 5.7 | 8.0 | 17.0 | 29.5 | 5.7 | 100.0 | 5.7 |
| Econ. sc. | 3 | 4 | 14 | 5 | 6 | 10 | 5 | 11 | 3 | 5 | 21 |
| percentage | 14.3 | 19.0 | 66.7 | 23.8 | 28.6 | 47.6 | 23.8 | 52.4 | 14.3 | 23.8 | 100.0 |
Table shows frequencies and percentages for overlaps of disciplines, multiple answers were allowed (n = 1232).
Fig 2Perceived properties of open data.
The ability to assess the quality, to select based on metadata, and to easily access and (re)use the data were rated as most important (n = 944 to 973 responses).
Fig 3Views on licenses for open data.
A “Public Domain” or “Attribution” license were considered most useful for open data (n = 712 to 820 responses).
Fig 4Expectations about functionalities of infrastructures.
Core expectations of users of data infrastructure were that attribution information is provided and that data is citable (n = 890 to 911 responses).
Fig 5Importance of open data for disciplinary communities.
Four out of five respondents highlighted that open data is crucial for advancing research (n = 853 to 878 responses).
Fig 6Motivators to publish data as open data.
The commitment to publish data as open data seems to be driven by research-intrinsic motives, combining general and personal motivations (n = 834 to 861 responses).
Motivators for data managers vs. all other data professionals.
| Motivator | n | low | interm. | high | p-value |
|---|---|---|---|---|---|
| Acceleration of scientific research and appl. | 235 | 2.15 | 22.32 | 75.54 | 0.5318 |
| Dissemination and recogn. of your work | 231 | 3.48 | 24.35 | 72.17 | 0.1487 |
| Personal commitment to open data | 232 | 5.65 | 24.35 | 70.00 | 0.0003 |
| Requests from data users | 231 | 7.66 | 33.19 | 59.15 | 0.7284 |
| Funder policy | 230 | 15.22 | 32.17 | 52.61 | 0.9979 |
| Organizational institutional policy | 231 | 15.09 | 37.93 | 46.98 | 0.8039 |
| Scientific professional society policy | 230 | 13.85 | 39.83 | 46.32 | 0.8076 |
| Community norms | 233 | 12.55 | 45.45 | 41.99 | 0.3322 |
| Publisher policy | 230 | 20.78 | 47.19 | 32.03 | 0.4843 |
At a personal level, data managers who contributed to the survey were significantly more committed to open data than all other data professionals (p-values based on a two-sample Mann-Whitney-Wilcoxon test, Levels of significance:
*: p<0.05,
**: p<0.01,
***: p<0.001.).
Fig 7Barriers across countries.
The release of data was seen as a secondary step compared to publishing results (n = 825 to 854 responses).
Age groups (clustered).
| Age group | Frequency | Percentage |
|---|---|---|
| 20–35 | 397 | 31.8 |
| 36–50 | 528 | 42.3 |
| 51+ | 322 | 25.8 |
Table shows frequencies and valid percentages for each age group, derived from the original age groups (n = 1247).
Publishing before releasing data—by age groups.
| Age group | Minor Barrier | Barrier | Major Barrier | n |
|---|---|---|---|---|
| 20–35 | 12.2 (30) | 26.02 (64) | 61.79 (152) | 397 |
| 36–50 | 14.08 (49) | 34.48 (120) | 51.44 (179) | 528 |
| 50+ | 15.04 (34) | 36.28 (82) | 48.67 (110) | 322 |
Table shows percentages and frequencies for each age group.
Fig 8Desire to publish before releasing data by age groups.
The willingness to share data share varies across age groups, the 31–35 year-olds expressed a significantly higher desire to publish results before releasing data. Due to the very small number of respondents in some categories the plot does not display all age groups (age < = 20 and >70 are not shown).
Fig 9Discovery of data.
References in journal articles, web search engines and data repositories were identified as the most common discovery routes (n = 774 respondents selected at least one option).
Discovery routes.
| Discovery route | Frequency | Percentage |
|---|---|---|
| References in journal articles | 622 | 79.8 |
| Web search engines | 549 | 70.5 |
| Searching in specific data repositories | 492 | 63.2 |
| Direct requests to data providers | 314 | 40.3 |
| Newsletters or other publications | 201 | 25.8 |
| Government or institutional announcements | 164 | 21.1 |
| Directories or catalogues | 136 | 17.5 |
| Social media | 70 | 9.0 |
| Blogs | 63 | 8.1 |
| Other discovery | 36 | 4.6 |
Table shows frequencies and valid percentages for each discovery route, multiple answers were allowed (n = 779).
Fig 10Burden when accessing data.
Paying for data as well as varying data quality, standards and formats were considered least acceptable when accessing data (n = 687 to 731 responses).
Fig 11Motivators by country.
Across all countries the acceleration of scientific research and applications, and dissemination and recognition of a researcher’s work were important reasons for sharing data. Funder policies as a motivator stood out in the UK and the U.S.
Motivators by Country.
| Item | Australia | France | Germany | Italy | UK | U.S. | ANOVA | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| M | SD | M | SD | M | SD | M | SD | M | SD | M | SD | F;p | |
| M1 | 2.59 | 0.56 | 2.59 | 0.58 | 2.47 | 0.61 | 2.56 | 0.61 | 2.30 | 0.68 | 2.53 | 0.60 | 1.63; 0.151 |
| M2 | 2.21 | 0.69 | 2.18 | 0.62 | 2.16 | 0.71 | 2.19 | 0.69 | 2.23 | 0.70 | 2.36 | 0.66 | 0.84; 0.519 |
| M3 | 2.35 | 0.73 | 2.19 | 0.73 | 2.18 | 0.76 | 2.26 | 0.67 | 2.36 | 0.67 | 2.41 | 0.70 | 0.97; 0.439 |
| M4 | 1.97 | 0.72 | 1.91 | 0.65 | 2.02 | 0.72 | 2.45 | 0.63 | 2.00 | 0.73 | 2.22 | 0.72 | 1.53; 0.178 |
| M5 | 2.21 | 0.81 | 2.38 | 0.66 | 2.25 | 0.74 | 2.14 | 0.67 | 2.64 | 0.60 | 2.60 | 0.61 | 3.99; 0.001 |
| M6 | 2.15 | 0.78 | 2.14 | 0.72 | 2.28 | 0.71 | 2.35 | 0.64 | 2.15 | 0.71 | 2.23 | 0.66 | 0.98; 0.429 |
| M7 | 2.53 | 0.75 | 2.69 | 0.56 | 2.60 | 0.60 | 2.76 | 0.48 | 2.63 | 0.63 | 2.61 | 0.58 | 1.06; 0.38 |
| M8 | 2.65 | 0.60 | 2.78 | 0.47 | 2.64 | 0.59 | 2.77 | 0.47 | 2.58 | 0.61 | 2.65 | 0.52 | 1.38; 0.232 |
| M9 | 2.32 | 0.64 | 2.50 | 0.63 | 2.51 | 0.68 | 2.41 | 0.67 | 2.63 | 0.58 | 2.65 | 0.60 | 3.72; 0.003 |
Table shows mean agreement and standard deviation for all countries with at least 45 responses. The p-values are based on univariate ANOVA tests within omnibus MANOVA (MANOVA: F(9, 466) = 1.81; p = 0.001***). Levels of significance:
*: p<0.05,
**: p<0.01,
***: p<0.001.
Tukey’s post-hoc analysis:
a: The UK differs significantly from the Germany and the Italy,
b: The U.S. differ significantly from the Germany and the Italy,
c: The U.S. differ significantly from Italy and Australia.
Motivators: M1: Requests from data users, M2: Community norms, M3: Organizational institutional policy, M4: Publisher policy, M5: Funder policy, M6: Scientific professional society policy, M7: Dissemination and recognition of your work research, M8: Acceleration of scientific research and applications, M9: Personal commitment to open data.
Barriers by Country.
| Item | Australia | France | Germany | Italy | UK | U.S. | ANOVA | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| M | SD | M | SD | M | SD | M | SD | M | SD | M | SD | F;p | ||
| B1 | 2.50 | 0.62 | 2.09 | 0.80 | 2.39 | 0.71 | 2.35 | 0.74 | 2.29 | 0.78 | 2.08 | 0.77 | 4.78; 0.0003 | |
| B2 | 2.18 | 0.80 | 2.20 | 0.78 | 2.00 | 0.76 | 2.13 | 0.78 | 1.98 | 0.73 | 2.06 | 0.70 | 0.56; 0.727 | |
| B3 | 2.09 | 0.90 | 2.02 | 0.81 | 2.01 | 0.79 | 2.24 | 0.73 | 2.00 | 0.80 | 1.81 | 0.75 | 3.06; 0.01 | |
| B4 | 1.85 | 0.67 | 1.93 | 0.78 | 2.13 | 0.84 | 2.34 | 0.72 | 1.97 | 0.76 | 1.86 | 0.81 | 4.4; 0.001 | |
| B5 | 1.91 | 0.87 | 2.00 | 0.76 | 1.96 | 0.80 | 2.23 | 0.73 | 2.09 | 0.77 | 2.08 | 0.76 | 2.13; 0.06 | |
| B6 | 2.03 | 0.76 | 2.19 | 0.74 | 2.16 | 0.74 | 2.25 | 0.75 | 2.15 | 0.78 | 2.11 | 0.77 | 0.4; 0.851 | |
| B7 | 1.82 | 0.72 | 1.88 | 0.66 | 2.09 | 0.73 | 2.17 | 0.74 | 1.91 | 0.72 | 1.82 | 0.65 | 2.96; 0.012 | |
| B8 | 1.82 | 0.76 | 1.60 | 0.63 | 1.85 | 0.70 | 2.00 | 0.69 | 1.83 | 0.72 | 1.70 | 0.72 | 2.2; 0.053 | |
| B9 | 1.68 | 0.73 | 1.77 | 0.74 | 1.83 | 0.72 | 2.09 | 0.73 | 1.83 | 0.79 | 1.78 | 0.77 | 1.97; 0.081 | |
| B10 | 2.06 | 0.78 | 2.23 | 0.81 | 2.52 | 0.68 | 2.29 | 0.75 | 2.45 | 0.64 | 2.38 | 0.72 | 3.28; 0.006 | |
Table shows mean agreement and standard deviation for all countries with at least 45 responses. The p-values are based on univariate ANOVA tests within omnibus MANOVA (MANOVA: F(10, 453) = 1.99; p = 5.011e-05***). Levels of significance:
*: p<0.05,
**: p<0.01,
***: p<0.001.
Tukey’s post-hoc analyis:
a: The U.S. differ significantly from Australia and Germany;
b: Italy differs significantly from the U.S.;
c: Italy differs significantly from the U.S. and Australia;
d: The U.S. differ significantly from Germany and Italy;
e: Australia differs significantly from Germany.
Barriers: B1: Legal constraints, B2: Organisational constraints, B3: Commercial use and exploitation, B4: Loss of control over intellectual property, B5: Misinterpretation or misuse, B6: Loss of credit or recognition, B7: Difficulty of clarifying rights multiple inputs or authors, B8: Concerns about legal liability for data or release of data, B9: Concerns about impact of data release, B10: Desire to publish results before releasing data.
Countries by continent.
| Continent | Frequency | Percentage |
|---|---|---|
| Europe | 797 | 63.9 |
| North America | 216 | 17.3 |
| Asia | 130 | 10.4 |
| Australia / New Zealand | 53 | 4.2 |
| South America | 38 | 3.0 |
| Africa | 14 | 1.1 |
Table shows frequencies and valid percentages for each regional category (n = 1248).