| Literature DB >> 29333230 |
Chiara Gabella1, Christine Durinx1, Ron Appel1.
Abstract
Millions of life scientists across the world rely on bioinformatics data resources for their research projects. Data resources can be very expensive, especially those with a high added value as the expert-curated knowledgebases. Despite the increasing need for such highly accurate and reliable sources of scientific information, most of them do not have secured funding over the near future and often depend on short-term grants that are much shorter than their planning horizon. Additionally, they are often evaluated as research projects rather than as research infrastructure components. In this work, twelve funding models for data resources are described and applied on the case study of the Universal Protein Resource (UniProt), a key resource for protein sequences and functional information knowledge. We show that most of the models present inconsistencies with open access or equity policies, and that while some models do not allow to cover the total costs, they could potentially be used as a complementary income source. We propose the Infrastructure Model as a sustainable and equitable model for all core data resources in the life sciences. With this model, funding agencies would set aside a fixed percentage of their research grant volumes, which would subsequently be redistributed to core data resources according to well-defined selection criteria. This model, compatible with the principles of open science, is in agreement with several international initiatives such as the Human Frontiers Science Program Organisation (HFSPO) and the OECD Global Science Forum (GSF) project. Here, we have estimated that less than 1% of the total amount dedicated to research grants in the life sciences would be sufficient to cover the costs of the core data resources worldwide, including both knowledgebases and deposition databases.Entities:
Keywords: Bioinformatics; Data Resources; Funding; Knowledgebases; Long-Term Sustainability; Open Science
Year: 2017 PMID: 29333230 PMCID: PMC5747334 DOI: 10.12688/f1000research.12989.2
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Figure 1. Funding models sources.
The 12 considered models are represented depending on the origin of the revenues.
Comparison of the 12 models in function of open access, equity, stability and key dependency.
The aspects that favour open access, equity of users and stability over time are highlighted in bold.
| # | Name of the model | Compatible
| Potential
| Stability forecasted
| Key dependency |
|---|---|---|---|---|---|
| 1 |
|
|
|
| National economic situation |
| 2 |
|
|
|
| Research spending by funding
|
| 3 |
|
|
|
| Institutional funds availability |
| 4 |
|
|
| Cyclic - grants
| Infrastructure/research
|
| 5 |
| No | Low | Function of usage | Commercial partner |
| 6 |
| No | Low | Function of usage | Usage |
| 7 |
| Not
| Low | Function of usage | Usage |
| 8 |
| No | Low | Function of usage | Usage |
| 9 |
|
|
|
| Commercial partner |
| 10 |
|
|
| Function of usage | Usage, commercial partners |
| 11 |
|
|
| Highly dependent
| Willingness to contribute |
| 12 |
|
|
|
| Partners |
Model 1, National funding.
Potential amounts from the top-10 UniProt user countries to sustain UniProt (orange columns) and the total core data resources (blue columns). Costs per country as a function of (1) usage, (2) Gross Domestic Product (GDP), (3) Net National Income (NNI) and (4) R&D domestic spending.
| Country | % of
| UniProt | Total core data resources | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Tax based
| 0.00025
| 0.00035
| 0.014 ‰
| Tax based
| 0.0024 ‰
| 0.003 ‰
| 0.13 ‰
| |||
| 1 | United States | 26.64 | 5,862 | 4,163 | 4,538 | 5,693 | 53,288 | 39,965 | 42,548 | 52,862 |
| 2 | China | 9.72 | 2,138 | 4,476 | 2,634 | 4,349 | 19,438 | 42,966 | 24,697 | 40,384 |
| 3 | United
| 6.87 | 1,512 | 639 | 675 | 533 | 13,741 | 6,139 | 6,332 | 4,950 |
| 4 | Germany | 6.10 | 1,342 | 923 | 964 | 1,285 | 12,201 | 8,857 | 9,036 | 11,929 |
| 5 | India | 5.47 | 1,204 | 1,838 | 2,353 | 875 | 10,944 | 17,648 | 22,060 | 8,126 |
| 6 | Japan | 4.35 | 958 | 1,130 | 1,187 | 2,064 | 8,706 | 10,852 | 11,131 | 19,170 |
| 7 | France | 3.26 | 717 | 641 | 669 | 712 | 6,515 | 6,153 | 6,270 | 6,610 |
| 8 | Canada | 2.69 | 592 | 374 | 389 | 321 | 5,385 | 3,595 | 3,644 | 2,985 |
| 9 | Spain | 2.27 | 500 | 373 | 386 | 236 | 4,546 | 3,583 | 3,617 | 2,192 |
| 10 | Italy | 1.96 | 431 | 524 | 543 | 337 | 3,916 | 5,034 | 5,089 | 3,130 |
| … | … | |||||||||
| 14 | Switzerland | 1.49 | 328 | 120 | 122 | 162 | 2,986 | 1,149 | 1,142 | 1,500 |
| … | … | |||||||||
| Total | 100 | €20 million | €190 million | |||||||
Applicability of the models to the UniProt case study.
The table summarizes the potential of income of each model and the complexity of the implementation. Refer to Section 5 for the calculations.
| # | Name of the model | Applicable to
| Potential and condition for income for
| Estimated
| Pros (+) | Cons (-) |
|---|---|---|---|---|---|---|
| 1 |
| Yes | 0.00025–0.00035‰ of the domestic
| Several years | + Stable funding in the long term
| - Requires negotiation with the
|
| 2 |
| Yes | ∼ 0.1% of total spending for life science
| Months to years | + Stable funding in the long term
| - Requires negotiation with the
|
| 3 |
| Current (SIB
| 63% of the budget | Already existing | + Funding relatively stable
| - Amounts insufficient to cover full
|
| 4 |
| Current (NIH
| 32% of the budget | Already existing | + Open Access | - Amount insufficient to cover full
|
| 5 |
| Yes | Commercial licences for private
| Months | + Stable funding in the long term
| - Not Open Access
|
| 6 |
| Yes | Subscription fees for users of €20/month
| Months | - Stable funding in the long term
| - Not Open Access
|
| 7 |
| ? | ? | Months | + Potential for high income | - Has to be combined with another
|
| 8 |
| ? | ? | Months | Potential for high income | - Has to be combined with another
|
| 9 |
| Yes | Consortium sharing the costs size allows
| Months to years | + Potential for high income
| - Requires negotiation with the
|
| 10 |
| Yes | €1.3 million (6,5% of the budget) | Months | + Open Access | - Has to be combined with another
|
| 11 |
| No | - | Months | + Open Access
| - Has to be combined with another
|
| 12 |
| Yes | Voluntary donation of
| Months | + Open Access | - Highly unpredictable
|
Figure 2. The Infrastructure Model on the level of the funding agency.
On the left, the current model, in which databases compete cyclically for grants against research or resource projects. On the right the Infrastructure Model, in which the funding agencies distribute research grants only to research projects. A percentage of each grant is retained and assigned to a budget for data stewardship, and subsequently redistributed among the relevant infrastructures, including Data Management Plans providers, deposition databases and knowledgebases.
Figure 3. Distribution of the UniProt cost among the 5 funding agencies with the 3 variations of the Infrastructure Model.
Case (i) is the classic model, in which the cost is covered by 0.1% of life science budget of each agency. In Case (ii), the five funding agencies are classified depending on their life science spending and total cost is shared among the categories with different percentages, but constant inside each group. In Case (iii) a fixed 2% entry fee is required from each funder (irrespective of the size) and the rest of the cost is covered by a contribution depending on the classification (S - M - L), as in Case (ii).