| Literature DB >> 25314947 |
Paul C Boutros, Adam A Margolin, Joshua M Stuart, Andrea Califano, Gustavo Stolovitzky.
Abstract
Rapid technological development has created an urgent need for improved evaluation of algorithms for the analysis of cancer genomics data. We outline how challenge-based assessment may help fill this gap by leveraging crowd-sourcing to distribute effort and reduce bias.Entities:
Mesh:
Year: 2014 PMID: 25314947 PMCID: PMC4318527 DOI: 10.1186/s13059-014-0462-7
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Non-comprehensive list of important and current challenge efforts and platforms
|
|
|
|
|
|
|---|---|---|---|---|
| Assemblathon1&2 | Sequence assembly | Objective scoring | UC Davis Genome Center |
|
| CAFA | Protein function prediction | Objective scoring | Community collaboration |
|
| CAGI | Systems biology | Objective scoring | UC Berkley/University of Maryland |
|
| CAPRI | Protein docking | Objective scoring | Community collaboration |
|
| CASP | Structure prediction | Objective scoring | Community collaboration |
|
| ChaLearn | Machine learning | Objective scoring | ChaLearn Organization (non-for profit) |
|
| CLARITY | Clinical genome interpretation | Objective scoring and evaluation by judges | Boston Children’s Hospital |
|
| DREAM | Network inference and systems biology | Objective scoring | Community collaboration & Sage Bionetworks |
|
| FlowCAP | Flow cytometry analysis | Objective scoring | Community collaboration |
|
| IGCG-TCGA DREAM Somatic Mutation Calling | Sequence analysis | Objective evaluation | Community collaboration & Sage Bionetworks |
|
| IMPROVER | Systems biology | Objective evaluation and crowd-verification | Phillip Morris International |
|
| Innocentive | Topics in various industries | Objective scoring and evaluation by judges | Commercial platform |
|
| Kaggle | Topics in various industries | Objective scoring and evaluation by judges | Commercial platform |
|
| RGASP | RNA-seq analyses | Objective scoring | European Bioinformatics Institute |
|
| Sequence Squeeze | Sequence compression | Objective scoring and evaluation by judges | Pistoia Alliance |
|
| X-Prize | Technology | Evaluation by judges | X-Prize Organization (non-for-profit) |
|
The challenges were chosen based on relevance to cancer genomics or the representativeness of a type of challenge. Different challenges specialize in specific areas of research (see ‘Scope’), and may use different assessment types such as objective scoring against a gold standard, evaluation by judges, or community consensus (‘crowd-verification’). Organizers can be researchers from specific institutions (such as universities or hospitals), a group of diverse researchers from academia and industry collaborating in the challenge organization (community collaboration), not-for-profit associations, or commercial platforms that run challenges as their business model (such as Innocentive and Kaggle). Initiatives such as CAFA, CAGI, CAPRI, CASP, ChaLearn, DREAM, FlowCAP and IMPROVER organize several challenges each year, and only the generic project is listed in this table, with the exception of DREAM, for which we also show the IGCG-TCGA DREAM Somatic Mutation Calling Challenge because of its relevance to this paper. More information about these efforts can be found on the listed websites.
Figure 1Typical design of a crowd-sourced challenge. A dataset is split into a training set, a validation (or leaderboard set) and the test set (or gold standard). Participants have access to the challenge input data and the known answers for just the training set. For the validation and test sets only, the challenge input data are provided but the answers to the challenge questions are withheld. In the challenge open phase, participants optimize their algorithms by making repeated submissions to predict the validation set answers. These submissions are scored and returned to the participants who can use the information to improve their methods. In the final evaluation phase, the optimized algorithms are submitted and evaluated against the final test set (the gold standard), and the resulting scores are used to compute the statistical significance and the ranking of the participating algorithms.
Some advantages and limitations of challenge-based methods assessment, along with barriers to participation in them
|
|
|
|
|---|---|---|
| Reduction of over-fitting | Narrower scope compared to traditional open-ended research | Incentives not strong enough to promote participation |
| Benchmarking individual methods | Ground truth needed for objective scoring | No funding available to support time spent participating in challenges |
| Impartial comparison across methods using same datasets | Mostly limited to computational approaches | Fatigue resulting from many ongoing challenges |
| Fostering collaborative work, including code sharing | Requires data producers to share their data before publication | Time assigned by organizers to solve a difficult challenge question may be too short |
| Acceleration of research | Sufficient amount of high-quality data needed for meaningful results | Lack of computing capabilities |
| Enhancing data access and impact | Large number of participants not always available | New data modality or datasets that are too complex or too big poses entry barrier |
| Determination of problem solvability | Challenge questions may not be solvable with data at hand | Challenge questions not interesting or impactful enough |
| Tapping the ‘Wisdom of Crowds’ | Traditional grant mechanisms not adequate to fund challenge efforts | Cumbersome approvals to acquire sensitive datasets |
| Objective assessment | Difficulties to distribute datasets with sensitive information | |
| Standardizes experimental design |
Figure 2Different researchers studying the same data may arrive at discordant conclusions. Benchmarking becomes essential as a way to separate true findings from spurious ones. (Illustration by Natasha Stolovitzky-Brunner© inspired by the parable of the six blind men and the elephant).