| Literature DB >> 35910078 |
Vlasta Sikimić1, Sandro Radovanović2.
Abstract
As more objections have been raised against grant peer-review for being costly and time-consuming, the legitimate question arises whether machine learning algorithms could help assess the epistemic efficiency of the proposed projects. As a case study, we investigated whether project efficiency in high energy physics (HEP) can be algorithmically predicted based on the data from the proposal. To analyze the potential of algorithmic prediction in HEP, we conducted a study on data about the structure (project duration, team number, and team size) and outcomes (citations per paper) of HEP experiments with the goal of predicting their efficiency. In the first step, we assessed the project efficiency using Data Envelopment Analysis (DEA) of 67 experiments conducted in the HEP laboratory Fermilab. In the second step, we employed predictive algorithms to detect which team structures maximize the epistemic performance of an expert group. For this purpose, we used the efficiency scores obtained by DEA and applied predictive algorithms - lasso and ridge linear regression, neural network, and gradient boosted trees - on them. The results of the predictive analyses show moderately high accuracy (mean absolute error equal to 0.123), indicating that they can be beneficial as one of the steps in grant review. Still, their applicability in practice should be approached with caution. Some of the limitations of the algorithmic approach are the unreliability of citation patterns, unobservable variables that influence scientific success, and the potential predictability of the model.Entities:
Keywords: Data envelopment analysis; Efficiency of experiments; Epistemic utility; High energy physics; Peer-review; Predictive analysis
Year: 2022 PMID: 35910078 PMCID: PMC9307966 DOI: 10.1007/s13194-022-00478-6
Source DB: PubMed Journal: Eur J Philos Sci ISSN: 1879-4912 Impact factor: 1.602
Fig. 1Method for project efficiency estimation and prediction
DEA efficiency scores and benchmarks; the efficient projects are highlighted
| 21.67% | 29.20% | 72.58% | ||||||
| 28.90% | 25.57% | 15.45% | ||||||
| 33.21% | 16.73% | 20.69% | ||||||
| 29.84% | 19.13% | 52.53% | ||||||
| 26.29% | 43.55% | 19.83% | ||||||
| 36.28% | 39.39% | |||||||
| 23.62% | 53.85% | 26.08% | ||||||
| 9.56% | 19.77% | 4.10% | ||||||
| 4.20% | 20.52% | 30.51% | ||||||
| 56.01% | 72.06% | 10.13% | ||||||
| 16.28% | 81.00% | |||||||
| 30.44% | 10.52% | 21.92% | ||||||
| 39.83% | 32.53% | |||||||
| 64.82% | 20.80% | |||||||
| 29.66% | 59.09% | |||||||
| 7.04% | 56.77% | |||||||
| 6.81% | 18.22% | |||||||
| 10.33% | 3.59% | |||||||
| 7.73% | ||||||||
| 63.66% | 18.72% | |||||||
| 18.25% | ||||||||
| 5.14% | 10.20% | |||||||
| 6.52% | 20.29% | |||||||
| 46.07% | 8.46% | |||||||
| 52.73% | 19.67% | |||||||
| 34.12% | 14.43% | |||||||
| 3.63% | 42.82% |
The efficient projects are bolded.
Predictive performances
| Algorithm | RMSE | MAE |
|---|---|---|
| Lasso linear regression | 0.030 ± 0.030 | 0.123 ± 0.045 |
| Ridge linear regression | 0.031 ± 0.026 | 0.130 ± 0.042 |
| Neural network | 0.054 ± 0.035 | 0.180 ± 0.057 |
| Gradient boosted trees | 0.035 ± 0.026 | 0.125 ± 0.039 |
Fig. 2Heatmaps of efficiency. The three input variables time (in days), the number of teams and the number of researchers are plotted against each other. The color indicates the efficiency scale from efficient (blue) to inefficient (red). As we can see, experiments become inefficient if they last long or involve a high number of teams
Summary: tendencies of efficient projects
| Time | The relationship between time and project efficiency has a saturation point. |
| Project structure | Efficient projects have a smaller number of teams. |