| Literature DB >> 35951620 |
Moe Kyaw Thu1, Shotaro Beppu2, Masaru Yarime1,3, Sotaro Shibayama4,5.
Abstract
The progress of science increasingly relies on machine learning (ML) and machines work alongside humans in various domains of science. This study investigates the team structure of ML-related projects and analyzes the contribution of ML to scientific knowledge production under different team structure, drawing on bibliometric analyses of 25,000 scientific publications in various disciplines. Our regression analyses suggest that (1) interdisciplinary collaboration between domain scientists and computer scientists as well as the engagement of interdisciplinary individuals who have expertise in both domain and computer sciences are common in ML-related projects; (2) the engagement of interdisciplinary individuals seem more important in achieving high impact and novel discoveries, especially when a project employs computational and domain approaches interdependently; and (3) the contribution of ML and its implication to team structure depend on the depth of ML.Entities:
Mesh:
Year: 2022 PMID: 35951620 PMCID: PMC9371286 DOI: 10.1371/journal.pone.0272280
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Descriptive statistics and correlation matrix.
| Variables | Mean | S.D. | Min | Max | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Impact | 1.010 | 1.173 | .000 | 8.218 | ||||||||||||
| 2 | Novelty | .500 | .289 | .000 | 1.000 | -.053 | |||||||||||
| 3 | Ln(#Author) | 1.715 | .670 | .000 | 5.198 | .077 | .071 | ||||||||||
| 4 | Ln(#Org) | 1.101 | .727 | .000 | 5.176 | .071 | .049 | .668 | |||||||||
| 5 | Ln(#Country) | .296 | .460 | .000 | 3.296 | .095 | .008 | .327 | .515 | ||||||||
| 6 | Univ-Industry Collab | .064 | .246 | .000 | 1.000 | .033 | -.030 | .167 | .197 | .136 | |||||||
| 7 | Comp-Domain Collab | .174 | .379 | .000 | 1.000 | .070 | .011 | .105 | .256 | .152 | .062 | ||||||
| 8 | Intra-Org Collab | .084 | .277 | .000 | 1.000 | .064 | .023 | .103 | .212 | .045 | .018 | .660 | |||||
| 9 | Inter-Org Collab | .154 | .361 | .000 | 1.000 | .064 | .010 | .124 | .274 | .193 | .079 | .932 | .515 | ||||
| 10 | Multi-Affiliation | .100 | .300 | .000 | 1.000 | .071 | .019 | .075 | .226 | .131 | .049 | .727 | .582 | .684 | |||
| 11 | Multi-Expertise | .151 | .358 | .000 | 1.000 | .017 | -.018 | -.131 | -.044 | .015 | .004 | .227 | .142 | .210 | .161 | ||
| 12 | ML-related | .103 | .304 | .000 | 1.000 | .095 | .004 | -.032 | .009 | -.010 | .086 | .192 | .134 | .171 | .130 | .231 | |
| 13 | DL-related | .040 | .196 | .000 | 1.000 | .046 | -.021 | -.033 | -.020 | -.014 | .055 | .130 | .082 | .117 | .086 | .156 | .602 |
Note. N = 24,641.
Fig 1Team size.
Team size is estimated by ordinary least squares (OLS) regressions controlling for publication years and journals. The error bars indicate one standard error. Two-tailed test: *p<0.05, ***p<0.001.
Fig 2Collaboration form.
Collaboration forms are estimated by logit regressions controlling for publication years and journals. The error bars indicate one standard error. Two-tailed test: ***p<0.001.
Fig 3Interdisciplinary expertise.
Collaboration forms are estimated by logit regressions controlling for publication years and journals. The error bars indicate one standard error. Two-tailed test: ***p<0.001.
Prediction of publication quality: ML-related vs. ML-unrelated projects.
|
| ||||
| Impact | Novelty | |||
| Model 1 | Model 2 | |||
| ln(#Author) | .171 | (.010) | .020 | (.005) |
| ln(#Org) | .013 | (.009) | .008 | (.004) |
| ln(#Country) | .079 | (.011) | -.004 | (.005) |
| Univ-Industry Collab | .046 | (.018) | -.043 | (.009) |
| ML-related | .478 | (.015) | -.004 | (.006) |
| Year dummies | Yes | Yes | ||
| Journal dummies | Yes | Yes | ||
| F stat | 499.220 | 37.113 | ||
| R2 adjusted | .660 | .173 | ||
| N | 24641 | 16433 | ||
|
| ||||
| Impact | Novelty | |||
| Model 1 | Model 2 | |||
| ln(#Author) | .161 | (.009) | .019 | (.005) |
| ln(#Org) | .012 | (.009) | .011 | (.004) |
| ln(#Country) | .066 | (.011) | -.005 | (.005) |
| Univ-Industry Collab | .017 | (.017) | -0.40 | (.009) |
| ML-related | .497 | (.016) | -.001 | (.008) |
| F stat | 241.507 | 155.761 | ||
| R2 adjusted | .828 | .797 | ||
| N | 24405 | 15938 | ||
Note. Unstandardized coefficients (standard errors in parentheses).
Two-tailed test: †p<0.1,
*p<0.05,
**p<0.01,
***p<0.001.
Ordinary least squares (OLS). (B) ML-related papers are paired with ML-unrelated papers published in the same journal in the same year.
Prediction of publication quality by team structure.
|
| ||||||||||||||||
| Impact | Novelty | |||||||||||||||
| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | Model 7 | Model 8 | |||||||||
| ln(#Author) | .110 | (.037) | .110 | (.037) | .109 | (.037) | .107 | (.042) | -.012 | (.013) | -.012 | (.013) | -.009 | (.013) | -.009 | (.015) |
| ln(#Org) | .004 | (.038) | -.001 | (.038) | -.002 | (.038) | .029 | (.041) | .037 | (.013) | .035 | (.013) | .034 | (.013) | .021 | (.014) |
| ln(#Country) | .144 | (.044) | .150 | (.044) | .136 | (.044) | .102 | (.048) | -.003 | (.015) | -.001 | (.015) | -.003 | (.015) | -.000 | (.016) |
| Univ-Industry Collab | .032 | (.051) | .033 | (.051) | .032 | (.051) | .002 | (.058) | -.062 | (.018) | -.061 | (.018) | -.063 | (.018) | -.061 | (.020) |
| Comp-Domain Collab | .113 | (.037) | -.002 | (.013) | ||||||||||||
| Intra-org Collab | .089 | (.048) | .012 | (.016) | ||||||||||||
| Inter-org Collab | .067 | (.041) | -.006 | (.014) | ||||||||||||
| Multi-Affiliation | .160 | (.046) | .017 | (.016) | ||||||||||||
| Comp-Domain Collab | .056 | (.046) | -.021 | (.016) | ||||||||||||
| Multi-Expertise | .119 | (.044) | .026 | (.015) | ||||||||||||
| Comp-Domain Collab | .139 | (.053) | -.013 | (.018) | ||||||||||||
| Year dummies | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | ||||||||
| Journal dummies | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | ||||||||
| F stat | 42.377 | 41.923 | 41.937 | 31.402 | 6.086 | 6.027 | 5.820 | 5.327 | ||||||||
| R2 adjusted | .611 | .611 | .606 | .587 | .184 | .184 | .176 | .189 | ||||||||
| N | 2530 | 2530 | 2505 | 2034 | 2137 | 2137 | 2117 | 1724 | ||||||||
|
| ||||||||||||||||
| Impact | Novelty | |||||||||||||||
| Model 1 | Model 2 | Model 3 | Model 4 | |||||||||||||
| ln(#Author) | .170 | (.010) | .159 | (.011) | .021 | (.005) | .025 | (.005) | ||||||||
| ln(#Org) | .007 | (.009) | .011 | (.010) | .007 | (.005) | .003 | (.005) | ||||||||
| ln(#Country) | .080 | (.011) | .069 | (.012) | -.003 | (.005) | .001 | (.006) | ||||||||
| Univ-Industry Collab | .049 | (.018) | .048 | (.020) | -.043 | (.009) | -.040 | (.009) | ||||||||
| ML-related | .457 | (.016) | .460 | (.020) | -.007 | (.007) | -.010 | (.009) | ||||||||
| Multi-Affiliation (ML-unrelated) | .006 | (.017) | .002 | (.008) | ||||||||||||
| Multi-Affiliation (ML-related) | .119 | (.033) | .017 | (.014) | ||||||||||||
| Multi-Expertise (ML-unrelated) | -.006 | (.016) | -.006 | (.008) | ||||||||||||
| Multi-Expertise (ML-related) | .006 | (.031) | .024 | (.013) | ||||||||||||
| Year dummies | Yes | Yes | Yes | Yes | ||||||||||||
| Journal dummies | Yes | Yes | Yes | Yes | ||||||||||||
| F stat | 486.705 | 412.463 | 34.530 | 30.739 | ||||||||||||
| R2 adjusted | .654 | .665 | .164 | .176 | ||||||||||||
| N | 24392 | 20095 | 16250 | 13386 | ||||||||||||
Note. Unstandardized coefficients (standard errors in parentheses).
Two-tailed test: †p<0.1,
*p<0.05,
**p<0.01,
***p<0.001.
Ordinary least squares (OLS). (B) To compare the impact of Multi-Affiliation and Multi-Expertise between ML-related and ML-unrelated projects, we interacted Multi-Affiliation (-Expertise) with ML-related. For example, Multi-Affiliation (ML-related) = 1 if Multi-Affiliation = 1 and ML-related = 1.
Use of machine: Computation-focused vs. computer-domain integrated projects (ML-related projects only).
| Impact | Novelty | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Computation-focused | Computer-Domain Integrated | Computation-focused | Computer-Domain Integrated | |||||||||||||
| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | Model 7 | Model 8 | |||||||||
| ln(#Author) | .008 | (.082) | .014 | (.099) | .112 | (.071) | .117 | (.080) | -.017 | (.031) | -.018 | (.037) | -.004 | (.023) | .001 | (.026) |
| ln(#Org) | -.030 | (.087) | .036 | (.094) | -.015 | (.068) | .047 | (.073) | -.004 | (.032) | -.016 | (.034) | -.029 | (.022) | -.021 | (.024) |
| ln(#Country) | .157 | (.095) | .162 | (.108) | .172 | (.078) | .112 | (.084) | .018 | (.034) | .038 | (.038) | .049 | (.025) | .047 | (.027) |
| Univ-Industry Collab | .118 | (.120) | .063 | (.145) | -.079 | (.090) | -.109 | (.098) | .032 | (.047) | .012 | (.054) | -.078 | (.029) | -.061 | (.031) |
| Multi-Affiliation | .085 | (.098) | .186 | (.074) | .039 | (.036) | .058 | (.024) | ||||||||
| Multi-Expertise | .008 | (.089) | .164 | (.074) | -.013 | (.032) | .063 | (.024) | ||||||||
| Year dummies | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | ||||||||
| Journal dummies | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | ||||||||
| F stat | 19.116 | 11.190 | 21.415 | 18.534 | 2.857 | 2.735 | 3.492 | 3.230 | ||||||||
| R2 adjusted | .611 | .532 | .665 | .662 | .160 | .186 | .211 | .217 | ||||||||
| N | 613 | 476 | 751 | 646 | 487 | 380 | 664 | 572 | ||||||||
Note. Unstandardized coefficients (standard errors in parentheses).
Two-tailed test: †p<0.1,
*p<0.05,
**p<0.01,
***p<0.001.
Ordinary least squares (OLS).
ML technologies.
|
| |||||||||||||||
| Impact | Novelty | ||||||||||||||
| Model 1 | Model 2 | ||||||||||||||
| ln(#Author) | .170 | (.010) | .021 | (.005) | |||||||||||
| ln(#Org) | .014 | (.009) | .007 | (.004) | |||||||||||
| ln(#Country) | .079 | (.011) | -.003 | (.005) | |||||||||||
| Univ-Industry Collab | .045 | (.018) | -.043 | (.009) | |||||||||||
| ML-related | .401 | (.018) | .017 | (.008) | |||||||||||
| DL-related | .200 | (.028) | -.054 | (.012) | |||||||||||
| Year dummies | Yes | Yes | |||||||||||||
| Journal dummies | Yes | Yes | |||||||||||||
| F stat | 495.598 | 36.987 | |||||||||||||
| R2 adjusted | .661 | .174 | |||||||||||||
| N | 24641 | 16433 | |||||||||||||
|
| |||||||||||||||
| Computation-focused | Computer-Domain Integrated | Computation-focused | Computer-Domain Integrated | ||||||||||||
| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | Model 7 | Model 8 | ||||||||
| ln(#Author) | -.014 | (.082) | -.003 | (.097) | .114 | (.071) | .112 | (.080) | -.013 | (.031) | -.013 | (.037) | (.023) | .003 | (.026) |
| ln(#Org) | -.021 | (.086) | .027 | (.093) | -.011 | (.069) | .055 | (.073) | -.004 | (.031) | -.015 | (.034) | (.022) | -.024 | (.024) |
| ln(#Country) | .168 | (.094) | .179 | (.106) | .173 | (.078) | .110 | (.084) | .016 | (.033) | .035 | (.038) | (.025) | .046 | (.027) |
| Univ-Industry Collab | .100 | (.119) | .066 | (.143) | -.088 | (.090) | -.124 | (.099) | .031 | (.046) | .004 | (.055) | (.029) | -.052 | (.032) |
| DL-related | .234 | (.079) | .198 | (.109) | .090 | (.084) | .121 | (.098) | -.023 | (.029) | -.026 | (.040) | (.027) | -.065 | (.032) |
| Multi-Affiliation | -.012 | (.137) | .201 | (.085) | .100 | (.052) | (.027) | ||||||||
| Multi-Affiliation | .113 | (.124) | .127 | (.135) | .003 | (.046) | (.044) | ||||||||
| Multi-Expertise | -.135 | (.126) | .214 | (.086) | .005 | (.047) | .042 | (.028) | |||||||
| Multi-Expertise | .102 | (.114) | .031 | (.133) | -.026 | (.040) | .118 | (.043) | |||||||
| Year dummies | Yes | Yes | Yes | Yes | Yes | Yes | Yes | ||||||||
| Journal dummies | Yes | Yes | Yes | Yes | Yes | Yes | Yes | ||||||||
| F stat | 19.029 | 11.414 | 20.834 | 18.050 | 2.845 | 2.667 | 3.444 | 3.212 | |||||||
| R2 adjusted | .618 | .547 | .665 | .662 | .165 | .186 | .212 | .220 | |||||||
| N | 613 | 476 | 751 | 646 | 487 | 380 | 664 | 572 | |||||||
Note. Unstandardized coefficients (standard errors in parentheses).
Two-tailed test: †p<0.1,
*p<0.05,
**p<0.01,
***p<0.001.
Ordinary least squares (OLS).