| Literature DB >> 21490004 |
Ross D King1, Maria Liakata, Chuan Lu, Stephen G Oliver, Larisa N Soldatova.
Abstract
The reuse of scientific knowledge obtained from one investigation in another investigation is basic to the advance of science. Scientific investigations should therefore be recorded in ways that promote the reuse of the knowledge they generate. The use of logical formalisms to describe scientific knowledge has potential advantages in facilitating such reuse. Here, we propose a formal framework for using logical formalisms to promote reuse. We demonstrate the utility of this framework by using it in a worked example from biology: demonstrating cycles of investigation formalization [F] and reuse [R] to generate new knowledge. We first used logic to formally describe a Robot scientist investigation into yeast (Saccharomyces cerevisiae) functional genomics [f(1)]. With Robot scientists, unlike human scientists, the production of comprehensive metadata about their investigations is a natural by-product of the way they work. We then demonstrated how this formalism enabled the reuse of the research in investigating yeast phenotypes [r(1) = R(f(1))]. This investigation found that the removal of non-essential enzymes generally resulted in enhanced growth. The phenotype investigation was then formally described using the same logical formalism as the functional genomics investigation [f(2) = F(r(1))]. We then demonstrated how this formalism enabled the reuse of the phenotype investigation to investigate yeast systems-biology modelling [r(2) = R(f(2))]. This investigation found that yeast flux-balance analysis models fail to predict the observed changes in growth. Finally, the systems biology investigation was formalized for reuse in future investigations [f(3) = F(r(2))]. These cycles of reuse are a model for the general reuse of scientific knowledge.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21490004 PMCID: PMC3163424 DOI: 10.1098/rsif.2011.0029
Source DB: PubMed Journal: J R Soc Interface ISSN: 1742-5662 Impact factor: 4.118
Figure 1.Overall structure of the formalization (a fragment). The figure shows three investigations: the investigation into automation in science (in blue), the investigation into the reuse of the results of the investigation into automation in science (in brown) and the investigation into the FBA model (in green). The boxes represent parts of the investigations, the links are has-part relations.
Comparison of the predicted (sim.) change in growth rate (deletant − wild-type) with the experimentally measured (exp) growth rate change for the 20 manually studied gene deletants. MM denotes minimal medium; YPD is rich medium; n.a. means the reactions are not present in the iND750 model.
| reaction ID in iND750 | deleted gene (open reading frame) | exp. DM | sim. DM | exp. YPD | sim. YPD |
|---|---|---|---|---|---|
| R_AATA | YER152C | 0.009 | −0.733 | 0.019 | −0.222 |
| R_AATA | YGL202W | −0.024 | −0.733 | 0.024 | −0.222 |
| R_AATA | YJL060W | 0.013 | −0.733 | 0.024 | −0.222 |
| R_AGAT_SC | YDL052C | 0.009 | −0.733 | 0.034 | −0.805 |
| R_FTHFCLm | YER183C | 0.022 | 0 | 0.014 | 0 |
| R_G6PDA | YGR248W | 0.017 | 0 | 0.007 | 0 |
| R_G6PDA | YHR163W | −0.222 | 0 | 0.005 | 0 |
| R_G6PDA | YNR034W | 0.023 | 0 | 0.028 | 0 |
| R_GLUN | YIL033C | −0.079 | 0 | −0.205 | 0 |
| R_M1PD | YNR073C | 0.016 | 0 | 0.024 | 0 |
| R_MACACI | YLL060C | 0.011 | 0 | 0.014 | 0 |
| R_POLYAO2 | YMR020W | 0.016 | 0 | 0.023 | 0 |
| R_PUNP1 | YLR017W | 0.003 | 0 | 0.008 | 0 |
| R_PUNP1 | YLR209C | 0.017 | 0 | 0.004 | 0 |
| R_PYDXK | YNR027W | 0.013 | 0 | 0.023 | 0 |
| R_PYDXK | YPR121W | 0.036 | 0 | 0.025 | 0 |
| R_SERATi | YJL218W | 0.015 | −0.733 | 0.038 | 0 |
| n.a. | YDL168W | 0.018 | 0 | 0.024 | 0 |
| n.a. | YJL045W | 0.016 | 0 | 0 | 0 |
| n.a. | YLR070C | 0.012 | 0 | 0.019 | 0 |
Figure 2.(a) The histogram shows the median observed differences in the growth rate of a knockout strain (k) and that of the wild-type (w) in both minimal and rich media; we used medians as they are more robust to outliers (black bars, k-w in minimal; grey bars, k-w in rich). (b) The histogram shows the median observed differences in global maximum OD between the knockout strain (k) and that of the wild-type (w) in both minimal and rich media (black bars, k-w in minimal; grey bars, k-w in rich). (c) The histogram shows the median observed differences in hours between the lag-time parameter of the wild-type grown in the presence of a nutrient and that of the wild-type grown on minimal medium (black bars).