| Literature DB >> 20209084 |
Giovanni Boniolo1, Marcello D'Agostino, Pier Paolo Di Fiore.
Abstract
We propose a formal language that allows for transposing biological information precisely and rigorously into machine-readable information. This language, which we call Zsyntax (where Z stands for the Greek word zetaomegaeta, life), is grounded on a particular type of non-classical logic, and it can be used to write algorithms and computer programs. We present it as a first step towards a comprehensive formal language for molecular biology in which any biological process can be written and analyzed as a sort of logical "deduction". Moreover, we illustrate the potential value of this language, both in the field of text mining and in that of biological prediction.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20209084 PMCID: PMC2831071 DOI: 10.1371/journal.pone.0009511
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The linguistic and the biological interpretations of the theorems.
| LINGUISTIC CASE | BIOLOGICAL CASE (ZSYNTAX) | ||
| From |
|
|
|
| Through | Inferential process | Classical logical rules and non-logical axioms | Non-standard logical rules and EVFs |
| To |
|
|
|
Theorem representing the reactions of the glycolytic pathway leading from D-Glucose to Fructose-1,6-bisphosphate: Glc&HK&GPI&PFK&ATP&ATP ⊢ F1,6P.
| 1. Glc & HK & GPI & PFK & ATP & ATP | IA |
| 2. Glc & HK | From 1 by & |
| 3. GPI | From 1 by & |
| 4. PFK | From 1 by & |
| 5. ATP | From 1 by & |
| 6. ATP | From 1 by & |
| 7. Glc & HK → Glc*HK | EVF |
| 8. Glc*HK | From 2,7 by → |
| 9. (Glc*HK) & ATP | From 5,8 by & |
| 10. (Glc*HK) & ATP → (Glc*HK) *ATP | EVF |
| 11. (Glc*HK) *ATP | From 9,10 by → |
| 12. (Glc*HK) *ATP → G6P & HK & ADP | EVF |
| 13. G6P & HK & ADP | From 11,12 by → |
| 14. G6P | From 13 by & |
| 15. HK | From 13 by & |
| 16. ADP | From 13 by & |
| 17. G6P & GPI | From 3,14 by & |
| 18. G6P & GPI → G6P*GPI | EVF |
| 19. G6P*GPI | From 17,18 by → |
| 20. G6P*GPI → F6P & GPI | EVF |
| 21. F6P & GPI | From 19,20 by → |
| 22. F6P | From 21 by & |
| 23. GPI | From 21 by & |
| 24. F6P & PFK | From 4,22 by & |
| 25. F6P & PFK → F6P*PFK | EVF |
| 26. F6P*PFK | From 24,25 by → |
| 27. (F6P*PFK) & ATP | From 6,26 by & |
| 28. (F6P*PFK) & ATP → (F6P*PFK) *ATP | EVF |
| 29. (F6P*PFK) *ATP | From 27,28 by → |
| 30. (F6P*PFK) *ATP → F1,6P & PFK & ADP | EVF |
| 31. F1,6P & PFK & ADP | From 29,30 by → |
| 32. F1,6P | From 31 by & |
|
| From 1–32 by → |
The reactions of the pathway are illustrated in all their detail. In each line we write the conclusion of a rule application, together with its justification, without keeping track of the initial aggregates (IA) on which it may depend. All the lines in this example depend on the IA of line 1, except for the EVFs, which do not depend on any IA, and the final theorem reported on line 33. Here, the IA of line 1 is “discharged” by the application of rule →I, as indicated to the right of line 33 (From 1–32 by →I), with the consequence that the final theorem does not depend on any IA. Abbreviations: D-Glucose, Glc; D-Glucose-6-phosphate, G6PPP; Hexokinase [EC 2.7.1.1], HK; Glucose-6-phosphate isomerase [EC 5.3.1.9], GPI; 6-Phosphofructokinase [EC 2.7.1.11], PFK; Fructose-6-phosphate, F6PP; Fructose-1,6-bisphosphate, F1,6PP. Note that no formula is used more than once in the derivation process.
Simplified version of the theorem.
|
|
| Glc&HK&GPI&PFK&ATP&ATP ⊢ F1,6P |
|
|
| 1. Glc & HK → Glc*HK |
| 2. (Glc*HK) & ATP → (Glc*HK) *ATP |
| 3. (Glc*HK) *ATP → G6P & HK & ADP |
| 4. G6P & GPI → G6P*GPI |
| 5. G6P*GPI → F6P & GPI |
| 6. F6P & PFK → F6P*PFK |
| 7. (F6P*PFK) & ATP → (F6P*PFK) *ATP |
| 8. (F6P*PFK) *ATP → F1,6P & PFK & ADP |
In Zsyntax, deductions can be written in a simpler way than that presented in Table 2. Here, the emphasis is on the main steps of the inferential process, while inferential rules remain hidden. These rules must however be considered to be implicitly applied, in spite of the fact that they are not explicitly mentioned. Abbreviations are as in Table 2.
Theorem representing the regulatory loop involving MDM2, MDM2 and TP53 and leading to TP53 degradation: TP53& TP53& MDM2& U& P ⊢ d(TP53).
| 1. TP53 & TP53 & | IA |
| 2. TP53 | From 1 by & |
| 3. TP53 | From 1 by & |
| 4. | From 1 by & |
| 5. U | From 1 by & |
| 6. P | From 1 by & |
| 7. TP53 & | From 2,4 by & |
| 8. TP53 & | EVF |
| 9. TP53* | From 7,8 by → |
| 10. TP53* | EVF |
| 11. MDM2 | From 9,10 by → |
| 12. MDM2 & TP53 | From 3,11 by & |
| 13. MDM2 & TP53 → MDM2*TP53 | EVF |
| 14. MDM2*TP53 | From 12,13 by → |
| 15. (MDM2*TP53) & U | From 5,14 by & |
| 16. (MDM2*TP53) & U → (MDM2*TP53) *U | EVF |
| 17. (MDM2*TP53) *U | From 15,16 by → |
| 18. (MDM2*TP53) *U → MDM2 & (TP53*U) | EVF |
| 19. MDM2 & (TP53*U) | From 17,18 by → |
| 20. TP53*U | From 19 by & |
| 21. (TP53*U) & P | From 6,20 by & |
| 22. (TP53*U) & P → (TP53*U) *P | EVF |
| 23. (TP53*U) *P | From 21,22 by → |
| 24. (TP53*U) *P → d(TP53) & U & P | EVF |
| 25. d(TP53) & U & P | From 23,24 by → |
| 26. d(TP53) | From 25 by & |
|
| From 1–26 by → |
It is known that TP53, the well-known tumor suppressor [22] binds to the MDM2 gene and activates its transcription, ultimately leading synthesis of the MDM2 protein [23], [28]. But if TP53 binds the protein MDM2, this latter acts as a ubiquitin ligase, leading to TP53 ubiquitination and ultimately to its proteasomal degradation [29], an event that we indicate by d(TP53). Thus, a complex regulatory loop exists involving TP53, the MDM2 gene and the MDM2 protein. The reactions of this pathway are illustrated, in Zsyntax language, in the detailed form. In this form, the theorem is reported on line 27, the antecedent (IA, initial aggregate) is the multiset reported on line 1 and is “discharged” by the application of →. Abbreviations: U, ubiquitin; P, proteasome. The reader can check that no formula (resource) is used more than once in the derivation process.
Simplified version of the theorem representing the regulatory loop.
|
|
| TP53& TP53& |
|
|
| 1. TP53 & |
| 2. TP53 * |
| 3. MDM2 & TP53 → MDM2*TP53 |
| 4. (MDM2 *TP53) & U → (MDM2*TP53) *U |
| 5. (MDM2*TP53) *U → MDM2 & (TP53*U) |
| 6. (TP53*U) & P → (TP53*U) *P |
| 7. (TP53*U) *P → d(TP53) & U & P |
Theorem representing the feed forward loop: A & A & B & C & RA& RA& RB ⊢ C.
| 1. | IA |
| 2. | From 1 by & |
| 3. | From 1 by & |
| 4. | From 1 by & |
| 5. | From 1 by & |
| 6. RA | From 1 by & |
| 7. RA | From 1 by & |
| 8. RB | From 1 by & |
| 9. | From 2,6 by → |
| 10. | EVF |
| 11. | From 9,10 by → |
| 12. | EVF |
| 13. A | From 11,12 by → |
| 14. A & RB | From 8,13 by & |
| 15. A & RB → A*RB | EVF |
| 16. A*RB | From 14,15 by → |
| 17. (A*RB) & | From 4,16 by & |
| 18. (A*RB) & | EVF |
| 19. (A*RB) * | From 17,18 by → |
| 20. (A*RB) * | EVF |
| 21. B | From 19,20 by → |
| 22. | From 3,7 by & |
| 23. | EVF |
| 24. | From 22,23 by → |
| 25. | EVF |
| 26. A | From 24,25 by → |
| 27. A & B | From 21,26 by & |
| 28. A & B → A*B | EVF |
| 29. A*B | From 27,28 by → |
| 30. (A*B) & | From 5,29 by & |
| 31. (A*B) & | EVF |
| 32. (A * B) * | From 30,31 by → |
| 33. (A * B) * | EVF |
| 34. C | From 32,33 by → |
|
| From 1–34 by → |
A feed forward loop [30] is illustrated in the detailed form. In this form, the theorem is reported on line 35, the antecedent (IA, initial aggregate) is the multiset reported on line 1 and is “discharged” by the application of →. The theorem illustrates the abstract case of a feed forward loop composed of three genes A, B, C, their encoded proteins (A, B, C), and two regulatory proteins RA and RB, such that (i) A is regulated by RA; (ii) B by RB and the protein A; (iii) C by the protein complex A*B. The reader can check that each formula (resource) is used at most once.
Simplified version of the theorem representing the feed forward loop.
|
|
|
|
|
|
| 1. |
| 2. |
| 3. A & RB → A*RB |
| 4. (A*RB) & |
| 5. (A*RB) * |
| 6. |
| 7. |
| 8. A & B → A*B |
| 9. (A*B) & |
| 10. (A*B) * |
Theorem concerning the phosphorylation of TP53.
|
|
| TP53 & ATP |
|
|
| 1. Kinase & ATP → Kinase*ATP |
| 2. (Kinase*ATP) & TP53 → (Kinase*ATP) *TP53 |
| 3. (Kinase*ATP) *TP53 → TP53-P & Kinase & ADP |
The four levels of mathematics into physics.
| “Enveloping” mathematics | Geometry | Logic | Dynamics (ODEs PDEs) | |
|
| Vectorial calculus | Euclidean geometry | Classic logic | Newton laws |
|
| Differential topology | Riemannian geometry | Classic logic | Einstein equations |
|
| Complex functions and Hilbert spaces | Euclidean geometry | Classic logic plus Quantum logic | Schrödinger equation |
Simplified version of the theorem concerning the degradation path of TP53.
|
|
| TP53&MDM2 |
|
|
| 1. MDM2 & TP53 → MDM2*TP53 |
| 2. (MDM2 *TP53) & U → (MDM2*TP53) *U |
| 3. (MDM2*TP53) *U → MDM2 & (TP53*U) |
| 4. (TP53*U) & P → (TP53*U) *P |
| 5. (TP53*U) *P → d(TP53) & U & P |
The eight hypothetical theorems for the NUMB, TP53, MDM2 regulatory loop.
| 1 | TP53&MDM2&NUMB&U&P ⊢ ((MDM2*NUMB) *TP53) &U&P |
| 2 | TP53&MDM2&NUMB&U&P ⊢ (MDM2*NUMB) &U& P& TP53 |
| 3 | TP53&MDM2&NUMB&U&P ⊢ ((MDM2*NUMB) *U)&TP53&P |
| 4 | TP53&MDM2&NUMB&U&P ⊢ ((MDM2*NUMB) *P)&TP53&U |
| 5 | TP53&MDM2&NUMB&U&P ⊢ (MDM2*NUMB)&(TP53*U)& P |
| 6 | TP53&MDM2&NUMB&U&P ⊢ (MDM2*NUMB)&(TP53*P)& U |
| 7 | TP53&MDM2&NUMB&U&P ⊢ (MDM2*NUMB)&(U*P)&TP53 |
| 8 | TP53&MDM2&NUMB&U&P ⊢ (MDM2*NUMB)&((TP53*P)*U) |