| Literature DB >> 25484913 |
Martin Mann1, Feras Nahar1, Norah Schnorr1, Rolf Backofen2, Peter F Stadler3, Christoph Flamm4.
Abstract
Chemical reactions are rearrangements of chemical bonds. Each atom in an educt molecule thus appears again in a specific position of one of the reaction products. This bijection between educt and product atoms is not reported by chemical reaction databases, however, so that the "Atom Mapping Problem" of finding this bijection is left as an important computational task for many practical applications in computational chemistry and systems biology. Elementary chemical reactions feature a cyclic imaginary transition state (ITS) that imposes additional restrictions on the bijection between educt and product atoms that are not taken into account by previous approaches. We demonstrate that Constraint Programming is well-suited to solving the Atom Mapping Problem in this setting. The performance of our approach is evaluated for a manually curated subset of chemical reactions from the KEGG database featuring various ITS cycle layouts and reaction mechanisms.Entities:
Keywords: Atom-atom mapping; Chemical reaction; Constraint programming; Imaginary transition state
Year: 2014 PMID: 25484913 PMCID: PMC4256833 DOI: 10.1186/s13015-014-0023-3
Source DB: PubMed Journal: Algorithms Mol Biol ISSN: 1748-7188 Impact factor: 1.405
Figure 1Diels-Alder reaction. Example of a Diels-Alder reaction omitting hydrogen atoms. The imaginary transition state (ITS) is an alternating cycle defined by the bonds that are broken (dotted) and the bonds that are newly formed.
Figure 2Supported ITS layouts. (top) ITS layouts found within the elementary reaction data set from [34]. The number within the vertices corresponds to atomic oxidation state changes, broken bonds are dotted given a negative bond label while formed bonds show positive numbers. (left) Homovalent elementary reactions result in even sized cycles with no oxidation state changes at the atoms (see Figure 1). (middle) Odd cycles with two oppositely charged atoms separated by a non-changing pseudo bond (dashed edge labeled 0 see Figure 5). (right) Similar layout involving two equivalent oxidation state changes. Note, the inverse layout was also found and used. (bottom) Additionally supported ITS layouts for ambivalent elementary reactions involving non bonding electrons. These result in odd sized cycles and oxidation state changes of one atom. Note that this situation is equivalent to a non-elementary cycle with alternating bond labeling (middle).
Figure 5Ambivalent reactions. (top) The Meisenheimer rearrangement [37] transforms nitroxides to hydroxylamines. It does not admit a simple alternating cycle as ITS when molecules are represented as graphs whose vertices are atoms. An extended representation, in which the additional electron at the oxygen is treated as a “pseudo-atom” can fix this issue. (bottom) Note that such even sized cycles with a virtual vertex for the moving charge (vertex label e −) can be represented by smaller odd cycles with two oppositely charged atoms separated by a non-changing pseudo bond (dashed edge labeled 0). See Figure 2 for further details of such an ITS layout.
Figure 3Hydrogen symmetry problem. Symmetries resulting from interchangeable hydrogens. The figure presents three successive atom assignments within an ITS mapping. Bonds present in I are given in black, bonds to be formed to derive O are dotted and gray. The ITS describes the loss of a hydrogen for the carbon (bond order decrease) and the bond formation between the decoupled hydrogen with the oxygen next in the ITS. It becomes clear that all 4 hydrogens are not distinguishable, which results in 4 possible symmetric ITS mappings.
Figure 4Approach overview. A simplified overview of the extended CSP for a homovalent ITS of size k=6 where the extensions of the basic CSP are given in the gray box in the lower right.
Performance evaluation of the basic and extended CSP model for reactions from Table 2
|
|
|
| ||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| R00013 | 14 | Basic | 6 | 0.03 | 1 | 346 | 0.8 | 0.03 |
| Ext. 〈2C〉 |
| 80 |
| 0.02 | ||||
| R00018 | 36 | Basic | 4 | 10.4 | 1 | 73,924 | 2.62 | 19.9 |
| Ext. 〈2N〉 | 0.28 | 36 | 0.44 | 0.01 | ||||
| R00048 | 30 | Basic | 4 | 0.1 | 2 | 26,178 | 1.44 | 6.1 |
| Ext. 〈2O〉 |
| 24 |
| 0.03 | ||||
| R00059 | 44 | Basic | 4 | 0.34 | 1 | 194,210 | 9.45 | 63.15 |
| Ext. 〈H,C,N,O〉 |
| 4 |
| 0.01 | ||||
| R00207 | 20 | Basic | 8 | 0.02 | 1 | 20,640 | 1.11 | 4.05 |
| Ext. 〈C,4O〉 |
| 24 |
| 0.02 | ||||
Timings are given in seconds; minimal timings are highlighted in boldface. For extended CSPs, the minimal multiset of ITS participating atoms is listed in column 3. Column “Sol. CSP” gives the number of CSP solutions (ITS candidates) tested via VF2 for final atom mappings.
Elementary homovalent reactions from the KEGG REACTION database [ 42 ] used for the evaluation of the approach
|
|
|
|
|---|---|---|
| R00013 | C(=O)=O, C(C(=O)O)(C=O)O | 2×C(=O)(C=O)O |
| R00018 | N, N(CCCCN)CCCCN | 2×C(CCN)CN |
| R00048 | CC(O)CC(=O)OC(C)CC(O)=O, O | 2×CC(O)CC(O)=O |
| R00059 | N(C(=O)CCCCCN)CCCCCC(=O)O, O | 2×C(CC(=O)O)CCCN |
| R00207 | P(=O)(O)(O)O, O=O, CC(=O)C(=O)O | P(=O)(OC(=O)C)(O)O, OO, C(=O)=O |
The educt and product molecules are given in SMILES notation [38].