| Literature DB >> 25158156 |
Laura Guasch1, Markus Sitzmann, Marc C Nicklaus.
Abstract
A compound exhibits (prototropic) tautomerism if it can be represented by two or more structures that are related by a formal intramolecular movement of a hydrogen atom from one heavy atom position to another. When the movement of the proton is accompanied by the opening or closing of a ring it is called ring-chain tautomerism. This type of tautomerism is well observed in carbohydrates, but it also occurs in other molecules such as warfarin. In this work, we present an approach that allows for the generation of all ring-chain tautomers of a given chemical structure. Based on Baldwin's Rules estimating the likelihood of ring closure reactions to occur, we have defined a set of transform rules covering the majority of ring-chain tautomerism cases. The rules automatically detect substructures in a given compound that can undergo a ring-chain tautomeric transformation. Each transformation is encoded in SMIRKS line notation. All work was implemented in the chemoinformatics toolkit CACTVS. We report on the application of our ring-chain tautomerism rules to a large database of commercially available screening samples in order to identify ring-chain tautomers.Entities:
Mesh:
Year: 2014 PMID: 25158156 PMCID: PMC4170818 DOI: 10.1021/ci500363p
Source DB: PubMed Journal: J Chem Inf Model ISSN: 1549-9596 Impact factor: 4.956
Figure 1General isomerization scheme for ring–chain tautomers. (Top) Exocyclic ring closure. (Bottom) Endocyclic ring closure.
(Top) Baldwin’s Rules for Ring Closure and (Bottom) Adaptation to our Ring–Chain Tautomer Rulesa
Columns represent the size of the ring formed (3–7) and the type of closure process (exocyclic and endocyclic). Rows specify the nature of the electrophilic carbon: tet (tetrahedral/sp3), trig (trigonal/sp2), and dig (digonal/sp). Green (with check mark) indicates favorable cyclizations, red (with cross-out) indicates unfavorable cyclizations, and gray indicates that no prediction was made.
SMIRKS Transforms for Our 11 Ring–Chain Tautomer Rules
Figure 2A case of ring–chain tautomerism not covered by our rules. It involves bridged ring systems (molecules D and E) in the second cyclization reaction of molecule A.
Figure 3All experimentally found tautomers of warfarin. The cyclic tautomers are in the dashed-line box.
Distribution of the Number of Tautomers Predicted Per Molecule by Our Ring–Chain Rules, CACTVS Prototropic Rules, and Combined Application of Both Sets of Rules to the AMS (Aldrich Market Select) and CSDB (Chemical Structure DataBase) Databases (See Text)a
| ring–chain tautomerism,
11 rules, AMS | prototropic tautomerism, 21 rules, AMS | prototropic tautomerism, 21 rules, CSDB | both types of tautomerism,
32 rules, AMS | |||||
|---|---|---|---|---|---|---|---|---|
| count | % | count | % | count | % | count | % | |
| no tautomers (single molecule) | 5 297 864 | 92.05 | 1 393 612 | 24.21 | 9 756 186 | 14.06 | 1 364 937 | 23.72 |
| one tautomer | 101 890 | 22.26 | 1 235 979 | 28.34 | 10 721 845 | 17.99 | 1 093 477 | 24.90 |
| 2–10 tautomers | 304 245 | 66.47 | 2 428 781 | 55.68 | 33 532 284 | 56.25 | 2 271 956 | 51.75 |
| 11–50 tautomers | 37 267 | 8.14 | 584 842 | 13.41 | 13 492 899 | 22.63 | 722 423 | 16.45 |
| 51–100 tautomers | 3905 | 0.85 | 72 832 | 1.67 | 1 136 066 | 1.91 | 136 558 | 3.11 |
| 101–200 tautomers | 7078 | 1.55 | 35 901 | 0.82 | 565 199 | 0.95 | 151 384 | 3.45 |
| 201–500 tautomers | 3017 | 0.66 | 3486 | 0.08 | 157 260 | 0.26 | 14 734 | 0.34 |
| 501–1000 tautomers | 308 | 0.07 | 141 | 0.00 | 6088 | 0.01 | 105 | 0.00 |
The reduction of the number of AMS molecules with 501–1000 tautomers when going from 21 to 32 rules stems from the fact that we cut off tautomer generation at 1000 generated tautomers (see text), and more rules pushed more molecules beyond this limit.
Frequency of Application of Ring–Chain Rules in the Systematic Generation of Ring–Chain Tautomers in AMS Database
| SMIRKS rule | count | % |
|---|---|---|
| 3-exo-trig | 65 435 | 0.31 |
| 4-exo-trig | 10 560 | 0.05 |
| 5-exo-trig | 7 506 722 | 35.09 |
| 6-exo-trig | 5 289 114 | 24.72 |
| 7-exo-trig | 4 185 292 | 19.56 |
| 5-exo-dig | 179 567 | 0.84 |
| 6-exo-dig | 472 074 | 2.21 |
| 7-exo-dig | 3 293 445 | 15.40 |
| 5-endo-trig | 169 007 | 0.79 |
| 6-endo-trig | 156 371 | 0.73 |
| 7-endo-trig | 65 239 | 0.30 |
Comparison of Natural Product and Drug Databases for the Occurrence Rates of the Possibility of Ring–Chain and Prototropic Tautomerism
| total number of molecules | number of molecules capable of ring–chain
tautomerism | number of molecules capable of prototropic tautomerism | ||||
|---|---|---|---|---|---|---|
| databases | database type | count | count | % | count | % |
| CHMIS-C | natural products (herbal medicines) | 8572 | 1141 | 13.3 | 4319 | 50.4 |
| NCI-NP | natural products (various types) | 124 701 | 22 875 | 18.3 | 73 005 | 58.5 |
| TCM | natural products (traditional chinese medicines) | 9127 | 1381 | 15.1 | 4875 | 53.4 |
| KEGG MEDICUS | drugs | 12 041 | 2180 | 18.1 | 7313 | 60.7 |
| AMS | screening samples and building blocks | 5 755 574 | 457 710 | 8.0 | 4 369 069 | 75.9 |
Figure 4Enumeration of all predicted ring–chain tautomers of 4-amino-2-(benzylideneamino)-4-oxobutanoic acid (molecule 0). Blue numbers indicate ring closed tautomers, and red numbers indicate ring opened tautomers.