| Literature DB >> 29892514 |
Miki H Maeda1, Tomoki Yonezawa1, Tomomi Komaba1.
Abstract
We have developed a three-dimensional structure database of natural metabolites (3DMET). Early development of the 3DMET database relied on content auto-generated from 2D-structures of other chemical databases. From 2009, we began manual curation, obtaining new compounds from published works. In the process of curation, problems of digitizing 3D-structures from structure drawings of documents were accumulated. As the same as auto-generation, structure drawings should be also payed attention about stereochemistry. Our experiences in manual curation of 3DMET, as described herein, may be useful to others in this field of research and for the development of supporting systems of a chemical structure database. Manual curation is still necessary for proper database entry of the 3D-configurations of chiral atoms, a problem encountered frequently among natural products.Entities:
Keywords: absolute configuration; chemical curation; chirality; molecular docking; natural products
Year: 2018 PMID: 29892514 PMCID: PMC5992871 DOI: 10.2142/biophysico.15.0_87
Source DB: PubMed Journal: Biophys Physicobiol ISSN: 2189-4779
Figure 1Stereoisomers of oleans; a, (R)-olean and b, (S)-olean.
Curation issues that required detailed confirmation from the source publication
| Issue | Compounds | Previously reported | Newly reported |
|---|---|---|---|
| Compounds requiring confirmation | 2,507 (100%) | ||
|
| |||
| unclear drawing of defined chiral atoms | 1,579 (63.02%) | 778 | 801 |
| lacked compound name | 436 (17.39%) | 269 | 167 |
| correct name but wrong structure | 93 (3.71%) | 24 | 69 |
| inverted drawing of sugar | 77 (3.07%) | 33 | 44 |
| wrong name but correct structure | 43 (1.72%) | 13 | 30 |
| uncorresponding chiral definition | 52 (2.07%) | 1 | 51 |
| to NMR spectrum | 10 (0.40%) | 1 | 9 |
| to Mass spectrum | 3 (0.12%) | 0 | 3 |
| to Circular dichroism | 2 (0.08%) | 0 | 2 |
| to X-ray analysis | 1 (0.04%) | 0 | 1 |
| misspelling of name | 43 (1.72%) | 21 | 22 |
| different structure from previous report | 139 (5.54%) | 135 | 4 |
| inconsistent names within article | 11 (0.44%) | 2 | 9 |
| uncorresponding name and drawing | 15 (0.60%) | 15 | 0 |
| distortion in generated 3D-structure | 2 (0.08%) | 0 | 2 |
This data were determined from 13,881 newly structure-defined natural compounds reported in the articles listed in Natural Product Updates 2010 to 2011. The value provided in the columns indicates the number of compounds with the defined inaccurate. Numbers in the parentheses are the rate against all compounds requiring confirmation.
Journals investigated as literature resources about structure definition of natural products
| Journal | Articles | New compounds | |
|---|---|---|---|
|
| |||
| Total | With chiral center | ||
| J. Nat. Prod. | 173 | 892 | 721 |
| Fitoterapia | 96 | 288 | 228 |
| Nat. Prod. Commun. | 83 | 150 | 98 |
| Tetrahedron Lett. | 71 | 162 | 147 |
| Helv. Chim. Acta | 70 | 206 | 175 |
| Org. Lett. | 62 | 126 | 112 |
| Bioorg. Med. Chem. Lett. | 59 | 169 | 130 |
| Tetrahedron | 38 | 162 | 141 |
| Chem. Pharm. Bull. | 34 | 146 | 105 |
| Chem. Biodiversity | 28 | 92 | 75 |
| Heterocycles | 22 | 52 | 39 |
| Biosci. Biotechnol. Biochem. | 18 | 35 | 25 |
| Eur. J. Org. Chem. | 17 | 101 | 97 |
| Org. Biomol. Chem. | 6 | 38 | 35 |
| RSC Adv. | 5 | 14 | 14 |
| Chem. Eur. J. | 4 | 6 | 6 |
| Nat. Prod. Res. | 4 | 5 | 4 |
| J. Med. Chem. | 3 | 3 | 3 |
| Nat. Prod. Sci., Korea | 3 | 3 | 3 |
| Angew. Chem., Int. Ed. | 2 | 9 | 9 |
| Bull. Chem. Soc. Jpn. | 2 | 4 | 4 |
| Phytother. Res. | 2 | 3 | 1 |
| Chem. Commun. | 1 | 1 | 1 |
| Chem. Lett. | 1 | 1 | 1 |
| J. Am. Chem. Soc. | 1 | 1 | 1 |
| Z. Naturforsch., B: Chem. Sci. | 1 | 2 | 0 |
|
| |||
| Total | 806 | 2671 | 2175 |
The articles were selected based on the database, Natural Product Updates on 2013.
Figure 2Defined chiral atom configurations of a molecule shown by Venn’s diagram. Overlap of two configurations means that a molecule includes two parts of configurations.
Success rate of defining absolute configuration by each analytical method
| Methods | Compound number | Rate (%) | ||
|---|---|---|---|---|
|
| ||||
| Completely defined | All | |||
| X-ray | 158 | 171 | 92.4 | |
| NMR | NOESY/ROESY | 1298 | 1625 | 79.9 |
| Mosher’s method | 110 | 126 | 87.3 | |
| Circular dichroism (CD) | 486 | 523 | 92.9 | |
| Optical rotatory dispersion (ORD) | 193 | 201 | 96.0 | |
NMR, nucleic magnetic resonance; NOESY, nuclear Overhauser enhancement and exchange spectroscopy; and ROESY, rotating frame nuclear Overhauser effect spectroscopy. The number of compounds completely defined by each method and the number of all compounds analyzed by the method are listed. Multiple methods could be applied to one compound (refer to Table 4).
Combination of analytical methods to determine absolute configuration
| Analytical methods | Number | ||||
|---|---|---|---|---|---|
|
| |||||
| X-ray | NMR | CD | ORD | ||
|
| |||||
| NOESY or ROESY | Mosher’s method | ||||
| ✓ | 40 | ||||
| ✓ | ✓ | 79 | |||
| ✓ | ✓ | ✓ | 1 | ||
| ✓ | ✓ | ✓ | 20 | ||
| ✓ | ✓ | ✓ | 5 | ||
| ✓ | ✓ | 1 | |||
| ✓ | ✓ | 6 | |||
| ✓ | ✓ | 3 | |||
| ✓ | ✓ | ✓ | 3 | ||
| ✓ | 708 | ||||
| ✓ | ✓ | 57 | |||
| ✓ | ✓ | 301 | |||
| ✓ | ✓ | 71 | |||
| ✓ | ✓ | ✓ | 31 | ||
| ✓ | ✓ | ✓ | 24 | ||
| ✓ | ✓ | ✓ | 1 | ||
| ✓ | 19 | ||||
| ✓ | 81 | ||||
| ✓ | 66 | ||||
| ✓ | ✓ | 20 | |||
|
| |||||
| Total | 1537 | ||||
All reported compounds were analyzed by 1D-NMR and MS. Number indicates the number of compounds defined by one or multiple checked methods on the line. An abbreviation of each method is the same as Table 3. Determination by minor techniques is not listed here and the total number of each method is different from the number of “defined as absolute configuration” in Figure 2.