| Literature DB >> 27158267 |
Andrés M Castillo1, Andrés Bernal2, Reiner Dieden3, Luc Patiny4, Julien Wist2.
Abstract
BACKGROUND: We present "Ask Ernö", a self-learning system for the automatic analysis of NMR spectra, consisting of integrated chemical shift assignment and prediction tools. The output of the automatic assignment component initializes and improves a database of assigned protons that is used by the chemical shift predictor. In turn, the predictions provided by the latter facilitate improvement of the assignment process. Iteration on these steps allows Ask Ernö to improve its ability to assign and predict spectra without any prior knowledge or assistance from human experts.Entities:
Keywords: Automatic assignment; Chemical shift prediction; HOSE codes; Machine learning; Nuclear magnetic resonance; Peak-picking
Year: 2016 PMID: 27158267 PMCID: PMC4858875 DOI: 10.1186/s13321-016-0134-6
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Fig. 1The logic behind Ask Ernö. The automatic assignment of nuclei to their signals (right) produces entries to a database (mid) for chemical shift prediction (left). Predicted chemical shifts in turn provide further restrictions for assignment. Ask Ernö is trained by repeatedly looping on this assignment-prediction cycle
Fig. 2n-spheres with radius 1–5 of a proton assigned to a chemical shift of 7.394 ppm. Dotted curves indicate aromatic bonds
Results of the automatic assignment of a 5-proton molecule performed based on integrals exclusively
| Proton | Proton | Proton | Proton | Proton | |
|---|---|---|---|---|---|
| 1 | 1.30 | 2.52 | 4.16 | 7.47 | 8.27 |
| 2 | 2.52 | 1.30 | 4.16 | 7.47 | 8.27 |
| 3 | 1.30 | 2.52 | 4.16 | 8.27 | 7.47 |
| 4 | 2.52 | 1.30 | 4.16 | 8.27 | 7.47 |
Despite the ambiguity introduced by the existence of 4 possible solutions, assignment of proton c to the peak at 4.16 ppm is present in all of them. This nucleus—chemical shift pair is thus deemed correct and selected to be learnt
Fig. 3Correlation between observed and predicted chemical shift values for the test molecules at each iteration of the training loop
Fig. 4Evolution during the training loop of prediction error (top), prediction uncertainty (middle) and fraction of predicted chemical shifts (bottom) for the test molecules
Fig. 5Evolution of the cumulative error distributions during training. The fraction of predictions is given relative to the total number of protons in the test set for which chemical shift can be predicted (2007 protons in total). To generate these curves, the set of chemical shift prediction errors was split into 100 bins of 0.01 ppm, plus a last bin containing predictions with an error equal or greater than 1 ppm. This last bin being larger explains the sudden increase observed at the end of the curves
Fig. 6Correlation between observed and predicted chemical shift values after learning for different sphere radius (iteration 9)