| Literature DB >> 31671588 |
Andreza Barbosa Silva Cavalcanti1, Renata Priscila Costa Barros2, Vicente Carlos de Oliveira Costa3, Marcelo Sobral da Silva4, Josean Fechine Tavares5, Luciana Scotti6, Marcus Tullius Scotti7.
Abstract
Lamiaceae is one of the largest families of angiosperms and is classified into 12 subfamilies that are composed of 295 genera and 7775 species. It presents a variety of secondary metabolites such as diterpenes that are commonly found in their species, and some of them are known to be chemotaxonomic markers. The aim of this work was to construct a database of diterpenes and to use it to perform a chemotaxonomic analysis among the subfamilies of Lamiaceae, using molecular descriptors and self-organizing maps (SOMs). The 4115 different diterpenes corresponding to 6386 botanical occurrences, which are distributed in eight subfamilies, 66 genera, 639 different species and 4880 geographical locations, were added to SistematX. Molecular descriptors of diterpenes and their respective botanical occurrences were used to generate the SOMs. In all obtained maps, a match rate higher than 80% was observed, demonstrating a separation of the Lamiaceae subfamilies, corroborating with the morphological and molecular data proposed by Li et al. Therefore, through this chemotaxonomic study, we can predict the localization of a diterpene in a subfamily and assist in the search for secondary metabolites with specific structural characteristics, such as compounds with potential biological activity.Entities:
Keywords: Lamiaceae; SOMs; chemotaxonomic; database; diterpenes
Mesh:
Substances:
Year: 2019 PMID: 31671588 PMCID: PMC6864738 DOI: 10.3390/molecules24213908
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Phylogenetic diagram of Lamiaceae subfamilies (adapted from Li et al. [3]).
Lamiaceae subfamilies listed according to Li et al. [3]. Abbreviations, botanical data, number of diterpenes and chemical occurrences added and used in SistematX (https://sistematx.ufpb.br).
| Subfamily | Acronym | Genera | Species | Diterpenes | Occurrences |
|---|---|---|---|---|---|
| Ajugoideae | Aju | 7 | 99 | 580 | 856 |
| Callicarpoideae | Cal | 1 | 14 | 71 | 86 |
| Lamioideae | Lam | 23 | 188 | 601 | 1183 |
| Nepetoideae | Nep | 30 | 289 | 2433 | 3644 |
| Peronematoideae | Per | 1 | 1 | 7 | 7 |
| Premnoideae | Pre | 2 | 7 | 85 | 92 |
| Scutellarioideae | Scu | 1 | 31 | 286 | 342 |
| Viticoideae | Vit | 1 | 10 | 130 | 169 |
| Total | 66 | 639 | 4115 | 6379 |
Results of the self-organizing map with the values of the occurrences and the number of correct hits for clade III (Nep) and clade IV (Aju, Lam and Scu) of the family Lamiaceae, using the descriptors generated by the program DRAGON 7.0 [24].
| Molecular Descriptors | Fingerprint | |||||
|---|---|---|---|---|---|---|
| Subfamily | Diterpenes | Occurrences | No. of Hits | % of Hits | No. of Hits | % of Hits |
| Clade III | 2433 | 3644 | 3252 | 89.2 | 3366 | 92.3 |
| Clade IV | 1453 | 2381 | 1948 | 81.8 | 2031 | 85.3 |
| Total | 3886 | 6025 | 5200 | 86.3 | 5397 | 89.5 |
Figure 2Self-organizing map obtained by classification of the subfamilies of clade III (red) and clade IV (lilac) and generated descriptors: (a) SOM → molecular descriptors; U-matrix; O-056, nArOH and NRS. (b) SOM → fingerprint and U-matrix.
Summary of results of training and test match (%) of clade III (Nep) and clade IV (Aju + Lam + Scu).
|
|
|
|
|
|
|
|
| Clade III | 91.4 | 87.1 | 89.5 | 88.5 | 89.4 | 88.6 |
| Clade IV | 79.9 | 84.8 | 80.1 | 83.2 | 82.3 | 82.1 |
| Total | 86.9 | 86.2 | 85.8 | 86.4 | 86.6 | 86.4 |
|
|
|
|
|
|
|
|
| Clade III | 89.4 | 87.7 | 89.8 | 87.5 | 87.2 | 88.3 |
| Clade IV | 76.9 | 84.2 | 84.2 | 78.6 | 81.1 | 81.0 |
| Total | 84.5 | 86.3 | 87.6 | 84.0 | 84.8 | 85.4 |
Summary of test match (%) corresponding to the results obtained from 5-fold models using self-organizing map (SOM), support vector machine (SVM) and k-nearest neighbors (k-NN) algorithms for clade III (Nep) and clade IV (Aju + Lam + Scu).
| Subfamily | SOM Average | SOM fingerprint Average | SVM Average | k-NN Average |
|---|---|---|---|---|
| Clade III | 88.3 | 88.1 | 92.2 | 96.2 |
| Clade IV | 81.0 | 80.0 | 88.9 | 92.2 |
| Total | 85.4 | 89.5 | 90.9 | 94.6 |
Figure 3Chemical structures of the diterpenes located in the SOM of clade III (Nep) and clade IV (Aju, Lam, Scu) and their respective botanical occurrences.
Results of the self-organizing maps with the occurrence values and the number of correct hits for the subfamilies belonging to clade IV (Aju, Lam and Scu), using the descriptors generated by the Dragon 7.0 program.
| Molecular Descriptors | Fingerprint | |||||
|---|---|---|---|---|---|---|
| Subfamily | Diterpenes | Occurrences | No. of Hits | % of Hits | No. of Hits | % of Hits |
| Aju | 580 | 856 | 776 | 90.7 | 742 | 86.6 |
| Lam | 601 | 1183 | 1122 | 94.8 | 1122 | 94.8 |
| Scu | 286 | 342 | 278 | 81.3 | 320 | 93.5 |
| Total | 1467 | 2381 | 2176 | 91.4 | 2184 | 91.7 |
Figure 4Self-organizing map obtained by the classification of the subfamilies Aju (light blue), Lam (green), Scu (dark blue) and generated descriptors: (a) SOM → molecular descriptors; U-matrix; nArCOOR; nR = Cp and nFuranes. (b) SOM → fingerprint and U-matrix.
Summary of the results of training and test match (%) of Aju, Lam and Scu.
|
|
|
|
|
|
|
|
| Aju | 86.6 | 90.2 | 92.8 | 88.6 | 87.6 | 89.2 |
| Lam | 96.4 | 96.2 | 95.7 | 94.6 | 96.5 | 95.9 |
| Scu | 78.4 | 75.9 | 67.2 | 76.6 | 82.8 | 76.2 |
| Total | 91.4 | 91.1 | 90.6 | 89.9 | 91.3 | 90.9 |
|
|
|
|
|
|
|
|
| Aju | 93.0 | 90.1 | 89.5 | 90.6 | 82.5 | 89.1 |
| Lam | 93.6 | 91.1 | 96.6 | 95.8 | 93.2 | 94.1 |
| Scu | 63.8 | 67.6 | 54.4 | 70.6 | 84.1 | 68.1 |
| Total | 89.1 | 87.4 | 88.0 | 90.3 | 88.0 | 88.6 |
Summary of test match (%) corresponding of the results obtained 5-fold models using SOM, SVM and k-NN algorithm of the subfamily Aju, Lam and Scu.
| Subfamily | SOM Average | SOM Fingerprint Average | SVM Average | k-NN Average |
|---|---|---|---|---|
| Aju | 89.1 | 83.6 | 94.6 | 95.3 |
| Lam | 94.1 | 87.3 | 97.0 | 97.6 |
| Scu | 68.1 | 95.8 | 87.4 | 88.0 |
| Total | 88.6 | 92.7 | 94.7 | 95.4 |
Figure 5Chemical structures of diterpenes located in SOM and their respective botanical occurrences.
Summary of the five different training and test sets related to SOM obtained with diterpenes from clades III (Nep) and IV (Aju, Lam and Scu).
| Train Set | Test Set | Total | |||
|---|---|---|---|---|---|
| Train | % Total | Test | % Total | ||
| Clade III | 2915 | 80 | 728 | 20 | 3643 |
| Clade IV | 1905 | 79.9 | 477 | 20.1 | 2382 |
Summary of the five different sets of training and test related to SOM obtained with diterpenes only from Clade IV (Aju, Lam and Scu).
| Train Set | Test Set | Total | |||
|---|---|---|---|---|---|
| Train | % Total | Test | % Total | ||
| Aju | 684 | 79.9 | 172 | 20.1 | 856 |
| Lam | 947 | 80 | 236 | 20 | 1183 |
| Scu | 273 | 79.8 | 69 | 20.2 | 342 |