Literature DB >> 35309498

Molecular Design Learned from the Natural Product Porphyra-334: Molecular Generation via Chemical Variational Autoencoder versus Database Mining via Similarity Search, A Comparative Study.

Yuki Harada¹, Makoto Hatakeyama^1,2, Shuichi Maeda¹, Qi Gao³, Kenichi Koizumi¹, Yuki Sakamoto¹, Yuuki Ono³, Shinichiro Nakamura¹.

Abstract

A comparative study is presented. The method via chemical variational autoencoder (VAE) and the method via similarity search are compared, focusing on their generation ability for new functional molecular design. Focusing on the natural porphyra-334 as a model molecule, we generated three groups: molecules of mycosporine-like amino acids (MAAs) as seeds (G SEEDS ), molecules generated via chemical VAE (G VAE ) and molecules gathered via similarity search (G SIM ). The number of molecules that satisfy the condition for the light absorption ability of porphyra-334 in G SEEDS , G VAE , and G SIM are 52, 138, and 6, respectively. The method via chemical VAE shows a promising potential for future molecular design. By using quantum chemistry wave function properties for chemical VAE, we find new molecules that are comparable to porphyra-334, including some with unexpected geometries. At the end, we show a group of molecules found with this method.

Entities: Chemical

Year: 2022 PMID： 35309498 PMCID： PMC8928499 DOI： 10.1021/acsomega.1c06453

Source DB: PubMed Journal: ACS Omega ISSN： 2470-1343

Introduction

UV radiation (UVR) has become one of the subjects of environmental and green chemistry because of the decrease of the thickness of the ozone layer, which hinder the transmission of UVR from the sun to the Earth’s surface. Sunlight is the primary energy source of living organisms; however, UVR damages human skin. It may act as the origin of skin cancers. Therefore, the development of efficient sunscreens without side effects is necessary. Porphyra-334 is a UV-resistive molecule in nature. Mycosporine-like amino acids (MAAs), including porphyra-334, are chemicals that prevent UVR-induced damage. They have attracted attention due to having a strong anti-UV effect.[1−4] We reported previously a study on the molecular-level mechanism in energy transformations from sunlight to heat in porphyra-334 using first-principles molecular dynamics simulations and by quantum chemistry.[5,6] It revealed that the UV-excited porphyra-334 releases its kinetic energy via vibrational modes to surrounding water molecules. The structure of porphyra-334, which contains many hydrophilic functional groups, favors effective hydrogen bond formation with surrounding water molecules. Thus, the vibrational modes of water molecules absorb the energy from the excited molecule. This study provided an interpretation of excellence in a natural molecule, namely porphyra-334. An ambitious extension in molecular science is the design of such molecules. Therefore, we explore a design principle in an attempt to advance toward the natural products. The design and selection of environmentally friendly and harmless materials and molecules are critical to establishing a sustainable society. They are mandatory for the development of functional molecules, drugs, and a wide range of materials. To achieve the sustainable conditions, many expensive experiments are in fact necessary. However, considering the time and cost of the society, we must provide, in parallel, computational support for the design and selection of these molecules and materials. Historically, the methodology so far has been based on the analogy of geometrical appearances (shapes) in molecules and materials starting from a lead molecule that is found more or less by chance. If such a methodology was sufficient, we would not be suffering from the current environmental problems. Chemical space consists of the union of compounds. While the number of all feasible compounds is extremely high, estimated to be 1060 possible structures, only a small fraction can be processed and analyzed at the same time.[7] Exploring the new horizon of chemical space is a challenge for cheminformatics and computational molecular design. An alternative approach that does not depend on the appearance or similarity of molecular shapes is necessary. We conducted a comparative study to find search criterions other than shapes and appearances, and the results are reported here. One of the hopeful design approaches is learning from nature-made molecules such as porphyra-334. Porphyra-334, a molecule that survived in the long process of evolution, is considered to be the goal of UV-resistive natural products. We compare the approaches, one via the shapes and appearances method and the other that uses something different, as a clue to reach this goal. In fact, we are comparing the different processes of lead-optimization. We have carried out a comparison of the molecules generated via chemical variational autoencoder (ChemVAE)[8] versus the molecules gathered via Similarity Search (SimSearch).[9] Chemical VAE is a promising approach proposed recently that is based on machine learning. This provides great opportunities to generate a new molecule and to explore the search method in chemical space. In contrast, similarity search is a powerful conventionally applied method. Notice that Winter et al. proposed the application of chemical VAE in drug discovery,[10] and Gao et al. reported the availability of chemical VAE in application for the generation of novel alternative drug candidates for eight existing market drugs.[11] We compare lead-optimization processes starting from the natural product porphyra-334. The group obtained via SimSearch is based on fingerprints from a chemical database. This cheminformatics method is a conventional search that is based on an existing chemical space. The molecular generation via ChemVAE is based on machine learning structural recognition; it transforms the input data from SMILES into the vector representation. There is no need to manually specify the mutation rules. As a result, unexpected jumps (to desired properties) in chemical space are possible. In the future, gradient-based optimization will be performed in combination with Bayesian statistics.[8] In Figure , the scheme of current study is presented. The design approaches begin from the seeds, which are derivatives of the molecule porphyra-334; hereafter, we will refer to them as G (in green in Figure ). The first molecular group was gathered via SimSearch, and the second was generated via ChemVAE; hereafter, we will call them G (in blue) and G (in orange), respectively.

Figure 1

Scheme of the comparative study.

Scheme of the comparative study. For each group of molecules, SMILES data; 3D MOL data, that is, (x, y, z) coordinates; and properties by quantum chemical calculations were obtained. Data for each molecule are represented by vector elements. Then, the following three data mapping methods for G, G, and G were compared: (I) a machine learning (ML)-based comparison, (II) the cheminformatic comparison from 3D MOL, and (III) the quantum chemistry properties comparison from DFT calculations (see right in Figure , light blue). In this paper, we will show a demonstrative result that the new lead-optimization process produces promising results via ChemVAE, especially in connection with quantum chemical calculations. We believe that the current study provides an example of machine learning applications in the search for desired molecule from the vast chemical space.

Materials and Methods

Preparation of Three Molecular Groups

The Seeds Structures (370) from MAAs Molecules (19) (G)

A variety of UV-absorbing molecules, termed mycosporine-like amino acids (MAAs), have been reviewed by several researchers.[1−3,12−14] The MAAs from a marine organism are imine derivatives of mycosporines, as shown in Figure a. The MAA motif contain an amino-cyclohexen imine ring linked to an amino acid, an amino alcohol, or an amino group, which absorbs UV light from 320 to 362 nm[12] and shows photoprotective and antioxidant functions.

Figure 2

(a) Structures of 19 natural MAAs molecules. (b) Examples of protonated MAA motifs of porphyra-334 (see the SI for others).

(a) Structures of 19 natural MAAs molecules. (b) Examples of protonated MAA motifs of porphyra-334 (see the SI for others). As an extension of a previous study on porphyra-334,[6] we study here the same family of molecules with a stable structure. Taking the ubiquitous photosensitive component of marine algae in a liquid water environment into account, we systematically and exhaustively obtained all possible structural isomers and tautomers that existed in the aqueous phase. Thus, derived from the 19 molecules shown in Figure a as porphyra-334 derivatives, 370 seeds structures (G) were generated on account of the equilibrium in water. From the thus-prepared G, two groups of molecules, namely G and G, were obtained via the ChemVAE method and the SimSearch method, respectively. Given the excellent properties of porphyra-334 in UV energy absorption and its dispersion mechanism,[1,6,14−18] we must include the protonated MAA motifs. The typical examples of protonated MAA motifs are shown in Figure b (see the SI for others). Thus, we added structures reflecting protonated and zwitterionic molecules (the 99 structures, which are included in the total 370 of G; see the SI).

Molecular Generation via ChemVAE (G)

Gómez et al. reported a deep neural network model consisting of three coupled functions: an encoder, a decoder, and a predictor. It provides a machine learning-based de novo molecular design method.[8] The code and full training data sets are disclosed at their GitHub page.[19] This model was trained on hundreds of thousands of existing chemical structures, which allowed us to automatically generate novel chemical structures. Owing to this system, we could carry out the current study, that is, the group of molecules G generated via ChemVAE. Their autoencoder architecture is illustrated in Figure . Notations follow those from the paper by Gómez et al.[8] This trained autoencoder system has three latent representations: an embedding vector (X_1), a latent vector (z_1), and embedding vector (X_r); hereafter, we will call them X_1, z_1, and X_r, respectively. During the training, the canonical SMILES strings were assigned as an input to avoid confusion among chemically equivalent string representations. The encoder and the decoder shown in Figure are recurrent neural networks (RNNs).

Figure 3

Scheme of the current Chem VAE.

Scheme of the current Chem VAE. The encoder RNN that processes from a given SMILES string and the decorder RNN that processes from a given X_r are stochastic operations. As a result, the same input (smi) may be decoded into different outputs (smi_r), reflecting the different intermediates (X_1, z_1, or X_r). There is a possibility that the decoder RNN (from X_r to SMILES (smi)) might result in chemically invalid strings. We collected the generated molecules, 2000 per one SMILES decoding attempt, iteratively for the ChemVAE method. After removing duplicated strings, we obtained 550784 strings for which we employed RDkit[20] to validate the chemical structures of the output molecules and discard invalid ones. Thus, we finally obtained 2454 SMILES strings. Meaningless structures were ruled out for the following reasons: having less than four heavy atoms, failing generate a 3D structure for quantum chemistry calculations, having unrealistic termination during quantum chemistry calculations, or having an unstable radical species. In total, 1572 molecules were excluded. Finally, 882 molecules (G) were generated via ChemVAE (in orange, left in Figure ).

Similarity Search by Fingerprint (G)

The SimSearch procedure in chemical databases is a well-known and widely used process.[9,21,22] We downloaded the “Annotated” subset of 1 458 577 582 molecules from ZINC15 (as shown in Figure S23).[23,24] It includes compounds that are in catalogs (but not for sale). We did not apply any other specific standardization to the molecular database. We gathered SMILES strings in accordance with Tanimoto similarity by utilizing MACCS, ECFP, and FCFP fingerprints (see the SI for details). ZINC15 is a research tool for investigators to search chemical and biological targets. Notice that fingerprints can be used for applications such as the current SimSearch as well as for molecular characterization, molecular diversity, and chemical database clustering. The MACCS keys have 166 bit structural key descriptors (vector with 166 elements) in which each bit is associated with a SMARTS pattern.[25,26] Extended-connectivity fingerprints (ECFPs) are circular topological fingerprints designed for various wide molecular studies and structure–activity modeling.[27,28] The ECFP encodes substructure patterns from molecules to a bit string length of 1024 (the length can be varied). The FCFP is a variant of this ECFP that is intended to capture precise atom environment substructural features. The FCFPs are intended to capture more abstract role-based substructural features. These keys were implemented in the open-source cheminformatics software package RDkit. We gathered 1125 compounds from a database derived from G (in green in Figure ). We removed some chemicals because of their failure to prepare 3D structures for quantum chemistry calculations. At the final stage, we obtained 1094 chemicals (G) to be considered in the chemical space exploration (in blue, left in Figure ).

Quantum Chemistry Properties

To prepare geometric data for quantum chemistry calculations, the MMFF94 force field implemented with RDkit was applied to construct 3D structures for G, G, and G. We then performed the calculations for the ground and excited states using density functional theory (DFT). We used the B3LYP hybrid functional and the 6-31G(d) basis sets. The solvent effect of water was taken account by the integral equation formalism of the polarization continuum model (IEFPCM). We used the Gaussian 16 program package.[29] We first carried out the geometry optimizations of the ground states, starting from the structure generated by RDkit. We then performed the single-point calculation of the excited states using time-dependent density functional theory (TD-DFT). As shown in Table , we extracted 23 properties from the calculated results, such as total energies, the HOMO (highest occupied molecular orbital)–LUMO (lowest unoccupied molecular orbital) gap energies, three orbital energies around the HOMO and the LUMO, viral coefficients, dipole moments, quadrupole moments, the degrees of freedom in the structures, the trace of the quadrupole moment, and the coordinate invariants of the quadrupole moment (Table ). Ground-state properties are selected for versatility. In total, there are 84 elements for each vector. Then, we carried out the PCA analyses, to be mentioned later.

Table 1

Quantum Chemistry Properties Obtained from DFT Calculations and Some Physical Chemical Properties

detail	number of elements
estimated molecular volume[30]	1
difference of the orbital energies (eigen values) of the HOMO and LUMO	1
quadrupole moment	3
total dipole moment	1
total energy and the viral coefficient	1
electronic spatial extent	1
absorption wavelength (nm) of the nth excited state	20
absorption energy (eV) of the nth excited state	20
oscillation strength of the nth excited state	20
number of electrons	1
orbital energy (eigen value) of first through third highest occupied molecular orbitals	3
orbital energy (eigen value) of first through third lowest unoccupied molecular orbitals	3
rotational constants	3
degree of freedom	1
number of (H, C, N, O, and S) atoms	5

Mapping

Representations of various vectors in chemical space[7,31] were applied for the comparison or exploration of the internal relations. It is necessary to map higher-ordered complex information onto a low-dimensional space. One typical mapping method is principal component analysis (PCA),[32] which is used for exploratory data analysis and to make predictive models. It is commonly used for dimensional reduction by projecting each data point onto only the few principal components to obtain lower-dimensional data. We show the first two principal components, and the cumulative contribution rate data are shown in the SI.

Results and Discussions

Representation for Three Groups: G, G, and G

We present here the results obtained via ChemVAE generation and SimSearch mining. The comparison of the three groups (G, G, and G) was carried out by mapping three different viewpoints: (I) ML-based, (II) cheminformatics, and (III) quantum chemistry (right in Figure ). It is noteworthy that we used the ChemVAE method again in the mapping process. That is, in the process of (I) the ML-based process (Figure ), we use SMILES strings for G and G as the input (the second time) for the ChemVAE procedure, then we obtained output vectors of X_1, z_1, and X_r with which we carried out the PCA mapping . The results of X_1 and z_1 from the ChemVAE vectorization are shown below. For the X_r results, see the SI.

Mapping (I): ML-Based Comparison

Two chemical space representations were mapped by PCA via ChemVAE vectorization as shown in Figure a and b (see the SI for the X_r results). At first, the mapping results for the vectors (X_1) are shown in Figure a, where G is distributed slightly closer to G than G. For the second mapping, the vector (z_1) is shown in Figure b. Now, we observe that G is distributed distinctly closer to G than G. The PCA mapping is one of the various methods used. We stay with the method due to its well-known versatility.[33,34] We also show the results from t-SNE in Figures S15–21 in the SI. The main arguments are the same.

Figure 4

Mapping of principal components analyses for three groups, namely G (green), G (orange, via ChemVAE) and G (blue, via SimSearch), using (a) ML-based vector X_1, (b) ML-based vector z_1, (c) cheminformatics (ECFP), and (d) (III) quantum chemistry.

Mapping (II): Cheminformatics Comparison

Chemical space is usually described by molecular descriptors, so-called descriptor space. We adopted the ECFP fingerprint for these three groups, namely G, G, and G. The PCA mapping results are shown in Figure c (see the SI for results by MACCS and FCFP). The results show that G is closer to G than G. Interestingly, the groups G and G are located in different areas of the chemical space. This result shows that the two methods, ChemVAE and SimSearch, provide two distinct groups of molecules, suggesting the high potential of ChemVAE as a method for searching through criteria different from similarity toward new areas in chemical space.

Mapping (III): Quantum Chemistry Properties

The chemical space spanned by vectors consisting of quantum chemistry properties is expressed by PCA and shown in Figure d. It can be seen from this result that the distribution of G is located closer to G than G. Contrasting with the other mappings shown above, as shown in Figure d, the distribution of the two groups G and G scarcely overlap. Therefore, we can infer the fact that the molecules in G are differentiated well from those in G when these vector elements consist of quantum chemistry properties. The results shown in Figure a–d indicate why it is so critical that we adopt a relevant vector for each molecule. As shown in Figure d, we have arrived at a mapping that enables us to distinguish among three groups of our samples. By adopting a vector whose elements consist of quantum chemical properties, reflecting the wave function of each molecule, we can differentiate the groups well. The results suggest that we can obtain molecules (in orange) that might be comparable to porphyra-334. These differentiated molecules may potentially be new molecules. Here, we had better mention that there may be another possibility for the vector selection. The relevant vectors led us to the best mapping in the molecular space to find molecules comparable to porphyra-334. What is a rational procedure to find such an optimal vector? To the best of our knowledge, there is no established methodology. This is a very important issue in future. Recently, some ML-based fingerprints have been published. Among them is the promising fingerprint Mol2vec,[35] which has been applied for drug discovery,[36,37] solvation free energy prediction,[38] the prediction of pKa values of CH acids,[39] and other material designs. Examples include other ML-based fingerprints such as one that uses graph-convolution models[40] and another proceeds by the evolution of the embedding step[41] (including an application for SAR/SPR). Obtaining a rational procedure for creating a linkage between classical fingerprints and ML-based fingerprints will be a future subject.

Differences among the Three Groups from a Quantum Chemistry Point of View

The purpose of the current study is to find excellent molecules. Therefore, we examine the obtained molecules in three groups from a physical chemistry point of view. The MAAs are known to possess high stabilities even under relatively strong UV irradiation.[42] The absorbed energy is expected to be dissipated very efficiently to the surrounding water environment.[5,7,29,31,42−45] It is the typical mechanism for porphyra-334 and its charasteristics of UV-resistance and the nondestructive release of energy properties. Among many properties of porphyra-334, we must consider the critical ones, that is, its hydrophilic property (log P), absorption wavelength (λmax), and oscillator strength (f). Although log P is widely used, we focus here on quantum chemistry properties and did not include log P. The results with log P included did not change our conclusion described below. The details of the results and arguments for log P are explained in the SI. Since the excitation wavelength (λmax) in UV–visible range and the oscillator strength (f) are the indispensable properties for the optical property in porphyra-334, we employed the TD-DFT method to calculate the excitation energies and oscillator strengths of the three groups G, G, and G. Among the various UV regions, namely UVB (280–315 nm), UVA1 (315–340 nm), and UVA2 (340–400 nm), we filtered molecules whose calculated spectral characteristics were in the 300–350 nm range, reflecting the absorbing range of porphyra-334. We paid special attention on the zwitterionic isomers, since the protonated MAA motifs for photoprotective and antioxidant functions are critical isomers, as was reported in our previous study.[6] We extracted charge-neutral and zwitterionic forms of G via SimSearch and G via ChemVAE. The histogram of the calculated oscillator strengths is shown in Figure . Thus, the number of molecules that satisfied the threshold of spectral properties f > 0.1 and 300 < λ < 350 for G, G, and G, are 52, 138, and 6, respectively. These molecules were finally filtered and scrutinized described below.

Figure 5

Histogram of calculated oscillator strengths in the 300 < λ < 350 nm range for the three groups, namely G (green), G (orange, via ChemVAE), and G (blue, via SimSearch).

Mapping of the Final Selected Molecules

The results shown in Figure for ML-based, cheminformatics-based, and quantum chemistry-based mappings were filtered by the criteria f > 0.1 and 300 < λ < 350, and results are shown in Figure . We then focused on the selected molecules and examined the features of these molecules. The results are shown in Figure .

Figure 6

Filtered molecules (f > 0.1 and 300 < λ < 350 nm) from those shown in Figure for (a) X_1, (b) z_1, (c) ECFP (fingerprint), and (d) quantum chemistry.

Filtered molecules (f > 0.1 and 300 < λ < 350 nm) from those shown in Figure for (a) X_1, (b) z_1, (c) ECFP (fingerprint), and (d) quantum chemistry. All the plots in Figure satisfy the conditions f > 0.1 and 300 < λ < 350. As shown in Figure , the data points (each plot corresponds to each molecule expressed by one vector from X_1 or z_1 of the ChemVAE vectorization) cannot be clearly divided into clusters. This is quite natural in the sense that the results at the X_1 or z_1 level still correspond to these bu way of machine learning. By contrast, the data shown in Figure c show relatively separated features in two clusters. One is the G group (orange) and the other is the G (green) and G (blue) groups. In the latter, the two groups (G and G) are mostly overlapped. These results suggest the possibility that we can somehow explore new chemical space using vectors generated via ChemVAE, even though at this stage the elements consist only of structural information and do not yet include quantum chemistry information. At the final stage, as shown in Figure d, the plots show a promising feature. These data were generated via the vectors whose elements consisted of quantum chemical properties. The G (orange) data show a distribution with a large diversity, whereas the other two, G (blue) and G (green), are covered by the G (orange) zone; they stay in one section and do not spread, suggesting their properties have less diversity. From the aspects shown in Figure d and Figure d, as a matter of fact, many molecules belonging to G were rejected by the filtration criteria (f and λ). When we take the quantum chemical properties into account, we can explore the chemical space more widely via ChemVAE than via SimSearch. It may be relevant to cite here the arguments given by Gómez et al.[8] and various researchers[46−49] as well as the reported studies in which quantum chemical properties were predicted by machine learning.[50,51] Moreover, some studies using transfer learning have been published.[47,52] A future subject remains, specifically how to find new strings of molecular representation beyond SMILES. Currently we are using only SMILES strings, therefore the performance of machine learning for chemical information is still limited. It is noteworthy that recently some research examples beyond SMILES have appeared, such as those from graph theory[51] and those from linear string.[46] The current mapping in Figure d shows that quantum chemical properties do extend a new horizon of the search area. Methodologies based on molecular machine learning (ChemVAE) are thus promising when we add quantum chemical properties. The excellence of porphyra-334 may not be limited only to its intramolecular properties. The excellence may exist further in its ability to form intermolecular interactions such as subtle hydrogen bond networks. If we can include molecular information derived from other dimensions such as wave functions and responsive properties to the environment instead of solely structures, the potential of machine learning will be further realized. The inclusion of such properties will be a future subject.

De Novo Molecules Generated via ChemVAE

According to calculated spectral properties and the mappings after filtration, we have now demonstrated a promising performance of the method via ChemVAE. We show representative examples of the filtered and selected final structures from G in Figure .

Figure 7

Selected molecular structures from GVAE.

Selected molecular structures from GVAE. To show the currently obtained promising feature of ChemVAE molecular generation together with quantum chemistry properties, we display eight representative molecules in Figure . Among the filtered (selected) molecules shown in Figure d, these eight representative molecules are located in the vicinity of G plots. The other G molecules are also shown in the SI. By contrast, only six molecules from the G group satisfied the calculated spectral requirements (see the SI). As shown in Figure , the presence of molecules with a five-membered ring is noteworthy. In their molecular molecular paper, Losantos et al.[17,18] reported the protonated MAA motifs and also proposed protonated five-membered-ring motifs. Since natural bioactive MAAs have six-membered-ring motifs, their rational design shows the significance. Indeed, the thus-proposed five-membered-ring photoactive molecules were not registered in the database of ZINC15 until now. Even among the molecules in the G group obtained via SimSearch, we could not find the molecules that they designed. By contrast, we generated the molecules with five-membered rings, as shown in Figure , in the G group via ChemVAE.

Conclusions

This study reports the results of a comparative study between the ChemVAE method and the SimSearch method, which was focused on their generation ability for new functional molecular designs. Defining the natural porphyra-334 as a model molecule, we generated three groups: molecules of MAAs as seeds, molecules generated via ChemVAE, and molecules gathered via SimSearch (G, G, and G, respectively). There were 52, 138, and 6 molecules that satisfied the condition of the light absorption ability of porphyra-334 at f > 0.1 and 300 < λ < 350 in G, G, and G, respectively. The ChemVAE method shows promising potential for future molecular design capability. When we use quantum chemistry properties for the ChemVAE method, we can obtain molecules significantly comparable to porphyra-334, including unexpected ones (five-membered ring).

Data and Software Availability

We used the Gaussian 16 program package[29] for the quantum chemistry calculations. We used RDkit[20] for the 3D structure construction (MMFF94 force field), the fingerprints (MACCS, ECFP, and FCFP), and the Tanimoto similarity of the fingerprints. We used the OpenBabel toolkit[53] for the data I/O. The multivariate analysis and mapping are proprietary but not restricted to our program.

40 in total

1. Reoptimization of MDL keys for use in drug discovery.

Authors: Joseph L Durant; Burton A Leland; Douglas R Henry; James G Nourse
Journal: J Chem Inf Comput Sci Date: 2002 Nov-Dec

2. Physical chemistry: Seaming is believing.

Authors: Todd J Martinez
Journal: Nature Date: 2010-09-23 Impact factor: 49.962

3. Reducing the dimensionality of data with neural networks.

Authors: G E Hinton; R R Salakhutdinov
Journal: Science Date: 2006-07-28 Impact factor: 47.728

4. Photostability via a sloped conical intersection: a CASSCF and RASSCF study of pyracylene.

Authors: Martial Boggio-Pasqua; Michael A Robb; Michael J Bearpark
Journal: J Phys Chem A Date: 2005-10-06 Impact factor: 2.781

Review 5. Progress in visual representations of chemical space.

Authors: Dmitry I Osolodkin; Eugene V Radchenko; Alexey A Orlov; Andrey E Voronkov; Vladimir A Palyulin; Nikolay S Zefirov
Journal: Expert Opin Drug Discov Date: 2015-06-22 Impact factor: 6.098

6. Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition.

Authors: Sabrina Jaeger; Simone Fulle; Samo Turk
Journal: J Chem Inf Model Date: 2018-01-10 Impact factor: 4.956

7. Convolutional Neural Networks for the Design and Analysis of Non-Fullerene Acceptors.

Authors: Shi-Ping Peng; Yi Zhao
Journal: J Chem Inf Model Date: 2019-11-21 Impact factor: 4.956

8. ZINC: a free tool to discover chemistry for biology.

Authors: John J Irwin; Teague Sterling; Michael M Mysinger; Erin S Bolstad; Ryan G Coleman
Journal: J Chem Inf Model Date: 2012-06-15 Impact factor: 4.956

9. Delfos: deep learning model for prediction of solvation free energies in generic organic solvents.

Authors: Hyuntae Lim; YounJoon Jung
Journal: Chem Sci Date: 2019-08-20 Impact factor: 9.825

Review 10. Unravelling the Photoprotective Mechanisms of Nature-Inspired Ultraviolet Filters Using Ultrafast Spectroscopy.

Authors: Temitope T Abiola; Abigail L Whittock; Vasilios G Stavros
Journal: Molecules Date: 2020-08-28 Impact factor: 4.411

1 in total

1. Synthesis of multi-band reflective polarizing metasurfaces using a generative adversarial network.

Authors: Parinaz Naseri; George Goussetis; Nelson J G Fonseca; Sean V Hum
Journal: Sci Rep Date: 2022-10-11 Impact factor: 4.996

1 in total