Literature DB >> 31459475

MoDoop: An Automated Computational Approach for COSMO-RS Prediction of Biopolymer Solubilities in Ionic Liquids.

Yunhan Chu1, Xuezhong He1.   

Abstract

An automated computational framework (MoDoop) was developed to predict the biopolymer solubilities in ionic liquids (ILs) on the basis of conductor-like screening model for real solvents calculations of two thermodynamic properties: logarithmic activity coefficient (ln γ) at infinite dilution and excess enthalpy (H E) of mixture. The calculation was based on the optimized two-dimensional structures of biopolymer models and ILs by searching the lowest-energy conformer and optimizing molecular geometry. Three lignin models together with one IL dataset were used to evaluate the prediction ability of the developed method. The evaluation results show that ln γ is a more reliable property to predict lignin solubilities in ILs and the p-coumaryl alcohol model is considered as the best model to represent lignin molecules. The developed MoDoop approach is efficient for rapid in silico screening of suitable ionic liquids to dissolve biopolymers.

Entities:  

Year:  2019        PMID: 31459475      PMCID: PMC6648271          DOI: 10.1021/acsomega.8b03255

Source DB:  PubMed          Journal:  ACS Omega        ISSN: 2470-1343


Introduction

Lignocellulosic biomass is the most abundant renewable biomaterial on the earth. It is a composite material with three main biopolymers: cellulose, hemicellulose, and lignin.[1] Cellulose and hemicellulose are typically used for production of textiles, paper, pharmaceutical compounds, etc., whereas lignin is usually converted into liquid biofuels or turned to be feedstocks of chemicals, such as binders, dispersants, surfactants, and emulsifiers.[2] Due to the growing concern of sustainable development and environmental protection, substantial attention has been put on the conversion of lignocellulosic biomass into biofuels or valuable products through thermochemical/biological conversion.[3−5] A key step in the utilization of lignocellulosic biomass is to dissolve these contained biopolymers (i.e., cellulose, hemicellulose, and lignin). Among them, lignin is a cross-linked polyphenolic polymer mainly acting as a barrier preventing biological and physical attacks to cellulose and hemicellulose.[1] The crystalline structure of cellulose and the cross-linked structure of lignin make them even more difficult than hemicellulose to be deconstructed. Therefore, proper solvents should be identified to dissolve these biopolymers. Ionic liquids (ILs) are green solvents and typically consist of a bulky, asymmetric organic cation and an anion that largely adjusts the physical and chemical properties.[6,7] Compared to conventional solvents, ILs have desirable properties such as high thermal stability, nonvolatility, high solvation ability, and low toxicity.[8−13] Moreover, ILs can be altered with a wide range of cations and anions to produce new ILs with a wide spectrum of physical, chemical, and biological properties.[14,15] All of the aforementioned advantages make ILs promising solvents for dissolution of biopolymers of lignocellulosic biomass.[16,17] Moreover, due to the large diversity of ILs, experimental screening of ILs with preferred dissolution ability from a vast number of potential ILs to dissolve biopolymers is not practical, which highlights the importance of applying an automated rapid tool to predict their dissolving ability. Combining statistical thermodynamics and quantum chemistry, conductor-like screening model for real solvents (COSMO-RS)[18−21] as a well-founded approach has recently received a significant amount of attention. With the large number of segments of the molecular surfaces of the compounds, and the assumption that the segment of one molecule overlaps perfectly with that of another, the charge distribution (σ-profile) on the molecular surface and chemical potential distribution (σ-potential) of the molecule in liquid mixture are computed by COSMO-RS on the basis of quantum chemistry and statistical thermodynamics. The resulting μ turns out to be the foundation for evaluation of other equilibrium thermodynamic properties, e.g., activity coefficient (γ) and excess enthalpy (HE). Given the ability to predict the thermodynamic data of compounds, COSMO-RS can be used as an in silico tool to screen molecules for a specific problem solely on the basis of the information arising from their molecular structures. COSMO-RS has been proven to be effective for prediction of properties of ILs.[22−29] It integrates dominant interactions such as electrostatic misfits, H-bonds, and van der Waals forces to summarize multiple solvation among IL systems; so, mixture calculations can be performed at different temperatures.[30] Compared to group contribution methods (e.g., UNIFAC model[31−34]), COSMO-RS is a priori predictive method, which allows calculations of systems with a qualitative accuracy.[35] Some literature has also reported the suitability of using COSMO-RS to predict solubilities of cellulose[36−38] and drug molecules[30] in ILs. On the availability of a database of quantum COSMO calculated compounds, COSMO-RS is adequate for rapid in silico screening of a large number of solutes or solvents on the basis of their selected molecular models. Moreover, the conformations of biopolymers/ILs have a high influence on the prediction results of COSMO-RS in that different predictions of thermodynamic properties can be resulted from different conformations of the same molecule.[23,39] Therefore, it is essential to use proper molecular models and conformations searched by a stable routine to acquire qualitatively and quantitatively precise predictions. In this work, we present an automated computational framework that allows COSMO-RS-based prediction of biopolymer solubilities in ILs (MoDoop). The computational framework is developed on the basis of a script calling of different tools: ChemAxon Convert and Cxcalc,[40] OpenBabel,[41] MOPAC,[42] and Amsterdam density functional (ADF) COSMO-RS.[43−45] By selecting an appropriate force field and geometry optimization method, MoDoop generates a single thermodynamically stable conformer for both biopolymers and ILs. The single thermodynamically stable conformer can be used to calculate COSMO result files,[45] which permits rapid qualitative screening of ILs against selected biopolymer models on the basis of COSMO-RS. To evaluate the developed MoDoop method, the solubilities of lignin in ILs were predicted. Lignin is represented by three different models as p-coumaryl, coniferyl, and sinapyl alcohol. The logarithmic activity coefficient (ln γ) of lignin models in ILs at infinite dilution and the HE of mixtures were calculated by COSMO-RS as qualitative measures of their solubilities in ILs. ln γ is correlated with differences in the strength among molecules due to the dominant interactions, which leads to the affinity between solutes and solvents.[39]HE, as the temperature derivative of Gibbs free energy, is a sensitive measure of the intermolecular interactions within a mixture, which reflects the behavior of the species in solution. Linear regressions are conducted to compare the calculated ln γ and HE with available experimental solubilities of lignin, and R-squared (R2) and residual standard error (RSE) are used to measure the goodness of fit of the regression models to reflect their prediction accuracies with respect to lignin solubilities in ILs. On the basis of the evaluation of the two thermodynamic properties, the best lignin model and suitable ILs are identified.

Results and Discussion

σ-Potentials of Lignin Models

The σ-potential in COSMO-RS measures the affinity between the system S and a surface of polarity σ. It can roughly be divided into H-bond acceptor region, the nonpolar region, and the H-bond donor region[23] for the σ-potential distribution on the molecular surface. As shown in Figure , the sinapyl alcohol model shows the strongest hydrogen-bonding acceptor capacity due to a more negative σ-potential in the H-bond donor region and a more positive σ-potential in the H-bond acceptor region. The p-coumaryl alcohol model shows the strongest hydrogen donor capacity due to a more negative σ-potential in the H-bond acceptor region and a more positive σ-potential in the H-bond donor region. The coniferyl alcohol model is somehow in between. Thus, the solubility ranking of the three lignin models in ILs is p-coumaryl alcohol > coniferyl alcohol > sinapyl alcohol, given that the IL dissolution process is anion dominated.
Figure 1

σ-Potentials of the three lignin models: p-coumaryl, coniferyl, and sinapyl alcohol predicted by COSMO-RS.

σ-Potentials of the three lignin models: p-coumaryl, coniferyl, and sinapyl alcohol predicted by COSMO-RS.

Model Validation

The thermodynamic properties of ln γ and HE calculated by MoDoop on the basis of the proposed three lignin models along with experimental lignin solubilities in the four selected ILs from the IL dataset are listed in Table . The experimental lignin solubilities are compared to the calculated ln γ and HE by linear regressions, and R2 and RSE are used to characterize the goodness of fit as listed in Table . The lignin solubilities in the selected ILs are predicted by each regression model on the basis of calculated ln γ and HE, as shown in Table . There are deviations between the predicted solubilities and the experimental data. However, the dissolution ability trends can be well predicted on the basis of these models, which can be used for the qualitative screening of suitable ILs for lignin dissolution.
Table 1

Experimental Solubilities of Lignin along with ln γ and HE (kJ mol–1) Calculated by MoDoop at 90 °C

  ln γ (predicted solubility)
HE (predicted solubility)
ILlignin solubility (wt %)[49,50]p-coumarylconiferylsinapylp-coumarylconiferylsinapyl
[Emim]Ac30–8.04 (27.60)–3.20 (27.09)–3.22 (26.39)–9.94 (28.29)–8.06 (28.68)–7.75 (28.38)
[Bmim]Cl10–4.13 (14.70)–1.45 (15.33)–1.58 (16.12)–6.16 (13.74)–5.11 (13.19)–5.09 (13.70)
[Bmim]BF44–0.30 (2.06)0.64 (1.29)0.81 (1.16)–3.24 (2.49)–3.13 (2.79)–3.11 (2.77)
[Bmim]PF610.15 (0.58)0.64 (1.29)0.78 (1.35)–2.70 (0.42)–2.67 (0.38)–2.65 (0.23)
Table 2

Goodness of Fit (R2 and RSE) Reflected by Linear Regressions Conducted between Experimental Lignin Solubilities and Thermodynamic Properties (ln γ and HE) Calculated by MoDoop on the Basis of Three Different Lignin Models

 p-coumaryl
coniferyl
sinapyl
goodness of fitln γHEln γHEln γHE
R20.910.940.870.960.830.95
RSE3.993.124.712.625.423.03
The R2 values of the linear regressions listed in Table based on the three alcohol models show that the coniferyl alcohol model (with medium polarity) gives the best prediction with HE (R2 = 0.96), whereas the p-coumaryl alcohol (with more hydrogen donor capacity) gives the best prediction with ln γ for the lignin solubility (R2 = 0.91). Nevertheless, both models present good predictions regarding the lignin solubilities in ILs as shown in Figure .
Figure 2

Linear regressions of experimental solubilities of lignin measured in four ILs at 90 °C against (a) ln γ calculated on the basis of the p-coumaryl alcohol model and (b) HE of mixture calculated on the basis of the coniferyl alcohol model.

Linear regressions of experimental solubilities of lignin measured in four ILs at 90 °C against (a) ln γ calculated on the basis of the p-coumaryl alcohol model and (b) HE of mixture calculated on the basis of the coniferyl alcohol model. On the basis of ln γ of the p-coumaryl modelOn the basis of HE of the coniferyl model(See the other prediction models in Appendix A: Figures S1–S4.) It should be noted that more experimental solubilities are probably needed to further validate the robustness of the developed models in the future work.

Screening ILs

The predicted ln γ based on the p-coumaryl alcohol model and HE based on the coniferyl alcohol model in 450 ILs are depicted in Figure a,b, respectively. (The detailed values are given in Appendix B: Tables S1 and S2.) The cations and anions are mapped according to scaled values of ln γ and HE. The ILs with a higher dissolution capacity (highly negative values of ln γ and HE) are shown in the down-left corner (blue region), whereas those with a lower dissolution capacity (highly positive ln γ and HE values) are shown in the upper-right corner (red region). Both thermodynamic properties, ln γ and HE, vary significantly with anions, but are less dependent on cations, which indicates that the dissolution power is strongly dependent on anions. The ionic liquids containing the anions of Ac–, HCOO–, MeH2NCOO–, MeHOCOO–, DEC–, MeHSCOO–, and BEN– are found to have a high dissolution power for lignin. On the other hand, the HE calculated on the basis of the coniferyl alcohol model shows a small difference with cations, which may indicate the challenges in distinguishing the dissolution power of ILs containing different cations. Thus, ln γ is regarded as a more reliable property and the p-coumaryl alcohol is considered as the optimal model to predict lignin solubility in ILs.
Figure 3

(a) ln γ of lignin in 450 ILs at infinite dilution estimated on the basis of the p-coumaryl alcohol model and (b) HE of mixture calculated on the basis of the coniferyl alcohol model at 90 °C by COSMO-RS. The ln γ and HE values were scaled.

(a) ln γ of lignin in 450 ILs at infinite dilution estimated on the basis of the p-coumaryl alcohol model and (b) HE of mixture calculated on the basis of the coniferyl alcohol model at 90 °C by COSMO-RS. The ln γ and HE values were scaled.

Conclusions

The automated computational framework of MoDoop is used for COSMO-RS-based prediction of biopolymer solubilities in ILs. To conduct the COSMO-RS calculations, the COSMO result files are generated from the two-dimensional (2D) structures of biopolymers and ILs based on the conformers searched by specific force fields and the geometries optimized by empirical and density functional theory (DFT) methods. The method allows the use of a single thermodynamically stable conformer to represent biopolymers and ILs and thus enables rapid qualitative screening of ILs to dissolve biopolymers. Three selected lignin models have been used to predict the solubilities of lignin in 450 ILs at 90 °C following the developed MoDoop method. ln γ is found to be a reliable reference property as it can reflect the variation of the dissolution power of ILs along with both cations and anions. The p-coumaryl alcohol model is selected as the best model to predict lignin solubility on the basis of ln γ with the high R2 of 0.91. The ionic liquids containing the anions of Ac– and HCOO– show a high dissolution power for lignin. The developed MoDoop approach is efficient for the large-scale screening of suitable ILs for dissolution of lignin and potentially other biopolymers.

Methods

Computational Framework

In the MoDoop framework, the COSMO-RS calculations of thermodynamic properties are based on ADF COSMO result files from quantum mechanical calculations of different molecular structures generated by a specific geometry optimization route. The overview of the computational workflow of MoDoop is shown in Figure .The in-house MoDoop script allows the whole computational workflow to be automated, which outputs the analysis results on the basis of given 2D structures of biopolymer and ILs. It should be noted that the step of COSMO result file generation is most time-consuming, which mainly depends on the sizes of the models; however, it only requires to be performed once per molecule, and the generated COSMO result files are reusable in the subsequent COSMO-RS computations. Moreover, in accordance with our previous calculation of ln γ of cellulose in ionic liquids,[47] the ILs containing halogen ions (e.g., Cl–, Br–, and I–) are totally underestimated with hydrogen-bond strengths compared to those not containing halogen elements. Therefore, the values of some key parameters of the COSMO-RS model, such as the subkey CHB (chb) and SIGMAHBOND (σhb) of the key CRSPARAMETERS, are adjusted according to our previous work.[47]
Figure 4

Schematic workflow of MoDoop.

Sketching the 2D structures of biopolymer models (e.g., lignin) and ILs by Marvinsketch; converting the 2D structures of biopolymer models to 3D by Molconvert; conducting the lowest-energy conformer search for biopolymers by Cxcalc with Dreiding force field;[46] optimizing the geometries of the obtained lowest-energy conformers of biopolymers on the basis of PM6 method by MOPAC software; searching the lowest-energy IL conformers for isolated cations/anions by OpenBabel on the basis of the universal force field; optimizing the resulted geometries of biopolymers and ILs at DFT level to generate COSMO result files by ADFprep[43] on the basis of the main parameterization GGA:BP/TZP; calculating the σ-profiles, σ-potentials, and thermodynamic properties (e.g., γ and HE) at a given temperature on the basis of the generated COSMO result files by COSMO-RS implemented in CRSprep;[43−45] reporting the calculated COSMO-RS properties by ADFreport; conducting data visualization, plotting, and reporting to create the overall analysis report. Schematic workflow of MoDoop.

Lignin Models

The quantum COSMO calculation is time-consuming; thus, it is impractical to conduct computation on the whole biopolymer. A feasible way is to represent the biopolymer by a unit part, which is not only compact enough for efficient quantum mechanical calculations but also remains the main characteristic of the molecule. Lignin is usually biosynthesized from up to three monomers: p-coumaryl, coniferyl, and sinapyl alcohols. However, their compositions in lignin vary due to different material resources (e.g., softwood, hardwood, and grasses). p-Coumaryl alcohol is the substructure of the coniferyl alcohol, whereas coniferyl alcohol is the substructure of sinapyl alcohol. Therefore, these three alcohol structures were chosen to represent lignin molecules, and their 2D structures and COSMO molecular surfaces are shown in Figure a–c.
Figure 5

COSMO molecular surfaces and 2D structures of lignin models: (a) p-coumaryl alcohol, (b) coniferyl alcohol, and (c) sinapyl alcohol. On the molecular surface, the red area with the underlying molecular charge as negative marks positive COSMO charge density, and the blue area with the underlying molecular charge as positive marks negative COSMO charge density, whereas the yellow and green area marks nearly neutral charges.

COSMO molecular surfaces and 2D structures of lignin models: (a) p-coumaryl alcohol, (b) coniferyl alcohol, and (c) sinapyl alcohol. On the molecular surface, the red area with the underlying molecular charge as negative marks positive COSMO charge density, and the blue area with the underlying molecular charge as positive marks negative COSMO charge density, whereas the yellow and green area marks nearly neutral charges.

IL Dataset

A set of 450 ILs was extracted from the literature,[38,47,48] which includes 18 cations (Table ) of methylimidazolium+, ethylmorpholinium+, methylpyrrolidinium+, and pyridinium+, with functional groups of allyl, ethyl, butyl, acryloyloxypropyl, 2-methoxyethyl, or 2-hydroxylethyl, and 25 anions (Table ). The selected IL dataset was used for COSMO-RS calculations. In addition, four ILs with experimental solubilities for Kraft lignin (Indulin AT)[49,50] at 90 °C were used to validate the prediction ability of the MoDoop approach.
Table 3

Cations of the IL Dataset[38,47,48]

Table 4

Anions of the IL Dataset[38,47,48]

  3 in total

1.  New Screening Protocol for Effective Green Solvents Selection of Benzamide, Salicylamide and Ethenzamide.

Authors:  Maciej Przybyłek; Anna Miernicka; Mateusz Nowak; Piotr Cysewski
Journal:  Molecules       Date:  2022-05-22       Impact factor: 4.927

2.  Screening ionic liquids for dissolving hemicellulose by COSMO-RS based on the selective model.

Authors:  Jinzheng Zhao; Guohui Zhou; Timing Fang; Shengzhe Ying; Xiaomin Liu
Journal:  RSC Adv       Date:  2022-06-06       Impact factor: 4.036

3.  Efficient Extraction of Fermentation Inhibitors by Means of Green Hydrophobic Deep Eutectic Solvents.

Authors:  Patrycja Makoś-Chełstowska; Edyta Słupek; Karolina Kucharska; Aleksandra Kramarz; Jacek Gębicki
Journal:  Molecules       Date:  2021-12-28       Impact factor: 4.411

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.