Literature DB >> 29940733

High-Throughput Screening Approach for the Optoelectronic Properties of Conjugated Polymers.

Liam Wilbraham¹, Enrico Berardo², Lukas Turcani², Kim E Jelfs², Martijn A Zwijnenburg¹.

Abstract

We propose a general high-throughput virtual screening approach for the optical and electronic properties of conjugated polymers. This approach makes use of the recently developed xTB family of low-computational-cost density functional tight-binding methods from Grimme and co-workers, calibrated here to (Time-Dependent) Density Functional Theory ((TD)DFT) data computed for a representative diverse set of (co)polymers. Parameters drawn from the resulting calibration using a linear model can then be applied to the xTB derived results for new polymers, thus generating near DFT-quality data with orders of magnitude reduction in computational cost. As a result, after an initial computational investment for calibration, this approach can be used to quickly and accurately screen on the order of thousands of polymers for target applications. We also demonstrate that the (opto)electronic properties of the conjugated polymers show only a very minor variation when considering different conformers and that the results of high-throughput screening are therefore expected to be relatively insensitive with respect to the conformer search methodology applied.

Entities: Chemical Disease Gene Species

Mesh：

Substances：

Year: 2018 PMID： 29940733 PMCID： PMC6307085 DOI： 10.1021/acs.jcim.8b00256

Source DB: PubMed Journal: J Chem Inf Model ISSN： 1549-9596 Impact factor: 4.956

Introduction

High-throughput virtual screening (HTVS) has emerged as a powerful tool across many areas of materials discovery.[1] At its core, it concerns the automated data-driven discovery of promising candidate materials for a particular application. Low-computational-cost techniques are applied in an initial screening step to expose promising leads, which are then fed through more accurate, higher-cost techniques to refine computed properties.[2] HTVS is applied to explore vast, complex chemical spaces in order to accelerate the discovery of promising materials, as well as to reduce experimental effort spent on negative leads. Recent examples include the successful screening of both organic and inorganic materials for applications in areas including photovoltaics,[3−8] light emitting devices,[9,10] information storage,[11,12] gas storage,[13−19] and catalysis,[20,21] often resulting in materials being synthesized with record-breaking properties.[10,18,19] Conjugated polymers are used as light absorbers and electron donor layers in organic photovoltaics,[22−26] the light-emitting layer of organic light emitting diodes,[27,28] electrodes for batteries and supercapacitors,[29,30] and photocatalysts for proton reduction and overall water splitting.[31−35] All of these applications exploit a combination of the optoelectronic and/or redox properties of conjugated polymers in combination with their relatively facile tunability through copolymerization and chemical modification, as well as the earth abundant nature of their constituents (if not the catalysts used to make them via Suzuki or Stille coupling). Applying HTVS approaches to polymeric materials, however, remains challenging. Because of the size of the required oligomeric models, calculations on polymers are inherently more computationally costly than for small(er) molecules. Moreover, because of the large numbers of potential monomer sequences, compositions, and the potential for disorder along the polymer chains, extremely large numbers of calculations can be required. For example, constructing simple, ordered, two-component copolymers from a pool of 500 symmetric monomer units leads to over 120 000 possible polymers. If three-component copolymers are considered for the same monomer pool, the number of possible polymer increases to more than 20 million. Finally, most conjugated polymers form amorphous or poorly crystalline solids, ruling out periodic calculations. The fundamental optoelectronic properties of a material can be understood[36] in terms of (i) the optical gap (Δo), the minimum energy to form a excited electron–hole pair bound through their mutual electrostatic interaction (an exciton) and below which the material is transparent to light, (ii) the fundamental gap, the minimum energy to form a noninteracting excited-electron and hole, (iii) the exciton binding energy, the difference between the fundamental gap and optical gap, a measure of how strong the excited electron and hole are bound in the exciton, and (iv) the ionization potential (IP) and electron affinity (EA), the energy required to remove an electron and released upon adding an electron to an oligomer, respectively. IP and EA are commonly expressed in terms of standard reduction potentials vs a standard electrode, e.g., the standard hydrogen electrode (SHE). The fundamental gap, finally, can be defined in terms of the difference between IP and EA. Because of the lack of periodicity, properties such as the optical gap, IP, and EA of polymers are typically calculated for a single oligomeric strand; a description that can be improved upon by combining this description with embedding[37−39] in a dielectric continuum model to account for environmental effects. In the latter case, one assumes that all intermolecular interaction, be it with other polymer strands or a solvent or electrolyte, can be described in terms of an (isotropic) dielectric response. Generally, the underlying calculations are performed using Density Functional Theory (DFT) or the combination of DFT with GW theory, methods that, because of their computational cost, are not ideally suited for HTVS. Hutchison and co-workers pioneered the use of semiempirical methods to perform high-throughput virtual screening for properties of conjugated polymers.[3,7,40] Semiempirical methods such as AM1,[41] PM6,[42] and PM7[43] approximate Hartree–Fock or DFT by parametrizing integrals that are expensive to calculate. This parametrization, combined with the use of minimal basis-sets results, makes semiempirical calculations orders of magnitude faster than DFT but potentially less accurate. In this paper, we introduce an approach for the fast and accurate screening of the optoelectronic properties of conjugated (co)polymers. Besides direct use for HTVS we envisage that it can also be applied to generate data to train machine-learning models. We demonstrate that, by combining the recently developed xTB[44] family of density functional tight-binding methods with calibration using a linear model to properties obtained through (Time-Dependent) DFT ((TD)DFT) calculations, one can not only rapidly screen polymers but also achieve (TD)DFT-quality results. We predominantly compare with results of (TD)DFT calculations using the B3LYP density functional, since we previously showed that calculations with this density functional yield accurate potentials for conjugated polymers when compared to experimental photoelectron spectroscopy data on polymeric solids,[39] while also predicting reasonable optical gap values. The difference between our approach here and previous work[3,7,40,45] is 3-fold. Besides the use of xTB rather than alternative semiempirical methods, we calculate adiabatic IP and EA values directly rather than approximate them from orbital energies, include an explicit description of the dielectric environment of the polymers, and use an integrated family of methods for both electronic and optical properties of the polymers.

General workflow

Structure Library Generation

In this section, we give an overview of the general workflow of the proposed high throughput approach. As outlined in Figure , the process involves multiple steps. Starting from a simplified molecular-input line-entry system (SMILES)[46] representation of each monomer unit, linear polymer structures were generated using the Supramolecular Toolkit (stk),[47,48] a python library for the assembly, structure generation and property calculation of supramolecules, which takes base functionality from RDKit.[49]stk allows for flexible copolymer formation from arbitrary monomer units, control over monomer sequence within repeat units, and the automatic generation of different structural isomers where asymmetric monomer units (e.g., 2,5 linked pyridine) are concerned. We restrict polymer chain length in all cases to the approximate equivalent of 12 phenylene monomer units. For example, a copolymer of thiophene and phenylene contains 6 thiophene and 6 phenylene units (each count as 1), a copolymer of fluorene and phenylene contains 4 fluorene and 4 phenylene units (phenylene counts as 1, fluorene counts as 2), and a copolymer of fluorene and carbazole contains 3 fluorene and 3 carbazole units (each counts as 2), as shown in Figure b. Polymer models of this length have previously been shown to provide approximately converged properties with respect to oligomer length.[39]

Figure 1

Figure 2

(a) Repeat units of the polymers in the calibration set. (b) 3D geometries of the conformers of polymers with repeat units 8, 9, 23, 29, 32, and 38 as illustration of the types of conformers encountered.

(a) Workflow of the calibration procedure, resulting in the linear model used to calibrate semiempirically determined properties (IP, EA, and optical gap (Δo)) to DFT-level calculations. (b) Subsequent proposed high-throughput screening approach, where the linear model produced in panel a is applied to new polymer structures for optoelectronic property screening. (a) Repeat units of the polymers in the calibration set. (b) 3D geometries of the conformers of polymers with repeat units 8, 9, 23, 29, 32, and 38 as illustration of the types of conformers encountered.

Conformer Search Strategy

Linear copolymers generally contain large numbers of rotatable bonds, and therefore a conformer search on the initially generated polymer structures is warranted. Here, we make use of a stochastic rather than systematic approach, sampling the conformational space of the polymer randomly using the Experimental-Torsion Distance Geometry with additional basic knowledge (ETKDG) method,[50] where we typically generate 5000 conformers per polymer. The resulting conformers are embedded in 3D space and undergo a subsequent optimization and energy ranking procedure using the Merck Molecular Force Field (MMFF)[51] as implemented in RDKit. This second optimization step adds little computational cost to the overall conformer search, as the dominant contribution to computational cost stems from the initial stochastic conformer generation and embedding.

Property Calibration

At this stage, we take a set of around 40 simple copolymers that we will refer to as the calibration set. This is a set of copolymer structures composed of the monomer units shown in Figure a, for which the optoelectronic properties span a wide range of values. Polymer structures are generated and conformer searches are performed on each. For each system in the calibration set, both (TD)DFT and semiempirical calculations are performed on the lowest energy conformer, according to MMFF, to compute ground state geometries, ionization potentials (IP), electron affinities (EA), and optical gaps (for specific computational details, see below). This data is then used to fit linear models using Python 3.6 (numpy version 1.13.3) for IP, EA, and the optical gap; calibrating properties calculated with the semiempirical method to (TD)DFT-derived data. The motivation behind this approach is that the slopes and intercepts of this calibration step can be used to provide DFT-quality data at the cost of the associated semiempirical method. In turn, this allows a much more rapid exploration of chemical space for high-throughput screening of the intrinsic electronic properties of polymer materials, with the relatively small initial computational investment required for calibration. Naturally, this procedure relies on the semiempirical and DFT-derived data being strongly linearly correlated.

(TD)DFT Calculations

For the (TD)DFT calculations, IPs and EAs are computed using an adiabatic ΔDFT approach, based around the DFT total energies of the neutral polymer and its relaxed cationic/anionic counterparts. These calculations use the COSMO[52,53] solvation model to approximate the dielectric environment of a polymer chain (e.g., at a polymer–water interface, εr 80.1, or within the polymer bulk, εr 2.0). As in our previous work,[39,54,55] the (TD)DFT calculations use the B3LYP[56−59] density functional and the DZP[60] basis-set. The absorption spectra of the polymer models are approximated by vertical singlet excitations, calculated using (TD)B3LYP/DZP. Here, we define the optical gap, i.e., the onset of light absorption, as the energy of the lowest vertical excitation with nonzero oscillator strength. All (TD)DFT calculations described until now have been performed with Turbomole 7.01.[61−63] Additional optical gap calculations were performed using the CAM-B3LYP[64] and LC-ωHPBE[65] range separated hybrid functionals in combination with a 6-31G(d,p) basis set, all using Gaussian 16.[66]

Semiempirical Calculations

Though different semiempirical methods have been applied to conjugated polymers previously,[3,7,40] we make use of a recently developed density functional tight binding approach (GFN-xTB)[44] for structural optimizations of the neutral polymers. For IP/EA calculations, we use an extension of the parent GFN-xTB method, IPEA-xTB,[67] a differently parametrized variant of GFN-xTB for the calculation of ionization potentials and electron affinities. Essentially, IPEA-xTB performs vertical ΔSCF calculations, and has previously been shown to result in typical errors for computed vertical IP/EA values of 0.2–0.4 eV compared to DFT in vacuum.[67] For optical gaps, we use the tight binding simplified Tamm–Dancoff approach (sTDA)[68,69] applied to orbitals and orbital eigenvalues obtained through xTB (sTDA-xTB),[70] an approach capable of ultrafast computation of entire UV–vis absorption spectra. All GFN-xTB calculations were performed using the xtb code,[71] whereas the sTDA results were obtained using the sTDA[72] code. All GFN-xTB calculations, but not sTDA calculations, used the generalized Born surface area solvation model, with the default parameters for benzene and water distributed with the xtb code. Finally, for comparison we also perform analogous PM6,[42] PM7[43] (MOPAC 2016),[73] and ZINDO/S[74] (Gaussian 16)[66] calculations on the polymers to obtain IP and EA values.

Results

In the following, we explore each stage of the high-throughput screening methodology, assessing the accuracy of the combination of the semiempirical calculation and linear calibration model, robustness of the approach when applied to various oligomer lengths and compositions, and how the conformer searching strategy ultimately affects predicted properties.

Calibration–Ionization Potential and Electron Affinity

The IPs and EAs computed by IPEA-xTB and DFT/B3LYP for the calibration set are shown in Figure . For each dielectric environment, simple linear models are fitted to IP and EA, yielding intercept and gradient values that can be applied to calibrate semiempirical data to obtain B3LYP-quality results. All calibration parameters resulting from the regression models are given in Table and Table S12. There is a strong linear correlation between the B3LYP and IPEA-xTB data, with R2 values of 0.99 for both dielectric environments. For the aqueous environment, the mean absolute differences (MAD) between IPEA-xTB and DFT data are 0.37 and 0.16 V for IP and EA, respectively. After calibration using the linear model, these are reduced to 0.08 and 0.06, respectively. The considerable reduction in MAD after calibration demonstrates that not only the trends between different copolymer properties but also the absolute values predicted by DFT are recovered well by calibrated IPEA-xTB. Furthermore, the same calibration procedure for the nonpolar (benzene) environment data results in MAD of 0.10 (IP) and 0.06 (EA) between IPEA-xTB and DFT results, indicating that the calibration quality holds across strongly differing dielectric environments with high and low dielectric constants.

Figure 3

Table 1

Parameters (slope and intercept) of the Linear Models Applied to Calibrate xTB data (obtained with IPEA-xTB and sTDA-xTB for IP/EA and optical gaps, respectively) to DFT data, alongside Mean Absolute Errors of Calibrated Semiempirical Results vs DFT (B3LYP)

Property	Environment	Slope	Intercept	MAD after calibration (eV)
IP	Low dielectric	0.91	0.15	0.10
EA	Low dielectric	0.92	–1.05	0.06
IP and EA	Low dielectric	1.27	–0.32	0.18
IP	High dielectric	0.88	–0.24	0.08
EA	High dielectric	0.84	–0.75	0.06
IP and EA	High dielectric	1.02	–0.41	0.10
Optical Gap	Low dielectric	0.83	–0.21	0.13
Optical Gap	High dielectric	0.85	–0.25	0.13

IP (up triangles) and EA (down triangles) values calculated for the copolymer calibration set using IPEA-xTB and B3LYP/DZP. In each case, properties are computed within a low dielectric screening environment resembling benzene (left) and a high dielectric environment resembling water (right). Two different fits to linear models are performed in each case. (i) Single fit performed for both IP and EA, with the predictions of the resulting model and its 95% prediction bounds shown as dashed line and light blue shaded region, respectively. (ii) Two separate fits are performed for IP and EA, with the predictions of the resulting models and its 95% confidence bounds shown as solid lines and blue shaded region, respectively. All values are presented versus the standard hydrogen electrode potential (4.44 V). The above discussion was based on linear models where IP and EA values are fitted separately. We repeated this procedure, this time fitting the IP and EA results simultaneously (Figure ), a so-called global fit. Calibration using models based on a global fit results in slightly higher MAD values for both dielectric environments (0.18 and 0.10 V for the IP and EAs of low and high dielectric environments, respectively). The small difference in performance for each dielectric environment is reflected in the residual sum of squares, which is lower for the calibration performed in the high dielectric environment (1.23) than the low dielectric environment (2.44), indicating that, although they perform similarly, the global calibration procedure yields a better fit for the former than the latter. Here, we see that the effect of reducing the total number of parameters when moving from two separate models for IP and EA to one global fit is smallest in the high-dielectric range, indicating that a global fit may be appropriate when considering high-dielectric environments. For each linear model, 95% prediction bands indicate that with new polymers not used for fitting, we can expect an accuracy of ±0.25 V with respect to the B3LYP calculations. Below, unless explicitly stated otherwise, all calibrated IPEA-xTB data will have been obtained using separate models for IP and EA. We attempted analogous calibrations for PM6,[42] PM7,[43] and ZINDO/S[74] as examples of alternative semiempirical methods. For PM6 and PM7, we calculated IP and EA both using ΔSCF and by making the orbital approximation, while only applying the latter for ZINDO/S. The orbital approximation is often made in the literature,[7,75] here one approximates IP and EA by the negative of the HOMO and LUMO orbital energies of a given neutral polymer. The results of these calibration attempts can be found in the Supporting Information (Figures S15–S17). In summary, PM6/PM7 ΔSCF calculations and PM6/PM7/ZINDO/S calculations that make the orbital approximation yield poorer results than IPEA-xTB data, before, and importantly after, calibration (Table S12). However, we should note that ZINDO/S performs similarly to IPEA-xTB for IP but that IPEA-xTB offers significant improvements for EA (MAE of 0.18 V for ZINDO/S, 0.06 V for IPEA-xTB in the low dielectric case). The improved performance of IPEA-xTB over PM7, PM6, and ZINDO/S is likely to be at least partly due to the fact that IPEA-xTB was specifically developed to predict IP and EA values.

Calibration–Optical gap

Figure shows computed optical gaps (defined as the vertical excitation to the lowest excited state associated with nonzero oscillator strength) of the calibration set. In analogy to the above, we fit a linear model using data obtained with (TD)B3LYP and with sTDA-xTB. For both dielectric environments, (TD)DFT computed optical gaps are reproduced well by the combination of sTDA-xTB and a calibrated linear model. At least part of the latter shift is probably due to the fact that the (TD)B3LYP calculations employed a dielectric screening model, something that is unavailable for sTDA-xTB.

Figure 4

Optical gap, defined as lowest computed vertical excitations with nonzero oscillator strength, values calculated for the copolymer calibration set at sTDA-xTB and (TD)DFT level (using B3LYP, CAM-B3LYP, and LC-ωHPBE). In each case, properties are computed using a dielectric screening model for benzene. Corresponding results for the water dielectric model can be found in the Supporting Information. Linear model fits are performed in each case, with the predictions of the fitted model and the 95% prediction bounds shown as blue dashed lines and blue solid lines, respectively. To investigate effects of self-interaction and the potential overestimation of conjugation length by the B3LYP hybrid functional, we also employ the range-separated hybrid functional CAM-B3LYP, which has previously been shown to result in consistent agreement between experimental and DFT calculated optical gaps in conjugated polymer systems, particularly when applied to different oligomer lengths.[76] CAM-B3LYP indeed improves the fit quality relative to B3LYP. However, the improvement is marginal and results in a minor change in the slope and a largely rigid shift of the DFT calculated absorption energies to higher values. This improvement upon switching to a long-range corrected functional is also exemplified when reperforming a fit against data obtained with LC-ωHPBE. The improved fits are reflected in the reduced MAD of noncalibrated sTDA-xTB optical gaps compared to density functionals with improving asymptotic behavior, which induces a more even treatment of polymers with different conjugation lengths used in the calibration set (i.e., those with meta-lined monomer units). Specifically, MAD values of optical gaps computed for the calibration set in an aqueous environment compared to B3LYP, CAM-B3LYP and LC-ωHPBE are 0.72, 0.20, and 0.18 eV, respectively. Similarly, the MADs for the nonpolar (benzene) environment are 0.83, 0.22, and 0.18 eV respectively. As we have seen above for IP and EA, these MAD values are significantly reduced through calibration (0.15, 0.10, and 0.08 eV in an aqueous environment, 0.15, 0.10, and 0.09 eV for the nonpolar (benzene) environment for B3LYP, CAM-B3LYP, and LC-wHPBE, respectively). The effect of going from B3LYP to CAM-B3LYP/LC-ωHPBE can be attributed to the improved asymptotic behavior of the exchange-correlation (electronic) potentials of these density functionals relative to B3LYP, reducing self-interaction error and subsequent over- delocalization of π orbitals along the polymer backbones.[77] As the electronic potential in xTB displays the exact asymptotic behavior,[70] the better quality fit is obtained when a long-range corrected density functional is expected.

Calibration and Polymer Length

Following the observations described above, we further test the robustness and versatility of calibration through a linear model by considering oligomers of varying length. To do so, we extract a subset of polymer structures from the calibration set and consider how IP, EA, and optical gap values vary with oligomer length. The results are shown in Figures S5 and S6 of the Supporting Information compared to the predictions of the linear model and 95% prediction intervals from Figures and 4. Considering IP and EA, we can see that the linear relationship between IPEA-xTB and B3LYP holds for all oligomer lengths, indicating that issues with self-interaction observed in the optical gap calibration do not carry over to those of IP and EA. This is consistent with the excellent quality of fit observed in Figure for IP and EA, despite the fact that the calibration set contains meta-linked polymers with a formally shorter conjugation length. Now considering optical gaps, issues with self-interaction become apparent for B3LYP. As the oligomer length decreases from 12 to 8 and then to 4 equiv phenylene units, we see the systematic increase in optical gap relative to sTDA-xTB with decreasing oligomer length associated with the self-interaction error. Moving to CAM-B3LYP, this effect is mitigated, though not entirely eliminated, consistent with the improved (but not exact) asymptotic behavior of the exchange-correlation potential. Now considering LC-ωHPBE, the effect is further reduced compared to CAM-B3LYP, in line with the previous analysis of computed optical gaps in Figure .

Validation of the Approach

Here we demonstrate the robustness of the calibration procedure by introducing additional polymer systems and molecules not used during fitting. Figure shows the previous IP, EA, and optical gap calibrations for the benzene dielectric screening model, alongside results for this validation set. The validation set, see Figure S3, includes polymers potentially relevant to organic photovoltaics, e.g., those based on benzothiazole or thieno[3,2-b]thiophene and polymers from the literature on photocatalytic water splitting, e.g., the carbon nitride polymer melon, consisting of a linear chain of heptazine units linked by −NH– bridges. It also includes some relevant nonpolymeric molecules, e.g., the fullerene OPV acceptor material PCBM, and even two nonconjugated polymers (fully reduced polyaniline and poly(p-xylene)). Aside from application, a crucial feature of the validation set is that it is composed of polymers that are drastically different from those used for calibration. We can see from Figure that the properties of all systems in the validation set follow the same linear trends as identified for the calibration set and thus the calibration developed above is not biased to the polymers in the calibration set, but can be applied more generally to others. Differences between calibrated IPEA-xTB/sTDA-xTB and B3LYP data reflect this, with MAD values of 0.21 and 0.16 V for IP and EA and 0.17 eV for optical gaps calculated using both B3LYP and CAM-B3LYP. For the water dielectric screening model, MAD values of 0.16 and 0.14 V for IP and EA and 0.16 eV for optical gaps calculated using both B3LYP and CAM-B3LYP were obtained.

Figure 5

(a) IP (triangles)/EA (circles) and (b) optical gap (circles) values computed at IPEA-xTB/sTDA-xTB and B3LYP/DZP level for the validation set (Figure S3). In each case, a dielectric screening model for benzene is used. The corresponding linear regression prediction line and 95% prediction bounds from Figures and 4 are included. Analogous results obtained using a dielectric screening model for water can be found in Figure S19 of the Supporting Information.

Sensitivity of Results to Conformational and Repeat Unit Disorder

To investigate the sensitivity of calculated properties to polymer conformation, we calculate IP, EA, and optical gaps for 5000 randomly generated conformers of six selected copolymers. Each conformer is optimized at GFN-xTB level, with IP/EA and optical gaps calculated using IPEA-xTB and sTDA-xTB, respectively. Figure shows the properties calculated for each conformer plotted against the calculated Boltzmann factor relative to the lowest energy conformer. None of the properties are very sensitive to the polymer conformation, with the maximum variation of a given property with respect to conformation around 0.1 (e)V, in line with previous findings.[78] Further, the maximum dispersion in predicted properties lies at higher energy regions (low Boltzmann factor), with far lower variation in IP, EA, and optical gap in low-energy regions (approaching zero for IP and EA and ∼0.05 eV for optical cap). This highlights that, although an exhaustive search of all possible conformers of each polymers may be desirable, it is not necessary to predict the intrinsic electronic properties of a polymer accurately using an oligomer model. In turn, this supports our use of a stochastic approach coupled with a classical molecular mechanics optimization for conformer generation, which inherently may not necessarily be exhaustive.

Figure 6

(a) IP and (b) EA values computed with IPEA-xTB and (c) optical gap values computed with sTDA-xTB for 5000 randomly generated conformers of polymers shown in panel d. The energy of each conformer is expressed as a Boltzmann factor relative to the lowest energy conformer obtained for each polymer model. In this case, each conformer structure has been optimized via GFN-xTB. Symbol colors on plots match the colors of the polymer repeat units shown in panel d. Boltzmann factors are calculated at a temperature of 298 K. As a secondary observation, comparing the distribution of Boltzmann factors for each of the polymers, it appears that differences between the different polymers can be linked to different inter-repeat unit interactions. As an example, for polyphenylene, we calculate a broad distribution of conformers at different energies, while for polypyridine and polythiophene we obtain a narrow distribution centered at high energy, with far fewer conformers at low energy. This highlights a greater thermodynamic preference for polypyridine and polythiophene to assume particular structures, while polyphenylene and PPV remain more conformationally flexible from an energetic point of view. Finally, the lack of sensitivity of the computed properties with respect to conformation appears to hold even for two copolymers containing meta-linking 1,3-phenylene groups, where changes in conformation can lead to larger overall changes in geometry. To test the effects of repeat unit disorder along polymer chains, i.e., the fact that for asymmetric monomers there will be different regioregular and regiorandom structures, we also produced 100 conformers of 25 randomly generated isomers of polymers comprising asymmetric repeat units, generating analogous plots to Figure (see Supporting Information Figure S13). In the same vein as the above analysis, introducing random disorder along polymer chains seems to have little effect on this in all, with maximum deviations in IP, EA, and optical gap around 0.2 (e)V, which reduces to around 0.05 (e)V for lower-energy structures.

Expected Screening Rate

Within the entire process, the conformer-searching step is the clear bottleneck, predominantly because 5000 conformers per polymer are generated, but also because the full set of calculations with xTB on the lowest energy conformer are extremely rapid (typically resulting in <1 min of computation for a typical polymer model on a typical 24 core desktop machine). As an illustrative example, using such a setup we expect a screening rate of approximately 500 polymers per day. However, because of the observed relative insensitivity of IP, EA, and optical gap values to conformational degrees of freedom (Figure ), the number of conformers sampled per polymer can be further reduced and as a result the screening rate suggested above should be viewed as a lower limit.

Perspectives

As demonstrated above, both IPEA-xTB and sTDA-xTB perform well in terms of reproducing the ordering of the IP, EA, and optical gap values predicted by (TD)B3LYP for the polymers in the calibration set. After fitting to linear models for EA, IP, and the optical gap, calibrated IPEA-xTB and sTDA-xTB also accurately reproduce the corresponding absolute values. More importantly, calibrated IPEA-xTB and sTDA-xTB also perform well when reproducing the IP/EA and optical gap values predicted by (TD)B3LYP for the polymers in the validation set, which are not included in the fitting. This ability to accurately predict properties of systems outside of the calibration set is a critical feature for high-throughput virtual screening, where one expects to screen orders of magnitude more polymers than in the calibration set. Bearing in mind that high-throughput approaches may resort to higher-level (DFT) calculations after the identification of promising leads from a low-cost approach, we show that this step could effectively be avoided altogether as a result of the inherent accuracy of the calibrated xTB data. It is interesting to see that even if we performed our calibration exclusively for conjugated polymers, the properties of the nonconjugated polymers (fully reduced polyaniline and poly(p-xylene)) and a nonpolymeric molecule (PCBM) in our validation set are also reasonably well described. As such, the calibrated models proposed here are likely reasonably transferrable and can, for example, be used to predict open-circuit voltages and approximate energy conversions efficiencies using the Scharber model[79] for bulk-heterojunction organic solar cells using a conjugated polymer as a donor and PCBM (or other molecular materials) as an acceptor. We have performed separate calibrations for the low and high dielectric environment cases, relevant to organic photovoltaics/organic light emitting diodes and water splitting photocatalysis, respectively. However, one would expect that the same procedure would yield results of equal accuracy for the case of intermediate dielectric permittivity values relevant for polymers in contact with moderately polar solvents, relevant to the applications of conjugated polymers in batteries, as well as the characterization of their IP/EA values using cyclic voltammetry. The accuracy of the calibrated IPEA-xTB and sTDA-xTB results is, at best, that of the DFT functional they were calibrated to. Conceptually, it would be just as easy to calibrate IPEA-xTB and sTDA-xTB to experimental data or results from higher-level quantum chemical calculations. Our rationale for calibrating to DFT rather than experimental data is that, at least for IP and EA, there is very limited experimental data to calibrate to, especially in the presence of water. Similarly, there is limited data from accurate quantum chemical calculations for adiabatic potentials of polymers in the presence of a dielectric. Following on from the above, our usage of the word calibration is not meant to signify that (TD)DFT using a particular functional inherently yields results that are more accurate and as such we do not benchmark here the xTB family of methods or any of the other semiempirical method discussed above. Rather, the process of calibration and the fitting of a linear model means that the semiempirical methods can be used to predict the answer DFT would give for a fraction of the computational cost. A good illustration of this is the case of the optical gap, where, as discussed above, at least the trend in optical gap values predicted by sTDA-xTB might be more reliable than that predicted by (TD)B3LYP, as the former captures the correct asymptotic behavior of the electronic potential and the latter does not. The observed relative insensitivity of the predicted IP, EA, and optical gap values to conformational degrees of freedom is an attractive property for high-throughput virtual screening for two reasons. First, it implies that the effect of not finding the true lowest energy conformer on the predicted optical and electronic properties is only very minor and that hence a minimal conformer search will suffice. Second, and perhaps more importantly, it suggests that the effect of conformational disorder on the predicted properties is likely equally minor. Based on the good correlation between properties predicted by the xTB family of methods and (TD)B3LYP, it stands to reason that the observed insensitivity to conformational degrees of freedom will translate to property predictions by (TD)DFT, even if orders of magnitude more computationally expensive to probe.

Conclusions

We demonstrate that the xTB family of density functional tight-binding methods from Grimme and co-workers form a powerful basis for a high-throughput screening method for the optoelectronic properties of conjugated polymers. We show that after fitting a linear model using a calibration set of IPEA/sTDA-xTB and (TD)B3LYP results, one can use the former to predict the latter at a fraction of the computational cost. We also demonstrate that the (opto)electronic properties of conjugated polymers are relatively insensitive to conformational degrees of freedom and that hence a minimal conformer search will probably suffice as well as, perhaps more importantly, that the effect of conformational disorder on the predicted properties is likely minor.

7 in total

1. Identifying structure-absorption relationships and predicting absorption strength of non-fullerene acceptors for organic photovoltaics.

Authors: Jun Yan; Xabier Rodríguez-Martínez; Drew Pearce; Hana Douglas; Danai Bili; Mohammed Azzouzi; Flurin Eisner; Alise Virbule; Elham Rezasoltani; Valentina Belova; Bernhard Dörling; Sheridan Few; Anna A Szumska; Xueyan Hou; Guichuan Zhang; Hin-Lap Yip; Mariano Campoy-Quiles; Jenny Nelson
Journal: Energy Environ Sci Date: 2022-05-20 Impact factor: 39.714

2. Understanding structure-activity relationships in linear polymer photocatalysts for hydrogen evolution.

Authors: Michael Sachs; Reiner Sebastian Sprick; Drew Pearce; Sam A J Hillman; Adriano Monti; Anne A Y Guilbert; Nick J Brownbill; Stoichko Dimitrov; Xingyuan Shi; Frédéric Blanc; Martijn A Zwijnenburg; Jenny Nelson; James R Durrant; Andrew I Cooper
Journal: Nat Commun Date: 2018-11-23 Impact factor: 14.919

3. Photocatalytically active ladder polymers.

Authors: Anastasia Vogel; Mark Forster; Liam Wilbraham; Charlotte L Smith; Alexander J Cowan; Martijn A Zwijnenburg; Reiner Sebastian Sprick; Andrew I Cooper
Journal: Faraday Discuss Date: 2019-07-04 Impact factor: 4.008

4. Mapping binary copolymer property space with neural networks.

Authors: Liam Wilbraham; Reiner Sebastian Sprick; Kim E Jelfs; Martijn A Zwijnenburg
Journal: Chem Sci Date: 2019-04-01 Impact factor: 9.825

5. Accelerated Discovery of Organic Polymer Photocatalysts for Hydrogen Evolution from Water through the Integration of Experiment and Theory.

Authors: Yang Bai; Liam Wilbraham; Benjamin J Slater; Martijn A Zwijnenburg; Reiner Sebastian Sprick; Andrew I Cooper
Journal: J Am Chem Soc Date: 2019-05-22 Impact factor: 15.419

6. Screening of metal-organic frameworks for water adsorption heat transformation using structure-property relationships.

Authors: Min Xu; Zhangli Liu; Xiulan Huai; Lanting Lou; Jiangfeng Guo
Journal: RSC Adv Date: 2020-09-18 Impact factor: 3.361

7. Conformational Heterogeneity and Interchain Percolation Revealed in an Amorphous Conjugated Polymer.

Authors: Robert M Ziolek; Alejandro Santana-Bonilla; Raquel López-Ríos de Castro; Reimer Kühn; Mark Green; Christian D Lorenz
Journal: ACS Nano Date: 2022-09-14 Impact factor: 18.027

7 in total