Onur Çaylak1, Anil Yaman2, Björn Baumeier2. 1. Department of Mathematics and Computer Science & Institute for Complex Molecular Systems , Eindhoven University of Technology , P.O. Box 513, 5600MB Eindhoven , The Netherlands. 2. Department of Mathematics and Computer Science , Eindhoven University of Technology , P.O. Box 513, 5600MB Eindhoven , The Netherlands.
Abstract
We present a general framework for the construction of a deep feedforward neural network (FFNN) to predict distance and orientation dependent electronic coupling elements in disordered molecular materials. An evolutionary algorithm automatizes the selection of an optimal architecture of the artificial neural network within a predefined search space. Systematic guidance, beyond minimizing the model error with stochastic gradient descent based backpropagation, is provided by simultaneous maximization of a model fitness that takes into account additional physical properties, such as the field-dependent carrier mobility. As a prototypical system, we consider hole transport in amorphous tris(8-hydroxyquinolinato)aluminum. Reference data for training and validation is obtained from multiscale ab initio simulations, in which coupling elements are evaluated using density-functional theory, for a system containing 4096 molecules. The Coulomb matrix representation is chosen to encode the explicit molecular pair coordinates into a rotation and translation invariant feature set for the FFNN. The final optimized deep feedforward neural network is tested for transport models without and with energetic disorder. It predicts electronic coupling elements and mobilities in excellent agreement with the reference data. Such a FFNN is readily applicable to much larger systems at negligible computational cost, providing a powerful surrogate model to overcome the size limitations of the ab initio approach.
We present a general framework for the construction of a deep feedforward neural network (FFNN) to predict distance and orientation dependent electronic coupling elements in disordered molecular materials. An evolutionary algorithm automatizes the selection of an optimal architecture of the artificial neural network within a predefined search space. Systematic guidance, beyond minimizing the model error with stochastic gradient descent based backpropagation, is provided by simultaneous maximization of a model fitness that takes into account additional physical properties, such as the field-dependent carrier mobility. As a prototypical system, we consider hole transport in amorphous tris(8-hydroxyquinolinato)aluminum. Reference data for training and validation is obtained from multiscale ab initio simulations, in which coupling elements are evaluated using density-functional theory, for a system containing 4096 molecules. The Coulomb matrix representation is chosen to encode the explicit molecular pair coordinates into a rotation and translation invariant feature set for the FFNN. The final optimized deep feedforward neural network is tested for transport models without and with energetic disorder. It predicts electronic coupling elements and mobilities in excellent agreement with the reference data. Such a FFNN is readily applicable to much larger systems at negligible computational cost, providing a powerful surrogate model to overcome the size limitations of the ab initio approach.
Dynamics of electronic
excitations drives the functionality of molecular nanomaterials in
many energy applications, e.g., in organic photovoltaics, photocatalysis,
thermoelectricity, or energy storage. The dynamics is governed not
only by the chemical structure, architecture, or electronic structure
of molecular building blocks but also by the local and global morphology
of the materials and molecular interactions on the mesoscale.[1−3] It is essential to understand how elementary dynamic processes,
such as electron or energy transfer, emerge from an interplay of morphology
and electronic structure. Such fundamental insight will eventually
allow for controlling the above processes and pave the way for a rational
design of molecular materials. While macroscale information can be
obtained experimentally, zooming into the electronic dynamics at an
(sub)atomic scale is nearly impossible.[4]Computational modeling of, e.g., charge dynamics can provide
valuable insight in this situation. The Gaussian disorder model and
its various extensions[5−8] have been used to study general aspects of transport, such as temperature
or carrier density dependence.[9,10] For material specificity,
they require information about the width of the density of states,
which is typically obtained by fitting to macroscale or device-scale
observables, for instance, a current–voltage curve. These more
descriptive models cannot provide detailed information about underlying
intermolecular processes.In contrast, bottom-up simulations
of charge and exciton dynamics in large-scale morphologies aim to
explicitly zoom in to elementary charge transfer reactions at molecular
level and predict the mesoscale charge dynamics using multiscale strategies
which link quantum and supramolecular scales.[11−13] Such approaches
have shown a remarkable level of predictiveness[3,14−16] but come at the price of high computational costs.
They typically involve the determination of bimolecular electron transfer
rates in explicit material morphologies, which in turn requires the
calculation of intermolecular electronic coupling elements,[17,18] or transfer integrals, of the formwhere |Ψ⟩ and |Ψ⟩ are diabatic
states of molecules i and j, respectively,
and Ĥ is the Hamiltonian of the coupled system.
Practical evaluation of eq relies on quantum-mechanical information about the relevant
electronic states of the two individual monomers, as well as of the
dimer formed by them. Based on density-functional theory (DFT) the
necessary calculations for a morphology consisting of a few thousand
molecules of moderate size can consume several hundreds of days of
CPU time, even with techniques optimized for large-scale systems.
Many relevant materials or processes, e.g., the system-size dependence
of carrier mobility in dispersive transport, realistic carrier densities,
or disordered interfaces in heterojunctions, can only be studied using
significantly larger systems that are currently not accessible to
multiscale models.Surrogate stochastic models have been developed
to overcome some of these limitations.[19,20] They represent
the molecular morphology by (random) point patterns and assign coupling
elements between them by drawing from distributions with distance-dependent
means and standard deviations, fitted to microscopic data. These models
manage to reproduce, e.g., the field-dependence of the mobility stochastically,
i.e., averages over several realizations are required to obtain a
comparable behavior. While the generation of larger surrogate systems
is computationally inexpensive, information about the molecular details
is lost, and the models are not transferable to interfaces or heterostructures.In this paper, we develop an alternative surrogate model which
allows application to system sizes currently inaccessible to the multiscale
ab initio approach while retaining its molecular level details. Focus
is on removing the computational bottleneck associated with the explicit
quantum-mechanical evaluations of electronic couplings using eq with the help of Machine
Learning (ML). ML has attracted considerable interest as a tool to
save computational costs in large scale calculations[21] or in exploring chemical space, e.g., by predicting material
properties.[22−25] Different models differ by the representation of the molecular information
(features) and the choice of a suitable ML algorithm, and their combinations
have been optimized accordingly.[23,26]For
our goal of predicting electronic coupling elements, an appropriate
data representation must accurately take distance and mutual orientations
between two molecules of the same type, as given by the explicit atomic
coordinates, into account. The machine learning algorithm must be
capable of reliably predicting values of J that can easily span several orders of magnitude,
in particular in amorphous molecular materials. In this situation
we turn to biologically inspired computational models known as artificial
neural networks (ANNs).[27] However, the
construction of an appropriate network architecture is not trivial
and may require a trial-and-error approach. We deal with this problem
in a systematic way by using search algorithms such as evolutionary
algorithms (EA).[28] The advantage of using
an EA approach to constructing a neural network is that it not only
minimizes the model error but is also capable of taking into account
additional physical principles providing systematic guidance to designing
architectures.Specifically, we employ such an evolutionary
method to construct a multilayered (deep) feedforward neural network
(FFNN) for the prediction of electronic coupling elements based on
the Coulomb Matrix representation[26] of
molecular orientations. As a prototypical system, we consider an amorphous
morphology of tris(8-hydroxyquinolinato)aluminum (Alq3).
An ab initio model of hole transport explicitly determined for a system
containing 4096 molecules serves as a reference for the training of
the FFNN. The electric-field dependence of the hole mobility is used
as an additional fitness parameter in the evolutionary algorithm.
The final FFNN model provides inexpensive predictions of J with which hole mobilities are obtained
in excellent agreement with the ab initio data, both without and with
energetic disorder. We demonstrate that it is readily applicable to
systems of larger size containing 8192 and 32768 molecules, respectively.This paper is organized as follows: In section , we briefly recapitulate steps in the multiscale
ab initio model to obtain the reference data. Details about the data
representation and processing and the evolutionary approach for constructing
the feedforward neural network including the definition of its fitness
are given in section . Validation of the model and prediction results are discussed in section . A brief summary
concludes the paper.
Multiscale Ab Initio Model
In what follows, we briefly
summarize the steps of the multiscale model of charge transport in
disordered molecular materials, needed to generate the ab initio reference
for the FFNN. More in-depth discussion of the procedures can be found
in ref (11). The starting
point is the simulation of an atomistic morphology using classical
Molecular Dynamics (MD). 4096 Alq3 molecules are arranged
randomly in a cubic box, which is then first equilibrated above the
glass transition temperature in an NpT ensemble at T = 700 K and p = 1 bar and subsequently annealed
to T = 300 K. The Berendsen barostat[29] with a time constant of 2.0 ps and the velocity
rescaling thermostat[30] with a time constant
of 0.5 ps are used throughout. All calculations make use of an OPLS-based
force field specifically developed[31] for
Alq3 and are performed using Gromacs.[32]In the obtained room-temperature morphology, the
centers of mass of the molecules define the hopping sites for charge
carriers. Pairs of molecules for which any intermolecular distance
of the respective 8-hydroxyquinoline ligands is less than 0.8 nm are
added to a neighborlist of possible charge transfer pairs. Transfer
rates between two molecules i and j are calculated within the high-temperature limit of nonadiabatic
transfer theory[33] using the Marcus expression,
which is given bywhere ℏ is the reduced Planck constant, T is the temperature, λ is the reorganization energy, kB is
the Boltzmann’s constant, ΔG is the free energy difference between initial and
final states, and J is the electronic coupling element, as defined in eq . A hole reorganization energy of
0.23 eV is obtained from DFT calculations with the B3LYP functional[34] and a triple-ζ basis set.[35] Site energies E are determined from a classical atomistic model that takes
into account the effects of local electric fields and polarization
in the morphology.[36,37] Their distribution is approximately
Gaussian with σ = 0.20 eV. The influence of an externally
applied electric field F is added to the site energy
difference as ΔG = ΔE + ΔEext = ΔE + eFr, where r is the distance vector between molecules i and j. Electronic coupling elements are calculated using a dimer-projection
technique based on DFT[18] with the Perdew–Burke–Ernzerhof
functional[38] and the triple-ζ basis.
All DFT calculations are performed with the Orca software package,[39] while the VOTCA package[11,40] is used for all charge transport related steps.The molecular
centers of mass and the hopping rates between the molecules can be
interpreted as the vertices and edges of a weighted directed graph.
In a system with only one charge carrier, the time evolution of the
occupation probabilities P is described by the Kolmogorov forward equation (Master equation)However, in this work, we are interested
in a system that is in a steady state. This restriction allows us
to write eq asHere, the matrix W can be constructed from the
Marcus rates ω. The field-dependent
mobility μ(F) can be obtained from steady state
occupation probabilities via the relationwhere ⟨v⟩ is the average velocity.
Machine Learning Model
Data Representation
Explicit structural information on molecular pair geometries is extracted
from MD simulations and used to construct the features of the data
set. Featurization is the process of encoding molecules into vectors,
where each vector gets a label, in this case log10[(J/eV)2].Coupling elements between molecular pairs are translation and rotation
invariant, which is not accounted for in the plain atom coordinates R. The Coulomb matrix (CM)[23,26,41] representation is capable of
capturing this invariance and is used in the following to encode crucial
information into the data set’s features.For every molecular
pair the entries c of
the corresponding Coulomb matrix C are computed according
towhere Z is the nuclear charge of atom i. Figure illustrates the building blocks of the CM representation applied
to a pair of molecules. One obtains a symmetric matrix that consists
of four equally sized block matrices. The upper-right block indicates
the intermolecular orientations, whereas the upper-left and lower-right
matrices represent the intramolecular conformations.
Figure 1
Visualization of a Coulomb
matrix for molecular pairs. (a) The Coulomb matrix contains values
corresponding to the inter- and intramolecular electrostatic interactions.
Lowest values (dark red) correspond to the relations of hydrogen atoms,
whereas interactions among heavy atoms of the ligands lead to values
near 10 (yellow). (b) Schematic representation of how the arrangement
of two Alq3 molecules is encoded in specific regions of
the upper right block of the CM.
Visualization of a Coulomb
matrix for molecular pairs. (a) The Coulomb matrix contains values
corresponding to the inter- and intramolecular electrostatic interactions.
Lowest values (dark red) correspond to the relations of hydrogen atoms,
whereas interactions among heavy atoms of the ligands lead to values
near 10 (yellow). (b) Schematic representation of how the arrangement
of two Alq3 molecules is encoded in specific regions of
the upper right block of the CM.Before being usable as input for the FFNN, the calculated
CMs must be preprocessed, as illustrated in Figure (b,c). This step involves the removal of
the lower triangular entries including the diagonal elements and scaling
of the values to the interval [0,1]. Subsequent vectorization yields
the instances of the descriptor space as input to an artificial neural
network.
Figure 2
Overview of the data flow from raw molecular dynamics information
to a neural network input. (a) Explicit atomic coordinates of molecular
pairs is extracted from an MD snapshot. (b) A symmetric Coulomb matrix
with dimension given by the sum of the number of atoms per molecule is constructed. (c) To
only keep relevant and nonredundant information in the vectorized
form of the Coulomb matrix, preprocessing techniques such as feature
selection (upper triangle) and data scaling (to [0,1]) are introduced.
(d) The final vectorized CM enters a feedforward neural network with
the hidden layers h1,...,h5 to predict the electronic coupling elements JFFNN.
Overview of the data flow from raw molecular dynamics information
to a neural network input. (a) Explicit atomic coordinates of molecular
pairs is extracted from an MD snapshot. (b) A symmetric Coulomb matrix
with dimension given by the sum of the number of atoms per molecule is constructed. (c) To
only keep relevant and nonredundant information in the vectorized
form of the Coulomb matrix, preprocessing techniques such as feature
selection (upper triangle) and data scaling (to [0,1]) are introduced.
(d) The final vectorized CM enters a feedforward neural network with
the hidden layers h1,...,h5 to predict the electronic coupling elements JFFNN.
Deep Feedforward Neural Networks and Evolutionary
Algorithms
Artificial neural networks consist of a number
of artificial neurons typically arranged as layers with specific connectivity
referred to as topology. Among the ANNs, feed forward neural networks
arrange a certain number of consecutive layers of neurons where each
neuron in each layer is directionally connected to all neurons within
the next layer. The activation a of neuron i in layer m is computed
from the activations of neurons in the m–1-th
layer according towhere ν is the weight of the
connection between the neurons, and f is an activation
functions. For our purposes this activation function is given by a
sigmoid function.One of the conventional ways of training the
FFNNs is the backpropagation algorithm with stochastic gradient descent.
However, the number of layers and the number of neurons per each layer
should be defined before the training. These parameters are referred
to as hyperparameters and play an important role in the performance
of the networks. Although there are some “rule of thumb”
guidelines established based on empirical studies, the selection of
a proper set of hyperparameter settings may require a great deal of
expert knowledge and/or trial-and-error. This can be avoided by search
algorithms like Genetic Algorithms (GAs). GAs are population based
global search algorithms inspired by biological evolution.[28] The research field known as Neuroevolution employs
evolutionary computing approaches to optimize artificial neural networks.[42] Adopting the terminology from biology, the genetic
material of a population of individuals encodes parameters of the
ANNs. The encoding depends on the parameters of the ANNs to be optimized
(topology and/or weights).[43,44] The workflow of such
a genetic algorithm is shown schematically in Figure . It starts with randomly initializing a
population of individuals (Figure (a)), where each individual is evaluated and assigned
a fitness value (Figure (b)) to measure its performance. The main part of the algorithm performs
selection, crossover, and mutation operations aimed at iteratively
improving the fitness values. The selection operator (Figure (c)) selects the individuals
with better fitness values to construct a next generation of individuals.
One of the most commonly used selection operators is roulette wheel
selection, in which individuals are selected with a probability proportional
to their fitness values. The stochasticity of this selection process
may occasionally cause the best individuals to disappear from the
population. It can be combined with the elitist selection scheme,
which selects the top best
ranked individuals, such as n1 and n2 in Figure (c), and transfers them unchanged directly to the next
generation. The crossover operator combines two individuals selected
by the roulette wheel operator (parents, n3 and n4 in Figure (d)), to generate two new individuals (offspring, n′ and n″). In particular,
the 1-point crossover operator selects, with a probability of p, a random point to copy two
different parts of two parents to generate offspring. Subsequently,
a mutation operator flips the bit value in components of the binary
representation of the offspring individuals randomly with a small
probability p. Overall,
the GA is run for a certain number of iterations or until a satisfactory
solution, defined, e.g., by a threshold fitness value, is found.
Figure 3
Flowchart
of the evolutionary algorithm used for optimizing the deep FFNN. The
initialization process consists of generating k =
30 arbitrary FFNN architectures {n1,...,n}. (a) The weights of the
networks are then optimized with a backpropagation algorithm. (b)
For every optimized network n the predicted electronic coupling elements JFFNN are used in eq to determine the matrix W. After solving the
stationary Master eq (eq ) and calculation of the field-dependent mobility, the fitness of
the architectures is evaluated. (c) The architectures are then ordered
based on their fitnesses and selected according to the roulette wheel
principle. (d) After applying the crossover and mutation operators,
a new generation of feedforward neural networks is generated, and
the whole process is repeated until one of the stopping criteria is
satisfied.
Flowchart
of the evolutionary algorithm used for optimizing the deep FFNN. The
initialization process consists of generating k =
30 arbitrary FFNN architectures {n1,...,n}. (a) The weights of the
networks are then optimized with a backpropagation algorithm. (b)
For every optimized network n the predicted electronic coupling elements JFFNN are used in eq to determine the matrix W. After solving the
stationary Master eq (eq ) and calculation of the field-dependent mobility, the fitness of
the architectures is evaluated. (c) The architectures are then ordered
based on their fitnesses and selected according to the roulette wheel
principle. (d) After applying the crossover and mutation operators,
a new generation of feedforward neural networks is generated, and
the whole process is repeated until one of the stopping criteria is
satisfied.
Construction
of a Deep FFNN for Prediction of Electronic Coupling Elements
In this work, we use a genetic algorithm to optimize the topology
parameters (the number of hidden layers and the number of neurons
in each hidden layer) of the feedforward neural networks. Each individual in the population is represented as five
dimensional strings, wherewhich encodes an ANN topology. Therefore, the maximum
number of hidden layers a network can have is set to 5, and the number
of neurons each hidden layer is taken from the set of 21 discrete
values. We limit the FFNN topologies in this manner to reduce the
search space and computational complexity. Consequently, our genetic
algorithm aims to find the optimum model settings in 215 = 4084101 number of possible networks.The FFNNs are trained
on a training data set using backpropagation to minimize the error
between the target log10[(JDFT/eV)2] (actual labels of the input data) and predicted
outputs log10[(JFFNN/eV)2].
We use three distinct snapshots extracted from different MD simulations
for training and validation. Each snapshot contains 4096 Alq3 molecules with approximately 24000 pairs in the neighbor list as
described in section . The first snapshot is used to optimize the weights of a chosen
neural network, while the second snapshot is used for selecting the
neural network architectures based on their fitness values. The third
data set is used to analyze the performance of the constructed final
neural network on completely unseen data.The fitness value
of a given feedforward neural network architecture is determined by
evaluating the mean squared error of the difference between the mobility
μFFNN and the reference mobility μDFTwhere N stands for the number of field values.
For each |F| = b·107 V/m with b ∈ {3,4,5,7,9}, the mobility is
obtained as an average over ±x, ±y, ±z directions of the applied field.Our GA starts with randomly initializing 30 individuals and evaluating
the fitness of the constructed FFNNs, respectively, as illustrated
in Figure . We use
the roulette wheel selection with the elitism of and standard 1-point crossover operators
with the probability of p = 1. The mutation operator selects a random component of a string
with a probability p = 0.1 and replaces it with a randomly selected value from . The probability
of selecting for the mutation is 0.3, and the rest of
the values share 0.7 with equal probabilities to encourage smaller
networks. The GA was run until there is no fitness improvement for
20 generations.
Results
In Figure we show the calculated
field-dependence of the hole mobility in Alq3 for various
different models (a) without (μ(ΔE = 0)) and (b) with (μ(ΔE)) energetic disorder taken
into account in eq .
The data indicated by the orange squares has been obtained by the
ab initio model as described in section and serves as the reference for the FFNN
model.
Figure 4
Field-dependent hole mobility (Poole-Frenkel plot) of Alq3, for systems containing 4096, 8192, and 32768 molecules. In (a)
the mobility μ for the disorder-free case, i.e. ΔE = 0, is given, whereas (b)
illustrates the mobility μ for the case with disorder, i.e.,
ΔE ≠ 0.
Field-dependent hole mobility (Poole-Frenkel plot) of Alq3, for systems containing 4096, 8192, and 32768 molecules. In (a)
the mobility μ for the disorder-free case, i.e. ΔE = 0, is given, whereas (b)
illustrates the mobility μ for the case with disorder, i.e.,
ΔE ≠ 0.In particular, we chose the disorder-free
case as in Figure (a) in the evaluation of the fitness (eq ) during the evolutionary FFNN optimization.
Here, the rates and concomitantly the mobility are solely determined
by the topological connectivity of the charge transporting network
given by the electronic coupling elements.[45] The reference mobility has a minimally negative slope with increasing
field strength, which is attributed to the system being in the inverted
Marcus regime for ΔE = 0. Light green triangles show μFFNN(F) as it results from two individuals in the first generation
FFNN, with vastly different performances. The first one yields completely
unphysical behavior with mobilities on the order of 104 cm2/(V s), about 5 orders of magnitude larger than the
ab initio reference. In comparison, the second model is much closer
but underestimates μDFT(F) consistently
by about a factor of 10. While this agreement appears acceptable,
a closer inspection of Figure (a) reveals a low fitness value (Ξ = 1.5 × 103 V2 s2/cm4). The predicted
log10[(JFFNN/eV)2] shows a MAE of 1.80 and is only weakly correlated to the DFT reference,
as can be seen in Figure (c). From this starting point, the evolution of the FFNN results
in an initially slowly increasing fitness. Going through 25 generations
the fitness only improves by a factor of 2. This slow growth is followed
by a rapid fitness evolution that takes place within only six generations,
after which it stops quickly, and the process ends up in an equilibrium
situation. Such a phenomenon of instantaneous change is not unique
to our evolutionary FFNN, and it has also been observed in evolutionary
biology with similar patterns in the fossil records known as punctuated
equilibrium.[46] The last generation FFNN
consists of two layers with 800 and 550 neurons, respectively, and
is characterized by a fitness value of Ξ = 2 × 106 V2 s2/cm4, an improvement of 3
orders of magnitude over the first model. This is also reflected by
the obtained hole mobility (circles) in Figure (a), which is practically indistinguishable
from the ab initio reference, see also Table . The MAE is reduced to 0.55, and the correlation
in Figure (b) is clearly
improved.
Figure 5
(a) Fitness evolution of the best performing feedforward neural
network in each generation, showing a slow growth followed by rapid
improvement reminiscent of punctuated equilibrium. (b) Correlation
of predicted and reference data for the coupling elements of the final
optimal FFNN model, compared to (c) the one in the first generation.
Table 1
Hole Mobilities (in
cm2/(V s)) of Alq3 for Different Values of Externally
Applied Electric Fielda
3 × 10–7 V/m
4 × 10–7 V/m
5 × 10–7 V/m
7 × 10–7 V/m
9 × 10–7 V/m
without disorder ΔEij = 0
4096
DFT
4.7 × 10–2 ± 2.1 × 10–4
4.6 × 10–2 ± 1.7 × 10–4
4.4 × 10–2 ± 2.0 × 10–4
4.2 × 10–2 ± 2.1 × 10–4
4.1 × 10–2 ± 4.3 × 10–4
4096
FFNN
4.6 × 10–2 ± 1.8 × 10–4
4.5 × 10–2 ± 2.1 × 10–4
4.3 × 10–2 ± 2.5 × 10–4
4.0 × 10–2 ± 3.0 × 10–4
3.7 × 10–2 ± 4.3 × 10–4
8192
FFNN
3.9 × 10–2 ± 3.6 × 10–4
3.8 × 10–2 ± 3.8 × 10–4
3.6 × 10–2 ± 4.0 × 10–4
3.3 × 10–2 ± 4.3 × 10–4
3.0 × 10–2 ± 4.4 × 10–4
32769
FFNN
3.0 × 10–2 ± 5.1 × 10–5
2.9 × 10–2 ± 6.0 × 10–5
2.8 × 10–2 ± 6.7 × 10–5
2.6 × 10–2 ± 7.6 × 10–5
2.4 × 10–2 ± 7.8 × 10–5
with disorder ΔEij≠0
4096
DFT
8.2 × 10–9 ± 1.5 × 10–9
1.1 × 10–8 ± 2.3 × 10–9
1.5 × 10–8 ± 4.4 × 10–9
3.0 × 10–8 ± 8.8 × 10–9
4.1 × 10–8 ± 9.8 × 10–9
4096
FFNN
1.0 × 10–8 ± 4.6 × 10–10
1.3 × 10–8 ± 1.2 × 10–9
1.7 × 10–8 ± 1.9 × 10–9
2.5 × 10–8 ± 2.6 × 10–9
3.6 × 10–8 ± 4.2 × 10–9
8192
FFNN
9.8 × 10–10 ± 4.6 × 10–10
2.1 × 10–9 ± 1.2 × 10–9
3.6 × 10–9 ± 1.9 × 10–9
7.8 × 10–9 ± 2.6 × 10–9
1.5 × 10–8 ± 4.2 × 10–9
32769
FFNN
2.1 × 10–10 ± 1.2 × 10–10
3.2 × 10–10 ± 1.7 × 10–10
4.7 × 10–10 ± 2.8 × 10–10
1.5 × 10–9 ± 1.1 × 10–9
4.3 × 10–9 ± 3.0 × 10–9
Results for cases both without (ΔE = 0) and with energetic
disorder (ΔE ≠
0) are given for the three different system sizes considered, as obtained
from DFT and FFNN based coupling elements, respectively.
(a) Fitness evolution of the best performing feedforward neural
network in each generation, showing a slow growth followed by rapid
improvement reminiscent of punctuated equilibrium. (b) Correlation
of predicted and reference data for the coupling elements of the final
optimal FFNN model, compared to (c) the one in the first generation.Results for cases both without (ΔE = 0) and with energetic
disorder (ΔE ≠
0) are given for the three different system sizes considered, as obtained
from DFT and FFNN based coupling elements, respectively.With a FFNN model at hand that shows
good characteristics and performs well for the disorder-free case,
we now evaluate its applicability in the realistic scenario with energetic
disorder taken into account. This constitutes an important test for
the FFNN model of the electronic coupling elements. While optimization
of the model based on the disorder-free case should, ideally, predict
the connectivity of the charge transporting network accurately, it
does so for a completely flat energy landscape. It cannot be excluded
a priori that coupling elements that are below the percolation threshold
(J2 < 4 × 10–7 eV2), and hence insignificant for μ(ΔE = 0), provide low-probability,
but crucial, additional pathways to avoid or escape low-energy regions
(traps) in the ΔE ≠ 0 case. With the large energetic disorder of σ =
0.20 eV the ab initio reference model exhibits a mobility reduction
of about 6 orders of magnitude, see orange squares in Figure (b), and a noticeable positive
field-dependence known as Poole-Frenkel behavior with μ(F) = μ0 exp(αF).
The FFNN model reproduces this behavior extremely well, with an observed
maximum error of 5.0 × 10–9 cm2/(V
s) and a MAE of 2.7 × 10–9 cm2/(V
s), which are both smaller than the average standard error of the
mean of μDFT of 5.4 × 10–9 cm2/(V s), see Table . Poole-Frenkel slopes of αDFT = 4.2
× 10–3 (cm/V)1/2 and αFFNN = 3.2 × 10–3 (cm/V)1/2 are in reasonable agreement with each other, taking the bounds of
the respective errors of the individual mobility values as given in Table into account.Based on this comparison, we conclude that the FFNN provides very
reliable predictions of electronic coupling elements in Alq3 over a wide range of magnitudes taking into account explicit details
of molecular orientations in large-scale morphologies. It also comes
at a significantly reduced computational cost compared to the ab initio
model. For a single frame containing N molecules
and (on average) κN hopping pairs, the total
CPU time for the coupling elements is T = κNTcoupl, where Tcoupl is a typical CPU time per coupling element.
Using the DFT based method as described in section consumes about Tcoupl = 20 min on one thread of an Intel(R) Xeon(R) CPU E7-4830
v4 @ 2.00 GHz for Alq3. With κ ≈ 5.5, one
obtains T4096 = 312 d. These calculations
are however easily parallelizable in high-throughput settings, so
that transport simulations can be performed within an acceptable total
simulation time of, e.g., 1 week. For the 4096 molecule system of
Alq3, this can be achieved by using about 45 threads simultaneously.
It is apparent that due to the linear scaling of T with the number of molecules, studying
even slightly larger systems (which might be necessary when transport
properties are system-size dependent or to investigate realistic carrier
concentrations) implies a linear increase in either total simulation
time or number of simultaneously used threads. Note that we are not
addressing issues related to memory or storage. Compared to the cost
of the DFT calculations of the reference data, the cost for training
the FFNN is small. Within the search space, training of a single neural
network, making use of a Nvidia Titan Xp GPU (3840 CUDA cores running
at 1.6 GHz), takes about 5 min. Solving the Master equation (eq ) for fitness evaluation
as part of the EA approach was performed on a single CPU thread in
a few seconds. The complete training procedure as in Figure takes in total about 3 days.
With the trained FFNN at hand, the evaluation of a coupling element
is practically instantaneous, which removes the above costs and restrictions
of the ab initio model.To demonstrate the applicability of
the FFNN in this context, we have also simulated Alq3morphologies
containing 8192 and 32769 molecules, respectively, following the same
procedure as before, except for the calculation of the JDFT, which would have taken T8192 = 624 d and T32769 = 2496 d.
Applying the FFNN model, the hole mobilities are calculated, and the
results are shown in Figure and in Table . In the disorder-free case (Figure (a)), the mobility is as expected practically independent
of system size. With energetic disorder taken into account, the situation
in Figure (b) is markedly
different. Doubling the system size from 4096 to 8192 molecules lowers
the mobility by about 1 order of magnitude, while another quadrupling
further reduces the mobility by the same amount. Such a behavior is
indicative of dispersive transport[47] and
is related to the fact that the mean transport energy of the charge
carrier depends on the system size.All in all, the FFNN constructed
with the evolutionary approach described in this work based on fitness
evaluation in the ΔE = 0 case has not only proven to work well for the more realistic,
unseen ΔE ≠
0 simulation but also in application to larger systems that are inaccessible
to the standard ab initio model.
Summary
To summarize, we have presented a general framework for the construction
of a deep feedforward neural network to predict electronic coupling
elements. The final FFNN model constructed for an amorphous organic
semiconductor, tris(8-hydroxyquinoline)aluminum, showed good agreement
with ab initio reference data with and without energetic disorder.
Additionally, we have shown that the final model is applicable to
larger systems, which makes the presented approach a promising candidate
to overcome system size limitations inherent to computationally expensive
multiscale approaches.