Literature DB >> 34910455

Diversifying Databases of Metal Organic Frameworks for High-Throughput Computational Screening.

Sauradeep Majumdar¹, Seyed Mohamad Moosavi¹, Kevin Maik Jablonka¹, Daniele Ongari¹, Berend Smit¹.

Abstract

By combining metal nodes and organic linkers, an infinite number of metal organic frameworks (MOFs) can be designed in silico. Therefore, when making new databases of such hypothetical MOFs, we need to ensure that they not only contribute toward the growth of the count of structures but also add different chemistries to the existing databases. In this study, we designed a database of ∼20,000 hypothetical MOFs, which are diverse in terms of their chemical design space─metal nodes, organic linkers, functional groups, and pore geometries. Using machine learning techniques, we visualized and quantified the diversity of these structures. We find that on adding the structures of our database, the overall diversity metrics of hypothetical databases improve, especially in terms of the chemistry of metal nodes. We then assessed the usefulness of diverse structures by evaluating their performance, using grand-canonical Monte Carlo simulations, in two important environmental applications─post-combustion carbon capture and hydrogen storage. We find that many of these structures perform better than widely used benchmark materials such as Zeolite-13X (for post-combustion carbon capture) and MOF-5 (for hydrogen storage). All the structures developed in this study, and their properties, are provided on the Materials Cloud to encourage further use of these materials for other applications.

Entities: Chemical

Keywords: MOFs; carbon capture; diversity; hydrogen storage; machine learning; molecular simulations

Year: 2021 PMID： 34910455 PMCID： PMC8719320 DOI： 10.1021/acsami.1c16220

Source DB: PubMed Journal: ACS Appl Mater Interfaces ISSN： 1944-8244 Impact factor: 9.229

Introduction

Metal organic frameworks (MOFs) have been an exciting class of crystalline nanoporous materials since their discovery about 2 decades ago. By combining metal nodes and organic linkers, one can, in principle, make an infinite number of MOFs.[1] Over 100,000 MOFs have already been currently synthesized.[2−4] Due to characteristics like high surface area, large pore volume, and wide range of pores sized from micro- to mesoscale, MOFs have found applications in several areas like gas storage,[5] catalysis,[6] and nondistillative separations.[7−9] At present, of the millions of possible MOF structures, in practice, we can only synthesize a small fraction of all possible structures. This is because synthesizing a new MOF and then characterizing and testing it could take many months.[10] Therefore, computational researchers have been building databases of hypothetical MOF structures for high-throughput screening purposes. The idea stems from the fact that efficient computational algorithms can help in generating MOF structures and evaluating them for different applications in a less expensive and faster way. Ranking these hypothetical MOFs based on specific material properties then helps in identifying the most promising materials for a specific application.[4,11] Experimentally, we can then focus our efforts on synthesizing only the promising materials. Thus, along with the 100,000 experimental structures, there are also millions of hypothetical MOFs, which have been generated computationally.[10,12−15] One of the earliest hypothetical MOF databases was generated by Wilmer et al.,[14] which consisted of a database of around 137,000 MOFs constructed using a ‘Tinkertoy’ algorithm. This database of hypothetical MOFs was used in various screening studies for gas storage and separations, and some of those hypothetical MOFs have been experimentally synthesized.[16] This approach of MOF construction, however, had a limitation. These 137,000 MOFs only sample from six topologies, with most of them having a pcu topology. What this algorithm did was it sequentially connected the molecular building blocks (SBUs) until a period crystal was formed, or in other words, it used a bottom-up approach for generating MOFs. Along with the building blocks, topologies also play an important role in MOF performance. The net of a MOF, also called a topology, represents the underlying connectivity of the metal nodes and organic linkers. Gomez-Gualdron et al.[17] showed in their computational study of Zr-MOFs for volumetric methane storage how Zr-MOFs based on ftw topology outperform Zr-MOFs based on scu and csq topologies, even if the same organic linkers are used for all the three topologies. Subsequent algorithms to generate MOFs explore the different topologies using a top-down MOF construction algorithms. One such topology-based MOF construction algorithm has been developed by Boyd and Woo[12,18] and is called ToBasCCo. This study generated around 325,000 MOFs, which were screened for post-combustion carbon capture, and two of those structures, namely, Al-PMOF and Al-PyrMOF have also been synthesized. Another topology-based MOF construction algorithm was developed by Gomez-Gualdron et al.[13] and is called ToBaCCo. Hypothetical MOFs generated using ToBaCCo have been used for screening applications in hydrogen storage, methane storage, and xenon–krypton separation. Some structures from this database were also synthesized and tested.[13] Recently, a study by Lee et al.[15] presented an algorithm to explore a MOF space of over 100 trillion materials, which was used to find the most optimal structures for methane storage. At present, there are thus several databases with millions of hypothetical MOF structures in total. The algorithms underlying these databases have been focused on enumerating as many possible structures for a given topology, metal node, organic linker, and functional group. The end result is that we have now reached such a large number of structures that it is practically impossible to screen all structures for a possible application. In addition, as we have an infinite number of possible structures, this is a fundamental problem we cannot solve with faster computers. It is therefore important to take a different approach and carefully select a representative set of diverse structures as a starting point for a screening study and, subsequently, use the strategy of adding only novel structures if in our set of most diverse materials some materials are missing. In this respect, a detailed analysis of the diversity of the computation-ready experimental metal–organic framework (CoRE MOF) database,[3] which represents the synthesized MOFs from the Cambridge Structural Database (CSD),[19] and hypothetical databases was performed by Moosavi et al.[20] Descriptors were built to capture features of a MOF, such as pore geometry, metal chemistry, linker chemistry, and functional groups, which combine to form the chemical design space for a MOF chemist. The chemical diversity of a MOF was then expressed in terms of these features. Moosavi et al.[20] concluded that with respect to pore geometry, linker chemistry, and functional groups, the hypothetical databases seem to sufficiently well covered. However, with respect to metal chemistry, the hypothetical databases turn out to be less explored. The variety of metal chemistry in hypothetical databases was found to be surprisingly low, when compared to those in the experimental databases.[20] Hence, there are many MOF structures corresponding to these missing metal nodes, lying in the less explored regions of the material space. It is therefore important that we are able to generate such structures to study their properties. By harvesting these metal nodes using the algorithms developed in this respect,[15,20,21] one could generate such structures. In this study, we thus designed a database of ∼20,000 hypothetical MOFs, keeping in mind their chemical diversity in terms of pore geometry, metal chemistry, linker chemistry, and functional groups. We focused on improving the diversity of metal nodes in hypothetical MOF databases by harvesting metal nodes from experimental structures. Diversity of metal nodes can be important for important environmental applications like carbon capture.[20] We are interested in carbon capture and storage because it is considered to be one of the most promising and viable technologies to address the rising CO2 emissions in the atmosphere.[22,23] In this study, we have specifically looked into post-combustion carbon capture. Another application we have looked into here is the storage of hydrogen, a promising vehicular fuel.[24,25] Promising structures found from this screening study could then be added to the list of already available top-performing structures and the resultant list of structures would thus be chemically more diverse. This would then also help to choose from a wider range of structures and try synthesizing them.

Methods

Building Block Selection and Structure Generation

Moosavi et al.[20] developed a methodology to mine metal nodes from experimental MOF databases. These are some of the metal nodes, which are not commonly used for structure generation in hypothetical MOF databases. Thus, in this work, we focused on some of these metal nodes, as a proof of concept to validate our argument of improving metal diversity. Figure shows some of these metal nodes. In total, 14 metal nodes have been used and they are all listed in the Supporting Information. We have chosen metal nodes consisting of different metals such as nickel, zinc, cadmium, copper, manganese, cobalt, and lead. Additionally, we have included metal nodes with different connectivities, ranging from 4-connected nodes up to 12-connected nodes, as well as different coordination geometries such as triangular, tetranuclear, square-planar, and others. There are several libraries of organic linkers reported in literature like the ToBaCCo database[9,13] and ToBasCCo database.[12,18] We selected our organic linkers from these reported libraries. We have also used several functional groups to decorate these organic linkers (see Supporting Information). Within ToBaCCo, there is a list of all the topologies from the Reticular Chemistry Structure Resource Database (RCSR)[26] (see Supporting Information for the list of all topologies used in this study). The topologies were selected from this list, based on their compatibility with the building blocks—metal nodes and organic linkers. It is to be noted that the list of topologies used in this study is not exhaustive. One could, in principle, generate even more structures by exploring more topologies. Because our focus in this study was on the diversity of the structures and not on the number of structures in itself, we did not explore all possible topologies for a given set of building blocks. The building blocks along with the topologies were then used to build the hypothetical MOFs using the ToBaCCo algorithm.

Figure 1

Some metal nodes used in this study to generate hypothetical MOFs. The metal type, connection type, and the metal node names as we used in this study are provided below each node. (a) 6-connected Ni metal node. (b) 8-connected Co metal node. (c) 12-connected Mn metal node. (d) 6-connected Ni metal node. (e) 6-connected Zn metal node. (f) 5-connected Cu metal node. These metal nodes are shown to highlight different metals, connectivities, and geometries used in this study. The entire list of metal nodes used in this study is provided in the Supporting Information.

Structure Optimization and Charge Generation

The hypothetical MOFs of our database were optimized using the universal force field (UFF).[27] The optimization of the structures was performed using LAMMPS,[28] the input and data files for which were generated using lammps_interface.[29] The EQeq (extended charge equilibration) method[16] was used to generate the partial charges of the framework atoms of the hypothetical MOFs designed in this work.[30] Additional details on the structure optimization and charge generation process are provided in the Supporting Information.

Diversity Analysis

To analyze the diversity of MOF databases, we used a set of descriptors to quantify the similarity of MOF structures. Because both pore geometry and material chemistry are important in gas separation applications, we need descriptors for both aspects. Several material descriptors have been developed to characterize different aspects of the similarity of MOF materials.[31−34] We used classic geometric characteristics, such as the largest included sphere, surface area, density, and pore volume to describe the pore geometry. These descriptors were computed using Zeo++.[35,36] We described the chemistry of MOF structures using revised-autocorrelations (RACs). RACs are the product or difference of atomic heuristics, for example, Pauling electronegativity, connectivity, and covalent radii, computed on a molecular or crystal graph.[37] While RACs were initially introduced for machine-learning open shell transition metal complex properties,[37−39] they were recently adapted to MOF chemistry[20] and shown to be successful in capturing structure–property relationships for gas adsorption[20] and photoelectronic properties (e.g., color) of MOFs.[40] In this approach, the MOF structure is described with three groups of features, describing the metal centres, organic linkers, and the functional groups. In total, 156 RAC descriptors were computed using the molSimplify package[38,41] to describe the chemistry of a MOF structure. We computed variety, balance, and disparity to assess the diversity of the material databases. The diversity metrics were calculated for each aspect of the material chemistry that includes chemistry of metals, linkers, functional groups, and the pore geometry. The chemical and geometric descriptors construct high-dimensional feature spaces. We first split these high-dimensional spaces into 1000 bins using the k-means clustering method. In this approach, the structures were assigned to their closest centroid. Then, the three diversity metrics were computed using this binning. Each diversity factor captures different information related to the diversity. These diversity metrics—variety, balance, and disparity—are also used in a wide range of other fields like understanding the stability of ecosystems, social sciences.[42−46] Following Moosavi et al.,[20] variety has been calculated as the percentage of all the bins sampled by a given database, that is, how many district types of structures exist in a database normalized with the 1000 unique bins. The balance of a database gives us an indication of how even is the distribution of structures in a database. For example, let us say that in database 1, we have 100 structures of type A and 2 structures of type B, and in database 2, if we have 70 structures of type A and 50 structures of type B. The variety is the same in both databases, but the balance is very low in database 1. There is thus a bias toward structures of type A in database 1. We used Pielou’s evenness,[47] which measures how even the structures are distributed among the sampled bins, as a measure of the balance. Following Moosavi et al.,[20] the evenness of the distribution of structures—balance—could be computed using different methods, which are all transformations of the Shannon entropy.[20] The Shannon entropy is given by The maximum entropy would be achieved in case of a uniform distribution. Therefore, normalizing the system entropy with the maximum entropy (when all bins are equally likely) would give us a metric for evenness—relative entropy. One transformation of the entropy was introduced by Pielou We used 1 – PLrel(X) in this study to measure the evenness of distribution, such that 1 is the maximum evenness, that is, uniform distribution. The disparity metric gives us a measure of the spread of the structures in a database. A high value of disparity would mean that the database contains significantly dissimilar structures that are far apart from each other in the material space.To compute disparity, we computed the covered area of the concave hull by a database in the map of the first two principal components. We normalized this number with the area of all databases together. The covered area was computed using the Shapely package with the circumference to area ratio cutoff of 1.[48] A detailed description of the material descriptors and diversity analysis can be found in the previous work.[20]

Property Calculation

The pore limiting diameter and blocking spheres for each MOF were calculated using Zeo++.[35,36] For blocking spheres, we considered spherical probes with diameters of 3.05 Å for CO2 (oxygen’s sigma in TraPPE), 3.31 Å for N2 (nitrogen’s sigma in TraPPE), and 2.96 Å for H2 (hydrogen’s sigma in the Buch force field[49]). The force-field parameters for the framework atoms were extracted from UFF.[27] CO2 and N2 molecules were described by the TraPPE force field,[50] and H2 was described by the Buch force field[49] with the Feynman Hibbs correction[51] (see Supporting Information for the full list of parameters used). The gas–framework interactions were modeled using Lennard Jones potential, truncated at 12 Å (for CO2 and N2) and 12.8 Å (for H2), with tail corrections.[52] The Lennard Jones interactions between dissimilar atoms were approximated using Lorentz–Berthelot rules.[53] The Coulombic electrostatic interactions were computed using Ewald summation. The gas adsorption calculations were performed in RASPA.[54] Grand-canonical Monte Carlo (GCMC) simulations were used to compute the gas uptake of the MOFs. Each calculation consisted of 10,000 equilibration cycles followed by 10,000 production cycles. In RASPA, a cycle is defined as max(20,N) steps where N is the number of molecules.[54] The pure component CO2 uptakes were calculated at 1 bar and 298 K. We also calculated the uptakes of CO2 and N2 for a binary mixture of CO2 and N2 in the ratio of 0.15:0.85. For the binary mixture, we considered the flue gas to be adsorbed at 1 bar and 298 K and regenerated at 0.1 bar and 363 K. These conditions have been used in several studies for post-combustion CO2 capture.[12,55−57] The H2 uptakes were calculated at 100 bar and 77 K. These conditions have been used in several studies for hydrogen storage.[9,24]

Results and Discussion

Diversity Analysis

Using the workflow as described in the Methods section, a database of ∼20,000 MOFs was generated. We took the combination of all synthesized MOF structures and hypothetical MOF structures as the current chemical space of MOFs. This chemical space has been described using the high-dimensional pore geometry and chemistry feature vectors, and we have thus made a projection of it on two dimensions to visualize which regions of the material design space our hypothetical MOFs are covering. For experimental structures, we have considered the CoRE-2019 database.[3] For hypothetical structures, we have considered the database developed by Anderson et al.,[58] the ToBaCCo database,[13] a diverse subset of 20,000 structures from the database developed by Boyd and Woo.[12] These databases were used for the analysis in the work by Moosavi et al.[20] Figure shows a dimensionality reduction visualization of all hypothetical MOF databases, including the database we have developed in this study, when overlaid on the total set of all experimental and hypothetical MOF databases. The distributions of the databases are shown with respect to their pore geometry, metal chemistry, linker chemistry, and functional groups. For pore geometry, linker chemistry, and functional groups, the hypothetical databases are covering and sampling the design space well. For metal chemistry, we find that the sampling of the design space has improved on including the structures from this study, when compared to the previous distribution reported by Moosavi et al.[20] This overall improvement of the diversity in metal chemistry has been quantified below (Table ).

Figure 2

Table 1

Diversity Metrics for the Different Features of Hypothetical Databasesa

feature	hypothetical databases	variety	balance	disparity
geometric	excluding this study	0.977	0.849	0.874
	including this study	0.988	0.775	0.933
metal center	excluding this study	0.068	0.334	0.078
	including this study	0.107	0.296	0.104
linker chemistry	excluding this study	0.648	0.617	0.737
	including this study	0.684	0.446	0.798
functional group	excluding this study	0.722	0.213	0.782
	including this study	0.851	0.323	0.834

We first split the high-dimensional spaces into 1000 bins using the k-means clustering method. Variety measures the percentage of all the bins sampled by a given database. Balance measures the evenness of the distribution of the structures among the sampled bins. And, disparity measures the spread of the sampled bins. We normalized these number with the area of all databases together.

Visualization of the material design space. The t-Distributed Stochastic Neighbour Embedding (t-SNE)[59] method was used to project the pore geometry, metal chemistry, linker chemistry, and functional groups descriptor spaces to 2D maps. The t-SNE method preserves pairwise distances, ensuring that similar structures are mapped close to each other in two dimensions. (See principal component analysis figures in Supporting Information for the global similarities.) Only descriptors up to the second coordination shell were included for metal chemistry to emphasize the local metal chemistry environment. The entire known design space, containing the structures from all databases—experimental and hypothetical—is represented in gray. The structures from all the hypothetical databases were colored and overlaid on this design space. Thus, the colored regions represent those parts of the design space, which are covered by all the hypothetical databases. The gray regions represent those parts of the design space, which are not covered by the hypothetical databases. We first split the high-dimensional spaces into 1000 bins using the k-means clustering method. Variety measures the percentage of all the bins sampled by a given database. Balance measures the evenness of the distribution of the structures among the sampled bins. And, disparity measures the spread of the sampled bins. We normalized these number with the area of all databases together. Figure shows the regions of the material design space we have specifically contributed to through the database of this study. As discussed in the Methods section, for organic linkers, functional groups, and topologies, we have selected them from the respective libraries reported in literature. However, for metal nodes, we have tried to focus on the ones that have not been commonly used in the other hypothetical databases mentioned in this study. Thus, if we look at the metal chemistry map, we find that our metal nodes are complementing different regions of the space as compared to the previously used metal nodes in the other hypothetical databases.[20] Also, when we combine all these metal nodes used in all the hypothetical databases together, we get the map of metal chemistry, as shown in Figure .

Figure 3

Visualization of the material design space. The t-SNE method was used to project the pore geometry, metal chemistry, linker chemistry, and functional groups descriptor spaces to 2D maps. Only descriptors up to the second coordination shell were included for metal chemistry to emphasize the local metal chemistry environment. The entire known design space, containing the structures from all databases—experimental and hypothetical—is represented in gray. The structures from the hypothetical database developed in this study were colored and overlaid on this design space. Thus, the colored regions represent those parts of the design space, which are covered by the hypothetical database developed in this study. The gray regions represent those parts of the design space that are not covered by the hypothetical database developed in this study. We have quantified the diversity of the databases in terms of their variety, balance, and disparity (Table ). The variety of a database indicates how many distinct types of structures exist in our database. Balance indicates how even the distribution of structures is. Disparity of a database reflects how dissimilar or distinct the structures of our database are. A high disparity would thereby indicate that we have structures from far apart points in the material design space. We thus calculated these metrics for the hypothetical databases in two scenarios: before adding the database of this study and after adding the database of this study. For the geometric features of the hypothetical databases, we find a slight increase in the variety and the disparity and a slight decrease in the balance. For the metal center features, we see that on adding the structures from the database of this study, the variety of structures have improved. The balance of the structures decreases slightly, and the disparity of the structures also improves. This gives us an indication that the overall diversity of structures with respect to the metal chemistry has improved upon adding the structures from the database of this study. For the linker chemistry, as like the geometric features, we see a slight increase in the variety and the disparity and a decrease in the balance. Also, for functional groups, we see an improvement in all the three diversity metrics. Now that we have designed a diverse set of hypothetical MOF structures, our next aim was to see if we could use them in some practical applications.

Post-Combustion Carbon Capture

Figure shows the distribution of the hypothetical MOFs of the current study for the uptake of pure CO2 at 1 bar and 298 K. A reference line has been drawn to denote the pure CO2 uptake of Zeolite-13X, which is often used as a benchmark CO2 adsorbent.[12] From the distribution, we find that there are many structures, which perform as well as Zeolite-13X—pure CO2 uptake of ∼5 mmol g–1,[55,60] and there are also many structures—around 800—which surpass the performance of Zeolite-13X.

Figure 4

Results from the computational screening of ∼20,000 MOFs for post-combustion carbon capture (pure CO2 adsorption at 1 bar and 298 K). This plot shows the distribution of the pure CO2 uptake of the MOFs. The blue reference line denotes the pure CO2 uptake of Zeolite-13X. We further investigated the structures for their performance in separating CO2 from flue gas. For this, we considered a binary mixture of CO2 and N2. We then calculated the CO2 working capacity and CO2/N2 selectivity of these hypothetical structures. Figure shows how the structures perform. Again, we find that there are many structures—around 250—which surpass the performance of Zeolite-13X under dry conditions—CO2 working capacity greater than ∼2 mmol g–1 and CO2/N2 selectivity greater than ∼50.[12,61]

Figure 5

Results from the computational screening of ∼20,000 MOFs for post-combustion carbon capture (15:85 CO2/N2 mixture with adsorption at 1 bar and 298 K and regeneration at 0.1 bar and 363 K). This plot shows the CO2 working capacity versus CO2/N2 selectivity of the MOFs. The color coding represents the number of MOFs according to the color bar on the right. The blue reference lines denote the CO2 working capacity and CO2/N2 selectivity of Zeolite-13X. Based on diversity analysis, Moosavi et al.[20] concluded that for CO2 adsorption at low pressures, metal chemistry as a factor cannot be ignored. To illustrate the importance of using different metal nodes in our database and to have a look at how some of the metal nodes in our study performed, we plotted the pure CO2 uptake at 1 bar and 298 K for some of the metal nodes used in this study. For example, Figure shows the distribution of the pure CO2 uptake for the metal node mn1—a Ni-based metal node, mn3—a Cu-based metal node, mn2—a Zn-based metal node, and mn12—a Ni-based metal node. Nodes mn1, mn2, and mn3 have similar tetranuclear metal clusters. However, in mn3, Cu forms a five-connected cluster—Cu4(μ3–OH)2(COO)5,[62] and in mn1 and mn2, Ni and Zn form a six-connected cluster—Ni4(μ3–OH)2(COO)6 and Zn4(μ3–OH)2(COO)6, respectively.[63,64] Many of the MOF structures of this study containing nodes mn1 and mn2 have pure CO2 uptakes above 5 mmol g–1, while hardly any of the structures containing node mn12 have pure CO2 uptakes above 5 mmol g–1. This shows that even if we have metal nodes of similar geometry, the type of metal can impact the formation of a node with different connectivities and different CO2 adsorption characteristics of the MOF.

Figure 6

Pure CO2 uptake distribution for MOFs of different metal nodes at 1 bar and 298 K. (a) Distribution of MOFs of node mn1 (left) and the structure of node mn1 (right). (b) Distribution of MOFs of node mn3 (left) and the structure of node mn3 (right). (c) Distribution of MOFs of node mn2 (left) and the structure of node mn2 (right). (d) Distribution of MOFs of node mn12 (left) and the structure of node mn12 (right). The number of bins for the MOFs of metal node mn2 are less than the other nodes because fewer MOFs were generated with this node mn2 (as it was one of the last metal nodes to be added in our database). The blue reference lines denote the pure CO2 uptake of Zeolite-13X. If we then compare the distribution of the nodes mn1 and mn12, both metal nodes are made of Ni but have different geometries. In mn1, Ni forms a tetranuclear cluster, while in mn12, Ni forms a six-connected triangular cluster, Ni3(μ3–OH)2(COO)6.[65] Also, the structures made of both these metal nodes perform quite well in their CO2 uptakes. This shows how the same metal can form two nodes of different geometries and that the structures of both the nodes could be promising for CO2 capture. Capturing these variations in the metal nodes is important because when we include all these metal nodes, we get the final distribution as shown in Figure , which is very different from the individual distributions. Also, the presence of these different metal nodes helps in obtaining many high performing structures. This would also help us to choose from a wider range of metal nodes, while synthesizing new MOFs for carbon capture. We find that most of the top performing structures contain the metal nodes mn1 (Ni-based)—∼40% of the top performing structures, mn12 (Ni-based)—∼30%, mn13 (Co-based)—∼8%, and mn2 (Zn based)—∼5%. For the linkers, we find the top performing structures to have simple two-coordinated linkers like benzene dicarboxylic acids to more complicated three-coordinated linkers like benzene-1,3,5-tricarboxylic acid and triazine to further complicated six-coordinated linkers like bicyclooctanes and a combination of different linkers. Figure shows the structure of one of the high performing MOFs for post-combustion carbon capture.

Figure 7

Structure of a top performing MOF for post-combustion carbon capture—ddmof_559—metal node mn1 + organic edge oe31 + topology snk.

Hydrogen Storage

Figure shows a plot of the gravimetric uptake versus the volumetric uptake of H2 in our structures at 100 bar and 77 K. This plot shows a volcano-type relationship between the two types of uptakes, as observed in previous studies.[9,24] A reference line has been drawn to denote the gravimetric H2 uptake—9.20 wt %[24] and volumetric H2 uptake—52.64 g L–1[24] in MOF-5, a widely used benchmark material for H2 storage selected by the Hydrogen Storage Engineering Centre of Excellence (HSECoE).[24,66,67] We find that many structures from our database have a gravimetric uptake higher than that of MOF-5. An ideal H2 adsorbent should however exhibit a balance between high gravimetric uptake and high volumetric uptake.[24] This is because the volumetric uptake of the H2 storage system has a greater impact on the driving range of fuel cell vehicles (FCVs) than the gravimetric uptake.[24,66−69] Also, in this respect, we find around 50 MOFs from our database, which outperform MOF-5. Figure shows the structure of one such promising hypothetical MOF of this study for hydrogen storage.

Figure 8

Figure 9

Structure of a top performing MOF for hydrogen storage—ddmof_6749—metal node mn13 + organic node on1 + organic edges oe33, oe68 + tfz-d topology.

Results from the computational screening of ∼20,000 MOFs for hydrogen storage (pure H2 adsorption at 100 bar and 77 K). The colour coding represents the number of MOFs according to the colour bar on the right. The blue reference lines denote the volumetric H2 uptake and gravimetric H2 uptake of MOF-5. Structure of a top performing MOF for hydrogen storage—ddmof_6749—metal node mn13 + organic node on1 + organic edges oe33, oe68 + tfz-d topology. Here, we find that almost all of the top performing structures have a tfz-d topology, as shown in Figure . Metal nodes, which form structures in these topologies, are thereby more prevalent in these top 50 structures. In this study, these are mainly eight connected metal nodes like mn13 (Co-based)—∼85% of the top-performing structures and mn4 (Cd-based)—∼15% of the top-performing structures. The exploration of different topologies with these metal nodes led to the generation of MOFs with different pore geometries, which finally led to some of these structures to be promising for hydrogen storage. This again highlights the importance of having a diverse database. Here, we did not pre-bias the structures for a particular application. We tried to make a diverse set of structures, which covers different aspects of MOF chemistry. Some features of these structures play an important role in one application and some features in other applications. Therefore, some of these structures turn out to be good for one application and some for other. In this case of hydrogen storage, it is the topology of a MOF, which plays a more important role in forming a top-performing storage than metal chemistry.

Figure 10

Results from the computational screening of ∼20,000 MOFs for hydrogen storage (pure H2 adsorption at 100 bar and 77 K). The structures with the tfz-d topology are highlighted in blue and all the remaining structures are in gray. It is important to note here that our analysis is based on the current state of the art methods used in screening studies, that is, generic force fields and rigid crystals. In our case, we used the UFF, which generally gives good predictions of the adsorption behavior, but for some classes of materials (open metal sites), it is known to underestimate the adsorption.[70] As open metal sites are very sensitive to water,[71] these materials are in practice less interesting for carbon capture applications; so, we did not attempt to correct these results. Also, for hydrogen adsorption at high pressures, UFF is reported to work reasonably well.[72]

Conclusions

In this study, we have designed a database of ∼20,000 hypothetical MOFs with the aim to increase the chemical diversity of the existing databases. We show that adding the structures of our database improves the overall diversity metrics of hypothetical databases, especially in terms of metal chemistry. To highlight the usefulness of these diverse structures, we evaluated their performance for two important environmental applications—post-combustion carbon capture and hydrogen storage. In the case of post-combustion carbon capture, we find that many of these structures outperform Zeolite-13X, a widely used benchmark material for carbon capture, in terms of their pure CO2 uptake—around 800 structures, CO2 working capacity and CO2/N2 selectivity—around 250 structures. For hydrogen storage, we find around 50 structures, which outperform MOF-5, a widely used benchmark material for hydrogen storage, in terms of their balance between gravimetric uptake of H2 and volumetric uptake of H2. For post-combustion carbon capture, we find that including different metal nodes help in obtaining high performing structures. In the case of hydrogen storage, we find that it is the topology of the MOF, which plays the more dominant role for a structure to be high-performing. The promising structures found in this study could be added to the existing list of promising structures in literature and this would provide us with a more diverse range of materials to choose from, for synthesizing. Through this study, we thus show that on starting with a relatively small but diverse set of materials, one could still obtain interesting materials for different applications. This would help us to locate the interesting regions of the material space. To avoid brute-force screening of an infinite number of possible MOFs, as a next step, one could then explore around these interesting regions using active learning, Bayesian optimization,[73,74] or generative models.[75,76]

32 in total

Diversifying Databases of Metal Organic Frameworks for High-Throughput Computational Screening.

Introduction

Methods

Building Block Selection and Structure Generation

Structure Optimization and Charge Generation

Diversity Analysis

Property Calculation

Results and Discussion

Diversity Analysis

Post-Combustion Carbon Capture

Hydrogen Storage

Conclusions

1. Systematic design of pore size and functionality in isoreticular MOFs and their application in methane storage.

2. A general framework for analysing diversity in science, technology and society.

3. Ab initio carbon capture in open-site metal-organic frameworks.

4. Applicability of Tail Corrections in the Molecular Simulations of Porous Materials.

5. Diversity and its decomposition into variety, balance and disparity.

6. Geometric landscapes for material discovery within energy-structure-function maps.

7. A data-driven perspective on the colours of metal-organic frameworks.

8. The Cambridge Structural Database.