Literature DB >> 33274268

Too Many Materials and Too Many Applications: An Experimental Problem Waiting for a Computational Solution.

Daniele Ongari¹, Leopold Talirz^1,2, Berend Smit¹.

Abstract

Finding the best material for a specific application is the ultimate gon class="Chemical">al of materials discovery. However, there is also the reverse problem: when experimental groups discover a new material, they would like to know all the possible applications this material would be promising for. Computational modeling can aim to fulfill this expectation, thanks to the sustained growth of computing power and the collective engagement of the scientific community in developing more efficient and accurate workflows for predicting materials' performances. We discuss the impact that reproducibility and automation of the modeling protocols have on the field of gas adsorption in nanoporous crystals. We envision a platform that combines these tools and enables effective matching between promising materials and industrial applications.

Entities: Chemical Disease Gene Species

Year: 2020 PMID： 33274268 PMCID： PMC7706098 DOI： 10.1021/acscentsci.0c00988

Source DB: PubMed Journal: ACS Cent Sci ISSN： 2374-7943 Impact factor: 14.553

Introduction

In this Outlook, we are indulging in a luxury problem: too many materials with too many possible applications. In the field of nanoporous materin class="Chemical">als, more than 100 000 metal–organic frameworks (MOFs) have been reported (Figure ), with large design spaces also guaranteed for covalent organic frameworks (COFs), zeolites, porous organic cages, etc. At the same time, the range of applications of these materials is expanding from gas separation[1] and gas storage[2] to fields such as catalysis[3,4] and sensing.[5,6] A new material is often designed and tested for one specific application; testing it for a wide range of applications may exceed the expertise, time resources, and/or infrastructure of the research group synthesizing the material. Conversely, research groups focused on the application side can no longer afford to test all materials of possible interest.

Figure 1

(a) Papers mentioning “Zeolite”, “Metal Organic Framework”, and “Covalent Organic Framework” in the title or the abstract, as parsed from Scopus in July 2020.[7] The right column collects histograms for the deposition of materials in publicly available databases. (b) Zeolite code types by year of assignment, from the database of the International Zeolite Association (IZA).[8] (c) MOF-subset of the Cambridge Structural Database (CSD, May 20 update) by year of publication (orange).[9] MOFs in the CoRE-2019 “All solvent Removed” (ASR) subset (purple) are selected from the CSD release of November 2017 with criteria such as three-dimensionality of the framework and permeability to small molecules.[10] (d) COFs in the CURATED-COFs database (June 20 update), by year of publication.[11,12]

(a) Papers mentioning “Zeolite”, “n class="Chemical">Metal Organic Framework”, and “Covalent Organic Framework” in the title or the abstract, as parsed from Scopus in July 2020.[7] The right column collects histograms for the deposition of materials in publicly available databases. (b) Zeolite code types by year of assignment, from the database of the International Zeolite Association (IZA).[8] (c) MOF-subset of the Cambridge Structural Database (CSD, May 20 update) by year of publication (orange).[9] MOFs in the CoRE-2019 “All solvent Removed” (ASR) subset (purple) are selected from the CSD release of November 2017 with criteria such as three-dimensionality of the framework and permeability to small molecules.[10] (d) COFs in the CURATED-COFs database (June 20 update), by year of publication.[11,12] Let us illustrate this point with a few examples. Al-PMOF was first synthesized for its photocatn class="Chemical">alytic activity but was later discovered to be promising for the separation of CO2 from wet flue gases.[13] MOF SBMOF-1 was synthesized to capture CO2 but turned out to be an excellent material for separating Xe from Kr.[14] UMCM-152 was first reported in 2010 and tested for H2 adsorption[15] but was rediscovered as a record-breaking material for oxygen storage eight years later.[16] In these and further examples,[17] computational screening studies discovered the potential of existing materials for new applications. Since the number of available materials is so large (and becomes even larger when including materials generated in silico(18)), computational modeling is at present the only feasible screening method. A typical computationn class="Chemical">al screening study aims to rank a set of materials for a given application: the first step is to determine key performance indicators (KPIs) and the ranking criteria for the comparison of different materials. KPIs are typically related to material properties, such as the electronic band structure (for optical or electronic KPIs), or adsorption isotherms (for KPIs related to gas storage or separations). While some properties, such as stability, remain difficult to predict from first-principles, density functional theory (DFT) calculations provide access to a wide range of properties of MOFs, including the band gap and band structure,[19] mechanical properties,[20,21] and catalytic properties.[22,23] DFT also allows us to make accurate predictions of the interactions of guest molecules inside the pores.[24,25] In addition, classical molecular simulations enable the computation of thermodynamic and transport properties of these guest molecules.[26,27] We can envision taking this approach even one step further: What if the workflow infrastructure was made available for experimental groups to use? They could upload the crystn class="Chemical">al structure of a new nanoporous material (MOF, COF, or zeolite) and—with close to zero effort—obtain calculations of thermodynamic and transport properties for a range of guest molecules, as well as predictions of how this new material would perform in screening studies previously published in the literature (Figure ), or if an experimental group develops a novel separation process, having a database of thermodynamic data would allow this group to identify top performing materials for their applications. Finally, if a computational group improves an existing force field or protocol for a specific class of materials, the updated workflow could be available whenever a material of that class is uploaded.

Figure 2

Scheme of exemplary workflow. The user starts by uploading the atomic structure of a crystalline materials in the CIF format, which triggers the refinement of the atomic positions, the computation of pore geometry, and thermodynamic and transport properties. Finally, its performance for specific applications is evaluated, and the material is ranked versus other candidates.

Scheme of exemplary workflow. The user starts by uploading the atomic structure of a crystalline materin class="Chemical">als in the CIF format, which triggers the refinement of the atomic positions, the computation of pore geometry, and thermodynamic and transport properties. Finally, its performance for specific applications is evaluated, and the material is ranked versus other candidates.

Databases of Nanoporous Materials

Curated Structures from Experimental Syntheses

Computational screening studies rely on large databases of materin class="Chemical">als. A first step is to collect all the reported structures in a consistent format: today, the Crystallographic Information File (CIF) format is the most common one. The minimal information needed to proceed with a computational screening study is the dimensions of the unit cell and the elements and coordinates of the atoms that compose the framework. This information is obtained ideally from single crystal X-ray diffraction studies. When obtaining single crystals is not possible, one can rely on powder X-ray diffraction or other indirect measurements instead.[28] In the MOF community, it is standard practice to publish new structures in the Cambridge Structurn class="Chemical">al Database (CSD)[29] and to report the reference code assigned to the entry in the article. With its more than one million entries (102 508 of which were recognized as MOFs until May 2020),[9] the CSD is the oldest and largest data set of organic and metal–organic crystals.[30] Unfortunately, many of the reported structures in the CSD are not suitable for a computational screening study out of the box. From X-ray diffraction data, it is difficult to locate n class="Chemical">hydrogen. If the material is charged, locating the counterions can also be challenging. For porous MOFs, the crystal structure is often determined with solvent molecules present inside the framework, while many practical applications require activating the material by removing the solvent molecules. In addition, disorder in the material is often indicated via partial occupancies, which need to be resolved to unique structures (in most of the cases manually) before using the structures as input for simulations. The group around D. Sholl pioneered the extraction of MOFs from the CSD for a computationn class="Chemical">al screening purpose. In two studies from 2012 (seeking materials for kinetic separation of noble gases[31] and CO2/N2[32]) they distilled a set of 3432 and 1163 MOFs, respectively, from which they discarded entries with atomic disorder, and they algorithmically removed solvent molecules. One year later, Siegel et al.[33] targeted hydrogen storage and identified ca. 38 800 crystals from the CSD as MOFs but had to exclude ca. 16 000 problematic structures due to missing H, disorder, etc. For these first studies, the final database of filtered and curated CIFs was only accessible upon request to the authors. In 2014, Chung et al. created a set of 5109 “Computation-Ready Experimental” (CoRE) structures, selected to be three-dimensional, porous (i.e., with a pore limiting diameter >2.4 Å), and fully desolvated, and made the database openly available for download.[34] Recently, this database was updated to include 14 142 MOFs, 546 of which were collected from sources other than the CSD.[10] This update also included a version of the database where solvent molecules coordinated to metal sites were not removed (i.e., the MOFs were not “computationally activated”). In two separate projects, Nazarian et al. used DFT to provide partial charges[35] and geometry-optimized structures[36] for the first set of CoRE-MOFs. Only 502 (i.e., ca. 10%) passed both refinements. The database with partial charges was used by other groups in several screening studies,[16,37−39] demonstrating the impact of providing an open-access, curated, and extensive database of structures to the computational community. Ideally, for every new n class="Gene">MOF structure deposited in the CSD, a computation-ready structure would be generated as well. At present, however, there is no standardized protocol for these steps—e.g., removal of solvents, addition of missing hydrogen atoms, resolution of partial occupancies, correction of atomic overlaps, and structural distortion—and as a result, each group may generate slightly different structures that make it difficult to compare predicted properties.[40,41] This cleaning procedure can be seen as a continuous process where more and more checks and fixes are added in a collaborative effort.[42,43] For the platform we envision, it would be highly desirable to eventually arrive at an internally consistent and extensive database of nanoporous structures that are “ready” for molecular simulations or electronic structure calculations, and made available in a way that satisfies the FAIR data principles: Findable, Accessible, Interoperable, and Reusable.[44,45] Extending this curation to different classes of materials, such as n class="Chemical">COFs, inevitably results in new considerations and challenges. For MOFs, the CSD imposes quality controls on the accuracy of the crystal structure. COFs typically have short-range crystallinity but long-range disorder, preventing the refinement of the crystal structure directly from X-ray diffraction data[46] and thus inclusion in the CSD. As a consequence, experimental groups develop their own protocols to generate the crystal structures reported with their publications. These difficulties motivated us to create a database that combines the advantages of both the CoRE and CSD protocols and provides a high level of transparency and consistency in the refinements including cell optimization and the calculation of partial charges. Branching off from the second version of the CoRE-COF database by Tong et al.,[47] we extended the database to 574 COFs in the June 2020 update. The relevant literature is monitored by the @COF_Papers Twitter bot, and structures in CIF format are collected in a public repository, where researchers can suggest new additions or report errors.[12] All modifications, corrections, and additions are tracked by the Git version control system. Moreover, an automated routine, orchestrated by the AiiDA workflow manager,[48,49] computes the DFT-optimized structures and partial charges following a published protocol.[11] The results are made available periodically in the CURATED COF database hosted on the Materials Cloud open science platform.[50] A recent independent study compared the gas-separation performances as computed from the structures of this database and the original COF structures, highlighting the importance of the curation process.[51]

Hypothetical Structures

In addition to the databases of experimentally determined structures, there is an even larger number of structures generated in silico, which are further candidates for computationn class="Chemical">al screenings. Replacing the experimental synthesis of new materials with computational algorithms increases the number of atomic structures that can be assembled (zeolites, MOFs, and COFs) by orders of magnitude. To give an idea of how large these databases can get, we just mention two very recent works, which reported 325 000 hypothetical MOFs[13] and 471 990 hypothetical COFs.[52] As the growth of the number of hypothetical structures even outpaces the continued increase of computationn class="Chemical">al power, brute force screening will become increasingly unfeasible. A promising alternative is to select only a modest subset of most diverse structures to perform accurate calculations (comparable to those on experimental structures), train machine learning methods to capture the structure–performance relations, and use them to extend the screening to the remaining materials.[53] Indeed, the key to the success of this approach is to find effective metrics for the “diversity” between structures in the context of a particular application.[54]

Computation of Materials’ Properties

Gas Adsorption Properties

Once one has a set of computation-ready structures, one can start computing the properties that are relevant for the application(s) of interest—here, gas adsorption in nanoporous materin class="Chemical">als. If interested in comparing properties and performances among many materials (and updating this ranking over time), one needs to pay particular attention to the consistency of results. Consistency means, for example, applying the same protocol to curate the crystals’ atomic structure, to estimate partial charges, to exclude inaccessible pores, etc. in order to enable the comparison of the final results. This includes both choices of the model, such as the DFT functional or the force field (UFF, DREIDING, TraPPE, etc.), as well as secondary parameters, such as the choice of the DFT basis set or whether to use tail-corrections in grand canonical Monte Carlo (GCMC) adsorption calculations.[55] A first step may involve a relaxation of the atomic positions using force fields, semiempirical methods, or DFT in order to ensure that the atomic structure is consistent with the computationn class="Chemical">al method employed. This step can also help identify mistakes in input structures and take the effect of solvent removal on the framework into account. Gas–framework interactions are often evn class="Chemical">aluated using GCMC insertion techniques with classical force fields.[17] In this approach, the Coulomb interaction is modeled by partial charges, which are tabulated for popular gas molecules but need to be computed for the framework. If DFT was used for the geometry relaxation, partial charges can be computed at negligible extra cost from the electrostatic potential, usually preferring protocols that aim at reproducing the electrostatic potential (e.g., REPEAT, DDEC) over the others (e.g., Mulliken, Hirshfeld, Bader).[56,57] Alternatively, cheaper charge equilibration methods can be used, but with extreme care (see ref (37)). The partial charges on the framework atoms need to be combined with a force field to describe the interaction between the gas molecule and the framework. Many studies opt for off-the-shelf parameters for the dispersion interaction, such as DREIDING[58] or UFF[59] for the framework and TraPPE[60] for the adsorbate. Steps outside the original design space of existing force fields or modifications of their parameters need to be carefully validated in order to ensure that the behavior of the gas molecules in the pores is reproduced with sufficient accuracy (e.g., adsorption isotherms, heats of adsorption, etc.). As one moves to larger numbers of structures and more complex workflows, it becomes increasingly challenging to manage the cn class="Chemical">alculations and to provide all information required to reproduce a particular result. This is where workflow managers can come to the rescue, and numerous open-source infrastructures are available for orchestrating computational chemistry codes with advanced logic,[61] such as AiiDA,[48,49] FireWorks,[62] AFLOW,[63] or signac.[64]

Open Challenges

Once we obtain a reliable force field for molecule–framework interactions, we are still left with a number of open challenges. One chn class="Chemical">allenge is the modeling of defects: as high-throughput computations typically assume perfect crystals, they will not capture properties that are dominated by crystal defects present in the experimental material. Another challenge is the modeling of framework mechanics upon adsorption: most screening studies assume a rigid framework. For many structures, this approximation is reasonable, but some materials are known to display structural changes upon gas adsorption, which can affect performance in relevant applications.[65] Assuming the structure to be rigid may lead to incorrect identification of pore accessibility for gas molecules. Algorithms based on geometrical assessment of channel diameters can easily recognize nonaccessible pores and exclude them from the adsorption calculation.[66] However, it is less trivial to routinely identify those cases where a small rotation of the ligands can allow the gas molecule to permeate (such as in the well-studied case of ZIF-8[67,68]). There are other material properties relevant to the process modeling of n class="Gene">gas-related applications that can be evaluated from the unit cell, such as gas diffusion,[69] heat capacity,[70,71] mechanical stability,[72,73] and chemical stability. The studies cited above propose computational protocols for investigating these properties, which might be extended for high-throughput screenings. Combining all these properties in the same screening platform would allow the filtering out of structures that are unstable or show poor thermal or molecular diffusion and provide more information to the process model. Moving beyond the field of gas adsorption brings yet more properties into focus, such as more accurate electronic properties for applications in sensing, semiconductors, and photocatn class="Chemical">alysis,[19,74] which put increased emphasis on the choice of the electronic structure method and are beyond the scope of this Outlook.

Ranking Materials

Accurately predicting material properties is the aim of molecular simulations, but it represents only hn class="Chemical">alf of the story: our ultimate goal is to rank materials for a given application, based on key performance indicators (KPIs). In the following, we discuss the KPIs for two important applications of nanoporous materials: hydrogen storage[33,75−79] and CO2 separation from nitrogen.[11,40,80−83] For H2 storage, the main KPI is the deliverable (or “working”) capacity, i.e., the difference in n class="Gene">gas uptake between the loading conditions at higher pressure and/or lower temperature, and the discharging conditions at lower pressure and/or higher temperature. Therefore, the evaluation of hydrogen storage performance requires calculations at only these two conditions of temperature and pressure, and molecular simulations have been shown to be feasible for screening more than half a million structures.[78,79] Similar considerations are also valid for the evaluation of natural gas deliverable capacity.[34,47,84,85] Other important KPIs may focus on the diffusion of gas and heat inside the framework, in order to enable fast loading/discharge and heat dissipation. For CO2 capture, finding KPIs for the ranking is more complex. In 2012, our group developed a simplified thermodynamic model for n class="Chemical">carbon capture and sequestration (CCS) involving a temperature–pressure-swing adsorption process, considering inlet gases from a coal-fired power plant (14:86 ratio of CO2:N2), a natural-gas-fired power plant (4% CO2), or air (400 ppm of CO2). This model was used to evaluate different classes of nanoporous materials,[82,86] and we recently expanded the study to COFs.[11] Two KPIs were identified: The “parasitic energy” is defined as the energy needed to separate 1 kg of CO2 and compress the purified gas to 150 bar for underground storage. The parasitic energy can be taken as a measure of the operating cost of the separation (OPEX). The working capacity, on the other hand, determines the amount of adsorbent material needed and thus the capital cost (CAPEX). The study showed that the minimal parasitic energy can be related to an optimal value for the Henry coefficient of CO2 around 10–3 mol/(kg Pa) (at 300 K) for power plants—stronger affinity between CO2 and the framework would result in higher energy needed for the regeneration of the adsorbent. For direct-air capture, the optimal value lies above 10–1 mol/(kg Pa), and chemisorption appears to be a more promising solution. Recently, simulations have been coupled with more advanced models of the pressure-swing process,[87−89] indicating subtle relations between the properties of the material and their performance in the process. In particular, the often overlooked nitrogen isotherm was identified as a key indicator. The case of CO2 separation highlights the importance of connecting materin class="Chemical">als’ properties to process modeling: on one hand, the process modelers need to be aware of the uncertainties in material property predictions and how they propagate through their model. If small perturbations in the inputs alter the outcome significantly, this “butterfly effect” will compromise the reliability of the final ranking. On the other hand, the molecular simulation community should focus its efforts on improving the predictive accuracy for those properties that are shown to have the largest influence on the process models. In this context, modular workflows and automated provenance tracking can simplify investigations of individual workflow components and help trace their impact on the final rankings.

Toward Best Practices

While the vision of a common platform with easily extendable and fully interoperable workflows may appear somewhat utopian today, there are a number of concrete practices that researchers can adopt to move closer toward this goal.

Reproducibility

In order for a reader of a scientific publication to be able to reproduce its results, the study should report—among the analysis and discussion of its scientific results—n class="Chemical">also all the information needed to reproduce them. However, for screening studies that involve large numbers of materials and/or multistep workflows, achieving this “radical transparency” is easier said than done: manually collecting all necessary input files, postprocessing scripts, software versions, etc. can be time-consuming, and completeness is difficult to ensure. Here, workflow managers can help by tracking the provenance automatically and providing ways to export and share this information with peers. For example, our recent work on parsing COFs from the literature and assessing their performance for n class="Chemical">carbon capture tries to follow this approach, publishing both the full provenance graph of the study and the source code of the workflow used to orchestrate the calculations.[11,50,90] The provenance graph gives any interested researcher the ability to click on a data point and to inspect every step of the workflow that was used to compute it, try to reproduce an individual calculation themselves, or report mistakes they encounter. Sharing the source code of the corresponding workflows on collaborative platforms like GitHub further enables direct suggestions of bug fixes or improvements to the protocols, both in code and in narrative form.

Automation

Moving from the study of a few materials to hundreds or thousands of them puts an emphasis on automation. One needs an effective way not only to manage the sequence of cn class="Chemical">alculations but also to handle common errors and perform preliminary data analysis. We illustrate this using simple, practical examples from gas adsorption in nanoporous materials: Before submitting a crystal structure to GCMC simulations one has to detect and block the inaccessible pores and expand the simulation cell to include twice the cutoff used for the potential. These operations may need some external packages or ad hoc scripts and can be fully automated using a workflow manager. Another notable case is the handling of DFT calculations in which the self-consistent field cycle fails to converge. Depending on the system under study, remedies can include automatic resubmission with slower, more conservative minimization schemes (e.g., switching from orbital transformation to diagonalization methods) or turning on electronic smearing.[11] When considering one’s own use cases, and realizing the many intermediate steps that would need to be automated to go from an input structure to the finn class="Chemical">al result, one inevitably arrives at the question whether the effort of full automation is worth the time investment. While this determination needs to be made case by case, it is easy to forget that each manual operation makes results harder to reproduce and entails substantial time investments when others (or even our future self) go on to extend and build upon the work. At the same time, modern workflow managers provide time-saving convenience features, such as automatic translations of job parameters to the language of various queuing systems, automatic file transfers between the local workstation and the cluster, and automatic record keeping. The development of a robust workflow can be challenging, but is a vn class="Chemical">aluable outcome from a computational study on its own: it ensures that when new sets of materials are released, they can be included in the screening with minimal additional effort.

Open Source

The idea of open science does not stop at access to papers and data but extends naturally to the software used to obtain the data: the use of free and open-source software (FOSS) lowers the barriers to reusing, reproducing, and building upon prior work. This is particularly true for materin class="Chemical">als screening studies, where readers may want to compare a new material to the set of the already screened ones. When data, software, and workflows are made openly accessible, the barrier for such checks reduces to the marginal computational cost of screening just one more material. Despite these obvious benefits, the demand for FOSS has been notably missing from declarations in the open science context.[91] One of the reasons may be that developing and maintaining high-quality scientific software takes years of teamwork, and commercin class="Chemical">al licenses have proven to be a successful model for funding such efforts in the past.[92] Today, however, FOSS alternatives exist for most applications in computational materials science, and we do seem to observe a trend of increasing adoption of these codes vs their commercial competitors over the course of the past decade:[92,93] CASTEP[94,95] (restricted to academic use) and openMolcas[96] being two recent examples of codes that have decided to switch to a more open licensing model. We believe that the question of sustainable software development for open science needs to be on the table and discussed by all stakeholders.

Setting the Stage for Machine Learning

In recent years, machine learning (ML) has been rapidly mixing with molecular simulations,[97] and we expect the advent of automated workflows in the field of nanoporous materials modeling to amplify this trend.[53] Among the first applications of ML methods is the prediction of the Henry coefficient or a full isotherm in a fraction of a second, from conventionn class="Chemical">al geometric properties of the crystal structure (such as pore volume and atoms’ connectivity) and/or more advanced descriptors.[98−101] This massive speed up enables the screening of even millions of materials at affordable computational cost, shifting the role of molecular simulation to providing sufficient training data for the ML. The main question for a new material then becomes is this structure different enough from the others already included in the training set to justify the use of expensive molecular simulations over ML predictions?[54] However, at present, ML studies trained on data published by other groups are rare in the field. In many cases, data are recomputed specifically for the training, even when similar (but not identicn class="Chemical">al and consistent!) data are available. In this context, moving as a community from delivering just a final set of results to including also the infrastructure needed to obtain them could allow ML experts to easily extend the training set with new consistent data. Reusing the same data sets in multiple ML studies would also enable effective assessments of the models themselves, which is less trivial when the training data differ. Finally, the training routines should be automated and made reproducible as well.[53]

Toward a Prototype of a Materials Matching Platform

As a first step toward realizing the ideas put forward in this Outlook, we have extended our work of curating n class="Chemical">COFs and screening these materials for carbon capture[11] to six new applications and 250 new COFs.[50] The new structures include mostly COFs published after the original work (as tracked by the @COF_PAPERS Twitter bot), and the applications are based on previous screening studies focused on gas storage (methane,[85] hydrogen,[78,79,102] and oxygen[16]) and gas separations (Xe/Kr,[14] H2S removal in wet gases). Performance of COF structures for n class="Chemical">CO2 capture: parasitic energy required for the process versus gravimetric working capacity. Markers of the 250 new COFs are color-coded based on their ranking from high performance (low parasitic energy and high working capacity, green) to low performance (red). Markers of materials already included in ref (11) are shown in light gray. Provided that the input structure is chemically sound (e.g., no missing n class="Chemical">hydrogens or overlapping atoms) and is charge-neutral (no counterbalancing ions), the workflow CURATED 99% of the structures without human intervention. For all CURATED structures, we automatically computed the adsorption isotherms and/or Henry coefficients of CO2, N2, H2, CH4, O2, H2S, H2O, Xe, and Kr. From these isotherms, the KPIs were computed automatically and used to rank the materials as shown in Figure for the extension of our previous study on CO2 capture[11] and in Figure for the other new applications included. The full workflow typically takes 2–5 days from start to finish, using ≈1000 core hours. In other words, it costs about the price of three cups of coffee: two cups for the curation of the structure and one for all KPIs.[103] The full provenance graph of each workflow, shown in Figure , is tracked automatically by the AiiDA workflow manager.

Figure 3

Performance of COF structures for CO2 capture: parasitic energy required for the process versus gravimetric working capacity. Markers of the 250 new COFs are color-coded based on their ranking from high performance (low parasitic energy and high working capacity, green) to low performance (red). Markers of materials already included in ref (11) are shown in light gray.

Figure 4

Performance of CURATED-COFs for H2 storage at (a) cryogenic and (b) near-ambient conditions, (c) methane storage, (d) oxygen storage, (e) Xe/Kr separation, and (f) (H2S)/water separation. The ranking is color-coded from high performance (green) to low performance (red). Selectivities are computed as the ratio of the Henry coefficients of the two gases at 300 K. The coordinates of the markers for T-COF-2 and JUC-509 are highlighted by dashed and solid lines, respectively.

Figure 5

AiiDA provenance graph of the workflow tracing the entire path from the initial CIF file to the properties and performance computed for it. The graph shows process and data as nodes, and their connection:[49] in an interactive visualization, each node can be browsed to explore the input parameters of the calculation, its output results, and the details of the processes.[104] Colors distinguish different modules of the workflow, whose source code is available online.[90] The modules make use of other popular open-source tools, such as CP2K,[105] Raspa,[106] Zeo++,[66] and chargemol.[107]

Performance of CURATED-COFs for n class="Chemical">H2 storage at (a) cryogenic and (b) near-ambient conditions, (c) methane storage, (d) oxygen storage, (e) Xe/Kr separation, and (f) (H2S)/water separation. The ranking is color-coded from high performance (green) to low performance (red). Selectivities are computed as the ratio of the Henry coefficients of the two gases at 300 K. The coordinates of the markers for T-COF-2 and JUC-509 are highlighted by dashed and solid lines, respectively. AiiDA provenance graph of the workflow tracing the entire path from the initial CIF file to the properties and performance computed for it. The graph shows process and data as nodes, and their connection:[49] in an interactive visun class="Chemical">alization, each node can be browsed to explore the input parameters of the calculation, its output results, and the details of the processes.[104] Colors distinguish different modules of the workflow, whose source code is available online.[90] The modules make use of other popular open-source tools, such as CP2K,[105] Raspa,[106] Zeo++,[66] and chargemol.[107] It is interesting to discuss two examples of recently reported structures that were included in the update. T-COF-2 (Figure a) was synthesized in 2020 and tested for a photocatn class="Chemical">alysis application.[108] The simulations do not predict this material to be among the top performers for any of the gas adsorption applications implemented so far. While this is the most likely outcome, as the screening studies are extended to more applications, the probability of discovering unexpected hits should increase.

Figure 6

Crystal structures of (a) T-COF-2 and (b) JUC-509. Elements: H (white), C (gray), N (blue), oxygen (red), S (yellow), Cl (green).

Crystal structures of (a) n class="Chemical">T-COF-2 and (b) JUC-509. Elements: H (white), C (gray), N (blue), oxygen (red), S (yellow), Cl (green). The other COF, n class="Chemical">JUC-509 (Figure b), seems more promising. This material was synthesized in 2019 for catalysis.[109] Based on the atomic structure of the material, our screening predicts it to be among the top performing materials for the storage of H2, CH4, and O2 (Figure ). We acknowledge that there are other factors that may play a role in deciding whether it is worth testing JUC-509 for these applications, but it is our hope that cases like this one will lead to interesting, unexpected discoveries going forward. Many things remain to be done in order to transform this prototype into a platform of broad impact. We have limited ourselves to COFs, and it is essentin class="Chemical">al to extend it to MOFs and other porous materials. We have used relatively elementary KPIs to illustrate the concept, which we hope to replace with more advanced ones, and so far, the predictions rely on generic force fields which may not always be the optimal choice. Finally, we would like to extend the range of applications beyond the current scope of gas storage and separation. The results of this screening are updated periodically and can be accessed from materialscloud.org/discover/curated-cofs. Both the data and the source code of the underlying workflows are made available online.[110] Over time, we hope to inspire other research groups to build upon the existing open infrastructure and develop their own modules for new applications, resulting in “living” screening studies that are regularly updated with new materials. Infrastructure projects like these require long-term commitments, which are notoriously difficult to make in today’s research funding landscape. Thanks to support from the MARVEL Nationn class="Chemical">al Centre of Competence for Research, we feel ready to accept the challenge and, given the enormous potential impact for the field, hope to be able to convince other funding agencies and possibly commercial partners to join. In summary, the idea of this Outlook is to illustrate a dormant potential in the computationn class="Chemical">al materials science community that can be unlocked by moving toward a more open, collaborative way of doing science—not necessarily inventing something spectacularly new, but simply putting together the pieces of a large puzzle. While we have made the case for the field of nanoporous materials for gas adsorption applications, the basic concept would seem extensible to further classes of materials and applications.

53 in total

1. Quantum-Chemical Characterization of the Properties and Reactivities of Metal-Organic Frameworks.

Authors: Samuel O Odoh; Christopher J Cramer; Donald G Truhlar; Laura Gagliardi
Journal: Chem Rev Date: 2015-04-15 Impact factor: 60.622

2. Modeling the Structural and Thermal Properties of Loaded Metal-Organic Frameworks. An Interplay of Quantum and Anharmonic Fluctuations.

Authors: Venkat Kapil; Jelle Wieme; Steven Vandenbrande; Aran Lamaire; Veronique Van Speybroeck; Michele Ceriotti
Journal: J Chem Theory Comput Date: 2019-04-30 Impact factor: 6.006

3. Single-crystal x-ray diffraction structures of covalent organic frameworks.

Authors: Tianqiong Ma; Eugene A Kapustin; Shawn X Yin; Lin Liang; Zhengyang Zhou; Jing Niu; Li-Hua Li; Yingying Wang; Jie Su; Jian Li; Xiaoge Wang; Wei David Wang; Wei Wang; Junliang Sun; Omar M Yaghi
Journal: Science Date: 2018-07-06 Impact factor: 47.728

4. A 3D Covalent Organic Framework with Exceptionally High Iodine Capture Capability.

Authors: Chang Wang; Yu Wang; Rile Ge; Xuedan Song; Xueqing Xing; Qike Jiang; Hui Lu; Ce Hao; Xinwen Guo; Yanan Gao; Donglin Jiang
Journal: Chemistry Date: 2017-12-14 Impact factor: 5.236

5. In silico screening of carbon-capture materials.

Authors: Li-Chiang Lin; Adam H Berger; Richard L Martin; Jihan Kim; Joseph A Swisher; Kuldeep Jariwala; Chris H Rycroft; Abhoyjit S Bhown; Michael W Deem; Maciej Haranczyk; Berend Smit
Journal: Nat Mater Date: 2012-05-27 Impact factor: 43.841

6. A Robust Machine Learning Algorithm for the Prediction of Methane Adsorption in Nanoporous Materials.

Authors: George S Fanourgakis; Konstantinos Gkagkas; Emmanuel Tylianakis; Emmanuel Klontzas; George Froudakis
Journal: J Phys Chem A Date: 2019-07-02 Impact factor: 2.781

7. Data-driven design of metal-organic frameworks for wet flue gas CO₂ capture.

Authors: Peter G Boyd; Arunraj Chidambaram; Enrique García-Díez; Christopher P Ireland; Thomas D Daff; Richard Bounds; Andrzej Gładysiak; Pascal Schouwink; Seyed Mohamad Moosavi; M Mercedes Maroto-Valer; Jeffrey A Reimer; Jorge A R Navarro; Tom K Woo; Susana Garcia; Kyriakos C Stylianou; Berend Smit
Journal: Nature Date: 2019-12-11 Impact factor: 49.962

8. Metal-organic framework with optimally selective xenon adsorption and separation.

Authors: Debasis Banerjee; Cory M Simon; Anna M Plonka; Radha K Motkuri; Jian Liu; Xianyin Chen; Berend Smit; John B Parise; Maciej Haranczyk; Praveen K Thallapally
Journal: Nat Commun Date: 2016-06-13 Impact factor: 14.919

Review 9. Computational Design of Functionalized Metal-Organic Framework Nodes for Catalysis.

Authors: Varinia Bernales; Manuel A Ortuño; Donald G Truhlar; Christopher J Cramer; Laura Gagliardi
Journal: ACS Cent Sci Date: 2017-12-21 Impact factor: 14.553

10. Distinguishing Metal-Organic Frameworks.

Authors: Senja Barthel; Eugeny V Alexandrov; Davide M Proserpio; Berend Smit
Journal: Cryst Growth Des Date: 2018-01-25 Impact factor: 4.076

4 in total

1. Excited-State Properties for Extended Systems: Efficient Hybrid Density Functional Methods.

Authors: Anna-Sophia Hehn; Beliz Sertcan; Fabian Belleflamme; Sergey K Chulkov; Matthew B Watkins; Jürg Hutter
Journal: J Chem Theory Comput Date: 2022-06-27 Impact factor: 6.578

2. Effect of Metal-Organic Framework (MOF) Database Selection on the Assessment of Gas Storage and Separation Potentials of MOFs.

Authors: Hilal Daglar; Hasan Can Gulbalkan; Gokay Avci; Gokhan Onder Aksu; Omer Faruk Altundal; Cigdem Altintas; Ilknur Erucar; Seda Keskin
Journal: Angew Chem Int Ed Engl Date: 2021-03-01 Impact factor: 15.336

3. Diversifying Databases of Metal Organic Frameworks for High-Throughput Computational Screening.

Authors: Sauradeep Majumdar; Seyed Mohamad Moosavi; Kevin Maik Jablonka; Daniele Ongari; Berend Smit
Journal: ACS Appl Mater Interfaces Date: 2021-12-15 Impact factor: 9.229

4. A data-driven perspective on the colours of metal-organic frameworks.

Authors: Kevin Maik Jablonka; Seyed Mohamad Moosavi; Mehrdad Asgari; Christopher Ireland; Luc Patiny; Berend Smit
Journal: Chem Sci Date: 2020-12-28 Impact factor: 9.825

4 in total