Literature DB >> 33869695

Dataset on structure and physical properties of stable diatomic systems based on van der Waals density functional method.

Kiyou Shibata1, Eiki Suzuki1, Teruyasu Mizoguchi1.   

Abstract

With the influence of progress in the materials informatics, development of fundamental database has been attracting growing interest. The bonding between atoms is essential component of all kinds of materials and govern their structure, stability, and properties. When we try to understand a material by breaking it down into microscopic components, bonding of diatomic system is the most fundamental. In the field of spectroscopy, diatomic molecular spectroscopy data has been studied well, and the diatomic molecular spectroscopy database [1] has been constructed recently. Concerning electronic structure, however, there is no easily accessible database of diatomic system. In order to develop a database of diatomic systems, it is important to consider adequate interaction. In addition to covalent bonding, van der Waals (vdW) interaction is also known to play an essential role especially in describing weak bonding systems such as noble gas dimers, atomic or molecular absorption, and layered materials. Thus, vdW interaction must be considered to develop database of diatomic systems so that it can be used for general purposes. One of its theoretical implementations is vdW density functional (vdW-DF) method [2], which has been developed within the framework of density functional theory 3 (DFT) and has been showing its effectiveness as general-purpose method. In this data article, we provide a vdW-DF-based calculation dataset focusing on diatomic systems. All diatomic systems containing atoms from H (Z = 1) to Ra (Z = 88) were considered, and stable structures and properties of more than 2,900 stable diatomic systems has been calculated correctly. This cyclopedic dataset of diatomic systems with consideration of vdW interaction can be useful building blocks for understanding, describing, and predicting interaction of atoms.
© 2021 The Authors. Published by Elsevier Inc.

Entities:  

Keywords:  Binding energy; Chemical bonding; Density functional theory calculations; Diatomic molecule; First-principles calculations; Van der Waals interaction

Year:  2021        PMID: 33869695      PMCID: PMC8040128          DOI: 10.1016/j.dib.2021.106968

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table

Value of the Data

To understand atom adsorption and general chemical bonding, bonding states between atoms is essential. The most primitive form of inter-atomic interaction can be found in diatomic systems. This dataset considers all possible diatomic systems containing H to Ra, and contains stable bond length, binding energy, and density of states of over 2900 diatomic systems along with properties of isolated single atoms based on density functional theory with consideration of van der Waals interaction. This dataset provides basic knowledge for describing atom adsorption and general chemical bonding. This dataset is useful for researchers investigating atom adsorption or catalytic activities, or ones looking for datasets with versatile physical properties in the field of materials informatics. This cyclopedic dataset of diatomic systems with consideration of vdW interaction can be useful building blocks for understanding, describing, and predicting stability of bondings between atoms and molecules.

Data Description

Raw dataset

The most primitive data records are provided as set of raw VASP output files, OUTCAR and vasprun.xml, for both 3916 diatomic systems and 88 isolated atom systems. These data records are separately available as zip compressed files at Mendeley data [4]. These raw VASP file datasets are possibly useful for those who want to access density of states of the diatomic systems or want to run DFT calculation with other calculation condition.

Parsed dataset

We also provide parsed statistical datasets as python pickle files which can be loaded by pandas module and csv files. For just overviewing, we recommend the use of these parsed statistical datasets. This dataset contains properties obtained by parsing the VASP files (“vasprun.xml” and “OUTCAR”) and is suitable for overviewing and processing statistical data. The pickle files and csv files are provided for both diatomic systems and isolated atom systems separately, and they are available at Mendeley data [4]. A description of the data fields in the pandas DataFrame of the diatomic systems and isolated atom systems are given in Tables 1 and 2, respectively. Most parameters are obtained from attributes of classes of pymatgen (Vasprun and Outcar in pymatgen.io.vasp) and not modified. Note that some parameters such as “total_mag” obtained by Outcar.total_mag contain negative values. The csv files have the same table format as picklefiles, but they contain only numerical and string variables, namely except for “vasprun” and “outcar”.
Table 1

Description of the associated data fields in the diatomic system dataset, formats, types and units, where atom index i = 1 or 2, and orbital o = s, p, or d.

Data FieldDescriptionType (and Unit)
system_namesystem name (e.g. “H_H”)str
Vasprunpymatgen Vasprun objectpymatgen.io.vasp.Vasprun
Outcarpymatgen Outcar objectpymatgen.io.vasp.Outcar
no_errorwhether the calculation is failed with errorbool
Convergedwhether the calculation is convergedbool
converged_electronicwhether the calculation is electronically convergedbool
converged_ionicwhether the calculation is ionically convergedbool
Stabilizedwhether the binding energy is negative or positivebool
calc_statcalculation statusint
Distanceinter atomic distancefloat (Å)
binding energybinding energy of the diatomic systemfloat (eV)
final energytotal energy of the diatomic systemfloat (eV)
EfermiFermi energy of the diatomic system with respect to vacuum levelfloat (eV)
total_magtotal magnetization of the systemfloat (gμB/2)
atomic_symbol_iatomic symbol of atom istr
potcar_symbol_ipotcar symbol of atom istr
Z_iatomic number of atom iint
isolated_energy_ienergy of isolated atom ifloat (eV)
electrostatic_potential_ielectrostatic potential at the position of atom ifloat (V)
sampling_radii_isampling radius for calculating electrostatic potential of atom ifloat (Å)
charge_i_tottotal charge on atom i as a sum of charge_i_ofloat (C)
charge_i_ocharge on atom i of orbital ofloat (C)
magnetization_i_omagnetization on atom i of orbital ofloat (gμB/2)
Table 2

Description of the associated data fields in the isolated system dataset, formats, types and units, where orbital o = s, p, or d.

Data FieldDescriptionType (and Unit)
system_namesystem name (e.g. “H”)str
vasprunpymatgen Vasprun objectpymatgen.io.vasp.Vasprun
outcarpymatgen Outcar objectpymatgen.io.vasp.Outcar
no_errorwhether the calculation is failed with errorbool
convergedwhether the calculation is convergedbool
converged_electronicwhether the calculation is electronically convergedbool
final energytotal energy of the diatomic systemfloat (eV)
efermiFermi energy of the diatomic system with respect to vacuum levelfloat (eV)
total_magtotal magnetization of the systemfloat (gμB/2)
atomic_symbolatomic symbolstr
potcar_symbolpotcar symbol of atomstr
Zatomic numberint
electrostatic_potentialelectrostatic potential at the positionfloat (V)
sampling_radiisampling radius for calculating electrostatic potentialfloat (Å)
charge_tottotal charge as a sum of charge_ofloat (C)
magnetization_tottotal magnetization as a sum of magnetization_ofloat (gμB/2)
charge_ocharge of orbital ofloat (C)
magnetization_omagnetization of orbital ofloat (gμB/2)
Description of the associated data fields in the diatomic system dataset, formats, types and units, where atom index i = 1 or 2, and orbital o = s, p, or d. Description of the associated data fields in the isolated system dataset, formats, types and units, where orbital o = s, p, or d. We examined the validity of the calculation in several criteria. Some calculations on some atom pairs failed with errors. The physical parameters of these pairs are apparently unreliable and are not included in the record. Some did not converge within the convergence criteria we used, and convergence tends to be poor for pairs containing lanthanoid atoms. Among the converged calculations, some resulted in positive binding energy. Basically, the positive binding energy indicates that the relaxation was not enough or was not done correctly. This is because the total system energy should be equal to the sum of each energy of isolated atoms, namely the binding energy should be 0 eV when the atoms are separated far enough. However, we include these data as well because these data provide an insight of repulsive nature of the atom pairs. One of the reasons of the non-convergence is due to intrinsic instability. Atom pairs which have repulsive interaction cannot be relaxed completely at finite interatomic distance in the limited-sized calculation cell. Considering these problems, we classified atomic pairs into four classes: 0 = error, 1 = not converged, 2 = not stabilized, 3 = stabilized. The error was detected by the stop of calculations or failure in loading output files. The convergence was checked by the convergence attribute of the pymatgen.io.vasp.Vasprun objects. The stabilization was confirmed by the sign of the binding energy is negative, ΔE<0. The number of pairs of class 0, 1, 2, and 3 are 42, 771, 127, and 2976, respectively. Fig. 1 shows a heatmap of the calculation status. It should be noted that the data sets in class 3 are simply not problematic in terms of the above criteria as a result of calculation, and there is a possibility that calculation results are incorrect. However, since it is hard to set a clear, reasonable, and uniform standard for judging whether these data are wrong and all the process of each calculations can be trackable by analysing VASP output files, we do not dare to exclude any pairs by arbitrary way, for instance, removing outliers visually. Therefore, users should understand the premise of the calculation and use it with care by comparing it with multiple calculation and experimental results available depending on the scope of the data set to be used. For example, in the calculation, spin orbit interaction was not considered due to its large calculation cost. This might result in some deviation from actual properties especially in atom pairs containing heavy atoms.
Fig. 1

Heatmap of calculation status. The values correspond to calculation status values: 0 = error, 1 = not converged, 2 = not stabilized, 3 = stabilized.

Heatmap of calculation status. The values correspond to calculation status values: 0 = error, 1 = not converged, 2 = not stabilized, 3 = stabilized. Heatmap of bond length (r) of the stabilized diatomic molecules. Calculation status of class 1, 2, and 3 is indicated by cells with black line edge, dashed black line edge, and blank cell, respectively.

Visualization of physical parameters

Here we display typical calculation results by visualizing variation of physical parameters. Fig. 2 shows a heat map of bond length of stable diatomic systems. Fig. 3 shows a heat map of binding energies. Fig. 4 shows a heat map of Fermi energies. Fig. 5 shows a heat map of spin magnetic moment. Fig. 6 shows a scattering plot matrix for properties: binding energy, inter atomic distance, Fermi energy, and spin magnetic moment, by using only class 3 data.
Fig. 2

Heatmap of bond length (r) of the stabilized diatomic molecules. Calculation status of class 1, 2, and 3 is indicated by cells with black line edge, dashed black line edge, and blank cell, respectively.

Fig. 3

Heatmap of binding energy (∆E) of the stabilized diatomic molecules. Calculation status of class 0, 1, and 2 is indicated by blank cells, cells with dashed black line edge, and cells with black line edge, respectively.

Fig. 4

Heatmap of Fermi energy (EF) of the stabilized diatomic molecules with respect to vacuum level. Calculation status of class 1, 2, and 3 is indicated by cells with black line edge, dashed black line edge, and blank cell, respectively.

Fig. 5

Heatmap of absolute value of spin magnetic moment (µS) of the stabilized diatomic molecules. Calculation status of class 1, 2, and 3 is indicated by cells with black line edge, dashed black line edge, and blank cell, respectively.

Fig. 6

Scatter matrix of inter atomic distance (r), binding energy (∆E), Fermi energy (EF), spin magnetic moment (µS) of the stabilized diatomic molecules.

Heatmap of binding energy (∆E) of the stabilized diatomic molecules. Calculation status of class 0, 1, and 2 is indicated by blank cells, cells with dashed black line edge, and cells with black line edge, respectively. Heatmap of Fermi energy (EF) of the stabilized diatomic molecules with respect to vacuum level. Calculation status of class 1, 2, and 3 is indicated by cells with black line edge, dashed black line edge, and blank cell, respectively. Heatmap of absolute value of spin magnetic moment (µS) of the stabilized diatomic molecules. Calculation status of class 1, 2, and 3 is indicated by cells with black line edge, dashed black line edge, and blank cell, respectively. Scatter matrix of inter atomic distance (r), binding energy (∆E), Fermi energy (EF), spin magnetic moment (µS) of the stabilized diatomic molecules. All these heatmaps and scatter plot can be reproduced by the dataset and codes at Mendeley data [4].

Comparison with experimental values in literature

To validate our dataset, we also compared stable bond length and binding energy to reported experimental measurement dataset on diatomic system. We extracted bond length r from list of experimental diatomic bond lengths in Computational Chemistry Comparison and Benchmark DataBase (CCCBDB) [5]. Among 2976 valid (class 3) pairs in our database, 173 pairs are recorded in CCCBDB and are used for comparison. Fig. 7a shows a validation plot between binding energy in our dataset and the reported experimental values.
Fig. 7

Validation plots for comparing our data set to the previous literatures. a Validation plot of bond length compared with experimental diatomic bond length (r) in CCCBDB [5]. b Validation plot of binding energy compared with experimental value of binding energy (∆E) in bond dissociation energy [6]. c-e Validation plots with comparison of bond length (r) in our data set to that of calculated geometry in CCCBDB [5] calculated by methods with c predefined basis sets, d standard basis sets, and e effective core potentials, respectively. The plot ranges of c-e are limited for visibility. The same plots as c-e with view ranges containing all data points are presented in Supplementary Fig. 1.

Validation plots for comparing our data set to the previous literatures. a Validation plot of bond length compared with experimental diatomic bond length (r) in CCCBDB [5]. b Validation plot of binding energy compared with experimental value of binding energy (∆E) in bond dissociation energy [6]. c-e Validation plots with comparison of bond length (r) in our data set to that of calculated geometry in CCCBDB [5] calculated by methods with c predefined basis sets, d standard basis sets, and e effective core potentials, respectively. The plot ranges of c-e are limited for visibility. The same plots as c-e with view ranges containing all data points are presented in Supplementary Fig. 1. We extracted experimentally obtained binding energies from bond dissociation energy database [6] and compared them with binding energies in our dataset. The dissociation energy database records the binding energy in diatomic systems at 298 K and the authors approximate the value for the pairs of which value is not available at 298 K by considering the temperature dependent internal energy to be 3/2RT. Since our binding energies are obtained by the DFT calculations at 0 K, here we compare our value with the database values subtracted by 3/2R(298 K)=3.71818 J⋅mol−1. Among 2976 valid (class 3) pairs in our database, 828 pairs are recorded in the database and are used for comparison. Fig. 7b shows a validation plot between binding energy in our dataset and the reported experimental values. Note that experimental errors provided in the database are not considered in the plot.

Comparison with calculated values in literature

We compared our dataset with previously reported calculated values on diatomic systems. We extracted bond length r from calculated diatomic bond lengths in CCCBDB [5]. Among 2976 valid (class 3) pairs in our database, 199 pairs are recorded in CCCBDB and are used for comparison. For each pair, CCCBDB contains multiple values of bond length depending on calculation conditions. Fig. 7c-e shows validation plots of bond length with comparison between our dataset and that of the calculated diatomic bond lengths in CCCBDB by methods with predefined basis sets, standard basis sets, and effective core potentials, respectively. Note that the plot ranges of Fig. 7c-e is chosen to be from 0 to 5 Å for visibility since bond length in our dataset which are also in CCCBDB are in this range. The same plots with view ranges containing all data points are presented in Supplementary Fig. 1. The secondary dataset and codes for plotting the validation plots are provided as supplementary file.

Experimental Design, Materials and Methods

First-principles calculations

Diatomic systems of atom pairs containing element from H (Z = 1) to Ra (Z = 88), 3916 pairs in total, are considered. In addition to the diatomic systems, the 88 isolated atom systems were also calculated for evaluating binding energy. All the first-principles DFT calculations were performed with the projector augmented wave (PAW) method [7] using the Vienna Ab-initio Simulation Package (VASP) [8]. SCAN+rVV10 [9] was used as vdW-DF in the implementations of the VASP code [10,11]. We have examined six functionals, including SCAN+rVV10, Perdew-Burke-Ernzerhof (PBE) [12], Tkatchenko-Scheffler [13], DFT-D2 [14], optPBE [10], and optB88 [11], and concequently SCAN+rVV10 vdW was selected because it shows best stability and accuracy on the calculation convergence and results, respectively. Semi-core orbital was included in valence. The selection of the PAW potential can be confirmed by potcar_symbol entry in the parsed statistical datasets or by “OUTCAR” in the data set at Mendeley data [4]. Cut-off energy of 500 eV was used as a default value for most of the isolated atoms and diatomic systems, but it was altered manually for some pairs which did not converged. Spin polarization was considered, but spin orbit interaction was not considered. The Brillouin zone was sampled with a 1 × 1 × 1 Γ-centered k-point grid. All calculations were carried out in a 15 Å cubic cell, large so that interaction between mirror atoms becomes small. Each diatomic system or isolated atom was positioned at the center of the cubic cell. For diatomic systems, both atomic structure and electronic structure were relaxed. For isolated atoms, only electronic structure was relaxed. The initial interatomic distance of each diatomic system was set as the sum of covalent atomic radii [15] and was tuned manually for pairs which does not converged. Other all detailed calculation conditions can be confirmed by checking VASP output files (“vasprun.xml” and “OUTCAR”) available at Mendeley data [4].

Parsing data and creating database

The calculated data was parsed in Python using pymatgen package [16] and summarized using pandas [17,18] package. Final energy and Fermi energy were read from “OUTCAR” and “vasprun.xml”. The stabilized distance was obtained from “vasprun.xml”. For each diatomic system composed of atom 1 and atom 2, binding energy ΔE was calculated by ΔE=Etot-(Etot1+Etot2), where Etot is total energy of diatomic system, Etot1 is total energy of an isolated atom 1, and Etot2 is that of an isolated atom 2.

Code availability

The VASP code used for the DFT calculation is a proprietary code. The VASP input and output data was parsed, checked, and summarized in Python with freely available packages: numpy, pymatgen, pandas, matplotlib, seaborn, and jupyter. The csv files are text files and can be used by many softwares and programs. The python pickle data records can be loaded by python environments with pandas and pymatgen package installed. Along with the data records, we provide some python scripts which can be used for parsing the raw VASP files and visualizing the statistical properties. These codes for parsing datasets are also available at Mendeley data [4].

Ethics Statement

This work does not involve neither of the use of human subjects nor animal experiments nor data collected from social media platforms.

CRediT Author Statement

Kiyou Shibata: Software, Data curation, Validation, Visualization, Writing - Original Draft; Eiki Suzuki: Methodology, Software, Investigation; Teruyasu Mizoguchi: Conceptualization, Resources, Writing - Review & Editing, Supervision, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.
SubjectComputational Materials Science
Specific subject areaBonding behavior between various two atoms using van der Waals density functional method
Type of dataTable, Figure, python pickle file, VASP output files, python scripts for parsing and plotting
How data were acquiredThe first-principles calculations were carried out by projector augmented wave method using the Vienna Ab-initio Simulation Package (VASP) code.The raw VASP output files were parsed by scripts written in Python.
Data formatRawAnalysedFiltered
Parameters for data collectionSCAN+rVV10 van der Waals density functional with exchange correlation interaction by the Perdew-Burke-Ernzerhof generalized gradient approximation. All calculations were carried out in a 15 Å cubic cell. Spin polarization was considered, but spin orbit interaction was not considered. The Brillouin zone was sampled with a 1 × 1 × 1 Γ-centered k-point grid. For diatomic systems, both atomic structure and electronic structure were relaxed. For isolated atoms, only electronic structure was relaxed.
Description of data collectionStructure and basic physical properties were calculated for all diatomic molecules containing atoms from H (Z = 1) to Ra (Z = 88) through ionic and electronic structure optimization based on the density functional theory considering van der Waals interaction. Basic physical properties of 88 isolated atom systems from H (Z = 1) to Ra (Z = 88) were also calculated in the same manner without ionic structure optimization for obtaining binding energies of the diatomic molecules. The raw calculation data sets were parsed and classified based on some criteria of calculation errors and unphysical values.
Data source locationInstitution: Institute of Industrial Science, the University of Tokyo, 4–6–1 Komaba Meguro-ku, Tokyo, JapanPrimary data sources (for comparison):NIST computational chemistry comparison and benchmark database, NIST standard reference database 101, Editor: R. D. Johnson III, Release 21 (Aug. 2002). doi: https://doi.org/10.18434/T47C7Z. URL http://cccbdb.nist.gov/S. Fliszár, Bond Dissociation Energies, in: Atomic Charges, Bond Properties, and Molecular Energies, John Wiley & Sons, Inc., Hoboken, NJ, USA, 2008, pp. 151–166. doi: https://doi.org/10.1002/9780470405918.ch12.
Data accessibilityRepository name: Mendeley DataDirect URL to data: https://data.mendeley.com/datasets/yz5rrmvrgd/1
  8 in total

1.  Generalized Gradient Approximation Made Simple.

Authors: 
Journal:  Phys Rev Lett       Date:  1996-10-28       Impact factor: 9.161

2.  Semiempirical GGA-type density functional constructed with a long-range dispersion correction.

Authors:  Stefan Grimme
Journal:  J Comput Chem       Date:  2006-11-30       Impact factor: 3.376

3.  Accurate molecular van der Waals interactions from ground-state electron density and free-atom reference data.

Authors:  Alexandre Tkatchenko; Matthias Scheffler
Journal:  Phys Rev Lett       Date:  2009-02-20       Impact factor: 9.161

4.  Covalent radii revisited.

Authors:  Beatriz Cordero; Verónica Gómez; Ana E Platero-Prats; Marc Revés; Jorge Echeverría; Eduard Cremades; Flavia Barragán; Santiago Alvarez
Journal:  Dalton Trans       Date:  2008-04-07       Impact factor: 4.390

5.  Projector augmented-wave method.

Authors: 
Journal:  Phys Rev B Condens Matter       Date:  1994-12-15

6.  Chemical accuracy for the van der Waals density functional.

Authors:  Jiří Klimeš; David R Bowler; Angelos Michaelides
Journal:  J Phys Condens Matter       Date:  2009-12-10       Impact factor: 2.333

7.  van der Waals forces in density functional theory: a review of the vdW-DF method.

Authors:  Kristian Berland; Valentino R Cooper; Kyuho Lee; Elsebeth Schröder; T Thonhauser; Per Hyldgaard; Bengt I Lundqvist
Journal:  Rep Prog Phys       Date:  2015-05-15

8.  The diatomic molecular spectroscopy database.

Authors:  Xiangyue Liu; Stefan Truppe; Gerard Meijer; Jesús Pérez-Ríos
Journal:  J Cheminform       Date:  2020-05-11       Impact factor: 5.514

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.