Literature DB >> 33177500

A band-gap database for semiconducting inorganic materials calculated with hybrid functional.

Sangtae Kim1, Miso Lee1, Changho Hong1, Youngchae Yoon1, Hyungmin An1, Dongheon Lee1, Wonseok Jeong1, Dongsun Yoo1, Youngho Kang2, Yong Youn3, Seungwu Han4.   

Abstract

Semiconducting inorganic materials with band gaps ranging between 0 and 5 eV constitute major components in electronic, optoelectronic and photovoltaic devices. Since the band gap is a primary material property that affects the device performance, large band-gap databases are useful in selecting optimal materials in each application. While there exist several band-gap databases that are theoretically compiled by density-functional-theory calculations, they suffer from computational limitations such as band-gap underestimation and metastable magnetism. In this data descriptor, we present a computational database of band gaps for 10,481 materials compiled by applying a hybrid functional and considering the stable magnetic ordering. For benchmark materials, the root-mean-square error in reference to experimental data is 0.36 eV, significantly smaller than 0.75-1.05 eV in the existing databases. Furthermore, we identify many small-gap materials that are misclassified as metals in other databases. By providing accurate band gaps, the present database will be useful in screening materials in diverse applications.

Entities:  

Year:  2020        PMID: 33177500      PMCID: PMC7658987          DOI: 10.1038/s41597-020-00723-8

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

The band gap (Eg) is a fundamental quantity that directly relates to usability of materials in optical, electronic, and energy applications. For instance, in photovoltaic devices, materials with a direct Eg of ∼1.3 eV[1,2], corresponding to the Shockley-Queisser limit, are favored as photo-absorbers that maximize the solar-cell efficiency. In power electronics, semiconductors with Eg ≥ 3 eV are employed to sustain high electric fields[3]. To increase the figure of merit in thermoelectric devices, materials with Eg of 10 kBTop where kB and Top are the Boltzmann constant and operating temperature, respectively, are selected[4]. Given the central role of Eg, a database of Eg over a wide range of materials can expedite the material selection in specific applications by factoring out suboptimal candidates rapidly. Currently, popular material databases such as the Materials Project[5], the Automatic Flow of Materials Discovery Library (AFLOW)[6], the Open Quantum Materials Database (OQMD)[7], and the Joint Automated Repository for Various Integrated Simulations (JARVIS)[8] provide theoretical Eg for up to one million inorganic materials. However, most of them were obtained by semilocal functionals with a generalized gradient approximation (GGA), which underestimates Eg by typically 30–40%[9]. (MatDB[10] provides accurate quasi-particle band gaps, but the number of data is limited to hundreds.) To compensate for this, AFLOW provides adjusted Eg using a linear fit to experimental data[11]. However, such a universal correction may not address element-dependent error fluctuations. We note that JARVIS provides Eg based on meta-GGA[12], which significantly improves the accuracy. As a related issue, many small-gap semiconductors such as Ge, InAs, PdO, Zn3As2, and Ag2O are misclassified as metals, which can affect selection of narrow-gap semiconductors in IR sensors[13], for instance. (In JARVIS, some of these errors are resolved by meta-GGA.) Besides the underestimation of Eg, all the databases consider only the ferromagnetic ordering for spin-polarized systems due to computational convenience, which can cause significant errors in Eg of antiferromagnetic materials. For instance, the antiferromagnetic NiO has an experimental Eg of 4.3 eV[14], but the computational Eg ranges over 2.2–2.6 eV in the ferromagnetic ordering and GGA functional[5-7] while the correct antiferromagnetic ordering produces 4.5 eV within the hybrid functional. Addressing limitations in the existing material databases, we herein report a theoretical dataset of fundamental and optical Eg computed by employing a hybrid functional and identifying the stable magnetic ordering, thus providing more accurate Eg than the existing databases. For the high-throughput computational workflow, we employ ‘Automated Ab initio Modeling of Materials Property Package’ (AMP2)[15], which is a fully automated package for density functional theory (DFT) calculations of crystalline properties. AMP2 addresses the band-gap underestimation in semilocal functionals with the help of a hybrid functional, thereby producing a more accurate Eg, even if the material is incorrectly metallic within the semilocal functional. Furthermore, the package finds the antiferromagnetic ground state based on an effective Ising model. The present database focuses on materials with 0 eV < Eg < 5 eV, which covers most semiconducting materials. The target materials are selected from Inorganic Crystal Structure Database (ICSD)[16] and partly filtered by information from the Materials Project database. In total, the database collects Eg for 10,481 materials that encompass most inorganic solids with Eg ranging between 0 and 5 eV. For 116 benchmark materials, the root-mean-square error (RMSE) with respect to experimental data is 0.36 eV, significantly smaller than 0.75–1.05 eV in the existing databases. The resulting data are available online at figshare[17] or SNUMAT[18].

Methods

High-throughput methodology: AMP2

The present database is constructed by employing AMP2 which is an automation script operating VASP[19-21]. Starting only with the initial crystalline structure, AMP2 provides band structure, Eg, effective mass, density of states (DOS) and dielectric constants of the crystal by automatically pipelining computational procedures. To summarize computational settings relevant in the present work, we employ GGA developed by Perdew-Burke-Ernzerhof (PBE)[22] for the exchange-correlation functional for structural relaxation and identifying band edges. The Eg is obtained by ‘one-shot’ hybrid functional (specifically, HSE06[23] (simply HSE hereafter)) calculations in which the package estimates Eg from HSE eigenvalues at k points of band edges found with PBE (crystal structures are also fixed to those relaxed by PBE). In the previous study[24], it was demonstrated that band edges from PBE and HSE lie at the same k points, which is confirmed again in the present work with Si, SrS, BAs, BeS, AlAs, AgI, AgGaTe2, ZnSiAs2, and ZnIn2Se4. In addition, the small structural differences between PBE and HSE[25] would not affect the band gap significantly, except for small-gap semiconductors (see below). (This is also the case for systems that go from metallic in PBE to insulating in HSE.) This supports that the one-shot scheme can produce Eg close to the full hybrid calculations. If the material is identified as a metal within PBE, AMP2 inspects DOS, and if DOS at the Fermi level normalized by the valence band (DF/DVB) is less than a threshold, the package further tests a possible gap-opening by the one-shot hybrid calculation. The PBE+U method is applied on 3d orbitals[26] only when the material has a finite Eg. About pseudopotentials, we mostly employ those without any tags in the VASP database, which tends to reduce the number of valence electrons. For further details, we refer to the original publication[15]. Computational parameters used in the present work follow the default setting of AMP2 except that the package applies the PBE+U method on Ce 4f levels with the U value of 4 eV[27]. (The pseudopotentials for La and Ce treat f levels as valence.) Furthermore, for compounds including Tl, Pb, and Bi, Eg is recalculated by including the spin-orbit coupling (SOC) when the default Eg without SOC is smaller than 1 eV. (The band edges are also resought with SOC.) This is because typical SOC corrections of ∼0.5 eV would be critical in these cases. In identifying the stable collinear magnetic ordering, AMP2 applies a genetic algorithm to the Ising model[28]. This approach finds the stable magnetic ordering correctly for many compounds. However, the original formulation requires a large supercell to isolate exchange interactions from periodic images, which costs significant computational resources and also suffers from ill-convergence in electronic iterations. To resolve this, we here develop an alternative method in obtaining exchange parameters. First, we choose a minimal supercell under the following two conditions: (i) A magnetic site α and its periodic images in other supercells are apart more than 5 Å (cutoff range for magnetic interactions). (ii) If two magnetic sites α and β (not necessarily belong to the same supercell) are within 5 Å, then the distance between α and β′, a periodic image of β (β′ ≠ β) is longer than 5 Å except when α-β and α-β′ are symmetrically equivalent. Within the Ising model, the total energy of the supercell (E) can be expressed as follows:where E0 is the base energy excluding the magnetic interaction, and I is the index for independent exchange interactions (total m interactions) with the maximum range of 5 Å and the exchange parameter of J. In Eq. (1), N and N are the numbers of parallel and antiparallel spin pairs within the supercell corresponding to the interaction I, respectively. Then, based on the ferromagnetic configuration (all spin-up), diverse spin configurations are sampled by spin-flipping a magnetic pair (both atoms) or a certain magnetic site. The number of resulting equations is larger than m and an optimal {E0, J} can be obtained by the pseudoinverse method. We find that this approach produces essentially the same parameters as the original scheme but is more reliable and efficient.

Selection of materials

Figure 1 schematizes the workflow of constructing the database. Starting from the ICSD, we only consider compounds consisting of elements with atomic number (Z) < 84. Among the lanthanides, we limit the elements to La and Ce. We remove structural duplicates and structures with partially occupied sites, and also omit large primitive cells that contain more than 40 atoms. For unary and binary compounds, all the structures are calculated with AMP2. For ternary and higher compounds, we utilize information on Eg and DOS in the Materials Project database (calculated by PBE) to filter out materials that are likely to be metallic or large-gap insulators. To be specific, we exclude materials with bigger than 3 eV since they are likely to have larger than 5 eV. (Compiling data of 4,421 compounds from the previous screening studies[24,29-31], we find that 99.7% of materials with < 5 eV have < 3 eV.) We also include metallic materials with DF/DVB < 0.8 for possible gap opening (see above; a larger threshold is used because of low-resolution DOS in the Materials Project). If a Materials Project data has incomplete entries for Eg or DOS, the material is included in the computation list. In this way, we could factor out 5,059 materials from the list of ternary and higher compounds. Finally, we calculate 21,353 materials with AMP2. After computation, we collect 10,481 materials with finite Eg (unary: 63, binary: 1,919, ternary: 5,074, quaternary: 2,804, quinary: 573, and higher: 48).
Fig. 1

The computational workflow for collecting the dataset. Numbers in the parentheses indicate material counts.

The computational workflow for collecting the dataset. Numbers in the parentheses indicate material counts.

Data Records

All the calculated properties for 10,481 compounds can be downloaded from the Figshare Repository[17]. The whole data including metals are also uploaded to SNUMAT (www.snumat.com), which provides easy search and visualization of materials through its own interactive interface. SNUMAT also supports REST API[32] for users to search the materials with authorization. The authorization token expires 24 hours after they are issued.

File format

The data are stored in the JSON format. The name of the file is X_ICSD#.json, where X is chemical formula and ICSD# is the ICSD collection code of the initial structure used for calculation. Each JSON file includes final relaxation structure information, , , and DOS. Table 1 summarizes keys for metadata.
Table 1

Description of metadata keys in JSON file.

KeyTypeDescription
SNUMAT_idstringID in the SNUMAT
ICSD_numberintICSD collection code
Band_gap_GGAfloatCalculated fundamental band gap in GGA (eV)
Band_gap_GGA_opticalfloatCalculated direct band gap in GGA (eV)
Band_gap_HSEfloatCalculated fundamental band gap in HSE (eV)
Band_gap_HSE_opticalfloatCalculated direct band gap in HSE (eV)
Direct_or_indirectstringType of band gap in GGA (direct or indirect)
Direct_or_indirect_HSEstringType of band gap in HSE (direct or indirect)
Structure_rlxstringRelaxed structure information (VASP POSCAR format)
Space_group_rlxintSpace group number of relaxed structure
Magnetic_orderingstringMagnetic ordering of final structure
SOCbooleanSpin-orbit coupling (True or False)
Description of metadata keys in JSON file.

Graphical representation of the data

In Fig. 2, we present the distribution of and for 10,481 materials. Most materials with > 5 eV (663 cases) are unary or binary compounds for which AMP2 is applied to the whole structure dataset from ICSD.
Fig. 2

Distribution of and . Top and right are occurrence histograms of and , respectively.

Distribution of and . Top and right are occurrence histograms of and , respectively.

Technical Validation

Comparison to experimental measurements and other databases

In Fig. 3a, we compare experimental and theoretical values for 116 benchmark materials with experimental Eg between 0 and 5 eV. The list of compounds is shown in Online-only Table 1. For comparison, theoretical results from other databases are also shown in Fig. 3b–d. The RMSE values are 0.36 eV (present work), 1.05 eV (Materials Project), 0.93 eV (AFLOW), 0.75 eV (AFLOW fitted), 1.02 eV (OQMD), and 0.85 eV (JARVIS). (The meta-GGA values of 19 materials, mostly with small Eg, are missing in JARVIS.) This confirms that the present database provides more accurate Eg than the existing databases on average. In particular, we correctly identify the semiconducting nature for small-gap semiconductors such as AgSbTe2, CdO, CoP3, Cu3AsSe4, Cu3SbS4, Cu3SbSe4, CuFeS2, Ge, Mg2Sn, RhSb3, and ZnSnSb2, which are mostly misreported as metals in other databases. In addition, other databases exhibit pronounced errors for every antiferromagnetic material (CuFeS2, CuO, FeF2, MnO, MnTe, and NiO) because these materials are considered as ferromagnetic or non-magnetic. (For non-magnetic materials in Online-only Table 1, the Eg calculated with pure PBE (without +U and SOC) by AMP2 agrees well with those from Materials Project (the mean absolute error is 0.034 eV).)
Fig. 3

Comparison of Eg for benchmark materials between experimental and theoretical data from (a) this work, (b) Materials Project, (c) AFLOW and (d) OQMD and JARVIS. AFLOW-fitted values are obtained from eV.

Online-only Table 1

The list of materials and their band gaps from experiment, present work, Materials Project, OQMD, AFLOW, AFLOW (fitted), and JARVIS.

NameICSDSpace groupExp.Band gap
Present workMaterials ProjectOQMDAFLOWAFLOW (fitted)JARVIS
Mg2Pb1048462250.1000000.32
InSb6404112160.160.0100.26000.13
Bi2Te3157531660.210.020.530.520.281.29-
PbTe630982250.260.420.870.970.81.991.4
CdSnAs2167371220.260.040.050.300-
Sb2Te31933411660.30.430.010.210.040.97-
Cu3SbSe44006521210.310.110000-
Te1616901520.330.690.410.50.151.120.62
Bi2Se36170721660.350.230.540.420.491.58-
Mg2Sn1048692250.360.0900000.24
InAs6106872160.360.100.37000.4
PbSe630962250.370.360.470.690.41.450.96
ZnSnSb2426691220.40.3600000.32
CoP36245942040.430.110000-
CdSb52830610.460.520.140.190.091.03-
Cu3SbS428571210.460.320000-
PbS621902250.50.470.470.690.471.541.3
ZnSb43265610.50.630.040.20.171.14-
CuFeS2601661220.531.600000
CdGeAs21535931220.530.090.080.340.030.960.52
CoSb31535042040.630.750.170.220.151.12-
ZnSnAs2182031220.650.550.010.2000.8
Ge419802270.670.1700000.61
GaSb416752160.670.4100.37000.59
CoAs3311112040.690.5200.1500-
InN1094631860.70.500.28000.76
Mg2Ge522832250.740.540.190.240.191.170.58
TlSe447061400.740.550.2500.111.060.74
Mg2Si1048642250.770.590.240.280.241.230.62
RhSb36502482040.80.1600.1500-
CdO1812942250.80.7600001.31
CuGaTe2287381220.820.960.20.40.431.490.04
RhAs3340522040.8500000-
ZnGeAs2167351220.850.840.050.340.171.141.07
HgIn2Te425652820.861.120.520.610.561.671.35
CuInSe2733511220.860.740.020.16000.64
Cu3AsSe46103611210.880.050000-
Zn3As2440911370.930.54000.131.09-
CuInTe21690481220.950.710.020.240.251.251.04
AgInSe2287511220.960.740.060.160.41.451
CuGaSe22475131220.961.080.040.250.41.451.39
AgGaSe2287481221.11.240.250.410.71.861.56
Si516882271.121.190.620.770.611.741.28
AgGaTe2287491221.150.90.190.320.521.62-
Cu2CdSnS42381441211.160.9700.210.11.05-
CdSnP2221831221.170.90.270.510.141.11.21
InSe1851721941.171.180.460.550.431.51.43
IrSb36409582041.180.610.050.21000.24
AglnS2287501221.181.260.380.490.922.161.62
ZnIn2Te425650821.21.510.820.920.932.171.77
CuInS2668651221.21.220.010.420.41.450.69
Cu3AsS414285311.240.960.0100.21.18-
Cdln2Te425651821.261.490.840.920.92.121.72
InP531052161.271.120.470.740.581.691.39
GaAs419812161.350.950.190.570.31.321.32
ZnGa2Te4290911821.351.721.011.181.112.411.74
CuO16025151.41.9400000.01
CdTe939442161.441.630.590.870.711.871.64
HgIn2Se425649821.451.380.620.720.671.811.64
ZnSnP2221791221.451.530.70.920.681.831.69
MnTe1740281941.461.4700000
BAs438712161.51.861.211.421.22.531.93
Se (gray)1642641521.51.761.060.880.982.231.71
CdTe6205181861.51.670.620.870.711.921.76
Cdln2Se4151954821.551.770.951.031.042.312.16
CdSiAs2221871221.61.140.40.690.461.531.33
AlSb443252161.61.881.231.381.232.571.78
AgGaS2236981221.661.990.961.091.52.942.36
GaTe635512121.691.881.060.930.892.121.72
ZnSiAs2221841221.71.670.891.061.122.421.85
CdSe418261861.741.470.570.770.691.842.06
CdGeP21004671221.81.330.640.90.71.851.65
ZnIn2Se425647821.821.830.981.121.132.442.22
HgGa2Se4188545821.951.720.911.050.952.22.08
GaSe631221942.021.771.230.970.812.012.14
CuAlTe2287351222.061.881.021.21.292.651.82
BP6151542162.11.971.241.371.252.591.91
AlAs1850812162.162.151.521.631.52.942.28
ZnGeP2167341222.21.831.191.371.252.591.93
CdSiP2221881222.21.991.431.41.412.812.02
ZnGa2Se444887822.22.341.391.541.5532.84
AgI1649632162.222.491.371.481.983.582.87
GaP770872162.242.431.591.751.643.132.37
ZnTe1851412162.262.191.081.451.482.92.23
ZnSiP2221901222.31.921.361.441.382.772.03
β-SiC6037982162.32.261.391.591.372.762.31
AgAlTe2287461222.351.841.051.21.362.751.99
CdS1541861862.422.381.120.771.252.62.6
CdGa2Se42287822.432.221.351.451.412.822.72
AlP244902162.452.331.631.751.633.112.56
AgAlSe2287451222.52.171.141.331.563.022.36
ZnSe1851342162.582.541.171.51.73.22.63
La2S315151622.731.751.031.20.922.16-
AgI627891862.632.521.41.551.983.612.85
CuAlSe2287341222.652.080.91.071.242.582.51
BeTe539452162.82.692.022.152.023.632.82
HgGa2S4189737822.842.571.570.721.623.12.83
α-SiC1644291862.862.942.042.152.023.633.08
CuBr239892162.912.120.490.621.132.441.55
Cul337242162.952.561.141.291.653.142.2
Al2Se31437393.12.771.811.861.763.282.85
CuCl239882163.172.330.560.641.282.641.58
CeO21845842253.23.361.872.12.424.172.01
ZnO1544861863.22.660.731.041.823.372.47
GaN1538901863.342.941.742.091.913.493.08
CuAlS21890831223.353.041.691.882.053.672.49
FeF2141431363.43.633.380.422.634.460
ZnGa2S444886823.43.312.272.412.444.23.86
CdGa2S4106362823.443.122.121.452.213.893.71
SnO21549601363.472.290.651.2002.39
ZnS770902163.543.612.022.322.684.523.59
BeSe6164192163.613.542.692.822.674.513.88
MnO98642253.92.851.720.422.624.440
SrS6510542254.13.462.52.592.54.283.61
BeS447242164.174.13.153.233.145.154.38
NiO437402254.54.532.312.622.223.910.01
RMSE0.361.051.020.930.750.85

The experimental data were adopted from a CRC Handbook[39] except AgAlSe2[40], AgAlTe2[40], AgGaTe2[40], CdO2[41], CdSnP2[42], CuAlS2[43], CuAlTe2[44], HgIn2Se4[45], and InN[46] for which more reliable measurements are cited, and CeO2[47] and La2S3[48] which represent lanthanides.

Comparison of Eg for benchmark materials between experimental and theoretical data from (a) this work, (b) Materials Project, (c) AFLOW and (d) OQMD and JARVIS. AFLOW-fitted values are obtained from eV. In most cases, the present database provides Eg that agrees well with experiment. However, there are some materials with large errors of ≥0.5 eV such as AgAlTe2, Cu3AsSe4, CuAlSe2, CuBr, CuCl, CuFeS2, CuO, Ge, IrSb3, La2S3, MnO, RhAs3, RhSb3, SnO2, SrS, and ZnO. For small-gap materials such as Cu3AsSe4, Ge, and IrSb3, Eg is sensitive to the lattice parameters that are slightly overestimated by PBE. Employing experimental lattice parameters or those relaxed within HSE significantly improves the results[15]. For Cu-bearing materials, it is known that HSE often exhibits substantial errors in Eg due to nonlocal screening effects in Cu, which requires GW calculations[33,34]. We also note that van der Waals interactions are not described by semilocal functionals, and lattice parameters can be overestimated in layered structures such transition-metal dichalcogenides[35]. This can significantly affect Eg, and so care is needed in referring to Eg in layered materials. The present results do not consider finite-temperature effects on Eg, which can be significant in some materials, for example, hybrid perovskites[36]. More generally, Eg dataset with the ultimate theoretical accuracy would be obtained by the quasiparticle approaches such as GW or Bethe-Salpeter equations[37,38].
Measurement(s)band gap • semiconducting inorganic material
Technology Type(s)computational modeling technique
Sample Characteristic - Environmentmaterial entity
  14 in total

1.  Generalized Gradient Approximation Made Simple.

Authors: 
Journal:  Phys Rev Lett       Date:  1996-10-28       Impact factor: 9.161

2.  Optimum band gap of a thermoelectric material.

Authors: 
Journal:  Phys Rev B Condens Matter       Date:  1994-02-15

3.  Strong nonlocal contributions to Cu 2p photoelectron spectroscopy.

Authors: 
Journal:  Phys Rev B Condens Matter       Date:  1993-05-01

4.  Energy band gaps and lattice parameters evaluated with the Heyd-Scuseria-Ernzerhof screened hybrid functional.

Authors:  Jochen Heyd; Juan E Peralta; Gustavo E Scuseria; Richard L Martin
Journal:  J Chem Phys       Date:  2005-11-01       Impact factor: 3.488

5.  Quasiparticle self-consistent GW theory.

Authors:  M van Schilfgaarde; Takao Kotani; S Faleev
Journal:  Phys Rev Lett       Date:  2006-06-06       Impact factor: 9.161

6.  Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set.

Authors: 
Journal:  Phys Rev B Condens Matter       Date:  1996-10-15

7.  Projector augmented-wave method.

Authors: 
Journal:  Phys Rev B Condens Matter       Date:  1994-12-15

8.  Predictive Determination of Band Gaps of Inorganic Halide Perovskites.

Authors:  Julia Wiktor; Ursula Rothlisberger; Alfredo Pasquarello
Journal:  J Phys Chem Lett       Date:  2017-10-31       Impact factor: 6.475

9.  Conference Reports: INTERNATIONAL CONFERENCE ON NARROW-GAP SEMICONDUCTORS AND RELATED MATERIALS Gaithersburg, MD June 12-15, 1989.

Authors:  D G Seiler; C L Littler
Journal:  J Res Natl Inst Stand Technol       Date:  1990 Jul-Aug

10.  Computational screening of high-performance optoelectronic materials using OptB88vdW and TB-mBJ formalisms.

Authors:  Kamal Choudhary; Qin Zhang; Andrew C E Reid; Sugata Chowdhury; Nhan Van Nguyen; Zachary Trautt; Marcus W Newrock; Faical Yannick Congo; Francesca Tavazza
Journal:  Sci Data       Date:  2018-05-08       Impact factor: 6.444

View more
  2 in total

1.  A universal similarity based approach for predictive uncertainty quantification in materials science.

Authors:  Vadim Korolev; Iurii Nevolin; Pavel Protsenko
Journal:  Sci Rep       Date:  2022-09-02       Impact factor: 4.996

2.  Long-Time Persisting Superhydrophilicity on Sapphire Surface via Femtosecond Laser Processing with the Varnish of TiO2.

Authors:  Dandan Yan; Zhi Yu; Tingting Zou; Yucai Lin; Wenchi Kong; Jianjun Yang
Journal:  Nanomaterials (Basel)       Date:  2022-09-28       Impact factor: 5.719

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.