Gaussian basis set

In order to solve the Hartree-Fock, Kohn-Sham DFT equations, it is necessary to expand the molecular orbitals into linear combinations of single-electron basis functions.

\[\varphi_{i}(r) = C_{1,i}\chi_{1}(r) + C_{2,i}\chi_{2}(r) + C_{3,i}\chi_{3}(r) + \dots + C_{N,i}\chi_{N}(r)\]

In quantum chemistry calculations, the basis functions have only mathematical meaning, not physical meaning. The more basis functions there are, the more accurate the result will be, but it also depends on how well the basis functions are set up. The Complete Basis Set Limit (CBS) is reached when there are infinitely many basis functions, which is called a complete set, and the molecular orbitals can be perfectly expanded. The actual use of a finite basis set does not reach the CBS, and the resulting error in the calculation result is called the basis set incompleteness error.

As many basis functions as are used, as many molecular orbitals are produced, but only occupied orbitals, and lower order non-occupied orbitals (valence level empty orbitals) are usually chemically meaningful. If the basis functions are taken to be atomic orbitals, it is called linear combination of atomic orbitals (LCAO), but this is only a concept in structural chemistry, and the basis functions used in actual calculations are not the real atomic orbitals.

The commonly used basis functions in quantum chemistry are as follows.

Gauss type orbital（GTO）basis functions: Most quantum chemistry programs use GTO basis functions because they are mathematically easy to calculate two-electron integrals.
Slater orbital (Slater type orbital, STO) basis functions: Semi-empirical and used by a few quantum chemistry programs (e.g. ADF). It is difficult to calculate two-electron integrals, but its radial behavior is closer to the actual atomic orbitals than the GTO basis functions, so that only a small number of STOs are needed to achieve a large number of GTO results.
Plane wave: A basis function specifically suitable for periodic calculations and much less cost effective than the GTO basis function for isolated systems.
Numerical atomic orbital (NAO) basis functions: Few programs support them, typically Dmol3, Siesta. NAO basis functions do not have an analytic mathematical form, but are described by discrete distributions of points.

The STO basis function was used in the early days of BDF software, and the GTO basis function is mainly used now.

For GTO basis functions with orbital angular momentum L higher than p*（e.g., GTO basis functions such as *d 、f, etc.），there are two ways to represent them. One is written in the form of a Cartesian function (also called a right-angle function).

\[N x^{lx} y^{ly} z^{lz} {\rm exp}(-\alpha r^2), \qquad L=lx+ly+lz\]

It has $(L+1)(L+2)/2$ components, e.g., the d function contains xx，yy，zz，xy，xz，yz。The other is written in the form of a spherical function (also called a spherical harmonic function, pure function).

\[N Y^L_m r^L {\rm exp}(-\alpha r^2)\]

It has $2L+1$ components, for example, the d function contains -2，-1，0，+1，+2。

The advantage of the Cartesian function is that it is easy to calculate the integral, but there are redundant functions; whereas the spherical function corresponds to exactly $(L+1)(L+2)/2$ magnetic quantum numbers, so in quantum chemistry programs the integral is usually calculated first under the Cartesian function and then combined into the integral of the spherical function by a certain linear relation [53].

Attention

most modern Gaussian basis sets are optimized under spherical basis functions, except for the older basis sets such as Pople type.
Cartesian basis functions have no advantage in terms of accuracy or efficiency, especially for all-electron relativity calculations, which also lead to numerical instability, so spherical basis functions are always used in BDF calculations.
Cartesian and spherical basis functions lead to different results. If the results of BDF calculations are repeated with other quantum chemistry programs, it is necessary to check whether the spherical basis functions are used, in addition to ensuring that the structure, method, and basis set are the same.

In the literature, data sets of optimized GTO basis functions for various atoms in different situations have been created and given different names to be called by quantum chemistry programs. These named GTO basis function data sets are called Gaussian Basis Sets。 the Gaussian Basis Sets built into the BDF are mainly from the following Basis Set Repository websites, and the original literature on the various Basis Sets can be found at the corresponding websites.

Basis Set Exchange [54] ：All-electron basis sets, scalar ECP basis sets, can be exported in BDF format（note: ECP basisets have to be manually repositioned for ECP data）。 https://www.basissetexchange.org/
Stuttgart/Cologne pseudopotential basis set library: mainly SOECP basis sets, and a few early scalar ECP basis sets. http://www.tc.uni-koeln.de/PP/clickpse.en.html
Turbomole basis set library:all-electron basis set, scalar ECP basis set, SOECP basis set. http://www.cosmologic-services.de/basis-sets/basissets.php
Dyall Relativistic Basis Set: All-electron relativistic basis set. http://dirac.chem.sdu.dk/basisarchives/dyall/index.html
Sapporo basis set library: all-electron basis sets. http://sapporo.center.ims.ac.jp/sapporo/
Clarkson University ECP basis set library: SOECP basis set. https://people.clarkson.edu/~pchristi/reps.html
ccECP Basis Set Library: Scalar ECP Basis Sets. https://pseudopotentiallibrary.org/

In addition, there are individual elements with built-in motifs from the original literature.

All-electron basis set Dirac-RPF-4Z and Dirac-aug-RPF-4Z, including s-、p-region elements [55]，d-region elements [56]，f-region elements [57]
Pseudopotential basis set Pitzer-AVDZ-PP、Pitzer-VDZ-PP、Pitzer-VTZ-PP [58]
Ce - Lu [59], Fr - Pu [60], Am - Og [61, 62] in the pseudopotential basis set CRENBL（Note: the Am - Og basis set on the Basis Set Exchange is wrong!）
Am - Og [61, 62] in the pseudopotential basis set CRENBS（Note: the Am - Og basis set on Basis Set Exchange is wrong!）
Ac, Th, Pa [63] ，U [64] in the pseudopotential basis set Stuttgart-ECPMDFSO-QZVP

BDF users can use either the standard basis sets from the BDF basis set library or custom basis sets.

All-electron basis sets

All-electron basis sets are divided into two categories: non-shrinking basis sets and shrinking basis sets. The former can be used for both non-relativistic and relativistic calculations, but mainly for relativistic calculations, while the latter is divided into non-relativistic shrinkage basis sets and relativistic shrinkage basis sets.

All-electron relativistic calculations use Hamiltonians such as DKH, ZORA, X2C, etc. that take relativistic effects into account（see Relativistic effects ）， when it is necessary to use shrinkage basis sets optimized specifically for relativistic calculations, such as the cc-pVnZ-DK series, SARC, ANO-RCC, etc. Most relativistic shrinkage basis sets treat the nucleus as a point charge, but some do take into account the nucleus distribution size effect when doing the shrinkage, which has the most pronounced effect on the shrinkage factor of the s and p asis functions. Accordingly, a finite nucleus model must also be used in the calculation of molecular integrals. finite nucleus model 。

Pseudopotential basis sets

The Effective Core Potential (ECP) includes the Pseudopotential (PP) and the Model Core Potential (MCP). The PP in quantum chemical calculations is not fundamentally different from the PP in plane wave calculations, except that it is expressed in a concise analytic form. Most quantum chemistry software, including BDF, supports PP, but fewer quantum chemistry software support MCP, so the names ECP and PP can be used interchangeably without ambiguity.

The pseudopotential basis set needs to be used in conjunction with the pseudopotential, and the basis functions describe only the valence level electrons of the atoms. When heavier elements are involved in the system, the pseudopotential basis set is usually used for them, while the normal basis set is used for the other atoms as usual. The Lan series, the Stuttgart series, and the cc-pVnZ-PP series all belong to this set. For ease of recall, the pseudopotential basis sets of some lighter elements are actually non-relativistic all-electron basis sets, such as the Def2 series of basis sets for elements before the fifth period.

The pseudopotential basis sets are divided into scalar pseudopotential basis sets and spin-orbit coupled pseudopotential (SOECP) basis sets, depending on whether the pseudopotential contains a spin-orbit coupling term or not.

Custom basis set files

The BDF can use non-built-in basis sets, where the basis set data is saved in a text format basis set file, placed in the calculation directory, with the file name is the name of the base set to be referenced in the BDF.

Warning

The file name of the custom base set file must be in all capital letters ！However, when referenced in the input file, the case is arbitrary.

For example, create a text file MYBAS-1 in the calculation directory (note: if you create a text file under Windows OS, the system may hide the extension .txt, so the actual name is MYBAS-1.txt) with the following contents

# This is my basis set No. 1.               # any blank lines and # leading comment lines
# Supported elements: He and Al

****                                        # a line beginning with four asterisks, followed by a base set of elements
He      2    1                              # element sign, nuclear charge number, highest angular momentum of basis function
S      4    2                               # S type GTO basis function, 4 original functions reduced to 2
               3.836000E+01                 # exponents of four S-type Gaussian primitive functions
               5.770000E+00
               1.240000E+00
               2.976000E-01
      2.380900E-02           0.000000E+00   # Two colums of contraction factors, corresponding to two contraction S-type GTO basis functions
      1.548910E-01           0.000000E+00
      4.699870E-01           0.000000E+00
      5.130270E-01           1.000000E+00
P      2    2                               # P type GTO basis function, two original functions are reduced to two
               1.275000E+00
               4.000000E-01
      1.0000000E+00           0.000000E+00
      0.0000000E+00           1.000000E+00
****                       # four asterisks end the base set of he, followed by the base set of another element, or end
Al     13    2
（ellipsis）

In the above basis set, the P function is not contracted and can also be written in the following form.

（S function，ellipsis）
P      2    0              # 0 indicates non shrinkage, and the shrinkage factor is not required at this time
               1.275000E+00
               4.000000E-01
****
（ellipsis）

For pseudopotential basis sets, it is also necessary to provide ECP data after the valence basis function. For example

****                                              # for the price basis function, the note is the same as above
Al     13    2
S       4    3
           14.68000000
            0.86780000
            0.19280000
            0.06716000
    -0.0022368000     0.0000000000     0.0000000000
    -0.2615913000     0.0000000000     0.0000000000
     0.6106597000     0.0000000000     1.0000000000
     0.5651997000     1.0000000000     0.0000000000
P       4    2
            6.00100000
            1.99200000
            0.19480000
            0.05655000
    -0.0034030000     0.0000000000
    -0.0192089000     0.0000000000
     0.4925534000    -0.2130858000
     0.6144261000     1.0000000000
D       1    1
            0.19330000
     1.0000000000
ECP                     # ECP data section
Al    10    2    2      # element symbol, number of core electrons, ECP maximum angular momentum, soecp maximum angular momentum（optional）
D potential  4                                    # ECP maximum angular momentum（D function）
   2      1.22110000000000     -0.53798100000000  # R power，exponent，factor（the same below）
   2      3.36810000000000     -5.45975600000000
   2      9.75000000000000    -16.65534300000000
   1     29.26930000000000     -6.47521500000000
S potential  5                                    # S number of items projected
   2      1.56310000000000    -56.20521300000000
   2      1.77120000000000    149.68995500000000
   2      2.06230000000000    -91.45439399999999
   1      3.35830000000000      3.72894900000000
   0      2.13000000000000      3.03799400000000
P potential  5                                    # P number of items projected
   2      1.82310000000000     93.67560600000000
   2      2.12490000000000   -189.88896800000001
   2      2.57050000000000    110.24810400000000
   1      1.75750000000000      4.19959600000000
   0      6.76930000000000      5.00335600000000
P so-potential  5                                 # the number of items projected by P so. Scalar ECP does not have this part
   2      1.82310000000000      1.51243200000000  # Scalar ECP does not have this part
   2      2.12490000000000     -2.94701800000000  # Scalar ECP does not have this part
   2      2.57050000000000      1.64525200000000  # Scalar ECP does not have this part
   1      1.75750000000000     -0.08862800000000  # Scalar ECP does not have this part
   0      6.76930000000000      0.00681600000000  # Scalar ECP does not have this part
D so-potential  4                                 # the number of items of D so projection. Scalar ECP does not have this part
   2      1.22110000000000     -0.00138900000000  # Scalar ECP does not have this part
   2      3.36810000000000      0.00213300000000  # Scalar ECP does not have this part
   2      9.75000000000000      0.00397700000000  # Scalar ECP does not have this part
   1     29.26930000000000      0.03253000000000  # Scalar ECP does not have this part
****

For scalar ECP, the SOECP highest angular momentum is 0 (which can be omitted and not written), and it is not necessary to provide the data for the SO projection part.

Once the above data is saved, the MYBAS-1 base set can be called in the BDF input file, which is achieved by the following hybrid input mode.

#!bdfbasis.sh
HF/genbas

Geometry
 .....
End geometry

$Compass
Basis
   mybas-1         # give the name of the base set file in the current directory. It is not case sensitive here
$End

The custom base set must be entered in BDF’s mixed mode. In the second line the input base set is set to genbas, and the custom base set file name needs to use the keyword Basis in the COMPASS module with a value of mybas-1, which means that the base set file named MYBAS-1 is called.

Basis set designation

Use the same BDF built-in basis set for all atoms

In simple input mode, the basis set is specified in method/generic/basis set or method/basis set. Here, the basis sets are the BDF built-in ones listed in the previous sections base set names, and the input characters are case-insensitive, as follows.

#! basisexample.sh
TDDFT/PBE0/3-21g

Geometry
H   0.000   0.000    0.000
Cl  0.000   0.000    1.400
End geometry

#! basisexample.sh
HF/lanl2dz

Geometry
H   0.000   0.000    0.000
Cl  0.000   0.000    1.400
End geometry

In case of advanced input mode, the basis set used for the calculation is specified in the compass module using the keyword basis, for example

$compass
Basis
 lanl2dz
Geometry
  H   0.000   0.000    0.000
  Cl  0.000   0.000    1.400
End geometry
$end

where lanl2dz calls the built-in LanL2DZ basis set (registered in the basisname basisname file), which is case-insensitive.

Specifying different basis sets for different elements

You have to use the mixed input mode, i.e. set the basis set to genbas in Methods/Generic/Bases, and add the COMPASS module input, specifying the basis set using the basis-block … end basis keyword.

If you specify a different name for a different element, you need to put it in the COMPASS module’s basis-block … end basis block. where the first line is the default base set and the subsequent lines specify other base sets for different elements in the format element= base set name * or *element1, element2, …,element n= base set name 。

For example, an example of using different basis sets for different atoms in mixed input mode is as follows.

#! multibasis.sh
HF/genbas

Geometry
H   0.000   0.000    0.000
Cl  0.000   0.000    1.400
End geometry

$compass
Basis-block
 lanl2dz
 H = 3-21g
End Basis
$end

In the above example, the 3-21G basis set is used for H, while the default LanL2DZ basis set is used for Cl which is not additionally defined.

In case of advanced input, the following is used.

$compass
Basis-block
 lanl2dz
 H = 3-21g
End Basis
Geometry
  H   0.000   0.000    0.000
  Cl  0.000   0.000    1.400
End geometry
$end

Assigning different basis sets to different atoms of the same element

The BDF can also assign different base sets with different names to different atoms of the same element, which need to be distinguished by an arbitrary number after the element symbol to distinguish them. For example

#! CH4.sh
RKS/B3lyp/genbas

Geometry
  C       0.000   -0.000    0.000
  H1     -0.000   -1.009   -0.357
  H2     -0.874    0.504   -0.457
  H1      0.874    0.504   -0.357
  H2      0.000    0.000    1.200
End geometry

$compass
Basis-block
 6-31g
 H1= cc-pvdz
 H2= 3-21g
End basis
$end

In the above example, the cc-pVDZ basis set is used for the two hydrogen atoms of type H1, the 3-21G basis set for the two hydrogen atoms of type H2, and the 6-31G basis set for the carbon atoms. Note that the symmetry equivalent atoms must use the same basis set, which will be checked by the program; if the symmetry equivalent atoms have to use different basis sets, the symmetry can be set to a lower point set symmetry by Group or turned off with Nosymm.

Auxiliary basis sets

The method using density fitting approximation (RI) requires an auxiliary basis set. the Ahlrichs family of basis sets and the Dunning correlation consistency basis set as well as other individual basis sets have specially optimized auxiliary basis sets. the auxiliary basis sets can be specified in BDF by the RI-J、 RI-K and RI-C keywords in compass. RI-J is used to assign coulomb fitting basis set, RI-K is used to assign coulomb exchange fitting basis set, RI-C assign coulomb correlation fitting basis set. The auxiliary basis sets supported by BDF are stored in the corresponding folder under the $BDFHOME/basis_library path。

High-level density fitting bases can be used on lower-level bases, e.g. cc-pVTZ/C can be used to do RI-J on cc-pVTZ，and for pople series bases such as 6-31G** that do not have a standard auxiliary base, cc-pVTZ/J can be used to do RI-J or RIJCOSX. Conversely, a high-level orbital basis set combined with a low-level auxiliary basis set introduces a more significant error.

$Compass
Basis
  DEF2-SVP
RI-J
  DEF2-SVP
Geometry
  C          1.08411       -0.01146        0.05286
  H          2.17631       -0.01146        0.05286
  H          0.72005       -0.93609        0.50609
  H          0.72004        0.05834       -0.97451
  H          0.72004        0.84336        0.62699
End Geometry
$End

In the above example, the def2-SVP basis set was used to calculate the $\ce{CH4}$ methane molecule, while the def2-SVP standard Coulomb fitting basis set was used for accelerated calculations.

Hint

The RI calculation function of BDF is used to accelerate wave function calculation methods such as MCSCF、 MP2 etc. It is not recommended for users in SCF 、 TDDFT, etc. The MPEC method does not depend on redundant functions and is comparable to the RI method in terms of computational speed and accuracy. The MPEC method does not depend on the redundancy function and is comparable to the RI method in terms of speed and accuracy.