The text for this page is taken mainly from my PhD thesis, chapter 2, and my paper in Int. J. Quant. Chem. (2006).

The energy profile for a molecule can be visualized as in the first figure, where the energy is given as function of some intrinsic coordinate of the molecule. This intrinsic coordinate consists of a combination of all relevant coordinates of the molecule, but can be visualized most easily for a diatomic molecule. In that case, the intrinsic coordinate consists of the distance between the two atoms.

The equilibrium geometry, indicated in the first figure by the star, is found where the energy is lowest. In a geometry optimization, one aims to find this equilibrium geometry. One could use energy values only, but as this requires lots of energy evaluations it is useful only for methods in which the evaluation of the energy does not take a long time (as in the case of semi-empirical or molecular mechanics methods). A more efficient approach is to use also the gradient \(g\), which is a vector containing the first derivatives of the energy \(U\) with respect to the Cartesian coordinates of the atoms \(r_{i}\):

$$g_{i} = \frac{\partial U}{\partial r_{i}}$$

In the second figure, the gradient is sketched that belongs to the energy profile of the first figure as function of the intrinsic coordinate. In the equilibrium geometry, it is zero.
On both sides of the equilibrium, a force is working on the molecule to push it back towards equilibrium. This can be used in geometry optimizations to obtain the equilibrium geometry; one has to find the geometry where the gradient is zero.

The most simple algorithm takes the negative gradient as step in the geometry optimization (steepest descent)1. This requires lots of optimization steps as the convergence is usually slow. It can be improved upon by using the conjugate gradient method. This method constructs its steps by taking a mixture of the current negative gradient and the previous search direction, thereby making it conjugate to the previous step. This will improve the convergence to some extent, but the convergence remains slow.

A further improvement of optimization techniques can be obtained by including the Hessian \(H\), which is a matrix containing the second derivatives of the energy \(U\) with respect to the atomic coordinates \(r_{i}\): $$H_{ij} = \frac{\partial^2 U}{\partial r_{i} \partial r_{j}}$$ One could compute this matrix after every geometry optimization step and use it to construct a new step in the optimization, but as the calculation of the Hessian matrix may be very time-consuming, the use of approximate Hessians may be worthwhile. These approximate Hessians are not re-calculated after every geometry optimization cycle, but simply updated based on the gradient change. One of the most popular and powerful update schemes is the Broyden-Fletcher-Goldfarb-Shanno (BFGS)^{see Jensen} scheme. In this scheme, the Hessian is updated from \(H_{0}\) to \(H_{+}\) by using the step vector \(s\) and the gradient difference \(\Delta g\):
$$H_{+} = H_{0} + \frac{\Delta g \otimes \Delta g}{\Delta g \cdot s} - \frac{(H_{0} \cdot s) \otimes (H_{0} \cdot s)}{s \cdot H_{0} \cdot s}$$
In this equation, the tensor product has been used:
$$(a \otimes b)_{ij} = a_{i} b_{j}$$
The BFGS scheme has, under certain weak conditions on the step vector \(s\), the property of non-negative definiteness. This property is useful as it ensures that the quadratic model has a minimum, or put otherwise, that the eigenvalues of the Hessian are all larger than or equal to zero.

The most simple algorithm takes the negative gradient as step in the geometry optimization (steepest descent)1. This requires lots of optimization steps as the convergence is usually slow. It can be improved upon by using the conjugate gradient method. This method constructs its steps by taking a mixture of the current negative gradient and the previous search direction, thereby making it conjugate to the previous step. This will improve the convergence to some extent, but the convergence remains slow.

A further improvement of optimization techniques can be obtained by including the Hessian \(H\), which is a matrix containing the second derivatives of the energy \(U\) with respect to the atomic coordinates \(r_{i}\): $$H_{ij} = \frac{\partial^2 U}{\partial r_{i} \partial r_{j}}$$ One could compute this matrix after every geometry optimization step and use it to construct a new step in the optimization, but as the calculation of the Hessian matrix may be very time-consuming, the use of approximate Hessians may be worthwhile. These approximate Hessians are not re-calculated after every geometry optimization cycle, but simply updated based on the gradient change. One of the most popular and powerful update schemes is the Broyden-Fletcher-Goldfarb-Shanno (BFGS)

Optimization techniques rely on a Taylor expansion of the energy \(E\) about the atomic coordinates \(X\),

Only close to the minimum is the energy surface actually quadratic, and can the Taylor expansion up to second order be trusted to be valid; this region is called the trust region, with a radius \(\tau\). If the quasi-Newton (QN) or Newton–Raphson (NR) step is smaller than \(\tau\), the QN/NR step is taken, else the restricted second order (RSO

Although at every point the QN/NR step is the best option, the geometry optimization is enhanced by using GDIIS;

We have implemented in QUILD the option to use either the step, the gradient or the “energy” vector (e.g., \(B_{ij} = g^{T}_{i} \cdot H^{-1}_{k} g_{j}\))