Reviews of Modern Physics, in press.
The last decade has witnessed a proliferation in methodologies for numerically solving large-scale problems in electrostatics and electronic structure. The rapid growth has been driven by several factors. First, theoretical advances in the understanding of localization properties of electronic systems have justified at a fundamental level approaches which utilize localized density matrices or orbitals in their formulation (Kohn, 1996; Ismail-Beigi and Arias, 1998; Goedecker, 1999). Second, a wide variety of computational methods have exploited that physical locality, leading to linear scaling of the computing time with system size (Goedecker, 1999). Third, general algorithms for solving electrostatics and eigenvalue problems have been improved or newly developed including particle-mesh methods (Hockney and Eastwood, 1988; Darden et al., 1993; Pollock and Glosli, 1996), fast-multipole approaches (Greengard, 1994), multigrid techniques (Brandt, 1977, 1982, 1984; Hackbusch, 1985), and Krylov subspace and related algorithms (Booten and van der Vorst, 1996). Last, and perhaps not least, the ready availability of very fast processors for low cost has allowed for quantum modeling of systems of unprecedented size. These calculations can be performed on workstations or workstation clusters, thus creating opportunities for a wide range of researchers in fields both inside and outside of computational physics and chemistry (Bernholc, 1999). Several monographs and collections of reviews illustrate the great variety of problems recently addressed with electrostatics and electronic structure methods (Gross and Dreizler, 1994; Bicout and Field, 1996; Seminario, 1996; Springborg, 1997; Banci and Comba, 1997; Von Rague Schleyer, 1998; Jensen, 1999; Hummer and Pratt, 1999).
This review examines one subset of these new computational methods, namely real-space techniques. Real-space methods can loosely be categorized as one of three types: finite differences (FD), finite elements (FE), or wavelets. All three lead to structured, very sparse matrix representations of the underlying differential equations on meshes in real space. Applications of wavelets in electronic structure calculations have been thoroughly reviewed recently (Arias, 1999) and will therefore not be addressed here. This article discusses the fundamentals of FD and FE solutions of Poisson and nonlinear Poisson-Boltzmann equations in electrostatics and self-consistent eigenvalue problems in electronic structure. As implied in the title, the primary focus is on calculations in density functional theory (DFT); real-space methods are in no way limited to DFT, but since DFT calculations comprise a dominant theme in modern electrostatics and electronic structure, the discussion here will mainly be restricted to this particular theoretical approach.
Consider a physical system for which local approaches such as real-space methods are appropriate: a transition metal ion bound to several ligands embedded in a protein. These systems are of significance in a wide range of biochemical mechanisms (Banci and Comba, 1997). Treating the entire system with ab initio methods is presently impossible. However, if the primary interest is in the nature of the bonding structure and electronic states of the transition metal ion, one can imagine a three-tier approach (Fig. 1). The central region, including the metal ion and the ligands, is treated with an accurate quantum method such as DFT. A second neighboring shell is represented quantum mechanically but is not allowed to change during self-consistency iterations. The wavefunctions in the central zone must be orthogonalized to the fixed orbitals in the second region. Finally, the very distant portions of the protein are fixed in space and treated classically; the main factors to include from the far locations are the electrostatic field from charged or partially charged groups on the protein and the response of the solvent (typically treated as a dielectric continuum). Real-space methods provide a helpful language for representing such a problem. The real-space grid can be refined to account for the high resolution necessary around the metal ion and can be adjusted for a coarser treatment further away. There is clearly no need to allow the metal and ligand orbitals to extend far from the central zone, so a localized representation is advantageous. Also, the electrostatic potential can be generated over the entire domain (quantum, classical, and solvent zones) with a single real-space solver without requiring special techniques for matching conditions in the various regions. The same ideas could be applied to defects in a covalent solid or impurity atoms in a cluster.
In order to place the real-space methods in context, we first briefly examine other computational approaches. The plane-wave pseudopotential method has proven to be a powerful technique for locating the electronic ground state for many-particle systems in condensed phases (Payne et al., 1992). In this method the orbitals are expanded in the nonlocal plane-wave basis. The core states are removed via pseudopotential methods which allow for relatively smooth valence functions in the core region even for first-row and transition elements (Vanderbilt, 1990). Therefore, a reasonable number of plane-waves can be used to accurately represent most elements important for materials simulation. Strengths of this method include the use of efficient fast Fourier transform (FFT) techniques for updates of the orbitals and electrostatic potentials, lack of dependence of the basis on atom positions, and the rigorous control of numerical convergence of the approximation with decrease in wavelength of the highest Fourier mode. Algorithmic advances have led to excellent convergence characteristics of the method in terms of the number of required self-consistency steps (Payne et al., 1992; Hutter et al., 1994; Kresse and Furthmüller, 1996); only 5-10 self-consistency iterations are required to obtain tight convergence of the total energy, even for metals. In spite of the numerous advantages of this approach, there are restrictions centered around the nonlocal basis set. Even with the advances in pseudopotential methods, strong variations in the potential occur in the core regions (especially for first-row and transition elements), and local refinements would allow for smaller effective energy cutoffs away from the nuclei. This issue has been addressed by the development of adaptive-coordinate plane-wave methods (Gygi, 1993). If any information is required concerning the inner-shell electrons, plane-wave methods suffer severe difficulties. Of course such states can be represented with a sufficient number of plane waves (Bellaiche and Kunc, 1997), but the short-wavelength modes required to build in the rapidly varying local structure extend over the entire domain to portions of the system where such resolution is not necessary. Also, for localized systems like molecules, clusters, or surfaces, nontrivial effort is expended to accurately reproduce the vacuum; the zero-density regions must be of significant size in order to minimize spurious effects in a supercell representation. In addition, charged systems create technical difficulties since a uniform neutralizing background needs to be properly added and subtracted in computations of total energies. Lastly, without special efforts to utilize localized-orbital representations, the wavefunction orthogonality step scales as N3, where N is the number of electrons.
In quantum chemistry, localized basis sets built from either Slater-type orbitals (STOs) or Gaussian functions have predominated in the description of atoms and molecules (Szabo and Ostlund, 1989; Jensen, 1999). The molecular orbitals are constructed from linear combinations of the atomic orbitals (LCAO). An accurate representation can be obtained with less than thirty Gaussians for a first-row atom. In relation to plane-wave expansions, the localized nature of these basis functions is more in line with chemical concepts. With STOs or other numerical orbitals, the multicenter integrals in the Hamiltonian must be evaluated numerically, while with a Gaussian basis, the Coulomb integrals are available analytically. The `price' for using Gaussians is that more basis functions are required to accurately describe the electron states, since they do not exhibit the correct behavior at either small or large distances from the nuclei. Techniques such as direct inversion in the iterative subspace (DIIS) have been developed to significantly accelerate the convergence behavior of basis-set self-consistent electronic structure methods (Pulay, 1980, 1982; Hamilton and Pulay, 1986).1 The LCAO methods have led to a dramatic growth in accurate calculations on molecules with up to tens of atoms. It is now common to see papers devoted to detailed comparisons of experimental results and electronic structure calculations on systems with more than one hundred electrons (Rodriguez et al., 1998). Often in basis-set calculations, care must be taken to account for basis-set superposition errors which arise due to overlap of nonorthogonal atom centered functions for composite systems. Also, linear dependence is a problem for very large basis sets chosen to minimize the errors. These factors lead to difficulty in obtaining the basis-set limit for a given level of theory.2 The scaling of basis-set methods can be severe, but recent developments (see Section III) have brought the scaling down to linear for large systems.
With the successes of plane-wave and quantum chemical basis functions, what is the motivation to search for alternative algorithms? Ten years ago, a review article discussing the relevance of Gaussian basis-set calculations for lattice gauge theories argued for the utilization of Gaussian basis sets in place of grids (Wilson, 1990). The author states (concerning the growth of quantum chemistry): ``The most important algorithmic advance was the introduction of systematic algorithms using analytic basis functions in place of numerical grids, first proposed in the early 1950s." The point was illustrated by examination of core states for carbon: only a few Gaussians are required (with variable exponential parameters), while up to 8 ×106 grid points are necessary for the same energy resolution on a uniform mesh. What developments have occurred over the last decade which could begin to overcome such a large disparity in computational effort?
This review seeks to answer the above question by summarizing recent research on real-space mesh techniques. To locate them in relation to plane-wave expansions and LCAO methods, some general features are introduced here and further developed throughout the article. The representation of the physical problems is simple: the potential operator is diagonal in coordinate space and the Laplacian is nearly local, depending on the order of the approximation. The near-locality makes real-space methods well suited for incorporation into linear-scaling approaches. It also allows for relatively straightforward domain-decomposition parallel implementation. Finite or charged systems are easily handled. With higher-order FD and FE approximations, the size of the overall domain is substantially reduced from the estimate above. Adaptive mesh refinements or coordinate transformations can be employed to gain resolution in local regions of space, further reducing the grid overhead. Real-space pseudopotentials result in smooth valence functions in the core region, again leading to smaller required grids. As mentioned above, the grid-based matrix representation produces structured and highly banded matrices, in contrast to plane-wave and LCAO expansions (Payne et al, 1992; Challacombe, 2000). These matrix equations can be rapidly solved with efficient multiscale (or other preconditioning) techniques. However, while more banded than LCAO representations, the overall dimension of the Hamiltonian is substantially higher.3 In a sense, the real-space methods are closely linked to plane-wave approaches: they are both `fully numerical' methods with one or at most a few parameters controlling the convergence of the approximation, for example the grid spacing h or the wavevector of the highest mode k.4 On the other hand, the LCAO methods employ a better physical representation of the orbitals (thus requiring fewer basis functions); attached with this representation, however, are some of the problems discussed above related to the art of constructing nonorthogonal, atom- or bond-centered basis sets. The purposes of this paper are 1) to provide a basic introduction to real-space computational techniques, 2) to review their recent applications to chemical and physical problems, and 3) to relate the methods to other commonly used numerical approaches in electrostatics and electronic structure.
The numerical problems addressed in this review can be categorized into four types in order of increasing complexity:
| |||||||||||||||||||||||||||
The first expression symbolizes the generation of the Laplacian on the real-space grid. The second is the linear, elliptic Poisson equation. The third is the nonlinear Poisson-Boltzmann equation of electrostatics which describes the motion of small counterions in the field of fixed charges. The final equation is an eigenvalue equation such as the self-consistent Schrödinger equation occurring in electronic structure. Note that both the third and fourth equations are nonlinear. The Poisson-Boltzmann equation includes exponential driving terms on the rhs. The self-consistent eigenvalue problem is `doubly nonlinear': one must solve for both the eigenvalues and eigenfunctions, and the potential generally depends nonlinearly on the eigenfunctions. The multigrid method allows for solution of both linear and nonlinear problems with similar efficiencies.
The article is organized into several sections beginning with background discussion and then following the order of problems listed above. Section II introduces the central equations of DFT for electronic structure and charged classical systems. Section III reviews developments in linear-scaling computational algorithms and discusses their relationship to real-space methods. Section IV presents the fundamental aspects of representation in real space by examination of Poisson problems. Section V discusses multigrid methods for efficient solution of the resulting matrix representations. Section VI summarizes recent advances in electrostatics computations in real space including both Poisson and nonlinear Poisson-Boltzmann solvers. Applications in biophysics are illustrated with several examples. Section VII discusses real-space eigenvalue methods for self-consistent problems in electronic structure. Section VIII summarizes recent computations of optical response properties and excitation energies with real-space methods. The review concludes with a short summary and discussion of possible future directions for research.
Motivated by the fundamental Hohenberg-Kohn theorems (Hohenberg and Kohn 1964) of DFT, Kohn and Sham (1965) developed a set of accessible one-electron self-consistent eigenvalue equations. These equations have provided a practical tool for realistic electronic structure computations on a vast array of atoms, molecules, and materials (Parr and Yang, 1989). The Hohenberg-Kohn theorems have been extended to finite-temperature quantum systems by Mermin (1965) and to purely classical fluids in subsequent work (Hansen and McDonald, 1986; Ichimaru, 1994). An integral formulation of electronic structure has also been discovered in which the one-electron density is obtained directly without the introduction of orbitals (Harris and Pratt, 1985; Parr and Yang, 1989). This approach is in the spirit of the original Hohenberg-Kohn theorems, but to date this promising theory has not been used extensively in numerical studies. This section reviews the basic equations of DFT for electronic structure and charged classical systems. These equations provide the background for discussion of the real-space numerical methods.
The Kohn-Sham self-consistent eigenvalue equations for electronic structure can be written as follows (atomic units are assumed throughout):
| (5) |
where the density-dependent effective potential is
| (6) |
The classical electrostatic potential ves(r) is due to both the electrons and nuclei, and the (in principle) exact exchange-correlation potential vxc([r(r)]; r) incorporates all nonclassical effects. The exchange-correlation potential includes a kinetic contribution since the expectation value of the Kohn-Sham kinetic energy is that for a set of non-interacting electrons moving in the one-electron effective potential. The electron density, r(r), is obtained from the occupied orbitals (double occupation is assumed here):
| (7) |
The electrostatic portion of the potential for a system of electrons and nuclei (Hartree potential plus nuclear potential) is given by
| (8) |
This potential can be obtained by numerical solution of the Poisson equation:
| (9) |
where rtot(r) is the total charge density due to the electrons and nuclei.
If the exchange-correlation potential is taken as a local function (as opposed to functional) of the density with the value the same as for a uniform electron gas, the approximation is termed the local density approximation (LDA). Ceperley and Alder (1980) determined the exchange-correlation energy for the uniform electron gas numerically via Monte Carlo simulation. The data have been parametrized in various ways for implementation in computational algorithms (see, for example, Vosko et al., 1980). The LDA theory has been extended to handle spin-polarized systems (Parr and Yang, 1989). The LDA yields results with accuracies comparable to or often superior to Hartree-Fock, but generally leads to overbinding in chemical bonds among other deficiencies. One obtains the Hartree-Fock approximation if the local exchange-correlation potential in Eq. (6) is replaced by the nonlocal exact exchange operator.
In recent years, a great deal of effort has gone into developing more accurate exchange-correlation potentials (see Jensen, 1999, for a review). These advances involve both gradient expansions which incorporate information from electron density derivatives and hybrid methods which include some degree of exact Hartree-Fock exchange. With the utilization of these modifications, results of chemical accuracy can be obtained. Since the main focus of this review is on numerical approaches for solving the self-consistent equations, we do not further examine these developments.
Pseudopotential techniques allow for the removal of the core electrons. The valence electrons then move in a smoother (nonlocal) potential in the core region while exhibiting behavior the same as in an all-electron calculation outside the core. Recently developed real-space versions of the pseudopotentials allow for computations on meshes (Troullier and Martins, 1991a, 1991b; Goedecker et al., 1996; Briggs et al., 1996). Inclusion of the pseudopotentials substantially reduces the computational overhead since fewer orbitals are treated explicitly and the required mesh resolution can be coarser. However, truly local mesh-refinement techniques may allow for the efficient inclusion of core electrons when necessary (see Sections VI.A.2 and VII.D).
Self-consistent solution of the Kohn-Sham equations [Eq. (5)] for fixed nuclear locations is conceptually straightforward. An initial guess is made for the orbitals. This yields an electron density from which the effective potential is constructed by solution of the Poisson equation and generation of the exchange-correlation potential. The eigenvalue equation is solved with the current effective potential [Eq. (6)], resulting in a new set of orbitals. The process is repeated until the density or total energy change only to within some desired tolerance. Alternatively, the total energy can be minimized variationally using a technique such as conjugate gradients (Payne et al., 1992); the orbitals at the minimum correspond to those from the iterative process described above.
The ground-state theory discussed above has been extended to finite-temperature quantum and classical systems and has found wide use in the theory of fluids (Rowlinson and Widom, 1982; Hansen and McDonald, 1986; Ichimaru, 1994). Here I discuss the formulation for systems of charged point particles (mobile ions) moving in the external potential produced by other charged particles in the solution (for example, colloid spheres or cylinders). The solvent is assumed to be a uniform dielectric with dielectric constant e in these equations. The free energy for an ion gas of counterions can be written as the sum of an ideal term, the energy of the mobile ions in the external field due to the fixed colloid particles (this term incorporates both the electrostatic potential from the fixed charges on the colloids and an excluded-volume potential),
| (10) |
the Coulomb interaction potential energy of the mobile ions with each other, and a correlation free energy. The mobile-ion density rm (r) is the number density, not the charge density, in the solution. The charge on the counterions is q, and the approximate correlation free energy typically assumes a local density approximation for a one-component plasma. Thus the theory includes ion correlations, but the approximation is not systematically refineable, just as in the Kohn-Sham LDA equations. Löwen (1994) utilized this free energy functional in Car-Parrinello-type simulations (Car and Parrinello, 1985) of charged rods with surrounding counterions.
The equilibrium distribution is obtained by taking the functional derivative of the free energy with respect to the mobile-ion density and setting it to zero. It is clear that, if the correlation term is set to zero, the equilibrium density of the mobile counterions is proportional to the Boltzmann factor of the sum of the external and mobile ion Coulomb potentials:
| (11) |
The potential fm (r) is that due to the mobile ions only and b = (kT)-1.
Since the total charge (fixed charges on colloid particles and mobile ion charges) at equilibrium must satisfy the Poisson equation, the following nonlinear differential equation results for the equilibrium distribution of the mobile ions in the absence of correlations. The treatment is generalized here to account for the possibility of additional salt in the solution and a dielectric constant that can vary in space (Coalson and Beck, 1998):
| (12) |
where f(r) is the total potential due to the fixed colloid charges and mobile ions, rf(r) is the charge density of the fixed charges on the colloids, [`n]+ and [`n]- are the bulk equilibrium ion densities at infinity (determined self-consistently so as to conserve charge in the region of interest), and v(r) is a very large positive excluded-volume potential which prevents penetration of the mobile ions into the colloids. Fushiki (1992) performed molecular dynamics simulations of charged colloidal dispersions at the Poisson-Boltzmann level; the nonlinear Poisson-Boltzmann equation was solved numerically at each time step with FD techniques.
An alternative elegant statistical mechanical theory for the ion gas has been formulated (Coalson and Duncan, 1992). It uses field theoretic techniques to convert the Boltzmann factor for the ion interactions into a functional-integral representation of the partition function. The Poisson-Boltzmann-level theory results from a saddle-point approximation to the functional integral. The distinct advantage of this theory is that correlations can be systematically included by computing the corrections to the mean-field approximation via loop expansions. However, in practice the corrections are computationally costly for real-space grids of substantial size. This theory was used in simulations of colloids (Walsh and Coalson, 1994), and the deviations from mean-field theory were investigated. For realistic concentrations of monovalent background ions, the corrections are often small in magnitude, thus justifying the Poisson-Boltzmann-level treatment. Correlations must be considered, however, for accurate computations involving divalent ions (Guldbrand et al., 1984; Tomac and Gräslund, 1998; Patra and Yethiraj, 1999).
Several new methods have appeared for computations involving systems with long-range interactions. In this section, developments in linear-scaling methods for classical and quantum systems are summarized. Goedecker (1999) has clearly reviewed applications in electronic structure, so the discussion of this topic will be limited. The purpose is to illustrate the context in which real-space methods can be utilized in linear-scaling solvers for electrostatics and electronic structure.
Three algorithms have been most widely used in classical electrostatics calculations which require consideration of long-range forces. The first is the Ewald (1921) summation, which partitions the Coulomb interactions into a short-range sum handled in real space and a long-range contribution summed in reciprocal space. Both sums are convergent. The partitioning is effected by adding and subtracting localized Gaussian functions centered about the discrete charges (Pollock and Glosli, 1996). In the original Ewald method, the Coulomb interaction of the Gaussians is obtained analytically:
| (13) |
once the charge structure factor,
| (14) |
is computed. In Eq. (13), W is the cell volume and G is the Gaussian width. This method scales as N3/2 (where N is the number of particles) so long as an optimal exponential factor is used in the Gaussians. Discussion of the optimization equation which yields the N3/2 scaling is given in Pollock (1999). The Ewald technique has been used extensively in simulations of charged systems (Allen and Tildesley, 1987). An efficient alternative procedure for Madelung sums in electronic structure calculations on crystals was proposed by Harris and Monkhorst (1970).
The scaling of the Ewald method has been reduced by an alternative treatment for the interaction energy of the Gaussians. Instead of solving the problem analytically, 1) the charge density is assigned to a mesh, 2) the Poisson equation is solved using FFT methods, 3) the potential is differentiated, and lastly 4) the forces are interpolated to the particles. These methods are termed particle-particle particle-mesh (Hockney and Eastwood, 1988) or particle-mesh Ewald (Darden et al., 1993) [or an improved version called smooth particle-mesh Ewald (Essmann et al., 1995)]. Since the potential is generated numerically via FFT, the methods scale as N lnN (or NÖ{lnN} if an optimal G is used; see Pollock, 1999). The above-mentioned methods differ in how the four steps in the force generation are performed, but all three center on the use of FFT algorithms for their efficiency. Comparative studies have suggested that the original particle-particle particle-mesh method is more accurate than the particle-mesh Ewald versions; Deserno and Holm (1998) recommend its use with modifications obtained from particle-mesh Ewald. See also Sagui and Darden (1999), where it is shown that similar accuracies can be obtained with particle-mesh Ewald as compared to the particle-particle particle-mesh method.
The second algorithmic approach utilizes the fast multipole method (FMM) (Greengard, 1994) or related hierarchical techniques. In these methods, the near-field contributions are treated explicitly, while the far field is handled by clustering charges into spatial cells and representing the field with a multipole expansion. The methods are claimed to scale linearly with system size, but recent work contends the scaling is slightly higher (Pérez-Jordá and Yang, 1998). Fast multipole techniques and the quantum chemical tree code (QCTC) of Challacombe et al. (1996), have been widely applied in Gaussian-based electronic structure calculations. Since the classical Coulomb part of the problem is a significant or even dominant part of the overall computational effort, near linear scaling is required for an overall linear-scaling solver (Strain et al., 1996; White et al., 1996). Pérez-Jordá and Yang (1997) have developed an alternative efficient recursive bisection method for obtaining the Coulomb energy from electron densities. The FMM has also been utilized extensively in particle simulations. In comparative studies of periodic systems, Pollock and Glosli (1996) and Challacombe et al. (1997), have shown that, for the case of discrete particles, the particle-mesh related techniques are more efficient than the fast multipole method over a wide range of system sizes (up to 105 particles). However, for the case of continuous overlapping distributions, it is difficult to develop systematic ways in the particle-mesh approach to handle the charge penetration in large-scale Gaussian calculations (Challacombe, 1999a). Recently, Cheng et al. (1999) have developed a more efficient and adaptive version of the fast multipole method which will make the technique competitive with the particle-mesh method. Also, Greengard and Lee (1996) presented a method combining a local spectral approximation and the fast multipole method for the Poisson equation.
A third set of linear-scaling algorithms for classical electrostatics employs real-space methods, which will be discussed in-depth in subsequent sections. The problem is represented with FD equations, FE methods, or wavelets, and solved iteratively on the mesh. Since all operations are near-local in space, the application of the Laplacian to the potential is strictly linear scaling. However, the iterative process on the fine mesh typically suffers from slowing down in the solution process, so efficient preconditioning techniques must be employed to obtain the linear scaling. The multigrid method (Brandt, 1977, 1982, 1984, 1999; Hackbusch, 1985) is a particularly efficient method for solving the discrete equations. Linear-scaling real-space methods have been developed for solution of the Poisson problem in DFT (White et al., 1989; Merrick et al., 1995, 1996; Gygi and Galli, 1995; Briggs et al., 1995; Modine et al., 1997; Goedecker and Ivanov, 1998a). These studies have illustrated the accuracies and efficiencies of the real-space approach. One possible application of multigrid techniques which has not received attention is in solving for the Coulomb energy of the Gaussian charges in the particle-mesh algorithms. Since the multigrid techniques are highly efficient, scale linearly, and allow for variable resolution, they may provide a helpful counterpart to the FFT-based methods currently used. An advantageous feature of the multigrid solution during a charged-particle simulation is that, once the potential is generated for a given configuration, it can be saved for the next solution process for the updated positions which have changed only slightly. Thus the required number of iterations is likely to be low. Tsuchida and Tsukada (1998) utilized similar ideas in their FE method for electronic structure, where they employed MG acceleration for rapid solution of the Poisson equation and discussed the relation of their method to the particle-mesh approach.
Electronic structure calculations involve computational complexities which go well beyond the necessity for efficient solution of the Poisson equation. In order to obtain linear scaling, physical localization properties must be exploited either for the range of the density matrix or the orbitals. Goedecker (1999) categorized the various linear-scaling electronic structure methods as follows: Fermi operator expansion (FOE), Fermi operator projection (FOP), divide and conquer (DC), density matrix minimization (DMM), orbital minimization (OM), and optimal-basis density matrix minimization (OBDMM). He further classified the algorithms into those which employ small basis sets (LCAO-type approaches) and ones which utilize large basis sets (FD or FE).5 Clearly the methods most relevant to the present discussion are those which can be implemented with large basis sets (FOP, OM, and OBDMM). The two approaches most directly related to the FD and FE mesh techniques considered here are the OM and OBDMM methods, so we review their characteristics.
The OM method obtains the localized Wannier functions by minimization of the functional:
| (15) |
The minimization is unconstrained in that no orthogonalization is required; the orthonormality condition is automatically satisfied at convergence. In Eq. (15), W is the `grand potential', the cin are the expansion coefficients for the Wannier function n with basis function i, and the H¢i,j are the matrix elements of H - mI, where m is the chemical potential controlling the number of electrons and I is the identity matrix. The functional can be derived by making a Taylor expansion of the inverse of the overlap matrix occurring in the total energy expression (Mauri et al., 1993). Ordejón et al. (1995) presented an alternative derivation and related the OM functionals to the DMM approach. Assuming no localization restriction on the orbitals, it can be shown that the functional W gives the correct ground state at its minimum. However, some problems arise when localization constraints are imposed: 1) the functional can have multiple minima, 2) the number of required iterations to reach the ground state can be quite large, 3) there may be runaway solutions depending on the initial guess, and 4) the total charge is not conserved for all stages of the minimization (although charge is accurately conserved close to the minimum).
The original OM methods utilized underlying plane-wave (Mauri et al., 1993; Mauri and Galli, 1994) and tight-binding or LCAO-type bases (Kim, et al., 1995; Ordejón et al., 1995; Sánchez-Portal et al., 1997) for the representation of the localized orbitals. In the work of Sánchez-Portal et al. (1997) on very large systems, a fully numerical LCAO basis developed by Sankey and Niklewski (1989) was implemented for the orbitals, and the Hartree problem was solved via FFT techniques on a real-space grid. Lippert et al. (1997) developed a related hybrid Gaussian and plane-wave algorithm which uses Gaussians in place of the numerical atomic basis. Also, Haynes and Payne (1997) formulated a new localized spherical-wave basis which has features in common with plane waves in that a single parameter controls the convergence.
Real-space formulations have also applied OM ideas; since the real-space approach is inherently local, it provides a natural representation for the linear-scaling algorithms. Tsuchida and Tsukada (1998) incorporated unconstrained minimization into their FE electronic structure method. Hoshi and Fujiwara (1997) also employed unconstrained minimization in their FD self-consistent electronic structure solver. Finally, Bernholc et al. (1997) utilized the original localized-orbital functional of Galli and Parrinello (1992) in their FD multigrid method to obtain linear scaling. They are also investigating other order N functionals. These real-space algorithms will be the subject of extensive discussion in Section VII.
The OBDMM method is an efficient combination of density matrix and orbital-based methodologies. The optimization process to locate the ground state is divided into two minimization steps. In the inner loop, the usual DMM procedure is followed to obtain the density matrix for a fixed contracted basis. The density matrix F(r,r¢) is represented in terms of contracted basis functions yi and a matrix K which is a purified form from the DMM method:
| (16) |
and
| (17) |
where L is the contracted basis density matrix and O the overlap matrix. The matrix K is `purified' in that if the eigenvalues of L are close to zero or one, the eigenvalues of K will be even closer to those values. The outer loop searches for the optimal basis with fixed L. The OBDMM method was developed independently by Hierse and Stechel (1994) and Hernández and Gillan (1995). The two approaches differ in that the algorithm of Hernández and Gillan allows for a number of basis functions larger than the number of electrons. Also, Hierse and Stechel (1994) used tight-binding and Gaussian bases, while Hernández and Gillan (1994) employed a FD difference representation in their original work. Later, Hernández et al. (1997) developed a blip-function basis (a local basis of B-splines, see Strang and Fix, 1973), very closely related to FE methods.
In the quantum chemistry literature, efforts have focused on Gaussian basis-function algorithms. As discussed above, the Coulomb problem is typically solved with the FMM or other hierarchical techniques (White et al., 1996; Strain et al., 1996; Challacombe et al., 1996; Challacombe and Schwegler, 1997). Additional algorithmic advances include linear scaling for the exchange-correlation calculation in DFT (Stratmann et al., 1996), for the exact exchange matrix in Hartree-Fock theory (Schwegler and Challacombe, 1996; Schwegler et al., 1997), and for the diagonalization operation (Millam and Scuseria, 1997; Challacombe, 1999b). Alternative linear-scaling algorithms include the early Green's function based FD method of Baroni and Giannozzi (1992) and the finite-temperature real-space method of Alavi et al. (1994).
It is evident from the above discussion that real-space methods, in particular FD and FE approaches,6 are well suited for linear-scaling algorithms. In classical electrostatics calculations, the multigrid method provides an efficient and linear-scaling technique for solution of Poisson problems given a charge distribution on a mesh (finite or periodic systems). In electronic structure, FD and FE representations have been extensively employed in the OM and OBDMM localized-orbital linear-scaling contexts.
The early development of FD and FE methods for solving partial differential equations stemmed from engineering problems involving complex geometries, where analytical approaches were not possible (Strang and Fix, 1973). Example applications include structural mechanics and fluid dynamics in complicated geometries. However, even in the early days of quantum mechanics, attention was paid to FD numerical solutions of the Schrödinger equation (Kimball and Shortley, 1934; Pauling and Wilson, 1935). Also, fully converged numerical solutions of self-consistent electronic structure calculations have played an important role in atomic physics (see Mahan and Subbaswamy, 1990, for a discussion of the methodology for spherically symmetric systems) and more recently in molecular physics (Laaksonen et al., 1985; Becke, 1989).
Real-space calculations are performed on meshes; these meshes can be as simple as Cartesian grids or can be constructed to conform to the more demanding geometries arising in many applications. Finite-difference representations are most commonly constructed on regular Cartesian grids. They result from a Taylor series expansion of the desired function about the grid points. The advantages of FD methods lie in the simplicity of the representation and resulting ease of implementation in efficient solvers. Disadvantages are that the theory is not variational (in the sense of providing and upper bound, see below), and it is difficult to construct meshes flexible enough to conform with the physical geometry of many problems. Finite-element methods, on the other hand, have the advantages of significantly greater flexibility in the construction of the mesh and an underlying variational formulation. The cost of the flexibility is an increase in complexity and more difficulty in the implementation of multiscale or related solution methods. In this Section, we review the technical aspects of real-space FD and FE representations of differential equations by examination of Poisson problems.
The second-order FD representation of elliptic equations is very simple but serves to illustrate several key features. Consider the Poisson equation in one dimension (the 4p is left here since we will be considering three-dimensional problems):
| (18) |
where f(x) is the potential and r(x) the charge density. Expand the potential in the positive and negative directions about the grid point xi:
| ||||||||||||||||||||||||||
The grid spacing is h, here assumed uniform. If these two equations are added and the sum is solved for f¢¢(xi), the following approximation results:
| (20) |
The first contribution to the truncation error is second order in h with a prefactor involving the fourth derivative of the potential. Depending on the nature of the function f(x), the errors can be of either sign. When f(x) is used to compute a physical quantity such as the total electrostatic energy, the net errors in the energy can be either positive or negative. In this sense, the FD approximation is not variational. As we will see below, the solution can be obtained by minimizing an energy (or action) functional, which is a variational process, but the solution does not necessarily satisfy the variational theorem obtained in a basis-set method. So the FD approach is not a basis-set method.
In matrix form, the one-dimensional discrete Poisson equation is
| (21) |
This equation can be expressed symbolically as
| (22) |
where Lh is the discrete Laplacian, uexh is the exact solution on the grid, and fh is -4pr. The operator -L is positive definite. An observation from the matrix form Eq. (21) is that the Laplacian is highly sparse and banded in the FD representation; its application to the potential is thus a linear-scaling step. In one dimension the matrix is tridiagonal, while in two or three dimensions it is no longer tridiagonal but is still extremely sparse with nonzero values only near the diagonal. This differs from the wavelet representation, which is sparse but includes several bands in the matrix (Goedecker and Ivanov, 1998b, Fig. 7; Arias, 1999, Fig. 10).
In addition to the truncation error,
| (23) |
estimates can be made of the function error itself (see Strang and Fix, 1973, p. 19):
| (24) |
where ua is the exact solution to the continuous differential equation and e2 is proportional to the second derivative of the potential. Therefore, one can test the order of a given solver for a case with a known solution by computing errors over the domain and taking ratios for variable grid spacing h. For example, the ratio of the errors on a grid with spacing H = 2h to those on h for overlapping points should be close to 4.0 in a second-order calculation.
The two- and three-dimensional representations are obtained by summing the one-dimensional case along the two or three orthogonal coordinate axes (this holds for higher-order forms as well). Since the Laplacian is the dot product of two vector operators, off-diagonal terms are not necessary. The second-order two-dimensional Laplacian consists of five terms with a weight of -4 instead of -2 on the diagonal, and the three-dimensional case has seven terms with a weight of -6 along the diagonal. See Abramowitz and Stegun (1964, Sections 25.3.30 and 25.3.31), for the two-dimensional representation of the Laplacian.
Consider the action functional:
| (25) |
If the first term on the rhs is integrated by parts (assuming the function and/or its derivative go to zero at infinity or are periodic), one obtains
| (26) |
Take the functional derivative of the action with respect to variations of the potential, and a `force' term results,
| (27) |
which can be employed in a steepest-descent minimization process:
| (28) |
where t is a fictitious time variable.
Then discretize the problem in space and time, leading to (for simplicity of representation a one-dimensional form is given here)
| (29) |
The parameter w is 2dt/h2. The two- and three-dimensional expressions are easily obtained following the same procedure. Since the action as defined in Eq. (26) possesses only a single minimum, the iterative process eventually converges so long as a sufficiently small time step dt is chosen to satisfy the required stability criterion (below).
Several relaxation strategies result from the steepest-descent scheme of Eq. (29). As it is written, the method is termed weighted-Jacobi iteration. If the previously updated value f(xi-1)t+1 is used in place of f(xi-1)t on the right hand side, the relaxation steps are called Successive Over-Relaxation or SOR. If the parameter in SOR is taken as w = 1, the result is Gauss-Seidel iteration. Gauss-Seidel and SOR do not guarantee reduction in the action at each step since they use the previously updated value. Generally, Gauss-Seidel iteration is the best method for the smoothing steps in multigrid solvers (Brandt, 1984). If one cycles sequentially through the lattice points, the ordering is termed lexicographic. Higher efficiencies (and vectorization) can be obtained with red-black ordering schemes in which the grid points are partitioned into two interlinked sets and the red points are first updated, followed by the black (Brandt, 1984; Press et al., 1992). Similar techniques can be used for high orders with multicolor schemes. Conjugate-gradient methods (Press et al., 1992) significantly outperform the above relaxation methods when used on a single grid level. However, in a multigrid solver the main function of relaxation is only to smooth the high-frequency components of the errors on each level (see Section V.A), and simple relaxation procedures (especially Gauss-Seidel) do very well for less cost.
An important issue in iterative relaxation steps relates to the eigenvalues of the update matrix defined by Eq. 29 (Briggs, 1987). Solution of the Laplace equation using weighted-Jacobi iteration illustrates the basic problem. For that particular case, the eigenvalues of the update matrix are
| (30) |
where w is the relaxation parameter defined above, N+1 is the number of grid points in the domain, and k labels the mode in the Fourier expansion of the function. Generally, the Fourier component of the error with wavevector k is reduced in magnitude by a factor proportional to lkt in t iterations.
First, it is easy to see that if too large an w value (that is `time' step for fixed h) is taken, the magnitude of some modes will exceed one, leading to instability. This shows up very quickly in a numerical solver! Second, for the longest-wavelength modes, the eigenvalues are of the form:
| (31) |
As more grid points are used to obtain increased accuracy on a fixed domain, the eigenvalues of the longest-wavelength modes approach one. Therefore, these modes of the error are very slowly reduced. This fact leads to the phenomenon of critical slowing down in the iterative process (Fig. 2), which motivated the development of multigrid techniques. Multigrid methods utilize information from multiple length scales to overcome the critical slowing down (Section V).
Mathematical arguments lead to the conclusion that the FD scheme discussed above is convergent in the sense that Eh ® 0 as h ® 0 (Strang and Fix, 1973; Vichnevetsky, 1981). Therefore, one only needs to proceed to smaller grid spacings to obtain results with a desired accuracy. This neglects the practical issues of computer time and memory, however, and it has become apparent that orders higher than second are most often necessary to obtain sufficient accuracy in electronic structure calculations on reasonable-sized meshes (Chelikowsky, Troullier, and Saad, 1994).
The higher-order difference formulas are well known (Hamming, 1962; Vichnevetsky, 1981), and can easily be generated using computer algebra programs (see Appendix A). Why does it pay to use high-order approximations? Consider the three-dimensional Poisson equation with a singular-source charge density:
| (32) |
The Dirac delta function is approximated by a unit charge on a single grid point. Let us solve the FD version of Eq. (32) on a 653 domain using 2nd- and 4th-order Laplacians and compare the potential eight grid points away from the origin.7 In order to obtain the same numerical accuracy with a 2nd-order Laplacian, a grid spacing with one third that for the 4th-order case is required. This implies a 27-fold increase in storage and roughly a 14-fold increase in computer time, since the application of the Laplacian contains 7(13) terms for the 2nd(4th)-order calculations.
As a second example, we solve for five states of the hydrogen-atom eigenvalue problem using the fixed potential generated in the solution of Eq. (32). The grid parameters are the same as those used in the multigrid eigenvalue computations of Section VII.A.2. The variation of the eigenvalues, the first orbital moments, and the virial ratios with approximation order are presented in Figs. 3, 4, and 5. A possible accuracy target is the thermal energy at room temperature (kT » 0.001 au); this accuracy is achieved at 12th order. Clearly the results at 2nd order are not physically reasonable, but accurate results can be obtained with the higher orders. Merrick et al. (1995) and Chelikowsky, Trouller, Wu, and Saad (1994) have presented analyses of the impact of order on accuracy in DFT electrostatics and Kohn-Sham calculations; in the Kohn-Sham calculations, 8th or 12th orders were required for adequate convergence.
There exist alternative high-order discretizations such as the Mehrstellen form used in the work of Briggs et al. (1996). This discretization is 4th order and leads to terms which are off-diagonal in both the kinetic and potential operators. The advantage of the Mehrstellen approach is that both terms only require near-neighbor points on the lattice, while the high-order forms above include information from further points (which increases the communication overhead somewhat in parallel implementations). However, the 4th-order Mehrstellen operator involves 33 multiplies to apply the Hamiltonian to the wavefunction, while the standard 4th-order discretization requires only 14 (a 12th-order standard form uses 38 multiplies). Also, the Mehrstellen representation has only been applied to the 4th-order case, and for some applications higher orders may be required. The exact terms for the Mehrstellen representation of the real-space Hamiltonian are given in Briggs et al. (1996).
Consider again the action of Eq. (25) in one dimension:
| (33) |
This form of the action proves useful since the appearance of the first derivative as opposed to the second expands the class of functions which may be used to represent the potential. Now, expand the potential in a basis:
| (34) |
where the ui are the expansion coefficients and zi the basis functions. The action is then
| (35) |
The variational calculation is performed by minimizing the action with respect to variations in the expansion coefficients (assuming the original differential operator is positive definite):
| (36) |
The minimization equation leads to a matrix problem completely analogous to Eq. (22). In the present case, the grid index is replaced by the basis-function index. It is often necessary to perform the integral of the second term (which involves the charge density) numerically.
A more general origin of the FE method is termed the Galerkin approach which takes as its starting point the ``weak" formulation of the problem. This method allows one to handle problems which cannot be cast in the minimization format described above by requiring only an extremum of the action functional and not a minimum. Also, it does not require symmetric operators. Take the action functional of Eq. (33) and perturb it by the addition of a small term ev.8 The action becomes
| (37) |
By taking the derivative with respect to e, making e zero, and setting what remains to zero, the stationary point is obtained. This variational form results in the following integral equation:
| (38) |
This equation is valid for any test function v; solution requires finding the function f for which the equation holds for all v. Alternatively, Eq. 38 can be derived by simply left multiplying the differential equation by the test function v and integrating by parts. When the functions f and v are represented in the zi basis, a matrix equation the same as Eq. (36) is obtained. This basis-set manifestation of the weak formulation is termed the Galerkin method. If the test function space for v is taken to include all Dirac delta functions, and the problem is cast in the strong form < v,Lf+4pr > = 0 (where L is the differential operator, in this case the Laplacian), the collocation (or pseudospectral) approximation results when the problem is discretized (Orszag, 1972; Vichnevetsky, 1981; Ringnalda et al., 1990). Excellent reviews of the theory and application of finite elements are given in Strang and Fix (1973), Vichnevetsky (1981), Brenner and Scott (1994), and Reddy (1998).
Any linearly independent basis may be used to expand the potential. One choice would be to expand in trigonometric functions which span the whole domain. Then Fourier transform techniques could be used to solve the equations. In the FE method, the basis functions are rather taken as piecewise polynomials which are nonzero only in a local region of space (that is, have small support). The simplest possible basis consists of piecewise-linear functions whose values are one at the grid point about which they are centered and zero everywhere beyond the nearest-neighbor grid points. Then the coefficients ui correspond to the actual function values on the mesh. With this basis and a basis-set representation of the charge density r(x), the resulting matrix representation of the one-dimensional Poisson equation is identical to Eq. (21), except the right hand side is replaced by terms which are local averages of the charge density over three points. The local average is idential to Simpson's rule integration. Therefore, for uniform meshes, there is a close correspondence between FD and FE representations. Relaxation methods similar to those described above can be used to solve the FE equations.
Besides the variational foundation of the FE method, the key advantage over FD approaches is in the flexibility available to construct the mesh to conform to the physical geometry. This issue becomes particularly relevant for two- and three-dimensional problems. There is of course an immense literature on development of accurate and efficient basis sets for FE calculations in a wide variety of engineering and physical applications, and that topic cannot be covered properly here. Some representative bases are mentioned from recent three-dimensional electronic structure calculations. White et al. (1989) employed a cubic-polynomial basis and constructed an orthogonal basis from the nonorthogonal set. Ackermann et al. (1994) used a tetrahedral discretization with orders p = 1 - 5. Pask et al. (1999) utilized piecewise cubic functions (termed ``serendipity'' elements). Yu et al. (1994) employed a Lobatto-Gauss basis set with orders ranging from five to seven. Hernández et al. (1997) developed a B-spline basis which is closely related to traditional FE bases. Tsuchida and Tsukada (1998) used piecewise third-order polynomials in their self-consistent electronic structure calculations.
In relating the FD and FE methods, two points are worth noting. First, the FE bases are typically nonorthogonal and this issue must be dealt with in the formulation. Second, since the basis is local, the representation is banded with the width depending on the degree of the polynomials. For the FD representation, the high-order Laplacian includes 3p + 1 terms in a row of L for three-dimensional calculations. Alternatively, the FE method requires O(p3) terms along a row of L in the limit of high orders, although the exact number of terms depends on the particular elements (Pask, 1999). This issue of scaling of the bandwidth with order may become a significant one in development of efficient iterative solvers of the equations. Due to the relative merits of the two representations, there is no clear `answer' as to which one is preferable; the key feature for this review is that both are near-local leading to structured and sparse matrix representations of the differential equations. The wavelet basis method is closer in form to the FE representation but, as mentioned above, leads to more complicated matrix structures than either the FD or FE cases (Goedecker and Ivanov, 1998b; Arias, 1999).
The previous section discussed the basics of real-space formulations. The representations are near-local in space, and this locality manifests itself in the stalling process of iterative solvers induced by Eq. (31). The finer the resolution of the mesh, the longer it takes to remove the long-wavelength modes of the error. The multigrid technique was developed in order to overcome this inherent difficulty in real-space methods. Multigrid methods provide the optimal solvers for problems represented in real space.
The asymptotic convergence of an iterative solver on a given scale is controlled by Eq. (31). However, for shorter-wavelength modes it is easy to show that the convergence factor
| (39) |
where |eh| is the norm of the difference vector between the exact grid solution uexh and the current approximation uh, and [e\tilde]h is the vector for the next step of iteration, is of order 0.5 for Gauss-Seidel iteration on the Poisson equation (Brandt, 1984). Those components of the error are reduced by an order of magnitude in only three relaxation sweeps. Thus, relaxation steps on a given grid level are referred to as smoothing steps; the high-frequency components of the error are efficiently removed while the long-wavelength modes remain. Following the fine-scale smoothing, the key step of multigrid is then to pass the problem to a coarser level, say H = 2h (with appropriate rules for the construction of the problem on the coarse grid); smoothing steps on the coarse level efficiently remove errors of twice the wavelength. Finally, the fine-grid function is corrected with the error interpolated from the coarse level, and further iterations on the fine level remove remaining high-frequency components induced by the coarse-grid correction.
When this process is recursively followed through several levels, the stalling behavior can be completely removed and the solution is obtained in O(N) operations, where N is the number of unknowns. Typically, the problem can be solved to within the truncation errors in roughly ten total smoothing steps on the finest level. The previous discussion rests on a local-mode analysis of the errors (Brandt, 1977, 1984); additional mathematical arguments confirm the excellent convergence rates and linear scaling of multigrid solvers (Hackbusch, 1985).
For linear problems, the algebraic Eq. (22) can be rewritten as
| (40) |
where h is the finest grid spacing, eh = uexh - uh (grid error), and rh = fh - Lh uh (residual equation). During the multigrid correction cycles, the coarse-grid iterations only need to be performed on the error term eH which is subsequently interpolated to the fine grid to provide the correction. However, this rearrangement is not possible for nonlinear problems. Brandt (1977, 1984) developed the full approximation scheme (FAS) approach for handling such problems. Besides providing solutions to nonlinear differential equations like the Poisson-Boltzmann equation, the FAS strategy is well suited to handle eigenvalue problems and mesh-refinement approaches. The FAS form of multigrid is thus presented here due to its generality. In the case of linear problems, the FAS is equivalent to the error-iteration version mentioned above.
Consider a Poisson problem discretized on a Cartesian lattice with a FD representation on a fine grid with spacing h (Eq. 22). Now construct a sequence of coarser grids each with grid spacing twice the previous finer value. For a 4-level problem in three dimensions, the sequence of grids will consist of 173, 93, 53, and 33 points including the boundaries. If h = 1, the coarser grid spacings are 2, 4, and 8. The boundary values of the potential on each level are fixed based on the physics of the problem. For example, if there are a set of discrete charges inside the lattice, direct summation of the 1/r potential or a multipole expansion can be performed. Alternatively it is easy to apply periodic boundary conditions by wrapping the potential. On the coarsest grid, only the one central point is iterated during relaxation steps there.
Assume there are l levels for the general case; each level is labeled by the index k which runs from 1 (coarsest level) to l (finest level). The operator Lk is defined by the FD discretization on level k with grid spacing hk. The goal is to obtain the solution uexl of Eq. 22 on the finest level. The equations to be iterated on level k take the form:
| (41) |
where one starts from a trial uk and improves it. The initial uk on coarse levels is obtained by applying the full-weighting restriction operator Ik+1k to uk+1:
| (42) |
The restriction operator takes a local average of the finer-grid function. The average is over all 27 fine grid points (in three dimensions) including the central point which coincides with the coarse grid and the 26 neighboring points. The weights are: 1/8 for the central point, 1/16 for the 6 faces, 1/32 for the 12 edges, and 1/64 for the 8 corners. The restriction operator is a rectangular matrix of size Ngk+1 (columns) by Ngk (rows) where Ngk is the number of grid points on level k. Of course, only the weights need be stored. The coarse-grid charge density fk is obtained similary from fk+1. The defect correction tk is defined as
| (43) |
The defect correction is zero on the finest level l. Therefore the third term on the rhs is zero for the grid next-coarser to the fine scale. It is easy to show that if one had the exact grid solution uexl on the finest level, the coarse-grid equations (Eq. 41) would also be satisfied on all levels, illustrating zero correction at convergence. Another point of view is that the defect correction modifies the coarse-grid equations to `optimally mimic' the finer scales. The defect correction provides an approximate measure of the discretization errors and can be used in the construction of adaptive solvers (Brandt, 1984): higher resolution is placed in regions where the defect correction magnitude exceeds a prescribed value.
The solver begins with initial iterations on the finest level (typically two or three relaxation steps are adequate on each level). The problem is restricted to the next coarser level as outlined above, and relaxation steps are performed there. This process is repeated until the coarsest grid is reached. The solver then returns to the fine level by providing corrections to each next-finer level and applying relaxation steps there. The correction equation for grid k+1 is
| (44) |
The additional operator Ikk+1 is the interpolation operator. Most often it is acceptable to use linear interpolation, and the easiest way to apply the operator in three dimensions is to interpolate along the lines in each plane and finally to interpolate along lines between the planes. That is, the operator can be applied by a sequence of one-dimensional interpolations. For linear interpolation, the coarse-grid points which coincide with the fine grid are placed directly into the fine-grid function, and the intermediate points get a weight of 1/2 from each neighboring coarse-grid point. In the same way, high-order interpolation operators can be applied as a sequence of one-dimensional operations; see Beck (1999b) for a listing of the high-order interpolation weights. The high-order weights are used in interpolating to new fine levels in full multigrid eigenvalue solvers (Brandt et al., 1993) and in high-order local mesh-refinement multigrid methods (Beck, 1999b). The interpolation operator is a rectangular matrix of size Ngk (columns) by Ngk+1 rows. Only the weights need be stored, just as for the restriction operator. All of the operators defined above can be initialized once and used repeatedly throughout the algorithm. The multigrid cycle defined by the above discussion is termed a V-cycle which is shown schematically in Fig. 6. Alternative cycling methods have been employed as well, such as W-cycles. Reductions in the norm of the residual in one V-cycle are generally an order of magnitude. The same set of operations is employed in a high-order solver; the second-order Laplacian is simply replaced by the high-order version. The form of the multigrid solver is quite flexible; for example, a lower-order representation could be used on coarse levels during the correction cycles. In our own work, we have observed similar optimal convergence rates for high-order solvers as for second-order ones, so there is no degradation in efficiency with order. Applications in electrostatics and extensions for eigenvalue problems are discussed in Sections VI and VII.
The grid solution can be efficiently obtained with one or at most a few V-cycles described above. The process obeys linear scaling since the solution is obtained with a fixed number of multigrid cycles and each operation on the grid scales linearly with the number of grid points. In three dimensions, the total grid overhead is Ntot = Sl Nfine, where
| (45) |
and l is the number of levels. In the limit of many levels, Ntot thus approaches 1.143 Nfine. Another development in the multigrid approach, full multigrid (FMG), can even further accelerate the solution process beyond the V-cycle algorithm. The idea of FMG is to begin iterations on the coarsest level. The initial approximation there is interpolated to the next-finer level, iterated, and the new fine-grid approximation is corrected in a V-cycle on that level. This process is repeated until the finest scale is reached. The FMG solver for a Poisson problem is illustrated in Fig. 7. The advantage of this approach is that a good initial (or preconditioned) approximation to the fine-scale function is obtained on the left side of the final V-cycle. With this strategy, the solution to Poisson problems can be obtained with a single passage through the FMG solver. (Self-consistent problems may require two or more passages through the final V-cycle to obtain convergence.) Note that a direct passage via iterations and interpolation from coarse to fine scales without the correction cycles does not guarantee multigrid convergence behavior since residual long-wavelength errors can remain from coarser levels. The multigrid corrections on each level serve to remove those errors, leading to optimal convergence (Brandt, 1984; Hackbusch, 1985; Briggs, 1987; Wesseling, 1991).
Multigrid solvers have been applied to many problems in fluid dynamics, structural mechanics, electrostatics, eigenvalue problems, etc. The majority of applications have utilized FD-type representations, but significant effort has gone into developing efficient solvers for FE representations as well (Brandt, 1980; Deconinck and Hirsch, 1982; Hackbusch, 1985; Braess and Verfurth, 1990; Brenner and Scott, 1994). An additional difficulty with FE multigrid methods is a proper representation of the problem on coarse levels: the more regular the fine-scale mesh, the easier is the coarsening process.
The original formulation of the multigrid method was directed at solution of linear elliptic equations like the Poisson equation. Subsequently, methods were developed to handle nonlinear problems such as the Poisson-Boltzmann equation of ionic solution theory. In this section, applications of real-space methods to electrostatics problems are discussed. First, the high efficiency of the multigrid method is demonstrated by examination of a Poisson problem. Then, new mesh-refinement techniques which allow for treatment of widely varying length scales are examined. Poisson-Boltzmann numerical solvers are discussed, with presentation of some representative applications in biophysics.
We investigate a model atomic-like Poisson problem which has an analytic solution:
| (46) |
The analytic solution is f(r) = e-r/r. The source singularity is modeled as a single discrete charge at the origin, and the neutralizing background charge value at the origin is set to give a net charge of zero summed over the whole domain. Here we discretize the problem with a 12th-order Laplacian on a 653 lattice with fine grid spacing h = 0.25. The problem was solved with the FAS-FMG technique with a single passage through the FMG process. Linear interpolation and full-weighting restriction were employed for the grid transfers. The potential was initially set to zero over the whole domain. Three Gauss-Seidel smoothing steps were performed on each level. Several additional smoothing steps were taken for points just surrounding the singularity to accelerate the convergence there (Bai and Brandt, 1987). This requires virtually no additional effort since only few grid points are involved.
The solution is obtained to within the truncation errors with a total of six relaxation sweeps on the finest level (Fig. 8). Thus the entire solution process only requires roughly ten times the effort it takes to represent the differential equation on the grid. The total energy of the charge distribution is E = -S/4p, where S is the action of Eq. (26). After the single FMG cycle, the energy is converged to within 0.00029 au of the fully converged energy of 4.31800 au (obtained with repeated V-cycles on the finest level). The final residual (using the 1-norm divided by the total number of points, that is the average absolute value of the grid residuals) is 5 ×10-6. After 1200 Gauss-Seidel iterations on the finest level alone, the residual is still of magnitude 9 ×10-6. (With an optimal SOR parameter, the number of iterations to obtain a residual of 5 ×10-6 can be reduced to 200 iterations.) Thus there is an enormous acceleration due to the multiscale processing. Similar efficiencies are observed for an FAS-FMG eigenvalue solver (Section VII.A). Since the number of iterations is independent of the number of fine grid points, these efficiencies are quite general and can be routinely expected from a correctly functioning multigrid solver.
Next we compare the operations count for generating the solution to Eq. 46 from scratch using multigrid and FFT methods on the same 653 lattice.9 The FFT solver required 33 ×106 floating point operations. The multigrid solver required 27 ×106, 43 ×106, 75 ×106, and 106 ×106 operations for the 2nd, 4th, 8th, and 12th order solvers, respectively. Therefore, there appears to be no clear advantage in generating the Poisson potential from scratch with multigrid as opposed to FFT. However, there are some advantages to using the real-space multigrid approach: 1) finite and periodic systems are handled with equal ease, 2) in a quantum simulation where particles move only slightly from a previous configuration, the potential can be saved from the previous configuration, thus reducing the number of iterations, and 3) one can incorporate mesh refinements to reduce the computational overhead. For example, if the same problem is solved with three nested refinement patches centered on the singularity, the number of floating point operations is reduced by nearly two orders-of-magnitude while the accuracy is sufficient since the smooth parts of the potential away from the singularity can be well represented on coarser meshes. In addition, multigrid methods can be used to solve nonlinear problems such as the Poisson-Boltzmann equation with similar efficiencies.
A situation that arises in many applied electrostatics computations is that of strongly varying dielectric profiles. Analogous problems occur in steady-state diffusion problems with widely varying diffusion coeffcients such as those encountered in neutron diffusion. If the coefficients vary by orders of magnitude, multigrid efficiency can be lost (Alcouffe et al., 1981). The reason is that the correct continuity condition across the boundaries is e1 Ñf(r1) = e2 Ñf(r2) rather than continuity of the gradients themselves. Thus the gradients vary widely across the boundaries, and the standard smoothing steps do not properly reduce the errors in the function. Alcouffe et al. (1981) developed procedures based on the above continuity condition which restore the standard multigrid convergence. In biophysical applications, the dielectric constant varies from one to eighty, so such modifications prove useful for that case (Holst and Saied, 1993).
Many physical problems require consideration of a wide range of length scales. One example given in the Introduction is a transition metal ion buried inside a protein. A protein interacting with a charged membrane surface is another example: particular charged groups near the interaction region must be treated accurately, but distant portions of the protein and membrane do not require high resolution to obtain reliable energetics. In electronic structure, the electron density is very large near the nucleus but is diffuse further away. A significant strength of real-space methods lies in the ability to place adaptive refinements in regions where the desired functions vary rapidly while treating the distant zones with a coarser description.
Two approaches exist for such refinements in the FD method (FE methods allow quite easily for grid adaptation): grid curving and local mesh refinements. While grid curving is an elegant procedure for adapting higher resolution in certain regions of space, generally the coordinate transformations are global. Therefore, the higher resolution tends to spread some distance from where the refinement is necessary (Modine et al., 1997, Figs. 1, 4, and 5), and depending on the geometry of the problem it may be difficult or impossible to design an appropriate grid transformation. Also, the transformations can be quite complex leading to additional difficulties in the solution process. Finally, the grid-curving transformations alter the underlying spectral properties of the operators which can in principle lead to degradation of the multigrid efficiency in the solution process. However, this does not appear to have been a problem in the methods of Gygi and Galli (1995) and Modine et al. (1997), although the convergence behavior of their multigrid Poisson solvers was not extensively discussed in those works. Mesh-curving strategies for electronic structure calculations are discussed in Section VII.D.
An alternative procedure is to place nested uniform patches of refinement locally in space (Fig. 9). Then the overall structure of the multigrid solver is the same, except fine-level iterations are performed only over the nested patches. The same forms for the Laplacian, restriction, interpolation, and smoothing operators are maintained. This procedure is highly flexible since the nested refinements can be centered about any locations of space and can move as the problem evolves. The placement of the refinements can be adaptively controlled by examination of the defect correction tH; higher resolution should be placed in regions where tH is large. If an underlying FD representation is employed, it is relatively easy to extend the method to high-order solvers since the mesh of the refinement patch is uniform.
Bai and Brandt (1987) developed an FAS multigrid mesh-refinement method for treating widely varying length-scale Poisson-type problems. They first developed a l-FMG exchange-rate algorithm which minimizes the error obtained for a given amount of computational work. Since the number of visits to the coarse levels (which extend over the whole domain) is proportional to the number of patches, direct application of the multigrid algorithm does not scale strictly linearly for many levels; the l-FMG process restores the linear scaling for a solver including the mesh refinements. Second, they showed that extra local relaxations around structural singularities restore asymptotic convergence rates which can otherwise degrade. Third, they developed a conservative-differencing technique for handling source singularities.
To motivate the need for conservative differencing in the FAS-FMG mesh-refinement solver, consider Eq. (41) and a two-level problem with one nested patch. The defect correction on the coarse level H is initially defined only over the interior region of the patch. However, if one examines the sum of tH over the refinement, most of the terms in the interior cancel, but nonzero values remain near the boundaries. The remaining terms closely resemble flux operators at the boundary. The net effect is thus the introduction of additional sources in the Poisson equation, which pollutes the solution severely over the whole domain. By balancing the local fluxes with additional defect correction terms on the patch boundary, the correct source strength is restored. Bai and Brandt (1987) solved this problem for second-order equations and tested the method on a source-singularity problem in two dimensions.
Recently the method has been extended to high-order FD approximations by Beck (1999b, 2000). The boundary defect correction terms were determined by examination of the noncancelling terms for the high-order approximations. Without the conservative scheme, significant errors are apparent over the whole domain. With the inclusion of the boundary corrections, the sum of tH over the patch is zero to machine precision and the correct high-order behavior is obtained over the whole domain. The method was tested on a source-singularity problem in three dimensions for multiple nested patches. Typical multigrid efficiencies were observed. Additional corrections will be necessary at the boundaries for continuous charge distributions which cover the refinement boundaries, but these are independent of the order of the Laplacian. These techniques are currently being included in high-order FD electronic structure calculations. They will significantly reduce the grid overhead in comparison to uniform-grid calculations while still maintaining the linear-scaling properties of the multigrid method. It is not possible to handle truly local refinements with the FFT approach.
In related work, Goedecker and Ivanov (1998a) developed a linear-scaling multiresolution wavelet method for the Poisson equation which allows for treatment of widely varying length scales. They utilized second-generation interpolating wavelets since the mapping from grid values to expansion coefficients is easy for these functions, and they have a fast wavelet transform. They solved the Poisson equation for the challenging case of the all-electron uranium dimer. Their solver employed 22 hierarchical levels, and the potential was obtained to six significant digits.
As discussed in Section II.B, the Poisson-Boltzmann equation arises from the assumption of no ion correlations. That is, it is a mean-field treatment. Onsager (1933) showed that there exists an inherent asymmetry at the Poisson-Boltzmann level. Nevertheless, calculations performed at this level of theory can yield accurate energetics for monovalent ions at moderate concentrations (Honig and Nicholls, 1995; Tomac and Gräslund, 1998; Patra and Yethiraj, 1999). Linearization of the Poisson-Boltzmann equation restores the symmetry, but for many cases of experimental interest the linearization assumption is too severe. Solution of the Poisson-Boltzmann equation produces the electrostatic potential throughout space, which in turn generates the equilibrium mobile-ion charge densities and the total free energy of the ion gas (below). By computing the total energies for several macroion configurations, the potential of mean force due to electrostatic effects can be approximated (Rice, 1959). In this section, we focus on real-space numerical methods for solution of the nonlinear Poisson-Boltzmann equation [Eq. (12)].
Numerical solution of nonlinear partial differential equations is problematic. For the Poisson-Boltzmann case, the nonlinearities can be severe near fixed charges since the ratio of the potential to kT can be large. Also, strong dielectric discontinuities at the boundary of a large molecular ion and the solution create technical difficulties. However, it is known that there is a single stable minimum of the action functional whose derivative yields the Poisson-Boltzmann equation (Coalson and Duncan, 1992; Ben-Tal and Coalson, 1994). Therefore, properly constructed iterative processes can be expected to locate that minimum.
Early numerical work centered on FD representations. Nicholls and Honig (1991) developed an efficient single-level SOR method which included special techniques for memory allocation and for locating the optimal relaxation parameter. For the test cases considered, between 76 and 184 iterations were required for convergence. They also observed divergence for some highly nonlinear cases. Davis and McCammon (1989) and Luty et al. (1992) used instead a conjugate-gradients relaxation method. Between 90 and 118 iterations were required to obtain convergence. They observed a factor of at least two improvement in efficiency in comparison with SOR relaxation in their test calculations.
After the development of multigrid methods for solving linear Poisson-type problems, efforts focused on nonlinear problems. The FAS algorithm presented above is well suited for solving nonlinear problems (Brandt, 1984). Two modifications are needed: the driving term fh on a given level now i