1. Introduction
Point-charge electrostatic models have been a mainstay of molecular simulation for years. Increasing in sophistication, many authors have also implemented dipole interactions, such that the electrostatics on each nuclear site is modelled by the combination of a dipole and a charge, which requires that the simulation code be capable of calculating both charge–dipole and dipole–dipole interactions in addition to the usual charge–charge interactions required for point charge models. This presented particular problems for treatment of long-range interactions, as typically handled by an Ewald sum, or related schemes, as such algorithms were initially developed for the case of point charge models, and thus needed to be modified in order to account for dipole interactions. Fortunately, the problem of adapting Ewald sums for dipoles has largely been solved, and, Smith [
1] in particular has provided a formulation which is well organised and easy to follow.
In recent years, some groups have moved even beyond the dipole level, exploring the use of yet higher terms in the multipole expansion of the charge distribution, where the terms in the next three ranks are referred to by names quadrupole, octopole and hexadecapole. Again though, the problem is faced—how to implement these terms in a simulation code, and how to modify an Ewald sum, or similar algorithm, such that it can properly handle the higher order multipoles interactions.
There is currently a strong need for such methods, given that there now exist reliable convenient methods to partition the charge distribution from electronic-structure calculations into atomic multipoles residing on each nuclear site, with the most notable method being the Distributed Multipole Analysis (DMA) approach of Stone and coworkers [
2,
3,
4]. DMA provides a particularly appealing route for the construction of accurate empirical models, because researchers should now be able take the values of atomic multipoles as output from a DMA analysis as input for their models. However, doing this depends, of course, on the ability to write code which can calculate the required multipole interactions, and this is far from trivial.
Progress has been made, with popular molecular-dynamics packages, such as DL-POLY [
5] and TINKER [
6] having functionality to evaluate potential-energy surfaces with multipole interactions. So, in a sense, this is a solved problem, but it is still worth exploring, given the complexity of the current approaches, if there are any ways to simplify the mathematical and computational approaches for computing multipole interactions.
Smith’s method can, in principle, be used up to arbitrary rank, and his original paper even includes the terms up to quadrupole. However, there is a noticeable increase in complexity and computational cost incurred by going up to higher ranks, such that the quadrupole terms are both more difficult to both derive and to calculate than the dipole terms, and, it seems like it would be a significant undertaking to extend this approach to yet higher ranks.
However, it was recently shown by Lin [
7] that we do not have to derive the higher multipole terms “by hand”, as it were. Instead, it is possible to derive a more general formula which more or less automatically generates the expression for any degree multipole, without resorting to increasingly tedious algebra.
It is true that past formulations by Smith and others [
8,
9] are extendable to arbitrarily higher rank multipoles, but Lin’s work is, to our best knowledge, the first to give such a direct closed form expression for the interaction between multipoles of any rank. However, even Lin’s extension to Smith’s approach suffers from a drawback, which is that they both choose to evaluate the multipole interaction in Cartesians. Furthermore, this is a problem because, as it turns out, the Cartesian representation of multipoles is highly redundant. This is due to the presence of symmetries in the Cartesian representation, resulting in the use of far more numbers than is strictly required to represent each multipole, making it hard to calculate the multipole interactions in the minimal number of operations.
Bearing this in mind, it is possible to make many optimisations in the Cartesian implementation, such that these redundancies have relatively minimal impact. Most notably, it is possible to take advantage of the fact that the traces of the Cartesian tensors do not contribute to the total energies/forces of the multipole interaction, and so remove them, an approach that has been used with success by both Lin and Huang [
10,
11]. It is also possible to write the required sums in such a way that the redundancy from the remaining tensor symmetries is reduced.
However, to increase computational efficiency even further, what is really needed is an expression in which the multipoles are described by a completely non redundant basis. Furthermore, it has long been known that this is provided by moving from a Cartesian representation to spherical harmonics.
It could be argued that a Cartesian-multipole representation is more compatible with existing molecular dynamics codes, given they are normally mostly written in Cartesians. Furthermore, it is certainly true that most people prefer thinking in terms of Cartesians, than the somewhat complex spherical harmonics. However, the spherical harmonics approach ultimately proves to be faster, and more elegant, and so is arguably worth the extra effort for codes which make heavy use of multipoles.
Writing the interaction in terms of spherical harmonics is usually regarded as a formidable task, requiring mastery of the mathematics of manipulating these objects, and intimate knowledge of the spherical harmonics’ properties. Furthermore, the complexity is further multiplied when it comes to modifying the treatment of long-range interactions, such as when handled by Ewald sums. Even so, a derivation of the Ewald sum for multipoles has been given by Leslie [
12], who described his implementation of multipoles in the DL-MULTI software package, which can be interfaced with DL-POLY [
5] molecular dynamics software package. Leslie gave the expression for the Ewald sum form of the multipole interaction in terms of Stone’s S functions. These functions, which are derived in ref. [
13], act give the interaction tensor components between two multipoles on different sites, each described in their local reference frames, where the necessary orientational information is described by Wigner rotation matrices, and where the S function also contains a spherical harmonic, which acts to mediate the interaction between.
Another derivation has recently been given by Simmonett et al. [
14] who proposed a method for converting the Ewald sum into the spherical harmonic formulation of Stone, which required the aid of symbolic algebra package to derive. Furthermore, the resulting expression requires the calculation of hard to interpret ‘contamination terms’ in the tensorial interactions between the multipoles, which arise from the modification of the multipole interaction due to the Gaussian screening functions that are present in the real-space part of the Ewald sum. Although the above derivations appear complete and satisfactory, they do depend on fairly technical knowledge, which is presumably why many in the community have been slow to implement such approaches.
In the present contribution, we aim to rectify this situation by providing a surprisingly straightforward derivation of the multipole interaction in spherical harmonics which, unusually, does not require any detailed technical knowledge of the theory of spherical harmonics. Our method provides a direct connection between the Cartesian and spherical harmonic representations, such that it becomes straightforward to transform between the two representations, and our final expressions lend themselves to being easily implemented in Ewald sum type methods. Now, we seek to provide a comprehensive account of the full development of this Cartesian formulation, so this will necessarily involve some presentation of relatively well-known, pervious material in the theory and (spherical-harmonics) presentation and treatment of multipoles. Still, in the present contribution, we wish to provide a full, and self-contained development of a Cartesian formulation, in addition to an intuitive associated graphical representation.
We do not claim that the resulting expressions are superior to previous derivations, such as that given by Leslie and Simmonett et al.; we are full of admiration for their work. However, the approach presented here is significantly different to standard derivations, which makes it interesting in itself, and our resulting expressions are also in a different form, though they must be completely equivalent in their predictions, to those obtained with other methods. We also think our approach will be of interest in showing how the Smith and Lin method for Cartesian multipoles can be transformed into a spherical harmonic equivalent, while preserving the essential structure of their solution.
It would, of course, be of enormous interest to find a reasonably direct way of connecting our expressions for the multipole interaction in spherical harmonics to those produced by other derivations. However, this appears far from trivial, given we have taken what seems to be such a radically different approach to that used by other authors.
Our approach is also very much inspired by the work of Applequist [
15,
16], who published two papers detailing some quite beautiful theorems on the deep connection between homogeneous polynomials and Cartesian tensors, and also on the properties of the Maxwell Cartesian spherical harmonics, which are the natural Cartesian analogues of the spherical harmonics, both of which we will have cause to use in the course of this work.
Our approach is summarised as follows. We first essentially follow the approaches of Smith and Lin to give the multipole interactions in terms of Cartesians. We then proceed to show how this interaction can be converted into spherical harmonic form. This is done with the aid of three key ideas. (i) the use of a diagrammatic representation of the interaction between multipole sites, which greatly clarifies the math, as it turns out the complete interaction can be calculated in a ‘sum over diagrams’ type sense. (ii) the use of spherical harmonics to construct an orthogonal basis for the traceless multipole tensors, such that the Cartesian to spherical harmonic conversion can be achieved by way of Stone’s tables of spherical harmonics, as given in reference [
13]. (iii) the recognition that the multipole interaction can be written in terms of so-called tensor inner products, which we will show are proportional to inner products involving spherical harmonics, allowing us to convert the entire multipole interaction into spherical-harmonic form.
2. Results
2.1. The Multipole Expansion
Consider a cluster of
${N}_{c}$ charges, with charges
${q}_{m}$, at positions
$\mathit{r}+\Delta {\mathit{r}}_{m}$, where the
$\Delta $ is used to signify that the displacements are typically quite small (at least, they are closer to
$\mathit{r}$ than any test sites probing the fields from the distribution). We then define the rank-n Cartesian multipole tensors (
n = 0,1,…),
${\mathit{M}}^{\left(n\right)}$, of the distribution, with respect to
$\mathit{r}$, according to
where
$\Delta {\mathit{R}}^{m\left(n\right)}=\Delta {\mathit{r}}^{m}\Delta {\mathit{r}}^{m}\Delta {\mathit{r}}^{m}..$ is a tensor product, with Cartesian components
$\Delta {R}_{\alpha \beta \gamma ..}^{m\left(n\right)}=\Delta {r}_{\alpha}^{m}\Delta {r}_{\beta}^{m}\Delta {r}_{\gamma}^{m}..$, in which each Greek index is one of
$x,y,z$, and there being
$n$ factors of type
$\Delta {r}_{\alpha}^{m}$ in the product. Furthermore, where, for
n = 0, the tensor becomes unity:
$\Delta {\mathit{R}}^{m\left(0\right)}=1$ (see
Section 4.1).
Note: In Equation (1), and throughout, we are following Smith in subsuming the inverse factorial in the definition of each multipole, but we warn that other authors use different conventions, and it is important to check which convention is being used when comparing different works.
The rank 0 multipole, ${M}^{\left(0\right)}$, which is a scalar, is just the sum of the charge. The rank 1 multipole ${\mathit{M}}^{\left(1\right)}$ multipole is referred to as the dipole moment, and is a vector with components ${M}_{x}^{\left(1\right)}$, ${M}_{y}^{\left(1\right)}$, ${M}_{z}^{\left(1\right)}$. The rank 2 multipole, ${\mathit{M}}^{\left(2\right)}$ is called the quadrupole, and has 9 components: ${M}_{xx}^{\left(2\right)}$, ${M}_{xy}^{\left(2\right)}$, ${M}_{xz}^{\left(2\right)}$, etc., and the rank 3 multipole with 27 components is called the hexadecapole, and so on.
It is clear from their definition in Equation (1). that each multipole tensor is symmetric under any permutation of its indices; e.g., for the rank-3 multipole tensor, ${M}_{\alpha \beta \gamma}^{\left(3\right)}={M}_{\alpha \gamma \beta}^{\left(3\right)}$ $={M}_{\beta \gamma \alpha}^{\left(3\right)}={M}_{\beta \alpha \gamma}^{\left(3\right)}$ $={M}_{\gamma \alpha \beta}^{\left(3\right)}={M}_{\gamma \beta \alpha}^{\left(3\right)}$. Furthermore, it can be shown that the same is true in any axis frame. This permutation symmetry means that the multipole tensors belong to a class that are referred to as symmetric tensors, and we will have frequent cause to make use of this symmetry throughout this work.
The multipoles are useful because the electrostatic properties of the cluster can often be written in terms of a rapidly converging series over multipole moments of increasing rank, rather than having to sum over the individual charges themselves. To see this, let us place the charge distribution in a background non-uniform electrostatic potential,
$\varphi \left(\mathit{r}\right)$. Then, the electrostatic energy,
$U\left(\mathit{r}\right)$, of the cluster due to
$\varphi \left(\mathit{r}\right)$ is given by
Now, using the inner product notation
$\u27e8{\mathit{A}}^{\left(n\right)},{\mathit{B}}^{\left(n\right)}{\u27e9}_{t}$, where
$\u27e8{\mathit{A}}^{\left(n\right)},{\mathit{B}}^{\left(n\right)}{\u27e9}_{t}$ indicates an inner product over tensors
${\mathit{A}}^{\left(n\right)}$ and
${\mathit{B}}^{\left(n\right)}$ (see
Section 4.1)
the Taylor series expansion of
$\varphi \left(\mathit{r}+\Delta {\mathit{r}}_{m}\right)$ can be written as
where
${\nabla}^{\left(n\right)}=\nabla \nabla \nabla ..$ with components
${\nabla}_{\alpha \beta \gamma ..}^{\left(n\right)}={\partial}^{n}/\partial {r}_{\alpha}\partial {r}_{\beta}\partial {r}_{\gamma}..$ (again, see
Section 4.1).
Now, defining the rank n symmetric electrostatic field tensors
${\varphi}^{\left(n\right)}\left(\mathit{r}\right)$ from
Furthermore, substituting Equation (4) into Equation (2) and using Equation (1), we can rewrite the energy expression as
Similarly, it can be shown from the elementary theory of multipoles, and from using another Taylor series expansion, that the electrostatic potential at
$\mathit{r}$, from a charge cluster around the origin, with multipoles
${\mathit{M}}^{\left(n\right)}$ is given by
where we have set
$4\pi {\u03f5}_{0}=1$ to simplify the formulae.
Thus, assuming the above series converges, and we wish to calculate the electrostatic potential from the charge cluster, we are justified in ignoring the fine details of the charge distribution and just working with its multipoles up to a maximum rank, where these multipoles are calculated from the multipole expansion of the charge distribution in Equation (1), and where we imagine the multipoles are placed on the multipole expansion site, in this case at the origin.
2.2. Traceless Tensors and the Detracing Operator
The trace of a rank 2 tensor is given by the sum of its diagonal elements,
In general, and following Applequist [
15] (
Section 2.3), we will define the trace of a rank
n symmetric Cartesian tensor as
where the trace tensor,
$Tr\left({\mathit{A}}^{\left(n\right)}\right)$, is a symmetric tensor of rank
n − 2.
Tensors for which $Tr\left({\mathit{A}}^{\left(n\right)}\right)$ = $\mathbf{0}$ will be referred to as traceless. In fact, we have already encountered two such tensors: (i) the ${\varphi}^{\left(n\right)}\left(\mathit{r}\right)$ tensor, defined in Equation (5) above, and (ii) the ${\nabla}^{\left(n\right)}\left\{1/r\right\}$ tensor which occurs in Equation (7).
To show that the
${\varphi}^{\left(n\right)}\left(\mathit{r}\right)$ tensor is traceless, we combine Equations (5) and (9) to obtain
where we have used the fact that the electrostatic potential in free space is a solution to Laplace’s eqn,
${\nabla}^{2}\varphi \left(\mathit{r}\right)=0$. Thus,
${\varphi}^{\left(n\right)}\left(\mathit{r}\right)$ is traceless, and, given that
$1/r$ is just the potential from a unit charge, a similar argument shows that
${\nabla}^{\left(n\right)}\left\{1/r\right\}$ is also traceless.
As an aside, it may be argued that in real systems the Laplacian is not zero, because any real atomic site will experience a non-zero charge-density from the inter-atomic and inter-molecular charge-clouds from the other electrons. However, the standard multipole expansion for the interatomic electrostatic energy does not account for such effects, and they will henceforth be assumed to be zero.
The addition of any two rank n traceless tensors, or the multiplication of a traceless tensor by a scalar will result in another traceless tensor. Thus, the traceless tensors of rank n form a vector subspace of rank n tensors. As such we ought to be able to find a projection operator which projects out the non-traceless part of any rank n symmetric tensor. For example, we might (correctly) guess that the rank 2 tensor ${\mathit{A}}^{\left(2\right)}$ has the traceless projection ${A}_{\alpha \beta}^{t\left(2\right)}={r}_{\alpha}{r}_{\beta}-Tr\left\{{\mathit{A}}^{\left(2\right)}\right\}{\delta}_{\alpha ,\beta}/3$. Furthermore, we will now show how to find this projection in general.
Before we begin, it is convenient to introduce
${\mathit{R}}^{\left(n\right)}$ tensors,
${\mathit{R}}^{\left(n\right)}=\mathit{r}\mathit{r}\mathit{r}\mathit{r}..$, with components
${R}_{\alpha \beta \gamma ..}^{\left(n\right)}={r}_{\alpha}{r}_{\beta}{r}_{\gamma}{r}_{\delta}..$, and where
${\mathit{R}}^{\left(0\right)}=1$ by convention (see
Section 4.1).
We will now make use of a theorem due to Hobson, [
17] which has been extensively discussed by Applequist, ref [
15,
16] that the projection of the
${\mathit{R}}^{\left(n\right)}$ tensors onto the subspace of traceless tensors is given by
where
$\hat{D}$ is the so-called
detracing projection operator onto the subspace of traceless tensors, and where
${\mathit{R}}^{t\left(n\right)}=\hat{D}{\mathit{R}}^{\left(n\right)}$ is the traceless tensor resulting from the projection of
${\mathit{R}}^{\left(n\right)}$ into that traceless subspace. Furthermore, where that
${\mathit{R}}^{t\left(n\right)}$ is traceless follows immediately from the traceless nature of
${\nabla}^{\left(n\right)}\left\{1/r\right\}$.
Applequist refers to
${\mathit{R}}^{t\left(n\right)}$ as the
Maxwell Cartesian spherical harmonics, given that these gradients were investigated by James Clerk Maxwell [
18], and given that, although they are not orthogonal, they behave in many senses like the Cartesian analogue of the spherical harmonic polynomials, of which more later.
For the moment we can think of the $\hat{D}$ operator as being defined by Equation (11), and so our job is to show that $\hat{D}$ defined this way is indeed the detracing projection operator.
Firstly, we need a way to express the action of
$\hat{D}$ on a general degree n tensor. Applequist has provided a closed-form expression for the matrix representation of
$\hat{D}$ [
15], but here we will show a perhaps easier method, which is to infer these coefficients from Equation (11).
This is perhaps easiest shown by way of example. From Equation (11), and calculating the repeated derivatives of
$1/r$, we find that
where the term in brackets corresponds to a sum over all distinct permutations of the indices.
We can now define the action of the
$\hat{D}$ operator on a general tensor,
${\mathit{A}}^{\left(n\right)}$, through substituting components of
${\mathit{A}}^{\left(n\right)}$ for those of
${\mathit{R}}^{\left(n\right)}$, which gives
We will now show that the
$\hat{D}{\mathit{A}}^{\left(n\right)}$ are traceless. Starting from a result from Applequist [
15] (
Section 3.3) that it always possible to find
${\mathit{r}}^{i}$ vectors such that any degree n tensor can be expressed as a linear combination in
${\mathit{R}}^{i\left(n\right)}$, viz
We then apply the
$\hat{D}$ operator to both sides of the above to give
and given that every
$\hat{D}{\mathit{R}}^{i\left(n\right)}$ is traceless we have that
$\hat{D}{\mathit{A}}^{\left(n\right)}$ must also be traceless, which was to be shown.
To show that
$\hat{D}$ is a proection operator onto the subspace of traceless tensors we first note that every term in
${\left(\hat{D}{\mathit{R}}^{\left(n\right)}\right)}_{\alpha \beta \gamma ..}$ in Equations (12)–(15) is of the form
where
C is some constant and where it is easy to show that there must be
m Kronecker deltas if the term is to be dimensionally correct. That each term must be in this form can be shown from induction on the repeated directional derivatives of
$1/r$.
Looking again at Equations (12)–(15), we also see that, for each degree n, there is always a first term with m = 0, i.e., containing no factors of ${r}^{2}$, or Kronecker deltas, which is given by ${r}_{\alpha}{r}_{\beta}{r}_{\gamma}..$. Again, this can be shown from induction.
Now, consider what happens when we make the substitution from
${\mathit{R}}^{\left(n\right)}$ to
${\mathit{A}}^{\left(n\right)}$ in expression 22, which gives
where the sum involves taking
m traces of
${\mathit{A}}^{\left(n\right)}$. Then if
${\mathit{A}}^{\left(n\right)}={\mathit{A}}^{t\left(n\right)}$ is a traceless tensor, the above must be equal to zero for every term except the
m = 0 term, which is equal to
${A}_{\alpha \beta \gamma ..}^{t\left(n\right)}$. Furthermore, it follows that
$\hat{D}{\mathit{A}}^{t\left(n\right)}={\mathit{A}}^{t\left(n\right)}$.
In general, we have that ${\mathit{A}}^{\left(n\right)}$ is not traceless, but given that, for any ${\mathit{A}}^{\left(n\right)}$, we have already shown that $\hat{D}{\mathit{A}}^{\left(n\right)}$ is traceless, we also have that $\hat{D}\left(\hat{D}{\mathit{A}}^{\left(n\right)}\right)=\hat{D}{\mathit{A}}^{\left(n\right)}$, and so $\hat{D}\hat{D}=\hat{D}$. It follows that $\hat{D}$ is the detracing projection operator, which projects onto the entire subspace of traceless tensors.
2.3. The Multipole Interaction in Terms of Traceless Tensors
In this section, we will show how the multipole interaction can be written in terms of traceless tensors.
We begin with an observation that If ${\mathit{A}}^{\left(n\right)}$ is a rank n tensor and ${\mathit{B}}^{t\left(n\right)}$ is a traceless tensor of the same rank, then the inner product $\u27e8{\mathit{A}}^{\left(n\right)},{\mathit{B}}^{t\left(n\right)}{\u27e9}_{t}=\u27e8\hat{D}{\mathit{A}}^{\left(n\right)},{\mathit{B}}^{t\left(n\right)}{\u27e9}_{t}$.
To see this, write ${\mathit{A}}^{\left(n\right)}={\mathit{A}}^{t\left(n\right)}+{\mathit{A}}^{{}^{\prime}\left(n\right)}$, where ${\mathit{A}}^{t\left(n\right)}=\hat{D}{\mathit{A}}^{\left(n\right)}$ is the traceless part of ${\mathit{A}}^{\left(n\right)}$, and ${\mathit{A}}^{{}^{\prime}\left(n\right)}$ is the non-traceless remainder, then $\u27e8{\mathit{A}}^{{}^{\prime}\left(n\right)},{\mathit{B}}^{t\left(n\right)}{\u27e9}_{t}=0$, because ${\mathit{A}}^{{}^{\prime}\left(n\right)}$ is completely outside the traceless subspace as a result of this decomposition, by construction, and the result follows.
We will now show that the traces of the multipole tensors make no contribution to the energies forces and torques, and so can be set to zero if desired.
Given that ${\varphi}^{\left(n\right)}\left(\mathit{r}\right)$ is traceless, we can use the above theorem to show that the inner product in Equation (6) can be written as either $\u27e8{\mathit{M}}^{\left(n\right)},{\varphi}^{\left(n\right)}\left(\mathit{r}\right){\u27e9}_{t}=\u27e8{\mathit{M}}^{t\left(n\right)},{\varphi}^{\left(n\right)}\left(\mathit{r}\right){\u27e9}_{t}$, and similarly, the inner product in Equation (7) can also be written using either ${\mathit{M}}^{\left(n\right)}$ or ${\mathit{M}}^{t\left(n\right)}$, without making any difference to the result.
We have shown that the energy of a multipole in a background field is independent of its trace, and the field from a multipole, as given by Equation (7) is also independent of the multipole traces. Thus, the energies of any set of interacting multipoles are also independent of their traces, and the same is true for all the forces and torques being that they involve gradients of the energy.
Finally, we will show that the electrostatic potential from a multipole distribution at a given point in space can be written in terms of the traceless tensors
${\mathit{R}}^{t\left(n\right)}$ and
${\mathit{M}}^{t\left(n\right)}$. Substitution of Equation (11) into Equation (7) gives
Furthermore, given the claim at the start of this section, we could use the substitutions ${\u27e8{\mathit{M}}^{\left(n\right)},{\mathit{R}}^{t\left(n\right)}\u27e9}_{t}={\u27e8{\mathit{M}}^{t\left(n\right)},{\mathit{R}}^{t\left(n\right)}\u27e9}_{t}={\u27e8{\mathit{M}}^{t\left(n\right)},{\mathit{R}}^{\left(n\right)}\u27e9}_{t}$ in the above.
2.4. The Multipole Interaction Generating Formula
The ${\nabla}^{\left(n\right)}\left\{1/r\right\}$ gradients play a central role in the theory of multipoles, as should already be evident from the last section. However, as shown by Smith, the Ewald sum analogue of the multipole interaction requires the calculation of more general gradients, in which spherically symmetric functions, here written as $B\left(r\right)$, are substituted for the $1/r$ terms in equalities like Equation (7). That is, instead of calculating the ${\nabla}^{\left(n\right)}\left\{1/r\right\}$ gradients, we will now be interested in finding gradients ${\nabla}^{\left(n\right)}\left\{{B}_{0}\left(r\right)\right\}$, with the knowledge that, should we wish to find results for the non-Ewald regular multipole expansion then we can simply set $B\left(r\right)=1/r$ in the final expressions.
In particular, the Ewald sum analogue to the normal multipole formulae requires using the kernel $B\left(r\right)=erfc\left(\alpha r\right)/r$, where erfc(x) is the complimentary error function, which corresponds to the interaction of a unit charge with a negative Gaussian ‘screening’ density, where the screening density is of the form $\rho \left(r\right)=-Aexp\left(-{a}^{2}{r}^{2}\right)$ (suitably normalised).
To simplify these gradient calculations, Smith defines the radial functions,
${B}_{m}\left(r\right)$, where the zeroth term is given by
${B}_{0}\left(r\right)=B\left(r\right)$, and with the higher order terms defined according to
from which it follows that if
${B}_{0}\left(r\right)=1/r$, then
${B}_{1}\left(r\right)=1/{r}^{3}$,
${B}_{2}\left(r\right)=3/{r}^{5}$, and, in general,
${B}_{m}\left(r\right)=\left(2m-1\right)!!/{r}^{2m+1}$, where
$!!$ is the double factorial.
As might be expected, the expressions are somewhat more complicated when the ${B}_{0}\left(r\right)=erfc\left(\alpha r\right)/r$ kernel is used, but the repeated derivatives for these functions can be readily evaluated in terms of Gaussian functions.
It is instructive to calculate the first few terms in the directional gradients of the
${B}_{0}\left(r\right)$ functions. Begin by recalling that for a general spherically symmetric function,
$f\left(r\right)$, we have that
$\partial f\left(r\right)/\partial {r}_{\alpha}={f}^{\prime}\left(r\right){r}_{\alpha}/r$. It then follows that
from which the first two directional derivatives evaluate to
and
where the terms in brackets are the results for the particular choice of
${B}_{0}\left(r\right)=1/r$, and where we note that these terms can be written in terms of the Maxwell Cartesian spherical harmonics we encountered in Equation (11).
It is of course possible to continue calculating higher-order terms in this fashion, but the main point we want to make here is that the first two directional derivatives of ${B}_{0}\left(r\right)$ above contain terms in ${B}_{l}\left(r\right)$, for some value of $l$; in general, as a consequence of the chain rule, it can be shown that gradient terms in every order can be expressed as a series in ${B}_{l}\left(r\right)$, where the $l$th derivative requires calculated functions up to the $l$th term.
Returning to the multipole problem, suppose that there are two multipole sites,
$i$ and
$j$, at locations
${\mathit{r}}^{i}$ and
${\mathit{r}}^{j}$ respectively, each of which carry a set of multipoles of different ranks, where the multipoles are placed at the site locations. Then the electrostatic energy of this pair is due to each multipole on site
$i$ interacting with every multipole on site
$j$, which can be determined by first evaluating the field at
${\mathit{r}}^{j}$ from the multipoles at
${\mathit{r}}^{i}$ according to Equation (7), and then calculating the energy of the multipoles on
$j$ in the presence of that field according to Equation (5). Labelling this energy
${U}^{ji}\left({\mathit{r}}^{ji}\right)$, we obtain
where
${\mathit{r}}^{ji}={\mathit{r}}^{j}-{\mathit{r}}^{i}$ is the inter-site vector,
${r}^{ji}=\left|{\mathit{r}}^{ji}\right|$,
${\nabla}_{j}^{\left({d}_{i}\right)}$ are the directional derivatives with respect to
${\mathit{r}}^{j}$ and where we have used the more general form containing the
${B}_{0}\left({r}^{ji}\right)$ kernel.
Similarly to how we found that every order term in the repeated derivatives of
${B}_{0}\left(r\right)$ contain a series in
${B}_{l}\left(r\right)$, it is not difficult to see that the same must be true for the interaction energy,
${U}^{ji}\left({r}^{\mathit{j}i}\right)$, and, following Smith, we can make this clear by collecting terms in
${B}_{l}\left({r}^{ji}\right)$ to write
where the
${G}_{ji}^{l}\left({\mathit{r}}^{ji}\right)$ functions can be thought of as coefficients in
${B}_{l}\left({r}^{ji}\right)$.
Evaluating these functions would seem to require calculating and summing over the
${\nabla}^{\left(n\right)}\left\{B\left(r\right)\right\}$ terms ‘by hand’, as it were, which would involve much tedious algebra. However, a closed form solution has been given by Lin [
7], (cf. equation 4.3 in Lin’s paper) and we simply state without proof that it is given by
where
${\mathit{R}}^{\left(n\right)}=\mathit{r}\mathit{r}\mathit{r}..$ (see
Section 4.1); the multipoles are all assumed to be traceless; the notation
$\mathit{A}.d.\mathit{B}$ indicates a d-fold contraction over the tensor indices of
$\mathit{A}$ and
$\mathit{B}$ (again, see
Section 4.1);
${d}_{i}$ is the number of contractions in the bracket containing
${\mathit{M}}^{i}$;
${d}_{j}$ is the number of contractions in the bracket containing
${\mathit{M}}^{j}$;
${d}_{c}$ is the number of contractions acting in the centre, between the two brackets, which we have expressed as an inner product; and where
${C}_{{d}_{i},{d}_{c},{d}_{j}}$ are integer combinatorial coefficients, given by
We also have that the sum in Equation (31) is over all ${d}_{i},{d}_{j},{d}_{c}$, where ${d}_{i}+{d}_{c}+{d}_{j}=l$; ${d}_{i},{d}_{c},{d}_{j}\ge 0$, i.e., the sum is over all possible terms having $l$ contractions.
We will refer to Equation (31) as the multipole interaction generating formula, as it generates all the terms in the multipole-multipole expansion of the electrostatic energy. Its derivation is not too difficult, if a little tedious, and essentially involves expanding out the repeated derivatives of the ${B}_{0}\left(r\right)$ functions, and using combinatorics to find the number of symmetrically equivalent terms.
The
${G}_{ji}^{l}\left(\mathit{r}\right)$ functions up to rank 3 are given in
Section 4.1, from which it is readily apparent that the total number of dots, i.e., the number of contractions, in each term of the
$l$th
${G}_{ji}^{l}\left({\mathit{r}}^{ji}\right)$ function is equal to
$l$, which is also apparent from the structure of Equation (31).
Claim 1. The traces of both the $\mathit{M}$ tensors and the $\mathit{R}$ tensors make no contribution to the multipole interaction generating formula.
Sketch of Proof: For the multipoles, this follows straightforwardly from the trace condition that all multipole tensors must be traceless. Furthermore, that the same applies to the
$\mathit{R}$ tensors follows from the fact that their indices are completely contracted with the traceless
$\mathit{M}$ tensors, and then using a similar argument to that used in
Section 1 for the traceless multipoles. Q.E.D.
We have shown that it is possible to detrace the $\mathit{R}$ tensors in the sum of Equation (31). However, we will generally prefer to work with the tensors in their simpler, untraced, form, unless we have cause otherwise.
One of the real advantages of the Smith and Lin method for deriving the multipole interactions is that it provides a clean separation between the ${G}_{ji}^{l}\left({\mathit{r}}^{ji}\right)$ functions, which depend on the traceless tensor components, and the spherically symmetric kernel dependent part, which are described by the ${B}_{l}\left({r}^{ji}\right)$ functions. This suggests that the Smith and Lin approach may provide a useful starting point for converting the multipole interaction into spherical harmonics, and later on, we shall show how this can be done.
2.5. The Diagrammatic Representation
It is possible to construct a diagrammatic representation of the multipole interaction generating formula of Equation (31), where one such diagram is shown in
Figure 1. The figure shows one term in the formula for
$l=6$, corresponding to
$({\mathit{M}}^{i\left(5\right)}\vdots {\mathit{R}}^{ji\left(3\right)}):({\mathit{M}}^{j\left(3\right)}.{\mathit{R}}^{ji\left(1\right)})$. The rules being that each node represents a different tensor, where its rank is given by the number of spokes radiating from the node in question; the number of spokes shared between two nodes is equal to the degree of the contraction acting between the corresponding tensors; and the sign of the diagram is taken to be negative when there are an odd number of bonds connecting the
$j$ multipole with its
$\mathit{R}$ tensor.
Theorem 1. The complete expression for the $l$th ${G}_{ji}^{l}\left({\mathit{r}}^{ji}\right)$ term involves the sum over all $l$-bond diagrams, where the spokes on each multipole tensor are treated as distinct, and where no tensor is allowed to bond to itself.
Proof. The total number of bonds is equal to the total number of contractions which is given by ${d}_{i}+{d}_{c}+{d}_{j}=l$, which is the range of the sum in Equation (31). Assume that the spokes on each multipole are distinct, such that it is possible to label the spokes on both multipoles. Then the ${C}_{{d}_{i},{d}_{c},{d}_{j}}$ integer coefficients of Equation (32) are seen to be the number of unique ways of arranging the spokes on both the $i$ and $j$ multipoles for the topology described by contractions ${d}_{i},{d}_{c},{d}_{j}$. Furthermore, the negative sign mentioned above is a consequence of the ${(-1)}^{{d}_{j}}$ factor in Equation (32). Q.E.D. □
We also stated that no tensor may bond to itself. Given that a self-bond corresponds to taking the partial trace of a tensor, this rule follows directly from the fact that the traces of the $\mathit{M}$ and $\mathit{R}$ tensors make no contribution to the multipole interaction generating formula, as established in the last section.
The diagrammatic representation is by no means essential for following the math in this work, but it does provide a useful mnemonic for keeping track of the various terms in the multipole interaction. Furthermore, it will be interesting to see whether such a simple picture can still be constructed when we come to convert these expressions into spherical harmonics.
2.6. Forces, Fields, Angular Derivatives and Torques
We are generally interested in more than just calculating the total energy, and for completeness, this section will detail how to calculate the forces, multipole fields, angular derivatives and torques, all of which are commonly required when implementing multipole interactions into molecular-simulation code.
We begin with the forces. Let ${\mathit{F}}^{ji}\left({\mathit{r}}_{ji}\right)=-{\mathit{F}}^{ij}\left({\mathit{r}}_{ji}\right)$ be the force on multipole site $j$ due to its interaction with multipoles site $i$, where the total force on $j$ is given by ${\mathit{F}}^{j}={\displaystyle {\displaystyle \sum}_{i}^{N}}{\mathit{F}}^{ji}\left({\mathit{r}}^{ji}\right)$.
The force is found through taking the gradient of Equation (30) and evaluates to
where we have used Equation (26) to find the gradient of the
${B}_{l}\left({r}^{ji}\right)$ function, and where the
$\partial {G}_{ji}^{l}\left({\mathit{r}}^{ji}\right)/\partial {r}^{j}$ functions are found from taking the gradient of Equation (31), and are given by
The
$\partial {G}_{ji}^{l}\left({\mathit{r}}^{ji}\right)/\partial {\mathit{r}}^{j}$ functions up to rank 3 are listed in
Section 4.2.
A diagrammatic representation of one force term is shown in
Figure 2, which corresponds to taking the gradient of the interaction from
Figure 1. Comparing
Figure 1 and
Figure 2, it can be seen that taking the gradient results in breaking one of the bonds connecting either the
$i$ or
$j$ multipole with its
$\mathit{R}$ tensor. This leaves the multipole with a bare (or unbonded) spoke, which means that, taken as a whole, we have a diagrammatic representation of a rank 1 vector.
We have already provided an expression for the total energy of the system, but an arguably more elegant expression for the energy is in terms of multipole field tensors, according to
where
${\varphi}^{i\left(n\right)}$ is the rank
n field tensor on site
$i$, which is defined from
Equation (35), which is written in terms of multipole fields has the distinct advantage that it allows for an automatic decomposition of the total energy into contributions from the different multipole ranks, i.e., charge, dipole, quadrupole etc. Furthermore, another reason why we may want to calculate the fields is that it greatly simplifies calculation of angular derivatives and torques, to be given in the next section.
The fields can be calculated by finding the derivative of the energy with respect to each multipole. Looking at just one pair of multipole sites, the field on multipole site
$j$ due to the multipoles on site
$i$ is given by
where, taking the derivative of Equation (31) with respect to the multipoles we obtain
where SYMM indicates that the resulting tensor elements are to be symmetrised over all index permutations, and for their traces to be removed.
This final step is necessary to ensure that the field tensors have the same symmetries as their corresponding multipoles, given that per their definition, the field tensors must be unchanged with respect to any permutation of their indices. Furthermore, the removal of the traces is because the field traces make no contribution to the energy, and so it makes sense that these are set to zero.
A diagrammatic representation of the field calculation is shown in
Figure 3, in which the rank 5 field on
${\mathit{M}}^{i\left(5\right)}$ and the rank 3 field on
${\mathit{M}}^{j\left(3\right)}$, corresponding to the multipole interaction in
Figure 1, are shown. Both diagrams can be thought of as cutting a multipole free from
Figure 1, which corresponds to taking the derivative with respect to that multipole.
In order that we may calculate the angular derivatives of the energy, let
${O}_{\alpha \beta}$ be orthogonal rotation matrices which transform vector components from the reference frame to the laboratory frame (see
Section 4.1). Furthermore, suppose that
${O}_{\alpha \beta}={O}_{\alpha \beta}\left(\varphi ,\theta ,\psi \right)$, where
$\left(\varphi ,\theta ,\psi \right)$ are (three) Euler angles (e.g., see ref. [
19]) and we wish to find the energy derivatives with respect to these angles. Taking the Euler angle
$\psi $ as an example, use the chain rule to obtain the contribution from the rank n multipole as
where we have used Equation (78) from
Section 4.1 to relate the components of the multipole tensor to their components in the reference frame, and where the factor of
$n$ in the above derives from the index permutation symmetry of
${\varphi}_{\alpha \beta \gamma ..}^{\left(n\right)}$ and
${M}_{\alpha \beta \gamma ..}^{ref\left(n\right)}$, such that it can be shown that each of the
n terms involving derivatives of the rotation matrices are identical, and so can be added together.
The torques may be evaluated from the energy derivatives with respect to rotations about each axis. First consider an infinitesimal rotation by
$\Delta {\theta}_{\alpha}$ about the
$\alpha $ axis. If a vector
$\mathit{r}$ has the value
$\mathit{r}\left(\Delta {\theta}_{\alpha}\right)$ after the rotation, then we have
$\mathit{r}\left(\Delta {\theta}_{\alpha}\right)=\mathit{r}+\Delta {\theta}_{\alpha}\hat{\alpha}\times \mathit{r}$, which has components
where
${\u03f5}_{\alpha \beta \gamma}$ is the Levi-Civita symbol,
${\u03f5}_{xyz}={\u03f5}_{yzx}={\u03f5}_{zxy}=1$,
${\u03f5}_{zyx}={\u03f5}_{yxz}={\u03f5}_{xzy}=-1$, and
$\u03f5=0$ otherwise.
It will prove useful to recast Equation (40) in terms of a rotation matrix. We have
${r}_{\beta}\left(\Delta {\theta}_{\alpha}\right)={\displaystyle \sum _{\gamma}}{O}_{\beta \gamma}\left(\Delta {\theta}_{\alpha}\right){r}_{\gamma}$, where
${O}_{\beta \gamma}\left(\Delta {\theta}_{\alpha}\right)$ is given by
The components of the torque,
${t}_{\alpha}$, are obtained from substitution of the above into Equation (39) to give
where we have used
$\partial {O}_{\beta \gamma}\left({\theta}_{\alpha}\right)/\partial {\theta}_{\alpha}={\u03f5}_{\beta \alpha \gamma}$, and
${O}_{\beta \gamma}\left(0\right)={\delta}_{\beta ,\gamma}$, both of which can be deduced from Equation (41).
A diagrammatic representation of the torque for one multipole is given in
Figure 4.
2.7. Polynomials and Symmetric Tensors
This section explores the deep connection between homogeneous polynomials and symmetric tensors, covering similar ground to the treatment by Applequist [
15].
We begin with introducing some useful notation. Suppose that
${A}_{\alpha \beta \gamma ..}^{\left(n\right)}$ is a symmetric Cartesian tensor component where the
$\alpha \beta \gamma ..$ index contains
${n}_{x}$ occurrences of
$x$,
${n}_{y}$ occurrences of
$y$ and
${n}_{z}$ occurrences of
$z$. Defining
$\mathit{n}=\left({n}_{x},{n}_{y},{n}_{z}\right)$, it will be useful to introduce what Applequist refers to as compressed tensor notation, in which we write
where, in total, there is a multinomial of
$n!/\left({n}_{x}!{n}_{y}!{n}_{z}!\right)$ permutations of the
$\alpha \beta \gamma ..$ indices belonging to a particular
$\mathit{n}$.
Now suppose that ${p}^{\left(n\right)}\left(\mathit{r}\right)$ is a homogeneous polynomial of degree n, such that ${p}^{\left(n\right)}\left(\lambda \mathit{r}\right)={\lambda}^{n}{p}^{\left(n\right)}\left(\mathit{r}\right)$, where $\lambda $ is a scalar. For example, a degree 3 homogenous polynomial is given by ${p}^{a\left(3\right)}\left(\mathit{r}\right)=4{x}^{3}+2{y}^{2}x-5{y}^{3}$.
To express ${p}^{\left(n\right)}\left(\mathit{r}\right)$ in general form, we use compressed notation to write ${R}_{n}^{\left(n\right)}={x}^{{n}_{x}}{y}^{{n}_{y}}{z}^{{n}_{z}}$, and we will label the monomial coefficient in the ${R}_{n}^{\left(n\right)}$ component as ${\overline{P}}_{n}^{\left(n\right)}$. The bar is used to signify that the ${\overline{P}}_{n}^{\left(n\right)}$ are monomial coefficients, although we will shortly see that the ${\overline{P}}_{n}^{\left(n\right)}$ can also be interpreted as tensor components.
In terms of the polynomial coefficients
${\overline{P}}_{n}^{\left(n\right)}$, we can express any degree n homogeneous polynomial as
where the sum is over all values of
$\mathit{n}=\left({n}_{x},{n}_{y},{n}_{z}\right)$ for which
${n}_{x}+{n}_{y}+{n}_{z}=n$, and where we have also introduced the symmetric Cartesian tensor,
${\mathit{P}}^{\left(n\right)}$.
To clarify the above, it may help to take an example. The second order polynomial,
${p}^{\left(2\right)}\left(\mathit{r}\right)$, can be written as either
or
where we have used the permutation symmetry of
${\mathit{P}}^{\left(2\right)}$.
Equating terms in the polynomial, we have
${\overline{P}}_{\left(2,0,0\right)}^{\left(2\right)}={P}_{xx}^{\left(2\right)}$, and
${\overline{P}}_{\left(1,1,0\right)}^{\left(2\right)}=2{P}_{xy}^{\left(2\right)}$, and similarly for other terms. In general,
${P}_{\alpha \beta \gamma ..}^{\left(n\right)}$ is related to
${\overline{P}}_{n}^{\left(n\right)}$ by
where the inverse multinomial coefficients are required due to the permutation symmetry of the
$\alpha \beta \gamma ..$ indices.
We can also form inner products,
${\u27e8{\mathit{P}}^{\left(n\right)},{\mathit{A}}^{\left(n\right)}\u27e9}_{t}$, which can be evaluated as
Given that ${\mathit{P}}^{\left(n\right)}$ encodes all the information in the polynomial coefficients, i.e., there is a one to one mapping between the polynomial coefficients and the tensor components, we can think of ${\mathit{P}}^{\left(n\right)}$ as being the symmetric tensor form of ${\overline{P}}_{n}^{\left(n\right)}$.
Technically,
${\mathit{P}}^{\left(n\right)}$ is a so-called
covariant tensor, which transforms under rotations in the opposite sense to
${\mathit{R}}^{\left(n\right)}$. To see why, consider a rotation of the coordinate system
where
${\mathit{O}}^{\left(n\right)}$ is the orthogonal tensor rotation matrix,
${O}_{\alpha \beta \gamma ..,{\alpha}^{\prime}{\beta}^{\prime}{\gamma}^{\prime}}^{\left(n\right)}={O}_{\alpha {\alpha}^{\prime}}{O}_{\beta {\beta}^{\prime}}{O}_{\gamma {\gamma}^{\prime}}..$ (See
Section 4.1.)
This shows that we can find $p\left(\mathit{O}\mathit{r}\right)$ from either transforming the components of ${\mathit{R}}^{\left(n\right)}$ by the usual orthogonal rotation ${\mathit{O}}^{\left(n\right)}$, or, we can do it by transforming the components of ${\mathit{P}}^{\left(n\right)}$ by the inverse rotation ${\mathit{O}}^{\left(n\right)}{}^{T}={\mathit{O}}^{\left(n\right)}{}^{-1}$.
The
${\mathit{P}}^{\left(n\right)}$ tensor may also be obtained from the polynomial itself by way of a linear operator,
$\hat{L}=\left(1/n!\right){\nabla}^{\left(n\right)}$, according to
which may be seen by applying the above gradient operator to Equation (44). Note that because we are taking n derivatives of a degree-n homogeneous polynomial that the LHS above is independent of
$\mathit{r}$.
Furthermore, we have that the inverse to Equation (50) is given by
We now introduce the so-called harmonic polynomials, ${h}^{\left(n\right)}\left(\mathit{r}\right)$, which are a subset of the ${p}^{\left(n\right)}\left(\mathit{r}\right)$ polynomials which have a vanishing Laplacian, ${\nabla}^{2}{h}^{\left(n\right)}\left(\mathit{r}\right)=0$. As an example, one such polynomial is given by ${h}^{a\left(3\right)}=2{x}^{3}z+3{x}^{2}y-6x{y}^{2}z-{y}^{3}$, for which it may be confirmed that $\Delta {h}^{a\left(3\right)}=0$.
The harmonic polynomials are of particular interest, as their corresponding symmetric tensors, given by
${\mathit{H}}^{\left(n\right)}=\left(1/n!\right){\nabla}^{\left(n\right)}{h}^{\left(n\right)}\left(\mathit{r}\right)$, are traceless, due to the vanishing Laplacian condition on
${h}^{\left(n\right)}\left(\mathit{r}\right)$, i.e.,
Thus, the linear operator, $\hat{L}=\left(1/n!\right){\nabla}^{\left(n\right)}$, together with its inverse in Equation (51), establishes an isomorphism between the vector spaces of harmonic polynomials of degree n, and symmetric traceless tensors of rank n.
The harmonic polynomials, or equivalently the rank n traceless tensors are spanned by 2n + 1 linearly independent vectors. To see this, first consider the components, ${A}_{\mathit{n}}$, of a rank n symmetric Cartesian tensor for which it is a simple matter to show that there are ${N}_{s}=\left(n+1\right)\left(n+2\right)/2$ possible values of $\mathit{n}=\left({n}_{x},{n}_{y},{n}_{z}\right)$ for which ${n}_{x}+{n}_{y}+{n}_{z}=n$. Furthermore, the trace condition imposes ${N}_{t}=n\left(n-1\right)/2$ constraints. To see this, take the trace tensor, $Tr\left({\mathit{A}}^{\left(n\right)}\right)$ from Equation (9), which is a degree n − 2 symmetric tensor, having components $Tr{({\mathit{A}}^{\left(n\right)})}_{\alpha \beta \gamma ..}={\displaystyle \sum _{\chi}}{A}_{\alpha \beta \gamma ..\chi \chi}^{\left(n\right)}$, where clearly, each component is independent, and as such the trace tensor is described by $n\left(n-1\right)/2$ linearly independent vectors, all of which must be independently equal to zero in order that ${\mathit{A}}^{\left(n\right)}$ be traceless. This leaves us with ${N}_{st}={N}_{s}-{N}_{t}=2n+1$ degrees of freedom for both the rank n traceless tensors and the degree n harmonic polynomials.
2.8. Spherical Harmonics as a Basis for Traceless Symmetric Tensors
The discussion at the end of the last section referred to the fact that symmetric traceless tensors can in principle be spanned by a minimal set of 2n + 1 linearly independent vectors. It would obviously be advantageous to work in a representation in which just this number of components are used, and in this section we shall show how this can be done using spherical harmonics, which provide a natural orthonormal basis for traceless tensors.
We begin by defining what we will refer to as the spherical inner product,
$\u27e8,{\u27e9}_{s}$, not to be confused by our tensor inner product,
$\u27e8,{\u27e9}_{t}$, which is given by
where the superscripts
$a$ and
$b$ label two homogeneous polynomials of order
m and
n, respectively, and we have switched to spherical coordinates. (Also note, here we are using real polynomials, but either one of
${p}^{a\left(n\right)}$, or
${p}^{b\left(n\right)}$ in the integral would need to be replaced by its complex conjugate in the full complex case, in order that
$\u27e8,{\u27e9}_{s}$ be a true inner product.)
We now state a theorem which will allow us to convert between the tensor inner product of traceless tensors and the spherical inner product of harmonic polynomials.
A Theorem on the Equivalence of Two Inner Products
Suppose that
${h}^{\left(n\right)}\left(\mathit{r}\right)$ is a degree n harmonic polynomial, and
${\mathit{H}}^{\left(n\right)}=\left(1/n!\right){\nabla}^{\left(n\right)}{h}^{\left(n\right)}\left(\mathit{r}\right)$ is its traceless tensor equivalent, then the spherical inner product,
$\u27e8{h}^{a\left(n\right)},{h}^{b\left(n\right)}{\u27e9}_{s}$, and the tensor inner product,
$\u27e8{\mathit{H}}^{a\left(n\right)},{\mathit{H}}^{b\left(n\right)}{\u27e9}_{t}$, are in a constant ratio for each rank, according to
The above theorem may seem like it would be easy to derive through standard algebraic methods, but in fact, it is surprisingly hard to obtain and our derivation ended up being quite technical. Thus, we will leave the mathematical details to
Section 4.4 through
Section 4.6;
Section 4.5, we show that the two inner products are proportional, whilst in
Section 4.6 we derive the proportionality constant. We should also note that similar expressions to Equation (54) have been developed by Ehrentraut and Muschik, [
20] (especially Section 4 in this reference) although using a quite different approach to the one taken here.
From the theory of spherical harmonics, a complete orthogonal basis (with respect to the spherical inner product) for degree harmonic polynomials is provided by the
spherical harmonics (technically, the regular solid harmonics), which comprise a set of 2n + 1 real harmonic polynomials orthogonal over the unit sphere, such that, writing the spherical harmonic polynomials as
${q}^{i\left(n\right)}\left(\mathit{r}\right)$, we have that
where
$\parallel {q}^{i\left(m\right)}{\parallel}_{s}=\sqrt{\u27e8{q}^{i\left(m\right)},{q}^{i\left(m\right)}{\u27e9}_{s}}$ is the norm of
${q}^{i\left(m\right)}\left(\mathit{r}\right)$, and similarly for
${q}^{j\left(n\right)}\left(\mathit{r}\right)$.
A note on the nomenclature. Strictly speaking, a spherical harmonic can be used to describe any harmonic polynomial confined to the unit sphere. However, here we will use the term spherical harmonic polynomial to refer specifically to the set of ${q}^{i\left(n\right)}\left(\mathit{r}\right)$ polynomials, which are orthogonal over the unit sphere.
By application of Equation (50), we can also define the traceless tensor form of the spherical harmonics, which we will call
${\mathit{Q}}^{i\left(n\right)}$, from
which, using Equation (51), has an inverse given by
The spherical harmonic polynomials up to rank 3 are given in
Table 1, adapted from Stone [
13].
Given the orthogonality of the spherical harmonic polynomials from Equation (55) and given the inner product equivalence from Equation (54), we also have that, under suitable normalisation,
A note on the normalisation: Here, we are choosing the normalisation such that Equation (58) above holds, i.e., $\parallel {\mathit{Q}}^{i\left(n\right)}{\parallel}^{2}=1$. So that the ${q}^{i\left(n\right)}\left(\mathit{r}\right)$ are consistent with the ${\mathit{Q}}^{i\left(n\right)}$ according to Equation (56), we have, by way of Equation (54), that the ${q}^{i\left(n\right)}\left(\mathit{r}\right)$ polynomials should be normalised according to $\parallel {q}^{i\left(n\right)}{\parallel}^{2}=4\pi n!/\left(2n+1\right)!!$.
Thus the
${\mathit{Q}}^{i\left(n\right)}$ provide an orthogonal basis for the traceless symmetric rank n tensors, and as such, we can express any symmetric traceless rank n tensor as a linear sum in
${\mathit{Q}}^{i\left(n\right)}$, according to
and taking the inner product of both sides of Equation (59) above with respect to
${\mathit{Q}}^{i\left(n\right)}$ shows that
${A}_{i}^{\left(n\right)}$, the components of
${\mathit{A}}^{\left(n\right)}$ in the spherical harmonic basis are given by
where we use the convention that spherical harmonic components are to be indexed by modern roman lower case letters, as opposed to Greek for the Cartesian indices.
Conversion of traceless tensors from Cartesians to spherical harmonics according to Equation (60) is perhaps most easily done through consulting tables of spherical harmonics polynomials, such as those given by Stone [
13]. Furthermore, for convenience, the spherical harmonic polynomials up to rank 3 are listed in
Table 1, which is adapted from Stone.
As an example, from
Table 1 (
Section 4.7), we have that the spherical harmonic
${q}^{4\left(3\right)}=\sqrt{3/2}z\left({x}^{2}-{y}^{2}\right)$, which gives,
${A}_{4}^{\left(3\right)}=\sqrt{3/2}\left({A}_{zxx}-{A}_{zyy}\right)$, and given that
${q}^{1\left(2\right)}=\left(\sqrt{6}/6\right)\left(3{z}^{2}-{r}^{2}\right)$, we have that
${A}_{1}^{\left(2\right)}=\left(\sqrt{6}/6\right)\left(3{A}_{zz}-\left({A}_{xx}+{A}_{yy}+{A}_{zz}\right)\right)$, and so on.
The spherical harmonic representation comes in particularly useful for calculating inner products; for, we can use the orthogonality of spherical harmonics to write
Thus, it can be seen that calculating an inner product in spherical harmonics requires the minimal 2
n + 1 operations for that rank, which is an enormous saving over the
${3}^{n}$ multiplications required for naively multiplying all of the
${A}_{\alpha \beta \gamma ..}^{\left(n\right)}{B}_{\alpha \beta \gamma ..}^{\left(n\right)}$ matrix Cartesian components together [
21,
22].
We have seen how the ${Q}_{n}^{i\left(n\right)}$ coefficients allow for transformation of Cartesians into spherical harmonics. There is also an inverse transformation, given by ${R}_{k}^{\alpha \beta \gamma ..\left(n\right)}$, which are the components of ${R}_{\alpha \beta \gamma ..}^{\left(n\right)}$ projected onto the spherical harmonic basis, and which can be used to transform the components in the spherical harmonic basis back to Cartesians.
Furthermore, once again, Stone has provided tables which give a very convenient way for carrying out these transformations;
Table 2, which is also adapted from Stone, provides the relevant information.
As an example, from
Table 2, we have that
$x{y}^{2}=-\left(1/2\right){q}^{6\left(3\right)}-\left(\sqrt{15}/30\right){q}^{2\left(3\right)}$, from which it follows that, if we have spherical harmonic components
${A}_{k}^{\left(3\right)}$, then
${A}_{xyy}^{\left(3\right)}=-\left(1/2\right){A}_{6}^{\left(3\right)}-\left(\sqrt{15}/30\right){A}_{2}^{\left(3\right)}$.
We conclude this section with a discussion of what Applequist [
15] has called the detracing operator.
Suppose that
${\mathit{T}}^{\left(n\right)}$ is an in-general non-traceless symmetric tensor. Given that the spherical harmonics form a complete orthonormal basis for the subspace of traceless rank n tensors, the detracing operator
${\hat{D}}_{n}$ can be written in spherical harmonics as
One application of
$\hat{D}$ is in particular worth noting. Applying the above expression for
${\mathit{R}}^{\left(n\right)}$, and using Equation (57) to make the substitution
${\u27e8{\mathit{R}}^{\left(n\right)},{\mathit{Q}}^{i\left(n\right)}\u27e9}_{t}={q}^{i\left(n\right)}\left(\mathit{r}\right)$ gives via Equation (60) that
where
${\mathit{R}}^{t}\left(n\right)=\hat{D}\mathit{R}$ are the traceless projections of the
${\mathit{R}}^{\left(n\right)}$ tensors (see Equation (11)), and where we have used
${\u27e8{\mathit{R}}^{\left(n\right)},{\mathit{Q}}^{i\left(n\right)}\u27e9}_{t}={\u27e8\hat{D}{\mathit{R}}^{\left(n\right)},{\mathit{Q}}^{i\left(n\right)}\u27e9}_{t}$, which holds because only the traceless component of
${\mathit{R}}^{\left(n\right)}$ can contribute to the inner product with the traceless
${\mathit{Q}}^{i\left(n\right)}$.
Thus, the spherical harmonic components of the Maxwell Cartesian spherical harmonics are just the spherical harmonic polynomials themselves.
3. Discussion
3.1. The Multipole Interaction in Spherical Harmonics
In this section, we will aim to convert the various expressions so far developed in Cartesians into their spherical harmonic equivalents.
In the last section, we showed how to convert inner products into spherical harmonics. Furthermore, in this regard, it is unfortunate the multipole interaction generating formula of Equation (31) cannot be expressed entirely in terms of such products, involving as it does problematic contractions of the form ${\mathit{C}}^{\left({d}_{c}\right)}={\mathit{M}}^{\left({d}_{i}+{d}_{c}\right)}.{d}_{i}.{\mathit{R}}^{\left({d}_{i}\right)}$.
However, even when dealing with such contractions, there is a way of still using the inner product method, which we shall now describe. We introduce what we will refer to as the
split-component representation of a symmetric tensor, using the notation
${\mathit{T}}^{\left({n}_{a},{n}_{b}\right)}$, where
$n={n}_{a}+{n}_{b}$ is the full rank of the tensor,
${\mathit{T}}^{\left({n}_{a}+{n}_{b}\right)}$, of which
${\mathit{T}}^{\left({n}_{a},{n}_{b}\right)}$ is but one representation. Taking
${n}_{a}=3$,
${n}_{b}=2$ as an illustrative example, we write the symmetric traceless multipole tensor,
${\mathit{M}}^{\left(3,2\right)}$, which has Cartesian components
where
${\mathit{M}}^{\left(3,2\right)}$ transforms as a symmetric traceless Cartesian tensor with respect to (i) its before-comma components, (ii) its after-comma components, and (iii) in all its components as a whole. In this representation, an example contraction can now be written as
which behaves like an inner product with respect to the after-comma components, and, as such, can be readily evaluated in spherical harmonics.
We proceed by separately transforming the before-comma and after-comma components of
${M}_{\alpha \beta \gamma ,\delta \u03f5}^{\left(3,2\right)}$ into spherical harmonics, and, using transformations of the sort described by Equation (60), we have that
where
${Q}_{\delta \u03f5}^{j\left(2\right)}$ is used to transform
${M}_{\alpha \beta \gamma ,\delta \u03f5}^{\left(3,2\right)}\to {M}_{\alpha \beta \gamma ,k}^{\left(3,2\right)}$, and
${Q}_{\alpha \beta \gamma}^{i\left(3\right)}$ is used to transform
${M}_{\alpha \beta \gamma ,k}^{\left(3,2\right)}\to {M}_{j,k}^{\left(3,2\right)}$. (As discussed in the last section, these transformations are easiest done by way of tables of spherical harmonics, suitably implemented into code.)
Also, and referring to the discussion of the detracing operator of Equation (62), the
${\mathit{R}}^{\left(n\right)}$ tensor components are transformed as
${R}_{\alpha \beta}^{\left(n\right)}\to {R}_{i}^{t\left(n\right)}={q}^{i\left(n\right)}\left(\mathit{r}\right)$, and so the desired contraction can now be written in spherical harmonics as
Notes:
- (i)
The concept of a split component representation can be made quite general. If we wished to, we could use a mixed Cartesian-spherical harmonic representation, such as ${M}_{\alpha \beta \gamma ,k}^{\left(3,2\right)}$, or we could choose to use more than one; for instance, ${M}_{a,b,c}^{\left(3,2,1\right)}$ is a valid split of ${\mathit{M}}^{\left(6\right)}$. However, no matter how we choose the split, or the base, it is still referring to the same underlying tensor, and, if necessary, one can always recover all the original components from the by taking the appropriate inverse transformations.
- (ii)
The rank-1 spherical harmonics are just x, y and z (see
Table 1, in
Section 4.7), from which it follows that a rank 1 tensor has spherical harmonic components
${T}_{a}^{\left(1\right)}={T}_{x}^{\left(1\right)},{T}_{y}^{\left(1\right)},{T}_{z}^{\left(1\right)}$, the same as in Cartesians. Furthermore, in general, we have that
${T}_{\alpha ,\beta ,\gamma ,.}^{\left(1,1,1,..\right)}={T}_{\alpha \beta \gamma ..}^{\left(n\right)}$.
- (iii)
The split component representation is symmetric with regards to any permutation of its components, e.g., ${T}_{a,b}^{\left(m,n\right)}={T}_{b,a}^{\left(n,m\right)}$, and ${T}_{a,b,c}^{\left(l,m,n\right)}={T}_{c,a,b}^{\left(n,l,m\right)}={T}_{c,b,a}^{\left(n,m,l\right)}$, and so on.
- (iv)
The transformations can all be done by way of the table method explained in the last section. That is, we do not have to carry out tedious matrix multiplications, but can instead just use
Table 1 suitably implemented into code to convert the Cartesians into spherical harmonic components.
At this stage it will prove useful to return to the diagrammatic representation.
Figure 5’s top illustrates the equivalence between different representations of the tensors in spherical harmonics and Cartesian coordinates. The example given is of a traceless symmetric rank 4 tensor, which can be represented as either
${\mathit{T}}^{\left(1,1,1,1\right)}$, or
${\mathit{T}}^{\left(2,1,1\right)}$, or
${\mathit{T}}^{\left(2,2\right)}$, or
${\mathit{T}}^{\left(3,1\right)}$, or
${\mathit{T}}^{\left(4\right)}$, where the Cartesian coordinates are, as usual, represented by spokes, and where the transformation to spherical harmonics is depicted by
braiding any number of spokes together. Of course, this is just a visual metaphor, but it is intended to convey how the transformation into spherical harmonics intertwines (through taking linear combinations of) multiple Cartesian indices into one spherical harmonic index.
The next line, in
Figure 5’s middle, shows how energy term,
$({\mathit{M}}^{i\left(5\right)}\vdots {\mathit{R}}^{ji\left(3\right)}):({\mathit{M}}^{j\left(3\right)}.{\mathit{R}}^{ji\left(1\right)})$, depicted in
Figure 1, can be converted to spherical harmonics, through performing the braidings
${M}_{\alpha \beta \gamma \delta \u03f5}^{i\left(5\right)}\to {M}_{a,b}^{i\left(3,2\right)}$,
${M}_{\alpha \beta \gamma}^{j\left(3\right)}\to {M}_{a,b}^{j\left(2,1\right)}$,
${R}_{\alpha \beta}^{ji\left(1\right)}\to {q}^{a\left(1\right)}\left({\mathit{r}}^{ji}\right)$ and
${R}_{\alpha \beta \gamma}^{ji\left(3\right)}\to {q}^{a\left(3\right)}\left({\mathit{r}}^{ji}\right)$, and then calculating the contractions according to
Furthermore, the final diagrammatic equation, in
Figure 5’s bottom, shows how to convert the gradient of the above term into spherical harmonics, which requires the additional braiding
${M}_{\alpha \beta \gamma \delta \u03f5}^{i\left(5\right)}\to {M}_{a,\nu ,c}^{i\left(2,1,2\right)}$.
The methods developed here can be used to transform any contraction, and thus, we are now in a position to transform the entire multipole interaction, energies and forces and fields, into spherical harmonics. We begin with the multipole interaction generating formula of Equation (31), which in spherical harmonics is given by
The spherical harmonic analogue to Equation (34), which gives the gradient terms necessary for the force calculations (per Equation (31)) is given by
and the spherical harmonic analogue to Equation (38), which gives the derivatives necessary for the multipole fields (per Equation (37)) is given by
This last expression needs some explanation. Each term in the sum has a tensor representation of type ${T}_{a,c}^{\left({d}_{c},{d}_{j}\right)}$, but given that the sum is over ${d}_{c},{d}_{j}$, the quantity in square brackets will result in a sum in different split component representations, e.g., for rank 4, the sum will have the form ${\mathit{S}}^{\left(4\right)}={c}_{4}{\mathit{T}}^{\left(4\right)}+{c}_{3,1}{\mathit{T}}^{\left(3,1\right)}+{c}_{2,2}{\mathit{T}}^{\left(2,2\right)}$, where the representations do not in general refer to symmetric tensors. However, once the full sum has been evaluated, it can be symmetrised by converting the result back into Cartesians, before averaging over all permutations of the Cartesian indices.
The electrostatic potential at
$\mathit{r}$, from a multipole expansion at the origin can be found from the above by taking the rank 0 multipole derivative and then using Equation (37) to obtain
where the last term can be derived from the first, or obtained from taking the rank 0 multipole derivative of Equation (38), and where we note that it reduces to Equation (22) for the kernel
${B}_{0}\left(r\right)=1/r.$Finally, the spherical harmonic analogue for Equation (42), giving the total torque on a multipole site is given by
Equations (69)–(73) then comprise our final expressions for the multipole interaction in spherical harmonics, in a form suitable for implementation into the Ewald sum. Here, we should admit that we have not given spherical harmonic equivalents for rotation of the multipoles, or for calculation of the angular derivatives (Equation (39)). We currently prefer to keep these in Cartesians for simplicity, but given that both these calculations can be performed outside the main ij particle loop, there is no significant computational cost to their calculation.
3.2. Implementation
It may be a cause for worry that implementing the expressions in the last section is technically very difficult, or computationally costly; however, neither of these things is true.
As far as the implementation goes, it is true that, due to the amount of ‘book-keeping’ required, implementing the multipole interaction in either Cartesians or spherical harmonics is a moderately difficult coding task, but implementing Equations (69)–(73), does not need to be any more difficult than implementing the same sums in Cartesians.
As for computational cost, note that, contrary to a perennial myth, implementation of the multipole interaction in spherical harmonics does not require the calculation of any expensive trigonometric functions. All of the necessary coordinate transformations can be done using
Table 1, suitably implemented into code, obviating the need for any explicit matrix multiplication or calculation of trigonometric functions (see
Section 4.7). It is true that the spherical harmonic transformations may need to be evaluated afresh each step of a simulation, but the required transformations can be done exclusively outside the main
ij particle loop, which almost always takes up the vast majority of time in a calculation. These transformations can be done in a time that’s linear with the number of particles, and which, in any case require no operations more complex than the multiplication of real numbers. Furthermore, the final expressions in spherical harmonics require fewer operations than their Cartesian equivalents to calculate, thus providing an overall saving in computational time.
Implementation is greatly aided by the use of tests to verify the results at each stage.
We begin with the energies, where numerical differentiation can be used to check that the multipole interactions are giving the right energies for each rank.
Suppose we are confident that the multipole interactions are accurate to rank n. Then, it can be checked that the rank n + 1 multipoles are also giving the right results by comparing the analytic energies for rank n + 1 multipoles against numerical results found from numerical differentiation of the rank n multipoles.
One way of doing this is, given rank-n multipole ${\mathit{M}}^{\left(n\right)}$, we construct a fictitious diatom of bond-length $\Delta \mathit{r}$, in which the first site holds a multipole ${\mathit{M}}^{\left(n\right)}/\Delta r$, and the second site holds a multipole $-{\mathit{M}}^{\left(n\right)}/\Delta r$. Then, as $\Delta \mathit{r}\to \mathit{0}$, a multipole expansion of the diatom as a whole will give a pure rank n + 1 multipole moment. We can now check to see how the energy of this diatom in the field of other rank $\le $ n multipoles compares to the energy of the system when replacing it by an analytic multipole of rank n + 1, where the analytic multipole is assigned the rank n + 1 multipole moments of the diatom.
If $\Delta \mathit{r}$ is made small enough, it should be possible to obtain exact agreement up to numerical precision. Furthermore, in this way, it is possible to boot-strap our way to checking multipoles of arbitrary rank. We begin with charges, which can be added together to make numerical dipoles. Furthermore, once the analytic dipoles are confirmed, pairs of analytic dipoles can be added to together to make numerical quadrupoles, and so on.
We have implemented our expressions for the energies, fields, forces and torques in spherical harmonics into an Ewald sum code, going up to rank 3 in the multipole expansion. The analytic forces’ torques and angular derivatives were checked by comparison to numerical derivatives.
The reciprocal space and self-interaction parts of the Ewald sum are given in
Section 4.7, and given that both these terms involve simple to convert inner products over the multipoles, it is absolutely straightforward to convert the Cartesian multipole form of these expressions as given by Smith [
1] into their spherical harmonic equivalents.
We tested our code on a system of 32 molecules each containing 26 nuclei and interacting under periodic boundary conditions, and found about a 22% speed-up upon converting the full Ewald sum to spherical harmonics for multipoles up to rank 3, where the Cartesian form had already been heavily optimised to remove all obviously redundant operations. This is a not insignificant saving, and the difference would only be expected to grow with increasing rank.
We have made a copy of our code available on the Internet [
21]. It includes all the gradient tests mentioned in this section, and also includes the aforementioned multipole consistency test, in which numerical rank
n + 1 multipoles are created from displacing rank n multipoles.
3.3. Scaling
The implementation of site multipole expansions does not alter the fundamental scaling with respect to the number of particles over that of a calculation involving just point charges, but there is a scaling with respect to the maximum multipole rank used in the expansion, and it is to this we now turn.
To obtain an estimate of this scaling, we will try enumerating the number of multiplications involved in calculating the ${G}_{ji}^{l}\left(\mathit{r}\right)$ functions from Equation (31) up to a given rank.
Recall that the inner product of two rank-n tensors requires a minimum of
$\left(n+1\right)\left(n+2\right)/2$ multiplications in Cartesians (accounting for permutation symmetry), and 2
n + 1 multiplications in spherical harmonics. Then, the inner product of Equation (31), which has
${d}_{i}$ left-bracket contractions,
${d}_{j}$ right-bracket contractions, and
${d}_{c}$ between-bracket contractions requires
multiplications in Cartesians and
multiplications in spherical harmonics, where the
$-{\delta}_{{d}_{i},0}-{\delta}_{{d}_{j},0}$ terms in the above two expressions arise from the fact that if
${d}_{i}=0$, then the corresponding
${\mathit{R}}^{ji}$ tensor is equal to unity and no multiplication is required (and similarly for
${d}_{j}=0$).
To obtain the total number of multiplications involved in calculating the
${G}_{ji}^{l}\left(\mathit{r}\right)$ functions for multipoles up to a given rank, we wrote a simple code to sum the values
where
$X\left(n\right)$ is the total number of multiplications, and where we are only summing over terms with multipole ranks
${d}_{i}+{d}_{c},{d}_{j}+{d}_{c}\le n$.
Finally, to obtain the scaling, we fit the curves $\Delta X\left(n\right)=X\left(n\right)-X\left(0\right)$ up to a maximum rank of $n=8$, with the form $\Delta X=a{n}^{s}$, where $a$,$s$ are fitting coefficients, with $s$ being the scaling power.
The result of this exercise was that we found $\Delta {X}^{cart}\left(n\right)=3.5{n}^{3.8}$ for Cartesians and $\Delta {X}^{SH}\left(n\right)=5.8{n}^{3.3}$ in spherical harmonics. Thus, the spherical harmonics are expected to have better scaling than the Cartesian case ($s=$ 3.3 for spherical harmonics vs. 3.8 for Cartesians.)
It is hard to imagine many users would want to go beyond n = 8, but repeating the above in the range n = 1…16 gives a scaling of $s=$ 3.6 for spherical harmonics vs. 4.3 for Cartesians.
This analysis is admittedly quite crude. It does not consider the cost of calculating the forces, the fields, or the cost of array look-ups. It also ignores the fact that at least some of the inner products occur more than once in the calculation of the ${G}_{ji}^{l}\left(\mathit{r}\right)$ functions, and so only need to be evaluated once and stored for later use. In light of this, it is worth discussing how our actual implementations perform.
To this end, we ran a 32-molecule test case with each molecule having 26 atoms, for both periodic and non-periodic boundary conditions, and for both spherical harmonics and Cartesian implementations of the multipole interactions, where the periodic simulations employed a full Ewald sum.
We recognise that some groups will be interested in calculations using much larger system sizes, but our code is optimised for crystal-structure prediction using relatively small numbers of molecules, as it often makes sense to look for crystal structures with relatively small simulation cells. Here, we also mention that, unlike many codes, our code allows for arbitrary cut-off radii, [
22] where the cut-off sphere is allowed to be larger than can fit in the simulation cell, which means that we can converge energies for even small unit cells, and in this case we employed a real space cut-off of 16
$\mathsf{\AA}$.
For the periodic calculation we obtained scalings of $s$ = 3.7 for spherical harmonics, and $s$ = 4.5 for Cartesians, and for the non-periodic calculation we obtained s = 2.7 for spherical harmonics and $s$ = 3.3 for Cartesians; results that are not too dissimilar from our relatively crude theoretical predictions.
4. Materials and Methods
We discuss in this section, by way of technical sub-sections, the underlying intellectual and mathematical infrastructure underpinning the above-discussed novel contributions of the present work outlined in
Section 2 and discussed further in
Section 3, before summarising and concluding the article in
Section 5 below.
4.1. Some Mathematical Properties of Cartesian Tensors and Notation
Let $v$ be a Cartesian vector in R^{3}, which has Cartesian components $v.\hat{\alpha}={v}_{\alpha}$, where $\alpha $ is one of $x$, $y$ or $z$, and $\hat{\alpha}$ is a unit vector in the $\alpha $ direction. If $x$, $y$, $z$ is an orthogonal axis set, then $\hat{\alpha}.\hat{\beta}={\delta}_{\alpha ,\beta}$, where ${\delta}_{\alpha ,\beta}=1$ if $\alpha =\beta $, and ${\delta}_{\alpha ,\beta}=0$ otherwise. Thus, we can write $\mathit{v}={\displaystyle \sum _{\alpha}}{v}_{\alpha}\hat{\alpha}$, where the sum is over the three directional indices, $x$, $y$ and $z$.
Now suppose we also have a reference set of axes, given by
${\hat{x}}^{ref}$,
${\hat{y}}^{ref}$ and
${\hat{z}}^{ref}$, where
${\hat{\alpha}}^{ref}.{\hat{\beta}}^{ref}={\delta}_{\alpha ,\beta}$, then
${O}_{\alpha \beta}=\hat{\alpha}.{\hat{\beta}}^{ref}$ is the orthogonal 3 × 3 rotation matrix, which takes the reference-frame components to the components in the laboratory frame via
Now, let
${\mathit{T}}^{\left(n\right)}={\sum}^{}_{\alpha \beta ..}{T}_{\alpha \beta \gamma ..}^{\left(n\right)}\hat{\alpha}\hat{\beta}\hat{\gamma}..$ be a
tensor of rank n, which has Cartesian components
${T}_{\alpha \beta \gamma ..}^{\left(n\right)}$, where the number of indices is equal to its rank, and where the tensor components transform according to
where there are
$n$ occurrences of the rotation matrix in the above, and where we have defined the orthogonal tensor rotation
matrix ${\mathit{O}}^{\left(n\right)}$, with components
We will generally use the superscript notation (n) to indicate the rank of each tensor, except in a small number of cases where the rank can be inferred from counting its indices. To distinguish different tensors of the same kind and rank, we will often also use the superscript to give labels to the tensors, e.g., if we have two rank-n tensors we wish to label a and b, then we will use the notation ${\mathit{T}}^{a\left(n\right)}$ and ${\mathit{T}}^{b\left(n\right)}$, which have components ${T}_{\alpha \beta \gamma ..}^{a\left(n\right)}$ and ${T}_{\alpha \beta \gamma ..}^{b\left(n\right)}$.
Almost all of the tensors used in this work are symmetric with respect to the permutation of their indices, e.g., ${T}_{\alpha \beta \gamma}={T}_{\alpha \gamma \beta}={T}_{\beta \gamma \alpha}={T}_{\beta \alpha \gamma}={T}_{\gamma \alpha \beta}={T}_{\gamma \beta \alpha}$, and we will refer to such tensors as symmetric.
We now turn to a discussion of tensor contractions. We will only give a brief overview, but we note that Applequist has written extensively on this topic, and the interested reader should consult his work [
15].
Suppose that
${\mathit{A}}^{\left(i\right)}$ and
${\mathit{B}}^{\left(j\right)}$ are two such symmetric tensors. We introduce the notation
$\mathit{A}:n:\mathit{B}$ to indicate a
contraction over
$n$ indices of two such tensors, e.g.,
is a contraction over 2 indices, and
is a contraction over 3 indices.
Furthermore, in general, the contraction ${\mathit{A}}^{\left(i\right)}.n.{\mathit{B}}^{\left(j\right)}$ results in a symmetric tensor of rank i + j − 2n.
For small numbers of contractions, we can use the alternative notation that a contraction is indicated by vertical dots, where the number of dots is equal to the degree of the contraction, e.g., ${\mathit{A}}^{\left(i\right)}.{\mathit{B}}^{\left(j\right)}={\mathit{A}}^{\left(i\right)}.1.{\mathit{B}}^{\left(j\right)}$, ${\mathit{A}}^{\left(i\right)}:{\mathit{B}}^{\left(j\right)}={\mathit{A}}^{\left(i\right)}.2.{\mathit{B}}^{\left(j\right)}$, and ${\mathit{A}}^{\left(i\right)}\vdots {\mathit{B}}^{\left(j\right)}={\mathit{A}}^{\left(i\right)}.3.{\mathit{B}}^{\left(j\right)}$.
Now consider the contraction
$s={\mathit{A}}^{\left(n\right)}.n.{\mathit{B}}^{\left(n\right)}$, where
${\mathit{A}}^{\left(n\right)}$, and
${\mathit{B}}^{\left(n\right)}$ are traceless rank (
n) tensors, and the result,
$s$, is a scalar. It is easy to show that this contraction forms an inner product on the vector space of all rank-d tensors. Thus, we will use the notation
To prove that the above is a genuine inner product, we first use the fact that symmetric rank-n tensors form a vector space, that is, the addition of any two rank-n symmetric tensors results in another rank-n symmetric tensor, and the multiplication of any symmetric rank-n tensor by a scalar also results in another rank-n symmetric tensor. Then, we can show that on this vector space, (i) ${\u27e8{\mathit{A}}^{\left(n\right)},{\mathit{B}}^{\left(n\right)}\u27e9}_{t}$ is always a scalar. (ii) for any scalar, $s$: $s{\u27e8{\mathit{A}}^{\left(n\right)},{\mathit{B}}^{\left(n\right)}\u27e9}_{t}={\u27e8s{\mathit{A}}^{\left(n\right)},{\mathit{B}}^{\left(n\right)}\u27e9}_{t}$ (linearity), (iii) ${\u27e8{\mathit{A}}^{\left(n\right)},{\mathit{B}}^{\left(n\right)}\u27e9}_{t}={\u27e8{\mathit{B}}^{\left(n\right)},{\mathit{A}}^{\left(n\right)}\u27e9}_{t}$ (symmetry), and (iv) ${\u27e8{\mathit{A}}^{\left(n\right)},{\mathit{A}}^{\left(n\right)}\u27e9}_{t}\ge 0$. (positive defiteness). Which are the four conditions required for ${\u27e8{\mathit{A}}^{\left(n\right)},{\mathit{B}}^{\left(n\right)}\u27e9}_{t}$ to be an inner product.
It remains to be shown that the inner product is the same in any axis frame, such that it operates on the tensors, and not just their components.
Firstly, working in the lab-axis frame, if $\mathit{a},\mathit{b},\mathit{c},\mathit{d}$ are any rank 1 Cartesian vectors, then $\mathit{a}\mathit{b}:\mathit{c}\mathit{d}=\left(\mathit{a}.\mathit{c}\right)\left(\mathit{b}.\mathit{d}\right)$. From which it follows that if we have axes defined by orthogonal unit vectors $\hat{\alpha}$ and ${\hat{\alpha}}^{\prime}$, where one is rotated with respected to the other, then $\hat{\alpha}\hat{\beta}:{\hat{\gamma}}^{\prime}{\hat{\delta}}^{\prime}={A}_{\alpha \gamma}{A}_{\beta \delta}$, where ${A}_{\alpha \gamma}=\alpha .{\hat{\gamma}}^{\prime}$ are the components of the rotation matrix which transforms between the two frames.
Now, suppose we expand out the inner product of two rank 2 tensors
${\mathit{C}}^{\left(2\right)}$ and
${\mathit{D}}^{\left(2\right)}$, where the latter’s components are calculated in the
${\hat{\alpha}}^{\prime}$ frame, then, writing
${D}_{\alpha \beta}^{\prime}$ for the components in this frame we have
which is the same as if both tensors components were calculated in the same frame. Thus,
${\u27e8{\mathit{C}}^{\left(n\right)},{\mathit{D}}^{\left(n\right)}\u27e9}_{t}$ is the same no matter the frame each tensor’s components are calculated in, and the same goes for all tensor inner products in general.
Finally, it is useful to define the tensors
with components
${R}_{\alpha \beta \gamma ..}^{\left(n\right)}={r}_{\alpha}{r}_{\beta}{r}_{\gamma}{r}_{\delta}..$, where the number of
$\mathit{r}$s is equal to its rank. It can be readily seen that
${\mathit{R}}^{\left(m\right)}{\mathit{R}}^{\left(n\right)}={\mathit{R}}^{\left(m+n\right)}$, which implies that
${\mathit{R}}^{\left(0\right)}=1$.
Similarly, we define the tensors
where
${\nabla}_{\alpha}=\partial /\partial {r}_{\alpha}$, and
${\nabla}_{\alpha \beta \gamma ..}^{\left(n\right)}={\partial}^{n}/\partial {r}_{\alpha}\partial {r}_{\beta}\partial {r}_{\gamma}..Furthermore,$ again, we have that
${\nabla}^{\left(m\right)}{\nabla}^{\left(n\right)}={\nabla}^{\left(m+n\right)}$, which implies that
${\nabla}^{\left(0\right)}=1$.
4.2. The ${\mathit{G}}_{\mathit{j}\mathit{i}}^{\mathit{l}}\mathbf{\left(}\mathit{r}\mathbf{\right)}$ Functions up to Rank 3
The following lists the
${G}_{ji}^{l}\left(\mathit{r}\right)$ functions defined from Equation (31) (and surrounding text) up to multipoles of rank 3. These formulae agree with those derived by Smith, [
1] who calculated terms up to rank 3, except that here we list only those terms which occur for traceless multipole tensors.
4.3. The $\mathit{\partial}{\mathit{G}}_{\mathit{j}\mathit{i}}^{\mathit{l}}\mathbf{\left(}{\mathit{r}}^{\mathit{j}\mathit{i}}\mathbf{\right)}\mathbf{/}\mathit{\partial}{\mathit{r}}^{\mathit{j}}$ Functions up to Rank 3
The following lists the
$\partial {G}_{ji}^{l}\left({\mathit{r}}^{ji}\right)/\partial {\mathit{r}}^{j}$ functions given by Equation (34), up to rank 3 in the multipoles. These formulae agree with those derived by Smith, [
1] who calculated terms up to rank 3, except that here we list only those terms which occur for traceless multipole tensors.
4.4. The General Decomposition for Homogeneous Polynomials
This sub-section introduces a standard decomposition for polynomials into mutually orthogonal rotationally invariant subspaces. The results will be used below in
Section 4.5, in which we explain how to convert between the tensor and spherical inner products.
A result from the theory of spherical harmonics (e.g., see chapter 2 of ref. [
23]): Any degree n homogeneous polynomial,
${p}^{\left(n\right)}\left(\mathit{r}\right)$, has a unique decomposition given by
where
${h}^{m\left(n-2m\right)}\left(\mathit{r}\right)$ is the
mth harmonic polynomial, and where, as usual, the degree of the polynomial is placed in brackets, so that
${h}^{m\left(n-2m\right)}\left(\mathit{r}\right)$ describes a polynomial of degree
n−2
m. Furthermore, where, the
$\lfloor n/2\rfloor $ in Equation (77) is the smallest integer less than or equal to
$n/2$, e.g.,
$\lfloor 8/2\rfloor =4$ and
$\lfloor 11/2\rfloor =5$.
The decomposition of Equation (100) is unique, because each term in the sum resides in a subspace which is mutually orthogonal under the spherical inner product.
To show this, use the result from the theory of spherical harmonics (e.g., chapter 2 of ref. [
23]) that for any two harmonic polynomials of degree
$m$ and
$n$, the spherical inner product
${\u27e8{h}^{a\left(m\right)},{h}^{b\left(n\right)}\u27e9}_{s}$ is zero, unless
$m=n$. Thus, if
${p}^{m\left(n\right)}\left(\mathit{r}\right)={r}^{2m}{h}^{m\left(n-2m\right)}\left(\mathit{r}\right)$, and
${p}^{{m}^{\prime}\left(n\right)}\left(\mathit{r}\right)$ $={r}^{2{m}^{\prime}}{h}^{{m}^{\prime}\left(n-2{m}^{\prime}\right)}\left(\mathit{r}\right)$, then the spherical inner product,
$\u27e8{p}^{m\left(n\right)},{p}^{{m}^{\prime}\left(n\right)}{\u27e9}_{s}={\u27e8{h}^{m\left(n-2m\right)},{h}^{{m}^{\prime}\left(n-2{m}^{\prime}\right)}\u27e9}_{s}$ is zero unless
$m={m}^{\prime}$.
The decomposition of Equation (100) is also complete, because the dimension of the $m$th subspace is $2\left(n-2m\right)+1$, and summing over all subspaces gives a dimension of $\left(n+1\right)\left(n+2\right)/2$ (which can be proved by induction), which is the full dimension of the vector space of degree n homogeneous polynomials. (For example, if $n=7$, then $2n+1=15$, and the total dimension is $15+11+7+3=36=\left(7+1\right)\left(7+2\right)/2$.)
Now consider a polynomial in the mth subspace, and so of the form ${p}^{\left(n\right)}\left(\mathit{r}\right)={r}^{2m}{h}^{\left(n-2m\right)}\left(\mathit{r}\right)$, and what happens when that polynomial is rotated.
We can rotate the polynomial through calculating ${p}^{\left(n\right)}\left(\mathit{O}\mathit{r}\right)$, where $\mathit{O}$ is an orthogonal rotation matrix, and where the result of which is ${p}^{\left(n\right)}\left(\mathit{O}\mathit{r}\right)={r}^{2m}{h}^{\left(n-2m\right)}\left(\mathit{O}\mathit{r}\right)$. However, given that the rotation of any harmonic polynomial is still a harmonic polynomial of the same degree, we have that ${p}^{\left(n\right)}\left(\mathit{O}\mathit{r}\right)$ must still belong to the same subspace as ${p}^{\left(n\right)}\left(\mathit{r}\right)$. Thus, any polynomial which belongs to the mth subspace is guaranteed to remain in that subspace after any rotation.
Furthermore, it can be shown (e.g., chapter 2 of ref. [
23]) that each subspace is
irreducible, in the sense that it cannot be further divided into rotationally invariant orthogonal subspaces. (Spherical harmonic polynomials of degree
n form a 2
n + 1 dimensional basis for the irreducible representation of SO(3), the group of all rotations in three dimensions. See also, for example, chapter 8 in ref. [
24]).
We can also define associated projection operators,
${\hat{d}}_{m}$, such that the
mth projection operator projects a homogeneous polynomial into the
mth subspace, according to
^{.}It was shown in Equation (44) that a degree-n polynomial can be written in the form
which allows us to define the associated tensor operators,
${\hat{D}}_{m}$, according to
where Equation (103) must define uniquely the
${\hat{D}}_{m}$, because the polynomial
${\hat{d}}_{m}{p}^{\left(n\right)}\left(\mathit{r}\right)$ is uniquely defined, and there is a one to one correspondence between polynomials and their equivalent tensors.
One iteration of Equation (103) gives
Consider the case ${m}^{\prime}=m$, for which we have ${\hat{d}}_{m}={\hat{d}}_{m}{\hat{d}}_{m}$, given that ${\hat{d}}_{m}$ are projection operators. In this case, equating Equations (103) and (104) then implies that ${\hat{D}}_{m}{\hat{D}}_{m}={\hat{D}}_{m}$, and so ${\hat{D}}_{m}$ is also a projection operator.
Conversely, for ${m}^{\prime}\ne m$, we have that ${\hat{d}}_{{m}^{\prime}}{\hat{d}}_{m}=0$, because ${\hat{d}}_{m}$ are orthogonal projectional operators, in which case, Equation (104) must equal zero, which can only be possible for all ${\mathit{P}}^{\left(n\right)}$, ${\mathit{R}}^{\left(n\right)}$ if ${\hat{D}}_{{m}^{\prime}}{\hat{D}}_{m}=0$, again for ${m}^{\prime}\ne m$; from which we conclude that the ${\hat{D}}_{m}$ are orthogonal projection operators.
It can be seen from Equation (101) that the ${\hat{d}}_{0}$ operator projects polynomials into the space of harmonic polynomials of the same degree, with ${\hat{d}}_{0}{h}^{\left(n\right)}\left(\mathit{r}\right)={h}^{\left(n\right)}\left(\mathit{r}\right)$ for any ${h}^{\left(n\right)}\left(r\right)$. Given that the tensor equivalent to a harmonic polynomial is traceless, the corresponding ${\hat{D}}_{0}$ operator must project tensors into the space of traceless tensors, i.e., we have that ${\hat{D}}_{0}=\hat{D}$, the detracing operator.
4.5. Conversion between the Tensor and Spherical Inner Products
This sub-section gives the conversion between the spherical inner product ${\u27e8{p}^{a\left(n\right)},{p}^{b\left(n\right)}\u27e9}_{s}$, defined in Equation (53), and the tensor inner product, ${\u27e8{\mathit{P}}^{a\left(n\right)},{\mathit{P}}^{b\left(n\right)}\u27e9}_{t}$, where ${\mathit{P}}^{a\left(n\right)}$ is the tensor equivalent to ${p}^{a\left(n\right)}\left(\mathit{r}\right)$, and similarly for ${\mathit{P}}^{b\left(n\right)}$ and ${p}^{b\left(n\right)}\left(\mathit{r}\right)$.
We begin by recalling from Equation (44) that any degree
$n$ homogeneous polynomial can be written as
Substituting the above into the
$\u27e8{p}^{a\left(n\right)},{p}^{b\left(n\right)}{\u27e9}_{s}$ spherical inner product gives
which can be rewritten as
where the self-adjoint tensor
operator $\hat{K}$ is defined from its action on a degree
n Cartesian tensor,
${\mathit{P}}^{\left(n\right)}$, according to
where
$\hat{K}$ has matrix elements
The components of
$\hat{K}$ are easiest expressed in compressed notation, in which they are given by
where
$={m}_{x}+{m}_{y}+{m}_{z}$ $={n}_{x}+{n}_{y}+{n}_{z}$, and
${e}_{m+n}=1$ if
${m}_{x}+{n}_{x}$,
${m}_{y}+{n}_{y}$ and
${m}_{z}+{n}_{z}$ are all even, and
${e}_{m+n}=0$ otherwise, and where the integral above was solved using the methods for integrating polynomials over the unit sphere in ref. [
25].
Although Equation (110) defines completely the matrix elements of
$\hat{K}$, it is not in a very useful form. In the remainder of this section we will show how
$\hat{K}$ can be put in a more useful form by writing it as a spectral sum in the projection operators,
${\hat{D}}_{m}$, defined in
Section 4.5.
Let
$\hat{R}=\hat{R}\left(\varphi ,\theta ,\psi \right)$ be a rotation operator, parameterised in terms of Euler angles
$\varphi ,\theta ,\psi $ say, which acts to rotate the system, where
$\hat{R}$ acting on a polynomial rotates the polynomial by Euler angles
$\varphi ,\theta ,\psi $, i.e.,
$\hat{R}p\left(\mathit{r}\right)=p\left({\hat{R}}^{-1}\mathit{r}\right)$, where rotation of
$\mathit{r}$ can be achieved using orthogonal rotation matrices as described in
Section 4.1. We also define the action of
$\hat{R}$ on tensors, such that if
${\mathit{P}}^{\left(n\right)}$ is the tensor equivalent of the polynomial
${p}^{\left(n\right)}\left(\mathit{r}\right)$, then
$\hat{R}{\mathit{P}}^{\left(n\right)}$ is the tensor equivalent of the polynomial
$\hat{R}{p}^{\left(n\right)}\left(\mathit{r}\right)$.
We will now establish that both ${\hat{D}}_{m}$ and $\hat{R}$, and $\hat{K}$ and $\hat{R}$ always commute, i.e., $\left[{\hat{D}}_{m},\hat{R}\right]=0$ and $\left[\hat{K},\hat{R}\right]=0$. Furthermore, this will allow us to show that $\hat{K}$ must be a linear combination of ${\hat{D}}_{m}$.
We first show that
$\left[{\hat{D}}_{m},\hat{R}\right]=0$. Recall from
Section 4.4 that no rotation can move a polynomial,
${\hat{d}}_{m}{p}^{\left(n\right)}\left(\mathit{r}\right)$, in the
mth subspace out of its subspace, which implies that
${\hat{d}}_{m}\hat{R}=\hat{R}{\hat{d}}_{m}$, or
$\left[{\hat{d}}_{m},\hat{R}\right]=0$.
Now, given that any polynomial in the mth subspace (with associated projection operator ${\hat{d}}_{m}$) is equivalent to a tensor in its mth subspace (with associated projection operator ${\hat{D}}_{m}$), it also follows that no rotation of a tensor in its mth subspace can move that tensor out of the mth subspace. (If it could, we could transform a polynomial in the mth subspace into its corresponding tensor, rotate that tensor out of the mth subspace, and then transform back, which would mean that the mth subspace is not rotationally invariant.)
Thus, we also have that ${\hat{D}}_{m}\hat{R}=\hat{R}{\hat{D}}_{m}$, or $\left[{\hat{D}}_{m},\hat{R}\right]=0$, which was to be shown.
We will now show that
$\left[\hat{K},\hat{R}\right]=0$. First note that because the spherical inner product is the integral of the product of two polynomials over the unit sphere, we have that rotating both polynomials by the same Euler angles must leave the inner product unchanged. We thus have that
From Equation (107), the LHS above can be rewritten as
$\u27e8{\mathit{P}}^{a\left(n\right)},\hat{K}{\mathit{P}}^{b\left(n\right)}{\u27e9}_{t}$. Further, we also have that the RHS above can be rewritten as
where the last part of Equation (112) above comes from applying the inverse rotation,
${\hat{R}}^{-1}$, to both sides of the inner product. We thus have that
which implies that
$\hat{K}={\hat{R}}^{-1}\hat{K}\hat{R}$, or
$\left[\hat{K},\hat{R}\right]=0$, which was to be shown.
We will now show that since $\left[\hat{K},\hat{R}\right]=0$ and $\left[{\hat{D}}_{m},\hat{R}\right]=0$, it follows that $\hat{K}$ must be a linear combination of ${\hat{D}}_{m}$.
Suppose that
$\mathit{v}$ is an eigenvector of
$\hat{K}$ with eigenvalue
$\alpha $. Given that
$\left[\hat{K},\hat{R}\right]=0$, we have that
It follows that $\hat{R}\mathit{v}$ is also an eigenvector of $\hat{K}$, also with eigenvalue $\alpha $. Thus, $\mathit{v}$ must belong to a degenerate subspace of $\hat{K}$ spanned by all eigenvectors of $\hat{K}$ with eigenvalue $\alpha $. Furthermore, this subspace is rotationally invariant, i.e., for any vector $v$ in that subspace $\hat{R}v$ also belongs to that subspace. However, we have seen that the orthogonal irreducible rotationally invariant subspaces on the vector space of tensors are described by projection operators ${\hat{D}}_{m}$, so each one of the mutually orthogonal degenerate eigenspaces of $\hat{K}$ is one of the m spaces associated with the ${\hat{D}}_{m}$ projection operators.
Now, given that any symmetric matrix can be expressed as a linear sum:
$\mathit{A}={\displaystyle \sum _{i}}{\lambda}_{i}{\hat{P}}_{i}$, where
${\hat{P}}_{i}$ is the projection into the
ith degenerate eigenspace, with associated eigenvalue
${\lambda}_{i}$, we have that
$\hat{K}$ can be written in the form
with the
${\alpha}_{m}$ to be determined.
As a check, we can take the commutator of both sides of Equation (115) above with respect to
$\hat{R}$, which gives
as expected.
Finally, substitution of Equation (116) above into Equation (107) gives
where we have now included the index
$n$ in
${\alpha}_{n,m}$ given that it is possible these scalars also have a dependence on
n, and where we have used
However, Equation (117) must hold separately for each value of
m, for the
mth term on the LHS can only depend on the
mth term on the RHS. So we have that
which, apart from the
${\alpha}_{n,m}$ to be determined in the next sub-section (i.e., 4.6) for the
m = 0 case, is our final form for the conversion between inner products.
4.6. Proportionality Constants for Conversion between the Tensor and Spherical Inner Products for the Case of Traceless Tensors and Harmonic Functions
In
Section 4.5, we developed a general expression (Equation (119)) for the conversion between the spherical and tensor inner products. We will only need the
m = 0 case for this present work, for which the
${\hat{D}}_{0}$ projection operator is the detracing operator,
$\hat{D}={\hat{D}}_{0}$, and the associated
${\hat{d}}_{0}$ operator describes a projection onto the space of harmonic polynomials. Thus, for m = 0, Equation (119) can be written as
where
${\mathit{H}}^{\left(n\right)}$ are traceless tensors, and
${h}^{\left(n\right)}\left(\mathit{r}\right)$ are their associated harmonic polynomials.
In the present sub-section, it will be shown that the rank-dependent proportionality constants
${\alpha}_{n,0}$ are given by
Note that as the multipole interaction formulae are written in terms of traceless tensors, only the m = 0 case is required for this present work. However, the general approach described here should be extensible to results for higher orders.
Our ‘plan of attack’ is to fix its value by choosing the simplest rank n harmonic polynomials we can think of, which are arguably provided by the so-called zonal spherical harmonics, ${r}^{n}{l}_{n}\left(z/r\right)$, where ${l}_{n}\left(z\right)$ is the nth Legendre polynomial, and where $\Delta \left\{{r}^{n}{l}_{n}\left(z/r\right)\right\}=0$. We also have that the zonal spherical harmonics are symmetric about the z axis, which should make them particularly simple to work with.
Again, aiming for simplicity, we will set both
${h}^{a\left(n\right)}\left(\mathit{r}\right)$ and
${h}^{b\left(n\right)}\left(\mathit{r}\right)$ to the same zonal spherical harmonic.
Now, using the properties of Legendre polynomials, we have that the LHS of Equation (120) is given by
where we have used the standard formula for Legendre polynomials that the integral of
${l}_{m}\left(x\right){l}_{n}\left(x\right)$ over
$z=-1$ to
$1$ is given by
$2/\left(2n+1\right){\delta}_{m,n}$.
We next turn to calculating the equivalent inner product on the RHS of Equation (120). Let ${\mathit{L}}^{\left(n\right)}$ be the zonal spherical harmonics in Cartesian tensor form, which using Equation (50) are given by ${\mathit{L}}^{\left(n\right)}=\left(1/n!\right){\nabla}^{\left(n\right)}\left\{{r}^{n}{l}_{n}\left(z/r\right)\right\}$. Given the z symmetry of the zonal harmonics, we have that the ${\mathit{L}}^{\left(n\right)}$ must be proportional to $\hat{D}{\mathit{Z}}^{\left(n\right)}$, where ${\mathit{Z}}^{\left(n\right)}=\hat{\mathit{z}}\hat{\mathit{z}}\hat{\mathit{z}}..$. (No other choice would have the right symmetry.)
Taking the inner product of a rank n tensor, e.g.,
${\mathit{A}}^{\left(n\right)}$, with
${\mathit{Z}}^{\left(n\right)}$ returns the
${A}_{zzz..}^{\left(n\right)}$ component, i.e.,
$\u27e8{\mathit{A}}^{\left(n\right)},{\mathit{Z}}^{\left(n\right)}{\u27e9}_{t}={A}_{zzz..}^{\left(n\right)}$, and we will now state two such inner products (to be proved at the end of this section), which will come in useful for the following.
Using the last two identities above fixes the proportionality constant between
$\hat{D}{\mathit{Z}}^{\left(n\right)}$ and
${\mathit{L}}^{\left(n\right)}$ to give
We can then write
where we have (i) used Equation (126) to exchange
${\mathit{L}}^{\left(n\right)}$ for
$\hat{D}{\mathit{Z}}^{\left(n\right)}$, (ii) used the fact that
${\mathit{L}}^{\left(n\right)}$ is inside the traceless subspace to allow us to exchange
$\hat{D}{\mathit{Z}}^{\left(n\right)}$ for
${\mathit{Z}}^{\left(n\right)}$, and (iii) used Equation (124) above.
Finally, substituting Equations (127) and (123) into Equation (120) fixes ${\alpha}_{n,0}=4\pi n!/\left(2n+1\right)!!$, which is Equation (121).
As promised, we will now provide the proofs of Equations (124) and (125).
To prove Equation (124), we use the identity
which can be proved by algebraic differentiation using the standard identity for the derivative of a Legendre polynomial:
${l}_{n}^{\prime}\left(x\right)=\left(n/\left({x}^{2}-1\right)\right)\left(x{l}_{n}\left(x\right)-{l}_{n-1}\left(x\right)\right)$.
To prove Equation (125), we use the standard generating function for Legendre polynomials
Substituting
$x=cos\left(\theta \right)$ and
$t=r$, where
$cos\left(\theta \right)=z/r$, and recognising that, under these substitutions, the LHS above is now equivalent to
$|\mathit{r}-\hat{\mathit{z}}{|}^{-1}$, we have
However, the LHS above can also be Taylor-expanded as
where we have made use of the expression for the Maxwell Cartesian spherical harmonics from Equation (11).
Comparing Equations (131) and (132), we deduce that
Now, setting $r=1$ and $\theta =0$ means the LHS becomes equal to $\u27e8{\mathit{Z}}^{\left(n\right)},\hat{D}{\mathit{Z}}^{\left(n\right)}{\u27e9}_{t}$, and RHS is given by $n!/\left(2n-1\right)!!$, which gives the required identity.
4.7. Ewald-Sum Terms
For completeness, the full Ewald sum is given by (following Smith [
1]).
where the
${G}_{ji}^{l}\left({\mathit{r}}^{ji}\right)$ terms are given by Equation (31) for Cartesians, and by Equation (69) for spherical harmonics, and where the kernel
${B}_{0}\left(r\right)=erfc\left(\u03f5r\right)/r$ is used, where
$\u03f5$ is a width parameter, which controls the Gaussian width of the Ewald screening charges, and where the higher order
$B$ functions generated by Equation (25).
The
$\mathit{k}$ in Equation (134) are reciprocal-lattice vectors:
$\mathit{k}=2\pi \left({n}_{a}/{L}_{a},{n}_{b}/{L}_{b},{n}_{c}/{L}_{c}\right)$ for an orthogonal unit cell of dimensions
${L}_{a},{L}_{b},{L}_{c}$.
where
${\mathit{K}}^{\left(l\right)}$ is the tensor product
${\mathit{K}}^{\left(l\right)}=\mathit{k}\mathit{k}\mathit{k}..$The
${\varphi}_{i}^{S}$ in Equation (134) are so-called self terms, given by
Note: A recent paper by Stamm et al. [
26] provides a mathematically rigorous derivation of the self terms as given by Smith.
Both Equations (135) and (136) are in the form of inner products, and so can be readily expressed in either Cartesians or in spherical harmonics using Equation (61).
5. Conclusions
We have presented a non-technical—indeed, almost trivial—derivation of the multipole interaction in spherical harmonics, in a form suitable for use with Ewald-sum methods.
We began by summarising one derivation of the multipole interaction in Cartesians, most of which is not new to us, but goes back to the work of Smith and earlier, and continues up to Lin’s derivation of what we refer to as the multipole interaction generating formula.
We then introduced a diagrammatic method for visualising the multipole interactions, in where it was shown that the entire multipole interaction can be represented as a ‘sum over diagrams’, which arguably makes for a much more appealing representation of the interaction than thinking in terms of mathematical formulae alone.
Here, we admit that the results in this work could have been obtained without the aid of diagrams, but given that the diagrams show so clearly the structure of the interaction, we think they provide a valuable insight into what our various algebraic manipulations are actually doing.
Using what we believe to be a novel approach, the remainder of this work showed how the multipole interaction can be converted from Cartesians into spherical harmonics. This involved what was perhaps the key part of this paper, which was (i) the recognition that the multipole interaction in Cartesians involve expressions that can be written as inner products over tensors, (ii) showing that these tensor inner products are proportional to an inner product over the vector space of harmonic polynomials on the unit sphere, (iii) using this relation to convert the Cartesian tensor inner products to spherical inner products involving spherical harmonic polynomials.
The key expression here is Equation (54), which gives the proportionality relation between tensor and spherical inner products. Deriving this relation turns out to be a surprisingly non-trivial problem, but once it has been shown, it becomes a relatively straightforward task to convert the entire multipole interaction from Cartesians into spherical harmonics.
Our method relies on the use of existing tables of spherical harmonics to Cartesian transformations, which have already been given by Stone, and whose properties have been analysed extensively by Stone and others.
Another novelty of our method is that our ‘end-result’ formulae are not pure functions of the spherical harmonic multipoles, like seen in other approaches. Instead, our formulae are functions of split tensor components, which may have several indices, all of which are separately represented in spherical harmonics. However, what really distinguishes our approach from others is that our approach involves fully transforming the between-tensor contractions (bonds in the diagrams) into their spherical harmonic form, while respecting the underlying structure of the contractions; this structure which is particularly evident from the diagrammatic form.
It is surprising to us that such a straightforward conversion method appears to have been overlooked in the literature. It is not clear to us if this method will prove to give superior efficiency when compared to standard approaches, but it is faster than the Cartesian scheme, and it appears to us that our method is comparatively more straightforward and easier to implement than previous approaches. Furthermore, it is our hope that this work will prove a ‘less-painful’ route for the implementation of spherical-harmonic multipole expansions in computational-chemistry codes.