1. Introduction
From metabolism to drug design, the free energy of a system controls its biological processes. The free-energy minimum determines the equilibrium state. The free energy gradients drive the kinetics of the biochemical reactions.
The free energy contains both energy and entropy terms. The energy of complex biomolecules can now be calculated quite accurately, especially using the quantum–chemical methods introduced by Hohenberg and Kohn [
1] and extensively developed by many authors in subsequent work. The entropy, on the other hand, is less-extensively studied, although the free energy involves the difference between the energy and entropy terms. The entropy term is known to play an important role in at least some biochemical processes. An example of an entropy-driven biochemical reaction can be found in sickle-cell disease, where the carbon dioxide causes hemoglobin polymerization and oxygen from respiration reverses the process [
2,
3]. In addition, strategies for controlling the entropy are being used in self-assembling systems to generate novel materials in fields like colloids, macromolecular systems and nonequilibrium assembly [
4].
Biomolecules are typically long-chain molecules. Arrays of one-dimensional chains on a surface can be created by numerical methods like self-avoiding walks, as can two-dimensional self-avoiding membranes [See Chapter 10 of Plishke and Bergersen, [
5]]. Analytic results, however, are few. The simplest “long-chain” molecule is the dimer, and the first analytic derivations of the partition function of such a system were by Temperley, Fisher, and Kasteleyn [
6,
7,
8] for the classical dimer model. The simplest closed form expression for the partition function of a one-dimensional chain appears in Fisher’s 1961 paper [
7], and we will be presenting a simplified version of their derivation. In the dimer model, when a dimer attaches to a lattice, it blocks two adjacent sites from other dimers attaching. It is much more difficult to determine the number of possible arrangement of dimers on a lattice than for only monomers, because a monomer only blocks a single site.
Fisher’s solution method for the dimer model builds monomers as well as dimers into the initial formal construction of the partition function. The monomer term complicates the derivation. Although one set of anticommuting matrices, similar to the Dirac matrices of relativistic quantum mechanics, is used to enforce the dimer constraint, the presence of the monomer term requires an additional set of anticommuting matrices in a product form, reminiscent of the chirality operator , to create homogeneity. Even with this construction, Fisher was unable to find an analytic solution to complete the evaluation of the partition function on a two-dimensional lattice with both monomers and dimers included. He therefore dropped the monomer term to reach their final analytic expression for the partition function Z in dimension .
Because of this, we present, beginning in
Section 4, a simpler method of solving Fisher’s dimer model by neglecting the monomer term from the beginning, and hence restricting the monomer distribution on the simple square
lattice to be completely filled with dimers, which we refer to as filling fraction
, the completely filled lattice with no empty sites. We then need only a single set of anticommuting operators
that live on the links of the lattice and that we refer to as link fields. This method also gives Fisher’s final analytic expression for the partition function
Z in
. This result
Z is for a finite lattice with the sites arranged in
rows and
columns.
Beginning in
Section 8, we specifically use this result for the special case of a two-leg-ladder lattice with
rows and
columns to construct, in the end, the partition function for an infinitely long one dimensional chain of sites containing not just dimers but also monomers of two colors. We use this to produce results using the dimer model that parallel the results found in previous papers for the DNA double helix [
9,
10], which use a lattice gas model. We find results from the dimer model that differ from the simpler lattice gas models, in which all the sites are independent. Perhaps the most important of these is that the dimer geometry gives a persistently nonzero entropy, as the force binding the particles to the lattice increases relative to the thermal energy.
2. Partition Function, Entropy, and Occupancy
Although the classical dimer problem only requires classical physics, classical statistical mechanics is plagued with oddities like Gibbs’s paradox. This means that it is more reliable to start from the quantum-mechanical expression for the partition function of the grand canonical ensemble,
where
is the free energy of this ensemble (often called the grand canonical potential),
H is the quantum Hamiltonian of the system,
N is the number of particles in the system and
the chemical potential of the particle of type
j. The thermal parameter
specifies the temperature
T of the system, where
is Boltzmann’s constant. The trace here is over quantum states of the system. Taking the trace is equivalent to summing over all the possible configurations of the system.
The entropy is then given by , and the mean number of particles of type j is given by , where the subscripts indicate the quantities that are held constant in taking the partial derivatives.
Fisher’s dimer model describes dimers adsorbed on a two-dimensional lattice. Eventually, we will take the limit in which the lattice becomes large, and for that reason, it is useful to express the quantities as energy per lattice site and entropy per lattice site. Therefore, it is more reasonable to consider the entropy per site and the number of particles of a given type per site, which is the occupancy per site. It is then helpful to define the logarithm of partition function per site as
which means that the entropy per site can be written as
The derivative with respect to
gives
and then the entropy becomes
A simple example of partition function is that of a lattice gas. Suppose that there are two kinds of particles, like red balls and blue balls. There are
ways to fill the
N sites of the lattice, where
and
are the numbers of red and blue balls. This
is divided by
, the number of ways to swap red balls, and still has the same configuration, then by
, the number of ways to swap blue balls without changing the configuration. This gives the number of configurations of
red balls and
blue balls,
There are no forces between the balls, merely a force holding them on the lattice sites, and so the Hamiltonian is
while the chemical potential term becomes
. The partition function is then
The activity of the red balls is
and of the blue balls is
. If
is the number of states in the trace that contain
red balls and
blue balls, then the partition function for this system becomes
where
. With the binomial theorem, this can be written as
which is the partition function for red and blue balls on a lattice. The sites are independent, so that the factor
is the partition function contribution from each site. For an infinite lattice, the logarithm of the partition function per site is then
The entropy per site is then given by
Since
is contained in the activities
r and
b,
, and similarly for
b. Inserting the explicit form of the partition function yields the entropy per site of the two-color lattice gas,
We can similarly find the mean particle numbers by differentiating by the chemical potentials. For the red balls
r,
The number of red balls per site
is given by
Substituting the partition function per site for the red balls, we have
The derivative of
r with respect to
is given by
, and so the average occupation of the red balls is
Similarly, for blue balls, we have
and these two expressions add to one, which they should.
Generalizing this argument to include red, blue, and green balls, we have
with the partition function per site in the infinite lattice limit represented as
the entropy per site as
and the occupancies as
If one simply wants vacant sites rather than the green balls or holes in the lattice, the activity is replaced by , because holes have neither an energy nor a chemical potential.
3. The Lattice and Its Dual
Dimer models describe the behavior of rods with ends that occupy sites on a lattice. However, two dimers cannot have their ends on the same site, and this is the fundamental constraint that makes determination of the allowed dimer configurations difficult. A physical example of a dimer might be a hydrogen molecule, with each of the two atoms held to a separate lattice site through electrostatic attraction. The lattice may be completely covered with dimers, or the coverage may be less than complete. In the latter case, the empty sites may be called vacancies, or alternatively regarded as occupied by monomers. The lattice can be of any dimension, although pictures of one- and two-dimensional lattices are easiest to draw.
Dimer models can be either classical or quantum. The difference is, basically, that in the quantum dimer model, the dimers can tunnel between sites, while classically, they can only move by thermally activated diffusive processes. The quantum dimer model is the basis for Anderson’s resonating-valence-bond theory of antiferromagnetism.
Fisher [
7] considered a simple square lattice in two dimensions, with dimers placed randomly on this lattice. He then attempted to determine the number of different dimer arrangements. This result could, for example, be used to determine the entropy, because entropy is the logarithm of the number of possible states of a system, Fisher discovered that he could solve this problem when the lattice was fully occupied and there were no vacancies. The key, he found, was to randomly distribute “half dimers” on all sites of the lattice, as in the lattice gas problem. He then threw away all arrangements in which a half dimer was not connected to a neighbor because the orientations did not match. The remaining configurations were those in which dimers completely covered the lattice.
The trick he used to accomplish this feat was to tag each dimer with a member of a sufficiently large set of anticommuting objects
such that
so that
, with
I the identity, where the
s were represented by matrices. He then used the trace theorems that are employed in particle physics to simplify calculations involving products of Dirac matrices [
11,
12,
13]. These caused the unwanted configurations to vanish, and he identified those that remained with a Pfaffian. Its evaluation gives the number of dimer configurations.
When the lattice is completely covered with dimers, the calculation simplifies because there are no monomers. The fraction of sites covered by dimers is the “filling fraction”, , and this complete dimer coverage corresponds to .
The trace theorems evaluate products of the s. We will need four different s for each site in the lattice, and for all of them to be independent matrices, their dimension d will be quite large. Fortunately, all we need is the commutation law, because it defines their algebra. We will never have to actually write down any of these matrices.
We consider dimers adsorbed on a simple square lattice of sites, indicated by the red points in
Figure 1. The dual lattice is formed by connecting the midpoints of the links of the original lattice. The links are the lines joining each pair of neighboring sites (red, dashed). The dual lattice is the lattice of blue points located at the midpoints of the links of the original red lattice shown in
Figure 2.
Let us denote the lattice sites by Roman subscripts . The anticommuting objects A actually live on the links of the lattice, or equivalently on the sites of the dual lattice. Because they live on the links, one can think of them as "link fields" that anticommute. Let us use Greek superscripts to indicate the links or sites of the dual lattice, so that these anticommuting link fields can be written as . These link fields are regarded here as operators that act on some set of vectors in a vector space. The introduction of a representation by using a set of basis functions allows these fields to be represented by matrices of some suitable dimension d. The trace is then the sum of the diagonal elements , which is invariant under a unitary transformation.
The next step is to develop the trace theorems involving the fields
that are needed here. Because Kronecker deltas and identity matrices are so easy to sum out, it is easy to lose track of them. It is clearer to write the anticommutation law as
where
, the Euclidean-space metric. When matrices of dimension
d are used to represent the
, this becomes
with
the
d-dimensional identity matrix. We will need to evaluate expressions involving
, where
means the trace over a product of matrices representing the
.
A proof of the trace theorems of these matrices is given in
Appendix A. A simpler way to look at the procedure is as follows. In the general product
, use anticommutation to move
s with the same index next to one another as pairs. This introduces a parity factor
. Then, replace each matched pair with
, since the anticommutator
gives
. You will obtain
if the product consists only of matched pairs. Otherwise, there will be at least one
factor that has unmatched indices, and the entire product will vanish. The trace theorems for the
are as follows:
- 3.1
If the general product can be rearranged so that adjacent pairs of indices are the same, thenwhere P is the appropriate parity index for the rearrangement. - 3.2
If the general product cannot be rearranged so that adjacent pairs of indices are the same, then
4. The Dimer Model
One of the simplest models of a system containing diatomic molecules is that of lattice gas of
rigid dimers, each of which fills two nearest neighbor sites of a space lattice of
sites. Fisher was only able to completely evaluate their result for a lattice completely filled,
, with dimers, leaving no vacancies or monomers, as shown in
Figure 3. Furthermore, because we will eventually represent the lattice with its dimer arrangements by matrices, the y-axis is inverted so that the site
is in the upper left corner like the usual initial matrix element
.
We suppose that the dimers do not interact with one another apart from the geometric constraint that only one dimer can be attached to a given site. The dimers are bound to the lattice sites, and we let the total binding energy of a dimer be
, which is twice the binding energy to each site because a dimer has two ends. We follow Fisher [
7] in allowing the binding energy of dimers aligned in the two orthogonal directions to be different, and call them
and
. Then, the partition function becomes
Here,
is the activity of an
x-oriented dimer, and we also let
be the activity of a
y-oriented dimer, following Fisher, who allowed the dimer activities in the two directions to differ. The trace over quantum states adds together the contributions of states with the same numbers
and
of dimers oriented in the two directions. If we define the number of such states to be
, an equivalent expression for the partition function is
which is actually valid for any filling fraction
.
We now invent, following Fisher, a snake-like path through the lattice that allows us to use a one-dimensional numbering scheme to label the lattice sites. The numbering begins at the upper left corner and weaves back and forth along the
x-direction, as shown in
Figure 4. We note that this one-dimensional numbering scheme, serpentine numbering, also works for a three-dimensional lattice if one draws it on a long sheet of paper and then folds the paper with accordion-like pleats. For a lattice with
rows of sites (red) arranged one below another in the vertical
y-direction and
columns parallel to one another in the horizontal
x-direction, we have a lattice of
sites. The virtue of this numbering is that it is easy to determine the signs resulting from the interchanges required to move identical
s adjacent to one another. For the
x-links it is obvious, because the two ends of a dimer are on adjacent sites, so no interchange is needed, and the sign is plus. For a
y-link, consider the following example: the link between sites 27 and 40 in
Figure 4. The
y-links (vertical red dashed lines) look like the ties between two rails on a railroad track, and these ties have two ends. Thus, to move
from position 27 to 39 (adjacent to 40) requires moving to the right by six interchanges (
) along row three, and then moving back left along row 4 by another six interchanges (
). Because a tie has two ends, you will always obtain an even number of interchanges, and this will crucially make all terms positive in our construction of the partition function below.
The dual lattice has twice as many sites as the original lattice because there are two links per site in the square lattice. Consequently, if the original lattice has sites, the dual lattice has links or points. However, there are only distinct adjacent pairs of points in the original lattice, and means that all of these distinct adjacent pairs of points are occupied by dimers. This makes it generally useful to consider as even, because if it is odd, there is at least one vacancy.
An alternative way of labeling the links is to make the superscript
of the link field
a pair of site indices
where
j and
k label the two ends of the link, where
. This leads to the link-field labels of
Figure 5.
The link fields can be arranged as the upper right triangle of an
matrix with zeros down the diagonal, suggestive of an antisymmetric matrix. For a smaller
example with four rows and three columns but similar to the example above, the upper right triangle of entries
is shown in Equation (
29) and shown pictorially in
Figure 6.
The link fields
,
, and
refer to
x-directed links, and the link fields
,
,
, and
refer to
y-directed links. There are no near-neighbor links corresponding to the elements for which zero is entered. In other words, the entries sloping diagonally upward in the direction lower left corner to upper right corner are
y-directed links. The remaining ones sloping downward from the upper left to lower right corners are
x-directed links in the pattern shown in the matrix
M in Equation (
30).
5. Inclusion of Constraints on the Partition Function by Use of a Child’s Toy
Suppose that the lattice has square holes with sides parallel to
x and
y at each lattice site. Into each of these holes, we insert a toy, like a child’s top with a square shaft that just fits the hole, as shown in
Figure 7. The disk that provides most of the top’s angular momentum has a green line painted on it normal to one of the faces of the shaft. This means that the green line can be oriented four ways—north, south, east, and west.
The orientations of the disk are random, so they are distributed like a lattice gas of four colors, as shown in
Figure 8. Sometimes, the green lines of neighboring disks point toward one another, and for any adjacent pair, the probability of this occurring is
.
Now, suppose that the green lined represent “half-dimers”, and when green lines point toward one another on adjacent sites, those two half-dimers join up to form a complete dimer between those two sites. In
Figure 9, the figure on the left shows the half-dimers pointing toward one another on adjacent sites, and the figure on the right shows the corresponding dimer arrangement.
We construct the partition function in a way that allows the constraints of no double occupancy and a
completely filled dimer arrangement to be imposed. Associate with each lattice site
j a function
where the sum is over the nearest neighbors of site
j, as emphasized by the subscript
on the sum. The quantity
is the activity of a dimer on the link
, and is either
if the link is in the
x-direction, or
if the link is in the
y-direction. The factor
is the link field associated with the link
. The square root of the activity is taken because it is the activity of a half-dimer, the object represented by the green line on the disk of the toy. It takes two of these factors,
, to give the activity of the dimer on the link. One factor can be thought of as emanating from the site
j, and the other from the site
ℓ.
The partition function is constructed as a product of these factors
, one rooted on each lattice site
j. To see how this will work, suppose that we have a product of only two lattice sites called
j and
k. The product is
If sites
j and
k are not nearest neighbors, then none of the link fields are common to both factors, and taking
simply gives zero and the term vanishes. None of the sixteen terms in the last line then will contribute to the partition function.
The situation is different when
j and
k are nearest neighbors. Then, there will be one link field in common. Suppose, for example, that it is on an
x-directed link, arising from
because
j and
k are nearest neighbors in the
x-direction with
k to the right of
j. Since the order of subscripts on a link does not matter,
. Thus, this term survives when the trace is taken.
In this way, the partition function,
is an expression in which the only surviving terms are those which give
dimer arrangements, as shown, for example, in
Figure 9. The numbering is one-dimensional,
, because we use one-dimensional serpentine ordering.
6. Introduction of the Antisymmetric Matrix
A more concise notation is useful. Furthermore, we want the contributions of the even-numbered and the odd-numbered sites to the partition function to be written slightly differently. We begin by writing
so that
, which is, of course, either
or
, and where the subscript “
" on the sum is omitted, although the restriction of the sum in
to nearest neighbors of
j will remain understood.
Now, consider the product,
which appears in the partition function. The number of sites is chosen to be even, and we insert
between each odd–even pair, giving
Then, for odd-numbered sites, we have
and for even-numbered sites, we have
These relations can be combined as
and the original product becomes
Finally, let us set
The quantities
are operators formed as linear combinations of the link fields. The link fields satisfy the simple anticommutation law
or, for
d-dimensional matrix representations thereof,
We now show that the
operators also anticommute, and calculate the values of their anticommutators. We have
Writing this in terms of the anticommutators of the link fields gives
where
gives unity if
and
denote the same link field and zero otherwise. They only denote the same link field
if the lattice sites
j and
k are nearest neighbors, and then only if
and
, that is,
. The anticommutator becomes
which reduces to
Finally, we write this in terms of the notation
of
Section 5. This gives
if
j and
k are nearest neighbors, and zero otherwise.
Furthermore,
, the square root of the activity of a dimer on the link
, which does not depend on the order of the indices. Thus, the anticommutator of the operators
and
can be written as
the activity
x or
y of the dimer occupying that link.
The expression giving the partition function is
In order to have the lattice completely filled with dimers (filling factor
), the number
of lattice sites must be even in order to avoid a single essential but unwanted left over vacant site. As a reminder of this, let us temporarily set
, and write the partition function as
To make further progress, we need to find a way to evaluate the product
that appears in the partition function
To do this, we successively use the anticommutator of the operators
from Equation (
50).
Let us write the anticommutator as
and use this expression to rearrange the product as
The second term on the right has
displaced one step to the right, and can be written with the aid of the anticommutation rule as
We continue to move
all the way to the right, requiring
interchanges. We then take the trace of both sides, and use the cyclic property of the trace to move
back to the beginning, thereby reproducing the trace of the left side accompanied by
. In the course of performing this, we “contract”
and
for
, producing
terms containing a factor of
, each accompanied by
. This result is
The factors of two can cancel. Furthermore,
for
is the top row of a Pfaffian of order
.
Appendix B contains a brief description of Pfaffians and an example of how a simple Pfaffian is evaluated. On iterating, each step produces a single additional factor of
in every one of the terms resulting from the trace on the right. The end result is that the trace of Equation (
57) produces the Pfaffian
If
j and
k are extended to cover the entire lattice, the complete entries in the Pfaffian and the upper right triangle of the corresponding antisymmetric matrix will be filled in. In the present context, the only nonvanishing activities
are on links with
j and
k nearest-neighbor sites, and all the remaining entries are zero. This result is found in Fisher’s paper [
14] and references therein, and is written in detail in J. C. Baker’s MS thesis [
15].
Earlier, an example with three rows and four columns of sites was discussed, and is shown in
Figure 6. The
antisymmetric matrix associated with the Pfaffian of this
lattice example, written in block form, is
and that block form consists of the nine
blocks, denoted as
as shown in
Figure 10.
This is also a tridiagonal matrix, where the blocks
X,
Y, and 0 are
blocks. These blocks are
and
while “0” stands for a
block of sixteen zeros. In what follows, we will be using the fact that matrices can be multiplied block by block, and this is shown in
Appendix C.
Consider the upper right triangle that forms the corresponding Pfaffian. The first row has x-oriented links between the sites, and so the block X has entries x in the direction of the diagonal. For the example, there are four columns, so and there are three entries x in the block X. They are offset from the diagonal by one, so the dimensions of the block X shown are . In general, the blocks X are .
Between each row and the one below it, there are vertical y links, and so there are entries of y up the diagonal line from the lower left to upper right corner. In the example, , and there are four of these so the block Y is . In general, the blocks Y are of dimension , just like the blocks X, as they must be to fill the original matrix of dimension , because . There are blocks with entries to account for the entries in the antisymmetric matrix M.
In general, then, when the lattice of sites has rows and columns, the dimension of each individual block is . The matrix of blocks, on the other hand, has dimension . The total number of matrix elements is then , which is the square of the number of sites .
7. Eigenvalues and Eigenvectors of the Asymmetric Matrix
The scheme that makes the calculation simplest is to diagonalize the
X block and use those eigenvectors as the basis to transform the
Y block. This will lead to a cruciform version of the big matrix, which will be convenient for the calculations. Therefore, first we investigate the eigenvalues and eigenvectors of the block
X. The eigenvalue equation for
X is given by
where
is an eigenvalue and
v a column vector with entries
. The set of linear equations represented by this equation have the form
We can make all the linear equations look the same if we imagine padding the list with
and
and then apply the boundary condition
. Then, we have
The generic equation is the difference equation
which can be rewritten as
Regarding
a as a lattice spacing, we recognize this equation as the discretized version of the first-order differential equation
with exponential solutions
that are growing or decaying if the eigenvalue
is real. If
is imaginary, the solutions oscillate.
A solution that satisfies the boundary conditions is given by
This gives
for
k even but
for
k odd.
The boundary condition at
is
which is correct. At
we have
We also require this to vanish, which occurs if the quantity in brackets vanishes, that is, if
Taking logarithms of both sides gives
where
ω is a winding number that gives the number of times that the point
circles the singularity that terminates the branch cut of the logarithm of
. As a result, the angle
is quantized, that is, it acquires only discrete values given by
so that
only takes the values
that is, the discrete values
. The values
and
are the boundary points where the solution vanishes that were added by padding the ends of the list. The components of the eigenvector
are
with
A the normalization constant determined from
.
Figure 11 shows the discrete values of the angle
for the example with eleven columns.
The previous derivation of the eigenvalue
is unchanged. Insert one of the terms of the generic solution
—either one will do—into the difference equation
Substituting
yields
and as long as the eigenvalue
satisfies this equation,
is a solution to the difference equation. When we include the boundary-condition requirements, the angle
is restricted to the discrete values
, and the eigenvalues of
X are
Figure 12 shows the values of the eigenvalue
for the example with eleven columns.
The expression for the eigenfunction,
can be rewritten by inserting the explicit form of the angle
. This gives
giving
Let us now find the normalization constant using
. The components of the eigenvector are
with
. Substituting the expression for
gives
We then have
Now,
because
is always even, giving
where
. The normalization constant is then given by the finite sum
which gives
Then, since a shift in phase of
gives another factor of
,
The value of the sum over cosines is
, and the normalization factor is
Taking A to be real and positive, the normalized eigenvectors are
with
and eigenvalue
. The subscript
k distinguishes the various components of a given eigenvector
. The values
and
are the boundary points where
vanishes. In
Figure 13, are the first four eigenvectors for
and various winding numbers
ω. The points show the values, while the lines are merely to guide the eye.
The previous example of a lattice of sites that has three rows and four columns of sites was shown in
Figure 6. This lattice has
sites, so the dimensions of the antisymmetric matrix
M that gives the square of the partition function through
is
, that is,
. This matrix
M provides an example of the block-matrix form of the antisymmetric matrix
M having the Pfaffian array as its upper right triangle, and is given in terms of the dimer activities
x and
y by the matrix shown in
Figure 10.
This particular matrix has 12 rows and 12 columns, and consists of
blocks called
X,
Y, and zero. The matrix is divided into blocks by the red lines in
Figure 10, and, in general, such a matrix has the form
For clarity, the zeros are not shown. This is a tridiagonal block matrix, and this form always arises when using serpentine numbering of the lattice sites. The size of each block is the number of sites along a single row of the lattice, as is most easily seen by examining the blocks
Y. The number of blocks is the square of the number of rows in the lattice of sites.
Earlier, we described the diagonalization procedure for a matrix with the form of the block
X. That same procedure can be used to find eigenvectors and eigenvalues of the block form of
M. One can manipulate the blocks just like matrix elements, because block matrices can be multiplied block by block. The notation, though, can be confusing if the matrix, which is
M, is denoted by
M both when it is written as matrix elements and when it is written in blocks. Consequently, let us call the block form
B, writing
when it is in block form, and
when it is written in terms of its matrix elements
.
We will need here our earlier observation that, in general, when the lattice of sites has rows and columns, the dimension of each individual block is . The matrix of blocks, on the other hand, has the dimension . The total number of matrix elements is then , which is the square of the number of sites .
The eigenvectors of the block matrix are the vectors
u with components
that satisfy
The components
will actually be vectors, so that a given
may have components
as in the eigenvectors of the block
X given in Equation (
90). That, however, does not concern us here. Eventually, though, the transformed matrix
M will be cruciform.
Writing the eigenvector in terms of its components, the eigenvalue equation is
The eigenvectors can be found from the rows, which are given by
with boundary conditions
just as in the example of the diagonalization of the
X matrix earlier in this section. Since the block matrix
B is an
array of blocks, the eigenvector has
components.
This difference equation has solution, once again, of the form
or, in order to keep the angle
positive, a linear combination consisting of
and
. We write these eigenvectors as
and apply the
boundary condition to obtain
and so
, and then
Either of the two terms appearing here could be used to find the eigenvalue
by inserting it into the difference equation. We choose
, and have
which, after canceling
, gives
so that
is an
matrix that represents the “eigenvalue” of a block, irrespective of whether or not
X and
Y are diagonal or even simultaneously diagonalizable.
The second boundary condition
quantizes the angle
, just as it quantized the angle
for the matrix
X. It gives
so that
Taking the logarithm of both sides shows that
where
W is another winding number arising from crossing the branch cut in the logarithm. This is what provides the complete set of eigenvectors, characterized by the various values that the angle
can take on as
W assumes the values
. Then,
are the various values of
. The values
and
are the boundary points where the solution vanishes that were added by padding the ends of the list. The components of the eigenvector
are
with
A the normalization constant determined from
.
The expression for the eigenfunction,
can be rewritten by inserting the explicit form of the angle
. This produces
giving
The normalization constant
A is found earlier in this section in the diagonalization of the
X matrix by requiring that
. This gives
and so the components of the normalized eigenvector are
and the associated eigenvalue is
These are the block eigenvalues
of the block matrix
B. Writing these down the diagonal gives
B in diagonal form,
Each of these eigenvalues is actually an
matrix forming a block of the now-diagonal block matrix
B, and if written out in full represents the original
matrix
M, although that matrix is not yet cruciform. We now set out to make it so.
The block
X is not yet in diagonal form, although we know its eigenvalues, which are
with
. Thus, when diagonalized, we have
The question is, what happens to
Y when we transform both
X and
Y using the unitary transformation
U that diagonalizes
X? The short answer is that the vertical dimer activity
y shows up along the antidiagonal, which is the “diagonal” that runs from the lower left to upper right corners of a matrix. This produces the cruciform matrix
where
. We then have a block matrix consisting of these cruciform blocks, and its determinant gives the square of the partition function.
This cruciform shape results from the transformation
U that diagonalizes
X, which is constructed from the normalized eigenvectors
of
X with the components of each vector
forming the columns of
U, giving
The block
Y, written explicitly in terms of the activity, has entries
y along the antidiagonal and can be written as
y times the matrix with entries of 1 on the antidiagonal. To see what effect such a matrix has on another matrix, a two-dimensional example should be sufficient. We have
It seems that the effect of a matrix with unity along the antidiagonal (the anti-identity?) is to reverse the rows of the matrix it multiplies. This result implies that the matrix element
of
is
The general matrix element then consists of the product of the component
and
with
, or in terms of the components of the eigenvector,
and
. The matrix elements of
are
which is derived in detail in
Appendix D.
These are the terms that appear on the antidiagonal of
Y after transforming to the representation in which
X is diagonal. All matrix elements off the antidiagonal of
Y are zero. This
matrix
Y can finally be written as
The transformed matrix
becomes
This matrix
is a block of the matrix
B, which, because it has the form of a cross, is called a cruciform matrix.
The partition function is given by the Pfaffian of this matrix
B, and since the square of a Pfaffian is a determinant, it is more convenient to evaluate the square of the partition function, which is the determinant of
B. The block eigenvalues
form the diagonal elements of
B, while all other blocks are zero. The determinant of a diagonalized block matrix like this is the product of the determinants of the individual blocks, and the square of the partition function is the determinant.
The determinant of a cruciform matrix is easily determined by expanding by the first row of the matrix and then in each minor expanding by the last row. Then each new minor is expanded by the first row and each of the resulting minors by their bottom rows. This repetition creates
factors of the type
. The brackets
indicate the floor function, which gives the highest integer lower than
x. The square of the partition function is evaluated in
Appendix E, with the result
The square root of this, the partition function itself, is also evaluated in
Appendix E and is given by
9. Colored y Dimers and the Partition Function for Dimers and Monomers
The partition function in Equation (
127) is applicable to any lattice formed from
rows and
columns of lattice points. Suppose that the lattice has the form of a two-leg ladder, with
rows and
columns and that it is completely filled with dimers,
. A dimer arrangement on this lattice is shown in the top of
Figure 14. The
y-direction dimers are colored orange to distinguish them from the
x dimers, which are colored green. Looking down on this ladder so that it is viewed edge-on, as in the bottom picture, it looks like a linear lattice filled with dimers and orange monomers.
What is the effect of randomly coloring the vertical dimers red and blue and assigning them corresponding activities and , just like the red and blue balls discussed earlier? If we could find the partition function for this system, we could have a linear system of dimers, monomers, and vacant sites, even if this problem has not been solved analytically in two dimensions in Fisher’s work.
We can see how to do this by starting with the expression
because
is the number of states accessible to the system when there are precisely
horizontal dimers and
vertical dimers. If the vertical y-dimers come in two colors,
and
, the number of distinct states increases, and does so by a factor of
, where
is the number of red
y-bonds and
is the number of blue
y-dimers, and
. The sum over
gains the additional factor
and we have
The link fields
are colorless and do not distinguish between
and
, even though they throw away a large fraction of the random arrangements of the child’s toys. Thus, the rest of the calculation proceeds as before, the only change being the replacement
, giving
We now return to the two-leg ladder. Here, we want a very long ladder,
, while for two legs,
. Evaluating the partition function for
gives
, and we have the logarithm of the partition function given by
The angle
is
, and so the cosine is
, giving
Combining the first term with the sum brings a factor of two inside the log in the sum, giving
This is the correct result for the two-leg ladder. It is not, however, correct for the linear chain. To see why, imagine squeezing the rungs of the ladder together, producing the orange disks from the original
y-oriented dimers, as shown in
Figure 15. This results in a double dimer, one from each leg of the ladder, on pairs of sites that should only a single dimer. In the logarithm of the partition function, the origin of the double dimers is the
term, which because we have logarithms, gives
. The correct result for the linear chain,
, has
replaced by
x,
Fisher implies that he confirmed this result by including vacancies in their original formulation and evaluating the Pfaffian, and states that this result is consistent with other ways of determining configurations of a monomer–dimer linear chain.
10. The Linear Chain
This gives the partition function for an infinitely long linear chain by letting
and using the integral representation introduced above. This gives
Evaluating the integral gives
where
and
. In the infinite limit of a very long chain,
Under the square root, we have
, giving
Then the log of the partition function per site is
for a long monomer–dimer chain. This allows systems involving the attachment of monomers and dimers to a long polymeric chain, such as charged dimers and point-like ions binding electrostatically to a DNA strand.
For colored vertical dimers, we replace the activity y by
, with
the activity of red
y-dimers and
y the activity of blue y-dimers. Making this change in the partition function of the one-dimensional very long chain, we have
We can now identify the red
y-dimers with monomers with activity
and the blue
y-dimers with vacancies with activity
, because vacancies have neither energy nor chemical potential, and so
. We will call the activity of the
x dimers
. Then the log of the partition function per site that includes
y dimers and vacancies is
In general, the entropy per site is given by the expression
Therefore, the entropy for the dimer model becomes
where twice the derivative of the partition function is
The derivative of the activity
with respect to
is
Therefore, the derivative of the partition function per site is
This means that the entropy is given by
Simplifying, we have
Similarly, the mean site occupation by the perpendicular dimers
is
Therefore, taking the derivative, we have
The derivative of the activity
with respect to
is
and this is zero applied to the activity of a different type of particle. Then the mean site occupation by the perpendicular dimers
is
Simplifying, we have
Factoring out
from the parentheses, we have
Now we see that the quantity inside the large parentheses is
, and so the mean site occupancy reduces to
The same steps lead to the mean site occupancy for the vacancies
v, which is
Since the parallel dimers
each occupy two sites, the mean occupancy per site must be doubled, and we have the mean site occupancy for the parallel dimers as
Taking the derivative, we have
Simplifying yields
The sum of the occupancies for the three species should be 1, and so let us check that. The sum is
Substituting
in the numerator, we have
Factoring
from the numerator and expanding, we have
Factoring
from the numerator produces
The last factor on the right is exactly
, and so the right-hand side reduces to one, and we have
Therefore, the total occupancy is, correctly, unity.
11. Results for the Entropy, Average Occupation, and Total Charge
Under certain conditions, DNA molecules condense. In this process, the DNA chain rolls into a tight toroid. If a DNA strand were simply placed in aqueous solution, this would not happen, because the phosphate groups that form the DNA backbone are negative, so that one length of the chain repels another. However, in a solution containing cations, the positive cations are electrostatically attracted to the negative backbone, neutralizing it or even inverting the charge. When the charge is inverted, the DNA chain dressed by the cations becomes positive, allowing it to coil up compactly.
Understanding the statistical mechanics of such processes gives insight into the evolution of complex life and the dynamics that maintains it. This led two of the authors to use methods borrowed from the physics of interacting particles and quantum field theory to the electrostatics of biomolecules [
9,
10] using lattice gas models. Such models regard all ions, monomers, dimers, or larger polyions as pointlike particles, completely abandoning the geometric constraints that come from the extended nature of the adsorbing species. All the geometry inherent in attaching a polyion flat against a surface so that it covers several lattice sites is lost.
In the plots of this section, any numerical parameters used were taken from our earlier work [
9,
10]. In these plots, the horizontal axis is the dimensionless product
. Because parallel dimers occupy two sites, while the perpendicular dimers only occupy one, we set
. Strong binding to the lattice occurs when
is large and negative, while the weak binding region is where
is positive. Assuming that the DNA double helix is in equilibrium with a solution of dimers, all the chemical potentials must be the same and be equal to the chemical potential of dimers in solution given by
. The plots are all for the single physiological temperature of
K, with
.
The Fisher dimer model, as formulated in the previous sections, describes parallel dimers, with activity , perpendicular dimers, with activity , and vacancies with activity v. These vacancies are on the negative backbone, and each of these vacant sites carries a charge of , in units of the magnitude of the electron charge. The dimers have charge , with one positive charge on each end. Then, electrostatic attraction causes the dimers to bind to the vacant sites. They can do so in two ways, first by lying flat, parallel to the DNA chain, and binding to two vacant sites, so that they have an activity of . Alternatively, they can protrude at right angles to the DNA chain, with only one end bound to the chain and activity . These two possibilities are called parallel and perpendicular dimers, with the perpendicular dimers acting as the monomers in the dimer, monomer, and vacancy model.
Figure 16 shows the entropies given by the Fisher dimer model and by the non-interacting lattice gas model. The striking feature is the way that entropy appears to plateau in the strong binding region. The lattice gas model does not show this because all the dimers lie flat on the lattice by
. The nonzero value of the entropy in the Fisher dimer model suggests that disorder persists in this region even as the binding force becomes quite strong.
To understand this disorder in more detail,
Figure 17 shows the site occupancies for the Fisher dimer model. In the region of strong binding, both the site occupancies for the parallel and perpendicular dimers also plateau, even though the number of vacancies drops to near zero. This indicates that the disorder that leads to the nonvanishing entropy consists of a mixture of parallel and perpendicular dimers, occupying practically all the sites and accounting for the nonzero value of the entropy as the binding force becomes strong.
For comparison,
Figure 18 shows the corresponding site occupancies for the lattice gas model. There, the perpendicular dimer occupancy drops to zero in the region where the binding force is strong, as does the vacancy density, and the lattice is completely occupied by parallel dimers. An interpretation of the contrasting behavior between the two models is that the actual shape of a dimer spanning two sites when it lies parallel is responsible for the disorder, while in the lattice gas model, the parallel dimers are treated as though they are monomers, but with twice the binding energy.
The phenomenon of the DNA strand rolling up compactly requires excess charge, and so
Figure 19 shows the total charge
. This charge is positive when the binding force is strong because in that region there are few negative vacancies, but still a large number of perpendicular dimers, while each parallel dimer neutralizes two sites. That persistent positive charge in the Fisher model does not appear in the lattice gas model, where the charge drops to near zero in that region. In the weak binding region, the charge becomes small in the same way in both models, and the two curves lie on top of one another.
In the context of DNA compaction, this excess charge is the physically most important feature seen from the Fisher model. This is because it leads to larger and more persistent charge inversion than is seen in the lattice gas model. It does so because once a sequence of parallel dimer–vacancy–parallel dimer forms, the only way the vacancy can be filled is with a perpendicular dimer. This must be the persistent disorder that occurs in the region of strong binding force.