Using Symmetries (beyond Geometric Symmetries) in Chemical Computations: Computing Parameters of Multiple Binding Sites

We show how transformation group ideas can be naturally used to generate 1 efficient algorithms for scientific computations. The general approach is illustrated on the 2 example of determining, from the experimental data, the dissociation constants related to 3 multiple binding sites. We also explain how the general transformation group approach is 4 related to the standard (backpropagation) neural networks; this relation justifies the potential 5 universal applicability of the group-related approach.


Why Use Group Theory in General Scientific Computations?
10 Use of symmetries in chemistry: a brief reminder. In many practical situations, physical systems 11 have symmetries, i.e., transformations that preserve certain properties of the corresponding physical 12 system. For example, a benzene molecule C 6 H 6 does not change if we rotate it 60 • : this rotation 13 simply replaces one carbon atom by another one. The knowledge of such geometric symmetries helps in 14 chemical computations; see, e.g., [2,3,5].
Group theory: a mathematical tool for studying symmetries. Since symmetries are useful, once 16 we know one symmetry, it is desirable to know all the symmetries of a given physical system. In other 17 words, once we list the properties which are preserved under the original symmetry transformation, it is 18 desirable to find all the transformations that preserve these properties. 19 If a transformation f preserves the given properties, and the transformation g preserves these 20 properties, then their composition h(x) = f (g(x)) also preserves these properties. For example, if 21 the lowest energy level of the molecule does not change when we rotate it 60 degrees, and does not 22 change when we rotate it 120 degrees around the same axis, then it also will not change if we first rotate 23 it 60 degrees and then 120 degrees, to the total of 180 degrees.

24
Similarly, if a transformation f does not change the given properties, then the inverse transformation 25 f −1 also does not change these properties. So, the set of all transformations that preserve given properties 26 is closed under composition and inverse; such a set is called a transformation group or symmetry group.

27
Mathematical analysis of such transformation is an important part of group theory.

28
Problems of scientific computations: a brief reminder. In this paper, we argue that group theory 29 can be used in scientific computations beyond geometric symmetries. To explain our idea, let us briefly 30 recall the need for scientific computations.

31
One of the main objectives of science is to be able to predict future behavior of physical systems. To 32 be able to make these predictions, we must find all possible dependencies y = F (x 1 , . . . , x n ) between 33 different physical quantities. Often, we only know the general form of the dependence, i.e., we know celestial body depends on its spatial location, but this description contains masses c i of celestial bodies; 38 these masses must be determined based on the astronomical observations.

39
In general, to be able to predict the value of a desired quantity y for which we know the form of the 40 dependence y = G(x 1 , . . . , x n , c 1 , . . . , c m ), we must do the following:

41
• first, we use the know observation x  In scientific computation, the first problem is known as the inverse problem and the second problem as the forward problem. Usually:

48
• the forward problem is reasonably straightforward: it consists of applying a previously known 49 algorithm, while 50 • an inverse problem is much more complex since it requires that we solve a system of equations, 51 and for this solution, no specific algorithm is given.
Inverse problem as the problem of finding the inverse transformation. In the idealized case, when 53 we can ignore the measurement uncertainty, the generic inverse problem can be reformulated as follows: solving a complex inverse problem, we represent it as a sequence of easier-to-solve problems.

62
For example, everyone knows how to solve a quadratic equation can be effectively used if we need to solve a more complex equation Then: • first, we solve the equation a · y 2 + b · y + c and find y;

66
• next, we solve an equation x 2 = y with this y and find the desired value x.

67
In general, if we represent a transformation f as a composition The simplest case is when we have a system of linear equations. In this case, there are well-known 90 feasible algorithms for solving this system (i.e., for inverting the corresponding linear transformation).

91
It would be nice if we could always only use linear transformations, but alas, a composition of linear 92 transformations is always linear. So, to represent general non-linear transformations, we need to also 93 consider some systems of non-linear equations.

94
For nonlinear systems, in general, the fewer unknown we have, the easier it is to solve the system. approximators; see, e.g., [1,4]. Specifically, in a 3-layer neural network with K hidden neurons:

103
• we first compute K linear combinations of the inputs • then, we apply, to each value y k , a function s 0 (y) of one variable s 0 (y), resulting in z k = s 0 (y k ); 105 usually, a sigmoid function s 0 (y) = 1 1 + exp(−y) is used;

106
• finally, we compute a linear combination y = K k=1 W k · z k − W 0 . .
For the case of several (S) binding sites, B is a linear combination of terms corresponding to different binding sites, i.e., (1) for appropriate values R s and k ds .

117
Inverse problem corresponding to the case study. The problem is to find the values R s and k ds from the observations. In other words, we observe the bound proportions y (k) for different ligand concentrations [L] = x (k) , and we want to find the values R s and k ds for which (2) How to use group-theoretic ideas to simplify the corresponding computations: analysis of the 118 problem. The system (2) is a difficult-to-solve system of nonlinear equations with 2S unknowns.

119
To simplifying the solution of this system, let us represent its solution as a composition of linear 120 transformations and functions of one variable.

121
By adding all S fractions R s · x k ds + x , we get a ratio of two polynomials P (x) Q(x) . Here, Q(x) is the product of all S denominators x + k ds , and is, thus, a S-th order polynomial with the leading term x S : Similarly, since P (x) is divisible by x, we get P (x) = p S · x S + p S−1 · x S−1 + . . . + p 1 · x.

122
The equations y (k) = P (x (k) ) Q(x (k) ) can be equivalently represented as y (k) · Q(x (k) ) = P (x (k) ), i.e., as This is a system of linear equations with 2S unknowns p i and q i . Solving this system of linear equations 123 is relatively easy.

124
Once we solve this linear system and find the values q i , we can find the parameters k ds from the 125 condition that for x = −k ds , we have x + k ds = 0 and thus, the product Q(x) of all such terms is equal Inverse problem corresponding to the case study: resulting algorithm. We start with the values 134 y (k) of the bound proportion corresponding to different ligand concentrations x (k) . Our objective is to 135 find the parameters R s and k ds of different binding sites s = 1, . . . , S. To compute these parameters, we 136 do the following:

137
• first, we solve the linear system (4) with 2S unknowns p i and q i ;