On Exact and Approximate Approaches for Stochastic Receptor-Ligand Competition Dynamics—An Ecological Perspective

Cellular receptors on the cell membrane can bind ligand molecules in the extra-cellular medium to form ligand-bound monomers. These interactions ultimately determine the fate of a cell through the resulting intra-cellular signalling cascades. Often, several receptor types can bind a shared ligand leading to the formation of different monomeric complexes, and in turn to competition for the common ligand. Here, we describe competition between two receptors which bind a common ligand in terms of a bi-variate stochastic process. The stochastic description is important to account for fluctuations in the number of molecules. Our interest is in computing two summary statistics—the steady-state distribution of the number of bound monomers and the time to reach a threshold number of monomers of a given kind. The matrix-analytic approach developed in this manuscript is exact, but becomes impractical as the number of molecules in the system increases. Thus, we present novel approximations which can work under low-to-moderate competition scenarios. Our results apply to systems with a larger number of population species (i.e., receptors) competing for a common resource (i.e., ligands), and to competition systems outside the area of molecular dynamics, such as Mathematical Ecology.


Introduction
Receptors play a significant role in determining the fate of cells through the initiation of intra-cellular phosphorylation (signalling) cascades [1][2][3][4][5]. For receptors on the cellular membrane, these cascades can be initiated by the binding of a ligand molecule, in the extra-cellular medium, to the extra-cellular domain of the receptor, forming a monomer. Tyrosine residues on the intra-cellular tail of the bound receptor can become phosphorylated, in turn, allowing the receptor to interact with downstream signalling proteins in the cellular cytoplasm. The fate of the cell, for example division, death or migration, will be determined by the specific receptor-ligand interaction and by the number of phosphorylated monomers. Often, ligand molecules are capable of binding to more than one kind of receptor [4,6]. These receptors may be similar in structure, yet different receptor-ligand interactions might lead to strikingly diverse cellular outcomes [7]. The strength of the signalling determines cellular fate, and in some systems this strength can be assumed to be proportional to the number of monomers formed [8,9]. It is therefore important to quantify the number, and timescales of formation, of the different ligand-bound monomers on the cell membrane.
analytical approximations for the summary statistics of interest. Our approximations, for the two-dimensional system of interest (i.e., two kinds of receptor competing for a common ligand), are motivated by the fact that, under excess of ligand, the processes representing ligand-bound monomer formation for the two different receptors are approximately independent. The structure of this paper is as follows. In Section 2 the bi-variate stochastic model for two kinds of receptor, R 1 and R 2 , with a shared ligand, L, is introduced. In Section 3 we describe how to analyse two summary statistics: (i) the steady-state distribution of the number of monomers and (ii) the time to reach N monomers for receptor type 2. We carry out this analysis by means of both (exact) matrix-oriented approaches and novel approximations. Having proposed these approximations, we then test their accuracy for a wide range of biologically relevant parameter values and numbers of molecules in Section 4. We also consider a different summary statistic: the time to reach a threshold number N of productive monomers (i.e., monomers that remain bound for longer than a given dwell time τ). Finally, Section 5 provides a discussion of our results. Details of the matrix-oriented algorithms used in this paper are given in Appendices A and B. Appendix C has some additional numerical results, while Appendix D provides a comparison of the efficacy of the different computational methods.

A Stochastic Competition Model for Receptor-Ligand Interactions
We consider two receptor types, R 1 and R 2 , on the cell membrane, and a shared ligand, L, in the extra-cellular medium. Both receptors can associate with the ligand to form a ligand-bound monomer, which we denote by M 1 or M 2 , depending on the receptor type involved. We assume a spatially homogeneous distribution of molecules on the cell membrane and mass-action kinetics [33]. We also assume that these reactions occur with forward association rates k f ,1 and k f ,2 , respectively. We allow for the monomers, M 1 and M 2 , to dissociate into their constituent receptor and ligand, with rates k r,1 and k r,2 , respectively (see Figure 1a).
If we denote by n R,1 and n R,2 the total number of receptors available in the system, and by n L the total number of ligands, we can write n R,1 = R 1 (t) + M 1 (t) and n R,2 = R 2 (t) + M 2 (t), for all t ≥ 0, where R 1 (t) and R 2 (t) represent the number of free receptors of type 1 and 2, respectively, at time t, and M 1 (t) and M 2 (t) represent the number of monomers of type 1 and 2, respectively, at time t. Then, the process can be described as a bi-variate continuous-time Markov chain (CTMC) X = {X(t) = (M 1 (t), M 2 (t)) : t ≥ 0}, with M 1 (t), M 2 (t) ≥ 0 for all t ≥ 0, and the process X evolves over the state space S X = {(m 1 , m 2 ) ∈ (N ∪ {0}) 2 : m 1 ≤ n R,1 , m 2 ≤ n R,2 , m 1 + m 2 ≤ n L }, where m 1 and m 2 are the number of type 1 and type 2 monomers, respectively, at any given time.

Figure 2.
A diagram of the two-dimensional Markov process X . Each state in the process represents a number of monomers of each type (m 1 , m 2 ), and the transitions between states represent formation and dissociation of type 1 and type 2 monomers over time. The maximal values of m 1 and m 2 will depend on the parameters n R,1 , n R,2 and n L , and will determine the overall shape of S X .
The process shown in Figure 2 is an example of a two-dimensional competition process, as introduced by Reuter [26] in 1961. In this type of process, jumps can be made from a given state to their nearest neighbours, as seen by the connecting arrows between states in Figure 2. Here we have two axes, each of which corresponds to a different receptor type, that is, the x-axis represents the number of type 1 monomers formed and the y-axis represents the number of type 2 monomers formed. In principle, as introduced by Iglehart [27] in his generalisation of Reuter's work [26], one could add additional axes to represent further receptor variants competing for the same ligand; thus, expanding our model in Figure 1 to a multi-variate competition process (see the discussion in Section 5).
To illustrate the competition dynamics generated by the reactions in Figure 1a, we simulate in Figure 3 the stochastic model by means of the Gillespie algorithm [34]. We simulate the process for different numbers of molecules and K d values of the reactions, where K d,i = k r,i k f ,i for i ∈ {1, 2}. We vary these parameter values, as well as the number of available receptors of type 2, to see the effect on the competition dynamics. In all plots in Figure 3, n L = 10 2 , k r,1 = k r,2 = 10 −3 s -1 and n R,1 = 10 2 . In the first row, for equal K d values we vary the number, n R,2 , of type 2 receptors. In the second row, for equal number of receptors we vary K d,2 .
In the first plots of each row in Figure 3, we see that the process favours the formation of M 1 over M 2 . In this case, receptor type 1 out-competes receptor type 2 for ligand binding. In the second plot of each row, the number of molecules and rate constants governing the formation of M 1 and M 2 are identical and hence we see that the time courses for the two monomers are at a similar level, and are sometimes overlapping. Finally in the third plots of each row, we see that the receptors of type 2 are out-competing the receptors of type 1, and hence the time courses for M 2 are above those of M 1 . We also notice from Figure 3 that in some of the plots the process appears to reach a steady state by the end of the time course. One of the aims of this paper therefore, is to examine the effect of the rate constants and the number of molecules on the steady-state distribution of the process.
In Section 3 we compute two summary statistics of interest, namely the steady-state distribution of monomers on the cell surface [29,35] and the mean time to reach N monomers of one receptor type [29,36]. For each summary statistic, we present a computationally efficient matrix-analytic approach for its exact analysis, which can be implemented for low receptor and ligand numbers. Throughout this paper we will refer to this method of analysis as the exact matrix-analytic approach (EMA). Along with the exact analysis, and for scenarios where the large number of molecules leads to a huge number of states in S X , in turn making the EMA computationally intractable, we propose novel approximations to compute the proposed summary statistics. Finally, in Section 4 we compare the exact methods with the approximations, in order to quantify the goodness of the new approximations under different parameter regimes (i.e., in terms of the parameter values (k f ,1 , k r,1 , k f ,2 , k r,2 ) of the model and the number of molecules).

Steady-State Distribution
In this section we analyse the steady-state distribution for the mathematical model defined in Section 2. We show how this distribution can be analytically computed by solving matrix equations in a computationally efficient way. Since this methodology is computationally limited by the number of molecules in the system, we propose as well approximations, which are based on the analysis of the process X when n L → +∞ or, equivalently, n L n R,1 , n R,2 . That is, instances when the competition for ligand is low. This analysis, when adjusting the number of available ligands to an effective number of ligands, leads to an approximation that can be used in low-to-moderate competition scenarios.
The steady-state distribution for the model in Figure 1 is described in terms of the following probabilities which can be stored in a row vector π = (π (m 1 ,m 2 ) , (m 1 , m 2 ) ∈ S X ) for any given order of states in S X . These probabilities correspond to the number of monomers, of type 1 and type 2, respectively, found on the cellular surface at late times. They are known to satisfy [28] the system of equations where Q is the infinitesimal generator of the CTMC X , #S X is the number of states in the state space, 0 a is a row vector of zeros with length a and 1 a is a column vector of ones with length a. The infinitesimal generator Q encodes the infinitesimal transition rates. Since the transitions described in Figure 2 only occur between adjacent states (see also Figure 1b), it is possible to order the states in S X so that Q is tri-diagonal by blocks, as shown in Appendix A. This means that the process X can be described as a level-dependent quasi-birth-and-death (LD-QBD) process [28]. For this type of process, efficient computational methods exist to solve Equation (2). In particular we use Algorithm A1, as presented in Appendix A, to compute the probabilities π EMA (m 1 ,m 2 ) for all (m 1 , m 2 ) ∈ S X . Once these probabilities are in hand, one can compute the mean number of monomers at steady-state as with N 1 = min(n R,1 , n L ) and N 2 = min(n R,2 , n L ). However, as the number of molecules increases (see Appendix A), Algorithm A1 becomes computationally expensive, and an alternative approach needs to be considered.

No-Competition Approximation (NCA) under Excess of Ligand
We seek to find an approximation to the steady-state distribution which can be computationally tractable even for a large state space. To this aim, we note that as n L → +∞ (or, equivalently, n L n R,1 , n R,2 ), the two populations of monomers behave independently of one another given the excess of ligand. We therefore hypothesised that we could approximate the steady-state distribution in the large n L limit by considering two independent Markov chains for the variables M 1 (t) and M 2 (t). In the limit n L → +∞, given that the monomers are linked only by the number of ligands present, the steady-state probabilities for the two-dimensional process will be precisely equal to the product of the steady-state probabilities for the one-dimensional processes modelling M 1 (t) and M 2 (t). This product, in the limit n L → +∞, comprises what we denote by the no-competition approximation (NCA).
In particular, we define two independent Markov chains for the monomers M 1 and M 2 , respectively, with state spaces We note that, since the number of monomers in each CTMC can only increase or decrease by one unit in every transition (by means of monomer formation or dissociation), each CTMC X j , for j ∈ {1, 2}, is a birth-and-death process as depicted in Figure 4, with λ n = k f ,j (n R,j − n)(n L − n) and µ n = k r,j n. Figure 4. Diagram for the birth-and-death process X j , j ∈ {1, 2}.
In the limit n L → +∞, one can consider a constant number of ligands (not depleted by monomer formation); the binding rates then become λ n = k f ,j n L (n R,j − n). For these rates, Equation (2) can be easily solved [25], as follows (5) which can be identified as binomial distributions. For these one-dimensional birth-and-death processes, X 1 and X 2 , one can determine the expected values of monomers in steady-state as In the limit n L → +∞, the steady-state distribution of the process X is given by We note that, when n L → +∞, one gets N 1 = n R,1 , N 2 = n R,2 and S X = {(m 1 , m 2 ) ∈ N 2 0 : 0 ≤ m 1 ≤ N 1 , 0 ≤ m 2 ≤ N 2 } = S X 1 × S X 2 , so that Equations (4)-(7) are consistent.
It is clear that Equation (7) will also provide a reasonable approximation to the steady-state distribution when the number of ligands is much greater than the number of receptors, that is, n L n R,1 , n R,2 , since we are then in a situation in which there is very little competition between the receptors for available ligand, and hence, ligand depletion can be neglected, leading to two independent birth-and-death processes for the two different monomer populations. On the other hand, when n L is comparable in magnitude to n R,1 + n R,2 , Equation (7) overestimates the probabilities to reach higher numbers of monomers of each type, since it does not take into account the fact that ligands are being depleted by the formation of monomers in the competing Markov chain. To effectively approximate the steady-state distribution in scenarios where n L ≈ n R,1 + n R,2 , we propose to employ an effective number of ligands available to each birth-and-death process, n * L , which is lower than the total number of ligands in the two-dimensional process, n L . We propose that this effective number of ligands, and therefore an approximation to the steady-state distribution, can be reached iteratively using an algorithmic approach.

Moderate-Competition Approximation (MCA)
Here we propose Algorithm 1 to compute an approximate steady-state distribution, π MCA (m 1 ,m 2 ) , which can be used when n L ≈ n R,1 + n R,2 . In Algorithm 1, we iteratively reduce the total number of ligands available in each independent birth-and-death process X 1 and X 2 , by subtracting from n L the expected number of each monomer type in steady-state computed from the previous iteration. In each iteration i, the mean number of monomers in steady-state under the NCA approximation, , are computed, and they are used to compute an effective number of free ligands available N (i+1) L , at iteration i + 1. The iterative scheme stops once these mean values are close enough for two consecutive iterations, as determined by a threshold value . We note that the final number of ligands considered, n * L = N (i) L in the last iteration, can be seen as an effective number of ligands that reflects the competition for shared resources.
L as the number of ligands in the system, instead of n L ; , (m 1 , m 2 ) ∈ S X ; MCA steady-state distribution 11: π MCA (m 1 ,m 2 ) ; Mean number of type 1 monomers The parameter α (α ∈ [0, 1]) is a tuning parameter needed in situations where n L < n R,1 + n R,2 . It modulates the convergence speed of the algorithm. We set α = 1 when n L ≥ n R,1 + n R,2 , since in this case it is impossible for ] to be negative at any iteration. Otherwise, when n L < n R,1 + n R,2 , the parameter α is chosen between 0 and 1 such that the effective number of ligands at every iteration in the algorithm is positive. Our numerical exploration indicates that the the choice of α does not affect the resulting steady-state probabilities π MCA (m 1 ,m 2 ) . When α = 1, the expected number of monomers and ligands in steady-state exhibit damped oscillations as they converge. An example of this can be found on the left plot of Figure 5. When α < 1, for values of α for which the algorithm converges, the expected number of monomers and ligands in steady-state either exhibit damped oscillations as they converge (for larger values of α), or they decrease monotonically in convergence (for smaller values of α). An example of monotonic convergence can be seen in the right plot of Figure 5. We found that for values of α for which there is monotonic convergence, the smaller the value of α, the more iterations the algorithm requires to converge. We note that in Figure 5, the mean number of free ligands in steady-state, as predicted by the EMA, is given by . Interestingly, we note how the values of N (i) L tend, in the limit i → +∞, to this mean number of free ligands in steady-state. . In both plots, = 10 −5 , n R,1 = 20, n R,2 = 30, K d,1 = 1, K d,2 = 1, and α = 0.2 for the right plot.
In Section 4.1, we conduct a comprehensive numerical exploration of the accuracy of the MCA calculated with Algorithm 1 when compared to the exact steady-state distribution (EMA).

Timescales of Monomer Formation
In this section we study a second summary statistic of interest; namely the timescales of monomer formation. For the bi-variate process X , we define to be the time to reach N monomers of type 2 (without loss of generality, since our arguments would similarly apply to type 1 monomers) given the initial state (m 1 , m 2 ) ∈ S X . N can be any arbitrary value Following the same strategy as that of Section 3.1, we first analyse T (m 1 ,m 2 ) (N) by means of an exact matrix-analytic approach. This implies solving a matrix equation for the two-dimensional process X described in Figure 2. This EMA is once again computationally limited by the number of molecules in the system. Thus, we analyse T (m 1 ,m 2 ) (N) in the limit as n L → +∞ (i.e., under effectively no competition for the shared ligand). We call this approximation the NCA. Based on this approximation, we propose the MCA, where an effective number of ligands is considered when computing this summary statistic.

Exact Matrix-Analytic Approach (EMA)
In this case, we can efficiently analyse T (m 1 ,m 2 ) (N), for (m 1 , m 2 ) ∈ S X with m 2 ≤ N, by means of first-step arguments. We omit the index N from now on to simplify our notation. One can, according to arguments described in Appendix B, compute the l-th order moment of T (m 1 ,m 2 ) , E T l (m 1 ,m 2 ) , by solving the system of linear equations: with This system of equations can be efficiently solved by arranging the states of the bi-variate Markov process X in levels L(k), 0 ≤ k ≤ N 2 , as done for the steady-state distribution and detailed in Appendices A and B. We note that the system of equations given by Equation (8)

No-Competition Approximation (NCA) under Excess of Ligand
As an approximation under no competition (n L → +∞), we can compute the mean time to reach N type 2 monomers by considering the independent one-dimensional birth-and-death process X 2 . In particular, we can analyse T m 2 Let us focus on the mean value E[T m 2 (N)] for m 2 = 0 (that is, no type 2 monomers initially in the system), although we note that our arguments can be easily modified for different initial states 0 < m 2 < N, or for higher order moments of T m 2 (N). It is clear that, for N = 1, E[T 0 (1)] = 1 λ 0 . It is also clear from a first-step argument [25] that This allows one to recursively obtain which is similar to the well-known extinction time expressions for a one-dimensional birth-and-death process [25]. Thus, we propose the following approximation when n L n R,1 + n R,2 . In the case when n L is comparable to n R,1 + n R,2 , it is clear that E NCA [T (0,0) (N)] will underestimate the exact mean time to reach N type 2 monomers, E EMA [T (0,0) (N)], since it does not take into account the ligands being depleted by the formation of monomers in the competing Markov chain. Thus, we propose next, an approximate method (MCA) to calculate E[T (0,0) (N)], which can be used under low-to-moderate competition scenarios (i.e., when n L ≈ n R,1 + n R,2 ).

Moderate-Competition Approximation (MCA)
Since the NCA underestimates the time to reach N type 2 monomers, we now define the MCA, in order to approximately account for the depletion of ligand. In particular, we propose to compute E MCA [T (0,0) (N)] by implementing Equation (9), but with an effective number of free ligands We expect that the EMA will be approximated best by the NCA or the MCA depending on the number of molecules n R,1 , n R,2 and n L and the parameter values k f ,1 , k f ,2 , k r,1 and k r,2 . We explore the relationship between the EMA, NCA and MCA in terms of the parameters values and numbers of molecules in Section 4.2.

Results
In this section we analyse the behaviour of the different approximations described in Section 3 and compare them with the exact matrix-analytic approach. We do so numerically and for a wide range of parameter values and numbers of molecules, so as to identify regions of the parameter space for which the approximations are most accurate. In order to reduce the complexity of the numerical exploration we consider only the K d value for each reaction in Figure 1, where K d,i = k r,i /k f ,i for i ∈ {1, 2}, and not the individual forward and backward rate constants, by fixing k r,1 = k r,2 = 10 −3 s −1 [6,37]. The value chosen for these rates is the common rate at which VEGFR1 and VEGFR2 dissociate their shared ligand, VEGF-A. The K d values and numbers of each receptor type are varied within the ranges given in Table 1, inspired from Reference [24] (Table 3), and which correspond to VEGFR1, VEGFR2 and VEGF-A. Finally, the number of ligands, n L , is varied to consider different competition regimes between the receptors, so that the comparisons between the EMA, NCA and MCA can be studied. Table 1. Ranges for the parameter values and numbers of molecules in the model.

Parameter
Range Unit

Steady-State Distribution
In this section we wish to compare the steady-state distributions computed via the EMA and the MCA, to determine how well the approximation works under different combinations of parameter values and number of molecules. We first present (see the first three columns of Figure 6) a comparison of the steady-state distributions computed via the EMA, NCA and MCA, for different numbers of ligands, n L , and where the number of receptors and other parameter values remain constant (see Table 1). In these heatmaps, the colour of each state indicates its steady-state probability.
In the fourth and fifth columns, the colour represents the absolute difference |π EMA (m 1 ,m 2 ) − π NCA (m 1 ,m 2 ) | or |π EMA (m 1 ,m 2 ) − π MCA (m 1 ,m 2 ) |, respectively, for each state (m 1 , m 2 ) ∈ S X . As the number of ligands, n L , decreases and becomes comparable to n R,1 + n R,2 , the NCA approximation worsens since the number of monomers of each type is overestimated. In contrast, we see that the MCA approximation remains similar to the EMA distribution, even for the lowest value of n L . It also better captures the mean number of monomers in steady-state than the NCA. Having seen in Figure 6 that the MCA is able to accurately capture the steady-state distribution computed with the EMA for the specific number of molecules and parameter values chosen, we now seek to quantify the accuracy of the MCA for a wide range of parameter values and numbers of molecules. The Hellinger distance is a way to measure the similarity between two discrete probability distributions. The Hellinger distance (HD) can take values between 0 and 1, where lower values indicate that the probability distributions are more similar. Analysis of the Hellinger distances between the steady-state distributions can be found in Appendix C. Here, we focus on competing strength parameters. In particular, in Figure 7 we display scatter plots of the Hellinger distance as a function of the parameters n R,j /K d,j , j ∈ {1, 2}. We note that the summary parameter n R,j /K d,j is an indicator of the competing strength of receptor type j, which will be large whenever there are many of these receptors in the system and/or they have high affinity for the common ligand. In Figure 7 we sample 10 3 parameter values (K d,1 , K d,2 , n R,1 , n R,2 ) within the ranges of Table 1, from uniform distributions n R,j ∼ U(20, 200) and K d,j = 10 x with x ∼ U(1, 4), for j ∈ {1, 2}. We observe that the HD decreases with increasing n L , so that in the third plot where n L = 500, the HD is relatively small, even for the largest sampled values of n R,1 /K d,1 and n R,2 /K d,2 . From the first plot in which n L = 10 2 , representing a high competition scenario, we observe larger values of the HD, especially when n R,1 /K d,1 and n R,2 /K d,2 are large. On the other hand, in the n L = 250 plot, smaller values of the competition strength parameters still lead to relatively small distances, indicating that even when n L is small comparable to n R,1 + n R,2 the approximation can still be reasonable if n R,1 /K d,1 and n R,2 /K d,2 are relatively small. Although it is clear that there is some disparity between the probability distributions, especially when the competition is the highest, we note that the expected number of monomers in steady-state (not reported here) are almost identical between the two methods of computation. In particular, for all the sets (K d,1 , K d,2 , n R,1 , n R,2 , n L ) of parameter values considered in Figure 7

Timescales of Monomer Formation
In this section we carry out a similar numerical exploration to that of Section 4.1, of the accuracy of the NCA and MCA for the expected time to reach N monomers of type 2. We note here that, while the NCA for the steady-state distribution can be seen as a necessary step to then propose the MCA in Algorithm 1, but where the MCA is always expected to behave better than the NCA, the NCA and the MCA can both lead to reasonable and complementary approximations to the EMA when studying the time to reach N type 2 monomers. Thus, our objective in this section is to compare the expected value of T (0,0) (N) derived via the EMA and the NCA/MCA. We plot in Figure 8 the mean time to reach N monomers, where N is represented as a percentage P of the steady-state number of monomers, , by means of the EMA, NCA and MCA. In low competition scenarios (e.g., large n L or when K d, 1 K d,2 so that the dynamics of formation of type 1 monomers does not significantly affect the timescales of formation of type 2 monomers), the three approaches lead to almost indistinguishable results. For low-to-moderate competition settings, the MCA leads to a reasonable approximation. It is worth noting that the MCA and NCA seem to lead to lower and upper bounds for the mean time under analysis: the NCA always underestimates the timescales of type 2 monomer formation, by considering that there are n L ligands available, neglecting ligand depletion due to type 1 monomer formation competition; while the MCA tends to overestimate the timescales of type 2 monomer formation because one considers only n L − E[M 1 ] ligands available (that is, we remove from the beginning all the ligands occupied by steady-state type 1 monomers, even if these monomers may take some time to form). Since this steady-state amount of type 1 monomers takes some time to form, some of these ligands might still be able to contribute to the formation of type 2 monomers during early times, leading to the observed overestimation. Still, it is striking how well both approximations work when K d, 1 K d,2 , even when the number of ligands is even less than the total number of receptors available in the system (n L = 100 < 200 = n R,1 + n R,2 ).

Figure 8.
Comparison between the expected times to reach N monomers of type 2, computed using the Exact Matrix-Analytic Approach (EMA), No-Competition Approximation (NCA) and the MCA, for K d,2 = 10 3 and different values of n L ∈ {100, 500, 2500} and K d,1 ∈ {10 2 , 10 3 , 10 4 } (i.e., K d,1 K d,2 , K d,1 = K d,2 or K d, 1 K d,2 ). For all plots the number of receptors are n R,1 = n R,2 = 10 2 , and initial state (m 1 , m 2 ) = (0, 0). P represents a percentage of the mean steady-state value, so that Analysis of the two approximations over a range of parameter values and number of molecules can be found in Appendix C. Here, and in order to better illustrate the effect of competition on the goodness of the NCA and MCA, we plot in Figures 9 and 10 similar scatter plots to those of Figure 7, where we plot the relative differences between the mean times instead of the HD. From the first plot in Figure 9 we see that the relative difference increases with the competing strength parameter However, similarly to the case of the steady-state distributions, as we increase n L , the competition from R 1 decreases and the NCA improves. For the sampled parameter sets, in general it seems that the MCA outperforms once again the NCA. Similarly to the NCA, the performance of the MCA tends to improve with large enough values of n L , where in the third plot of Figure 10 in which n L = 500, more than half of the sampled parameter sets lead to an approximation with a relative error smaller than 5%. It is clear that the MCA will behave rather well in situations where n L is large enough and, in particular, of a different order of magnitude to n R,1 + n R,2 ; for example, for n L = 2000 (data not shown here; note that n R,1 + n R,2 ranges from 40 to 400), an overwhelming majority of scenarios (more than 90%) in Figure 10 lead to MCA predictions with relative errors smaller than 5%. Encouragingly also, we see that in Figure 9, all of the sampled points have relative differences greater than 0, which implies that the NCA always underestimates the EMA. This is expected given that the formation of monomers of type 2 will happen faster if there is no competition from another receptor, as assumed in the NCA. A similar pattern can be seen in Figure 10, in which the MCA tends to overestimate the EMA, leading to negative values of the relative difference. This is true for the majority of the points sampled. However, it is not always true as 6 of the 10 3 points sampled lead to very slightly positive relative differences (for n L = 100 and n L = 250 only).  We finally note that the NCA may behave better than the MCA in situations where the mean steady-state number of type 1 monomers is perhaps non-negligible, but the timescales of type 1 monomer formation are significantly slower than the timescales of type 2 monomer formation. In these situations, considering an effective number of ligands n L − E[M 1 ] in Equation (10) might lead to worse predictions, since this ligand depletion might take very long to occur, and should, thus, be neglected, so that the NCA would prevail. An example of this behaviour can be seen in the left plot of Figure A3 in Appendix C.

Time to Signal Initiation: Reaching a Threshold Number of Productive Monomers
We illustrate an extension of our methodology by considering a stochastic descriptor that is linked to signal initiation for some receptor tyrosine kinases (RTKs)-the time to reach a threshold number N of productive monomers on the cell surface. In References [19,20], the authors hypothesise that signal initiation of T cells through the T cell receptor (TCR) is determined by the time a threshold number, N, of productive monomers is reached. A productive monomer is considered to be a receptor that remains bound to the ligand, for at least a dwell time τ. In References [19,20] the authors compute the mean time, T(N, τ) to reach N productive monomers. Note that T (0,0) (N) represents the time to reach a threshold number N of simultaneously bound monomers in the system, while T(N, τ) does not require these events to be simultaneous, but instead requires the corresponding monomers to be productive, based on the hypothesis that "counting devices are at work to allow signal accumulation, decoding and translation into biological responses", as expressed in Reference [19].
The model in Figure 1a is proposed in Reference [19] when analysing T(N, τ) with a single receptor type (the TCR). Under ligand excess, authors in Reference [19] propose the approximation: where k f is the forward binding rate of the receptor, n R is the number of receptors available in the system, and N = exp(k r τ)N represents the average number of binding events required for N productive ones to be reached. Recall that dissociation of a monomer occurs after an exponentially-distributed random time Exp(k r ), where k r is the dissociation rate of the receptor.
The authors in Reference [19] make use of experimental data to test two different hypotheses: that (a) the timescale of a T cell response correlates with the time it takes to have had N receptor-ligand complexes bound for at least a threshold dwell time, τ, each; or that (b) the timescale of a T cell response correlates with the time a threshold number, N, of TCRs must be occupied at equilibrium. Their conclusion is that experimental data supports hypotheses (a), but not (b). The descriptor T(N, τ) has been proposed in References [19,20] for the T cell receptor, and a similar hypothesis has been considered for other RTKs [18,29]. We make use of the methods introduced in previous sections, to compute T(N, τ) for a model as described in Figure 1a, where the receptor of interest for signal initiation (receptor 1) has a competitor (receptor 2) for binding the ligand.
We note that, by following similar arguments to those above, the NCA approximation for T(N, τ) consists of applying Equation (12) with the total number of ligands in the system n L , thus neglecting competition from receptor 2. On the other hand, as described in the MCA approximation, we may use Equation (12) with n L replaced by the effective number of ligands n * L = n L − E[M 2 ]. Figure 11 shows a comparison between the mean time T (N, τ), computed making use of the NCA and MCA approximations, and numerical simulations (T SI M (N, τ)). The matrix-analytic approach for this descriptor is not applicable since τ is fixed. We note that the time T(N, τ) to reach N productive type 1 monomers is not well estimated by the NCA (using Equation (12) with n L ligands) in situations where n L is not large enough (e.g., n L = 100 in Figure 11). The worst estimates are obtained in scenarios where the competing strength of receptor 1 is low, and that of receptor 2 is high, as one would expect. On the other hand, the MCA approach, using an effective number of ligands n * L , is better, even for a small number of ligands. In general, we observe that the MCA approximation gives good estimates of T(N, τ) over a wide range of parameter values: N and τ.  (N,τ) , j ∈ {NCA, MCA}, between the mean time T(N, τ) computed through the NCA and MCA approaches, and the time computed through stochastic simulations, for n L ∈ {10 2 , 10 3 }. In these scenarios, 10 3 parameter sets have been sampled by varying the number of receptors n R1 and n R2 between 100 and 400, K d,1 and K d,2 rates vary between 10 1 and 10 4 , and setting k r,1 = k r,2 = 10 −3 s −1 . In these examples, N = 10 and τ = k −1 r,1 , so that a monomer is considered to be productive if it lasts for longer than its average lifetime.

Discussion
In this paper, our focus has been on the modelling of receptor competition for a shared ligand. We have expressed this process as a multi-variate (bi-variate, in this case) competition process, as defined by Reuter [26] and Iglehart [27] in the area of Mathematical Ecology. We have shown how this leads to the study of a bi-dimensional structured continuous-time Markov chain; in particular, a level-dependent quasi-birth-and-death-process. For this process, our interest is in the analysis of two summary statistics related to late time (the steady-state distribution of monomers) and transient (the timescales of type 2 monomer formation, or of productive type 1 monomer formation) dynamics. For these summary statistics, it is possible to implement first-step arguments which lead to the study of systems of linear equations, where the number of equations is equal to (for the steady-state distribution), or potentially less than (for the timescales of monomer formation), the number of states in the CTMC, #S X . Since we are dealing with a LD-QBD process, there are matrix-oriented algorithms available in the literature [28], which we have exploited, that allow one to efficiently solve these systems.
The main limitation of this matrix-oriented approach is that the number of equations increases with the number of states, which turns out to increase with the number of molecules in the system, typically large in in vivo and in vitro settings [24]. Thus, we have proposed several approximations based on the analysis of the process in low-to-moderate competition scenarios (e.g., when the number of competing receptors is small, there is excess of ligand, or the competing receptor has a relatively low affinity compared to the other receptor). These approximations lead to the analysis of independent one-dimensional birth-and-death processes, for which analogous computations can be carried out much more efficiently. Our numerical results suggest that these approximations could be exploited in a wide range of parameter regimes, which have been explored inspired from values corresponding to the VEGF1 and VEGF2 receptors, which can bind the shared ligand, VEGF-A.
A striking advantage of using our approximate methods to compute the stochastic descriptors presented here is that the computational cost is much less than the cost of computing the same descriptors in an exact fashion (EMA). In Appendix D, Figures A4 and A5, we present heatmaps of the central processing unit (CPU) times to run each method for both descriptors and a range of the numbers of molecules, n R,1 and n R,2 . From this analysis we find that, for the maximal values of n R,1 and n R,2 considered, the EMA takes over 40 min to compute the steady-state distribution, whereas the NCA and MCA can compute the distribution in a matter of seconds. Similarly, the EMA takes over 25 min to compute the mean time to reach N = n R,2 2 monomers of type 2 for the maximal values of n R,1 and n R,2 considered, whereas the NCA and MCA again only take a few seconds. We note that as well as the CPU times, the EMA for both descriptors requires a large amount of memory for large values of n R,1 and n R,2 , whereas for the NCA and MCA the memory required is minimal. Results in Figures A4 and A5 (see Appendix D) suggest that, due to both computational memory and time constrains, the MCA and NCA would allow us to analyse the summary statistics of interest in this system when dealing with numbers of molecules significantly beyond those that can be considered when implementing matrix-analytic approaches.
In this paper we have considered a system of two receptors which share a common ligand. We note, however, that the modelling framework, and the analytic and approximate methods of analysis of the summary statistics, could be extended and applied to any other system in which two species compete for a shared resource. For example, our model here could be reversed to represent two ligand molecules competing for a shared receptor, such as the competition by several ligands for the epidermal growth factor receptor (EGFR) (see Roepstorff et al. [38]). For example, two major ligands for the EGFR are the epidermal growth factor (EGF) and transforming growth factor α (TGF-α). There are many other examples of multiple ligands stimulating the same receptor system to which our methodology could be applied [4,39,40]. Moreover, since our methods have been framed within processes originally devised in Mathematical Ecology, it is to be expected that they could be extended to more general competition processes in this area [41][42][43], not restricted to molecular dynamics.
Similarly, although our methods have been presented for bi-variate processes, they could in principle be extended to higher dimensions, where the competition of M species for a shared resource (a structured M-dimensional continuous-time Markov chain) could be approximated as M independent birth-and-death processes, or as a different combination of lower-dimensional LD-QBDs if some competing sub-groups of species need to be analysed jointly. This can be especially useful since, while the systems of equations we are dealing with in the original bi-variate process are, when n R,1 + n R,2 ≤ n L , of size (n R,1 + 1) · (n R,2 + 1) (see Appendix A), they would be of size (n R,1 + 1) · (n R,2 + 1) . . . (n R,M + 1) for a system with M receptors, making the matrix-oriented approach not feasible in practice. On the other hand, our approximations would be dealing instead with M independent birth-and-death processes, each of them of size n R,j + 1, j ∈ {1, . . . , M}. To illustrate the applicability of our approach in these higher-dimensional processes, we consider a situation with four different receptors competing for a common ligand. We plot in Figure 12 the absolute difference between the mean number of monomers in steady-state computed via the MCA and the mean number computed via stochastic simulations, for a wide range of parameter values, thus characterising different competition strengths for each receptor. We note that for the parameter regimes considered here (e.g., number of receptors ranging up to 2 × 10 2 for each species), the matrix-analytic approach is clearly unfeasible since it would require the analysis of CTMCs with more than 10 9 states. Thus, our NCA and MCA approximations allow one to analyse several descriptors of interest without having to implement these, numerically expensive, analytical methods. In particular, Figure 12 shows that the MCA approximation is able to capture, with high accuracy, these mean values of interest even for a small number of ligands (i.e., potentially high competition scenarios), unless the competing strengths of several receptors are too large. Even in these large competition scenarios, the MCA approach is able to estimate well these mean quantities for moderate ligand numbers (e.g., n L = 10 3 in Figure 12).
Finally, a clear limitation of our approach is that it has focused on receptor-ligand binding dynamics without taking into account additional reactions such as receptor synthesis, degradation or internalisation. Our analysis can be generalised to include those scenarios; for example, when modelling the surface and intra-cellular dynamics of two kinds of receptor competing (on the surface) for a common ligand, one could use our techniques to disentangle this competition, by the consideration of an effective number of ligands, so that the two intra-cellular processes can be separately studied. In the same way, our methodology has the potential to be of use in other settings. For example, since links between Queuing Theory and receptor-ligand interactions have been already established [44,45] (e.g., the two receptors in our system can be considered as two types of customers, waiting to be served by the common ligand (i.e., server)), one could explore the applicability of our NCA and MCA approximations to Queueing Systems (e.g., by modelling a multidimensional M/M/c system with k different types of arrivals being served by c common servers, as k independent M/M/c systems with an effective number of servers c * ). This is out of the scope of this manuscript and remains an area for future work. Funding: PJ is supported by the EPSRC, AstraZeneca and Smith Institute (Smith Institute CASE studentship, award reference 1969354). MC is partially supported by grants FIS2016-78883-C2-2-P and PID2019-106339GB-I00 (Ministerio de Economía, Industria y Competitividad -Agencia Estatal de Investigación).
Data Availability: Python (version 3.7) codes for computing the steady-state distribution and expected times to reach N monomers of type 2, via the EMA, NCA and MCA can be found in the public repository, "https: //github.com/PollyJeffrey/Competition-modelling-paper".

Abbreviations
The following abbreviations are used in this manuscript: EMA Exact matrix-analytic approach NCA No-competition approximation MCA Moderate-competition approximation

Appendix A
In this appendix we explain how the tri-diagonal by blocks structure of the infinitesimal generator matrix Q in Section 3 results from a particular ordering of the states in S X . We note first that the number of states in S X is where N 1 = min(n R,1 , n L ) and N 2 = min(n R,2 , n L ). In fact, it is possible to obtain explicit formulae for the number of states in S X depending on the values of n R,1 , n R,2 and n L , as , if n R,1 ≤ n R,2 ≤ n L and n R,1 + n R,2 > n L , (n L − n R,2 + 1)(n R,2 + 1) + (n R,1 +n R,2 −n L )(n R,2 +n L −n R,1 +1) 2 , if n R,2 ≤ n R,1 ≤ n L and n R,1 + n R,2 > n L , if n R,1 > n L and n R,2 < n L , if n R,1 < n L and n R,2 > n L , if n R,1 ≥ n L and n R,2 ≥ n L .
This state space can be organised into levels, as follows, Thus, level L(k) = {(0, k), (1, k), ..., (min(N 1 , n L − k), k)} contains those states representing k type 2 monomers present on the cell surface at any given time. If, when constructing the matrix Q, one orders the states by levels with L(0) ≺ L(1) ≺ ... ≺ L(N 2 ), and states within each level L(k) as (0, k) ≺ (1, k) ≺ · · · ≺ (min(N 1 , n L − k), k), this leads to a tri-diagonal by blocks infinitesimal generator. In particular, from the transition diagram in Figure 2, it is clear that only transitions between adjacent states are allowed. This means that from any state (m 1 , m 2 ) ∈ L(m 2 ), the next event in the Markov chain can take the process to either another state in level L(m 2 ), a state in level L(m 2 + 1), or a state in level L(m 2 − 1), by means of an association or dissociation reaction involving M 1 , an association to form M 2 or a dissociation of M 2 , respectively. Thus, since the process only moves up or down by a maximum of one level after each transition, the CTMC X is a level-dependent quasi-birth-and-death process (LD-QBD) [46], and the infinitesimal generator matrix Q is tri-diagonal by blocks. Q is given by We note that each level L(k) contains J(k) = #L(k) = min(N 1 , n L − k) + 1 states, so that each sub-matrix Q k,k has dimensions J(k) × J(k ). Sub-matrices Q k,k are given as follows: for 0 ≤ i ≤ J(k) and 0 ≤ j ≤ J(k + 1).
Equation (A2) can then be solved efficiently using Algorithm A2 to obtain the l th order moments of the random variable T (m 1 ,m 2 ) . In this algorithm, I a denotes the a × a identity matrix. Baseline parameter values (that is, the value chosen in each plot for any parameter that has been fixed) are n R,1 = 10 2 , n R,2 = 10 2 , K d,1 = 10 3 , K d,2 = 10 3 and n L = 250. We set the threshold parameter = 10 −5 for the MCA.
When focusing on the time to reach N monomers of type 2, and similarly to the comprehensive sensitivity analysis carried out in Figure A1 for the steady-state distributions, we vary in Figure A2 the parameters (K d,1 , K d,2 , n L , n R,1 , n R,2 ) in pairs, and plot heatmaps of the relative differences, , so that T (0,0) (N) encodes the timescales for type 2 monomers to reach steady-state. We note that the MCA seems to better approximate the exact mean time in almost all parameter regimes explored in Figure A2. We expect however that the NCA might perform better in some scenarios (some of them can be identified in Figure A2), for example, when N is small (see Figure 8). Still, we observe in Figure A2 (left) that the NCA performs relatively well in parameter regimes representing low natural competition from R 1 , for example when K d,1 is relatively large (indicative of low affinity of R 1 for L) and/or n R,1 is small. In Figure A2 (right), this effect of competition from receptors of type 1 is reduced, and the MCA in general performs much better thanks to the effective ligand numbers considered in Equation (10), which takes into account ligand depletion due to receptor competition. We note that the MCA and the NCA lead to very similar, and almost exact, results under very low competition scenarios, since when E[M 1 ] ≈ 0, n * L ≈ n L in Equation (10) and thus, the two approximations are effectively the same. Figure A2. Relative difference between the EMA and NCA (left) and MCA (right) expected times to reach N monomers of type 2. Baseline parameter values are n R,1 = 10 2 , n R,2 = 10 2 , K d,1 = 10 3 , K d,2 = 10 3 and n L = 250.
As mentioned in Section 4.2, there are some situations in which the NCA behaves better than the MCA. For example, the NCA seems to lead to better predictions than the MCA in Figure A3 (left), where the formation of type 2 monomers occurs more rapidly than that of type 1 monomers. In this figure, by carrying out a single stochastic (Gillespie) simulation of the process, we get a particular realisation of the time T (0,0) (N) to reach N = P 100 E EMA [M 2 ] type 2 monomers, for different values of P; in particular, for P = 20, 40 , 60, 80 and 100% of the average number of monomers in steady-state, we get the dots plotted in Figure A3 (top). On the other hand, in situations where the type 1 receptor is a strong competitor, such as in Figure A3 (right), the MCA outperforms the NCA. and different values of P. Left: n R,1 = n R,2 = 10 2 , K d,1 = 10 2 and K d,2 = 10. Right: n R,1 = 50, n R,2 = 10 2 , K d,1 = 10 and K d,2 = 10 3 . Number of ligands n L = 250.

Appendix D
In this appendix we illustrate the computational advantage of using our approximations. In Figures A4 and A5, a simple computational comparison between the NCA/MCA and the EMA is presented. In particular, we plot heatmaps of the central processing unit (CPU) times to run each method, for a range of numbers of the molecules n R,1 and n R,2 . In both figures, we set n L = 10,000 and K d,1 = K d,2 = 1000. We note that, for a large number of ligands such as n L = 10,000, the computational cost of the EMA just depends on the values of n R,1 and n R,2 in Figures A4 and A5 (see Appendix A, Algorithms A1 and A2), and is independent of K d,1 and K d,2 . The colour bar on each figure represents the CPU time for the computation of the summary statistic, with units minutes. We notice that, as expected, the EMA requires the largest amount of time to compute both descriptors since, especially when n R,1 and n R,2 are large, the method involves the computation of relatively large matrix inverses. There is a similar trend for both the NCA and MCA approaches to compute the steady-state distribution, that is, the larger n R,1 and n R,2 , the greater the CPU time. We see that, whereas the CPU time under the EMA takes up to a maximum of approximately 45 min, the NCA and MCA take less than 1 min for all numbers of molecules considered. For the second descriptor, we observe from Figure A5 that the EMA takes roughly half the CPU time as the EMA for the steady state distribution in Figure A4. This is since we are considering a value of N = n R,2 2 , for the expected times and hence we are only inverting half the matrices as required in Figure A4. Again we see for the expected times in Figure A5, that the NCA and MCA are strikingly faster to compute than the EMA. As expected, the NCA in this case does not depend on the value of n R,1 since this quantity does not feature in the calculation. For the MCA however, we must first compute E MCA [M 1 ] using the MCA for the steady-state distribution and hence the CPU times here do depend on n R,1 .
In Figure A6, we explore in more detail specific pixels from Figures A4 and A5. In particular we choose 3 values of n R,2 , corresponding to the 3 columns in Figure A6 and vary on the x-axis of each plot, the value of n R,1 . We are thus considering in each plot of Figure A6, a cross-section of the heatmaps in Figures A4 and A5. The top row of Figure A6 relates to the steady-state distribution descriptor and hence the y-axis of the plots in this row is the Hellinger distance. The second row in the figure relates to the expected times to reach N monomers of type 2 descriptor and hence the y-axis of the plots in this row is the relative difference as plotted in Figures 9 and 10. Finally the colour bar represents the CPU time saved with units minutes, by computing the descriptors using the approximations (NCA and MCA) instead of the analytic (EMA). For the steady state descriptor in the first row we see that as expected the Hellinger distance between the EMA and the NCA increases as n R,1 increases, whereas the Hellinger distance between the EMA and the MCA remains almost constant, indicating that this approximation works very well for these numbers of molecules. As we increase n R,1 and n R,2 we find that the time saved by using the NCA or MCA increases. In the second row we note that both the NCA and MCA worsen as approximations as we increase n R,1 , but this worsening is only very slight (the relative differences are always 1). Again we see that as we increase n R,1 and n R,2 the time saved by using the NCA or MCA increases. From this figure we see that there is a "playoff" between the Hellinger distance or relative difference and the CPU time saved, but for the numbers of molecules and parameter values considered here, the time saved clearly outweighs the slight deviation from the EMA result caused by using the MCA for the steady state distribution and both the NCA and MCA for the expected times descriptor.   Playoff between the Hellinger distance, the value of n R,1 and the CPU time saved by considering either the NCA or MCA instead of the EMA when computing the steady state distribution. Bottom: Playoff between the relative difference, the value of n R,1 and the CPU time saved by considering either the NCA or MCA instead of the EMA when computing the expected time to reach N monomers of type 2. In all plots, n L = 10,000 and K d,1 = K d,2 = 1000.