Implementation of a Parallel Algorithm to Simulate the Type I Error Probability

Novoa-Muñoz, Francisco

doi:10.3390/math12111686

Open AccessArticle

Implementation of a Parallel Algorithm to Simulate the Type I Error Probability

by

Francisco Novoa-Muñoz

Departamento de Enfermería, Facultad de Ciencias de la Salud y de los Alimentos, Universidad del Bío-Bío, Chillán 3800708, Chile

Mathematics 2024, 12(11), 1686; https://doi.org/10.3390/math12111686

Submission received: 10 April 2024 / Revised: 7 May 2024 / Accepted: 15 May 2024 / Published: 29 May 2024

(This article belongs to the Special Issue Mathematical Modeling for Parallel and Distributed Processing)

Download

Browse Figures

Versions Notes

Abstract

Simulating the probability of type I error is a powerful statistical tool that allows confirming if the statistical test achieves the established nominal level. However, its computational implementation has the drawback of significantly long execution times. Therefore, this article analyzes the performance of two parallel implementations (parRapply and boot) which significantly reduce the execution time of simulations of type I error probability for a goodness-of-fit test for the bivariate Poisson distribution. The results obtained demonstrate how the parallelization strategies accelerate the simulations, reducing the time by 50% to 90% when using 2 to 12 processors running in parallel. This reduction is graphically evidenced as the execution time of the analyzed parallel versions fits almost perfectly (

R^{2} \approx 0.999

) to the power model

y = a p^{b}

, where p is the number of processors used, and

a > 0

and

b < 0

are the constants of the model. Furthermore, it is shown that the parallelization strategies used scale with an increasing number of processors. All algorithms were implemented in the R programming language, and their code is included at the end of this article.

Keywords:

parallel programming; simulation; probability of type I error; goodness-of-fit

MSC:

62-08

1. Introduction

In today’s world, it is essential to anticipate events that may occur, and to do so, having models that predict such events with the highest possible accuracy is vital. Statistics provides tools for making predictions or decisions based on the information contained in the available data sample. Typically, these tools consist of models based on probability distributions. However, before using such models, it is crucial to determine whether the sample data behave according to the probability distribution on which the probabilistic model is based. This is achieved by applying a goodness-of-fit test.

Regarding the latter, it is necessary to specify that an essential part in the development of a goodness-of-fit test is the result of simulation under the null hypothesis, which confirms if the test reaches the established nominal level. In [1], two goodness-of-fit tests were generated, and the parametric bootstrap method was used to simulate the type I error probability. However, the excessive execution time of each simulation influenced the authors to make a few variations in both the parameter vector

θ

and the sample sizes n considered.

In this regard, to explore more and new possible scenarios, it is fundamental and necessary to reduce the computational execution time of the involved simulation processes. Many areas pursue this goal and employ parallel programming for it, for instance in [2,3,4,5,6,7].

Therefore, the interest of this research is to use parallelization strategies to accelerate the simulation processes employed in the first of the statistical tests developed in [1]. This will allow examining new parameters

θ

and testing with larger sample sizes n. It will contribute to reducing computation time and confirming that as sample sizes grow, parameter estimates converge to their nominal value.

This article is organized as follows: Section 2 presents the theoretical aspects and details the first test proposed in [1]. Section 3 shows the fundamental aspects to take into account to implement a parallel algorithm and are discussed the facilities offered by the R language to make implementations of this type. Section 4 presents the experimental results. In Section 5, the obtained results are discussed, and Section 6 exhibits the conclusions.

2. Background

From here on, the following notation is considered:

$N_{0} = {0, 1, 2, 3, \dots}$ .
$Θ = {θ = (θ_{1}, θ_{2}, θ_{3}) \in R^{3} : θ_{1} > θ_{3}, θ_{2} > θ_{3}, θ_{3} > 0}$ .

2.1. Bivariate Poisson Distribution

There have been several definitions for the bivariate Poisson distribution. This article utilizes the data provided in [8], which is the one that has received the most attention in the statistical literature.

Definition 1.

Let

X_{1} = Y_{1} + Y_{3} a n d X_{2} = Y_{2} + Y_{3},

where

Y_{1}, Y_{2}

and

Y_{3}

are mutually independent Poisson random variables with means given by

θ_{1}^{'} = θ_{1} - θ_{3} > 0

,

θ_{2}^{'} = θ_{2} - θ_{3} > 0

and

θ_{3} \geq 0

, respectively.

The joint distribution of the vector

(X_{1}, X_{2})

is called bivariate Poisson distribution with parameter

θ = (θ_{1}, θ_{2}, θ_{3})

, which will be denoted by

(X_{1}, X_{2}) \sim B P (θ)

.

The joint probability mass function of

X_{1}

and

X_{2}

is given by

P_{θ} (X_{1} = x_{1}, X_{2} = x_{2}) = exp (θ_{3} - θ_{1} - θ_{2}) \sum_{i = 0}^{min {x_{1}, x_{2}}} \frac{{(θ_{1} - θ_{3})}^{x_{1} - i} {(θ_{2} - θ_{3})}^{x_{2} - i} θ_{3}^{i}}{(x_{1} - i)! (x_{2} - i)! i!},

where

x_{1}, x_{2} \in N_{0}

.

Also, the joint probability generating function (pgf) of

X_{1}

and

X_{2}

is

g (u; θ) = exp {θ_{1} (u_{1} - 1) + θ_{2} (u_{2} - 1) + θ_{3} (u_{1} - 1) (u_{2} - 1)},

\forall u = (u_{1}, u_{2}) \in R^{2}

,

\forall θ \in Θ

.

2.2. Goodness-of-Fit Test of Novoa-Muñoz and Jiménez-Gamero (2014)

This subsection shows, in summary, the first test proposed by these authors, hereinafter, test Q.

To do this, let

X_{1} = (X_{11}, X_{21}), \dots, X_{n} = (X_{1 n}, X_{2 n})

be random vectors independent and identically distributed of

X = (X_{1}, X_{2}) \in N_{0}^{2}

.

The hypotheses are contrasted

\begin{matrix} H_{0} & : (X_{1}, X_{2}) \sim B P (θ_{1}, θ_{2}, θ_{3}), for some (θ_{1}, θ_{2}, θ_{3}) \in Θ, \\ H_{1} & : (X_{1}, X_{2}) ≁ B P (θ_{1}, θ_{2}, θ_{3}), \forall (θ_{1}, θ_{2}, θ_{3}) \in Θ . \end{matrix}

According to the definition established in [9], if

θ \in Θ

but the hypothesis test incorrectly decides to reject

H_{0}

, then the test has made a type I error.

The test statistic is of the Cramer Von Misses type and is given by

Q_{n, w} ({\hat{θ}}_{n}) = n \int_{0}^{1} \int_{0}^{1} {\{g_{n} (u) - g (u; {\hat{θ}}_{n})\}}^{2} w (u) d u,

where a consistent estimator

{\hat{θ}}_{n} = {\hat{θ}}_{n} (X_{1}, X_{2}, \dots, X_{n})

is a consistent estimator of

θ

,

g (u; {\hat{θ}}_{n})

is the pgf and

g_{n} (u)

its empirical version,

w (u) = u_{1}^{a_{1}} u_{2}^{a_{2}}

is the weight function, in which

u \in {[0, 1]}^{2}

and

(a_{1}, a_{2})

, called the weight vector, is such that

a_{1} > - 1, a_{2} > - 1

.

The null hypothesis will be rejected for “large” values of

Q_{n, w} ({\hat{θ}}_{n})

.

Since the null distribution of the statistic turned out to be unknown, the authors estimated it using the asymptotic null distribution, but since it depends on the true value of the

θ

parameter, which is unknown, it did not provide them will a useful solution and decided to estimate the null distribution of the statistic using the parametric bootstrap method, which is analyzed below.

2.3. Parametric Bootstrap

In the case under study, the distribution of the statistic depends on the parameter; then, such a model is called parametric, and therefore, the statistical methods based in this model are parametric methods [10].

As in the real world, a sample from a given population is obtained, and with this sample, the population parameter is estimated, which is expected to be an interior point of the parameter space. The parametric bootstrap method is a resampling technique that emulates the situation described above; that is, with the parameter estimator, a Bivariate Poisson is generated from which a sample is drawn, called the bootstrap sample, and with it the parameter is estimated, which is named the bootstrap parameter. With the sample and the bootstrap parameter, the statistic

Q^{*}

is computed, which is the bootstrap version of the statistic Q.

Although the bootstrap method has been highly useful in statistical inference, [11] shows that there are situations in which it is not consistent. In the said article, examples are shown where parametric bootstrap does not yield good results, and particularly, it is demonstrated that the method is inconsistent when the true parameter value is on the boundary of the parameter space or very close to it.

3. A Parallel Implementation

There are two parameters to analyze the efficiency of an algorithm which solves a given problem: the memory space required to store the input data and results and the time of execution used in performing the task. Several technological advances allow the use of many computers that collaborate each other either to increase the size of problems to be solved through data distribution or task distribution [12]. In this latter case, parallel programs are being used. In the case of bootstrap resampling, it involves task distribution, as the iterations are independent of each other.

3.1. Parallelization in R Language

The R programming language offers the possibility of developing tasks in parallel, when they allow it, through a series of packages that provide commands for this type of processing. Among the packages that help accomplish this task, Snow, Multicore, and Parallel can be mentioned.

Currently, one of the packages that has gained great prominence in the programming in R language is Parallel [13], which takes the best of two previous versions Multicore and Snow, which is complemented by the random numbers generation [14,15].

The parApply package allows all the functions contained in Apply, such as the direct data manipulation from each vector data, matrix, list or data frames avoiding the use of cycles. It also allows finding out how many processors are available.

Thus, the parallel package allows distributing the data in the different cores available and there, simultaneously, manipulating them according to the indicated function, and then gathering all the information and returning it as a list [16].

In general terms, two ways of implementing parallelism using this package can be distinguished: through forking or sockets.

The forking method is based on the full duplication of a master process with a shared environment toward each of the parallel environments, including objects or variables defined before the start of parallel threads. This method has the advantage of being very fast, but it cannot run under the Windows operating system.

On the other hand, the sockets method is based where each thread runs separately without sharing objects or variables, which can only be passed from the master process explicitly. As a result, it runs slower due to communication overhead, but it has the advantage of being able to be implemented in any operating system [16].

Just as the Parallel package offers tools that allow parallelization in R, there is a Boot package [13] that incorporates functions that facilitate the application of the bootstrap method. For this package, we consider the possibility of performing the operations in parallel using forking or a socket.

When programming in parallel in R language [13], with the aim of reducing computation times, it is important to take into account some experts’ recommendations. In many cases, simply “quite fast” is pretty good. The additional speed that could be achieved by transitioning from R to C/C++ does not justify the significant amount of time required to write, debug, and maintain code at that level [17].

3.2. Performance Evaluation of a Parallel Program

According to [18], one way to measure the performance of a parallel program is to calculate its Speedup (S), which is the ratio between the execution time of the program in the sequential version (

T_{s}

) and the execution time of the parallel version (

T_{p}

) with p processors, which is given by

S = \frac{T_{s}}{T_{p}}

.

Furthermore, as the processors used increase, there is more additional work (and time) than expected; that is, the ratio between S and p decreases in value. Thus, the Efficiency of a parallel program is defined as

E = \frac{S}{p}

.

3.3. Methodology

To showcase the results of this work, three aspects are considered:

(a)

Preparation of the programs.

Once it was verified that the process was parallelizable, the parts of the program that could be modified to reduce computation times were identified; then, the instructions of the R language that allow for parallelizing the task were analyzed, finding parRapply from the package Parallel, and boot from the Boot package. Then, the new codes that allow for parallelizing the simulation process were developed.

(b)

Simulation with the same components used in [1]. The simulations made by these authors were replicated considering the execution sequential and parallel code. For the parallel program, the number of cores p, using

p = 2, 3, \dots, 12

.

(c)

Simulations with new components.

-: New weight vectors $(a_{1}, a_{2}) \in {(1, 0), (1, 1)}$ , with $p = 1, \dots, 12$ .
-: Sample sizes $n = 100 (50) 300, 500$ with the same original population parameters and $p = 10$ .
-: Population parameters $θ$ , with $θ_{1} = 1, 1.5$ , $θ_{2} = 1$ and $θ_{3}$ chosen so that the correlation coefficient $ρ = \frac{θ_{3}}{\sqrt{θ_{1} θ_{2}}}$ is less than 0.25 or greater than 0.75, $n = 30 (20) 70, 100$ $(50) 300, 500$ and $p = 10$ .

The simulations described in items (b) and (c) were carried out in a cluster of computers with 12 processors with a processing speed of 2.00 GHz, 32 GB of RAM and operating system Centos 7 (Linux).

4. Experimental Results

This section presents the results of the simulations of type I error considering the three parts detailed in Section 3.3.

4.1. Simulations of Probability of a type I error with the Components Used in the First Test of Novoa-Muñoz and Jiménez-Gamero (2014)

Firstly, it was verified that when using parallel programming, there was no significant difference between the simulated type I error probability and its respective nominal value. Due to space, the results are not presented in this paper and can be obtained from the author upon request.

The next step was to analyze the execution times and the efficiency of the two parallel versions (parRapply and boot commands). For this, it is essential to know the sequential execution time, which is presented in Table 1 and Table 2. Table 1 presents the results using the algorithm implemented in this research (described in Appendix A), and Table 2 records the results of running the R package 4.0.0 boot command. These times serve as a basis for calculating the performance of parallel programming.

The execution times and efficiency of the parallelized programs using the parRapply and boot commands, which were the two parallel versions analyzed in this study, are shown graphically below.

Figure 1, Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6 present the average execution times versus the number of processors used. To obtain these results and for comparison, the same parameters

θ

used in [1] were used. Table A1–Table A12, displayed in Appendix B, contain the data with which these graphs were constructed.

Figure 1, Figure 2 and Figure 3 show the execution times when

E (X_{1}) = E (X_{2})

and Figure 4, Figure 5 and Figure 6 show the times when

E (X_{1}) \neq E (X_{2})

.

Since in this research two parallel implementations are analyzed, the interest lies in knowing which of them is more efficient; therefore, the efficiencies of both parallel implementations were calculated. Figure 7, Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12 display the efficiencies, E, versus the number of processors used. Each figure shows graphically the efficiency of both commands (parRapply and boot) for the same parameter

θ

. These parameters are the same ones used in [1].

Tables with these data are not shown, as they can be obtained directly using Table 1 and Table 2 (sequential time) together with Table A1–Table A12 in Appendix B (parallel time).

4.2. Simulations of Type I Error Probability with Sample Sizes Larger than Those Used in the First Test of Novoa-Muñoz and Jiménez-Gamero (2014)

For simulations with sample sizes greater than

n =

70, only the command parRapply was used because, as seen in the previous subsection, it is more efficient than the command boot. Furthermore, only ten processors were used.

The estimations of the type I error probabilities for the nominal values

α

= 0.01, 0.05, and 0.10 are presented in Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8 with 1%, 5% and 10%, respectively.

Figure 13 and Figure 14 present the average execution times versus different sample sizes (n) when the simulations displayed in Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8 are executed.

4.3. Simulations of the Probability of a Type I Error with Parameter Vectors Different from Those Used in the First Test of Novoa-Muñoz and Jiménez-Gamero (2014)

The simulation of type I errors implemented in parallel allows to evaluate the performance of the Q test when working with vectors that have parameters different to those already studied. In particular, it allows simulating the type I error probability for the cases in which the correlation coefficient

ρ

takes very small (≤ 0.1) or very large (≥ 0.8) values. The first of these cases takes on special relevance, since it involves parameters that are very close to the boundary of the parametric space, which according to [11] could lead to inconsistencies while increasing computation times.

For the same reasons given in Section 4.2, only the command parRapply and ten processors were used.

Table 9, Table 10, Table 11, Table 12, Table 13, Table 14 and Table 15 present the results of the type I error simulation for the parameter vectors that meet the characteristics already mentioned, considering the cases

E (X_{1}) = E (X_{2})

and

E (X_{1}) \neq E (X_{2})

with the following sample sizes

n = 30, 50, 70, 100, 150

, and 200.

Figure 15 and Figure 16 present the average execution times versus different sample sizes (n) when the simulations displayed in Table 9, Table 10, Table 11, Table 12, Table 13, Table 14 and Table 15 are executed.

5. Discussion

From Table 1 and Table 2, it is evident that the sequential version implemented in this research is faster (between 1% and 2%) than the boot version of the R package in almost all parameters

θ

, except for

θ =

(1.5, 1, 0.92). This could be attributed to the sensitivity of the Maximum Likelihood Method, as

θ_{2} = 1

is very close to

θ_{3} = 0.92

.

On the other hand, regardless of the parallel version (parRapply or boot) used, Figure 1, Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6 highlight the rapid decrease in program execution time as the number of processors increases. Additionally, it is observed that the parRapply command delivers results faster than the boot command. Furthermore, when

E (X_{1}) = E (X_{2})

, the execution time is lower than in cases where

E (X_{1}) \neq E (X_{2})

. It is observed that the execution time increases as sample sizes grow.

It is also important to emphasize that the power model (faster than the exponential model) fits almost perfectly to the execution time versus number of processors, as approximately 99.9% (

R^{2} \approx 0.999

) of the variation in execution time is explained by the number of processors used.

Adding to the previous point, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12 show that the two parallel implementations addressed in this research exhibit linear efficiency. In all analyzed cases, the parRapply command is more efficient than the boot command and with less dispersion among the sample sizes used. Furthermore, it is observed that when two processors are employed, the efficiency of the parRapply command is superior to 1.

In this line of discussion and thanks to this research (using parallel programming), it was possible to obtain results for sample sizes greater than 70, which are displayed in Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8. From these tables, it can be observed that the estimates are close to the nominal values. For example, the estimates at 10% are very close to 0.1. This becomes clearer as the sample size increases, implying that the estimates are converging to the nominal values.

In line with the previous paragraph, Figure 13 and Figure 14 graphically illustrate the positive correlation between the execution time of the programs and the sample size. Clearly, the execution time of the programs is linearly adjusted with the sample size, and this adjustment is almost perfect (

R^{2} \approx 0.99

). It is also evident that the execution time of the programs is lower when

E (X_{1}) = E (X_{2})

.

Finally, from Table 9, Table 10, Table 11, Table 12, Table 13, Table 14 and Table 15, it is possible to appreciate that the estimations of the type I error probability are very close to the nominal values in all cases where the correlation coefficient

ρ > 0.1

. However, as could be expected, according to what [11] suggested, in the cases in which

θ_{3} = 0.1 (ρ = 0.1)

and

θ_{3} = 0.12

(

ρ \approx 0.098

), the simulated probabilities are very far from the nominal values and, strangely, the situation worsens as the sample size grows.

In relation to the last paragraph, from Figure 15 and Figure 16, it is appreciated that the execution time of the programs is linearly adjusted with the sample size (

R^{2} > 0.94

) with a positive slope. It is also observed that these times decrease as the correlation coefficient

ρ

increases, and along this path, the dependence becomes more linear (

R^{2}

increases). Additionally, from these graphs, the high dispersion between the lines can be observed, which is due to the large difference between the correlation coefficients

ρ

.

6. Conclusions

In this research, two parallel algorithmic implementations in the R language have been analyzed, which allow simulating the probability of type I error. The results obtained show that the R language is a good alternative to reduce the simulation time of the studied test. On the other hand, the estimates of the type I error probability of the studied test have a value close to the nominal value for both small sample sizes (

n = 30, 50, 70

) and large sample sizes (

n = 100 (50) 300, 500

) as long as the involved correlation coefficient

ρ

is greater than 0.1

Moreover, the execution time when simulating the type I error probability of the studied test is significantly higher in cases where the correlation coefficient

ρ

is low (

ρ \approx 0.1

and

ρ \approx 0.2

) compared to those where this coefficient is high (

ρ \approx 0.8

and

ρ \approx 0.9

).

The major mathematical finding obtained in this research is that the execution time of the analyzed parallel versions (parRapply and boot) fits almost perfectly (

R^{2} \approx 0.999

) to the power-law model of the form

y = a p^{b}

, where p is the number of processors used,

a > 0

, and

b < 0

are the constants of the model.

The other mathematical result found is that the execution time of the analyzed parallel versions (parRapply and boot) fits the linear model

y = a n + b

, where n is the sample size, and

a > 0

is the slope of the line.

As future work, there are plans to deepen the research on the efficiency of the studied test in cases where the correlation coefficient is less than 0.2. Additionally, implementing the algorithm developed in this research in the C programming language is planned. Furthermore, developing a package in the R language that incorporates the goodness-of-fit test studied and others proposed in [1,19,20,21] is also planned.

Funding

This paper is funded in part by the Universidad del Bío-Bío Project DIUBB 2220529 IF/R, Dirección de Investigación y Creación Artística (Fondo de Apoyo a la Participación a Eventos Internacionales) and Vicerrectoría Académica.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The author thanks the editor of this journal and the anonymous reviewers for their valuable time and their careful comments and suggestions, which have contributed to improving the quality of this work.

Conflicts of Interest

The author declare no conflicts of interest.

Appendix A

This section presents the code generated in this research.

#Estimation with Maximum Likelihood Method (MLM)

#EstimatorML: estimates the parameters of the DPB by the MLM

#X=((X_{1i},X_{2i})), i=1,2,…,n

EstimatorML = function(X){

n = dim(X)[2]

t1 = mean(X[1,])

t2 = mean(X[2,])

t3 = min(t1,t2)/2 # initial value of parameter t3

x_min = min(X[1,])

x_max = max(X[1,])

y_min = min(X[2,])

y_max = max(X[2,])

frec = matrix(0,x_max-x_min+1,y_max-y_min+1)

for(i in x_min:x_max){

p = i - x_min + 1

for(j in y_min:y_max){

q = j - y_min + 1

for (k in 1:n){

if(X[1,k]==i && X[2,k]==j) frec[p,q] = frec[p,q]+1

}

# Newton-Raphson Method

# prev_t3 saves the previous value of t3

# RMyDerRM_t3 calculates RM and the derivative of RM with respect to t3

num = 1

difm = 0.001

diff = 100

dif_min = 100

t3_min <- t3

while (diff>=difm && num<=200){

prev_t3 = t3

RM_Der = RMyDerRM_t3(n,x_min,x_max,y_min,y_max,frec,t1,t2,t3)

t3 = t3 - (RM_Der[1] - 1)/RM_Der[2]

diff = abs(prev_t3 - t3)

if(is.na(diff)){

diff = difm/10

num = 250

t3 = t3_min

} else if(diff<dif_min){

dif_min <- diff

t3_min <- t3

}

num = num + 1

}

return(c(t1,t2,t3))

}

# probPB calculates the probability function of a PB(t1,t2,t3)

probPB = function(x,y,t1,t2,t3){

s = 0

if(x>=0 && y>=0){

m = min(x,y)

tt1 = t1 - t3

tt2 = t2 - t3

for(i in 0:m){

s = s + tt1^(x-i)*tt2^(y-i)*t3^i/(gamma(x-i+1)*gamma(y-i+1)*gamma(i+1))

}

s = s*exp(-tt1 - tt2 - t3)

}

return(s)

}

RMyDerRM_t3 = function(n,x_min,x_max,y_min,y_max,frec,t1,t2,t3){

der = rm = 0

for(i in x_min:x_max){

p = i - x_min + 1

for(j in y_min:y_max){

q = j - y_min + 1

if(frec[p,q]>0){

prob_ij = probPB(i,j,t1,t2,t3)

prob_im1jm1 = probPB(i-1,j-1,t1,t2,t3)

prob_im2jm2 = probPB(i-2,j-2,t1,t2,t3)

prob_im2jm1 = probPB(i-2,j-1,t1,t2,t3)

prob_im1jm2 = probPB(i-1,j-2,t1,t2,t3)

prob_im1j = probPB(i-1,j,t1,t2,t3)

prob_ijm1 = probPB(i,j-1,t1,t2,t3)

rm = rm + frec[p,q]*prob_im1jm1/prob_ij

sum1 = (prob_im2jm2 - prob_im2jm1 - prob_im1jm2)/prob_ij

sum2 = prob_im1jm1*(prob_im1j + prob_ijm1 - prob_im1jm1)/(prob_ij^2)

der = der + frec[p,q]*(sum1 + sum2)

}

return(c(rm/n,der/n))

}

f_n = function(i,j,n,X){

ind1 = ind2 = rep(0,n)

ind1[X[1,]==i] = ind2[X[2,]==j] = 1

ss = sum(ind1*ind2)

return(ss/n)

}

# Bivariate Poisson Sample Generation

gmpb = function(n,t1,t2,t3){

if (t1>t3 && t2>t3 && t3>0){

a = rpois(n,t1-t3)

b = rpois(n,t2-t3)

c = rpois(n,t3)

return(rbind(a+c,b+c))

} else { stop("The parameters do not meet the requirements"); on.exit}

}

# Calculation of the statistic R_n,a # n is the sample size

R = function(X,t1,t2,t3){

m = 10

inf = max(max(X[1,]),max(X[2,])) + m

fp = Prob <- matrix(0,inf+1,inf+1)

for(i in 0:inf){

for(j in 0:inf){

Prob[i+1,j+1] = probPB(i,j,t1,t2,t3)

fp[i+1,j+1] = f_n(i,j,n,X) - probPB(i,j,t1,t2,t3)

}

k = l = 0:inf

f = function(k,l){1/((ii + k + a1 + 1)*(jj + l + a2 + 1))}

sa = 0

for(ii in 0:inf){

for(jj in 0:inf){

s = s + fp[ii+1,jj+1]*sum(fp*outer(k,l,FUN=f))

}

return(n*s)

}

# A function is defined to be introduced as an argument in parRapply

# Must be a function of a column

full_function = function(X){

X = data.frame(matrix(unlist(X), ncol = 2, byrow = F))

X = t(X)

th_estim = EstimatorML(X)

t1 = th_estim[1]

t2 = th_estim[2]

t3 = th_estim[3]

r_obs = R(X,t1,t2,t3)

r_boot = rep(0,B)

for (b in 1:B){

next = "F"

while(next=="F"){ # B bootstrap samples (Mboot) are generated

X_PBboot = gmpb(n,th_estim[1],th_estim[2],th_estim[3])

th_est_boot = EstimadorML(X_PBboot) # bootstrap theta estimator

th_dif1 = th_est_boot[1] - th_est_boot[3]

th_dif2 = th_est_boot[2] - th_est_boot[3]

if(th_dif1 > 0 && th_dif2 > 0 && th_est_boot[3] > 0) next = "T"

}

# Statistics are evaluated in each bootstrap sample

r_boot[b] = R(X_PBboot,th_est_boot[1],th_est_boot[2],th_est_boot[3])

}

# An approximation of the p-value is accumulated for each statistic

ind_r = rep(0,B)

ind_r[r_boot >= r_obs] = 1

vp_b = sum(ind_r)/B

}

M = 1000 # Monte Carlo iterations B = 500 # Bootstrap iterations

va1 = c(0,0,1,1); va2 = c(0,1,0,1) # vector (a1,a2)

library(parallel) nc = detectCores()

tm = c(30,50,70,100,150,200,250,300,500) # Sample sizes

Theta = matrix(0,6,3)

Theta[1,] = c(1.5,1,0.31); Theta[2,] = c(1.5,1,0.62)

Theta[3,] = c(1.5,1,0.92); Theta[4,] = c(1,1,0.5)

Theta[5,] = c(1,1,0.25); Theta[6,] = c(1,1,0.75) for(i in 1:6){

th = Theta[i,]

even = odd = c()

for(i in 1:(2*M)){

if(i%%2==0){

even = c(even,i)

} else {odd<-c(odd,i)}

}

for(t in 1:length(tm)){

n = tm[t]

X = Y = matrix(0,M,n)

carpet = "folder where the samples are saved"

file_L = paste0(carpet,"/Theta_",th[1],"_",th[2],"_",th[3],"_n",n,".txt")

XY = matrix(scan(file_L,sep=","),nrow=2*M,byrow=TRUE)

X = XY[odd,]

Y = XY[even,]

XY_col = matrix(0,2*n,M)

for (m in 1:M){

XY_col[,m] = c(X[m,],Y[m,])

}

# parallelize with parRapply with nc processors

carp_file = "folder where the outputs will be saved"

file = paste0(carp_file,th[1],"_",th[2],"_",th[3],"_n",n,".txt")

for (a_12 in 1:4){

a1 = va1[a_12]

a2 = va2[a_12]

for(nn in 1:nc){

cl = parallel::makeCluster(nn)

clusterExport(cl, "EstimatorML")

clusterExport(cl, "RMyDerRM_t3")

clusterExport(cl, "probPB")

clusterExport(cl, "f_n")

clusterExport(cl, "n")

clusterExport(cl, "a1")

clusterExport(cl, "a2")

clusterExport(cl, "B")

clusterExport(cl, "M")

clusterExport(cl, "R")

clusterExport(cl, "gmpb")

SS = paste0("Outings with ",nn," processors", "a1=",a1, "a2=",a2)

write(SS,file=file,append=TRUE)

a = proc.time()

res = parRapply(cl, t(XY_col), full_function)

parallel::stopCluster(cl)

execution_time = proc.time()-a

write("",file=file,append=TRUE)

write(c("vp : ",res),ncolumns=length(res)+1,file=file,append=TRUE)

ind1_r = ind2_r = ind3_r = rep(0,M)

ind1_r[res <= .01] = 1

error1_r = sum(ind1_r)/M

ind2_r[res <= .05] = 1

error5_r = sum(ind2_r)/M

ind3_r[res <= .1] = 1

error10_r = sum(ind3_r)/M

write("",file=file,append=TRUE)

write(c(error1_r,error5_r,error10_r),file=file,ncolumns=4,append=TRUE)

write("",file=file,append=TRUE)

write("Execution time",file=arch,append=TRUE)

write(execution_time,file=file,append=TRUE)

}

Appendix B

In this section, the average execution times are presented as a function of the number of processors used when running the two parallel implementations (parRapply and boot) analyzed in this research.

Table A1. Average execution time (in seconds) using parRapply,

θ =

(1.5, 1, 0.31), weight vectors

(a_{1}, a_{2})