Using Copula to Model Dependence When Testing Multiple Hypotheses in DNA Microarray Experiments: A Bayesian Approximation

Many experiments require simultaneously testing many hypotheses. This is particularly relevant in the context of DNA microarray experiments, where it is common to analyze many genes to determine which of them are differentially expressed under two conditions. Another important problem in this context is how to model the dependence at the level of gene expression. In this paper, we propose a Bayesian procedure for simultaneously testing multiple hypotheses, modeling the dependence through copula functions, where all available information, both objective and subjective, can be used. The approach has the advantage that it can be used with different dependency structures. Simulated data analysis was performed to examine the performance of the proposed approach. The results show that our procedure captures the dependence appropriately classifying adequately a high percentage of true and false null hypotheses when choosing a prior distribution beta skewed to the right for the initial probability of each null hypothesis, resulting in a very powerful procedure. The procedure is also illustrated with real data.


Introduction
There are many experiments that require simultaneously testing many hypotheses. In that context, if each hypothesis is tested individually at a given significance level, α, the probability of erroneously rejecting at least one hypothesis increases rapidly depending on the number of hypotheses, i.e., a problem arises if the multiplicity of the problem is not taken into account when simultaneously evaluating all of the hypotheses. DNA microarray experiments exhibit this problem, as data analysis often requires simultaneously testing many hypotheses, one for each gene. The first to warn of this problem was [1] . The literature regarding this subject is extensive, especially under assumption of independence.
From a frequentist point of view, procedures for testing multiple hypotheses are based on controlling a measure related to Type I errors, such as the family wise error rate (FWER). However, this usually leads to especially conservative procedures, in the sense that few false null hypotheses are rejected, thus reducing the power of the test.
When testing multiple hypotheses, the false discovery rate (FDR) was proposed by [2] as a measure of error that results in less conservative procedures than those controlling the FWER. The authors argue that in some situations, it may be acceptable to tolerate some false positives, provided that there are few in relation to the number of rejected null hypotheses. A review of multiple hypothesis tests is presented by [3]. Additionally, these authors proposed methods based on ordered p-values for multiple comparisons between means of normal populations.
Following with frequentist approaches, the different error rates were analyzed by [4] who also compared the different procedures in the context of DNA microarrays, and a general statistical framework is proposed by [5] for multiple tests of association between known fixed features of a genome and unknown parameters of the distribution of variable features of this genome in a population of interest.
In the context of DNA microarray experiments, the FDR is the most commonly used error type in the frequentist approach. This is because multiple hypotheses are used, in many situations, as a first exploratory step to identify groups of genes that are expressed differentially to perform further research with them subsequently. Thus, it may be acceptable to support a higher number of false positives. Furthermore, the authors derived a procedure for controlling the FDR at a certain level, α, for independent test statistics. There are many published studies in this field. For a detailed review of the multiple testing problem, see [4,5].
In some cases, however, test statistics are dependent. For instance, in the context of DNA microarray experiments, where genes usually present high correlation and the number of hypotheses is very high, this presents a challenge and an important problem in statistics. The following research considers multiple hypothesis testing through a frequentist approach under dependence. The case in which test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses is analyzed in [6]. A step-down procedure to control the FDR under the independence of test statistics while also controlling the FDR under positive dependence can be seen in [7]. An application of the Archimedean copulas to resampled p-values generated by permutations in the context of multiple testing is shown in [8]. Detailed review of multiple testing problem addressed to control FDR under Archimedean copulas can be found in [9] and their references.
From a Bayesian perspective, the posterior probability of each null hypothesis must be obtained to make decisions. Several publications offer essential insights on this approach under the assumption of independence. For example, as distribution for the observations, a mixture of a discrete and a continuous component in a Bayesian approach is proposed by [10]. Hierarchical Bayesian models can be seen in [11,12], the former, robust with respect to extreme values and powerful even with a small number of observations, the latter based on a mixture of two distributions along with an empirical Bayes approach.
Moreover, the sensitivity regarding the choice of the prior distribution on the probability of each null hypothesis was analyzed in [13]. On the other hand, the multiple hypothesis testing problem from the perspective of Bayesian decision theory was dealt with in [14], where a decision criterion based on an estimate of the number of false null hypotheses is proposed. Finally, procedures that control the Bayes FDR and the Bayes FNR, which are applicable in any situation are suggested by [15].
Under the assumption of dependence in the field of genomics, an empirical Bayesian method is proposed in [16], which essentially combines multiple testing with a clustering technique. Similarly, a procedure that combines a clustering technique with multiple testing from a Bayesian perspective to deal with the correlation effect in data analysis is presented by [17].
Other recent approaches have been developed to address dependence in testing multiple hypotheses, such as graphical models. The hidden Markov model (HMM), in the context of graphical models, has emerged as a tool to support the structure of data dependence when testing multiple hypotheses and it has had considerable impact on the field of genomics. The potential offered by the HMM to support the dependency structure has been explored by [18][19][20][21], among others. The dependency structure through Markov chain models was explored by [18], demonstrating the optimality of the FDR under certain conditions at an appropriate alpha level along with the empirical realization of these models. This procedure was extended in [21][22][23], by developing a graphical model based on the multiple test procedure and a Markov-random-field-coupled mixture model. The extended procedure allows for an arbitrary dependency structure (N ≥ 2) and heterogeneously dependent parameters. The effect of the dependency structure of the finite states of the HMM and on the likelihood ratio for optimal multiple tests in hidden states is analyzed in [19].
The main aim of this paper is to provide a Bayesian procedure for testing multiple hypotheses in cases with many hypotheses, under the assumption of dependency, and modeling this dependence through copula functions. Copulas are attractive because they can be used to model a wide range of dependency structures. The remainder of this paper is organized as follows. In Section 2, we propose a full Bayesian approach for the problem of testing multiple hypotheses, and we describe the theory regarding copula functions. In Section 3, we present the full Bayesian approach for modeling dependence through an N-dimensional Gaussian copula with normal marginal densities, together with the prior and conditional posterior distributions necessary to apply a Markov chain Monte Carlo (MCMC) algorithm (the Metropolis-Hastings-within-Gibbs algorithm), along with a summary of this algorithm (the details for which are given in the Appendix A) and a simulation study that evaluates the performance of our approach. Section 4 shows the Bayesian approach for modeling the dependence through an N-dimensional Clayton copula with normal marginal densities, together with a simulation study and a comparison for model selection based on the Deviance Information Criterion (DIC). Section 5 applies the proposed methodology to a real data set from experiments with DNA microarrays. Finally, Section 6 presents the main conclusions.

A Bayesian Approach: Model Specification
Suppose N dependent random variables are measured under two different independent treatment conditions. In particular, suppose X = (X 1 , . . . , X N ) is an N-dimensional random vector of dependent variables measured from one condition, and that Y = (Y 1 , Y 2 , . . . , Y N ) is an N-dimensional random vector of dependent variables measured from another condition, where X and Y arise independently from distributions F X (X|Θ X , λ X ) and F Y (Y|Θ Y , λ Y ), respectively, and where Θ X = (θ X 1 , . . . , θ X N ) and Θ Y = (θ Y 1 , . . . , θ Y N ) are the parameter vectors of interest and Λ = (λ X , λ Y ) is the other group of parameter vectors for the model. We consider the problem of simultaneous testing as follows: We decide which null hypothesis to accept through the posterior probability of each null hypothesis. Thus, we build a distribution probability model for X and Y, preceding the inference of the parameters when the variables X i and Y i have been observed.
The joint probability density of X and Y is defined as product of the joint probability density's f (X) and f (Y), because we previously considered the independence between these two conditions. Thus, where Θ = (Θ X , λ X , Θ Y , λ Y ) denotes the model parameters.

Copula Function
We built a multivariate distribution for each treatment condition according to copula function. According to [24,25], a copula is a joint distribution function defined in the unit cube [0, 1] n with standard uniform univariate margins. This concept was introduced by [26], and other recent works have been proposed, such as [27][28][29], among others. Copulas are especially useful because they can be used to model the dependence in data completely.
According to Sklar's Theorem(1959), given a joint cumulative distribution function F (x 1 , . . . , x N ) for random variables X 1 , X 2 , . . . , X N with marginal cumulative distribution functions (CDFs) F 1 (x 1 ), F 2 (x 2 ), . . ., F N (x N ), F can be written as a function of its marginals: where C(u 1 , u 2 , . . . , u N ) is a joint distribution function with uniform marginals, u i = F i (x i ), for i = 1, . . . , N and C is called a copula. If F 1 , . . . , F N are continuous, then the copula C is unique, and if each F i is discrete, then C is unique on For a strict analysis of copulas, see [29]. As a consequence of Sklar's Theorem (Without loss of generality, we will treat the absolutely continuous case), the joint probability density can be written as a product of the marginal density and the copula density. Thus, an N-dimensional joint density function is defined as follows: where The dependence function c (u 1 , u 2 . . . , u N ) is called the copula density and it encodes the dependence among the variables (x 1 , x 2 , . . . , x N ). For instance, if the random variables x 1 , x 2 , . . . , x N are independent, c (u 1 , u 2 . . . , u N ) = 1. Thus, f (x 1 , . . . ,

Modeling Dependence with N-Dimensional Copulas
From (3) and (4), the N-dimensional joint density function (2) for X and Y can be expressed as follows: and ω X and ω Y denote the vector parameters for the copula density through conditions X and Y, respectively. Next, we update the parameter vectors for the model: To simplify the notation, throughout the following, we use the notation c X (u X ; ω X ) and c Y (u Y ; ω Y ), rather than c X u X 1 , u X 2 , . . . , u X N ; ω X and c Y u Y 1 , u Y 2 , . . . , u Y N ; ω Y , respectively.
In the following sections, we consider normal marginal densities for the model, as it is usually done when modeling gene expression data, and the means are the parameters of interest. In this context, to assume that the joint distribution is normal may seem reasonable. Then, in Section 3 the dependency between each treatment condition variables is modeled using an N-dimensional Gaussian copula. However, the Gaussian copula is not always the most appropriate to model dependency even if the marginals are normal, because the assumption that the marginal distributions are normal does not imply that the joint distribution is normal as can be seen, for example, in [31][32][33]. For this reason, in Section 4, the dependency is modeled using an N-dimensional Clayton copula.

Modeling Dependence Through N-Dimensional Gaussian Copulas with Normal Marginal Densities
The typical objective when analyzing data arising from microarray experiments is to identify genes that are differentially expressed. Normal marginal distributions have been widely used in the field of genomics to model gene expression data [13,14,34,35], among others.
Thus, we may assume a normal distribution for the variables X i and Y i , i = 1, 2, . . . , N. We consider that the vector of observations (x i ., y i .) proceeds from a distribution under H 0i when τ i = 0, and the marginal density of each observation of this vector is defined by the same law, N µ X i , σ 2 i , for both treatment conditions. Likewise, we consider that the vector of observations (x i ., y i .) proceeds from a distribution under H 1i , when the latent variable τ i = 1, for each i = 1, 2, . . . , N. However, X i and Y i random variable marginal densities are defined by N µ X i , σ 2 i and N µ Y i , σ 2 i , respectively. Please note that we consider the variance of the population from the two treatment conditions to be . . , N; however, this could differ for all hypotheses (For simplicity's sake, we have considered σ 2 ; however, the procedure is also applicable when the variances σ 2 The main objective of this paper is to identify differentiated expressions under two experimental conditions. Therefore, we developed multiple hypothesis tests to decide between treatments, which is equivalent to testing the following hypotheses: where µ X i and µ Y i are the means of X i and Y i , respectively. To model the dependence between the variables, we assume the density distribution for each condition defined by the N-dimensional Gaussian copula, since it uses only the pairwise correlation among variables. This is done in precisely the same way that a multivariate normal distribution encodes the dependence between variables, and it allows for any marginal distribution and any positive-definite correlation matrix [36], defined as follows: are correlation matrices for the copula through conditions X and Y, respectively, and Φ is the normal CDF. For the sake of simplicity, to build the model we considered the same dependency structure for the two treatment conditions. Consequently, the correlation matrix is denoted by The normal scores ξ X and ψ Y are quantiles of order u X i and u Y i , respectively, from the standard normal distribution N (0, 1) , i = 1, 2, . . . , N.
To obtain the likelihood, we consider the latent variables defined in (7) with µ X i and µ Y i , i = 1, . . . , N, the interest parameters.
In accordance with the parametric model defined above (in Section 2), the likelihood (8) for parameters Θ = µ X , µ Y , σ 2 , Σ is defined as follows: Due to the multiplicity of the problem and the need to estimate too many parameters, we used the uniform correlation structure matrix in accordance with [36] to build a multivariate Gaussian copula that is dependent on the correlation parameter ρ, defined as follows: We use the copula function as a tool to describe the dependency relation between variables. There can be many ways of quantifying this relation. The most common is the Pearson correlation, although this measure is the most limited, as it reflects only linear dependence. As alternatives, the Kendall rank correlation or Spearman correlation can be used, as they are invariant under monotonic transformations.
Thus, the posterior distribution of the parameters is as follows: To develop the Bayesian approach, we need to specify the prior distribution for the model

Prior and Posterior Distributions
To present a model as simple as possible, complex as it is though, in this paper we assume independence for all prior distributions of Θ parameters. However, it is possible to consider prior distributions that reflect some kind of dependency between parameters, as for example in [12,13]. Then, the joint prior distribution for the model parameters (Θ, τ) is where the prior distribution for the means for both conditions of treatment µ X i and µ Y i follow a uniform distribution with range a X i , b X i and a Y i , b Y i , respectively, as indicated by [37]. The prior distribution for the variance σ 2 i (i = 1, 2, ..., N) is defined through Jeffrey's prior density function π(σ i 2 ) ∼ 1 The parameter τ i is then introduced to the model, assuming that each latent variable follows a Bernoulli distribution, We assume the prior distribution for the parameter p i ∼ Beta(α i , β i ), following [12,35], among others. As explained in the previous section, we use an N-dimensional Gaussian copula function as the tool for modeling the dependence between variables through the Pearson correlation. The same dependency structure for the two treatment conditions Σ X = Σ Y = Σ, and, likewise, for the correlation coefficient, is used for both treatment conditions ρ X = ρ Y = ρ and follows a uniform distribution with range [a, b].
Thus, the joint prior distribution for the model parameters (12) is defined as follows: Then, given the data, the joint prior densities (13), and the likelihood (10), the posterior distribution of the parameters (Θ, τ) = (µ X , µ Y , σ, p, ρ, τ) is defined as follows: As we can see, this joint posterior distribution is complex and has no known analytical expression. However, Bayesian inference may be performed using the MCMC approach. The MCMC approach can produce a Markov chain Θ (l) , τ (l) : l = 1, . . . , M that convergences to the joint posterior distribution. Consequently, we can estimate the parameters from the generated sample, for example, from the marginal means of these samples. In particular, we used the Metropolis-Hastings-within-Gibbs algorithm. For details regarding this, see [38,39], and their references.

Conditional Posterior Distributions
In this subsection, we derive the conditional posterior distributions of the model parameters to construct the MCMC chain.
The conditional posterior distributions are derived as follows for each τ i = 0, i = 1, 2, . . . , N, given the data and the remaining parameters: Therefore, the conditional posterior distributions of each τ i = 1, for i = 1, 2, ..., N, are The conditional posterior distributions of each p i , for i = 1, 2, . . . , N, given the data and the remaining parameters are The conditional posterior distributions of each µ X i for i = 1, 2, . . . , N, given the data and the remaining parameters, when τ i = 0 and τ i = 1, are respectively defined by and where The conditional posterior distributions for each µ Y i of the model, when τ i = 1, for i = 1, 2, . . . , N, given the data and the remaining model parameters, are defined by where The conditional posterior distributions for each σ i 2 of the model, for i = 1, 2, . . . , N, given the data and the remaining model parameters, when τ i = 0 and τ i = 1, are respectively defined by if Finally, the conditional posterior distribution for ρ given the data and the remaining parameters is Please note that the posterior conditional distribution in Equations (17), (19), (21), (23) and (26) for the parameters has no analytic form.
For the computational implementation of the algorithm, we used the R program, as it is free statistical software and it provides an easy structure for manipulating complex models.

MCMC Algorithm (Metropolis-Hastings-within-Gibbs Algorithm)
We make use the Metropolis-Hastings-within-Gibbs algorithm based on the MCMC sampling strategies to obtain a sample from the joint posterior distribution (14). The structure of the proposed MCMC method is implemented as follows. The detailed algorithm is described in the Appendix A. Update µ (l) X i , for i = 1, . . . , N by sampling from (17) and (19)
We can also approximate the posterior probability of the alternative hypothesis by for i = 1, 2, . . . , N.
Please note that we can use these posterior probabilities to solve the multiple hypothesis testing problem.

Simulation Study
We performed a simulation with N = 50 simultaneous hypotheses, with n = 20 observations per hypothesis/gene, where n x = 7 and n y = 13 observations express Treatment Conditions 1 and 2, respectively, since the microarray dates typically have a smaller number of samples than the number of genes/hypothesis.
In this context, the datasets for both experimental conditions were generated following multivariate Gaussian distributions N 50 (µ X , Σ) and N 50 (µ Y , Σ), with the same covariance-variance matrix Σ and correlation coefficient ρ = 0.8. The vector of means µ X for Condition 1 was defined in the range (1150, 1160), and the vector of means µ Y for Condition 2 was defined in the range (1165, 1180). The vector of standard deviations was defined in the range (8,16).
We considered simultaneous updates of the vector (Θ, τ) = µ X , µ Y , σ 2 , p, τ, ρ of the model parameters. The proposed algorithm was run for 15,000 draws (i.e., iterations) of the Metropolis-Hastings-within-Gibbs sampler sequence, discarding the first 7500 burn-in iterations in one-step MCMC outputs.
The simulation compared the performance of our approach when applied to three simulated datasets. Thus, the simulated dataset assumed 80%, 50%, and 20% of the true null hypotheses, and all sets were generated at N = 50. For the prior distribution p i ∼ Beta (α i , β i ), we assumed the same α and β parameters for i = 1, . . . , N for the sake of simplicity, i.e., p i ∼ Beta (α, β) for the three simulated datasets. Furthermore, to analyze the sensitivity with respect to the election of beta distribution parameters, we selected the following values for (α, β): (0.5, 1) , (1, 1), (1, 0.5), and (2, 0.5). Indeed, with these parameters, very different distributions are obtained. As in [14,40,41], we obtain the FDR from the expected false discovery rate introduced by [42,43]. The FDR has been estimated using an MCMC sample obtained from Metropolis-Hasting-within-Gibbs algorithm. Tables 1-3 present the simulation results. As shown in Tables 1 and 2, for both data structures there was too much sensitivity in the choice of parameters α and β for the prior distribution on p i . It seems that these parameters have a considerable influence on the results, since we can observe significant differences in the estimations. Therefore, it appears that the parameters α and β of the prior distribution for p i are better when the prior distribution is skewed to the right, for instance, when we assume the prior distributions Beta (1, 0.5) and Beta (2, 0.5). To emphasize that this is in fact better for our approach, we simulated a dataset by assuming 20% of the true null hypotheses. The simulation results are presented in Table 3. We can observe that these results are similar to those obtained in Tables 1 and 2, i.e., the estimations are closer to reality when the prior distribution is skewed to the right, even with a low percentage of true null hypotheses. Therefore, we conclude that with our model, good results will be obtained whenever a prior distribution for p i is skewed to the right, regardless of the number of true null hypotheses.

Modeling Dependence Through N-Dimensional Clayton Copulas with Normal Marginal Densities
The Archimedean family includes a large number of copulas with different peculiarities, characterized by allowing the modeling of multivariate distributions using a single univariate function, thus simplifying the calculations.
As in the previous section, we consider normal marginal densities, but the dependency is modeled using an N-dimensional Clayton copula of the Archimedean family. This copula has already been used to model dependency in gene expression analysis, as can be seen in [8].
In this case, the likelihood is expressed as in (10) but now c X u X. j ; θ c and c Y u Y. k ; θ c are the Clayton copulas given by: where θ c is the dependency parameter for the Clayton copula, u X. j = (u X 1j . . . , u X Nj ), u Y. k = (u Y 1k . . . , u Y Nk ), being u X ij = F x ij and u Y ik = F y ij , i = 1, . . . , N, j = 1, 2, . . . , n x and k = 1, 2, . . . , n y . Then, the parameter vector for the model is .., σ 2 N ) and p = (p 1 , ..., p N ), being p i the initial probability of each null hypothesis.
Thus, given the data, the joint prior densities as in (13), with the uniform distribution for θ c over [a, b] and the likelihood, the posterior distribution takes the same form as in (14) with the c X u X. j ; θ c and c Y u Y. k ; θ c the Clayton copulas. As in Section 3.1, this posterior distribution has no analytic form, but Bayesian inference may be performed using MCMC methods. The parameters conditional posterior distributions, which are necessary to build the algorithm, are the same as those described in Section 3.2, replacing Gaussian copulas by Clayton copulas. Therefore, the same Metropolis-Hastings-within-Gibbs algorithm is used substituting Gaussian copulas by Clayton copulas, defined in (32).

Simulation Study
In this section, the same data sets from Section 3.4 was used. When working with non-elliptical distributions, Kendall τ coefficient should be used instead of Pearson correlation coefficient [36]. Under normality, both coefficients are related as follows: while Kendall coefficient and Clayton copula parameter, θ c , are also related: Therefore, we select the uniform prior distribution over the interval (1.3, 4) for θ c as the candidate-generating density for the MCMC algorithm, since, from (33) and (34), this interval corresponds to the interval (0.58, 0.87) for Pearson coefficient.
The Metropolis-Hastings-within-Gibbs algorithm, described in Section 3.3, was used replacing Gaussian copulas by Clayton copulas. The algorithm was run for 15,000 iterations, discarding the first 7500 burn-in iterations in MCMC outputs. The FDR was also obtained as in Section 3.4.
As shown in Tables 4-6, there was a high sensitivity regarding the choice of parameters α and β for the prior distribution on p i , as it happened for the model with Gaussian copulas. Likewise, the procedure is more accurate when the prior distribution of p i is skewed to the right, for instance, when the prior distributions Beta(1, 0.5) and Beta(2, 0.5) are assumed. However, the FDR obtained is higher than in the Gaussian copula model.   True  0  25  6  19  25  0  25  0  25  False  0  25  1  24  8  17  11  14  25   Total  0  50  7  43  33  17  36 14 50 Table 6. Results for the model with Clayton copulas and for data corresponding to 20% of true null hypotheses to distinct values of the prior distributions of p i .

Model Selection
To compare normal marginal distribution models using Gaussian and Clayton copulas, we use the Deviance Information Criterion (DIC). A model with smaller DIC should be preferred to models with larger DIC [44]. The DIC value is given by: Given an MCMC sample of size M from the posterior distribution, the DIC value (35), can be approximated by, The DIC value was obtained for the model with Gaussian Copulas and the model with Clayton copulas and, in both cases, using prior distributions for p i skewed to the right, Beta(1, 0.5) and Beta(2, 0.5). The results are shown in Table 7, for data sets with 80%, 50% and 20% true null hypotheses, respectively. Table 7. DIC values for the different percentages of true null hypotheses and for the prior distributions of p i skewed to the right.

Model Gaussian Copulas Clayton Copulas
(α, β) ( As it can be seen in Table 7, the smallest DIC values obtained correspond to the model with Gaussian copulas, as expected based on the data that had been generated. On the other hand, it can be seen that there were no differences between the DIC values there were no important differences between the DIC values corresponding to the parameters (1, 0.5) and (2, 0.5) of the beta prior distribution. Furthermore, the FDR values are lower using Gaussian copulas than Clayton copulas. Thus, the model with Gaussian copulas is more suitable for our simulated data.

Application to a Real Data Set: DNA Microarrays
The procedure described in the previous section is applied to a DNA microarrays data set. This data set consists of 38 genes obtained from duodenal biopsies tissues, performed in 13 children with celiac disease of mean age 5.6 (±0.6) years old and 7 children controls of mean age 8.1 (±2.2) years old, belonging to part of the study carried out in [45]. The data is available in NCBI-GEO datasets (The National Center for Biotechnology Information-Gene Expression Omnibud) via the GSE76168 access number.
The aim is to identify deferentially expressed genes. Thus, we tested simultaneously if there are significant differences between celiac patients and controls in the expression mean level of 38 genes, i.e., we consider the multiple hypothesis tests given in (9) and the main objective is to obtain the posterior probability of each null hypothesis.
The Bayesian procedure described by [12,14] was applied by [45] considering independence between the levels of genes expression. However, there is correlation between the levels of expression of these genes. To model the data, we consider normal marginal densities, as [45], and the dependency is modeled first through Gaussian copulas and secondly through Clayton copulas.

Modeling Dependency through N-Dimensional Gaussian Copulas
To apply Metropolis-Hastings-within-Gibbs algorithm, described in Section 3.3, we need to consider a candidate-generating density for the dependency parameter ρ. Since there are positive and negative correlations, the most suitable procedure would be to select the uniform prior distribution in the range (-1,1). However, we consider as a candidate-generating distribution for ρ the uniform prior distribution in the range (−0.027, 1) due to the constraint (11).
For the prior distribution on p i , we consider the Beta(2, 0.5) distribution, because, according to the results obtained in Section 3.4, this is the prior distribution that produces the most accurate results when using Gaussian copulas.
The algorithm was run for 40,000 iterations, discarding the first 20,000 burn-in iterations. If we compare our results, using a data dependency structure with those of [45] that suppose independence, different results are obtained: we identified 16 genes that are expressed differentially while they find 15, additionally out of the 16 identified, 4 do not coincide with those identified by their procedure.

Modeling Dependence Through N-Dimensional Clayton Copulas
In this subsection, the same Metropolis-Hastings-within-Gibbs algorithm, replacing the Gaussian copulas by the Clayton copulas, is used to the data of [45].
For θ c , the uniform distribution over (0.01, 5) was considered to be candidate-generating density. For the parameters p i , we chose a Beta(1, 0.5) prior distribution because, according to the results obtained in Section 4.1, this is the prior distribution that produces the most accurate results when using Clayton copulas. Likewise, the algorithm was run for 40,000 iterations, discarding the first 20,000 burn-in iterations.
In this case, we found 22 genes that are differentially expressed, 6 more than when using Gaussian copulas and if we compare them with the results of [45], we identified the 15 genes that they had already identified and 7 additional ones.
Finally, DIC = 1487.213 was obtained for the model with Gaussian copulas and DIC = 1504.445 for the model with Clayton copulas. Therefore, the model with Gaussian copulas and normal marginal densities turns out to be the most suitable for the data associated with celiac disease by [45], because with this model the smallest value DIC is obtained.

Conclusions
The proposed approach is very useful when many hypotheses are tested simultaneously under the assumption of dependence. For the proposed procedure for testing multiple hypotheses, the full data are used directly, rather than using test statistics, especially for modeling the dependency structure. Therefore, the modeling process is more complex than when using test statistics, and this presents computational problems when thousands of hypotheses are tested simultaneously with a large sample size. In any case all available information, both objective and subjective, can be used.
In the field of genomics, the normal distribution is widely used to model gene expression data. Thus, we adopted normal marginal distributions and modeled the dependency structure through the Gaussian copula, which shares the properties of a multivariate normal distribution. We opted to use a uniform correlation matrix to reduce the dimensionality of the parameters. To model the dependency structure, we also consider the Clayton copula of the Archimedean family, which enables modeling of multivariate distributions using a single univariate function, thus simplifying calculations. The proposed approach is flexible as far as it can be used with other correlation matrices more realistically, or with other copula functions to model the dependence, as well as other marginal distributions.
For the model with Gaussian copulas, with a lower DIC value and therefore the most appropriate for our simulated data, the results obtained demonstrated that the procedure fits the dependence well. The estimated correlation coefficient was close to the true value with which the data were generated. However, the procedure is not robust with respect to the choice of prior distribution for the initial probability of each null hypothesis. Nevertheless, in all simulated examples, the procedure rejected almost all false hypotheses when we used a prior distribution beta skewed to the right. Therefore, our proposal turned out to be a very powerful procedure for testing multiple hypotheses.
In the cases analyzed using the Gaussian copula model, the highest FDR value obtained was 0.079 when using a right-skewed beta prior distribution. However, this need not be inconvenient in the context of experiments with DNA microarrays, because, as explained above, the main objective in many of these studies is to obtain the greatest possible number of potentially expressed genes, with which more detailed studies can be carried out subsequently. As a result, in this phase of analysis, we can tolerate more false positives to obtain the greatest possible number of interesting genes.

Conflicts of Interest:
The authors declare no conflict of interest, financial or ethical of any kind.

Appendix A. MCMC Algorithm
In this appendix, we explain the proposed MCMC algorithm in detail. We constructed a Metropolis-Hastings-within-Gibbs sampling scheme that combines the Gibbs and Metropolis-Hastings sample strategies. The scheme involves interactively sampling from the Gibbs algorithm and from a single iteration of the Metropolis-Hastings algorithm, for standard conditional posterior distributions and for no standard conditional posterior distributions, respectively.