Next Article in Journal
On the Sine Inverse Lomax Burr III Distribution with Application to Monthly Actual Tax Revenue Data
Previous Article in Journal
Predictive Power Analysis of Multiple Test Procedures Under Arbitrary Dependence
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Copula-Based Framework for Multivariate Count Time Series with Mixed Marginal Distributions

1
Department of Statistics, Grand Valley State University, Allendale, MI 49401, USA
2
Fowler School of Engineering, Chapman University, Orange, CA 92866, USA
*
Author to whom correspondence should be addressed.
Stats 2026, 9(3), 57; https://doi.org/10.3390/stats9030057
Submission received: 25 March 2026 / Revised: 27 May 2026 / Accepted: 29 May 2026 / Published: 2 June 2026

Abstract

We developed a class of multivariate integer-valued time series models using copula theory. Each count time series is modeled as a Markov chain, with serial dependence characterized through copula-based transition probabilities for Poisson and negative binomial marginals. Cross-sectional dependence is modeled via a trivariate Gaussian or a “t-copula”, allowing for both positive and negative correlations and providing a flexible dependence structure. Model parameters are estimated using likelihood-based inference, where the trivariate Gaussian or t-copula integrals are evaluated through standard randomized Monte Carlo methods. Simulation results, along with an analysis of annual counts of major hurricanes (Category 3+) across the North Atlantic, Eastern North Pacific, and Western North Pacific basins, demonstrate the effectiveness of the proposed model.

1. Introduction

Multivariate count time series are increasingly prevalent in modern statistical analysis and rarely exhibit independence between series; see Davis et al. [1] for a comprehensive review of univariate count time series. The classical marginal distribution for count data is the Poisson distribution; however, any multivariate Poisson formulation must account for both serial dependence within each time series and cross-correlation across the series. The definitions of multivariate Poisson distributions are not universally agreed on (see Teicher [2] and Inouye et al. [3]). In a time series context, properly modeling temporal dependence in the univariate components together with cross-sectional dependence between series is essential for accurate inference. These challenges are further compounded by the presence of nonlinear dependence structures and potential tail dependence.
Building on the development of univariate integer-valued autoregressive moving-average (INARMA) models, Quoreshi [4] introduced bivariate integer-valued moving-average (BINMA) models that accommodate both positive and negative cross-correlations and further extended the framework to the multivariate setting. Wang et al. [5] employed a bivariate zero-inflated Poisson model to analyze occupational injury data. Subsequently, Heinen and Rengifo [6] introduced a multivariate autoregressive conditional double-Poisson model capable of accommodating overdispersion, serial dependence, and nonzero cross-correlation. In their framework, cross-sectional dependence between component series is modeled using a Gaussian copula, and parameter estimation is carried out using a two-stage estimation procedure. Karlis and Pedeli [7] developed a first-order bivariate integer-valued autoregressive process (BINAR(1)) in which cross-correlation is modeled via a copula, allowing for both positive and negative dependence. In their framework, Frank and Gaussian copulas are employed to capture dependence, while the marginal dynamics are modeled using Poisson and negative binomial INAR(1) processes. Furthermore, Sefidi et al. [8] proposed a model for longitudinal count data with zero-inflated marginal distributions, where cross-sectional dependence is captured using a D-vine copula. Zhao et al. [9] used a copula to capture both temporal and cross-sectional dependence in multivariate time series, generalizing a copula based first-order Markov model to a semi-parametric univariate D-vine model; multivariate series were modeled by combining the univariate D-vines using copula theory. Yu et al. [10] employed vine copula models to identify risk factors associated with water pollution in the IRSN and compared the performance of C-vines, D-vines, and R-vines in risk factor identification. Deng and Chaganty [11] proposed a C-vine copula framework for modeling the joint distribution of family data and illustrated the modeling process and likelihood construction for responses of various types, including continuous, discrete, binary, and ordinal outcomes. Bradshaw and Blei [12] developed a generative model for underreported campus sexual assault data that enables estimation of both true incidence and reporting rates. They employed Hamiltonian Monte Carlo (HMC) sampling for posterior inference on school-level reporting rates and assault incidence, and applied the proposed approach to campus sexual assault data.
Cui and Zhu [13] proposed a bivariate Poisson INGARCH model that accommodates both positive and negative cross-correlations between time series. Ahmad et al. [14] introduced a bivariate count regression framework to capture dependence between multiple crash outcomes that traditional independent count models fail to represent. Their approach specifies suitable marginal distributions (e.g., Poisson or negative binomial) for each outcome and employs a copula function to model the joint dependence structure.
Jeng et al. [15] proposed a copula-based time series model for forecasting COVID-19 cases using wastewater SARS-CoV-2 viral load and clinical variables. The model was developed in two stages: first, time series techniques were used to characterize the marginal distributions of the response and covariates; second, a copula-based marginal regression framework was applied to model dependence and predict COVID-19 case trends. Fokianos et al. [16] developed a general framework for multivariate count autoregressive models, focusing on dependence structures and theoretical properties for multivariate count time series. More recently, Debaly and Truquet [17] investigated multivariate time series models for mixed data, including binary, count and continuous variables, providing additional theoretical developments for dependent multivariate processes. Motivated by these developments, there has been increasing interest in flexible dependence modeling for multivariate count time series data.
In previous works, Alqawba et al. [18] and Fernando and Jayanetti [19] developed a class of copula-based models to analyze bivariate count data under different marginal distributions. This approach was extended to model trivariate count time series data by incorporating a Markovian structure to capture within-series dependence and a copula family to model the cross-sectional dependence between each pair of time series. For example, annual severe hurricane counts observed across different ocean basins may exhibit substantial temporal dependence within each basin as well as cross-dependence between basins due to shared environmental and atmospheric patterns. In such settings, the count series may follow different marginal distributions and may exhibit both positive and negative correlations. Therefore, flexible multivariate count time series models are needed to appropriately capture these complex dependence structures. The copula-based approach allows each marginal distribution to be modeled appropriately while flexibly capturing the dependence between the series through the selected copula family. This flexibility enables the model to accommodate both positive and negative dependence, providing a more general and adaptable framework for analyzing multivariate count time series data. In addition, the proposed framework allows for the use of mixed marginal distributions across the individual time series.
The remainder of this paper is organized as follows. Section 2 introduces the Poisson and negative binomial distributions within a copula framework and presents the proposed copula-based trivariate model. Section 3 contains simulation studies and a real-life application, and Section 4 concludes the paper.

2. Materials and Methods

2.1. The Poisson Distribution

The Poisson distribution is one of the most widely used distributions for modeling count data; see, for example, Johnson et al. [20]. In this work, we employ the Poisson distribution as one of the marginal distributions within our proposed copula-based model. Suppose y t denotes a random observed count at time t. The probability mass function (pmf) of the well-known Poisson distribution is defined as:
f ( y t ) = e λ λ y t y t ! ,
where λ > 0 is the intensity parameter with E ( y t ) = λ and V ( y t ) = λ . This property, known as equidispersion, implies that the variance equals the mean.

2.2. The Negative Binomial Distribution

Poisson distribution can be restrictive when the observe counts exhibit overdispersion, where the variance exceeds the mean. To accommodate overdispersed count data, negative binomial distribution is a great choice; see, for example, Hilbe [21]. By introducing an additional parameter ( κ ), the negative binomial distribution provides greater flexibility than the Poisson distribution by explicitly accounting for overdispersion in count data. The pmf of the negative binomial distribution is defined as:
f ( y t ) = Γ ( κ + y t ) Γ ( κ ) y t ! κ κ + λ κ λ κ + λ y t f o r y t = 0 , 1 , 2 ,
where λ and κ are parameters associated with intensity and dispersion with E ( y t ) = λ and V ( y t ) = λ + λ 2 κ .

2.3. Copulas

To capture dependence across multiple count series while maintaining the marginal distributions, we employ a copula based approach. As a multivariate cumulative distribution function (CDF), a copula is a joint function that captures the dependence structure between variables. With uniform margins U ( 0 , 1 ) as in Nelson [22], an n-dimensional copula is a function C : [ 0 , 1 ] n [ 0 , 1 ] with the following properties:
  • C ( 1 , , 1 , u t , 1 , , 1 ) = u t , t = 1 , 2 , , n and u t [ 0 , 1 ] .
  • C ( u 1 , u 2 , , u n ) = 0 if at least one u t = 0 for t = 1 , 2 , , n .
  • For any u t 1 , u t 2 [ 0 , 1 ] with u t 1 u t 2 , for t = 1 , 2 , , n ,
    j 1 = 1 2 j 2 = 1 2 j n = 1 2 ( 1 ) j 1 + j 2 + + j n C ( u 1 j 1 , u 2 j 2 , u n j n ) 0 .
Let Y 1 , , Y n be random variables with marginal CDFs F 1 , , F n and joint CDF F. Then
  • There exists an n-dimensional copula C such that for all y 1 , , y n R
    F ( y 1 , y 2 , , y n ) = C ( F 1 ( y 1 ) , F 2 ( y 2 ) , , F n ( y n ) ) .
  • If Y 1 , , Y n are continuous, then the copula C is unique. Otherwise, C can be uniquely determined on n dimensional rectangle R a n g e ( F 1 ) × R a n g e ( F 2 ) × × R a n g e ( F n ) .
When all the marginals are integer-valued, the multivariate probability mass function can be obtained as
f ( y 1 , y 2 , , y n ) = P ( Y 1 = y 1 , Y 2 = y 2 , , Y n = y n )
= j 1 = 1 2 j 2 = 1 2 j n = 1 2 ( 1 ) j 1 + j 2 + + j n C ( u 1 j 1 , u 2 j 2 , u n j n )
where u t 1 = F t ( y t ) and u t 2 = F t ( y t ) . Here F t ( y t ) is the left-hand limit of F t at y t , which is equal to F t ( y t 1 ) . In the Trivariate case,
P r ( Y 1 = y 1 , Y 2 = y 2 , Y 3 = y 3 ) = C ( F ( y 1 ) , F ( y 2 ) , F ( y 3 ) ; θ ) C ( F ( y 1 ) , F ( y 2 ) , F ( y 3 ) ; θ ) C ( F ( y 1 ) , F ( y 2 ) , F ( y 3 ) ; θ ) C ( F ( y 1 ) , F ( y 2 ) , F ( y 3 ) ; θ ) + C ( F ( y 1 ) , F ( y 2 ) , F ( y 3 ) ; θ ) + C ( F ( y 1 ) , F ( y 2 ) , F ( y 3 ) ; θ ) + C ( F ( y 1 ) , F ( y 2 ) , F ( y 3 ) ; θ ) + C ( F ( y 1 ) , F ( y 2 ) , F ( y 3 ) ; θ ) .
Here, θ denotes the dependence parameter of the copula function, and a variety of copula families, denoted by C, are available for selection. Table 1 lists several commonly used copula families, with further details provided in Joe [23]. Bivariate copulas such as the Gaussian, Frank, and t copulas are capable of modeling both positive and negative dependence, while the Gumbel, Clayton, and Plackett copulas are restricted to capturing positive dependence only. In this study, we primarily focus on the Gaussian copula function to model cross-correlation, as it can accommodate both positive and negative dependence. Additionally, the Gaussian copula was selected due to its ability to model distinct pairwise dependencies through a parsimonious correlation structure while maintaining computational tractability. Nevertheless, alternative copula families may be employed depending on the context and underlying characteristics of the data.

2.4. Copula-Based Trivariate Model

Suppose that we observe a series of 3-dimensional vectors, { Y t } t = 1 n , where Y t = ( Y 1 t , Y 2 t , Y 3 t ) for t = 1 , 2 , . . . , n . Assume that each series { Y 1 t } t = 1 n , { Y 2 t } t = 1 n and { Y 3 t } t = 1 n follows a first-order Markov process based on copula (see, Alqawba and Diawara [24] for an example). Then, the mean vector μ t and the covariance matrix, say Γ ( t , t 1 ) , are defined as follows.
μ t = E ( Y t ) = E ( Y 1 t ) E ( Y 2 t ) E ( Y 3 t ) ,
and
Γ ( t , t 1 ) = COV ( Y t , Y t 1 ) = COV ( Y 1 t , Y 1 , t 1 ) COV ( Y 1 t , Y 2 , t 1 ) COV ( Y 1 t , Y 3 , t 1 ) COV ( Y 2 t , Y 1 , t 1 ) COV ( Y 2 t , Y 2 , t 1 ) COV ( Y 2 t , Y 3 , t 1 ) COV ( Y 3 t , Y 1 , t 1 ) COV ( Y 3 t , Y 2 , t 1 ) COV ( Y 3 t , Y 3 , t 1 ) .
The first-order Markov assumption is used to model the temporal dependence in each count time series while keeping the model formulation relatively simple. Since the data are counts, the dependence on higher-order lag is often difficult to identify clearly in practice. The diagonal elements of the covariance matrix correspond to the autocovariance within each time series, while the off-diagonal elements capture the cross-covariance between the corresponding pairs of series. Given the presence of both serial dependence and cross-correlation across the components, the joint probability distribution of Y 1 t , Y 2 t and Y 3 t conditional on Y 1 , t 1 , Y 2 , t 1 and Y 3 , t 1 , for t = 1 , , n , is expressed as:
f ( y 1 t , y 2 t , y 3 t | y 1 , t 1 , y 2 , t 1 , y 3 , t 1 ) = V 1 ( F 1 , t ) V 1 ( F 1 , t + ) V 1 ( F 2 , t ) V 1 ( F 2 , t + ) V 1 ( F 3 , t ) V 1 ( F 3 , t + ) V 3 ( z 1 , z 2 , z 3 , R ) d z 3 d z 2 d z 1 ,
where V 1 denotes the inverse cdf of the normal or the t-distribution, and V 3 ( . , R ) being the pdf of the trivariate normal or t-distribution. The matrix R is the correlation matrix of the joint distribution, which captures the cross-sectional dependence, and is defined as:
R = 1 ρ 12 ρ 13 ρ 12 1 ρ 23 ρ 13 ρ 23 1 ,
where ρ 12 , ρ 13 , and ρ 23 are the dependence parameters of either the Gaussian or the t-copula function that describe the cross-sectional dependence between series 1 and 2, series 1 and 3, and series 2 and 3, respectively.
Also, F i , t + = F ( y i t | y i , t 1 ) and F i , t = F ( y i t 1 | y i , t 1 ) , for i = 1 , 2 , where:
F ( y i t | y i , t 1 ) = F 12 ( y i t , y i , t 1 ) F 12 ( y i t , y i , t 1 1 ) f t 1 ( y i , t 1 ; θ ) ,
is the conditional cdf of Y i t given Y i , t 1 , for i = 1 , 2 , and
F 12 ( y i t , y i , t 1 ) = C ( F t ( y i t ) , F t 1 ( y i , t 1 ) ; δ ) ,
Here, C ( . ; δ ) denotes a bivariate copula function with dependence parameter δ , which characterizes the serial dependence within a single time series. The vector θ represents the marginal parameters and reduces to a scalar in the Poisson case, i.e., θ = λ . The proposed model is applicable to the analysis of trivariate count time series data with marginals following any discrete distribution, thereby offering flexibility beyond traditional parametric assumptions.

2.5. Inference

Parameter estimation is carried out by maximizing the likelihood function, with the log-likelihood constructed using copula theory. Since this function has no closed-form expression, its maximization cannot rely on standard analytical methods Panagiotelis et al. [25]. The maximization procedure employed is described next. Using the conditional density function shown in the Equation (2) for t = 1 , the joint distribution of Y 11 , Y 21 and Y 31 is given by
f ( y 11 , y 21 , y 31 ) = V 1 ( F 1 , 1 ) V 1 ( F 1 , 1 + ) V 1 ( F 2 , 1 ) V 1 ( F 2 , 1 + ) V 1 ( F 3 , 1 ) V 1 ( F 3 , 1 + ) V 3 ( z 1 , z 2 , z 3 , R ) d z 3 d z 2 d z 1 ,
and for t = 2 , , n , the conditional bivariate distribution of Y 1 t = y 1 t , Y 2 t = y 2 t and Y 3 t = y 3 t given Y 1 , t 1 = y 1 , t 1 , Y 2 , t 1 = y 2 , t 1 and Y 3 , t 1 = y 3 , t 1 is given by
f ( y 1 t , y 2 t , y 3 t | y 1 , t 1 , y 2 , t 1 , y 3 , t 1 ) = V 1 ( F 1 , t ) V 1 ( F 1 , t + ) V 1 ( F 2 , t ) V 1 ( F 2 , t + ) V 1 ( F 3 , t ) V 1 ( F 3 , t + ) V 3 ( z 1 , z 2 , z 3 , R ) d z 3 d z 2 d z 1 .
Hence, joining the equations in (3) and (4), the likelihood function is given by
L ( ϑ ; y ) = f ( y 11 , y 21 , y 31 ) . t = 2 n f ( y 1 t , y 2 t , y 3 t y 1 , t 1 , y 2 , t 1 , y 3 , t 1 ) ,
where ϑ = ( θ , δ 1 , δ 2 , δ 3 , ρ 12 , ρ 13 , ρ 23 ) , here θ is the vector of marginal parameters, δ 1 , δ 2 and δ 3 are the serial dependence parameters to deal with the first, second and third count series, respectively. The bivariate dependence between the two time series is captured by ρ 12 , ρ 13 and ρ 23 . where ρ 12 , ρ 13 , and ρ 23 are the dependence parameters of either the Gaussian or the t-copula functions, describing the cross-sectional dependence between series 1 and 2, series 1 and 3, and series 2 and 3, respectively. Therefore, by taking the logarithm of the function in (5), the log-likelihood function can be constructed as follows:
log L ( ϑ ; y ) = l ( ϑ ; y ) = log f ( y 11 , y 21 , y 31 ) + t = 2 n log f ( y 1 t , y 2 t , y 3 t y 1 , t 1 , y 2 , t 1 , y 3 , t 1 ) .
Maximizing the log-likelihood function in (6) yields the maximum likelihood estimates for the proposed model class. However, the log-likelihood involves a bivariate normal or a t integral, as shown in (2), which does not admit a closed-form solution. To evaluate this integral, we employ the randomized importance sampling method of Genz and Bretz [26], which is well suited for low-dimensional settings, particularly for dimensions below ten. The method approximates the required multivariate probabilities by generating randomized samples from suitable proposal distributions, evaluating the integrand at the sampled points, and averaging the results to obtain a numerical approximation of the integral. This procedure was implemented by Hothorn et al. [27] in the mvtnorm package available on CRAN. Specifically, the package provides the functions pmvnorm and pmvt for computing multivariate normal and multivariate t probabilities, respectively. Then, the parameter estimates, i.e., ϑ ^ , can be obtained using
ϑ ^ = arg max ϑ l ( ϑ ; y ) .
Due to numerical instability and non-convergence issues associated with the Hessian matrix in the application study, Hessian-based standard errors were not used. Similar numerical instability issues associated with the calculation of the Hessian matrix in likelihood-based copula models have been discussed in the literature; see, Schepsmeier [28]. Nikoloulopoulos and Karlis [29] discussed the use of parametric bootstrap procedures for inference in copula-based models, and such approaches have been widely used in copula and multivariate time series settings. Therefore, instead of relying on Hessian-based standard errors, standard errors of the maximum likelihood estimates of ϑ were obtained using a parametric bootstrap procedure with 2000 bootstrap replications. Repeated datasets were generated from the fitted model, the model was refitted to each bootstrap sample, and the standard errors were computed as the empirical standard deviations of the resulting bootstrap estimates. In the next section, we assess the performance of the proposed class of models through a comprehensive simulation study and an application to a real-world dataset.

3. Results

A comprehensive simulation study is conducted to evaluate the proposed method and the asymptotic properties of the parameter estimates are validated. The marginal distributions are selected from the class of Poisson distribution. Here λ 1 , λ 2 , λ 3 , δ 1 , δ 2 , δ 3 , ρ 12 , ρ 13 and ρ 23 denotes the means, serial dependence parameters and cross-correlation parameters between the two time series respectively. Gaussian copula was selected as candidate copula family with true parameters ( λ 1 = 3, λ 2 = 5, λ 3 = 7, δ 1 = 0.4 , δ 2 = 0.4 , δ 3 = 0.3 , ρ 12 = 0.5 , ρ 13 = 0.4 , ρ 23 = 0.6 ). Since we assume the process is stationary, we set the marginal distributions’ parameters, θ , to be constant across time. Simulation study was conducted using sample sizes of 50, 100, 300 and 1000 while replicating it 500 times. For the nine parameters, standard error (SE), mean square error (MSE), and mean absolute error (MAE) were calculated, the results summarized in Table 2. The performance measures reported, including SE, MSE and MAE, were empirically calculated in the simulated datasets. The MSE and MAE are defined as follows.
M S E = 1 m i = 1 m ( θ i θ i ^ ) 2 , M A E = 1 m i = 1 m θ i θ i ^ ,
where θ i ^ estimated value of the parameter and m is the number of replications.
Table 2 and Table 3 show that the parameter estimates converge to their true values, with standard errors, MSEs, and MAEs decreasing as the sample size increases. These results indicate that the estimates become increasingly reliable as the sample size grows.
Figure 1 present the Q–Q plots for the parameter estimates obtained using Poisson marginals. The quantile plots show that the empirical distribution of the parameter estimates aligns closely with the 45 0 reference line, indicating that the sampling distribution of the estimates is approximately normal. This supports the asymptotic normality of the maximum likelihood estimates and suggests stable inference as sample size increases.
Figure 2 present the Q–Q plots for the parameter estimates obtained using Poisson marginals under negative cross correlation. The quantile plots show that the empirical distribution of the parameter estimates aligns closely with the 45 0 reference line for most of the parameters, indicating that the sampling distribution of the estimates is approximately normal.
With the introduction of an additional parameter ( κ ), the negative binomial (NB) distribution can accommodate overdispersion, unlike the Poisson distribution. We conducted simulation studies using a Gaussian copula for both the univariate and joint dependence structures, with one marginal specified as NB. The results in Table 4 demonstrate that the stability and precision of the estimation increase as the sample size increases, although some finite-sample bias remains for certain parameters of dependence. Furthermore, the copula-based framework allows for the use of mixed marginal distributions, which provides an additional advantage in modeling flexibility.

Applications

To further demonstrate the effectiveness of the proposed model, we consider the annual number of major hurricanes (Saffir-Simpson Category 3 and above) recorded in the North Atlantic, Eastern North Pacific and Western North Pacific basins from 1980–2025. The data used in this study are publicly available from the International Best Track Archive for Climate Stewardship (IBTrACS) project. Statistical modeling of hurricane count data has been considered in several previous studies; see, for example, Villarini et al. [30] and Elsner and Jagger [31]. Hurricanes participate in equalizing global heat imbalances. The Atlantic and Pacific hurricane basins are strongly influenced by large-scale climate patterns, particularly sea surface temperatures and vertical wind shear, which often shift in opposite ways between the two basins. Three hurricane count time series were considered for modeling under the proposed trivariate count time series framework, with Poisson or negative binomial marginal distributions chosen to adequately capture the dispersion characteristics of each series. The empirical means for the three count time series are 3.064 (North Atlantic), 5.255 (Eastern Pacific) and 6.965 (Western Pacific), respectively. The variance of the Eastern Pacific series exceeds its mean, indicating the presence of overdispersion. The annual hurricane counts for the three series are shown in Figure 3. Compared to the North Atlantic count time series, the Eastern Pacific and Western Pacific series exhibit higher counts and greater variability over time. Figure 4 presents bar plots of the count distributions along with the sample autocorrelation functions (ACFs) for the three count series. The ACFs reveal clear serial dependence in each series. This observation motivates a detailed examination of both serial and cross-series dependencies using the proposed copula-based trivariate model.
Table 5 presents the AIC values for different copula-based models with Poisson and mixed marginal distributions. In the mixed marginal specification, the first and third marginal distributions are Poisson, whereas the second marginal distribution follows a negative binomial distribution. The copula family listed under Univariate denotes the copula specification used for the marginal distributions, whereas the copula family listed under Joint represents the copula used to model the multivariate dependence structure. Thus, each row corresponds to a distinct combination of marginal and joint copula specifications. Several candidate copula families are considered to model the serial dependence within each time series, while the Gaussian or t copula functions are used to model the cross-sectional dependence between each pair of count series. The Gaussian copula was selected due to its ability to model distinct pairwise dependencies through a parsimonious correlation structure while maintaining computational tractability. It is also well suited for modeling cross-correlation, as it directly captures dependence through the correlation matrix. Given the moderate sample size and the absence of strong evidence of tail dependence in the hurricane count data, the Gaussian copula provides an appropriate balance between flexibility and interpretability. Additionally, the Gaussian copula can accommodate both positive and negative dependence, which is particularly important in this application, as some of the series exhibit negative cross-dependence. Among the fitted models, those employing Gaussian copulas for both the univariate and joint distributions yield the minimum AIC. Specifically, the mixed marginal distribution model using a Gaussian copula for both the univariate and the joint distributions yields the minimum AIC value of 626.88. This model effectively captures the negative cross-correlation observed between the Atlantic and Pacific basins, which is a key advantage of the copula-based approach. Furthermore, the small standard errors associated with the parameter estimates demonstrate the stability and usefulness of the proposed class of models.
Table 6 presents the parameter estimates for the copula-based model, in which a Gaussian copula is used to construct the joint distribution with Poisson and mixed marginal distributions. The model parameters were estimated using the maximum likelihood estimation (MLE) approach. During the analysis, numerical instability and convergence issues were observed in the Hessian matrix associated with the likelihood. Therefore, instead of relying on Hessian-based standard errors, parametric bootstrap standard errors were computed using the final MLE estimates. Specifically, 2000 bootstrap samples were generated from the fitted model, the model was refitted to each bootstrap sample, and the bootstrap standard errors were obtained as the empirical standard deviations of the resulting parameter estimates. In this setting, the parameters with subscript “1” correspond to the hurricane counts recorded in the North Atlantic basin, the parameters with subscript “2” correspond to the hurricane counts recorded in the Eastern Pacific basin, and the parameters with subscript “3” correspond to the counts in Western Pacific basin. The mixed marginal model indicates that hurricane activity varies substantially across ocean basins, with the Western Pacific exhibiting the highest average counts and the Eastern Pacific displaying significant overdispersion. Moderate serial dependence suggests persistence in annual hurricane activity, while cross-basin dependence reveals a negative association between the Atlantic and Pacific basins and a positive association within the Pacific. These patterns are consistent with large-scale climate drivers such as the El Niño–Southern Oscillation (ENSO).

4. Discussion

Time series count data often arise as multivariate vectors that exhibit not only serial dependence within each individual series but also cross-correlation across series. In this paper, we propose a copula-based modeling framework to capture the dependence structure in trivariate count time series. A copula is used to capture both serial dependence and cross-sectional dependence among multiple time series. Moreover, the proposed class of models allows for a general Markov structure, which enhances flexibility in modeling complex dependence patterns in count time series data. Likelihood computation is nontrivial, as it involves evaluating trivariate Gaussian or a t copula integrals; these are approximated using standard randomized Monte Carlo methods. To demonstrate the effectiveness and superiority of the proposed model, both simulation studies and real-data applications are presented. One limitation of the proposed approach is that Hessian-based standard errors may be numerically unstable in some settings, particularly when the likelihood surface is relatively flat or the observed information matrix is close to singular. In such cases, bootstrap-based standard errors provide a practical alternative, as considered in this study. As a future extension, Bayesian estimation procedures could be explored to provide an alternative framework for parameter estimation. In addition, vine copula methods may be considered to construct more flexible joint dependence structures in higher-dimensional settings.

Author Contributions

Methodology, D.F., Y.W. and W.J.; Software, D.F., Y.W. and W.J.; Formal analysis, D.F., Y.W. and W.J.; Investigation, D.F.; Writing—original draft, D.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data that support the findings of this study are openly available in International Best Track Archive for Climate Stewardship at https://www.ncei.noaa.gov/products/international-best-track-archive (accessed on 21 December 2025).

Acknowledgments

The authors thank the Editor and Reviewers whose comments have significantly improved the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Davis, R.A.; Fokianos, K.; Holan, S.H.; Joe, H.; Livsey, J.; Lund, R.; Pipiras, V.; Ravishanker, N. Count time series: A methodological review. J. Am. Stat. Assoc. 2021, 116, 1533–1547. [Google Scholar] [CrossRef]
  2. Teicher, H. On the multivariate Poisson distribution. Scand. Actuar. J. 1954, 1954, 1–9. [Google Scholar] [CrossRef]
  3. Inouye, D.I.; Yang, E.; Allen, G.I.; Ravikumar, P. A review of multivariate distributions for count data derived from the Poisson distribution. Wiley Interdiscip. Rev. Comput. Stat. 2017, 9, e1398. [Google Scholar] [CrossRef]
  4. Quoreshi, A.S. Bivariate time series modeling of financial count data. Commun. Stat.-Theory Methods 2006, 35, 1343–1358. [Google Scholar] [CrossRef]
  5. Wang, K.; Lee, A.H.; Yau, K.K.; Carrivick, P.J. A bivariate zero-inflated Poisson regression model to analyze occupational injuries. Accid. Anal. Prev. 2003, 35, 625–629. [Google Scholar] [CrossRef]
  6. Heinen, A.; Rengifo, E. Multivariate autoregressive modeling of time series count data using copulas. J. Empir. Financ. 2007, 14, 564–583. [Google Scholar] [CrossRef]
  7. Karlis, D.; Pedeli, X. Flexible bivariate INAR (1) processes using copulas. Commun. Stat.-Theory Methods 2013, 42, 723–740. [Google Scholar] [CrossRef]
  8. Sefidi, S.; Ganjali, M.; Baghfalaki, T. Pair copula construction for longitudinal data with zero-inflated power series marginal distributions. J. Biopharm. Stat. 2021, 31, 233–249. [Google Scholar] [CrossRef]
  9. Zhao, Z.; Shi, P.; Zhang, Z. Modeling multivariate time series with copula-linked univariate D-vines. J. Bus. Econ. Stat. 2021, 40, 690–704. [Google Scholar] [CrossRef]
  10. Yu, R.; Yang, R.; Zhang, C.; Špoljar, M.; Kuczyńska-Kippen, N.; Sang, G. A vine copula-based modeling for identification of multivariate water pollution risk in an interconnected river system network. Water 2020, 12, 2741. [Google Scholar] [CrossRef]
  11. Deng, Y.; Chaganty, N.R. Pair-copula models for analyzing family data. J. Stat. Theory Pract. 2021, 15, 13. [Google Scholar] [CrossRef]
  12. Bradshaw, C.; Blei, D.M. A Bayesian model of underreporting for sexual assault on college campuses. Ann. Appl. Stat. 2024, 18, 3146–3164. [Google Scholar] [CrossRef]
  13. Cui, Y.; Zhu, F. A new bivariate integer-valued GARCH model allowing for negative cross-correlation. Test 2018, 27, 428–452. [Google Scholar] [CrossRef]
  14. Ahmad, N.; Gayah, V.V.; Donnell, E.T. Copula-based bivariate count data regression models for simultaneous estimation of crash counts based on severity and number of vehicles. Accid. Anal. Prev. 2023, 181, 106928. [Google Scholar] [CrossRef]
  15. Jeng, H.A.; Singh, R.; Diawara, N.; Curtis, K.; Gonzalez, R.; Welch, N.; Jackson, C.; Jurgens, D.; Adikari, S. Application of wastewater-based surveillance and copula time-series model for COVID-19 forecasts. Sci. Total Environ. 2023, 885, 163655. [Google Scholar] [CrossRef]
  16. Fokianos, K.; Støve, B.; Tjøstheim, D.; Doukhan, P. Multivariate Count Autoregression. Bernoulli 2020, 26, 471–499. [Google Scholar] [CrossRef]
  17. Debaly, Z.-M.; Truquet, L. Multivariate Time Series Models for Mixed Data. Bernoulli 2023, 29, 669–695. [Google Scholar] [CrossRef]
  18. Alqawba, M.; Fernando, D.; Diawara, N. A class of copula-based bivariate Poisson time series models with applications. Computation 2021, 9, 108. [Google Scholar] [CrossRef]
  19. Fernando, D.; Jayanetti, W. A copula-based model for analyzing bivariate offense data. Stats 2025, 8, 111. [Google Scholar] [CrossRef]
  20. Johnson, N.L.; Kemp, A.W.; Kotz, S. Univariate Discrete Distributions. In Wiley Series in Probability and Statistics, 3rd ed.; Wiley: Hoboken, NJ, USA, 2005. [Google Scholar]
  21. Hilbe, J.M. Negative Binomial Regression, 2nd ed.; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
  22. Nelsen, R.B. An Introduction to Copulas; Springer: Cham, Switzerland, 2007. [Google Scholar]
  23. Joe, H. Dependence Modeling with Copulas; Chapman and Hall/CRC: Boca Raton, FL, USA, 2014. [Google Scholar]
  24. Alqawba, M.; Diawara, N. Copula-based Markov zero-inflated count time series models with application. J. Appl. Stat. 2021, 48, 786–803. [Google Scholar] [CrossRef]
  25. Panagiotelis, A.; Czado, C.; Joe, H. Pair copula constructions for multivariate discrete data. J. Am. Stat. Assoc. 2012, 107, 1063–1072. [Google Scholar] [CrossRef]
  26. Genz, A.; Bretz, F. Computation of Multivariate Normal and t Probabilities; Springer Science & Business Media: Cham, Switzerland, 2009; Volume 195. [Google Scholar]
  27. Hothorn, T.; Bretz, F.; Genz, A. On multivariate t and Gauss probabilities in R. Sigma 2001, 1000, 3. [Google Scholar]
  28. Schepsmeier, U. A goodness-of-fit test for regular vine copula models. Econom. Rev. 2019, 38, 25–46. [Google Scholar] [CrossRef]
  29. Nikoloulopoulos, A.K.; Karlis, D. Copula model evaluation based on parametric bootstrap. Comput. Stat. Data Anal. 2008, 52, 3342–3353. [Google Scholar] [CrossRef]
  30. Villarini, G.; Vecchi, G.A.; Smith, J.A. Modeling the dependence of tropical storm counts in the North Atlantic basin on climate indices. Mon. Weather Rev. 2010, 138, 2681–2705. [Google Scholar] [CrossRef]
  31. Elsner, J.B.; Jagger, T.H. Hurricane Climatology: A Modern Statistical Guide Using R; Oxford University Press: Oxford, UK, 2013. [Google Scholar]
Figure 1. Q-Q Plots of ML estimates for n = 1000 under positive cross-correlation with Poisson marginals.
Figure 1. Q-Q Plots of ML estimates for n = 1000 under positive cross-correlation with Poisson marginals.
Stats 09 00057 g001
Figure 2. Q-Q Plots of ML estimates for n = 1000 under negative cross-correlation with Poisson marginals.
Figure 2. Q-Q Plots of ML estimates for n = 1000 under negative cross-correlation with Poisson marginals.
Stats 09 00057 g002
Figure 3. Hurricane counts for North Atlantic, Eastern pacific and Western Pacific basins.
Figure 3. Hurricane counts for North Atlantic, Eastern pacific and Western Pacific basins.
Stats 09 00057 g003
Figure 4. Bar plot and ACF for counts of hurricanes for North Atlantic (Top), Eastern Pacific (Middle) and Western Pacific (Bottom).
Figure 4. Bar plot and ACF for counts of hurricanes for North Atlantic (Top), Eastern Pacific (Middle) and Western Pacific (Bottom).
Stats 09 00057 g004
Table 1. Bivariate copula functions.
Table 1. Bivariate copula functions.
CopulaCopula Function
Gaussian C ( u 1 , u 2 ; δ ) = Φ δ ( Φ 1 ( u 1 ) , Φ 1 ( u 2 ) ) , δ [ 1 , 1 ]
Frank C ( u 1 , u 2 ; δ ) = 1 δ log 1 + ( e δ u 1 1 ) ( e δ u 2 1 ) e δ 1 , δ R { 0 }
Gumbel C ( u 1 , u 2 ; δ ) = exp ( log ( u 1 ) ) δ + ( log ( u 2 ) ) δ 1 / δ , δ 1
Clayton C ( u 1 , u 2 ; δ ) = ( u 1 δ + u 2 δ 1 ) 1 / δ , δ > 0
Plackett C ( u 1 , u 2 ; δ ) = [ 1 + ( δ 1 ) ( u 1 + u 2 ) ] [ 1 + ( δ 1 ) ( u 1 + u 2 ) ] 2 4 u 1 u 2 δ ( δ 1 ) 2 ( δ 1 ) , δ 0
BVT C ( u 1 , u 2 ; δ , ν ) = τ ν , δ ( τ ν 1 ( u 1 ) , τ ν 1 ( u 2 ) ) , δ [ 1 , 1 ]
Table 2. Parameter estimates for univariate and joint distributions using Gaussian copula with Poisson marginals.
Table 2. Parameter estimates for univariate and joint distributions using Gaussian copula with Poisson marginals.
Sample SizeParameterEstimateSEMSEMAE
50 λ 1 (3)3.1030.3660.1340.291
λ 2 (5)4.9930.4910.2400.387
λ 3 ( 7 ) 6.9970.4960.2460.395
δ 1 ( 0.4 ) 0.3410.1090.0150.099
δ 2 ( 0.4 ) 0.3390.1060.0150.098
δ 3 ( 0.3 ) 0.2480.1300.0190.109
ρ 12 ( 0.5 ) 0.4490.1030.0130.090
ρ 13 ( 0.4 ) 0.3530.1280.0180.104
ρ 23 ( 0.6 ) 0.5690.0910.0090.075
100 λ 1 ( 3 ) 3.0060.2680.0720.212
λ 2 ( 5 ) 5.0040.3510.1230.278
λ 3 ( 7 ) 6.9950.3590.1290.279
δ 1 ( 0.4 ) 0.3480.0740.0080.073
δ 2 ( 0.4 ) 0.3550.0680.0070.065
δ 3 ( 0.3 ) 0.2670.0790.0070.069
ρ 12 ( 0.5 ) 0.4520.0780.0080.073
ρ 13 ( 0.4 ) 0.3530.0880.0100.078
ρ 23 ( 0.6 ) 0.5640.0640.0050.057
300 λ 1 ( 3 ) 3.0090.1610.0260.125
λ 2 ( 5 ) 5.0050.2030.0410.160
λ 3 ( 7 ) 6.9840.2240.0510.176
δ 1 ( 0.4 ) 0.3570.0450.0040.051
δ 2 ( 0.4 ) 0.3560.0410.0030.049
δ 3 ( 0.3 ) 0.2710.0430.0030.041
ρ 12 ( 0.5 ) 0.4460.0440.0050.058
ρ 13 ( 0.4 ) 0.3430.0490.0060.063
ρ 23 ( 0.6 ) 0.5630.0370.0030.042
1000 λ 1 ( 3 ) 2.9910.0910.0080.073
λ 2 ( 5 ) 4.9980.1320.0170.107
λ 3 ( 7 ) 6.9850.1260.0160.105
δ 1 ( 0.4 ) 0.3580.0260.0020.042
δ 2 ( 0.4 ) 0.3580.0240.0020.042
δ 3 ( 0.3 ) 0.2720.0270.0010.033
ρ 12 ( 0.5 ) 0.4450.0290.0040.055
ρ 13 ( 0.4 ) 0.3440.0330.0040.059
ρ 23 ( 0.6 ) 0.5630.0270.0020.038
Table 3. Parameter estimates for univariate and joint distributions using a Gaussian copula with Poisson marginals and negative cross-correlation.
Table 3. Parameter estimates for univariate and joint distributions using a Gaussian copula with Poisson marginals and negative cross-correlation.
Sample SizeParameterEstimateSEMSEMAE
50 λ 1 ( 3 ) 3.0260.4200.1760.326
λ 2 ( 5 ) 5.1200.5830.3540.465
λ 3 ( 7 ) 7.0990.5870.3540.462
δ 1 ( 0.4 ) 0.3630.1120.0140.089
δ 2 ( 0.4 ) 0.3610.1110.0140.092
δ 3 ( 0.3 ) 0.2810.1120.0130.089
ρ 12 ( 0.5 ) −0.4260.1370.0240.119
ρ 13 ( 0.4 ) −0.3180.1600.0320.135
ρ 23 ( 0.6 ) 0.5660.0980.0110.080
100 λ 1 ( 3 ) 3.0670.2790.0820.230
λ 2 ( 5 ) 5.0890.4100.1750.329
λ 3 ( 7 ) 7.0670.4330.1920.347
δ 1 ( 0.4 ) 0.3720.0860.0080.070
δ 2 ( 0.4 ) 0.3700.0760.0060.065
δ 3 ( 0.3 ) 0.2850.0910.0080.071
ρ 12 ( 0.5 ) −0.4220.1090.1790.098
ρ 13 ( 0.4 ) −0.3230.1180.0190.105
ρ 23 ( 0.6 ) 0.5540.0690.0070.067
300 λ 1 ( 3 ) 3.0560.1990.0420.167
λ 2 ( 5 ) 5.0750.2760.0820.225
λ 3 ( 7 ) 7.0610.2980.0920.237
δ 1 ( 0.4 ) 0.3730.0590.0040.051
δ 2 ( 0.4 ) 0.3740.0530.0030.047
δ 3 ( 0.3 ) 0.2960.0620.0040.051
ρ 12 ( 0.5 ) −0.4170.0850.0140.089
ρ 13 ( 0.4 ) −0.3110.1080.0190.098
ρ 23 ( 0.6 ) 0.5480.0550.0060.058
1000 λ 1 ( 3 ) 3.0570.1580.0280.130
λ 2 ( 5 ) 5.0950.2140.0550.172
λ 3 ( 7 ) 7.0940.2330.0630.179
δ 1 ( 0.4 ) 0.3840.0510.0030.042
δ 2 ( 0.4 ) 0.3810.0440.0020.037
δ 3 ( 0.3 ) 0.3010.0530.0030.041
ρ 12 ( 0.5 ) −0.4180.0080.0120.083
ρ 13 ( 0.4 ) −0.3150.0890.0150.088
ρ 23 ( 0.6 ) 0.5510.0420.0040.052
Table 4. Parameter estimates for univariate and joint distributions using a Gaussian copula with Poisson and negative binomial marginals.
Table 4. Parameter estimates for univariate and joint distributions using a Gaussian copula with Poisson and negative binomial marginals.
Sample SizeParameterEstimateSEMSEMAE
50 λ 1 ( 3 ) 2.9920.3810.1450.310
λ 2 ( 5 ) 5.0370.8090.6560.645
κ 2 ( 2.5 ) 2.3740.7010.5060.552
λ 3 ( 7 ) 6.9630.5110.2620.408
δ 1 ( 0.4 ) 0.3300.1100.0170.102
δ 2 ( 0.4 ) 0.3480.1120.0150.098
δ 3 ( 0.3 ) 0.2520.1150.0150.098
ρ 12 ( 0.5 ) 0.4660.1120.0140.092
ρ 13 ( 0.4 ) 0.3710.1250.0160.099
ρ 23 ( 0.6 ) 0.5800.0910.0090.073
100 λ 1 ( 3 ) 3.0200.2650.0700.205
λ 2 ( 5 ) 5.0110.5790.3350.453
κ 2 ( 2.5 ) 2.3060.5040.2910.430
λ 3 ( 7 ) 6.9810.3520.1240.278
δ 1 ( 0.4 ) 0.3510.0780.0080.073
δ 2 ( 0.4 ) 0.3600.0780.0080.071
δ 3 ( 0.3 ) 0.2650.0810.0070.070
ρ 12 ( 0.5 ) 0.4500.0710.0070.071
ρ 13 ( 0.4 ) 0.3490.0790.0090.074
ρ 23 ( 0.6 ) 0.5710.0650.0050.056
300 λ 1 ( 3 ) 2.9780.1570.0250.127
λ 2 ( 5 ) 4.9910.3660.1330.292
κ 2 ( 2.5 ) 2.2330.2910.1620.341
λ 3 ( 7 ) 6.9910.2110.0450.169
δ 1 ( 0.4 ) 0.3550.0460.0040.053
δ 2 ( 0.4 ) 0.3700.0430.0030.042
δ 3 ( 0.3 ) 0.2700.0480.0030.045
ρ 12 ( 0.5 ) 0.4490.0430.0040.055
ρ 13 ( 0.4 ) 0.3440.0480.0050.062
ρ 23 ( 0.6 ) 0.5650.0390.0030.043
1000 λ 1 ( 3 ) 2.9790.1040.0110.083
λ 2 ( 5 ) 4.9820.2470.0610.196
κ 2 ( 2.5 ) 2.2130.1780.1270.318
λ 3 ( 7 ) 6.9820.1280.0180.108
δ 1 ( 0.4 ) 0.3550.0260.0030.045
δ 2 ( 0.4 ) 0.3730.0290.0010.031
δ 3 ( 0.3 ) 0.2700.0290.0020.034
ρ 12 ( 0.5 ) 0.4460.0310.0030.055
ρ 13 ( 0.4 ) 0.3440.0350.0030.058
ρ 23 ( 0.6 ) 0.5650.0250.0020.036
Table 5. AIC values for models with different copula families.
Table 5. AIC values for models with different copula families.
PoissonMixed
Univariate Joint AIC
GaussianGaussian629.52626.88
Gaussiant649.23648.89
Frankt645.66651.35
Claytont649.36651.57
Table 6. Parameter estimates for the trivariate Poisson and mixed marginal models.
Table 6. Parameter estimates for the trivariate Poisson and mixed marginal models.
PoissonMixed
Parameter Estimate SE Estimate SE
λ 1 2.9580.1132.9250.102
λ 2 4.6570.1444.5460.219
κ 2 3.4590.639
λ 3 8.8220.1788.7430.176
δ 1 0.2520.2640.1810.182
δ 2 0.1730.2240.1710.165
δ 3 0.3280.3110.1940.162
ρ 12 −0.2170.083−0.4320.068
ρ 13 −0.2330.079−0.4350.063
ρ 23 0.1820.0830.2850.078
AIC629.52626.88
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fernando, D.; Wen, Y.; Jayanetti, W. A Copula-Based Framework for Multivariate Count Time Series with Mixed Marginal Distributions. Stats 2026, 9, 57. https://doi.org/10.3390/stats9030057

AMA Style

Fernando D, Wen Y, Jayanetti W. A Copula-Based Framework for Multivariate Count Time Series with Mixed Marginal Distributions. Stats. 2026; 9(3):57. https://doi.org/10.3390/stats9030057

Chicago/Turabian Style

Fernando, Dimuthu, Yuxin Wen, and Wimarsha Jayanetti. 2026. "A Copula-Based Framework for Multivariate Count Time Series with Mixed Marginal Distributions" Stats 9, no. 3: 57. https://doi.org/10.3390/stats9030057

APA Style

Fernando, D., Wen, Y., & Jayanetti, W. (2026). A Copula-Based Framework for Multivariate Count Time Series with Mixed Marginal Distributions. Stats, 9(3), 57. https://doi.org/10.3390/stats9030057

Article Metrics

Back to TopTop