1. Introduction
Point processes are mathematical models which represent random events occurring in time or space. These processes are commonly used in various fields such as telecommunications, ecology, seismology, and finance, where events are randomly distributed and their temporal or spatial patterns are of particular interest. The most extensively studied class of point processes is the temporal Poisson process, known for its simplicity and utility in modeling events that occur independently over time. The homogeneous Poisson process assumes a constant event rate (intensity), while the inhomogeneous Poisson process accounts for a time-varying intensity function. By contrast, both methodological developments and inferential tools for more general temporal point processes are relatively scarce, and formal hypothesis testing procedures for comparing point processes are particularly limited.
Poisson processes have a rich history in statistical modeling, and traditional methods focus heavily on mathematical frameworks, likelihood representations, model estimation, and simulations [
1]. One key feature of Poisson processes is their independence from history, which means they can be fully characterized by a deterministic intensity function
. This intensity function dictates the rate of event occurrence over time. If
is constant, the process is homogeneous; otherwise, it is inhomogeneous. While the study of homogeneous Poisson processes is well established, comparing the intensities of inhomogeneous Poisson processes remains an area with many unexplored challenges.
Most classical studies focused on comparing homogeneous Poisson processes [
2,
3]. These approaches are equivalent to comparing the means of two Poisson distributions. More recent advancements, including the E-test [
4] and parametric bootstrap tests [
5], have extended these methods. However, much of the literature still concentrates on the homogeneous case, with fewer studies addressing the more complex inhomogeneous Poisson processes. Some notable exceptions include investigating the ratio of intensities [
6] and normalized intensities [
7]. In this paper, we propose a three-step procedure to compare two inhomogeneous Poisson processes. Our procedure addresses both the magnitude and the distribution of event times. The results from both tests are combined using Fisher’s combined probability test [
8], providing an overall
p-value to assess whether the two intensity functions are equal.
Beyond Poisson processes, more complex temporal point processes, such as the Hawkes process, introduce additional challenges. The Hawkes process [
9] is a “self-exciting” process where each event increases the probability of future events, making it highly suitable for modeling clustered or cascading events, such as earthquakes, trade orders, or bank default. The intensity function of a Hawkes process is not only time-dependent but is also influenced by past events, allowing the process to capture dependencies between events. This flexibility makes the Hawkes process a valuable tool in modeling phenomena where events are interdependent.
Parametric tests are suitable for processes with a specific form of the conditional intensity functions, such as the Poisson and Hawkes processes, since the parameters can be estimated using conventional log-likelihood. We introduce parametric tests based on maximum likelihood estimation for such processes, taking advantage of their well-defined likelihood structures to conduct parameter estimation and hypothesis testing.
In cases where conditional intensity assumptions are inappropriate or unknown, nonparametric methods provide an alternative. A prominent nonparametric approach is the Isometric Log-Ratio (ILR) transformation [
10], which maps compositional data, such as inter-event times in point processes, from a constrained simplex space to an unconstrained Euclidean space. This transformation enables the application of multivariate statistical techniques to compare the transformed data, overcoming the limitations imposed by the compositional nature of the original data.
To assess the similarity of point processes based on their transformed data, we employ depth-based hypothesis testing. Several depth-based tests have been proposed for multivariate distributions, including depth versus depth plots [
11] and multivariate spacings [
12]. In this paper, we implement a quality index
Q based on data depth [
13] to compare the distributions of two groups of point processes. The quality index is defined as the probability that a random sample from one distribution has a smaller depth with respect to the other distribution. A
Q-value of 0.5 indicates that the two distributions are identical, while values deviating from 0.5 suggest differences in location or scale.
The structure of this paper is as follows.
Section 2 introduces the hypothesis tests for comparing Poisson processes, including the proposed three-step procedure for inhomogeneous processes.
Section 3 discusses parametric tests based on maximum likelihood estimation, with a focus on point processes such as the Hawkes process.
Section 4 presents nonparametric tests using the ILR transformation and depth-based methods for comparing general point processes. Simulation studies are provided throughout to demonstrate the effectiveness of each approach. In
Section 5, we provide simulation studies to demonstrate the effectiveness of the proposed approaches under various scenarios. We also apply the methods to real-world data, showcasing their practical relevance. Finally,
Section 6 concludes the paper with a summary of the key findings and outlines potential directions for future research. Mathematical details are given in Appendices.
2. A Three-Step Procedure for Comparing Poisson Processes
Poisson processes are classical stochastic processes, and their intensity functions play a critical role in determining the behavior of the process. In this section, we propose a three-step hypothesis test to compare the intensity functions of two inhomogeneous Poisson processes. This method extends the conventional comparison framework by considering both the total intensities and the normalized intensities of the processes. The combined test allows for a more detailed comparison of Poisson processes in various practical scenarios.
In practice, a process is often given in a finite-time domain, and we limit our study in a finite interval
. Assume the intensities of two Poisson processes are
and
, respectively. We are interested in testing
The first step of the test examines whether the total intensities, denoted by
and
, of the two processes are equal. A
z-test is used to compare the two intensities based on large sample properties. If the total intensities are found to be equal, the second step compares the normalized intensities, or densities,
and
, using the Kolmogorov–Smirnov test to assess whether the distributions of the event times are the same. Finally, the results from both tests are combined using Fisher’s combined probability test [
8] to form an overall
p-value, which evaluates the hypothesis that the two Poisson processes have the same intensity function.
Suppose
is an inhomogeneous Poisson process with an intensity function
. Given that
, the unordered
K arrival times
are independent and share a common density function
where
is the total intensity.
Consider two inhomogeneous Poisson processes,
and
, with intensity functions
and
, respectively. The number of events in the interval
for each process, denoted as
, follows a Poisson distribution with mean
,
Specifically, for any integer
, the probability mass function is
Let
and
be two sets of independent realizations from the first and second processes, respectively, where
and
denote the number of realizations. The numbers of events in these realizations are denoted as
and
. Each
and
for
and
. Furthermore, each realization consists of event times in strictly increasing order, represented as:
where
denotes the
k-th event time in the
j-th realization of the
i-th process, with
,
, and
.
To test whether the intensity functions are equal, i.e.,
we conduct two independent subtests. The first subtest is to compare the total intensities of the two processes
We apply a z-test with the following test statistic
where
for
. Under the null hypothesis,
Z asymptotically follows a standard normal distribution
. The
p-value for this subtest is given by
where
is the cumulative distribution function (c.d.f.) of
.
The second subtest compares the distributions of the arrival times by testing the normalized intensities:
Based on the conditional property described in Equation (
1), for any realization
, the unsorted event times
are independently distributed with density function
. Let
where each set
contains
independent event times with density function
. As the sample sizes
and
increase,
almost surely. We use the two-sample Kolmogorov–Smirnov (K-S) test to compare the empirical distribution functions of
and
, i.e., to compare
and
.
The test statistic for the K-S test is
where
and
are the empirical c.d.f.s of
and
, respectively. The
p-values for the KS test can be derived from tables. If
and
are sufficiently large,
approximately follows the Kolmogorov–Smirnov distribution.
After obtaining the p-values from both subtests, we calculate an overall p-value to compare the two intensity functions. Note that both subtests partially address the same hypothesis—whether . Indeed, the first subtest compares if the numbers of events in the two processes follow the same distribution, with null hypothesis . The second subtest compares if the distributions of events in the two processes are the same, with null hypothesis . If both subtests confirm the null hypothesis, we conclude that the two intensity functions are equal.
Since the
p-values
and
are asymptotically independent when
and
are large, we apply Fisher’s method to combine them:
where
X has a chi-squared distribution with four degrees of freedom. The combined
p-value, which is computed using the test statistic
X, is the overall
p-value for the test
. The following algorithm outlines the test procedure for comparing two inhomogeneous Poisson processes.
The overall test procedure for the comparison of the two inhomogeneous Poisson processes is given in the following Algorithm 1.
| Algorithm 1. (Given two groups of Poisson realizations and ) |
- 1.
Count the number of events of every realization: and ; - 2.
Perform a z-test on the total intensities and compute the p-value ; - 3.
Merge all samples in each group: ; - 4.
Apply two-sample Kolmogorov–Smirnov test to compare distribution functions of and and compute the p-value ; - 5.
Use Fisher’s method to combine and , compute the overall p-value using the test statistic where .
|
Remark 1.
The proposed three-step procedure can be naturally extended to compare multiple groups of inhomogeneous Poisson processes. Suppose we have K groups of independent Poisson realizations, each with an intensity function The first subtest compares the total intensities of the K processes:For this step, we can use statistical tests such as Pearson’s test or the likelihood ratio test [14]. The second subtest compares the distributions of event times among the
K Poisson processes:
The
K-sample Anderson–Darling test [
15] is appropriate for this step.
Finally, we still apply Fisher’s method to combine
p-values from previous two steps to generate an overall
p-value for the hypothesis:
Building upon this method for Poisson processes, we can also explore parametric approaches for more general point processes. The following section discusses the use of maximum likelihood estimation as a parametric method for comparing point processes.
3. Parametric Tests Based on the Maximum Likelihood Estimation
In this section, we explore hypothesis testing for realizations from general point processes. The maximum likelihood estimation (MLE) approach offers a robust parametric framework for comparing point processes. By estimating the unknown parameters of intensity functions, we can construct test statistics to evaluate whether two processes share the same underlying intensity.
Suppose two point processes have conditional intensity functions and in closed form, respectively, where denotes history, and and are unknown parameters. The null hypothesis can be tested by first estimating the parameters and using MLE. Once the parameters are estimated, the Wald test is applied to test whether the two estimated intensities are equal. The MLE approach enables the use of optimization methods such as the Newton–Raphson method to maximize the log-likelihood functions and obtain parameter estimates.
We also present two examples: the Hawkes process [
9] and the Poisson process software reliability model [
16]. While these examples demonstrate the feasibility of our method, the approach is applicable to any point process as long as MLE can be performed.
3.1. Tests Using Maximum Likelihood Estimation for Point Processes
Given a point process realization with conditional intensity
, where events
are on an finite interval
, the likelihood
L of this point process is [
1]
Consider two independent point processes with conditional intensity functions
and
on
, where
and
are
p-dimensional parameter vectors. Assume that the realizations
have intensity
, and
have intensity
. We test the null hypothesis:
For set , suppose the likelihood of each independent realization is , . Then, the likelihood of the entire set is . Similarly, the likelihood of the set is denoted as . To maximize the log-likelihood, we calculate the gradients and Hessian matrices of and . If the Hessian is well-defined and has a closed form, it is negative semi-definite, allowing us to apply the Newton–Raphson method or other appropriate optimization techniques to obtain the MLE.
Assuming MLEs
and
exist, their asymptotic distributions are
as
. The proof of asymptotic properties follows standard techniques and is provided in
Appendix A.1.
The Wald test statistic for
is
where
is the Fisher information matrix estimated by the negative Hessian
Under the null hypothesis,
W follows an asymptotic
distribution, where
r is the dimension of
, that is, the number of parameters being tested.
In the remainder of this section, we present two examples of point processes for which MLE can be feasibly applied.
3.2. Example: Hawkes Process
The Hawkes process models events where each occurrence increases the likelihood of subsequent events, such as earthquakes or trade orders. The conditional intensity function of a Hawkes process is [
1]:
where
is the background intensity, and
is the excitation function.
A widely used choice for the excitation function is an exponentially decaying function. In this case,
for
, and the intensity function is given by
Given a realization of such a Hawkes process,
, on the interval
, the intensity function becomes
The log-likelihood is given as [
17]
We compute the partial derivatives and the Hessian matrix of the log-likelihood function to obtain the MLE of
,
and
. The gradients and Hessians are provided in
Appendix A.2. Optimization techniques, such as the Newton–Raphson method and the Berndt–Hall–Hall–Hausman (BHHH) method, are employed to maximize the log-likelihood.
3.3. Example: Poisson Process Software Reliability Model
The maximum likelihood estimators for a general nonhomogeneous Poisson process do not always exist. Therefore, in our second example we focus on a software reliability model, a nonhomogeneous Poisson process on
with intensity function
, where the unknown parameters are
and
. The maximum likelihood estimations of
and
exist as
with probability 1 [
16]. For
, the log-likelihood becomes
Gradients and Hessians are computed to obtain the maximum likelihood estimator using numerical optimization methods. The gradients and Hessians are given as follows:
The Hessian matrix is provided by
4. Nonparametric Tests for General Point Processes
In many cases, parametric models require a known functional form for the intensity function. However, identifying an appropriate parametric form in practice can be challenging, especially when there is limited prior knowledge about the process. Even when a reasonable parametric form is available, the estimation process may encounter difficulties as an MLE for point processes may not always exist. For example, consider a Poisson process with intensity on , where is the parameter. The likelihood of a realization is , the likelihood as . Hence, alternative testing methods are needed in such cases.
In general, nonparametric tests for point processes are highly desirable in cases where no appropriate assumptions can be made about the intensity functions. To address this issue, we propose to apply the well-known Isometric Log-Ratio (ILR) transformation [
10] to inter-event times of the processes, mapping them into a Euclidean space. This transformation allows us to use multivariate statistical tools to compare the distributions of these transformed points. The ILR-based method is flexible and can be applied to a wide range of point processes, making it an ideal choice when parametric assumptions are not suitable. Specifically, the test compares whether the multivariate distributions of transformed inter-event times from two processes are statistically different. This method is advantageous when the intensity functions of the processes are unknown or too complicated to estimate parametrically.
We now consider tests for point processes without making assumptions about the intensity function. For a fixed time interval
, two groups of point processes are observed, denoted as
and
. Suppose
has a conditional intensity function
for
. We aim to test the null hypothesis:
or equivalently
However, directly conducting nonparametric tests to compare conditional intensity functions is challenging. To address this, we introduce the ILR transformation on inter-event times. This transformation maps data from a simplex space to Euclidean space, allowing us to transform point processes into points in Euclidean space and then compare the multivariate distributions of these points.
Given a cardinality
in the time domain
, a point process can fully be characterized by its joint density function. If the point processes in
have a joint density function
for
, let
denote the set of all point processes having cardinality
K in
. Then,
for
. Let
and
represent the cardinalities of the point processes in
and
, respectively. The hypothesis in Equation (
5) can be reformulated as:
- 1.
Condition 1. The cardinalities of point processes in both groups follow the same distribution: for all K.
- 2.
Condition 2. For any fixed cardinality K, the point processes in and have the same joint density function: .
Based on these conditions, we propose a series of subtests to compare
and
:
We choose
such that all corresponding subsets
and
have sufficiently large sample sizes.
For the first subtest
, we compare the distributions of cardinalities using Pearson’s chi-square test for homogeneity. Let the cardinalities of the realizations in the two groups be
The detailed computation of this test is presented in [
18].
For the subsequent subtests
, we use a quality index based on data depth as proposed by [
13]. However, since the point processes with a fixed cardinality
K are in irregular spaces, we need to map them into regular Euclidean spaces to apply depth-based methods effectively. Let
denote the set of all point processes on
and
denote the set of all point processes with cardinality
K. Specifically,
for some non-negative integer
K and hence
. Since depth functions are not well-defined in irregular spaces like
, we apply the ILR transformation to map the point processes into regular Euclidean spaces, enabling the use of depth-based methods for hypothesis testing.
4.1. Isometric Log-Ratio Transformation
We use the notion of inter-event time (IET) representation for the ILR transformation [
19]. Suppose
is a point process with given cardinality
K over the time domain
. This process can be equivalently represented using inter-event times. Let
Since
and
almost surely for all
, any point process with cardinality
K can be viewed as a vector in the simplex
To perform statistical analysis, we use the ILR transformation to map the IET-represented point processes from the simplex to unconstrained Euclidean space , allowing us to apply depth-based methods.
The ILR transformation is an isometry between
and
[
10]. For a vector
, the ILR transformation is defined as [
20]
where
is the component-wise geometric mean of
, and
is a matrix in
satisfying
and
, where
and
are identity matrices, and
is a (
)-dimensional vector of ones. The inverse of the ILR transformation, given as
, is
The matrix
is not unique. A common form, obtained via the Gram–Schmidt process, is given by [
20]
Using the IET representation and ILR transformation, we establish bijective relationships between point process space
, the simplex
and Euclidean space
. We can now map two groups of point process realizations to vectors in
, denoted as
Our goal is to compare the multivariate distributions of
and
. Given a cardinality
K, assume that the random vectors in
have density function
, and those in
have
. Then, the hypothesis in Equation (
6) can be reformulated as
where
and
. Next, we compare the multivariate distributions in
applying depth-based methods.
4.2. Multivariate Distribution Test Based on Data Depth
Several depth-based tests have been proposed for multivariate distribution, including depth versus depth plots [
11] and multivariate spacings [
12]. In our approach, we implement a quality index
Q, defined in [
13], as follows:
where
F and
G are two distribution functions on
for some
, and
is a data depth function with respect to distribution
F. The value of
Q lies in the interval
, with
indicating that
, and
suggesting a location shift or scale difference. We use this index to test the hypothesis
to determine whether
F and
G are statistically identical.
Given two samples
from
F and
from
G, the empirical estimate of
Q is
where
is the proportion of
s that satisfy
, and
is the empirical depth function with respect to the empirical distribution
. When
is calculated using the Mahalanobis depth, and under certain conditions, the asymptotic behavior of
is given by [
13]
as
.
Now, consider hypothesis in Equation (
9). Given two point process samples
and
and a cardinality
K, we extract two subsets
and
. Applying the ILR transformation to these subsets yields two groups of vectors in
, denoted as
and
. Rather than directly comparing the point processes in
and
, we compare the
K-dim random vectors in
and
by calculating
.
Using the asymptotic distribution of
from Equation (
10), we can perform a large-sample z-test to determine whether
. In practice, we recommend selecting
, as larger cardinalities may result in inaccurate estimates of Mahalanobis depth due to covariance matrix instability. Additionally, testing in high-dimensional settings requires a large number of samples to ensure reliable statistical inference.
4.3. Test Procedure
After obtaining the
p-values of all
subtests in Equation (
9), the final step is to combine these multiple comparisons into a single
p-value. Fisher’s method [
21] is a well-known approach for combining
p-values, under the assumption that the individual tests are independent. However, in our context, the subtests
are not independent due to shared data structures across the transformed point processes.
To address this dependency, we consider Bonferroni correction [
22], which is effective under arbitrary dependence structures. However, Bonferroni correction can be overly conservative, particularly in cases where the number of comparisons is large. To overcome this, we use Holm’s modification [
23], which provides a stepwise procedure to control the family-wise error rate while being less conservative than the traditional Bonferroni correction.
Assume
are
p-values for
m hypotheses
, and
is the significance level. Holm’s method involves sorting the
p-values in ascending order, denoted by
, with corresponding hypotheses
. The stepwise procedure is detailed in Algorithm 2.
| Algorithm 2. Holm’s Method for Combining Hypotheses |
- 1.
If , reject and proceed to Step 2; otherwise, accept all and stop; - 2.
If , reject and proceed to Step 3; otherwise, accept and stop; - 3.
If , reject and proceed to Step 4; otherwise, accept and stop; - 4.
Continue this process, stop if a hypothesis is accepted; - 5.
If , reject ; otherwise, accept it and stop.
|
Holm’s method adjusts the original
p-values to corrected values:
Since the global null hypothesis
holds if all sub-hypotheses
in Equation (
9) are true, the overall
p-value is
Our test procedure is summarized in Algorithm 3.
| Algorithm 3.
Input: two groups of realizations and |
- 1.
Count the cardinality of each realization, denoted as and ; - 2.
Use Chi-square test to compare the distributions of and , and obtain the p-value ; - 3.
Let denote the set of all point processes with cardinality K in , choose cardinalities such that the corresponding subsets and have sufficiently large sample sizes; - 4.
For each , apply the IET representation and ILR transformation to map the realizations in and into Euclidean space , yielding two groups of -dimensional vectors, denoted as and ; - 5.
For each , compute the quality index , test the hypothesis , and obtain the p-value for each comparison; - 6.
Collect all p-values , and arrange them in ascending order as . Using Holm’s correction, compute the overall p-value as
|
6. Summary and Future Work
In this paper, we proposed and examined three statistical hypothesis testing approaches for comparing point processes, particularly focusing on Poisson processes. The three methods include: (1) a three-step test designed specifically for inhomogeneous Poisson processes, (2) a parametric test based on maximum likelihood estimation for structured point processes such as Hawkes processes, and (3) a nonparametric approach utilizing the Isometric Log-Ratio transformation for general point processes.
The three-step test for inhomogeneous Poisson processes evaluates differences in total intensity and normalized intensity separately, combining results using Fisher’s method to obtain an overall p-value. Through extensive simulations, we demonstrated that this method effectively identifies both magnitude and distributional differences in event occurrences.
The MLE-based parametric test was applied to both Hawkes processes and a Poisson process model used in software reliability. By estimating intensity function parameters through numerical optimization, the test successfully detected differences between two processes in simulated experiments. The results confirmed that as the differences between intensity functions grow, the test reliably identifies deviations.
For more general point processes where a parametric form may not be suitable, we introduced a nonparametric ILR transformation-based test. By mapping inter-event times into a Euclidean space, we leveraged depth-based statistical methods to compare point process distributions. Simulation studies on both Hawkes and Poisson processes demonstrated that the ILR-based test effectively captures differences in the underlying event structure.
Additionally, we applied the ILR-based test to real spike-train recordings. We analyzed spike-time distributions across multiple experimental conditions, focusing on groups with similar or distinct pharmacological manipulations. The results showed that comparisons between conditions with similar neural activity patterns exhibited no significant differences in their spike timing, whereas conditions involving altered excitability displayed clear and statistically significant deviations.
While the proposed methods provide effective tools for comparing point processes, there are several directions for future research:
Extending the three-step procedure to multi-group comparisons: Our three-step test was primarily designed for comparing two Poisson processes. Extending this framework to compare multiple inhomogeneous Poisson processes simultaneously would be an interesting avenue for further study.
Robustness of MLE-based methods in more complex point process models: The MLE-based test performed well in Poisson and Hawkes processes, but further research could explore its effectiveness in more complex models, such as renewal processes or self-correcting point processes.
Optimization of ILR-based methods for small sample sizes: The ILR transformation provides a flexible framework for analyzing point processes, but its performance can be limited when sample sizes are small. Future research can explore ways to refine the transformation or incorporate additional statistical techniques to improve sensitivity and power in low-sample scenarios.
Application to other real-world datasets: While we applied our methods to spiking data, similar techniques could be used to analyze other event-based data, such as financial transactions, earthquake occurrences, or neuronal spike trains.
Overall, our study provides a basic framework for statistical tests on point processes, and the proposed methods have demonstrated their effectiveness in both simulated and real-world settings. Future work will aim to refine and extend these methodologies to further improve their applicability and robustness in diverse domains.