To show the usage of the LIPD, we utilize three real datasets in this section: The first is the well-known intervened data (student enrollment) used by [
8] to prove the performance of the IGPD. The second is the COVID-19 data, and the third is the NHIS data. The form of the HRF of the datasets was determined using a graphical method based on the total time on test (TTT). The associated HRF has a decreasing, increasing, or upside-down bathtub shape if the empirical TTT plot is convex, concave, convex then concave, or concave then convex, respectively (see [
28]). We employed the RStudio software for numerical evaluations of these datasets.
9.1. Student Enrollment Data
The following data were collected over a five-year period regarding student enrollment, in particular senior Mathematics and Statistics courses at the University of Calgary: 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9, 9, 13, 13, 14, 16, 16, 17, 17, 17, 18, 19, 20, 24, 24, 24, 24, 27, 31, 33, 35, 37.
These data are available in [
4], and the author in [
8] examined them as well. Students enrolled in these courses either during an advanced registration period, which was restricted, or during a subsequent open registration period, which was unrestricted. A course must have been offered if at least one student enrolled in it during the advanced registration period. Therefore, it is appropriate to take into account the LIPD model.
Table 3 shows the descriptive measures of the data, which include sample size
n, minimum (min), first quartile (
), median (Md), third quartile (
), maximum (max), and interquartile range (IQR). The empirical InD of the data is equal to
. As a result, our model employed to describe the current dataset is capable of dealing with overdispersion.
In addition,
Figure 5 shows an empirical TTT plot of the data, and it reveals an increasing HRF.
To demonstrate the LIPD’s potential benefit, the distributions given in
Table 4 were considered for comparison.
We compare the competitive distributions to the recommended distribution using the statistical methods presented, specifically the negative log-likelihood (−logL), Akaike information criterion (AIC), Bayesian information criterion (BIC), and
statistic. The corresponding MLEs and goodness-of-fit (GOF) results are shown in
Table 5. According to
Table 5, the LIPD’s GOF statistical values are lower than those of the other distributions under consideration. Therefore, the proposed model is the best choice for analyzing the provided intervention dataset.
In the case of the GLRT, the calculated value based on the test statistic in (19) is
(
p-value
). As a result, at any level
, the null hypothesis is rejected in favor of the alternative hypothesis. Hence, we conclude that the additional parameter
in the LIPD is significant in light of the test procedure outlined in
Section 6.
9.2. COVID-19 Dataset
In 2019, a fresh coronavirus (COVID-19) was found in Wuhan, China. After identifying such a virus, it spread rapidly on a daily basis. To stop the virus from spreading further, various preventive actions (treatment interventions) were taken by health service agencies. Due to various interventions, we would be able to control the very large spread to some extent. Here, we consider a dataset of daily newly reported COVID-19 instances from Rwanda in East Africa, recorded between 11 October 2021 and 15 December 2021. Since the data were collected at the end of 2021, we may assume that several treatment interventions have already been applied, and hence, it is reasonable to assume the LIPD for the dataset. The data are as follows: 98, 46, 95, 86, 61, 80, 61, 17, 36, 32, 39, 36, 37, 29, 20, 57, 43, 42, 51, 63, 53, 17, 29, 38, 55, 34, 44, 33, 16, 31, 22, 19, 28, 35, 43, 11, 12, 16, 19, 7, 6, 11, 10, 15, 23, 22, 26, 8, 14, 5, 14, 5, 13, 19, 10, 13, 10, 15, 20, 15, 53, 39, 28, 50, 79, 50.
These data are accessible at
http://covid19.who.int/data (accessed on 24 August 2022).
Table 6 shows the descriptive measures of the data. The empirical InD of the data is equal to
. As a result, our model employed to describe the current dataset is capable of dealing with overdispersion.
In addition,
Figure 6 shows an empirical TTT plot of the data, which reveals an increasing HRF.
We compare the competitive distributions to the suggested distribution using the statistical techniques provided, specifically the
, AIC, BIC, and
statistic value.
Table 7 displays the corresponding MLEs and GOF results, respectively. The LIPD’s MLEs and GOF statistic values are less than the other examined distributions. As a result, the suggested model is the most appropriate for modeling the given intervention dataset. Furthermore, the LIPD provides information on how effective various preventive actions taken by health service agencies were.
In the case of the GLRT, the calculated value based on the test statistic in (19) is
(
p-value
). As a result, at any level
, the null hypothesis is rejected in favor of the alternative hypothesis. Hence, we conclude that the additional parameter
in the LIPD is significant in light of the test procedure outlined in
Section 6.
9.3. National Health Insurance Scheme
Data from the NHIS with no zero counts were collected from a health facility in Ota, Ogun State, Nigeria, for this study. A sample of 1647 patients under the NHIS was obtained from July 2016 to July 2017. There have been encounters (visits to the doctor), which is the response variable (Nencounter). If a patient was ever admitted for the period, it was indicated by the class (Eclass), which reads (in-patient = 1, out-patient = 0). The predictor (follow-up) indicates whether a patient received routine checkups or not (follow-up = 1, no follow-up = 0). The gender (sex) of the patients reads (male = 1, female = 0). Another predictor was Ndiagnosis, which represents the number of diagnoses a patient had during the period of the encounter. The last predictor included was the biological age of the patient. The data can be found on the Mendeley Elsevier website,
https://data.mendeley.com/datasets/z7wznk53cf/8 accessed on 15 November 2022. Furthermore, the dataset was utilized in [
29], and the authors found that, following the dispersion test, the data were underdispersed with some dispersion parameter,
. For
, the fit non-linear regression model is given by
The following regression models were used to compare the LIPRM:
The ZTNBRM created in [
31].
The IPRM elaborated in [
20].
Table 8 compares the performance of the LIPRM to that of the ZTPRM, ZTGPRM, and IPRM, as well as provides real-world summaries, such as the standard errors (SEs),
p-values, negative log-likelihood (
), AIC, and BIC values. According to this table, the LIPRM has the lowest values across all model selection criteria, indicating that it is the best count regression model among the ZTPRM, ZTGPRM, and IPRM. Evidently, we also find that our model, which is applied to explain the current dataset, is perfectly suited to handling this underdispersion.