Forecasting the Endemic/Epidemic Transition in COVID-19 in Some Countries: Influence of the Vaccination

Objective: The objective of this article is to develop a robust method for forecasting the transition from endemic to epidemic phases in contagious diseases using COVID-19 as a case study. Methods: Seven indicators are proposed for detecting the endemic/epidemic transition: variation coefficient, entropy, dominant/subdominant spectral ratio, skewness, kurtosis, dispersion index and normality index. Then, principal component analysis (PCA) offers a score built from the seven proposed indicators as the first PCA component, and its forecasting performance is estimated from its ability to predict the entrance in the epidemic exponential growth phase. Results: This score is applied to the retro-prediction of endemic/epidemic transitions of COVID-19 outbreak in seven various countries for which the first PCA component has a good predicting power. Conclusion: This research offers a valuable tool for early epidemic detection, aiding in effective public health responses.


Introduction 1.Problem Statement
This study aims to develop a novel method for predicting transitions between endemic and epidemic phases in contagious diseases, with a specific focus on COVID-19 dynamics.To predict qualitative changes in the dynamics of a contagious disease, it is not enough to have a good mathematical model considering the mechanisms of contagion.It is also necessary, from the observed data, for example, new daily cases, to be able to predict the occurrence of a new epidemic wave from a stationary endemic situation as defined by D. Bernoulli in 1766 [1,2].
Such an objective requires the use of specific predictive statistical tools.To find a reliable method of prediction of the frontiers between different stationary and nonstationary phases of a time series is a challenging problem.This objective is close to that of the stationarity rupture tests studied for about forty years by statisticians.Indeed, since the seminal work by J. Deshayes and D. Picard on the stationarity rupture in time series [3,4], many works have dealt with stationarity breaking [5][6][7][8][9][10], the most recent using the concept of functional statistics, which considers observed curves of incidence or mortality as functions to be estimated in parametrized sets of functions [11][12][13][14][15][16][17].

Significance of the Research
There are very few relevant articles that have considered endemic/epidemic transition forecast, but our approach is different from other approaches in the literature.We intend to fill the gap; in addition, we seek to present a new method able to forecast the endemic/epidemic transition, taking as example the COVID-19 outbreak.In literature [18], some authors exploited the knowledge on the past epidemics, namely at the level of the Diseases 2023, 11, 135 2 of 20 endemic/epidemic transitions (see Figure 1), for making predictions on their occurrence during the COVID-19 pandemic [18].In [19], the authors remarked that during the transition to the endemic phase, vaccination rates have lagged and that developed countries needed to boost vaccination rates globally.
The constraint of stationarity is crucial as many forecasting models of time series rely on stationary for performing an easy modeling and obtaining reliable results.The main characteristic parameters of the empirical distribution of the random variables of a stationary time series (moments, coefficient of variation, entropy, etc.) remain constant, the randomness coming often from an additive Gaussian noise.In the event of a break in stationarity, there may be a sudden transition with a sudden change in the values of these parameters and the appearance of a non-constant trend.The problem of the existence of this transition arises with acuity in the case of contagious diseases, which alternate stationary endemic periods and epidemic peaks with an exponential initial trend, which must be predicted to prevent the spread of the disease does not give rise to a pandemic.

Prediction Approaches in the Literature
The prediction of epidemics is one of the major objectives of the mathematical modeling of the spread of infectious diseases.It can be achieved by the spatiotemporal continuation of the solutions of the partial differential equations of the chosen continuous model realized through the extrapolation of a discrete statistical description of the evolution of the observed variables.The difficulty of predicting the evolution of a pandemic lies in the adaptive capacities of the infectious agent and the infected and transmitting host.On the one hand, the genetic mutations of the infectious agent and its contagious power and pathogenic dangerousness develop highly infectious and low pathogenic variants, often signaling the natural end of a pandemic.On the other hand, the permanent adaptation strategy of the individual and collective host defenses makes it possible to anticipate the effects of changes in the agent's infectious strategy.In both cases, modeling the dynamics of mutation and prevention is essential to predict and act in near real-time on the evolution of a pandemic.We refer to [18][19][20][21][22][23][24][25][26][27][28][29][30][31][32] for more results and references on the topic of forecasting the contagious diseases.

Methodology and Approach
In this article, we offer a method to estimate the breakdown of endemic stationarity based on seven parameters whose isolated or joint predictive power is analyzed.These parameters are the coefficient of variation and the entropy of the empirical measure calculated in a moving window, as well as the ratio between the modules of the dominant and subdominant eigenvalues of the nonstationary transition matrix, the third and fourth standardized moments of the empirical distribution (called respectively skewness and kurtosis), and eventually diversity and normality indices, quantifying the distance of the empirical distribution to respectively the uniform and the normal distribution.
An epidemic corresponds to an unexpected increase in the number of disease cases in a specific location.Yellow fever, smallpox, measles, and polio are prime examples of epidemics.An epidemic disease does not necessarily have to be contagious.The rapid increase in obesity is also considered as an epidemic: worldwide obesity has indeed nearly tripled between 1975 and 2016 and has been considered by WHO as a pandemic since 1997 [33].
A pandemic is characterized by exponential growth of the disease, when it concerns a continent of the entire world.This means the growth rate skyrockets, and each day cases grow in number more rapidly than the day prior.In being declared a pandemic, the virus has nothing to do with virology, population immunity, or disease severity.It means a virus covers a wide area, affecting several countries and populations [34].A common example we experienced recently is the COVID-19 worldwide pandemic with a high contagiousness of the virus.
An endemic disease designates a disease constantly present within a population, at a usual level of prevalence and in a stable state.An epidemic can turn into an endemic in one or both of the following cases: (i) loss of virulence of the pathogen; (ii) gradual elevation of specific antibodies in the affected population through repeated infections (which confers natural immunity) or regular vaccinations decided by public health authorities as a means of mitigation (antibody artificial induction).This decreases the population's susceptibility to infection and the severity of infection in the individuals.This refers to a decrease in the pathogenicity of the infectious agent, which could make it either less infectious, or less lethal, or both [19], making the infection clinically stable and less apparent.Over time, the infectious agent usually mutates and circulates at lower, more manageable levels due to the occurrence of a variant more contagious, but less pathogenic.Then, a pandemic evolves into an endemic disease, the common example being influenza.
grow in number more rapidly than the day prior.In being declared a pandemic, th has nothing to do with virology, population immunity, or disease severity.It m virus covers a wide area, affecting several countries and populations [34].A comm ample we experienced recently is the COVID-19 worldwide pandemic with a high giousness of the virus.
An endemic disease designates a disease constantly present within a popula a usual level of prevalence and in a stable state.An epidemic can turn into an end one or both of the following cases: (i) loss of virulence of the pathogen; (ii) gradua tion of specific antibodies in the affected population through repeated infections confers natural immunity) or regular vaccinations decided by public health author a means of mitigation (antibody artificial induction).This decreases the population ceptibility to infection and the severity of infection in the individuals.This refers t crease in the pathogenicity of the infectious agent, which could make it either les tious, or less lethal, or both [19], making the infection clinically stable and less ap Over time, the infectious agent usually mutates and circulates at lower, more mana levels due to the occurrence of a variant more contagious, but less pathogenic.T pandemic evolves into an endemic disease, the common example being influenza.[35,36]).

Overview of Epidemic/Endemic Transition: Example of Influenza
RNA viruses of influenza belong to the genus Orthomyxoviridae, and the fir ous influenza pandemic occurred in 1918, infecting about one third of the entire w population until 1921 and killing patients in the range of 24.7-39.3 million [37].Its d ics has been well documented in some precise areas like New York City [38].A disappearance during the year 1921 in New Caledonia, the influenza virus muta sulting in descendant strains still circulating and infecting millions of patients and globally between 294,000 and 518,000 deaths every winter [39,40].Influenza has be sporadic disease, that is, a disease with epidemics occurring when a new virus str pears into the population causing an antigenic drift [41], and between these epid the virus continues to circulate between individuals in an endemic fashion, cau infection to become clinically less apparent, making influenza a classic example of demic disease.

Organization of the Article
In the following, we introduce in Section 2 the criteria used to define the brea of stationarity of the random variable equal to the daily new cases of a contagious d Section 3 presents the results of an application concerning the COVID-19 outbreak results are discussed in Section 4, followed by some perspectives in Section 5 dev Conclusion.[35,36]).

Overview of Epidemic/Endemic Transition: Example of Influenza
RNA viruses of influenza belong to the genus Orthomyxoviridae, and the first serious influenza pandemic occurred in 1918, infecting about one third of the entire world's population until 1921 and killing patients in the range of 24.7-39.3 million [37].Its dynamics has been well documented in some precise areas like New York City [38].After its disappearance during the year 1921 in New Caledonia, the influenza virus mutated, resulting in descendant strains still circulating and infecting millions of patients and killing globally between 294,000 and 518,000 deaths every winter [39,40].Influenza has become a sporadic disease, that is, a disease with epidemics occurring when a new virus strain appears into the population causing an antigenic drift [41], and between these epidemics, the virus continues to circulate between individuals in an endemic fashion, causing an infection to become clinically less apparent, making influenza a classic example of an endemic disease.

Organization of the Article
In the following, we introduce in Section 2 the criteria used to define the breakdown of stationarity of the random variable equal to the daily new cases of a contagious disease.Section 3 presents the results of an application concerning the COVID-19 outbreak.These results are discussed in Section 4, followed by some perspectives in Section 5 devoted to Conclusion.

Data Description
We considered COVID-19 daily empirical cases data in Japan, Nigeria, Cameroon, France, USA, and India.We chose countries in which either the level of economy (more or less developed) or the quality of detection (by more or less systematic PCR) or the vaccination policy (more or less generalized) or the dynamics of appearance of variants (more or less rapid) were different in order to obtain a representative sample of the different possible histories of the disease.Three countries are developed countries while others We used the daily case count to analyze the differences in disease spread and peaks among these countries.For all the countries considered, daily numbers of confirmed cases, deaths, and full vaccinated data were extracted from public databases Worldometer [35] and Our World in Data [36] from January 2020 to July 2022.
Figure 2 shows the real time of daily new and cumulative cases of COVID-19 for Japan, showing that since the initial stage of the epidemic, there were obvious differences between epidemic peaks.In this regard, we provide some explanations and insight to describe the observed phenomena in our analysis.
France, USA, and India.We chose countries in which either the level of economy (more or less developed) or the quality of detection (by more or less systematic PCR) or the vaccination policy (more or less generalized) or the dynamics of appearance of variants (more or less rapid) were different in order to obtain a representative sample of the different possible histories of the disease.Three countries are developed countries while others are developing countries.Three countries are among the seven most populated countries in the world, while others have between 30 and 70 million inhabitants.Two variants of SARS-CoV-2 were originated from India (Delta) and USA (Epsilon), which makes data sets interesting to gain insight in the dynamics of COVID-19 outbreak.
We used the daily case count to analyze the differences in disease spread and peaks among these countries.For all the countries considered, daily numbers of confirmed cases, deaths, and full vaccinated data were extracted from public databases Worldometer [35] and Our World in Data [36] from January 2020 to July 2022.
Figure 2 shows the real time of daily new and cumulative cases of COVID-19 for Japan, showing that since the initial stage of the epidemic, there were obvious differences between epidemic peaks.In this regard, we provide some explanations and insight to describe the observed phenomena in our analysis.

Stationarity Breakdown Criteria
The transition between the stationary endemic state of a contagious disease and an epidemic wave is studied by calculating three parameters in a moving window around the frontier on which we suspect that this transition occurred.These three parameters are the coefficient of variation, the entropy of the empirical distribution of the new cases of the disease daily observed, considered as random variable N, and the ratio between the absolute values of the dominant and subdominant eigenvalues of the transition matrix ruling the growth of N.

Coefficient of Variation (CV)
The coefficient of variation of a random integer variable N valued in {n1, …, nd} is defined as the ratio of the standard deviation σ(N) to the mean E(N) of the empirical distribution of N, i.e., the set of weights pi = Proba({N = ni}) of the histogram:

Stationarity Breakdown Criteria
The transition between the stationary endemic state of a contagious disease and an epidemic wave is studied by calculating three parameters in a moving window around the frontier on which we suspect that this transition occurred.These three parameters are the coefficient of variation, the entropy of the empirical distribution of the new cases of the disease daily observed, considered as random variable N, and the ratio between the absolute values of the dominant and subdominant eigenvalues of the transition matrix ruling the growth of N.

Coefficient of Variation (CV)
The coefficient of variation of a random integer variable N valued in {n 1 , . .., n d } is defined as the ratio of the standard deviation σ(N) to the mean E(N) of the empirical distribution of N, i.e., the set of weights p i = Proba({N = n i }) of the histogram: Then, the classical formulas for the first moments (expectation E(N) and standard deviation σ(N)) and the coefficient of variation CV of the empirical distribution {p i } i=1,d are

Empirical Entropy
The entropy E of the empirical distribution {p i } i=1,d is defined as follows: The entropy E is maximal, equal to Log(d), when the empirical distribution is uniform, i.e., when all is equals 1/d, and E is minimal, equals 0, when only one p i equals 1.

Spectral Subdominant/Dominant Ratio
The Demongeot-Magal discrete equation of infectious dynamics is defined in [42]: where S(t) and N(t) are the numbers of susceptibles and infectious cases at day t.The transition matrix satisfies the Fröbenius theory; then, it has in its spectrum a real positive dominant eigenvalue λ 1 and two complex conjugates as subdominant eigenvalues of absolute value |λ 2 |.Then, the spectral subdominant/dominant ratio R is defined as follows: (5)

Skewness
The skewness (Skew) of the empirical distribution {p i } i=1,d of the random variable N is defined as its third standardized moment:

Kurtosis
The kurtosis (Kurt) of the empirical distribution {p i } i=1,d of the random variable N is defined as its fourth standardized moment:

Index of Dispersion
The index of dispersion (ID) is defined by the following formula: ID equals 0 for a constant random variable N and 1 for a Poisson variable.

Normality Index
The normality index KStest is defined as the fitting criterion of the Kolmogorov-Smirnov test of adequation to the normal distribution, with E(N) and σ(N) as, respectively, expectation and standard deviation of the empirical distribution of N.

Principal Component Analysis
The principal component analysis (PCA) is an exploratory data analysis technique which uses real data, for example, q variables for each individual of a population of size n (e.g., the observed COVID-19 new cases and deaths in the French population) [42][43][44].Let us consider the q n-dimensional vectors y j made from these observations and calculate the combinations of the y j 's, which are orthogonal and have a variance decreasing with i.They constitute a matrix denoted as Y and defined as follows: ∀i = 1, n, ∑ q j=1 y ji a ji = Ya i or < y i , a i >= Ya i (9) and var where a i is a vector commonly called the i th eigenvector and Z is the covariance matrix associated with the real data.The n linear combinations Ya i are called principal components (PCs) and the elements of the eigenvectors a j are called PCs scores, which are values each among the n individuals score on PCs [43].The first principal component Ya 1 offers the most information in the principal component analysis.

Construction of a Score
In practice, the prediction power of each of the breakdown parameters is different from the others and can be measured in a retro-analysis by calculating the regression coefficients between the daily new cases N(j) observed at day j and the parameters calculated on a temporal moving window of two weeks ending on day j.We can then either retain the parameter with the greatest predictive power or define a breakdown score equal, in a multiple polynomial regression of the daily number of cases observed on the break parameters, to the combination of parameters producing the minimum error.A way to obtain this score is also to use the first principal component of principal component analysis (PCA), which explains, in general, a sufficient percentage of the variance of the new case empirical distribution.

Choice of the Countries
The choice of the studied countries has been guided by the search on three continents (Africa, Asia and Europe) of countries presenting complementary profiles to be compared in terms of values of mean Temperature (T), Elevation (E), Density (D), Age Median (M), R 0 , date of start and exponential slope of the first and second waves of new cases of COVID-19, and percentage of the GDP dedicated to health expenditure.These countries are selected as follows: for Africa, Cameroon and Nigeria; for Asia, Japan and India; for Europe, France and UK; and for North America, USA.Values of mean Temperature (T), Elevation (E), Density (D), Age Median (M), R 0 , date of start and exponential slope of the first and second waves of new cases of COVID-19, and percentage of the GDP dedicated to health expenditure are applied (cf.Table A1 in Appendix A).

Coefficient of Variation (CV) during COVID-19 Outbreak
CV alone is not a reliable predictor of epidemic waves due to varying trends among countries and waves.Figure 3 shows such variation of the coefficient of variation at the frontier between endemic and epidemic stages, but the sense of this variation varies largely between the waves in the same country and between countries.For example, CV decreases during first endemic/epidemic transition in the USA and India, but though in France it also decreases before the third wave, it increases during the fourth one (Figure 3 and Table 1).France it also decreases before the third wave, it increases during the fourth one (Figure 3 and Table 1).Table 1.CV values for USA (1 st and 4 th wave) and India (1 st , 2 nd , 3 rd , 4 th waves) during epidemic waves of COVID-19 (after [23]).Entropy is calculated for empirical distribution of daily new cases (Figure 4) and Figure 5 shows that Entropy alone is not a good predictor of new cases waves for France.

Empirical Entropy in COVID-19 Outbreak
Entropy is calculated for empirical distribution of daily new cases (Figure 4) and Figure 5 shows that Entropy alone is not a good predictor of new cases waves for France.R alone cannot represent a reliable endemic-epidemic transition predictor.In Table 2, we see that the values of the spectral ratio R increase during epidemic phases in France and Japan, but differences are very small and not significant.At the start of the first wave in the USA (Figure 5), we calculate the entropy E the value of which is equal to −  log  =  log   log   log   log   log   log  = 0.686.

Spectral Dominant/Subdominant Ratio in COVID-19 Outbreak
R alone cannot represent a reliable endemic-epidemic transition predictor.In Table 2, we see that the values of the spectral ratio R increase during epidemic phases in France and Japan, but differences are very small and not significant.At the start of the first wave in the USA (Figure 5), we calculate the entropy E the value of which is equal to − 6 ∑ i=1 p i log p i = p 1 log p 1 + p 2 log p 2 + p 3 log p 3 + p 4 log p 4 + p 5 log p 5 + p 6 log p 6 = 0.686.

Spectral Dominant/Subdominant Ratio in COVID-19 Outbreak
R alone cannot represent a reliable endemic-epidemic transition predictor.In Table 2, we see that the values of the spectral ratio R increase during epidemic phases in France and Japan, but differences are very small and not significant.Because the three first possible indicators of the endemic-epidemic transition have no prediction power, we keep the breakdown parameters calculated from the empirical distribution of the daily new cases, namely the coefficient of variation, the entropy, the third and fourth standardized moments (skewness and kurtosis), the uniformity index and normality index, all calculated on same moving window respecting the following rules:

•
Choice of the same length of moving window as for the CV calculation (14 days); • Use of the same time step as for moving the window (1 day); • Movement of the window from the start to the end of the COVID-19 outbreak observed between January 2020 and July 2022.
In Figure 6, we can observe the evolution of all the six breakdown parameters in Japan, and in Figure 7A, we can observe that of only the first component of principal component analysis (PCA) performed with these parameters, which summarizes their predictive power globally.We can conclude that among the breakdown parameters, the only good predictor for epidemic waves is the first PCA component because its variations anticipate epidemic peaks.Because the three first possible indicators of the endemic-epidemic transition have no prediction power, we keep the breakdown parameters calculated from the empirical distribution of the daily new cases, namely the coefficient of variation, the entropy, the third and fourth standardized moments (skewness and kurtosis), the uniformity index and normality index, all calculated on same moving window respecting the following rules:

•
Choice of the same length of moving window as for the CV calculation (14 days); • Use of the same time step as for moving the window (1 day); • Movement of the window from the start to the end of the COVID-19 outbreak observed between January 2020 and July 2022.
In Figure 6, we can observe the evolution of all the six breakdown parameters in Japan, and in Figure 7A, we can observe that of only the first component of principal component analysis (PCA) performed with these parameters, which summarizes their predictive power globally.We can conclude that among the breakdown parameters, the only good predictor for epidemic waves is the first PCA component because its variations anticipate epidemic peaks.In Figure 7, we can observe the evolution of only the first component of the PCA in seven different countries: Japan, Nigeria, Cameroon, France, UK, USA and India.We eliminated entropy and empirical moments because they have a restricted predictive power and the ID index because its predictive power is about same as that of PCA.We can observe that the minima of the PCA curves (blue) approximately correspond to the peaks of new cases curves (green) for the first four countries, but when endemic periods are long, PCA peaks are better predictors.It is the case for the UK, the USA and India, which contrasts with expectations.

The ID Index as Predictor
The minima of the first PCA component curves correspond to the maxima of the ID index curves.Hence, ID index can be also a good predictor of COVID-19 epidemic peaks (Figure 8).In Figure 7, we can observe the evolution of only the first component of the PCA in seven different countries: Japan, Nigeria, Cameroon, France, UK, USA and India.We eliminated entropy and empirical moments because they have a restricted predictive power and the ID index because its predictive power is about same as that of PCA.We can observe that the minima of the PCA curves (blue) approximately correspond to the peaks of new cases curves (green) for the first four countries, but when endemic periods are long, PCA peaks are better predictors.It is the case for the UK, the USA and India, which contrasts with expectations.

The ID Index as Predictor
The minima of the first PCA component curves correspond to the maxima of the ID index curves.Hence, ID index can be also a good predictor of COVID-19 epidemic peaks (Figure 8).In the case of Japan, the precision of the forecasting character of both the first PCA principal component PCA1 and of the ID index can be easily explained by the fact that ID index often has the main weight in the linear combination expressing PCA1 on the breakdown coefficients, as calculated for the first moving window in Japan during early January 2020, where the breaking coefficients are calculated for the first moving windows of two weeks in Table 3: PCA1 = 8.86760799 10 -2 Kurt + 1.73156383 10 -2 E + 1.25157924 10 -2 Skew + 2.49657969 10 -2 CV + 9.95518350 10 -1 ID + 1.05368220 10 -5 KS.
Table 3.Values of the breakdown coefficients during the first two weeks moving windows W(i) (i = 0 to 4) for Japan during early January 2020.
The values of the breakdown variable ID remain small during the COVID-19 evolution, but their relative variations ∆ID(i) = [ID(i + 1) − ID(i)]/ID(i) are important (Table 3), which explains the relatively important weight of ID in PCA1.The minima of PCA1 and maxima of ID are systematically preceding the epidemic peaks (except for India), and the change in nature in the empirical distribution (the loss of stationarity) of the new cases is  In the case of Japan, the precision of the forecasting character of both the first PCA principal component PCA1 and of the ID index can be easily explained by the fact that ID index often has the main weight in the linear combination expressing PCA1 on the breakdown coefficients, as calculated for the first moving window in Japan during early January 2020, where the breaking coefficients are calculated for the first moving windows of two weeks in Table 3: PCA1 = 8.86760799 10 -2 Kurt + 1.73156383 10 -2 E + 1.25157924 10 -2 Skew + 2.49657969 10 -2 CV + 9.95518350 10 -1 ID + 1.05368220 10 -5 KS.The values of the breakdown variable ID remain small during the COVID-19 evolution, but their relative variations ∆ID(i) = [ID(i + 1) − ID(i)]/ID(i) are important (Table 3), which explains the relatively important weight of ID in PCA1.The minima of PCA1 and maxima of ID are systematically preceding the epidemic peaks (except for India), and the change in nature in the empirical distribution (the loss of stationarity) of the new cases is easily understandable.The index of dispersion ID is indeed the logarithm of the ratio between second and first moments of the empirical distribution of new cases and its variations reflect the loss of stationarity before an exponential growth of the new cases, which is the main characteristics of the early dynamics of an epidemic peak.We observe the same predictive behavior for the breakdown parameters and PCA1, calculated from death data.easily understandable.The index of dispersion ID is indeed the logarithm of the ratio between second and first moments of the empirical distribution of new cases and its variations reflect the loss of stationarity before an exponential growth of the new cases, which is the main characteristics of the early dynamics of an epidemic peak.We observe the same predictive behavior for the breakdown parameters and PCA1, calculated from death data.We see in Figures 9-12 the following three main features:

The Influence of Vaccination on the Daily New Cases and Deaths Curves
-The first PCA component (PCA1) anticipates systematically the new case and death waves, the latter ones occurring some weeks (between two and four) after the new case waves; -ID waves occur in opposition of phases with PCA1, but also predicts the new case and death waves well; -This anticipation remains true after vaccination, except for the end of the vaccination campaign which shows the beginning of a decorrelation between PCA1 and new case last waves.
These features have to be confirmed in future works.Suggested directions could be:

Conclusions
We examined the predictive power of seven parameters related to the empirical distribution of new COVID-19 cases in six countries (Japan, Nigeria, Cameroon, France, USA, and India), which constitutes an improvement of our previous work on COVID-19 outbreak in [44][45][46][47] using different approaches.Only six parameters showed an ability to predict epidemic peaks, all being related to the empirical distribution of new cases: kurtosis, entropy, skewness, coefficient of variation, index of dispersion and the fitting criterion of the Kolmogorov-Smirnov normal adequation test.The calculation of the first component

Conclusions
We examined the predictive power of seven parameters related to the empirical distribution of new COVID-19 cases in six countries (Japan, Nigeria, Cameroon, France, USA, and India), which constitutes an improvement of our previous work on COVID-19 outbreak in [44][45][46][47] using different approaches.Only six parameters showed an ability to predict epidemic peaks, all being related to the empirical distribution of new cases: kurtosis, entropy, skewness, coefficient of variation, index of dispersion and the fitting criterion of the Kolmogorov-Smirnov normal adequation test.The calculation of the first component of principal component analysis (PCA) based on these six parameters showed that its principal component PCA1 has a good forecasting power in all the above-mentioned countries, except the USA and India, whose endemic phases showed only weak variations of the moments of the empirical distribution of the Daily new cases.Hence, for the USA and India, a minimum of the ID variable was impossible to individualize inside the endemic background noise.The future efforts in the direction of this research are vital for a future pandemic or emerging infectious disease preparation, because we believe that the research presented in this article could be relevant for new infectious case forecasting in order to deploy proper intervention and resources (as vaccination policies [48]) to fight the epidemic spread and, in a way, guide policy-making for public health.
countries.Three countries are among the seven most populated countries in the world, while others have between 30 and 70 million inhabitants.Two variants of SARS-CoV-2 were originated from India (Delta) and USA (Epsilon), which makes data sets interesting to gain insight in the dynamics of COVID-19 outbreak.

Figure 2 .
Figure2.COVID-19 outbreak in Japan with Cumulative (resp.Daily new) cases in grey with a 7day moving average in orange (resp. in grey with a 7-day moving average in light blue) (after[35]).

Figure 2 .
Figure2.COVID-19 outbreak in Japan with Cumulative (resp.Daily new) cases in grey with a 7-day moving average in orange (resp. in grey with a 7-day moving average in light blue) (after[35]).

Figure 3 .
Figure 3.During France third (left) and USA first (right) endemic/epidemic transitions, the co-evolution of the Coefficient of Variation CV and Daily new cases.The x-axis represents time in days and the y-axes the Coefficient of Variation (in blue) and the Daily New Cases (in green).

Figure 3 .
Figure 3.During France third (left) and USA first (right) endemic/epidemic transitions, the coevolution of the Coefficient of Variation CV and Daily new cases.The x-axis represents time in days and the y-axes the Coefficient of Variation (in blue) and the Daily New Cases (in green).

Figure 4 .
Figure 4.The empirical distributions of daily new cases for France 3 rd wave and USA 1 st wave.

Figure 5 .
Figure 5. Co-evolution of CV and Entropy during France third (left) and USA first (right) waves.At the start of the first wave in the USA (Figure5), we calculate the entropy E the value of which is equal to

Figure 4 .
Figure 4.The empirical distributions of daily new cases for France 3rd wave and USA 1st wave.

Figure 4 .
Figure 4.The empirical distributions of daily new cases for France 3 rd wave and USA 1 st wave.

Figure 5 .
Figure 5. Co-evolution of CV and Entropy during France third (left) and USA first (right) waves.

Figure 5 .
Figure 5. Co-evolution of CV and Entropy during France third (left) and USA first (right) waves.

Figure 6 .
Figure 6.Breakdown Parameters and New Cases (in grey) in Japan during COVID-19 Outbreak.

Diseases 2023 , 20 Figure 7 .
Figure 7. First Principal Component (blue) as predictor of COVID-19 Daily new case waves (green) in various countries: Japan (A), Nigeria (B), Cameroon (C), France (D), UK (E), USA (F) and India (G).The x-axis represents the time in days and the y-axis the PCA principal component.The red arrows correspond to local maxima of the first principal component.

Figure 8 .
Figure 8. ID index (in blue) as predictor of the epidemic waves for Japan COVID-19 outbreak, with Daily new cases superimposed (in green).The x-axis represents the time in days.The red arrows correspond to local maxima of the first principal component.

Figure 8 .
Figure 8. ID index (in blue) as predictor of the epidemic waves for Japan COVID-19 outbreak, with Daily new cases superimposed (in green).The x-axis represents the time in days.The red arrows correspond to local maxima of the first principal component.

4. 2 .
Figures 9-11 show the influence of vaccinations on new cases and deaths curves.

Figures 9 -
Figures 9-11 show the influence of vaccinations on new cases and deaths curves.

Figure 9 .
Figure 9. (A) Breakdown parameters of new cases before (left) and after (right) vaccination during Japan COVID-19 outbreak; (B) Influence of vaccination on waves, with PCA1 (in blue) and new cases (in green) before (left) and after (right) vaccination with percentage of fully vaccinated superimposed (in light blue); (C) PCA1 and ID for new cases before and after vaccination (fully vaccinated superimposed); (D) PCA1 for deaths before (left) and after (right) vaccination (fully vaccinated superimposed); (E) PCA1 and ID for deaths before (left) and after (right) vaccination (fully vaccinated superimposed).The x-axis represents the time (in months).The red arrows correspond to local maxima of the first principal component.

Figure 9 .
Figure 9. (A) Breakdown parameters of new cases before (left) and after (right) vaccination during Japan COVID-19 outbreak; (B) Influence of vaccination on waves, with PCA1 (in blue) and new cases (in green) before (left) and after (right) vaccination with percentage of fully vaccinated superimposed (in light blue); (C) PCA1 and ID for new cases before and after vaccination (fully vaccinated superimposed); (D) PCA1 for deaths before (left) and after (right) vaccination (fully vaccinated superimposed); (E) PCA1 and ID for deaths before (left) and after (right) vaccination (fully vaccinated superimposed).The x-axis represents the time (in months).The red arrows correspond to local maxima of the first principal component.

Figure 10 .
Figure 10.(A) Influence of vaccination on waves of Nigeria COVID-19 outbreak, with PCA1 (in blue) and daily new cases (in green) before (left) and after (right) vaccination with percentage of fully Figure 10.(A) Influence of vaccination on waves of Nigeria COVID-19 outbreak, with PCA1 (in blue) and daily new cases (in green) before (left) and after (right) vaccination with percentage of fully vaccinated people superimposed (in black); (B) PCA1 and ID for new cases and deaths before and after vaccination (percentage of fully vaccinated superimposed); (C) PCA1 for deaths before (left) and after (right) vaccination (fully vaccinated superimposed); (D) same as (A) and (C) for Cameroon with new cases (left) and deaths (right) superimposed (in green) with fully vaccinated superimposed (in black).The x-axis represents the time (in months).The red arrows correspond to local maxima of the first principal component.

Figure 11 .
Figure 11.(A) Influence of vaccination on waves of France COVID-19 outbreak, with daily new cases superimposed (in green) before (left) and after (right) vaccination with percentage of fully vaccinated people superimposed (in red); (B) same for deaths before and after vaccination; (C) same as (A) for the United Kingdom; (D) same as (C) for the US.The x-axis represents the time (in months).The red arrows correspond to local maxima of the first principal component.

20 Figure 11 .
Figure 11.(A) Influence of vaccination on waves of France COVID-19 outbreak, with daily new cases superimposed (in green) before (left) and after (right) vaccination with percentage of fully vaccinated people superimposed (in red); (B) same for deaths before and after vaccination; (C) same as (A) for the United Kingdom; (D) same as (C) for the US.The x-axis represents the time (in months).The red arrows correspond to local maxima of the first principal component.

Figure 12 .
Figure 12. (A) Influence of vaccination on waves USA COVID-19 outbreak, with daily new cases superimposed (in green) before (left) and after (right) vaccination with percentage of fully vaccinated people superimposed (in red); (B) same for deaths before and after vaccination; (C) same as (A) for India; (D) same as (C) for India.The x-axis represents the time (in months).The red arrows correspond to local maxima of the first principal component.

Figure 12 .
Figure 12. (A) Influence of vaccination on waves USA COVID-19 outbreak, with daily new cases superimposed (in green) before (left) and after (right) vaccination with percentage of fully vaccinated people superimposed (in red); (B) same for deaths before and after vaccination; (C) same as (A) for India; (D) same as (C) for India.The x-axis represents the time (in months).The red arrows correspond to local maxima of the first principal component.

Table 2 .
Absolute value of dominant and first subdominant eigenvalues, and spectral ratio R = | |/ .

Table 2 .
Absolute value of dominant and first subdominant eigenvalues, and spectral ratio R = | |/ .

Table 3 .
Values of the breakdown coefficients during the first two weeks moving windows W(i) (i = 0 to 4) for Japan during early January 2020.