Spectral Method in Epidemic Time Series: Application to COVID-19 Pandemic

Simple Summary This article aims to study the times series provided by data of the daily number of reported cases of COVID-19. During the COVID-19 pandemic, most people viewed the oscillations around the exponential growth at the beginning of an epidemic wave as the default in reporting the data. The residual is probably partly due to the reporting data process (random noise). Nevertheless, a significant remaining part of such oscillations could be connected to the infection dynamic at the level of a single average patient. Eventually, the central question we try to address here is: Is there some hidden information in the signal around the exponential tendency for COVID-19 data? Abstract Background: The age of infection plays an important role in assessing an individual’s daily level of contagiousness, quantified by the daily reproduction number. Then, we derive an autoregressive moving average model from a daily discrete-time epidemic model based on a difference equation involving the age of infection. Novelty: The article’s main idea is to use a part of the spectrum associated with this difference equation to describe the data and the model. Results: We present some results of the parameters’ identification of the model when all the eigenvalues are known. This method was applied to Japan’s third epidemic wave of COVID-19 fails to preserve the positivity of daily reproduction. This problem forced us to develop an original truncated spectral method applied to Japanese data. We start by considering ten days and extend our analysis to one month. Conclusion: We can identify the shape for a daily reproduction numbers curve throughout the contagion period using only a few eigenvalues to fit the data.


Introduction
Modeling an epidemic peak requires precise knowledge of the daily data corresponding to new cases. One of the aims of the paper is to extract the value of the average daily reproduction numbers. The daily reproduction numbers vary from individual to individual and from day to day during the period of contagiousness of an individual. These numbers depend on the age of infection, i.e., the number of days since the individual contracted the infectious disease. From a discrete model of the evolution of new daily cases, we propose to evaluate the average number R 0 (d) of secondary infected individuals produced by a single infected individual on each day d since his infection. For this purpose, on the top of the dominant eigenvalue, we will estimate from the data other significant subdominant eigenvalues (complex), which explain the modulation of the growth and allow better adequacy of the model to the data.
For that purpose, we reconsider the discrete-time epidemic model with the age of infection presented in Demongeot et al. [1]. This model is a discrete-time version of the Volterra integral formulation of the Kermack-McKendrick model with age of infection [2]. The variation of the number of susceptible individuals S(t) is given each day t = t 0 , t 0 + 1, . . ., by where S(t) is the number of susceptible individuals at time t, and N(t) is the daily number of new infected at time t. Throughout the paper, we use the following convention for the sum m ∑ d=k = 0, whenever m < k.
As a consequence, when t = t 0 , (1) gives We assume for simplicity that the epidemic starts from a single cohort of infected at time t 0 , then the number of infectious individuals is given by where I 0 is the number of infected individuals at time t 0 , and Γ(d) is the probability for an infected to be infectious after d day of infection. In particular , we have Γ(0) = 0. We assume that the number N(t) of new infected at time t is the product of the transmission rate τ(t) with the number S(t) of susceptible individuals and the number I(t) of infectious at time t. That is, N(t) = τ(t) S(t) I(t). ( By replacing I(t) by the right hand side of (2) in (3), we obtain Now assuming that τ(t) = τ 0 and S(t) = S 0 are constant (over a short period of time), then we define the daily reproduction numbers as R 0 (d) = τ 0 S 0 Γ(d), ∀d ≥ 0.
The quantity R 0 (d) is the average number of secondary infected produced by a single infected on the day d since infection (see [1] for more details). Therefore, the basic reproduction number is the following quantity where n is the maximal duration (in days) of the infection. Moreover, when τ(t) = τ 0 and S(t) = S 0 are constant, Equation (4) becomes a linear discrete time Volterra integral equation where (I) is the number of infected produced directly by the I 0 infected individuals already present on day t 0 , and (II) is the number of new infected individuals at time t produced by the new infected individuals since day t 0 .
In practice, we can assume that R 0 (0) = 0 since infected individuals are not infectious immediately after being infected. Under this additional assumption, we obtain the system N(t 0 + 2) = R 0 (2)I 0 + R 0 (1)N(t 0 + 1), . . . Therefore, (6) can be rewritten as a scalar delay difference equation Assume that the infectious period is n days. That is R 0 (a) = 0, ∀a ≥ n + 1.
The goal of this article is to understand how to identify the daily reproduction numbers d ∈ {1, . . . , n} → R 0 (d) in (8) knowing t ∈ [t 1 , t 2 ] → N(t) on some finite time interval. This problem is particularly important to derive the average dynamic of infection at the level of a single patient.
One of the aims of this paper is to investigate the variations of the daily reproduction number d ∈ {1, . . . , n} → R 0 (d) during the period of contagiousness of infectious individuals. This is not the case in influenza, as shown in simulated data [3] and in real infected animals, where we observe a U-shaped evolution of their viral load and symptoms as their body temperature during their contagiousness period. From there, it is possible to suspect a U-shaped variation in their ability to emit (aerosol transmission) the virus and, therefore, to contaminate it [4].
After the first asymptomatic period (without contagiousness), the daily reproduction number increases. After one to three days, this number decreases due to the action of the first defense of the innate immune system. But, the virus passes over this first immune defense, and the daily reproduction number increases again before the action of the second adaptive immune system. Then, after two to four days, the second adaptive immune response becomes fully effective. The combination of these biological mechanisms causes the daily reproduction numbers' U-or M-shaped curve.
The literature about parameters identification for epidemic models with age of infection can be divided into two groups of articles depending on the assumptions made. The first group assumes that d → Γ(d) is a given function and estimates the time dependent transmission rate t → τ(t). As a consequence, they obtain the instantaneous (daily or effective) reproduction number, which is We refer to [5][6][7][8][9][10][11][12] (and references therein) for more results about this subject. The second group corresponds to the assumptions considered here. That is, we assume that t → τ(t) = τ 0 and t → S(t) = S 0 are constant functions (over a short period of time) and estimate the daily reproduction number. That is the case for the discrete time model in [13] and more recently for the continuous time model in [1]. The major default in [13] is that the estimated d → R 0 (d) does not remain positive. we will have the same problem is Section 3.1 when we will use the full spectrum. In Section 3.2, to solve this problem, we introduce a method using the dominant and secondary eigenvalue only.
This article aims to investigate the shape of the distribution d → R 0 (d) from the data of COVID-19. In Figure 1, we illustrate the notion of U-or M-shaped distribution. In this figure, we illustrate the notion of U shape distribution in (a) and M shape distribution in (b). Recall that R 0 (d) represents the ability of patients to transmit the pathogen after d days since they were infected. The U shape or M shape distribution means that patients can transmit the pathogen since the beginning of their infection. Then they become less infectious in the middle of the infected period. Finally, they become infectious again at the end of the infected period. The only difference between U and M shape distribution is to include days 0 and 8 and R 0 (0) = R 0 (8) = 0 in the plot.
The U and M shape distribution are well known in the context of influenza [3,4]. In Figure 2, we present some figures reflecting patients' viral load for COVID-19.
Such U shape has not yet been systematically studied in COVID-19 data, but observations of the evolution of the viral load have been done in some patients and show this U shape. Figure 2 shows such a U-shaped evolution for the viral load in real patients [14].
The present work is directly connected to the original work of Peter Whittle in 1951 [15,16], who introduced the Auto Regressive Moving Average (ARMA) model, after the seminal paper on time series by N. Wiener [17], Moving average part , (10) where N(t) is the size at time t of the population whose growth is forecasted, the kernel d → K(d) has real values, n is the regression order, and here w(t) stands for a noise. Equation (10) has been extensively studied under the denomination of ARMA models by many authors [18][19][20][21][22][23][24].
(a) (b) Figure 2. Viral load in COVID-19 real patients [14]. In (a), the red curve corresponds to the throat swab and the blue curve corresponds to the sputum. In (b), the curves correspond to several patients (A), (B), and (C).
Here, we propose a new approach based on the spectral properties of the population growth equation to capture information from data. Our goal is to estimate the shape of the daily reproduction numbers d → R 0 (d). Spectral methods are not new (see Priestley [20,25]), but it usually refers to Fourier transform with frequencies associated to various periods, corresponding to a fundamental period and its sub-multiples (harmonics). If we consider the auto regressive part only, the spectrum of the delay difference equation is determined by its characteristic equation The main idea in this article is to use these eigenvalues λ 1 , λ 2 , . . . , λ n ∈ C (i.e., the solution of the characteristic equation) to identify the parameters K(1), K(2), . . . , K(n). The eigenvalues λ 1 , λ 2 , . . . , λ n ∈ C are estimated by some separated method. In Section 2, we will see that when all the eigenvalues are non null and separated two by two, then we can compute the parameters K(1), K(2), . . . , K(n) by using the eigenvalues only.
The idea of using eigenvalues in population dynamics goes back to Malthus [26], who, in 1798, first identified in a mixture of populations the one that would impose itself on the others, determined through the exponential growth of the largest exponent-this leading exponent having been called Malthusian parameter by Fisher [27]. The Malthusian growth seeming unrealistic, the saturation logistic term was introduced further by Lambert [28], and then extending the initial work by Euler [29], Lotka [30], Leslie [31], and Hahn [32] gave the current matrix form of the discrete population growth equations.
However, as far as we know, estimating the subdominant eigenvalues to characterize the system is new. So the key idea of this work is to use the dominant eigenvalue λ 1 and also the following pair of complex conjugated eigenvalues λ 2 , λ 2 as an estimator to reconstruct the kernel of the auto regressive part. This work is motivated by the times series provided by data of the daily numbers of reported cases of COVID-19. During the COVID-19 pandemic, most people viewed the oscillations around the exponential growth at the beginning of an epidemic wave as the default in reporting the data. The residual is probably partly due to the reporting data process (random noise). Nevertheless, a significant remaining part of such oscillations could be connected to the infection dynamic at the level of a single average patient. Eventually, the central question we try to address here is: Is there some hidden information in the signal around the exponential tendency for COVID-19 data? So we consider the early stage of an epidemic phase, and we try to exploit the oscillations around the tendency in order to reconstruct the infection dynamic at the level of a single average patient.
We start by investigating the connection between a signal decomposed into a sum of damped or amplified oscillations and a renewal equation. The prototype example we have in mind is the following: In Figure 3, we illustrate a growing function with damped oscillations (i.e., α 2 < 0) and amplified oscillations (i.e., α 2 > 0). It is clear from Figure 3 that a periodic function can not represent such a signal, and extending such a signal by periodicity would be artificial. Indeed, the Fourier decomposition would only provide purely imaginary eigenvalues that would exclude a continuation of the exponential growth (i.e., eigenvalues with non-zero real parts). To apply wavelets theory (see, for example, in [33]), we need to extend the data for negative times by symmetry with respect to the initial time t = 0, and we need a decreasing function (α 1 < 0 and α 2 < 0). Here, we are more interested in the model resulting from the data (i.e., R 0 (d) ≥ 0, ∀d = 1, . . . , n) than in the fit to the data. The major problem with the Fourier method is that this method provides only eigenvalues with zero real parts (that is due to the periodicity required for this method). Such eigenvalues are well adapted to a periodic signal, but this is not suitable to describe, for example, an ever-growing function (as in Figure 3). Consequently, the Fourier method is not well adapted to derive non-negative daily reproduction numbers (i.e., R 0 (d) ≥ 0, ∀d = 1, . . . , n).
Previous analogous approaches can be found in the seismic data modeling and statistical literature, like the Wiener-Levinson predictive deconvolution (Robinson [34], Peacock and Treitel [35], Robinson and Treitel [36]), which intends to estimate the minimum phase wavelet in the data, in particular in the case where the relatively weak sampling does not make it possible to affirm the Gaussian character of the errors (Walden and Hosken [37]). If the Gaussian character of the errors can be proven, another similar approach is that of the Geometric Brownian Motion (GBM) processes (Vinod et al. [38]) used, for example, in the analysis of financial data (Ritschel et al. [39]), which are based on the model of the solution of a stochastic differential equation, multiplied by a periodic component with a Gaussian noise.
The structure of this paper is as follows: Section 2 is devoted to the materials and methods. we recall some notions of matrices and spectra. we also present some phenomenological models that will be compared to the data. Section 3 contains the results. we fit the phenomenological models to the cumulative numbers of reported cases in Japan over 10 days and 30 days. we use the eigenvalues derived from the phenomenological model, and we identify the daily reproduction numbers by using: (1) all the spectrum (see Appendix B) and (2) part of the spectrum. The last section of the paper is devoted to the discussion and the conclusion. We present in the Appendices all the mathematical aspects of the paper (see Appendices A-D).

Identification of the Model
The Leslie matrix associated to the difference Equation (8) is The characteristic equation of (11) is for λ ∈ C, which is equivalent to (whenever λ = 0) The complex numbers satisfying the characteristic equation are called the eigenvalues of L.
In Appendices A and B, we discuss the identification problem of the daily reproduction numbers R 0 (1), . . . , R 0 (n) by using the eigenvalues of L. The main identification result of Appendix B corresponds to the formula (A3). Definition 1. We will say that L is a Markovian Leslie matrix if all the values d ∈ [1, n] → R 0 (d) are non negative, and n ∑ d=1 R 0 (d) = 1.

Phenomenological Model to Fit the Cumulative and the Daily Numbers of Reported Case Data
Due to Lemma A1 below, we propose the following phenomenological model to represent the data where CR 1 , . . . , CR m ∈ C are non null, λ 1 = α 1 + iω 1 , . . . , λ m = α m + iω m ∈ C are pairwise distinct, and m ≤ n.

Remark 1.
In the above formula, we allow the constant terms whenever λ n = 0.
Assuming that the unit of time is one day, we have the following relationship between the cumulative number of cases CR(t) and the daily number of cases N(t) We deduce that the daily number of reported cases has the following form where N 1 , . . . , N m ∈ C are non null, and λ 1 , . . . , λ m ∈ C are the same as in (14), and m ≤ n.
Since N(t) is obtained from CR(t) by computing the first derivative, we have the following relationship

Remark 2.
For the daily number of cases data t → N(t) only a few eigenvalues will be tractable. For example, in Section 3.3, we will consider the following extension where w(t) will contain N 5 e λ 5 t + . . . + N m e λ m t merged together with some random term.

Remark 3.
The identification of the eigenvalues λ 1 , . . . , λ m as parameters of the phenomenological model is discussed in Section 3.3. So far, this problem for a finite time interval seems to be open.
We will first approach the data with the following phenomenological model.

Phenomenological model for the cumulative numbers of reported cases with λ > 0
We start with a first eigenvalue λ = e α > 0, for some α ∈ R. The phenomenological model used to fit the cumulative numbers of reported cases has the following form where A ∈ R, α ∈ R, and C ∈ R are real numbers. For discrete times, it is equivalent to say that CR(n) = Aλ n + C, for n = 0, 1, 2, . . . .
By computing the first derivative of t → CR(t), we obtain a model for the daily number of cases of the following form Once the best fit of the above phenomenological model to the data is obtained, we can subtract this model to the data t → CR Data (t) , then we obtain a first residual Next we will approach the residual with the following phenomenological model.

Phenomenological model for the cumulative numbers of reported cases with λ ∈ C
Assume that the eigenvalues are two conjugated complex numbers λ = e α±iω ∈ C, for some α ∈ R and ω ≥ 0. The phenomenological model used to fit the cumulative numbers of reported cases has the following form where α ∈ R, A ∈ R, B ∈ R, C ∈ R, and ω ≥ 0 are four real numbers. For discrete times, it is equivalent to say that By computing the first derivative of t → CR(t), we obtain a model for the daily number of cases of the following form where Remark 4. When ω = 0 in (18), we obtain the previous model (15).

Cumulative and Daily Number of Reported Cases for COVID-19 in Japan
Here we use cumulative numbers of reported cases for COVID-19 in Japan taken from the WHO [40]. The data show a succession of epidemic waves (blue background color regions) followed by endemic periods (yellow background color regions). In Figure 4, black dots represent the data. The blue background color regions correspond to epidemic phases, and the yellow background color region to endemic phases. The region of interest to apply the method is between 19 October and 29 October 2020. This region is marked with light green vertical lines in the figure.

Methods Applied to Ten Days Data
In this section, we will fit the phenomenological model (15) or (18) to the cumulative numbers of reported cases presented in the previous subsection. we consider a period of 10 days since the beginning of the third epidemic wave of COVID-19 in Japan. The period goes from 19 to 29 October 2020.
Step 1: In Figure 5, we fit an exponential function (15)  October and 29 October 2020 (black dots). The red curve corresponds to the best fit of model (15) to the cumulative numbers of reported cases.
In Figure 5, the best fit of model (15) is obtained for A 1 = 2.8810 4 , C 1 = 6.4210 4 , and α 1 = 0.02. Hence, Step 2: Next, we consider the residual left after the previous fit, In Figure 6, we fit the model (18) to the first residual function t → Residual 1 (t). October and 29 October 2020 (black dots). The red curve corresponds to the best fit of model (18) to Residual 1 (t).
In Figure 6, the best fit of model (18) (i.e., minimizing the sum-of-squares error) is obtained for The period associated to ω 2 is equal to P 2 = 2π ω 2 = 6.609 days. This periodic phenomenon was observed in many countries (see for example [41]). Here, By using Since det(M) = 1.78 i, therefore, the components of M −1 are not too large, and the above result should not be too sensitive to the stochastic errors. The main problem in (22) is the second component −1.9625, which is not making sense in this context.

Spectral Truncation Method Applied to Ten Days Data
In the previous subsection, the first two fits make perfect sense. However, adding more fits would be questionable because they become more and more random after a few steps. we could alternatively continue to fit the rest by using our phenomenological model, which would provide new eigenvalues.
The major problem in the previous section is that when we apply formula (A3) with all the eigenvalues, we obtain some R 0 (1), . . . , R 0 (n) with negative values. Instead here, we increase the dimension n of L, and we use only the eigenvalues λ 1 , λ 2 , λ 3 .

Re-Normalizing Procedure
Assume that λ 1 = 1 then by where t → N(t) is a solution of (8), we obtain the following normalized equation and by dividing the above equation by λ t 1 we obtain where By using the procedure, we can always fix the dominant eigenvalue of L to 1 by imposing that L is Markovian (see Definition 1). Then we use the following re-normalizing procedure for the eigenvalues λ 1 = λ 1 /λ 1 = 1, λ 2 = λ 2 /λ 1 = 0.53 + 0.74 i, and λ 3 = 0.53 − 0.74 i.
In Figure 7, we fit these eigenvalues λ 2 and λ 3 with the spectrum of Markovian Leslie matrices L on a mesh. we observe that the fit improves when the dimension of L increases.  7. We plot the spectrum of the Markovian Leslie matrices L (red dots) when n = 3, 5, 6, 7, (respectively in (a-d)) giving the best match to the secondary eigenvalues λ 2 and λ 3 (green dots). We observe that the best fit of the two secondary eigenvalues remain far away from λ 2 and λ 3 for n = 3, then get closer for n = 5, and are very close for n = 6 and n = 7.

Daily Basic Reproduction Numbers
In Figure 9, we plot the average distribution d → R 0 (d), standard deviation (blue region), and 95% confidence interval.  Figure 9. In this figure, we use the distributions d → R 0 (d) minimizing the distance |λ 2 − λ 2 | and |λ 3 − λ 3 | whenever n = 7. In (a), we plot the average distribution d → R 0 (d) (red curve), standard deviation (blue region), and 95% confidence interval (light blue region). In (b), we plot the 24 distributions d → R 0 (d). In (c), we give a histogram with the multiple values of R 0 = ∑ n d=1 R 0 (d). we observe that some of the d → R 0 (d) are similar to the case n = 6, with a maximum on day d = 6, but on average the maximum value is on day 7.
In Figure 10, we plot the daily basic reproduction numbers R 0 (d). The distribution for n = 7 corresponds to the red curve in Figure 9.
We can notice that following [42], the effective R 0 is between 1.06 and 1.14 on 19 October 2020, in Japan.

Applying the Model to Daily Number of Reported Cases
The model used to run the simulations is the following and according to the formula (17) and (20), with the initial condition with In (24)-(26) we use the parameter values estimated in Section 3.1.
In Figure 11, we plot the daily number of reported cases data from October 19 to November 19, 2020 (black dots) and from model (24) and (25) with the values of R 0 (d) obtained in Figure 10c (red dots).

Oct 19
Nov 02 Nov 16 2020 500 1000 1500 2000 Data Model Figure 11. In this figure, we plot the daily number of reported cases data from October 19 and November 19, 2020 (black dots) and from model (24) and (25) with the values of R 0 (d) obtained in Figure 10c (red dots).

Extension of the Spectral Truncation Method over One Month
In Figure 12, we apply respectively the AutoCorrelation Function (ACF) and Partial AutoCorrelation Function (PACF) to the daily number of cases for Japan from 19 October and 19 November 2020. It does not look like any standard cases. In the ACF, we observe the correlation is significant until 7 days, while in the PACF it is until 16 days. Step 1: In Figure 13, we fit the model with the cumulative number of reported cases data between 19 October and 19 November 2020.  Figure 13. In this figure, we plot the cumulative number of reported cases data between 19 October and 19 November 2020 (black dots). we plot the best fit of the model (27) to the cumulative data (red curve).
Step 2: Next we define as before the first residual and we fit the Residual 1 (t) with the model In Figure 14, we plot the cumulative number of reported cases data between 19 October and 19 November 2020.

t (A 2 cos( 2 t)+ B 2 sin( 2 t))+e 3 t (A 3 cos( 3 t)+ B 3 sin( 3 t))+C 2
Residual 1 (t) Figure 14. In this figure, we plot the cumulative number of reported cases data between 19 October and 19 November 2020 (black dots). we plot the best fit of the model (30) to the cumulative data (red curve).
The parameters of the phenomenological model φ 2 (t) obtained for the best fit are the following and The periods associated to ω 2 and ω 3 are, respectively, P 2 = 2π ω 2 = 6.92 days, and P 3 = 2π ω 3 = 21.24 days.
These periods are close multiples of 7 days.

Remark 5.
It is important to note that the period P 3 of 21 days is difficult to explain mechanically, but this value is the smallest value giving the best fit to the data. we tried to impose some upper bounds smaller than 21 days. In such a case, P 3 is always replaced by the upper bound. This is true for all constraints less that 21 days, and for each constraint larger than 22 days, we obtain P 3 = 21.24 days.
This condition is coming from the fact that λ 1 must be the spectral radius of L and λ 2 , λ 3 belong to the circle centered at 0 and with the radius equal to the spectral radius of L (i.e., with a modulus less or equal to λ 1 ).
Eigenvalues associated to the model φ 1 (t) and φ 2 (t): The first eigenvalue is The fourth eigenvalue is and the fifth eigenvalue is its conjugate and the modulus of λ 4 is Using λ 2 and λ 4 as an estimator: Next we consider all the matrices L in which the component R 0 (d) is replaced by R 0 (d), and we assume that The dominant eigenvalue of L is 1, and we look for matrices such that the second eigenvalue of L is close to λ 2 = λ 2 /λ 1 , and the fourth eigenvalue of L is close to For realizing this approach, we minimize the where σ(L) is the set of all eigenvalues of L.
In Figure 15, we consider the d → R 0 (d) such that the corresponding maximum satisfies We define In Figure 16, we obtain a good description of the dynamic of infection at the individual level that confirms the one obtained over shorter periods. As expected, the average patient first loses its ability to transmit the pathogen, and after decreasing by day 1 to day 4, R 0 (d) increases between day 4 and day 7. Day 7 is a maximum. After day 7, R 0 (d) decays until day 9. Then a second peak arises, with a maximum on the day 14. we could explain this second peak by supposing that an important transmission of pathogen still exists from day 12 to day 16. We also obtain a third from day 19 to 23 with a maximum value on day 21.   Figure 15.
In Figure 17, we plot the spectrum of the Leslie matrix L when d → R 0 (d) corresponds to the average distribution (i.e., the red curve in Figure 15).
Recalling that, by definition, the basic reproduction number is we obtain the sum of the daily reproduction numbers (red curve in Figure 16) In Figure 18, we plot a histogram for the values of the basic reproduction number obtained by summing the distributions d → R 0 (d) from Figure 16.  Figure 17. In this figure, we consider the case n = 25. We plot the spectrum of the Leslie matrix L (red dots) when d → R 0 (d) corresponds to the average distribution (i.e., the red curve in Figure 15). Next, we consider and accordingly to the formula (17) and (20), with the initial condition for t = t 0 , t 0 + 1, . . . , t 0 + 25, we have with In (24)- (26) we use the parameter values estimated in Section 3.1.
In Figure 19, we see the mean distribution d → R 0 (d) permits to produce oscillations around the tendency for the daily number of cases. It is important to note that without the third peak in Figure 16 we do not obtain such a good correspondence between the model and the data.  Figure 19. In this figure, we plot the daily number of reported cases data between 19 October and 19 November 2020 (black dots). The red curve corresponds to φ 1 + φ 2 , and the green dots correspond (34) and (35) whenever R 0 (d) comes from the average distribution (i.e., the red curve in Figure 15). We observe a very good match between the green dots and the red curve (the phenomenological model).

Discussion
In this article, we start by investigating the connection between a signal decomposed into a sum of damped or amplified oscillations and a renewal equation. Namely, we connect the daily number of reported cases written as N(t) = N 1 e α 1 t [cos(ω 1 t) + i sin(ω 1 t)] + . . . + N n e α n t [cos(ω n t) + i sin(ω n t)], ∀t ≥ t 1 − n, with the renewal equation In the context of epidemic time series, a spectral method usually refers to the Fourier decomposition of a periodic signal. In the present paper, the data are not periodic and are composed of an exponential function (Malthusian growth) perturbed with some damped oscillating functions. So we use complex numbers with non-null real parts. we refer to Cazelles et al. [33] for more results about time series.

Data over Ten Days
We can notice in Figures 9 and 10 and Table 1 that the daily reproduction number as well as the instantaneous reproduction number are estimated. Concerning the instantaneous (or effective) reproduction number R e (t) [43,44] estimated by [42], which equals 1.1 at the 19th of October 2020, the best fit corresponds to n = 7 days (see (c) in Figure 9). This value of the duration of the contagiousness period is close to the values 6 or 7 days and are close to the values estimated from the virulence measured in [14,45,46]. In Figure 10, we always obtain a U-shaped distribution for the curve of daily reproduction numbers. This corresponds to the biphasic form of the virulence already observed in respiratory viruses, such as influenza, as recalled in the Introduction. This temporal behavior of the contagiousness can correspond to the evolution of contagious symptoms like cough or spitting, which diminish during the innate immune response, followed by a comeback of the symptoms before the adaptive immune response (whenever the innate defense has been overcome by the virus). If the innate cellular immunity has been not sufficient for eliminating the virus, the viral load again increases, causing a reappearance of the symptoms before the adaptive immunity (cellular and humoral) occurs, which results in a transient decrease in contagiousness between the two immunologic phases. The medical recommendations are, in the case of U-shaped contagiousness, never to take a transient improvement for a permanent disappearance of the symptoms and to stay at home to avoid a bacterial secondary infection that is possibly fatal.
The estimation of the daily reproduction numbers in the COVID-19 outbreak constitutes an important issue. At the public health level, to publish only the sum of the daily reproduction numbers, that is, to say the basic reproduction number R 0 or the effective reproduction number Re, could suffice for controlling and managing the behavior of a whole population with mitigation or vaccination measures. At the individual level, it is important to know the existence of a minimum of the daily reproduction numbers, which generally corresponds to a temporary clinical improvement, after a partial success of the innate immune defense. This makes it possible to advise the patient to continue to respect his own isolation, prevention, and therapy choices (depending on his vaccination state) even if this transient clinical improvement has occurred. The present methodology allows also to estimate both the individual contagiousness duration in a dedicated age class and also its seasonal variations, which is crucial for optimizing the benefit-risk decisions of the public and individual health policies.

Data over One Month
Over one month, we obtain a daily reproduction number with three peaks. Each peak is centered respectively on 7 days, 14 days, and 21 days. These quantities coincide with the period of 7 days and 21 days obtained in Figure 14 in fitting the first residual when we subtract the exponential growth first fit to the cumulative data. As far as we understand the problem, that is the period of 21 days in the data, which induces the third peak. This third peak is very suspicious. Nevertheless, the data lead us to such a shape for the daily reproductive number. we also tried to run Figure 19 without the third peak, and we obtained a bad fit to the data, while with this third peak, the fit is good. One may also note that the 21-day period is insignificant for the ACF and the PACF in Figure 12.
Several possibilities exist to explain this strange shape for the daily reproduction number using the data over one month. One possible explanation is that the Japanese population should be subdivided into several groups having very different infection dynamics (at the level of a single patient). Here we have in mind the patient with a short infection period but high transmissibility (super spreaders) versus the patient with a long infection period with mild symptoms.
We suspect that such a shape for the daily reproduction number could be attributed to the time since infection to report a case. The daily number of reported cases would be obtained from N(t), and the daily number of new infected cases by using the following model where the integer q ≥ 1 is the maximum number of days needed to report a case, f ∈ [0, 1] is the fraction reported, and K(d) ≥ 0 is the probability to report a case after d days. Therefore, we must have

Perspectives and Conclusions
In the present paper, we only consider the Japanese data in the exponential phase of the third epidemic wave.
The case of Japan seems emblematic to us, as it corresponds to a wave of well-identified new cases following a clearly characterized endemic phase. The exponential growth phenomenon being transitory, this explains the relatively limited duration of the sampling, which corresponds to a period in days during which the epidemiological parameters (such as the transmission rate) can be considered as constant. It is in such circumstances where the Gaussian nature of the errors is difficult to prove, due to the small sampling, such that similar methods based on wavelets have been proposed (Walden and Hosken [37]).
The method of the present paper should be applied to several countries for each epidemic wave to obtain a more systematic study. For the moment, over one month, we obtained a shape for the daily reproduction number that follows the data very well. However, we are suspicious about the third peak. we suspect that the default of our analysis is coming from the model itself. Such a question has been recently studied by Ioannidis and his collaborators in [47], and we believe that we are facing such modeling difficulties.

Informed Consent Statement:
There is no subject involved in the present study.

Data Availability Statement:
No data were produced for this study.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Non Identifiability Result
From Formula (13), we deduce that the characteristic (12) has exactly one positive solution. By the Perron-Frobenius theorem applied to the Leslie matrix L defined by (11), we know that (by considering the norm of linear operator) the spectral radius of L r(L) := lim is the unique positive solution of (12). Moreover, all the remaining eigenvalues have a modulus smaller or equal to r(L). we refer to ( [48], Chapter 4), for more results about this subject. Non identifiability result: Let λ > 0 and N = 0. Then is a known solution of (8) if and only if λ is a solution of the characteristic equation.

Appendix B. Identifiability Result
Assumption A1. Assume that λ 1 , . . . , λ n ∈ C are nonzero complex numbers, and are separated two by two. That is, λ i = 0, ∀i = 1, . . . , n. and Remark A1. Since the coefficients of the characteristic Equation (12) are all real, we could also impose that the conjugate of each eigenvalue belongs to the spectrum. That is However, that is not necessary in this subsection.
Lemma A1. Let Assumption A1 be satisfied. Assume that each λ i ∈ C satisfies the characteristic Equation (12). Then the Leslie matrix L defined by (11) is diagonalizable (and invertible); moreover, for each U 1 , U 2 , . . . , U n ∈ C, is a solution of (8). That is to say,

Identification of the components U i from the values of t → N(t):
Assume that the values of N(t) are given for t = t 1 , . . . , t 1 + n − 1. we claim that we can compute U 1 , U 2 , U 3 , . . . , U n ∈ C. Indeed, . . .
can be rewritten as the system (A1) The determinant of the above Vandermonde-like matrix Therefore, under Assumption A1, this determinant is non null, and we obtain the following result.
Proposition A1. Let Assumption A1 be satisfied. Then we can compute the components U 1 , . . . , U n in function of the given elements of the trajectory N(t 1 ), . . . , N(t 1 + n − 1) by solving the linear system (A1), and Identification of the component R 0 (d) from the λ i : By assuming that each λ i is a solution of the characteristic Equation (12), we obtain which rewrites in the matrix form as Under Assumption A1 the Vandermonde-like matrix Therefore, we can compute the component of the map d ∈ [1, n] → R 0 (d) by solving a linear system involving the eigenvalues of the characteristic equation. (A3) In Figure A1, we plot all the spectrum's location for Markovian Leslie matrices on a mesh. we can observe the changes of location of the spectrum depending of the dimension n. It seems that the spectrum is fielding more and more the unit circle in C when the dimension increases. We refer to Kirkland [49] for more results going in that direction. ⊂ Ω, and a point Λ = λ 1 , . . . , λ n ∈ Ω (i.e., all satisfying Assumption A1). Assume that

Now since
hence, for all m ≥ 1 large enough (i.e., satisfying L m L(C n ) < 1) and the proof is completed.

Appendix C. Identification of the Phenomenological Model
Here we assume that the daily number of reported cases has the following form N(t) = N 1 e λ 1 t + N 2 e λ 2 t + N 3 e λ 3 t + . . . + N m e λ m t , where N 1 , . . . , N n ∈ C are non null, and λ 1 , . . . , λ n ∈ C are pairwise distinct. If we assume to know t → N(t) for all positive integer values t = 0, 1, 2, . . . , then we can compute the discrete Laplace transform L(N)(λ) = ∞ ∑ t=0 e −λt N(t), which is well defined for all λ ∈ C such that Re(λ) > max i=1,...,n Re(λ i ). The Laplace transform could be used to identify the unknown parameters λ 1 , . . . , λ m in (A5). Then by combining this idea with linear regression of t → e λ k t , we could identify the parameters N k , then step by step compute all the parameters of N(t) in (A5).
In practice, we only know t → N(t) on a finite time interval t = 0, 1, 2, . . . , L. In that case, we can define the truncated Laplace transform as . The Laplace transform does not permit to detect the eigenvalues λ k (we tested without success some examples with values of complex numbers coming from the present article). Identification of the eigenvalues λ k , whenever t → N(t) is known only on a finite time interval, seems to be an open intriguing question.

Appendix D. About Residual 2 (t) in Section 3.3
In Figure A2, we observe that average of Residual 2 (t) = Residual 1 (t) − φ 2 (t) is close to 0, but its histogram does not have the shape of a normal distribution. So, there might be some residual information in Residual 2 (t).  Figure A2. In this figure, we plot Residual 2 (t).