An Autoregressive Disease Mapping Model for Spatio-Temporal Forecasting

One of the more evident uses of spatio-temporal disease mapping is forecasting the spatial distribution of diseases for the next few years following the end of the period of study. Spatio-temporal models rely on very different modeling tools (polynomial fit, splines, time series, etc.), which could show very different forecasting properties. In this paper, we introduce an enhancement of a previous autoregressive spatio-temporal model with particularly interesting forecasting properties, given its reliance on time series modeling. We include a common spatial component in that model and show how that component improves the previous model in several ways, its predictive capabilities being one of them. In this paper, we introduce and explore the theoretical properties of this model and compare them with those of the original autoregressive model. Moreover, we illustrate the benefits of this new model with the aid of a comprehensive study on 46 different mortality data sets in the Valencian Region (Spain) where the benefits of the new proposed model become evident.


Introduction
Statistical techniques for disease mapping have received substantial attention during the last two decades [1][2][3]. These methods seek the smoothing of health indicators by taking into account the geographic structure of the units of study in order to obtain reliable risk estimates, regardless of the statistical size of these units. These methods consider dependence between nearby geographic units, according to their geographical proximity. In principle, that dependence improves the risk estimates of the units of study. Among all the approaches for smoothing risk estimates proposed in the literature, that by Besag et al. (BYM) [4] is particularly popular, and many applications of this model can be easily found.
Although BYM is a benchmark for disease mapping studies, this proposal ignores the temporal evolution of the risks for the units of study, assuming that these are static quantities over time. This hypothesis is not necessarily realistic, especially in those studies where large periods of analysis are considered, since therefore, temporal changes in risks are likely to have occurred somewhere. In addition, if large periods of study are aggregated, we would not retrieve an updated view of the risks at the end of the period, which would be surely our main goal, but its average during all that time period. Thus, any risk excess that could have been estimated for some geographical unit could be just the consequence of a past risk excess that does not exist anymore and on which we should not focus our attention. Additionally, Ocaña and Riola (2007) [5] showed also that the aggregation of data over long time periods in disease mapping studies produces biased risk estimates, and therefore, it would be convenient to consider the temporal variation of risks in those models for correcting that bias. For all these reasons, and evidently for forecasting, spatio-temporal models should be borne in mind in very general contexts, mainly if the available period of study is particularly long. Regretfully, despite the benefits of spatio-temporal analyses, the number of studies of this kind in the applied literature in comparison to regular spatial studies is low. Surely, the reason for this situation is the additional complexity of spatiotemporal models, which have to integrate two sources of dependence, spatial and temporal dependence, into a single proposal. Moreover, spatio-temporal disease mapping models have been mainly used for visualizing the spatio-temporal evolution of risks, and despite the evident interest, forecasting has not focused particular attention on this area.
Spatio-temporal models are useful tools to study the evolution of the geographical distribution of diseases and to predict the risks in future periods of time. Forecasting the risks of diseases could have innumerable applications of practical interest such as planning and locating health resources, the establishment of priorities, and guidance in the development and implementation of public health and social programs to alleviate the geographic inequalities found.
Many proposals can be found in the Bayesian literature for the spatio-temporal modeling of risks. These proposals have followed very different approaches, ranging from spatially dependent temporal parametric models [6][7][8] to more flexible and parameterized spatio-temporal spline models [9][10][11][12][13]. However, the ability of disease mapping models to predict future mortality or disease incidence has been hardly explored [14][15][16]. An interesting proposal among this class of models, of particular importance to this work, was introduced by Martinez-Beneito et al. (2008) [17], and we will refer to this proposal simply as the autoregressive (AR) model. In that paper, the authors proposed a spatio-temporal model for disease mapping that concatenates several BYM spatial models in time by means of first order autoregressive time series. As a consequence, a spatio-temporal structure is defined in that model where the risks for each location and moment are spatially and temporally dependent at the same time. This allows nearby places to have, in general, similar risk estimates and time trends. In addition, Martinez-Beneito et al.'s proposal does not assume any parametric family for the temporal evolution of risks that could restrict them, which seems in principle an advantage of this model. Another interesting advantage of this proposal is that it can be easily implemented in practice in regular Bayesian inference packages such as WinBUGS or INLA, given its spatio-temporal separability [17,18]. Finally, the use of time series to induce temporal dependence in this model could be an important advantage in predictive terms, given the good forecasting capabilities of these tools over other modeling alternatives such as splines, with a much more modest predictive performance.
In this work, we start by introducing Martinez-Beneito et al.'s AR model (Section 2), and afterwards, we describe a motivating application of that model to a large set of mortality causes in the Valencian Region (Section 3), one of the seventeen autonomous regions of Spain. Some limitations of the AR model that could affect its forecasting properties will be evidenced according to the results of the motivating example. For this reason, we will propose (Section 4) an enhancement of the AR model that allows solving the issues found and, therefore, improving the corresponding risk estimates. We also show and compare (Section 5) the results of this new model in the Valencia mortality data set used in Section 3. In addition, we also compare these proposals with two simpler spatio-temporal models proposed in the literature, which consider spatially varying quadratic and linear time trends [6,7], in order to check if less parameterized models could show greater ability for forecasting future out-of-sample risks. Finally, we give some conclusions (Section 6) about the results and models described in the previous sections.

An Autoregressive Spatio-Temporal Model for Disease Mapping
In this section, we introduce the autoregressive spatio-temporal disease mapping proposal of [17]. Let O ij and E ij denote, respectively, the number of observed and expected cases for the i-th geographical unit and j-th time interval of the period of study. The data likelihood for this model assumes: where exp(R ij ) are the corresponding relative risks to be estimated. For the first time interval, the log-relative risks are modeled as in a BYM spatial model, that is as the sum of an intercept and two random effects, one of them inducing spatial dependence and the second inducing heterogeneous variability between spatial units. Specifically, for the first period of study (j = 1) and i = 1, . . . , I, we consider: The intercept in the linear predictor above (µ + α 1 ) is the sum of two terms, µ the overall intercept for all geographical units and time intervals and α 1 the average deviation from that overall intercept for the first time interval. The second term in the linear predictor above depends on ρ, the temporal correlation parameter, which will weight the temporal dependence between time intervals. The contribution of this parameter will be clearer when we introduce the modeling of the subsequent time intervals. Nevertheless, the term 1 − ρ 2 −1/2 in the expression above basically guarantees that the variance of the log risks in the first period is equal to that of the subsequent periods (stationary variance of the autoregressive time series). This term depends on two additional vectors, θ 1 = (θ 11 , · · · , θ I1 ) and φ 1 = (φ 11 , . . . , φ I1 ), the heterogeneous and spatial random effects, respectively, for the first period. As can be observed, the spatial random effect follows an intrinsic Gaussian conditional autoregressive distribution (ICAR) and the heterogeneous random effect a conditionally independent normal distribution of mean zero and common variance. The combination of the spatial and heterogeneous random effects corresponds to the spatial modeling proposal of BYM. Nevertheless, the autoregressive proposal that we are introducing could be easily modified to include other alternative spatial processes such as Leroux et al.'s distribution [19] or just a proper CAR distribution.
For the subsequent time periods the autoregressive model assumes, for i = 1, . . . , I and j = 2, . . . , J: Note that the spatial and heterogeneous random effects are temporally and mutually independent for each period in both Expressions (1) and (2). However, the linear predictor in Expression (2) induces temporal dependence for each areal unit, specifically as a first order autoregressive time series. Moreover, the jumps of the time series corresponding to the different areal units are spatially correlated because of the spatial dependence of φ 2 , . . . ,φ J . This makes the time series corresponding to the risks of neighboring units follow similar trends. In this way, the relative risk in each geographical unit and time interval depends on the neighboring units, but also on the neighboring time intervals, making information be shared in both time and space.
As for any Bayesian hierarchical model, the autoregressive model needs prior distributions for the rest of the parameters in the model: α, µ, ρ, σ φ , σ θ , σ α . In the original proposal, an ICAR distribution in time (this is equivalent to a random walk of first order) is also proposed for modeling α, considering as neighbors all the consecutive components of this vector. This induces temporal dependence on the intercepts of all time intervals in the period of study and makes the overall time trend follow a smooth trajectory. The temporal dependence parameter ρ is assumed to follow a uniform prior distribution between -1 and 1, which implicitly entails that the time series for R i· for each i are stationary. Finally, vague prior distributions are assumed for µ, on the whole real line, and the rest of the standard deviations present in the model, on the positive real line.
The AR model just introduced has interesting links with the spatio-temporal modeling framework proposed by Knorr-Held (2000) [20], in particular the four spatio-temporal interaction types defined there. Specifically, the spatio-temporal interaction types (I and III) showing temporal independence can be reproduced by just setting ρ = 0 in the AR model. Similarly, spatial dependence could be removed from the AR model (interaction Types I and II) by making σ φ = 0. The Type IV Knorr-Held's spatio-temporal interaction, showing both spatial and temporal dependence, can be seen as a limit case of the AR model without a heterogeneous term (σ θ = 0) and with ρ fixed to one, as this interaction was originally formulated as a random walk instead of an autoregressive process. However, in addition to all four Knorr-Held's interaction types, which correspond to some particular values of σ φ , σ θ , and ρ in the AR model, this model can reproduce far more spatio-temporal interactions by setting these values to take some more additional values. In this sense, the AR model can be seen as a comprehensive proposal able to reproduce, within a single model, all four Knorr-Held's interaction types and much more interaction types in between.
The AR model is able to reproduce all four Knorr-Held's interaction types; this modeling framework cannot be strictly considered a particular case of the AR model. Namely, Knorr-Held's modeling proposal splits the spatio-temporal variability into three terms: spatial, temporal, and (pure) spatio-temporal terms. Thus, this model has a pure spatial term, common for the whole study period, that the AR model does not have. In part, the AR model ignores this term due to the computational and confounding problems between terms posed by the joint inclusion of those three components. These confounding problems make some of the proposed implementations of this model include substantial constraints in order to avoid them [21,22]. These constraints increase in a significant manner the computational burden and computing time of the model.

A Motivating Analysis: A Comprehensive Spatio-Temporal Mortality Study in the Valencian Region
As a motivating example for the rest of the paper, we performed the following analysis. The AR model was originally used as the spatio-temporal smoothing tool for the Spatiotemporal Mortality Atlas of the Valencian Region (1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) [23]. This atlas studies the spatio-temporal evolution for 46 combinations of the cause of death and sex corresponding to 27 causes of mortality, where some particular combinations without enough deaths, such as breast cancer in men, or without biological sense, such as prostate cancer in women, have been removed. Mortality is disaggregated for this study at the municipal level for a total of 540 municipalities and 20 annual periods from 1987 to 2006.
Regarding the results derived from this atlas, those shown in Table 1 are of particular interest for this work. This table shows the temporal correlation coefficient (parameter ρ in the AR model) estimated for each mortality cause and sex.
As can be observed, the temporal correlation parameter estimated is very high for most of the mortality causes considered, frequently greater than 0.9. This agreement is particularly striking, and these common high values for ρ suggest the possible existence of a common geographical pattern for each cause of death that is repeated, for all the time intervals considered, perhaps with slight temporal variations. However, the AR model does not explicitly include a pure spatial component, but just a pure (common) time trend for the whole Valencian Region and the spatio-temporal component already described.
The high values observed for the temporal correlation parameter could be just fixing the missing spatial component of the AR model. Furthermore, this model assumes that the autoregressive time series for every municipality has a mean of µ. This implicit assumption could heavily influence the forecasts of this model since, in the long rung, all the predictions will tend to this common value.

An Autoregressive Spatio-Temporal Model with a Specific Spatial Component
These results could be suggesting the convenience of including a new pure spatial term in the AR model that could reproduce the common spatial pattern that all time intervals could show. The presence of this new term could make the autoregressive term of the AR model be in charge of reproducing just the spatio-temporal interaction of the matrix O of observed cases instead of the spatial and spatio-temporal variability, as for the results above. In that case, the term in charge of weighting the temporal dependence (ρ) for each spatial unit could take values not so close to one if the data really required it. For this new model, the common spatial pattern would have been explicitly modeled, and therefore, it would not have to be reproduced by fitting a high value of ρ in the autoregressive term.

Modeling Proposal
In this section, we propose a modification of the spatio-temporal autoregressive model. This modification is motivated by the high temporal correlation parameter estimated with this model for all the mortality causes studied in the Spatio-temporal Mortality Atlas of the Valencian Region. As already mentioned, this fact suggests the existence of a common spatial pattern that is repeated during the whole period of study for each cause of death. Therefore, we propose to include a new random effect in the model, which we will denote by δ i , i = 1, · · · , I, independent of the others effects and common for all time periods. The goal of this new effect will be modeling the hypothetical common pattern that the risk maps could show for all time periods, that is the average risk level of each geographical unit. Therefore, from a forecasting point of view, risk predictions would eventually tend to the corresponding local value instead of to a common average value.
The new spatial random effect will be modeled once again as a BYM process, that is as a sum of a spatial (φ δ i ) ICAR distributed vector and a heterogeneous normal random effect (θ δ i ) of mean zero and common variance. More in detail, for the first time period (j = 1) and for i = 1, . . . , I, we define the log relative risks as: and, for the following time periods (j = 2, . . . , J): In order to complete the model, we set prior distributions for the rest of parameters as follows: where L is a large enough value intended to make vague the corresponding uniform prior distributions. These distributions follow the same reasoning as those used for the AR model. In particular, vague prior distributions are considered for the mean (log) risk for all geographic units and time periods (µ) and also for the standard deviations of the random effects. The uniform prior distribution for ρ guarantees the stationarity of the log risk time series.

Dependence Structure
We explore now some theoretical properties of the model just introduced and compare them with the corresponding properties of the AR model. These results will provide useful insight about the dependence structures, and the main differences, of both models.
First, we deduce the joint distribution of the log risks conditioned to the rest of parameters of the model except the spatial and spatio-temporal random effects. This distribution will be multivariate normal since each risk R ij is a linear combination of normal random variables. In order to determine the joint distribution of all the risks, we need to calculate the corresponding variance-covariance matrix. This is the goal of the following lines.
The variance-covariance matrix for the first time period is: and for the second time period: Iteratively, it can be easily shown that the variance-covariance matrix for the subsequent time periods (var(R j | . . .), j = 3, . . . , J) has also as the expression: Martinez-Beneito et al. showed that the corresponding variance-covariance matrices for R j , j = 1, . . . , J conditioned to the rest of model parameters, for the AR model, have as the expression: (1 − ρ 2 ) −1 Σ θ+φ . Therefore, including the spatial term δ in the new model induces an additive term, the spatial variance-covariance matrix Σ δ , on the variancecovariance matrix of the (log) risk vector for each time period.
Regarding the covariances of the (log) risk vectors for the different time periods, for any k > 0, we have: As a consequence, in a more compact form, the variance-covariance matrix of (R 1 , . . . , R J ) can be expressed as: where ⊗ denotes the Kronecker product of two matrices, and is the correlation matrix of a first order autoregressive process with correlation coefficient ρ. Note that cov(R j , R j+k | . . .) for the original AR model is just [17]: thus, the spatial variance-covariance matrix Σ δ has the same additive effect on the covariances between different time periods as for the variances for the (log) risk vectors, in comparison to the original AR model.
In addition, the components of any R j for j = 1, . . . , J (given the hyperparameters of the model) follow a normal distribution (of mean µ + α j ), since they are linear combinations of normal variables. These results yield the following joint distribution: As shown in Martinez-Beneito et al. (2008) [17], the covariance of the log risks at different time periods in that model has the form: for any k > 0. Therefore, for the AR model, the joint distribution of the (log) risks is: These results allow us to compare the spatial and temporal dependence structure of both models and how information is shared between the different time periods. Note that the variance-covariance matrix of the new proposal is the sum of the variance-covariance matrix of the AR model and a block matrix with all its elements equal to Σ δ . This has important consequences on the correlation between log risks of different time periods. Thus, let us consider k > 0 and 1 ≤ i ≤ I; therefore: Consequently, the correlation between spatial patterns does not vanish to zero for distant time periods, unlike the AR model where (Σ δ ) ii = 0. This correlation will be larger when the variance of the spatial term is also larger, in comparison to the variance of the spatio-temporal term. Note also that, for the AR model, the convergence rate of the correlations above to zero depends just on ρ, and this will be faster for ρ ≈ 0 and slower for |ρ| = 1. Therefore, if there was an underlying persistent spatial pattern for all time periods, which seems reasonable from an epidemiological perspective, the AR model would fit a high value of ρ in order to reproduce that setting. This fact could easily explain the results obtained in the Spatio-temporal Mortality Atlas of the Valencian Region (1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) described in the previous section.

A Re-Analysis of the Mortality Study in the Valencian Region
In this section, we return to the analysis of the spatio-temporal trends of the 27 mortality causes considered in the Spatio-temporal Mortality Atlas of the Valencian Region (1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006). We now implement the new modeling proposal described in the previous section and compare the results with those of the AR model. Specifically, we compare the temporal correlation coefficients, deviance information criteria (DICs) [24], and log predictive scores [25] on the models considered. The log predictive score allows us to compare the models according to their predictive performance. Furthermore, as suggested by a reviewer, we also considered the spatio-temporal model of Bernardinelli et al. (1995) [6], which considers a linear time trend for each municipality, and an extension of this model to spatially varying quadratic time trends [7]. We will compare these models, less parameterized than those discussed in this paper, in predictive terms for out-of-sample data for the period 2007-2011.
All four models, for each data set, were run in WinBUGS. The syntax of each model and the R code used to fit them can be found, as Annex Material, from the RCode.pdf supplementary document. Three chains were run for each model with 7000 iterations, whose first 2000 iterations were discarded as the burn-in period. Of these, one of every 15 iteration was saved, yielding a final sample size of 1000 iterations. Convergence was assessed by means of the Brooks-Gelman-Rubin statistic [26]; in particular, we required this parameter to be lower than 1.1 for each variable in the models, and the effective sample size [27] was required to be at least 100 for every sampled variable in the models. Convergence was assessed with the aid of the R2WinBUGS package of R.

Comparison of the Temporal Correlation Coefficients of Both Models
We begin by comparing the temporal correlation coefficients estimated with the AR model and the new model proposed in this paper for each mortality cause and sex. Table 2 shows the posterior mean and 95% credible interval for the temporal correlation coefficient (parameter ρ) estimated for each model.  As can be seen, the posterior mean for the temporal correlation coefficients (ρ) for the new model proposed in this work (Model 2) is always lower than that estimated for the AR model (Model 1). In addition, it can be observed how the decrease in the temporal correlation coefficient is much more remarkable for the tumoral mortality causes. On the contrary, for the rest of the diseases, this difference is not so important. The results of Table 2 seem to confirm the starting hypothesis: the AR model fits high values of the temporal correlation coefficient to compensate the lack of a common spatial pattern for all time periods. When such a pattern is explicitly included in the model, the estimate of the temporal correlation coefficient changes evidently. Therefore, when the common spatial pattern is included in the model, the temporal correlation coefficient is used to fit the dependence strength of the time series, which can be very different for the different data sets, instead of being used to reproduce the missing common spatial pattern of the whole spatio-temporal system as for the original AR model.

Models' Fit Comparison According to the DICs
In this subsection, we compare the fit of the AR and the new model in terms of the DICs, for each mortality cause and sex. Table 3 shows the corresponding DICs for each model and data set.
For men, the new model with a common spatial component (Model 2) has a lower DIC in 13 out of the 23 mortality causes considered, i.e., for 57% of the data sets. For women, this model has a lower DIC in 16 out of the 23 mortality causes, i.e., for 70% of the data sets. Therefore, the DIC seems to show a slight superiority of the new proposal over the original AR model. Interestingly, this superiority is more evident for the tumoral causes of death considered, Causes 2-18 of Table 3. Thus, for these causes, the DIC was lower for the new model in eight out of 13 data sets for men and 11 out of 13 data sets in women. We remind that, precisely, the tumoral causes of death were those where ρ showed the sharpest decrease, so DICs are more similar for the causes of death where this parameter is, in general, similar for both models.

Prediction for the Next Five Years
Finally, in this subsection, we compare the predictive performance of: • The original AR model of Martinez-Beneito et al. • The autoregressive model proposed in this paper. • The spatio-temporal model of Bernardinelli et al. (1995) [6], which would assume spatially correlated linear time trends for the municipalities of the Valencian Region. • The spatio-temporal model of Assunção et al. (2001) [7], which would assume spatially correlated quadratic time trends for the municipalities of the Valencian Region.
We will assess the predictive performance of these models for the five years of the period 2007-2011 (out-of-sample data not used in model fitting) and each of the data sets studied in the Spatio-temporal Mortality Atlas of the Valencian Region. We will compare the predictive performance of these four models by means of log predictive scores [25]. These are the sum of the log predictive scores for the number of observed deaths, for each municipality, for the next five upcoming years after the end of the period of study (2007)(2008)(2009)(2010)(2011). Table 4 shows the log predictive score for the first 15 causes of death of those considered. The full table with the log predictive scores for all the causes of death is annexed as Supplementary Material to this paper due to space limitations (Table S1). Log predictive scores are reported for each data set of the study and, separately, for each of the years of the period 2007-2011, the five subsequent years to the end of the period of study used originally for the Atlas. In principle, the model with a greater log predictive score is supposed to have a better performance in predictive terms. Moreover, considering separately each of the years of the period 2007-2011 allows us to study how the predictive performance of these models evolves as the distance with the end of the study increases. Unlike other alternative model fit assessment criteria, the log predictive scores used do not reward the overfitting of any model as, in this case, the fit is measured on new data, in contrast to those criteria that use the data used to fit those models. As can be observed in Table 4, and more in depth in the full table of the Supplementary Material (Table S1), the new model introduced in this paper shows in general larger log predictive scores than the original AR model. Specifically, the log predictive score of the new model for the full period of study is greater that of the AR model for 18 out of the 23 (78%) mortality causes studied in men and for 21 out of the 23 (91%) mortality causes studied in women. This shows the enhanced performance, in predictive terms, of the new proposal. In addition, the predictions of the new model with a common spatial component are even better for the latest years considered, that is, as shown in Table 4, differences for the log predictive scores of Model 1 and 2 are, in general, mild for both models for 2007, but these differences become more evident for the latest predicted years (2010 or 2011). This suggests that the new proposal captures and reproduces the mid-/long-term time trends better than the original AR model. That is, the new model with a spatial term sets a mean value for each municipality where time trends should return in the long run, and this seems to be beneficial in predictive terms. Regarding the full comparison, which includes the rest of the (parametric) models considered, the new model introduced in this paper also shows, in general, higher log predictive scores than Bernardinelli et al.'s and Assunção et al.'s proposals. Thus, we find that the log predictive scores of the new model are the highest for 15 out of the 23 (65%) mortality causes studied in men and for 14 out of the 23 (61%) mortality causes studied in women.

Discussion
As described in this paper, the autoregressive spatio-temporal model proposed by Martinez-Beneito et al. (2008) [17] tends to estimate high values of the temporal correlation coefficient (ρ) for all the data sets studied in the Spatio-temporal Mortality Atlas of the Valencian Region (1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006). As shown, this is an artifact of this model, which is due to the lack of an explicit spatial pattern for the whole period of study. As a consequence, the term in charge of modeling temporal dependence (ρ) is forced to take high values in order to reproduce that missing common pattern. One of the main contributions of this work is to disclose this limitation of the autoregressive spatio-temporal model, which questions the time trend fit of this model.
In this paper, we propose a modification of the spatio-temporal original AR model that includes a new random effect, which models the common spatial pattern for the whole period of study. Under a theoretical perspective, this model generalizes Knorr-Held's spatio-temporal modeling framework [20], allowing reproducing all four models proposed there (and many more models) in a single proposal. In addition, the new proposal introduced makes the temporal correlation coefficients ρ considerably lower for many of the considered data sets. As a consequence, this parameter for the new model is used to tune the strength of the data temporal dependence instead of reproducing the common spatial pattern, explicitly introduced in this model. In addition, we observed that the decrease in ρ is much more remarkable in the tumoral mortality causes; therefore, these causes seem to be benefit more from the inclusion of the common spatial component. This result is of important epidemiological interest, as it points out a significant difference between the temporal correlation of the tumor mortality causes and the other types of diseases, which would be convenient to study in depth in future studies. The study of the differential behavior of the tumoral mortality causes, with respect to the other diseases, is an interesting line of research, which has been already pointed out [28]. The results of this paper seem to confirm that particular performance of tumoral mortality causes. Research on the causes of that particular performance can make an interesting contribution to the spatial epidemiology of tumoral mortality causes.
Regarding the fit of the original AR model and the new proposal introduced in this paper, we conclude that the fit differences during the period of study are mild. This seems to be confirmed by the DICs in Table 3, which point out a better performance of the new model, mainly when the spatial term takes a more evident effect. Nevertheless, these DIC differences are generally mild, which seems to be confirmed by the similar risk estimates of both models. As a consequence, we could conclude that both models are in general flexible enough for modeling the risk time trends for the data sets considered. However, in predictive terms, things are different. As shown, the log predictive score of the new modeling proposal is higher than that of the original AR model for almost all the data sets considered, and therefore, this model has a better behavior when used for forecasting. Moreover, this new model seems to be even better for medium-/long-term forecasting, which makes it particularly interesting in practice. In addition, this new model also shows in general better predictive capabilities than Bernardinelli et al.'s and Assunção et al.'s parametric spatio-temporal models, which suggests that parametric models may be somewhat restrictive for this particular purpose.
As a summary, this new proposal makes a contribution to spatio-temporal disease mapping, of particular use/interest when forecasting is the main goal of the corresponding study, something very common in spatio-temporal modeling. The proposed model would be mainly useful for the analysis of diseases with stable spatial patterns over time, such as those corresponding to chronic diseases, in contrast to more chaotic spatio-temporal mortality patterns, such as COVID-19 mortality. The interest of this approach is its enhancement of the AR model, which already uses the time series methodology, particularly suited for predictive purposes. As shown, this makes this new proposal more suited for predictive purposes than the AR model and much more suited than other spatio-temporal approaches relying on modeling tools, such as polynomial or splines fit, with much more limited predictive skills.
Supplementary Materials: The following are available online at https://www.mdpi.com/2227-739 0/9/4/384/s1, Table S1: Log predictive scores for each data set (full table)  Data Availability Statement: The data are not publicly available due to confidentiality issues.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

BYM
Spatial model proposed by [4] ICAR Intrinsic Gaussian conditional autoregressive distribution AR model Autoregressive spatio-temporal model proposed by [17] DIC Deviance information criterion