Probability of Deriving a Yearly Transition Probability Matrix for Land-Use Dynamics

Takada’s group developed a method for estimating the yearly transition matrix by calculating the mth power roots of a transition matrix with an interval of m years. However, the probability of obtaining a yearly transition matrix with real and positive elements is unknown. In this study, empirical verification based on transition matrices from previous land-use studies and Monte-Carlo simulations were conducted to estimate the probability of obtaining an appropriate yearly transition probability matrix. In 62 transition probability matrices of previous land-use studies, 54 (87%) could provide a positive or small-negative solution. For randomly generated matrices with differing sizes or power roots, the probability of obtaining a positive or small-negative solution is low. However, the probability is relatively large for matrices with large diagonal elements, exceeding 90% in most cases. These results indicate that Takada et al.’s method is a powerful tool for analyzing land-use dynamics.


Introduction
Land-use and land-cover change with natural processes and human activities, which further depends on ecological, economic, political institutional, and social constraints [1]. Thus, studying land-use/cover change (LUCC) may contribute to better understanding of the interaction between environmental and human-driven processes and finding key processes within the local human-environment system [2]. Several approaches were developed to understand, analyze and evaluate LUCC [3][4][5][6]. Among them, the probability-based transition matrix approach has been used to analyze, compare, and predict LUCC over specific periods with a stationary Markov model [7][8][9][10]. In this approach, two maps of a single site for two points in time are classified into the same set of land-use/cover categories and the transition probabilities between the categories are estimated by comparing these two maps [11]. The transition probability matrix, T, whose interval is m years, is used to calculate the projection of the area of LUCC, x t+m as where x t is a row vector representing the proportion of each category in time t.
The transition probability matrix is useful not only for extracting factors that lead to differences in one time period at a site, but also for comparing the difference of land-use change among several time periods [8]. Transition probability matrices are sometimes obtained by comparing aerophotographs or satellite images of the target location, but in some cases, the intervals of the aerophotographs or satellite images differ. For example, consider three aerophotographs of the same place taken in 2000, 2007 and 2012. The LUCC transition probability matrix between 2000 and 2007 can be calculated by comparing 2000 and 2007 aerophotographs. In addition, the transition probability matrix between 2007 and 2012 can be calculated. These two transition probability matrices cannot be compared, because the former matrix reflects LUCC in 7 years, whereas the latter matrix reflects LUCC in 5 years. Such mismatch of aerophotograph or satellite image shooting interval might arise from unsystematic planning of shooting interval. The same situation sometimes occurs for comparing LUCC in several sites, because aerophotographs with the same interval are not always available.
To resolve this problem, the matrices should be adjusted such that they have the same interval. Takada et al. [12] developed a method for estimating the yearly transition matrix by calculating the power root of a transition probability matrix with any interval of the research period. Hereafter, Takada et al. [12]'s annualizing method is referred as TAM for short. TAM has been used by many LUCC researchers [13][14][15][16][17], etc. These studies dealt with various topics, such as agricultural land use, forest management, climate change, deforestation, and urbanization, suggesting that TAM is useful for obtaining the yearly transition matrix in LUCC analysis.
Theoretically, the number of solutions induced from mth power roots of an n × n matrix T is m n , because they are calculated as where λ i is the i-th eigenvalue of matrix T and u i is its corresponding eigenvector. The number of mth power root of λ i is generally m for each i and the total number of the combinations is m n [12]. The m n solutions include matrices with complex numbers or negative numbers. They are unsuitable for land-use dynamics analyses, such as scenario-based simulation and future prediction. This problem has been discussed in several studies [6,[18][19][20][21]. However, the possibility of obtaining suitable yearly transition probability matrices with TAM remains unclear. The aims of this study are to (1) clarify the possibility of obtaining yearly transition probability matrices with real field data set, (2) clarify the theoretical possibility of obtaining yearly transition probability matrices, and (3) explain the difference between real and theoretical results and examine the validity of TAM. In this study, we estimated the probability of acquiring a positive or small-negative solution via TAM. Empirical verification was conducted with 62 transition matrices obtained from previous land-use change studies. Monte-Carlo simulations were conducted with randomly generated matrices and biased matrices to estimate the probability of acquiring suitable solutions. Furthermore, we discuss the effectiveness of TAM.

Empirical Verification
In this study, the possibility of obtaining yearly transition probability matrices with actual transition probability matrices was examined. From 34 previous studies on land-use change (Table 1), 62 transition probability matrices were obtained. These matrices were annualized with the software developed by Takada et al. [12] whose name was "annualmatrix.exe" in the following URL, https://taktakada.github.io/esoftdownload.html, to determine the number of positive and small-negative solutions. Among 62 transition probability matrices used in the empirical verification, 48 matrices were supplied together with the initial and final area size (or proportion) of each category (the row of "Area data" in Table 2). For these matrices, the row vector of the final area size, v f in can be calculated as where T is the transition probability matrix and v init is the row vector of the initial area size. Assume that T is a transition probability matrix in m years and A is a m th power root of the matrix calculated by TAM. The row vector of the estimated final area size, v est would be calculated as Errors in the estimation of the annualization are calculated as the sum of differences in each category between the real and estimated area sizes, using the following formula, The Pearson's correlation coefficient between the estimated error and study area was calculated using statistical software R version 2.11.1 [52]. The Kendall rank correlation coefficient between the estimated error and number of classes, and between the estimated error and study period was also calculated using R version 2.11.1 as well as Kendall library [53].

Monte-Carlo Simulation
To estimate the probability of obtaining a positive solution of the power root matrix, we conducted Monte-Carlo simulations using a randomly generated matrix. LUCC transition probability matrices have the characteristic of all the row sums being always 1.0. Assume that P is a n × n transition probability matrix and p ij is an element of the P. The row sum for all j. In most LUCC studies, the total area or number of grids of the target area neither increases nor decreases during the study period. Thus, randomly generated matrices should meet this requirement. In this study, an n × n random matrix was generated based on the "broken stick" method proposed by Takada et al. [54]. First, n − 1 random numbers were generated from the uniform distribution ranging from 0 to 1 using R version 2.11.1 [52]. and then sorted in the ascending order. A line (stick) with length 1 is broken into n pieces using the random numbers as breaking points. The lengths of broken lines are used as n random numbers, whose sum is equal to 1. These random numbers are combined to generate a row vector of n size, and its sum is equal to 1. For example, three random vectors whose sizes were 3 and sum of the elements was 1.0, were generated as This procedure was repeated n times and n row vectors were obtained. They were concatenated to form an n × n matrix T as The obtained random matrices were annualized with the software developed by Takada et al. [12] to determine the number of positive and small-negative solutions. Monte-Carlo simulations were conducted for matrix sizes ranging from 2 to 9 and power roots of 3, 4, 5, 7, 10, 13, 20, or 30, except for a matrix size of 8 and power root of 20 or 30 and matrix size of 9 and power root of 20 or 30. The simulation was repeated 1000 times in most cases. The procedure was repeated 100 times for matrices with large sizes and power roots because the simulations were time-consuming. The probability of obtaining a positive or small-negative solution was estimated by dividing the number of trials that yielded a positive or small-negative solution by the total number of trials.

Biased Monte-Carlo Simulation
A fully random matrix was generated using the "broken stick" method. However, the transition probability matrix analyzed for land-use change tends to differ from a random matrix. In many cases, the diagonal elements of a transition matrix are relatively larger than non-diagonal elements (e.g., [8,28,30,40,43,45,[55][56][57][58]). This may be attributed to the generally constant land-use patterns during the study period or the tendency of self-replacement probability to be high. To simulate a transition probability matrix in land-use dynamics, we generated a series of biased random matrices (Hereafter, this type of matrix is referred to as a "modified random matrix"). Therefore, the "broken stick" method was not adopted to generate modified random matrices, The procedure is as follows. First, n random numbers were generated from the F distribution with 1.0 and 0.0 degrees of freedom using R version 2.11.1 [52]. F distribution was used to simulate a skewed distribution, in which the majorities of random numbers were relatively small, while the minority of random numbers were relatively large. Then, the random numbers were combined to produce a row vector v, which was corrected such that the sum of the row vector was one as where v is a corrected vector. This procedure was repeated n times. For each row vector that constituted a random matrix, the largest element in the row vector was swapped with the element in the diagonal position of the matrix. For example, three random vectors with a size of 3 and sum of elements of 1.0 were generated as These three vectors, v 1 , v 2 , v 3 were combined as an 3 × 3 matrix T by swapping the first and third elements of v 1 , and by swapping the first and the second elements of v 2 . The same procedure was applied for the modified random matrices to obtain the probability of obtaining a positive or small-negative solution by TAM.

Results
The size (number of categories) of the transition probability matrices ranged from 4 to 10 and the study period ranged from 3 to 52 years among 62 transition probability matrices used in empirical verification. A positive or small-negative solution was obtained from 54 of 62 matrices(87%) ( Table 2).
Error estimation was conducted for 42 transition probability matrices, which were supplied together with the initial and final area size of each category and an annual transition probability matrix could be obtained. Estimated errors in transition probability matrices were smaller than 0.05, except for data from Lopéz et al. [59] (No. 25 in Table 2) with 0.081 ( Table 2). The Pearson's correlation coefficient between the estimated error and the study area was not significant (r = −0.16, p = 0.30). In addition, the Kendall's τ between the estimated error and the number of classes was not significant (τ = 0.213, p = 0.07). Nevertheless, the estimated error and study period were significantly correlated (τ = −0.31, p = 0.005).
The probability of obtaining a yearly transition matrix was estimated through Monte-Carlo simulations using random matrices ( Table 3).
The probability of obtaining a positive solution was very low for different matrix sizes and power roots (Table 3), and it was almost zero for a matrix size greater than 4. The probability of obtaining a positive solution did not increase linearly with the power root. Although the probability of a small-negative solution with a random matrix was higher than that for a positive solution, it was still less than 30% ( Table 3). The probability of obtaining a small-negative solution increased with the power root for matrix sizes larger than 4. However, the relationship between a small-negative solution and the power root was not linear for matrix sizes of 2 or 3.   In contrast, the probability of obtaining a positive solution with a modified random matrix was generally higher than that with a random matrix (Table 4).   For a matrix size of 2, the probability was 100%. Similarly, the probability of obtaining a small-negative solution with a modified random matrix was higher than that with a random matrix, and exceeded 90% for matrix sizes greater than 4 (Table 4). However, the probability was zero for a matrix size of 2 and the power root was odd.

Discussion
The possibility of obtaining yearly transition probability matrices with real field data set, random matrices, and modified random matrices using TAM, was high (54 in 62 matrices; 87%, Table 2), low (Table 3), and relatively high (Table 4), respectively. These results suggest that TAM may provide suitable solutions using transition probability matrices with relatively large diagonal elements, which is common in LUCC studies.
The low probability of obtaining a positive or small-negative solution from a random matrix, especially for a matrix greater than 5 × 5 ( Table 3), suggests that a yearly transition matrix with real and positive elements cannot always be derived from a transition matrix. However, the probability of obtaining a positive or small-negative solution from a modified random matrix (relatively large diagonal elements) exceeded 90% in most cases ( Table 4). The diagonal elements of the transition probability matrix tend to be relatively large in land-use dynamics analysis because land-use patterns are fairly constant over a short period. Consequently, TAM should derive yearly transition matrices from the transition probability matrices for land-use analysis.
In cases where the target location experiences drastic changes and the self-replacement rate is low, the diagonal elements of the transition probability matrices tend to be not relatively large. In the empirical verification, annual transition probability matrices were not obtained from No. 9, 16,19,29,43,45,50, and 62 matrices in Table 2 [47]). For these types of studies, it is possible that TAM cannot obtain an applicable solution of yearly transition matrix.
The estimated error with area projection between real and annualized transition probability matrices was lower than 0.05 in most cases ( Table 2). The estimated error did not correlate with the study area size and the number of classes, but it was negatively correlated with the study period. TAM includes errors attributable to calibration, in which negative elements close to zero are treated as zero [12]. Nevertheless, these results indicated that the calibration error is small and independent of study area size and number of classes.
We could not calculate the annual transition probability matrix from several large (containing many classes) and long-term matrices, such as [55,56,58,62], because their estimated time of calculation exceeded a month. For example, even with the newest PC (AMD Ryzen5 3600x, 3.8GHz) configuration, the calculation of an annual transition probability matrix from the transition probability matrix from Ojeda-Revah et al. [62], whose matrix size (number of categories) was 10 and study period was 24 years, was estimated to consume more than 1000 days. Please note that TAM will check all the possible m n solutions for n × n transition probability matrix whose duration is m years. Therefore, other algorithms will be needed to speed up the calculation for large and long-term transition probability matrices.

Conclusions
In this study, empirical verification based on transition matrices from previous land-use studies and Monte-Carlo simulations were conducted to estimate the probability of obtaining an appropriate yearly transition probability matrix with TAM. This study has revealed that (1) the possibility of obtaining yearly transition probability matrices with real field data set is high, (2) the theoretical possibility of obtaining yearly transition probability matrices is low, as shown in Monte-Carlo simulation with random matrices, and (3) the difference between real and theoretical results may be explained by high possibility of obtaining yearly transition probability matrices in Biased Monte-Carlo simulation, suggesting that the possibility of obtaining yearly transition probability matrices is high when the diagonal elements of the transition probability matrix were relatively larger than non-diagonal elements. The diagonal elements of the transition probability matrix tend to be relatively large in most cases in LUCC studies. However, the diagonal elements of the transition probability matrices tend to be not relatively large in cases where the target location experiences drastic changes such as urbanization, rapid agricultural or industrial land-use change and long-term research. For these types of studies, TAM may not be able to obtain an applicable solution of yearly transition matrix.
This study suggests that TAM is applicable for many transition probability matrices and may contribute to land-use dynamics analysis as a powerful tool. Acknowledgments: We express our sincerest gratitude to Jan Bogaert, Toshihiko Hara, Norio Yamamura and Takashi S. Kohyama for their helpful suggestions.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: