An Efficient Graph-based Method for Long-term Land-use Change Statistics

Statistical analysis of land-use change plays an important role in sustainable land management and has received increasing attention from scholars and administrative departments. However, the statistical process involving spatial overlay analysis remains difficult and needs improvement to deal with mass land-use data. In this paper, we introduce a spatio-temporal flow network model to reveal the hidden relational information among spatio-temporal entities. Based on graph theory, the constant condition of saturated multi-commodity flow is derived. A new method based on a network partition technique of spatio-temporal flow network are proposed to optimize the transition statistical process. The effectiveness and efficiency of the proposed method is verified through experiments using land-use data in Hunan from 2009 to 2014. In the comparison among three different land-use change statistical methods, the proposed method exhibits remarkable superiority in efficiency.


Introduction
Land use change is a key component of the Earth changes being found in all of the continents and having impacts on the Earth resource and services used by humankind [1][2][3][4][5].Land use changes are resulting in the disturbance of soil, vegetation, and water resources and services, and at the same time determining and modifying the fate of Earth life system, chemical, and hydrological cycles [6][7][8].Certain analytic indices [9,10] (e.g., relative change rate, dynamic degree, location quotient, etc.) and dynamic models (e.g., Markov forecasting [11,12], spatial autocorrelation [13], etc.) have been applied widely to studies on land-use change patterns.
Type transition statistics are an important kind of spatio-temporal statistics that provide databases for the analytic indices and dynamic models.It concerns the measurable areal relationship characterizing changes between features and answers queries such as "how much farmland has changed to residential land over a period".Transition statistical process (TSP), which involves complicated relational and topological operations, remains a challenge in land data statistical applications.Although TSP can be obtained through overlay analysis (Figure 1), the process requires substantial time and resources when handling massive practical land-use data [14].Based on the query optimizations for spatio-temporal database (STDB), the computation of TSP can be reduced to a certain extent, but still unable to satisfy the practical needs.The lack of an accurate and efficient TSP method for mass data is the bottleneck problem of land-use change analysis and data mining.
Sustainability 2015, 72015, 7, page-page 2 a certain extent, but still unable to satisfy the practical needs.The lack of an accurate and efficient TSP method for mass data is the bottleneck problem of land-use change analysis and data mining.Graph representation can be used to facilitate change analysis in vector spatial data through the discovery of long temporal patterns from short-term relations.In this paper, we firstly analyze land use change in graph approach and establish restrict mathematical relationship between graph and spatio-temporal change.Then, we propose and illustrate a new graph-based TSP method based on the graph analysis.Finally, we compare this new TSP method with the basic overlay analysis method and a query-optimized method using practical land-use data.

STDB for Land Management
Spatio-temporal databases have been applied to land use data for effective storage and management of the historical data.It can be used to reconstruct the past states, trace the changes, and analyze the statistics or trends of land use state.The basic spatio-temporal models that are suitable for effective land-use data management include a base state with an amendment model, objectoriented model, and event-based model [15][16][17].One typical spatio-temporal model that combines the characteristics of these basic model types is proposed by Teng et al. [18] (Figure 2).The model stores geographical entities as data object in a present database (Present DB) and a history database (History DB).Every object has semantic attributes, geometric shape (Shape), and lifespan (BeginTime, EndTime).When geographical entities change, the new objects are stored in the Present DB, and the old objects are stored in the History DB.The model also records the inheritance relationship between old objects and new objects in the Event Table, by which local spatio-temporal changes can be traced and the evolution of inheritances can be understood.Teng, 2005 [18]).
Topological information plays an important role in land use databases.Point-set methods are generally used in vector GIS [19].Topology is generally used in maintaining data quality.An important function of topological information is storing a spatial partition since, without topological information, it is difficult to control whether a given set of regions forms a partition or not [20].Spatiotemporal topology represents inheritance relations in land use changes which is involved in some queries.The spatio-temporal topology is used for land use database updating in China [21].Graph representation can be used to facilitate change analysis in vector spatial data through the discovery of long temporal patterns from short-term relations.In this paper, we firstly analyze land use change in graph approach and establish restrict mathematical relationship between graph and spatio-temporal change.Then, we propose and illustrate a new graph-based TSP method based on the graph analysis.Finally, we compare this new TSP method with the basic overlay analysis method and a query-optimized method using practical land-use data.

STDB for Land Management
Spatio-temporal databases have been applied to land use data for effective storage and management of the historical data.It can be used to reconstruct the past states, trace the changes, and analyze the statistics or trends of land use state.The basic spatio-temporal models that are suitable for effective land-use data management include a base state with an amendment model, object-oriented model, and event-based model [15][16][17].One typical spatio-temporal model that combines the characteristics of these basic model types is proposed by Teng et al. [18] (Figure 2).The model stores geographical entities as data object in a present database (Present DB) and a history database (History DB).Every object has semantic attributes, geometric shape (Shape), and lifespan (BeginTime, EndTime).When geographical entities change, the new objects are stored in the Present DB, and the old objects are stored in the History DB.The model also records the inheritance relationship between old objects and new objects in the Event Table, by which local spatio-temporal changes can be traced and the evolution of inheritances can be understood.
Sustainability 2015, 72015, 7, page-page 2 a certain extent, but still unable to satisfy the practical needs.The lack of an accurate and efficient TSP method for mass data is the bottleneck problem of land-use change analysis and data mining.Graph representation can be used to facilitate change analysis in vector spatial data through the discovery of long temporal patterns from short-term relations.In this paper, we firstly analyze land use change in graph approach and establish restrict mathematical relationship between graph and spatio-temporal change.Then, we propose and illustrate a new graph-based TSP method based on the graph analysis.Finally, we compare this new TSP method with the basic overlay analysis method and a query-optimized method using practical land-use data.

STDB for Land Management
Spatio-temporal databases have been applied to land use data for effective storage and management of the historical data.It can be used to reconstruct the past states, trace the changes, and analyze the statistics or trends of land use state.The basic spatio-temporal models that are suitable for effective land-use data management include a base state with an amendment model, objectoriented model, and event-based model [15][16][17].One typical spatio-temporal model that combines the characteristics of these basic model types is proposed by Teng et al. [18] (Figure 2).The model stores geographical entities as data object in a present database (Present DB) and a history database (History DB).Every object has semantic attributes, geometric shape (Shape), and lifespan (BeginTime, EndTime).When geographical entities change, the new objects are stored in the Present DB, and the old objects are stored in the History DB.The model also records the inheritance relationship between old objects and new objects in the Event Table, by which local spatio-temporal changes can be traced and the evolution of inheritances can be understood.Topological information plays an important role in land use databases.Point-set methods are generally used in vector GIS [19].Topology is generally used in maintaining data quality.An important function of topological information is storing a spatial partition since, without topological information, it is difficult to control whether a given set of regions forms a partition or not [20].Spatiotemporal topology represents inheritance relations in land use changes which is involved in some queries.The spatio-temporal topology is used for land use database updating in China [21].Topological information plays an important role in land use databases.Point-set methods are generally used in vector GIS [19].Topology is generally used in maintaining data quality.An Sustainability 2016, 8, 9 3 of 14 important function of topological information is storing a spatial partition since, without topological information, it is difficult to control whether a given set of regions forms a partition or not [20].Spatio-temporal topology represents inheritance relations in land use changes which is involved in some queries.The spatio-temporal topology is used for land use database updating in China [21].

Statistical Process in STDB
STDB organized with Teng's model, or its analogues, stores spatio-temporal topology for the changes and enables various forms of query when spatio-temporal query languages [22] are integrated.The STDB can be used in GIS to retrieve entities in snapshots and optimize static statistical processes.However, GIS cannot facilitate query or analysis computations for entities that are beyond the representation capabilities of its data models [23], and the optimization of transition statistical processes is strictly limited unless the data model reveals further connections between simple topological relations.
Ruan et al. introduced a land-use temporal statistics model based on transaction mechanics in 2011 [24].In the concept of "transaction", complex changes are divided into "change basic events", then a "change index table" is built to advance the statistical query process.However, Ruan's statistical method specified to land transition cannot be applied to long term changes where multiple complex changes occurred at the same place during a period.

Analysis of Land Use Change in Graph Theory Approach
Spatio-temporal graphs (STG) show how entities are related in space and time.This section focuses on analysis of the relationship between long-interval land use transition and spatio-temporal topology of short-term local events in a STG.We extend STG into a flow network where the change area of local events is mapped as edge capacity and the long-term type transition is modeled as a special kind of network flow that is mapped from spatio-temporal transitions.The properties of this network flow ensures that it can keep constant in reducible network, which is defined to be the basis of a fast TSP method.

Characterization of Spatio-Temporal Flow Network
Wilcox classified the spatio-temporal model into four types based on the continuity in space and time and discusses the methods of modeling space-discrete, time-continuous, spatio-temporal data in a graph-based approach [25].Based on his research, Yin et al. presented a space-discrete time-discrete graph model for cadastral changes [26].Spéry et al. first investigated the application of directed acyclic graph (DAG) to the distributed cadastral management [27].Recently, Rodriguez et al. researched the formalization to extend spatio-temporal graph expressive power and the mapping from spatio-temporal graph to STDB [28].
Land-use change can be described by Yin's STG model, since the change is space-discrete and time-discrete (see Figure 3).Vertices represent spatio-temporal entities whose temporal relationship forms a partial order.A directed edge connects a pair of spatio-temporal entities that intersect in space and meet in time, and the intersection area of the two entities is recorded for the edge.An STG is acyclic in that time is invertible.
Land-use change can be described by Yin's STG model, since the change is space-discrete and time-discrete (see Figure 3).Vertices represent spatio-temporal entities whose temporal relationship forms a partial order.A directed edge connects a pair of spatio-temporal entities that intersect in space and meet in time, and the intersection area of the two entities is recorded for the edge.An STG is acyclic in that time is invertible.If we specify two timestamps, namely, Tb and Te and a geographic region R, the 5-tuple (V, E, S, T, c) can be used to represent a spatio-temporal flow network (STFN).V and E, respectively, denote If we specify two timestamps, namely, Tb and Te and a geographic region R, the 5-tuple (V, E, S, T, c) can be used to represent a spatio-temporal flow network (STFN).V and E, respectively, denote the vertex set and edge set in the STG that is associated with the change between T1 and T2 within R. Source set S and sink set T respectively denote the spatial temporal entities existing at Tb and Te within R. We define capacity function c as the mapping from edge (u, v) to the intersection area between entities u and v.In Figure 4, the land-use change process described in Figure 3 is represented by STFN.
Sustainability 2015, 72015, 7, page-page the vertex set and edge set in the STG that is associated with the change between T1 and T2 within R. Source set S and sink set T respectively denote the spatial temporal entities existing at Tb and Te within R. We define capacity function c as the mapping from edge (u, v) to the intersection area between entities u and v.In Figure 4, the land-use change process described in Figure 3 is represented by STFN.Let the function gets the geometric shape of a spatial-temporal entity, and the function Area calculates the area of a geometric shape.Then, c can be formulated below: The birth and death of any specified entity is expected to be record in edge set E. Therefore, it can be deduced that: If we define: For every internal vertex (vertex that is not a source or sink) w, equations ( 2) and ( 3) implies that: Equation ( 6) means that the capacity of inflow edges and outflow edges for any internal vertex is balanced.Let the function g gets the geometric shape of a spatial-temporal entity, and the function Area calculates the area of a geometric shape.Then, c can be formulated below: The birth and death of any specified entity is expected to be record in edge set E. Therefore, it can be deduced that: g pwq " Y uPV,pu,wqPE g puq X g pwq , @w P V g pwq " Y vPV,pw,vqPE g puq X g pwq , @w P V If we define: c ´pwq " ÿ uPV,pu,wqPE c pu, wq , @w P V (4) For every internal vertex (vertex that is not a source or sink) w, Equations ( 2) and (3) implies that: Equation ( 6) means that the capacity of inflow edges and outflow edges for any internal vertex is balanced.

Modelling Long-Term Transition as Multi-Commodity Flow
A multi-commodity flow problem is a kind of network flow problem where multiple commodities are routed between different sources and sinks [29].In STFN, we use different commodities to indicate the distinction between the topological relations of different connected S-T pairs (the term "A connect to B" is used for simplicity in a directed graph and means exactly that A can reach B or B can reach A).Let f ij pu, vq be the intersection area of entities s i , t j , u, and v, specifically: Since the geometric shapes at any given timestamp are mutually exclusive and collectively exhaustive, for any source-sink pair, we can deduce that: and: Let F ij be the transition area from entity i to entity j, it follows that: Hence, determining multi-commodity flow values in STFN is equivalent to calculating transition areas.

Constant Multi-Commodity Flow Condition
Based on graph theory [30], we prove that the multi-commodity flow is constant under certain conditions.
Preliminary.A transportation network with multiple sources and sinks is a 5-tuple (V, E, S, T, c) where m sources s 1 , s 2,... s m P S and n sinks t 1 , t 2 , . . .t n P T belong to the vertex set V, and c : E Ñ R ìs the capacity function for edges in the directed edge set E. An S-T path P is a directed path from one source to one sink.Denote the complete set of S-T paths by P the set of S-T paths from s i to t j by P ij , and the set of S-T paths containing edge e by P e .i.e.P ij P P P ˇˇs i P P, t j P P ( , P e tP P P|e P Pu.A (path-form) multiflow f P R P `is an assignment for flows along S-T paths.A feasible multiflow is nonnegative and subject to the capacity constraint: Sustainability 2016, 8, 9 6 of 14 We suppose that the commodities routed between different connected S-T pairs are different, and use F to denote the multi-commodity flow with respect to the S-T pairs, is the flow value for S-T pair `si , t j ˘.
Unlike residual network for single-commodity flow, the residual network for a multiflow f does not contain a directed edge that is opposite to the edges in the original network.It contains edges with capacity c f peq " c peq ´řPPP e f P if c f peq ą 0.
Let R pvq be the set of vertices that can reach v or v can reach (include vertex v); i.e., R pvq " tu P V|pDPq u P P, v P Pu.Obviously, u P R pvq ô v P R puq.The network presented in this paper is assumed to have no directed paths connecting two different sources or two different sinks; that is, |R psq X S| " 1 for each s P S and |R ptq X T| " 1 for each t P T. We also assume that the network is S-T connected, which implies |R pvq X S| ą 0 and |R pvq X T| ą 0 for each v P V.
Definition 1.A vertex v is called mixed vertex if more than one source can reach v and v can reach more than one sink, i.e.: |R pvq X S| ą 1 and |R pvq X T| ą 1 Definition 2. A one-way cut is a cut " X, X ‰ separating S from T where no directed edges have a tail in X and a head in X.
Definition 3.An S-T exclusive cut δ pXq is a one-way cut that only contains edges which can be reached from exactly one source and can reach exactly one sink, i.e.: |R puq X S| " |R pvq X T| " 1, @ pu, vq P δ pXq Lemma 1.A directed acyclic network without mixed vertices has an S-T exclusive cut.
Proof.Let X i be the set of s i and all non-terminal vertices that only s i among S can reach; i.e., X i " tu|R puq X S " ts i uu, For every edge (u, v) in cut " X i , X i ‰ , there are only two possible circumstances about the its head v: ‚ v P V ´T, then v must be reachable from other sources than s i (otherwise it will be included in X i ), so it must connect only one sink (or it is a mixed vertex).|R pvq X T| " 1 Since X i for each sources do not intersect, the number of reachable terminals for each vertex in the union set X Y m i"1 X i is not changed.@ pu, vq P " X, X ‰ , |R puq X S| " |R pvq X T| " 1. Obviously, " X, X ‰ is a one-way cut, so " X, X ‰ is an S-T exclusive cut.Q.E.D. Remark 1.The above proof also implies a method to find an S-T exclusive cut, which is to be used in the optimization algorithm.Definition 4. A multiflow f on a network is saturated if the total flow on each edge is maximized.i.e.,: Definition 5. A network is capacity-balanced if for all internal vertices, the capacity sum of inflow edges equals the capacity sum of outflow edges.Let c ´pvq ř uPV c pu, vq, c `pvq ř uPV c pv, uq.Hereby, the capacity-balanced constraint is represented as: Remark 2. This section intends to determine on what circumstances the saturated multiflow on a directed acyclic capacity-balanced network (DACB network) has constant multi-commodity flow with respect to S-T pairs.Although the result could be achieved by solving linear programming directly [31], its variable size is Op|E| ˆ|S| ˆ|T|).Only small instances of multiflow problem can be solved as linear programming in practice.Hence, we need to find a simpler approach by studying the properties of DACB network.
Fact 1.In a DACB network, the residual network for any given feasible flow is still a DACB network.
Fact 2. In a DACB network, if P is the only S-T path that includes edge (u, v), then (u, v) has the smallest capacity among edges along P.
Proposition 1.There always exists a saturated multiflow in a DACB network.
Proof.We construct a multiflow by sequentially adding a maximum feasible flow on each S-T path: When the first k S-T paths is maximized with flows, denote the residual network by N k .Fact 1 implies that N k is still a DACB network for k P r1, |P|´1s.In the construction process, if the nth S-T path P m is the last routable path for an edge e n , e n must be one of the edges with the smallest capacity in N m´1 because of Fact 2, and c pe n q is saturated when maximizing the flow along P m .It ensures that each edge e n will be saturated after all flows on routable paths P (e n P P) are added.Thus, the multiflow constructed in this wayis saturated.
Lemma 2. In a DACB network, if an S-T exclusive cut δ pXq exists, then F ˚, the saturated multi-commodity flow with respect to S-T pairs, is constant.
Proof.Since δ pXq is a one-way cut, π p tP e |e P δ pXqu is a partition of P. Therefore: where sum pf, Qq ř PPQ f Q .Note that for any e in the S-T exclusive cut δ pXq: P e X P ij " # P e , R puq X S " ts i u and R pvq X T " t j ( , and for saturated multiflow f ˚, ř PPP e f P " c peq.Then, all sum `f ˚, P e X P ij ˘in Equation ( 6) are constants, and F ij is also constant regardless of f ˚.Q.E.D.
Lemma 3. In a DACB network, if a mixed vertex exists, then F ˚, the saturated multi-commodity flow with respect to S-T pairs, is not constant.
Proof.The S-T path matrix [32] is denoted by M p " p ij ‰ |E|ˆ|P| , where: # 1, i f the ith edge lies in the jth ST path , 0, elsewise Then we can formulate saturated multiflow problem as standard linear programming: Now, f is the column vector of f P whose entries are in the same order with S-T paths, and c is the column vector of c peq whose entries are in the same order with edges.We need to determine whether each S-T pair commodity flow F ij is constant for all feasible solutions in this linear programming.
Suppose that v m is the mixed vertex.Then it is easy to find two S-T paths, P ac and P bd , which both contain v m and route between different sources s a , s b and different sinks t c , t d , respectively.We can set any feasible positive flow concurrent f ac and f bd for these two paths.According to Fact 1 and Proposition 1, the residual network for p f ac `fbd q have saturated multiflow f r and, hence, f ˚" f ac `fbd `f r is a particular solution for the linear equations (18).While P ac and P bd meet at v m , we can find another S-T path P ad , which share the same path with P ac between s a and v m and share the same path with P bd between v m and t d .We can find the S-T path P bc , similarly.If constructing a vector f 0 " ´f 0 1 , . . ., f 0 k , . . ., f 0 |P| ¯T, where the real number λ ‰ 0 and: ´λ, the kth path is P ac or P bd , λ, the kth path is P ad or P bc , 0, elsewise then, M p f 0 " 0, which suggests that f 0 is a nontrivial solution of (18)'s corresponding homogeneous equations.So f " f ˚`f 0 is also a solution of (19).Given that the components for P ac and P bd in f åre positive, there exist solutions also feasible for the non-negative constraint (19) when λ ą 0. As f ac , f bd , f ad , and f bc contribute to different F ij , F cannot remain constant.

Theorem 1. (Constant Multi-commodity Flow Condition)
In a DACB network, the following statements are equivalent.
Proof.This theorem is directly derived from Lemmas 1-3.

Reducible or Unreducible
As is shown in Section 3.1, STFN is directed, acyclic, and capacity balanced.Equations ( 9) and (10) shows that flow value for every S-T pair in an STFN saturated-commodity flow is the intersection area of corresponding parcels, which is used in TSP (see Section 1, Figure 1).An STFN is considered reducible if the saturated multi-commodity flow with respect to S-T pairs is uniquely determined so that TSP can be reduced into simple operations of summing numbers.
Theorem 1 implies whether a network is reducible or not depends on the existence of mixed vertices.Let R s and R t respectively be the set of sources and sinks connecting to mixed vertices.Clearly, a sub-network without any connection between Rs and Rt contains no mixed vertices; hence, such sub-network is reducible; an S-T exclusive cut can be found in a reducible network and, thus, be used to calculate the flow value F ij (see Lemma 2).In contrast, the commodity flows F ij that routes between Rs and Rt cannot be uniquely determined by finding an S-T exclusive cut (a simple corollary of Theorem 1).Finding a network partition with the maximum reducibility network is the key to simplify TSP.

Description of the Graph-Based TSP Method
Based on the analysis in Section 3, we introduce the graph-based method for TSP in this section.We assume that the land use data is already stored in an STDB and queries of land use type transitions in terms of any two timestamps is required.We firstly construct an STFN and use it to optimize the calculation of TSP.The network partitioning (PartitionSTFN) is the key algorithm of this method.
In the graph construction algorithm (BuildSTFN), STFN is created as a 5-tuple (V, E, S, T, c) using a typical STDB (introduced in Section 2.2) with the materialization of change events.We then retrieve source set S, sink set T, and related event list by spatio-temporal queries, extend the vertex set V by simply iterating the event list (edge set E), and connect the vertices and edges to form an STG.No spatio-temporal entity information other than their identities is read in this graph-construction process.Retrieving edges and attaching them on STG forms an STFN.Hash techniques may be used to accelerate this process.Section 3.4 shows that the spatio-temporal network can be divided into two sub-networks, namely, the reducible one and the unreducible one.In the reducible sub-network, there exists an S-T exclusive cut, so the transition flow can be calculated directly.In the unreducible sub-network, the transition flow is still calculated by the overlay between spatio-temporal entities corresponding to the sources and sinks of the network.Given that a few mixed vertices can be found in a STFN of practical data, in the function PartitionSTFN, we divide the global STFN into maximized reducible and minimal irreducible networks in order to calculate the transition statistics with maximum efficiency.
The pseudo code of the partitioning algorithm is presented.We use a list of triple (s, t, c) Fr to store the flows for S-T pairs in the reducible network.The queue Q is used for breadth first search (BFS), the basic search strategy in this algorithm.This algorithm turns networks into a triple (Rs, Rt, Fr).For each vertex v, we use v.Rs, v.Rt to denote the reachable terminals, and use v.color to represent the vertex set X.

8
simply iterating the event list (edge set E), and connect the vertices and edges to form an STG.No spatio-temporal entity information other than their identities is read in this graph-construction process.Retrieving edges and attaching them on STG forms an STFN.Hash techniques may be used to accelerate this process.Section 3.4 shows that the spatio-temporal network can be divided into two sub-networks, namely, the reducible one and the unreducible one.In the reducible sub-network, there exists an S-T exclusive cut, so the transition flow can be calculated directly.In the unreducible sub-network, the transition flow is still calculated by the overlay between spatio-temporal entities corresponding to the sources and sinks of the network.Given that a few mixed vertices can be found in a STFN of practical data, in the function PartitionSTFN, we divide the global STFN into maximized reducible and minimal irreducible networks in order to calculate the transition statistics with maximum efficiency.
The pseudo code of the partitioning algorithm is presented.We use a list of triple (s, t, c) Fr to store the flows for S-T pairs in the reducible network.The queue Q is used for breadth first search (BFS), the basic search strategy in this algorithm.This algorithm turns networks into a triple (Rs, Rt, Fr).For each vertex v, we use v.Rs, v.Rt to denote the reachable terminals, and use v.color to represent the vertex set X.

9
end for end while if /* exclude flows from irreducible networks */ Delete (s, t, c) rows in Fr where s ∈ Sm or t ∈ St; return Rs, Rt, Fr; e.g., (Figure 5) v1, v2, v3 and v4 are the sources and v8, v9 and v10 are the sinks.We observe that v7 simultaneously connects sources v3, v4 and sinks v9, v10, so it is a mixed vertex which makes the lower sub-network irreducible.In the upper sub-network, edge (v1,v8) only connects source v1 and sink v8, edge (v2,v5) only connects source v2 and sink v8, and edge (v2,v6) only connects source v2 and sink v9; therefore, {(v1,v8), (v2,v5), (v2,v6)} is an S-T exclusive cut, and the flows can be calculated: Fr = [(v1, v8, 10), (v2, v8, 10), (v2, v9, 10)].The complete procedures of proposed graph-based TSP is presented in Figure 6: we primarily construct a dynamic STFN N according to user's spatio-temporal query.Then, we partition the entire STFN and get the reducible flow list Fr, irreducible sources Sm, and irreducible sinks Tm.Therefore, transition flows on the reducible sub-network is achieved by simple grouping operations on Fr.At the same time, transition flows on the irreducible sub-network (between Sm and Tm) is calculated through the spatial overlay analysis.The final transition statistics result is achieved by summing up statistics from both sub-networks.The complete procedures of proposed graph-based TSP is presented in Figure 6: we primarily construct a dynamic STFN N according to user's spatio-temporal query.we partition the entire STFN and get the reducible flow list Fr, irreducible sources Sm, and irreducible sinks Tm.Therefore, transition flows on the reducible sub-network is achieved by simple grouping operations on Fr.At the same time, transition flows on the irreducible sub-network (between Sm and Tm) is calculated through the spatial overlay analysis.The final transition statistics result is achieved by summing up statistics from both sub-networks.

Sample Data
China has experienced noticeable land-use changes for years resulting from rapid economic development, accelerated urbanization, and the implementation of ecological protection strategy [33].The contradiction between economic development, farmland protection, and ecologic sustainability is a problem that draws increasing attentions of the land management and planning departments.To effectively manage and analyze the land use status in China, the Second National Land Survey was launched (2007)(2008)(2009).From then on, the map accuracy, data storage methods, and land classification system have been significantly improved.The land use databases have been continuous updated due to annual land change survey mechanism.
Our sample data (Figure 7) was actually measured from two towns during the Second National Land Survey and annual land change survey up to 2014.In this sample region with a land area of 15,373 hectares, the amount of land parcels ranges from 18,502 to 18,660, and the approximate average number of polygon's vertices is 79 (there are also line objects in snapshots, of an amount that ranges from 13,615 to 18,502, with average number of vertices 4).In 2009, agricultural land (farmland, garden, and forest) accounted for 78.56%, construction land accounted for 18.40%, and unused land

Sample Data
China has experienced noticeable land-use changes for years resulting from rapid economic development, accelerated urbanization, and the implementation of ecological protection strategy [33].The contradiction between economic development, farmland protection, and ecologic sustainability is a problem that draws increasing attentions of the land management and planning departments.To effectively manage and analyze the land use status in China, the Second National Land Survey was launched (2007)(2008)(2009).From then on, the map accuracy, data storage methods, and land classification system have been significantly improved.The land use databases have been continuous updated due to annual land change survey mechanism.
Our sample data (Figure 7) was actually measured from two towns during the Second National Land Survey and annual land change survey up to 2014.In this sample region with a land area of 15,373 hectares, the amount of land parcels ranges from 18,502 to 18,660, and the approximate average number of polygon's vertices is 79 (there are also line objects in snapshots, of an amount that ranges from 13,615 to 18,502, with average number of vertices 4).In 2009, agricultural land (farmland, garden, and forest) accounted for 78.56%, construction land accounted for 18.40%, and unused land accounted for 2.90% in this area.We observed that the land use changes of in the sample region are diversified: they were resulting from construction, reclamation, reforestation, and many other human activities; the transition from farmland to construction land is the highest (345 hectares, up to 2014), and changes between other land use types also occurred; as to the pattern of changing parcels, merging, splitting, and partial geometric changes are all discovered.10 development, accelerated urbanization, and the implementation of ecological protection strategy [33].The contradiction between economic development, farmland protection, and ecologic sustainability is a problem that draws increasing attentions of the land management and planning departments.To effectively manage and analyze the land use status in China, the Second National Land Survey was launched (2007)(2008)(2009).From then on, the map accuracy, data storage methods, and land classification system have been significantly improved.The land use databases have been continuous updated due to annual land change survey mechanism.
Our sample data (Figure 7) was actually measured from two towns during the Second National Land Survey and annual land change survey up to 2014.In this sample region with a land area of 15,373 hectares, the amount of land parcels ranges from 18,502 to 18,660, and the approximate average number of polygon's vertices is 79 (there are also line objects in snapshots, of an amount that ranges from 13,615 to 18,502, with average number of vertices 4).In 2009, agricultural land (farmland, garden, and forest) accounted for 78.56%, construction land accounted for 18.40%, and unused land accounted for 2.90% in this area.We observed that the land use changes of in the sample region are diversified: they were resulting from construction, reclamation, reforestation, and many other human activities; the transition from farmland to construction land is the highest (345 hectares, up to 2014), and changes between other land use types also occurred; as to the pattern of changing parcels, merging, splitting, and partial geometric changes are all discovered.

Method for Evaluation
We consider three kinds of statistical method for comparison: basic TSP, the query-optimized TSP and the graph-based TSP.The basic TSP uses a spatial overlay to calculate transition areas and provides a lower bound of performance; the query-optimized TSP only retrieve changed parcels from STDB for the spatial overlay and reflects the performance of the present level; the graph-based TSP is the proposed method in this paper which only perform spatial overlay for the parcels with regard to the irreducible network.

Method for Evaluation
We consider three kinds of statistical method for comparison: basic TSP, the query-optimized TSP and the graph-based TSP.The basic TSP uses a spatial overlay to calculate transition areas and provides a lower bound of performance; the query-optimized TSP only retrieve changed parcels from STDB for the spatial overlay and reflects the performance of the present level; the graph-based TSP is the proposed method in this paper which only perform spatial overlay for the parcels with regard to the irreducible network.
The time complexity of TSP is mainly determined by the amount of polygons involved in the spatial overlay.Polygon intersection operation can achieve a time complexity of O pp ˆlog 2 pq [14,34], where p is the vertex number of each polygon.When we overlay two polygon layers containing n polygons each, then n 2 intersection operations have to be performed.So, the polygon overlay results in a time complexity of O `n2 ˆp log 2 p ˘and it determined the efficiency of basic TSP.In the query-optimized TSP, a spatio-temporal query is performed before the spatial overlay, but the query takes time no more than O pmq, where m is the amount of entities in the region.When calculating graph-based TSP with an STG of |V| vertices and |E| edges, the function BuildSTFN can achieve a time complexity O p|E|q, and the function DecomposeSTFN is based on BFS, which is known for O p|V| `|E|q complexity.All three TSP methods have to perform spatial overlay, whose complexity has a quadratic relationship with the amount of polygons and a faster than linear relationship with polygon vertex number, which have a time expense far greater than all other procedures in any of the TSP methods when n and p are large enough.In addition, polygons processed by spatial overlay are necessary to read into memory and determines the space cost of the methods.
Notice that n varies with method.In basic TSP, all parcels in both timestamps Tb and Te is processed.In the query-optimized TSP, only changed parcels between Tb and Te are processed.In the graph-based TSP, the number of parcels process by spatial overlay operations is indicated by the sources and sinks in the irreducible network.Since spatial overlay dominate the performance of TSP, we can evaluate three methods by counting n.

Discussion of the Results
We tested TSP methods by calculating land use type transition matrix for the sample data.Table 1 shows the amount and ratio of the polygons that are processed by spatial overlay in TSP for the sample data.Results in Table 1 imply that graph-based TSP is exceptionally efficient.It only performs a spatial overlay of a data size that is about 1/1558 of the size in the basic TSP and about 1/157 of the size in the query-optimized TSP for a five-year statistics.The results correspond to the actual performance of the program using ArcGIS Engine: the graph-based TSP completes calculations almost instantly, whereas the other two methods take several minutes to obtain the same results.
In general, land use change is sporadic in space and time [35] and mostly directional in terms of feature types [36].These characteristics contribute to the high reducibility of STFN in this graph modeling.The efficiency of the method is verified by the sample data, which contains spatio-temporal changes diversified in reasons, transition types and geometric change forms.For the above reasons, the graph-based method is probable to retain high efficiency for land-use data elsewhere.

Conclusions
The spatio-temporal topological information in land-use databases can be utilized for the improvement of statistical processes.Land-use data organized with typical event-based spatio-temporal models can be described as a spatio-temporal graph, and further modeled as a transportation network with several interesting properties.Based on the proof that transition flows on such a transportation network can be determined under certain condition, a graph-based method is proposed to reduce the amount of polygons involved in spatial overlay processure, which determines the efficiency of transition statistical process.The proposed method has excellent performance compared with other mentioned methods as is demonstrated in experiments with practical data from Changsha, and is expected to retain high efficiency for land use data elsewhere.The proposed method can be used to support statistical analysis or administration applications where efficient and accurate statistical processes are required.This approach can also be used for the change statistics on properties other than land use type, and can be possibly applied to transition calculations for STDB in land cover and cadastral fields, etc.

Figure 1 .
Figure 1.Transition Flow Calculation by overlay analysis.

Figure 1 .
Figure 1.Transition Flow Calculation by overlay analysis.

Figure 1 .
Figure 1.Transition Flow Calculation by overlay analysis.

Figure 4 .
Figure 4. Spatio-temporal flow network (note: The capacities are labeled on each edge).

Figure 4 .
Figure 4. Spatio-temporal flow network (note: The capacities are labeled on each edge).

Figure 7 .
Figure 7. Experimental land-use data in Hunan.The first picture displays base state at 2009, and the following five pictures display incremental data indicating the location and time of change events (2009-2014).

Figure 7 .
Figure 7. Experimental land-use data in Hunan.The first picture displays base state at 2009, and the following five pictures display incremental data indicating the location and time of change events (2009-2014).

Table 1 .
Number of polygons processed by spatial overlay in the experiment.