A Management Method of Multi-Granularity Dimensions for Spatiotemporal Data

: To understand the complex phenomena in social space and monitor the dynamic changes in people’s tracks, we need more cross-scale data. However, when we retrieve data, we often ignore the impact of multi-scale, resulting in incomplete results. To solve this problem, we proposed a management method of multi-granularity dimensions for spatiotemporal data. This method systematically described dimension granularity and the fuzzy caused by dimension granularity, and used multi-scale integer coding technology to organize and manage multi-granularity dimensions, and realized the integrity of the data query results according to the correlation between the different scale codes. We simulated the time and band data for the experiment. The experimental results showed that: (1) this method effectively solves the problem of incomplete query results of the inter-section query method. (2) Compared with traditional string encoding, the query efficiency of multiscale integer encoding is twice as high. (3) The proportion of different dimension granularity has an impact on the query effect of multi-scale integer coding. When the proportion of fine-grained data is high, the advantage of multi-scale integer coding is greater.


Introduction
The spatiotemporal data reflect the quantitative and qualitative characteristics, spatial structure, spatial relations, and their changes with time of various elements or phenomena in the geographical world, which is the basis for human cognition of the geographical world. Most of the problems we face must be addressed based on data-driven approaches for understanding better and achieving more efficient and optimal decisions [1,2]. In recent years, with the development of Internet technology and sensor equipment, the production mode of spatiotemporal data has changed from passive production and active production to automatic production, which makes the spatiotemporal data resources we obtain more abundant [3][4][5]. We can use the monitoring video to analyze the vehicle operation and the movement rule of people on the road, realize the tracking and real-time prediction of traffic conditions, and avoid traffic and congestion. It can also improve the accuracy of weather forecasting by establishing models for continuous years of observation data. However, many social phenomena are complex. In order to reveal their essence, we need more cross-scale data [6,7]. For example, in recent years, the outbreak of COVID-19 has seriously affected the development of society. Many scholars have contributed to the prevention of the outbreak of COVID-19 by analyzing the correlation between relevant indicators at different scales and confirmed cases of COVID-19 [8,9]. Therefore, we can see the importance of different scale data for data mining. However, when we query data, we often ignore the multi-scale impact of data, resulting in incomplete data acquisition. Therefore, there is an urgent need for a multi-granularity dimension management method of spatiotemporal data. The rest of this paper was organized as follows: "Related work" introduced the content related to this study. "Method" introduced the DGFQM and the multi-scale dimension integer coding method. "Results" verified the effectiveness of the method and analyzed its results. "Discussion" was the content related to the results, including shortcomings and prospects.
Multi-granularity dimensions of spatiotemporal data mainly included time and space. First, we understand how to manage time and space in spatiotemporal data management. A popular direction of spatial information management was the method based on the grid model [10,11], which used a spatial filling curve to build grid code, and improved the indexing and query efficiency of multi-scale spatial data. For example, Guo [12] and others proposed an adaptive Hilbert-Geohash geogrid and coding method. Cao et al. [13] and others used a Hilbert curve to store and retrieve spatiotemporal data. However, geohash encoding lacked cross-scale spatial relations, resulting in low indexing efficiency. Zhai et al. [14] and others proposed a level-by-level space-filling curve, which improves the correlation between multiple levels by connecting adjacent levels. The clustering between levels of this method was poor, and the existing spatial retrieval strategy only considered the intersection and the included data that was not considered [15]. To solve the above problems, Lei et al. [16] proposed a global multi-scale spatial grid coding model, and designed a strategy to ensure the integrity of spatial queries based on this model. Multi-scale time was more used for auxiliary processing in simple ways, such as timestamps [17] and strings [18], which were scattered in file systems [19][20][21], databases [22][23][24], and programming languages. This time management method did not retain the multi-scale information of time, and it was difficult to manage the multi-scale time uniformly. To solve this problem, Tong et al. [25] proposed a multi-scale time segment integer coding method, which uses integer representation of time scale information and location information. However, the management of multi-granularity dimensions had the following challenges: First, the intersection query results based on time and other one-dimensional dimensions are incomplete, and the impact of time multi-scale is not considered. The current fuzzy query method used fuzzy set theory [26] to solve the problem of fuzzy words such as "left and right" and "probably" in time description, and could solve the new fuzzy problem caused by multi-granularity [27,28].
Secondly, the current research on multi-granularity dimensions was mainly about time and space, and other multi-scale dimensions were not discussed. As a kind of spatiotemporal data, remote sensing data have the ability to cover a wide area of spectrum. The spectral band spanned from visible light, thermal infrared to microwave, and the resolution changed from multispectral to hyperspectral, which was an important indicator for distinguishing physical properties of ground objects in remote sensing data [29,30]. Recently, the application of radio wave imaging technology in daily life had made the band information span larger [31][32][33]. Zhang et al. [34] designed five spatiotemporal spectral integrated storage formats for long-term remote sensing data with time, space, and spectral information. However, there were few studies on multiscale bands. At present, band information was represented by unique identifiers in the database system. This method is not conducive to the unified storage of multi-source information.
Based on the above analysis, we propose a multi-granularity dimensions management method for spatiotemporal data, which is mainly divided into the DGFQM and multi-scale dimension integer coding. The DGFQM divided the query results into fuzzy data and fine data according to the dimension granularities, and obtained complete query results according to the correlation between different scale codes. Multi-scale dimension integer coding mainly applied the multi-scale integer coding method to the band. We designed an association method between arbitrary scale band and multi-scale integer coding to improve the efficiency of data retrieval.

Dimension
Dimension refers to the inherent and measurable physical properties of physical quantities. Internationally, the seven basic dimensions, such as time and length, are often used to represent other physical quantities. Physical quantity under the same dimension has the function of ordering and is often used to retrieve conditions. However, the dimensions have multi-granularity characteristics. The multi-granularity dimensions easily lead to data loss, so we defined the concept of dimension granularity fuzziness, and described the fuzziness problem.

Dimension Granularity
Granularity refers to the size of the particles. Granularity is measured by particle diameter (usually long or medium diameter). We expressed the measure of physical quantity with dimension granularities. Database systems usually use existing units to represent dimension granularities, such as standard time units (year, month, day, etc.), and length units (meter, decimeter, millimeter, etc.). The premise of realizing this goal requires a simple, effective, and easy-to-use multi-granularity dimension system. Therefore, we defined the relevant concepts as follows: There are specific conversion rules between these units such as 60 min for an hour. However, there is not only fixed granularity information but also other granularity information. Therefore, it is urgent to implement the limited granularities to represent other granularities. The dimensions have two different representations: point and segment type. The point type represents a position on a dimension domain, represented by a value of a certain granularity. The segment type represents the interval on the dimension domain, which is represented by two points. This representation method realizes the representation of various granularities by existing units.

Dimension Granularity Fuzziness
At present, the fuzzy problem adopts the fuzzy set theory. The method calculates the probability of fuzzy data occurrence through the membership function. In this way, the fuzzy point was represented as a two-tuple (d1, δ1), where d1 represented the point, and δ1 represented the membership degree. The fuzzy segment was converted into a quad-tuple (d1, δ1, d2, δ2), where d1 and d2 were the start point and endpoint, and δ1 and δ2 were the membership degrees of the start point and endpoint, respectively. The premise of using the fuzzy set theory was to obtain the fuzzy data set. However, the fuzzy data sets were obtained through semantic computing or empirical knowledge. The above methods cannot solve the fuzziness caused by multi-granularity dimensions. Therefore, we described point fuzziness and segment fuzziness separately.
Compared with fine-grained data, coarse-grained data with multi-granularity dimensions has uncertainty. Therefore, different granularity choices for the same event produce different results. We defined the fuzziness induced by multi-granularity as the dimension granularity fuzziness, describing the fuzziness problem of point type and segment type, respectively.

Point
At present, most database systems use a point of a certain granularity to represent the state of an object, which is usually an index value. There are different granularities in practical applications, so granularity conversion is needed. We defined the granularity transformation function T: where dR is a point at the granularity of R, H is a granularity of transformation, and h is a point at the granularity of H. Assume that d1 is a point at the granularity of R1, and R2 is a different granularity from R1. The conversion of d1 from the granularity of R1 to that of R2 involves the following two cases: R1 ≮ R2, there is a unique dimension point d2 at the granularity of R2, i.e., d2 = h; R1 ≯ R2, {d2|l < d2 < u} = h, where l~u is a point set of R2. The constant is generally used as the retrieval condition, so we divided the transfer function T into Ts and Tl.
where Ts is the minimum value converted to H granularity, Tl is the maximum value converted to H granularity. Because of the multi-granularity characteristic of dimension, different granularity description of the same event produces different results. When describing the same event, coarse-grained points are fuzzier than fine-grained points. For example, the Wenchuan earthquake occurred on 12 May 2018 (China Standard Time), and the time under the annual granularity is 2018. The time information at annual granularity is fuzzier than that at daily granularity. We may miss this fuzzy information when retrieving data.

Segment
The segment represents a binary group [d1, d2], which is all points between d1 and d2. The granularity of d1 and d2 are R1 and R2. In the ideal case, the segment can represent by one index value. However, the length of the segment does not correspond to the existing granularity. Currently, dimension segments represent by two fields, which is inefficient when querying. With the introduction of multi-scale integer coding, we designed the following rules to attain a reasonable and smaller number of index values to represent segments. According to the granularity relationship between d1 and d2, there are two kinds of cases.
Case 1: The granularity of d1 is equal to d2, i.e., R1 = R2. Assuming the interval length is L. If Rx = L, select the value at the granularity of Rx to represent this interval, as shown in Figure 1a. If Rx ≠ L, there are two filling methods. One is to fill from coarse-grained to fine-grained. The following three situations may exist depending on the coverage position of the index value. (1) As shown in Figure 1b, the index value covers the middle position of the interval. (2) As shown in Figure 1c, the coverage of the index value starts at the starting position d1. (3) As shown in Figure 1d, the coverage of the index value ends at the endpoint d2. Recalculate the length of the remaining sections and repeat the above steps until all sections [d1, d2] are covered. The other is to fill from fine-grained to coarse-grained. We can choose to start filling from the start point d1 or the endpoint d2. This method only needs to determine the granularity of the starting index value and does not need to perform multiple calculations. Therefore, we choose this method to study the band. Case 2: The granularity of d1 is not equal to d2, R1 ≠ R2. First, we needed to convert the coarse-grained point to a fine-grained point. If R1 ≯ R2, we reached the point of d1 at R2 granularity by the transformation function Ts. If R1 ≮ R2, we converted d2 to the point at R1 granularity through the transformation function Tl. According to the transformation function, the starting and ending points of the segment have the same granularity. Secondly, design the index values according to case 1. The fuzzy problem of segment type is similar to that of point type. Let the segment D consists of several segments, D= {D1, D2, …, Dn}. T is fuzzy relative to Di.

DGFQM
There are two main ways to retrieve data through dimensions. One is to query through a point, and the other is to query through the start point and end point, also known as the intersection query. Due to dimension granularity fuzziness, data are easily lost when querying, such as in the following example: The above examples show that the same data was described differently due to the multi-granularity characteristics of temporal, spatial, and spectral attributes. Data record 1 was more accurate than data record 2, and data record 3 was fuzzier than data record 2. Important data may be missing from query results.
There are two kinds of missing data caused by dimension granularity fuzziness: coarse-grained missing and fine-grained missing data. Therefore, we divided the query results into fuzzy and exact data according to scales. Assume O (p1, p2, …, pi) is an object with multiple attributes, where pi represents the i-th attribute. Take the intersection query as an example. Let the query interval be 1 2 [ , ] i i p p , the corresponding scales are N1 and N2, respectively. We divided the query result S into S1, S2, and S3, S1, S2, and S3, i.e., where S1 is the set of objects whose scales are larger than 1 i p and 2 i p ; S2 is the set of objects whose scales are between N1 and N2; and S3 is the set of objects whose scales are smaller than 1 i p and 2 i p . When N1 = N2, S1 is the fuzzy data set and S2 and S3 are the exact data set. When N1 ≠ N2, S1 and S2 are the fuzzy data set and S3 is the exact data set. The DGFQM is to obtain the missing accurate data and fuzzy data. This method obtains missing data by analyzing the relationship between different dimension granularity. Since the specific steps of this method are related to the dimension coding method, we will introduce them in Section 3.
In practical application, the dimension granularity fuzzy query method must satisfy the following conditions: Condition 1: The dimension has a multi-scale characteristic in the concrete application.
Condition 2: Inclusion relationships exist between adjacent levels. Condition 1 means that a dimension domain can be represented by sets of points with different dimension granularities, or a point can be represented by multiple granularities. Condition 2 means that there is an inclusion relationship between adjacent levels, and a point on a certain scale includes all points on the next fine scale. As shown in Figure 2a, d1 is expressed as one index value at R1 granularity, two index values at R2 granularity, and three index values at R3 granularity, respectively. However, R2 and R3 do not satisfy condition 2. As shown in Figure 2b, d1 can be represented as one index value at R1 granularity, two index values at R2 granularity, and four index values at R3 granularity. Therefore, there is an inclusion relationship between adjacent granularities, which satisfies condition 2.

Dimension Coding Method
At present, dimensions are expressed in two ways: single-scale dimension coding and multi-scale dimension coding. Single-scale dimension coding is the representation of multi-granularity dimensions on a fixed scale. Multi-scale dimension coding represents multi-granularity dimension by coding at different scales. The existing coding methods are string coding and multi-scale integer coding. The multi-scale integer coding had been used in the time segment (multi-scale time segment integer encoding, MTSIC). For a time, MTSIC has had certain advantages compared to string coding. We extended it to multigranularity dimensions, and the implementation method was as follows: Assuming the dimension is where i  is the number of dimension components and n is the number of dimension components. Figure 3 shows the principle of multi-scale dimension integer coding. Firstly, the components of the dimension are expressed in binary, and the single-scale dimension integer coding is formed by bit operation. Then, the multi-scale dimension integer coding is obtained based on the level information N. Since the bands usually exist in the form of dimension segments, we used multi-scale integer coding to manage the bands and designed the association method between multi-scale integer coding and band.

Multi-Scale Band Integer Coding
The band is encoded with an integer for single-scale band integer coding (SBIC) and multi-scale band integer coding (MBIC). The main idea of MBIC is to transform the band information into an SBIC, and then transform the SBIC into MBIC by level information.
2. Multi-scale band integer coding calculation: according to Formulas (8)-(10), the multi-scale band integer coding mc is obtained by using the level N; where Deta0 is the smallest number in the Nth level.

MBIC Related Operations
Since MBIC represents band data by integers, the related operations in MBIC mainly involve the addition and subtraction of integers and bit operations. This section introduces the level calculation and relationship calculation method of MBIC in detail.

Level Calculation
The multi-scale band integer code is a 64-bit integer, so the level information cannot be intuitively obtained by giving the integer. It is necessary to calculate its level. According to the parity of MC, the specific methods are as follows: 1. If MC is an even number, its level N is 63; 2. If MC is an odd number, first, calculate how much the high-order bits in front of the binary of MC-1 and MC + 1 are the same, i.e., Mid = (MC-1) ^ (MC + 1). Secondly, the level is calculated by calculating how many consecutive zeros are on the left side of the binary of Mid. MBIC is represented by a 64-bit integer and can use the bifurcation method to efficiently obtain level information. The branch method judges how many 0 are on the left of the 64-bit integer according to the method of dichotomy.

Level Relationship Calculation
The multi-scale band integer encoding has a containment relationship and a contained relationship. The child coding set can be obtained by using the containment relationship, and the parent coding set can be obtained by the contained relationship.  (11) and (12): Formulas (13) and (14), the parent coding set of MC is obtained from N-1 to 0 through loop variable N':

The Association Method between MBIC and Band
The bands often exist in the form of an interval, and establishing the association between band intervals and MBIC is crucial for data retrieval. Since MBIC is designed according to the binary tree rules based on common granularity units, the following rules are designed to establish the association between band and MBIC:  Rule 1: The maximum level Nmax of MBIC is not larger than the maximum level Nmax' of the start and end point of the band.  Rule 2: First, the bands are padded with fine-grained to coarse-grained integer encoding, then the bands are padded with coarse-grained to fine-grained integer encoding until the band interval is filled. The specific filling method is shown in Figure  6, where L represents the band, and A, B, C, and D represent multi-scale integer coding at different levels. The steps to associate the band with the multi-scale band integer coding are as follows: 1. Convert the start and end point of the bands to the same granularity.
Analyze the levels of the start(b1(li)) and end (b2(lj)) points of the bands. If i j  , use the conversion function to convert coarse-grained to fine-grained. When the granularity of b1 is coarser than that of b2, the Ts conversion function is used, and when the granularity of b1 is finer than that of b2, the T1 conversion function is used; 2. Gradually divide and determine its level scope.
Assuming that both b1(li) and b2(lj) are data at the micrometer scale, i.e., i = j = 6, according to each component, its grade is divided into 6 grades (33~43, 29~33, 25~29, 21~25, 11~21, 0~11). The minimum level Nmin and the maximum level Nmax of the MBIC is determined grade by grade. The maximum level is the maximum level at this grade, i.e., Nmax = 43 at the (33~43) grade. The minimum level calculation is divided into two cases: 3. Accurate filling step by step.
According to Table 1 As shown in Figure 7, the relationship between MBIC and band is many-to-many.

Results
To verify the effectiveness of the design method in this paper, we conducted related experiments on multi-granularity dimensions (time, band) that satisfy the fuzziness of dimension granularity. The verification content mainly includes the following three points: the effectiveness of the DGFQM, the relevant factors that affect the query efficiency of MTSIC and string coding, and the influence of the association method between MBIC and band on data retrieval. In response to the above contents, we designed the experiments as follows: Experiment 1: To verify the effectiveness of the DGFQM, we simulated time data, and then compared the query results of the DGFQM and the intersection query method. Experiment 2: We designed time data sets with different proportions using string encoding and MTSIC methods and compared the retrieval efficiency of the two ways. Experiment 3: We used the string coding method and the association method between MBIC and band to build an index table for the simulated band data, respectively, and then compared the query efficiency of the two ways.

DGFQM
At present, we mostly use the intersection query method for data queries. We used string coding and MTSIC to store time data, respectively, and then compared the results of the DGFQM and the intersection query method. First, randomly generate n different time scales (year, month, day, hour, minute, second, millisecond, microsecond), then perform string coding and multi-scale integer coding. Finally, build a B-tree for the intersection query method and the DGFQM.

The DGFQM Based on String Coding
Dimension granularity fuzziness query steps based on string coding: 1. Perform string encoding on the query interval [t1, t2] to obtain the string interval [s1, s2]; 2. Decode the strings s1 and s2 to attain levels N1, N2; 3. Parse the string s1, and then obtain the parent data set Cf1 of s1 by coding; 4. Parse s2, and then obtain the child set Cs2 of s2 through string coding; 5. Obtain query results through set operations and query statements; Set n to be 10,000, 100,000, 500,000, 1,000,000, 5,000,000, 10,000,000, and select various query intervals to perform the intersection query and the DGFQM, respectively. The query intervals are the annual scale, the daily scale, and the second scale. The query results are shown in Figure 8. The intersection query method does not take into account the dimension granularity fuzziness, but only relies on the size sorting function of the code to obtain the data. Therefore, the number of results obtained by DGFQM is higher than that of the intersecting query method. From Figure 8, it can be seen that the amount of missing data in the intersection query is affected by the amount of data and the query interval. The amount of missing data is proportional to the query interval and the total amount of data. To verify the correctness of the data in the query results of the dimension granularity fuzziness, we took the query interval (15 November 2014, 15 February 2015 as an example to compare query results for both methods under the 1 million data set. The number of query results for the DGFM is 5727, and the number of unequal results is 5564. The query results of the intersection query method are 5564, of which 5408 are unique. As shown in Table 2, the query results of the DGFM are more complete than the intersection query.  1. According to the multi-scale time segment integer encoding method, the integer coding MTC1 and MTC2 of t1 and t2 were obtained, so the integer coding interval was Cb = [MTC1, MTC2]; 2. Calculate the level of MTC1 and MTC2, and obtain the corresponding levels N1 and N2 through level operations; 3. The parent data sets Cf1 and Cf2 are obtained through the contained relationship operation, and the missing fuzzy data set C1 is obtained according to Formula (15); 4. The child data sets Cs1 and Cs2 of MTC1 and MTC2 were obtained by using the containment relationship operation, and then the missing precise data set C2 was obtained according to the following Formula (16); 5. Obtain query results through set operations and query statements; Set n to be 10,000, 100,000, 500,000, 1,000,000, 5,000,000, 10,000,000, and select various query intervals to perform the intersection query and the DGFQM respectively. The query intervals are the annual scale, the daily scale, and the second scale. The query results were consistent with the query result based on string coding, as shown in Figure 8.

The Influence of the Proportion of Different Time Scales on Retrieval Efficiency
MTSIC uses an integer type to store time data, which occupies less memory and is more computationally efficient than a string type. Therefore, the proportion of different scales in the time data may have an impact on the query efficiency. We designed different temporal data sets to compare the query efficiency of temporal string encoding and MTSIC using DGFQM. The experimental design process was as follows: 1. Randomly generate n time data (year, month, day, hour, minute, second, millisecond, microsecond) according to equal and unequal proportions. The non-proportional data is generated in the way of 1: 2: 4: 8: 16: 32: 64: 128, which will generate a combination of factorials of 8, so we divided the scales into fine scales (hour, minute, second, millisecond, microsecond) and coarse scales (year, month, day). The specific design is shown in Table 3. 2. Establish a B-tree index. Perform string encoding and MTSIC on time data, and then build B-trees, respectively. 3. Dimension granularity fuzzy query. According to Section 3.1, we performed the DGFQM on string coding and MTSIC, respectively, and counted the results. Set n to 10,000, 10,000, 100,000, 1,000,000, 5,000,000, 10,000,000, and select the query range: "2014 to 2015", "15 November 2014 to 15 February 2015" for querying. Each query result was taken ten times, and the query efficiency was counted. The result was shown in Figure 9.
The red marks in Figure 9a-e are all lower than the blue marks, so the query time of MTSIC is less than that of string coding. Under the data volume of 10,000, the time-consuming of the string coding was 1.2 (dbl1) times, 1.5 times (bdbl1), 1.1 times (dbl2), and 1.2 times (bdbl2) of MTSIC, respectively. Under the 10,000 data volume of fdbl, the time-consuming of string coding was roughly equal to that of MTSIC. The time-consuming of string coding under 10 million data volume is 1.2 times (fdbl1), 1.7 times (dbl1), 2.1 times (bdbl1), 1.1 times (fdbl2), 1.5 times (dbl2), 2.1 times (bdbl2) of MTSIC, respectively. Therefore, we can draw the following conclusions: Under the same proportion, with the increase in the total amount of data or the expansion of the query scope, the query effect of MTSIC was better and better compared to string coding. In the case of the same amount of data, with the increase in the fine-scale ratio, the query effect of MTSIC was better and better.

Comparing the Retrieval Efficiency of MBIC and String Encoding
Randomly generate n bands and manage them in two ways. One was the string coding method, which was stored and indexed through two fields of string type. The other was to use the association method between MBIC and band to store and index. Table 4 is a comparison of the expressions of the two codes. Let the band be [b1, b2], and retrieve data according to the DGFQM. The steps for the DGFQM of bands were as follows:

MBIC Store bands with a column of integer
The multi-scale integer encoding of "6-626-4-5-1"-" 4. Obtain query results through set sum operation; The steps of the DGFQM based on MBIC: 1. According to the association method between MBIC and band, the corresponding MBIC set B= {MC1, MC2, ..., MCn} is obtained; 2. Attain the exact data set Cx in the query interval. Obtain the child interval xi of the ith code in B by including relational operation, i.e., B(i) and repeat the operation until all codes in B are traversed. The specific process was shown in Figure 10a: 3. Attain fuzzy data set Cm of query interval. Obtain the parent interval mi of the i-th code in B by including relational operation, i.e., B(i) and repeat the operation until all codes in B are traversed. The specific process was shown in Figure 10b; 4. Obtain query results through set operations; Set n to 500,000, 1,000,000, 5,000,000, and 10,000,000, and make multiple queries. We considered four query intervals as an example, which contained four different scale intervals. The query intervals were represented by the string coding method and the multiscale integer coding method, and the specific design is shown in Table 5. Then query according to DGFQM under different codes. Finally, take ten times for each query and count the query efficiency.  The statistical results are shown in Figure 11. The association method between MBIC and band proposed in this paper has a better effect than the traditional string representation. The query time for both methods increase with the amount of data. Under the same amount of data, when using the method proposed in this paper, the query time gradually increased with the expansion of the band range. It can be seen from Figure 11 that the time-consuming of queries 1-3 was about zero. However, when using the string coding method to retrieve the band range, it is necessary to traverse all the data, which took a long time. The results show that the query band range has little effect on it.

Discussion
Aiming at the problem of the multi-granularity dimension in spatiotemporal data, we proposed a management method of multi-granularity dimensions for spatiotemporal data. Mainly study the fuzziness and organization methods of multi-granularity dimensions. First, according to the inclusion relationship between granularities, we proposed DGFQM, which solved the problem of data loss caused by the multi-granularity characteristic of dimensions. Second, we discussed the encoding method of bands and designed the association method of multi-scale integer coding and bands. The correlation experiments were carried out by simulating time and band data. Correlation experiments are carried out by simulating time and band data. The experimental results are as follows: (1) Whether the string coding method or MTSIC, the DGFQM can obtain more complete data than the intersection query method; (2) Although the query efficiency of MTSIC is higher than that of the string coding method, its effect is affected by the proportion of different scales in the data. With the increase in the amount of fine-scale data, the query effect of multi-scale time integer coding is better; (3) Compared with the string coding method, the association method between MBIC and band designed in this paper effectively improves the data retrieval efficiency. The retrieval efficiency of this method is related to the range of the query band, and the query effect is better as the range of the band decreases. Especially when the band range is small, the query time is about 0.

DGFQM
Few studies have discussed the fuzziness caused by the multi-granularity of dimensions. Although a cross-scale spatial filling curve was proposed in reference [16] to provide a query method for multi-scale spatial data, the relevant theories and methods of dimension granularity fuzzy such as time were not proposed. In this paper, we discuss the fuzziness of multi-granularity dimensions from point and segment, and proposed the DGFQM. To verify the effectiveness of the DGFQM, we simulated temporal data and compared the query results of the intersection query method [25] and DGFQM.

Multi-Scale Integer Coding
At present, multi-scale integer coding has achieved good results in time and space. However, there were few studies on other multi-granularity dimensions. The concept of time-spectrum was proposed in reference [34], which put our focus on spectral information. We extended multi-scale integer coding to multi-scale dimension and took the band as an example to describe the application of multi-scale integer coding in a band in detail. We used the scale information contained in multiscale integer coding to design the correlation method between multiscale integer coding and band. The band was converted into a one-dimensional array by filling. The experiment showed that the association method proposed in this paper improved the efficiency of data retrieval compared with the traditional binary form.
In the above research, we studied the multi-granularity metric in spatiotemporal data from the above two aspects. The results were generally good, but there were still some limitations, and there are still some problems to be discussed.
(1) This method was to solve the problem of incomplete query results based on time and other multi-scale dimensions. This requires that the query data cover as many areas as possible. Secondly, the method uses multi-scale integers to fill multi-scale dimensions. When the scale is one year, three months, one day, and five hours, this complex situation needs to be filled with many multi-scale integer codes, which would affect the efficiency of data retrieval.
(2) We analyzed the fuzziness of spatiotemporal data from the multi-scale dimension level, and provided a new perspective for the study of spatiotemporal data fuzziness. We obtained fuzzy data with hidden values from the data through the DGFQM, so as to better understand and analyze the change trend in various fields such as economy and culture.
Next, we will further study the query results, analyze the potential information in the fuzzy data, and build the corresponding knowledge map.
(3) We applied multi-scale integer coding to the band, and discussed the applicability of multi-scale integer coding. It can be seen that multi-scale integer coding has certain advantages in terms of memory occupation and query efficiency. At present, multi-scale integer coding was applied to time, space, and band, respectively. Next, we will consider building the coding of a space-time, spatiotemporal spectrum based on multi-scale integer coding.