Spatial Negative Co-Location Pattern Directional Mining Algorithm with Join-Based Prevalence

: It is usually difﬁcult for prevalent negative co-location patterns to be mined and calculated. This paper proposes a join-based prevalent negative co-location mining algorithm, which can quickly and effectively mine all the prevalent negative co-location patterns in spatial data. Firstly, this paper veriﬁes the monotonic nondecreasing property of the negative co-location participation index (PI) value as the size increases. Secondly, using this property, it is deduced that any prevalent negative co-location pattern with size n can be generated by connecting prevalent co-location with size 2 and with an n − 1 size candidate negative co-location pattern or an n − 1 size prevalent positive co-location pattern. Finally, the experiment results demonstrate that while other conditions are ﬁxed, the proposed algorithm has an excellent efﬁciency level. The algorithm can eliminate the 90% useless negative co-location pattern maximumly and eliminate the useless 40% negative colocation pattern averagely.


Introduction
After Yoo et al. proposed two co-location algorithms based on join and join-less methods [1][2][3][4][5], the method for co-location mining received increasing attention. Subsequently, in order to mine co-location patterns quickly, Wang et al. proposed a new co-location mining algorithm, CPI-Tree [6,7]. In recent years, many scholars were inspired by the transaction-based approach [8][9][10][11][12] and the transaction-free approach [13][14][15][16][17]. An increasing number of co-location pattern algorithms were also developed, including "fuzzy co-location pattern mining", "parallel co-location pattern mining", "the adaptive maximal co-location (AMCM) algorithm", "efficient co-location pattern mining" and "co-location pattern mining with rare features" [18][19][20][21][22][23][24][25][26]. With the popularity of the join-less co-location mining algorithms, Zhou et al. applied co-location patterns to decision trees and proposed a co-location-based decision tree (CL-DT, a method of decision tree) [27] and the CL-DT method of maximum variance expansion [28]. Subsequently, to further study the join-less co-location algorithm, the maximal instance algorithm for the fast mining of spatial colocation patterns [29] and a book about data mining for co-location patterns [30] proposed by Zhou et al. have made great contributions to spatial mining [31][32][33][34][35][36]. To mine negative relationships in the dataset, Zheng et al. proposed some constraint conditions and mining algorithms for negative sequence PT [37]. Cao et al. proposed the E-NSP algorithm, which can effectively identify negative sequence PT [38][39][40][41][42][43], and Dong et al. proposed the F-NSP algorithm [44][45][46][47][48]. Although there are an increasing number of algorithms for spatial data mining at present, there are still some potentially useful patterns that are not fully developed, including the negative co-location pattern. In the study of negative co-location rules, many scholars proposed mining algorithms for association rules.

• Related Work
In 2004, in order to more effectively mine the co-location patterns in spatial data, Huang et al. [4] first proposed a join-based algorithm on the basis of the Apriori algorithm. This algorithm can efficiently handle continuous spatial data and keep track of spatial information that is not modeled by transactions. However, this algorithm did not extend to mine nonlinear distribution spatial data sets. In 2016, Zhou et al. [28] proposed an MVU(maximum variance unfolding)-based CL-DT algorithm, which extend co-location mining method to nonlinear distribution in spatial data sets. This algorithm overcomes the deficiency of the traditional CL-DT method, in which the Euclidean distance of instances that are nonlinear distributions in high-dimensional space cannot accurately reflect the co-location relationship between instances. To further shorten the calculation amount and time, in 2021, Zhou et al. [29] first defined maximal instances and proposed a maximal instance algorithm. This algorithm constructed the RI-tree to find maximal instances from a spatial data set and pruned it to prevalent co-location patterns. Although both algorithms can find co-location very efficiently, they do not extend to negative co-location.
A mutually exclusive relationship is extremely important in aspects such as investment and construction planning. In order to solve the problem that the traditional co-location cannot mine the mutually exclusive relationship, in 2004, Wu et al. [37] first proposed a new pattern of negative co-location and defined the prevalent negative co-location pattern in order to research the negative relationship patterns in space. They called all unexpected patterns except positive co-location negative co-location, which solved the basic concept of negative co-location. The practical application significance of negative co-location is explained by taking the example of the mutually exclusive relationship between coffee and tea in the shopping baskets of supermarket customers. However, their definition about negative co-location has not been detailly given, and it is not clearly defined and inefficient. So, a more precise definition is needed. In 2010, Jiang et al. [49] proposed the accurate concept of negative co-location patterns and the calculation method of the PI value of negative co-location in order to more accurately define the negative co-location pattern and speed up the mining speed. They also proposed a "mining algorithm for spatial positive and negative patterns", which is called the traditional algorithm in this paper. This algorithm can effectively mine negative co-location patterns and calculate whether negative co-location is prevalent. However, the disadvantage of this algorithm is that it cannot distinguish useless negative co-location and eliminate it, and a large number of useless negative co-location patterns need to be calculated. In 2021, Wang et al. [50] proposed an upward inclusive negative co-location pattern algorithm and proposed the concept of minimal negative co-location in order to find the co-location pattern more effectively. Their algorithm uses the upward inclusiveness of negative co-location patterns to mine all negative co-location patterns by adding the features to minimal negative co-locations. However, this algorithm does not save much calculation time because it needs to search for the minimal negative co-location pattern first, and the calculation method of minimal negative co-location is complicated; therefore, upward inclusion cannot effectively and completely remove the useless co-location pattern.
Spatial data mining with a negative co-location pattern can be significant because it can find features with strong negative correlations and determine mutually exclusive relationships between spatial features, which can play a vital role in many applications. For example, Cao et al. [39] proposed that with negative mining being applied to the detection of gene sequences, the birth of disabled babies can be effectively avoided by mining the negative co-location relationship between certain diseases and specific gene sequences. It is also possible to prevent the occurrence of diseases in advance by mining gene sequences. Zheng et al. [38] proposed an Apriori-like negative mining algorithm to mine the data of supermarket shopping baskets. He defined negative correlation as (1) A ∩ B = ∅; (2) supp (A) ≥ ms and supp (B) ≥ ms; (3) supp (A ∪ ¬B) ≥ ms (or supp (¬A ∪ B) ≥ ms, or supp (¬A ∪ ¬B) ≥ ms). However, the algorithm was not extended to the mining of spatial negative co-location. Moreover, because a large number of useless negative mining need to be calculated, the pruning of the algorithm needs to be further improved. The two negative mining examples above discover that it must calculate a large number of useless negative co-location patterns to obtain prevalent negative co-location patterns. For this reason, this paper proposes a join-based algorithm, which avoids calculating a large number of useless co-location patterns and effectively mines prevalent negative co-location patterns. This paper also provides a directional mining algorithm. Given a specified space feature set -Y and size number, the prevalent candidate negative co-location pattern can be quickly determined.
In summary, the main innovations of this paper can be condensed as follows: • A candidate negative co-location pattern is proposed based on the definition [49] of prevalent negative co-location patterns. Additionally, we prove that any prevalent negative co-location pattern of size n can be generated by connecting the prevalent co-location of size 2 with an n − 1 size candidate negative co-location pattern or an n − 1 size prevalent positive co-location pattern.
• For the specified spatial feature set -Y, the negative co-location pattern T = X ∪ -Y of the specified size can be calculated directly through the join-based algorithm.

•
According to the definition of a negative co-location pattern, the monotonous nondecrement of the PI value of a negative co-location pattern is strictly proven, and a quick pruning method is proposed by using this monotonous non-decrement of the PI value. • By combining the negative co-location patterns from small to large size, two patterns in extreme cases and their meanings are proposed: a "single positive of negative colocation pattern" and a "single negative of negative co-location pattern". Additionally, an algorithm for solving the pattern is given.

Preliminary Definitions
In this section, the abbreviations are explained as follows (Table 1): Table 1. Abbreviation explanation.

Terms Abbreviation Definition
Co-location C-L Co-location is two spatial feature instances that satisfy R (e.g., Euclidean distance metric). [2] Co-location pattern C-LP The co-location pattern is the co-location combination of spatial instance S = {s 1 , s 2 , s 3 , . . . , s n } satisfying R in a given spatial feature . . , f n }. [2] The PI value of the C-LP C-LPI In this paper, the C-LPI is the value of the participation index for the co-location pattern.
Pattern PT In this paper, the pattern represents a specific spatial instance co-location relationship.
Participation Index PI The participation index (PI) of a co-location . . , f n }. [49] The value of the participation index TVPI The value of the participation index is the minimum in all PR (c, , f k ) of co-location C. [49] Size SZ In this paper, size is the number of spatial feature sets . . , f n }.
Co-location of Size C-LSZ Co-location of size is the number of spatial feature sets . . , f n }. [2]

Basic Definition of Negative Co-Location
Given a spatial feature set F = { f 1 , f 2 , f 3 , . . . , f n }, the corresponding spatial feature instance set is S = {s 1 , s 2 , s 3 , . . . , s n }. Given a relation R in a spatial feature instance, where R is assumed to be Euclidean distance and the threshold is d A.1, B.1 ∈ S, R (A.1, B.1) ⇔ (distance (A.1, B.1) ≤ d , R can represent topological relationships (e.g., linked, intersection), distance relationships (e.g., Euclidean distance metric) and mixed relationships (e.g., the shortest distance between two points on a map) [49].
For a set of spatial instances, as shown in Figure 1, if two instances satisfy the relation R, they are connected by solid lines. If there is a candidate negative C-L relationship, it is connected with a dotted line. Definition 1 (prevalent negative co-location pattern) [49]. "Given a minimum prevalent threshold (min_prev), a negative C-LP T = X ∪ -Y is a prevalent negative C-LP if T meets the following conditions." "(1) PI(X) ≥ min_prev, PI(Y) ≥ min_prev and PI(X ∪ Y) < min_pre (2) PI(T) ≥ min_prev" Example 1 shows that it is impossible to directly calculate the participation rate PI of a negative C-LP at present. Only by calculating all the negative C-LP and corresponding PI of the C-LP can we judge whether it is a prevalent negative C-LP. However, the number of combinations of all negative C-LP is very large, and the amount of computation is very large. Most combinations are useless C-LP. The number of candidate negative C-LP that can be generated by an infrequent C-LP of size n is C 1 n + C 2 n + C 3 n + . . . + C n n = 2 n − 1. Therefore, the number grows exponentially. Once there are too many spatial features, the amount of calculation will be very large. To solve this problem, this paper proposes a join-based negative C-LP algorithm, which can quickly determine prevalent negative C-LP and greatly reduce the computation amount of useless negative C-LP. If in a negative C-LP, there is only one negative spatial feature and the others are all positive, then M has a strong negative correlation with spatial feature objects such as F. For example, fungicides, herbicides and mosquito-repellent incense can kill or strongly repel a certain type of spatial characteristic object.

Definition 3 (Single negative of negative co-location pattern).
If a negative C-LP T = F ∪ -M and | -F| = 1, this negative C-LP is called a single positive of negative C-LP. The meaning is similar to the previous one.
Both of these PTs can be directly solved using the algorithm proposed in this paper, and the candidate negative C-LP with a specified spatial feature set and arbitrary size can be obtained.

Lemma 1.
For any negative C-LP T = F ∪ -M, increase its spatial feature object; that is, with the increase in the size of the negative C-LP, the participation rate and participation degree are monotonically nondecreasing.
F and F in the same space features, in the F instance, will be in the F line of instance. However, due to the instance of rows that appear in F , they do not have to be in F. Thus, PR(T, The equal sign holds if and only if MaxPR(T , F) = MaxPR(N , F).
The equal sign holds if and only if MaxPR(T , F) = MaxPR(N , F).

Lemma 2. For a prevalent negative C-LP
As the size of the C-LP increases, its PI is monotonically nonincreasing. Then, F ⊆ F , PI F ∪ M ≤ min _prev. Because F is a prevalent C-LP , PI F ≥ min _prev, namelPI(M) ≥ min_prev, PI F ≥ min_prev, As the size of the C-LP increases, its PI is monotonically nonincreasing. In addition, M ⊆ M ; therefore, PI F ∪ M ≤ min_prev. Because M is a prevalent C-LP, PI M ≥ min _prev. Therefore, PI M ≥ min _prev, It can be obtained from Lemma 1 that

Definition 4 (Candidate negative co-location).
According to the definition of prevalent negative C-LP in Jiang et al. [49], to better calculate the prevalent negative C-LP, the negative C-LP that meets the following conditions is called the candidate negative C-LP: "PI(X) ≥ min_prev, PI(Y) ≥ min_prev and PI(X ∪ Y) < min_prev". Lemma 4. For any size n candidate negative C-LP, it must be composed of an SZ n − 1 candidate negative C-LP or prevalent C-LP connected to an SZ 2 prevalent C-LP.
Proof. Assume an SZ n candidate negative C-LP, (1) For any spatial feature in F = {F 1 , F 2 , F 3 , . . . , F n }, if one of them is removed, PI(F) ≥ min_prev will still be true. The C-LP composed of any two spatial features (2) This is the same as M = {M 1 , M 2 , M 3 , . . . , M n }. It is thus proved that any SZ n candidate negative C-LP must be composed of an SZ n − 1 candidate negative C-LP or prevalent C-LP connected to an SZ 2 prevalent C-LP. Proof. Because T is a prevalent C-LP, F, M must be a prevalent C-LP. In addition, T is size n, so F is SZ n-m, and F ∩ M = ∅.

An Illustrative Example of Join-Based Co-Location
For example, in Figure 1, given min_prev = 0.5, the steps to find all of its candidate negative C-LP are shown in Figures 2 and 3.  After the candidate mode is determined, all prevalent C-LP can be quickly determined by the pruning method and PI value comparison.

Join-Based Negative Co-Location Algorithm
In this section, the J-B prevalent negative C-LP algorithm and J-B directional prevalent negative C-LP mining algorithm are proposed. The specific steps are as follows: Section 3.1. J-B Prevalent Negative C-LP Algorithm.

Join-Based Prevalent Negative Co-Location Pattern Algorithm
The join-based prevalent negative co-location pattern algorithm includes the following steps: (1) Calculate the positive C-LP of all instances and use any algorithm for mining prevalent positive C-LP. Store all prevalent C-LP of SZ 2 and above and store PI values for all C-LP of size 2.
(2) Compare the PI value of the SZ 2 C-LP with the threshold value of min_prev. Find and store all the SZ 2 candidate negative C-LP. The prevalent negative C-LP of SZ 2 is calculated to facilitate pruning.
(3) Starting from SZ 2, an SZ 2 prevalent C-LP or candidate negative C-LP is connected to an SZ 2 prevalent C-LP to generate an SZ 3 candidate negative C-LP. Then, an SZ 3 prevalent C-LP or candidate negative C-LP is connected to an SZ 2 prevalent C-LP to generate an SZ 4 candidate negative C-LP, and so on.
(4) The candidate negative C-LP obtained is pruned. According to Lemmas 2 and 3, if the SZ 2 candidate negative C-LP connected is a prevalent negative C-LP, then it is a prevalent negative C-LP. The remaining unpruned candidate negative C-LP are judged by the comparison between the PI value and the set threshold value of min_prev to obtain prevalent negative C-LP.

Join-Based Prevalence Negative Co-Location Pattern Directional Mining Algorithm
This section introduces a J-B mining algorithm for directional prevalent negative C-LP. The algorithm proposed in this paper can quickly find the prevalent negative C-LP T = X ∪ -Y for the specified -Y. Based on Lemma 5, once -Y and the size of the final prevalent negative C-LP T are determined, then the SZ of X is also determined. Then, the determined prevalent C-LP X and -Y are selected from the entire data set to be connected to become the candidate T. However, the traditional algorithm cannot give feedback to -Y and can only calculate all the negative C-LP. We need to count 2 n − 1 negative C-LP to solve this. The specific steps are as follows: (1) Calculate the C-L relationships for all instances. The prevalent positive C-LP is mined using any existing algorithm. Store all prevalent C-LP of SZ 2 and above.
(2) For the specified mining SZ k -Y.

Experiment and Analysis
To date, there are only a few studies on the mining of negative C-LP. To evaluate the filtering rate and effectiveness of the J-B prevalent negative C-LP algorithm proposed in this paper, Algorithms 1 and 2 proposed in this paper are compared with the algorithm in Jiang et al. [49] (referred to as the traditional algorithm) on real and synthetic datasets. All algorithms are written in Python, and the experimental environment is PyCharm running in Windows10. min_prev: Minimum PI threshold 6. Output: 7.
nPPC: SZ n prevalent positive C-L collection 8.
nCNC: SZ n candidate negative C-L collection 11.
NT nPPC: SZ n prevalent positive C-L collection 10.
nCNC: SZ n candidate negative C-L collection 11.

Experiment and Analysis of Real Data Sets
The real data set selected in the experiment is the distribution data from Shopping, Traffic, Dining and Companies in Jinan, Shandong, with a total of 11,189 data points. Among these four features, there is a negative C-L relationship between each of them. For example, most companies have their own staff canteens, and they are all free. Therefore, around the company, the number of other dining rooms may decrease. As another example, people do not choose to be next to their workplace most of the time when shopping, and they are more likely to choose entertainment complexes. In addition, most companies have commuter buses for their own employee routes, so other traffic may also be reduced. In this section of the experiment, different strong and weak negative C-LPs are mined by controlling the size of the min_prev value.
Convert the latitude and longitude of the data in Table 2 into the corresponding XY coordinate axes, as shown in Figure 4.  In the experiment in this section, the distance threshold was fixed at d = 1000 m. The participation threshold was changed from 0.1 to 0.7, and the relevant tests were performed.
According to the obtained data, a positive C-LSZ 1-4 was obtained, as shown in Figure 5. The line charts represent the sum of the total number of positive C-LP for each SZ, and the bar charts represent the number of detailed positive C-LP for each SZ.

Experiment-1 with Join-Based Prevalent Negative Co-Location Pattern Algorithm
In this experiment, the traditional mining algorithm and J-B algorithm were used to mine all negative C-LP under different thresholds.
As seen in Figures 6 and 7, the effect of the J-B algorithm changed with the change in min_prev. In this experiment, the prevalent negative C-LP under each threshold was certain. We observed that the two algorithms required a certain number of prevalent negative C-LP, the calculations that needed to be carried out, and the number of infrequent negative C-LP produced to determine the quality of the algorithm. When the number of prevalent C-LSZ 2 was equal to the number of candidate negative C-LSZ 2, it was the most complex, and the effect was the worst, but its number was approximately 0.6 of the traditional algorithm. The traditional algorithm enumerates all negative C-LP, independent of the value of min_prev, without fluctuations.     In the experiment of this section, we compared the calculation amount required for a certain number of frequent negative C-LP under the same features. Additionally, we also made a comparison between the calculation time of the algorithm proposed in Section 3 and the traditional algorithm. The traditional algorithm cannot cope with the change of -Y. It can only calculate all the negative C-LP to obtain prevalent negative C-LP. So, its time and computational complexity were not affected by -Y. Traditional algorithms are not directional. We can clearly see that the original algorithm could only calculate each negative C-LP and then compare min_prev. Therefore, the J-B algorithm can connect directly and save considerable time.

Experiments with Real Data-2
As shown in Table 3, to further validate the performance of the algorithm proposed in this paper, the real data, kindergartens, automobile services, restaurants, shops and Guilin per capita income, which include 1466 kindergartens, 1193 automobile services, 4202 restaurants and 473 shops, were used. According to the nature of the schools, kindergartens were classified into four types: public, private, public institutions and local businesses. The number of teachers and the size of the kindergarten were judged according to the number of classes and degrees in the kindergarten. In this paper, the per capita income of Guilin was divided as follows according to the "Guilin per capita disposable income from January to September 2021" issued by the Guilin Statistics Bureau of Guangxi Province. For urban personnel, a monthly income of more than 3500 RMB is rich, a monthly income of 3300~3500 RMB is medium and a monthly income of less than 3300 RMB is poor. For rural people, a monthly income of more than 1400 RMB is rich, a monthly income of 1100~1400 RMB is medium and an income of less than 1100 RMB is poor. Because the number of some kindergarten types was small, the co-location PI and negative co-location PI for kindergartens in special data analysis were the PI values of the spatial features for kindergartens and were not the minimum values in the co-location pattern. The experiments were conducted to compare the performance of algorithms with spatial analysis by setting different thresholds of min_prev.

• Algorithm performance analysis
This section analyzes the performance of the algorithm. Figure 10 show the comparison between our algorithm and the traditional algorithm, in which the broken line Rate represents the elimination rate of the useless co-location in the traditional algorithm by our algorithm, and the histogram is the calculation amount of each algorithm. It can be analyzed from Figure 10 that in this experiment, the elimination rate of the algorithm in this paper increased first and then decreased with the increase of the min_prev value. It can be seen that when min_prev was 0.25, the elimination rate was the largest, which was 0.7. It can be seen from Figure 11 that in this group of experimental data, the algorithm running time decreased first and then increased with the increase of min_prev. When min_prev was 0.25, the running time was the shortest, which was 1437 s. Through the experiments in this section, it was proved once again that our algorithm is faster than the traditional algorithm [49] and reduces the operation of useless negative co-location.  • Spatial co-location analysis This section conducts a spatial analysis of co-location PI values and negative colocation PI values for different types of kindergartens and the surrounding spatial features in Guilin. The analysis revealed the relationship between PI value and the size of kindergartens, as well as the economic situation of kindergarten families. The result is shown in Figure 12. As observed in Figure 12, the spatial analysis for private and public kindergartens is as follows: (1) The PI value for the total co-location pattern F 0 = { private and public kindergarten, restaurant, shop, automobile service} is 0.11, and the PI value for the total negative co-location pattern -F 0 = { private and public kindergarten, The negative co-location PI value is large, indicating that its mutual exclusion is greater than its correlation, showing that private and public kindergartens are widely distributed. In order to meet the schooling needs of most families, whether the surrounding environment of the kindergarten is prosperous is not the focus of the construction of private and public kindergartens.
(2) The size 2 co-location PI values for F 1 = { private and public kindergarten, restaurant}, The total negative co-location PI value may be too large because one of the features is mutually exclusive. By analyzing the size 2 negative co-location PI value, it can be seen that private and public kindergartens have no strong correlation with the surrounding features, indicating that their geographical location is indeed not very good, which is consistent with the conclusion of the total negative co-location.
(3) The co-location PI value and negative co-location PI value for the kindergarten and the surrounding features also discover the relationship between the size of the kindergarten and the economic status of students' families. For example, the average number of private and public kindergarten classes is 5.38, and the average number of degrees is 139.13. This shows that there are fewer teachers and students in the kindergarten, which is consistent with the high negative co-location value (negative C-LPI: 0.89). Similarly, it can be seen from Figure 13 that in the family income of private and public kindergartens, poor families account for the highest proportion of 60%. The PI value of negative co-location is high, and the schools' geographical location is not good; therefore, the families of the nearby schools are not wealthy, and teachers and students are scarce. As observed in Figure 12, the spatial analysis for local business kindergartens is as follows: (1) The PI value for the total co-location pattern F 0 = { local business kindergarten, restaurant, shop, automobile service} is 0.66, and the PI value for the total negative co- automobile service } is 0.34. The negative co-location PI value for local business kindergartens is significantly lower than private and public kindergartens (negative C-LPI: 0.89), indicating that it has a good correlation with the surrounding spatial features. It shows that local businesses often build their own kindergartens next to business districts with better environments.
(2) The size 2 co-location PI value for local business kindergartens 18. From the size 2 co-location and negative co-location, it can be seen that the local business school has a good correlation with the surrounding spatial features, which is consistent with the total negative co-location.
(3) The average number of local business kindergarten classes is 11.17, and the average number of degrees is 316. Both of them are higher than public and private kindergartens (classes: 5.38, degrees: 139.13), indicating that there are many teachers and students in local business kindergartens, and the size of the kindergarten is large. Additionally, its negative co-location PI value of 0.34 is also smaller than public and private kindergartens (negative C-LPI: 0.89). This reveals that the smaller the negative co-location PI value, the larger the school size and the larger number of teachers and students. It can be seen from Figure 13 that 90% of the household economic income for local business kindergartens are high-income groups, and 10% are middle-income groups. This indicates that the smaller the negative co-location PI, the better the location of the kindergarten and the more affluent families that attend kindergartens nearby.
As observed in Figure 12, the spatial analysis for public institution kindergartens is as follows: (1) The PI value for the total co-location pattern F 0 = { public institution kindergarten, restaurant, shop, automobile service} is 0.58, and the PI value for the total negative co-location The negative co-location PI value for public institution kindergarten is less than 0.5, indicating that its correlation with surrounding spatial features is greater than mutual exclusion. The kindergartens of public institutions are basically affiliated schools of universities, which is in good correlation with spatial features such as shops and restaurants.
(2) The size 2 co-location PI values for public institution kindergartens The size 2 negative co-location PI value for public institution kindergartens is higher than local business kindergartens (negative C-LPI: 0.34) and lower than public and private kindergartens (negative C-LPI: 0.89). This is because although there are many restaurants and shops near the kindergarten affiliated with the university, it is not as numerous and dense as the kindergartens around the business district.
(3) The average number of classes in public institution kindergartens is 8.33, and the average number of degrees is 244.83. Both of them are higher than public and private kindergartens (classes: 5.38, degrees: 139.13) and lower than local business kindergartens (classes: 11.17, degrees: 316). Therefore, the number of the kindergartens' students is "local business's number > public institution's number > number of private and public", which is inversely proportional to the negative co-location PI value (each negative C-LPI: 0.34, 0.42 and 0.89). However, it is shown that the high-income families account for 100% of the household income for the kindergartens in public institutions in Figure 13, which is higher than local business kindergartens (high-income family: 90%). According to the negative co-location PI value (local business: 0.39, public institution: 0.42), it should be that the family income of local business kindergartens is higher than public institutions. This is because some low-level employees' children in the local business also study in the school, but the staff and teachers in the university are basically senior intellectuals and have higher incomes.
Based on the above analysis, it can be seen that the lower the negative co-location PI value of the kindergarten and its surrounding features, the more prosperous the kindergarten is located, the more teachers and students the kindergarten has and the better the economic conditions of the family.

Experiments and Analysis with Synthetic Data Sets
As shown in Figure 14, in the experiment on the synthetic data set, the data set is seven sets generated by random numbers, and the total number of instances is 1000. In this experiment section, the fixed distance threshold d = 150, and the participation threshold min_prev vary from 0.55 to 0.7. Since the experimental data are a random generation number, the data points are too random to generate a high-order negative C-LP. Therefore, this experiment is conducted on the SZ 3 negative C-LP. become the prevalent negative C-LP, so we can just throw it out. However, traditional algorithms cannot deal with that, so they do a lot of useless calculations. With the change of the participation threshold, the number of candidate negative C-LPs mined by Algorithm 1 and the number of candidate negative C-Ls mined by the traditional algorithm are shown in Figure 15. When the threshold changes, the traditional method does not respond to it, and the algorithm starts filtering out the useless negative C-LP. When participation reaches 0.7, it is impossible to generate prevalent negative C-LP, but the traditional algorithm still needs to calculate all the PI values in the 198 PT to reach this conclusion. The curve above represents the actual number of prevalent negative C-LP. The filtering rate of Algorithm 1 for the prevalent negative C-LP in the traditional algorithm is shown in Figures 15 and 16. In Figure 15 the blue bar chart shows the number of candidates and useless negative C-LP required to calculate all the negative C-Ls under the set threshold. The orange bar chart shows the number of excluded PTs compared to the amount of computation required by traditional algorithms. The PI rate of the data in this randomized experiment is too dense, so the filtering rate can reach 100%, but there may be some differences in the actual situation.

Conclusions
To obtain the prevalent negative co-location pattern quickly and effectively, this paper proposes a join-based algorithm to mine them. Firstly, the algorithm mines the candidate negative co-location patterns at each size and then combines them with size 2 prevalent colocation patterns. This method can generate size n candidate negative co-location patterns by combining size n−1 candidate negative co-location patterns with size 2 prevalent colocation patterns. The join-based method can avoid a lot of calculations for useless negative co-location patterns by combining each of them. Finally, the prevalent negative co-location patterns can be obtained by eliminating a small number of useless candidate negative co-location patterns.
From the experimental results, the following conclusions can be drawn up. The proposed algorithm, which calculates prevalent negative co-location patterns, is 30% faster than the traditional algorithm. Additionally, the algorithm reduces the calculations for useless negative co-location patterns by an average of about 40% relative to the traditional algorithm.
Although our method can mine prevalent negative co-location patterns effectively, it cannot perform an analogy for those similar negative co-location patterns with different spatial features. It is hopeful that we can solve this problem by adding convolution into the algorithm in the future.