GIScience Theory Based Assessment of Spatial Disparity of Geodetic Control Points Location

Geodetic networks provide a spatial reference framework for the positioning of any geographical feature in a common and consistent way. An even spatial distribution of geodetic control points assures good quality for subordinate surveys in mapping, cadaster, engineering activities, and many other land administration-oriented applications. We investigate the spatial pattern of geodetic control points based on GIScience theory, especially Tobler’s Laws in Geography. The study makes contributions in both the research and application fields. By utilizing Average Nearest Neighbor, multi-distance spatial cluster analysis, and cluster and outlier analysis, it introduces the comprehensive methodology for ex post analysis of geodetic control points’ spatial patterns as well as the quantification of geodetic networks’ uniformity to regularly dense and regularly thinned. Moreover, it serves as a methodological resource and reference for the Head Office of Geodesy and Cartography, not only the maintenance, but also the further densification or modernization the geodetic network in Poland. Furthermore, the results give surveyors the ability to quickly assess the availability of geodetic points, as well as identify environmental obstacles that may hamper measurements. The results show that the base geodetic control points are evenly dispersed (one point over 50 sq. km), however they tend to cluster slightly in urbanized areas and forests (1.3 and 1.4 points per sq. km, respectively).


Introduction
Geodetic control networks are widely used for investigating and monitoring any geographical features and phenomena. They are of great priority to many human actions, like geodetic and geophysical measurements, surveying, civil and environmental engineering, GIS development, and gathering spatial data. These networks provide a series of accurate positional measures that cover the analyzed area, thus delivering a time-continuous infrastructure enabling studying and tracking of crustal deformation [1], as well as seismic and geomorphologic landform activities (e.g., [2][3][4]). The ground stability of geodetic control points provides quality to the results of geometric measures, enables an accurate and detailed description of land topography, and changes relative to Earth processes, natural hazards, and climate change [5]. Control surveys serve as the basis for initiating or reviewing subordinate surveys for property boundary delineation [6], route and construction planning [7,8], engineering object displacements [9], cadastral and topographic mapping [10], as well as many environmental applications and cultural heritage preservation and monitoring [11,12]. They are indispensable as a reference framework for land administration systems, especially for georeferencing spatial objects [13][14][15]. Many scholars point out that the production and quality control of any geographical data and products requires references to the base geodetic network that realizes the CRS (Coordinate Reference System) and provides the consistent frame for GNSS users [5,16,17].  Coordinates of the 6723 BGCPs stations were obtained from the Central Geodetic and Cartographic Documentation Centre, in Warsaw. The points' coordinates are given both in ETRS89 and Cartesian coordinate system "PL1992" (the EPSG code 2180). The geographic locations of points are shown in Figure 1a.
Coordinates of the 6723 BGCPs stations were obtained from the Central Geodetic and Cartographic Documentation Centre, in Warsaw. The points' coordinates are given both in ETRS89 and Cartesian coordinate system "PL1992" (the EPSG code 2180). The geographic locations of points are shown in Figure 1a. The CORINE Land Cover (CLC) delivers information on the biophysical characteristics of the Earth's surface and its changes derived on visual interpretation of satellite images. The land cover nomenclature is hierarchical, including three levels of thematic details grouped in five major land cover types: Artificial, agricultural, forests and semi-natural areas, wetlands, water bodies, with 44 classes at the third, most detailed level. The CLC data is available from COPERNICUS land monitoring services [46] free of charge. The CORINE Land Cover ( Figure 1b) data was used to investigate a relationship between BGCPs' density and land cover.
VmapL2 delivers vector-based geospatial data at medium resolution (1:50,000). Data are structured in nine thematic layers, i.e., boundaries, elevation, hydrography, industry, physiography, population, transportation, utilities, vegetation [47]. The study utilized data on road and rail networks, as subsets of the transportation layers.

Research Hypotesis and Methods
Based on an in-depth literature and national regulation review, we hypothesized that the spatial pattern of the base geodetic control points in Poland depends on the scale of investigation. Countrywide it could be perceived as regularly dispersed, although, locally it is regularly densified or regularly thinned. Moreover, based on CLC and VmapL2 data, we discover the BGCPs that are located in the signal interference zone due to environmental factors, such as water reservoirs, heavy traffic paved roads, electricity lines, and other geographical features. This is of particular importance for both surveyors and mapping authorities, because some advanced measurement and calculation The CORINE Land Cover (CLC) delivers information on the biophysical characteristics of the Earth's surface and its changes derived on visual interpretation of satellite images. The land cover nomenclature is hierarchical, including three levels of thematic details grouped in five major land cover types: Artificial, agricultural, forests and semi-natural areas, wetlands, water bodies, with 44 classes at the third, most detailed level. The CLC data is available from COPERNICUS land monitoring services [46] free of charge. The CORINE Land Cover ( Figure 1b) data was used to investigate a relationship between BGCPs' density and land cover.
VmapL2 delivers vector-based geospatial data at medium resolution (1:50,000). Data are structured in nine thematic layers, i.e., boundaries, elevation, hydrography, industry, physiography, population, transportation, utilities, vegetation [47]. The study utilized data on road and rail networks, as subsets of the transportation layers.

Research Hypotesis and Methods
Based on an in-depth literature and national regulation review, we hypothesized that the spatial pattern of the base geodetic control points in Poland depends on the scale of investigation. Country-wide it could be perceived as regularly dispersed, although, locally it is regularly densified or regularly thinned. Moreover, based on CLC and VmapL2 data, we discover the BGCPs that are located in the signal interference zone due to environmental factors, such as water reservoirs, heavy traffic paved roads, electricity lines, and other geographical features. This is of particular importance for both surveyors and mapping authorities, because some advanced measurement and calculation techniques for measuring BGCP position should be recommended. CLC data allows also to investigate dependence between the Thiessen polygons' area and shape and the land cover structure.
The hypothesis was proven using GIS tools based on Tobler's laws of geography, in particular, the first law that states 'everything is related to everything else, but near things are more related than distant things'. This law serves as the basic concept for spatial statistical methods that are vastly relevant for spatial analyses and modeling, e.g., global and local indicators of spatial autocorrelation (Moran's I, Anselin Local Moran's I), and spatial interpolation (kernel density estimation). The second law ('the phenomenon external to an area of interest affects what goes on inside') emphasizes spatial heterogeneity, particularly scale-dependent concentration and decentralization [48].
The BGCPs spatial pattern was investigated by testing complete spatial randomness using nearest-neighbor analysis and Ripley's K-function, and global Moran's I statistics. The regularity of geodetic point locations analysis was based on morphological parameters (area, shape and nearest distance between two adjacent polygons) of Thiessen polygons generated from the BGCPs. As a foundation for the quantification of the geodetic network evenness into regularly dense and regularly thinned cluster, the outlier analysis (Anselin Local Morans I) was implemented. The adopted methodological framework is briefly shown in Figure 2. techniques for measuring BGCP position should be recommended. CLC data allows also to investigate dependence between the Thiessen polygons' area and shape and the land cover structure. The hypothesis was proven using GIS tools based on Tobler's laws of geography, in particular, the first law that states 'everything is related to everything else, but near things are more related than distant things'. This law serves as the basic concept for spatial statistical methods that are vastly relevant for spatial analyses and modeling, e.g., global and local indicators of spatial autocorrelation (Moran's I, Anselin Local Moran's I), and spatial interpolation (kernel density estimation). The second law ('the phenomenon external to an area of interest affects what goes on inside') emphasizes spatial heterogeneity, particularly scale-dependent concentration and decentralization [48].
The BGCPs spatial pattern was investigated by testing complete spatial randomness using nearest-neighbor analysis and Ripley's K-function, and global Moran's I statistics. The regularity of geodetic point locations analysis was based on morphological parameters (area, shape and nearest distance between two adjacent polygons) of Thiessen polygons generated from the BGCPs. As a foundation for the quantification of the geodetic network evenness into regularly dense and regularly thinned cluster, the outlier analysis (Anselin Local Morans I) was implemented. The adopted methodological framework is briefly shown in Figure 2.

Point Pattern Analysis
Point pattern analysis (PPA) is a set of methods for the investigation of spatial arrangements of points in space, usually two-dimensional, and could be expressed as a set X = {x ∈ D} where D, the study region, is a subset of R 2 , a two-dimensional Euclidean space [49]. Diggle [50] defined a spatial point pattern as a set of locations, irregularly distributed within a study region, which have been generated by random mechanisms. Moreover, he formulated the three basic simplified spatial point patterns: Complete spatial randomness (CSR), regularity, and clustering that are very useful at an early stage of spatial analysis. Literature presents many methods of spatial point pattern analysis, classified by Cressie [51] in five main categories: Quadrat methods, distance methods, kernel estimators of intensity, nearest neighbor distribution functions, and K-function analyses. As noticed by [52], each method has its pros and cons, and works well for certain datasets or regions.
Three methods, namely: Kernel density, nearest neighbor analysis, and Ripley's K-function, were used to examine the spatial geodetic control point pattern in this study. They are briefly described below. Kernel density, a nonparametric estimation, belongs to the local density techniques based on a first-order property of the pattern, which concerns itself with the variation of the observations' density across a given area [53]. The kernel method computes a localized density for subsets of the study area; the sub-regions overlap one another, providing a moving sub-region

Point Pattern Analysis
Point pattern analysis (PPA) is a set of methods for the investigation of spatial arrangements of points in space, usually two-dimensional, and could be expressed as a set X = {x ∈ D} where D, the study region, is a subset of R 2 , a two-dimensional Euclidean space [49]. Diggle [50] defined a spatial point pattern as a set of locations, irregularly distributed within a study region, which have been generated by random mechanisms. Moreover, he formulated the three basic simplified spatial point patterns: Complete spatial randomness (CSR), regularity, and clustering that are very useful at an early stage of spatial analysis. Literature presents many methods of spatial point pattern analysis, classified by Cressie [51] in five main categories: Quadrat methods, distance methods, kernel estimators of intensity, nearest neighbor distribution functions, and K-function analyses. As noticed by [52], each method has its pros and cons, and works well for certain datasets or regions.
Three methods, namely: Kernel density, nearest neighbor analysis, and Ripley's K-function, were used to examine the spatial geodetic control point pattern in this study. They are briefly described below. Kernel density, a nonparametric estimation, belongs to the local density techniques based on a first-order property of the pattern, which concerns itself with the variation of the observations' density across a given area [53]. The kernel method computes a localized density for subsets of the study area; the sub-regions overlap one another, providing a moving sub-region window, defined as a kernel. The density, λ(s), at s (a vector location anywhere in R and s 1 , . . . , s n are the vector locations of the n observed events) is estimated as [54]: where, k() represents the kernel weighting function; s, the center of moving window; and τ, which is greater than 0, the bandwidth. Deng et al. [55] demonstrated that the choice of the bandwidth τ has a critical impact on the results of density analysis. As noticed by Gatrell et al. [54], increasing the bandwidth results in smoothening of the spatial variation in intensity, while its reduction results in an increasingly 'spiky' estimate.
The nearest neighbor analysis and Ripley's K-function belong to the distance methods. They explore how the points are distributed relative to one another (a second-order property of the point pattern) as opposed to how the points are distributed relative to the study extent [56][57][58]. The PPA is tested against the hypothesis of complete spatial randomness, under which the spatial pattern is a realization of a Poisson point process. An average nearest neighbor (ANN) analysis measures the average distance from each point in a given area to its nearest point which compares to the distances between nearest points and distances that would be expected on the basis of chance. If the average distance is less than the average for a hypothetical random distribution, the distribution of the analyzed features is considered clustered. If the average distance is greater than a hypothetical random distribution, the features are considered dispersed [59]. The nearest neighbor analysis, developed in 1950 [54], is one of the oldest distance statistics. The null hypothesis, tested by the two-tail test of CRS, is rejected if and only if: where, z ANN -observed distribution of analyzed sample of points, z p -standard normal distribution, and p-the significance level. A z ANN value lower than z p shows that the analyzed point pattern is more clustered, while a value larger than z p/2 indicates a pattern more dispersed than random. The K-function, known as Ripley's K-function, examines all inter-point distances instead of computing separate nearest neighbors. It illustrates how the spatial clustering or dispersion of point change when the neighborhood size changes. For data analysis, the variance stabilized Ripley K-function, called the L function, is generally used (Equation (3)): where, d-the distance, N-total number of points (features), A-the total area, and k i,j -a weight connected with edge correction function. Particularly, in this study, Ripley's K function counts the number of neighboring BGCPs found within a given distance of each individual point. The number of observed neighboring BGCPs is then compared to the number of points that are expected based on a completely spatially randomness. The advantage of the function is that it allows for the investigation of how BGCPs point pattern distribution is changing with scale. However, it should be remembered that patterns are suspect at larger distances due to edge effects.

Thiessen Polygons
Measuring the evenness of the BGCPs' spatial location is based on the analysis of the area and shape of Thiessen polygons formed over the geodetic points' site. Thiessen polygons, also called Voronoi diagrams, define individual areas of influence around each of a set of points, hence any location within a Thiessen polygon is closer to its linked point than to any other [60]. This property has caused that Thiessen polygons are perceived as a proper construct to test the CSR hypothesis as well as the regularity of spatial distribution [61,62]. In considering the equal division characteristics of the Thiessen polygon in spatial tessellation, the method can be used to investigate such issues as nearest points, proximity, adjacency, and accessibility analyses [63].
The shape indices (SI) of Thiessen polygons were calculated according to the equation, firstly introduced by [64]: where, P-denotes the perimeter of the Thiessen polygon, and A-the area of the polygon. The SI value 1.128 refers to the square-shaped polygon, while the value 1 denotes a circle-shaped. The diversities in the shape, area, and land cover structure of Thiessen polygons were measured by the coefficient of variation, CV (coefficient of disparity), a dimensionless number that quantifies the degree of variability relative to the mean [65]: where, CV is the coefficient of variation, σ-the standard deviation, and µ-the mean. Clusters of small and big Thiessen polygons were found by analyzing the statistical distribution of the area and shape of the polygons. The diversity of Thiessen polygons land cover structure, expressed as percentage of artificial areas, agriculture, forest and bushes, water bodies, and marshes, as well as land cover diversity (i.e., number of classes-NC) and fragmentation (i.e., number of patches-NP), was assessed by the statistical measures of central tendency, position, and dispersion (mean, median) and the coefficient of disparity (CV).

Spatial Autocorelation, Cluster and Outlier Analysis
Spatial autocorrelation on the entire study area was investigated by the Global Moran's I index according to Equation (6) [66].
where, n is the number of attributes; y i is the attribute value in region i; y j is the attribute value in region j; y is the mean value of the attribute in the study site; and w ij are weights according to the neighborhood of the attributes. Moran's I values closer to −1 indicate negative or inverse spatial autocorrelation between the Thiessen polygons' morphological parameters, i.e., adjacent Thiessen polygons are characterized by dissimilar parameters values. The values closer to +1 indicate positive or direct spatial autocorrelation and highlight the regions with similarly high or low attribute values. Moran's I value 0 shows a global spatial randomness pattern of distribution.
The local indicator of spatial autocorrelation of Thiessen polygons was explored by the means of Anselin Local Moran's I (LISA) statistics in line with Equation (7) [67]: where, n is the number of attributes; y i is the attribute value in region i; y j is the attribute value in region j; y is the mean value of the attribute in the study site. LISA is generally used for the decomposition of Moran's statistics, the global measures of autocorrelation. The results, presented as a choropleth map, depict (by the COType attribute) statistically significant (95 percent confidence level) cluster of high values (HH) and low values (LL), as well as outlier where a high/low value are surrounded by low/high values (HL/LH). Based on the COTtype attributes' values, the regularly dense and regularly thinned BGCPs' patterns were defined.
The regularly dense cluster (RD) is the sum of the Thiessen polygons that belong to the subset formed by the COType LL of the Thiessens' area (LL AREA ) and nearest neighbor distance between geodetic points (LL NND ) minus the Thiessen polygons of shape index (SI) value lower than SI < SI-σ 2 , i.e., SI outliers (SI out ), as seen in the Equation (8). The elimination of very irregular polygons makes it possible to exclude the polygons that are adjacent to the country border, thus eliminating the edge effect [54]. The regularly thinned cluster (RT) is composed of the Thiessen polygons which COType attribute of an area and a nearest neighbor fields take value HH (Equation (9)).
The RD regions with a closer location between BGCP (LL) and RT-more distant (HH)-are shown on a map. This map, available as a web application, could help surveyors to assess the cost and duration of measurement work.
ArcGIS 10.5 and Statistica 13.1 were used for conducting the spatial and statistical analysis.

General Characteristic of BGCPs' Location
The Base Geodetic Control Network in Poland comprises 6723 points, including 103 ASG-EUPOS stations. The geographical points location highly (with the coefficient of determination equals to 0.92) corresponds to the main land cover type, namely: Artificial areas, agriculture, forest, and bushes (see Table 2 and Figure 3a). As many as 57.37% of BGCPs are situated on agricultural land, 30.78% in forest, and 5.65% in urban areas. There are relatively few points on pastures and meadows. This could be explained by the difficulty of maintaining a point's monumentation stability in lands with high flood risk and variable groundwater levels, which, according to [68], is typical for Polish meadows and pastures, located mainly in river valleys. The average density of the BGCPs in Poland amounts to 1.08 point per 50 km 2 and is in line with the national regulation [32].
A total of 6256 points are situated no further than 200 m from the transportation network (Table 3, Figure 3b), of which as many as 343 (5.1%) are within a 20 m buffer around paved roads, and two points are closer than 20 m from the railway.  The relatively close (up to 200 m) distance of as much as 93.05% of geodetic control points to paved roads makes them easily accessible to surveyors, which is undoubtedly a great advantage of the Base Geodetic Control Network in terms of its use and maintenance.
Nevertheless, there is a set of 383 BGCPs located no further than 20 m from environmental objects that could provide significant disruption of an electromagnetic signal, and finally, decrease the accuracy of positioning (see Table 3 and Figure 4).
Local density of BGCPs varies from 0.2 to 3.65 points over 50 km 2 and shows places of higher and lower point density, which are underlined in red and orange in the kernel density map ( Figure  5a). A significant increase in the density of geodetic points is observed in major metropolitan agglomerations and heavily industrialized regions, such as mining areas (Figure 5b).
The highest density is observed in the Upper Silesian Polycentric Metropolitan Area (from 3 to 3.5 points per 50 km 2 ) followed by Warsaw, Lodz, Wroclaw, Szczecin, Bydgoszcz-Torun-Grudziadz, tri-cities metropolitan areas, where the density of BGCPs varies from 1.2 to 2.85 points per 50 km 2 . The lowest values (less than 0.5 over 50 km 2 ) are mainly found in the Southern Pomerania Lakelands, Sandomierz and Nida Basins, as well as the North Polesie Plain, which are mostly agricultural and forested regions with a high share of natural vegetation and relatively sparsely populated areas, and the percentage of urbanized areas lower than 1.17 (Figure 5b).  The relatively close (up to 200 m) distance of as much as 93.05% of geodetic control points to paved roads makes them easily accessible to surveyors, which is undoubtedly a great advantage of the Base Geodetic Control Network in terms of its use and maintenance.
Nevertheless, there is a set of 383 BGCPs located no further than 20 m from environmental objects that could provide significant disruption of an electromagnetic signal, and finally, decrease the accuracy of positioning (see Table 3 and Figure 4).
Local density of BGCPs varies from 0.2 to 3.65 points over 50 km 2 and shows places of higher and lower point density, which are underlined in red and orange in the kernel density map (Figure 5a). A significant increase in the density of geodetic points is observed in major metropolitan agglomerations and heavily industrialized regions, such as mining areas (Figure 5b).
The highest density is observed in the Upper Silesian Polycentric Metropolitan Area (from 3 to 3.5 points per 50 km 2 ) followed by Warsaw, Lodz, Wroclaw, Szczecin, Bydgoszcz-Torun-Grudziadz, tri-cities metropolitan areas, where the density of BGCPs varies from 1.

Point Pattern Analysis
The null hypothesis presuming Complete Spatial Randomness (CSR) for BGCPs' pattern analysis was rejected, using Average Nearest Neighbor tools. The score amounts to 1.39, with a 99% confidence level, which indicates that there is less than 1% likelihood that the dispersed pattern

Point Pattern Analysis
The null hypothesis presuming Complete Spatial Randomness (CSR) for BGCPs' pattern analysis was rejected, using Average Nearest Neighbor tools. The score amounts to 1.39, with a 99% confidence level, which indicates that there is less than 1% likelihood that the dispersed pattern

Point Pattern Analysis
The null hypothesis presuming Complete Spatial Randomness (CSR) for BGCPs' pattern analysis was rejected, using Average Nearest Neighbor tools. The z ANN score amounts to 1.39, with a 99% confidence level, which indicates that there is less than 1% likelihood that the dispersed pattern of the BGCPs could be the result of random chance and that the points are dispersed over the territory of Poland. The observed mean nearest inter-point distance equals 5.583 km, while the expected amounts to 4.013 km. Moreover, the coefficient of variation CV amounts to 21.66%, which means that the nearest distances between BGCPs do not vary significantly (see Figure 6). ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW 11 of 20 of the BGCPs could be the result of random chance and that the points are dispersed over the territory of Poland. The observed mean nearest inter-point distance equals 5.583 km, while the expected amounts to 4.013 km. Moreover, the coefficient of variation CV amounts to 21.66%, which means that the nearest distances between BGCPs do not vary significantly (see Figure 6).  At the distances from 2 up to 7 km, typical for base geodetic control networks, L(d) lies below the expected distance value (higher than expected approximately 2000 m) and indicates statistically significant (with the level of confidence of 99%) evidence of regular dispersion of geodetic points. At longer distances (greater than 7 km) L(d) lies above the expected distance, indicating spatial clustering. In general, at near distances, the BGCPs are expected to be dispersed, as the number of points within a given distance of each individual point is small, as the distance increases, each BGCP would typically have more neighbors, hence they are clustered.

Thiessen Polygons Morphology and Land Cover Structure
ExpectedK ObservedK Confidence Env. of the BGCPs could be the result of random chance and that the points are dispersed over the territory of Poland. The observed mean nearest inter-point distance equals 5.583 km, while the expected amounts to 4.013 km. Moreover, the coefficient of variation CV amounts to 21.66%, which means that the nearest distances between BGCPs do not vary significantly (see Figure 6).  At the distances from 2 up to 7 km, typical for base geodetic control networks, L(d) lies below the expected distance value (higher than expected approximately 2000 m) and indicates statistically significant (with the level of confidence of 99%) evidence of regular dispersion of geodetic points. At longer distances (greater than 7 km) L(d) lies above the expected distance, indicating spatial clustering. In general, at near distances, the BGCPs are expected to be dispersed, as the number of points within a given distance of each individual point is small, as the distance increases, each BGCP ExpectedK ObservedK Confidence Env. At the distances from 2 up to 7 km, typical for base geodetic control networks, L(d) lies below the expected distance value (higher than expected approximately 2000 m) and indicates statistically significant (with the level of confidence of 99%) evidence of regular dispersion of geodetic points. At longer distances (greater than 7 km) L(d) lies above the expected distance, indicating spatial clustering. In general, at near distances, the BGCPs are expected to be dispersed, as the number of points within a given distance of each individual point is small, as the distance increases, each BGCP would typically have more neighbors, hence they are clustered.

Thiessen Polygons Morphology and Land Cover Structure
The analysis of the Thiessen polygons area and shape also emphasizes that the distribution of BGCPs in Poland is generally dispersed. The shape of polygons, expressed as the shape index (Eq.4), is almost equal. The SI values vary from 0.095 to 0.935 (range 0.840), with the mean of 0.826, and median of 0.840 (see Table 4, and Figure 8a). Moreover, 50% of polygons are characterized by the SI index that fluctuates from 0.812 (lower quartile) to 0.863 (upper quartile). H-spread (IQR) amounts to 0.051, showing very low dispersion, which is also visible in the CV index value, equal to 8.45. The exclusion of polygons neighboring the country border entails even smaller diversification of the SI, which underlines the almost double reduction of the CV index from 8.45% to 4.48%. The analysis of the Thiessen polygons area and shape also emphasizes that the distribution of BGCPs in Poland is generally dispersed. The shape of polygons, expressed as the shape index (Eq.4), is almost equal. The SI values vary from 0.095 to 0.935 (range 0.840), with the mean of 0.826, and median of 0.840 (see Table 4, and Figure 8a). Moreover, 50% of polygons are characterized by the SI index that fluctuates from 0.812 (lower quartile) to 0.863 (upper quartile). H-spread (IQR) amounts to 0.051, showing very low dispersion, which is also visible in the CV index value, equal to 8.45. The exclusion of polygons neighboring the country border entails even smaller diversification of the SI, which underlines the almost double reduction of the CV index from 8.45% to 4.48%. The analysis of the Thiessen polygons' size distribution indicates little deviation from the normal distribution. The mean equals 46.48 km 2 , and the median is 46.64km 2 . Skewness (0.051) shows the fairly symmetric character, with a little longer right tail, while kurtosis (0.896) indicates the fewer and less extreme outliers than in the normal distribution. The area of 50% polygons ranges from 3.685 to 46.636 km 2 , and 90% does not exceed 64.4 km 2 (Figure 8b). The coefficient of disparity (CV) of the polygons' area amounts to 32.4% and emphasizes their medium differentiation.
The analysis of land cover structure in the Thiessen polygons reveals high diversity, emphasized by the CV coefficient, which varied from 285.18% for water bodies to 47.76% for agriculture areas. The land cover dispersion in dense urban areas (CV = 166.16) is higher than in sparsely urbanized ones (CV=123.97) and was followed by forested lands (CV = 76.35). On average, the number of CLC classes (NC=9) is nearly four times lower than the number of patches (NP = 34), which suggests that the landscape heterogeneity is lower than its fragmentation. Moreover, there were some polygons covered entirely by one land cover patch (NP amounts to one). On average, 53.4% of the area of Thiessen polygons is agricultural, followed by forests (32.4%). Urban areas occupy about 6.6%, and dispersed settlement 2.8%. Water bodies (lakes and rivers) cover only 2.1%.  The analysis of the Thiessen polygons' size distribution indicates little deviation from the normal distribution. The mean equals 46.48 km 2 , and the median is 46.64km 2 . Skewness (0.051) shows the fairly symmetric character, with a little longer right tail, while kurtosis (0.896) indicates the fewer and less extreme outliers than in the normal distribution. The area of 50% polygons ranges from 3.685 to 46.636 km 2 , and 90% does not exceed 64.4 km 2 (Figure 8b). The coefficient of disparity (CV) of the polygons' area amounts to 32.4% and emphasizes their medium differentiation.
The analysis of land cover structure in the Thiessen polygons reveals high diversity, emphasized by the CV coefficient, which varied from 285.18% for water bodies to 47.76% for agriculture areas. The land cover dispersion in dense urban areas (CV = 166.16) is higher than in sparsely urbanized ones (CV=123.97) and was followed by forested lands (CV = 76.35). On average, the number of CLC classes (NC=9) is nearly four times lower than the number of patches (NP = 34), which suggests that the landscape heterogeneity is lower than its fragmentation. Moreover, there were some polygons covered entirely by one land cover patch (NP amounts to one). On average, 53.4% of the area of Thiessen polygons is agricultural, followed by forests (32.4%). Urban areas occupy about 6.6%, and dispersed settlement 2.8%. Water bodies (lakes and rivers) cover only 2.1%. Artificial land (heavily and sparsely urbanized), on average, occupies 9.44% of the Thiessen polygons. However, the smallest polygons, with an area of less than 16.5 km 2 (mean − 2.0 std. dev.), are covered by artificial surfaces in 32.06%, while for very large polygons (bigger than 77 km 2 , mean − 2.0 std. dev.) artificial land covers only 4.64% (see Figure 9). The Pearson's correlation coefficient (PCC), equal to −0.28 (significant at p < 0.05000), indicates the weak downhill linear correlation between the size of the polygons and acreage of urbanized areas. The level of urbanization, however, is moderately correlated with road density (PCC equals 0.50). Nevertheless, in small polygons, the average share of urbanized and agricultural areas amounted to 30% and the average share of forests amounted to 20%, which shows a fairly balanced land cover structure. Artificial land (heavily and sparsely urbanized), on average, occupies 9.44% of the Thiessen polygons. However, the smallest polygons, with an area of less than 16.5 km 2 (mean − 2.0 std. dev.), are covered by artificial surfaces in 32.06%, while for very large polygons (bigger than 77 km 2 , mean − 2.0 std. dev.) artificial land covers only 4.64% (see Figure 9). The Pearson's correlation coefficient (PCC), equal to −0.28 (significant at p < 0.05000), indicates the weak downhill linear correlation between the size of the polygons and acreage of urbanized areas. The level of urbanization, however, is moderately correlated with road density (PCC equals 0.50). Nevertheless, in small polygons, the average share of urbanized and agricultural areas amounted to 30% and the average share of forests amounted to 20%, which shows a fairly balanced land cover structure. A moderate positive linear correlation (with the significant at p < 0.05) was observed between a polygon's size and the number of patches (PCC = 0.56), while, the relationship between a polygon's size and the area of agricultural land was very weak (PCC = 0.14). Likewise, the linear relationship between the shape index the land cover structure was similarly very weak (PCC lower than 0.15).

Quantification of Geodetic Network Uniformity
The Thiessen polygons' variability in size and shape analysis supported by cluster and outlier analysis (Anselin Local Moran's I), and preceded by a global autocorrelation assessment, made it possible to delineate the places of more or less densified geodetic networks. Global Moran's I statistics, based on Thiessen polygons' locations and attributes values, such as area, shape, and nearest distance to neighbor polygons' centroids, returned positive I and z-score values (I = 1.06/z = 204.43, I = 0.24/z − 45.18, and I = 1.08/z = 207.78, correspondingly) with a 99% level of significance. This empowers rejection (with a probability greater than 1%) of the null hypothesis, assuming that the Thiessen polygons characterised by the attribute being analyzed are randomly distributed in a study area that covered all of Poland.. Therefore, the spatial distributions of high and low area, shape, and NN distance (NNd) values are clustered. Such characteristics of the spatial distribution of A moderate positive linear correlation (with the significant at p < 0.05) was observed between a polygon's size and the number of patches (PCC = 0.56), while, the relationship between a polygon's size and the area of agricultural land was very weak (PCC = 0.14). Likewise, the linear relationship between the shape index the land cover structure was similarly very weak (PCC lower than 0.15).

Quantification of Geodetic Network Uniformity
The Thiessen polygons' variability in size and shape analysis supported by cluster and outlier analysis (Anselin Local Moran's I), and preceded by a global autocorrelation assessment, made it possible to delineate the places of more or less densified geodetic networks. Global Moran's I statistics, based on Thiessen polygons' locations and attributes values, such as area, shape, and nearest distance to neighbor polygons' centroids, returned positive I and z-score values (I = 1.06/z = 204.43, I = 0.24/z − 45.18, and I = 1.08/z = 207.78, correspondingly) with a 99% level of significance. This empowers rejection (with a probability greater than 1%) of the null hypothesis, assuming that the Thiessen polygons characterised by the attribute being analyzed are randomly distributed in a study area that covered all of Poland.. Therefore, the spatial distributions of high and low area, shape, and NN distance (NNd) values are clustered. Such characteristics of the spatial distribution of polygons was the basis for further analysis, namely determining the type of the polygon clusters, due to the high/low values of the attribute of surrounding polygons (see Figure 10a-d).
Out of 6723 polygons, 3914 (58.2%) created statistically significant clusters of autocorrelated polygons, characterized by the HH/LL or HL/LH area values. As much as 1234 Thiessen polygons (18.3%) create HH clusters (light red), and 1130 create LL clusters (light blue) (Figure 10a). The number of autocorrelated Thiessen polygons is less if the nearest distance between neighboring polygons is analyzed; a total of 2054, which is 30.6% of all polygons. As much 46.0% (945 points) form LL clusters and 870 (42.4%) form HH clusters. These Thiessen polygons comprise 14.1% and 12.9% of all polygons, respectively. If the shape (i.e., the SI) of polygons is considered (Figure 10c), polygons of LL and HL are mainly located along the country border. They constitute 43.4% of autocorrelated polygons, and just 10% of all of them. A further 746 polygons form HH clusters. However, LL and HL clusters are rather small, with an average of 11 polygons, so they were eventually excluded from further analysis aimed at finding local BGCP Thiessen polygon clusters. The RT regions (shown in green Figure 10e) are mainly covered by forest, agriculture, or agricultural-forest lands, and occupy 92,644 km 2 (29.7% of Poland). The regions of regularly densified geodetic control points (RD, presented in red, Figure 10e) are created by small, regularly shaped The RT regions (shown in green Figure 10e) are mainly covered by forest, agriculture, or agricultural-forest lands, and occupy 92,644 km 2 (29.7% of Poland). The regions of regularly densified geodetic control points (RD, presented in red, Figure 10e) are created by small, regularly shaped Thiessen polygons. They are composed of a smaller number of polygons than the RT regions (Table 5), and are more dispersed throughout the country territory, covering mainly urbanized areas, i.e., large agglomerations and accompanying industrial areas. Tiny RD clusters are located along the country's border, mainly West (Odra river bands), North (along Baltic coastal zone), and South (the Sudetes, the Tatra Mountains and the Western and Eastern Carpathians). The two largest clusters are located in Silesia Upland and Oświęcim Basin as well as in the central part of the Masovia and Podlasie Lowlands. RG regions cover just 9.9% of the country and comprise 16.8% of all Thiessen polygons. When it comes to the polygons' size, RD regions are more homogeneous than RT clusters, because Thiessen polygons' area ranges from 3.68 to 46.44 km 2 with an average value of 27.35 km 2 , while, in RT, is takes values from 9.29 to 120.80 km 2 (see Table 5).  Table 5). Base Geodetic Network densification and thinning also emphasize the shortest distances between adjacent (near neighbor) points, which ranges from 5840 to 20,580 m in RT clusters, and from 145 to 5560 in RD clusters. Distances below 500 m occur in Upper Silesia, in areas threatened by land surface deformation related to hard coal mining.

Discussion
New technological advances in GIScience have made it easier to justify Tobler's laws of geography. Now, we are able to measure almost everything, but the choice of technology and methods has a significant impact on the result of the analysis and final conclusions. Tobler's laws give the background for GIScience theory, while GIS software provides tools for spatial analysis. In our study, Tobler's laws of geography are read as a statement about the form of geodetic control points' spatial arrangement, particularly in the sense of similarity. Regardless of the nature and type of geodetic control networks, their task is to create a basis for determining the position of various objects on the Earth's surface. Preferably, the entire area concerned should be covered at a uniform density with a simultaneously adjusted network of control survey stations. Both ANN and Ripley's K-function are best suited to analyzing the type of spatial distributions. They help avoid the modifiable areal unit problem [69] as they treat space as continuous, without constraining data within any regions or units, and, contrary to the quadrat method, they are free from statistical bias caused by data aggregation [70]. Nevertheless, Ripley's K and NN functions describe different aspects of a point process. In particular, processes with the same K(t) function may have different nearestneighbor distribution functions, and vice versa.
As stated in many national regulations and standards, the spacing of the higher-order stations can vary from 5 to 16 km [23,33,71]. The conducted research proved that, in Poland, such  Table 5). Base Geodetic Network densification and thinning also emphasize the shortest distances between adjacent (near neighbor) points, which ranges from 5840 to 20,580 m in RT clusters, and from 145 to 5560 in RD clusters. Distances below 500 m occur in Upper Silesia, in areas threatened by land surface deformation related to hard coal mining.

Discussion
New technological advances in GIScience have made it easier to justify Tobler's laws of geography. Now, we are able to measure almost everything, but the choice of technology and methods has a significant impact on the result of the analysis and final conclusions. Tobler's laws give the background for GIScience theory, while GIS software provides tools for spatial analysis. In our study, Tobler's laws of geography are read as a statement about the form of geodetic control points' spatial arrangement, particularly in the sense of similarity. Regardless of the nature and type of geodetic control networks, their task is to create a basis for determining the position of various objects on the Earth's surface. Preferably, the entire area concerned should be covered at a uniform density with a simultaneously adjusted network of control survey stations. Both ANN and Ripley's K-function are best suited to analyzing the type of spatial distributions. They help avoid the modifiable areal unit problem [69] as they treat space as continuous, without constraining data within any regions or units, and, contrary to the quadrat method, they are free from statistical bias caused by data aggregation [70]. Nevertheless, Ripley's K and NN functions describe different aspects of a point process. In particular, processes with the same K(t) function may have different nearest- Base Geodetic Network densification and thinning also emphasize the shortest distances between adjacent (near neighbor) points, which ranges from 5840 to 20,580 m in RT clusters, and from 145 to 5560 in RD clusters. Distances below 500 m occur in Upper Silesia, in areas threatened by land surface deformation related to hard coal mining.

Discussion
New technological advances in GIScience have made it easier to justify Tobler's laws of geography. Now, we are able to measure almost everything, but the choice of technology and methods has a significant impact on the result of the analysis and final conclusions. Tobler's laws give the background for GIScience theory, while GIS software provides tools for spatial analysis. In our study, Tobler's laws of geography are read as a statement about the form of geodetic control points' spatial arrangement, particularly in the sense of similarity. Regardless of the nature and type of geodetic control networks, their task is to create a basis for determining the position of various objects on the Earth's surface. Preferably, the entire area concerned should be covered at a uniform density with a simultaneously adjusted network of control survey stations. Both ANN and Ripley's K-function are best suited to analyzing the type of spatial distributions. They help avoid the modifiable areal unit problem [69] as they treat space as continuous, without constraining data within any regions or units, and, contrary to the quadrat method, they are free from statistical bias caused by data aggregation [70]. Nevertheless, Ripley's K and NN functions describe different aspects of a point process. In particular, processes with the same K(t) function may have different nearest-neighbor distribution functions, and vice versa.
As stated in many national regulations and standards, the spacing of the higher-order stations can vary from 5 to 16 km [23,33,71]. The conducted research proved that, in Poland, such requirements are fulfilled, with the mean observed interstation distance equal to 5.6 km. Moreover, each geodetic control point should be monumented with substantial stable survey monuments, and situated far from sources of electromagnetic interference and ground vibration [5,25,32]. The analysis revealed that these requirements are fulfilled, although, 5.7% of the base geodetic control points, which are located in the near vicinity of geographical features that could provide significant disruption of electromagnetic signal, have to be monitored on a regular, short-term basis in order to ensure a high accuracy of their position.
The review of world-wide scientific literature concerning geodetic points' location, as well as international and national regulation on network design and monumentation, revealed that intertwining between global, high-, and low-order networks is clearly visible. This is due to the growing demand for high-precision location measurements of objects, especially engineering, which is ensured by GNSS-based measurement techniques and methods [71]. Therefore, there is a clear tendency to place detailed geodetic control points (low-order) in a position that maximizes the use of various measurement techniques, especially those that are intended for use in GNSS observations [21,28,72]. High-order geodetic stations are dispersed regularly [25,39,73,74], and this evenness is only slightly disturbed in densely urbanized areas [23] or those with challenging environmental conditions [75]. At the local level, based on Poland as a case study, the clustered distribution and its relation to land cover/land use is evident and well documented [22,29,30], and the analysis revealed high clustering along roads, water courses, and in dense urban areas. High-order geodetic control networks, with regularly dispersed reference stations, define the geodetic reference system in Poland [16]. To the contrary, local, detailed networks, that are a part of land administration systems, are tailor-made to the covered area, and are suitable for performing local measurements, including surveying deformations and the necessity to monitor engineering structure [28,72]. They are inevitably designed to match local purposes and scale.
Scale is, de facto, a fundamental concept in any analysis concerned with spatial pattern and processes [48,66] as the patterns and processes of spatial objects have an implicit scale of variation. As noticed by Goodchild [76,77], scale is often perceived as an alternative dimension to interpret the patterns and processes. Scale also implicite relates to land policy, especially in land administration domain systems and models [15,78]. Scale dependent variation of BGCPs' spatial pattern is shown by Ripley's K-function and spatial autocorrelation. Furthermore, the spatial pattern of geodetic control networks is locally hampered by environmental features or processes (e.g., surface deformation as a result of mining, vicinity of water reservoirs, sky obstruction by a forest canopy) and changes from regular to densified or thinned.

Conclusions
The novelty of this study relied on quantifying the observed dispersed spatial pattern of the base geodetic control points (BGCPs) into regularly dense and regularly thinned, depending on a cluster and outlier analysis. The rules and principles of designing geodetic control networks are well recognized and described in the literature, but mainly as theoretical, numerical examples. Our research examines the spatial distribution of geodetic control points monumented within the frame of the base geodetic network in Poland. Hence, it has a broader context and provides some results concerning the spatial pattern of the BGCPs in Poland, which slightly changes within scales. The spatial dispersion of the BGCPs was documented by an ANN analysis, one of the oldest spatial statistics. The z-score of 1.39 for the ANN analysis, placed in the tail of normal distribution, provides clear evidence for dispersion. This dispersion was confirmed by the second-order spatial statistic, namely the Ripley's K-function, which shows that at the near distances from 2 to 7 km, which are typical lengths in base geodetic networks. This leads to the conclusion that the assumed uniform distribution of BGCPs over the country's territory has been achieved, however, in regions where the inter-point distance equals 2 km on average, a considerable densification of BGCPs is observed.
Only 6% of the BGCPs could be influenced by environmental obstacles, such as close vicinity to express roads, water reservoirs, or electricity lines. Moreover 452 points are located in artificial areas, where the satellite signals may be disturbed in multiple ways. All these points should be monitored regularly to ensure that their coordinates are determined accurately.
The presented research faces challenges in examining the configuration of Polish base geodetic control points by means of advanced spatial pattern methods, especially in analyzing the distribution of basic control network points. The achieved results could serve the scientific community and also land surveyors. They could be helpful at the survey planning stage because they show the accessibility of points in terms of inter-point distance, geometric distribution, and environmental conditions in the vicinity of stations.
The originality of the research lies in its complex approach to analyzing the density of geodetic control points and the evenness of their spatial distribution. Although the methods used are well known and widely described in the literature, their combination is an added value, especially for the National Mapping Agency that is responsible for geodetic control networks maintenance as well as for surveyors. Ultimately, the goal of the present research is not only to quantify the uniformity of spatial distribution of the base geodetic control points in Poland, but also to find environmental obstacles that hamper the assumed regular dispersed distribution and, finally, to provide methodological fundamentals enabling to conduct the analytical procedure for geodetic control points spatial pattern assessment.