2.1. Characteristics of the Sicilian GNSS CORS Networks
With an area of 25,711 km2, Sicily (Italy, Europe) is the largest island in the Mediterranean Sea, strategically positioned at the crossroads of civilizations. Over the centuries, it has been a hub for cultural exchange, scientific advancements, and technological innovation. This legacy continues today, particularly in the field of geospatial sciences, where GNSS CORS networks play a crucial role in high-precision positioning and geodynamic studies.
Over time, GNSS CORS networks were implemented in Sicily and provided essential data for applications ranging from seismic monitoring to cadastral mapping, ensuring accurate geodetic measurements across the region.
Chronologically, until the last decade, the presence of GNSS CORS was limited exclusively to those aimed at monitoring and strain control. In addition, two networks managed, respectively, by the Italian National Institute of Geophysics and Volcanology (
Istituto Nazionale di Geofisica e Vulcanologia, INGV) and the Italian Institute for Environmental Protection and Research (
Istituto Superiore per la Protezione e la Ricerca Ambientale, ISPRA) had been set up. The INGV has realized two different GNSS networks: the first one is an integrated network referred to as the
Rete Integrata Nazionale GPS (National Integrated GPS Network,
RING) and was located in the most seismogenetically relevant regional areas (see the papers of Devoti et al., Avallone et al., and Barreca et al. [
29,
30,
31]). The
RING GNSS Network consists of about 200 stations that are deployed all over Italy; the development and realization of a stable GPS monumentation, and its integration with seismological instruments, make this network one of the most innovative and reliable CGPS networks in the world. Moreover, since 1995, the INGV—Etnean Observatory (INGV—
Osservatorio Etneo, INGV—OE) has deployed a network devoted to the monitoring of the Sicilian volcanic areas. In particular, a network of 34 GNSS CORS stations is located on the flanks of Mt. Etna volcano [
32], a network of 4 stations is active in Stromboli Island, 7 stations have been realized in the Vulcano–Lipari area, and, finally, 5 stations are currently operating in Pantelleria Island. Since the early 2000s, there has been considerable interest in public and private GNSS CORS networks in Sicily aimed at transferring real-time differential corrections to geodetic professionals and users. Three GNSS CORS networks with different locations, management software and geodetic instruments were thus created in Sicily: the first, in 2007 there has been a private network managed by the
Computer Graphics Technologies (CGT) of Palermo, called
VRS Sicilia, with Trimble instrumentation, consisting of 18 SPs, framed in the ETRF89 datum, managed by GPSNET software version 2.50, which offers VRS and Nearest as available streams; the second in 2008, the GNSS CORS UNIPA public station network designed and realized by researchers of the University of Palermo, with Topcon instrumentation, was developed, in the framework of the national research project PRIN2005 (networks of GPS permanent stations for real-time surveying in control and emergency deployments), consisting of 8 SPs (
Figure 1c), framed in the RDN (
Rete Dinamica Nazionale, powered by IGM) ETRF2000 datum, managed by Geo++ software version 2005, VRS, FKP, Nearest available streams [
33]; Topcon Positioning Italy, following the collaboration with the University and as part of its commercial activity, has expanded the network by installing other stations in Eastern Sicily; to date, the network has 17 SPs, framed in the IGS05 datum, managed by Geo++ software, streams available VRS, Nea [
34]; the third, since 2009, the INGV has also developed a network of permanent stations for real-time publications called
Sicili@net, which relies on some of the 85 CORS managed by the Osservatorio Etneo INGV (INGV OE). This network, consisting of 16 SPs (
Figure 1d), with Leica and Trimble instrumentation, is framed in the RDN datum (ETRF2000), managed by the Leica GNSS Spider Net software version 1.0 [
30] and offers Max and iMAX, VRS, FKP, and Nearest as available streams. GNSS data are processed by fixing an elevation satellite mask of 10°. Precise IGS (International GNSS Service) Ultra Rapid Orbit is used in the real-time processing that is automatically downloaded; if no precise ephemerides are available, broadcast ephemerides are used. The ionosphere model is fixed to ‘Automatic’, in which GNSS Spider automatically chooses a proper ionospheric model based on the available data. The highest status is given to a model based on the predicted IONEX data, which is downloaded from the IGS. This model is predicted from observations on a global set of GNSS reference stations. It provides the most accurate a priori estimate of the ionosphere’s condition. If this data is not available, the Klobuchar model is used [
35]. The Klobuchar model corrects global ionospheric disturbances. In general, it models about 50% of the ionosphere. The parameters needed in the Klobuchar model are provided in the navigation message that is sent from each satellite. The Klobuchar model is applied to the raw observations before the ambiguities are estimated. Furthermore, a standard tropospheric model is applied in data processing. In addition, any tropospheric errors that are not accounted for by the standard troposphere model are estimated by the network processing. This allows network processing to account for unexpected deviations in the tropospheric delay.
Fourth, last but not least, in the Sicilian region, there was also
SmartNet ItalPoS managed directly by Leica Geosystem, consisting of 14 SPs, some of which belong to the same INGV network [
36].
In Dardanelli et al. [
33], all the details concerning the design, data availability, preliminary studies, and analysis involving the
GNSS CORS UNIPA network, geodetic framework used, time series of coordinates and displacements retrieved in time, and statistical analysis with the cumulative distribution function (CDF) are shown.
Instead, see the work of Castagnetti et al. [
37] for other details of the
SmartNet ItalPoS. It is therefore worth pointing out that the
GNSS CORS UNIPA and
SmartNet ItalPoS networks are now fully operational and form part of the
Topcon GNSS [
34] and
Exagon HxGN SmartNet [
36] networks, respectively.
Unlike the networks operated with Topcon and Leica instruments, from 2007 to 2020 the
VRS Sicilia network was active on the island (
Figure 1b): it consisted of 18 CORS, with Zephir Geodetic II GPS antennas and Trimble
® NetR5 receivers capable of tracking L_1, L_2, L_2 C and L_5 GPS signals, as well as L_1 and L_2 GLONASS. NET TRIMBLE GPS CORS management software. Network service provided corrections: VRS and Nearest.
To determine the geodetic framework of
VRS Sicilia, the
NDA Professional software (NDA version 2012) has been used. This software has already been used both for the geodetic framework in the GNSS CORS UNIPA network [
33] and for other scientific applications. To validate the results of NDA, several adjustments have been carried out over the past years: e.g., Panza et al. compared the results obtained by NDA with those of GAMIT/GLOBK to obtain a combined solution, in the SISMA project (Seismic Information System for Monitoring and Alert), while Dardanelli et al. achieved from NDA performance outcomes that were comparable with Bernese 5.0 [
38,
39]. Lastly, Maltese et al. 2021 were able to obtain significant results from this software for verification with optical-based classification, thermal diachronic analysis, and a quasi-PS (Persistent Scatter) Interferometric SAR technique in an earthen dam [
40]. Therefore, in keeping with what has already been established in the published paper, the following models were used for the processing of the
VRS Sicilia network:
Saastamoinen tropospheric correction [
41,
42], the
Klobuchar ionospheric correction [
35], and the
Schwiderski ocean loading correction [
43].
In detail, a single baseline was used for the connection between the
VRS Sicilia and nearest
International GNSS Service (IGS) CORS, CAGL (Cagliari), MATE (Matera) and NOT1 (Noto), represented in
Figure 1a in addition to the other IGS stations already used for the geodetic framework of
Sicili@net network. For baseline processing, the zenith troposphere estimation (affecting the baseline coordinates estimation) was enabled on both stations (recommended for baselines over 15 km). According to the multi-frequency strategy, double-differenced observation data coming from L5 (wide-lane) and L3 (ionospheric-free) frequency combinations were used. This option is recommended within the software for baseline lengths higher than 10 km. The
Least-Squares Ambiguity Decorrelation Adjustment (LAMBDA) method was used to fix the phase ambiguity, as is well known in the work of Teunissen [
44]. To estimate the final solution, the wide-lane observation, estimating the wide-lane ambiguity, and then the ionospheric-free observations, estimating the remaining narrow-lane ambiguity, were also used. Finally, the time range and the cut-off angle were set to 30 s and 10 degrees, respectively.
Once the coordinates were calculated in the ITRF2005 system, they were transformed into the ETRF2000 system, using parameters taken from official EUREF documentation (
Boucher–Altamimi transformation equations, use of the public website [
45]).
Similar to previous studies evaluating permanently materialized CORS, easily accessible and detectable markers were utilized without requiring special arrangements [
25]. These GNSS points were selected within the national and local static GNSS networks in Sicily. Specifically, some of the selected points belong to the national network developed during the IGM 95 project, the others to the local GNSS network developed by the Sicilian region. The points from the IGM95 network, developed by the Italian Military Geographical Institute (IGM) in 1990, were calculated in the European datum ETRS89, and they have an interdistance of approximately 20 km and a Root Mean Square Error (RMSE) of ±0.05 m. The points belonging to the regional static local network were mainly developed for technical applications with a spacing of 7–9 km, and an RMSE of ±0.075 m.
When the surveys were carried out, different network SWs were available for postprocessing analyses (Geo++, GNSMART, GPSNEt, Spidernet). Each software has implemented different algorithms for three different network products for real-time survey: data from a virtual station close to the user’s receiver (the so-called VRS approach, in the paper of Vollath et al. [
46]), data from an SP and area parameters for spatially correlated disturbances (the so-called FKP approach, in the work of Wübbena et al. [
47,
48]), corrections for all SPs in the cell surrounding the user (the so-called MAC or MAX, Chen et al. [
49].
2.2. Characteristics of the Differential Correction Streams
The acronyms VRS, FKP, and MAX stand for different ways of processing and transmitting network information. Originally, these modes were proposed by individual companies, but later they became standard procedures, partly accepted in the RTCM definitions. As is known, the virtual reference station (VRS) mode requires two-way communication between the user’s receiver and the network center, because the latter has to make specific position corrections for the user’s receiver. The process of combining the data from different stations is performed in the network control center to produce the virtual station data to be sent to the user’s receiver.
The drawbacks of this technique are related to the following aspects: the two-way communication required, a high computational effort of the control center to perform specific calculations for each user, and the requirement of position optimization of the virtual reference station. The latter is necessary to prevent some rovers from using only L1 data, as they received information from a very nearby base station. Of course, this would be appropriate with a real close base station, but not with a virtual one.
The Flachen–Korrectur Parameter (FKP) mode is based on the use of parametric models. The computation is performed by the control center (parameters’ estimation) and the user receiver (position calculation). Indeed, preliminary data from different stations are used to estimate parameters transmitted to the user receiver from the control center. Then, the user receiver performs the optimized position. Compared to the previous method, the FKP mode avoids performing calculations in the control center specific to the user’s receiver; in addition, the transmission could be unidirectional.
The data format of FKP is a proprietary format that exploits the RTCM-2 record 59 left free. In the FPK method, the dispersive effect, i.e., the ionospheric delay, and the frequency-independent effects, where tropospheric delay and orbit errors prevail, are considered separately.
The MAX method was proposed by Leica and Geo++ [
48], and the iMAX variant (Individualized MAX) was developed concurrently to accommodate older receivers that were incapable of processing MAX corrections. Nowadays, its format has become part of the RTCM-3 standards. The format is based on data transmission from different reference stations to the user’s receiver without performing any spatial modeling, and the data combination is entrusted to a calculation model that is not strictly defined. The acronym used, ‘Master-Auxiliary’, provides information on network processing based on the differences between the data of a master station (Master) and the remaining stations (Auxiliary). Network processing is easy and does not require a particular model. Also, the MAX system requires the RTCM-3 protocol, and this prevents the use of obsolete receivers.
The iMAX variant provides the information in a format similar to that of the VRS and still uses the RTCM-2 format. Unlike the MAX method, the iMAX mode requires two-way communication. Generally, methods that do not require bi-directional transmission would allow very cost-effective dissemination of information with broadcast-type transmissions based on digital radio technology or similar. Such systems, however, are only experimental. Nowadays, most of the systems are based on a connection between the user’s receiver and the network (direct or via the Internet). Under these conditions, the advantage of a one-to-one system no longer exists [
50].
In addition to the differential correction streams mentioned, the Nearest correction has also been involved in the analysis. In this case, the solution is used when the distance between the measured points (VTRs) and the nearest reference station is shorter than approximately 20 km.
2.3. Data Analysis
In this work, 455 measurements involving 37 reference points (VTRs) were carried out, employing different CORS networks and streams using NRTK survey. The data consistency of the measurements obtained by using specific guidelines [
50] is shown separately in
Figure 2, according to the CORS network employed (a), the differential correction streams (b), and then, regardless of the network and the streams involved (c). Considering the consistency of the different CORS networks (panel
a), 250 points were measured involving the Sicili@net network, while measurements decreased to ~100 and 90 using the UNIPA CORS network and VRS Sicilia, respectively. Referring to the differential streams, the computation of the solutions with all streams was not possible for all measured points. We reached the most consistent number of measurements with punctual corrections (panel
b): ≈180 measurements using the VRS stream, followed by those with Nearest one (≈110 measurements). The measurements obtained involving the areal streams corrections were ≈50 for MAX, IMAX, and FKP. Finally, panel
c presents the number of measurements recorded for each VRS point, reported in the vertical axis, obtained using all available CORS networks and data streams. The first quartile, represented in
Figure 2c with a black bold circle, corresponds to four measurements, while the third quartile, represented by the dashed circle, corresponds to twenty measurements retrieved from the VTRs and the IGM point. The name of the two IGM points is visualized in the gray sector; the other numbers represent the ID of the VTR points.
This work evaluates the deviations of the measured points. Preliminary, in a GIS environment, a grid was selected to represent the deviations, analyzing the number of data points falling within each interval; then, the results were separated according to the pattern, as punctual streams solutions (Nearest and VRS) and areal solutions (FKP, MAX, and iMAX).
As is known, the VRS should be considered an areal correction technique. Indeed, it requires multiple reference stations connected to the control center to provide the positioning correction [
46]. Since these corrections generate a virtual reference station near the receiver, and the rover processes this data as if it originated from a single reference station, we consider the VRS in this study as comparable to a punctual correction, according to [
49]. This provides a basis for comparing it with the Nearest stream, which delivers corrections from the reference station closest to the receiver.
To analyze the empirical distribution of GNSS errors, several statistical tests, among the most representative, have been selected, considering the results for all components in the three directions (∆E, ∆N, ∆Q) separately, and then the planimetric and plano-altimetric components (∆EN, ∆ENQ): Shapiro–Wilk, Cramer–von Mises, Lilliefors (Kolmogorov–Smirnov), Shapiro–Francia, and Anderson–Darling [
51].
Razali and Wah conducted a comparative evaluation of the power of various normality tests by analyzing their test statistics relative to critical values. Their findings indicate that the statistical power of these tests is significantly influenced by factors such as the significance level, sample size, and the nature of the alternative distributions. Notably, they observed a non-continuous variation in power, with a critical sample size threshold beyond which test performance changes markedly. Normality tests tend to exhibit reduced power with small sample sizes—typically those of 30 observations or fewer—raising concerns about their reliability in such contexts [
52].
Similarly, Mendes and Pala reported that for small sample sizes, the most powerful test may vary depending on the specific sample size, further complicating the selection of an optimal test [
53]. In light of these considerations, this study employs a suite of five normality tests to mitigate the limitations of any single method. While individual tests may yield false positives or fail to detect deviations from normality, the combined application of multiple tests is expected to provide a more accurate and comprehensive evaluation of the distributional properties of the GNSS error datasets. Based on the preliminary results, other statistical tests have been performed. Indeed, excluding the normal distribution for some streams, specifically those punctual, the fitting tests involved were some unimodal distributions as Lognormal [
54], Weibull [
55], and Logistic [
56]. The aim was to find whether a unimodal distribution would still fit the patterns obtained. The goodness of the univariate tests was assessed by using the Anderson–Darling (AD) coefficient [
57] for the two planimetric and plano-altimetric components ∆EN and ∆ENQ, obtained as a composition of the planimetric and plano-altimetric components, respectively. The lower the AD value obtained, the closer we get to a unimodal distribution.
In the literature, no similar results showing multimodality of GPS processes were found. Some references to multimodality behavior have been retrieved for day–night radiosonde measurement bias data by comparing with zenith tropospheric delay (ZTD) data, as in the papers of Haase et al. [
58] and Guerova et al. [
59]. Recently, Raghuvanshi and Bisnath verified instances of bimodality for smartphone data, including from Geo++ applications [
60], while Xue et al. found bimodal times for spatial signal ranging errors applied to the GNSS Beidou constellation [
61]. To confirm the unusual pattern of the deviations, a Density-Based Spatial Clustering of Applications with Noise (DBSCAN) classification was performed on the geometric position in the domain ∆E − ∆N, for all separated streams [
62]. DBSCAN is a non-parametric, density-based clustering method that identifies clusters as contiguous regions of high point density.
Comprehensive descriptions of the algorithm’s theoretical foundations, strengths, limitations, and notable variants are available in the literature [
63,
64,
65]. We selected DBSCAN for this analysis due to its key advantage: it does not require prior specification of the number of clusters. This feature makes it particularly suitable for exploratory spatial data analysis, especially when the underlying distribution is unknown.
In our case study, DBSCAN—executed using the default parameters available in QGIS—identified two clusters in the Nearest, VRS, FKP, and MAX data streams, thereby supporting the hypothesis of bimodality. In contrast, the iMAX stream was predominantly assigned to a single cluster, except for a second cluster consisting of a single outlier point. Additionally, various statistical indicators were extracted and analyzed to quantify the reliability of the results.
The indices involved were the within-cluster deviations (SSW), the between-cluster deviations (SSB), and the total deviations (SST). To account for the heterogeneous sample sizes across streams—reflected in the degrees of freedom within groups (dfw), which were 181, 110, 49, 55, and 55 for the VRS, Nearest, MAX, iMAX, and FKP configurations, respectively—and the number of classes (two per stream, yielding a between-groups degree of freedom, dfb, equal to 1), additional statistical indices were computed. These include the Mean Square Within (MSW) and Mean Square Between (MSB), which were derived accordingly (
Table 1) and subsequently employed in the calculation of Fisher’s F-statistic.
Two additional statistical tests have been performed to finalize the analysis: the ANOVA and Ashman’s D coefficient analysis. The ANOVA statistical test has been evaluated to determine the probability that the calculated F-value and the critical F-value match. The latter depends on the classes and the sample size used. Since the analyses involve different sample sizes for each stream, an unbalanced F-value was also considered, as required in these cases. The probabilities obtained for each stream were compared with the minimum likelihood obtained from the stream’s solutions. The Ashman’s D coefficient (Equation (1)) was calculated separately for the variables ∆E and ∆N using the following equation:
The test based on Ashman’s D coefficient aims to check whether the distribution is bimodal and whether the two peaks are distinct from each other. To this aim, the coefficient D must be >2.
As a final investigation, a linear multiple regression analysis was performed to quantify the influence of several parameters on the distribution response.
The parameters included in the correlation analysis were the distance between the VTRs and the nearest CORS reference stations (d), the distance between the VTRs and the coastline and the altimetry of the measured VTRs (d
s, h) to consider a residual tropospheric effect, and finally the horizontal and vertical root mean squares (σ
h and σ
v, respectively) to consider the satellite geometric configuration at the time of measurement acquisition. The contribution related to the distance between the VTRs and the nearest CORS reference stations providing the correction was made only for the Nearest stream. The same approach was not carried out for the VRS stream, as we do not know the location of the virtual station, still materialized virtually within a few km from the measured VTRs, according to [
46]. The value of the Pearson coefficient r [
66] indicates the strength of a linear relationship between the empirical and the corresponding values. According to the r values, the correlation varies from very weak (r < 0.2) to very strong (r > 0.8). While equal sample sizes are ideal for maximizing statistical power and robustness, they are not a strict requirement for the statistical tests employed in this study.
Specifically, ANOVA can accommodate unequal sample sizes, although this may affect the test’s sensitivity to violations of homogeneity of variances. In our analysis, assumptions were verified [
67]. Provided that, balanced ANOVA assumes equal sample sizes across groups and provides a straightforward interpretation of main effects and interactions; however, when sample sizes differ—as in our case, where the number of observations ranged from approximately 50 to 180 per stream—unbalanced ANOVA could help strengthen the results by adjusting for unequal group sizes [
68,
69].
Ashman’s D coefficient, used to assess bimodality, is based on the separation between two distributions and remains valid as long as group statistics (means and standard deviations) are reliably estimated [
70].
The normality tests applied—Shapiro–Wilk, Cramér–von Mises, Lilliefors (Kolmogorov–Smirnov), Shapiro–Francia, and Anderson–Darling—are all applicable to samples of varying sizes. While their power may vary with sample size, they remain statistically valid across the range used in this study [
51,
52].
Similarly, goodness-of-fit tests for unimodal distributions (Lognormal, Weibull, Logistic) are robust to unequal sample sizes, provided that expected frequencies are adequate [
71].
In our dataset, the number of measurements per stream ranged from approximately 50 to 180. These sample sizes reflect the realistic operational conditions; therefore, we retained the natural sample distribution to preserve the integrity of the observational data and limit artificial balancing that could introduce bias.