Impairing Land Registry: Social, Demographic, and Economic Determinants of Forest Classification Errors

: This paper investigates the social, demographic, and economic factors determining differences between forest identification based on remote sensing techniques and land registry. The Database of Topographic Objects and Sentinel ‐ 2 satellite imagery data from 2018 were used to train a forest detection supervised machine learning model. Results aggregated to communes (NUTS ‐ 5 units) were compared to data from land registry delivered in Local Data Bank by Statistics Poland. The differences identified between above mentioned sources were defined as errors of land registry. Then, geographically weighted regression was applied to explain spatially varying impact of investigated errors’ determinants: Urbanization processes, civic society development, education, land ownership, and culture and quality of spatial planning. The research area covers the entirety of Poland. It was confirmed that in less developed areas, local development policy stimulating urbanization processes does not respect land use planning principles, including the accuracy of land registry. A high education level of the society leads to protective measures before the further increase of the investigated forest cover’s overestimation of the land registry in substantially urbanized areas. Finally, higher coverage by valid local spatial development plans stimulate protection against forest classification errors in the land registry.

The paper is organized as follows. The section "Literature Review and Hypotheses" briefly presents the review of previous works on the cause of errors in the land registration. The subsequent section, "Methods and Data Collection," contains a description of the research design, applied methods, data collection process, and research area. The "Results and Discussion" section presents and interprets the findings of our research. The paper is closed by "Conclusions," summarizing the main findings regarding relations between the scale of the errors in land registry and impact of considered factors: Urbanization processes, civic society development, education, land ownership, and culture and quality of spatial planning.

Literature Review and Hypotheses
In the literature related to the assessment of the quality of data contained in land registers and factors affecting errors appearing, there has been much attention paid to formal, legal, organizational, and administrative factors, and to technical issues as well. However, socioeconomic factors have rarely been discussed in this context, especially regarding demography and urbanization, the condition of civic society, education, land ownership, and the quality of spatial planning.

Urbanization Processes Influencing Classification Errors in Land Registers
Demographics and unplanned, unsystematic, and rapid urbanization also have important implications for changes in transformation of the physical landscape, land cover, and forest loss and fragmentation [33][34][35]. Changing patterns of land use and the impact of socioeconomic factors have brought about major changes in ecosystems and forest land use [36]. Understanding of the dynamics of urbanization-induced land cover change is therefore necessary to acknowledge differences between the land register and the actual land functions.
Forests near large cities are subject to a strong anthropogenic pressure related to suburbanization. Migration and population growth in urban agglomerations stimulate the demand for land, including areas of high natural quality [37]. This contributes to the progressing fragmentation of forests [38] and the successive reduction of forest areas and their transformation into residential areas [39]. The local and regional sequences of natural connections are being cut. Therefore, it seems logical that the degree of forest fragmentation affects the number of errors [40] occurring both on maps [41] and in land registrations. Research conducted on the example of the Warsaw agglomeration confirms that the spontaneous development of buildings often occurs in suburban forests [42], which are bought for single-family housing. This is confirmed by the high dynamics of applications for changes in the use of these lands for construction purposes. In addition, buildings on forest plots often exceed acceptable standards or norms [43]. There are often cases of noncompliance with the provisions of local spatial development plans by owners of newly built residential buildings regarding the minimum share of forested area on a plot. Then, there is the situation in which the area occupied for development significantly exceeds the permissible size, which means that the forested area of such a plot remains fictional [44]. On the other hand, abandoned farmland, particularly in central and northern Europe, is undergoing a process of colonization by trees [45]. All this translates into irregularities between the actual state and records in the land registries.

Impact of the Condition of Civic Society on Data Quality in Land Registers
Knowledge is scarce over how the public sphere, and civic society in particular, engage with building data infrastructure including land registry data. Gray & Lammerhirt [46] partly tried to fill the research gap in this respect, but mainly in the context of urban areas. Meanwhile, in the literature, the social factor (people) has often been indicated as one of the most significant in Spatial Data Infrastructures (SDI) components next to data, legal, access and technology, standards, policy, and institutional arrangement [47,48].
Poland is still in the construction phase and maturation of civic society [49][50][51]. Among Polish citizens, there is a common attitude of unwillingness to cooperate with the authorities, which is also due to experience in the functioning of the previous political and economic system, which is why residents do not trust public authorities and show a low degree of involvement in sociopolitical issues [52,53]. This translates into a high degree of ignorance of formal and legal issues because of, e.g., difficult and unclear language of official documents and legal acts, complexity of procedures, inconsistent judicial and administrative rules, low level of spatial planning awareness, and the lack of interest in controlling of local authorities and institutions [54]. Insufficient legal awareness of the society in relation to the obligations and rights of owners and to the state's obligations regarding the protection of property rights to land causes a lack of fulfilling the obligations incumbent on residents (e.g., not updating records of land registration dictated by financial and tax issues), as well as proper supervision by public institutions [55,56]. Meanwhile, "good practices" in the quality of land registry data depend very strongly on the local social and cultural context of the particular land administration system [57].
Civic society, created thanks to voluntary organizations, associations, and contacts, is one of the foundations of the democratic system [51,58], shaping proper social and political relations [59,60] and achieving public goals [61]. The active participation of residents in spatial management (e.g., spatial planning, land records updating, etc.) is a characteristic of a developed democracy and an already formed civic society [62][63][64]. In Poland, the development of civic activity and the increase of social expectations are progressing slowly but steadily. A participatory approach in consulting planning arrangements is becoming more and more common [65][66][67][68][69]. Citizens' initiatives and associations are in the interests of various social groups participating in democracy procedures and common actions. Civic participation can be measured, e.g., by the level of election attendance [70][71][72][73] or participation in various types of nongovernmental organizations (NGOs), associations, or clubs [74][75][76][77]. It can be expected that the more developed the civic society institutions are, the better the control of the land registry should be, and as a consequence, the differences between the land registry and the real land functions should be smaller.

Education Level and Its Influence on Errors in Land Registers
The level of education is also important, because the higher the share of people with higher education in society is, the higher the expected level of activity and interest in sociopolitical matters. People with higher education, compared to people without, work and sit on the boards of NGOs more often and actively participate in volunteering [78]. It can be assumed that the better educated the society, the better the control over matters related to records of the land registry is. In addition, significant differences in social activity are observed in the urban-rural relationship, which almost entirely comes from twice the percentage of people with secondary and higher education in the urban population [79][80][81]. Hence, in more urbanized areas, residents show greater interest in local matters, and there, greater compliance of the formal status (land registry) with the actual status can be expected.

Land Ownership and Land Registry Classification Errors
The percentage of ownership of the forest by the State Forests underwent major changes over the years, especially in the period immediately after World War II, when there was a significant increase in forest areas managed by the State Forests. In the 1990s, the boundaries of real estate began to be disclosed in land and mortgage registers, when they were established for the State Forests' land [82]. Currently, they are established for the majority of the State Forests' land, and care for the state of ownership has become a priority for the State Forests [82][83][84].

Relation Between Culture of Spatial Planning and Data Quality in Land Registers
Land management and economic development activities stimulate demands for comprehensive information about social-economic and governance conditions in combination with other landrelated data [85]. They all contribute to the establishment of multifunctional information systems, incorporating, among others, structure and forms of land use, land tenure and land value, land development, and other useful data. The institutional evolution in the field of cadaster differs from one European country to another due to the influences of culture, history, economic environment, and social level [86].
Social expectations, associated with the institution of the cadaster, do not mean that this register actually fulfills the functions which are attributed to it. The quality of the data collected in the cadaster, resulting both from the adopted formal and legal, and technological solutions, negatively verifies this assumption and, largely in its current shape, does not meet the expectations of society [87]. We often face the situation of inability to unambiguously identify the extent of the rights to cadastral parcels and the actual form of land use, which remains incomprehensible for the public and makes the institution of real estate cadaster lose its significance.
In Poland, there is now no significant political, social, or economic pressure to tidy up entries in the land registry system. The public good nature of a national land administration infrastructure is not fully understood by citizens. Meanwhile, a fully complete and up-to-date infrastructure for land administration would radically improve social inclusion by providing better awareness and service delivery for citizens [88]. The Polish state and quality of the data contained in the records of the land does not fully reflect the actual state, because the records are in most cases carried out in a passive way. This causes irreversible outdating of registration data, leading to a lack of credibility [56]. Recently, attempts have been made to assess the quality of registration data regarding the indication of indicators for assessing the quality of cadastral data at the local [89,90] and regional [91] levels.
The Polish land and building register is identified with a real estate cadaster, which was specified by Geodetic and Cartographic Law [92]. The method of establishing and keeping the land and buildings register is specified in the Regulation of the Minister of Regional Development and Construction [93]. The land and building register for Poland is an information system that provides the collection, updating, and sharing, in a uniform manner of information on land, buildings, and premises, their owners and other entities that own or manage these land, buildings, or premises [94].

Hypotheses
Summing up the conclusions of previous studies, the following hypotheses about the impact of various factors on differences between actual and evidenced forest cover can be stated:

Research Design
The focus of this research was on identification of spatially varying social, demographic, and economic determinants of differences between forest identification based on remote sensing techniques and land registry (see Figure 1). The research area covers the entirety of Poland divided into 2,478 communes (NUTS-5 units). First, forest cover was detected by utilizing supervised machine learning classification of Sentinel-2 data in the year 2018. Then, the results were compared to forest cover reported in the land registry delivered in Local Data Bank by Statistics Poland. In consequence, forest classification errors in the land registry were identified and analyzed. Spatial analysis of forest classification errors included the local Moran's I statistic. Application of the mentioned statistic allowed to identify spatial clusters where forest cover in land registry is underestimated or overestimated. Finally, geographically weighted regression (GWR) was applied to estimate spatial patterns of impact of various social, demographic, and economic factors on occurrence of forest classification errors in the land registry. The following factors described by aggregate indexes were considered: Urbanization processes, civic society development, education, land ownership, and culture and quality of spatial planning.

Data Sources
Three different data sources were used to identify forest detection errors of land registry. On the one hand, data from land registry delivered in Local Data Bank by Statistics Poland and Head Office of Geodesy and Cartography were applied. On the other hand, the Database of Topographic Objects was used to prepare a forest detection machine learning classification model of Sentinel-2 data in the year 2018.
The data concerning forests defined as forest areas and land related to forest management of all ownership forms in 2018 for communes (NUTS-5 units) were obtained from Local Data Bank provided by Statistics Poland. The Sentinel-2 images were downloaded from Copernicus Open Access Hub. Sentinel-2 images used in the study were acquired from 20 April to 5 December 2018. The extended timespan was applied to acquire nonclouded satellite images, although materials with a maximum cloud coverage of 1% of images were also accepted. Sentinel-2 data bands: 4 (red), 3 (green), 2 (blue), and 8 (near infrared), all with a pixel resolution of 10 m, were combined into fourband rasters. The main source of forest spatial data was the Data Base of Topographic Objects valid for 2016, obtained from the Head Office of Geodesy and Cartography of Poland. This data covered the entire area of Poland and its accuracy corresponded to a map on a scale of 1:10,000 with a minimum area of land coverage patch of 0.1 ha. The information scope of the Data Base of Topographic Objects is based on three levels of detail. The database contains 9 object classes divided into 57 object categories, which contain 244 types of topographic objects. Two types of objects were used in the study-Forests and Coppices, belonging to the category of Forest and Wooded Areas, which are included in the Land Cover class of objects.
Separate process of data collection was conducted for variables describing social, demographic, and economic determinants of land registry errors: Urbanization processes, civic society development, education, land ownership, and culture and quality of spatial planning. Three following data sources were considered: Local Data Bank provided by Statistics Poland, the Database of Topographic Objects, and State Electoral Commission in Poland (see Table 1).

Methods of Analysis
There is strong evidence that automatic land cover classification is viable in terms of accuracy, cost, and time-efficiency. Several studies have confirmed that remote sensing can greatly benefit from utilizing modern methods offered by machine learning (ML), especially deep learning (DL), to tackle a wide range of problems related to processing satellite imagery. Among numerous examples, we can find successful attempts of crop types classification [95], utilizing DL in agriculture [96], urban land cover classification [97], and identifying land abandonment [98]. Thus, the use of machine learning in the context of forest areas detection in Poland is justified.
Semantic segmentation is a machine learning task of detecting a specific region of an image and assigning it a label to make this region distinguishable from different discovered regions and thus facilitating the process of image content interpretation. Segmentation, in terms of remote sensing and satellite imagery handling, is a process of classifying pixels, originating from satellite images, into categories representing, e.g., different land cover types. Due to high complexity of land cover features, size, and variety of information offered by satellite imagery, the task of training a model capable of determining belonging to a specific category should be considered nontrivial. Traditionally, such classification could be approached using classifiers such as Support Vector Machine (SVM) or Maximum Likelihood Classification (MLC). However, the above-mentioned classifiers lack accuracy and, what is equally important, the time needed to train a model based on them is far from desirable. On the contrary, classifiers based on convolutional neural networks (CNN) exhibit better performance than traditionally used SVM or MLC [99].
There are multiple semantic segmentation models architectures available. In the preliminary study, covering 20% of original data, researchers tested four architectures: U-Net [100], FPN [101], Linknet [102], and PSPNet [103]. Among those U-Net and PSPNet were elected as architectures capable of reaching the highest intersection over union (IoU) scores (above 70%). After hyperparameter tuning, comparing the scores and manual checking visualized inference results PSPNet model was abandoned. This was mainly due to issues related in properly inferring the forest area boundaries. Therefore, the authors built a neural network model based on the U-Net architecture (see Figure 2). U-Net is especially well suited for semantic segmentation. It consists of two paths: Contracting and expansive. The contracting path follows the typical architecture of a convolutional network, that is, a repeated application of two convolutions. Every step in the expansive path consists of an upsampling of the feature map [100]. The basic building block of the U-Net architecture, a twodimensional convolutional neural network layer, has been frequently used in remote sensing activities. CNN's main purpose is to extract from processed images common spatial features [104]. Those features can later be used in the inference phase to classify new images, i.e., describing them by relevant label or labels. The basic U-Net architecture has been adjusted to enable the processing of multiband satellite images. The adjustments were related to increasing the input size to treat images in RGB and near infrared channels, changing the kernel size from 3 × 3 to 5 × 5, increasing the depth of the contraction phase and tuning relevant hyperparameters. The model was evaluated using a loss function being a sum of binary cross entropy loss and Jaccard index loss. The accuracy of the model was monitored using the IoU metric. The model has been implemented in Python using Tensorflow and Keras frameworks. Overall, the model reached a 0.765 IoU score, 0.867 f1-score, and 0.922 binary accuracy (pixel-wise) on the test set (5% of the dataset). For all of the mentioned metrics, a range from 0 to 1:0 means that the model is a complete failure, while 1 indicates a perfect model. U-Net training phase requires a dataset composed of images and their segmentation maps (see Figure 3). Segmentation map holds information about interesting regions in the described image. There are multiple techniques of creating such regions, e.g., bounding boxes or polygons. Authors opted for masks. Mask is a grayscale image composed of pixels that correspond to a single input image. Each mask pixel value represents a different label. In case of this research, "1" was used to indicate that the pixel represents a forest and "0" otherwise. The results of supervised machine learning classification were aggregated to communes (NUTS-5 units), and then compared to data from land registry delivered in Local Data Bank by Statistics Poland. Forest cover (forest cover indicator), defined as the percentage ratio of the forest area to the total geodesic area of the commune, was calculated, and the forest areas in communes were analyzed for both types of data. The difference identified between forest cover indicators calculated based on application of remote sensing techniques and reported in the land registry was increased by 100% and defined as forest classification error of the land registry (FORESTCE). The dependent variable FORESTCE equals 100% when the results of remote sensing techniques application and land registry data are equal, exceeds 100% when forest cover identified by application of remote sensing techniques is larger than value recorded in land registry, and is lower than 100% when higher forest cover was reported in land registry compared to forest cover identified through remote sensing techniques application.
In the land registry, forest areas were defined as a homogeneous surface of at least 0.10 ha covered with forest vegetation like trees, bushes, and undergrowth (forested area), or temporarily devoid of it (nonforested area), but intended for forest production or being a nature reserve, a part of national park, or registered as nature monuments. The forest areas also included land connected with forest management, occupied by buildings and structures, water drainage facilities, forest spatial division lines, forest roads, areas under power lines, forest nurseries, wood storage facilities, and also forest car parks and tourist facilities [105,106]. The mask in supervised machine learning classification was representing forest defined as areas of at least 0.10 ha with a dense tree cover. These included forests, as well as other wooded areas, e.g., land adjacent to surface waters or recreational areas [107].
The method of land cover errors detection has some limitations. The first refers to the limitations of the remote sensing technique applied. It is the occurrence of cloud cover on satellite images, which disturb or prevent recognition of objects and data generation. It enforced the application of nonclouded satellite images coverage for the whole study area. The second limitation is the inconsistencies in the definition of forest area, which may affect the occurrence of errors of land registry. Small systemic overestimation of forest cover in the land registry might be identified. However, the incompatibility concerns areas associated with forest production and management that are mostly spatially related to forest areas and did not affect the results of the analysis in the adopted scale of the study.
The spatial volatility of land registry errors was analyzed in detail. Spatial clusters of the investigated phenomenon were identified by calculating the local Moran's I statistic, the popular indicator of local spatial autocorrelation. This technique of exploratory spatial data analysis is based on analysis of the investigated phenomenon differing over space in every considered unit as well as in its surroundings [108]. The calculation of local Moran's I statistic in every j considered spatial unit should be expressed by the following equation [16,109]: where FORESTCEj represents a value of forest classification errors in land registry identified in j commune, FORESTCEi equals the value of the same variable in every i of k considered surrounding units, and wji is the element of weight matrix of neighborhood between communes j and i. Neighborhood is a dummy variable which takes "1" when j and i communes are identified as neighbors, and "0" otherwise [16,109]. When local Moran's I statistics range between 0 and +1, either high values (or low ones) are spatially clustered around a commune characterized by a similar value. The hot spot is identified when the high value is surrounded by high ones, and the cold spot when the low value is surrounded by low ones. On the other hand, when the considered statistics range between −1 and 0, the neighboring values are dissimilar to the value at the particular spatial unit. The spatial outlier is identified when the high value is surrounded by low ones, or the low value surrounded by high ones [108]. Aggregate indexes related to social, demographic, and economic determinants of land cover errors (urbanization processes, civic society development, education, land ownership, and culture and quality of spatial planning) were considered as independent variables. Each aggregate index equals the average of normalized values of measures describing each investigated determinant of land cover errors. It must be underlined that calculation of normalized values z based on real values x, depended on the contribution of particular measures to determinants, and was different for stimulants:

⋅ 100%
(2) Compared to destimulants: However, in this research, all considered measures had stimulus contributions to aggregate indexes, further investigated as independent variables. Aggregate index URBAN describing urbanization processes covers the stimulus contributions of both population density and the number of buildings per 1 km 2 . The aggregate index CIVIC, referring to civic society development, consists of the following stimulus measures: Number of NGOs per 10,000 population, number of club and artistic groups members operating per 10,000 population, and turnout in local government elections in 2018. Then, the education (EDU) aggregate index includes stimulus measures of municipal expenditures on healthcare per capita, and percentage of councilors with higher education in a commune. The index LDOWN refers to land ownership and is equal to the normalized value of the share of public forests managed by the State Forests. Finally, the culture and quality of spatial planning (SPPLAN) is calculated as normalized share of area covered by valid local spatial development plans. The list of all measures contributing to considered independent variables might be found in Table 1. However, limitations of using aggregate indexes should be emphasized. Most of all, selection of measures contributing to indexes could be contested. Moreover, aggregate indexes consider only quantitative information and might invite a simplistic image of investigated phenomena [110].
GWR was applied to explain spatially varying impact of various social, demographic, and economic determinants on investigated forest classification errors. However, GWR application should be preceded by Ordinary Least Square (OLS) modelling to identify global, spatially constant impact of investigated social, demographic, and economic factors on land registry errors [111]. Linear, global impact of considered determinants on investigated forest classification errors should be expressed as follows: where FORESTCE refers to forest classification errors identified in land registry. URBAN, CIVIC, EDU, LDOWN, and SPPLAN describe aggregate indexes characterizing various social, demographic, and economic factors influencing investigated land registry errors. In particular, the URBAN abbreviation refers to urbanization processes, CIVIC-civic society development, EDU-education, LDOWN-land ownership, and SPPLAN-culture and quality of spatial planning. Detailed characteristics of independent variables were presented in Table 1. Finally, β0 refers to intercept, β1 to β5 refer to coefficients, and ε-to statistical bias. GWR was successfully applied in various researches concerning land cover or land use, as well as its changes, impacts, and results. Brown et al. [111] tested the potential of the method in examining local variations of relationship between land cover, rainfall, and surface water habitat in southeast Australia. Spatial volatility of landscape fragmentation and its many different anthropogenic influences were investigated by enabling GWR for the case of Shenzhen City in China [112]. By GWR application, Leśniewska-Napierała et al. [16] described local spatial patterns of European Union funds impacts on land cover changes in Poland. Identification of driving forces of deforestation in the state of Mexico was enabled by the discussed method as well [2]. Shariff et al. [113] applied GWR to model urban land use changes in Penang Island in Malaysia. The discussed method allowed for the investigation of the spatially varying relation between groundwater quantity changes, and land use and land cover changes in the Khanmirza Plain of southwestern Iran [114]. GWR allowed Huang et al. [115] to identify the role of forest areas in decreasing water pollution in the urban subwatersheds, which was more significant compared to rural ones.
In this research, GWR was designed to estimate local spatial patterns of various social, demographic, and economic impacts on errors of forest identification in land registry. The main goal of GWR application was to explore spatial variations in relationships between investigated variables [115]. Local models for every j Polish commune described by geographical coordinates Long, and Lat were estimated and analyzed. As a result, it was possible to identify the areas where forest classification errors were significantly influenced by each considered aggregate index referring to investigated social, demographic and economic factors. The mentioned spatially varying impact might be expressed as follows: The selection of neighboring spatial units to be included in estimation of local models in every j commune, weighted distance decay function (or kernel function), w, is applied to find the h bandwidth for every k commune spatially distributed around the j spatial unit [111,116]. Corrected Akaike information criterion was utilized to delimit the adaptive (smaller for dense located spatial units, and larger for sparse ones) bandwidth, offering a more desirable assessment than the fixed one [111,114]. Weight applied for each local model depends on the Euclidean distance d between the polygons' centroids of considered j, and every k surrounding commune [112]. Weight of units located out of the bandwidth from considered commune is equal to zero [116,117]: The results of GWR implementation were presented on the following maps: (1) Map of local coefficients of determination-to identify goodness of fit of local models to empirical observations, (2) maps of values of estimated local parameters of independent variables-to identify spatial patterns of investigated social, demographic, and economic impacts on forest classification errors in land registry, and (3) maps of values of t-Student tests for mentioned parameters-to detect statistical significance of identified impacts [112,118,119].

Forest Cover
Based on data obtained as the result of supervised machine learning classification of Sentinel-2 satellite images, the forest cover of Poland was 32.8% (see Figure 4). The areas located in the center of Poland, including the south part of the Kuyavia-Pomerania province, north part of the Łódź province, Masovia, Greater Poland, and the Lublin province in the east, were characterized by less forest cover. The greater forest cover occurred in the north and west regions of Poland, especially in the Lubusz province, where forest cover was over 50% of its area. The forest cover analysis carried out for communes  showed that that average forest cover was characterized by high variability reaching 63.7%. A complete lack of forests (0.0% of forest cover), was found only for 4 out of 2478 communes. Communes with forest cover of 0.1-20.0% accounted for 36.0% of all. These communes were located mainly in the central part of Poland, but their clusters are also visible in the south and south-east of the country. A significant number of communes-37.5%-had forest cover in the range of 20.1-40.0%, whereas communes with forest cover of 40.1-60.0% constituted 19.4% of all analyzed units and were situated mainly in western and northern part of Poland. The percentage share of communes with significant forest cover of 60.1-80.0% was 6.3%. They stretched in the form of a strip from the south to west through the Lubusz province to the north through the West Pomerania and Pomerania province. Their cluster was also visible on the southeastern edges of the Subcarpathia province. Only 0.7% of communes had forest cover equal 80.1% or above. The largest forest cover of 95.1% was recorded for the Krupski Młyn commune located in the southern part of Poland (Silesia province).

Errors of Forest Identification in the Land Registry
The variable FORESTCE depicting forest classification errors in land registry was introduced based on relative comparison of both forest identification resulting from remote sensing techniques application and land registry analysis. It should be recalled that FORESTCE index equals 100% when the results of remote sensing techniques application and land registry data are equal, exceeds 100% when forest cover identified by application of remote sensing techniques is larger than value recorded in the land registry, and is lower than 100% when value of forest cover value is overestimated in the land registry.
The values of FORESTCE index for every particular commune in Poland are presented on Figure  5. It must be emphasized that in 70.7% of the investigated communes, the values of forest cover identified by remote sensing techniques application and reported in the land registry were similar, and the difference did not exceed 5%. Interestingly, the lowest value of FORESTCE index (52.5%) was evidenced in the Białowieża rural commune in the Podlasie province. In the mentioned commune, the forest area identified by remote sensing techniques covered only 39.9% of the investigated unit, while the forest cover reported in the land registry was 87.4%. This is the area on the edge of Białowieża Forest (UNESCO World Heritage Site), where controversial cutting of the stand on a large scale was reported in 2017. On the contrary, the largest value of FORESTCE index was diagnosed in the city of Łęknica in the Lubusz province, on the Polish-German border. It must be emphasized that the largest English-style landscape garden in Central Europe called Muskau Park (UNESCO World Heritage Site) is located in the mentioned city. Summarizing. On one hand, updating the land registry cannot keep up with the effects of forestry. On the other hand, the pretend forest is the category of land cover which might somehow affect the results of forest identification based on remote sensing techniques.  On the other hand, LL describes spatial clusters where forest evidenced in land registry is larger than identified by remote sensing techniques application. The areas of spatial concentration of overestimated forest cover in the land registry are as follows: (1) Areas located in the trail of the violent gale that passed in August of 2017 over the provinces Greater Poland, Kuyavia-Pomerania, and Pomerania; (2) a wide strip stretching along the A1 and A2 highways starting from western edge of Warsaw metropolitan area, running latitudinally toward Łódź, and then longitudinally north, covering eastern parts of the provinces Kuyavia-Pomerania and Pomerania; (3) areas located on western edge of Białowieża Forest in the Podlasie province; and (4) the greater part of the Lublin province.

Determinants of Land Registry Errors
Urban performance, civic society development, education, and land ownership, as well as culture and quality of spatial planning, were introduced as aggregate indexes describing investigated social, demographic, and economic factors determining forest classification errors in the land registry of Poland. The descriptive statistics and global impact of the mentioned independent variables are presented in Table 2. It must be underlined that the global model estimated by applying OLS regression does not explain volatility of forest classification errors: The coefficient of determination was equal to 2.4%. However, all predictors but urbanization processes were recognized as statistically significant. Regarding the goal of this research, it must be underlined that spatially varying value and direction of impact of considered determinants of forest classification errors confirmed OLS as a completely ineffective method. When analyzing models estimating the impact of particular factors (urbanization processes, civic society development, education, land ownership, and culture and quality of spatial planning) on the variable referring to forest classification errors, the main focus needs to be on the areas where the discrepancies between the land registry records and actual state identified by remote sensing techniques application are the largest. This means that both LL and HH clusters of FORESTCE variable need to be investigated with particular attention. Additionally, the greatest emphasis should be placed on areas where impact of particular factors is recognized as most significant. This means that t-Student values calculated for local coefficients describing impact of each separately discussed factor are expected to be equal or higher than 1.96.
A significant relation between urbanization processes and occurrence of forest classification errors was diagnosed in a few areas in Poland (see Figure 7). However, only in the Lublin province, the negative impact of urbanization corresponds to the spatial concentration of areas characterized by forest cover overestimated in the land registry. This is the only region where hypothesis H1 was confirmed. When considering this peripheral, relatively less developed region of Poland, it must be stated that the more intensive urbanization processes are, the bigger difference between actual forest cover and reported one occurs. It should be emphasized that in less developed areas, local development policy stimulating urbanization processes does not respect land use planning. Actual deforestation carried out for the purposes of, e.g., developmental investment projects and initiatives, is not reflected in the land registry. The negative and significant impact of civic society development on the variable describing discrepancy between forest classification based on remote sensing techniques and forest cover reported in the land registry is evidenced in large areas in southeast and northeast part of Poland (see Figure 8). Moreover, in many border communes concentrated in the mentioned areas, the actual forest cover significantly exceeds the value reported in the land registry. This is contrary to hypothesis H2 and means that local civic societies are organized in opposition to state principles and institutions, including the land registry of Poland. Thus, the problem of inconsistencies in the development of borderland communities should be emphasized. It must be underlined that the land and building registry system should be re-engineered to better serve the needs of users, including citizens. As Williamson et al. [85] suggested, engagement of the society and good governance in decision-making and implementation is crucial. This requires building the necessary capacity in individuals, organizations, institutions and wider society so that they can perform their functions effectively, efficiently and sustainably. To provide spatial integrity of the cadaster and identification of every land parcel, it should be updated on a regular basis. In addition, a cadaster should ideally include all land in a jurisdiction: Public, private, communal, and open space. The significant relation between education and land registry errors is evidenced in a strip covering the provinces of Lesser Poland, Silesia, and Łódż (see Figure 9). However, hypothesis H3, related to the impact of education level of the society on forest classification errors in the land registry, of Poland was confirmed only in the northeast part of the metropolitan area of Kraków in the Lesser Poland province. In the mentioned area, forest cover is significantly overestimated in the land registry. However, the increase of the education level of the society enables better understanding of the land registry importance. In consequence, it reduces the difference between actual and evidenced forest cover. It can be concluded that the northeast part of the metropolitan area of Kraków is the only region in Poland where increasing education level of the society protects urbanized areas from overestimation of forest cover in the land registry. On the other hand, the results evidenced for the Silesia province and the south part of the Łódź province substantially contradict hypothesis H3. In these areas, the higher the education level of the society is, the greater the overestimation of forest cover in the land registry. Some indirect influence of education on the difference between actual and evidenced forest cover was expected. Moderating role of urbanization processes might be the explanation. However, GWR application does not enable analysis of indirect effects of investigated factors. Thus, in-depth analysis of the impact of education on forest classification errors in the Silesia province and the south part of the Łódź province is demanded. The influence of share of public forests managed by the State Forests on forest classification errors in the land registry of Poland is presented in Figure 10. It needs to be emphasized that hypothesis H4 was confirmed for both areas characterized by significantly overestimated and underestimated forest cover in the land registry. Within the areas where actual forest cover is less than the evidenced one, the State Forests' policy and forest management was confirmed as a factor stimulating land registry update. The situation is evidenced in the Kuyavia-Pomerania and Pomerania provinces, as well as in small clusters of overestimated forest cover in the land registry detected in the West Pomerania province. Moreover, the Silesia province is the region where actual forest cover is larger than reported in the land registry. However, the investigated factor related to the land ownership was also confirmed as influencing the decrease of the difference between forest cover identified and evidenced. In contrary, the higher the percentage of forests managed by the State Forest in communes located on the western edge of the Białowieża Forest is, the bigger the differences between the land registration records and the actual state are. This confirms serious problems diagnosed in the eastern part of Poland, which the state-owned company managing forests have to face with. The culture and quality of spatial planning was confirmed as the factor stimulating correction of the land registry data related to forest cover (see Figure 11). In particular, this is evidenced in both of the provinces of Kuyavia-Pomerania and Pomerania, where communes characterized by significant overestimated forest cover in the land registry are clustered. The mentioned provinces were identified as the areas where hypothesis H5 was confirmed. On the other hand, higher coverage by valid local spatial development plans does not protect all Polish communes against forest classification errors in the land registry. The problem is evidenced both in areas characterized by significantly underestimated (the metropolitan area of Warsaw, and mountain area on the border of the Silesia and Lesser Poland provinces), as well as overestimated (communes on the Polish-Ukrainian border) forest cover in the land registry. This also means that using coverage by valid local spatial development plans as a measure of culture and quality of spatial planning has serious limitations.
(a) (b) Figure 11. Value (a) and significance (b) of impact of quality and culture of spatial planning on difference between actual and evidenced forest cover in Poland, in 2018. Source: Own elaboration.

Conclusions
The goal of this paper relates to the social, demographic, and economic factors determining differences between forest identification based on remote sensing techniques and land registry. The combined application of the GWR method and local Moran's I statistics allowed the identification of regions where urbanization processes, civic society development, education level of the society, land ownership, and culture and quality of spatial planning affect the differences between actual forest cover and evidenced in the land registry. Spatial patterns of mentioned relations were diagnosed in areas where forest cover is over-or underestimated in the land registry.
It was confirmed that in less developed areas, local development policy stimulating urbanization processes does not respect land use planning principles, including accuracy of land registry. The problem of inconsistencies in the development of borderland communities was confirmed as local civic societies are frequently organized in opposition to state principles and institutions, including the land registry. The metropolitan area of Kraków was evidenced as the territory where the education level of the society protects from a substantial overestimation of forest cover in the land registry. Issues related to the land registry quality diagnosed in the eastern part of Poland, mainly on the western edge of the Białowieża Forest, needs to be solved mainly by the state-owned company managing forests. Finally, higher coverage by valid local spatial development plans stimulates protection against forest classification errors in the land registry for a limited number for Polish communes.