An Algorithm for Delimiting Rural Areas According to Soil Classes

: For many years and all over the world, agricultural production has been observed to slow down on low-quality soils in territories featuring difﬁcult topography and poor spatial structure (land fragmentation, excessive elongation of plots, plots without direct access to public roads, and land scattering). This paper proposes a new, self-designed algorithm for delimiting rural areas that allows the clustering of villages featuring low soil productivity, based on three factors used for determining the overall value of the area of land (Wcag), i.e., the overall area of the village (ha), share of speciﬁc type of land in the overall area of the village (%), and mean score for speciﬁc soil type (pts.), which allows the villages to be grouped according to classes of land occurring in the examined district. The results of the surveys provide a basis for further detailed studies into efﬁcient management of areas featuring low soil classes during land consolidation works. Further surveys will involve a detailed analysis of the identiﬁed clusters of villages to ensure that their potential is used to the optimum extent. As a consequence, these areas will potentially become more competitive and operations conducted there will be beneﬁcial to the local inhabitants and contribute to improving their living standard.


Introduction
For decades, many countries of the European Union or Asia have been challenged by problems of abandoned agricultural land. Reasons for discontinuing agricultural production have been investigated by many researchers [1][2][3][4]. Such a state of affairs is due to multiple factors, including land fragmentation, excessive elongation of plots, presence of plots without direct access to public roads, and land scattering. It should be remembered that agricultural land is an important and necessary factor in food production, so preventing the abandonment of land is an important element of food security [5,6]. Agricultural production is carried out mostly in areas generating high yield at a low workload by using specialist agricultural equipment, including land with good-quality soils intensively used for farming. Areas that require a high workload are exposed to a decreasing surface area of crops or to changes in agricultural production trends and are, in addition, adversely affected by the accelerated rate of climate change, including demographic processes [7][8][9], as well as social and economic processes [10]. This is corroborated by studies [11] revealing a clear demographic trend connected to the migration of the younger population to cities and their unwillingness to take over relatively small farms generating low income. Discontinuation of agricultural production refers mainly to areas with difficult topography [12,13] that also feature low-quality soils [14] and very unfavourable fragmentation of land [15]. Climate change is a factor that significantly contributes to abandoning agricultural production. Small resources of water and uneven distribution of atmospheric precipitation, together with low water-retention capacity, constitute obstacles to agricultural production. It has a significant impact on the natural conditions, and in particular on access to water in many areas where traditional agriculture is practised [16]. Central and Eastern Europe is a region Studies carried out in this area [31,32] allowed the authors to collect study material in the form of data from a real property register referring to types and classes of three major uses of land in terms of their share, i.e., arable land, meadows, and pastures. The overall value of the area of land (Wcag) was determined using three factors, i.e., total area of the village (ha), share of the respective type of land (separately for arable land, meadows, and pastures) in the overall area of the village (%), and the mean score for the respective type of land (pts.). The value of the last factor was calculated based on the scores for soil-quality classes of arable land and grassland adopted after [33], whose studies referred to four cereals and potato crop yield. Those studies resulted in determining the production value of arable land and grassland according to soil-quality classes using a 100-point scale (Table 1). Table 1. Scores assigned to soil classes for arable land and grassland.

Soil Class
Arable Land Grassland  I  100  90  II  92  80  IIIa  83  65  IIIb  70  IVa  57  45  IVb  40  V  30  38  VI  18  15 The methods of research comprised calculations for three factors, i.e., overall area of the village in hectares; percentage share (in relation to the overall area of the village) of arable land, meadows, and pastures and the mean score for the respective types of land; Studies carried out in this area [31,32] allowed the authors to collect study material in the form of data from a real property register referring to types and classes of three major uses of land in terms of their share, i.e., arable land, meadows, and pastures. The overall value of the area of land (Wcag) was determined using three factors, i.e., total area of the village (ha), share of the respective type of land (separately for arable land, meadows, and pastures) in the overall area of the village (%), and the mean score for the respective type of land (pts.). The value of the last factor was calculated based on the scores for soil-quality classes of arable land and grassland adopted after [33], whose studies referred to four cereals and potato crop yield. Those studies resulted in determining the production value of arable land and grassland according to soil-quality classes using a 100-point scale (Table 1). Table 1. Scores assigned to soil classes for arable land and grassland. I  100  90  II  92  80  IIIa  83  65  IIIb  70  IVa  57  45  IVb  40  V  30  38  VI  18  15 The methods of research comprised calculations for three factors, i.e., overall area of the village in hectares; percentage share (in relation to the overall area of the village) of arable land, meadows, and pastures and the mean score for the respective types of land; and  The whole computational algorithm provided for  clustering separately for arable land, meadows, pastures, and collectively for all agricultural  land (arable land, meadows, pastures). Village clustering used the previously tested Ward's method [31,32] and the distance matrix was squared as Euclidean distances. It should be mentioned that Ward's method provided very good clustering results that in previous studies [31] were controlled using a different clustering method-the complete-linkage method-leading to 80% concurrence of the results. The studies were carried out using the STATISTICA PLUS programme from StatSoft Polska.

Soil Class Arable Land Grassland
Since the diagnostic features used for clustering are expressed in different units, andin the first place-have a different order of magnitude, it is necessary to standardise their value. The selected standardisation procedure was classical standardisation according to the formula: x is the mean value, and s is the standard deviation.
Such converted values have a mean value equalling 0 and standard deviation equalling 1. Clustering is performed separately for arable land, meadows, pastures, and all agricultural land. It makes use of Ward's method and the distance matrix squared as Euclidean distances.

Clustering of Rural Areas-Arable Land
The results of clustering derived from surveys compiled in Table 2 (mean values for clusters), Table 3 (indicators of mean values for clusters-allows the level of specific mean values for clusters to be compared with the overall mean value), and Table 4 (general characteristics of the distribution of quality measures) made it possible to provide a precise description of six clearly identified clusters (Figure 2). and a comparison of results and their presentation as a dendrogram illustrating the pro cess of merging the areas with similar features. The whole computational algorithm pro vided for clustering separately for arable land, meadows, pastures, and collectively for a agricultural land (arable land, meadows, pastures). Village clustering used the previousl tested Ward's method [31,32] and the distance matrix was squared as Euclidean distances It should be mentioned that Ward's method provided very good clustering results that i previous studies [31] were controlled using a different clustering method-the complete linkage method-leading to 80% concurrence of the results. The studies were carried ou using the STATISTICA PLUS programme from StatSoft Polska.
Since the diagnostic features used for clustering are expressed in different units and-in the first place-have a different order of magnitude, it is necessary to standardis their value. The selected standardisation procedure was classical standardisation accord ing to the formula: * = ( − ̅ )/ where i is the object number, ̅ is the mean value, and s is the standard deviation.
Such converted values have a mean value equalling 0 and standard deviation equa ling 1.
Clustering is performed separately for arable land, meadows, pastures, and all agr cultural land. It makes use of Ward's method and the distance matrix squared as Euclid ean distances.

Clustering of Rural Areas-Arable Land
The results of clustering derived from surveys compiled in Table 2 (mean values fo clusters), Table 3 (indicators of mean values for clusters-allows the level of specific mea values for clusters to be compared with the overall mean value), and Table 4 (genera characteristics of the distribution of quality measures) made it possible to provide a pre cise description of six clearly identified clusters ( Figure 2).     Considering the study in terms of delimiting areas featuring the worst soil classes, the arrangement of these characteristics was clearly the worst in Group D, featuring the smallest area of land, the lowest share of arable land, and definitely the poorest soil quality. These villages had a very diverse terrain relief as well as high (Obarzym 53.3%, Wola Jasienicka 52.0%) and very high share of forestland (Hroszówka 97.0%). Group E corresponded to areas with poor soil quality but featuring a high-percentage share of land in the overall area of the village. In turn, villages from Group A featured the highest quality of arable land but the percentage share in the overall area was low. The spatial distribution of clusters is illustrated by Figure 3 and the number of villages in respective clusters is presented in Table 5.  Przysietnica, Domaradz, Golcowa, Haczów, Blizne, Izdebki, Wesoła

Clustering of Rural Areas-Meadows
A quite clear division into six groups was proposed for clustering rural areas according to the characteristics, as well as the occurrence and quality of meadows ( Figure  4). In view of the purpose of the study, 10 villages in Group E featured the worst characteristics of meadows and had a small overall area. Group D did not look good either, and Group F had one big advantage only-the biggest overall area.

Clustering of Rural Areas-Meadows
A quite clear division into six groups was proposed for clustering rural areas according to the characteristics, as well as the occurrence and quality of meadows ( Figure 4). In view of the purpose of the study, 10 villages in Group E featured the worst characteristics of meadows and had a small overall area. Group D did not look good either, and Group F had one big advantage only-the biggest overall area.  The results of clustering derived from surveys compiled in Table 6 , Table 7 and Table  8 made it possible to provide a precise description of six clearly identified clusters.   The results of clustering derived from surveys compiled in Tables 6-8 made it possible to provide a precise description of six clearly identified clusters.    Table 9 presents the elements and size of the respective groups. Group D was dominant, as it was associated with 18 villages, whereas Group C had only two-and these were villages with small registered area and very good-quality meadows.  Figure 5 illustrates the spatial distribution. It is worth noting that the study area featured quite a high spatial coherence between the clusters, which is primarily due to the natural conditions.   Table 9 presents the elements and size of the respective groups. Group D was dominant, as it was associated with 18 villages, whereas Group C had only two-and these were villages with small registered area and very good-quality meadows. Przysietnica, Domaradz, Golcowa, Izdebki, Wesoła Figure 5 illustrates the spatial distribution. It is worth noting that the study area featured quite a high spatial coherence between the clusters, which is primarily due to the natural conditions.

Clustering of Rural Areas-Pastures
The third analysed land-use type was pastures. Figure 6 shows a clear division into eight groups, which afterwards were agglomerated for similar distances, and only then could a clearer division into three or two clusters be made. Despite these reservations, I decided to show the results of the division into six groups, according to the results of classification for arable land and meadows.

Clustering of Rural Areas-Pastures
The third analysed land-use type was pastures. Figure 6 shows a clear division into eight groups, which afterwards were agglomerated for similar distances, and only then could a clearer division into three or two clusters be made. Despite these reservations, I decided to show the results of the division into six groups, according to the results of classification for arable land and meadows. The data (Tables 10-13) show that the most numerous group, Group F, had the least favourable characteristics-small area, and low share and poor quality of pastures. Among other groups, Groups E and B showed the worst results, although the latter had pastures of the highest quality even though the villages had a small registered area and a small percentage share of pastures in the overall area.  The data (Tables 10-13) show that the most numerous group, Group F, had the least favourable characteristics-small area, and low share and poor quality of pastures. Among other groups, Groups E and B showed the worst results, although the latter had pastures of the highest quality even though the villages had a small registered area and a small percentage share of pastures in the overall area.  Clustering according to the characteristics of pasture results in clusters of very different size (B, C, and D had no more than four elements each), as illustrated in Table 13. Their spatial distribution is presented in Figure 7.  Clustering according to the characteristics of pasture results in clusters of very different size (B, C, and D had no more than four elements each), as illustrated in Table  13. Their spatial distribution is presented in Figure 7.

Clustering of Rural Areas-All Agricultural Land
The studies concerning the clustering of rural areas in 44 villages of the district of Brzozów ( Figure 8) identified four types of villages. The results of clustering according to surface area, total share of arable land, meadows, and pastures, plus their mean score, should have been naturally closer to the results of analysis for arable land (they occupy the largest area, so they have the largest impact on the mean quality of land).

Clustering of Rural Areas-All Agricultural Land
The studies concerning the clustering of rural areas in 44 villages of the district of Brzozów ( Figure 8) identified four types of villages. The results of clustering according to surface area, total share of arable land, meadows, and pastures, plus their mean score, should have been naturally closer to the results of analysis for arable land (they occupy the largest area, so they have the largest impact on the mean quality of land). Considering this issue in terms of land least suitable for agricultural use, the cluster of villages marked as C clearly showed the worst characteristics. Cluster D also featured low values of the analysed characteristics-the highest share of agricultural land in the Considering this issue in terms of land least suitable for agricultural use, the cluster of villages marked as C clearly showed the worst characteristics. Cluster D also featured low values of the analysed characteristics-the highest share of agricultural land in the overall area only partially mitigated the problem of a smaller area and worse quality of land in relation to Cluster A. The best results were observed for Cluster B-mostly due to its area, which, on average, was two times bigger than in other groups. Tables 14-16 show detailed results of clustering.  The villages comprising their respective clusters are presented in Table 17, and Figure 9 shows their spatial distribution.

Discussion
World literature contains various pieces of information on marginal land. This term denotes land that has never or hardly ever been used for agricultural purposes, is not entered into the register of agricultural land, and is too barren to be used as agricultural land. At the Rio the Janeiro Earth Summit Janeiro in 1992, it was suggested that marginal land should not be allocated for agricultural use since this would hardly improve the food balance. In addition, it should be noted that allocating it for agricultural use usually reduces the forest cover of the continents, which leads to increasing risk to the environment [34].
The problem of marginal land was investigated in detail in Asian countries. Shi et al. [35] employed GIS-processed data and multiple regression analysis for factor analysis of such land in the mountainous regions of China, but they did not consider a sufficient number of socio-economic factors and related policies in their analysis.
By contrast, in Japan, most surveys regarding marginal land make use of the agricultural census data. In 2011, Takayama and Nakatani [36] carried out a survey using a set of data covering six Japanese prefectures. Previous studies in 1998 investigating marginal land in Japan, for instance, Senda [37], employed data of respective farmers also derived from an agricultural census. Moreover, those surveys did not take into account variables related to regional agrarian structure; hence, there were no implications for regional policies. In 2018, Su [38] analysed the determinants of marginal land based on GIS data, and in 2014, Matsui [39] developed a machine learning estimation model for these areas (generalised linear models, random forest, and multivariate adaptive regression splines). However, data input for those surveys was also based on a population census. At present, such analyses employ objective data and involve GIS data processing (ArcGIS 10.8 software) to estimate the marginal land rate model accurately.
By contrast, in Poland, a definition of marginal land was formulated in 1990, after the commercialisation of agriculture. The costs of labour and materials considerably exceeded the value of the crop yield. Therefore, the Agricultural Property Agency of the State Treasury-which acquired the lands formerly owned by the State Agricultural Farms-

Discussion
World literature contains various pieces of information on marginal land. This term denotes land that has never or hardly ever been used for agricultural purposes, is not entered into the register of agricultural land, and is too barren to be used as agricultural land. At the Rio the Janeiro Earth Summit Janeiro in 1992, it was suggested that marginal land should not be allocated for agricultural use since this would hardly improve the food balance. In addition, it should be noted that allocating it for agricultural use usually reduces the forest cover of the continents, which leads to increasing risk to the environment [34].
The problem of marginal land was investigated in detail in Asian countries. Shi et al. [35] employed GIS-processed data and multiple regression analysis for factor analysis of such land in the mountainous regions of China, but they did not consider a sufficient number of socio-economic factors and related policies in their analysis.
By contrast, in Japan, most surveys regarding marginal land make use of the agricultural census data. In 2011, Takayama and Nakatani [36] carried out a survey using a set of data covering six Japanese prefectures. Previous studies in 1998 investigating marginal land in Japan, for instance, Senda [37], employed data of respective farmers also derived from an agricultural census. Moreover, those surveys did not take into account variables related to regional agrarian structure; hence, there were no implications for regional policies. In 2018, Su [38] analysed the determinants of marginal land based on GIS data, and in 2014, Matsui [39] developed a machine learning estimation model for these areas (generalised linear models, random forest, and multivariate adaptive regression splines). However, data input for those surveys was also based on a population census. At present, such analyses employ objective data and involve GIS data processing (ArcGIS 10.8 software) to estimate the marginal land rate model accurately.
By contrast, in Poland, a definition of marginal land was formulated in 1990, after the commercialisation of agriculture. The costs of labour and materials considerably exceeded the value of the crop yield. Therefore, the Agricultural Property Agency of the State Treasury-which acquired the lands formerly owned by the State Agricultural Farms-delineated 57,400 ha of marginal lands with no agricultural value.
Institutions responsible for agricultural and non-agricultural management of marginal lands are the Ministry of Agriculture and Food Economy and the Institute of Soil Science and Plant Cultivation in Puławy. In 1992, the Ministry's Department of Land and Rural Management specified the notion of marginal land, defining it as land remaining under agricultural use or entered in the register of agricultural land that-due to adverse natural, anthropogenic, and economic conditions-has a relatively low productivity or is not suitable for producing healthy food [40]. This definition provided a basis for the "Rationalisation of Marginal Lands" grant awarded by the Ministry of Agriculture and Food Economy.
The project was commissioned by three agricultural institutions: the Institute of Soil Science and Plant Cultivation in Puławy, the Institute for Land Reclamation and Grassland Farming in Falenty, and the Institute of Agricultural and Food Economics in Warsaw.
In 1996, the Institute of Soil Science and Plant Cultivation in Puławy prepared detailed guidelines regarding the delineation of marginal lands from the utilised agricultural area [40]. According to the adopted criteria, land can be classified into four groups [40,41]:

1.
Infertile agricultural land where production is not profitable due to unfavourable natural conditions and erosion; 2.
Land representing different soil classes and featuring chemical contamination as a result of human activity; 3.
Degraded or mechanically transformed soils devoid of humus; 4.
Land with unfavourable natural and territorial conditions, i.e., hardly accessible agricultural land or obstacles to tillage.
The quality and suitability of land is determined based on soil-quality classes [34]. The uniform classification of land throughout Poland takes into account the physical and morphological features of land, constituting cartographic materials in the form of classification cadastral-scale maps [42,43].
Therefore, there is still a need to survey and analyse marginal lands in Poland. This paper presents the results of surveys using the algorithm of clustering villages according to their surface area, total share of arable land, meadows, and pastures, and their mean score, which allowed the villages to be grouped according to soil quality in the villages of the analysed district. Spatial distribution of villages in the groups is determined by terrain relief and natural conditions ( Figure 10) with a decisive impact on future proposals of management of the analysed area, taken into account at the stage of developing design documentation, that is, assumptions for the land consolidation project, which at a later stage of the works is also related to management of marginal lands.
The villages featuring the worst quality of land (Groups C, D) are mostly situated in the eastern part of the district (Dydnia, Hroszówka, Jabłonica Ruska, Jabłonka, Końskie, Krzemienna, Krzywe, Niewistka, Obarzym, Temeszów, Witryłów, Malinówka, Hłudno, Huta Poręby, Siedliska, Wołodź, Barycz, Grabówka, and Wydrna), and in its western (Jasienica Rosielna, Wola Jasienica, Malinówka, and Zmiennica) and northern parts (Barycz). These are mountainous villages (Barycz, Dydnia, Jabłonica Ruska, Jabłonka, Końskie, Krzemienna, Krzywe, Niewistka, Obarzym, Temeszów, Witryłów, Grabówka, and Wydrna) and sub-mountainous villages (Hroszówka, Jasienica Rosielna, Wola Jasienicka, Zmiennica, Malinówka, Hłudno, Huta Poręby, Siedliska, and Wołodź). On the other hand, villages from Groups A and B-featuring good-quality land-are situated on plains or hills. Detailed analysis of the results showed that five villages (Hroszówka, Obarzym, Barycz, Grabówka, and Wydrna) at each level of the survey were in the weakest of the identified village groups. In turn, Końskie, Niewistka, Temeszów, Malinówka, and Wola Jasienicka were classified in three out of four of the weakest groups. By contrast, Huta Poręby, Siedliska, Wołodź, Zmiennica, Dydnia, and Jabłonka-despite their absence at respective stages of the calculation (for arable land, meadows, and pastures)-were classified together in the weakest group in terms of arable land. very good results in terms of evaluating the actual level of defectiveness of the land in the study area. A comparison of results obtained using the previous and the modified algorithm showed 69.5% compatibility of the analysed villages, which is a satisfactory result. A high level of compatibility with the results obtained using the previous method is a guarantee of good decision-making, which can be an element of broad development strategies in specific areas.

Conclusions
The spatial structure of rural areas has been transformed dynamically, which is mostly due to changes in the lifestyle of the inhabitants. The increasingly better accessibility of cities leads to suburbanisation. The intensity and forms of space management affect its shape and contribute to preserving its natural and cultural values. Not only do the inhabitants and increasing building development in villages affect the forms of space management, but also new forms of using agricultural space. The area structure of farms can be improved through consolidation and exchange of land. This is one of the most efficient rural management procedures. Rational shaping of land contributes to improving the working and living conditions of its inhabitants. Such works facilitate reasonable management of areas of land featuring soils of the poorest grade, since these are the areas most at risk of environmental degradation processes. It should be emphasised that every rural area is unique; therefore, it is particularly important to find solutions matching the natural and landscape status of each area on a case-by-case basis. In Poland, there are many areas where agricultural production run by private farms owned by individuals is on the verge of or falls below the limit of profitability. The main factors contributing to the formation of problem areas include unreasonable utilisation of natural resources, which intensifies erosive degradation and soil acidity.
The self-designed algorithm accurately identifies locations featuring the poorestquality soil, which is particularly significant in developing strategies for larger areas such as communes, districts, and voivodeships. The new, modified algorithm allows areas with To sum up, the new algorithm for identifying land useless for agriculture showed very good results in terms of evaluating the actual level of defectiveness of the land in the study area. A comparison of results obtained using the previous and the modified algorithm showed 69.5% compatibility of the analysed villages, which is a satisfactory result. A high level of compatibility with the results obtained using the previous method is a guarantee of good decision-making, which can be an element of broad development strategies in specific areas.

Conclusions
The spatial structure of rural areas has been transformed dynamically, which is mostly due to changes in the lifestyle of the inhabitants. The increasingly better accessibility of cities leads to suburbanisation. The intensity and forms of space management affect its shape and contribute to preserving its natural and cultural values. Not only do the inhabitants and increasing building development in villages affect the forms of space management, but also new forms of using agricultural space. The area structure of farms can be improved through consolidation and exchange of land. This is one of the most efficient rural management procedures. Rational shaping of land contributes to improving the working and living conditions of its inhabitants. Such works facilitate reasonable management of areas of land featuring soils of the poorest grade, since these are the areas most at risk of environmental degradation processes. It should be emphasised that every rural area is unique; therefore, it is particularly important to find solutions matching the natural and landscape status of each area on a case-by-case basis. In Poland, there are many areas where agricultural production run by private farms owned by individuals is on the verge of or falls below the limit of profitability. The main factors contributing to the formation of problem areas include unreasonable utilisation of natural resources, which intensifies erosive degradation and soil acidity.
The self-designed algorithm accurately identifies locations featuring the poorestquality soil, which is particularly significant in developing strategies for larger areas such as communes, districts, and voivodeships. The new, modified algorithm allows areas with