Clustering of EU Countries by the Level of Circular Economy: An Object-Oriented Approach

: In order to effectively regulate the circular economy (CE) at the national and international levels, it is essential to have a uniﬁed and informative system of indicators for monitoring the progress in the CE. The lack of standard indicators for measuring the progress of cyclicality leads to contradictions and misunderstandings, which is a problem for the implementation of CE strategies. This paper aims to adapt dynamic clustering approaches to solving strategic management problems of circular production and consumption processes. To achieve this goal, the authors performed the following tasks: (1) tested clustering algorithms by ranking EU countries by the level of development of the circular economy; (2) identiﬁed the approach that allows the best classiﬁcation of EU countries, considering changes in the indicators of the level of CE development in 2000–2019 (dynamic classiﬁ-cation); (3) developed a software module using python libraries to classify and visualize the results. The results illustrate that the k-means algorithm has a good discriminatory ability in division of all countries of the training sample (EU countries) into several clusters with different dynamics in the development of the CE. The best quality of classiﬁcation is obtained by the indicator “Generation of municipal waste per capita”; satisfactory quality of the classiﬁcation is obtained by the indicator “Generation of waste excluding major mineral wastes per GDP unit”. The study results demonstrate the fundamental applicability of the object-oriented and classical statistical approach to solving strategic management problems of the CE and their potential effectiveness in terms of the clarity and information content of reﬂecting cyclical processes. of is that a lot can be achieved very little


Introduction
In recent years, environmental protection issues and problems are increasingly considered by society, investors, and the government. The continuous, irrational use and exploitation of natural resources by humans drive the global ecosystem to the brink of destruction. Recent decades have shown the importance of decoupling economic growth and social development from resource exploitation and waste generation. According to the World Bank forecasts, the world's population will reach 10 billion by 2050. Therefore, one of the main challenges of the 21st century is the combination of the economic development of competing countries and the continuous increase in the living standards of the population, with limited natural resources, while not endangering the stability of the global ecosystem.
Experts worldwide consider the circular economy to move from the current linear model of production and consumption to a new, more efficient economy based on the renewal of resources, recycling, and a transition from fossil fuel resources to the use of renewable energy sources. According to the experts of the Ellen MacArthur Foundation [1], by 2025 the circular economy can annually provide an increase in the income of the world economy of over USD 1 trillion, and due to industrial innovations, it can provide a 3% increase in productivity and, as a result, a 7% increase in world GDP. According to McKinsey's [2] estimates, the transition to a circular economy will bring the economy of the European Union USD 1.8 trillion. By 2030, it will lead to a 53% reduction in primary resource consumption and an 83% reduction in carbon dioxide emissions by 2050.
Some countries have adopted appropriate strategies to promote the circular economy and are improving legislation for this purpose (Germany, Finland, Switzerland, Japan, South Korea, etc.). Other countries actively use various instruments and mechanisms of state policy: the introduction of technologies, financing, and forms of doing business, the formation of society's readiness to change their habits, and creating new interaction schemes. With a robust industrial economy, Germany has formed the backbone of a circular economy through material flows and the availability of materials. The Netherlands has built on innovation in materials and business models. Finland is the first country to develop a national roadmap for the transition to a regenerative economy. Scotland became the first member of the Circular Economy 100 club and actively adopted circular economy technologies. Japan moved to a highly efficient circular economy, primarily due to the law to promote efficient use of resources. In China, the circular economy began to develop as part of an industrial ecology program that looks at how waste from one company can become a resource. The common thing that unites these countries is new concepts of national development that provide for a radical change in waste management systems a focus on maximizing the extraction of secondary resources from waste and their use in industrial production instead of natural mineral raw materials. Large producers of greenhouse gas emissions such as Brazil and Russia lack research in this area, and there is a gap between important industry initiatives and scientific research.
The circular economy represented by such processes that require minimal extraction of natural resources does not have a depressing effect on the environment due to the reuse of materials. The useful life of materials is extended through the reuse of new products in production, modern technological developments focused on resource durability and waste minimization, and the sharing of the economy. At the same time, the circular economy model suggests that waste is not only minimized but also returned to production processes. Studies on the success of circular economy policies are assessed primarily by the level of maturity of recycling and waste, including process-based approaches to waste disposal.
There are examples in the literature of meta-level reviews or comparisons of CE policies; most research focuses on quality standards, government procurement, market mechanisms, education and development, infrastructure financial incentives, and labeling related to the quality of reused and recycled products [3,4].
The basic principles of the circular economy complement the bioeconomy and should facilitate the recycling and reuse of materials. The circular economy is by definition "regenerative", demonstrating the qualitative principles of using resources and components, increasing their functionality and value, and reducing waste and cyclical production processes. Technological, socio-political, and economic restructuring is fundamental to incorporate new technologies and approaches to foster a circulating economy and a continuous economic cycle [5]. Thus, there are nine basic principles of the circular economy (Table 1).
Circular business models are a general term for a wide variety of business models that use fewer materials and resources to produce products and/or services. In addition, these models serve to extend the life of existing products and/or services through repair and refurbishment. Finally, they help complete the product life cycle by recycling the residual value of products and materials. One of the mechanisms for implementing such business models is life cycle assessment and lean manufacturing [6].
To effectively regulate the circular economy at the national and international levels, it is essential to have a unified and informative system of indicators (ISI). ISI allows one to assess how different production systems and production processes comply with the 9R principles and measure progress in this direction. The lack of standard indicators for measuring the progress of cyclicality leads to contradictions and misunderstandings, which is a problem for the implementation of CE strategies. Therefore, many scholars are currently working on development of indicators adapted from existing statistical information, which guarantees simplicity and informativeness appraisal progress in the CE [7]. Table 1. Basic principles of the circular economy.
(a) Use and build the product more intelligently

R0 Refuse
To make a product redundant by giving up its function or providing it with a radically different outcome.

R1 Rethink
Enhance product use (for example, through product exchange or multi-functional products).

R2 Reduce
Manufacturing a product is more efficient by using fewer raw materials in the product or using it.
(b) Extending the life of the product and parts

R3 Reuse
We are reusing a still-good product for the same function by another user.

R4 Repair
Repair and maintenance of a broken product for use in its old function.

R5 Refurbish
Update or upgrade an old product.

R6 Remanufacture
Use parts of a used product in a new product with the same function.

R7 Repurpose
Use parts of a used product in a new product with a different function.
(c) Useful use of materials R8 Recycle Process materials of the same quality: high and low quality.

R9 Recover
Burning materials for energy recovery.
(d) Substitution Substitution Replacement of non-renewable materials.
In our opinion, an essential step toward solving the problem of increasing the information content of the current systems for monitoring progress in the CE is the development of unique approaches that make it possible to distinguish economic systems with similar dynamics of the development of circular processes. The similarity of the dynamics of different indicators of the CE development level in other countries allows us to speak about the same (or similar) effectiveness (or inefficiency) of these countries' policies and institutional structures, stimulating the formation of circular processes and business models [9]. Further, this kind of information we can use to rank incentive policies in terms of effectiveness, determine the degree of their compliance with the institutional structure, and develop the most effective management decisions.
This paper aims to adapt dynamic clustering approaches to solving strategic management problems of circular production and consumption processes. To achieve this goal, we performed the following tasks: (1) testing clustering algorithms by the example of solving the problem of ranking EU countries by the level of development of the circular economy; (2) determination of the approach that allows the best classification of countries, taking into account changes in the indicators of the level of CE development in the time interval 2000-2019 (dynamic classification); (3) development of a software module using built-in Python libraries to classify and visualize the results.
As a training sample, we used data on the European Union countries as a region with the most developed system for monitoring circular processes in the sphere of production and consumption.
The paper is structured as follows. Section 2 presents the results of the literature review and systemizes the theoretical framework of the study. Then, Section 3 describes an object-oriented approach to the problem of monitoring CE's progress. Section 4 presents the results of clustering analysis, performed by the different methods, and compares their robustness and statistical quality. Section 5 discusses the results of clusterization from an economic point of view and provides some policy applications. Finally, Section 6 concludes the study and discusses its contribution to the literature as well as some limitations and perspective for future research.

Literature Review
Accounting and analysis of material flow across economic systems of different levels are considered a reasonably convenient tool for analyzing the circular economy since waste recycling is regarded as one of the most critical factors that impact the environment.
Haas et al. [10] analyzed the circularity of the world economy through an assessment of material flows, waste production, and recycling in the EU and the world. Researchers estimated the circulation of materials worldwide by measuring the ratio between waste extraction and material input for domestic use (the latter we defined as the sum of the country's extraction and import of materials). Their results show a low degree of material isolation throughout the world. A total of 44% of the recycled materials are reused for power generation, and socio-economic reserves are also growing by 17 Gt per year. In the EU, the level of circularity is low, but the level of waste disposal is relatively high. Results indicate that strategies targeting the output side (end of the pipe) are limited given the present proportions of flows. In contrast, a shift to renewable energy, a significant reduction of societal stock growth, and decisive eco-design are required to advance toward a CE.
Sverko Grdic et al. [11], based on an econometric model, showed that applying the concept of a circular economy can provide economic growth and GDP growth while reducing natural resources and ensuring more excellent environmental protection.
Cristian Busu and Mihail Busu [12] proposed a methodology to study circular economy processes based on mathematical modeling. The modeling process consists of constructing a composite indicator composed of a weighted sum of all indicators developed by an algorithm based on Shannon's Entropy. Their results are similar to the international rankings, consolidating and confirming the accuracy and reliability of this approach.
The article by Sterev, N. and Ivanova, V. [13] proved that some EU economies are entirely linear and delay the actual transition between linear and closed business models. The main takeaway is establishing new institutional arrangements for those EU economies that have had to move from a linear business model to a circular one.
A theoretical circular economy model for developing big cities in low-middle-income countries was described in the research by Ferranato et al. (2018) [14] within the study for effectively comparing which chances can spread for these countries regarding municipal solid waste exploitation.
Sánchez-Ortiz J. et al. [15] analyzed the proposals made by various researchers on indicators to measure the efficiency of the application of CE principles. They highlighted three issues: problems in establishing indicators, difficulty in defining the indicator, and the impossibility of obtaining the data.
Kovanda et al. [16], based on economy-wide material flow accounting and analysis (EW-MFA), analyzed waste recycling in the Czech Republic. They calculated this indicator for the Czech Republic for 2002-2011. They proved that it could also be calculated for other countries, even though some unclear methodological issues related to specific features of the Czech waste management system are encountered. The highest recycling rate was recorded for biomass, followed by metals, non-metallic minerals, and fossil fuels.
Research by Liu M. et al. [17] showed the economic benefit of China's waste paper recycling in 2017 was approximately 458.3 CNY/t and that the GHG emissions were 901.1 kgCO2eq. The standard recovery rate and nonstandard recovery acceptance rate will significantly impact the system's economic benefits and improve the structure of the GHG emission. In the context of integrating nonstandard recycling enterprises and individual recycling vendors, the economic benefits will rise to 3312.5 CNY/t by 2030, while GHG emissions will increase to 942.9 kgCO2eq.
Shpak et al. [18] proved that the level of waste recycling has a significant impact on the trade of recyclable raw materials in the EU. The article by Fura et al. [19] used a synthetic criterion for the study of CE promotion in the EU in each selected area of CE Eurostat, i.e., production and consumption, waste management, secondary raw materials, and competitiveness and innovation. To assess the correlation between the estimates obtained at the two most extreme points of the analysis period, we applied a similarity . The results confirm that the highly developed Benelux countries-Luxembourg, the Netherlands, and Belgium-have the highest CE development. Malta, Cyprus, Estonia, and Greece are the least advanced in CE practice. In addition, on average, there is some progress in the implementation of the CE, but there are significant imbalances between the EU countries, especially among the "new" member states.
Ivanova et al. [20] used cluster analysis research methods to analyze the state and progress of countries in each cluster. They selected 8 out of 10 indicators and added Resource Productivity and Domestic Material Consumption and Responsible Production and Consumption from the SDGs. They provided insight into resource efficiency, which is an urgent challenge in a circular economy. Since waste management and recovery are critical, four of the nine selected indicators relate to the total share of waste and the recycled portion. The authors identified the differences and similarities between the EU countries concerning the transition to a closed economy model and assessed the progress on this basis.
An interesting attempt to develop a comprehensive indicator of the level of development of a circular economy was made in the study of Mazur-Wierzbicka [7]. The advantage of the approach proposed in this study is the ability to track changes in the level of development of the circular economy in dynamics and to identify similar patterns in some countries.
European authors used a comprehensive analysis and quantification of national plastic flows to measure circularity [21]. Material flow analysis was used to set up a model quantitatively describing the Austrian plastics budget, and the quality of the data sources was assessed using uncertainty characterization. The results show that about 1.1 million tonnes (132 kg/cap·a ± 2%) of primary plastics were produced in Austria, whereas approximately 1.3 million tonnes (156 kg/cap·a ± 5%) of plastics products were consumed. Another study [22] showed that Austria exhibits an 8.5% share of secondary raw materials in processed materials. The percentage of recycled materials in interim outputs is at 16.8%.
Consistent material flow assessment combined with dynamic modeling has allowed [23] to conduct analyses over 100 years. Over the whole period, we observed growth in global material extraction by a factor of 12 to 89 Gt/yr. A shift from materials for dissipative use to stock building materials resulted in a massive increase of in-use stocks of materials to 961 Gt in 2015. Since materials increasingly accumulate in stocks, outflows of wastes are growing at a slower pace than inputs. In 2015, outflows amounted to 58 Gt/yr, of which 35% were solid wastes and 25% emissions, the remainder being excrements, dissipative use, and water vapor. Results indicate a significant acceleration of global material flows since the beginning of the 21st century. The scenario until 2050 shows average international metabolic rates double to 22 t/cap/yr and material extraction increases to around 218 Gt/yr. Overall, the analysis indicates a grand challenge calling for urgent action, fostering a continuous and considerable reduction of material flows to acceptable levels.
Rincon-Moreno et al. [24] analyzed the existing circular economy indicators defined by the European Union: Production and Consumption, Waste Management, Secondary Raw Materials, and Competitiveness and Innovation. The objective of this study was to develop and analyze the applicability of indicators aimed at assessing CE actions at the micro-level based on existing metrics that guarantee simplicity and effectiveness. Overall, the indicators proposed in this study were applicable at the micro-level based on the companies' responses. This fact showed that the indicators serve the purpose of application to companies regardless of the form in which economic activities they worked. Although the conceptualizing of circularity varies widely between companies, these cross-sector metrics based on a common conceptual framework will enable companies to speak the same language [25].
Cayser et al. [26] explored product performance through a circular economy concept. Based on the circular economy principles, the authors tried to determine suitable characteristics of indicators for measuring the performance of products. A multi-measure approach Sustainability 2021, 13, 7158 6 of 20 was taken, with a single aggregated metric for each life-cycle stage. This approach has several advantages: speed, simplicity, ease of diffusion, and comprehensible metaphor.
Linder et al. [27] in their article analyzed the advantages and disadvantages of the existing circular economy metrics. The approach of the Ellen MacArthur Foundation was considered: development based on the Material Circularity Indicator is perhaps the most ambitious attempt yet to create a product-level circularity metric. The MCI consists of two factors: the linear flow index and the utility factor. The linear flow index factor can be viewed as a particular variant of MFA. The Cradle to Cradle Products Innovation Institute has developed a C2C certification framework that has been used to evaluate 159 companies and around 2500 products. The tool entitled REPRO (Remanufacturing Product Profiles) performs statistical analyses of different end-of-life (EoL) product scenarios based on a set of 82 criteria. REPRO allows designers to compare their products with others that successfully remanufactured to improve remanufacturing rate. The authors also considered the eco-efficient value ratio model which assesses sustainability through three dimensions: costs, market value, and "eco-costs" (i.e., externalities). A product or service is considered "clean" when eco-costs are below a certain threshold [28]. The circular economy index measures circularity in terms of the ratio of recycled material value from EoL products compared to total material value in recycling processes needed to produce new versions of the same product [28]. Based on different approaches, Linder et al. [29] offered a unique, product-level circularity metric, where circularity is defined as the fraction of a product that comes from used products. Reasoning from this, we outline a metric based on the ratio between recirculated and total economic product value. The metric has a high degree of construct validity (i.e., it focuses solely on product-level circularity).
Haupt and Hellweg [30] tried to measure the environmental sustainability of a circular economy. The proposed complementary environmental-impact-based indicator measures the environmental value retained through reuse, remanufacturing, repairing, or recycling. The indicator extends the focus from the end of life to the entire life cycle and includes the substitution of primary materials. Furthermore, it allows for monitoring the transition toward a circular economy from an environmental and possibly economic and social perspective [31].
Lacko et al. [32] applied a well-developed data envelopment analysis methodology for monitoring the progress in the CE in Central European countries. A well-known advantage of this methodology is the ability to expand the concept of efficiency and use a large number of different indicators to aggregate them into a complex efficiency score. In addition to calculating the efficiency score of the development of a circular economy in the countries under study, the authors of this work, using linear regression, show that a high level of economic development of a country does not always contribute to an equally high development of a circular economy.
In recent years, an object-oriented approach to the analysis of indicators reflecting the state of the environment, climatic changes, and indicators related to the development of the CE has been gaining popularity in the literature. The appeal of this approach lies in the possibility of using powerful mathematical methods for analysis provided in the extensive Python libraries.
Chacon-Hurtado and Scholten [33], in their article, developed an algorithm-opensource software. This algorithm is based on a combination of multi-criteria decision analysis (MCDA), and environmental models are promising yet limited by the available MCDA software. They present Decisi-o-rama, an open-source Python MCDA library for single and sets (portfolios) of alternatives in the context of multi-attribute value/utility theory. Its development is driven by four aspirations that are crucial for usability in the context of environmental decision making: (1) interoperability, (2) uncertainty awareness, (3) computational efficiency, and (4) integration with portfolio decisions. The results indicate that these aspirations are met, thus facilitating MCDA methods by environmental researchers and practitioners. Decisi-o-rama offers a flexible implementation that supports user-defined preference models. In this respect, it is possible to define non-customary Sustainability 2021, 13, 7158 7 of 20 marginal value or utility functions and aggregation functions. In addition, the users may specify uncertain parameters of the preference model through their user-defined distributions. These features ensure straightforward extensibility for different types of models and users.
In the article, Donati et al. [34] tried to model the circular economy in environmentally extended input-output tables, based on previously created Python open-sourced software pycirk [35]. The authors described a Python package (pycirk) for modeling circular economy scenarios in the context of the Environmentally Extended Multi-Regional Input-Output database EXIOBASE V3.3, for the year 2011. They exemplified the methods and software through a what-if zero-cost case study on two circular economy strategies (Resource Efficiency and Product Lifetime Extension), four environmental pressures, and two socio-economic factors. The results from the case study show that environmental benefits can be obtained by pursuing CE strategies. In particular, the combined global effects could amount to a worldwide relative change of −10.1%, −12.5% raw material extraction used, −4.2% land use, and −14.6% blue water withdrawal. The analysis of the socio-economic indicators showed global reductions of 6.3% in value-added and 5.3% in employment globally.
Stadler [36] developed Pymrio-an open-source tool for Environmentally Extended Multi-Regional Input-Output analysis. Pymrio contains functionality aimed at professional Multi-Regional Input-Output analysts and sustainability scientists but might be helpful to anyone conducting environmental and/or economic research.

Research Methodology and Research Instruments
In this article, we cluster the EU countries through an object-oriented approach. In the economic literature, we found several definitions of cluster analysis. All definitions are based on mathematical methods designed to form relatively "distant" from each other groups of "close" objects. Groups are formed in accordance with information about distances or connections (measures of proximity) between them for all the most important characteristics. Cluster analysis is used to solve a wide range of problems, but most often it is about the segmentation problem. All studies, regardless of which method is used, are aimed at identifying stable groups, each of which combines objects with similar characteristics.
Silhouette refers to techniques for interpreting and checking consistency across data clusters. This method provides a quick graphical representation of how well each object has been classified. The silhouette value is a measure of the similarity of an object and a cluster (cohesion) compared to other clusters (separation). The silhouette ranges from −1 to +1, where a high value indicates that the feature matches its cluster and does not match for neighboring clusters If most of the objects are tall, then the clustering configuration is fine. If many points are low or negative, this indicates that there are too many or too few clusters in the clustering configuration. The silhouette coefficient is calculated using the average distance within the cluster (a) and the average distance to the closest cluster (b) for each sample. The silhouette ratio is defined as: a (i) is the average difference between object i and other objects of the same cluster. b (i) is the average difference between object (i) and other objects located in the nearest cluster.
The Silhouette Coefficient for a set of samples is given as the mean of the Silhouette Coefficient for each example.Calinski-Harabasz coefficient is a variance ratio criterion [37,38] that uses k-means clustering to obtain clustering results for different values of k. These k-means runs are randomly initialized and therefore have to be run several times to ensure an optimal clustering. When the data have been clustered in a few different models, the best model is selected with the following rule: BGSS is the between-cluster sum of squares, WGSS the within-cluster sum of squares, k the number of clusters, and n the number of samples. Evaluating the ratio for the different models with increasing k, the optimal clustering should be given by the first local maximum ratios [37]. An essential difference between this algorithm and the other two available in the software is that this algorithm does not explicitly search for normally distributed clusters.
The Davies-Bouldin index (DBI) [38], a metric for evaluating clustering algorithms, is an internal scoring framework that tests how well clustering has performed-using the quantities and characteristics inherent in the dataset. The lower the database index value, the better the clustering. It also has a disadvantage. The excellent value reported by this method does not mean the best search for information.
According to calculations, the index is defined as the average similarity between the cluster Ci for i = 1, ..., k and the most similar cluster Cj. Calculation of this indicator is defined as the measure of similarity Rij: s i , the average distance between each point of cluster i and the centroid (center of gravity of the triangle) of this cluster. The centroid can be thought of as the diameter of the cluster. d ij , the distance between the centroids i and j of the cluster. The program offers an easy choice to construct Rij as non-negative and symmetric. It is: Then the Davies-Bouldin index is defined as [39]: The indicator of cluster homogeneity is the defining indicator. The clustering result satisfies homogeneity if all its clusters contain data points belonging to the same class. This metric is independent of the absolute values of the labels. Rearranging the values of a class or cluster label will not change the resulting score.
Rand index estimates how many of those elements were in the same class, and those pairs of elements in different classes retained this state after the clustering algorithm.
Elements belong to one cluster and one class-TP. Elements belonging to the same cluster but different classes-FP. Elements belong to different clusters but the same class-FN. Elements belong to different clusters and different classes-TN. Has a domain of definition from 0 to 1, where 1 is complete coincidence of clusters with the specified classes, and 0 means no matches.
The clustering result satisfies the completeness indicator if all data points that are members of this class are members of the same cluster. This metric allows permutation of the values of the class or cluster label as this will not change the value of the score.
MeanShift clustering aims to detect droplets in samples with smooth density. This algorithm is also based on centroids and works by updating the centroid candidates to be the midpoints in a given region. Candidates are filtered at the post-processing stage. Filtering is used to eliminate nearly duplicates and form the final set of centroids. The candidate centroid Xi for iteration t is updated based on the equation in [40].
N (x i ) is the spread of the samples at a given distance around x i , and x j is the mean shift vector, which is calculated for each centroid and indicates the area of maximum increase in point density. It is calculated based on the equation and qualitatively updates the centroid to the mean of the samples in its vicinity [38]: The number of clusters is set automatically by the algorithm.
According to the k-means algorithm, the data are clustered by dividing the samples into n groups. Each group has equal variance with a minimum criterion known as inertia or within-cluster sum of squares. For this algorithm, you must specify the number of clusters. The advantage of this algorithm is good scalability for many samples. It can be used in a wide range of applications.
Another k-means algorithm divides the set of N samples X into K disjoint clusters C. Each cluster is described by the mean µj of the samples. The averages are commonly referred to as the "centroids" of the cluster and are generally not points from X, although they are located in the same space. This k-means algorithm is designed to select centroids that are capable of minimizing inertia or sum of squares within a cluster [41]: Inertia can be used to determine the internal coherence of a cluster. By using affinity propagation, clusters are formed by sending messages between pairs of samples until they converge. In the next step, the dataset is described using a small number of samples that are most representative of the other samples. Messages sent between pairs determine whether this pattern is suitable as a pattern for another, which is updated as a result of responding to values from other pairs. This update occurs iteratively until convergence. As a result of the update, the final samples are selected and the final clustering is formed.
Affinity propagation is interesting because it allows one to choose the number of clusters based on the available data. Two critical parameters are used to select the number of clusters. The first parameter is the preference, which controls how many instances are used. The second parameter is the damping factor, which reduces liability and availability messages to avoid numerical fluctuations during updates of these messages.
Agglomeration clustering is one of the most common hierarchical clustering methods. During the execution of hierarchical clustering, a bottom-up approach is used. At the first stage, each observation starts in its own cluster. Then, step by step, the clusters are sequentially combined. The linking criteria define the metric for choosing a linking strategy:

•
Minimizing the sum of squared differences in all clusters. As a result of this approach, variance is minimized, and it can be noted that it is of the same type as the target function of k-means but is solved using an agglomerative hierarchical system. • Maximum or complete binding minimizes the maximum distance between observations of pairs of clusters.

•
Similarly, average binding minimizes the average distance between all observations of cluster pairs. • Single linkage minimizes the distance between the closest observations of pairs of clusters.
Balanced iterative reducing and clustering using hierarchies (BIRCH) is a clustering algorithm that can cluster large datasets by first generating a small and compact summary of the large dataset that retains as much information as possible. This minor summary is then clustered instead of clustering the larger dataset.
BIRCH is often used to complement other clustering algorithms by creating a summary of a different clustering algorithm's dataset. However, BIRCH has one major drawback-it can only process metric attributes. A metric attribute is an attribute whose values can be represented in Euclidean space, i.e., no categorical attributes should be present.
In statistics, exploratory data analysis (EDA) is an approach of analyzing datasets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. Spectral clustering is an Exploratory Data Analysis technique that reduces complex multidimensional datasets into clusters of similar data in rarer dimensions. The main outline is to cluster all spectrum of unorganized data points into multiple groups based upon their uniqueness. "Spectral clustering is one of the most popular forms of multivariate statistical analysis" that "uses the connectivity approach to clustering", wherein communities of nodes (i.e., data points) that are connected or immediately next to each other are identified in a graph. Then the nodes are mapped to a low-dimensional space that is easily segregated to form clusters. Spectral clustering uses information from the eigenvalues (spectrum) of special matrices (i.e., affinity matrix, degree matrix, and Laplacian matrix) derived from the graph or the dataset. Spectral clustering methods are attractive, easy to implement, reasonably fast, especially for sparse datasets up to several thousand. Spectral clustering treats data clustering as a graph partitioning problem without making any assumptions on the data clusters.
When carrying out dynamic clustering, we used data from the European CE monitoring system, developed within the "Action Plan for the Circular Economy". At the moment, it includes several statistical indicators in the areas of Production and Consumption, Waste Management, Recycled Materials, and Competitiveness and Innovation. In the direction of Production and Consumption, the following statistical indicators are collected: (1) the provision of the European Union with raw materials and materials (in % of the consumption volume) for 24 metals and rare earth elements; (2) generation of municipal waste (kg/person); (3) generation of industrial waste except for mineral raw materials (kg/unit of GDP); (4) the ratio of the formation of industrial waste, except for mineral raw materials, to the volume of consumption of raw materials and materials (%). The following data are collected in the direction of Waste Management: (1) the share of municipal waste processing (in % of the total volume of municipal waste generation); (2) the share of processing of all production and consumption waste, minus mineral raw materials (in %); (3) the share of packaging recycling (by types of packaging, in %); (4) the share of electronic waste recycling (in %); (5) the share of biological waste processing (in %); (6) the share of recycling of construction waste (in %). In the direction of Recycled Materials, monitoring is carried out according to the following indicators: (1) the contribution of recycled materials to the final demand (in %, for 24 types of materials); (2) the share of recycled materials in the final demand on average (in %); (3) trade-in recycled materials between the countries of the European Union (tonnes). In the direction of Competitiveness and Innovation, monitoring is carried out on the following indicators: (1) private investment in the sectors of the circular economy (million euros and in shares of GDP); (2) jobs in a circular economy (in physical units and as a share of total employment); (3) gross value added in the sectors of the circular economy (million euros); (4) received patents in the field of the circular economy.
Monitoring the production and consumption phase is essential for understanding progress toward the circular economy [42]. Households and economic sectors should decrease the amount of waste they generate. In the longer term, this behavior may increase the self-sufficiency of selected raw materials for production in the EU. Therefore, the authors chose the indicators "Generation of municipal waste per capita" (cei_pc031) and "Generation of waste excluding major mineral wastes per GDP unit" (cei_pc032) for clustering based on a software module using various frameworks and Python libraries in real-time.
The first stage of the analysis is to prepare the data for processing. For this, we connect the Eurostat statistics to the Python environment. The pandas_profilig library provides us with many valuable tools for performing exploratory data analysis (EDA). Note that the statistical data are incomplete. There are outliers and many gaps, which is clearly shown in the figure. Nullity matrix is a data-dense display that lets you quickly visually pick out patterns in data completion (Figure 1). Monitoring the production and consumption phase is essential for understanding progress toward the circular economy [42]. Households and economic sectors should decrease the amount of waste they generate. In the longer term, this behavior may increase the self-sufficiency of selected raw materials for production in the EU. Therefore, the authors chose the indicators "Generation of municipal waste per capita" (cei_pc031) and "Generation of waste excluding major mineral wastes per GDP unit" (cei_pc032) for clustering based on a software module using various frameworks and Python libraries in real-time.
The first stage of the analysis is to prepare the data for processing. For this, we connect the Eurostat statistics to the Python environment. The pandas_profilig library provides us with many valuable tools for performing exploratory data analysis (EDA). Note that the statistical data are incomplete. There are outliers and many gaps, which is clearly shown in the figure. Nullity matrix is a data-dense display that lets you quickly visually pick out patterns in data completion (Figure 1). For further analysis, it is necessary to eliminate outliers and replace the gaps with the mean, which is convenient to do with the PyCaret library. PyCaret is an open-source Python machine learning library for teaching and deploying supervised and unsupervised models in a low-code environment. The goal of the caret package is to automate the major steps for evaluating and comparing machine learning algorithms for classification and regression. The main benefit of the library is that a lot can be achieved with very few lines of code and little manual configuration. The PyCaret library brings these capabilities to For further analysis, it is necessary to eliminate outliers and replace the gaps with the mean, which is convenient to do with the PyCaret library. PyCaret is an open-source Python machine learning library for teaching and deploying supervised and unsupervised models in a low-code environment. The goal of the caret package is to automate the major steps for evaluating and comparing machine learning algorithms for classification and regression. The main benefit of the library is that a lot can be achieved with very few lines of code and little manual configuration. The PyCaret library brings these capabilities to Python. Compared to other open-source machine learning libraries, PyCaret is a low-code alternative that can replace hundreds of lines of code with just a couple of words. The speed of more efficient experiments will increase exponentially. PyCaret is essentially a Python wrapper over several machine learning libraries such as scikit-learn, XGBoost, Microsoft LightGBM, spaCy, and many more.

Results
The assessment of the quality of clustering by the available models is presented in Table 2. Based on the statistical evaluation, it can be concluded that statistically significant Sustainability 2021, 13, 7158 12 of 20 clustering models are: mean-shift clustering, k-means clustering, affinity propagation, agglomerative clustering, BIRCH clustering, and spectral clustering. According to statistical estimates, the most preferred method for clustering countries according to the selected indicators is k-means. The elbow method was used to determine the optimal number of clusters. The basic idea behind partitioning methods, such as k-means clustering, is to define clusters such that the total intra-cluster variation (or total within-cluster sum of squares (WSS)) is minimized. The complete WSS measures the compactness of the clustering, and we want it to be as small as possible. The elbow method looks at the total WSS as a function of the number of clusters: one should choose several clusters to add as another cluster does not improve the total WSS.
The elbow method allows you to graphically determine the optimal number of clusters for a dataset. Based on this method (Figure 2), we see that the optimal number of clusters is different for two groups of data: "Generation of municipal waste per capita"-four clusters; "Generation of waste excluding major mineral wastes per GDP unit"-five clusters. A decrease in the number of clusters in the second group of data will not significantly worsen the statistical estimates of clustering (about 5%). Therefore, we choose the optimal value of clusters equal to four. The elbow method was used to determine the optimal number of clusters. The basic idea behind partitioning methods, such as k-means clustering, is to define clusters such that the total intra-cluster variation (or total within-cluster sum of squares (WSS)) is minimized. The complete WSS measures the compactness of the clustering, and we want it to be as small as possible. The elbow method looks at the total WSS as a function of the number of clusters: one should choose several clusters to add as another cluster does not improve the total WSS.
The elbow method allows you to graphically determine the optimal number of clusters for a dataset. Based on this method (Figure 2), we see that the optimal number of clusters is different for two groups of data: "Generation of municipal waste per capita"four clusters; "Generation of waste excluding major mineral wastes per GDP unit"-five clusters. A decrease in the number of clusters in the second group of data will not significantly worsen the statistical estimates of clustering (about 5%). Therefore, we choose the optimal value of clusters equal to four. Silhouette analysis shows the separation distance between the resulting clusters. The silhouette plot (Figure 3) displays a measure of how close each point in one cluster is to points in the neighboring clusters. It thus provides a way to assess parameters such as the number of clusters visually. Silhouette analysis shows the separation distance between the resulting clusters. The silhouette plot (Figure 3) displays a measure of how close each point in one cluster is to points in the neighboring clusters. It thus provides a way to assess parameters such as the number of clusters visually. We see that in the first group of indicators-"Generation of municipal waste per capita"-the clusters are characterized by more excellent uniformity and even distribution within each cluster. It, in turn, is confirmed by the intercluster distance ( Figure 4) and the quantitative distribution of countries within each cluster ( Figure 5). Intercluster distance maps display an embedding of the cluster centers in two dimensions with the distance to other centers preserved. E.g., the closer to centers they are in the visualization, the closer they are in the original feature space. The clusters are sized according to a scoring metric. By default, they are sized by membership, e.g., the number of instances that belong to each center. This provides a sense of the relative importance of clusters.   We see that in the first group of indicators-"Generation of municipal waste per capita"-the clusters are characterized by more excellent uniformity and even distribution within each cluster. It, in turn, is confirmed by the intercluster distance ( Figure 4) and the quantitative distribution of countries within each cluster ( Figure 5). Intercluster distance maps display an embedding of the cluster centers in two dimensions with the distance to other centers preserved. E.g., the closer to centers they are in the visualization, the closer they are in the original feature space. The clusters are sized according to a scoring metric. By default, they are sized by membership, e.g., the number of instances that belong to each center. This provides a sense of the relative importance of clusters. We see that in the first group of indicators-"Generation of municipal waste per capita"-the clusters are characterized by more excellent uniformity and even distribution within each cluster. It, in turn, is confirmed by the intercluster distance ( Figure 4) and the quantitative distribution of countries within each cluster ( Figure 5). Intercluster distance maps display an embedding of the cluster centers in two dimensions with the distance to other centers preserved. E.g., the closer to centers they are in the visualization, the closer they are in the original feature space. The clusters are sized according to a scoring metric. By default, they are sized by membership, e.g., the number of instances that belong to each center. This provides a sense of the relative importance of clusters.   To visualize our clustering, we use principal component analysis (PCA). PCA is a technique for downscaling large datasets by converting many variables to a smaller one containing most of the information in a large dataset. Reducing the number of variables of a dataset naturally comes at the expense of accuracy. Still, the trick in dimensionality reduction is to trade a little accuracy for simplicity [41] because smaller datasets are easier to explore and visualize and analyze data much easier and faster for object-oriented algorithms ( Figure 6). To visualize our clustering, we use principal component analysis (PCA). PCA is a technique for downscaling large datasets by converting many variables to a smaller one containing most of the information in a large dataset. Reducing the number of variables of a dataset naturally comes at the expense of accuracy. Still, the trick in dimensionality reduction is to trade a little accuracy for simplicity [41] because smaller datasets are easier to explore and visualize and analyze data much easier and faster for object-oriented algorithms. (Figure 6) (a) To visualize our clustering, we use principal component analysis (PCA). PCA is a technique for downscaling large datasets by converting many variables to a smaller one containing most of the information in a large dataset. Reducing the number of variables of a dataset naturally comes at the expense of accuracy. Still, the trick in dimensionality reduction is to trade a little accuracy for simplicity [41] because smaller datasets are easier to explore and visualize and analyze data much easier and faster for object-oriented algorithms. (Figure 6)

Discussion
In the discussion part, we try to characterize the countries in each cluster and for each data type. The boundaries of each cluster are presented in Table 3. Table 3. Cluster boundaries.

Discussion
In the discussion part, we try to characterize the countries in each cluster and for each data type. The boundaries of each cluster are presented in Table 3. Based on the results of clustering the indicator of municipal waste generation, we obtained more homogeneous groups. In total, 12 countries in the zero cluster, 7 countries in the first cluster, 14 countries in the second cluster, and 5 countries in the third cluster demonstrate a similar dynamic of change in this indicator over the entire study period (Table 4). As a result of clustering by the indicator "Generation of waste excluding major mineral wastes per GDP unit", we obtained groups of countries that significantly differ in size. Thus, "cluster 0" included 26 countries, cluster 1-only 1 country, cluster 2-2 countries (Bulgaria and Estonia), and cluster 3-7 countries. We can say that the dynamics of the waste generation indicator in Macedonia's industrial sector are unique and are characterized by a large spread of the indicator values during the study period. Bulgaria and Estonia are also characterized by a relatively large spread in the values of the indicator of industrial waste generation over the period under study. Still, in addition, they also demonstrate the highest level of waste generation in the industrial sector among all European countries (Figure 7).
Cluster 0 for the first and second indicators included countries with the least amount of waste, both in the consumption sector (municipal waste) and in the production sector, over the entire studied time interval. The lowest municipal waste generation per capita is observed in Albania, Bosnia and Herzegovina, Belgium, Estonia, Croatia, Hungary, Lithuania, Macedonia, Serbia, Sweden, Turkey, and Kosovo. The minimum indicators of the formation of industrial waste are observed in almost all "old" developed countries of the European Union. These two clusters are BE, HR, HU, SE, and TR (Belgium, Croatia, Hungary, and Turkey). In these countries, both municipal and industrial waste indicators throughout the study period are the lowest of all European countries. Features of the institutional structure, consumer culture, and actual practical measures of these countries to develop a circular economy are of most significant interest for study and replication. eral wastes per GDP unit", we obtained groups of countries that significantly differ in size. Thus, "cluster 0" included 26 countries, cluster 1-only 1 country, cluster 2-2 countries (Bulgaria and Estonia), and cluster 3-7 countries. We can say that the dynamics of the waste generation indicator in Macedonia's industrial sector are unique and are characterized by a large spread of the indicator values during the study period. Bulgaria and Estonia are also characterized by a relatively large spread in the values of the indicator of industrial waste generation over the period under study. Still, in addition, they also demonstrate the highest level of waste generation in the industrial sector among all European countries (Figure 7). Cluster 0 for the first and second indicators included countries with the least amount of waste, both in the consumption sector (municipal waste) and in the production sector, over the entire studied time interval. The lowest municipal waste generation per capita is observed in Albania, Bosnia and Herzegovina, Belgium, Estonia, Croatia, Hungary, Lithuania, Macedonia, Serbia, Sweden, Turkey, and Kosovo. The minimum indicators of the formation of industrial waste are observed in almost all "old" developed countries of the European Union. These two clusters are BE, HR, HU, SE, and TR (Belgium, Croatia, Hungary, and Turkey). In these countries, both municipal and industrial waste indicators throughout the study period are the lowest of all European countries. Features of the institutional structure, consumer culture, and actual practical measures of these countries to develop a circular economy are of most significant interest for study and replication. Further, cluster 2 follows an ascending order of values of "Generation of municipal waste per capita" and includes 14 countries (Austria, Bulgaria, Greece, Spain, Finland, France, Iceland, Italy, Montenegro, Netherlands, Norway, Portugal, Slovenia, and United Kingdom). Cluster 3 includes seven countries (Bosnia and Herzegovina, Lithuania, Montenegro, Poland, Romania, Serbia, and Kosovo). The intersection of these two clusters is only ME (Montenegro). We can note that in terms of the minimum, maximum, median, and average values of the waste generation indicator in the municipal sector, cluster 3 (Czech Republic, Latvia, Poland, Romania, and Slovakia) does not differ much from cluster 2. Still, their differentiation into a separate cluster indicates a different dynamic through the period.
Continuing the ranking of groups of countries according to the dynamics of waste generation indicators, we further highlight cluster 1 in the consumer sector and clusters 1 and 2 already described above in the industrial sector. Switzerland, Cyprus, Germany, Denmark, Ireland, Luxembourg, and Malta demonstrate the highest municipal waste generation per capita over the entire study period. Macedonia, Bulgaria, and Estonia show a high level of industrial waste generation. There are no intersections of these clusters, i.e., it is impossible to single out a group of countries that would simultaneously lag in CE development both in the consumer and manufacturing sectors.
From a practical point of view, the conducted clustering can be considered a preliminary stage in the study of the features of the institutional and technological structure and management strategies in the field of the CE. Using the data obtained, stakeholders can further focus on analyzing the development of circular processes only in those of the most powerful interest groups. These can be countries with low waste generation rates throughout the study period as role models or, conversely, countries with high rates as cases for analyzing managerial mistakes or identifying institutional failures, of particular interest for cluster study with approximately the same value waste generation indicators but with multidirectional dynamics.
We can say that we clustered only on two indicators to demonstrate our proposed dynamic clustering approach and Python capabilities. The set of indicators for clustering can quickly expand by loading an array of statistical data from open sources (for example, the European system for monitoring the development of the CE) and repeating the actions described in the methodology paragraph according to the proposed algorithm.
The use of built-in libraries and visualization tools for Python calculations allows processing large amounts of data without high labor costs in a semi-automatic mode. It does not require the obligatory possession of unique skills and competencies in data mining or programming. The software can easily be configured to solve a specific class of problems of preliminary analysis of large datasets, which increases the likelihood of using this toolkit in the daily practice of managing circular processes.
Despite the fact that our study of European countries was conducted more to demonstrate the capabilities of the proposed approach rather than to accurately compare the level of CE's development, our results are in good agreement with those known in the literature. For example, our study fully confirms the conclusion obtained in the paper of Lacko et al. that a country's high GDP is not a guarantee of a high level of development of a circular economy [32]. The conclusions that the highest level of the CE is observed in some "old" EU countries, such as Belgium and Sweden, is in good agreement with the results of Reference [7], although we used a much smaller set of indicators for the analysis. At the same time, countries such as Spain and France, demonstrate a decrease in the efficiency of the circular economy, which is consistent with the conclusions of Reference [43], obtained using the DEA methodology. Our results for Malta, Cyprus, and Estonia (the least advanced in the CE) also agree well with the findings of Reference [19], which were obtained for all circular economy indicators used in European statistics.
As the main policy application, we can suggest that CE practices in Belgium, Croatia, Hungary, and Turkey are most efficient and should be studied in more detail for their adaptation in other countries.

Conclusions
The study showed that the k-means algorithm allows the division of all countries of the training sample (EU countries) into several clusters, differing in the development of the circular economy in various spheres of production and consumption. In addition, the object-oriented approach allows one to quickly process an unlimited amount of data including conducting statistical evaluations and building multiple models. The best quality of classification is obtained by the indicator "Generation of municipal waste per capita" (cei_pc031); satisfactory quality of the variety is obtained by the indicator "Generation of waste excluding major mineral wastes per GDP unit" (cei_pc032) because the data are of poor quality and incomplete throughout the period under consideration. An essential feature of the obtained classifications is the consideration of the development of circular processes in time. Combining several classifications according to different indicators by searching for intersections of classification groups allows one to fully reflect the development of circular processes, identify the countries that are the locomotives of the development of the circular economy, and track their impact on the spread of circular processes [44,45].
The scientific novelty of the research lies in the appraisal of data processing on the circular economy using classical statistical methods in an object-oriented environment. The study results demonstrate the fundamental applicability of classical statistical approaches to solving the problems of strategic management in the development of a circular economy and their potential effectiveness in terms of visibility and information content of reflecting the development of circular processes. The next stage of the study will be to conduct a spatial analysis of the distribution of circular methods of production and consumption using tools such as, for example, ArcGIS-ArcMap.
For a detailed analysis of the temporal characteristics of the development of a circular economy, it is more preferable to have statistics for individual regions than for countries. It is especially true in the analysis of countries with a large territory and characterized by high socio-economic and/or natural-climatic heterogeneity. The lack of such detailing is a specific limitation of our study since it does not allow us to single out specific growth centers of the circular economy-regional clusters, eco-industrial parks, individual cities, and municipalities. However, this limitation is not of the classification method or the proposed approach but a limitation of the available data.

Conflicts of Interest:
The authors declare no conflict of interest.