Next Article in Journal
Assessment of Greenhouse Gas Emissions, Energy Demand and Solid Waste Generation Between Two Manufacturing Processes: A Case Study
Previous Article in Journal
Platform-Driven Sustainability in E-Commerce: Consumer Behavior Toward Recycled Fashion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust Statistical Approaches for Stratified Data of Municipal Solid Waste Composition: A Case Study of the Czech Republic

by
Radovan Šomplák
*,
Veronika Smejkalová
,
Vlastimír Nevrlý
and
Jaroslav Pluskal
Institute of Process Engineering, Faculty of Mechanical Engineering, Brno University of Technology, Technická 2896/2, 616 69 Brno, Czech Republic
*
Author to whom correspondence should be addressed.
Recycling 2025, 10(4), 162; https://doi.org/10.3390/recycling10040162
Submission received: 17 June 2025 / Revised: 23 July 2025 / Accepted: 3 August 2025 / Published: 12 August 2025

Abstract

Accurate information on waste composition is essential for strategic planning in waste management and developing environmental technologies. However, detailed analyses of individual waste containers are both time- and cost-intensive, resulting in a limited number of available samples. Therefore, it is crucial to apply statistical methods that enable reliable estimation of average waste composition and its variability, while accounting for territorial differences. This study presents a statistical approach based on territorial stratification, aggregating data from individual waste container analyses to higher geographic units. The methodology was applied in a case study conducted in the Czech Republic, where 19.4 tons of mixed municipal waste (MMW) were manually analyzed in selected representative municipalities. The method considers regional heterogeneity, monitors the precision of partial estimates, and supports reliable aggregation across stratified regions. Three alternative approaches for constructing interval estimates of individual waste components are presented. Each interval estimate addresses variability from the random selection of waste containers and the selection of strata representatives at multiple levels. The proposed statistical framework is particularly suited to situations where the number of samples is small, a common scenario in waste composition analysis. The approach provides a practical tool for generating statistically sound insights under limited data conditions. The main fractions of MMW identified in the Czech Republic were as follows: paper 6.7%, plastic 7.3%, glass 3.6%, bio-waste 28.4%, metal 2.1%, and textile 3.0%. The methodology is transferable to other regions with similar waste management systems.

1. Introduction

Changing current waste management (WM) systems towards increased recycling and resource recovery is complex and financially demanding [1]. However, this transition is necessary to reduce environmental impacts, particularly emissions associated with waste disposal [2]. To plan and implement such system changes effectively, it is essential to use modeling approaches that can simulate different management scenarios [3]. These models depend critically on accurate estimates of waste composition, which serve as input data for predicting outcomes and planning interventions [4].
Mixed municipal waste (MMW), representing the main unsorted waste stream, provides essential information for evaluating separation potential and improving recycling processes [5]. Accurate knowledge of its composition allows targeted WM development, supporting decision-making related to sorting efficiency [6], recyclability of plastic waste [7], potential for pyrolysis [8], and processing of food waste through fermentation [9]. Moreover, the composition of solid waste strongly affects the applicability of existing treatment technologies, including energy recovery [10]. Waste composition data are also increasingly integrated into sustainable management strategies [11].
Obtaining reliable estimates of MMW composition requires both physical waste analyses and robust statistical evaluation methods. Data processing methodology plays a key role in ensuring the applicability and credibility of the results. A novel approach based on zero-intercept regression, proposed by Ezennia et al. [12], has demonstrated the potential for improving the accuracy of composition estimates through statistical modeling. Recent studies further highlight that significant spatial variability in waste composition can be observed even within homogeneous administrative units. For example, Tonjes et al. [13] analyzed waste composition in New York State and reported notable differences between and within WM systems. These findings confirm the importance of using stratified sampling and statistical approaches to obtain representative estimates.
This study aims to analyze current methodologies for statistical evaluation of waste composition, identify their limitations, and propose improvements towards a comprehensive and practical tool for waste composition assessment. The description of the procedures for the analysis of solid waste is already presented as a part of the standardized methodology with recommendations SWA-Tool (solid waste analysis) [14]. The main idea of waste composition evaluation is based on the stratification of the investigated territory and the statistical evaluation of collected data. The stratification is necessary to select the most representative samples across the territory, as regions can differ in the character and WM system, possibly resulting in a considerable impact on the MMW composition [15].
The generally recommended methodology is divided into three levels described in Figure 1. The individual steps are strongly interconnected, and it is necessary to consider this aspect when designing the solution. This paper focuses on the statistical evaluation of the obtained data (using point and interval estimates of waste composition) and aggregating the results from collection containers or municipalities into a larger territorial unit (e.g., region or state). For relevant results, it is necessary to achieve a certain precision according to statistical assumptions at each level of stratification. It is essential to estimate the required number of samples to achieve the desired precision, which further allows for effective investigation planning from the point of view of costs and staffing.
However, the results processed in the third level (Figure 1) are significantly influenced by the choice of location and container, and subsequently by the analysis method. Before starting the analyses, it is necessary to appropriately divide the examined territory into clusters. In clusters, waste sample analyses are performed in only a few selected localities (representatives of clusters) that probably best describe the whole territory (e.g., the state) [16]. The principles of stratification are used for this purpose and are commonly used in various branches such as sociology [17], climatology [18], and others. Based on the stratification of the territory, the best representatives are determined, and in these selected parts of the territory (e.g., municipalities), the actual waste composition analyses will be carried out. Stratification will enable waste composition analyses to be more efficient, i.e., to reduce their number while maintaining the quality of the results. It is also necessary to select the waste fractions that will be monitored. The sample’s content should be investigated using a predetermined methodology. In some cases, it is also appropriate to perform stratification at the level of collection containers to ensure a description of variability within the municipality. A key factor is the coherence of stratification, sampling methodology, and suitable statistical methods. The statistical methods will evaluate the collected data and the aggregation results, which can be applied with a meaningful value.
This work is focused on the third part of the approach described above, i.e., on the statistical evaluation of the data obtained and the aggregation of the evaluated results. As Figure 1 shows, the third level is directly influenced by the approach to the levels of waste analyses that precede it. Concerning the complex methodology, including feedback in the process, and an iterative approach to refining the results, it is necessary to describe the whole process. In the follow-up research, it is therefore recommended to focus on the approaches to stratification and the performance of waste analyses. The quality of waste composition estimation is an interplay at all three levels.
The main objective of this study is to develop a general statistical approach for processing waste composition data in the context of territorial stratification, using a real case study based on MMW in the Czech Republic. Although designed primarily for MMW, the presented methodology applies to any waste type requiring stratified sampling and analysis. The key contribution of this work lies in enabling the aggregation of results from stratified territories to provide composition estimates for larger units, such as regions or entire countries. This approach improves existing methodologies by systematically considering sample-level variability and the variability from selecting representative locations. Additionally, the results from the Czech case study, covering approximately 19.4 tons of MMW analyzed across 10 clusters, provide valuable reference data for comparable regions, especially in Central and Eastern Europe. The methodology also supports practical applications in waste management planning, including economic evaluation of analysis campaigns and setting of data collection priorities based on the required precision.

1.1. Research Gap and Challenges

Current research into the production and composition of MMW is limited by economic and time constraints for carrying out waste composition analyses to the necessary extent. While numerous studies address waste composition, most are constrained by simple sampling strategies and basic statistical methods that fail to capture waste streams’ spatial and socio-economic variability fully. A critical review of available methodologies revealed that many studies apply random sampling or single-level stratification without properly treating variability between regions or sampling units. Furthermore, data aggregation into larger territorial units is often performed without rigorous statistical justification, potentially leading to biased or misleading results. Part of this work was a search for statistical methods used in the research on the production and composition of MMW. The articles to be studied were selected from the Scopus database based on the following keywords: stratification, waste composition, waste sampling, stratum characterization, waste components, waste classification, and SWA-Tool. Monitored parameters are as follows:
  • Assumptions check—many statistical methods have defined assumptions, which should be verified, e.g., data normality.
  • Point/interval estimate—point estimate represents a result with a simple application. On the other hand, its probability is considered zero and does not give information about variability.
  • Relative/absolute estimate—relative and absolute estimates complement each other. In the context of waste composition, the relative is considered more utilizable and gives clear insight into the obtained results.
  • Necessary number of samples—precision of statistical evaluation is very dependent on dataset size. Required precision should be suggested regarding the application and assessed from the results.
  • Testing of statistical hypotheses—analysis of obtained estimates to detect significant behavior of evaluated data.
  • Aggregation of results—in the case of stratification, it is necessary to propose a statistical method for aggregating the obtained results into a higher territorial level.
The results of the search are summarized in Table 1. Almost all the studies below focused on research into the production and composition of MMW, except for the study of specific electrical waste [19] and anaerobic swine waste [20]. It was also noted that methodologies differ significantly in how waste types are described and results reported, making comparisons between studies difficult.
The most critical gap identified relates to insufficient treatment of spatial variability and the lack of robust aggregation methods linked to stratification. Most existing publications rely on random sampling without systematic territory division. Variability due to regional differences or representative selection is rarely quantified. Results are often generalized to higher territorial units without proper statistical foundations, which increases the risk of biased conclusions and misapplication in waste management planning. Only a small number of publications focusing on waste composition estimation describe in detail statistical methods that would extend the commonly used approaches.
In general, all the above articles use the same method to describe the main characteristics of the sought unknown quantity—the ratio of selected waste fractions in the total weight. However, the calculations assume a normal distribution of data, which is usually not verified by the appropriate statistical method. For fractions that are abundant in waste (e.g., paper, plastic), this strict assumption is not a considerable problem. For fractions with a small proportion (e.g., electrical waste, metal), the confidence interval also contains negative values that do not reflect reality. Another neglected factor is the achieved precision of interval estimates and feedback for the required number of samples. Most approaches focus solely on relative precision, which becomes impractical for fractions with small shares, where achieving the required precision would demand an excessive number of samples. This issue can be addressed using absolute precision, but choosing the appropriate form of precision requires a methodological framework not discussed in the current literature.
Currently, only a few specialized approaches seek to address the issue comprehensively. The description of the procedures for statistical evaluation of stratified territory is part of the methodological recommendations of the SWA-Tool [14]. Similar statistical procedures are given in the ASTM methodology [39], which is confirmed by [40]. It uses the ASTM to process the results of 37 studies and states the standard deviation and coefficient of variation. Another guideline published in Scotland provides the basis for determining the minimum number of samples depending on the variation and confidence interval [39].
However, these existing guidelines rely on standard formulas and do not sufficiently address data aggregation from stratified territories, nor account for variability arising from the selection of representative locations. A structured approach combining stratification, variability control, and iterative precision evaluation is lacking. The limitations of existing methods—namely insufficient stratification, neglect of representative selection variability, reliance on symmetric intervals, and missing iterative precision control—highlight the need for a more comprehensive framework. This study addresses these gaps by introducing a statistical framework based on multi-stage stratification, combined with advanced aggregation methods and interval estimation approaches that account for both container-level and territorial-level variability. The methodology also supports optimization of sampling based on defined precision criteria, providing a structured and reproducible process for waste composition estimation.
One of the recent studies by [28] specifically focused on waste analysis in the Czech Republic, an EU member state, which is also the source of the case study presented in this article. However, the previous research did not apply territorial stratification; municipalities were divided into two groups based on size. The specific collection containers for waste analysis were then selected randomly within 35 municipalities and evaluated without distinction or further aggregation. Additionally, the data were used solely for point estimation, significantly reducing the explanatory value and the efficiency of resources invested in waste analysis. Most similar studies follow this approach, making a comprehensive statistical evaluation methodology a necessary step towards sustainability and increasing the efficiency of available resources.

1.2. Novelty and Main Contribution

This work suggests a theoretical approach for statistical data processing (from waste analysis) for general stratification and its application to the case study based on a real dataset. The presented methodology is generally defined for any waste type’s composition analysis, but it is primarily designed for MMW. The main benefit of the proposed procedure is the possible aggregation of results from different areas for an overall estimate of a higher territorial unit. Based on partial waste composition analyses in predetermined localities, it is then possible to evaluate waste composition at the levels of micro-regions, regions, or the entire state. Another benefit of the contribution is the results of the composition of MMW in locations with specific characteristics. This information can be further used in case studies and applications of WM in various locations.
At the same time, data from an extensive case study are provided, in which about 19.4 t of MMW were analyzed in 10 clusters on the territory of the Czech Republic. The composition results, including interval estimates, are presented in Section 3. Each cluster represents a particular group of municipalities with similar characteristics. A significant output is also a discussion on the results of individual clusters, which have a general overlap, especially for representatives of Central and Eastern Europe, but also in other parts of the world.
Legislative and industrial use of the results is the main benefit. An example is the TIRSMZP719 2019–2021 project [41], which addresses the average composition of the MMW for the Ministry of the Environment of the Czech Republic and follows this methodology. Estimating the necessary number of samples is a direct prerequisite for planning the economic costs of analysis. Cost planning is possible not only from the municipality’s point of view but also for regions and states. Using the presented approach, it is possible to plan the costs of local surveys in this area from a global perspective.
The authors of this article set the following requirements for the statistical processing of data from waste sample analyses. All these points are addressed within the presented approach in Section 2 and Appendix A.
  • Point and interval estimates—it is advisable to design asymmetric interval estimates that better describe the properties of the data. The interval should be limited to non-negative values.
  • The results must consider the different waste weights of the individual MMW composition analyses.
  • It must be possible to aggregate partial results into higher territorial units. The stratification results and the variability in individual waste composition analyses must be considered.
  • It is advisable to choose a precision criterion for interval estimates. A tool should be provided to decide on relative or absolute precision.
  • It is advisable to estimate the required number of samples based on partial waste composition analyses, allowing for initial data variability estimates.
  • It is appropriate to have an approach for estimating the waste composition of all territorial units based on the characteristics of territorial stratification.
Unlike existing approaches to the statistical evaluation of waste analyses, the newly proposed method utilizes results more comprehensively, especially regarding result aggregations. It not only considers variability caused by the selection of collection containers but also accounts for variability arising from the selection of representatives of individual strata at different levels. At the same time, the formulas offer a solution for cases where the number of representatives and individual waste samples is small. This is a typical scenario in waste analysis evaluations, as these are both time-consuming and financially demanding. The approach thus presents a comprehensive methodology with recommendations and possible alternatives for specific scenarios in waste composition analysis.

2. Methodology

This section briefly outlines the methodology for estimating the composition of the MMW fraction. A detailed description is available in Appendix A. It is possible to use several approaches to estimate waste composition in the investigated territory, as is shown in Figure 2. The most straightforward approach is a statistical evaluation of samples selected without any stratification or further knowledge. Another approach is one-level stratification, which involves analyzing waste in each stratum and performing aggregation to estimate the investigated territory. The most suitable way is a comprehensive methodology consisting of multi-stage stratification, complex statistical methods, and iterative assessment of estimate precision.
An important distinction must be made between the number of representative locations and the number of samples. Representative locations (municipalities) are selected to reflect the entire territory’s spatial, socio-economic, and waste-production diversity. Multiple samples (individual container analyses) are collected within each representative location to capture local variability and ensure statistical reliability. While the number of samples directly determines the statistical precision of the estimates, the number of representatives defines the territorial and population coverage of the analysis. Both quantities are essential for proper waste composition assessment and cost planning.
The proposed data processing process incorporates a sequence of seven consecutive steps shown in Figure 2. For clarity, it is possible to consider the evaluated territory as a state and the lowest territorial units as municipalities.
  • The evaluated territory is subjected to a cluster analysis in the first phase. Based on the selected parameters, the territory is divided into several clusters (A to H in Figure 2).
  • One representative from each cluster who best represents the group in the cluster according to the given criteria is selected.
  • The collection containers are selected. If the representative is a larger city, it is appropriate to efficiently and economically obtain relevant results and stratify this city based on selected characteristics (e.g., describing individual build-up areas).
  • Waste composition analyses will be performed in the selected location (representative) using a predetermined methodology.
  • The empirical data are then statistically evaluated; point and interval estimates are determined for each monitored fraction. Sufficient precision is based on an iterative statistical evaluation of the individual waste composition analyses until the criterion of precision in the individual strata is met.
  • Subsequently, these strata estimates are aggregated for the actual representative, who describes the whole cluster.
  • In the last step, all clusters are aggregated to estimate the overall composition in the evaluated territory.
As already mentioned, this article’s research subject is the statistical evaluation of the obtained data and determining the waste composition in the selected territory (i.e., points 5, 6, and 7). However, it is necessary to describe the entire complex methodology due to feedback in the process and an iterative approach involved in refining the results.
Statistical processing of waste composition analyses is described in detail in Appendix A. The mathematical apparatus evaluates the percentage composition of waste by predetermined fractions. The absolute amount can be calculated from the total production. Furthermore, there is a brief description of four approaches to statistical processing of waste composition analyses:
  • Estimation without stratification: This is a situation where the waste composition analysis was performed without previous stratification. It is helpful for an application on small territorial units (e.g., village, municipality).
  • Estimation for stratified locality/municipality: Stratification is beneficial if heterogeneity of the analyzed area is expected. The samples are representative of a particular locality.
  • Estimation for stratum consisting of many localities: Waste composition estimates have only been considered for individual municipalities. However, the goal of waste composition analysis can often be to estimate for regions or even for the whole state.
  • Estimation at a national level: The last approach to statistical processing of analyses is based on aggregating results for several strata. In this way, estimating the waste composition for a higher territorial unit is possible.

3. Results for the Case Study of the Czech Republic

The mathematical method described in Section 2 and in more detail in Appendix A is applied to data from the Czech Republic. The presented methodology is based on theoretical statistics and is generally applicable. The example of the Czech Republic provides the MMW composition for different clusters (C1–C10).

3.1. Waste Analysis Information

Within this work, a demonstration of the presented methodology was performed using an example focused on the Czech Republic. The territory of the Czech Republic, with an area of 78,870 km2 and a population of over 10.5 million inhabitants, is administratively divided into 14 regions. These are divided into 206 micro-regions and further into 6258 municipalities. Based on stratification, 19.4 t of MMW were selected and analyzed for the territory of the Czech Republic.
According to the proposed methodology, the area was stratified using a multi-level approach. The method combines socio-economic, demographic, and waste-related parameters (such as waste production and heating type) to divide municipalities into homogeneous groups and select representative locations for analysis. The detailed description of the stratification process, including the criteria and clustering steps, is presented in the study by Šomplák et al. [42]. In brief, hierarchical clustering based on Euclidean distance was applied to population density, waste generation per capita, settlement type, and socio-economic indicators. Representative municipalities were selected as those closest to the cluster centroid [42]. A similar procedure was used for the container selection within each investigated municipality. Based on this approach, the area of the Czech Republic was stratified into 10 clusters (C1–C10), as seen in Figure 3. Representatives were selected to reflect the diversity of settlement types and waste production patterns (see points 1 and 2 in Figure 2). Seasonal variability, which is known to affect household waste composition [43], was considered during sampling design according to the methodology described by Šomplák et al. [42], recommending the distribution of analyses between heating and non-heating seasons. However, due to dataset limitations, seasonal effects could not be quantitatively evaluated in this study. Waste analyses were subsequently performed in the selected representative locations.
Analyses of the MMW were performed for representatives of each cluster (C1–C10), as seen in point 4 in Figure 2. The aim was to obtain knowledge and key information for identifying individual fractions of the MMW based on certified methodology. Namely, paper, plastic, glass, biowaste, metals, textiles, composite packaging, electrical equipment, others, and the fraction of waste smaller than 40 mm were sorted out. The waste analysis aimed to identify the composition of the MMW, i.e., weighing and recording the weight and percentage of individual monitored fractions.
Manual analyses of MMW from randomly selected collection containers from the given housing types were performed. At first, a sieve (with a mesh size of 40 mm) was used for sorting—the result was an under-sieve fraction and an over-sieve fraction. The over-sieve fraction was further manually sorted and weighted. Furthermore, the total weight of the analyzed waste and the average weight in one sample are given. The average weights differ in different localities regarding the predominant type of housing. In the case of housing estates type, the analysis involved large collection containers (with a volume of 1100 L), where, due to the large volume of waste, only part of the waste was used for the analysis. In the case of the rural type, there were relatively small point samples in the form of waste from small bins (usually with a volume of 240 L). The single-family detached house type then combined the two previous variants.

3.2. Statistical Evaluation

The following part of the case example will focus on the statistical evaluation of available data from the waste analysis using the statistical methodology described in Section 2 and Appendix A. At the start, it is necessary to select appropriate weights for strata, significance level, and desired precision. The weights were determined according to the produced quantity; these are approximately the same values for any cluster. It is necessary to normalize weights in all aggregations. The confidence level is set at the usual value of 95%. A combination of absolute (0.5%) and relative precision (30%) is used to assess the precision of the waste analyses. An estimate is considered precise if at least one of these precisions is met. Bootstrap construction uses 5000 generated samples from the available dataset.
The C1 cluster is selected to describe the results and differences between approaches for confidence intervals. First, the individual strata of the representative must be evaluated, and the precision of the obtained estimates must be reviewed. According to the diagram in Figure 2, this step is point 5. The formulas (described in Appendix A) are used for the calculation, which do not assume further stratification of the examined area. Figure 4 shows important statistical outputs describing the composition of MMW in the housing estate of the cluster C1. A total of 55 samples with a total weight of 6747 kg were analyzed for C1.
Looking at interval estimates, the suitability of individual approaches can be assessed. The common construction of interval estimation represents a symmetric interval. The disadvantage of this approach is that, in the case of a small percentage of waste fraction and a large variance, it can give negative values which do not correspond to reality. For this reason, an alternative construction is proposed. The logit transformation provides an asymmetric interval and satisfies non-negativity. From Figure 4, it can be observed that the alternative interval is shifted to higher values even in the case of non-negative values. However, this occurrence with a larger number of samples is negligible. The bootstrap percentile method is an easy construction for non-negative interval estimation, but in the case of a smaller number of examined samples, it provides a too-narrow confidence interval. This is mainly because the range of the confidence interval is limited by the range of the observed individual ratios, which leads to too optimistic results in the case of less investigated waste. Based on the presented methodology, the necessary number of samples can be effectively changed during the waste analyses. This aspect is crucial, especially in terms of economic costs, which can be minimized. The MMW composition for each cluster is evaluated in the same way.
According to Figure 5, the aggregation of results is performed at the end to evaluate the entire examined territory, corresponding to point 7. The formulas related to Section 2 (described in Appendix A) are used for the calculation. The statistical evaluation of the Czech Republic also includes methodology from Section 2, which characterizes the composition of waste in strata and represents input data for aggregation at the level of the whole territory. During the investigation of the composition of MMW in the Czech Republic, 634 samples were taken with a total amount of waste of 19,401 kg.
The summary estimates are shown in Figure 5. The results of the bootstrap percentile method are not included, as it is unclear how to incorporate this additional variability to aggregate the results. In contrast to previous calculations, the number of required samples is not estimated. Instead, the total number of investigated localities (representatives) is given, where the waste analyses should meet the required precision. The estimate for the necessary number of representatives is 17. This value is due to the significant variance in the glass, and it can be expected that including other localities will reduce the variance. The expected additional cost required to achieve the desired precision by conducting further analysis in 7 localities is discussed in Section 4.
In summary, the statistical evaluation shows that the most represented fractions are bio (with a dominant share of food waste), the under-sieve fraction, and other waste. Information related to food waste can be utilized for measures aiming at higher separation, and for planning treatment infrastructure of new, effective technologies [44]. Common commodities such as plastic, paper, and glass each account for up to 10% of MMW. However, significant differences are evident for individual clusters. The other factions represent a minority stake in MMW. The results were aggregated into three strata, as indicated in Table 2, to describe the outputs. These are S1 (C1, C2), S2 (C3–C5), and S3 (C6–C9). Cluster C10 is not listed here because it includes only one location (Capital City Prague), where the analyses have not yet been carried out according to the valid methodology, and therefore, no data is available.
Simplistically, S1 is an aggregation for cities, S2 for transitional areas, and S3 represents an aggregation of rural settlements. In this way of representation, trends are already visible for some fractions of waste. Commonly separated fractions, such as paper, plastic, and glass, are most abundant in MMW in cities (S1) and gradually decrease in quantity. In the rural localities, the balances of these fractions in MMW are significantly lower. The opposite trend is visible for bio waste; the balance in MMW is considerably lower in cities than in rural areas. The remaining factions either have too little representation in MMW or do not reflect the character of the territory. Therefore, their occurrence in MMW is not linked to the type of housing.
The following Table 3 provides a comparison of the presented approach with the generally known statistical evaluation of the analyzed samples without considering stratification, and therefore, without considering gradual aggregation. The results are related to the highest level of stratification, i.e., for the entire Czech Republic. The results show that the point estimates of waste composition do not differ significantly. The simple evaluation works only with the weighted average according to the size of the individual samples. The presented approach additionally considers information from stratification, where individual clusters are weighted according to the amount of production they represent. However, stratification aims to design clusters of approximately equal size, so this aspect does not significantly impact the results.
However, a significant difference can be observed in the case of variance. Simple statistics do not consider that individual samples are taken from different locations, and the data is thus evaluated as a single set. The presented approach also considers the variability arising from the selection of representatives, i.e., that a representative might not have an average cluster composition. To reduce the impact of this contribution to variability, it is advisable to evaluate multiple representatives to minimize the uncertainty arising from this fact. This aspect, however, conflicts with the financial demands of waste analysis. Nevertheless, it is essential to consider this variability in subsequent studies and applications.

4. Discussion

Waste analysis statistical processing significantly impacts the applicability of the results in practice. According to the stated research gap and given requirements for statistical apparatus, three approaches for constructing interval estimates were introduced. The advantages and disadvantages were progressively discussed with appropriate recommendations. Given the economic demands of waste analyses, suggesting how to monitor the precision of results was a requisite.
The presented approach provides the necessary number of samples and representatives to satisfy the desired precision and allows for planning the whole project by properly assigning financial resources. It is closely connected with the choice between relative and absolute precision, and a combination of both is recommended due to their properties. The main benefit of the proposed procedure is the possible aggregation of results from different areas for a comprehensive view of the higher territorial unit. Based on partial waste composition analyses in predetermined localities, it is then possible to evaluate waste composition at the levels of micro-regions, regions, or the entire state. The approach limitation lies clearly in a small database, and from a statistical point of view, precision can be handled only by using a larger number of samples. The results are further influenced by the quality of stratification and own waste analyses in the field, which is beyond the scope of this work.
Although the stratification framework aims to mitigate sampling bias, the total analyzed waste mass of 19.4 t remains relatively small considering the national scope of the study. This limited database increases sensitivity to geographic and socio-economic heterogeneity, which may affect the representativeness of the composition estimates. In particular, regions underrepresented within the stratification process, such as certain rural or industrial areas, may exhibit different waste composition patterns. Therefore, while the methodology itself is designed to be broadly applicable, the specific composition results presented in this study should primarily be interpreted as representative of municipalities with characteristics similar to the selected clusters. A larger, more geographically balanced dataset would allow for more robust generalization to the entire national territory. However, in relative terms (population and waste production), the waste fraction analyzed in our study aligns with or exceeds common practice. The analyzed amounts are often limited to just a few tons, and sample sizes in the hundreds of tons are exceptional.
From a practical perspective, the sampling campaign conducted in the Czech Republic required approximately 79 working days, with a field team consisting of four sampling technicians and one supervisor. Considering the supervisor’s costs were approximately double those of a technician, the total personnel effort was equivalent to six full-time workers. This corresponds to approximately 474 person-days, or about 3800 working hours, excluding material, equipment, and transportation costs. This metric is used because each country has different wage levels based on its GDP. In our case study, an average of 8 working days is needed per locality. For seven more localities resulting from national evaluation in Figure 5, this would amount to up to 336 person-days. In addition to labor costs, protective gear, essential equipment, and transportation expenses must also be considered. Although a detailed cost analysis was beyond the scope of this study, these figures highlight the labor-intensive nature of waste composition analysis.
Based on the presented study and our experience in developing the improved approach, the following recommendations could be given for statistical evaluation of waste composition.
  • Investigated territory should be stratified in advance; multi-stage stratification based on multiple parameters is recommended.
  • To obtain relevant results, it is appropriate to stratify even the representatives according to significant parameters influencing the waste composition. Then, a random sampling procedure in individual strata will be performed.
  • Confidence intervals (for instance, with the standard 95% precision) provide a valuable tool to assess the quality (variability) of estimates of waste composition.
  • From the presented methods of constructing confidence intervals, the logit transformation method performs best in the context of waste composition estimation.
  • The required precision should be defined at the beginning of the process concerning economic possibilities and the application of results.
  • It is suitable for evaluating absolute and relative precision at the same time. The results should achieve at least one of them.
  • To achieve defined precision in the required detail, having the same precision in all lower strata is recommended.
  • The sample size should be at least 10. It should be proportional to the required precision and iteratively evaluated with new data.
  • In the case of using the absolute precision of 0.5%, and the relative precision of 30%, a requirement of at least 10 representatives can be stated, and at least 30 samples should be performed in each sub-area. However, these numbers depend on the variability of data from waste analyses.
A comparative summary of the three interval estimation methods, including their practical advantages and limitations, is presented in Table 4. The results indicate that, while the symmetric interval is straightforward to compute, it may yield negative values for small waste fractions with high variance. The logit-transformed interval addresses this issue by ensuring non-negativity, although it can be slightly biased upwards. The bootstrap percentile method is easy to implement but tends to underestimate uncertainty with small sample sizes and cannot account for additional variability when aggregating estimates. These findings underscore the importance of choosing an appropriate method based on the characteristics of the data and the number of available samples.

5. Conclusions

A general methodology for statistical processing of analysis data was created. The results will reduce the analysis cost by minimizing the number of collection containers analyzed in each location. The developed statistical apparatus allows the estimation of the minimum required number of representative localities to achieve the desired precision of the aggregated result. This approach provides for gradually optimizing localities and samples based on currently available data. Accurate estimates of the MMW composition will make it possible to plan WM and the necessary treatment infrastructure. The results will form the basis for setting suitable WM goals and identifying locations where waste separation can be improved. High-quality data evaluation will help identify the potential for using environmentally friendly technology and effectively plan the development of WM and processing infrastructure. In this way, it is possible to reduce the environmental burden due to the production of MSW.
The presented framework was applied to a case study in the Czech Republic, where 19.4 tons of MMW were manually analyzed. The dominant component was bio-waste, especially food waste, which represents a significant potential for further utilization. The share of paper and plastics was also identified, which could be materially recovered with higher separation rates. The evaluation results were compared with the standard sample assessment, revealing that point estimates showed only minor differences. However, significant discrepancies were observed in the variability of estimates, as the proposed method also accounts for the influence of representative selection within stratification. It can be seen that the different characteristics of municipalities affect the composition of MMW. The approach further provides recommendations, alternative evaluations, and modifications to calculations in cases where the number of samples and representatives is limited. These insights are crucial for waste analyses, as the high demands of their implementation allow only a limited number of studies to be conducted.
The main fractions of MMW in the Czech Republic are estimated as follows: paper 6.7%, plastic 7.3%, glass 3.6%, bio-waste 28.4%, metal 2.1%, and textile 3.0%. While comparable statistics for all fractions across the EU are inaccessible, existing data suggest that organic waste often accounts for 30–50% of MMW in Western Europe (approximately 34% as reported by the European Compost Network [45]). Differences in composition between countries may be affected by the sampling process and data variability—and therefore sample size requirements to achieve desired precision—but do not limit the applicability of the proposed methodology, which remains valid across distinct waste composition profiles. In general, the presented methodology supports optimization of sampling design during the investigation, which can significantly reduce resource demands by minimizing the number of necessary samples and representative locations without compromising the quality of the results.
In future research, extensive datasets such as the Waste Data and Analysis Center repository (wastedata.info [46]), which aggregates over 300 waste composition reports from the United States, could be used to validate the presented stratification approach in different regional and legislative contexts.

Author Contributions

Conceptualization, R.Š.; methodology, J.P. and R.Š.; software, V.N.; formal analysis, V.N.; investigation, J.P. and V.S.; writing—original draft preparation, J.P. and V.S.; writing—review and editing, R.Š. and V.N.; visualization, J.P. and V.S.; supervision, R.Š. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Technology Agency of the Czech Republic, grant number SS02030008.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on a reasonable request related to research purposes.

Acknowledgments

The authors gratefully acknowledge the financial support provided Technology Agency of the Czech Republic within grant No. SS02030008 “Centre of Environmental Research: Waste management, circular economy and environmental security”.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
MMWMixed Municipal Waste
WMWaste Management

Appendix A

Table A1 provides a list of symbols used in Appendix A. Symbols are listed in the order of their appearance in the text.
Table A1. The definition of variables and symbols is used in Appendix A.
Table A1. The definition of variables and symbols is used in Appendix A.
SymbolDescriptionUnitPhysical Meaning
b Bootstrap sample indexIdentifies the b-th bootstrap resample
d a b s ( n ~ ) Estimated length of confidence interval as a function of n ~ Absolute length of confidence interval calculated using transformed estimate
d f Degrees of freedomDegrees of freedom used for testing hypothesis and interval construction
d f ~ Adjusted degrees of freedomModified degrees of freedom used in small-sample confidence interval estimation
h i Sampling weight for i-th containerRelative sampling influence of container i
h i ( j ) Sampling weight for i-th container in j-th stratumRelative sampling influence of container i within stratum j
j Stratum indexIdentifies the j-th stratum within the locality
k Municipality index within stratum l Index identifying k-th municipality in stratum l
K l Number of sampled municipalities in stratum l Number of municipalities sampled in stratum l
K l ~ Estimated required number of municipalities in stratum l Minimum required number of municipalities in stratum l to achieve desired precision
K + l Total number of estimates used for variance in stratum l Total number of municipality-level estimates used to calculate
σ ^ l 2
l Stratum index at the national levelIndex identifying l-th national-level stratum
n Actual sample sizeNumber of analyzed collection containers—number of samples
n ~ Estimated minimum required sample sizeNumber of samples required to achieve the desired precision, based on variance and confidence level
n j Actual sample size in stratum j Number of analyzed samples in stratum j
n ~ j Estimated required sample size in stratum j Estimated number of samples needed in stratum j to achieve the desired precision
N ~ Estimated total required number of municipalities Estimated number of samples needed to achieve the desired precision at national level
  R True ratio of waste fraction to total wasteThe real share (unknown) of a given waste type in the container, which is estimated based on measurements.
R ^ n Estimated ratio of waste fraction based on a sample of size n Point estimate of R calculated from n samples
R ^ j , n j Estimated ratio of waste fraction in j-th stratumEstimated share of specific waste fraction in stratum j
R ^ n b * Bootstrap estimate of waste fraction ratio for whole locality in sample b Estimated waste fraction ratio for locality from b-th bootstrap resample
R ^ l Estimated ratio for stratum l Weighted estimate of waste fraction ratio in stratum l
R ^ k , l Estimated ratio in municipality k of stratum l Estimate of waste fraction ratio for municipality k in stratum l
R ^ N Estimated national waste fraction ratioAggregated estimator across all strata (national estimate)
t d f ( 1 α / 2 ) Quantile of Student’s t-distribution 1 α / 2 quantile of t-distribution with degrees of freedom d f , used in a confidence interval
T i Total weight of waste in the i-th containerkgTotal weight of all waste fraction in the single analyzed sample
T j i Total waste weight in i-th sample in j-th stratumkgTotal waste weight in container i within stratum j
T ¯ n Mean total waste weight per samplekgAverage total waste weight per container
T ¯ j , n j Mean total waste weight per sample within stratum j kgAverage total waste weight per container
u 1 α 2 Quantile of the standard normal distribution 1 − α/2 quantile of the standard normal distribution
v R ^ n Variance estimator of
R ^ n
Estimated variance of the ratio estimator R ^ n
v θ ^ 2 Asymptotic variance of θ ^ n Estimated variance of θ ^ n (variance of log-transformed point estimate)
v θ ^ 2 ( n ~ ) Asymptotic variance of θ ^ n when number of samples is n ~ Estimated variance of θ ^ n (variance of log-transformed point estimate) when number of samples is enough to ensure desired precision
v j , R ^ n j 2 Variance estimator of R ^ j , n j Estimated variance of waste fraction ratio in stratum j
v 1 , R ^ l 2 First component of variance of R ^ l Variability resulting from within-municipality estimates
v 2 , R ^ l 2 Second component of variance of R ^ l Variability resulting from location selection
v R ^ l 2 Variance of R ^ l Variability of waste ratio estimate for stratum l
v R ^ k , l 2 Variance of R ^ k , l Variance of waste ratio estimate for municipality k in stratum l
v R ^ N 2 Estimated variance of national estimateEstimated variance of R ^ N , used in confidence interval calculation
w j Weight of j-th stratumProportional weight of stratum j (based on waste production or population share)
w k , l Weight of municipality k within stratum l Proportion of waste production from municipality k within stratum l
w k , l 2 Square of municipality weightSquared weight of municipality k
k within stratum l
W l Weight of stratum l   Relative contribution of stratum l to national waste production
Y i Weight of the particular waste fraction in the i-th containerkgWeight of selected waste fraction (e.g., paper, plastic) in the single analyzed sample
Y j i Weight of waste fraction in i-th sample in j-th stratumkgWeight of selected waste fraction in container i within stratum j
α Significance levelProbability of rejecting a true parameter
δ Desired precision Maximum acceptable half-length of the confidence interval
θ ^ n Logit-transformed estimate of R ^ n Transformed estimate to ensure non-negativity
κ j , n j Correction factor in j-th stratumAdjusts variance estimator for small samples in stratum j
κ n Correction factor for variance estimatorAdjust variance to account for small sample effects and different container weights (recommendation)
σ ^ n Estimate of standard deviation of waste fraction ratioSample-based estimate of standard deviation of observed ratios
σ ^ j , n j Estimate of standard deviation of waste fraction ratio in stratum j Sample-based estimate of standard deviation of observed ratios in stratum j
σ ^ l Estimated variance of ratios in stratum l Variance of municipal waste ratios within stratum l

A.1. Mathematical Model—Estimation Without Stratification (or Any Other Additional Structure)

First, the situation where no stratification is considered at the estimation level is described. This approach is useful for estimating the waste composition of a particular small territorial unit (e.g., village, municipality). To further proceed, some notation is needed. Suppose n collection containers were inspected, resulting in a random sample Y 1 T 1 , , Y n T n , where Y i stands for the weight (in kg) of the particular fraction of waste (e.g., paper, plastic, biowaste) and T i stands for the total weight of the waste in the i -th collection container. Then, the ratio R of the particular waste fraction to the total waste is estimated as
R ^ n = i = 1 n   Y i i = 1 n   T i ,
This formula calculates the estimated proportion of a specific waste fraction (e.g., bio-waste) in the total waste stream based on the analyzed containers. Of course, in practice, it is interesting how well R ^ n estimates the unknown ratio R . To do that, the confidence intervals are calculated. In what follows, several methods of calculating confidence intervals are described and the advantages and disadvantages of these methods are discussed. For each of the methods, the way to use preliminary results, from the waste sampling to estimate the sample size to get the desired precision of our estimates, is presented.
  • Standard asymptotic confidence interval
It is well known that the asymptotic variance of the estimator R ^ n can be estimated as 1 ( i = 1 n   T i ) 2 i = 1 n   ( Y i R ^ n   T i ) 2 ; nevertheless, for small sample sizes (n is small) and different sizes of samples, it is recommended to use a slightly modified formula
v R ^ n 2 = κ n ( i = 1 n   T i ) 2 i = 1 n   ( Y i R ^ n   T i ) 2 ,
where
κ n = i = 1 n   T i i = 1 n   T i ( 1 h i )         a n d         h i = T i 2 j = 1 n   T j 2 ,
Note that κ n is a correction that inflates the variance estimator to accommodate for small sample effects and different values of T 1 , , T n . In the special case that T 1 = . . . = T n , the correction is equal to κ n = n n 1 and the variance estimator is given by v R ^ n 2 = 1 n ( n 1 ) i = 1 n   Y i T i R ^ n 2 . This formula calculates how much the estimated ratio R ^ n is expected to vary between samples due to random variability in the waste composition. For example, when analyzing a small number of containers with different total weights ( T i ), the correction factor κ n ensures that the variability is not underestimated. A higher variance indicates more uncertainty in the estimated proportion. Now the standard confidence interval for R (with the asymptotic coverage probability 1 α ) is given by
( R ^ n t d f ( 1 α / 2 )   v R ^ n , R ^ n + t d f ( 1 α / 2 )   v R ^ n ) ,
where t d f ( 1 α / 2 ) is 1 α / 2 -quantile of Student t -distribution with the degrees of freedom d f given by
n ~ u 1 α / 2   σ ^ n δ   T ¯ n 2 .
Here, it is recommended to use Student t -quantiles t d f ( 1 α / 2 ) instead of the standard normal quantiles u 1 α / 2 in order to accommodate for small samples and different values of T 1 , , T n . In the special case that T 1 = . . . = T n , the degrees of freedom are equal to d f = n 1 , which is the commonly used degrees of freedom in similar small sample problems. This confidence interval provides a range in which the true proportion of the waste fraction is expected to fall, based on the sample data. For example, if R ^ n   =   0.30 (30%) and v   R ^ n = 0.05, then the 95% confidence interval would be approximately 30% ± (t-value · 5%). This interval expresses the uncertainty of the estimate and helps in assessing whether the data is sufficiently reliable for decision-making.
The advantage of the confidence interval in Equation (A4) is its simplicity. The disadvantage is that it might contain values outside of the interval 0 , 1 . Although simply the intersection of the confidence interval 0 , 1 can be taken, this would be only a ‘cosmetic’ solution. The real problem is that the distribution of the random variable R ^ n R v R ^ n might be rather different from a standard normal distribution, in particular for small samples. In the following sections, an alternative method that is usually correct for these small sample problems is described.
  • Estimation of the sample size to obtain the desired precision
In this section, the problem of estimating the sample size that is needed to obtain the desired precision of the estimator R ^ n is described. It is needed to distinguish the estimated sample size (denoted as n ~ in the sequel) from the actual sample size (denoted as n ). In order to find n ~ it is useful to rewrite the confidence interval as
R ^ n t d f ( 1 α / 2 )   σ ^ n T ¯ n   n , R ^ n + t d f ( 1 α / 2 )   σ ^ n T ¯ n   n ,
where
T ¯ n = 1 n i = 1 n   T i ,         a n d         σ ^ n 2 = κ n n i = 1 n   ( Y i R ^ n   T i ) 2 .
By the precision δ it will be understood that the length of the confidence interval is less than 2 δ . With the help of Equation (A6), it can be concluded that n ~ should satisfy
t d f ~ ( 1 α / 2 )   σ ^ n T ¯ n   n ~ δ ,         w h e r e         d f ~ = m a x 1 , n ~ n d f .
The estimated sample size n ~ is often already so large that the difference between t -quantiles t d f ~ ( 1 α / 2 ) and standard normal quantiles u 1 α / 2 may be neglected. In this situation, the following explicit solution of Equation (A8) can be obtained:
n ~ u 1 α / 2   σ ^ n δ   T ¯ n 2 .
This formula helps to determine the minimum required sample size to achieve the desired precision. For example, if the estimated standard deviation σ ^ is 0.05, the desired absolute precision δ is 0.02 (2%), and the average total waste weight per sample T ¯ is 50 kg; then, inserting these values into Equation (A9) will calculate how many samples must be collected to ensure that the width of the confidence interval does not exceed ±2%. This allows practical planning of the sampling campaign based on preliminary variability observed in the data.
Absolute precision considers δ as a constant (e.g., δ = 0.02 ). On the other hand, when δ is proportional to R ^ n (e.g., δ = 0.2   R ^ n ), then relative precision is obtained. Note that relative precision allows larger δ when R ^ n is close to 1 2 , but smaller δ when R ^ n is small. This seems to be very reasonable. Nevertheless, the relative precision proves to be too stringent when R ^ n is close to zero (i.e., for waste fractions that have a small proportion). In practice, absolute precision is recommended for small fractions, where requiring strict relative precision would result in unreasonably large required sample sizes. Conversely, relative precision is more appropriate for larger fractions, where maintaining a fixed absolute error margin may be unnecessarily restrictive. As a compromise, in this case study is considered to satisfy the given precisions when at least one of the absolute or relative precision requirements is met.
In summary, Equations (A6)–(A9) provide a practical tool for determining the number of samples needed to obtain composition estimates with acceptable precision, reflecting both data variability and the desired level of certainty.
  • Alternative confidence intervals
As mentioned, the standard confidence interval in Equation (A4) does not work well when the distribution of the quantity R ^ n R v R ^ n is far from being normal (typically due to asymmetry). To circumvent this problem, the transformation of R ^ n or bootstrap can be used. Based on empirical experience, it is recommended to consider the logit transformation of R ^ n , i.e.,
θ ^ n = l o g R ^ n 1 R ^ n ,
with asymptotic variance of θ ^ n , which can be (with the help of Δ -method) estimated as
v θ ^ 2 = v R ^ n 2 R ^ n 2 ( 1 R ^ n ) 2   .
Using the logit transformation improves the behavior of the confidence interval, especially for proportions close to 0 or 1. This is because the transformation stretches extreme values and stabilizes variance, leading to more reliable estimation of uncertainty. The advantage of using θ ^ n is that the distribution of T n = θ ^ n log R 1 R v θ ^ is usually much closer to a standard normal distribution than the distribution of R ^ n R v R ^ n . Using the asymptotic normality of T n , firstly, the confidence interval for the logit of R is derived. Then, using the expit transformation of this interval yields, the following confidence interval for R can be constructed
e x p { θ ^ n t d f ( 1 α / 2 )   v θ ^ } 1 + e x p { θ ^ n t d f ( 1 α / 2 )   v θ ^ } ,   e x p { θ ^ n + t d f ( 1 α / 2 )   v θ ^ } 1 + e x p { θ ^ n + t d f ( 1 α / 2 )   v θ ^ } ,
where the degrees of freedom d f were introduced in Equation (A5). This confidence interval usually has the actual coverage (for small sample sizes) much closer to 1 α than the standard confidence interval in Equation (A6). For example, if the estimated share of a waste fraction is 90% or 5%, applying the logit transformation reduces the risk that the confidence interval will include impossible values (below 0% or above 100%).
To estimate the needed sample size to obtain the desired precision δ , denote first the length of the confidence interval (Equation (A12)) as
d a b s ( n ~ ) = e x p { θ ^ n + t d f ~ ( 1 α / 2 )   v θ ^ ( n ~ ) } 1 + e x p { θ ^ n + t d f ~ ( 1 α / 2 )   v θ ^ ( n ~ ) } e x p { θ ^ n t d f ~ ( 1 α / 2 )   v θ ^ ( n ~ ) } 1 + e x p { θ ^ n t d f ~ ( 1 α / 2 )   v θ ^ ( n ~ ) } ,
where
v θ ^ 2 ( n ~ ) = σ ^ n 2 n ~   T ¯ n 2   R ^ n 2 ( 1 R ^ n ) 2
and d f ~ was introduced in Equation (A6). Next, search for the smallest integer n ~ such that d ( n ~ ) 2 δ . Nevertheless, usually very similar n ~ as with the help of Equation (A8) is obtained. In practice, using the logit transformation leads to more realistic and statistically accurate confidence intervals, especially for fractions with low or high proportions.
Alternatively, one can also use bootstrap to calculate a confidence interval. There are plenty of bootstrap methods available. As there is no obvious parametric model for this type of data, the nonparametric bootstrap seems to be a natural choice. Furthermore, the percentile method to construct the confidence interval was elaborated, as this method naturally gives a confidence interval that is a subset of 0 , 1 , but it is problematic for small sample sizes. Furthermore, based on our experience, this (as well as other) bootstrap methods work less satisfactorily than the logit transformation method even for not small sample sizes. Bootstrap is thus useful primarily as an alternative tool to estimate variance, but not as a preferred method for constructing confidence intervals. In this study, the logit transformation was found to be more robust and practical.

A.2. Mathematical Model—Estimation for a Stratified Locality/Municipality

Stratification is useful when the locality can be divided into several homogeneous strata that are supposed to behave differently with respect to a variable of interest (i.e., the fraction of a particular type of waste). On the sampling level, the stratification can help to obtain a more representative sample. When the sample sizes in all strata are not smaller than the stratification at the estimation level, they can be used in order to increase the precision of the estimate. The procedure is as follows.
Suppose that the given locality consists of J disjoint strata. For j 1 , , J denote n j the sample size in the j th stratum. In what follows, let Y j i denote the weight of the particular type of waste for the i th sampled collection container in the j th stratum. Analogously let T j i be the total weight of waste of the corresponding collection container. Then, in the same way as in Section A.1 in this Appendix, the estimate R ^ j , n j (the estimate of the ratio of the particular type of waste) and v j , R ^ n j 2 (the estimate of the variance of R ^ j , n j ) can be calculated as follows
R ^ j , n j = i = 1 n j   Y j i i = 1 n j   T j i , v j , R ^ n j 2 = κ j , n j ( i = 1 n j   T j i ) 2 i = 1 n j   ( Y j i R ^ j , n j T j i ) 2 ,
where
κ j , n j = i = 1 n j   T j i i = 1 n j   T j i ( 1 h i ( j ) ) , h i ( j ) = T j i 2 l = 1 n j   T j l 2 .
Now let w j represent the size of the j th stratum in terms of total waste production. Ideally w j should be the fraction of the total weight of waste produced in the j -th stratum to the total weight of waste produced in all strata. As this information might not be available as a proxy, the percentage of inhabitants living in the j th stratum can be taken. It is natural to assume that weights w j are normalized in such a way that j = 1 J   w j = 1 . This weighting ensures that the contribution of each stratum to the total estimate reflects its relative importance (e.g., waste production or population size). Larger strata will have a proportionally greater influence on the aggregated result. Then, the ratio estimator for the whole locality is given by
R ^ n = j = 1 J   w j R ^ j , n j .
For example, if three strata contribute 50%, 30%, and 20% of the total waste, and their estimated bio-waste shares are 20%, 30%, and 40%, respectively, the overall bio-waste share is calculated as ( 0.5   ·   0.20 )   +   ( 0.3   ·   0.30 )   +   ( 0.2   ·   0.40 )   =   0.26 , or 26%. The corresponding estimator of the variance of R ^ n is
v R ^ n 2 = j = 1 J   w j 2 v j , R ^ n j 2 .
The confidence interval can be calculated using the recommended formula in Equation (A12), where the degrees of freedom d f of the quantile of Student t -distribution could be calculated as
d f = j = 1 J   w j 2 R ^ j , n j ( 1 R ^ j , n j ) i = 1 n j   T j i 2 j = 1 J   w j 4 κ j , n j i = 1 n j   T j i 2 R ^ j , n j 2 ( 1 R ^ j , n j ) 2 ( i = 1 n j   T j i ) 4 .
Nevertheless, when the total sample size is sufficiently large, Equation (A4) with the standard normal quantile u 1 α / 2 can be simply used instead of t d f ( 1 α / 2 ) .
Alternatively, one can use a nonparametric bootstrap. Suppose that B bootstrap samples are generated. Then, for a given b { 1 , , B } , a random sample with replacement is made in each of the strata. This will result in within strata estimates R ^ 1 , n 1 b * , , R ^ J , n J b * from which one can construct the (bootstrap) estimate R ^ n b * = j = 1 J w j R ^ j , n j b * for the whole locality. So, in total, one has R ^ n 1 * , , R ^ n B * that can be used to calculate a bootstrap confidence interval. In practice, bootstrap serves as an additional method for variance estimation and confidence interval construction, though in this study it was found less effective than the stratification-based approach.
  • Standard asymptotic confidence interval
Recall that n j is the actual sample size in the j th stratum and denote n ~ j the estimated sample size in the j th stratum. Furthermore, let n ~ = j = 1 J   n ~ j be the total estimated sample size. Let δ be the desired precision (note that δ can be a function of R ^ n , see Equation (A10)). Furthermore, for j { 1 , , J } denote
T ¯ j , n j = 1 n j i = 1 n j   T j i , σ ^ j , n j 2 = κ j , n j n j i = 1 n j   ( Y j i R ^ j , n j T j i ) 2 .
The possible approach would be to search for n ~ 1 , , n ~ J so that n ~ is minimized subject to
t d f ~ ( 1 α / 2 ) j = 1 J   w j 2 σ ^ j , n j 2 n ~ j T ¯ j , n j 2 δ , d f ~ = m a x 1 , n ~ n d f , n ~ j n j , j { 1 , , J } .
An approximate solution that ignores the difference between t d f ~ ( 1 α / 2 ) and u 1 α / 2 (and also the fact that n ~ 1 , , n ~ J are in fact integers) would be
n ~ u 1 α 2 2 δ 2 j = 1 J   w j σ ^ j , n j T ¯ j , n j 2 , n ~ j m a x u 1 α / 2 2 δ 2 w j σ ^ j , n j T ¯ j , n j j = 1 J   w j σ ^ j , n j T ¯ j , n j , n j .
The formula in Equation (A22) allows estimating how the total required number of samples should be distributed between individual strata. Strata with higher variability or greater weight w j will receive more samples, as their contribution to the total uncertainty is more significant. This ensures efficient allocation of sampling effort to achieve the desired overall precision δ . Nevertheless, the above optimal solution assumes that the fractions σ ^ j , n j T ¯ j , n j are stable with increasing sample sizes. For small sizes, it may be safer to introduce, for instance, the condition that
w j n ~ j n ~ , j { 1 , , J } ,
which would result in the estimated sample sizes
n ~ u 1 α 2 2 δ 2 j = 1 J   w j σ ^ j , n j 2 T ¯ j , n j 2     , n ~ j w j n ~ .
The introduced simplified allocation strategy ensures that sample distribution directly follows the assumed importance of each stratum, regardless of initial variance estimates. This conservative approach is particularly useful in early planning stages or when reliable variance estimates are unavailable, as it prevents under-sampling in critical strata.

A.3. Mathematical Model—Estimation of Strata

So far, only a single municipality has been considered. Nevertheless, the aim is to have the results at a national level, i.e., it is desirable to have an estimate representing many (for instance, thousands of municipalities) from which only a few of them (for instance, 20) are sampled. The assumption is that the municipalities are stratified, i.e., each municipality belongs to exactly one stratum. Let L be the number of strata. This section is focused only on a given l th stratum, where l { 1 , , L } . These results will be aggregated to a national level in the next section.
Let K l be the number of municipalities sampled from the l th stratum. Furthermore, for k { 1 , , K l } let w k , l be the weight of the k th sampled municipality in the sample, so that k = 1 K l   w k , l = 1 . This weight should correspond to the relative amount of waste produced by this municipality. Now, the estimated ratio for a particular type of waste in the l th stratum is
R ^ l = k = 1 K l   w k , l R ^ k , l ,
where R ^ k , l is an estimate of the ratio in the k th sampled municipality from this stratum. In practice, this model allows the estimation of waste composition at the national level, based on results from a limited number of sampled municipalities. The stratification into homogeneous groups reduces variability within each stratum, while weighting ensures that each region contributes proportionally to the national estimate. For example, if waste data from only 20 municipalities are available to represent thousands of municipalities, this method enables their combination into a statistically justified national estimate by accounting for the relative size and variability of each region.
Two sources of variability are systematically captured:
First, variability within each municipality, due to limited sampling of containers.
Second, variability between municipalities, since only a small number of representatives are analyzed in each stratum. Together, this framework ensures that uncertainty from both levels is reflected in the final national estimate. Using weighted aggregation reflects the actual waste production share of each stratum or municipality, preventing overrepresentation of smaller regions or underrepresentation of large waste producers.
Finally, the framework allows estimation of how many additional municipalities need to be sampled to achieve a desired national-level precision, supporting efficient planning of further sampling campaigns.
  • Variance estimation
In the case of variance of the estimate R ^ l , it is needed to take into consideration two sources of randomness/variability. The first source of variability comes from the fact that R ^ k , l are in fact estimates as only a sample of collection containers is inspected in each of the chosen municipalities. This variability can be estimated as
v 1 , R ^ l 2 = k = 1 K l   w k , l 2 v R ^ k , l 2 ,
where for v R ^ k , l 2 , see Equation (A2) or Equation (A18). The second source of variability comes from the fact that only a small number ( K l ) from many municipalities in the given strata are inspected. This variability can be estimated as
v 2 , R ^ l 2 = σ ^ l 2 k = 1 K l   w k , l 2 ,
where σ ^ l 2 is the estimate of the variance of ratios R k , l in this given stratum. If K l is not smaller than the sample variance of R ^ 1 , l , , R ^ K l , l can be simply used as σ ^ l 2 . If K l is small, it is recommended to add the estimates of ratios R ^ k , l also from neighbouring strata so that the sample variance is a more stable estimator of the variance of ratios R k , l . The resulting variance v R ^ l 2 is equal to the sum of both sources.
  • Confidence interval and estimating the number of municipalities needed to obtain a given precision
Analogously as in Equation (A4), the confidence interval for the ratio in the l th stratum can be simply calculated as
( R ^ l t d f ( 1 α / 2 ) v R ^ l , R ^ l + t d f ( 1 α / 2 ) v R ^ l ) ,
where d f = K + ( l ) 1 and K + ( l ) is the total number of estimates R ^ k , l used in the calculation of σ ^ l 2 as described above. But similar to the previous sections, it is recommended to use Equation (A12), where R ^ n and v R ^ n 2 are replaced with R ^ l and v R ^ l 2 , respectively. Denote K ~ l the estimated number of municipalities to obtain the given precision δ . Then, K ~ l can be found as the smallest possible integer (larger than or equal to K l ) such that
t d f ~ ( 1 α / 2 ) v R ^ l K l K ~ l δ , w h e r e   d f ~ = m a x K ~ l K l , 1 d f .

A.4. Mathematical Model—National Estimation

The assumption is that the municipalities are stratified, i.e., each municipality belongs to exactly one from the L strata. This section describes how to aggregate the results for several strata. Let R ^ l be the estimate in the l -th stratum as described in Section A.3. Furthermore, let W l be the weight of the l th stratum that reflects its relative waste production. Finally, assume that the weights are normalized such that l = 1 L   W l = 1 . Then, the aggregated estimator is simply given by
R ^ N = l = 1 L   W l   R ^ l ,
and the variance of this estimate can be estimated as
v R ^ N 2 = l = 1 L   W l 2   v R ^ l 2 ,
where v R ^ l 2 is the estimate of variance R ^ l introduced in Section A.3. Analogously as in Equation (A4), the standard asymptotic confidence interval can be calculated as
( R ^ N t d f ( 1 α / 2 )   v R ^ N , R ^ N + t d f ( 1 α / 2 )   v R ^ N ) ,
where d f = N 1 with N = l = 1 L   K l being the number of municipalities in our sample. Alternatively (and preferably), Equation (A12) can be used where R ^ n and v R ^ n 2 are replaced with R ^ N and v R ^ N 2 , respectively. Finally, let N ~ = l = 1 L   K ~ l be the estimate of the number of municipalities to obtain the desired precision δ of our estimate. Similarly to Section A.2, the integers K ~ 1 , , K ~ L can be searched so that N ~ is minimized subject to
t d f ~ ( 1 α / 2 ) l = 1 L   K l   W l 2   v R ^ l 2 K ~ l δ ,         K ~ l K l ,     l { 1 , , L } ,
where d f ~ = m a x 1 , N ~ N   d f . An approximate solution that ignores the difference between t d f ~ ( 1 α / 2 ) and u 1 α / 2 (and also the fact that K ~ 1 , , K ~ L are in fact integers) would be
N ~ u 1 α 2 2 δ 2 l = 1 L   K l   W l   v ^ R ^ l 2 ,         K ~ l m a x u 1 α / 2 2 δ 2 K l   W l   v ^ R ^ l l = 1 L   K l   W l   v ^ R ^ l , K l .

References

  1. Bohm, R.A.; Folz, D.H.; Kinnaman, T.C.; Podolsky, M.J. The Costs of Municipal Waste and Recycling Programs. Resour. Conserv. Recycl. 2010, 54, 864–871. [Google Scholar] [CrossRef]
  2. Vigiak, O.; Grizzetti, B.; Zanni, M.; Aloe, A.; Dorati, C.; Bouraoui, F.; Pistocchi, A. Domestic Waste Emissions to European Waters in the 2010s. Sci. Data 2020, 7, 33. [Google Scholar] [CrossRef] [PubMed]
  3. Magnanelli, E.; Birgen, C.; Becidan, M. Current Municipal Solid Waste Management in a Large City and Evaluation of Alternative Management Scenarios. Chem. Eng. Trans. 2023, 98, 225–230. [Google Scholar]
  4. Edjabou, M.E.; Martín-Fernández, J.A.; Scheutz, C.; Astrup, T.F. Statistical Analysis of Solid Waste Composition Data: Arithmetic Mean, Standard Deviation and Correlation Coefficients. Waste Manag. 2017, 69, 13–23. [Google Scholar] [CrossRef]
  5. Emmanouil, C.; Chachami-Chalioti, S.Ε.; Kyzas, G.Z.; Kungolos, A. Application of the Theory of Planned Behavior to Predict Waste Source Separation. Sci. Total Environ. 2024, 956, 177356. [Google Scholar] [CrossRef]
  6. Gadaleta, G.; De Gisi, S.; Todaro, F.; D’Alessandro, G.; Binetti, S.; Notarnicola, M. Assessing the Sorting Efficiency of Plastic Packaging Waste in an Italian Material Recovery Facility: Current and Upgraded Configuration. Recycling 2023, 8, 25. [Google Scholar] [CrossRef]
  7. Chin, H.H.; Varbanov, P.S.; Fózer, D.; Mizsey, P.; Klemeš, J.J.; Jia, X. Data-Driven Recyclability Classification of Plastic Waste. Chem. Eng. Trans. 2021, 88, 679–684. [Google Scholar] [CrossRef]
  8. Jangid, C.J.; Miller, K.M.; Seay, J.R. Analysis of Plastic-Derived Fuel Oil Produced from High- and Low-Density Polyethylene. Recycling 2022, 7, 29. [Google Scholar] [CrossRef]
  9. Vidal-Antich, C.; Peces, M.; Perez-Esteban, N.; Mata-Alvarez, J.; Dosta, J.; Astals, S. Impact of Food Waste Composition on Acidogenic Co-Fermentation with Waste Activated Sludge. Sci. Total Environ. 2022, 849, 157920. [Google Scholar] [CrossRef]
  10. Zhang, Z.; Chen, Z.; Zhang, J.; Liu, Y.; Chen, L.; Yang, M.; Osman, A.I.; Farghali, M.; Liu, E.; Hassan, D.; et al. Municipal Solid Waste Management Challenges in Developing Regions: A Comprehensive Review and Future Perspectives for Asia and Africa. Sci. Total Environ. 2024, 930, 172794. [Google Scholar] [CrossRef]
  11. Herrera-Franco, G.; Merchán-Sanmartín, B.; Caicedo-Potosí, J.; Bitar, J.B.; Berrezueta, E.; Carrión-Mero, P. A Systematic Review of Coastal Zone Integrated Waste Management for Sustainability Strategies. Environ. Res. 2024, 245, 117968. [Google Scholar] [CrossRef]
  12. Ezeudu, O.B.; Ozoegwu, C.G.; Madu, C.N. A Statistical Regression Method for Characterization of Household Solid Waste: A Case Study of Awka Municipality in Nigeria. Recycling 2019, 4, 1. [Google Scholar] [CrossRef]
  13. Tonjes, D.J.; Manzur, S.; Wang, Y.; Firmansyah, F.; Rahman, M.; Walker, G.; Lee, S.; Thomas, T.; Johnston, M.; Ly, M.; et al. Composition of New York State (United States) Disposed Waste and Recyclables in 2021: An Advanced Analysis of Waste Sort Data. Recycling 2024, 9, 87. [Google Scholar] [CrossRef]
  14. SWA-Tool Methodology for the Analysis of Solid Waste. Available online: https://www.wien.gv.at/meu/fdb/pdf/swa-tool-759-ma48.pdf (accessed on 11 June 2025).
  15. Struk, M. Distance and Incentives Matter: The Separation of Recyclable Municipal Waste. Resour. Conserv. Recycl. 2017, 122, 155–162. [Google Scholar] [CrossRef]
  16. Edjabou, M.E.; Jensen, M.B.; Götze, R.; Pivnenko, K.; Petersen, C.; Scheutz, C.; Astrup, T.F. Municipal Solid Waste Composition: Sampling Methodology, Statistical Analyses, and Case Study Evaluation. Waste Manag. 2015, 36, 12–23. [Google Scholar] [CrossRef]
  17. Roy, D.; Palavalli, B.; Menon, N.; King, R.; Pfeffer, K.; Lees, M.; Sloot, P.M.A. Survey-Based Socio-Economic Data from Slums in Bangalore, India. Sci. Data 2018, 5, 170200. [Google Scholar] [CrossRef]
  18. Winslow, L.A.; Hansen, G.J.A.; Read, J.S.; Notaro, M. Large-Scale Modeled Contemporary and Future Water Temperature Estimates for 10774 Midwestern U.S. Lakes. Sci. Data 2017, 4, 170053. [Google Scholar] [CrossRef] [PubMed]
  19. Gente, V.; Marca, F.L.; Massacci, P.; Serranti, S. Waste Characterization by Scanning Electron Microscopy for Material Recovery. Part. Sci. Technol. 2007, 25, 481–494. [Google Scholar] [CrossRef]
  20. Lovanh, N.; Loughrin, J.H.; Cook, K.; Rothrock, M.; Sistani, K. The Effect of Stratification and Seasonal Variability on the Profile of an Anaerobic Swine Waste Treatment Lagoon. Bioresour. Technol. 2009, 100, 3706–3712. [Google Scholar] [CrossRef]
  21. Zeng, Y.; Trauth, K.M.; Peyton, R.L.; Banerji, S.K. Characterization of Solid Waste Disposed at Columbia Sanitary Landfill in Missouri. Waste Manag. Res. J. A Sustain. Circ. Econ. 2005, 23, 62–71. [Google Scholar] [CrossRef] [PubMed]
  22. Cheniti, H.; Serradj, T.; Brahamia, K.; Makhlouf, A.; Guerraiche, S. Physical Knowledge of Household Waste in Algeria: Generation and Composition in the Town of Annaba. Waste Manag. Res. J. A Sustain. Circ. Econ. 2013, 31, 1180–1186. [Google Scholar] [CrossRef]
  23. Abylkhani, B.; Guney, M.; Aiymbetov, B.; Yagofarova, A.; Sarbassov, Y.; Zorpas, A.A.; Venetis, C.; Inglezakis, V. Detailed Municipal Solid Waste Composition Analysis for Nur-Sultan City, Kazakhstan with Implications for Sustainable Waste Management in Central Asia. Environ. Sci. Pollut. Res. 2021, 28, 24406–24418. [Google Scholar] [CrossRef] [PubMed]
  24. Petersen, C.M.; Berg, P.E.O.; Rönnegård, L. Quality Control of Waste to Incineration—Waste Composition Analysis in Lidköping, Sweden. Waste Manag. Res. J. A Sustain. Circ. Econ. 2005, 23, 527–533. [Google Scholar] [CrossRef]
  25. Dangi, M.B.; Urynowicz, M.A.; Belbase, S. Characterization, Generation, and Management of Household Solid Waste in Tulsipur, Nepal. Habitat. Int. 2013, 40, 65–72. [Google Scholar] [CrossRef]
  26. Bolaane, B.; Ali, M. Sampling Household Waste at Source: Lessons Learnt in Gaborone. Waste Manag. Res. J. A Sustain. Circ. Econ. 2004, 22, 142–148. [Google Scholar] [CrossRef]
  27. Thitame, S.N.; Pondhe, G.M.; Meshram, D.C. Characterisation and Composition of Municipal Solid Waste (MSW) Generated in Sangamner City, District Ahmednagar, Maharashtra, India. Environ. Monit. Assess. 2010, 170, 1–5. [Google Scholar] [CrossRef]
  28. Zhao, S.; Altmann, V.; Richterova, L.; Vitkova, V. Comparison of Physical Composition of Municipal Solid Waste in Czech Municipalities and Their Potential in Separation. Agron. Res. 2021, 19, 961–974. [Google Scholar]
  29. Qdais, H. ANALYSIS OF RESIDENTIAL SOLID WASTE AT GENERATION SITES. Waste Manag. Res. 1997, 15, 395–406. [Google Scholar] [CrossRef]
  30. Roberts, C.; Watkin, G.; Ezeah*, C.; Phillips, P.; Odunfa, A. Seasonal Varitation and Municipal Solid Waste Composition—Issues for Development of New Waste Management Strategies in Abuja, Nigeria. J. Solid. Waste Technol. Manag. 2010, 36, 210–219. [Google Scholar] [CrossRef]
  31. Chung, S.-S.; Poon, C.-S. Characterisation of Municipal Solid Waste and Its Recyclable Contents of Guangzhou. Waste Manag. Res. J. A Sustain. Circ. Econ. 2001, 19, 473–485. [Google Scholar] [CrossRef]
  32. Gidarakos, E.; Havas, G.; Ntzamilis, P. Municipal Solid Waste Composition Determination Supporting the Integrated Solid Waste Management System in the Island of Crete. Waste Manag. 2006, 26, 668–679. [Google Scholar] [CrossRef]
  33. Liikanen, M.; Sahimaa, O.; Hupponen, M.; Havukainen, J.; Sorvari, J.; Horttanainen, M. Updating and Testing of a Finnish Method for Mixed Municipal Solid Waste Composition Studies. Waste Manag. 2016, 52, 25–33. [Google Scholar] [CrossRef]
  34. Lebersorger, S.; Schneider, F. Discussion on the Methodology for Determining Food Waste in Household Waste Composition Studies. Waste Manag. 2011, 31, 1924–1933. [Google Scholar] [CrossRef]
  35. Ibikunle, R.A.; Titiladunayo, I.F.; Lukman, A.F.; Dahunsi, S.O.; Akeju, E.A. Municipal Solid Waste Sampling, Quantification and Seasonal Characterization for Power Evaluation: Energy Potential and Statistical Modelling. Fuel 2020, 277, 118122. [Google Scholar] [CrossRef]
  36. Guérin, J.É.; Paré, M.C.; Lavoie, S.; Bourgeois, N. The Importance of Characterizing Residual Household Waste at the Local Level: A Case Study of Saguenay, Quebec (Canada). Waste Manag. 2018, 77, 341–349. [Google Scholar] [CrossRef]
  37. Miezah, K.; Obiri-Danso, K.; Kádár, Z.; Fei-Baffoe, B.; Mensah, M.Y. Municipal Solid Waste Characterization and Quantification as a Measure towards Effective Waste Management in Ghana. Waste Manag. 2015, 46, 15–27. [Google Scholar] [CrossRef]
  38. Sharma, M.; McBean, E. A Methodology for Solid Waste Characterization Based on Diminishing Marginal Returns. Waste Manag. 2007, 27, 337–344. [Google Scholar] [CrossRef]
  39. ASTM D5231-92; Standard Test Method for Determination of the Composition of Unprocessed Municipal Solid Waste. ASTM International: West Conshohocken, PA, USA, 2016. Available online: https://www.astm.org (accessed on 11 June 2025).
  40. Sfeir, H.; Reinhart, D.R.; McCauley-Bell, P.R. An Evaluation of Municipal Solid Waste Composition Bias Sources. J. Air Waste Manag. Assoc. 1999, 49, 1096–1102. [Google Scholar] [CrossRef] [PubMed]
  41. TIRSMZP719; Prognosis of Waste Production and Determination of the Composition of Municipal Waste. Technology Agency of the Czech Republic: Prague, Czech Republic, 2022.
  42. Šomplák, R.; Kopa, M.; Omelka, M.; Nevrlý, V.; Pavlas, M. Multi-Level Stratification of Territories for Waste Composition Analysis. J. Environ. Manag. 2022, 318, 115534. [Google Scholar] [CrossRef] [PubMed]
  43. Denafas, G.; Ruzgas, T.; Martuzevičius, D.; Shmarin, S.; Hoffmann, M.; Mykhaylenko, V.; Ogorodnik, S.; Romanov, M.; Neguliaeva, E.; Chusov, A.; et al. Seasonal Variation of Municipal Solid Waste Generation and Composition in Four East European Cities. Resour. Conserv. Recycl. 2014, 89, 22–30. [Google Scholar] [CrossRef]
  44. Zhang, J.-P.; Hou, J.-Q.; Li, M.-X.; Yang, T.-X.; Xi, B.-D. A Novel Process for Food Waste Recycling: A Hydrophobic Liquid Mulching Film Preparation. Environ. Res. 2022, 212, 113332. [Google Scholar] [CrossRef] [PubMed]
  45. ECN. Bio-Waste in Europe. 2022. Available online: https://www.compostnetwork.info/policy/biowaste-in-europe/?utm_source=chatgpt.com (accessed on 16 July 2025).
  46. Waste Data & Analysis Center. 2025. Available online: https://www.wastedata.info (accessed on 15 July 2025).
Figure 1. Scheme of waste composition estimation methodology (part discussed in this paper is marked in red).
Figure 1. Scheme of waste composition estimation methodology (part discussed in this paper is marked in red).
Recycling 10 00162 g001
Figure 2. Scheme of estimation methodology of waste composition, represented by seven consecutive steps. The letters denote clusters and their representatives, while the numbers indicate an additional level of stratification within the municipality for selecting a representative container.
Figure 2. Scheme of estimation methodology of waste composition, represented by seven consecutive steps. The letters denote clusters and their representatives, while the numbers indicate an additional level of stratification within the municipality for selecting a representative container.
Recycling 10 00162 g002
Figure 3. The map of clusters and representatives for the Czech Republic, source: [42].
Figure 3. The map of clusters and representatives for the Czech Republic, source: [42].
Recycling 10 00162 g003
Figure 4. Statistical evaluation of cluster C1.
Figure 4. Statistical evaluation of cluster C1.
Recycling 10 00162 g004
Figure 5. Statistical evaluation of the Czech Republic.
Figure 5. Statistical evaluation of the Czech Republic.
Recycling 10 00162 g005
Table 1. An overview of reviewed publications focused on the statistical evaluation of the composition of municipal solid waste.
Table 1. An overview of reviewed publications focused on the statistical evaluation of the composition of municipal solid waste.
PaperAssumptions CheckPoint/Interval
Estimate
Absolute/Relative EstimateNumber of
Samples
Statistical
Hypotheses
Aggregation of Results
[4]nopoint, interval, log interval, correlationrelativenoyesno
[16]nopoint, intervalrelativenoyesonly results
[19]nononononono
[20]nopointabsolutenoyesno
[21]nopointrelativenoyesno
[22]nopointrelativenoyesno
[23]nopointrelativenonono
[24]yespoint, intervalbothnonono
[25]nopointrelativenonono
[26]nointervalbothyesnono
[27]nopointrelativenonoonly results
[28]nopointrelativenonono
[29]nopoint 1relativeyesnono
[30]nopointrelativenoyesonly results
[31]nopointrelativenonobrief description
[32]nopoint 1relativeyesnoonly results
[33]yespoint, intervalrelativeyesyesonly results
[34]nopoint, intervalrelativeyesyesbrief description
[35]nopoint 1relativeyesyesno
[36]yespointrelativeyes 2yesonly results
[37]nopoint 1relativeyesyesbrief description
[38]yespoint, intervalrelativeyesnono
1 Interval set in an equation for n by a given level of precision. 2 Expert estimation of the necessary number of samples.
Table 2. Point and interval estimates resulting from analyses of the composition of MMW in the Czech Republic, for representatives C1–C9 using logit transformation.
Table 2. Point and interval estimates resulting from analyses of the composition of MMW in the Czech Republic, for representatives C1–C9 using logit transformation.
S1S1S2S2S2S3S3S3S3
C1C2C3C4C5C6C7C8C9
PaperPoint7.70%9.50%5.10%9.90%4.10%5.20%3.60%3.00%4.80%
Interval5.70%8.23%3.18%7.31%3.14%3.45%2.72%1.90%3.46%
10.26%10.89%8.12%13.18%5.38%7.64%4.64%4.65%6.69%
PlasticPoint8.50%9.00%4.20%10.90%6.70%5.90%4.40%5.20%6.00%
Interval6.20%8.05%3.56%9.21%5.25%4.26%3.33%3.78%4.79%
11.54%10.13%4.93%12.75%8.59%8.03%5.79%7.23%7.44%
GlassPoint3.70%4.00%2.30%6.50%2.80%2.20%2.20%5.40%2.10%
Interval2.66%3.43%1.61%4.90%1.59%1.00%1.19%2.95%1.38%
5.18%4.75%3.39%8.55%4.85%4.61%4.00%9.51%3.21%
Bio wastePoint25.10%25.10%27.50%27.70%28.10%43.00%26.30%29.60%33.80%
Interval21.89%21.98%21.66%24.53%23.77%34.51%20.81%22.57%27.07%
28.52%28.45%34.21%31.14%32.78%52.01%32.63%37.71%41.25%
MetalPoint1.90%2.00%2.50%2.10%2.70%1.60%1.70%1.70%2.50%
Interval1.57%1.72%1.72%1.50%2.18%0.88%1.25%1.05%1.06%
2.20%2.23%3.54%3.04%3.38%3.05%2.35%2.64%5.95%
TextilePoint4.00%2.90%4.90%2.40%2.80%1.90%2.20%2.30%2.20%
Interval2.44%2.23%3.28%1.25%1.81%1.14%1.33%1.22%1.35%
6.59%3.66%7.25%4.70%4.41%3.10%3.55%4.13%3.71%
CompositePoint2.00%2.00%1.60%2.40%2.30%1.70%2.00%1.60%1.80%
Interval1.50%1.71%1.18%2.01%1.77%1.18%1.50%1.21%1.39%
2.61%2.25%2.05%2.84%2.86%2.57%2.55%2.17%2.36%
ElectricalPoint0.40%0.50%0.70%1.00%0.20%0.30%0.40%1.30%0.10%
Interval0.19%0.31%0.30%0.31%0.07%0.11%0.12%0.44%0.04%
0.75%0.83%1.80%3.10%0.53%0.65%1.25%3.93%0.20%
OthersPoint21.60%23.10%22.20%24.50%26.20%14.00%20.00%20.30%19.90%
Interval17.42%19.28%16.97%18.67%21.46%10.30%15.20%15.20%15.85%
26.84%27.75%30.73%32.49%32.41%18.95%25.89%26.55%25.00%
<40 mmPoint25.20%22.00%29.00%12.60%24.10%24.20%37.30%29.70%26.70%
Interval19.90%18.81%18.29%10.48%20.73%17.15%28.91%19.39%19.14%
31.32%25.46%42.72%15.06%27.78%33.07%46.63%42.58%35.83%
Table 3. Results of analyses of the composition of MMW in the Czech Republic, for representatives C1–C9 using logit transformation.
Table 3. Results of analyses of the composition of MMW in the Czech Republic, for representatives C1–C9 using logit transformation.
Presented ApproachSimple Evaluation
Point EstimateVariance (10-7)Point EstimateVariance (10-7)
Paper6.75%1020.46.64%102.1
Plastic7.32%851.77.06%71.8
Glass3.59%392.93.35%30.5
Bio waste28.37%2289.428.74%885.1
Metal2.06%36.72.06%15.7
Textile2.96%204.33.11%55.5
Composite1.93%96.41.90%5.1
Electrical0.53%26.20.51%5.4
Others21.86%1777.921.92%519.7
<40 mm24.63%8301.924.71%997.1
Table 4. Comparison of interval estimation methods.
Table 4. Comparison of interval estimation methods.
MethodAdvantagesLimitations
Standard asymptotic
Simple and fast to compute
Suitable for large samples
Unreliable for small samples
Performs poorly for extreme proportions
Confidence intervals may exceed [0, 1]
Logit-transformed
Reliable for small samples
Handles extreme proportions well
Keeps intervals within [0, 1]
Requires logit transformations
Slightly more complex
Bootstrap
Non-parametric (no distribution assumptions)
Intuitive and flexible
Less stable with small samples or stratified data
Computationally intensive
Less reliable than the logit method in this study
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Šomplák, R.; Smejkalová, V.; Nevrlý, V.; Pluskal, J. Robust Statistical Approaches for Stratified Data of Municipal Solid Waste Composition: A Case Study of the Czech Republic. Recycling 2025, 10, 162. https://doi.org/10.3390/recycling10040162

AMA Style

Šomplák R, Smejkalová V, Nevrlý V, Pluskal J. Robust Statistical Approaches for Stratified Data of Municipal Solid Waste Composition: A Case Study of the Czech Republic. Recycling. 2025; 10(4):162. https://doi.org/10.3390/recycling10040162

Chicago/Turabian Style

Šomplák, Radovan, Veronika Smejkalová, Vlastimír Nevrlý, and Jaroslav Pluskal. 2025. "Robust Statistical Approaches for Stratified Data of Municipal Solid Waste Composition: A Case Study of the Czech Republic" Recycling 10, no. 4: 162. https://doi.org/10.3390/recycling10040162

APA Style

Šomplák, R., Smejkalová, V., Nevrlý, V., & Pluskal, J. (2025). Robust Statistical Approaches for Stratified Data of Municipal Solid Waste Composition: A Case Study of the Czech Republic. Recycling, 10(4), 162. https://doi.org/10.3390/recycling10040162

Article Metrics

Back to TopTop