The Metrological Traceability, Performance and Precision of European Radon Calibration Facilities

An interlaboratory comparison for European radon calibration facilities was conducted to evaluate the establishment of a harmonized quality level for the activity concentration of radon in air and to demonstrate the performance of the facilities when calibrating measurement instruments for radon. Fifteen calibration facilities from 13 different European countries participated. They represented different levels in the metrological hierarchy: national metrology institutes and designated institutes, national authorities for radiation protection and participants from universities. The interlaboratory comparison was conducted by the German Federal Office for Radiation Protection (BfS) and took place from 2018 to 2020. Participants were requested to measure radon in atmospheres of their own facilities according to their own procedures and requirements for metrological traceability. A measurement device with suitable properties was used to determine the comparison values. The results of the comparison showed that the radon activity concentrations that were determined by European calibration facilities complying with metrological traceability requirements were consistent with each other and had common mean values. The deviations from these values were normally distributed. The range of variation of the common mean value was a measure of the degree of agreement between the participants. For exposures above 1000 Bq/m3, the variation was about 4% for a level of confidence of approximately 95% (k=2). For lower exposure levels, the variation increased to about 6%.


Introduction
The European Council Directive 2013/59/EURATOM requires that the EU member states introduce regulations for protection against radon exposure in homes and at workplaces [1]. In this context, the European metrology institutes are required to create a harmonized quality level for radon activity concentrations. The realization of the measurement quantity with a high degree of agreement between calibration bodies ensures that measurements are comparable and results are mutually recognized in the EU member states. The term radon used in the EU legal act, as well as in this work, refers to the radionuclide 222 Rn.
In the framework of the European Metrology Programme for Innovation and Research (EMPIR), the project Metrology for Radon Monitoring (MetroRADON) was initiated, which included an interlaboratory comparison to evaluate the metrological traceability of European radon calibration facilities and to demonstrate their performance and precision when calibrating measurement instruments for radon in the range from 300 to 10,000 Bq/m 3 .
Calibration services from the different EU member states, which preferably represent the respective national reference for the quantity of radon activity concentration in air, were encouraged to participate in the comparison.
The interlaboratory comparison was conducted by the German Federal Office for Radiation Protection (BfS, coordinator) and took place from 2018 to 2020. The participants were requested to measure the atmospheres of their own facilities according to their own procedures and requirements for metrological traceability. The comparison value was determined with a measurement device that had appropriate metrological characteristics and was made available to the participants in turn. Differences in the comparison values demonstrated the differences of the participants in the measurement of the radon activity concentration. The measure of precision was considered to be the degree of agreement between participants in the determination of the quantity. The interlaboratory comparison also showed the uncertainties when passing on the quantity to third parties through the calibration of devices.
In total, 15 calibration facilities from 12 different countries of the European Union and one from Montenegro participated in the interlaboratory comparison. Table 1 presents the calibration facilities that were involved in the comparison.
The considerable number of participants from various European countries with different positions in the metrological hierarchy and thus different positions in the metrological traceability chain allowed for a representative evaluation of the performance and precision in the calibration of measurement instruments for radon. The basic design of the interlaboratory comparison was developed in consultation with the members of the advisory group, which consisted of the EMPIR project collaborators. An agreed protocol for the comparison was handed out to each participant in advance. It informed about the procedure of the comparison, as well as the handover and handling of the comparison device.
The comparison device was sent to each participant in turn. It was made available to the participant for a predefined duration in order to perform the exposure measurements. After completing the exposure measurements, the device had to be returned to the coordinator. In addition to the exposure data, participants were asked to report data on the temperature, air humidity and air pressures that prevailed during the exposure measurements.

Comparison Device
The comparison device was used to transfer the comparison value for the measurements at different locations and levels. The device did not embody the comparison reference value.
The coordinator selected an electronic instrument of type AlphaGUARD PQ 2000 PRO TTL. This type of device is a standard instrument for the measurement of radon activity concentrations. The instrument is robust and reliable under various environmental conditions and is easy to use. Measurement results are stored safe from manipulation in its internal memory with sufficient capacity for the comparison exercises. The instrument was operated in the diffusion mode with an integration time of 10 min.
The comparison device was calibrated in the facilities of the coordinator at different radon activity concentrations in the range between 300 and 12,000 Bq/m 3 . Calibrations were performed before, during and after the interlaboratory comparison [2]. Taking the uncertainty into account, a constant calibration factor was obtained over the whole investigated range, which pointed to the linear relationship between the indicated value and the radon activity concentration in air. Similarly, no change in the calibration factor was observed over the comparison period, implying that the measurement characteristics of the instrument were constant, allowing for equal conditions for each participant.
The calibrations were flanked by regular background measurements. For this purpose, the device was enclosed in a volume that was flushed with low-radon air. Low-radon air was obtained from pressurized cylinders in which the air had previously been stored for a longer period. The resulting radon concentration in the volume was considered to be negligible (zero) and the device indicated the datum error for zero value of radon activity concentration. The background of the comparison device was measured before each run. It was constant throughout the comparison period and was determined to be 4 ± 5 Bq/m 3 , which had a negligible effect on the measurement results. The attributed uncertainty was the standard uncertainty. The AlphaGUARD operates with automatic background correction. Due to stochastic measurement effects that are not taken into account in the automatic background correction, the device provides measured values for the background that can also be negative.
Visual inspections of the comparison device for damage, including the diffusion filter, verification of functionality and checking of the set measurement parameters (e.g., calibration factor) supplemented the regular checks before the instrument was used for the next run.

Exposure Levels
Within the specified study range between 300 and 10,000 Bq/m 3 , 3 different exposure levels with low, medium and high radon activity concentrations were defined for the comparison. The nominal values of the radon activity concentrations are given in Table 2. The value of 1000 Bq/m 3 was already included in a previous comparison of calibration facilities for radon activity concentrations, which was carried out within the framework of the Euromet Project 657 [3].
In practice, the participants could not exactly adhere to the specified nominal values. Therefore, deviations from the nominal values were accepted within which the respective activity concentrations were expected.
With the exception of a few participants, most of the participants were able to meet these requirements. The main reasons for not achieving the nominal radon activity concentrations within their accepted deviations were generally: 1.
The participants were not able to keep the activity concentration constant over the duration of exposure, as the activity concentration decreased over time, mainly due to radioactive decay; 2.
The radon sources that were available in the participant's laboratories and/or the methods used to create the radon atmosphere were not suitable for reaching the predetermined concentrations.
The assessment of the degree of agreement between participants was carried out only for the results that were obtained at exposures that were within the accepted deviations. However, the results of measurements outside the accepted deviations from the nominal values were not excluded from consideration and are referred to in the following as singular exposures. They complemented the conclusions of this study by supporting its extension to the entire range of radon concentrations from low to high levels.

Methods for Processing the Results
The investigated quantity, which made the participant's performance comparable, was the ratio R of the radon activity concentration C RefLab , which was reported by the participant as the mean value for the relevant exposure period and the mean radon activity concentration C CD , which was measured during the same period with the comparison device: The standard uncertainty ∆R = u(R) was calculated from the propagation of uncertainty from Equation (1). The relative uncertainty is given by ∆C RefLab = u(C RefLab ) represents the standard uncertainty as reported by the participant and was determined according to its own procedure. The reported uncertainty included the statistical variation from repeated observations (type A evaluation of standard uncertainty) and contributions from other sources, in particular from data provided in the calibration and other certificates (type B evaluation of standard uncertainty) [4]. However, the procedure that was used by participants to calculate the measurement uncertainties was not evaluated as part of this interlaboratory comparison. ∆C CD is the uncertainty of the mean radon activity concentration, which was determined by the comparison device. Since the comparison device did not embody the comparison reference value, only the type A uncertainty given by the standard deviation of the mean, namely, s(C CD ), was considered: C CD,j is the jth of n measurements that were taken with the comparison device. Other contributions to the uncertainty, particularly from calibration factors, were not included. This was due to the essential requirement for the comparison device to provide an indication that depended linearly on the value of the radon activity concentration. The initial investigations of the comparison device showed that the linearity could be assumed over the entire range up to a radon activity concentration of 10,000 Bq/m 3 .
It should be noted that the simple averaging of the measurements performed with the comparison device and the use of Equation (3) was valid if the activity concentration was kept constant during the relevant exposure period. If this could not be ensured by the participant, the change in activity concentration over time must be well known. In such cases, the participant had to provide information on how to determine the mean radon activity concentration C CD from the readings of the comparison device. In general, the radon activity concentration that was established in a confined atmosphere decreased due to radioactive decay and, thus, the value for C CD at the reference time t ref was obtained using Equation (4) represents the average of the measured values corrected for the exposure time. The parameter t j represents the measurement time of C CD,j and λ represents the decay constant of radon. Equation (4) must be modified if the rate of decrease differs from that of radioactive decay.

Calculation of the Uncertainty-Weighted Average Ratio
When R i denotes the ratio R calculated for the ith of n participants and u i is the standard uncertainty attributed to R i , the uncertainty-weighted average ratio R w is determined using The parameter w i represents the weight for ratio R i : The weights are calculated from the reciprocal squared standard uncertainties of Equation (2). It follows that results with lower uncertainties are weighted higher than results with high uncertainties when determining the average ratio. The variance of R w is calculated as follows [5,6]:

Laboratory Reference Devices and the Compilation of the Results
Most participants used an AlphaGUARD-type device as the laboratory reference instrument for the radon activity concentrations. Two of these participants additionally performed measurements with scintillation chambers. The ATMOS 12DPX was utilized by two participants and a Radon Scout by one participant. AlphaGUARD and ATMOS use ionization chambers (single-or multi-wire) for radiation detection. The Radon Scout deploys high-voltage enhancement and alpha pulse counting using a semiconductor detector. Unlike AlphaGUARD and Radon Scout, which operate in diffusion mode, the ATMOS-type device operates in flow-through mode.
The vast majority of the participants were able to show the metrological traceability of the quantity through an unbroken chain of calibrations at recognized bodies. Two participants traced their measurements back through factory calibration.
Factory calibration is a service from the manufacturer that provides the instrument with an initial calibration before delivery. Although the manufacturers also trace their measurements back to recognized bodies, compliance with quality management standards and metrological requirements need not be demonstrated. The reported results revealed that, in particular, for participants who used factory calibration, the attributed measurement uncertainties were not consistent with the deviation from the collective average. It is shown below that this has consequences for the use of these results in the assessment of the interlaboratory comparison. Figure 1 shows the ratios R i representing the radon activity concentrations as measured by the participants in relation to the corresponding readings of the comparison device according to Equation (1). The error bars represent the standard uncertainties according to Equation (2). The results from participants who traced back their measurements using factory calibration are not included in Figure 1. revealed that, in particular, for participants who used factory calibration, the attributed measurement uncertainties were not consistent with the deviation from the collective average. It is shown below that this has consequences for the use of these results in the assessment of the interlaboratory comparison. Figure 1 shows the ratios representing the radon activity concentrations as measured by the participants in relation to the corresponding readings of the comparison device according to Equation (1). The error bars represent the standard uncertainties according to Equation (2). The results from participants who traced back their measurements using factory calibration are not included in Figure 1.

Consistency Check
A check of mutual consistency is required for interlaboratory comparisons using the BIPM consultative committee CCQM [5] to test the hypothesis that the participants have a collective mean value and that the deviations from this value are normally distributed.
The consistency check is performed using a chi-squared test over the number of n measurements (or participants). The observed test parameter χ 2 obs is calculated using According to CCQM [5], the test parameter is compared with the quantile of the chi-squared distribution for the significance level 1 − α with α = 0.05. The following decisions have to be made:

1.
If χ 2 obs < n − 1, the results are mutually consistent and the uncertainties account fully for the observed dispersion of the values; 2.
If n − 1 ≤ χ 2 obs < χ 2 0.05;n−1 , the data provide no strong evidence that the reported uncertainties are inappropriate, but there remains a risk that additional factors are contributing to the dispersion; 3.
If χ 2 obs > χ 2 0.05;n−1 , the data should be considered as mutually inconsistent. The results of the consistency checks are summarized in Table 3. The tests were performed for each exposure level and for the complete data set of all levels including singular exposures. The two participants who traced their measurements back through factory calibration are not included in the results of the consistency check.  Table 3 shows that for each exposure level, the observed test parameter was below the tabulated value χ 2 0.05;n−1 . Therefore, the conclusion can be drawn that the results were mutually consistent. There was no evidence of significant inconsistencies for each of the individual radon levels and the overall exposure range. The uncertainties fully accounted for the observed dispersion of the values.
For the radon level of 400 Bq/m 3 , the test parameter was greater than n − 1 at the stated significance level. The higher value of the observed test parameter was caused by results that showed increased deviations from the average ratio without a corresponding uncertainty being assigned to them. In these cases, the uncertainties that were attributed by some participants might have been too small for the observed deviation from the mean value.
The presented consistency of the data set failed when the results of the two participants who traced their measurements back through factory calibration were included in the data set. To ensure the consistency of the data set and to maintain the degree of representativeness of the intercomparison, the data from the two participants were not included in the derivation of the average ratio and, thus, the comparison reference value. The coordinator (from BfS) was also not considered further due to his special position as part of the supervising laboratory. Table 4 shows the uncertainty-weighted average ratio R w for the different exposure levels. R w is calculated according to Equation (5). The square root of the variance from Equation (7) is the standard uncertainty u(R w ). The values of the average ratio obtained for the various exposure levels agreed very well, taking into account the standard uncertainties.  Figure 1.

The Uncertainty-Weighted Average Ratio
Assuming that the comparison device represents the weighted collective average radon activity concentration for each exposure level, the average ratio would be compensated, resulting in R w = 1. However, the calculated values for R w showed a bias of about 1.5% above the expected compensation value. The bias was caused by the comparison device due to the calibration of the device at the coordinator's facility and indicated the coordinator's deviation in the measurement of the quantity from the collective mean. The measurements of the comparison device were, on average, 1.5% lower than the weighted average radon activity concentration that was measured by the participants for the respective exposure level. It was observed that the average ratio R w varied only slightly for the different exposure levels, confirming the performance and stability of the comparative measurements.

The Key Comparison Reference Value and the Dispersion of Measurement Values
The key comparison reference value (KCRV) is the value of the quantity representing the specific property of the material under consideration [5]. The specific property that was under consideration in this interlaboratory comparison was the activity concentration of radon in air. However, the single radon activity concentrations in the atmospheres that were measured at participants' facilities differed between participants. Moreover, three different main levels of radon activity concentration were measured by each participant, covering a large range over more than one magnitude.
The comparison device that was provided by the coordinator was used as a comparator to normalize the different radon activity concentrations that were established by the participants and thus allowed for comparability of the respective measurements of the quantities. As the comparison device is characterized by an indication, which is verifiably linear over the entire range, the comparison of the different radon activity concentrations found in the participant's facilities was made possible by their ratio to the indication of the comparison device, as is given by Equation (1). The average ratio R w , which can be deemed to be the KCRV, was calculated from the single ratios according to Equation (5). The consequences were as follows: 1.
The observed R w had a bias of about 1.5% (Table 4) compared to the expected value of R w = 1, which would result if the comparison device were to represent the uncertainty-weighted collective mean radon activity concentration and, thus, all individual deviations were compensated.

2.
The variance of R w that was calculated using Equation (7) led to the small values for u(R w ) given in Table 4. As was also shown in other studies [6][7][8], the reciprocal square root of the sum of the weights becomes too small with an increasing number of participants such that many laboratories fall outside the uncertainty interval. It is therefore assumed that this parameter is not an appropriate measure of the degree of agreement between the participants.
To overcome the disadvantages in terms of R w and also to eliminate the impact of the comparison device, a modified ratio, namely, R * i , for the ith participant is formed by The rationale behind this modified ratio, now considered as a new comparison value, is that the expectation value E R * i that is obtained from the weighted sum over each participant is equal to 1: The weights w i are given by Equation (6). Equation (10) implies that the values R * i are distributed around the common mean. Of particular importance for the results of the comparison is the mean square deviation of the participants: After replacing the weights by Equation (6), it follows that The square root of Equation (12) is considered to be the standard variation interval within which a certain radon activity concentration is measured in the atmospheres of European radon calibration facilities, and is thus a measure of the degree of agreement between the participants. Figure 2 shows the distribution of the single values of R * i around the common mean of 1 and the standard variation interval for each exposure level. The values for the standard and expanded intervals are provided in Table 5. The expanded variation interval is calculated using a coverage factor k = 2, which gives a level of confidence of approximately 95%. Although individual participants fell outside the coverage interval, they could not be considered outliers because their deviation was not significant when uncertainties were taken into account. Radon activity concentration as indicated by the comparison device (Bq/m 3 ) Figure 2. Comparison values given as modified ratios ( * ) and attributed uncertainties; blue straight lines represent a common mean of 1, dashed blue lines cover the standard variation interval (coverage factor = 1) according to Table 5, error bars indicate the standard uncertainties of the reported values, the results of the same participant are indicated by the same color and the color assigned to the respective participant is the same as in Figure 1. Already in 2005, a comparison of calibration facilities for radon activity concentration was carried out within the framework of the Euromet Project 657 [3]. The comparison of the expanded variation intervals obtained in this and the previous study are shown in Figure 2. Comparison values given as modified ratios (R * i ) and attributed uncertainties; blue straight lines represent a common mean of 1, dashed blue lines cover the standard variation interval (coverage factor k = 1) according to Table 5, error bars indicate the standard uncertainties of the reported values, the results of the same participant are indicated by the same color and the color assigned to the respective participant is the same as in Figure 1. Already in 2005, a comparison of calibration facilities for radon activity concentration was carried out within the framework of the Euromet Project 657 [3]. The comparison of the expanded variation intervals obtained in this and the previous study are shown in Table 6. Regardless of the different exposure levels, as well as the calculation of the variation intervals, a slight improvement in agreement on the measurements can be assumed. This was particularly evident at higher exposures.

Alternative Determination of Mean Values and Associated Uncertainties
The power-moderated mean method was proposed as an alternative method for calculating the KCRV and the associated standard uncertainty [8]. The method yields results that are generally intermediate between the arithmetic mean and the weighted mean. The power-moderated mean is an efficient and robust estimator of the reference value of a data set and its uncertainty. In particular, the method can be applied to discrepant data sets where the reported uncertainties do not cover the observed dispersion of the data and the condition χ 2 obs < n − 1 is not satisfied. This must be assumed for the data that were determined for the exposure level of 400 Bq/m 3 .
The calculations of the power-moderated mean were performed using the Excel spreadsheet MET511639suppdata.xlm, which is available for download from the Internet [9]. The automatic algorithm for moderating the relative weighting is used. Table 7 shows the values of the power-moderated mean and the corresponding standard uncertainties calculated with the Excel spreadsheet. The comparison with the complementary data of the average ratios in Table 4 does not show relevant differences. Only for the exposure level of 400 Bq/m 3 were slightly larger changes produced for the power-moderated mean method.
The approach provided by the Excel spreadsheet MET511639suppdata.xlm was also used for the identification of extreme values. Extreme values are indicated when the difference between the measured ratio and the power-moderated mean, namely, d i = R i − R w,pm , exceeds the constraint specified by [8]: For the coverage factor k = 2, no extreme values were found in the underlying data set. However, if the two participants who traced their measurements back through the factory calibration are included in the data set, some of their results would be classified as extreme values (outliers).

Influence of Climatic Conditions during the Calibrations
Most participants reported their results in conjunction with data on the climatic conditions (temperature, relative humidity and barometric pressure) in the laboratory at the time of exposure. Corrections for standard room conditions (temperature of 20 • C, relative humidity of 50% and air pressure of 1013 hPa) were not required. Only one participant reported his results for standard room conditions.
Exposures at the facilities were conducted in a wide range of climatic conditions that included temperatures (T) from 18 to 28 • C, atmospheric pressures (p) from 950 to 1024 hPa and relative humidities (rH) from below 10% to 63%. Detailed information with data on the climatic conditions is available on the Internet [2]. Figure 3 presents a three-dimensional plot of the climatic conditions that were present for nine of the participants during the exposures. These nine participants provided results for all three exposure levels without correcting them for standard conditions. The different climatic conditions raise the question of what influence they had on the results of this study. relative humidity of 50% and air pressure of 1013 hPa) were not required. Only one participant reported his results for standard room conditions. Exposures at the facilities were conducted in a wide range of climatic conditions that included temperatures ( ) from 18 to 28 °C, atmospheric pressures ( ) from 950 to 1024 hPa and relative humidities ( ) from below 10% to 63%. Detailed information with data on the climatic conditions is available on the Internet [2]. Figure 3 presents a threedimensional plot of the climatic conditions that were present for nine of the participants during the exposures. These nine participants provided results for all three exposure levels without correcting them for standard conditions. The different climatic conditions raise the question of what influence they had on the results of this study. The power of the association between a specified random variable ( * ) and a group of independent random variables ( , , ) is determined using the multiple correlation method. Details of the procedure and computational results are presented elsewhere [2].
The correlation studies did not reveal a statistically significant indication of a dependence of the comparison value * on the climatic conditions during the calibration exercise. However, the pairwise correlation showed a dominant dependence of the ratio * on the atmospheric pressure at the exposure level of 6000 Bq/m 3 . This correlation was not expected and is surprising. Since, in most cases, the type of device used by the participants is the same as that of the comparison device, it can be assumed that the climatic conditions affected the devices in the same way and that their effects canceled each other out when calculating the ratios. Therefore, it cannot be excluded that a random correlation was observed. Nevertheless, this finding should be clarified and can only be accepted if it is reproduced by further investigations.

Metrological Traceability and Correlations between the Participants
The participants were requested to provide information on how the metrological traceability of the radon activity concentration was realized. From this information, the chart in Figure 4 was developed, which shows the status of the metrological traceability during the period of the interlaboratory comparison. The power of the association between a specified random variable (R * i ) and a group of independent random variables (T, p, rH) is determined using the multiple correlation method. Details of the procedure and computational results are presented elsewhere [2].
The correlation studies did not reveal a statistically significant indication of a dependence of the comparison value R * i on the climatic conditions during the calibration exercise. However, the pairwise correlation showed a dominant dependence of the ratio R * i on the atmospheric pressure at the exposure level of 6000 Bq/m 3 . This correlation was not expected and is surprising. Since, in most cases, the type of device used by the participants is the same as that of the comparison device, it can be assumed that the climatic conditions affected the devices in the same way and that their effects canceled each other out when calculating the ratios. Therefore, it cannot be excluded that a random correlation was observed. Nevertheless, this finding should be clarified and can only be accepted if it is reproduced by further investigations.

Metrological Traceability and Correlations between the Participants
The participants were requested to provide information on how the metrological traceability of the radon activity concentration was realized. From this information, the chart in Figure 4 was developed, which shows the status of the metrological traceability during the period of the interlaboratory comparison.  The radon activity concentration is a derived quantity that is composed of the base quantities of activity of the gaseous radon and the volume. The volume is the capacity of the enclosed space containing the atmosphere into which radon is released. A secondary reference facility traces both the radon activity and volume back to their respective primary standards and combines the two quantities to form the radon activity concentration.
There were three main branches through which the base quantity of activity was traced back by the participants. The roots of the branches are the national metrology institutes PTB (Brunswick, Germany), LNHB (Paris-Saclay, France) and NIST (Gaithersburg, Maryland, USA), which hold the primary quantities. PTB (Brunswick, Germany), BfS (Berlin, Germany), IRSN (Paris-Saclay, France) and ENEA (Rome, Italy) act as secondary reference facilities. It should be noted that PTB abandoned its reference chamber in 2016. Facilities, that had used the PTB reference chamber to ensure the metrological traceability will have to undertake a rearrangement after its validity has expired. The BfS reference chamber switched the metrological traceability of the radon activity as of 2020 to LNHB by means of a gas standard. In order to adjust the radon activity concentrations to predetermined values that remain constant over time, emanation sources will become increasingly important in the future. Figure 4 shows that the secondary reference facilities measuring the derived quantity are not exclusively operated by metrology institutes and thus are not integrated into qualified metrological surveillance. Regular comparisons between the secondary reference facilities should be initiated to ensure the quality of the measurement of the quantity.

Conclusions
From March 2018 to February 2020, an interlaboratory comparison was conducted in the framework of the EMPIR Project Metrology for radon monitoring. In total, 15 calibration facilities from 13 different European countries participated in the interlaboratory comparison. Among those were national metrology institutes and designated institutes, national authorities for radiation protection and participants from universities. The radon activity concentration is a derived quantity that is composed of the base quantities of activity of the gaseous radon and the volume. The volume is the capacity of the enclosed space containing the atmosphere into which radon is released. A secondary reference facility traces both the radon activity and volume back to their respective primary standards and combines the two quantities to form the radon activity concentration.
There were three main branches through which the base quantity of activity was traced back by the participants. The roots of the branches are the national metrology institutes PTB (Brunswick, Germany), LNHB (Paris-Saclay, France) and NIST (Gaithersburg, MA, USA), which hold the primary quantities. PTB (Brunswick, Germany), BfS (Berlin, Germany), IRSN (Paris-Saclay, France) and ENEA (Rome, Italy) act as secondary reference facilities. It should be noted that PTB abandoned its reference chamber in 2016. Facilities, that had used the PTB reference chamber to ensure the metrological traceability will have to undertake a rearrangement after its validity has expired. The BfS reference chamber switched the metrological traceability of the radon activity as of 2020 to LNHB by means of a gas standard. In order to adjust the radon activity concentrations to predetermined values that remain constant over time, emanation sources will become increasingly important in the future. Figure 4 shows that the secondary reference facilities measuring the derived quantity are not exclusively operated by metrology institutes and thus are not integrated into qualified metrological surveillance. Regular comparisons between the secondary reference facilities should be initiated to ensure the quality of the measurement of the quantity.

Conclusions
From March 2018 to February 2020, an interlaboratory comparison was conducted in the framework of the EMPIR Project Metrology for radon monitoring. In total, 15 calibration facilities from 13 different European countries participated in the interlaboratory compari-son. Among those were national metrology institutes and designated institutes, national authorities for radiation protection and participants from universities.
The comparison was conducted by the German Federal Office for Radiation Protection (BfS). An electronic instrument of the type AlphaGUARD was selected as the comparison device, which was provided to the participants in turn. Participants were requested to expose the device to three different levels of radon activity concentration: 400, 1000 and 6000 Bq/m 3 . In certain cases, other exposures were also accepted. The ratio of the value of radon activity concentration established and measured at the participant's facility to the value that was simultaneously determined with the comparison device in the same atmosphere was used for comparison.
The results showed that, taking the statistical uncertainties into account, the ratios of the radon activity concentrations were identical for all exposure levels and for the summary of all levels including singular exposures. The radon activity concentrations that were determined by European calibration facilities that complied with metrological traceability requirements were consistent with each other and had common mean values. The deviations from these values were normally distributed. It can be assumed that the radon activity concentration that was measured by the European calibration facilities fluctuated around a collective mean value. Its interval of variation was a measure of the degree of agreement between the participants. For exposures above 1000 Bq/m 3 , the variation was about 4% for a level of confidence of approximately 95% (k = 2). For lower exposure levels, the variation increased to about 6% at 400 Bq/m 3 .
The participants performed their measurements under different climatic conditions. Correlation studies revealed no statistically significant indication of a dependence of the comparison value on the climatic conditions. However, a correlation was found between the comparison value and the atmospheric pressure for the exposure level of 6000 Bq/m 3 . This finding could not be clarified in the present study and requires further investigation.
The European radon calibration facilities traced their measurements back to primary quantities that were maintained at the national metrology institutes PTB (Brunswick, Germany), LNHB (Paris-Saclay, France) and NIST (Gaithersburg, MD, USA). The results of the interlaboratory comparison show that metrological traceability and calibration of instruments must be carried out according to uniform and generally recognized standards. Where these standards are not met, sufficient confidence in the measurement results cannot be established.
The interlaboratory comparison of European radon calibration facilities is a powerful tool to detect discrepancies in metrological traceability and to ensure the quality of radon measurements. It is strongly recommended to carry out interlaboratory comparisons regularly.