Application of Bayesian Methods to Exposure Assessment of Area Concentrations at a Rubber Factory

The present study estimated area concentrations of airborne benzene in several workshops using Bayesian methods based on available historical measurements. A rubber products factory utilizing benzene was investigated. Historical measurements of benzene concentrations, expert experiences, and deterministic modeling were utilized in a Bayesian Method to estimate area concentrations. Historical concentrations (n=124) were available with the geometric mean of 15.3 mg/m3. The geometric mean of the current field measurements on the workstations ranged from 0.7 to 89.0 mg/m3. One of the seven historical geometric means by work locations significantly differed from the field measurements for equivalent locations, but none of the geometric means of Bayesian estimates were significantly different from the field measurement results. The Bayesian methods based on the historical measurements appeared to be a useful tool for more closely estimating area concentrations shown by field data than that predicted only using historical measurements.


Introduction
Exposure assessment (EA) is often necessary for epidemiologic research [1]. EA methods are the foundation for establishing reasonable data for human dose-response (effect) relationships. Reliable EA is also needed to guide exposure control measures for workers exposed to significant health risks. Multiple EA approaches have been set up [2]. Furthermore, subjective assessments for reconstructing exposure have also been used [3], but most options are prone to a variety errors and biases [1,4]. Bayesian methods have been introduced for occupational EA in combination with mathematical modeling [4][5][6][7][8][9].
A Bayesian approach may utilize exposure measurements to update a "prior" constructed by a deterministic model, where the parameters were determined by expert opinion informed with available information and measurements. If e represents the physical parameter of interest (that is in our case airborne benzene area concentration), and the measurement process furnishes a number represented by M, then the Bayesian expression can be described by the following equation: where, P 0(e) is the probability distribution of e, P L (M/e) is the likelihood function that given the true value e, the measurement M is observed, P(M) is the probability that the measurement M is observed, and P post (e/M) is the updated probability (or the posterior) that the exposure is e given that the measurement M is observed.
Maximum Allowable Concentration (MAC, the concentration that should not be exceeded at anytime) [10] has been used in China as the occupational health standard from the 1950s to 2002. The present study takes benzene as an example to assess exposure levels with Bayesian methods based on historical area concentration measurements taken to evaluate MAC compliance.

Study Factory
A state-owned large rubber products factory in Shanghai, China, was recruited for the study. The factory was founded in 1954, and now includes 11 workshops for producing a variety of rubber hoses. The automation levels varied in the workshops, but the main processes were as the following: 1) rubber was extruded from a machine in the form of an inner layer of a tube assembly; 2) cotton threads or steel filaments were woven tightly on the surface of the rubber inner layer to increase the strength of the hose; 3) benzene was applied on the hose surface to make it tacky so the outer rubber layer adhered on it; 4) the outer rubber layer was extruded and tightly covered the fabric and the inner layers; 5) hoses were heated for vulcanizing. Benzene was mainly used as the bonding solvent between layers in eight of the workshops. Natural ventilation via doors and windows was present and fans were installed in the wall and/or ceiling of the building, but no local mechanical exhaust ventilation system was available at any worksite.

Bayesian Analysis Plan
In brief, the basis of the present methodology is a hybrid Bayes statistical approach [6,8,9], with the prior using exposure distributions obtained from a probabilistic mode and with available historical measurements. The Bayesian calculation plan was as follows.
1. For mathematical modeling, parameters of evaporation surface area for pollutant production, air flow rate, work environment distance from source to workers, products and production made, working hours every year, and process technology were provided to a professional expert team. The experts were asked to provide their opinion on the distribution and errors of the parameters for the deterministic model. The resulting parameters from the expert review were used to predict air concentrations.
2. Using parameters based on the expertise and historical working conditions, Monte Carlo simulation methods and the mathematical model were used to create the joint probability distributions and run for each sampled input parameters. These were the "prior" distributions, which characterized the uncertainty in the model output.
3. The estimated variance of the available historical air concentration data (the "real world" observations) were applied to estimate the parameters for the Bayesian likelihood function, P L (M/e).
4. Using WinBugs software [9] and Bayes rules, the prior was updated with the historical data to generate the posterior probability distribution of air concentrations P post (e/M).
A Bayesian process could be described as the following chart ( Figure 1).

Data Collection and Worksite Investigation
The Shanghai Institute of Public Health and Supervision was visited and historical data of area benzene concentrations were collected. A spreadsheet was set up for coding the workstations at the factory and for abstracting information including work scheduling, raw materials used, work processes, job/task titles, and preventive measures. In the present study, most workshops changed production by changing the number of assistant workers and running hours, but the process and the product line number remained unchanged, consequently the area concentrations changed little. According to the field investigation and the hygienist experiences, we found that along with the traditional processes such as Slurry-making and Iron core tube assembly, several new and modern production lines were developed such as Line 1 to Line 3 where rubber hoses for automobile use were produced. Otherwise, there were no significant changes in work shift, raw materials used, work processes, or preventive measures prior to the current investigation. The annual productions of the workshops were provided by the factory, while the working hours were based on the recall of the workers who had been working there. A total of 15 similar exposure groups (SEGs) were created based on the processes, tasks and job activities, equipment and environment as well as area concentration data [11][12][13].

Data Selection for Deterministic Modeling
We employed three local experts in occupational health and industrial hygiene engineering to develop parameters for our EA methods, two of them were professors of occupational health and one was a professor of air conditioning and ventilation. They were provided with references about the goals of the study, literature about occupational EA, the processes of the factory, pictures of the workstation and job activities. Judging on feasibility and reliability, experts were invited to select the most appropriate deterministic model among Well-Mixed Box Model, Two-Zone Steady State Model and Eddy Diffusion Model to evaluate the workplace exposure. As a result, the Two-Zone Steady State Model [14] was selected based on the small working spaces, repetitive operation and poor ventilation. The model has relative simplicity while still accounting for variability in concentration with distance from the source. The choice was justifiable if several factors were met: 1) air in both the near-and farfield was adequately mixed; 2) a little air exchange was allowed; 3) and the pollutant production rate was stable. The near zone always means the hemisphere with the radius of the distance between the operator and the pollutant source. The model can be described by the following equation: (2) where C N，SS : steady state concentrations of pollutant in the near zone (NZ), mg/m 3 ; G: pollutant production rate, mg/min; Q: air flow rate supplied to the workshop, m 3 /min; S A : NZ free surface calculated using the distance from source to operator assuming hemisphere, m 2 ; s: average of the random direction wind velocity across the near zonefar zone interface, m/min.

Current Worksite Benzene Air Sampling and Analysis
Benzene area samples were taken using sampling pumps (calibrated before and after sampling) and single section charcoal tubes of Chinese design (2nd Jianhu Electronic Instrument Factory of Jiangsu Province) at seven worksites where people were working, according to the current national standard (Specifications of air sampling for hazardous substances monitoring in the workplace, GBZ159-2004). The charcoal tubes have been widely used to collect airborne benzene in China for analysis with the NIOSH methods, and the results were reported to be comparable to those from NIOSH standard tubes [15]. Air samples were sealed and transported to the laboratory for analysis by gas chromatography according to NIOSH analytical method 1501. The air samples were stored at -20 °C before analysis.

Estimation of Area Concentration with Bayesian Model
All input parameters including pollutant production rate, general indoor and NZ interface flow rate, for the selected model of the Slurry-making workshop and Cloth-lining workshop were assayed in the field. When the pollutant production was assayed, doors and windows were closed and wall ventilation fans were shut off. The sizes of the zones nearby the workstations varied, so it was not practicable to develop more detailed estimates of the pollutant productions. The workshop was divided into a sampling grid of 12 virtual spaces of equal area, with 12 air samplers distributed evenly in the grid, with samples taken for 10 min 6 times a day for 2 days. The virtual spaces in the Slurry applying workstation and Cloth-lining assembly workstation were 15 and 13 m 2 , respectively. The pollutant production rates were computed at the following formula: where G: the pollutant production rate, mg/min; C n : average concentration of the 12 area samplers at one round of sampling, mg/m 3 ; C n-1 : average concentration of the 12 area samplers at one round of sampling prior to C n , mg/m 3 ; V: volume of the workshop, m 3 ; T: sample timing between C n and C n-1 , min (=10).
The pollutant production rate data in the two workshops were normally distributed (verified with W-test). The arithmetic averages (AMs) of Slurry-applying workstation and Cloth-lining assembly workstation were 520 mg/min and 1,302 mg/min, respectively (Table 1).
Air velocities at the NZ interface and in the general workshop were assayed at 9:30 am, 12:30 pm, and 3:30 pm for 2 min on 2 days. The air velocity measurements were log-transformed and W-test was used to check the distribution of the log-transformed measurements. We found that the air velocities of both the NZ interface and the general workshop were log-normally distributed. Ventilation calculations were discussed below. The ventilation rates of the general room and the near field volume in the Slurry-applying workstation were 629 m 3 /min and 25 m 3 /min, respectively. They were 3,501 m 3 /min and 71 m 3 /min in Cloth-lining workstation, respectively (Table 1). It was obviously that ventilation varied between workstations and workshops because of the different free surfaces.
Most of the blueprints and records of the factory were not available, so layouts of the workshops were measured by the hygienist at the factory, and the data of the ventilation open surface, pollutant source, and the distance of pollutant to operator were measured in the field. As the Large water hose workshop did not exist any longer, the ventilation open surface, pollutant source, and the distance between the pollutant source and the operators in this workshop were estimated based on descriptions provided by the workers. Information on pollution sources, and surfaces emitting volatile organic solvents, work space volumes, and natural ventilation opening areas for all workstations were also collected and provided to the experts, and then the experts were asked to provide subjective judgment for each input parameter as a probability distribution for use with the mathematical model to construct the priors [6,16]. The historical process information for experts' use in forming priors is listed in Table  2. The experts agreed that the measurements of exhaust ventilation velocity data and the pollutant production rates of the two workstations should be considered as the anchoring information for the estimation. The pollutant production rate was normally distributed and proportional to the evaporating sources, as the ingredient of the solvents were the same. The experts also agreed that the area concentration, ventilation velocity (both workroom and the NZ interface) appeared to be log-normally distributed. The GM of air velocity of exhaust ventilation points obtained in the field was identified as air velocity for the calculation of ventilation volumes of all workshops. According to the field measurements, the GM of air velocity of general exhaust ventilation and the near field-far field zone interfaces were 0.61 m/s and 0.23 m/s, separately. The 90% confidence intervals (CI 90%) of the air velocity were between 0.2-5.0 times of the GM. The general ventilation volume for the workshop equaled one-half of the product of vented surface and the GM of the air velocity at the vent location. The NZ interface surface was defined as a hemisphere, with the distance between the operators to the pollutant source as the radius. The pollutant production was proportional to the evaporating surface of pollutant source. According to the measurements of Slurry-applying and Cloth-lining workstation, the average pollutant production rate was 0.23 mg/cm 2 /min. Their CI 90% was 0.5-1.5 times of their averages. Priors of parameters provided by the experts were listed in Table 3. The Two-Zone Steady State Model was programmed with WinBugs. In the WinBugs code, the probability functions of pollutant production, air velocity and ventilation surface were coded as parameters for computation, and the historic measurements of the workstation were the observations. The model ran for 4000 iterations of input values by Monte Carlo sampling to obtain a simulated probability distribution of values for area concentrations of every workstation after burn-in of 1,000 iterations.

Quality Control
We had 20 duplicate samples in the Slurry-applying and Line 1-2 workstations, respectively, as representative workstations for low and high exposure levels. One set of the duplicates was sent to a U.S. laboratory for cross-checking. The results of a t-test on the duplicate samples showed they were statistically equivalent (data not listed). Data for the study were doubly entered into the computer system and automatically error-checked, with resolution of conflicting entries.

Statistical Analyses
The normal distribution of data was directly tested with W-test on the original data; the log-normal distribution was indirectly tested with W-test on the log-transformed data. The t-test was used to compare the means of historical measurements or Bayesian estimates with the current (collected as part of this study) worksite measurements. The p value was set at 0.05, double sided. All concentrations below the limit of detection (LOD) were replaced with LOD/[square root of 2] [17] for the computation of GM. All the statistical analyses were performed with SPSS 11.0.

Benzene Levels by Current Field Survey
Area concentrations from the current field sampling (FM) were listed in Table 4. The GMs of the seven workstations' data ranged from 0.7-89.5 mg/m 3 . Slurry-applying and Cloth-lining assembly had the highest benzene concentrations. Apart from the Steel weaving assistant workstation, the measurements of all workstations were log-normally distributed. The rate of samples below LOD was 17.4%.
During the period of the 1960s to 1984, samples were taken with a bubbler and analyzed by the digestive colorimetric method, with a LOD of 6 mg/m 3 [18]. From 1985 to 2003, samples were mainly taken by 1 min grab sampling with glass syringe and analyzed with gas chromatography, with a LOD of 0.6 mg/m 3 . Charcoal tube collection and gas chromatography analysis came into use in 2002, with a LOD 0.2 mg/m 3 [19].
We had 124 historical measurements (HM) of area concentration of benzene during the period of 1964 to 2003. The rate of samples below LOD was 13.7%, and the GM was 15.2 mg/m 3 . The historical measurements showed a big variation, e.g., 70% of the log-transformed standard deviations for measurements sampled within 17 years were greater than 3.
The GMs of the historical monitoring data from 15 workstations were 0.9-409.4 mg/m 3 (Table 4). Six out of the 15 workstation HM were higher than that of the correspondent FM, but they were within their corresponding CI 95% of FM. There was no significant difference between the GMs of HM and FM except for 1 workstation. Two GMs of the HM were outside of the CI 90% of the FM. The ratios of the GMs of HM to that of the FM varied from 0.9 to 4.6, with an average of 2.7 ±1.4.

Benzene Estimates by the Bayesian Model
Based on Bayesian Model (BM), the GMs of 15 workstations ranged 0.5-125.7 mg/m 3 (Table 4). Five GMs of the BM estimates from seven workstations were higher than that of the correspondent FM, but they were within their corresponding CI 95% of FM, and there was no significant difference. The average ratio of GMs of BM to that of FM was 1.47±0.76, which is much lower than that of the ratio of the HM to the FM. Furthermore the standard deviation of the average ratio of GMs of BM to the FM was smaller than that of HM to FM.

Discussion
The challenge facing exposure assessors is how to combine and interpret the diverse information, which may be incomplete or sometimes conflicting, so a structured synthesis of the occupational EA information is often needed. On one hand, the quality control of the historical measurements such as MAC frequently remains a problem [20][21][22]. As the majority of the historical measurements were based on MAC concept and short-term sampling, a hard effort needs to be devoted for a reasonable data interpretation. On the other hand, expert judgment base on the historical working condition is always prone to subjective opinion and difficult to validate. Consequently a reasonable approach is to use the measurements, expertise, and mathematical models together to estimate the exposure levels. Then, Bayesian statistics are recommended because of their ability to synthesize all the information and produce output as the posterior through Monte Carlo simulations [1,[6][7][8]23]. After the operation of the Bayesian methods, the estimates were closer to the field measurements than the historical data were, as the average ratio of GMs of BM to that of the FM was 1.47±0.76, while the ratio of HM to the FM was 2.7 ±1.4. Furthermore, as shown in geometric standard deviation of the estimates by Bayesian Methods was about half of that using HM. Even for the workstation that had not longer existed while investigating (e.g. the Large water hose workshop), the exposure determinants could be probed and adjusted, and the exposure levels could be "predicted", retrospectively. The closer estimates and the smaller deviations suggested that the BM would have utility in refining the data when historical MAC measurements are used for the exposure assessment.
Field investigation of the work conditions, production rates, and technology process and health protection measures used guaranteed that the field measurements were consistent with the airborne benzene MAC concentrations to a certain extend over the history, so the current field sampling in this study was considered as the "gold standard" because the data were obtained successively over a 10-day period with representative operations selected. As reported by Collins et al., a series of historical measurements of 4,213 personal benzene exposure samples from 1980-1993 were collected and the subsequent correlation analysis showed no significant different trends on the data in terms of their periods and job titles [13], implying that the historical data could be adjusted by the current field measurements and used for retrospectively predicting the exposure levels at the similar workstations and job titles. It was similar to the present study, showing only one workstation out of seven with the GM significant different with the field measurements.
There are two obvious limitations in the present study. First, the Two-Zone Steady State Model is a simple model and not able to adequately mirror the time and space-varying complexities of the actual work places. This may introduce bias. However, more complex models require more parameters, and that aspect may introduce even more limitations on available data and possibly other biases. Additional research and comparisons of alternative models would be needed to further resolve this potential dilemma. Secondly, the Bayesian estimates are subject to the influence of the historical measurements. Consequently it is necessary to carefully consider the quality of the historic data and exercise caution where the quality is uncertain. [6,8,9] The findings from the present study suggest that the Bayesian methods using the historical measurements are a useful tool for more estimating area concentrations of benzene in the workplace.