Air Pollution and Mortality: Timing is Everything

: This paper considers timing issues in health-effect exposure and response studies. Short-term studies must consider delayed and cumulative responses; prior exposures, disease latency, and cumulative impacts are required for long-term studies. Lacking individual data, long-term air quality describes locations, as do greenspaces and traffic density, rather than exposures of residents. Indoor air pollution can bias long-term exposures and effect estimates but short-term effects also respond to infiltrated outdoor air. Daily air quality fluctuations may affect the frail elderly and are necessarily included in long-term averages; any true long-term effects must be given by differences between annual and daily effects. I found such differences to be negligible after adjusting for insufficient lag effects in time-series studies and neglect of prior exposures in long-term studies. Aging of subjects under study implies cumulative exposures, but based on age-specific mortality, I found relative risks decreasing with age, precluding cumulative effects. A new type of time-series study found daily mortality of previously frail subjects to be associated with various pollutants without exposure thresholds, but the role of air pollution in the onset of frailty remains an unexplored issue. The importance of short-term fluctuations has been underestimated and putative effects of long-term exposures have been overestimated.


Introduction
Health effects are necessarily about the timing of changes: exposure to a virus or to extreme heat or cold, stopping or starting smoking, losing or gaining weight. The deadly London fog of 1952 is perhaps the best-known example of demonstrated air pollution health effects, albeit acute and transitory [1]. Many other studies of short-term effects ensued and have been used to establish regulatory limits. Other time-dependent effects include exposure duration and cumulative effects such as with smoking, mining, truck driving, indoor cooking with solid fuels. Such health effects are best characterized in terms of the total inhaled burden, i.e., concentration x time: pack-years, duration of employment, and time spent in adverse environments.
By contrast, the most widely used measures of long-term (>6 months) air pollution health effects have been based solely on differences among locations, as if moving from the nearby environment of a toxic waste dump to a mountain top could promptly unclog arteries or shrink tumors. Part and parcels of this paradigm are the hypotheses that aging can represent the accumulation of environmental insults and that today's conditions can also represent those of past decades, neglecting disease latency and induction periods. Previous reviews have found that short-and long-term studies may be mutually supportive but have not considered implications of their differences. Doing so poses differing exposure requirements that are the subject of this assessment.
This paper begins with a brief discussion of the development of various types of air pollution epidemiology studies and their exposure data requirements. Model structure is important to the goal of establishing valid relationships between exposures and health. Exposure issues include recognizing implicit uncertainties, especially the role of indoor air quality given that we typically spend up to 90% of our time indoors. I review time-series studies and the importance of considering lingering (lagged) effects, and long-term studies and the potential importance of historic exposures. Another important gap with regard to timing involves recognizing that cohort ageing and presumed accumulation of inhaled pollutants may be accompanied by other temporal trends, especially improvement of ambient air quality. Using age-specific cohort analyses, I show that ageing of cohort members is an important factor.
Much of the extant literature reports health effects in terms of fixed increments such as 10 μg/m 3 or 10 ppb, which is inappropriate for comparing pollutants. Mean PM 2.5 levels can be less than 10 μg/m 3 and considerably more for PM 10 and few O3 levels are as low as 10 ppb. Here I base all effect estimates on the mean concentration of the pollutant involved, denoted "mean effect".

1.1.Landmark Studies of Air Pollution and Mortality
Epidemiological studies of air pollution health effects began with the 1952 Great Fog in London, during which approximately 4000 people died over 4 days during intense fog and levels of black smoke and SO2 that are unimaginable today [1]. Notwithstanding these tragic consequences, this landmark study had everything an epidemiologist might need: • Well-defined end point.

•
Identified pollutants and sources.

•
Exposures widely distributed across the city including indoors.
• Reliable air quality monitoring.
Fortunately for public health, this combination of conditions has not since reoccurred, although there have been several near misses.
Studies of air pollution and mortality took a sharp turn in 1970 with Lave and Seskin's crosssectional analysis of U.S. metropolitan areas and long-term mortality [2]. That study was essentially an ecological shopping trip search of confirming data and appropriate specification of pollution exposures. In addition to the inability to define individual victims, several key elements were lacking, notably data on smoking habits, climate, and exposure to additional pollutants, all of which were subsequently shown to be important [3]. The appellation "ecological" refers to the assumption that the properties of a group such as a city also apply to each member and is especially important for air pollution exposure. Nevertheless, Lave and Seskin became a landmark in studies of long-term associations between ambient air quality and mortality.
Many such studies followed, most of which also suffered from these inadequacies. Another landmark was the first study of a defined cohort [4] that included data on individual smoking habits, education, and body mass index (but not income) for selected residents of six U.S. cities. Air quality monitoring had begun a decade previously in these cities in conjunction with studies of respiratory health, but it was not possible to distinguish significant differences among pollutants. The Six Cities Study was also notable for caveats that have largely since been ignored: indoor exposures may be important, long-term effects can reflect the higher exposures of previous decades, and long-term risks may also reflect short-term risks.
After Six Cities, the cohort study became the "gold standard" for studying long-term mortality effects. None of these studies had individual or indoor exposure data and hence all are "ecological", thus opening the door to other ecological variables such as climate or neighborhood socioeconomic status (SES). The Veterans Cohort Study [5] involved an unique cohort of approximately 70,000 male mild hypertensives of which 38% were African-American. Exposure data comprised the complete suite of regulated air pollutants from 1960 onward, and the study design included long-term lags. Subsequent papers considered traffic density [6], pollutant modeling [7], and separate analyses by race [8] and by age [9]. These analyses included individual data on blood pressure and ecological data on climate and census tract characteristics.
Another turning point may have been the inappropriate designation of some long-term studies of populations as "cohort" [10], in which population mortality rates were used in conjunction with census data on potential confounders. Smoking rates were inferred from lung cancer mortality rates [11] without considering latency. Because of inadequate ambient monitoring, pollutant exposures were limited to species that could either be modeled or based on atmospheric properties.
Cohort studies vary considerably in size, from ~8000 subjects for the Harvard Six Cities Study [4] to ~700,000 for the American Cancer Society Cohort [12]; population-based studies can be even larger [10]. However, it appears that the larger the study, the less we can know about each member.

Structures of Air Pollution Health Studies
There are three major types of health effect studies: toxicology, clinical, and epidemiological. Toxicology studies typically do not involve humans and require non-ambient exposures high enough to elicit rapid responses from a few subjects. For ethical reasons, clinical studies are limited to nonlethal exposures to healthy subjects. Each of these modes involves studying changes: exposure, response, and back to normal status, i.e., intervention. It follows that epidemiology should do likewise. The goal of an epidemiological study is to develop a valid dose-response function (DRF): change in health status per unit change in exposure.
An epidemiology study comprises three elements: a well-defined end point including temporal considerations, data on exposures to plausible causal agents, and consideration of potential confounders. Physiology must be considered when selecting pollutants to be studied. For example, with cardiovascular mortality, a potential agent must have a plausible pathway for entry to the cardiovascular system. This rules out large particles that tend to deposit in the upper airways and acids that are likely to be neutralized by endogenous ammonia. Ambient carbon monoxide levels tend to be well below the threshold for CO poisoning. This leaves particulates, ozone, and oxides of nitrogen as potential candidates, all of which should be considered in parallel. Median diameters, chemistry, and toxicity of particles are relevant.
The major epidemiological modes for studying air pollution and health comprise changes over time and differences among locations (cross-sectional) each of which has its own exposure issues. Temporal studies include accidents or episodes in which relatively few people endure brief but intense exposure and lengthier periods involving "natural" variability and large populations. With accidents or volcanic eruptions, the exposure issues are determining the pollutants and concentrations that may have been involved. In any event, the health effect "yield" of such an event may persist over days or weeks; Hansell et al. [13] posit that such effects may persist for decades. In either case the response period exceeds the exposure period and historical estimates could be required.
These two situations pose different biases to a DRF. Short-term time-series studies link daily responses with prior daily exposures. A valid DRF should be based on the sum of those responses, not just on the day that happens to be statistically significant as has often been the case with the extant studies. Considering subsequent responses strengthens the DRF (additional responses from the same exposure). By contrast, long-term effects are due in part to prior exposures commensurate with the time required for chronic diseases to develop. Considering prior exposures weakens a long-term DRF because observed responses are then linked with additional exposures (same response to an additional exposure). The net result from these considerations is attenuation of differences between these two modes of DRF development. Considering multiple pollutants complicates the situation even further as discussed in more detail below.
Cohort studies are basically cross-sectional and designed to link development of differences in individual health status with conditions at enrollment. Individual characteristics such as smoking or body mass index are typically assumed to persist during follow up. Air pollution exposures are defined geographically but seldom followed in time. Lacking such dynamics implies that a change in health status associated with air quality requires a change in location (moving to a cleaner environment) without considering the time required to adapt to new circumstances. Simply put, a cross-sectional study compares conditions at various locations at a given time (including residents' exposures) but provides no clues as to how they got that way.

Multiple Pollutant Modeling
Many air pollution epidemiology studies have been designed to test the hypothesis that a designated pollutant is associated with one or more health conditions. The Veterans Cohort Study considered 36 air pollutants including PM2.5 constituents as alternative predictors of a single end point (all-cause mortality [5][6][7][8][9]. Five pollutants (vanadium, nickel, elemental carbon, nitrate ion, and traffic density) were highly statistically significant (0.005 < p < 0.025), with mean effects from 0.04 to 0.14, but a public health official would need to know which of them might be the most important. Although all five are associated with vehicular traffic, they were only moderately correlated (R < 0.6). I used 2-and 3-pollutant regression models to address this question as shown in Figure 1, which shows reductions in mean effects estimated from joint multiple-pollutant regressions relative to the sums of single pollutant estimates, comprising 22% with 2-pollutant models and 36% with 3-pollutant models. The mean effect of traffic density was 0.20 as a single pollutant, but dropped to 0.15 when regressed jointly with elemental carbon (EC), which added 0.094. When nickel was added in a 3pollutant model, the traffic density effect dropped to 0.14 and the EC effect to 0.081 while including nickel added 0.02 to the combined effect. The overarching message from this example is that public health authorities should consider abating specific traffic-related pollutants in addition to vehicular traffic per se.

Dose-Response Functions
The ultimate goal of an air pollution epidemiology study is to develop a dose-response function (DRF) from which a change in health status may be deduced from a change in inhaled pollutants. There are major problems with this application of an ancient concept ("The dose makes the poison", as attributed to Paracelsus in the 16th century). However, ambient air quality does not constitute a "dose" to the lung nor does "personal" exposure, largely because of contributions from indoor air quality.
Initial concerns about DFRs focused on determining thresholds of no effect to provide guidance for establishing ambient air quality standards (NAAQS) as required under the U.S. Clean Air Act of 1970. This objective requires a concave upward DRF, sometimes likened to a "hockey stick". The Six Cities Study shows some examples; see their Figure 3 from which thresholds can be estimated as shown in Table 1. Thresholds are defined here as the concentration at which the relative risk (RR) is unity. Reasonable estimates are seen for the particulate species and NO2 but not SO2. The authors chose to focus on the PM2.5 slope but not on the apparent threshold. The result for SO2 indicates no safe level, but this is not supported by clinical or toxicology findings. Pope et al. also found SO2 to be a highly significant predictor of mortality in the American Cancer Society cohort [12] but chose not to emphasize that finding, although Krewski et al. [14] suggested its importance, even in conjunction with PM2.5. Table 1. Estimated all-cause mortality thresholds from the Six Cities Study [4] by pollutant.
The problem of measurement error is key here, in terms of both random distributions and bias. As is well known [15], imprecise measurements can bias a DRF towards the null and obscure a bona fide threshold. This may have been the case with SO2 which tends to have localized spatial distributions. By contrast, PM2.5 is more uniform spatially but has important contributions from indoor sources that have not been considered in the extant air pollution epidemiology.
There are questions about DRFs for PM2.5 that do not consider particle composition. PM2.5 comprises particles of varying toxicity, from basic ammonium salts to toxic metals to various carbonaceous compounds. Their relative concentrations vary by location so that the shape of a DRF may depend on their relative proportions that may be the case for supralinear effects at low concentrations [16]. Perhaps the most problematic example of this issue is the DRF used for the Global Burden of Disease (GBD) Project that concatenated ambient air pollutants (AAP) with second-hand smoke (SHS) and household air pollution (HAP) from burning solid fuels [17]. This DRF combines PM2.5 levels from 5 to 500 μg/m 3 with RRs up to 2.6, which is the level for active smoking for which the estimated PM2.5 intake is approximately 10,000 μg/m 3 . This DRF could only accommodate such a large range by assuming a logarithmic function of PM2.5 that posits lower relative toxicity at increased concentrations. This proposition is counterintuitive; the chemical compositions of AAP, SHS, and HAP are clearly disparate with AAP being the least toxic and with perhaps the least well-defined exposures. Even the DRF for active smoking is linear, with similar RR levels. I conclude that the GBD DRF is inappropriate for assessing health effects.
In summary, explicit DRFs for ambient air pollutants are poorly defined and likely to remain so.

Exposure Issues
Air pollution epidemiology studies have been based on grouped air quality data obtained at fixed locations rather than on individual exposures. In the United States, ambient air quality monitoring is the purview of regulatory rather than scientific agencies. Many air pollutants are strongly correlated but not all are routinely monitored. The agency decides which pollutants are important for example based on toxicology, develops measuring techniques, and deploys ambient monitoring networks. Health effect studies are commissioned to use this ambient data that may or may not support the agency's initial decision, which in effect precludes the ability to test other hypotheses because of the absence of the required ambient data and leads to circularity. The lack of monitoring data could imply lack of regulatory concern but not the absence of health effects. I know of no instance where the failure of epidemiology to find significant health effects associated with a regulated air pollutant led to discontinuation of its ambient monitoring.

Accuracy and Precision of Measurements.
The U.S. Environmental Protection Agency (EPA) specifies "reference" methods that define each pollutant. Thus, a pollutant concentration is defined by the output of the reference measurement method for which CO, SO2, or NO2 pose no difficulties. These pollutants tend to be localized and readings from a fixed monitoring station are unlikely to represent personal exposures, especially with respect to indoor air quality (see below). Particulate measurements are more difficult.
Beginning with "dustfall" measurements in the early 20th century, particulate matter has posed its own measurement problems that are now focused on particle size and composition. In keeping with concerns about soot and products of incomplete combustion of dirty fuels ca. 1980 and earlier, "smoke" concentrations were based on filter color, converted to mass units. These quantities have been referred to as "smoke", "British smoke", "black smoke", "smokeshade", or "coefficient of haze (COH)" and have produced useful epidemiology. In the 1950s the United States deployed a network of gravitationally based filter samplers, the National Air Sampling Network (NASN) that network operated from 1953 to 1957 [18]. Those samplers collected particles < ~75 μm that were analyzed for elemental composition and radioactivity. In part because of this precedent, U.S. particulate measurements for regulatory purposes then became based on the total mass collected on filters of various types. Accuracy refers to bias; precision refers to random error. Exposure errors may be classified as follows [19].
• Instrument errors include bias from instrument design and variability from external influences that may include the height of the sampler above the locations of personal exposure such as rooftops.

•
Spatial bias may result from the distributions of emission sources surrounding the monitoring station. New York City deployed a dense network in the 1970s that reported the following ranges of variability within approximately 800 km 2 : SO2, 15-49 ppb; TSP, 61-106 μg/m 3 ; SO4 2− , 13-20 μg/m 3 [20]. These ranges were well within the variability of annual averages among cities at that time, such that a single reading should not be used in a long-term study. However, it would not be unreasonable to assume that they would all vary similarly in response to weather fluctuations.

•
Temporal variability includes averaging time, diurnal and seasonal variability, and historical trends. Concentrations of all air pollutants are likely to fluctuate in concert from local weather fluctuations.

•
Indoor air quality factors are perhaps the most important as I discussed below. • Physical and temporal modeling may have the advantage of precision and widespread coverage at the expense of unknown biases.

•
Surrogate measures include fractions of urban green spaces [21] and traffic density [6]. These measures describe places per se rather than the exposures of inhabitants which is also true for outdoor ambient air quality data from fixed monitors.
Remote atmospheric sampling has been used for PM2.5 and O3 and entails its own set of problems including atmospheric variability and instrument design. Jerrett et al. [22] compared effects on circulatory disease mortality in approximately 700,000 participants in the American Cancer Society cohort study, as estimated from 10 different methods of estimating ambient PM2.5 concentrations. Mortality rates were based on the period 1982-2004 and the remote sampling data were extrapolated back in time; historical data were not mentioned. Correlations among the 10 estimates were as low as 0.54. After adjustment for ecologic confounders the largest risk estimate (1.12 [1.09-1.15]) was based on interpolated data from 1318 ground-based monitors and the lowest (1.02 [1.0-1.04]) from remote sampling without ground information. The range among methods was much larger than the confidence limits for any one of them, leading to the conclusion that accuracy may be more important than precision. In multiple-pollutant models, the pollutant with the least measurement error may appear to be the most important, regardless of inherent toxicity.

Indoor Air Quality.
Adults typically spend ~85% of their time indoors, where air quality can differ substantially from the measured or estimated outdoor (i.e., ambient) values used in health studies. There are both practical (too many subjects to monitor over too long a time) and administrative (not covered by the Clean Air Act) aspects involved. Some authors have dismissed the issue on grounds of indooroutdoor correlations, but this does not account for differences in pollutants or house-to-house variations within an area represented by outdoor monitoring or estimation. How then can the reported significant relationships between outdoor air quality and long-term health be rationalized?
Approximately 50% of outdoor air pollution infiltrates indoors [4,23] depending on building air exchange rates and use of air conditioning. In previous decades, outdoor air was substantially more polluted than indoors so that infiltrated outdoor air would have dominated health effects analyses. Currently, indoor PM2.5 can exceed outdoor, especially where smoking is involved and other air pollutants may also be involved. Figure 2 shows the variability in long-term indoor PM2.5 levels for a given outdoor level in a particular city (each data point). It confirms the estimate of 50% infiltration and the fact that outdoor air quality does not represent individual exposures. In Avery et al. [24] and Jenkins et al. [25] indoor and thus personal PM2.5 ranged up to twice the outdoor level that would have been used in epidemiology but randomly among cities and never less than the infiltrated outdoor air. Thus, in the long term there was no correlation between personal and outdoor exposures across the cities used in health effects analysis. By contrast, infiltrated outdoor air carries the same (outdoor) frequency distributions, such that time-series analysis in a given city would be less affected. The major effect on time-series analysis would be to approximately double the risk coefficient since any observed health effect would have been associated with only approximately half of the air pollution. Figure 2. Indoor vs. outdoor PM2.5 with allowance for outdoor air infiltration as measured in selected U.S. cities [23].
It thus follows that long-term outdoor ambient air quality is a characteristic of a city and its location, as are population density, climate, elevation above sea level, and factors that control indoor air quality. Personal exposures are randomly distributed among study subjects and cannot be assigned to individual affected residents. The correct interpretation of a long-term study is thus that any observed health effect is a property of the city per se and thus may be associated with any of those properties that should be included in statistical analyses. All long-term studies, cohort or population, must thus be classified as ecological.

Exposure Issues in Time-Series Modeling
Short-term (acute) effects of air pollution are modeled by comparing perturbations in health outcomes with previous environmental excursions. Weather effects, weekday vs. weekend differences, and influenza epidemics are potential confounders. Various econometric methodologies have been used in such analyses. These effects are typically small (<10%) so that multi-year periods of study may be required to achieve statistical significance, which is the main objective of many timeseries studies. The reality of such effects was indisputably shown by the 1952 London fog. Absent industrial explosions, short-term air pollution exposures are primarily driven by weather fluctuations and are thus relatively uniform throughout a metropolitan area. This also applies indoors with respect to infiltrated outdoor air and was demonstrated dramatically in 1952 London when cinema projections were unable to penetrate the fog and reach the screen.
The primary requirement for a time-series study is that of daily air quality data, which has been problematic for particulates in the U.S. due to typically monitoring only every 3rd or 6th day. Many monitoring sites now report daily data for PM2.5 mass but not for PM2.5 constituents. Ozone data are reported for different time intervals (1 hr, 8 hr, annual) that have different spatial relationships. Shortterm concentrations tend to peak in cities near precursor emission sites and long-term values tend to peak downwind, often in rural areas. My personal choice for ozone is peak short-term levels.
Selection of an appropriate number of lag days between exposure and response is an important consideration in time-series analysis that depends on its objective. Early on, researchers were mainly concerned with establishing statistical significance, i.e., the existence of the effect, notwithstanding the strong evidence from London in 1952. This was done by postulating a range of plausible lag days and focusing on the one having the highest probability (i.e., t-value). However, with respect to assessing health effects, precedence should be given to the magnitude of the effect given by the sum of responses out to say, a week after exposure. Few of the extant time-series analyses have done so.
Elderly subjects have been shown to be more likely to respond to acute pollution excursions. However, within this group, the most compromised among them are the most likely to succumb. Murray and Nelson [26] developed a time-series model that estimates the size of a frail population at risk and its sensitivity to daily perturbations in temperature and ambient air quality. Thus, daily mortality responds to population characteristics (elderly frailty) and the environment (temperature and air quality) which vary on a daily basis. Frail subpopulations are depleted by death and augmented by the newly frail but environmental effects may lag by a few days or more. This model has been evaluated for Philadelphia, Atlanta, and Chicago with various pollutants.
A more complicated time-series model includes separate environmental effects on the frailty of new entries [27], realizing that different pollutants might be involved between enhancing and responding to frailty. Frailty has been clinically defined as a state of increased vulnerability resulting from loss of reserve in physiological systems.
The Chicago analysis considered PM10, O3, NO2, SO2, and CO from 1987 to 2000, with moving averages up to 15 days [28]. The frail subpopulation comprised approximately 0.3% of those aged 65 and over and had a life expectancy of approximately 10 days that was reduced to ~2 days by air pollution. Lag periods of various lengths were evaluated for temperature and each pollutant, resulting in an empirical model that explained 80% of the variance. Based on mean concentrations, the sum of relative risks associated with PM10, SO2, CO, O3 and NO2 in a multipollutant model was 1.024 (0.992-1.056) after accounting for lag effects.
Di et al. [29] studied daily mortality in the Medicare cohort from 2000 to 2012 with estimated PM2.5 and O3 on a 1 km grid using air quality modeling; they reported combined RRs of 1.015 (1.014-1.018) based on the same and previous days and increments of 10 units. Accounting for longer lags and converting to mean concentrations would increase this estimate to 1.06 (1.04-1.07).
Aktinson et al. [30] reviewed PM2.5 time-series analyses and presented results for 22 all-cause mortality estimates in terms of percent excess mortality per 10 μg/m 3 increment. Based on a mean value of 20 μg/m 3 , the mean of 14 single-day estimates was RR = 1.026 (1.01-1.04). Their data on multiday lags yielded a lag factor of 1.82×ln(days), which corresponds to RR = 1.06 (1.01-1.10) for a 30 d lag period.
Zanobetti et al. [31] studied PM10 lag effects in European cities and reported that increasing the lag period to 40 days increased the risk estimate by a factor of 2.3. Their 40 d risk estimate was 1.048 (1.03-1.06), based on a mean value of 30 μg/m 3 .
Lipfert and Wyzga assembled a dataset of relationships between cardiovascular mortality and various PM metrics in the Philadelphia metropolitan areas from 1992 to 1995, for the same and previous days [32]. The metrics ranged from coefficient of haze, constituents of PM2.5, to TSP, with a median size range of 1 to 50 μm and mean concentrations from 0.3 to 100 μg/m 3 . Among 45 estimates was a non-significant positive relationship with particle size (R = 0.20) with an overall mean effect of 1.058 (1.05-1.07) after adjusting for lag effects. Many air pollution studies compare associations among cities for 1 or 2 pollutants, mainly PM2.5; we examined associations with a wide range of pollutants for a single metropolitan area. We also ran a suite of 2-pollutant models including peak values of O3, CO, SO2, and NO2, for which the combined (significant) mean risks ranged from 1.03 to 1.045. Table 2 summarizes findings for lagged daily risks from various studies based on estimated mean concentrations.

Exposure Issues with Long-Term Studies
"Long-term" is not well-defined in the air pollution epidemiology literature; in finance, it can mean holding an asset for 7-10 years, but medically it is a condition lasting longer than 6 months [33]. More generally, it can also mean "lasting" or "enduring" and another definition refers to the time "after the beginning of something". Operationally, long term refers to exposures measured as annual rather than as daily averages.
In the medical literature, "acute" refers to severe and sudden in onset. "Chronic" refers to a condition that develops slowly and lasts more than six months; presence of a chronic condition can enable acute responses. These definitions do not comport well with typical interpretations of cohort studies that declare causality without considering the processing time leading to the development of a chronic condition. However, for exposure to air pollution to be deemed "causal", it must have participated in disease development processes. This requirement has seldom been recognized in the extant cohort studies and requires exposure data from previous decades, often historical. For example, exposure to (active) tobacco smoke is measured in "pack-years" and effects of occupational exposures are based on the duration of employment. As examples for ambient air pollution, consider a 5% annual rate of PM2.5 abatement; after a latency period of 10 y, the appropriate exposure would have been 60% higher than at the time of death and the risk coefficient 60% lower. These considerations of bias far exceed the random errors captured by statistical confidence limits.
The Veterans Cohort Study [5] considered ranges of exposure and all-cause mortality subperiods for several pollutants, for which results are displayed in matrix format in Table 3. Here the rows compare subsequent mortality risks associated with previous exposures. Columns represent mortality during a given period associated with a given previous exposure. Table 3. Relative risks by mortality and exposure period for counties with ambient air quality monitors. The right-most column represents mortality for the entire follow-up period. Matrix diagonals represent coincident exposure and mortality that could also represent acute events. Over time, air quality improved and the cohort aged. TSP and PM10 show constant risks by mortality period, increasing risks by exposure period notwithstanding improved air quality, and increasing risks along the diagonal, all of which seem counterintuitive. NO2 had constant mean values and shows essentially constant risks throughout the table. Peak ozone shows increasing risks with more recent exposures and 1997 traffic density poses the same risk for all mortality periods.
PM2.5 risks in this dataset are sensitive to race [8] and thus were not included here. The common characteristic in Table 3 is that of uniformity in the estimated risks among pollutants and over both exposure and mortality timing; the overall mean risk was 1.047 [1.028-1.067]. This leads to the conclusion that these risks are associated with the veterans' places of residence per se, which we assume to have remained constant during follow up, rather than the corresponding air quality, which improved. This is precisely the case with traffic density, which was the most important mortality predictor throughout the study [6]. Note that traffic density describes effects on personal exposures that can only be implied.

Comparisons between Short-and Long-Term Effects
Regression coefficients in excess of those derived from daily studies have been assumed to reflect "long term" effects. The Six Cities Study provides an opportunity to directly compare long-and shortterm air pollution effects. This cohort study provided DRF information from 8096 cohort members. Air pollution trends were provided and it appears that average previous concentrations may have been approximately 50% higher than coincident, thus reducing the relative mortality risk per unit exposure to 1.16 (1.05-1.29). A time-series analysis was also done for each of these six locations, based on their metropolitan areas [34]. However, daily PM2.5 data were not available and lag effects were not considered. Combining these six estimates, weighting by the numbers of deaths, and doubling the risk to account for cumulative lag effects yields an overall acute relative risk of 1.056 (1.041-1.072). The long-and short-term confidence limits overlap, such that there is no significant difference between long-and short-term risks. The short-term estimate is more precise because of the much larger numbers of subjects in the metropolitan areas. The mean short-term RR in Table 3 is 1.057 (1.033-1.085); the mean effect of PM2.5 for the American Cancer Society cohort is 1.08 (1.03-1.15) [22], although some European reviews reported slightly higher values [35,36].
Dockery et al. [4] may have been the only cohort study to consider these issues directly. Here are some relevant excerpts (p. 1758). "Because the daily time-series studies evaluated only the effect of short-term changes in pollution levels, whereas our study evaluated associations with long-term exposure (including recurring episodes of relatively high pollution) quantitative comparisons with these investigations are difficult to make." "The pollution concentrations used in our analysis represent only exposures monitored during the study period. Increased mortality, however, may reflect the cumulative burden over a lifetime of exposure." In discussing this project, Greven et al. [37] pointed out that the Six City risk coefficients rely on "both differences in exposure between cities and within a city over time".
A more comprehensive comparison is afforded between the American Cancer Society cohort of approximately 400,000 participants [33] and a meta-analysis of 22 time-series studies [30]. Based on a PM2.5 increment of 10 μg/m 3 the long-term risk estimate is 1.04 (1.01-1.08), and the mean short-term estimate is 1.02 (1.007-1.033) accumulated over a 9-d lag period. Again, the confidence intervals overlap and there is no significant difference between long-and short-term risks in this much larger sample.
Atkinson and colleagues reviewed both short- [30] and long-term [38] mortality associations and found all-cause mean relative risks of 1.029 (1.011-1.040) and 1.08 (1.04-1.120, respectively. In perhaps the earliest such comparison, Lave and Seskin [2] compared time-series mortality estimates over 5 lag days in Chicago summed with national cross-sectional estimates and found reasonable agreement. They concluded that their temporal analyses supported their national spatial analyses. Lipfert and Wyzga [15] compared mean effects of TSP, PM10, PM2.5, SO4 2-, NO2, and SO2 in timeseries, long-term population, and cohort studies and found overall mean risks across the pollutants of 1.041 (1.023-1.059), 1.025 (1.012-1.037), and 1.089 (1.019-1.16), respectively. Since these confidence intervals overlap, we found no basis to prefer one type of study.
We also assembled a dataset of relationships between cardiovascular mortality and various PM metrics in the Philadelphia metropolitan areas from 1992 to 1995, for the same and previous days [32]. The metrics ranged from coefficient of haze, constituents of PM2.5, to TSP, with a median size range of 1 to 50 μm and mean concentrations from 0.3 to 100 μg/m 3 . Among 45 estimates was a nonsignificant positive relationship with particle size (R = 0.20) and an overall mean effect of 1.058 (1.05-1.07) after adjusting for lag effects.
In summary, I find substantial overlap between short-term and long-term risk estimates, thus questioning the existence of bona fide long-term effects that might trigger new cases of chronic disease.

Aging and Progression of Chronic Diseases
Aging and timing are concordant in epidemiology. Subject age and community circumstances may change as a study progresses. In particular, because ambient air quality is likely to have improved and long-term effects respond to cumulative exposures, these two trends may conflict. The literature provides no estimates of air-pollution mortality risks for specific age groups over time that are needed to separate aging effects from cumulative exposures.

Cohort Subsets
Most of the long-term mortality studies have interpreted positive responses as indicating that presence of outdoor ambient pollution caused new cases of chronic disease. I showed that this was not the case for lung cancer [39], because of neglecting timing effects and underestimating smoking effects. There have been few studies of long-term disease progression or morbidity [40]. In detailed re-analysis of the Six Cities and American Cancer Society cohorts, Krewski et al. [14] reported no differences in long-term mortality estimates according to prior disease, lung function, or smoking status, suggesting that these deaths were not sequelae of prior long-term exposures. All of these comparisons indicate that longevity differences associated with air pollution among cities are more likely due to the time-dependent demise of frail individuals than development of chronic disease in the previously healthy.
In a little-known paper, Jerrett et al. [22] further extended this analysis to 51 cities from the American Cancer Society cohort for various periods of exposure from 1980 for SO4 2-and the period 1982-2000 for PM2.5, with regard to cardiopulmonary mortality from 1982 to 2000, in five sub periods. They included age and education levels as potential modifying factors and concluded by noting "the uncertainty in ascribing health benefits to air quality improvements". (Note that PM2.5 levels were nearly halved during the study period.) Altogether, Jerrett et al. presented 83 risk estimates in terms of the period of exposure, period of mortality, age group, and educational attainment. I concatenated these estimates and used multiple linear regression to estimate the importance of each factor to ln(relative risk). This produced the following empirical relationship: ln(RR) = -3.67 + 0.002 (0.005) exposure year + 6 E-5 (0.0037) mortality year + 0.0015 (0.0012) age -0.15 (0.006) education level -0.023 (0.03) SO4 indicator. (1) The SO4 indicator term measures the difference between effect estimates based on SO4 2-and those based on PM2.5.
The conclusion is that only education matters; risks were reduced by approximately 25% for those with a high school education. Notwithstanding cohort aging, mortality year had no effect and risk appeared to increase with the passage of time despite improving air quality. Note that education level at entry is time invariant during a study, precluding time-dependent relationships.

Age-Specific Mortality Analyses
Age-specific mortality risks were obtained from the Veterans Cohort Study by subdividing it into deciles by age at entry and analyzing them separately [9]. Figure 3 shows that age-specific mortality risks associated with NOx decrease significantly with age, irrespective of mortality period. This negative trend may be due in part to selective mortality within the cohort. By contrast, Figure 4 shows how climate-related age-specific mortality rates changed over time, ostensibly due to increased use of residential air conditioning. Excess risks were positive (higher in cold regions) for both young and old during the early periods but nil for all age groups by the 1990s.  With respect to population-based mortality data, Lipfert and Morris [41] examined associations between county-level air quality and demographic variables from 1960 to 1997 for ages 15-44, 45-64, 65-74, 75-84, and 85+ for 1960-1964, 1970-1974, 1979-1981, 1989-1991, and 1995-1997 for both mortality and air quality, but not all pollutants had data for each of these periods. Since the NO2 record was complete, I used linear regression analyses to examine those relationships with age, timing, and lag between exposure and response, as a model for other pollutants. Age was the only statistically significant factor, with a slope of −0.002 (0.00027) per year, which is very similar to Figure  3. I found no evidence for latency or risk accumulating with age. However, because of the exponential increase in mortality rate with age, approximately 60% of the NO2-associated deaths occurred at age 75 or greater, whereas risks to the full cohort (dashed line) were similar to the age-specific estimates.
There are also age-specific exposure issues. Elderly subjects may be more likely to remain indoors at home. Younger subjects may be exposed to a wider range of environmental conditions including occupational exposures. However, everyone experiences the conditions associated with the place, including traffic density, climate, elevation above sea level, and greenspace.

Concluding Discussion
These findings have important implications, heretofore unrecognized. The sum total of air pollution effects in a given city includes the sum of responses to daily perturbations as estimated by time-series analysis. Those risks are proportional to ambient air quality levels and also to the underlying frailty of the population and may be used to characterize a city along with contributing factors such as smoking, education, income, etc. A long-term analysis of a group of cities evaluates the relative contributions of those factors to annual mortality rates along with the associated air quality levels. The goal of air pollution epidemiology is to estimate the number of deaths that might be reduced by improved air quality in each location, regardless of timing. A time-series analysis in each city assigned some of those deaths to short-term air quality variations, excluding deaths that occurred on days without environmental excursions presumably from cumulative effects of previous debilitating factors. Any true long-term effects of air pollution must thus be given by the difference between cross-sectional and time-series analyses which were nil in the examples above. Table 4 compares attributes of the two types of studies; long-term studies are found wanting for most of them. Each type of epidemiology study has its own basic limitations: long-term by the need for data on important confounders with follow-up and historical exposure data and short-term by the need for daily air quality and weather data. Long-term studies must look back for exposure data and shortterm studies must look ahead for delayed responses. Long-term studies have been misinterpreted by the failure to recognize that annual effect estimates include daily effects and that any true long-term effects must be given by the difference between the two effect estimates. The examples shown above indicate that most of the putative long-term effects are actually accumulated short-term effects. Shortterm studies require daily air quality data, which in turn requires greatly expanded monitoring of particulate matter. Time-series studies of air pollution and frailty indicate the need to study the contributions of air pollution to the onset of frailty, realizing that different pollutants might be involved. The adequacy of exposure data, especially short-term data, remains an ongoing issue in large part because of the ongoing regulatory emphasis on long-term studies.