Profiling Risk Factors for Household and Community Spatiotemporal Clusters of Q Fever Notifications in Queensland between 2002 and 2017

Q fever, caused by the bacterium Coxiella burnetii, is an important zoonotic disease worldwide. Australia has one of the highest reported incidences and seroprevalence of Q fever, and communities in the state of Queensland are at highest risk of exposure. Despite Australia’s Q fever vaccination programs, the number of reported Q fever cases has remained stable for the last few years. The extent to which Q fever notifications cluster in circumscribed communities is not well understood. This study aimed to retrospectively explore and identify the spatiotemporal variation in Q fever household and community clusters in Queensland reported during 2002 to 2017, and quantify potential within cluster drivers. We used Q fever notification data held in the Queensland Notifiable Conditions System to explore the geographical clustering patterns of Q fever incidence, and identified and estimated community Q fever spatiotemporal clusters using SatScan, Boston, MA, USA. The association between Q fever household and community clusters, and demographic and socioeconomic characteristics was explored using the chi-squared statistical test and logistic regression analysis. From the total 2175 Q fever notifications included in our analysis, we found 356 Q fever hotspots at a mesh-block level. We identified that 8.2% of Q fever notifications belonged to a spatiotemporal cluster. Within the spatiotemporal Q fever clusters, we found 44 (61%) representing household clusters and 20 (27.8%) were statistically significant with an average cluster size of 3 km radius. Our multivariable model shows statistical differences between cases belonging to clusters in comparison with cases outside clusters based on the type of reported exposure. In conclusion, our results demonstrate that clusters of Q fever notifications are temporally stable and geographically circumscribed, indicating a persistent common exposure. Furthermore, within individuals in household and community clusters, abattoir exposure (a traditional occupational exposure) was rarely reported by individuals.


Introduction
Q fever, caused by the bacterium Coxiella burnetii, is an important zoonotic disease worldwide. In humans, the bacterium can cause a range of disease patterns, including asymptomatic infection, mild influenza-like symptoms, through to chronic manifestations.
Approximately 10-15% of acute cases progress to a chronic fatigue-like state labelled post-Q fever fatigue syndrome [1]. In ruminants, the bacteria causes coxiellosis, which affects reproductive performance, particularly in small ruminant species, presenting as disorders such as abortion and infertility [2].
Human exposure to C. burnetii in Australia is widespread, with one study suggesting that 1 in 20 Australians have evidence of neutralising antibodies [3]. Seroprevalence in adolescents shows that Q fever is an ongoing public health issue [3]. Queensland, the most northeast-most state in Australia, is home for 19.6% of the national population but reports 43.1% of national notifications [4] and has the highest average annual Q fever notification rate at 6.3 per 100,000 population per annum [5]. These figures are likely to be moderate underestimates, due to the failure to detect asymptomatic infections. A recent study identified that 89% of blood donors that showed previous exposure to C. burnetii have never had a Q fever diagnosis [6].
The cornerstone of Australia's Q fever control includes vaccination and education programs focused on people identified as being at higher risk, such as workers in close contact with animals, particularly those working in abattoirs, farms, and veterinary clinics [7]. However, as the National Q Fever Management Program results in substantial improvement in the burden of Q fever in these sectors, there is increasing evidence of infection of other sectors, including those residing in urban and suburban areas [8]. Moreover, given continued urbanisation in traditional farming areas, there is rising concern over the potential for airborne spread of Q fever to communities neighbouring animal industries and processing facilities [9].
In our previous work, we detailed that reported animal exposure patterns in Queensland differ markedly depending on where cases live in the state [10], pre-empting the need for a deeper investigation into whether cases exhibit spatiotemporal clustering and how demographic and contextual profiles of the cases vary across the state. There is a significant gap in our understanding of exposure pathways to C. burnetii within high-risk communities, and of the complexities of Q fever epidemiology to help design measures aiming at the prevention of C. burnetii exposure [10].
Q fever represents a diagnostic challenge, particularly in those without a history of occupational exposure, hence is considered an underdiagnosed disease with the true infection rate within the community likely higher than the notification rate [5,11]. Householdcommunity clusters, which to date have not been adequately studied in Australia, represent an opportunity to better understand the complex epidemiology of Q fever transmission locally by examining differences between and within household and community clusters. However, to determine the approach to investigate household and community clusters, it is essential to understand how often these clusters occur and their relative location to known geographical areas of Q fever notifications and the differences in reported exposures between individuals in household and community clusters and other Q fever cases.
This study aimed to retrospectively explore the spatiotemporal clustering patterns of Q fever notifications in Queensland between 2002 and 2017, identify household and community clusters and compare epidemiological features of cases within community and household clusters to cases from those outside of clusters.

Geographical Clustering of Q Fever Incidence at the Mesh-Block Level
A total of 2175 out of 3233 records had a valid home address within Queensland borders during the period between 2002 and 2017. The data excluded from the analysis corresponded to 78% (n = 827) of records without an address, concentrated in 2002 (n = 164) and 2003 (n = 90), with the proportion of notifications with a missing address decreasing across the years. For 231 records, the address was not recognised within OpenStreetMap.
The spatial analysis of all Q fever notified cases in Queensland indicated significant clustering in that the overall Moran's I estimate was 0.033 (Z-value: 15 A total of 356 Q fever incidence hotspots (i.e., mesh blocks classified as high-high by LISA analysis) were primarily distributed in South East Queensland, close to the border with New South Wales, and 10 mesh blocks were classified as high-high for more than one year across the study period with 4 mesh blocks classified as high-high across five years ( Figure 1). Our annual LISA analysis indicated that in 2003 we found a high number of mesh blocks classified as high-high (n = 18), which decreased during subsequent years, to increase again in 2013 (n = 13) and 2014 (n = 26) ( Table 1).

Spatiotemporal Variation in Q Fever Household and Community Clusters
The location of household and community clusters identified by space-time scan statistics for the whole period shows clusters primarily in southeast Queensland and on the coast of Townsville, in the state's northeast. The annual number of cases per 100,000 people was 2.9. From the total Q fever cases reported, 8.2% (n = 179) belonged to a spatiotemporal cluster. We identified 72 spatiotemporal clusters across the study period between 2002 and 2017, using spatiotemporal scan statistics. From the 72 spatiotemporal clusters identified, 28 belonged to community clusters and 44 belonged to household clusters (Table S1). The average community cluster size was 3 km radius. The model revealed 20 significant clusters ( Table 2), with the largest number of cases in the Townsville cluster, with eight observed cases and a radius of 9.66 km. The time frame for Townsville's cluster was from February to March 2012 and carried a relative risk (RR) of 184.39 and a log-likelihood ratio (LLR) of 33.77. In addition, the 19 remaining clusters identified have a relative risk from 868 to 499.74.

Profile of Exposures of Q Fever Cases within Household and Community Clusters
Our analysis of the exposure responses of Q fever cases between household and community clusters detected by space-time analysis is summarized in Figure 2, and represents all different types of exposure reported by 179 cluster-associated cases. A total of 50% of recorded cases answered positively to living or working within 300 m of bush, followed by exposure to paddock dust (46%), and being exposed to livestock transport, and assisting/observing animal birth (33%). On the other hand, only 3% of recorded cases reported abattoir exposure and 1% reported working in the grounds of an abattoir.

Factors Associated with the Probability of Belonging to a Household or Community Q Fever Cluster
We analysed the reported exposure profile for each cluster type (community, household, or the combination of both) and cases reported outside a cluster. Our results indicate that the reported exposure profiles of Q fever notified cases within a cluster differed significantly from those of Q fever notified cases outside clusters. Factors independently associated with belonging to a Q fever household or community cluster included having contact with an infected person (p ≤ 0.001), which was statistically significant for all groups (household clusters only, community clusters only, and the combination of both cluster types). Assisting/observing animal birth (p ≤ 0.001) was statistically significant for community and household clusters as well as laundering clothes of an animal worker (p ≤ 0.001) and living on a farm (p ≤ 0.001) ( Table 3).
In the Generalise Additive Model (GAM), cases belonging to a community and household cluster were more likely to report being in contact with an infected person in the one month prior to disease notification (p ≤ 0.001). Cases belonging to a household and community cluster were also more likely to have reported assisting with or observing an animal birth (p = 0.036) than cases reported outside a cluster (Table 4). Table 3. Differences between Q fever cases within household and community clusters, and those outside clusters, in the proportions of reported exposures 1 month prior to disease onset. All reported exposure were analysed based on yes vs. no; community and household clusters (n = 221); household clusters (n = 146); community clusters (n = 75); total reported cases included in the analysis = 2175.

Discussion
In this study, we have identified significant overall geographical clustering in Q fever notifications in Queensland for the period of 2002 to 2017, suggesting common pathways of exposure to C. burnetii in vulnerable communities. Our results found clustering for 11 out of the 16 years analysed, and nonsignificant clustering was correlated to periods when Q fever notification incidence was relatively low. Results from a previous study indicated that during 2007, 2008, and 2009 there was a sharp decrease in the Q fever notification rate in Queensland, followed by an increase in 2010 [4] which correlates with our clustering results for the 2007-2009 period. We found the highest Moran's I value (0.02) in 2015, which corresponded with the second-largest peak of Q fever notifications in Queensland in the past 20-years [4]. Our study extends previous research in that we were able to identify Q fever incidence hotspots in communities in the southeast interior of the state as well as the northern tropical region. In previous work, we [4] described higher notification rates (per 100,000 population) in the Mareeba district, located in Far North Queensland, but while we did not identify statistically significant clusters in that area, we identified a significantly higher rate of Q fever notifications around the Townsville region. Moreover, our results demonstrate that clusters of Q fever notifications are temporally stable and geographically circumscribed, which may be an indicator of the existence of a persistent common exposure. Furthermore, individuals in household and community clusters do not seem to report abattoir exposure as the main exposure pathway, a traditional occupational risk group currently targeted for Q fever vaccination.
Our spatiotemporal analyses identified a total of 72 spatiotemporal Q fever clusters in Queensland between 2002 and 2017, 20 of which were statistically significant spatiotemporal clusters across Queensland. Our results indicate that Q fever clusters are an important component of Queensland Q fever notifications, as 8.2% of cases are generally associated with a spatiotemporal cluster. The average Q fever community spatiotemporal cluster was estimated to be of 3 km radius, which is in line with existing evidence indicating that the risk of C. burnetii infection is higher within 5 km of a contaminated source in rural areas [9]. Studies conducted with data collected during Q fever outbreaks indicate that the risk of infection is high in the direct vicinity of a source, decaying very rapidly after that. For example, the outbreak in Germany in 2005 had an association between risk of infection with Q fever and living close to a meadow with C. burnetii-infected sheep grazing and lambing. The attack rate during this outbreak dropped from 11.8% within 50 m to 1.3% at 350-400 m [12]. Our results also demonstrated that community clusters were located across the whole state, with the majority located in southeast Queensland, the western clusters across Murweh, Blackall Tambo Barcaldine regions, and the northern cluster located in Townsville region. The Townville cluster corresponded to the biggest cluster identified in our analysis, with a radius of 9.6 km. The large size of the Townsville cluster is consistent with evidence that C. burnetii can travel long distances, up to 18 km, by strong winds [13]. Aerosol dispersal of C. burnetii via wind has been associated with outbreaks in France, Germany, Netherlands, and UK [9,[12][13][14][15][16], and in outbreak conditions it has been reported that Q fever cases can cover approximately 10 km 2 areas [17]. However, other factors such as the average size of the mesh block in the Townsville area (larger than in the southeast region) could have an effect on the size of the clusters. While community clusters were located across the whole state, household clusters analysed in our study were mainly concentrated in the southeast region. Household clusters, in which members from the same house were exposed to the bacteria without necessarily having an 'at risk' occupation, were identified in our spatiotemporal analysis. For example, only 3% of people identified as part of a household cluster reported abattoir exposure. This result is supported by an increasing number of Q fever reports that are not related to direct contact with animals [18]. Our results from the household cluster profile indicate there may be a role for expanding Q fever control measures to people and communities that do not necessarily fit the current 'at risk' list of occupations.
Despite the endemicity of Q fever in Australia, epidemiological studies on Q fever are generally missing information about the infection risk profile of communities with recurrent Q fever risk that could inform the evidence base for the existence of a putative source of infection [9]. Our findings indicate that Q fever notified cases belonging to Q fever spatiotemporal clusters (community, household, or the combination of both) are associated with particular modifiable exposures, compared to Q fever notified cases outside identified clusters. This result suggests that sociodemographic context within identified Q fever spatiotemporal needs be taken into consideration when designing health promotion and education strategies to reduce potential sources of C. burnetii exposure. Interestingly, we did not find differences in abattoir exposure between Q fever cases belonging to a cluster and those not belonging to a cluster. Exposures other than abattoir-related exposure are likely to distinguish Q fever cases in household and/or community clusters from other cases. Indeed, our results indicate that Q fever cases are more likely to belong to a family and community cluster if they assist animal birth [19][20][21] or have contact with an infected person. The univariable model also shows that those cases reporting contact with clothes worn by someone who worked with animals were more likely to belong to a cluster. This type of exposure has been previously reported in a small outbreak, with three laundry workers infected with Q fever [22]. Our results suggest that laundered clothes from animal workers are a potential risk source for Q fever clusters. Similarly, we identified that notifications that reported exposure to paddock dust were more likely to belong to a community or household cluster. This result is consistent with the importance of aerosol transmission in Q fever infections [9] due to the capacity of the bacteria to survive in the environment, with viable bacteria being recovered from soil up to 20 days after inoculation [23]. Cases reporting living or working within 300 m of bushland were also more likely to belong to a cluster. This may be an indicator that the environment is playing an important role in the maintenance of the bacteria that could drive the Q fever clusters. The exposure reported of being in contact with an infected person in the month prior to the disease onset is an expected outcome, as it aligns with the cluster definition used in this study, defined by a minimum of two cases in two months. In addition, this result demonstrates that Q fever reported cases within these clusters are familiar with Q fever, since they are likely to know someone who has had Q fever.
As with all observational studies, there are limitations in our work. First, the records of Q fever cases are not always complete. For this study, 827 cases had an incomplete address, and therefore were removed from the analysis, and more than half of the cases that belonged to a household cluster had no information about their place of work. Secondly, the limitation of the ScatScan analysis due to the lack of an autoregressive process to capture the temporal dependencies.

Data Sources and Management
Q fever is a notifiable condition in Queensland under the Public Health Act 2005 and its subordinate legislation [24,25]. Q fever notification records from 2002 to 2017 were obtained from the Notifiable Conditions System (NoCS) managed by the Communicable Disease Branch of Queensland Health. The Notifiable Conditions System compiles data from clinical information, with follow-up from select individual public health units (PHUs) via case reporting forms. From 2012 onwards, Q fever notified cases have been contacted by staff of associated PHU and asked to respond to additional follow-up questions using a Q fever case report forms to collect information. The information included for this analysis is based on reported exposures in the month prior to illness onset (Queensland Health). Records between 1 January 2002 to 31 December 2017 with complete home addresses were included in the analysis. All cases were geocoded at the street address using the package tmaptool [26] in R [27] and the ©OpenStreetMap contributors; records outside Queensland borders were removed from the analysis.
We used human population counts and demographic data in Queensland at the mesh-block statistical area, obtained from the '2074.0-Census of population and housing: Australia, 2016' [28]. We used mesh-block divisions obtained from ASGS Ed 2016 digital boundaries in ESRI Shapefile format [28,29]. Isolated polygons such as islands were removed prior to the analysis.
To perform spatial analysis of Q fever incidence across Queensland, the mesh block was considered the spatial unit of analysis. Using the spatial join tool in ArcGIS Pro (version 2.7.0), we counted the number of Q fever notifications for each mesh block, and incidence was calculated by dividing the Q fever count per mesh block by the total population of the mesh block. When the population in a polygon was equal to 0 and Q fever records ≥1, we used the population from the nearest neighbour that had a recorded population.
The identification of space-time community clusters was performed by aggregating the geographical location of each case to the centroid of the mesh block and the population was also included as mesh-block level. Data management was conducted in ArcGIS Pro and the software R [27].

Exploration of Q Fever Notification Clustering Patterns in Queensland
We used the Moran's I statistic to assess the extent of spatial clustering of annual Q fever incidence (i.e., observed cases per 100,000 population) at the mesh-block level for the period of 2002 to 2017. To explore the location of significant high-risk mesh blocks for Q fever incidence, we applied the Local Moran's statistic, which is a Local Indicator of Spatial Association (LISA), to determine the spatial locations of the Q fever clusters in Queensland during each year. Using estimates of observed vs. expected incidence from the LISA analysis, each mesh block was categorized as a hotspot (high-high), coldspot (low-low) or as an outlier (high-low and low-high) [30]. A Z-score is generated by the Local Moran's I statistic to determine the significance level of clusters. Surroundings with spatial clusters will be indicated by a high positive Z-score, and the presence of spatial outliers will be represented by a low negative Z-score. A pseudo p-value was calculated using 499 permutations; this value corresponds to a summary of the results from the null reference distribution that assumed notifications were randomly distributed across the study area [31]. We investigated hotspots' mesh-block stability across the study period by spatially overlaying high-high mesh block for each year and selecting the mesh blocks that were categorised as hotpots in multiple years. Analyses were performed using GeoDa TM software [32].

Identification of Q Fever Household and Community Clusters
We categorised clusters into three categorical levels: household clusters, community clusters, or a combination of both. When two or more cases were recorded from the same home address within a period of six months, we considered this as a household cluster. Household clusters were identified based on the data available on the notification report form, including records for which a georeference was not available, but a home address or a name of a property was provided.
To identify the presence of community clusters and respective cluster sizes, we explored the spatiotemporal pattern of Q fever notifications clusters by performing a spatial scan (SaTScan software, version 9.7). In this study, we defined a community cluster as two or more cases associated within a 10 km radius, as it has been previously described that infection risk is generally higher within 5 to 10 km from an infected source [9]. The time aggregation for this study was two months, based on the maximum incubation period reported of 60 days with a median incubation period of 18 days [33]. The geographical unit of analysis was the geographical centre of the mesh block; a mesh block corresponds to the smallest geographic region in the Australian Statistical Geography Standard. Therefore, the input data for this analysis consisted of (i) a Q fever case notification file, where all Q fever notified cases during the 2002-2017 period were summarised for each mesh block per month; (ii) a population file based on the Australian census 2016 by mesh block, and (iii) a geographic file, consisting of the centroid of each mesh block in Queensland.
We used a space-time scan analysis, which is defined by a cylindrical 'window', in which a circle represents the geographic base, and the time is represented by the height of the cylinder. Then, the cylinder is moved in space and time, creating overlapping cylinders, where each cylinder represents a possible cluster [34]. A retrospective space-time analysis and a discrete Poisson probability model were used to estimate relative risk. We scanned for areas with high and lower rates. A likelihood ratio test was used to compare the alternative hypothesis, that risk is higher within the window as compared to the outside, providing relative risk and p-values for each cluster [35]. The model was run using a standard Monte Carlo test with 9999 replications to generate a p-value. We compared the results from the space-time analysis with the household clusters identified previously, and we categorised each space-time cluster into household or community clusters. A community Q fever cluster corresponded to a space-time cluster that did not overlap with the previously identified household clusters based on home address.

Associations between Q Fever Clusters and Demographic and Socio-Economic Characteristics
For the purpose of this analysis, the household and community clusters included those identified by home address and those based on SatScan analysis, respectively. We extracted only the Q fever notifications (n = 179) that were part of a cluster identified by spatial scan. We investigated the reported at-risk exposure within one month prior to the notification date of Q fever. We explored: (i) differences between clusters; for example, if cases from the same cluster reported the same 'at risk' exposure, and (ii) we explored within each cluster the type of exposure reported. We used a Pearson's chisquared test to investigate differences between the type of exposure individuals reported within household, community, or household and community clusters, and individuals who did not belong to these clusters. We excluded from the analysis patient responses that were recorded as 'unknown', or that contained missing data. We use a penalised General Additive Model (GAM) to investigate whether belonging to a cluster was associated with exposure type. We excluded variables that were correlated providing similar information, and with threshold value for correlation coefficients > 0.5. For instance, we included abattoir exposure, while work inside abattoirs was excluded for the model. Similar with variables related to work with wool, we excluded work in a shearing shed, and work in wool processing. A total of 17 variables were included in the GAM, with smoothing penalty using mgcv package [36]. We performed automatic variable selection using a random effect basis with a double penalty approach to regularise coefficients toward zero. All statistical analyses were conducted in R [27].

Conclusions
This study provides a detailed spatiotemporal analysis of Q fever clusters in Queensland as well as insight into the different 'at risk' exposures described between cases belonging to clusters and cases outside clusters. We conclude that Q fever cluster communities identified in this study require an in-depth environmental risk assessment to help inform public health strategics to decrease their endemicity. Further analysis is needed to understand the epidemiology of C. burnetii within clusters, and to determine the main source of infections in these clusters. Informed Consent Statement: Patient consent was waived because involvement in the research carries no more than negligible risk to participants; there is no, or minimal risk, of harm associated with not seeking consent from individuals; and the benefits of this research greatly outweigh any potential risk. The use of personal information in this study poses a negligible risk, and the results may benefit future Q fever vaccination programs for Queensland. No individual cases will be identifiable from material presented in public or publications arising from this project.

Data Availability Statement:
The data supporting the findings of the article are not available publicly due to ethical reasons.

Conflicts of Interest:
The authors declare no conflict of interest.