Pitch and Flat Roof Factors ’ Association with Spatiotemporal Patterns of Dengue Disease Analysed Using Pan-Sharpened Worldview 2 Imagery

Dengue disease incidence is related with the construction of a house roof, which is an Aedes mosquito habitat. This study was conducted to classify pitch roof (PR) and flat roof (FR) surfaces using pan-sharpened Worldview 2 to identify dengue disease patterns (DDPs) and their association with DDP. A Supervised Minimum Distance classifier was applied to 653 training data from image object segmentations: PR (81 polygons), FR (50), and non-roof (NR) class (522). Ground validation of 272 pixels (52 for PR, 51 for FR, and 169 for NR) was done using a global positioning system (GPS) tool. Getis-Ord score pattern analysis was applied to 1154 dengue disease incidence with address-approach-based data with weighted temporal value of 28 days within a 1194 m spatial radius. We used ordinary least squares (OLS) and geographically weighted regression (GWR) to assess OPEN ACCESS ISPRS Int. J. Geo-Inf. 2015, 4 2587 spatial association. Our findings showed 70.59% overall accuracy with a 0.51 Kappa coefficient of the roof classification images. Results show that DDPs were found in hotspot, random, and dispersed patterns. Smaller PR size and larger FR size showed some association with increasing DDP into more clusters (OLS: PR value = −0.27; FR = 0.04; R = 0.076; GWR: R = 0.76). The associations in hotspot patterns are stronger than in other patterns (GWR: R in hotspot = 0.39, random = 0.37, dispersed = 0.23).


Introduction
Dengue is a disease caused by the dengue virus (DENV), which is transmitted human-to-human by a female Aedes species (sp.) mosquito, an anthropophilic mosquito that breeds around humans [1].When a human is infected with DENV, then fever, headache, muscle and joint pain, and nausea appear in the first few days [2,3].Conditions might worsen with subcutaneous or nasal and oral spontaneous bleeding as well as life-threatening internal severe bleeding and shock [3][4][5].The mosquitoes bite humans mainly during the daytime but also at night [6].Their life cycle begins when female Aedes sp.mosquitoes put their eggs (oviposition) on watery habitats.In 1-2 weeks, eggs become instar larva, pupae, and finally adult mosquitoes.During drought, adult mosquitoes and their eggs can also live; once the environment becomes moist, the eggs can hatch [7][8][9].
An excellent means of controlling dengue disease pandemics is to monitor and intervene in environmental conditions [10].Environmental factors that have been mapped for dengue vector breeding habitats include vegetation, water bodies, and land cover [11].Built-up surfaces of urban structures constitute a major factor because of their higher probability of becoming a breeding habitat [12].Earlier studies have pointed out that, of urban structures, houses play a major role as a habitat for dengue vectors [13,14].Roof construction presents a potential risk for Aedes mosquito breeding sites [15].As part of roof construction, a flat roof (FR) made of waterproof material [16] presents a high probability for water to flow with lower velocity than a pitch roof (PR) [17].The former harbors stagnant water when both gutters are blocked [8,18].All outdoor oviposition showed a relation with indoor abundance of adult female mosquitoes and can engender higher rates of oviposition inside homes where water containers are most frequently found [15,19].In Indonesia, issuance of government regulations related to cleaning roof areas is still uncommon.Accordingly, health promotion campaigns related to such issues are nonexistent.In a neighboring country (Australia), cleaning roof gutters is also uncommon, but one study has found them to be a productive habitat for Aedes mosquitoes [8].We inferred that identifying pitched and flat roofs is a good approach to identify roof gutters, although not every house roof has a roof gutter.
Many studies have used spatial analysis of dengue case patterns, but several studies have used temporal indices to conduct modelling approaches.Such studies primarily proceeded to use a coarser scale using Landsat, Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) and Advanced Very High Resolution Radiometer (AVHRR) [9].High spatial resolution (HSR) imagery, pan-sharpened Worldview 2, has been applied for urban mapping because of its superiority of object-based analysis [20][21][22].However, its applicability to a public health approach by integrating spatial and temporal to generate dengue risk maps using HSR imagery was lacking [11].The use of HSR imagery for mosquito-borne diseases has become increasingly popular [10,11] with the use of Satellite Pour l'Observation de la Terre (SPOT) 5 [23], IKONOS [24], and QuickBird [25] for Malaria disease studies.For dengue disease studies, QuickBird [12,26], Geoeye-1 [27], and aerial images [28] have been used, although there is no mention of Worldview 2 in that body of literature [10].The objectives of the present study were to identify the association of roof construction (PR or FR) with dengue disease patterns (DDPs) by demonstrating integration of HSR images with DDP spatiotemporally.We hypothesized that the association result is lower than in other studies because we only assessed the roof construction apart from other factors such as vegetation and shadowed areas [29].Additionally, PR and FR classification quality assessment from a pan-sharpened Worldview 2 and DDP identification were demonstrated as a process to build the quality of association.

Study Area and Data
The study area comprises a 48.66 km 2 which encompasses 40 villages in northern Bandung city, located in West Java. Figure 1 presents the study area as a red, green, and blue (RGB) color composite.Its urban landscape presents heterogeneity, with elevations from highest altitude (1077 m above sea level) to the lowest (691 m), as well as various pitched and flat roofed houses.Data of the city census show inhabitants of this area as 748,561 people in 2012, by which the population density of the area was 15,383 per km 2 .Dengue cases were endemic in Bandung city Indonesia during the prior decade.Residents' density of this city is high (about 2.5 million in 167 km 2 ), presenting a high risk of endemicity.To control the endemicity, chemical agents for controlling mosquitoes have been applied frequently over many years [1,30], even in Bandung city, in the form of spraying on households and placing larvacides into water chamber.Nonetheless, roof control for the mosquito breeding sites in the city remains insufficient.
Figure 2 depicts data preparation and analytical procedures.An ortho-ready standard level-2A (ORS2A) archive pan-sharpened image of Worldview 2 (WV2) with 0.5 ground sample distance (GSD) and red, green, and blue (RGB) color composite was acquired on 2 July 2011 as three images covering the study area.The delivered product license was purchased by the district development plan government agency (BAPPEDA) in Bandung city, Indonesia.This image was only provided by the government in RGB band color composite, not in full 8-bands.Orthorectification was performed in ENVI 4.8 by application of a rational function model that was already included in the folder of the product.The imageries were then mosaicked in ArcGIS to obtain full imagery.A subset of the imagery was done in ENVI to obtain the only image covering the study area.This imagery (in ENVI image format) was loaded in the eCognition 64 software package (Trimble Geospatial Imaging) for the segmentation process.Multiresolution segmentation, an algorithm used for segmenting HSR imagery, has been used widely and has achieved good accuracy [20].It is a bottom-up region-merging technique from pixel level into image object (IO) polygon level using three parameters: scale, shape, and compactness [20,31,32].We set algorithm parameters to obtain the best polygons for each of the objects in the imagery by inserting parameters previously used for real-color composite of RGB color composite orthophotos on the pixel level: 10 on scale, 0.5 shape, and 0.9 compactness, into the software.This level was still in pixel level polygons, which did not convey meaningful IO.For the IO level, trial-and-error segmentation processing is necessarily performed to assure visual pure objects belonging to only one class [20,31,33].Given this process, we continued to try several algorithm parameters scaled from 20 to 100.Then, we got a 20-scale parameter because that was the scale that showed pure objects according to our visual interpretation.Figure 3 presents an RGB color composite of a pan-sharpened image and the segmentation result.From eCognition, this result was saved as a shapefile.We used this segmentation to produce training data by choosing polygons from the segmentation, matching with the imagery to decide which is PR, FR, and non-roof (NR).We collected 653 training data from segmentation IO polygons of the WV2 imagery.From those, we made three region of interests (ROIs) polygons consisting of 81 polygons belonging to the PR class, 50 FR; the rest (522) were NR class.Regarding the number of training data, previous reports described that the number of IO polygons used for training data varied between 18-63 polygons per class [20].
For use as reference data, we collected ground truth points in Bandung City using a Global Positioning System (GPS) 60 CSx including their pictures after informed consent was received from the home owners.House roofs were tagged by GPS in front of the house when ground truth was not possible to assess either because of difficult access or privacy inside of the house.An ethical clearance letter to conduct this study was issued by the health research ethics of the Faculty of Medicine, Padjadjaran University.A letter from the municipality giving information about this study was also issued to support the effort at informed consent along with the clearance letter.In all, 272 GPS points were collected.Of those, 52 were PR class, with 51 FR, and 169 NR points.We checked each GPS point photo by visual inspection and carefully matched each photo with IO of the imagery.Later, each point was moved to one of the pixels in the IO where it belonged.These steps were modified from the process used by Aguilar et al., who produced reference data from polygons of a segmentation, not from pixels [20].Earlier reports described that pixel-based analysis was preferred in HSR because of its higher accuracy than the IO-based accuracy [31].

Figure 2. Data preparation and analysis procedures.
Dengue case data from 1 January-31 December 2012 were obtained from Bandung city health service.It has been a common problem that dengue surveillance data are reported in areal units [34].However, we used dengue disease patient data in point units, which comprised addresses of patients, diagnoses, and dates of symptoms before hospital admission.As a preparation before analysis, the data quality was checked for better location information.To increase the quality, sub-district and village information were matched and corrected by the health service data manager.Later, at least one map of each sub-district that was derived from the WV2 imagery was printed with high-definition quality.The map(s) were then given to each dengue case manager of each primary health care (Puskesmas) in the study area on which they manually performed an address-approach to the patient locations and digitized them as points on the map.Each location of patient's address was confirmed according to the case managers' knowledge related to their working area coverage.When the case managers did not know where to point exactly, they approached the patient address based on household blocks.These data were labeled as corrected data, whereas unknown addresses were excluded and labeled as uncorrected.Each map was georeferenced on the WV2 imagery as the reference.Subsequently, all dengue patient points were digitized in ArcGIS 10.1.Data quality was calculated by dividing corrected data (4172 addresses) by total data (5096), yielding 0.82 for the whole city of Bandung.Of these corrected data, only the northern part of Bandung city was extracted (1058 cases) for analyses.Census data of each village of year 2012 were obtained from the city population and civil registration service.

Analysis
A Supervised Minimum Distance classifier method was used to produce a classification image based on the ROI polygons.This method is based on the mean vector for each class of training data.By considering these mean values, a pixel of unknown identity might be classified by computing the distance between the value of the unknown pixel and each of the training data means.After computing the distance, the unknown pixel is assigned to the closest class [31,35].This method is extremely effective when applied using HSR imagery, perhaps because the Minimum Distance classifier is a non-parametric approach, which requires no assumption of normality.This method can be effective for HSR images because of its urban area feature complexity leading to non-normal distributions of data [31].To differentiate PR and FR, the mean values of the R, G, B color composite digital number (DN) of the training data were used as shown in Table 1: DN description of PR and FR.The analysis was done in ENVI 4.8, which produced a classification image.Later, the accuracy assessment agreement of the roof classification image with the reference data was calculated.Explanatory variables refer to variables that measured model fitness association with the dependent variable using R-squared (R 2 ) and adjusted R 2 (0-1) [36].The classification image showing PR, FR, and NR classes was converted from raster to vector data in ArcGIS, and was then included in the analysis as explanatory variables except for the NR class.Dengue disease pattern (DDPs) refer to a variable that we derived from the incidence of dengue disease and then analyzed in ArcGIS based on address-based disease locations and dates of disease symptoms.The incidence of dengue disease is a rate measurement of number of cases per number of population at-risk [37,38].Populations at-risk were approached by dividing census data of each village by number of grids sized 75 m × 75 m on inhabited areas.The population in each grid was assumed as population at-risk because the size of the grid remained in the Aedes mosquito flight range [39].The grid size was approached based on the approximate size of one rukun tetangga (RT) or block in Bandung city that consists of about 30-75 houses (each house estimated as about 72 m 2 ) according to the city regulations.This method was a modification from that used by Kumar et al., who applied a 100 m × 100 m grid for built-up land density [34,40].The incidence of the disease was then obtained and used for analyzing DDP using Hotspot analysis or Getis-Ord Gi (GiZ) in ArcGIS 10.1 toolbox that can output numeric continuous data and which can identify statistically significant hotspots, random and also dispersed patterns [1,41].We used the dengue pattern of GiZ score as a hotspot or clustered pattern (GiZ Score > 1.65), random (−1.65 ≤ GiZ Score ≤ 1.65), and dispersed pattern (< −1.65) based on a previous study [1,41].These DDP patterns were used as dependent variables for regression analysis.This analysis was the most common approach.It is most likely to require public health intervention [11].The spatial autocorrelation coefficient Moran's I (+1 to −1) analysis was measured to evaluate the independence of residuals.A positive value means that the adjacent values tend to be similar, whereas a negative value implies dissimilarity [36].
For the analysis, we applied the date of disease symptoms as a temporal factor for the 28-day duration approach, which is modified from the lifecycle of the larva to adult mosquito in urban areas and sub-urban areas (24 and 33 days, respectively) [18], and 30 days [7].For spatial factors, several distance inputs were selected to cover the number of patients.We performed this analysis in ArcGIS toolbox using Spatial Weight Matrix (SWM) by which we included 1154 patients and excluded four patients.We also found the 1194 m distance as a minimum distance threshold to cover 1154 patients.Subsequently, a buffer radius (m) was created by dividing 1194 m by 28 days assuming the flight distance of a mosquito per day.Therefore a 42.64-m buffer distance was created.The higher the density becomes, the shorter the flight range of the mosquito, and therefore the higher the endemicity becomes [42].This buffer was in line with that reported by Muir and Kay, who measured the female flight distance, resulting in 5-69 m/day (maximum 160 m/week) [39].The buffer was made with the dengue disease patient location as a centre point.The explanatory variables presented above were then overlaid with the buffer in ArcGIS.Duplication analysis was done to ensure the data quality.In each buffer, the size mean of the explanatory variable was measured.Previous studies measured the proportion of land cover in a buffer [29,43].However, we used the size mean of PR and FR in one buffer based on our assumption that a smaller roof is more difficult to reach, making it difficult for people to clean the roof gutter.
For regression analysis, we used DDP as a dependent variable in ordinary least squares (OLS) and geographical weighted regression (GWR).We first applied OLS, a global regression model, in an attempt to determine the model performance, redundancy, stationarity and residuals normality.Model performance was measured using R 2 , adjusted R 2 and the Akaike information criterion (AICc).Redundancy was assessed through a variance inflation factor (VIF).If VIF was more than 10, then it indicates redundancy/multicollinearity [36].Koenker's studentized Bruesch-Pagan (Koenker BP) statistic was used to assess stationarity.If p < 0.05, then we infer non-stationarity.Jarque-Bera statistics were used to determine the normality on the residuals when p < 0.05 implies a residual not in the normal distribution, suggesting a biased model.We first applied OLS regression in our attempt to explain a global regression between DDP as a dependent variable and the explanatory variable as an independent variable.The regression was set as DDP = β0 + β1PR + β2FR, where β0 is the intercept value, and where β1 and β2 are estimated respectively as the values of PR and FR.However, the relation is not always stationary.In the normal distribution on residuals, a phenomenon in a relation is also influenced by the location that is localized and which might have a different relation (non-stationary) at each location.The GWR model is suggested for analysis if this phenomenon is found.Actually, GWR extends a universal model by allowing local variations (non-stationarity) and producing R 2 , adjusted R 2 , and AICc and a standardized residual map to measure the model results [36,44].

Results and Discussion
The findings of this study include classification of accuracy assessment from WV2 imagery, classification image, DDP, and association between the roofs and the DDP.

Accuracy Assessment of Roof Classification
The classification accuracy was evaluated using a confusion matrix based on the classification result for the ROIs.Herewith, the Supervised Minimum Distance classifier was applied to the pan-sharpened WV2 imagery.Results in Table 2 suggest moderate agreement (0.4-0.8) on Kappa coefficient (KC) [45] with 70.59% overall accuracy (OA), less than the previous study, which also produced a confusion matrix for WV2 images and which found 87.87% user accuracy and 77.91% producer accuracy for the roof class [46].However, 81.47% OA with 0.75 KC of their study is categorized as moderate agreement, which is in line with results of the present study.
Figure 4 was produced to show the RGB color composite of the pan-sharpened image and the classification image.
This classification image resulted only from the R, G, B color composite, whereas in another study that performed WV2 band of R, G, and B, the near-infra-red (NIR) and panchromatic (PAN) band found 71.3% OA and 0.59 KC, which suggests a similar manner with this study (70.59%OA and 0.51 KC) [20].We did not use Google Earth image because it only provides visual representation of possible dengue breeding sites of an image and is lacking of automated extract feature as well as land cover analysis [43].

Dengue Disease Patterns
A description of variables of DDP is shown below in Table 3, which presents differences in the sizes and GiZ score of hotspot, random and dispersed patterns.The GiZ score of each point was found using ArcGIS 10.1.
The image result of DDP is shown in Figure 5. Dense dengue disease hotspot patterns were found at the north part of the study area covered with vegetation.This result corresponds with those reported in an earlier study: dense DDP are associated with houses surrounded by dense vegetation.In addition, shadows under such vegetation are favorable for breeding mosquitoes [43,47,48].

OLS Regression and GWR Model
The scatter plot of PR and FR to GiZ scores is depicted in Figure 6.Some FR plots are visible as outliers.We defined them as outliers after the scatter plot results shown, from which we observed which data were separated and suspected as outliers.We tested them by comparing R 2 , adjusted R 2 , and AICc for all data, and the whole data minus the outliers.The criteria were that if R 2 , adjusted R 2 , and AICc of all data are higher than if without the suspected outliers, then they are not outliers; and if it is lower, then we designated them as outliers.From OLS regression analysis, we found positive autocorrelation (Moran's Index = 0.67, p = 0.000).Table 4a shows that the data are not mutually redundant (VIF < 10) and non-stationary at all points (Koenker test = 0.00).From all DDP data, we found that smaller PR size and larger FR size were associated with increasing DDP into a more clustered trend or into positive directions of GiZ score (OLS: PR value = −0.27;FR= 0.04; R 2 = 0.076; GWR: R 2 = 0.76).We then divided all data into three groups: hotspot, random, and dispersed patterns, as shown respectively in Table 4.In the hotspot pattern, we excluded the outlier because it resulted in a higher measurement value on R 2 and adjusted R 2 , and a lower value of AICc after the exclusion.However, we included the outliers of random patterns because, as presented in Table 4b, OLS regression measurement test results on R 2 and adjusted R 2 were higher with outliers than without outliers, even though AICc showed a lower value without outliers.In each pattern, we analyzed the association, which revealed that PR had more negative values and that FR had more positive values in hotspots than others (OLS: PR value = −0.17,FR = 0.14, R 2 = 0.06 in hotspot pattern; PR value = −0.01,FR= 0.03, R 2 = 0.01 in random pattern; and PR value = −0.04,FR = 0.02, R 2 = 0.04 in dispersed pattern).Of the OLS estimated values, PR was all in negative values.It corresponds with the observation that the smaller the size of the PR was, the higher the GiZ score or the higher the trend was in the higher clustered pattern.In contrast, FR was found in positive estimated values, implying that larger FR showed a stronger trend of becoming a higher clustered pattern.Assuming most of the PR were also with FR based on visual interpretation, this condition might result from lower water flow velocity on FR [17].
Moreover, stagnant water caused by blocked roof gutters makes a productive habitat for Aedes mosquito breeding [8,18].Regarding the smaller size of PR, it might be associated with densely populated areas where most dengue disease incidence was found.Although many studies have done detailed mapping of homes, results are very rarely related to dengue incidence [43].Another potential explanation results from more difficult access to cleaning the gutters of smaller PR.Unfortunately, we were unable to find a report to refute or corroborate this explanation.An earlier study found that it was uncommon behavior to clean a roof gutter, but no mention of the difficulty of accessing smaller PR is forthcoming from the literature [8].From a behavior perspective, untidy and poorly maintained houses, yard, and larger shade conditions present a high risk of dengue disease because female Aedes mosquitoes are more attracted to such houses where more breeding sites are often available [47,48].
The OLS assumes that relationship is in random (stationary) distribution whereas in this study was not in random manner because the relation was also influenced by location which might have a different relation at each location [36,44].When this phenomenon is found, OLS regression results suggest the use of a GWR model to measure the association because the OLS indicated a not-normal residual distribution (p value of Jarque-Bera < 0.01).The summary results of GWR are presented in Table 4. From all DDP data, we found that the relationship was higher when using GWR model (R 2 = 0.76).However, we considered focusing more on each local variations on dengue patterns as past studies found to distinguish relationships by local variations [36,44].The association resulted in higher hotspot patterns than in random and dispersed patterns (GWR: R 2 in hotspot = 0.39, random = 0.37, dispersed = 0.23), which according to Kinear PR and Gray CD is a larger effect (<0.01, small effect; 0.01 to 0.1, medium, and >0.1 is a large effect) than that of the OLS results found as medium effect [49].Vanwambeke et al. in their previous research about the presence of Aedes sp.larva, found higher association (R 2 = 0.52) for peri-urban housing and orchards factors [29].Our limitation is that we were addressing only house roof variables, which indicated weaker results than those found by Vanwambeke et al.Higher results might be obtained when adding vegetation and shadow as variables related to dengue disease [29,43,47,48].We did not include them as dengue risk factors because we have added such variables with PR and FR variables in earlier experiments.Unfortunately, our HSR imagery showed low agreement between the classification image and ground data.Results might be attributable to the condition that we did not have a full bundle of imagery that consists of eight bands including the panchromatic band by which we can increase agreement using methods of past studies [46].We also did not have texture feature data of the city in a detailed manner, such as building and skyscraper textures.These data are crucially important when differentiating objects including shadows in HSR images [20].We also did not include precipitation data or temperature data as dengue disease risk, variables in the analyses because the climate station in the city that measured such data is lacking, although many researchers used such data in past studies [11].However, despite these limitations, this study specifically demonstrated the use of available HSR images from the Indonesian government, analyzed in automatic manner, with an attempt to integrate the data by relation with DDP.Different from previous studies, we analyzed DDP spatiotemporally based on mosquito characteristics: flight range and life cycle [10,11].In future studies, manual digitation for the HSR imagery to correct the classification can be conducted to increase the agreement.
The standardized residual value of the GWR result is mapped in Figure 7.This figure is presented to help set priorities to control roofs by cleaning them to prevent future dengue disease cases.Priority of the roof is any value closer to 0 (zero).The greater the difference of the value from 0 is, the lower its priority.If the value was more than 3 or less than −3, it was regarded as an outlier and was ignored [49].This point is crucially important for setting government priorities.Efforts can be more efficient and effective despite budget limitations.

Conclusions
This study finds a low association of pitched and flat roof types with each of dengue disease pattern but higher association with whole patterns.Nonetheless, results show that the dengue hotspot pattern has higher association than random and dispersed patterns.Results showed moderate agreement of pitched and flat roof classifications.In this moderate agreement condition of the classification, results also show that the less-pitched roofs present higher probability of dengue disease in a hotspot pattern, although a larger flat roof might be slightly associated with a higher probability of the hotspot pattern.In preparing high-resolution imagery, automatic analysis might not be the best step for an RGB color composite pan-sharpened Worldview 2 imagery to obtain better agreement results, although manual analysis or using full bundle imagery is apparently a potential step that demands further research.However, a combination of the mosquito life cycle and flight range approach may be promising to analyze dengue disease patterns for other global regions as long as data of dengue patient addresses and dates of symptoms are reported to the city health service of a country.Exploring roof type associations with dengue disease patterns can provide more information with which governments and communities can map, prioritize and target environmental interventions against dengue disease.Additional research should address vegetation, shadows, temperature, and texture of city in addition to the association.

Figure 1 .
Figure 1.(a) Map of Indonesia showing the approximate location of northern Bandung city; (b) Bandung city and the white border of study area; (c) Pan-sharpened Worldview 2 satellite image, red, green, blue (RGB) color composite showing the area depicted in high resolution satellite imagery.

Figure 4 .
Figure 4. (a) RGB color composite of pan-sharpened 2 and (b) PR, FR, and NR class image.

Figure 6 .
Figure 6.(a) Scatter plot of PR with GiZ score and (b) scatter plot of FR with GiZ score.

Figure 7 .
Figure 7. (a) Standardized residuals in DDP; (b) standardized residuals are shown within hotspot, random, and dispersed patterns; and (c) closer examination of the residuals on roofs.

Table 1 .
Digital numbers description of Pitched and Flat Roof.

Table 2 .
Confusion matrix for classification accuracy assessment.

Table 3 .
Variables description of dengue disease patterns.