#### 2.1. Rainfall Data

In this study, daily rainfall data from 27 rain gauges maintained by the Brasilia water utility, Companhia de Águas e Esgotos de Brasília (CAESB), were used (see

Figure 2A). For each of these gauges, the total monthly-accumulated rainfall in the twelve months of the year from 1971 to 2010 was calculated—the period for which daily rainfall data are available. Only the data for January were considered because this month is statistically considered the rainiest within the rainy season in the Federal District.

However, not all of the 27 gauges had rainfall data for every month in this four-decade-long period. Some only had more recent data (e.g., from 2007 to 2009), while others had no data from most of the 1970s (only beginning in 1978), and still others had some random gaps in their data for days or even months. In view of these limitations, it was decided to only draw on data from 1981 to 2010, for which most of the gauges had data, albeit with some gaps. Nineteen of the 27 gauges (see

Figure 2A) had complete or almost complete data, and were therefore chosen to be considered in this study. Their geographic location is shown in

Figure 2B.

As already mentioned, the 19 gauges in

Figure 2B did have some gaps in their daily records, so an interpolation procedure was carried out based on the regional weighting method [

9]. Only a few months were adjusted: of the 570 months of January considered (30 from each of the 19 gauges), just 12 had to have their data interpolated. The other 558 months considered did not need the original CAESB data adjusted at all.

The Federal District, is located in the region of Central Brazil, with relief inserted in the morphoscultural composition defined as Central Plateau, region characterized by surfaces of old planes. The main geomorphological fact of the Federal District is the connection of two large hydrographic basins of Brazil, through the amended waters, a path that drains to the North and South, forming the sources of the Tocantins-Araguaia Hydrographic Basin to the North and the Hydrographic Basin the Prata River to the South. The altitudes in the Federal District vary from 720 to 1340 m, and the highest regions are located in the west, as shown in

Figure 3.

#### 2.2. Spatial Representativeness of the 19 Rain Gauges

After selecting the 19 gauges and executing the interpolation procedure, an analysis was conducted to ascertain whether the spatial position of the stations was in fact representative of the geographic space of the Federal District. First, QGIS 2.0 (Quantum GIS) software was used to calculate the approximate area where each of the 19 gauges is situated (

Figure 4).

As shown in

Figure 4A, the approximate area of the Federal District as calculated using QGIS 2.0 was 5703 km

^{2}, an area very similar to its official size (5799.999 km

^{2}).

Figure 4B shows that the hatched area, representing the space covered by the 19 gauges occupies approximately 3554 km

^{2}, corresponding to around 61% of the geographic area of the Federal District, which in principle indicates that the 19 gauges combined provide reasonable coverage of its territory, although there is a greater concentration of these in the western part of the territory.

In studies analyzing the spatial patterns of individual events, the idea is essentially to identify whether the events of interest occur in a clustered, dispersed, or random manner in the study area. An assessment of the distribution of the CAESB gauges in the Federal District was therefore carried out. It was hoped that their distribution pattern would be predominantly dispersed, because this would indicate that the records collected at these points would be more representative of the Federal District as a whole. Taylor (1977) presents the **R** scale this way:

“Let’s assume that we have measured the distance **r** between every point in a given pattern and its nearest neighbor. If we take the average of all these distances, we produce a value **r**_{a}.(…) We know from our previous discussion that a random process is associated with the Poisson probability functions. We can use this distribution to derive expected average nearest neighbor distances for a randomly generated pattern. It is known that this expected average distance for a randomness assumption is given by **r**_{e} = 1/(2*root (**n**/**A**)) where **n** is the number of points and **A** is the area of the study region.” (TAYLOR, 1977, p. 136, 137)

Essentially, the R scale is a “divergence” of the actual point pattern (measured by **r**_{a}) ratio from randomness (measured by **r**_{e}) given by R = **r**_{a}/**r**_{e}.

A k-nearest neighbors analysis was conducted for this purpose, based on the **R** scale. According to Taylor (1977), 0 ≤ **R** ≤ 2.149. **R**-values near 1 indicate a distribution pattern that is random. **R**-values below 1 represent an increasingly clustered pattern of distribution, such that **R** = 0 indicates maximum clustering, when all the events take place at the same point in the space under study. Finally, **R**-values over 1 indicate an increasing pattern of dispersion.

When the **R** value for the distribution pattern of the points representing the 19 rain gauges was calculated, it was found that **R** = 1.246. Although the number of elements in the sample is small (just 19 observations), conducting a statistical hypothesis test, where H_{0} = “the spatial distribution pattern of the 19 observations is random (i.e., **R** = 1)” and H_{1} = “the spatial distribution pattern of the 19 observations is not random (i.e., either it is clustered or it is dispersed: **R** ≠ 1)”, a z-score of 2.047738 and a p-value of 0.040578 were obtained.

For the purposes of a visual comparison,

Figure 5 shows three spatial arrangements of the 19 rain gauges in the Federal District. The first one represents a hypothetical situation whereby the 19 gauges are very close to one another, which would be a clustered pattern. The second corresponds to the real location of the 19 gauges, as presented in

Figure 2B. The third is another hypothetical situation, this time where the 19 rain gauges would be strongly dispersed.

The point pattern shown in

Figure 5C would be the ideal situation of representativeness for the scope of this study. The

**R** value of 1.246 for the real distribution of the 19 rainfall stations in the Federal District, as shown in

Figure 5B, could indicate a tendency towards a more random distribution. However, one could also say that there is a slight tendency for the patterns to be dispersed, with a slight tendency towards repulsion. According to the results of the hypothesis test, this is significant to the level of α = 4.0578%, which means the null hypothesis (H

_{0}) is rejected in favor of H

_{1}.

It would certainly not be true to say that the dispersion pattern is clustered. This therefore makes the hypothesis that the 19 gauges fairly represent the volume of rainfall across the whole geographic space of the Federal District plausible in view of the fact that they are relatively well spatially dispersed. In other words, according to the analysis made, it is fair to take the rainfall measured at the 19 rain gauges as representative of the rainfall in the whole of the Federal District.