2. Data Description
The dataset comprises long-term observations from the combined sewer catchment of the CSO structure R05 in western Graz, Austria, covering the period from 2008 to 2011. All time series are provided as CSV files with ISO-8601 timestamps including UTC offsets to ensure unambiguous temporal interpretation.
Figure 1 shows the overview of the repository structure.
2.1. Hydraulic Observations
Hydraulic measurements are stored in the folder “hydraulic data” and originate from devices operating at 3 min intervals during dry-weather conditions and at 1 min intervals during wet-weather conditions. In the main sewer conduit at the inflow to the CSO a single multi-parameter instrument recorded flow rate, flow velocity and water level. The data exist in two parallel streams: (i) locally logged values, which are retrieved irregularly due to storage constraints (inflow-digital.csv), and (ii) an analogue transmission to an online database (inflow-analogue.csv). Both files document the same variables but differ in transmission pathway and completeness. Water level in the CSO chamber, immediately upstream of the overflow weir, is provided in “chamber.csv” and enables identification of overflow onset. Measurements within the overflow conduit towards the River Mur (overflow.csv) include water level and discharge based on ultrasonic correlation sensing. All hydraulic files follow the metadata specifications on sensor technology, installation, units, and operational limits.
2.2. Pollutant Observations
The “pollutant data” folder contains continuous water quality observations within the CSO chamber. A UV/VIS spectrometer, which is installed on a floating pontoon, recorded surrogate measurements for CODeq, TSSeq and water temperature (spectrometer.csv). A locally calibrated CODeq time series derived from laboratory analyses of discrete samples is provided in “calibrated.csv”. The metadata specify calibration states, wavelength range, and sensor placement.
2.3. Laboratory Analyses
Discrete grab samples collected at short intervals by an automatic sampler are compiled in “lab data/samples.csv”. The dataset includes COD and TSS determined by cuvette tests and filtration-drying methods, respectively. Accompanying metadata describe sampling principles, device specifications, volumes, and analytical standards.
2.4. Precipitation Observations
Three precipitation time series from nearby monitoring stations are stored in “precipitation data” (files “0033900601.csv”, “0033901401.csv”, “0033901601.csv”). Each series originates from a tipping-bucket gauge and reports rainfall as time-stamped tip increments, enabling reconstruction of high-resolution event dynamics and spatial rainfall variability. Metadata include coordinates (reference system: EPSG:4326), gauge characteristics, and known data gaps.
2.5. Hydrodynamic Sewer Model
A detailed SWMM model of the catchment is provided in “sewer model/model.inp”. The model reflects the state of 2009 and includes the sewer system, sewer sheds, and rain gauges, all of which are geo-referenced using the coordinate reference system EPSG:31256. It was calibrated under both dry- and wet-weather conditions and is accompanied by a map illustrating the system layout (see
Figure 2).
2.6. Metadata
The accompanying “meta_data.json” file documents sensor technologies, physical placement, operational ranges, temporal coverage, and unit specifications for all variables. It also summarises known gaps and provides the contextual information necessary to correctly interpret and harmonise the files within the monitoring framework. Precipitation gauge location and site location is included in the metadata using the EPSG:4326 reference system.
Table 1 summarises the monitoring equipment and the measurement accuracies reported by the respective manufacturers. As no independent comparative measurements were conducted, the uncertainty of the recorded data cannot be quantified directly. For further information on expected measurement performance and methodological limitations, reference is made to previous studies [
1,
2,
3,
4,
5,
6].
2.7. Evaluation Products
A set of evaluation outputs is included to facilitate rapid orientation within the dataset. Data gaps for each time series are summarised as CSV files in “evaluations/data_gaps”. An availability graphic (availability.png) provides an overview of temporal coverage. A compiled table of wet-weather events lists start and end times, rainfall totals, return periods and relevant duration metrics for each station. Event-specific visualisations (event_plots.pdf) show rainfall, CSO inflow and overflow, and CODeq concentration. Sampling activity and corresponding continuous measurements are shown in “samples_plots.pdf”. Correlations between laboratory and continuous pollutant measurements are provided in “samples_corr_COD.png” and “samples_corr_TSS.png”.
3. Methods
3.1. Sewer Catchment
The study area lies in the western part of Graz, Austria’s second-largest city, and is internally designated as “Graz West R05.” Graz is situated at approximately 353 m above sea level in the south-eastern foothills of the Alps and is characterised by a temperate continental climate with an annual precipitation of about 800 mm. Summers are typically wetter and often influenced by convective thunderstorms. The urban area is divided by the river Mur, which has a mean discharge of roughly 110 m
3/s [
7].
The R05 catchment covers approximately 4.6 km2, of which about 1.3 km2 are impervious. Surface slopes generally range between 0.5% and 4%, increasing to up to 10% in the westernmost area. The sewer system is predominantly a combined system that drains by gravity along the natural terrain slope towards the River Mur. A main collector running parallel to the river conveys the wastewater further south to the municipal wastewater treatment plant. All upstream subcatchments converge at a single outflow point where a CSO structure is located.
The catchment hosts a mixed urban structure. The eastern part is densely built-up, whereas the western part features more open residential development. Several smaller indirect dischargers are present, alongside two major ones—a brewery and a pharmaceutical company. The total population served is about 19,500 inhabitants, corresponding to a density of roughly 43 inhabitants per hectare. The average dry-weather flow in 2009 was approximately 40 L/s.
The sewer network consists of various pipe geometries, including circular profiles from 150 mm diameter upwards, oval sections up to 1300/1950 mm, and several special cross-sections. The total network length is 46.5 km. A single in-sewer storage structure provides an activatable volume of 2300 m3 and is operated with a constant throttled outflow of about 160 L/s to the downstream conduit.
3.2. Combined Sewer Overflow Structure
The CSO structure (see
Figure 3 and
Figure 4) is located at the outlet of the R05 catchment, where all upstream sewer branches converge. When the inflow exceeds approximately 500 L/s, the system begins to spill excess water directly into the river Mur. The overflow is conveyed through a dedicated conduit about 90 m in length, leading from the CSO chamber to the river.
Dry-weather flow is routed beneath the overflow weir through a throttle pipe with a diameter of 0.6 m and a length of roughly 3 m, which discharges into the main collector sewer. This collector conveys the wastewater further downstream towards the municipal wastewater treatment plant.
The overflow weir itself is approximately 8.9 m long, with a crest about 0.9 m above the invert of the dry-weather flow channel. Between the weir and the overflow conduit, the main collector sewer passes laterally through the structure. In this section, the available flow depth is reduced to about 70 cm, which may cause backwater effects during intense overflow events.
3.3. Sewer Measurements
The primary measurement station is located on the orographic right bank of the river Mur within the CSO structure. It provides continuous monitoring of both hydraulic and water quality parameters. An overview of the measured variables and probe locations is shown in
Figure 4.
Under dry-weather conditions, all sensors operate at a standard logging interval of 3 min. During storm events, the interval automatically switches to 1 min when the water level reaches over 40 cm in the CSO chamber, ensuring higher temporal resolution during dynamic flow conditions.
All probes are connected to an industrial-grade PC designed for outdoor environments. This PC controls the monitoring station, handles data acquisition and manages intermediate storage. Sensors interface with the system either through digital bus connections or via analogue inputs (4–20 mA). For analogue channels, predefined value ranges are assigned to each sensor to reflect realistic operating limits, acknowledging that wider span settings reduce measurement precision.
Sensors could not be installed inside or in the vicinity of the throttle pipe. Installation within the pipe was not permitted because it would have reduced the effective flow cross-section and increased the risk of blockage. Maintaining the full cross-section would have required extensive structural modifications, which were not feasible given that the pipe is located approximately 5 m below ground and conveys continuous dry-weather flow. Measurements in the main sewer conduit were not part of the original monitoring concept, as the study primarily aimed to analyse in-sewer pollutant dynamics and the occurrence of overflow events.
Further technical details on the measurement station and its development can be found in [
8,
9].
3.4. Hydraulic Observations
Hydraulic conditions in the CSO structure are monitored by flow meters installed in both the inflow conduit and the overflow conduit, complemented by water-level measurements in the CSO chamber.
3.4.1. Inflow Conduit
In the inflow to the CSO chamber, flow velocity and water level are measured using a contactless flow meter (FloDar, Marsh McBirney, Ellicott City, MD, USA), which measures the flow velocity using a radar sensor based on the radar Doppler principle and the water level using an ultrasonic sensor. The analogue output of the instrument is scaled between 4 and 20 mA, corresponding to a discharge range from 0 to 2500 L/s. This upper limit represents a technical cut-off: inflows exceeding 2500 L/s cannot be captured, resulting in truncated peak values during large storm events. It reflects a deliberate trade-off between achieving accurate measurements and maintaining a sufficiently wide monitoring range for wet-weather conditions.
Water-level measurements in the inflow conduit are further constrained by the minimum required distance between the water surface and the ultrasonic sensor, which imposes a physical upper limit on measurable depths. The dry-weather calibration of the radar flow meter was carried out by [
10], whereas no calibration under wet-weather flow conditions was performed.
3.4.2. Overflow Conduit
In the overflow conduit, discharge is recorded using an ultrasonic flow meter (NIVUS Type OCM Pro) installed at the pipe invert. Water level in this conduit is measured by an external air-ultrasonic sensor positioned at the crown of the pipe.
3.4.3. CSO Chamber
Within the CSO chamber itself, an ultrasonic water-level probe is mounted above the dry-weather flow channel. Its measurements serve both as an indicator of hydraulic system state and as the trigger for switching between standard (3 min) and high-resolution (1 min) recording intervals.
3.5. UV/VIS Spectrometer Probe
Water quality in the CSO chamber is monitored using a UV/VIS spectrometer probe. The spectrometer is a submersible spectro::lyser (5 mm optical path length) manufactured by s::can. The instrument is approximately 65 cm long and 44 mm in diameter. It operates quasi-continuously, recording one spectrum per minute. A xenon flashlamp emits light across 200–750 nm in 2.5 nm increments, and the probe measures light attenuation (absorption and scattering) along its optical path. From these spectra, equivalent concentrations for chemical oxygen demand (CODeq) and total suspended solids (TSSeq) are derived. Because the method is reagent-free and requires no sample preparation, the probe enables continuous, in-situ monitoring with minimal manual intervention. The instrument is fitted with an automated air-blast cleaning system, activated before every 5th measurement.
The manufacturer provides a generic global calibration for typical municipal wastewater with the identifier “INFLU004V15T”; however, the optical signal is sensitive to the specific wastewater matrix at a site. Reliable estimates therefore require local calibration based on laboratory analyses across a representative concentration range. Several sampling campaigns were carried out for this purpose, and exponential regression yielded the best performance for COD
eq concentrations. No further improvements were found for TSS
eq concentrations. The root mean square error (RMSE) for COD was 184 mg/L using global calibration, compared to 109 mg/L with local calibration. For TSS, the RMSE under global calibration was 88 mg/L. Additional details on calibration procedures and regression methods are reported in [
6,
11]. During 2009, a major refurbishment of the probe was required following corrosion damage and a cable short-circuit, resulting in a data loss of more than one month.
3.5.1. Installation on a Floating Pontoon
The spectrometer is mounted on the bottom of a floating pontoon positioned directly in the sewer flow. This configuration enables continuous measurements in the uppermost water layer, which is the first to reach the overflow and is therefore appropriate for characterising overflow concentrations. Although concentrations across the full cross-section cannot be inferred without assuming complete mixing, the installation provides robust, event-responsive data at a hydraulically relevant location.
Deploying the probe directly in the sewer requires a mechanically resilient setup capable of withstanding dynamic hydraulic loads, which may reach up to 10 m
3/s at the measurement site, as well as an aggressive environment (e.g., corrosion, grease, and clogging). To mitigate clogging, two measures were implemented: (i) a steel baffle in the dry-weather channel to maintain a minimum water level, and (ii) a piston-driven cable winch enabling remote lifting of the pontoon for removing clogging. These measures substantially reduced the need for on-site maintenance. A schematic of the installation is shown in
Figure 5. Further descriptions of pontoon installations and comparisons with bypass configurations are provided by [
8,
9].
3.5.2. Calibration and Maintenance
Calibration datasets were collected under both dry- and wet-weather conditions, although uncertainties during storm events remain higher due to rapid changes in the wastewater matrix.
Routine maintenance involves manual cleaning of the spectrometer and surrounding sewer area with high-pressure equipment approximately every two weeks. After each rainfall event, the condition of the pontoon is visually assessed using a remotely operated camera, which allows emerging problems to be detected early and reduces potential downtime.
3.6. Sampling
An automatic sampler was installed to obtain reference water quality samples for calibrating the UV/VIS spectrometer. The sampling hose was mounted directly on the floating pontoon near the spectrometer, ensuring that collected samples represented the same wastewater matrix measured optically. Sampling was manually triggered during wet-weather events to capture the full range of concentrations relevant for calibration.
Grab samples of approximately 6 litres were taken over 3–5 min using an automatic peristaltic sampler (American Sigma, now Hach). Samples were cooled immediately after collection and stored under refrigerated conditions until transported to the laboratory, located about 1.5 km from the site. To minimise alterations in the wastewater matrix, laboratory analyses were performed as soon as possible.
In the laboratory, samples were homogenised using an Ultraturrax device. COD concentrations were determined using Hach (Lange) cuvette tests, while TSS concentrations were measured by filtering 50 mL aliquots at 8 bar, followed by drying at 105 °C and weighing. All analyses were performed in triplicate to quantify analytical uncertainties.
3.7. Precipitation Observations
High-resolution precipitation data are available from three tipping-bucket rain gauges installed within and near the R05 catchment. These digital gauges transmit data continuously to a central server via remote communication. Each recorded tip is transferred as a discrete value with an associated timestamp, corresponding to a rainfall depth of 0.1 mm, based on a bucket volume of 5 cm
3. The instruments were not subject to individual field calibration; however, comparison with nearby high-quality weighing gauges operated by GeoSphere Austria indicates good agreement, with the expected tendency towards slightly lower intensities [
12].
The exact locations of all three gauges are provided, and the installations were selected to avoid shadowing by nearby buildings, ensuring minimal disturbance to the rainfall measurements.
Additional high-quality 10 min meteorological data for the study period—including precipitation, air temperature, wind, and solar radiation—are available from the Datahub of GeoSphere Austria (GSA) [
13] under a CC BY 4.0 licence. The closest most relevant station is “Graz Straßgang” (ID 16413).
Figure 6 shows the cumulative precipitation sum of the three provided time series in comparison with the high-quality data of a weighing gauge from GSA. It shows that the stations monitored comparable pattern and total sums of rainfall with minor differences. The tipping buckets recorded in general a smaller sum. The plot also shows the gaps in the recording of the stations.
3.8. Hydrodynamic Sewer Model
A hydrodynamic model of the Graz West R05 catchment was developed using the EPA-SWMM simulation environment (Version 5.2.4). Model setup and parameterisation were based on multiple geospatial and infrastructure data sources, including infrared imagery, cadastral maps, the municipal digital sewer map, aerial photographs, and land-use information provided by the City of Graz.
The resulting network representation (see
Figure 2) comprises 1163 subcatchments, 1364 nodes, and 1369 conduits, capturing the structural and hydraulic characteristics of the combined sewer system in detail. Dry-weather calibration yielded a root-mean-square error of approximately 8 L/s for typical dry-weather flows between 10 and 50 L/s, indicating good agreement between measured and simulated conditions.
Although wet-weather calibration was performed manually rather than using advanced optimisation techniques, the model shows adequate performance during rainfall events. Across 176 monitored events in the measurement period, the Nash–Sutcliffe Efficiency (NSE) for inflow to the CSO chamber exceeded 0.43 for 100 events, and 0.75 for 37 events, demonstrating that the model can reproduce key aspects of the system’s dynamic behaviour during storm conditions.
3.9. Past Publications
The sewer monitoring station has been in continuous operation since 2002 and has supported a wide range of scientific studies. Early work presented the installation and first findings of the monitoring system [
8], followed by an assessment of CSO volumes and pollutant loads based on 39 overflow events recorded between 2002 and 2004 [
9]. Calibration strategies for the UV/VIS spectrometer using data from 2001 to 2003 were investigated in [
1,
11]. A six-month dataset from 2003 formed the basis for hydraulic and water quality model calibration in [
14], which was later adapted in [
15]. Gamerith et al. [
16] evaluated and validated the monitoring data from 2008 to 2010 and contributed to improving the modelling workflow. The monitoring setup, model configuration, and findings for 2009–2011 were described in [
17]. Model sensitivity for events in 2009 was examined in [
18]. Long-term spectrometer calibration using data from 2003 to 2009 was presented by [
6]. High-frequency spectrometer data from July 2011 were analysed in [
19] for machine-learning-based pollutant prediction. Subsequent studies focused on SWMM calibration using CSO duration data [
20] and on overflow detection with temperature probes [
21]. Most recently, ref. [
22] used the monitoring period 2009–2011 together with the hydrodynamic model to investigate particulate pollutant processes.
3.10. Statistical Summary
Over the 3-year and 2-month observation period, a total rainfall depth of approximately 3000 mm was recorded. In total, 220 rainfall events with at least 1 mm of precipitation were identified. Depending on the station, between 11 and 24 events exceeded a return period of one year, with maximum return periods ranging from 7.5 to 33 years. This variation reflects the spatial heterogeneity of convective storm events. Peak rainfall intensities ranged between 2.6 and 3.7 mm per 5 min. For 161 rainfall events, complete sewer measurements were available. For 63 events, no sewer data were recorded, while for 4 events only pollutant concentration data and for 2 events only hydraulic data were available. A statistical summary of the in-sewer measurements is provided in
Table 2.
4. User Notes
Several limitations should be acknowledged when interpreting the dataset and modelling results. First, the extended observation period includes some gaps in the recorded data (see
Figure 7). These interruptions may affect the continuity and reliability of certain analyses.
Several gaps occurred in the precipitation records, for which no documented cause is available. At station 0033900601, data are missing from 30 June 2009 to 29 August 2009 (approximately two months) and from 8 December 2009 to 26 September 2010 (approximately ten and a half months). For station 0033901401, observations only began on 28 November 2008. At station 0033901601, a gap is present from 9 October 2009 to 5 November 2009 (approximately one month). The digital inflow data were retrieved manually, which explains the presence of occasional gaps in this record. Periods with gaps in the analogue inflow data but not in the digital data indicate transmission problems in the analogue signal. The main monitoring campaign started in August 2008, while part of the digital inflow data originates from a preliminary phase preceding the main project. Two major data gaps in the sewer dataset were caused by failures of the central data logger, resulting in the loss of all measurements connected to the sewer monitoring system. No sewer data are available from 10 August 2009 to 22 October 2009 (73 days), and for the entire month of April 2010. In addition, two medium-length interruptions occurred from 5 November 2010 to 23 November 2010 (17 days) and from 7 January 2011 to 17 January 2011 (10 days).
Additionally, the precipitation measurements were collected separately from the sewer system data. Although the misalignment between these datasets is minor, it should be considered when assessing simulation performance and interpreting time-sensitive results.
Flow measurements from the analogue inflow sensor exhibit truncated peak values due to configuration settings on the analogue transmission. While these missing peaks can potentially be reconstructed using digitally recorded inflow data, the digital records show a noticeable time drift and are not synchronised with the rest of the dataset.
We suggest addressing the range limitation of the inflow sensor by combining both analogue and digital water level signals, as neither dataset is fully truncated across the observation period. A cross-correlation analysis can be performed stepwise (e.g., on a daily basis) over a range of possible time lags to capture potential drift in the temporal offset, after which the digital signal is shifted accordingly. Subsequently, a linear regression can be applied to align the analogue measurements with the digital data, accounting for slight transformations in the analogue signal. For time periods in which the analogue signal is saturated, and peak flows are truncated, the corresponding digital measurements can be used to reconstruct the flow values. The published dataset is retained in its raw form;
Figure 8 presents an estimate of the time lag, which is up to 27 min and reaches approximately 60 min for a short period.
Sensor maintenance represents another constraint. Despite regular cleaning, earlier operational periods demonstrated that extended cleaning intervals can cause gradual drift in quality measurements. Minor drift in the current dataset therefore cannot be entirely ruled out.
Interpreting low flow rates in the overflow conduit also requires caution. Although no event with backflow over the weir crest was documented during the published observation period, in-sewer video recordings from later years confirm that such events have occurred. Within the dataset period, backflow from the river into the overflow conduit can nevertheless be identified. These situations can be detected by selecting time intervals in which the chamber water level remains below the weir crest while the water level or flow rate in the overflow conduit exceeds zero. However, measurements at very low flow rates are subject to increased uncertainty and should be interpreted with caution.
Hydraulic interactions within the system introduce additional uncertainty. The main sewer conduit affects water levels in the CSO chamber, influencing calculated overflow volumes. Accurate modelling therefore requires calibration to account for this interaction. In addition, at low-flow conditions, the steel baffle installed within the CSO chamber can alter water levels in ways not fully captured by standard assumptions.
The SWMM model provided in the repository currently relies on manual calibration only. Further refinement and systematic calibration using this published dataset would enhance the precision and credibility of the simulation outputs.
The dataset is provided as raw data. Only clearly unrealistic values were removed, and no additional validation or automated filtering was applied. Measurement limits and operational boundaries of the sensors are documented in the accompanying metadata. As further validation steps, such as threshold-based filtering, may remove information that could be relevant for specific research questions, the data are intentionally left in their original form. Users are encouraged to apply case-specific filtering and quality-control procedures according to their analytical objectives.
The monitoring campaign and associated dataset are complete. No additional data collection or updates to the provided SWMM model are planned, even if future modifications to the drainage system take place.
Improvements
Future monitoring campaigns could be improved by strengthening operational reporting and maintenance documentation. For example, the regular generation of timely weekly reports would support systematic oversight and enable rapid detection of temporary failures of individual system components. To respond effectively to such failures, it would also be necessary to keep most critical components in redundant stock on site, allowing immediate replacement in case of malfunction. However, maintaining a fully redundant inventory of all required system components is typically not financially feasible within research projects. As a result, occasional downtime over extended monitoring periods is difficult to avoid. In addition, strict assurance of time synchronisation across all recorded measurement signals is essential to ensure data consistency and comparability.