FIKWater: A Water Consumption Dataset from Three Restaurant Kitchens in Portugal

With the advent of the Internet of Things (IoT) and low-cost sensing technologies, the availability of data has reached levels never imagined before by the research community. However, independently of their size, data are only as valuable as the ability to have access to them. This paper presents the FIKWater dataset, which contains time series data for hot and cold water demand collected from three restaurant kitchens in Portugal for consecutive periods between two and four weeks. The measurements were taken using ultrasonic flow meters, at a sampling frequency of 0.2 Hz. Additionally, some details of the monitored spaces are also provided.

The Future Industrial Kitchen (FIK) project (see https://futurekitchen.m-iti.org/ (accessed on 1 March 2021)) was placed in the Portuguese luxury hospitality sector. It aimed at developing a next-generation concept for IKs by combining, among others, Internet of Things (IoT) enabled interactive technologies and optimized appliances arrangements to maximize the efficiency and pleasurability of the operating staff. One of the main goals of the FIK project was to understand how electricity and water are consumed in high-end IKs. To this end, noninvasive electricity and water monitoring technology was deployed in the kitchens of three restaurants for consecutive periods of four weeks [13,14]. The monitored electricity and water consumption data were later used as inputs to a digital dashboard designed to provided real-time and historical feedback on resource consumption. This data descriptor presents the aggregated hot and cold water demand data collected during the three real-world deployments performed in the scope of this project. Furthermore, for one of the IKs, details about wet appliances (dishwasher and glasswasher) are also available. Such details include information on the periods when such appliances were ON or OFF, taken from the analysis of their electricity consumption from these two appliances.

Relation to Prior Datasets
Having received considerable attention from the research community, particularly in recent years, there has also been an effort to release water consumption datasets. In [15], Di Mauroet et al. [15] review 92 urban water demand datasets. The reviewed datasets were distributed across three spatial scales, district (20 datasets), household (31), and end-use (41). Unfortunately, many of these datasets are not available with an open-source license. Instead, some have restricted access (e.g., requiring a commercial license of special requests to the data owners), whereas for others (primarily in papers published in the 1970s/80s/90s), there is no information on accessing them.
Interestingly, there are no references to water demand in restaurants. Furthermore, a search on data.world (see https://data.world/search (accessed on 1 March 2021)) and openei.org (see https://openei.org/ (accessed on 1 March 2021)) for the keywords "kitchen water" and "restaurant water" did not reveal any relevant results that suggested the existence of such datasets.
To the best of our knowledge, FIKWater is one of the few publicly available datasets with water demand from restaurant kitchens. Hence, making this a valuable and unique contribution to the water monitoring and management research fields. More particularly, this dataset can be used to develop methodologies to explore water consumption in restaurants [7]. Likewise, this FIKWater can also be used to develop and evaluate water consumption benchmarks across kitchens, which is a topic currently underexplored in the context of electricity consumption (e.g., [16,17]) but not covered when it comes to water demand.
From a more technical perspective, FIKWater can also be used in the context of machine-learning research. For example, in the context of non-intrusive water disaggregation [3,5], which identifies the water consumption of individual wet-appliances taking only aggregated water demand measurements. In this context, the fact that FIKWater contains information about individual wet-appliances in one of the kitchens makes it particularly relevant. Finally, this dataset can also serve to further the research in the simulation of water demand profiles in industrial contexts, which unlike the domestic sector (e.g., [18,19]), is still relatively under-explored, e.g., [20].

Methods
This section provides an overview of the data collection process that leads to creating the FIKWater dataset.

Data Collection Hardware
The water consumption was measured using an ultrasonic flow meter, specifically the TUF2000M (see http://www.t3-1.com/english/index.php (accessed on 1 March 2021)) installed on the main water entrance pipe, hence monitoring the total water demand. The TUF2000M measures the following parameters: (1) instantaneous flow rate, (2) liquid velocity, (3) speed of sound, (4) positive and negative accumulators, and (5) totals (day, month and year).
The main reason to select an ultrasonic flow meter is the fact that they enable monitoring the water flow from outside the pipes with clamp-on sensors, hence avoiding invasive changes to the existing infrastructure. Figure 1 shows the used sensor and an illustration of the installation procedure. Table 1 lists the most relevant features of the TUF2000M ultrasonic flow meter.
After installing the sensors, it was necessary to set the following parameters in the meter: (1) type of liquid to monitor, (2) internal and external diameter of the pipe, (3) pipe thickness, (4) pipe material, (5) connection type, and (6) distance between the ultrasonic transducers.

Monitoring Platform
In order to proceed with the data collection, a bespoke monitoring platform was developed. Figure 2 illustrates the main components of the platform. The clamp-on sensors are placed on the water pipe, measuring the water flow. The monitored data were then sent to the local gateway using the Modbus protocol (see https://modbus.org/ (accessed on 1 March 2021)) protocol. The data were stored locally before being uploaded to the Internet using the standard HTTPS protocol.  Figure 3 shows the block diagram with the gateway's different components. In simple terms, the system worked as follows. The data acquisition software running in the Raspberry Pi (see https://www.raspberrypi.org/ (accessed on 25 January 2021)) took measurements from the ultrasonic flow meter at predefined S seconds intervals. By default, the value of M was set to five seconds, but it could also be given as an input to the data acquisition algorithm.
The collected measurements were stored on a local database. Every minute, the most up-to-date measurements were uploaded to an online database server for providing near real-time data access to interested third-parties (e.g., an application developer). Furthermore, every day at 12:00 a.m., a Comma Separated Values (CSV) file with the daily readings was uploaded to a shared folder. Upon successful upload, existing records were deleted from the local database to keep its footprint as light as possible at all times.
A Real Time Clock (RTC) was used to keep track of the time in the gateway and provide timestamps to collected measurements. Finally, a 3S lithium battery was used to allow deployments in places without a power connection and to avoid data losses in case of a power outage since the used sensing device did not have internal memory to store instantaneous measurements.

Deployments
The monitoring platform was deployed in three restaurant kitchens for consecutive periods between two (kitchen 1) and four weeks (kitchens 2 and 3). Table 2 summarizes the details of each kitchen.  In order to monitor both hot and cold water consumption, two monitoring systems were deployed in each kitchen. Figure 5 depicts two of the three deployments of the platforms. Unfortunately, due to physical constraints with the hot water installation of kitchen number 3, the recorded data were not accurate. Therefore it was not possible to add them to the FIKWater dataset.

Data Labeling
To enrich the dataset's potential applications, FIKWater also contained annotations of the periods when wet appliances (dishwasher and glasswasher) were turned ON or OFF. The annotations were obtained by manually inspecting those appliances' electricity consumption profiles when such data were available. More precisely, a wet appliance was considered ON when consuming energy for more than 15 consecutive minutes. Conversely, it was considered OFF when there was no consumption, or the observed consumption happened for less than 15 min. The threshold of 15 min was set empirically after observing the electricity consumption of the wet-appliances for the dataset's duration.

Data Description
The FIKWater dataset was made available individually for each monitored kitchen, and all the data files were in CSV format. Figure 6 shows an overview of the underlying organization of the FIKWater dataset. The following subsections describe the contents of the different files.

Demand Data
The water demand files (<?>_demand.csv) contained the measurements taken from the water flow sensors. These measurements were provided in raw form, i.e., as measured by the sensors. The underlying fields of the measurements files are described in Table 3.

Labels Data
The label files (<?>_labels.csv) identified the periods when wet appliances were turned ON or OFF. The underlying fields are described in Table 4. The technical details of the wet appliances are provided in the wet_appliances.txt file. Table 4. Column descriptions for the label files (<?>_labels.csv).

Column Description Units timestamp
The timestamp when the label was recorded mode If the appliance is ON (0) or OFF (1) binary

Deployments
The deployments file (deployments.csv) contained additional details of each deployment. The underlying fields are described in Table 5. Note that the Start and End dates refer to the date of the first and last water consumption measurements in each kitchen, respectively. These dates did not necessarily correspond to the start and end dates in Table 2 since these corresponded to the start and end of the FIK monitoring campaigns. Table 5. Column descriptions for the ground truth files (deployments.csv).

Column
Description Units

Kitchen identifier number service
Type of service provided (Breakfast, Lunch, Dinner) text area Area of the kitchen floor m 2 capacity Maximum number of customers in simultaneous number has_hot_water If hot water data are available or not binary has_cold_water If cold water data are available or not binary has_labels If the data contain wet appliance labels or not binary start Date of the first measurement across all the waste bins datetime end Date of the last measurement across all the waste bins datetime

Data Exploration and Conclusions
The number of monitoring days and records collected in the three kitchens are presented in Table 6. Coverage indicates the ratio between the monitored and the expected number of samples at the rate of one sample every five seconds ( 1 5 Hz), which, as can be observed, was very high across the three kitchens.  Figure 7 shows the distribution of the daily water flows (flow_today) in each of the monitored kitchens. As it can be observed from the water flows in kitchens 1 and 2, cold water consumption was much higher than that of hot water. This, in part, happened because most of the activities, like cooking and cleaning (normally happening at the end of the day), used cold water. On the other hand, hot water was used mostly for dishwashing.
Furthermore, it was possible to see that the consumption of cold water in kitchen 3 was much higher than that of the other kitchens (nine times higher than kitchen 1 and 19 times higher than kitchen 2). After consultation with the kitchen maintenance services, it was found that the monitored pipes were not fully dedicated to the kitchen operations. As such, a big part of these measurements referred to the consumption of other hotel divisions. This effect was also observable in Figure 8 (bottom), which shows that water was continuously used over the 24 h.  flow_rate. Note that this data is plotted at the original sampling rate of 1 5 Hz.
This clearly contrasts the measurements from kitchens 1 and 2 (see Figure 9), where it is evident that water was mostly used during specific periods of the day. For example, in kitchen 1, both hot and cold water were used more intensively after 10:00 PM, which corresponded to the dinner service's end. Finally, Figure 10 illustrates the total (hot + cold water) flow rate measurements supplemented with the labeled wet appliances transitions. The green dashed line represents ON transitions, whereas the red dotted lines represent the OFF events. As can be observed, there was also water consumption outside the periods where these appliances were ON, representing water uses for other purposes (e.g., cooking or cleaning) or wet appliances that were not monitored.