Binary Linear Programming as a Tool of Cost Optimization for a Water Supply Operator

: The problem of scheduling pumps is widely discussed in the literature in the context of improving energy efﬁciency, production costs, emissions, and reliability. In some studies, the authors analyze the available case studies and compare the results; others present their own computational methods. In the paper, a problem of pump scheduling in regular everyday operations of a water supply operator is considered. The issues of water production optimization and energy savings are part of the topic of sustainable development. The objective of the article is the minimization of the cost of electric power used by the pumps supplying water. It is achieved thanks to the variability of both the demand for water and the price of electric power during the day combined with the possibility of storing water. The formulation of an existing electric power cost optimization problem as a binary linear programming problem was improved. An essential extension of the above mathematical model, which enables more ﬂexible management of the pump system, was also proposed. An example containing real-world input data was successfully solved using Microsoft Excel with a free OpenSolver add-in.


Introduction
Supplying high-quality water at a competitive cost is a major challenge for water utilities worldwide, especially with ever-increasing water quality standards and energy prices [1,2]. Water supply systems are among the most important parts of infrastructure necessary to provide suitable quality of life for human beings. Those systems must comply with many requirements of various kinds. They must meet the demand of customers, which varies over time, be prepared for emergency increases of the demand, and provide the reliability of their (performed non-stop) operations. Moreover, operators of such systems should care about the financial aspects of their activities even if they are public-owned companies or they are subsidized because of their importance to society. In addition, minimizing costs is a part of positive action for sustainable development. For reasons of environmental responsibility, enterprises try to reuse, refurbish, and recycle used products to reduce the negative impact on the environment [3][4][5][6]. There are many kinds of costs connected with water supplies. Among them, costs of electric power make an important share of operating costs because of using many electric-powered pumps necessary to bring water from its source or sources to customers [7][8][9]. In order to optimize operational costs, smart control systems, which govern the operations of all elements in water supply systems, were developed [10][11][12][13]. In order to optimize the operation of a water pumping station itself, it is necessary to create a pump schedule for some period of time (e.g., 24 h) that states when a given pump should be turned on and, if it is controllable, with what efficiency it should work (full capacity or below). Correct optimization requires defining many necessary constraints, including taking into account the predicted demand for water, which varies throughout the day, capacities of reservoir tanks and minimal volumes of

Materials and Methods
The water supply system under consideration is that of a water supply operator based in a town with a population of about 25,000 inhabitants, located in Eastern Poland. The main parts of the system are wells, pumps, a reservoir tank, and the distribution pipeline network. Below there is a brief description of the most essential rules of functioning and parameters of the system. Along with proceeding with the description, a symbolic notation for parameters will be introduced.
Supplied water is groundwater pumped from 7 wells. The capacities and values of the electric power of the pumps are presented in Table 1. Water is pumped from the wells to a single reservoir tank with the capacity of V max = 1500 m 3 (the maximal volume of stored water). The demand for water varies over time. The outflow of water from the reservoir tank via the distribution network to customers is a continuous process. For practical reasons, predictions of demand are made for 24 one-hour timeslots. The accuracy of those predictions obviously requires using statistical methods. However, predicting the demand for water is not considered in this paper, and predicted values are just fixed-valued parameters of an optimization model without taking into account uncertainty.
Controlling the pumps must obey the following requirements. The pumps can operate with their nominal capacities only, and the amount of water pumped by any pump depends on the time of operation only. Each pump must operate for at least one hour per day. Additionally, at least one well and the pump integrated with it must be kept as a reserve at any moment of the day. The water inside the tank should be replaced at least once per day. During standard operational conditions, the volume of water in the reservoir tank cannot be less than V min = 523.5 m 3 . It is the firefighting reserve, which is kept in order to satisfy an extra demand when a fire is extinguished by using water supplied from hydrants.
In addition to the abovementioned technical and safety conditions, there is also an economical aspect of controlling the water intake system. Namely, the supplier of electric power does not use the same rate per MWh in its pricing policy all day long. Instead, it uses three tariff levels (see Table 2). Because prices of electric power may change relatively often to compare with technical parameters of the water supply system, which are constant for years, data used in the example calculations are the following (as of August 2015, PLN stands for Polish zlotys): Various prices of electric power make efficient controlling of the pumps much more difficult because the requirements resulting from the demand levels as well as technical and safety conditions should be satisfied at the lowest cost.
As it has already been mentioned above, a basically continuous process of water distribution is approximately described in a discrete form, namely by specifying 24 predicted values of the demand for 24 one-hour times slots. Demand predictions vary depending on the day of the week, holidays, and month/season of the year. So, the example numerical data of the demand values for 24 timeslots of just one specific day are presented in Table 3 (along with corresponding electric power prices). Analysis of the data presented so far shows that in each timeslot, each pump can either be used or not. Obviously, the time of operating each pump (expressed in hours) in a one-hour timeslot can be any non-negative value not larger than 1. However, for simplicity of the model, we assume that each pump either operates one hour during each timeslot or does not operate at all. The above approach suggests that the decision whether to use the pump j in the timeslot i can be described as a logical value (TRUE or FALSE) or numerically as a binary variable x ij . The binary variable is equal to 1 if the pump j is used in the timeslot i and is equal to 0 if it is not used. Because there are 24 timeslots and 7 pumps, then there are 24 × 7 = 168 different pairs of timeslot-pumps, and this is also the number of variables. To each of those pairs, a cost c ij of operating the pump j in the timeslot i is also assigned. Such cost parameters will be used in the final version of the cost optimization model. They are calculated from parameters already described in the following way where 1000 (kW/MW) is a coefficient used to change power units from kilowatts to megawatts and 1 stands for one hour (the duration of operation of the pump, which is equal to the duration of the timeslot; the coefficient is obviously not necessary from the mathematical point of view but it is added for compliance of units). For example, which means that using pump 2 during time slot 1 costs 6.253 PLN.
In order to create a cost optimization model, one more parameter is required. It is an initial volume of water in the reservoir tank at midnight, which is denoted by V init . In the example, it is equal to 550 m 3 .
In case of restrictions in power supplies, limits on maximal power usage below the contracted values may be applied to some users. Then, optional parameters L i (maximal power of all the pumps operating during the timeslot i) are introduced.
Finally, the following simple mathematical model-a binary linear programming problem-can be formulated. Its solution is an optimal schedule of the pump operation for 24 h. Variables • x ij binary-if equal to 1, the pump j is operating during the timeslot i; if equal to 0, the pump j is idle during the timeslot i (i = 1, 2, . . . , 24; j = 1, 2, . . . , 7) Auxiliary variables Optimization problem The total cost of electric power used by all the pumps operating during all the timeslots all the day is minimized.
Subject to the constraints x ij binary, i = 1, 2, . . . , 24; j = 1, 2, . . . , 7 The group of constraints (1) means that at most, six pumps can operate simultaneously during the same timeslot (at least one pump must be idle in the timeslot as a backup for failure incidents). The group of constraints (2) stands for the requirement that each pump must be used at least one hour per day, which is equivalent to operating at least one timeslot per day.
The group of constraints (3) is a recursive formulation of the volume of water in the reservoir tank at the end of the timeslot i. In detail, it is the sum of the volume of water pumped by all the operating pumps in the timeslot i and the volume of water left after the end of the previous timeslot (timeslot i − 1) minus the predicted demand for the timeslot i. For i = 1, the initial volume V init (a constant parameter) is used as the volume of water at the end instead of the analogical value for the previous timeslot.
The group of constraints (4) states the condition that the volume of water cannot be smaller than the minimal value resulting from the level of firefighting reserve and larger than the capacity of the reservoir tank.
The group of constraints (5) is a formal statement of the fact that x ij are binary variables. Those constraints are mandatory and cannot be omitted. Whereas there are linear programming problems (usually network optimization and assignment problems) in which forcing variables to be integers or, in special cases, binary is not necessary since the existence of integer/binary problems is guaranteed by the mathematical structure of the problem (see [42]), unfortunately, the considered problem does not belong to that class.
The group of constraints (6) is the only optional group in the model. It describes handling the situation of imposing power usage limits in case of limited power supplies.
To summarize, the above model of pump scheduling contains 168 variables, a linear objective function, 168 binary constraints imposed on those variables, 79 mandatory inequality linear constraints resulting from (1), (2), and (4), as well as 0 to 24 optional inequality linear constraint resulting from (6). The auxiliary variables v i are not actual variables since they are fully dependent on true variables x ij via the group of constraints (3). The reason for introducing both v i and the group of constraints (3) is the simplicity of notation that can be easily implemented in a spreadsheet.
The next step is to extend the considered model by removing a restriction of the 0-1 pump operation scheme (any pump operates during the full timeslot or it does not operate at all).
The extended model is described below. All the parameters defined earlier for the simple model remain valid, and no new parameters are required. A new group of variables is introduced. Variables • x ij binary-if equal to 1, the pump j is operating during the timeslot i (during the entire timeslot, i.e., one hour or shorter); if equal to 0, the pump j is idle during the timeslot i (i = 1, 2, . . . , 24; j = 1, 2, . . . , 7) • y ij -a number from the interval [0, 1] denoting a fraction of one hour, i.e., the time during the timeslot i when the pump j operating; if equal to 0, the pump j is idle during the timeslot i (i = 1, 2, . . . , 24; j = 1, 2, . . . , 7) The entire model now changes to the following The total cost of electric power used by all the pumps operating during all the timeslots all the day (the entire timeslots-y ij = 1 or parts of the timeslots-y ij < 1) is minimized.
The group of constraints (8) stands like (2) for the requirement that each pump must be used at least one hour per day. However, these constraints are no longer equivalent to operate during at least one timeslot per day but possibly to operate in shorter periods whose total duration is at least one hour.
The group of constraints (9) is the group of constraints (3) adjusted to calculate correctly the amount of water that is pumped if any pump is not operating during all the timeslot but shorter (less than one hour).
The group of constraints (13) and (14) force that each y ij is the actual time of operating the pump j during the timeslot i can be any value from 0 to 1 if the x ij is equal to 1 and is forced to be 0 otherwise.
To summarize, the extended model of pump scheduling contains 336 variables in two groups, a linear objective function of 168 variables from the second group, 168 binary constraints imposed on the variables from the first group, 415 mandatory inequality linear constraints resulting from (7), (8), (10), (13) and (14) as well as 0 to 24 optional inequality linear constraint resulting from (12). The auxiliary variables v i are used for the same reason as in the simple model, but they are fully dependent on true variables y ij , not x ij via the group of constraints (9). The first simple model is based on the model presented in [41], but it uses clearer notation, especially in the case of the group of constraints (3). It was improved by adding optional constraints for maximal power values used by all the pumps simultaneously. The model in [41] was implemented in MATLAB for purposes of numerical solving. In this paper, optimization features of Microsoft Excel were used instead, namely a free optimization add-in OpenSolver (built-in solver turned out to fail). The extended model is implemented in Excel, too. The choice of Excel to perform the implementation of the model is motivated by the convenience of potential users of that optimization tool: employees of the water supply operator. They are more likely to be familiar with Excel than with MATLAB, but there is one more reason for the above choice, connected with the cost of software licenses. Whereas MATLAB is advanced but very expensive software for scientific and engineering calculations, Microsoft Excel is cheaper and widely available in many companies as a part of the Microsoft Office package. This means that if a water supply operator has at least one Microsoft Office license, it can use the presented implementation at no extra software cost, whereas the time of calculating an optimal solution is acceptably short. It is worth mentioning that the implementation of both models, as shown in Appendices A and B, respectively, can be considered ready-to-run. Once an Excel file with any of the two models is created, it can easily be reused by entering 25 new values into the 25 cells (24 predictions of water consumption for the 24 timeslots and V init -the volume of water in the reservoir tank at midnight) and calculating a new pump schedule for the next day in a few seconds.

Results
The calculations were performed in Microsoft Excel 2019 with OpenSolver 2.9.0 (the latest stable version as of December 2020) add-on on an HP 250 G6 laptop with Intel Core i3-6006U 2.0 GHz processor, 8 GB RAM running on Windows 10 64 bit. The default solver engine used by OpenSolver was COIN OR CBC (linear solver).
First, the simple model was tested. Because the input data for the numerical example were identical with those in [41], the minimal cost of the electrical power used calculated for the first model was also identical and equal to 81.97 PLN (it was 81.865 rounded to 2 decimal places because 0.01 PLN is the minimal monetary unit in circulation).
The time of calculation was about 6 s. The details of the Excel implementation of the first simple model are presented in Appendix A.
The solution for the simple model obtained by Excel with OpenSolver is shown in Table 4.  (1) Usage of pumps is not scheduled at all during peak timeslots (with medium and high energy prices). (2) Pump 7 is scheduled to be used during all the timeslots with non-zero usage of pumps.
It results from the fact that it has the largest capacity-to-power ratio, i.e., its capacity in m 3 /h divided by power in kW is the largest (5.4545 m 3 /h per kW). Because the cost of the power supply is the same for all the pumps, pump 7 is the most cost-efficient. (3) The largest usage of pumps (the number of pumps used) is scheduled right after the morning peak (peak refers to the level of power prices), i.e., between 13:00 and 14:00 (timeslot 14), when four pumps are scheduled to work simultaneously. It is caused by the coincidence between peaks in water usage and electric power usage (the latter peak is reflected by the price level 2). Next, the numerical example was applied to the extended model. The minimal cost of the electrical power used was slightly lower, at -81.75 PLN (it was 81.74582107 before rounding).
The time of calculation was between 1 and 2 s. The details of the Excel implementation of the problem are presented in Appendix B.
The solution of the extended problem obtained by Excel with OpenSolver is shown in Table 5.  Conclusions resulting from the optimal solution are the same as in the simple model case, but also some new ones can be added: (4) The total cost turned out to be slightly lower (0.27%) than in the case of the simple model. It cannot be larger for any valid input data because the simple model is the worst case of the extended model. As to the technical aspect of calculations, it is obvious that a built-in solver cannot be considered as an appropriate tool for optimization of the pump schedule. Tests showed that the time of calculation of several minutes for the simple model is basically acceptable, but since an integer optimization is involved, even a slight change to some parameters may make it longer. Moreover, a built-in solver works near its computational limits; it can handle at most 200 variables and at most 100 formula-based constraints, i.e., different from the lower limit/upper limit/integer constraint imposed on a variable (see [43]). The abovementioned limits also make a built-in solver useless in the optimization of the extended model. For the above reasons, OpenSolver should be used instead. It can be used with Excel 2007 or later. It has no limitations of the numbers of variables and constraints imposed in Excel Solver and seems to be much faster in solving integer programming problems. What is also important, it is free (open source) software (developed in the Department of Engineering Science at the University of Auckland, New Zealand [44]). This is why using OpenSolver instead of Excel Solver will boost computational efficiency without extra cost.

Discussion
The safe and efficient operation of water supply systems is crucial, where digital tools, such as monitoring, hydro-informatics, and optimization algorithms, are key approaches that can play an important role in support decisions [45]. Turci, Wang, and Brahmia, based on traditional multi-population-based nature-inspired optimization algorithms, such as genetic algorithm (GA), ant colony optimization (ACO), and particle swarm optimization (PSO), adapted and improved the models to fit the complex constraints and characteristics of the system [21]. Cimorelli and others also investigated genetic algorithms to approximate solutions to optimal pump scheduling problems (OPS) [22]. Hyung, Seo Kim, and Koo used a genetic algorithm, a representative optimization technique, to develop an optimal pump operation method that enables the supply of stable water quantity and safe water quality. To propose the optimal pump operation method, the objective of minimizing the costs incurred from water intake to supply was considered, and the pump switching, the water level range of distribution reservoirs and clearwell, which should be considered in actual pump operation, were set as constraints [46]. Oh, Eom and Kim focused on pump scheduling optimization using sensors of water temperature and real-time pump efficiency monitoring. The hybrid hydraulic and thermodynamic calculation approach could provide flow rate and pump efficiency for individual pumps [25]. Muhammed and Farmani presented a fully dynamic tool that can handle four different types of fixed-speed pump schedule representations (on and off, time control, time-length control, and simple control (water levels in tanks)). The PST (Pump Scheduling Tool) has been developed using Visual Basic programming language and has a linkage between the EPANET hydraulic solver with the GANetXL optimization algorithm [26].
A successful formulation of the pump scheduling problem and its implementation in easily available optimization software obviously does not exclude further improvements and extensions. It might be, for example, modeling systems in which pumps have adjustable capacities (they can work below their nominal power) or taking into account operating below the target capacity when the pump starts and stops.

Conclusions
Water supply systems have a significant environmental and energetic impact due to the large amount of energy consumed in water pumping and water losses. Pump scheduling in water distribution networks is important, especially with current high energy prices. Water distribution management system is a costly practice, and with the growth of the population, the needs for creating more cost-effective solutions are vital. The presented problem shows that applying mathematical optimization methods in real-world management confronts one very important issue: those methods are usually useless for practitioners without the availability of appropriate IT support. IT support is understood here as a combination of software and hardware that allows for performing necessary calculations. Obviously, using any IT solutions cannot be conducted at no cost. Future benefits resulting from IT-supported mathematical optimization must take into account IT costs. Before proper optimization occurs, IT-related cost optimization must be conducted first. That is why implementation of a mathematical problem in the world of IT may be almost as important as a formulation of a real-world decision problem in the language of mathematics. If the optimization problem is solvable just by using hardware and software (possibly including using some new but free software) already possessed by a decisionmaker, it is more advantageous than if some extra IT purchase is needed. In this paper, the goal of optimization without unnecessary IT costs in case of a pump scheduling problem of a water supply operator was achieved.
Author Contributions: Conceptualization, P.K. and M.R.; methodology, P.K.; software, P.K.; validation, P.K. and M.R.; formal analysis, P.K. and M.R.; investigation, P.K.; resources, P.K.; data curation, P.K. and M.R.; writing-original draft preparation, P.K.; writing-review and editing, P.K. and M.R.; visualization, P.K.; supervision, P.K. and M.R.; project administration, P.K.; funding acquisition, P.K. and M.R. All authors have read and agreed to the published version of the manuscript. Acknowledgments: Edward Kozłowski of the Lublin University of Technology, Lublin, Poland as a co-author of the cost optimization model that was an inspiration to create this paper, provided many valuable remarks about the technical aspects of the modeled decision problem.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix B
The relationship between the mathematical model and its Excel implementation is described below.
Variables  -B5:J28 (initially they can be left empty)  -P5:V28 (initially they can be left empty) Auxiliary variables (as mentioned in Section 2, they are not actual variables but rather symbols to simplify the notation of the problem, this is why they are implemented as formulas)  -M5: =SUMPRODUCT(P5:V5,D$2:J$2)+N1-B5 Figure A1. The problem definition (the simple model) in the OpenSolver main window (OpenSolver-Model).

Appendix B
The relationship between the mathematical model and its Excel implementation is described below. Variables • x ij -B5:J28 (initially they can be left empty) • y ij -P5:V28 (initially they can be left empty) Auxiliary variables (as mentioned in Section 2, they are not actual variables but rather symbols to simplify the notation of the problem, this is why they are implemented as formulas) • ∑ 24 i=1 y ij ≥ 1; j = 2, 3, . . . , 7-obtained by copying P30 to Q30:V30 Parameters and the formula for the objective function are located exactly in the same cells as in Appendix A.