Analysis of Water Consumption in Fruit and Vegetable Processing Plants with the Use of Artiﬁcial Intelligence

: Fruit and vegetable processing has a signiﬁcant impact on the environment due to its consumption of a signiﬁcant amount of water. Water consumption mainly depends on the type of production and the technology used. Water in fruit and vegetable processing plants is used as a raw material, an energy carrier, and in hydro transport, as well as for washing raw materials and maintaining production hygiene. The variety of technological operations carried out in the production process and the seasonality of production make it difﬁcult to objectively assess the use of water in fruit and vegetable processing plants. Few available publications in this ﬁeld provide numerical values of water unit consumption indices, with none entering into the cause-and-effect relationships of water use in plants in this industry. The aim of this study was to analyze the research to date and to verify the following research hypothesis: the structure of processing and the relationship between the weights of individual products have an impact on water consumption in fruit and vegetable processing plants. For this purpose, neural models of water consumption were developed for the largest agri-food processing plants in Poland that use similar technology. Water consumption was then optimized using genetic algorithms for the processing structure. The results conﬁrmed the hypothesis that production structure has a signiﬁcant impact on the rationalization of water consumption. The optimization results show that the production of concentrates, juices, and drinks has the greatest impact on water consumption. The lowest water consumption will be achieved when the production of concentrates is at a 2 to 1 ratio to the production of juices and drinks.


Introduction
Most of the Polish businesses operating in the fruit and vegetable industry are small and medium enterprises employing less than 250 employees. Large enterprises account for only about 15% of businesses, but they hold a major share of the quantity of fruit and vegetables processed in Poland. Large enterprises produce several types of products, while small enterprises are predominantly oriented towards one product. Typical industry products include fruit juices and drinks, fruit and vegetable concentrates, and frozen fruit and vegetables. In fruit and vegetable industry plants, water is both a raw material and energy carrier; it is used in hydro-transport, for raw material washing, and for the maintenance of production hygiene [1]. The diversity of the technological operations performed in the production process, as well as the seasonality of production, make the objective assessment of water consumption in fruit and vegetable processing plants difficult [2]. The few available publications on this subject specify the numerical values of unit consumption indices without any in-depth analysis of the cause-and-effect relationships in the use of water in the facilities of this industry. Artificial neural networks are used in the study of biosystems in which it is difficult to formally describe the analyzed parameters. In the literature, there are many examples of this type of use of artificial neural networks, for example, to forecast corn and soybean yields [3], the stage of development of sweet pepper fruits [4], fuel consumption in vehicles [5], or river flow [6]. For predicting corn and soybean yields, the authors of [3] proposed a new convolutional neural network model called YieldNet, which utilizes a novel deep learning framework that uses transfer learning between corn and soybean yield predictions by sharing the weights of the backbone feature extractor. Similarly, in [4], an ensemble model of convolutional neural network (CNN) and multilayer perceptron (MLP) models was developed to detect sweet pepper fruits in images and predict their development stages. In the forecasting of fuel consumption in vehicles, the adopted research methodology was based on the use of MLP artificial neural networks and sensitivity analysis. It is very difficult to predict desired water flow using physically based models and conventional regression-based methods due to the nonlinear and fuzzy nature of hydrological activity and the scarcity of relevant data. These traditional methods are incapable of handling the complex non-linearity and non-stationary process of water flow. Thus, the aim of [6] was to develop an intelligent hybrid artificial intelligence model, namely, a genetic-algorithm-based Artificial Neural Network (GA-ANN) for monthly water flow prediction in the Mahanadi river system. All parameters associated with the artificial neural network (ANN) model were simultaneously optimized automatically using the Genetic Algorithm (GA) for the prediction of the water flow. In this study, it was also decided to use artificial neural networks and genetic algorithms to analyze water consumption in fruit and vegetable processing plants.
The objective of this paper was to elaborate on existing research and verify the following research hypothesis: processing structure and the relationship between the weights of specific products have an impact on water consumption in fruit and vegetable processing plants. For this purpose, neural water consumption models were developed from the largest agricultural and food processing plants in Poland that employ similar technology. Subsequently, water consumption was optimized, using genetic algorithms, in relation to processing structure. The analysis of the existing research on water consumption in fruit and vegetable companies and the results of optimization may constitute guidelines for rationalization in water management.

The Literature Review
The results of the tests of selected indicators of specific water consumption W w are given in Table 1. The presented indices are diversified, which results from both the differences in the technical equipment of the plants and the diversity of methods of determination of the said indices [20]. The indices may be data intended for the design of new plants or estimations of the future demand for water. The plant unit water consumption indices W w contribute the most information. Admittedly, the literature partially specifies the variability ranges of plant unit water consumption indices but does not list many factors that could affect their numerical values. Method basics with this scope can be found in publications [21,22]. Publication [22] presents the general characteristics of sixteen production plants. Daily water consumption in the summer-autumn period ranged from 55.2 to 5431.8 m 3 . Units of water consumption in the analyzed plants differed more than sevenfold.
The following questions can be posed in this context: • How can the variability of water consumption by plants in this industry be explained? • How can the optimum water consumption in a given plant be determined?
In order to respond to the first question, the determining factors and strength of their impact on water consumption A w and, subsequently, on the values of unit consumption indices W w , must be determined. The authors of publication [22] presented exemplary groups of factors (independent variables) that affect the consumption of water, covering the scope of the plant index, Table 2. The following formula was adopted to explain the dependence of y on independent variables (being the actual parameters observed in practice or their functions): where y is the dependent variable (A w or W w ) and x represents the independent variables (presented in Table 3). Table 3. Factors affecting water consumption variability in the fruit and vegetable industry plants.

Multiple Regression
Equations R 2 Independent Variables Determination, Dimension Numerical Range 10,008-572,645 where A w is the daily water consumption (m 3 ); K 2 = V 2 Z −1 is the total cubic volume of the plant per 1000 kg (1 mg) of the raw material processed in a day (m 3 /mg); P 1 is the power of electrical equipment installed in the plant boiler room, pump room, and water treatment station (kW); R 2 is the correlation coefficient; V 1 is the cubic volume of the production rooms of the plant (m 3 ); V 2 is the total cubic volume of the plant (m 3 ); W w =A w Z −1 is the plant unit water consumption index (m 3 /mg of raw material); Z is the daily raw material processing throughput (mg); Z 1 is the daily production of frozen vegetables (mg); Z 2 is the daily production of fruit concentrates (mg); and Z 3 is the daily production of drinks (mg).
Factor groups I-III contributed to the impact on daily water consumption A w , ranging from 39% to 54%. Admittedly, daily water consumption A w showed high disproportions among the analyzed plants, but its variability was 54%, attributable to the power of the electrical devices installed in the plant boiler room, pump room, and water treatment station (P 1 ). The production structure expressed as Z 1 (production of frozen vegetables), Z 2 (production of fruit and vegetable concentrates), and Z 3 (production of drinks) covered by group II contributed about 45% of the impact on the daily water consumption A w . The application of factor group IV is a source of information on the total impact of technical and technological factors, the degree of automation of the production operations, and organizational-production factors on water consumption. The variability of the unit water consumption W w was 84.3%, attributable to the impact of factor K 2 (total cubic volume per 1 mg of raw material processed in a day). The application of the obtained equations depends on the ranges of the specific independent variables, the numerical values of which are presented in Table 3.
The reasons for excess water consumption usually include: • Lack of fully closed circuits of process water (e.g., from washing) and (mainly) cooling water [23]; • Ineffective recovery of the condensate; • Lack of water consumption optimization in the washing processes (both automatic and manual); the leakage of pipelines, valves, and machinery; and the lack of full supervision over water consumption in specific technological processes.
In the ongoing operation of the production plants, the factor groups I, II, and IV specified in Table 1 have a minor impact on water consumption rationalization. The main impact on rationalization of water consumption is exerted by factor group III. Reduction of unit water consumption must be achieved by the application of the best available techniques in the scope of water management [24]. Rationalization of water consumption in fruit and vegetable industry plants is possible, mostly for entities that carry out monitoring with this scope. The research results presented in Table 3 only partially explain the impact of the processing structure (factor group III) on water consumption. The regression coefficients only determine the degree to which changes of specific parameters Z 1 , Z 2 , and Z 3 per unit affect the changes in daily water consumption A w .
It may be added that 270 m 3 of wastewater with a highly variable composition is drained daily from an exemplary refrigeration station [19]. On the other hand, a fruit and vegetable processing plant operating year-round produces from 3630 to 4540 m 3 of wastewater daily. This information is significant in determining the pollution load and the impact of the fruit and vegetable plants on the environment, taking into account the value of biochemical oxygen demand (BZT 5 ). Observations in the analyzed plants showed that increasing the use of condensate (water from the concentration of fruit juices) offers substantial water-saving potential. Lipowski [25] presented water consumption reduction options in the production of meat and vegetable preserves from 1.5 to 7.5 (m 3 /mg). A review of water consumption reduction options in vegetable blanching is presented in [26].

Neural Model of Water Consumption
Water consumption tests were carried out in large processing plants in the fruit and vegetable industry that used similar technologies and diversified processing products. For the construction of neural models, 634 data sets of water consumption in fruit and vegetable processing plants were used-all were normalized to obtain values in the range [0,1] by dividing them by maximum values. The purpose of modeling was to prepare the objective function for optimization.
In order to select the optimal structure of the neural network, 35 numerical simulations (Table 4) were carried out in the MATLAB program [27]. The variable parameters were divided into the training, validation, and test sets (column 2); activation functions in the hidden and output layer (columns 3 and 5); and the number of neurons in the hidden layer (column 4). The assessment criteria for the conducted numerical simulations were the smallest root mean square error (MSE), correlation coefficient R for the validation set, and R-adjusted value (columns 6, 7, and 8). All tested models were implemented in the MAT-LAB neural network and trained according to Levenberg-Marquardt learning algorithm.  The conducted simulations show that neural network no. 24 demonstrated the smallest root mean square error (MSE = 0.00411) and the highest correlation coefficient (R = 0.95967) (see Table 4). In this case, the samples were randomly divided into the following sets: for training, 80 % of samples; for validation, 10% of samples; and for testing, 10% of samples. Figure 1 shows ANN regression plots between outputs and target samples. The R values in each case are greater than 91%. Therefore, the fit is reasonably good for all data sets. Additionally, R for the water consumption was estimated as 0.90774 for training, 0.88705 for testing, and 0.95967 for validation data sets. Figure 2 presents the optimal structure of the neural network. The neural network consisted of three layers (input, hidden, output). The input layer contained six decision variables: frozen products (x 1 ), concentrates (x 2 ), juices and drinks (x 3 ), other products (x 4 ), total power (x 5 ), and total production (x 6 ). Water consumption (y) was in the output layer. As shown in Figure 2, 6 neurons were located in the hidden layer. The activation function, both in the hidden and output layers, was the "log-sigmoid" function. Therefore, the topology with six inputs, six neurons with one hidden layer, and one output (6-6-1) was applied for predicting water consumption.  For the created model, a sensitivity analysis was performed, obtaining the importance of individual input variables. It is assumed that a given input variable is more important the greater the increase in the error value caused by its removal (Table 5).

Optimizing Water Consumption
Based on the neural modeling results, the objective function was formulated Dependences (2-8): min y = 1 1 + e −(5.9157·F1−6.3083·F2+4.5594·F3+4.0588·F4−4.2846·F5−6.8515·F6−3.2504) (2) where the numeric constants (weights and biases) from Equations (1)-(6) were derived from the neural network structure. The input data x 1 , x 2 , x 3 , x 4 , x 5 , and x 6 were normalized, dividing them by the respective numbers 282, 773, 312, 304, 14054, and 778. The output data y were multiplied by 7316. The purpose of optimization was the minimization of water consumption y Equations (2)-(8) for the decision variables x 1 -x 6 in the scope of their upper and lower limits. The limitations were Dependences (9-15): 0.585 < x 2 < 773, 0.675 < x 3 < 312, 0.44 < x 4 < 304, 946 < x 5 < 14054, 1.63 < x 6 < 778, Genetic algorithms are an optimization method that relies on the natural processes of evolution. The main genetic parameters are crossover and mutation. These parameters determine the selection of the best individuals for the next generation (iteration). The main steps of the genetic algorithm are presented in Figure 3. These are selection, calculation of the objective function, and application of genetic operators (crossover and mutation). After each iteration, all steps are repeated until the best solution (chromosome) is obtained. Optimization was carried out using the genetic algorithm in the "Optimization" tool provided by the MATLAB software. Table 6 presents the genetic algorithm parameters adopted for optimization. The population size, crossover probability, mutation probability, and number of generations were determined by trial and error and on the basis of literature data [28].   The objective function assumes the lowest value for the decision variables presented in Table 7. The lowest water consumption of 3.86 m 3 a day was obtained at the lower range of the employed production capacity, 946 kW, for which the production of frozen products, concentrates, juices and drinks, and other products was 35, 505, 238, and 0.44 mg/day, respectively. Assuming that the production of frozen products x 1 and other production x 4 is significantly lower, the lowest water consumption can be achieved with a 2 to 1 ratio of concentrate production to juice and drink production.

Conclusions
As demonstrated in the analysis of the current state of knowledge, fruit and vegetable processing has a significant impact on the environment due to its consumption of a substantial quantity of water, which must be reduced by means of technological improvement in the scope of water consumption rationalization [29,30]. Water consumption reduction methods include obliging suppliers to supply the raw material in an already treated form, as well as options in the scope of water recovery, e.g., returning transport water back to the circuit. The presented formulas also enable the analysis of the variability of water consumption, taking into consideration significant technical, technological, and other factors, and enable the analysis of the impact of fruit and vegetable processing plants on the natural environment [31]. The presented results may also be useful in environmental reviews. The literature data [32][33][34][35][36] show that comprehensive actions in the scope of water consumption rationalization may result in the reduction of water consumption by up to 50-70%. Rationalization of water consumption in fruit and vegetable processing plants is possible only for entities that carry out monitoring within this scope.
Studies with the use of artificial intelligence, carried out on 634 real data sets, confirmed the hypothesis that the structure of processing is an important factor in rationalizing water consumption. The proposed method enables optimization of water consumption with regard to the type and structure of fruit and vegetable processing. As indicated by the optimization results, for the analyzed fruit and vegetable processing technology, the lowest water consumption will be achieved when the production of concentrates is at a ratio of 2 to 1 to the production of juices and beverages.