Forecast of Operational Downtime of the Generating Units for Sediment Cleaning in the Water Intakes: A Case of the Jirau Hydropower Plant

: Hydropower plants (HPP) in the Amazon basin suffer from issues caused by trees and sediments carried by the river. The Jirau HPP, located in the occidental Amazon basin, is directly affected by high sediment transportation. These materials accumulate in the water intakes and obstruct the trash racks installed in the intake system to prevent the entry of materials. As a result, head losses negatively impact the efﬁciency of the generating units and the power production capacity. The HPP operation team must monitor these losses and take action timely to clear the intakes. One of the possible actions is to stop the GU to let the sediment settle down. Therefore, intelligent methods are required to predict the downtime for sediment settling and restoring operational functionality. Thus, this work proposes a technique that utilizes hidden Markov models and Bayesian networks to predict the ﬁfty Jirau generation units’ downtime, thereby reducing their inactive time and providing methodologies for establishing operating rules. The model is based on accurate operational data extracted from the hydropower plant, which ensures greater ﬁdelity to the daily operational reality of the plant. The results demonstrate the model’s effectiveness and indicate the extent of the impact on downtime under varying sediment levels and when neighboring units are generating or inactive.


Introduction
Hydropower plants (HPP) offer a convenient solution for meeting energy demands, taking advantage of renewable water resources [1].Moreover, the hydropower plant's operation is closely tied to efficiently using available water resources [2].
The HPPs operation can be categorized as run-of-river or reservoir systems, using single or multiple reservoirs-operating independently or in cascade [3].The run-ofriver HPPs have minimal water storage volumes, and, consequently, in opposition to the reservoir systems HPP, the water resources cannot be stored and must flow along the course of the river.
Accurately determining the availability of generation units (GUs) is essential for optimizing energy generation using water resources [4,5].Furthermore, establishing operating rules based on the GUs' operational status is crucial for effectively allocating generation resources [6].
Several factors can impact the availability of GUs, including maintenance operations aimed at preventing issues and correcting failures that may reduce the productive capacity of the GUs or render them unusable [7].
In specific basins, like the Madeira River basin, a significant amount of sediment is transported downstream, including elements such as trees, branches, algae and debris.These substances can accumulate on trash racks, potentially leading to adverse effects on power generation efficiency [8,9].
The Jirau HPP installed in this basin is directly affected by the sediment accumulation problem.During the flood season, the volume of material transported by the river is extremely high.Over time, the sediment accumulates in the water intakes and gradually reduces the GUs efficiency, making the unit unavailable for power generation.
Efforts are made to address or mitigate the impacts of such accumulation.One of these approaches is to clear the trash racks using cleaning claws to remove a substantial portion of the accumulated sediment.Another method is stopping the GU for a certain period, allowing the sediment to settle and reducing the obstruction of the trash racks.
However, cleaning using the claws is only performed occasionally due to the operational cost and effort involved.Therefore, plant operators more frequently opt to stop the GUs for decanting.The challenge lies in determining the optimal downtime required for the decanting process to be effective.The stoppage time is also influenced by neighboring GUs, directly affecting the duration needed to reduce the sediment volume.
This work presents an innovative technique using hidden Markov models and Bayesian networks to estimate the ideal stopping time of the GUs, ensuring that the decantation process achieves its objective within the shortest possible downtime.Considering the vast amount of information associated with each GU, correlation techniques were employed to select the most relevant parameters for analysis.The study also explored the impact of neighboring GUs' operational states, whether active or inactive, on the downtime of the GU under investigation.
Using hidden Markov models enables the prediction of expected obstruction levels after a GU is stopped for cleaning.Bayesian networks contribute to achieving more accurate results by considering the influence of neighboring GUs.Moreover, historical data utilization facilitates the development of models that align more closely with the operational realities of the HPP.
With the resulting models, it is possible to estimate the required time for sediment decantation more accurately, taking into account the specific conditions of the plant.Furthermore, the proposed technique can be employed to develop improved operational rules, thereby enhancing the pre-operation process through a systematized process and leading to operational benefits.
The main contributions of this study are as follows: • Development of a hidden Markov model to estimate the required downtime for GUs; • Modeling of a Bayesian network to calculate conditional probabilities for estimating the necessary decanting downtime under various scenarios; • Use of correlation techniques to reduce the number of analyzed variables while maintaining the quality and relevance of the information; • Investigation of the influence of neighboring GUs on sediment movement when they are stopped; • Improvements in pre-operation processes enhancing GUs performance.
The work is structured as follows: Section 2 presents the problem definition.Section 3 presents related works.Section 4 presents the proposed solution.Section 5 shows the discussions and results obtained, and Section 6 concludes the work.

Problem Definition
Hydropower plants installed in the Amazon basin suffer from problems caused by trees and sediments transported by the river [10].These sediments reach the HPP operation area and, over time, accumulate in the GUs water intake trash racks (to prevent the entry of materials).The sediment accumulation causes a pressure loss at the turbine inlet, reducing the available net head.Another issue stemming from this accumulation of materials is the proportional increase in force exerted on the trash racks.
Usually, there is a safety limit for the force exerted on the trash racks, and this limit is represented by the HPP operation teams by the head loss in meters.Exceeding this limit poses safety risks as the trash racks may break, adversely affecting the plant.
The Jirau HPP is directly affected by sediment-related problems, and the case study presented in this work was carried out in this plant.It is installed in the Madeira River basin, one of the main sub-basins of the Amazon basin, covering an area of over 1.3 million km 2 .Figure 1 presents the hydrography of the Madeira River basin, where it is possible to observe the main tributaries of this basin: the Guaporé, Mamoré, Beni, Abunã and Madre de Dios rivers, with the headwaters located in Brazil, Bolivia and Peru [11,12].
The Amazon basin is the largest in the world, with around 7 million m 2 , covering seven countries in South America.The legal territory of the Amazon is divided into the occident and orient Amazon basin [13].
The occident Amazon basin has approximately 2,400,000 km 2 .Its most important rivers are the Solimões and Madeira Rivers, and they share essential characteristics such as large dimensions, large flows, low slopes and significant variations in level and flow from droughts to flood periods [14].Following the Brazilian government's determination, the Madeira River basin has a tiny passing reservoir at the HPP dam with low regulation capacity.However, this reservoir has a high flow speed to carry many materials, especially trees and sediments [15,16].The volumes of precipitation received by the Madeira River vary between 500 and 5000 mm per year.Figure 2 presents the average monthly flows of Madeira River [10].The government also regulates the levels for the damn operation throughout the different flow seasons throughout the year.
The Jirau HPP is approximately 120 km from Porto Velho, the capital of Rondônia, Brazil.Figure 3 shows the annual temperature history of Porto Velho, which varies between 21 and 34 degrees Celsius and is rarely lower than 18 or higher than 36 degrees.
The plant consists of 50 bulb-type turbines with an installed capacity of 75 MW each, totaling an installed capacity of 3,750 MW.The plant has a dam that stretches over 7875 m.An important aspect of the Jirau HPP is its run-of-river reservoir type without storage capacity, which means all inflow must be either used for generation or released downstream [17].
Jirau is the fourth-largest hydropower plant in Brazil in terms of installed capacity and the largest in the world in terms of the number of GUs [18].However, the immense structure of the HPP presents significant operational challenges.Furthermore, the construction of the plant was carried out with precautions to preserve the biodiversity and natural characteristics of the river by allowing the passage of transported trees.A complex set of equipment, referred to as auxiliary services, is essential for the proper functioning, operation and power generation of the HPP GUs.These auxiliary services encompass refrigeration systems, pressure control mechanisms, protection and control systems, temperature sensors and more.Figure 4 provides an aerial view, illustrating the physical dimensions of the HPP.
The HPP infrastructure of sensors, relays, actuators and systems totals more than 100,000 information points.All data collected from these components within the plant are directed to a SCADA system and stored in a relational database.The AVEVA Osisoft PI System software was acquired a few years ago to enable fast historical data access and reliable information retrieval.In addition, it also provided a solution for managing the large volume of data collected from GUs and their auxiliary services.
The PI System is a comprehensive software portfolio designed for collecting, storing, visualizing, analyzing and sharing operational data internally and externally.It includes a temporal database for information storage using tags with time stamps.Additionally, it offers a tool for effectively organizing the existing assets, displaying them in a tree format that aligns with the company's organizational structure.Due to the characteristics of the Madeira River, whose beds carry a great deal of material that usually leads to head losses in the water intakes of the GUs, stops are performed to allow the settling of accumulated sediment.During the GU operation, level measurements are performed before and after the trash racks, and the difference between these measurements represents the head loss in meters.
Regression analysis using the power information and the observed head of the GU is carried out to determine the flow rate of each unit.This regression is based on the GU hill curve.The obtained regression flow value and the previously recorded head loss information in meters are used to calculate K, representing the trash racks' obstruction factor.An increase in the K factor indicates a greater accumulation of sediments deposited on the trash racks.The formula for K is expressed in (1), where: ∆h gra = Head loss on the rack [m]; K gra = Dimensionless coefficient relative to the rack; A gra = Trash rack cross-sectional area; g = Gravity; According to studies conducted at the HPP, the trash racks installed at the water intake can support a maximum head loss of up to 1.5 m.As a result, operators typically continue operating the GU until the obstruction level reaches this maximum safety limit.Any further accumulation could break the trash racks, causing damage to the GUs and their associated systems, resulting in significant financial losses.
An operation rule was devised for the HPP to deal with the obstruction level issue, where the electrical power dispatch is reduced whenever the head loss reaches the safety limit.This reduction consequently reduces the flow, which causes a decrease in the head loss in meters as it is proportional to the flow.This process is repeated when the limit is reached until the dispatch exceeds the minimum operating limits of the GU.When the maximum obstruction level is reached, cleaning the trash racks or stopping the GU for sediment settling is necessary.Therefore, accurately estimating the downtime is extremely important for the plant as it enables proper planning and utilization of the GU.Given the large amount of existing data related to the operation of the plant, the status of the GUs and their auxiliary systems, an automated and innovative method has been developed to predict the GU downtime and thus systematize a random process that was carried out based on the operator knowledge and experience.The technique allows the establishment of more consistent operating rules that align with the plant's actual conditions.It makes it possible to provide reliable data to the pre-operation team, enabling better allocation of resources for efficient power generation.
In addition to the downtime forecasting, the work's main contribution lies in the fact that the proposed analysis also considers the impact of neighboring GUs activity in addition to time-series data, instead of only considering the time-series as in other methods, such as ARIMA, dynamic regressions, state space models, etc.
The HMM usage in this work is intended to infer the future level of obstruction in the trash racks after some elapsed time, considering that the previous UG obstruction level is known.As HMMs do not capture the influence factors on the future clogging level, BN models are used to map and capture the neighboring GUs relationship in the decantation process.The integrated approach trumps the separate use of techniques.The HMM considers the temporal sequences of obstruction levels, while the BNs incorporate the modeling of uncertainties.This results in enriched forecasting modeling integrating the two techniques.

Related Work
Although some works deal with the sediments subject, to the best of our knowledge, these studies do not address the impacts caused by sediments on the GUs' operational performance.This fact makes it difficult to compare the proposed work with the literature.
As examples of work dealing with sediments-related problems, we cite [9] that explores the cost-effective aspects of the sediment abrasion effect, the possible disturbances, ecological relevance and the sediment bypass systems.The work of [19] analyses the reduction in the downstream sediment caused by installing hydropower dams, impacting one of the world's largest freshwater fisheries, which supports 17 million livelihoods.Furthermore, in [20], reservoirs' water storage capacity decrease is studied, resulting from the human activities and climatic changes that accelerate soil erosion and increase reservoir sedimentation.

Hidden Markov Models
The hidden Markov model (HMM) is a stochastic process in which states are hidden or not directly observable.They can only be inferred through the sequence of symbols produced by an underlying stochastic process.This probabilistic modeling technique is commonly used to handle uncertainty by representing a system as a Markov process with hidden states that are not explicitly visible [21,22].
A triple π, A and B typically represents the HMM, consisting of an initial probability vector over states π, a transition probability matrix A that defines the set of possible states and an emission probabilities matrix B that represents the observation probability distribution over the hidden states [21,22].
A review is completed exploring various applications of HMMs in the context of energy production and associated problems.
One such application, presented in [23], proposes an approach for situation analysis and anomaly detection using a hierarchy of hidden semi-Markov models.The methodology models the expected behavior of a system to detect contextual anomalies in SCADA systems, aiming to predict and prevent potential risks of attacks that could disrupt or damage water supply or power grid systems structures.
Modeling and forecasting electricity prices are proposed in [24].The technique is based on input-output HMM.It considers the uncertainty of some involved variables, such as competitor behavior in the energy market, power source availability, water inflows, system energy demand and related costs.The market states are modeled as hidden states, and a conditional probability transition matrix is used to estimate probabilities when a new market session is opened.Finally, the paper reviews other electricity price models, and related works utilizing each type of model are presented, highlighting the strengths and weaknesses of each approach.
Integrating intermittent renewable energy sources into the power grid presents new challenges.To tackle these challenges, [25] conducted a study that focuses on modeling the power output of a wind farm.The author used discrete HMMs and inferred the model parameters from available data.By incorporating measurement data from multiple turbines and capturing the interdependencies between their outputs, the developed models successfully replicated crucial features of wind farm power output with high accuracy.
Non-intrusive load monitoring (NILM) is a technique used to identify appliance consumption at a disaggregated level.In [26], a hierarchical HMM framework is proposed to model home appliances and anticipate load characteristics at low voltage levels and distinct power consumption profiles in devices with multiple built-in operational modes.In addition, models were also built using dynamic Bayesian network representation.An expectation-maximization approach using the forward-backward algorithm was applied in the HMM fitting process.Tests related to the estimation of energy disaggregation showed that the proposed solution using HMM and a dynamic Bayesian network could effectively handle the modeling of appliances with multiple functional modes.
Islanding, a problem faced by power system engineers in smart grids, is addressed in [27] using an HMM-based algorithm approach to predict the probability of islanding events.The underlying process maps standard or faulty cases as a sequence of states.The HMM can help detect these states despite them not being directly observable but follow a pattern.Phasor measurements from the smart grid are used, and statistical analysis is conducted to determine the HMM parameters and tests were conducted in an IEEE nine-bus system.A trained artificial neural network provides HMM emission probabilities, enabling the prediction of islanding events based on posterior probabilities.
Reliability analysis of phasor measurement units is presented in [28] as another application of HMMs.The proposed methodology computes the transient probability, allowing for better monitoring systems during transient states.This ability enables faster and more effective restorative initiatives, providing a reliable method for operating, monitoring and controlling wide-area measurement systems.
The oil-immersed power transformers fault diagnosis using dissolved gases analysis is properly and commonly used.This technique is used with HMMs to estimate the health state of power transformers to infer operation failures in the work of [29].In addition, a dynamic fault prediction technique is proposed where a Gaussian mixture model is used as a clustering method to extract health state features from datasets with 1600 days of operation.The HMM transition probability was calculated and analyzed to relate different health states.The results showed the proposed solution's effectiveness in predicting fault in a condition-based operation.
An alternative method for faults classification is proposed in [30], utilizing an HMM algorithm to process electrical signals in multivariate time series.A comparative analysis between the proposed technique and artificial neural network, support vector machine, K-nearest neighbor and random forest is presented.When considering a significance level of α = 5%, the results indicated that only the artificial neural network (ANN) and random forest (RF) classifiers achieved results comparable to the HMM algorithm.Compared to other classifiers, the presented algorithm significantly reduced computational costs, with processing time reduced by over 90%.
The massive integration of plug-in electric vehicles into the power distribution networks directly affected the planning, control and operation processes.To contribute to understanding the power needs of this kind of vehicle, the work of [31] presents an analytical approach for modeling PEV travel behaviors and charging demand.Monte Carlo simulation was employed considering the temporal travel purposes and state of charge of vehicles.The Markov model and HMM formulated the probabilistic correlation between multiple PEV states and state of charge ranges.The technique was tested using an IEEE 53-bus test network with field data, with results demonstrating the benefits of the proposed modeling.

Bayesian Networks
Bayesian networks (BNs) have emerged as a powerful technique to address uncertainty problems in scenarios characterized by randomness, indeterminism or lack of predictability [32].
Specifically, within the context of energy generation and its associated tasks, several studies have presented approaches using BNs to address maintenance-related issues [33], stakeholder decision support [34], watershed management [35] and solar plant failure detection [36], among others.
This section provides an overview of BNs and their applications related to power production and associated problems.
The work of [37] employed statistical approaches to analyze runoff and sediment characteristics in China's Three Gorges Reservoir (TGP).The study utilizes cumulative anomaly analysis, Fisher-ordered clustering and maximum entropy spectral analysis to study variations and forecast flow and sedimentary load using hydrological series of several decades.The ARIMA model is used to build the prediction model over the monthly average runoff and sediment inflow.The findings indicate a decreasing trend in runoff and sediment, with notable changes observed in 1991 and 2001.
Another case study combining BNs, neural networks and a multiagent system is presented in [38] to support and improve the automatic control of solar power plants.
The BN model provides probabilistic values to aid operators in making informed decisions regarding remote control of the solar power plant, offering an optimized solution through distributed artificial intelligence technologies in industrial control systems for facilities based on solar photovoltaic energy sources.
In [39], a dynamic BN model is proposed for predicting the generation reserve size in renewable energy environments.This technique considers factors such as the availability of conventional generator capacity, weather conditions and market prices.Additionally, a new dynamic metric for calculating the reliability level of the power grid is introduced, serving as a real-time stochastic decision support tool.The approach is validated using seven years of historical data from the Australian Energy Market Operator, demonstrating improved accuracy in forecasting the risk of involuntary load shedding.
An agent-based model utilizing BNs is proposed in [40] to address the problem of short-term strategic bidding in a generation company's power pool.The agents employ probabilistic models based on dynamic BNs and online learning algorithms to train the model and estimate optimal bidding strategies, leveraging incomplete public information to infer the future state of the market correctly.The model was tested on two different time scales: hour ahead and day ahead.According to the results, the agents predicted the market equilibrium in advance with acceptable errors using incomplete information data.
Furthermore, a BN-based approach is applied in [41] to predict wind power ramp events, employing an imprecise conditional probability estimation method.The proposed solution utilizes the maximum weight spanning tree, a greedy search method to fit the observed data with the highest degree, and a modified version of the Dirichlet model to estimate the network parameters.Given the meteorological conditions, the proposed solution is meant to detect the possibility of a random ramp event, quantifying the uncertainty of the event.The method's effectiveness is demonstrated through tests using three-year operational data from a real wind farm.
Additionally, a BN is employed in the work of [42] for fault detection in power transformers.The model analyzes dissolved gases in oil, using concentration ratios of specific gases to identify normal deterioration and electrical and thermal failures.The solution used a historical database in the learning process, and, compared to data in the literature, the BN presented a high degree of reliability.
Lastly, the study in [43] discusses a BN-based approach for estimating faulty sections in transmission power systems within blackout areas.Three BN models are proposed, capable of testing the faultiness of components using uncertain or incomplete data and knowledge about power system diagnosis.The model uses a similar error backpropagation algorithm employed in artificial neural networks, with priors requiring domain experts' knowledge and network structure modeling.
In conclusion, the applications and case studies presented highlight the versatility and effectiveness of BNs in addressing uncertainty and decision making challenges in various power-related domains.Using BNs allows for improved complex systems analysis, prediction and control.

Analysis Summary
After a comprehensive review of the existing literature across three key subjects: HMMs, BNs and sediment transportation downstream, to the best of our knowledge, no prior work has sought to integrate these distinct topics in analyzing the impacts on the operational efficiency in HPPs caused by riverbed material transportation.
Notably, the critical challenge of sediment transportation prevails in the Amazon basin, where two of Brazil's largest HPPs are installed: Jirau and Santo Antonio.These plants hold the respective ranks of the fourth and fifth largest HPPs regarding installed capacity within the country.Although our contribution does not introduce new techniques, the innovation lies in the amalgamation of BNs and HMMs harnessed to address a pressing predicament within the Brazilian power sector.Such a fusion of methodologies offers a solution to an imperative issue, underscoring the novelty and significance of our work.

Data Selection
Due to the large amount of equipment-related data, using all available information in any forecasting technique is practically infeasible.This limitation arises from the time required for data processing and the computational resources consumed in performing such tasks.Pearson's correlation technique is used to identify the degree of dependence between the analyzed variables.This approach aims to reduce the data required while representing the relevant attributes of interest.
The correlation coefficient quantifies the relationship between variables, with values ranging between −1, indicating a strong negative relationship, 1, indicating a strong positive relationship and zero indicating no relationship.Pearson's correlation coefficient formula is expressed in (2).
This work uses data collected from the 50 GUs at Jirau HPP.The database encompasses three months, from November 2021 to January 2022, with a sampling interval of 10 min.Figure 5 illustrates the dispatch power and efficiency attributes derived from the data set.
In the first analysis attempt, numerous attributes of the GUs were used.However, the strong correlation between the attributes reduced the set to a few elements, which can still represent the necessary information.The resulting attributes from the analysis were net head, dispatch power, efficiency, calculated flow and racks loss (head loss on the trash racks), in which the last attribute represents the generation power loss related to sediment in the water intakes.Applying the Pearson correlation formula to the data set, the heat map correlation shown in Figure 6 is obtained.It is possible to visualize a strong correlation between net head and current power and calculated flow and current power and to notice that the relationship between net head and calculated flow is weak.However, the calculated flow strongly correlates with the rack loss.

HMM-Hidden Markov Models
The HMM is applied to predict the GUs obstruction factor after a specific time stopped, given the head loss observed when the GU was stopped.As a result, it is possible to detect a relationship dependency between the GU obstruction level and the required decanting time.The application of HMM modeling allows the extraction of this relationship.
Once the relevant attributes are identified, the HMM is developed by creating the initial probability vector, the transition probability matrix and the emission probability matrix.The HMM states are mapped using the K factor, and four value ranges are defined for their use in the model.These ranges include the cleanest range S1, where K reanges from 0 to 1 −6 , the S2 range from 1 −6 to 4 −6 , the S3 range from 4 −6 to 5 −6 and the most obstructed range S4 for values above 5 −6 .
During the operating period of the GU, the K factor and the head loss in meters can be calculated.Once the obstruction level on the trash racks reaches the maximum supported value, the operator stops the unit for decanting.
Four intervals are created to map the necessary GU downtime: interval H1 from 1 to 4 downtime hours, interval H2 from 4 to 8, interval H3 from 8 to 12 and interval H4 above 12 h.
Given the obstruction level at the stop, the HMM presents the GU probabilities of being in each mapped obstruction level interval over time.
It is important to emphasize that the probability vector, the transition and emission matrices were derived from HPP historical data so that the results from the model reflect the reality of the GU downtime for decanting.
The historical data are divided into two sets, one representing the training set and the other the test set.Separation is necessary to evaluate the model using a different group from the one used in training, thus avoiding data overfitting.
In the HPP, it is impossible to determine the current level of GU obstruction after decanting for a few hours.Therefore, the HMM is used in this scenario to estimate the GU obstruction level through probabilities to determine whether it is possible to restart the GU operation.
For the model creation, the following information is inferred from historical data: the prior or initial probability vector, the transition probability and emission probability matrices.
The initial probability vector denotes the probability of the GU being in a specific initial obstruction state, serving to determine the most probable initial state for the GU.The initial probabilities are defined based on the ratio between the number of decanting stops and the obstruction level when the GU was stopped.Consequently, the obtained initial state distribution vector, presented in Table 1, confirms the expected observation that decanting stops for the GU are more frequent when the obstruction level is higher.The transition probability matrix represents the likelihood of the GU transitioning from one state to another.As the GU downtime increases, there is a higher probability of transitioning from a more obstructed state to a lesser one.Since the current obstruction level is not directly observable, it is considered a hidden state.The resulting transition probability matrix can be seen in Table 2. On the other hand, the emission probability matrix corresponds to the observed information related to the GU's current obstruction level.This information is the elapsed decanting time since the GU was stopped.As time progresses, GU's trash racks become cleaner, which aligns with expectations.Thus, the HMM model utilizes the probability vector and matrices to estimate the GU's obstruction level after a random time.Table 3 presents the resulting emission probability matrix.
The HMM implementation is carried out using the Pomegranate, a Python package for probabilistic models [44].The model construction involved the three entities: π, A and B. Once the model is trained, given a sequence of observations O, the model determines a score for the observed sequence using the so-called forward algorithm, or α-pass, using the dynamic programming concept.After obtaining the score for the observed sequence, the next step is to reveal the most probable sequence of states given the presented observations.Given the GU stop elapsed time, the Viterbi algorithm is used to expose the hidden states, representing the actual K factor.The Viterbi algorithm generates the most likely sequence of hidden states for a given list of observations, using dynamic programming to generate the output sequence recursively.

Bayesian Networks
Bayesian networks offer the possibility of representing a domain problem through a graphical structure composed of nodes comprising a set of random domain variables.The arcs connect the nodes through pairs, meaning the direct dependence of the variables.The conditional probability distribution of each associated node governs the strength of the relationship between the variables.
Using BNs in this work makes it possible to represent the variables directly affecting the required downtime to decrease the GU obstruction level.
The following variables are considered in the BN modeling: K factor before the unit stops, an indication if the left, the right or both neighboring GUs are in operation during the analyzed GU stopped time, and the power at which the neighboring units were operating if it is the case.
The resulting BN diagram, designed to reflect information about the GUs, is shown in Figure 7.The K factor is discretized using the same four obstruction level intervals defined in Section 4.2.The BN model can be queried by one or more variables, obtaining the conditional probability according to the provided inputs.
To account for the neighboring operating time while the analyzed GU is stopped, all the intervals of each working neighboring are added up to obtain the relationship between the neighboring working time by the analyzed GU stopped time.
For example, if the analyzed GU is stopped for 12 h and the right neighbor operates for 3 h, then the neighbor operates for 25% of the analyzed GU stopped time.Therefore, the operating time of the neighboring GUs was divided into four ranges, the range T1 up to 25% of the time, T2 from 25% to 50%, range T3 from 50% to 75% and range T4 from 75% to 100%.
The average is used to compute the neighboring operating power during left or right GU usage hours.If both neighbors are operational, the average hourly power each neighbor generated is used.Finally, the same four intervals shown in Section 4.2 are used to infer the necessary GU downtime information.
The conditional probabilities distributions (CPD) related to each model variable are obtained through parameter learning using the provided data and model structure.The maximum likelihood estimation (MLE) algorithm is used in this work for CPD extracting using a data set [45,46].
The Bayesian model and CPDs make inferences using several scenarios to validate the proposed technique and compare the obtained model results with the plant operational data.

Results
After elaborating on the proposed models using the techniques presented, study cases are performed to verify their performance.In this section, the obtained results are provided below.
The obtained results through the HMM application for GU 2 are outlined in Table 4 and provide a comprehensive representation of the probabilities associated with specific hidden states.These hidden states correspond to varying levels of obstruction caused by the accumulation of sediments.The likelihood of the GU being in distinct hidden states can be ascertained by analyzing the data within each observed hourly interval.Whenever the GU is in state S4, which means it has the highest level of obstruction, the transition probability towards a less obstructed state becomes evident only after the time interval H3.This outcome aligns with the plant's operational practice when they usually keep the unit offline for extended periods when the obstruction level reaches a higher degree.
Alternatively, if the same GU currently has the obstruction level S2 and the GU remains inactive for the time interval H3, there is a significantly higher probability (18.7%) of it remaining in that state.The GU transition to a state cleaner than S2 demands a more extended downtime due to the characteristics of the sedimentation process, in which denser materials take more time to settle.
At the S1 obstruction level, the unit can remain in the same state, or, sometimes, the evolution is identified to a higher obstruction level, changing to S2.This event may occur due to the operation of neighboring GUs, which contributes to the movement of sediments, migrating material to the stopped GU.
In an ideal scenario for the HPP operation, a GU at the highest obstruction level should remain stopped until it is at the lowest dirt level, when it can return to activity.A study case was performed to obtain the probability of the GU migrating from the initial state S4 to S1 as its final state.The result is presented in Table 5.It is possible to notice that the provided scenario is more likely to occur only after the H3, with a probability of 24.99%, and it is most probably, with 30.20%, at H4.As expected, it is not common to reach S1 starting from S4 after the H1 or H2 periods, corresponding to 1 to 4 h or 4 to 8 h intervals.
To analyze the differences in downtime between the different GUs, Table 6 presents information relating to the level of obstruction and the downtime for GUs 1 and 3, respectively.The following considerations can be conducted through data analysis and using the same study case performed on GU 2 to obtain the probability of the GU migrating from the initial state S4 to S1 as its final state: The probability of remaining in state S4 shows a balanced dispersion over time for GU 1.This pattern indicates situations where the GU transitions to a cleaner state even within the H1 interval.Conversely, there are cases where a significant time lapse, such as H4, is required for the transition.This variance can be attributed to GU 1's proximity to the dam's ravine, potentially contributing to sediment accumulation in specific circumstances.
Concerning GU 3, the likelihood of persisting in state S4 during the H1 and H2 intervals is higher, registering values of 31.1% and 28.1%, respectively.The prevailing trend is for GU 3 to transition to a cleaner state only after the H3 interval.
The dissimilar behaviors observed between GUs 2 and 3 can be attributed to the following factors: GU 1 absorbs sediment from the riverbank and can consequently transfer sediment to GU 2, explaining why GU 2 shifts to a cleaner state only after a more extended downtime.Conversely, GU 3 is unaffected by the same issue due to its greater distance from GU 1, illustrating the impact of neighboring GUs on the sediment decantation process.
Probability outcomes for GUs 31 and 32 are shown in Table 7.The behaviors of GUs 31 and 32 differ from those presented for GUs 1 to 3.This difference can be attributed to these GUs being on different margins, separated by kilometers, and to the curvature of the river displayed on the left margin where these GUs are installed.It is possible to observe a certain similarity between the probabilities for GUs 31 and 32, with slight variations in the required time to change between states.Generally, there is migration between states only after the H3 interval, which can be associated with the type of material accumulated in the trash racks.
When the generation units (GUs) exhibit a notably high degree of obstruction, a prevalent trend emerges: substantial clearance of the trash racks occurs only after prolonged GUs downtime.Specifically, if the GU experiences a brief stop time upon resumption of operational activity, a considerable amount of material is expected to obstruct the trash racks persistently.
Observations reveal that units positioned near the riverbank experience a notably higher sediment accumulation, leading to a more pronounced obstruction of the trash racks.Adjacent GUs also experience a residual effect from this sediment accumulation, albeit with a lesser impact.
It is essential to highlight that the HMM technique cannot consider whether neighboring GUs are in operation, nor does it account for the GUs operating power or the time it was generating.
For this reason, BNs are used to consider the factors that directly affect the operation and consequently alter the sediment flow during the GU stop time.Separated BNs are created for each GU to reflect the specificities of each one.
Below are presented the obtained results for different types of BNs queries.For example, Table 8 shows the CPDs for GU 2. Utilizing models derived from BN offers a significant advantage due to their inherent query capabilities.Queries involving any model attributes mapped within the network can be completed, thereby facilitating the prediction of posterior values.Specifically, these models empower the prediction of the obstruction level for each distinct downtime interval.
This predictive capacity enhances the ability to forecast and anticipate the progression of obstruction levels during various operational downtimes.In essence, BNs allow for a comprehensive exploration of the network's attributes, enabling the generation of valuable insights into the system's expected behavior over time.
Using the BN network shown in Figure 7, it is possible to estimate the resulting obstruction level using a given scenario to verify which parameters influence the decantation process the most.
Below are the values entered for the BN parameters.Data referring to UG 2 were used.In all scenarios, it is considered that the UG is at the highest level of obstruction, S4.
Scenario A is parameterized only with this S4 obstruction information.
Scenario B is configured with the additional 'Right' information, whose defined value is T3, which comprises the value ranging from 50% to 75% of the time.
The information 'Right', 'Left' or 'Both' refers to the percentage of time that the adjacent unit operated when the GU was stopped for decantation.In this case of scenario B, the analyzed GU is the 2, and the 'Right' neighbor GU is the 3.
Scenario C is configured with the same value for the 'Right' parameter: T3.Additionally, the 'Left' information is set to T1, which comprises the value up to 25% of the time.
Scenario D is configured with the value for the 'Both' parameter equal to T4, which means that both the 'Right' and 'Left' GUs, in this case, 3 and 4, respectively, operated for the time interval comprising the 75% to 100% of time in which the GU 2 was stopped for decantation.
The results are presented in Table 9.In scenario A, the probabilities that GU 2, stopped at the worst obstruction level, will resume operation at levels S2 and S3 are approximately 38.8% and 32.9%, respectively.In scenario B, these values are close, 32.3% and 30.7%, respectively.The operation of the neighboring GU, in this case, GU 3, did not significantly impact the obstruction level of GU 2.
In scenario C, the most favorable cleaning results were obtained, with a 40% probability that the UG would return at the cleanest level of obstruction: S1.The probable explanation for such behavior may be that the operation of the left GU, in this case, GU 1, has pulled the sediment from GU 2, migrating the GU more quickly to a lower level of obstruction.
Finally, in scenario D, both neighbors were in operation for the entire time GU 2 was stopped.The results demonstrate a more uniformly distributed probability between levels S1, S2 and S3, with values of 48.4%, 35.2% and 42.1%, respectively.
The results show that the decantation process when the GU is stopped is significantly influenced by the neighboring GUs.This relationship changes depending on the time the neighboring GUs were operating and the level of dirt when the GU stopped.
The BN was parameterized to present modeling outputs for each final obstruction level when resuming GU operation for all available downtimes to enable a more comprehensive view of data, including more complete probability results.The obtained results are shown in Table 10.
Given that the GU was stopped at the higher obstruction level S4, the following behavior can be observed in the table for many scenarios: after stopping time H1, the highest probability is that the GU resumes operation still at level S4.For time H3, the restart must occur at level S2, and, finally, the stop for time H4 increases the probability of resuming at level S1.
Only on the H2 stop time interval does this pattern not hold.Instead of resuming at level S3, the GU remains at level S4, demonstrating that stopping the GU for short intervals does not influence the level of obstruction so strongly.

Discussions
The main feature of HMMs is their suitability for use with sequential data, where the order of observations is essential.In this work, the HMMs captured temporal dependencies and transitions between different states, which evidenced the relation between the obstruction level and elapsed time when GUs are stopped for decanting.The flexibility of HMMs enabled usage with time-series input data, while levels of obstruction are mapped as states in the model.Table 10.BN results for GU 2: for each scenario A through D, and for each time interval H1 through H4, the probabilities of the GU returning to operation at dirt levels S1 through S4 are presented, using S4 as the level of initial obstruction.The two main advantages of HMMs are related to the probabilistic modeling and the incorporation of hidden states.The first feature captures the uncertainties, which fits the objectives of this work: map the ratio of accumulated sediment and the required downtime to settle this material.The second maps the unobserved obstruction levels underlying processes as hidden states, enabling the estimation of the sediment settlement according to the elapsed time.

Hours
On the other hand, the limitations of HMMs in this model are, once the transition probabilities are influenced by the neighboring GU, the Markov property is directly affected and may not hold.The Markov property can also struggle to capture long-range dependencies effectively.
Another limitation is related to the fixed state space: the number of hidden states was determined in advance and may not represent the best possible scenario.Choosing the appropriate number of states was challenging since it could affect model performance.A significant effort was required to ensure that the selected state space reflects the best option.
The results showed that the expressiveness of the HMM technique alone is limited since the models might not effectively capture complex relationships between variables.The training complexity represents a time bottleneck since a new training cycle must be performed with each new parameter and state mapping variation.
The Bayesian networks' advantages rely on the fact that BNs provided a natural and intuitive way to model uncertainties and dependencies in data obtained from the real HPP operation.As the BN is a probabilistic framework, it was possible to infer even when some variables appear unrelated.
The causal inference of BNs allowed for understanding how the neighboring GU usage affected other variables.That feature was valuable for decision making about using the GUs while the next ones are stopped for sediment settling.With the help of problem domain experts of the Jirau HPP, it was possible to validate prior beliefs and causal relationships obtained by the resulting modeling.
BNs allowed exploratory analysis and efficient inference related to sediment behavior, which helped to uncover hidden patterns that were not immediately apparent from the raw data and compute probabilities of different scenarios, providing query evidence.As the real operational data from the HPP were available, it was possible to use the BN to learn the conditional probabilities parameters from these data, which makes the model consistent with the plant reality.
Bayesian networks present some limitations, such as the heavy dependence on the graph structure.It was challenging to correct specifying the design, which required domain expertise since the model might not effectively capture the true relationships without expert input.
The needed training time was a bottleneck to realize parameter variations during the modeling because the high number of GUs represents a computational complexity problem.Finally, the correlation and causation relationships represented a challenge because assuming causation based on correlation could sometimes be dangerous.
A strength of the presented work is the union of the HMM with the BN, which made it possible to take advantage of the main characteristics of each of the techniques.The model's robustness allows probability information extraction.It brings to light details related to the behavior of the obstruction in the trash racks, including the neighboring GU impact in the sediment settling behavior.
Because it is a pioneering work, which addresses the problem of obstruction of trash racks with consequent impact on the operation of hydroelectric plants, studies still need to be conducted for contextual reference and comparison.This sediment issue is specific to HPPs located within the Amazon basin.In the case of the Jirau plant, the challenge of high sediment transport rates arises only during particular periods of the year.For this reason, only data referring to flood seasons are used.

Conclusions
This paper proposes techniques to estimate the ideal stopping time of the GUs in the Jirau HPP using hidden Markov models and Bayesian networks as inference methods.Field operational data are used to obtain the presented models.The results demonstrate consistency with daily plant operation, allowing the use of the model in the operator's decision making, thus helping to operate the high number of existing GUs in the Jirau HPP.
An essential advantage of the presented methodology is that it allows systematized and data-based means to model information inferring, enabling more consistent operating rules at the HPP.
As the Jirau is a run-of-river plant and does not allow water resources storage, the presented proposal in this work offers methods that enable using a robust model in the plant operation planning under several possible scenarios, extracting the resulting probability under each perspective.
This innovative proposal aims to bring greater clarity and robustness to the operating rules extraction for HPPs whose accumulation of materials on the trash rack can negatively influence daily operations, especially those in the Amazon basin.
The applied methodology in this work can be used in other HPPs, both for operating rules extraction and for HMM and BN modeling.In addition, the resulting model can help to identify factors that alter the GUs operational efficiency, providing tools and methods to operate the plant efficiently.
Although the generated models use HPP operating data, the training was offline.As future work, it is proposed to integrate plant data for online model training, presenting realtime information to assist operators in decision making and minimizing GUs downtime.

Figure 1 .
Figure 1.Hydrography of the Madeira River basin where the Jirau HPP is located.

Figure 3 .
Figure 3. Porto Velho annual temperature history: temperature in the region ranges between 21 and 34 degrees Celsius.Gray bars: daily range of recorded temperatures; red and blue lines: daily highs and lows, respectively.

Figure 4 .
Figure 4. Aerial view of HPP Jirau.Right margin with 28 UGs on the left of the image.Left margin with 22 UGs on the right of the image.

Figure 5 .
Figure 5. Monthly efficiency and dispatch power readings extracted from the dataset.

Table 1 .
Initial state distribution vector.

Table 4 .
HMM results for GU 2: state probability after given time elapsed.Columns represent obstruction levels, and lines represent the time elapsed.

Table 5 .
Probability of the GU 2 migrating from initial state S4 to S1 as its final state.

Table 6 .
HMM results for GUs 1 and 3: state probability after time elapsed.Columns represent obstruction levels, and lines represent the time elapsed.

Table 7 .
HMM results for GUs 31 and 32: state probability after time elapsed.Columns represent obstruction levels, and lines represent the time elapsed.

Table 8 .
BN results for GU 2: state probability after given time elapsed.Columns represent time elapsed, and lines represent obstruction levels.

Table 9 .
BN results for GU 2: probabilities obtained for scenarios A, B, C and D, using S4 as the initial obstruction level.