Intelligent Wide-Area Water Quality Monitoring and Analysis System Exploiting Unmanned Surface Vehicles and Ensemble Learning

: Water environment pollution is an acute problem, especially in developing countries, so water quality monitoring is crucial for water protection. This paper presents an intelligent three-dimensional wide-area water quality monitoring and online analysis system. The proposed system is composed of an automatic cruise intelligent unmanned surface vehicle (USV), a water quality monitoring system (WQMS), and a water quality analysis algorithm. An automatic positioning cruising system is constructed for the USV. The WQMS consists of a series of low-power water quality detecting sensors and a lifting device that can collect the water quality monitoring data at di ﬀ erent water depths. These data are analyzed by the proposed water quality analysis algorithm based on the ensemble learning method to estimate the water quality level. Then, a real experiment is conducted in a lake to verify the feasibility of the proposed design. The experimental results obtained in real application demonstrate good performance and feasibility of the proposed monitoring system.


Introduction
High-quality water supply is essential for human survival [1,2]. Therefore, water protection has been a hot topic in academic and industrial domains [3,4]. With the rapid development of industry and urbanization, industrial and sanitary sewage has severely affected fresh-water sources worldwide, especially in developing countries, and thus has significantly influenced the living conditions of human beings [5]. In water protection, water quality monitoring is a key task. Therefore, in smart cities, it is extremely important to monitor the quality of water resources effectively [6,7].
In recent years, the evolution of high-resolution sensors and Internet of Things (IoT) technologies has significantly improved water quality monitoring technologies [8][9][10]. Various WQMSs have been designed to solve the problems related to the water quality monitoring of lakes [11,12], rivers [13,14], and groundwater [15,16]. Generally, water quality can be accurately determined by laboratory analysis, but spatial resolution and rapid assessment cannot be obtained with efficiency, simultaneously.
Significant progress has been made in the water quality monitoring field by the introduction of the IoT [17] and unmanned techniques [18,19] that can reduce the monitoring cost and improve the intelligence of the WQMS [20,21]. On the one hand, the new measurement technologies, such Water 2020, 12 as ship-borne measurements [22] and IoT system [23], have been used to monitor water quality. Particularly, [24,25], an unmanned surface vehicle (USV) was used to monitor water quality. On the other hand, various water analysis methods have been used to improve water quality estimation systems [26][27][28]. However, there are still many challenges, such as those related to low cost, high efficiency, and good real-time performance. Motivated by these studies, in our work we integrate the USVs into water quality monitoring at different depths. In this work, an intelligent three-dimensional wide-area water monitoring and analysis system and a dynamic energy-saving system are proposed. For the purpose of convenient reference, acronyms used in this paper are given in Table 1. The conceptual diagram of the proposed design is shown in Figure 1. Most of the traditional water quality monitoring devices can obtain only the water quality parameters related to the water surface [29,30]. However, due to the sewage spreading problem, the traditional methods have the shortcoming of collecting water quality monitoring data with limited precision; instead, water quality should preferably be monitored at different depth levels. Therefore, a device that can realize three-dimensional water quality monitoring at different depth levels is designed in this work. In the proposed system, a sensor node collects the water quality data and uploads it to the cloud, and then a water quality monitoring model based on ensemble learning is used to determine the water quality level. Significant progress has been made in the water quality monitoring field by the introduction of the IoT [17] and unmanned techniques [18,19] that can reduce the monitoring cost and improve the intelligence of the WQMS [20,21]. On the one hand, the new measurement technologies, such as shipborne measurements [22] and IoT system [23], have been used to monitor water quality. Particularly, [24,25], an unmanned surface vehicle (USV) was used to monitor water quality. On the other hand, various water analysis methods have been used to improve water quality estimation systems [26][27][28]. However, there are still many challenges, such as those related to low cost, high efficiency, and good real-time performance. Motivated by these studies, in our work we integrate the USVs into water quality monitoring at different depths.
In this work, an intelligent three-dimensional wide-area water monitoring and analysis system and a dynamic energy-saving system are proposed. For the purpose of convenient reference, acronyms used in this paper are given in Table 1. The conceptual diagram of the proposed design is shown in Figure 1. Most of the traditional water quality monitoring devices can obtain only the water quality parameters related to the water surface [29,30]. However, due to the sewage spreading problem, the traditional methods have the shortcoming of collecting water quality monitoring data with limited precision; instead, water quality should preferably be monitored at different depth levels. Therefore, a device that can realize three-dimensional water quality monitoring at different depth levels is designed in this work. In the proposed system, a sensor node collects the water quality data and uploads it to the cloud, and then a water quality monitoring model based on ensemble learning is used to determine the water quality level.  The rest of the paper is organized as follows. Section 2 presents the overall design of the USV and automatic cruising strategy. Section 3 describes the proposed water quality monitoring system. Section 4 introduces the water quality analysis algorithm. The experimental results of the proposed system are presented in Section 5. Section 6 concludes the paper.

USV Design
This section introduces the architecture design of the proposed USV and an automatic positioning cruise system.

USV Architecture
The overall architecture design of the USV is shown in Figure 2. The USV weighs 5 kg, and its length, width, and height are 0.8 m, 1.8 m, and 0.6 m, respectively. The draft of the vehicle is 0.1 m. Meanwhile, the catamaran is equipped with two brushless DC (direct current) motors, which can provide thrust from 1 kg to 2 kg. The hull structure of the USV adopts the structure based on the catamaran. Both hulls adopt a pontoon made of a PVC (polyvinyl chloride) mesh cloth, and the connecting bridge has a PVC composite frame. The frame of the connecting bridge is covered with a plexiglass panel to prevent damages to the system components caused by external factors. The entire USV system mainly consists of three parts: an MCU (micro control unit), an automatic positioning cruise system, and a wireless communication transmission module. The rest of the paper is organized as follows. Section 2 presents the overall design of the USV and automatic cruising strategy. Section 3 describes the proposed water quality monitoring system. Section 4 introduces the water quality analysis algorithm. The experimental results of the proposed system are presented in Section 5. Section 6 concludes the paper.

USV Design
This section introduces the architecture design of the proposed USV and an automatic positioning cruise system.

USV Architecture
The overall architecture design of the USV is shown in Figure 2. The USV weighs 5 kg, and its length, width, and height are 0.8 m, 1.8 m, and 0.6 m, respectively. The draft of the vehicle is 0.1 m. Meanwhile, the catamaran is equipped with two brushless DC (direct current) motors, which can provide thrust from 1 kg to 2 kg. The hull structure of the USV adopts the structure based on the catamaran. Both hulls adopt a pontoon made of a PVC (polyvinyl chloride) mesh cloth, and the connecting bridge has a PVC composite frame. The frame of the connecting bridge is covered with a plexiglass panel to prevent damages to the system components caused by external factors. The entire USV system mainly consists of three parts: an MCU (micro control unit), an automatic positioning cruise system, and a wireless communication transmission module.

USV Embedded Control Unit
The USV control unit is based on a 32-bit ARM microcontroller. An image of the USV is shown in Figure 3. The USV MCU is directly connected to the cruise system, power control module, and two wireless communication modules, as demonstrated in Figure 3. The USV MCU can establish wireless links to the communication module. The MCU of the USV is responsible for power control, cruising, and other tasks. Since the USV is sailing, an alterable power consumption mode is adopted according to the distance between the current and target locations. For instance, a USV is in the working and position modifying mode when the distance from the target location is less than or equal to 50 m, and when the distance is greater than 50 m the USV will go to the standby mode.

USV Embedded Control Unit
The USV control unit is based on a 32-bit ARM microcontroller. An image of the USV is shown in Figure 3. The USV MCU is directly connected to the cruise system, power control module, and two wireless communication modules, as demonstrated in Figure 3. The USV MCU can establish wireless links to the communication module. The MCU of the USV is responsible for power control, cruising, and other tasks. Since the USV is sailing, an alterable power consumption mode is adopted according to the distance between the current and target locations. For instance, a USV is in the working and position modifying mode when the distance from the target location is less than or equal to 50 m, and when the distance is greater than 50 m the USV will go to the standby mode.

Automatic Positioning Cruise System
The simplified block diagram of an automatic positioning cruise system of a USV is presented in Figure 4. The positioning cruise system includes a nine-axis sensor MPU9250 (three-axis gyroscope, three-axis acceleration, and three-axis magnetometer) and a GPS positioning module, which are used for positioning and USV moving direction determination. The nine-axis sensor and the positioning module are connected to the MCU (STM32F103ZET6) via SPI and UART interference, as shown in Figure 4. In some of the previous studies [31,32], an automatic positioning achieved by a nine-axis sensor and a GPS (global position system) module was presented. Therefore, based on the circuit architecture displayed in Figure 4, an APCS (automatic positioning cruise system) is proposed. The flowchart of the APCS working principle is shown in Figure 5.

Automatic Positioning Cruise System
The simplified block diagram of an automatic positioning cruise system of a USV is presented in Figure 4. The positioning cruise system includes a nine-axis sensor MPU9250 (three-axis gyroscope, three-axis acceleration, and three-axis magnetometer) and a GPS positioning module, which are used for positioning and USV moving direction determination. The nine-axis sensor and the positioning module are connected to the MCU (STM32F103ZET6) via SPI and UART interference, as shown in Figure 4.

Automatic Positioning Cruise System
The simplified block diagram of an automatic positioning cruise system of a USV is presented in Figure 4. The positioning cruise system includes a nine-axis sensor MPU9250 (three-axis gyroscope, three-axis acceleration, and three-axis magnetometer) and a GPS positioning module, which are used for positioning and USV moving direction determination. The nine-axis sensor and the positioning module are connected to the MCU (STM32F103ZET6) via SPI and UART interference, as shown in Figure 4. In some of the previous studies [31,32], an automatic positioning achieved by a nine-axis sensor and a GPS (global position system) module was presented. Therefore, based on the circuit architecture displayed in Figure 4, an APCS (automatic positioning cruise system) is proposed. The flowchart of the APCS working principle is shown in Figure 5. In some of the previous studies [31,32], an automatic positioning achieved by a nine-axis sensor and a GPS (global position system) module was presented. Therefore, based on the circuit architecture displayed in Figure 4, an APCS (automatic positioning cruise system) is proposed. The flowchart of the APCS working principle is shown in Figure 5. x y , the current position; D, the distance deviation; 0 d , the distance resolution; t, moving time.
The main APCS working steps are as follows. First, the coordinate system is set up such that the starting point of the monitoring denotes the origin, the east-west axis represents the x-axis, and the north-south axis represents the y-axis. The MCU of a USV obtains the starting position by a GPS module and uses a wireless communication module to obtain the information on the next position, which are respectively denoted as ( , ) s s x y and ( , ) N N x y , from the cloud server for the water monitoring task. Then, the MCU calculates distance DN and angle N  between the next and starting positions by: After obtaining the distance and angle of the next position, the MCU makes the USV move to the next position. During the USV's movement, the APCS periodically drives the sensor of MPU9250 and GPS module weighted fusion algorithm and sends the current heading angle C  to the MCU by weighted fusion algorithm. Then, the MCU calculates the angle deviation between C  and N  , and makes the USV change direction until the angle deviation becomes smaller than a predefined value  , which is given by: Then, before the USV reaches the next position, the MCU periodically acquires the information on USV current position ( , ) C C x y and calculates distance deviation D of the next position by Equation 4. The presented steps are repeated until distance deviation D satisfies the condition given by Equation 4, i.e., until the USV reaches the target position. α C , the current heading angle; α, the angle deviation;ε, the angle resolution; (x C , y C ), the current position; D, the distance deviation; d 0 , the distance resolution; t, moving time.
The main APCS working steps are as follows. First, the coordinate system is set up such that the starting point of the monitoring denotes the origin, the east-west axis represents the x-axis, and the north-south axis represents the y-axis. The MCU of a USV obtains the starting position by a GPS module and uses a wireless communication module to obtain the information on the next position, which are respectively denoted as (x s , y s ) and (x N , y N ), from the cloud server for the water monitoring task. Then, the MCU calculates distance D N and angle α N between the next and starting positions by: After obtaining the distance and angle of the next position, the MCU makes the USV move to the next position. During the USV's movement, the APCS periodically drives the sensor of MPU9250 and GPS module weighted fusion algorithm and sends the current heading angle α C to the MCU by weighted fusion algorithm. Then, the MCU calculates the angle deviation between α C and α N , and makes the USV change direction until the angle deviation becomes smaller than a predefined value ε, which is given by: Then, before the USV reaches the next position, the MCU periodically acquires the information on USV current position (x C , y C ) and calculates distance deviation D of the next position by Equation (4). The presented steps are repeated until distance deviation D satisfies the condition given by Equation (4), i.e., until the USV reaches the target position. Finally, when the USV reaches the next position, the water quality parameters are collected by the WQMM (water quality monitoring module).
In practice, a USV is placed on a lake, and its starting point is set at a specified location. Then, water quality monitoring is completed in the cruise mode. The USV stops at each measurement point until the water quality monitoring at different depths is completed. In the above automatic cruise steps, there may be a deviation at the position level. Therefore, by comparing the position information from a GPS module and that of the nine-axis sensor with α N and D N intermittently, the USV's position is modified by adjusting the angle α and distance D.

Wireless Communication Transmission Modules
Two different types of wireless modules are adopted to connect to the cloud server and the WQMS, respectively. The first module uses the GPRS (general packet radio service) to realize wireless communication transmission between USV and cloud server. This technology is built based on a cellular network with a long communication distance, and it can be directly used in GSM (global system for mobile communications) or LTE (long term evolution) network. The other wireless communication transmission system is a Wi-Fi module intended for the USV and WQMM, and it uses the IEEE (institute of electrical and electronic engineers) 802.11n communication protocol.

Water Quality Monitoring System
The WQMS represents the key component in this study. This section introduces the main modules of the WQMS, which are the water quality monitoring module, a lifting device, and data transmission and storage system.

Water Quality Monitoring Module
In the proposed system, the WQMM consists of an MCU, power, sensors, wireless communication module, and plastic enclosure, as demonstrated in Figure 6a. As already mentioned, the MCU of the WQMM is based on a 32-bit ARM microcontroller. The main water quality values, including the pH value, total dissolved solids (TDS) value, and turbidity value, are monitored [33]. The parameters of sensors used to measure these values, such as sensing range, accuracy, and working voltage, are presented in Table 2. The MCU of the WQMM mainly controls the water quality monitoring sensor. After obtaining the relevant water quality data, the collected data are transmitted to the main control board via the wireless module. TDS sensor: The working voltage of the TDS sensor is 3.3~5 V, while its measurement range and accuracy are 0~1000 ppm and ±5% F.S (full scale) (25 °C), respectively. There is a difference between the measured value and the true value of the TDS sensor, which is closely related to the water temperature. Therefore, the TDS sensor should be modified during practical application.
Turbidity sensor: The turbidity value is monitored based on the light transmittance and scattering rate of a liquid solution. The turbidity information is obtained by the AD (analog to digital) conversion interface in the dynamic monitoring environment. The working voltage and accuracy of turbidity sensor are 5 V and 0.75%, respectively. The response time is shorter than 500 ms.

WQMM Lifting Device
In order to realize a three-dimensional water quality monitoring system, the proposed design uses the ZS-RE81.3i DC (direct current) motor (ZHENG KE ELECTROMOTOR, Wenzhou, China) to lift and lower the water quality monitoring device. As shown in Figure 6b, the WQMM lifting device consists of a lifting gear shaft and DC motor. The voltage of the motor is 12 V, and its maximum value of revolutions per minute is 60, which indicates that the number of revolutions per minute can reach the value of 60. The PID (proportion integration differentiation) algorithm is used to obtain the desired rotating speed while the device is running.  the MCU of the WQMM is based on a 32-bit ARM microcontroller. The main water quality values, including the pH value, total dissolved solids (TDS) value, and turbidity value, are monitored [33]. The parameters of sensors used to measure these values, such as sensing range, accuracy, and working voltage, are presented in Table 2. The MCU of the WQMM mainly controls the water quality monitoring sensor. After obtaining the relevant water quality data, the collected data are transmitted to the main control board via the wireless module. pH sensor: The pH sensor is used to monitor the pH value of the water. Meanwhile, the determination coefficient R 2 of the calibration model of this sensor reaches a value of 0.999. The pH sensor detects the hydrogen ion concentration in a solution using a hydrogen ion glass electrode and a reference electrode that form a primary battery. During the ion exchange process, the potential difference between the electrodes is measured between the glass membrane and the hydrogen 0~14 5 ±0.7 TELESKY 1 TDS including the pH value, total dissolved solids (TDS) value, and turbidity value, are monitored [33]. The parameters of sensors used to measure these values, such as sensing range, accuracy, and working voltage, are presented in Table 2. The MCU of the WQMM mainly controls the water quality monitoring sensor. After obtaining the relevant water quality data, the collected data are transmitted to the main control board via the wireless module. pH sensor: The pH sensor is used to monitor the pH value of the water. Meanwhile, the determination coefficient R 2 of the calibration model of this sensor reaches a value of 0.999. The pH sensor detects the hydrogen ion concentration in a solution using a hydrogen ion glass electrode and a reference electrode that form a primary battery. During the ion exchange process, the potential difference between the electrodes is measured between the glass membrane and the hydrogen 0~1000 ppm 3.3~5 ±5 WAAAX 2 Turbidity including the pH value, total dissolved solids (TDS) value, and turbidity value, are monitored [33]. The parameters of sensors used to measure these values, such as sensing range, accuracy, and working voltage, are presented in Table 2. The MCU of the WQMM mainly controls the water quality monitoring sensor. After obtaining the relevant water quality data, the collected data are transmitted to the main control board via the wireless module. pH sensor: The pH sensor is used to monitor the pH value of the water. Meanwhile, the determination coefficient R 2 of the calibration model of this sensor reaches a value of 0.999. The pH sensor detects the hydrogen ion concentration in a solution using a hydrogen ion glass electrode and a reference electrode that form a primary battery. During the ion exchange process, the potential difference between the electrodes is measured between the glass membrane and the hydrogen pH sensor: The pH sensor is used to monitor the pH value of the water. Meanwhile, the determination coefficient R 2 of the calibration model of this sensor reaches a value of 0.999. The pH sensor detects the hydrogen ion concentration in a solution using a hydrogen ion glass electrode and a reference electrode that form a primary battery. During the ion exchange process, the potential difference between the electrodes is measured between the glass membrane and the hydrogen solution to detect the hydrogen ion concentration in the solution and to determine the pH value of the solution.
TDS sensor: The working voltage of the TDS sensor is 3.3~5 V, while its measurement range and accuracy are 0~1000 ppm and ±5% F.S (full scale) (25 • C), respectively. There is a difference between the measured value and the true value of the TDS sensor, which is closely related to the water temperature. Therefore, the TDS sensor should be modified during practical application.
Turbidity sensor: The turbidity value is monitored based on the light transmittance and scattering rate of a liquid solution. The turbidity information is obtained by the AD (analog to digital) conversion interface in the dynamic monitoring environment. The working voltage and accuracy of turbidity sensor are 5 V and 0.75%, respectively. The response time is shorter than 500 ms.

WQMM Lifting Device
In order to realize a three-dimensional water quality monitoring system, the proposed design uses the ZS-RE81.3i DC (direct current) motor (ZHENG KE ELECTROMOTOR, Wenzhou, China) to lift and lower the water quality monitoring device. As shown in Figure 6b, the WQMM lifting device consists of a lifting gear shaft and DC motor. The voltage of the motor is 12 V, and its maximum value of revolutions per minute is 60, which indicates that the number of revolutions per minute can reach the value of 60. The PID (proportion integration differentiation) algorithm is used to obtain the desired rotating speed while the device is running.
The upgrade device can reach different depths in the range of 0~2 m. Meanwhile, the number of revolutions (n r ) of the motor can be calculated using the number of pulses. Assume R denotes the radius of the lifting gear shaft. In addition, the system can obtain the descending depth l d of the probe box, which is given by: Because the WQMM is initially flush with the bottom of a USV, the starting position is determined by the USV's draft. The USV MCU stops the WQMM descent by stopping the motor working. When the USV arrives at a predefined location, the USV MCU computes the sinking time and the standing time of the WQMM. The water quality parameters are transmitted before the water monitoring.

Sensing Data Transmission and Storage
The water sensing data are stored in the local memory of a monitoring module. After the probe box leaves the water body, the WQMM MCU activates the wireless module to establish a wireless connection with the UVS. Then, the sensing data are transmitted to the USV via wireless links. The USV On the open-source cloud platform, the unique data access JSON (JavaScript object notation) format message can be obtained, which is then sent to the cloud platform by the wireless module. The cloud platform parses the data and stores it. The protocol format of data access is different for each device. When a device account is created, the system automatically assigns a device ID (identifier) and an interface key (APIKEY) to the device. When data need to be transmitted to the cloud platform, a device is required to store the data. As mentioned, each device has a fixed ID and APIKEY. When sending a data request message to the cloud platform, the device ID number and APIKEY are used to access and transfer the data.

Water Quality Analysis Algorithm
The water quality level can be determined relatively straightforward use of various parameters, including the pH value, turbidity, the total number of bacteria, oxygen content, TDS value, and others. The traditional water quality monitoring methods mainly conduct water quality testing based on water quality samples with a large number of parameters, which lacks the features of automation and high efficiency. Therefore, water quality evaluation using a small number of parameters can be an effective alternative. It was shown [34,35] that pH value was significantly positively correlated with the dissolved oxygen value, electrical conductivity, and other parameters. Besides, the turbidity promotes the growth and reproduction of bacteria and adsorption of harmful toxic inorganic and organic substances. The turbidity particles have a certain impact on human and fish health. In the water without electrolysis or acid-base treatment, the salt cations are mainly calcium and magnesium, which coincides with the definition of water hardness. Therefore, water hardness can be indirectly expressed by the TDS value. Namely, when this value changes, the water quality also changes. Therefore, in the water quality analysis system, the three parameters (pH, TDS, and turbidity) are used to assess the water quality.
The proposed system uses the ensemble learning method [36,37] to predict subsequent changes in water quality by analyzing the collected water quality data in order to determine the correlation between the pH, turbidity, and TDS values. The Random Forest algorithm represents a concrete implementation of the bagging method. This algorithm trains multiple decision trees and combines the results of these trees to obtain the final result. The Random Forest can be used for splitting and regression, which are used to find the best fitting parameters. Its performance mainly depends on the decision tree type. The decision tree type is selected according to the specific task. For instance, in machine learning, if a set of objects can be classified into multiple categories, then the information on a certain class (xi) can be defined as follows: where I(x) represents the information on a random variable, and p(xi) refers to the probability that xi occurs. On the open-source cloud platform, the unique data access JSON (JavaScript object notation) format message can be obtained, which is then sent to the cloud platform by the wireless module. The cloud platform parses the data and stores it. The protocol format of data access is different for each device. When a device account is created, the system automatically assigns a device ID (identifier) and an interface key (APIKEY) to the device. When data need to be transmitted to the cloud platform, a device is required to store the data. As mentioned, each device has a fixed ID and APIKEY. When sending a data request message to the cloud platform, the device ID number and APIKEY are used to access and transfer the data.

Water Quality Analysis Algorithm
The water quality level can be determined relatively straightforward use of various parameters, including the pH value, turbidity, the total number of bacteria, oxygen content, TDS value, and others. The traditional water quality monitoring methods mainly conduct water quality testing based on water quality samples with a large number of parameters, which lacks the features of automation and high efficiency. Therefore, water quality evaluation using a small number of parameters can be an effective alternative. It was shown [34,35] that pH value was significantly positively correlated with the dissolved oxygen value, electrical conductivity, and other parameters. Besides, the turbidity promotes the growth and reproduction of bacteria and adsorption of harmful toxic inorganic and organic substances. The turbidity particles have a certain impact on human and fish health. In the water without electrolysis or acid-base treatment, the salt cations are mainly calcium and magnesium, which coincides with the definition of water hardness. Therefore, water hardness can be indirectly expressed by the TDS value. Namely, when this value changes, the water quality also changes. Therefore, in the water quality analysis system, the three parameters (pH, TDS, and turbidity) are used to assess the water quality.
The proposed system uses the ensemble learning method [36,37] to predict subsequent changes in water quality by analyzing the collected water quality data in order to determine the correlation between the pH, turbidity, and TDS values. The Random Forest algorithm represents a concrete implementation of the bagging method. This algorithm trains multiple decision trees and combines the results of these trees to obtain the final result. The Random Forest can be used for splitting and regression, which are used to find the best fitting parameters. Its performance mainly depends on the decision tree type. The decision tree type is selected according to the specific task. For instance, in machine learning, if a set of objects can be classified into multiple categories, then the information on a certain class (x i ) can be defined as follows: where I(x) represents the information on a random variable, and p(x i ) refers to the probability that x i occurs.
In Figure 8, the flowchart of the water quality analysis algorithm based on the Random Forest is presented. The steps of this algorithm are as follows.
Water 2020, 12, x FOR PEER REVIEW 9 of 15 In Figure 8, the flowchart of the water quality analysis algorithm based on the Random Forest is presented. The steps of this algorithm are as follows. Figure 8. Flowchart of the water quality analysis algorithm. k, the number of trees; OOB, out-of-bag.
Step 1: Raw dataset is obtained by extracting the three features, namely the pH, turbidity, and TDS values. Then, these values are combined with the water quality evaluation results obtained from the historical dataset. The historical data consists of water quality parameters at different depths that are manually collected, and water quality evaluation value obtained by the average method.
Step 2: The data obtained in Step 1 is cleaned by retrieving and processing abnormal values of the three parameters and water quality evaluation value.
Step 3: The sub training set and test samples are generated for each decision tree by the bootstrap sampling technique.
Step 4: Steps 2 and 3 are repeated k times to construct k decision trees to generate random forest.
Step 5: The classification results of each decision tree for the test samples are summarized, and the class with the maximum number of votes is the final classification result.

Results and Analysis
This section presents and analyzes experimental sensing results, the performance of the water quality analysis algorithm, and water quality monitoring time.

Sensing Results
The proposed design was evaluated in the aquaculture zone, and the coordinates of the position where the water quality was measured were at Lat. 21.881131, Long. 110.842761, as shown in Figure  9. For the purpose of accurate measurement of water quality parameters, the water area was gridded. Step 1: Raw dataset is obtained by extracting the three features, namely the pH, turbidity, and TDS values. Then, these values are combined with the water quality evaluation results obtained from the historical dataset. The historical data consists of water quality parameters at different depths that are manually collected, and water quality evaluation value obtained by the average method.
Step 2: The data obtained in Step 1 is cleaned by retrieving and processing abnormal values of the three parameters and water quality evaluation value.
Step 3: The sub training set and test samples are generated for each decision tree by the bootstrap sampling technique.
Step 4: Steps 2 and 3 are repeated k times to construct k decision trees to generate random forest.
Step 5: The classification results of each decision tree for the test samples are summarized, and the class with the maximum number of votes is the final classification result.

Results and Analysis
This section presents and analyzes experimental sensing results, the performance of the water quality analysis algorithm, and water quality monitoring time.

Sensing Results
The proposed design was evaluated in the aquaculture zone, and the coordinates of the position where the water quality was measured were at Lat. 21.881131, Long. 110.842761, as shown in Figure 9. For the purpose of accurate measurement of water quality parameters, the water area was gridded. Meanwhile, 100 measuring points were selected, as shown in Figure 9, and the designed USV was used to measure water parameters at three different depths, which were 10 cm, 50 cm, and 100 cm. Meanwhile, for displaying the measurement part results at different depths, we measured the water parameters at points (9,19,33,47,61,75) at three different depth values, as shown in Figure 9. There were 18 monitoring points in the 15-square-meter area. Due to the continuity of monitored values, a spline interpolation technique was used to generate the cross-section map of each water parameter. In other words, the cross-section map denoted an objective reflection of the actual parameter values. The cross-section maps of the pH, TDS, turbidity, are presented in Figure 10. In Figure 10, it can be seen that average pH, TDS, and turbidity values were 8, 45 ppm, and 8.5 NTU, respectively. At the width from 10 to 15 m and the depth from 0 to 1 m, the pH value was greater than 9, which was beyond the safe limit. By using the proposed evaluation algorithm, it was found that the water quality in this region was slightly polluted. Meanwhile, for displaying the measurement part results at different depths, we measured the water parameters at points (9,19,33,47,61,75) at three different depth values, as shown in Figure 9. There were 18 monitoring points in the 15-square-meter area. Due to the continuity of monitored values, a spline interpolation technique was used to generate the cross-section map of each water parameter. In other words, the cross-section map denoted an objective reflection of the actual parameter values. The cross-section maps of the pH, TDS, turbidity, are presented in Figure 10. In Figure 10, it can be seen that average pH, TDS, and turbidity values were 8, 45 ppm, and 8.5 NTU, respectively. At the width from 10 to 15 m and the depth from 0 to 1 m, the pH value was greater than 9, which was beyond the safe limit. By using the proposed evaluation algorithm, it was found that the water quality in this region was slightly polluted.
parameter. In other words, the cross-section map denoted an objective reflection of the actual parameter values. The cross-section maps of the pH, TDS, turbidity, are presented in Figure 10. In Figure 10, it can be seen that average pH, TDS, and turbidity values were 8, 45 ppm, and 8.5 NTU, respectively. At the width from 10 to 15 m and the depth from 0 to 1 m, the pH value was greater than 9, which was beyond the safe limit. By using the proposed evaluation algorithm, it was found that the water quality in this region was slightly polluted.

Intelligent Water Quality Analysis Algorithm
Model evaluation results: the Random Forest model was tested using Python 3.6 (Python Software Foundation, Wilmington, DE, USA) programming language. A total of 2870 samples corresponding to five different water quality levels were used in the experiment. The parameters of the proposed algorithm are presented in Table 3. In the learning process of the Random Forest model, 90% of the available data were selected as the training set, whereas the remaining 10% was used to test the developed model. The logistic regression and SVM methods were also used to analyze water quality, and their results were compared with the results obtained by the proposed method. In comparison, the precision, recall, and F1 measure were used because these measures are usually adopted to evaluate the performances of diffident algorithms. The static random model was trained off-line using different values of pH, TDS, and turbidity at different depth levels as input parameters to estimate the water quality, which represented the output parameter. The evaluation indexes of different methods on the test dataset are presented in Table 4, where it can be seen that Random Forest was superior to the other two methods regarding the precision, recall, and F1 measure (H-mean). The precisions of the Random Forest, logistic regression, and SVM (support vector machine) were 92%, 39%, and 40%, respectively. In other words, among the tested methods, the proposed algorithm best evaluated the water quality on the test dataset. The ROC (receiver operating characteristic) curves of different methods are presented in Figure 11, where it can be seen that the values of the area under the curve (AUC) of the Random Forest, logistic regression, and SVM were 0.6, 0.7 and 0.5, respectively. Furthermore, the AUC value of the Random Forest algorithm was larger than those of the other two methods. Thus, the Random Forest that uses only three parameters (pH, TDS, and turbidity) represents a good water quality classifier. Water quality classification precision: a total of two water sample sets denoted as I and II were taken at each position in different time portions, as shown in Figure 9. The water quality of these samples was extracted by laboratory analysis and then compared with the values obtained by the Random Forest, SVM, and logistic regression methods.
The comparison of actual water quality value and water quality values predicted by the SVM, logistic regression, and Random Forest methods is presented in Figure 12. For the sample set I, the precision of the proposed Random Forest algorithm was higher than 92%, and those of the SVM and logistic regression methods were less than 40% and 39.8%, respectively. For the sample set II, the proposed algorithm had a precision of more than 95%, and it outperformed the SVM and logistic regression methods. In Figure 12, it can be seen that the Random Forest algorithm was superior to the other methods regarding the precision rate. The experimental results proved the feasibility of the proposed method. Therefore, the water quality level can be accurately predicted by the Random Forest based on the pH, TDS, and turbidity values.  Water quality classification precision: a total of two water sample sets denoted as I and II were taken at each position in different time portions, as shown in Figure 9. The water quality of these samples was extracted by laboratory analysis and then compared with the values obtained by the Random Forest, SVM, and logistic regression methods.
The comparison of actual water quality value and water quality values predicted by the SVM, logistic regression, and Random Forest methods is presented in Figure 12. For the sample set I, the precision of the proposed Random Forest algorithm was higher than 92%, and those of the SVM and logistic regression methods were less than 40% and 39.8%, respectively. For the sample set II, the proposed algorithm had a precision of more than 95%, and it outperformed the SVM and logistic regression methods. In Figure 12, it can be seen that the Random Forest algorithm was superior to the other methods regarding the precision rate. The experimental results proved the feasibility of the proposed method. Therefore, the water quality level can be accurately predicted by the Random Forest based on the pH, TDS, and turbidity values.
precision of the proposed Random Forest algorithm was higher than 92%, and those of the SVM and logistic regression methods were less than 40% and 39.8%, respectively. For the sample set II, the proposed algorithm had a precision of more than 95%, and it outperformed the SVM and logistic regression methods. In Figure 12, it can be seen that the Random Forest algorithm was superior to the other methods regarding the precision rate. The experimental results proved the feasibility of the proposed method. Therefore, the water quality level can be accurately predicted by the Random Forest based on the pH, TDS, and turbidity values.

Water Quality Monitoring Time
Under normal working conditions, the USV's moving speed could reach 0.5 m/s, and the WQMM's descending speed could reach 0.01 m/s. Therefore, these values were used in the experiment on water quality monitoring. At the above parameters, the experiment of water quality monitoring was constructed. The continuous monitoring was conducted at different depth values at 1 point (donates as C I), 5 points (donates as C II), and points (donates as C III) that were 2 m apart. The average monitoring times were obtained after the experiment was repeated three times under the same conditions. The average monitoring times of C I, C II, and C III were 400 s, 2100 s, and 4100 s, respectively. It only took three minutes from uploading the monitoring data to getting the evaluation results. The proposed system saved more than 60% of time compared with the manual approach.

Water Quality Monitoring Time
Under normal working conditions, the USV's moving speed could reach 0.5 m/s, and the WQMM's descending speed could reach 0.01 m/s. Therefore, these values were used in the experiment on water quality monitoring. At the above parameters, the experiment of water quality monitoring was constructed. The continuous monitoring was conducted at different depth values at 1 point (donates as C I), 5 points (donates as C II), and points (donates as C III) that were 2 m apart. The average monitoring times were obtained after the experiment was repeated three times under the same conditions. The average monitoring times of C I, C II, and C III were 400 s, 2100 s, and 4100 s, respectively. It only took three minutes from uploading the monitoring data to getting the evaluation results. The proposed system saved more than 60% of time compared with the manual approach.

Conclusions
In this paper, an intelligent wide-area water quality monitoring and analysis system is proposed, which represents a combination of intelligent USV, water quality monitoring module, and online water quality analysis. An unmanned system is designed to control a USV cruise automatically. By integrating the water quality sensor and lifting devices, the WQMM is designed. The ensemble learning method is proposed to analyze water quality, which provides a scientific basis for wide-area water quality testing. The experimental results demonstrate and validate that the proposed system can satisfy the requirements for water quality monitoring while improving the overall work efficiency. In the future, we will study the USV drifting control and the working performance of accuracy under different environmental conditions.