Application of Regression Analysis to Achieve a Smart Monitoring System for Aquaculture

: The consumption awareness of people in recent years has increased, with food safety becoming more and more important. While non-toxic products can be achieved by avoiding using too much antibiotics to control growth factors in a water environment, the measurement tools for dissolved oxygen on the market are very expensive and a great economic burden to ﬁshermen. Thus, the purpose of this study is to design more economical measurement modules and algorithms for monitoring ponds. The research collected pond data through Oxidation-Reduction Potential (ORP), pH and temperature sensors, used regression analysis to infer Dissolved Oxygen (DO) by ORP and pH, and employed a real-time pond monitoring data map to ﬁgure out pond conditions. Compared with traditional equipment, ﬁndings show our approach reduces costs by about 20%, and increases production capacity and output value.


Introduction
A report by the Food and Agriculture Organization (FAO) of the United Nations estimated that total global seafood production is expected to increase 18% by 2030, reaching 2.01 million tons, with aquaculture as the main output [1]. Taiwan's geographical environment is surrounded by sea, with a coastline spanning over 1100 km. Innately, it has favorable conditions for the development of offshore and coastal fisheries. With such an excellent environment, its aquaculture fishery industry has great potential for development.
People have paid great attention to food safety issues alongside the improvement of quality of life. An investigation showed that 82% of respondents mistrusted current food safety, which highlights the important of this issue [2]. Nevertheless, the quality of water is influenced by economic activities, such as waste water disposal from factories and aquaculture industries; the discharged residue has a great influence on the aqua ecosystem. For the reason mentioned above, water quality monitoring is very much in demand for giving an excellent environment for aquaculture [3][4][5][6][7].
In the era of artificial intelligence, big data monitoring and the analysis of smart farming are now trends. The popularity of networked applications has made various online upload service systems more and more convenient. The combination of simple transmission methods and various sensing systems allows users to monitor the status of culture ponds and adjust the water quality environment in a timely manner. Factors affecting the external environment of the fish population can be judged and determined through the long-term monitoring of trend charts, so as to minimize any adverse effects, improve the health of the fish, enhance the quality and output of the culture ponds, and reduce human resources and time costs.
Internet of things (IOT) technology is universally applied in several industries. Prabha et al. designed an aquaponics system and monitored the growth of fish as well as vegetables [8]. Purwandoko traced each parameter in the production process and established related knowledge and information with a rice traceability system [9]. Rajalashmi et al. established a system for monitoring and controlling water quality in a sewage treatment plant [10]. Abinaya et al. carried out a monitoring and control system for aquaculture [11]. In this study, an IOT-based smart monitoring system is carried out by analyzing water quality from the collected values of pH and ORP of water for aquaculture.
The ORP measures the potential difference between the gold or platinum electrode and the reference electrode. The survey instrument is in a pond fish, and discharged via the metabolites of protein in the form of organic nitrogen and ammonia. The water formed during nitration comprises the effect and denitrification for measuring the ORP through redox chemical reactions [12].
Because the amount of DO influences the growth of soil microorganisms, DO represent an environmental growth indicator. When DO rises, rainbow trout fingerlings' weight gain rates, specific growth rates and feed conversion rates all increase, thus exhibiting a significant change in the growth index [13][14][15][16][17][18]. Based on previous research, this paper derives several issues, as follows.
(1) DO measuring system: In general, the DO measuring system on the market costs NT$ 600,000 on average. Their internal circuit diagram structure is too complicated to make products. (2) Restriction of transmission space: Farm fields cannot be covered with a full Wi-Fi environment.
To set up a smart monitoring system, it becomes necessary to use a gateway and Bluetooth technology to receive each device's measured water quality data, which can then be uploaded to a cloud database via 4G or Wi-Fi.
The main purposes of this study are to use a low-cost hardware device to collect various environmental pond data, to transfer back the analysis of the estimated true values of DO, and to develop aquaculture ponds with intelligent environmental monitoring and control systems.

Methods
This paper used polynomial linear and non-linear regression analyses, where i is a sample of collected data, the pH value is set to X 1i , ORP is set to X 2i , and the amount of DO is set to Y i . Analysis operations are conducted on the data.

Polynomial Regression Analysis
If there is no linear relationship between X 1i and X 2i , then the above equation can solve for α, β 1 , and β 2 as: The three variables are replaced by Equation (1) to get the regression model.

Non-Linear Polynomial Regression Analysis
where Y is a return value (that is, DO quantity, X 1 and X 2 ) compared with two independent variables (namely, pH and ORP value) to make up Equation (5). The equation converts into a five-variable linear model set up according to the principle of the least square method, which is: where, α, β 1 , β 2 , β 3 , β 4 and β 5 are obtained from the above equation, and the regression model can be realized by substituting back into the original Equation (6).

System Architecture
In this study, an IOT-based system was designed and carried out, the architecture of which is shown in Figure 1, and which uses Raspberry PI as the gateway to receive two or more sensor module means, including temperature, pH meter and ORP. From the repeated judgement filtration of the received data, the filtered data are transferred into regression data analysis and uploaded to the database to be stored and checked for whether they exceed the recommended indicators. The information obtained goes back into regression analysis. Figure 2 shows the system push message activity diagram, including estimating the DO amounts by the pH values and the ORP values, sending data to GCM, and enabling the push broadcast service. When sending the push service, each mobile App device's REGID in the database can be captured and accurately sent to the device through this set of codes, and can successfully trigger a warning reminder, as shown in Figure 3. where Y is a return value (that is, DO quantity, X1 and X2) compared with two independent variables (namely, pH and ORP value) to make up Equation (5). The equation converts into a five-variable linear model set up according to the principle of the least square method, which is: where, α, β1, β2, β3, β4 and β5 are obtained from the above equation, and the regression model can be realized by substituting back into the original Equation (6).

System Architecture
In this study, an IOT-based system was designed and carried out, the architecture of which is shown in Figure 1, and which uses Raspberry PI as the gateway to receive two or more sensor module means, including temperature, pH meter and ORP. From the repeated judgement filtration of the received data, the filtered data are transferred into regression data analysis and uploaded to the database to be stored and checked for whether they exceed the recommended indicators. The information obtained goes back into regression analysis. Figure 2 shows the system push message activity diagram, including estimating the DO amounts by the pH values and the ORP values, sending data to GCM, and enabling the push broadcast service. When sending the push service, each mobile App device's REGID in the database can be captured and accurately sent to the device through this set of codes, and can successfully trigger a warning reminder, as shown in Figure 3.  where Y is a return value (that is, DO quantity, X1 and X2) compared with two independent variables (namely, pH and ORP value) to make up Equation (5). The equation converts into a five-variable linear model set up according to the principle of the least square method, which is: where, α, β1, β2, β3, β4 and β5 are obtained from the above equation, and the regression model can be realized by substituting back into the original Equation (6).

System Architecture
In this study, an IOT-based system was designed and carried out, the architecture of which is shown in Figure 1, and which uses Raspberry PI as the gateway to receive two or more sensor module means, including temperature, pH meter and ORP. From the repeated judgement filtration of the received data, the filtered data are transferred into regression data analysis and uploaded to the database to be stored and checked for whether they exceed the recommended indicators. The information obtained goes back into regression analysis. Figure 2 shows the system push message activity diagram, including estimating the DO amounts by the pH values and the ORP values, sending data to GCM, and enabling the push broadcast service. When sending the push service, each mobile App device's REGID in the database can be captured and accurately sent to the device through this set of codes, and can successfully trigger a warning reminder, as shown in Figure 3.    Figure 3. System push message activity diagram. Figure 4 shows the sensing module mounted on a styrofoam plate, which is located at the edge of the fish pond. The Arduino UNO microchip set and Bluetooth module are set nearby the pond. By Bluetooth communication, measured signals are transferred to Raspberry Pi, which is as an information gateway that receives the detected data from the pond. By this approach, multiple sensor devices set up over multiple fish ponds can be utilized under one integrated system. In this study, the total cost of equipment, including PH and ORP as well as supplies, is about USD 360. For individual sensors, the ORP sensor is USD 130, the PH sensor is USD 35 and Raspberry Pi is USD 70, separately. The offline version of the DO meter on the market costs an average of USD 1500.

Sensor Correction Module
Sensors are put in the culture pond. Nevertheless, the accuracy of the sensors is affected greatly by the position on the sensor's surface of fish feed, as well as impurities in the water. After a period of time, the ORP and pH sensors will change the captured values, affecting the accuracy of DO. Therefore, a sensor calibration module is established.
Based on the problems mentioned above, this study developed a DO field as a management system for the active inputs of the manager. As shown in Figure 5, when management needs to use the automatic calibration system, the automatic calibration button is pressed to enter the set page. The amounts of samples are processed and calibrated, and the confirmation button is pressed to start automatic regression correction.  Figure 4 shows the sensing module mounted on a styrofoam plate, which is located at the edge of the fish pond. The Arduino UNO microchip set and Bluetooth module are set nearby the pond. By Bluetooth communication, measured signals are transferred to Raspberry Pi, which is as an information gateway that receives the detected data from the pond. By this approach, multiple sensor devices set up over multiple fish ponds can be utilized under one integrated system. In this study, the total cost of equipment, including PH and ORP as well as supplies, is about USD 360. For individual sensors, the ORP sensor is USD 130, the PH sensor is USD 35 and Raspberry Pi is USD 70, separately. The offline version of the DO meter on the market costs an average of USD 1500.  Figure 3. System push message activity diagram. Figure 4 shows the sensing module mounted on a styrofoam plate, which is located at the edge of the fish pond. The Arduino UNO microchip set and Bluetooth module are set nearby the pond. By Bluetooth communication, measured signals are transferred to Raspberry Pi, which is as an information gateway that receives the detected data from the pond. By this approach, multiple sensor devices set up over multiple fish ponds can be utilized under one integrated system. In this study, the total cost of equipment, including PH and ORP as well as supplies, is about USD 360. For individual sensors, the ORP sensor is USD 130, the PH sensor is USD 35 and Raspberry Pi is USD 70, separately. The offline version of the DO meter on the market costs an average of USD 1500.

Sensor Correction Module
Sensors are put in the culture pond. Nevertheless, the accuracy of the sensors is affected greatly by the position on the sensor's surface of fish feed, as well as impurities in the water. After a period of time, the ORP and pH sensors will change the captured values, affecting the accuracy of DO. Therefore, a sensor calibration module is established.
Based on the problems mentioned above, this study developed a DO field as a management system for the active inputs of the manager. As shown in Figure 5, when management needs to use the automatic calibration system, the automatic calibration button is pressed to enter the set page. The amounts of samples are processed and calibrated, and the confirmation button is pressed to start automatic regression correction.

Sensor Correction Module
Sensors are put in the culture pond. Nevertheless, the accuracy of the sensors is affected greatly by the position on the sensor's surface of fish feed, as well as impurities in the water. After a period of time, the ORP and pH sensors will change the captured values, affecting the accuracy of DO. Therefore, a sensor calibration module is established.
Based on the problems mentioned above, this study developed a DO field as a management system for the active inputs of the manager. As shown in Figure 5, when management needs to use the automatic calibration system, the automatic calibration button is pressed to enter the set page. The amounts of samples are processed and calibrated, and the confirmation button is pressed to start automatic regression correction.

Dissolved Oxygen Prediction Analysis
Over-fitting in regression analysis produces inaccurate prediction, and comes down to insufficient training data and too many input parameters. In this study, the training data set is

Dissolved Oxygen Prediction Analysis
Over-fitting in regression analysis produces inaccurate prediction, and comes down to insufficient training data and too many input parameters. In this study, the training data set is divided into two parts: the training and the validation data set. The new training data set is still used for the training model, and the new validation data set is used to observe the over-fitting trend.
Two ponds were targeted for the experiment data information, and 60 records were taken as a set. In the set, 50 records were used for training regression and 10 records were for validation. The study took a control approach for polynomial regression analysis and non-linear polynomial regression analysis.
Under the polynomial regression analysis method, 50 sets of training data, the predicted DO values and the real DO values were plotted as a line chart, as in Figure 6. The real values and the non-linear polynomial regression analysis prediction results are shown as Figure 7.

Dissolved Oxygen Prediction Analysis
Over-fitting in regression analysis produces inaccurate prediction, and comes down to insufficient training data and too many input parameters. In this study, the training data set is divided into two parts: the training and the validation data set. The new training data set is still used for the training model, and the new validation data set is used to observe the over-fitting trend.
Two ponds were targeted for the experiment data information, and 60 records were taken as a set. In the set, 50 records were used for training regression and 10 records were for validation. The study took a control approach for polynomial regression analysis and non-linear polynomial regression analysis.
Under the polynomial regression analysis method, 50 sets of training data, the predicted DO values and the real DO values were plotted as a line chart, as in Figure 6. The real values and the non-linear polynomial regression analysis prediction results are shown as Figure 7.  The prediction results shown in Figure 6; Figure 7 did not show large differences between the training data set and the validation data set. The predictions of both polynomial regression analysis and non-linear polynomial regression analysis did not indicate any over-fitting condition.
The study first calculates the sum of squared errors (SSE) for the two regression approaches, the takes the difference between real dissolved oxygen (DOreal) and predicted dissolved oxygen (DOpredict), and divides this by DOreal, so as to get the calculated variation rate manner, as follows: The prediction results shown in Figure 6; Figure 7 did not show large differences between the training data set and the validation data set. The predictions of both polynomial regression analysis and non-linear polynomial regression analysis did not indicate any over-fitting condition.
The study first calculates the sum of squared errors (SSE) for the two regression approaches, the takes the difference between real dissolved oxygen (DO real ) and predicted dissolved oxygen (DO predict ), and divides this by DO real , so as to get the calculated variation rate manner, as follows: Equation (7) allows one to estimate the SSE of the polynomial regression analysis of DO within a 0.82% error in Table 1, which complies with a reasonable range. Additionally, the SSE of the non-linear polynomial regression analysis predicts DO values within a 0.42% error. The result shows a lower error than the linear regression model.

Web Page Management Mode
In Figure 8, the manager enters the management page of the website to observe current aquaculture ponds and also to monitor the measured data of each pond. Adding, modifying or deleting other farmed fish tanks can be conducted. In fact, the real-time management of the detected information of fish ponds can be achieved more conveniently. The prediction results shown in Figure 6; Figure 7 did not show large differences between the training data set and the validation data set. The predictions of both polynomial regression analysis and non-linear polynomial regression analysis did not indicate any over-fitting condition.
The study first calculates the sum of squared errors (SSE) for the two regression approaches, the takes the difference between real dissolved oxygen (DOreal) and predicted dissolved oxygen (DOpredict), and divides this by DOreal, so as to get the calculated variation rate manner, as follows: Equation (7) allows one to estimate the SSE of the polynomial regression analysis of DO within a 0.82% error in Table 1, which complies with a reasonable range. Additionally, the SSE of the non-linear polynomial regression analysis predicts DO values within a 0.42% error. The result shows a lower error than the linear regression model.

Web Page Management Mode
In Figure 8, the manager enters the management page of the website to observe current aquaculture ponds and also to monitor the measured data of each pond. Adding, modifying or deleting other farmed fish tanks can be conducted. In fact, the real-time management of the detected information of fish ponds can be achieved more conveniently. Given the different growth environments for each aquaculture pool, each aqua environment parameter should be set individually, such as the fish category, pool name, ORP limitation, pH limitation, temperature limitation and DO limitation. The user interface is shown in Figure 9. Given the different growth environments for each aquaculture pool, each aqua environment parameter should be set individually, such as the fish category, pool name, ORP limitation, pH limitation, temperature limitation and DO limitation. The user interface is shown in Figure 9. The power durability of the remote control module should be studied closely. The saving of power can be achieved by a partial shutdown module. For instance, the large part of the system can be shut down if there is no transformation data. Additionally, the entire system can be booted up if the transformation data are ready [19,20]. Therefore, we schematized an operation mode for the administrator which can set the remote control manually, including the maximum items stored in the repository, the refreshed APP cycle data and each parameter's flexible range, and filter repetitive information to avoid data duplication, as seen in Figure 10. This approach can improve the waste of power and upgrade the durability of the remote module. The power durability of the remote control module should be studied closely. The saving of power can be achieved by a partial shutdown module. For instance, the large part of the system can be shut down if there is no transformation data. Additionally, the entire system can be booted up if the transformation data are ready [19,20]. Therefore, we schematized an operation mode for the administrator which can set the remote control manually, including the maximum items stored in the repository, the refreshed APP cycle data and each parameter's flexible range, and filter repetitive information to avoid data duplication, as seen in Figure 10. This approach can improve the waste of power and upgrade the durability of the remote module. The power durability of the remote control module should be studied closely. The saving of power can be achieved by a partial shutdown module. For instance, the large part of the system can be shut down if there is no transformation data. Additionally, the entire system can be booted up if the transformation data are ready [19,20]. Therefore, we schematized an operation mode for the administrator which can set the remote control manually, including the maximum items stored in the repository, the refreshed APP cycle data and each parameter's flexible range, and filter repetitive information to avoid data duplication, as seen in Figure 10. This approach can improve the waste of power and upgrade the durability of the remote module. In this study, the Raspberry Pi was connected to a 100 mAh rechargeable battery to measure the energy consumption of both the transformation period and the stop working period. When a detector transferred 100 data per second, the energy consumption was about 720 mA, and the module could keep working for 1.3 h. Meanwhile, when the transferring of data stopped, the energy consumption was about 350 mA, and the module could keep working for 4 h. For this result, the amount of transferring data can be set on web pages to carry out the purpose of power saving.

APP Monitoring Mode
In this study, a mobile APP was designed and developed to monitor the water quality in an aquaculture pool. If the detected water quality is over the set limitation of the index parameters, the warning words in red are shown. When the mobile APP is opened up, the current condition In this study, the Raspberry Pi was connected to a 100 mAh rechargeable battery to measure the energy consumption of both the transformation period and the stop working period. When a detector transferred 100 data per second, the energy consumption was about 720 mA, and the module could keep working for 1.3 h. Meanwhile, when the transferring of data stopped, the energy consumption was about 350 mA, and the module could keep working for 4 h. For this result, the amount of transferring data can be set on web pages to carry out the purpose of power saving.

APP Monitoring Mode
In this study, a mobile APP was designed and developed to monitor the water quality in an aquaculture pool. If the detected water quality is over the set limitation of the index parameters, the warning words in red are shown. When the mobile APP is opened up, the current condition information of the ponds will be displayed on the main page. If the values of the aquaculture pond exceed the recommended indicators, a notification will be shown on the main page to remind the manager, as shown in Figure 11a. The manager presses the menu of the aquaculture ponds interface. According to the currently paired Bluetooth device, the mobile APP will then dynamically display on the screen the amounts of aquaculture ponds, as well as the corresponding current parameters, as shown in Figure 11b. information of the ponds will be displayed on the main page. If the values of the aquaculture pond exceed the recommended indicators, a notification will be shown on the main page to remind the manager, as shown in Figure 11a. The manager presses the menu of the aquaculture ponds interface. According to the currently paired Bluetooth device, the mobile APP will then dynamically display on the screen the amounts of aquaculture ponds, as well as the corresponding current parameters, as shown in Figure 11b.
(a) (b) Figure 11. APP monitoring mode for (a) warning attention and (b) aquaculture pond status.

Conclusions
This study issues a low-cost monitoring system for aquaculture. Farmers can monitor the information of culture ponds, and historical data. The cost of the used equipment in this study is about USD 350, but the entire commercial system is about USD 2000 on the market place.
The amount of DO influences the growth of soil microorganisms, i.e., DO results indicate the quality of the aquatic environment. Because the variation trend of the DO value is related to similar

Conclusions
This study issues a low-cost monitoring system for aquaculture. Farmers can monitor the information of culture ponds, and historical data. The cost of the used equipment in this study is about USD 350, but the entire commercial system is about USD 2000 on the market place.
The amount of DO influences the growth of soil microorganisms, i.e., DO results indicate the quality of the aquatic environment. Because the variation trend of the DO value is related to similar pH and ORP values, DO increases with the increase in pH and ORP values. As mentioned above, we validated the feasibility of using pH and ORP values to evaluate the DO value via the regression approach. In this study, a linear polynomial regression approach and a non-linear regression approach were carried out to predict the value of DO. By comparing the SSE of approaches and the actual results, we determined that the non-linear regression approach performed more accuracy than the linear one.
This study was carried out through HTML in combination with PHP to establish the managers' user interface. The advantages of this foundation are that it makes it convenient for management to set up the parameters of the aquaculture pond and the Raspberry PI in an orderly manner. Additionally, the functions of the mobile APP are programmed through the Android operation system. The main purpose is to receive information about the aquaculture pond and to send notifications to users. The APP is responsible for monitoring the pond's environmental status, in a range of standard values, anytime and anywhere for the express purpose of utilizing human resources more efficiently.