A Machine-Learning-Based IoT System for Optimizing Nutrient Supply in Commercial Aquaponic Operations

Nutrient regulation in aquaponic environments has been a topic of research for many years. Most studies have focused on appropriate control of nutrients in an aquaponic set-up, but very little research has been conducted on commercial-scale applications. In our model, the input data were sourced on a weekly basis from three commercial aquaponic farms in Southeast Texas over the course of a year. Due to the limited number of data points, dimensionality reduction techniques such as pairwise correlation matrix were used to remove the highly correlated predictors. Feature selection techniques such as the XGBoost classifier and Recursive Feature Elimination with ExtraTreesClassifier were used to rank the features in order of their relative importance. Ammonium and calcium were found to be the top two nutrient predictors, and based on the months in which lettuce was cultivated, the median of these nutrient values from the historical dataset served as the optimal concentration to be maintained in the aquaponic solution to sustain healthy growth of tilapia fish and lettuce plants in a coupled set-up. To accomplish this, Vernier sensors were used to measure the nutrient values and actuator systems were built to dispense the appropriate nutrient into the ecosystem via a closed loop.


Introduction
The system of aquaponics involves the adaptation of hydroponics and aquaculture in a combined set-up which can go on to eliminate the food scarcity and environmental crisis that the world faces today [1,2]. Due to rapid industrialization and large-scale globalization, the economic and social aspects of the varieties of food products emerging from aquaponics have been of great interest as research studies go on to show that it is possible to achieve a ten-fold increase in production without the use of harmful chemicals or pesticides, while using only 2-10% of the water used in traditional agricultural techniques [3][4][5]. Although there have been many studies over the last three decades on growing plants in aquaponic set-ups, none of them have focused specifically on determining the important nutrients which need to be monitored and controlled, depending on the season in which the crop was cultivated.
There have been numerous studies in which different Internet of Things (IoT) systems have been implemented in small laboratory set-ups for optimal growth of plants in hydroponic and aquaponic environments. In [6], Valiente et al. constructed a system to monitor the pH level and water temperature for growing Nile tilapia and romaine lettuce, and constructed an actuation system to maintain the pH level between 6.2 and 7.5 and water temperature between 27 • C and 30 • C to ensure optimal growth. In [7], Yanes et al. conducted a review study on IoT and smart systems for monitoring and for growing rice. Similarly, in [32], Zhang et al. proposed the use of sensors to measure air temperature, air humidity, soil temperature and soil humidity to monitor the growth of Chinese citrus in real-time environments using Zigbee technology. In [33], Yimwadsana et al. proposed an IoT-controlled system for plant growth consisting of sensors to measure air, light intensity and soil moisture, a dashboard to visualize the collected data and actuator modules consisting of relays, motor gear and water pump controlled by microcontrollers. In [34], Chaudhary et al. proposed a system consisting of a wireless sensor network to monitor and control the environmental parameters in a greenhouse. Similar to the above, in [35], Jaiswal et al. constructed a system to measure the parameters stated above in a greenhouse, controlling them through a Raspberry Pi module and displaying the data on a ThingSpeak cloud platform. Our research measures and regulates the important nutrients in the aquaponic solution which are determined from data-driven approaches rather than regulating the chemical properties of the solution such as pH and conductivity or the environmental parameters of the greenhouse such as temperature and humidity.

Methodology
The dataset used in our case was recorded over the course of a year from three aquaponic facilities in Southeast Texas: Aquatic Greens Farm (Bryan), Wolff Family Farms (Caldwell) and Texas US Farms (Grimes). From the Aquatic Greens Farm, three samples were collected weekly (one from the tank which bred goldfish, another from the tank which bred tilapia and one from the greenhouse where lettuce was grown). From the Wolff Family Farms, two samples were collected weekly (one from the tank which bred Nile tilapia and the other from the main greenhouse where collard greens, jalapeno peppers and lettuce were grown). From the Texas US Farms, three samples were collected weekly (one from the main growth tank which was used to breed tilapia and shrimp, one from the greenhouse which was used to grow lettuce and kale greens and the other from the tank which was used to breed seedlings before placing them in the main plant bed). After collecting the samples, these were sent to the Soil, Forage and Water Testing Laboratory at Texas A&M University, USA for determining the major nutrients in the aquaponic solution. In the laboratory, the nitrate concentration in the aquaponic solution was measured by reducing the nitrites to nitrates using a cadmium column followed by spectrophotometry. The chloride concentration in the aquaponic solution was reduced by chromatography and the carbonate/bicarbonate concentration was determined by using acid titration using sulfuric acid. A set-up of the fish tank and the main plant bed which was used for growing lettuce is shown in Figure 1.

Analysis of the Dataset
The dataset which was used for analysis in our case had a total of 211 observations and 12 predictors. The predictors consisted of the concentrations of calcium, magnesium, sodium, potassium, boron, carbonate, bicarbonate, sulfate, chloride, nitrate, ammonium and phosphorus (all of these measured in ppm). The month in which the observations were recorded was used as the class variable, i.e., the winter and spring months from November to March in which plants were cultivated were coded as 1, and the summer and fall months from April to October were coded as 0, respectively. The predictor which

Analysis of the Dataset
The dataset which was used for analysis in our case had a total of 211 observations and 12 predictors. The predictors consisted of the concentrations of calcium, magnesium, sodium, potassium, boron, carbonate, bicarbonate, sulfate, chloride, nitrate, ammonium and phosphorus (all of these measured in ppm). The month in which the observations were recorded was used as the class variable, i.e., the winter and spring months from November to March in which plants were cultivated were coded as 1, and the summer and fall months from April to October were coded as 0, respectively. The predictor which stored the carbonate concentration in ppm was dropped before carrying out any preliminary analysis as the variance was zero throughout.

Generation of Synthetic Data
As the dataset was too small to make inferences on, it was an important measure to generate synthetic data before applying any data-driven approach. The dataset at hand had 107 observations belonging to Class 0 and 104 observations belonging to Class 1. Therefore, it was decided to use the concept of generation of mean and covariance matrix between and among the classes using two variations of Monte-Carlo (MC) approaches to generate synthetic data which were used for carrying out inferences.
At first, a mean vector and covariance matrix were generated between the observations belonging to class 0 and also for class 1. Both classes were considered separately while generating these matrices, and based on these computed parameters, the synthetic data points were generated. The same process was repeated again where all the observations were considered together as a single dataset, irrespective of their classes, to generate the mean vector and covariance matrices that were used to produce the second set of synthetic data. In the end, both datasets were combined and different feature selection techniques were carried out on them to select the top two macronutrient predictors, as described in detail in the next section.

Feature Selection
Before constructing an IoT system for sensing and regulating the nutrient parameters in a feedback loop, it was important to select the most important parameters to scale down the cost of the system. For this, three methods were used in a pipeline for reducing the dimensionality of the dataset, which is depicted in Figure 2 below. The initial dataset which was used in this case followed a multivariate normal distribution, due to which, it was decided not to normalize the dataset before proceeding with the feature selection techniques. A pairwise correlation matrix was computed to remove the highly correlated predictors, as having these predictors in the analysis can increase the bias as well as degrade the interpretability of the model. After removing the features with the highest pairwise correlation, the XGBoost classifier was used to construct the relative feature importance by constructing boosted decision trees within the model. In the end, the ExtraTreesClassifier was used to aggregate the results of multiple decorrelated decision trees to generate feature importance. After this pipeline was implemented, the top two nutrients were selected, which were sensed and regulated using an IoT set-up that is described in detail in the next section.

IoT System for Nutrient Regulation
In order to build the IoT system for regulating nutrients, the entire system was divided into three parts, namely: (a) the sensor subsystem, (b) the feedback loop and (c) the actuator system. The sensor subsystem consists of two sensors, manufactured by Vernier, to measure the important nutrients and output the data to the Raspberry Pi through USB.
The sensors were selected by a thorough trade study, with consideration of cost, level of accuracy, type of interface required and ease of use. The two Vernier sensors which The initial dataset which was used in this case followed a multivariate normal distribution, due to which, it was decided not to normalize the dataset before proceeding with the feature selection techniques. A pairwise correlation matrix was computed to remove the highly correlated predictors, as having these predictors in the analysis can increase the bias as well as degrade the interpretability of the model. After removing the features with the highest pairwise correlation, the XGBoost classifier was used to construct the relative feature importance by constructing boosted decision trees within the model. In the end, the ExtraTreesClassifier was used to aggregate the results of multiple decorrelated decision trees to generate feature importance. After this pipeline was implemented, the top two nutrients were selected, which were sensed and regulated using an IoT set-up that is described in detail in the next section.

IoT System for Nutrient Regulation
In order to build the IoT system for regulating nutrients, the entire system was divided into three parts, namely: (a) the sensor subsystem, (b) the feedback loop and (c) the actuator system. The sensor subsystem consists of two sensors, manufactured by Vernier, to measure the important nutrients and output the data to the Raspberry Pi through USB.
The sensors were selected by a thorough trade study, with consideration of cost, level of accuracy, type of interface required and ease of use. The two Vernier sensors which were used can measure calcium and ammonium up to 40,000 parts per million (ppm), with an accuracy of ±10%. They have the capability to interface with a Raspberry Pi through USB or Bluetooth, and are programmed through Python using the manufacturer's Python driver. These sensors are calibrated by rinsing with distilled water. Each sensor samples the water five times before outputting an average of the results to the feedback loop which computes the values in ppm.
To properly measure and regulate the nutrient levels within the entire system, a feedback loop is used to continually monitor the environment. It connects to both the sensor system and actuator system to retrieve current nutrient amounts and release each nutrient in uniform concentration if needed. This program was created using the Python programming language due to the availability of libraries developed for usage with the Vernier ion-selective electrodes. The sensors themselves do not have any form of serial communication, so USB connectivity was used. A Python-based feedback loop also presents the opportunity to develop highly object-oriented code for ease of debugging and future improvements. A Raspberry Pi Model 3B+ was selected to run the program. This device also contains multiple USB connections to power the sensor module.
In order to run, the feedback loop first initializes all connected systems including the Vernier ISE probes and actuators. Upon a successful set-up sequence, the loop begins to take readings from the Vernier sensors and averages them five times to use as the current nutrient level. If the measured levels remain below the targeted nutrient levels, the actuator system receives a signal through the GPIO pins of the Raspberry Pi to run one cycle of nutrient release, slightly increasing the concentration of the given nutrient. As the nutrient deficit is not being absolutely compared, the feedback loop's iterative cycling helps to continually maintain a target goal. Rather than dropping an estimated amount in all at once to make up any measured deficit, the continual cycling and addition of small amounts better approaches a consistent system with a smaller chance of nutrient oversaturation or undersaturation.
The actuator subsystem in this case consists of a PIC microcontroller, 12 V tolerance motor modules and two 12 V stepper motors to dispense the nutrients. The stepper motors are connected to the motor modules which supply the motors with 12 V as well as the signals to rotate the dispensing portion. The motor modules are then connected to the PIC microcontroller, which determines the rotational direction of the motor and the speed of rotation. The motors are activated when the PIC microcontroller receives a bit signal from the Raspberry Pi to indicate that more nutrients are needed.

Results
A brief overview of the pipeline used to carry out the analysis is described in Figure 3 below.
The first step before analyzing the dataset is to remove the predictors which had zero or null variance in their values. Carbonate was removed from the dataset as it had a value of zero throughout the dataset. After this, it was decided to proceed with synthetic data generation using Monte-Carlo approaches. Standardization or normalization of the data points was not carried out as all the predictors were distributed normally. A total of 2675 data points were generated for each class using a variation of the Monte-Carlo technique in which a separate mean and covariance matrix was produced for each class. Similarly, 5350 synthetic data points were generated, sharing the mean and covariance matrix between both the classes. This resulted in a total of 10,700 observations with 12 predictors, which were then used for further analysis. As the process of synthetic data generation and feature selection algorithms was moderately computationally exhaustive, an x64-based Intel(R) Core(™) i7-processor-based system with 8 GB RAM was used to run the algorithms. motor modules and two 12 V stepper motors to dispense the nutrients. The stepper motors are connected to the motor modules which supply the motors with 12 V as well as the signals to rotate the dispensing portion. The motor modules are then connected to the PIC microcontroller, which determines the rotational direction of the motor and the speed of rotation. The motors are activated when the PIC microcontroller receives a bit signal from the Raspberry Pi to indicate that more nutrients are needed.

Results
A brief overview of the pipeline used to carry out the analysis is described in Figure  3 below. The first step before analyzing the dataset is to remove the predictors which had zero or null variance in their values. Carbonate was removed from the dataset as it had a value of zero throughout the dataset. After this, it was decided to proceed with synthetic data generation using Monte-Carlo approaches. Standardization or normalization of the data points was not carried out as all the predictors were distributed normally. A total of 2675 data points were generated for each class using a variation of the Monte-Carlo technique in which a separate mean and covariance matrix was produced for each class. Similarly, 5350 synthetic data points were generated, sharing the mean and covariance matrix between both the classes. This resulted in a total of 10,700 observations with 12 predictors, which were then used for further analysis. As the process of synthetic data generation and feature selection algorithms was moderately computationally exhaustive, an x64-based Intel(R) Core(™) i7-processor-based system with 8 GB RAM was used to run the algorithms.
As mentioned in the section above, a pipeline of feature selection techniques was applied to the dataset to select the top two nutrients. At first, a pairwise correlation matrix was generated between the predictors and is mentioned in Figure 4. As mentioned in the section above, a pipeline of feature selection techniques was applied to the dataset to select the top two nutrients. At first, a pairwise correlation matrix was generated between the predictors and is mentioned in Figure 4.  From Figure 4, it can be inferred that the pairwise correlation between the predictors is not greater than 0.05. Therefore, it was decided to proceed with the entire dataset without eliminating any predictors. Further, the XGBoost algorithm was used to generate Fscores for each predictor, as shown in Figure 5. From Figure 4, it can be inferred that the pairwise correlation between the predictors is not greater than 0.05. Therefore, it was decided to proceed with the entire dataset without eliminating any predictors. Further, the XGBoost algorithm was used to generate F-scores for each predictor, as shown in Figure 5. Figure 5 shows that the feature importance of sulfate is the highest, as the F-score generated by the XGBoost algorithm is around 375. This was followed by the feature importance of calcium, sodium, chloride, bicarbonate and potassium, with values ranging between 325 and 375. The rest of the nutrient predictors were excluded from our analysis before proceeding to the third step of the feature selection pipeline. From Figure 4, it can be inferred that the pairwise correlation between the predictors is not greater than 0.05. Therefore, it was decided to proceed with the entire dataset without eliminating any predictors. Further, the XGBoost algorithm was used to generate F-scores for each predictor, as shown in Figure 5.  Figure 5 shows that the feature importance of sulfate is the highest, as the F-score generated by the XGBoost algorithm is around 375. This was followed by the feature importance of calcium, sodium, chloride, bicarbonate and potassium, with values ranging between 325 and 375. The rest of the nutrient predictors were excluded from our analysis before proceeding to the third step of the feature selection pipeline.
The final step in the feature selection process is applying the ExtraTreesClassifier with Recursive Feature Elimination on the trimmed dataset to generate feature importance, as shown in Figure 6. The final step in the feature selection process is applying the ExtraTreesClassifier with Recursive Feature Elimination on the trimmed dataset to generate feature importance, as shown in Figure 6. The ExtraTreesClassifier revealed that the feature importance values of ammonium and calcium were 52% and 16%, respectively, making up 68% of the total feature importance in the dataset. Next, the main focus was to automate the sensing and actuation of these nutrients using a feedback loop. As discussed above, two Vernier sensors were used to measure ammonium and calcium, and two actuator modules were designed to release these nutrients if they fall below the recommended concentrations. A model setup of the sensing, actuation and feedback unit is shown in Figure 7. The ExtraTreesClassifier revealed that the feature importance values of ammonium and calcium were 52% and 16%, respectively, making up 68% of the total feature importance in the dataset. Next, the main focus was to automate the sensing and actuation of these nutrients using a feedback loop. As discussed above, two Vernier sensors were used to measure ammonium and calcium, and two actuator modules were designed to release these nutrients if they fall below the recommended concentrations. A model set-up of the sensing, actuation and feedback unit is shown in Figure 7.
From Figure 7, it was inferred that the output of the ML model, which is either 0 or 1, states the month in which the observations were recorded and has a set of recommended levels of nutrients that need to be maintained in the aquaponic solution. For each recommendation value stated in the table, the median of the nutrient values from the historical dataset per class was chosen as the concentration that should be maintained in the aquaponic solution. The ExtraTreesClassifier revealed that the feature importance values of ammonium and calcium were 52% and 16%, respectively, making up 68% of the total feature importance in the dataset. Next, the main focus was to automate the sensing and actuation of these nutrients using a feedback loop. As discussed above, two Vernier sensors were used to measure ammonium and calcium, and two actuator modules were designed to release these nutrients if they fall below the recommended concentrations. A model setup of the sensing, actuation and feedback unit is shown in Figure 7.  The overall nutrient regulation and monitoring system functions through the combination of three main components and is explained in detail with the help of the block diagram in Figure 8. From Figure 7, it was inferred that the output of the ML model, which is either 0 or 1, states the month in which the observations were recorded and has a set of recommended levels of nutrients that need to be maintained in the aquaponic solution. For each recommendation value stated in the table, the median of the nutrient values from the historical dataset per class was chosen as the concentration that should be maintained in the aquaponic solution.
The overall nutrient regulation and monitoring system functions through the combination of three main components and is explained in detail with the help of the block diagram in Figure 8. The first subsystem involves the nutrient level input to be fed through the rest of the system. In order to measure the current level of calcium and ammonium in the water at any point in time, two Vernier ion-selective electrodes were inserted into the water as shown in Figure 7a and given at least 30 min to accumulate to the present concentration. The sensors then communicated the measured nutrient concentration data through USB  The first subsystem involves the nutrient level input to be fed through the rest of the system. In order to measure the current level of calcium and ammonium in the water at any point in time, two Vernier ion-selective electrodes were inserted into the water as shown in Figure 7a and given at least 30 min to accumulate to the present concentration. The sensors then communicated the measured nutrient concentration data through USB protocol with the Raspberry Pi 3 B+ that is situated in the 3D printed box displayed in Figure 7b.
The second main component of the nutrient and regulation system involved the feedback loop which connects both the sensor and actuator systems. The Raspberry Pi ran this Python-based program to take in sensor readings and output signals for nutrient dispensing, as shown in Figure 7d. In order to integrate with the Vernier sensors, the program imported the GDX library provided by Vernier which holds initialization and interaction functions. With the sensors able to be communicated with, the feedback loop was able to call the sensors five times at an interval of five min in order to attain an average measured nutrient level to use. This level (in ppm) was then compared to the target nutrient levels defined by one of the two models listed in Table 1. The current Machine Learning model being used was altered manually through the use of a GPIO toggle switch depending on the season in which the crops were grown. Another switch was used as well to represent a pause function that allows the user to stop any function in order to perform side tasks such as sensor calibration while preventing unnecessary nutrient dispersal. If the current amount of each nutrient was lower than expected, a signal was sent through GPIO pins on the Raspberry Pi to the PIC32 microcontroller connected to the actuators. The final main component of this system involved the Nema 17 stepper motor actuators which dispensed the powdered nutrients. This system only operates when a high signal is received from the Raspberry Pi to the PIC32 microcontroller. The PIC32 interprets this signal to initiate the stepper motors 90 degrees. These actuators would spin a perforated wheel connected to a tank filled with each nutrient, as seen in Figure 7c. The tanks themselves had holes cut out at the bottom which line up with holes inside of the connected wheel. After the actuator was able to complete a cycle, a filled wheel hole was able to dispense 0.6 g of the respective powder into the tank of water. This cycle continued until the nutrient levels reached a stable level.
A recommendation system which has been implemented on the basis of the binary data-driven model is shown in Table 1.
Based on the above set of recommendations, some test runs were carried out and the results of the IoT-based system using Machine Learning as a Decision Support System are presented in detail in Table 2 below.
In order to demonstrate the efficiency of the proposed aquaponic system, several test runs were carried out over a period of five months from the end of October 2021 to early March 2022 in an aquaponic system containing 450 gallons, and the results of the test runs are presented in Table 2.
Some of the observations which are clearly demonstrated from Table 2 are as follows. As the sensors operate five times in a single cycle and can be seen in the test run dated 22 October 2021, the amount of ammonium sulfate dispensed in the first iteration increases the concentration of ammonium in the aquaponic solution by 0.29 ppm. This cycle of sensing and dispensing the above stated nutrients happens until both the concentrations of ammonium and calcium reach the desired level as stated in Table 1. The calcium nitrate and ammonium sulfate powders are released in dosage of 0.5 g and the respective nutrient concentrations are measured every cycle. This has been further demonstrated in the last three test runs on 9 December 2021, 8 January 2022 and 3 February 2022 where the sensing and actuation of nutrients do not run for five cycles as the desired concentration of the nutrients is achieved much before that. As the system monitors the nutrients in a closed loop, the loop is terminated before hitting five cycles.

Discussion
The main motivation behind developing this approach is to build a recommendation system with real-time sensing and actuation units to regulate nutrient levels in the aquaponic solution for optimal growth of fish and crop plants in a single set-up. In order to achieve this, the prerecorded observations from the historical dataset were used to design the approach and a set of rules was prescribed for both the classes. This is the first of its kind where IoT systems were combined with Machine Learning for optimizing nutrient supply in aquaponic solutions. The major advantage of a datadriven IoT system compared to a traditional aquaponic system is cost savings through improved yield and quality of produce via the integration of AI. This in turn significantly increases the profits generated from the aquaponic operations. Another solution provided in this research is the reduction in the amount of nutrients supplied for plant growth. In many of the earlier aquaponic studies, excess phosphorus accumulation due to unregulated use of fertilizers has resulted in eutrophication, leading to algal growth and a steep decline in fish populations [36]. The nutrient accumulation problem has been addressed in this research through IoT-based nutrient optimization.
For the observations for which the month class is 0, cultivation occurs during the dry and hot summers of Texas. During these months, the fish in the aquaponic set-up grew from fries to the adult stage. As ammonium is a byproduct formed from protein metabolism, it is excreted in high amounts during these months [37,38]. Since high ammonium concentrations are toxic to fish, it is important to restrict ammonia to a maximum level of 1.82 ppm. On the other hand, ammonium is a source of nitrogen for plants [39], resulting in higher yields, though greater than optimal levels can also be injurious to crop growth. For the observations for which the month class is 1, cultivation is performed during the cold winter months. During this period, the fish growth is minimal, and in turn the amount of ammonium released in water is comparatively low. The concentration of ammonium during the winter months can be restricted to 1.55 ppm, which is sufficient for plant growth and is also a safe limit for sustainable fish growth in commercial set-ups.
Most of the fish grown during the summer months in aquaponic set-ups in Texas are warm-water fish species. As mentioned above, calcium is an important element required for normal growth and reproduction of fish species [40]. Calcium levels need to be maintained at 32 ppm during these months, which allows for maintenance of water hardness between 75 and 125 ppm, while still being safe for plant growth. Calcium concentrations beyond this level can cause tip burn in lettuce plants. During the winter months, plant growth is considerably slower and excess levels of carbonates or bicarbonates can be toxic to plants [41]. Therefore, the recommended level of calcium in the aquaponic solution was 27 ppm, which was sufficient to sustain fish growth without injuring the winter crops.
This study specifically addresses the optimization of the top two nutrients (ammonium and calcium) which should be regulated depending on the month in which the plants and fish are grown in a closed-loop control aquaponic set-up. As ammonium is a key factor in controlling the pH and TDS of the solution, regulating the ammonium concentration addresses most of the issues which may occur due to TDS imbalance in the solution [42,43]. On the other hand, the concentration of calcium plays an important role in determining the hardness of the solution. Therefore, regulating both of these nutrients through an IoTbased sensing and actuation system ensures that pH is adequately regulated for promoting optimal growth and yield of both fish and crop plants in a single set-up.
As discussed earlier, multiple tests were carried out between October of 2021 and February of 2022 to validate our design in a leading commercial aquaponic set-up in Grimes, TX. Instead of randomly adding nutrients to the aquaponic set-up in varying concentrations as in traditional decoupled aquaponic set-ups, nutrients were added in small quantities and were regularly monitored using nutrient sensors specified before. The biggest advantage of this automated system is reduction in the overall cost incurred in cleaning up the unregulated aquaponic systems due to eutrophication resulting from overaccumulation of nutrients. The other advantage of the system is reduction in the number of nutrients which are monitored in commercial aquaponic systems. Barring the increase in yield and size of the produce which has been discussed in the conclusion, these are the major advantages which have been observed in our controlled aquaponic set-up compared to that of the existing commercial operations.
In most commercial aquaponic systems, decoupled systems are used where fish and plants are not connected together. Our system goes on to present a coupled system for growing plants and fish in a single set-up. In general, tap water is used as the source of water supply in most of these decoupled systems, with deregulated addition of nutrients such as iron, calcium, etc., and the growth of plants and fish is observed in individual set-ups. The major improvements which we suggest in this research are coupling both the systems together, which reduces the cost of the system significantly and promotes the recyclability of the water with routine sensing, and addition of nutrients based on the month lettuce was cultivated. The concern regarding monitoring of heavy metals is to be addressed in the future, for which we are working on building a spectrometer from scratch which can help in real-time monitoring of the heavy metals in the aquaponic solution.

Conclusions and Future Work
Using the dataset recorded from three aquaponic farms in Southeast Texas, the two most important predictors identified from the existing design, after generating synthetic data and carrying out dimensionality reduction, were the concentrations of calcium and ammonium. These two nutrients were regulated using an IoT-based sensing and actuation system in a closed-loop set-up. The tests were carried out for the cycle of 21 days to grow romaine lettuce and were verified experimentally as to how the proposed recommendations positively impacted plant and fish growth in a single system. The size and yield of the lettuce were compared to the ones grown in unregulated aquaponic environments and the yield showed significant increase in size, with some of them being as large as 40 to 45 inches in diameter. The cost involved in regulating nutrient parameters compared to traditional aquaponic environments also decreased by more than 75%.
In the future, a cloud database can be set up to host the Machine Learning model and store data dynamically. The size of the sensing and actuation modules described above can be scaled up to regulate more macronutrients and heavy metals to build a comprehensive unit to be installed at commercial set-ups. The data collected in our case to generate the AI-based approach were collected from three aquaponic farms in Texas where the terrain is mostly flat. More data can be added in the future from diverse geographical locations with more variable weather conditions to improve the robustness of the model.
As all the data on which our model was trained to build an AI-based approach were recorded from aquaponic farms located in Texas, the IoT system developed here cannot be used directly in places that are extremely hot or cold. The important features determined by our dimensionality reduction techniques are bound to change in those conditions due to high variance expected in the dataset. However, this study provides a general approach that can be adapted to a range of environments and aquaponic settings with adequate modifications. Future research should investigate the effectiveness of the IoT approach for optimizing nutrient supply in extreme climatic conditions and diverse management scenarios. Funding: This research received no external funding, but this research was funded by the departmental grants from the Department of Electrical and Computer Engineering, Texas A&M University.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The dataset used in this case has been described in detail in the sections above. The researchers would be willing to provide more details if needed.