1. Introduction
As the main cause of eutrophication in water bodies, nitrogen (ammonia nitrogen and total nitrogen) is a key regulatory parameter in sewage treatment. The traditional biological denitrification of sewage relies on nitrification and denitrification processes, which require significant energy input for aeration and external carbon sources. In the 1990s, the discovery of anaerobic ammonium oxidation (Anammox) revolutionized our understanding of the natural nitrogen cycle. Anammox is a biological reaction driven by anaerobic ammonia-oxidizing bacteria (AnAOB) that converts ammonia and nitrous acid into nitrogen gas [
1]. Anammox has become widely recognized as the most cost-effective biological denitrification technology, achieving a 60% reduction in aeration, complete elimination of carbon source requirements, and an 80% decrease in residual sludge production [
2]. It has been implemented across over 200 projects worldwide. The Anammox process has become an economically efficient and low-carbon advanced technology for upgrading wastewater treatment processes [
3].
AnAOB grows slowly and is sensitive to changes in environmental conditions. In actual sewage treatment, the water quality and quantity can exhibit significant fluctuations, often containing toxic substances such as salt, organic matter, and heavy metals, which can strongly interfere with the metabolic activity of AnAOB and easily cause system performance instability or even collapse [
4]. In addition, AnAOB often exists in the form of biological aggregates, and its natural tendency for self-immobilization limits the adaptive capacity of functional bacterial communities to keep pace with changing environmental conditions, requiring a long duration to reach stability. Therefore, compared to traditional denitrification processes, the stable operation of the Anammox process requires more stringent monitoring and control measures [
5].
From the perspective of process control, the sewage treatment process exhibits characteristics like nonlinearity, uncertainty, strong interference, and strong coupling [
6]. Traditional control technologies such as feedforward, feedback, and proportional integral derivative (PID) are commonly used in urban sewage treatment plants in China [
7]. The controller structure parameters are fixed, exhibiting poor adaptability, and the control system accuracy is low, resulting in inefficient sewage treatment operation and frequent exceedance of effluent quality standards. In addition, process control relies heavily on the input of key water quality parameters, and currently, the core water quality indicators related to biochemical performance in sewage treatment plants remain challenging to acquire. Most of these indicators depend on offline laboratory testing methods, the capabilities of laboratory equipment, and the expertise of technical personnel, with long testing cycles (several hours to several days). However, the existing online detection instruments are expensive, have high maintenance costs, and are significantly impacted by solid suspended/sediment in water, resulting in a low practical application rate. A literature review conducted on 14 existing Anammox production facilities showed that pH and dissolved oxygen (DO) are still the main online monitoring indicators [
6]. The nitrogen concentration in water bodies detected offline is only the final product of the complex microbial community’s multiple biochemical metabolic pathways within the system and cannot directly characterize the Anammox activity of sludge. At present, the Anammox process still lacks intuitive and effective activity monitoring indicators.
AnAOB exhibits a unique red color due to the presence of high concentrations of cytochrome c (Cyt c) protein in its cells. The image of Anammox granular sludge is shown in
Figure 1. In Anammox reactors, AnAOB often exists in the form of biological aggregates such as granular sludge, and during the operation of the reactor, Anammox granular sludge exhibits different colors (gray-black, brown-yellow, deep red, etc.) [
8]. Our team’s previous research demonstrated that the color of Anammox granular sludge is closely related to its activity [
9]. Among them, the content of reduced Cyt c serves as a key mediator in correlating color and activity. Color is a direct visual indicator for humans, which can be measured rapidly and non-destructively, and has been used in many fields for biological monitoring. Therefore, the color of Anammox granular sludge holds the potential as an effective indicator for monitoring its activity.
Machine vision technology primarily involves simulating human visual functions through computers and imaging devices, extracting useful feature information from objective images, processing, analyzing, and understanding it, and ultimately applying it to detection, measurement, and control [
10]. With the rapid development of sensors, microscopy technology, and image recognition algorithms, machine vision technology has been successfully applied in fields such as transportation, healthcare, and security, and its range of applications continues to broaden [
11]. In the field of wastewater treatment research, machine vision has found applications in activated sludge systems. Mesquita et al. used image processing and other methods to extract morphological features of sludge to characterize its biomass (MLSS) and volume index (SVI), aiming to evaluate the abnormal swelling state of activated sludge systems [
12]. Ginoris et al. employed image processing coupled with multivariate statistical analysis techniques to identify native and post-protozoa in activated sludge systems, to assess the operational performance of these systems [
13]. In the field of food research, using machine vision systems to collect and quantitatively process and analyze images of fruits allows the extraction of color, shape, size, and texture features to accurately determine fruit quality levels [
14]. Therefore, introducing machine vision technology to recognize the unique color characteristics of Anammox granular sludge has the potential to rapidly assess its activity status, thereby enabling the online monitoring of the Anammox process. However, no relevant studies have been reported in the literature so far.
Machine learning is a discipline focused on building mathematical models and algorithms that allow computers to recognize patterns, make predictions, or make decisions from large datasets without the need for explicit programming [
15,
16]. Machine learning has become a core technology in modern science and engineering, with applications spanning natural language processing, computer vision, medical diagnosis, financial prediction, and more. Its importance in data-driven decision-making and intelligent system development has become increasingly significant [
17].
A neural network is a computational model inspired by the human brain’s nervous system [
18]. The training process of neural networks involves learning the feature representation of input data by adjusting connection weights and modeling complex relationships through layer-by-layer transfer, weighted summation, activation functions, and other operations [
19]. In recent years, with the continuous increase in data volume and the improvement of computing power, deep neural networks have received widespread attention as a powerful machine learning technology. Deep neural networks have achieved great success in fields such as image recognition [
20], speech recognition, and natural language processing [
21] and have outperformed human experts in many applications. With ongoing research and technological advancement, neural networks are pivotal in advancing the fields of artificial intelligence and machine learning [
22]. However, as the number of layers in the neural network increases, the effects of gradient vanishing and exploding begin to emerge, manifested in the instability of the model’s accuracy. To address this issue, He et al. proposed residual neural networks by adding residual blocks between convolutional layers [
23]. The Skip Connection within the residual block group can better transfer gradients, which helps train deeper networks, which enables ResNet to extract richer features through its deep network structure, thereby improving recognition accuracy.
The results of this study seek to offer novel insights into monitoring the Anammox process and facilitate the digitalization and intelligent transformation of the wastewater treatment sector.
2. Materials and Methods
2.1. Anammox Granular Sludge
The reactor inoculated sludge is taken from the short-range nitrification anaerobic ammonia oxidation denitrification device, with a designed treatment capacity of 8 t/d, ammonia nitrogen concentration of 140 mg/L, reaction tower volume of 1 m3, ammonia nitrogen removal rate of up to 88.9%, total nitrogen removal rate of over 82.7%, and volumetric load of approximately 4.9 kg-N/(m3·d). The concentration of suspended solids (SSs) in sludge is 28.88 g/L, and the concentration of volatile suspended solids (VSSs) is 26.22 g/L. Using simulated wastewater inflow, ammonia nitrogen and nitrite nitrogen are provided by (NH4)2SO4 and NaNO2, respectively, with a ratio of approximately 1:1.2. Unless otherwise specified, the chemical reagents used in this study were purchased from Sinopharm Chemical Reagent Co. Ltd., Shanghai, China.
According to the research results, the upper limit of nitrogen load in Anammox reactors generally does not exceed 20 kg-N/(m3·d), and most of them are below 10 kg-N/(m3·d); The lower limit is generally not less than 0.3 kg-N/(m3·d). Therefore, to cover the range of nitrogen load, this study operated 8 UASB (Up-flow Anaerobic Sludge Bed) reactors with different nitrogen loads by setting different hydraulic retention times. The operating and structural parameters of the reactors are detailed in the attachment. The Anammox granular sludge used in the experiment was taken from a UASB reactor operating in steady state.
2.2. Specific Activity Determination
We used serum bottle batch test to determine the specific activity of Anammox granular sludge. We weighed 1 g of wet sludge, washed it three times with 0.1 M phosphate buffer solution (pH = 7.5), and transferred it to a serum bottle. The volume of serum bottle was 50 mL, and the mother liquor was the same as the inlet formula of the reactor. Then, aeration with 95% Ar+5% CO2 was carried out for 10 min to remove oxygen from the headspace and solution, and it was sealed with a butyl rubber stopper and fixed with an aluminum cap. We set the initial concentrations of ammonia nitrogen and nitrite nitrogen to 50 mg N/L and 60 mg N/L. We placed the serum bottle on a shaker (150 rpm, 35 °C), took samples at 1 h intervals, passed them through a membrane (pore size 0.45 μm), and measured the concentrations of ammonia nitrogen, nitrite nitrogen, and nitrate nitrogen. After the sampling was completed, we opened the cap to detect the concentration of VSS in the serum bottle. We drew a time history curve of nitrogen concentration and used linear regression to fit the degradation rate of the substrate. The specific activity is calculated as the sum of the degradation rates of ammonia nitrogen, nitrite, and nitrate nitrogen per unit of VSS. Two parallel experiments are set for each group.
2.3. Transcriptional Activity Assay
We weighed 0.2 g of Anammox granular sludge and extracted total RNA using PNeasy PowerBiofilm Kit (Qiagen, Dusseldorf, Germany). We referred to the instructions of the RNA extraction kit for the extraction method. We used the PrimeScript RT reagent kit (Takara, Tokyo, Japan) for RNA reverse transcription and followed the instructions in the kit for the reverse transcription process. The abundance of hzsA gene in cDNA obtained by reverse transcription was determined using fluorescence quantitative PCR (qPCR) technology. We used a 25 μL amplification system, including 12.5 μL Ex Taq enzyme premix (Takara, Japan), 9.5 μL deionized water, 1 μL template DNA, 1 μL pre-primer, and 1 μL post-primer. The primer sequences used are shown in the attached table. The CFX Connect quantitative PCR instrument (Bio-Rad, Hercules, CA, USA) was used for quantitative PCR analysis. The melting curve obtained by SYBR Green method amplification was used to characterize the specificity of primers, and the recombinant plasmid ten-fold dilution method was used to construct a quantitative PCR standard curve. The amplification efficiency is 90% to 100%, and the R2 values for line fitting are all greater than 0.98. Three replicates were set for the sample.
2.4. Image Acquisition and Image Data Processing
The entire image acquisition process is completed in a darkroom. An electron microscope camera (Gaopin, Kunshan, China) was utilized to capture images of Anammox granular sludge. Two to four Anammox particles were placed in a clean culture dish. Clean test strips were employed to absorb surface moisture, and the samples were then placed in a dark room. Using a uniform brightness bar light source (HIKVISION, Hangzhou, China) as the background light source to eliminate the influence of external lighting conditions. The electron microscope camera was focused until the image was clear, and specialized software(S-EYE 1.4.3.479), which is provided by the manufacturer of electron microscope cameras, was used to capture images of the Anammox granular sludge. Forty to fifty portions of Anammox particles were taken from each UASB reactor, capturing approximately 1000 images of Anammox particle sludge from three different angles for each portion.
A substantial dataset is crucial for obtaining robust machine learning results, and collecting only 1000 images is insufficient to train a comprehensive machine learning model. Therefore, data augmentation techniques were employed to address this issue. Six data augmentation techniques were employed: (1) horizontal flipping, (2) vertical flipping, (3) center symmetry processing, (4) horizontal flipping and Gaussian blur, (5) vertical flipping and motion blur, and (6) center symmetry processing with Gaussian noise.
The highlight areas of Anammox granular sludge images were detected after data augmentation, followed by image restoration and enhancement methods to process the highlight areas and remove reflective parts in the image. Foreground and background in the images were detected and distinguished, and image segmentation algorithms were used to remove irrelevant backgrounds. The background-segmented images were converted from RGB to HSV. The pixel values of the entire HSV image were traversed, weighted, and averaged based on their morphological features to obtain the mean pixel values of the H, S, and V channels. The images were proportionally scaled to 224 × 224 pixels for machine learning training. The original sludge image, the background segmented sludge image and the HSV image of sludge are shown in
Figure 2 2.5. Machine Learning
The H, S, and V values of each Anammox granular sludge image were used as independent variables, and the specific activity of the corresponding Anammox granular samples was the dependent variable. Machine learning training was conducted using three mathematical models: polynomial regression, random forest, and XGBOOST.
Polynomial regression was applied to the data, and the optimal training result was determined by comparing the predicted outcomes with actual results on the test set, adjusting parameters such as R2 and MSE. It was found that the optimal performance was achieved when the polynomial degree was 3, with an R2 of approximately 0.91.
Due to the limited number of samples relative to active data in this experiment, the random forest algorithm was considered for regression to reduce the risk of overfitting. Random forest regression was performed using 5-fold cross-validation to evaluate the training results, with R2 and MSE as performance metrics. Hyperparameters of the random forest were iteratively adjusted, and the optimal results were obtained when the number of decision trees was 47 and the maximum depth was 17, achieving an R2 of 0.898.
Given the suboptimal performance of traditional random forests, the XGBOOST algorithm was employed for gradient boosting optimization of the model. XGBOOST was applied to integrate random forest decision trees, utilizing five-fold cross-validation and evaluating performance using metrics such as R2 and MSE. After multiple adjustments to account for the impact of XGBOOST hyperparameters on training outcomes, it was determined that the best performance was achieved with 50 decision trees, a maximum depth of 5, and a learning rate of 0.22, resulting in an R2 of 0.928.
Given the suboptimal performance of the three methods, the Stacking algorithm was employed to integrate polynomial, random forest, and XGBOOST algorithms, enhancing the training results. The above algorithms served as primary learners, and to mitigate overfitting, a simple linear regression model was employed as the secondary learner. The primary learners were trained using the training set, and the predicted values from these trained primary learners were used as input for training the secondary learner. Five-fold cross-validation was utilized to evaluate the training outcomes, and the hyperparameters of the primary learners were adjusted based on the cross-validation results. Following multiple training iterations, the model with the best R2 and MAE values was selected as the final machine learning model.
2.6. Deep Learning
In this study, the ResNet50d neural network served as the main deep learning model. ResNet50d is an enhanced variant of the ResNet series, featuring a deeper network structure and more parameters. ResNet employs a deep residual structure to mitigate the gradient vanishing issue in deep neural network training, which facilitates the training of deeper networks and improves model performance. Although ResNet has a deep network structure, through the use of techniques like residual connections and global average pooling, the number of parameters in the model remains relatively small, helping to reduce the risk of overfitting.
To balance detection speed and accuracy, ResNet50 with a moderate depth was employed as the foundation. The neural network structure is shown in
Figure 3. The residual function module replaces traditional convolution layers, and the ReLU activation function is applied within the residual function module. The input image consists of a 244 × 244 × 3 HSV three-channel image, and the output corresponds to the number of adaptive categories, facilitating both classification and regression tasks.
In this study, the MSE function was employed as the loss function. The MSE function is defined as follows:
where
is the predicted value of the model and
is the annotation value of the training set.
Deep learning is a data-driven approach, and training neural networks with extensive datasets can enhance the robustness of the model and reduce overfitting. However, in practice, the available training data is often limited, which may compromise model reliability, and the cost of obtaining data can be high. Additionally, training with large datasets may incur significant hardware and time costs. Using pretrained weights can effectively reduce training time and improve the generalization performance of the model. To enhance detection accuracy, pretrained weights from ImageNet were utilized to train our neural network, thereby improving the model’s robustness.
4. Conclusions
Introducing machine vision technology to recognize the unique color characteristics of Anammox granular sludge has the potential to rapidly assess its activity status, facilitating online monitoring of the Anammox process. To address the current challenges, an end-to-end detection approach was implemented, proposing a model that combines traditional machine learning methods with a ResNet50d neural network. In addition, given the limited availability of datasets related to granular sludge, a dataset comprising 1000 Anammox granular sludge images was developed for further research. Recognizing the data-driven nature of deep learning, data augmentation and transfer learning methods were utilized to enhance the model’s accuracy.
The experimental results show that the model proposed in this study effectively recognizes the unique color characteristics of Anammox granular sludge and establishes a reliable correlation between color characteristics and sludge activity, satisfying the requirements for real-time detection. Besides, the ResNet50d-based model outperforms traditional machine learning models in detection performance. The model based on the ResNet50d neural network in this study can effectively detect denitrification efficiency in real-time by recognizing the color characteristics of Anammox granular sludge. Compared to traditional detection methods, this study does not require tedious and time-consuming chemical experiments and only requires deployment in deep learning computing units and suitable machine vision systems. The model proposed in this study can be applied to laboratory or wastewater denitrification processes to perform real-time, rapid, and safe detection of Anammox particle denitrification efficiency without contact with chemicals.
However, current deep learning techniques heavily rely on datasets. Despite the use of data augmentation and transfer learning in this study to reduce dependency on datasets, the image dataset provided by the eight UASB reactors still affects the model’s generalization to some extent. In future research, a more comprehensive and complete dataset would facilitate a thorough and systematic investigation of the Anammox process using machine vision and machine learning technologies. Additionally, to further advance this research, it is necessary to deploy online models in actual wastewater treatment industrial facilities. Considering the requirements for industrialized models, some lightweight models and new network architectures, such as attention mechanisms that enable neural network models to adapt to highly complex working environments, can be considered, and their usability and accuracy compared.