1. Introduction
Municipal solid waste incineration (MSWI) serves as a sustainable approach method for effectively managing the challenges posed by municipal solid waste (MSW) in terms of environmental sustainability [
1]. Through high-temperature combustion, it transforms MSW into ash and heat energy, playing a pivotal role in tackling the escalating environmental issues associated with MSW treatment [
2]. It also mitigates, to a certain extent, the negative impacts of conventional landfilling and composting practices on the environment. However, with the increasing emphasis on environmental sustainability, there is a heightened focus on the feasibility and long-term consequences of MSWI. Potential challenges arise in the MSWI process, including the release of harmful gases that can compromise air and water quality, posing a substantial threat to environmental sustainability [
3]. Consequently, it is imperative to implement effective control measures in the design and operation of incineration facilities to reduce emissions and minimize their impact on the environment and human health. In the pursuit of sustainability, MSWI must strike a delicate balance encompassing economic, social, and environmental considerations.
MSWI has gained widespread global recognition owing to its substantial benefits in terms of harm reduction, minimization, and resource utilization [
4,
5]. A variety of incinerators exist for MSW, such as grate-type, bed-type, and fluidized bed incinerators. The predominant method for the MSWI process typically employs grate-type incinerators. [
6]. Compared to other furnace types, grate-based ones are known for their characteristics of flexibility and easy operation. However, their energy efficiency is low, and their pollutant-emission rate is high under unstable running status [
7]. Technological innovation emerges as a key element in achieving this equilibrium, fostering the green and standardized management of MSW through techniques [
8,
9]. This will propel MSW management toward a more sustainable trajectory. Thus, much more advanced technologies, such as machining learning and artificial intelligence based on vision, are needed to overcome these problems [
10]. Due to its high heterogeneity, MSW poses challenges in maintaining combustion stability, potentially resulting in issues such as coking, ash accumulation, and corrosion inside the furnace. Therefore, making timely and accurate judgments on combustion status becomes necessary [
11].
Presently, the observation of waste incinerator combustion status primarily relies on visual assessments by experts. They combine visual observations with flame conditions from on-site observation holes to adjust key parameters, ensuring combustion stability [
12]. However, this method faces several challenges: (1) the lack a unified judgment standard leads to inconsistent results, susceptible to subjective variations; (2) prolonged on-site image observation induces visual fatigue in workers, impacting their health; (3) multiple interrelated key regulatory parameters significantly impact combustion efficiency, making accurate individual control by operators extremely challenging, potentially causing unstable control processes. Relying solely on manual methods for identifying combustion status is no longer adequate to meet production requirements. To enhance on-site detection automation, reduce subjective influences stemming from human factors, decrease labor intensity, and improve detection efficiency, employing online flame video recognition technology based on artificial intelligence is crucial [
13].
When it comes to recognizing combustion status through flame-image analysis in the MSWI process, several studies exist, each focusing on different furnace types. Miyamoto et al. [
14] conducted research on the “AI-VISION” system, integrating combustion-image processing, neural networks for discerning combustion status, and online learning methods for optimizing neural networks. Their system manipulated operating values in fluidized bed incinerators. Zhou [
15] developed a combustion status diagnosis model based on neural networks utilizing geometric features and grayscale information from flame images, validated through ten-fold cross-validation experiments. Guo et al. [
16] presented a combustion status-recognition method employing mixed data augmentation and a deep convolutional generative adversarial network (DCGAN) to obtain flame images under diverse conditions. Huang et al. [
12] extracted key parameters like grayscale mean, flame area ratio, high-temperature ratio, and flame front to characterize and evaluate combustion status. Meanwhile, Zhang et al. [
17] extracted 19 feature vectors encompassing color, shape, and texture of flame images, constructing an echo state network recognition model. These findings emphasize the necessity for further research and validation of combustion status identification methods tailored to different MSWI plants. In the field of combustion status recognition based on flame videos, researchers have proposed diverse solutions for similarly complex industrial processes. Chen et al. [
18] utilized typical video blocks of rotary kiln flame combustion as model training samples. They extracted texture and motion features from these blocks and inputted them into a support vector machine (SVM) to construct a flame status-recognition model, though with relatively unstable recognition performance. Li et al. [
19] employed a convolutional recurrent neural network (CRNN) with spatiotemporal relationships from rotary kiln flame image sequences to predict combustion status. Wu et al. [
20] initially used a convolutional neural network (CNN) to extract spatial features from electric melting magnesia furnace video signals. Then, they applied a recurrent neural network (RNN) to extract temporal features, achieving automatic labeling of abnormal conditions using weighted median filtering. These studies indicated that flame video recognition is founded upon analyzing sequences of flame images. Thus, achieving video recognition of combustion status in the MSWI process should commence with constructing an offline recognition model based on flame images.
The offline modeling process for flame-image recognition typically comprises two stages: feature extraction and image recognition. Some researchers have focused on manual feature extraction methods to derive flame features. For instance, Zhang et al. [
21] extracted multiple feature vectors encompassing color, shape, and texture features from flame images, utilizing these as inputs to the bilinear convolutional neural network (BCNN) for flame-image recognition. Wu et al. [
22] initially segmented the pertinent region in the flame image and, subsequently, employed extracted color, texture, and rectangularity features for flame recognition. Another approach by Wu et al. [
23] assessed image quality by modeling texture, structure, and naturalness, using the resulting image quality score as the input for the visual recognition model. However, the ability of the extracted feature parameters in the aforementioned studies to accurately represent combustion status relies partly on image-processing techniques, such as image segmentation algorithms, and partly on manual expertise. Consequently, this approach has significant limitations and inherent instability.
Feature extraction methods based on deep learning offer the capability to autonomously learn representative features from flame images. Han et al. [
24] utilized flame images to train the convolutional sparse autoencoder (CSAE), resulting in a feature extractor adept at extracting deep features. Visualization of these features demonstrated clear discriminability across various combustion statuses. Similarly, Liu et al. [
25] applied deep learning to industrial combustion processes, employing a multi-layered deep belief network (DBN) to extract nonlinear features. This approach yielded descriptive insights into flame physical properties, outperforming traditional principal component analysis (PCA). These studies validated the immense potential of deep networks in combustion status recognition. LeNet-5, a convolutional neural network devised by LeCun et al. in 1998, gained prominence in handwritten digit recognition, showcasing commendable recognition results [
26]. Roy et al. [
27] utilized LeNet-5 to extract deep features from forest fire images, offering insights for developing early-stage forest fire detection systems by controlling model complexity through L2 regularization. He et al. [
28] enhanced the model by augmenting the layer count of the LeNet-5 network and incorporating a dropout layer, achieving heightened recognition accuracy. Li et al. [
29] merged low-level and high-level features extracted from the LeNet-5 structure, leveraging the first two pooling layers and fully connected layers as SoftMax inputs for micro expression recognition, yielding robust results on a public expression database. LeNet-5’s capability to capture local image features based on local receptive fields, reduce network training parameters through shared weights, and maintain a simple network structure is noteworthy. Despite being an early convolutional neural network with shallow layers, LeNet-5 finds extensive use in image-processing tasks like license plate recognition and face detection. These studies show LeNet-5’s broad application prospects in image recognition. Its characteristic structure excels in extracting deep features, making it a promising choice for MSWI flame combustion status recognition in this study.
Drawing inspiration from deep neural network models, the deep forest classification (DFC) algorithm introduced by Zhou et al. [
30] comprises two primary components: a multi-grained scan and a cascaded forest (CF). The former transforms raw data features, while the latter constructs prediction models using these transformed features [
31,
32]. The multi-grained scan bolsters CF training, augmenting its effectiveness. Cao et al. [
33] integrated a rotating forest into the cascaded layer to enhance DFC’s discriminative ability for hyperspectral features. Their work also leveraged spatial information from adjacent pixels, refining hyperspectral image classification. Zheng et al. [
34] tackled challenges in leaf classification, specifically addressing the lack of large-scale professional datasets and expert knowledge annotations. They utilized generative adversarial networks for image feature extraction and a designed fuzzy random forest as CF’s base learner, achieving superior recognition performance compared to existing techniques. Sun et al. [
35] applied DFC to chest computer tomography (CT) scan image recognition for coronavirus disease-19 (COVID-19). Extracting features from specific image locations, they employed DFC to learn high-level representations, resulting in commendable recognition performance. Additionally, Nie et al. [
36] proposed an online multi-view deep forest architecture for remote sensing image data. DFCs offer advantages over DNNs, such as fewer hyperparameters, interpretability, and automatic adjustment of model complexity [
37]. Moreover, they perform well with smaller image data samples, effectively resolving challenges in constructing DNN recognition models. However, it is noteworthy that the multi-grained scan module of DFC can be time-consuming and inefficient in acquiring diverse scaled deep features. These studies collectively imply that DFC, combined with CNN-based deep feature extraction algorithms, can effectively tackle the limitations posed by limited flame-image datasets in the MSWI process.
In summary, achieving online recognition of combustion video status in the MSWI process entails addressing several key factors: (1) effectively extracting deep features from flame images despite limited sample size; (2) maximizing the utilization of these extracted deep features to build a recognition model that meets on-site recognition requirements; (3) advancing toward online video recognition by leveraging flame-image recognition. Hence, this article proposes an online video recognition method rooted in convolutional multi-layer feature fusion and DFC. This method involves (1) training the LeNet-5 network using flame images collected on-site to extract deep flame features; (2) employing an adaptive fusion method based on LeNet-5 multi-layer features to select and use fused features as flame representations; (3) utilizing the extracted deep fusion features in DFC to construct an offline recognition model for determining combustion status based on flame images; and (4) integrating the offline recognition algorithm into the developed MSWI flame video combustion status-recognition platform to achieve real-time online recognition.
The existing research highlights prevalent applications of online flame video recognition in areas like rotary kilns and electric magnesium melting furnaces. Surprisingly, there is a dearth of studies regarding online flame video recognition in the MSWI field. Consequently, this article aims to explore an online recognition method tailored to the unique characteristics of flame videos in MSWI. The primary innovations of this method encompass (1) proposing a fusion technique that combines flame depth feature extraction and adaptive selection based on LeNet-5; (2) integrating deep fusion features with the DFC algorithm to construct a combustion status-recognition model specifically designed for the MSWI process; and (3) developing a practical online combustion status-recognition platform based on flame video for MSWI. These advancements signify the potential practical value of this technology within the MSWI field.
5. Conclusions
In response to the practical need for reducing emissions and energy consumption in the treatment of MSW using a grate furnace within the MSWI process, we developed an online combustion status-recognition method. Based on a database of flame images depicting typical combustion statuses, our approach involves utilizing convolutional multi-layer feature fusion and DFC. Initially, a LeNet-5 network undergoes training to extract deep features from flame images across various typical combustion statuses. These extracted deep features are selectively fused using a multi-layer feature adaptive selection method, forming a comprehensive representation of flame combustion status. Subsequently, the fused depth features are fed into the DFC to establish an offline recognition model. Ultimately, this model facilitates the realization of online flame video recognition.
This study presents several notable advantages: (1) Advanced combination: It marks the first time of successfully combining LeNet-5 and DFC, applied specifically to the field of MSWI combustion status recognition. (2) High recognition accuracy: The constructed combustion status-recognition model exhibits superior accuracy in identifying various combustion statuses. (3) Online application validation: The application of the offline recognition model to online recognition systems demonstrates practical value and real-world applicability. (4) Real MSWI plant data: The research is based on actual MSWI plant flame data, offering important practical insights and guidance for implementation.
The study’s limitations are apparent in two areas: (1) Incomplete representation: The considered combustion statuses might not encompass all the varied conditions observed on site. Future work should involve supplementing these statuses based on expert insights to develop corresponding recognition models. (2) Qualitative analysis only: The current recognition model predominantly performs qualitative analysis of the flame’s combustion status. There is a vital need to make quantitative analyses using flame data to assess factors like material layer thickness.
The flame combustion status online recognition system plays a pivotal role in boosting operational efficiency and reducing pollutant emissions within the MSWI process. This cutting-edge technology enables real-time monitoring of incineration flames, ensuring a consistently efficient and stable combustion process. Based on the software of the flame online-recognition system, precise control strategies can be employed to fine-tune combustion parameters, thus minimizing the release of harmful gases significantly and enhancing resource utilization efficiency. This intelligent-control approach contributes significantly to realizing the sustainability objectives of MSW management by combining incineration technology with environmental sustainability protection, steering the MSWI process toward a more eco-friendly direction.