Deep Learning Techniques for Enhanced Flame Monitoring in Cement Rotary Kilns Using Petcoke and Refuse-Derived Fuel (RDF)

Arroyo, Jorge; Pillajo, Christian; Barrio, Jorge; Compais, Pedro; Tavares, Valter Domingos

doi:10.3390/su16166862

Open AccessArticle

Deep Learning Techniques for Enhanced Flame Monitoring in Cement Rotary Kilns Using Petcoke and Refuse-Derived Fuel (RDF)

by

Jorge Arroyo

^1,*

,

Christian Pillajo

¹,

Jorge Barrio

¹,

Pedro Compais

¹ and

Valter Domingos Tavares

²

¹

CIRCE Technology Center, Parque Empresarial Dinamiza Avenida Ranillas, 3D 1st Floor, 50018 Zaragoza, Spain

²

SECIL S.A., Estrada do Outão, 2901-182 Setúbal, Portugal

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(16), 6862; https://doi.org/10.3390/su16166862

Submission received: 30 May 2024 / Revised: 19 July 2024 / Accepted: 23 July 2024 / Published: 9 August 2024

(This article belongs to the Special Issue Towards Sustainable Energy Transition: Replacing Fossil Fuels-Based Energy with Waste-Derived and Renewable Energy)

Download

Browse Figures

Versions Notes

Abstract

The use of refuse-derived fuel (RDF) in cement kilns offers a multifaceted approach to sustainability, addressing environmental, economic, and social aspects. By converting waste into a valuable energy source, RDF reduces landfill use, conserves natural resources, lowers greenhouse gas emissions, and promotes a circular economy. This sustainable practice not only supports the cement industry in meeting regulatory requirements but also advances global efforts toward more sustainable waste management and energy production systems. This research promotes the integration of RDF as fuel in cement kilns to reduce the use of fossil fuels by improving the control of the combustion. Addressing the variable composition of RDF requires continuous monitoring to ensure operational stability and product quality, traditionally managed by operators through visual inspections. This study introduces a real-time, computer vision- and deep learning-based monitoring system to aid in decision-making, utilizing existing kiln imaging devices for a non-intrusive, cost-effective solution applicable across various facilities. The system generates two detailed datasets from the kiln environment, undergoing extensive preprocessing to enhance image quality. The YOLOv8 algorithm was chosen for its real-time accuracy, with the final model demonstrating strong performance and domain adaptation. In an industrial setting, the system identifies critical elements like flame and clinker with high precision, achieving 25 frames per second (FPS) and a mean average precision (

{m A P}^{50}

) of 98.8%. The study also develops strategies to improve the adaptability of the model to changing operational conditions. This advancement marks a significant step towards more energy-efficient and quality-focused cement production practices. By leveraging technological innovations, this research contributes to the move of the industry towards sustainability and operational efficiency.

Keywords:

cement rotary kilns; combustion monitoring; deep learning; YOLOv8 algorithm

1. Introduction

The cement industry is one of the pillars of modern infrastructure development, but it is also among the most energy-intensive and polluting sectors globally. The production of cement is responsible of an important share of the global carbon dioxide emissions, contributing to climate change and environmental degradation [1]. In this context, the cement industry is searching for sustainable practices, with a growing focus on the utilization of alternative fuels like biomass or waste fuels to mitigate environmental impacts and enhance energy efficiency [2]. Among these, refuse-derived fuel (RDF) has emerged as a promising solution, offering a pathway to reduce reliance on traditional fossil fuels like petcoke and incorporate waste management principles into energy recovery [3].

RDF is a kind of fuel produced from various types of waste: municipal solid waste (MSW), industrial waste, and commercial waste. The use of RDF in cement kilns has gained traction as a method to improve sustainability in the cement industry.

The global generation of MSW continues to increase, leading to significant environmental concerns due to the large portions that end up in landfills or are incinerated [4]. Converting MSW into refuse-derived fuel (RDF) offers a sustainable solution, especially for the cement industry, by providing an alternative fuel source that can help reduce the reliance on conventional fossil fuels like coal [5,6]. RDF is derived from non-recyclable components of MSW and has a calorific value similar to traditional fossil fuels, making it suitable for use in cement kilns for clinker production. Using RDF in cement kilns not only conserves natural resources but also helps lower greenhouse gas emissions. This is due to RDF’s biogenic content, which is considered carbon-neutral, and the overall reduction in emissions of pollutants like sulfur oxides (SOx), nitrogen oxides (NOx), and particulate matter compared to coal combustion [7,8,9].

Recent studies have emphasized the potential of RDF in enhancing the energy efficiency and environmental sustainability of cement production by decreasing the reliance on traditional fossil fuels like coal, oil, and natural gas, helping to conserve these non-renewable resources [3,7,10]. These investigations reveal that the adoption of RDF can lead to significant improvements in the carbon footprint of cement kilns, with the added benefit of addressing the escalating issue of waste disposal. Furthermore, the integration of RDF into cement manufacturing aligns with the principles of the circular economy, promoting the reuse of waste materials and the optimization of energy consumption [11]. Some authors have calculated that, economically, integrating 15% RDF into cement kiln fuel can save more than 20,000 tons of petcoke annually, reduce CO₂ emissions by more than 16,000 tons/year, and result in net savings of approximately USD 3 million per year due to decreased fuel and CO₂ costs in the case of the cement industry in Jordan [12]. However, the transition to RDF as a primary fuel source in cement kilns is not without challenges. Technical, economic, and regulatory hurdles must be overcome to facilitate the widespread adoption of this practice. The heterogeneity of RDF, along with the need for pre-treatment processes to meet industry standards, poses technical challenges that require innovative solutions for the efficient combustion of RDF in rotary kilns [5]. The use of RDF as a substitution fuel in cement kilns poses potential risks due to the presence of heavy metals, particularly the more volatile ones, which can transfer to gaseous emissions, so the suitability of RDF as a fuel is contingent on its quality and the use of RDF must be assessed on case-by-case basis to ensure environmental safety [13].

Regarding combustion stability, the maximum thermal substitution rate (TSR) achieved with RDF can reach 80–100% in the calciner, while it is limited to 50–60% in the kiln burner. Advances in pre-combustion technologies, multi-channel burners, and new satellite burners have facilitated high TSR. Extensive modeling of kiln burners and calciners has further enhanced TSR [8]. Other authors analyzed the use of some alternatives for the direct firing of RDF in rotary kilns, like the gasification of the RDF [8].

Rotary kilns, essential for producing lime and cement clinker, utilize direct flame heating for multiple chemical processes. Effective monitoring and control of this process are essential to ensure efficient fuel utilization, maintain product quality, and comply with environmental regulations. Effective monitoring includes maintaining the appropriate temperature profile, which is crucial for complete RDF combustion and the prevention of pollutant formation, achieved through the use of sensors and thermocouples along various zones of the kiln [8]. Infrared pyrometers and optical sensors are employed to monitor flame temperature, ensuring it remains within the optimal range for efficient combustion [14,15]. Continuous emissions monitoring systems track CO and CO₂ levels to ensure complete combustion and monitor process efficiency. Monitoring of NOx and SOx emissions is critical due to regulatory requirements, with some systems which provide real-time data facilitating process adjustments to minimize emissions [12]. Due to the higher content of volatile heavy metals in RDF, it is essential to monitor their levels in gaseous emissions, and specialized filters and detectors measure concentrations of elements such as Sb, Hg, Cd, and Pb [12]. Automated feed systems regulate the rate at which RDF is fed into the kiln, maintaining a consistent fuel supply and preventing fluctuations that could impact combustion efficiency [14]. Other studies present advanced models with prediction of the temperature inside the rotary kiln [16,17], predictive control techniques [18,19,20], or the analysis of the features of the flame video to recognize patterns [20].

These methodologies enable operators to make informed decisions regarding adjustments to the operation of the kiln. By analyzing data from these various sources, operators can optimize fuel consumption, enhance clinker quality, extend the lifespan of the kiln, and reduce the risk of unplanned downtime [21].

The literature indicates the crucial relationship between flame visual representation and cooking zone temperature [22,23,24]. Flame images and various internal components of the rotary kiln provide critical operational condition information. Thus, some methods utilizing computer vision techniques have been developed to achieve intelligent control of rotary kilns [24,25,26,27,28]. Among these techniques are algorithms for segmenting the flame in the combustion zone of a rotary kiln based on texture granularity through Gabor transform and Fuzzy C-MEANS clustering [25]. Other approaches use heterogeneous features, such as color and global and local configuration characteristics extracted directly from pixel values without segmentation. Additionally, machine learning techniques are increasingly applied, including image recognition methods using neural networks for flame state identification, albeit with high processing times [27,29].

Other methods analyze flame images to extract texture features like energy, entropy, and inertia, employing singular value decomposition (SVD) [30], support vector machines (SVMs) [31], and K-means [28] for feature extraction and classification of flame images to recognize rotary kiln working conditions. Recent studies have combined filters or image segmentation to highlight regions of interest before using neural networks for condition recognition, slightly improving recognition accuracy [24,32,33].

On the other hand, the application of deep learning techniques in image recognition has made significant progress, with models like the convolutional neural network VGG-16 used for feature transfer, training, and testing flame images in different combustion states to achieve automatic combustion state identification [34]. However, the application of deep learning in recognizing rotary kiln working conditions still holds substantial development potential for more complex analyses and recognitions.

The lack of labeled sample data complicates feature extraction in neural networks, and the training process is prone to overfitting. To reduce deep learning models’ dependence on training sample size, transfer learning can be applied to classification, detection, or segmentation tasks to accelerate training efficiency. Therefore, to overcome limitations due to the lack of massive data, it is proposed to start with state-of-the-art deep learning methods previously trained with massive datasets and apply transfer learning strategies to adapt them to the specific problem of instance segmentation within the rotary kiln [35].

Consequently, the use of deep learning for predictive maintenance and process optimization can lead to significant energy savings. By accurately predicting when maintenance is needed, DL models help prevent unplanned downtime and reduce energy waste associated with inefficient operations. Additionally, optimizing the combustion process ensures that the kiln operates at peak efficiency, minimizing energy consumption and associated costs [36,37].

1.1. Instance Segmentation

Computer vision, a rapidly growing interdisciplinary field, prominently features object detection as one of its primary applications. Object detection involves the dual tasks of locating objects within an image and classifying them. This task comprises two challenges: object location and classification in an image.

Advanced object detection methods are divided into two main categories: one-stage detectors and two-stage detectors. Two-stage detectors use a region proposal network (RPN) to generate regions of interest (ROIs), which are subsequently classified in a second step. In contrast, one-stage detectors integrate both tasks into a single process. Generally, two-stage detectors prioritize detection accuracy, while one-stage detectors focus on inference speed and are suitable for real-time applications [38].

Object detection typically aims to approximate object location in an image, obtaining a bounding box, as shown in the example of Figure 1. Image segmentation can be approached in various ways, ranging from assigning semantic labels to individual pixels (semantic segmentation) to partitioning individual objects (instance segmentation). Instance segmentation can distinguish isolated objects and separate them into different instances of the same class, providing detailed information about each individual entity in the scene.

The outcome of an instance segmentation model is a set of masks or contours delineating each object in the image, accompanied by class labels and confidence scores for each object. Instance segmentation is particularly useful when it is necessary to determine not only the location of objects within an image but also their exact shape.

Figure 1. Image analysis techniques: classification, object detection, and segmentation.

1.2. Mask R-CNN

Mask R-CNN [39], proposed in 2017, is an instance segmentation method that evolved from Faster R-CNN [40], an advanced object detection model based on convolutional neural networks (CNNs). Mask R-CNN extends Faster R-CNN by adding instance segmentation, allowing precise identification and delineation of objects in images by generating specific masks for each detected object. Its two-stage architecture incorporates a robust mechanism for generating pixel-level instance masks [39]. In Mask R-CNN, the image is first processed by a convolutional neural network that provides a convolutional feature map. Subsequently, another network (region proposal network, RPN) is used to predict the proposals for regions of interest (ROIs). These ROIs are then refined and classified while high-precision instance masks are simultaneously generated. A critical component of Mask R-CNN is the ROI Align technique, which ensures accurate alignment of the feature maps with the instance masks, significantly contributing to the quality of the resulting masks. This method achieves an inference speed of approximately five frames per second (FPS), which is a step towards real-time performance. Additionally, the impact of this method was significant, especially since the code and model were released after its publication. Furthermore, researchers have introduced variations and extensions, such as Cascade Mask R-CNN [41] and Panoptic FPN [42], which enhance its versatility and capabilities.

1.3. YOLO (You Only Look Once)

Unlike some approaches that use two stages for detection and segmentation, YOLO (You Only Look Once) [43] employs a single-stage architecture integrating both tasks in one step, significantly enhancing efficiency. Figure 2 shows the evolution of the different versions of YOLO throughout the years. YOLO, launched in 2015, quickly gained popularity for its speed and accuracy, capable of inferring at 45 FPS. Subsequent versions YOLOv2 [44] and YOLOv3 [45] introduced significant improvements. A highly efficient model with outstanding performance in real-time instance segmentation, known as YOLACT [46], was developed based on an encoder–decoder architecture.

In 2020, two new versions of YOLO were published: YOLOv4 by Alexey Bochkovskiy on Darknet [47] and YOLOv5 by Glenn Jocher in a PyTorch implementation [48]. Later, YOLOv6 [49] was developed by Meituan in 2022, but until the release of YOLOv7 [50], segmentation models and additional tasks, like pose estimation on the MS COCO keypoints dataset, were not added [51]. YOLOv8 [52] and YOLOv9 [53] are the most recent and advanced versions of the real-time object detection and segmentation algorithm, building on the success of previous versions and introducing new features and improvements to enhance performance, flexibility, and effectiveness.

In this study, the YOLOv8 algorithm has been used for monitoring the combustion zone of a rotary kiln. This selection has been based on several factors, including performance, adaptability, technical integration, and application-specific benefits. YOLOv8, similar to its predecessors, is distinguished by its real-time performance capabilities, offering high frame rates essential for continuous monitoring in industrial settings. Previous studies have demonstrated that YOLO-based models can achieve inference speeds upwards of 45 FPS, making them highly suitable for real-time applications [45,47]. The need for prompt detection and segmentation in a dynamic environment such as a rotary kiln makes YOLOv8 particularly advantageous. Additionally, YOLOv8 benefits from transfer learning, leveraging pre-trained models on extensive datasets. This allows for effective fine-tuning with smaller, application-specific datasets, thereby reducing the overall training time while maintaining high accuracy [54]. In comparison, other algorithms might require more extensive data and longer training periods to achieve comparable performance. This efficiency is crucial in industrial applications where rapid deployment and adaptation are necessary. The practical deployment of YOLOv8 in industrial environments is facilitated by its ease of implementation and resource efficiency. YOLOv8 is designed to run efficiently on standard hardware, making it feasible for real-time monitoring without requiring specialized computational resources [49]. The dynamic and high-temperature environment of the combustion zone presents unique challenges, such as fluctuating light conditions and high levels of particulate matter. The robust single-stage architecture of YOLOv8 is well suited to handle these challenges, providing reliable segmentation and detection under varying conditions [54,55]. Additionally, the capability of YOLOv8 to handle real-time inference with high accuracy ensures that it can effectively monitor and control the combustion process, enhancing operational efficiency and safety.

1.4. Segment Anything Model (SAM)

Segment Anything (SAM) [56] is an innovative object segmentation model based on convolutional neural networks (CNNs) and reinforcement learning. SAM is a segmentation system that can be activated through prompts and possesses a global understanding of the nature of objects (zero-shot generalization). This means it can segment any element in an image without necessarily having ever seen objects of the same class before.

In the SAM model, the image is initially processed by an encoder. Subsequently, another network (Prompt Encoder Network, PEN) encodes prompts provided by the user or generated automatically. These prompts can take various form, such as words, boxes, masks, or dots. Next, a lightweight network (Mask Decoder Network, MDN) decodes the features and prompts into pixel-level instance masks. A key component of SAM is the Prompt Align technique, which enhances feature extraction from the prompts, ensuring precise alignment of the feature maps with the instance masks, significantly contributing to the quality of the resulting masks.

This paper aims to extend the role of computer vision technologies in monitoring and optimizing the use of RDF in rotary cement kilns. By leveraging advanced imaging and data analysis techniques based on deep learning, it is possible to acquire and segment the most important parts in the flames of rotary cement kilns. To address the machine learning problem of detecting and classifying key elements in rotary kiln operation images using deep learning, some independent and dependent variables have been defined. The independent variables are the frames of the images captured from the kiln operation. Each frame serves as an input to our deep learning model, providing the visual data required for analysis. The dependent variables are the classes that the model predicts for each frame. These classes include Plume, Flame, and Clinker as defined in Figure 3. The class Plume corresponds to the mix of RDF and fossil fuel when it enters the rotary kiln from the burner, prior to its combustion. The class Flame refers to the air and fuel mix that is in the combustion phase and therefore at a high temperature. Finally, the class Clinker corresponds to the raw material that exits the rotary kiln at the lower part of the image.

This work is the first step towards the development of control algorithms based on images and process data that allow operators to make the right decisions when using fuels like RDF with high variability in their calorific values.

2. Materials and Methods

2.1. Cement Rotary Kiln

The experiments were carried out on a rotary kiln installed in the Maceira-Liz (Portugal) cement production plant of the cement producer SECIL. Figure 4 shows the scheme of a rotary kiln for cement production, similar to the kiln where the experiments were conducted. During the cement production process, the raw material is introduced into a cyclone preheater where the hot flue gases generated during the combustion of the different fuels move upward in the opposite direction. This process preheats the material before it progresses from the upper end of the kiln to the lower end where the main burner flame is produced. Inside the kiln, the material is heated up to 1450 °C, resulting in the formation of clinker. The residence time and the temperature profile within the kiln influence the quality of the clinker. After leaving the kiln, the clinker enters the cooler where it cools down before its storage in the storage towers.

The primary burner can operate with a variety of fuels and is capable of functioning with each fuel individually while not exceeding the maximum power admitted. Petroleum coke (petcoke) and coal are the principal fuels, with the burner fully operational on these fuels. Fuel oil is utilized during the startup and heating phases. In pursuit of more sustainable cement production methods, the burner can be supplied with alternative fuels such as RDF, wood chips, plastics, and dry sludge, among others. The fuel substitution capacity is 70% for wood chips and plastics and 60% for other alternative fuels, measured by their power contribution. A frequently adopted ratio for combustion stability is 40% petcoke and 60% combined alternative fuels.

One significant challenge in using alternative fuels is their elevated moisture content and the variability in their composition. Thus, to ensure the stability in the combustion, an ongoing analysis of RDF composition and continuous monitoring of the combustion process are required. Currently, RDF composition analysis is conducted intermittently through sample collection and combustion monitoring relies on the kiln assessment of the operator based on flame characteristics, such as shape and color, in conjunction with other process variables like kiln internal temperature. Based on these observations, the operator adjusts fuel flows, kiln rotation speed, and other parameters to maintain the optimal clinker melting temperature.

2.2. Flame Image Acquistion

The system used to visualize and monitor the flame in SECIL is a DURAG D-VTA 200 Thermography Analysis System composed of an RGB camera, a cooling system with a retractor unit to extract the camera from the kiln if the temperature becomes too high, and a software to manage the system. Table 1 gathers the main specifications of the existing video system used in this work.

This system generates images that, while not possessing the quality of current vision systems, may be sufficient for developing vision algorithms in such systems with minimal investment. Figure 5 presents the location of the video system in the rotary kiln and one sample of an image during the combustion of alternative fuels mixed with petcoke.

To facilitate the digitization of analog videos without interfering with the regular operations the plant, a bypass capture system was integrated into this environment, achieving a non-invasive monitoring system. The capture system employs a video signal converter Imaging Source DFG/USB2pro capable of the capture of videos from the PAL system at its maximum resolution (786 × 876 pixels) and the highest available acquisition rate (25 fps). The software provided with the converter includes a Software Development Kit (SDK), which is used for the implementation of the real-time image acquisition system on a mid-range PC. With all these tools, a system has been developed aimed at maximizing efficiency in the real-time image capture. During its implementation, fundamental Python libraries were used, including Ctypes for interfacing with C functions, OpenCV for image processing, NumPy for efficient numerical operations, and TIS Grabber, a module provided in the SDK designed for image capture [57]. This approach ensures an efficient and real-time capture of images, providing a robust foundation for the subsequent implementation of models and performing inferences on these images.

2.3. Dataset Creation

After describing the image acquisition system and the involved hardware, this section will focus on the generation of the dataset, whose steps are a first campaign to collect the images, the cleaning of the dataset, and the labeling of the images.

2.3.1. Data Collection Campaign

The first dataset (dataset 1) was created entirely during the execution of this work. An experimental campaign was conducted to capture several videos of the rotary kiln under different working conditions. These videos were acquired using the capture system detailed in the previous section.

To mitigate the limitations on video quality mentioned earlier, fixed recording parameters were established to achieve the best possible quality. Additionally, an experimental campaign was planned and executed over a two-month period. During this time, the cement production process data were collected from the SCADA of the plant in parallel to the flame images to be used in later stages and to support the monitoring. The objective of these tests was to develop a dataset featuring the rotary kiln operating under diverse conditions, which can serve as training data for deep learning models, aiding them in achieving the best possible results.

Figure 6 displays examples of the images obtained during the experimental campaign. As shown, a variety of videos encompassing different operating conditions were collected, such as moments of stationary activity (bottom left image of Figure 6), periods of furnace inactivity (visible in the top right image of Figure 6), and furnace startup phases (see bottom right image of Figure 6), which is overexposed. Additionally, the presence of dust in some images, which hinders proper visualization, can be observed.

2.3.2. Dataset Cleaning

Within the dataset, some images lack valid information, such as the extreme cases previously discussed (absence of activity or overexposure). This situation arises due to the fixed camera parameters, with no possibility of adjusting the exposure times. To avoid potential unexpected outcomes from the trained model, a preliminary filtering process will be carried out. This process aims to discard invalid images and exclusively provide high-quality ones to the deep learning models.

To conduct the necessary cleaning of the dataset, the use of a clustering techniques approach was followed, leveraging process data (such as the speed of the kiln, flow of alternative fuels, coal flow, or furnace pressure, among others) associated with the videos collected during the data collection campaign. This approach offers an effective strategy for identifying anomalous patterns in the images of the rotary kiln, contributing to the overall quality improvement of the dataset.

In this process, clustering was applied to the data of the process associated with each video, allowing the formation of clusters (groups) that represent different states or conditions of the kiln during production. Each resulting cluster will enable the identification of anomalous patterns in kiln behavior, such as conditions of overexposure, inactivity, or other undesirable behaviors. Various clustering algorithms, each with unique advantages and specific applications, were proposed for dataset cleaning. The options considered included DBSCAN [58], Gaussian Mixture Model (GMM) [59], and K-means [60]. Ultimately, K-means was chosen due to its notable efficiency in managing large volumes of data and its simplicity of implementation. This efficiency allows for effective data cleaning tailored to the needs of clustering the videos based on the process data. K-means excels in providing clear, well-defined clusters, which is crucial for ensuring the integrity and quality for subsequent analysis. Its ability to handle large datasets with high-dimensional features made it the optimal choice for our project, ensuring both accuracy and computational efficiency in the clustering process.

Initially, the selection of the most relevant variables from the process dataset was performed. It was observed that selecting a large number of variables does not favor the formation of clear groupings in the clusters. After analyzing various process variables, the relevance of selecting the two most significant variables for this case was determined: the flow of alternative fuels and the coal flow. Subsequently, these variables were normalized to ensure a uniform scale, and the optimal number of clusters (K) was determined using the elbow method, with K = 4 being the optimal number. Following the application of the algorithm, a clear grouping into 4 distinct clusters was evident (see Figure 7).

After examining the different clusters generated, it was confirmed that two of them exhibit conditions of overexposure or inactivity in the rotary kiln. This allows for the exclusion of the clusters from the dataset, consequently improving the quality of the images intended for further analysis and model training.

2.3.3. Labeling the Dataset

Once an extensive and high-quality dataset has been obtained, the next step involves precise labeling of the images intended for use by the model. The quality of the dataset is not only linked to the clarity and cleanliness of the images but also to the accuracy of the associated labels. These labels guide the model in identifying and extracting features of the object of interest. By accurately delineating the object in the images, a defined approach is provided for the model to capture and understand its properties.

The labeling procedure consists of visually defining the object of interest in an image through detailed contours around it, while specifying the class to which it belongs. Although this task may seem laborious, there are several software tools that significantly simplify this process, saving time and effort. Examples of these tools include VGG Image Annotator (VIA) [61] or Labelbox [62]. However, in this work, RoboFlow [63] has been used. Roboflow is a web development interface designed for computer vision applications. This platform provides a comprehensive set of resources for organizing, labeling, preprocessing, and creating various iterations of datasets from raw images. These datasets are prepared for deployment in training and validation processes of deep learning models. Additionally, Roboflow Annotate offers an automated polygon annotation tool, Smart Polygon, based on the Segment Anything Model (SAM), a method described previously. Ultimately, Roboflow offers a powerful and versatile solution as a labeling tool in object segmentation tasks, as it facilitates image preprocessing in this project, covering changes such as size, grayscale conversion, automatic orientation, or contrast adjustment. On the other hand, it implements data augmentation through transformations such as flipping, rotation, or lighting adjustment. Additionally, exportation in various annotation formats is allowed. Figure 8 shows various labeled images from the dataset, comprising the three classes previously explained: Flame, Plume, and Clinker, which represent the parts inside the kiln of interest for the combustion monitoring.

2.4. Development of the Monitoring System

This section addresses the development of the monitoring system for rotary kilns based on instance segmentation. Initially, the selected segmentation model is described, the model training process is detailed, and the preparation and use of the dataset discussed in the previous section are examined.

2.4.1. Segmentation Models

When choosing a segmentation model, a balance was sought between the accuracy of the model and the speed of inference. As detailed in Section 1.3, the YOLO model, which operates as a single-stage detector, is notable for its ability to achieve reduced inference times. However, it may face challenges when there is limited training data compared to multi-stage approaches such as Mask R-CNN or SAM, explained in Section 1.2 and Section 1.4, respectively, which, despite having lower inference speeds, tend to perform more accurately with a reduced training dataset.

The final model selection was based strictly on the requirements of the application, with a marked emphasis on the need for real-time processing, implying the achievement of minimal inference times without sacrificing the accuracy of the results. The main priority was to maintain real-time operation, which motivated the choice of single-stage models despite the potential decrease in accuracy this choice could entail. Mask R-CNN was discarded because it is not designed to work in real time, and SAM was discarded due to its high computational cost and difficulty in segmenting objects very similar to the background, as is the case in our scenario. This led to the adoption of YOLOv8, which represents a cutting-edge proposal in the state of the art (SOTA).

However, there are different variants of the YOLOv8 segmentation model (see Table 2), which have been pre-trained using the MS COCO dataset [52]. As it can be observed in Table 2, higher numbers in the mAP (mean average precision) column indicate better model performance. Lower numbers in the time columns indicate faster inference speed, which is desirable. Lower numbers in the parameters and FLOPs columns indicate a less complex model, which may be beneficial for devices with limited resources. The variants analyzed range from lighter versions like YOLOv8n-seg to more complex and accurate models like YOLOv8x-seg.

In the selection process, the YOLOv8-x model was excluded due to its prolonged inference time, attributed to its considerably more complex architecture, which would require more powerful hardware resources. This is because priority is given to maximizing accuracy. On the other hand, YOLOv8-n was discarded, despite its simplified structure and the consequent shorter inference time as it shows lower accuracy compared to other alternatives. Seeking an optimal balance between inference time and accuracy, three models were trained using the pre-trained YOLOv8-s, YOLOv8-m, and YOLOv8-l variants.

2.4.2. Model Training

All models were trained on a computer equipped with an NVIDIA GTX 1660 Ti GPU, which provides a good balance of computational power/cost to be deployed in similar field applications. It is noteworthy that NVIDIA CUDA (Compute Unified Devices Architecture), specifically version 12.2, was used as the framework for training.

Additionally, in all the training processes carried out, the model weights were initialized using a pre-trained network on an extensive dataset, specifically the MS COCO dataset [51]. This approach follows a transfer learning strategy, leveraging the patterns and general features learned during the training on that dataset. The advantage of this approach is that the model starts with a solid foundation of knowledge and an enhanced ability to extract features from the outset. As training progresses, the model specifically adapts to the particular dataset for the project domain, adjusting the weights and specializing in the features and patterns relevant to effectively addressing the problem at hand. This transfer learning process is particularly essential when working with smaller datasets and high-complexity networks, which can present challenges during the training process [35].

Before proceeding with the model training, it is necessary to have several utilities in the programming environment. Some of the most relevant include OpenCV for image processing, Pillow for image manipulation, and Torch and Torchvision, libraries in the PyTorch ecosystem which offer linear algebra operations and pre-trained models for computer vision.

The training duration is set at 20 epochs to balance between sufficient learning and computational efficiency. An epoch refers to one complete pass through the entire training dataset. By setting the training to 20 epochs, the model is given enough iterations to learn from the data, identify patterns, and refine its weights without overfitting. Overfitting can occur if the model is trained for too many epochs, where it learns the noise in the training data rather than the underlying patterns. This setting ensures that the model reaches a reasonable level of accuracy while keeping the training time manageable. The input images are resized to 640 × 480 pixels to standardize the input dimensions and reduce computational load. This resolution is a compromise between detail and processing speed. Smaller image sizes can lead to faster inference times because the model has fewer pixels to process, which reduces the amount of computation required per image. This size also ensures that the essential features of the images are preserved while minimizing the file size, thereby speeding up data loading and processing times. On the other hand, the batch size is set at 16, meaning 16 images are processed simultaneously in one pass. This value strikes a balance between memory usage and training speed. Larger batch sizes can speed up training because of the parallel processing capabilities of modern GPUs, but they also require more memory. A batch size of 16 is large enough to benefit from parallel processing but small enough to fit within the memory constraints of most GPUs, ensuring efficient use of resources using the Stochastic Gradient Descent (SGD) optimizer. SGD is used due to its simplicity and effectiveness in a wide range of machine learning tasks [64]. SGD updates the model parameters based on the gradient of the loss function with respect to the parameters, averaged over a randomly selected subset of data (batch). This approach helps in finding the minimum of the loss function more efficiently and avoids getting stuck in local minima. Additionally, SGD can converge faster than other optimizers when properly tuned. The learning rate (lr) is set at 0.01, which determines the step size for each iteration towards minimizing the loss function. A learning rate of 0.01 is a commonly used default value that often provides a good balance between convergence speed and stability. If the learning rate is too high, the model might overshoot the optimal solution, while a too-low learning rate can make the training process excessively slow. This value has been chosen to ensure efficient learning while maintaining the stability of the training process. Other hyperparameters are maintained at their default values to leverage the standard, well-tested configurations that work well for a variety of datasets and models. These defaults are typically chosen based on extensive empirical evidence and are designed to provide a good starting point for training.

2.4.3. Dataset Preparation

The main training dataset (dataset 1) comprises a total of 1225 images. Although this database may be considered relatively small for training an object detection model, it has proven sufficient for the project as transfer learning was applied. However, it is important to note that the size and number of images are not the only important factors for training. The quality of the images and the diversity of objects present in them also play a significant role in the performance of the model.

This dataset was strategically distributed, assigning 70% of the images to the training set, 20% to the validation set, and 10% to the test set, with the aim of avoiding any possible bias in the evaluation of our validation dataset. Of the 1225 images, 857 were used for training, 245 for validation, and 123 for testing.

The selection of images for the training set was carefully carried out, ensuring the absence of duplicates in the validation datasets. Additionally, data augmentation was applied to reduce false positive detections and improve the robustness of the model during training. After augmenting the training set, a final dataset consisting of 2920 images was achieved: 2552 allocated to training, 245 to validation, and 123 to testing.

Despite the dataset images being captured using the same hardware, various pre-processing operations were applied to ensure consistency and efficiency of the training. These operations include:

Automatic Orientation: The pixel order in all images has been standardized. This means that all images have been aligned to a consistent orientation, ensuring that they are uniformly processed. Standardizing pixel order is crucial because it eliminates discrepancies caused by different image orientations, which can lead to variations in how the algorithm perceives the data. Consistent orientation helps in maintaining uniform feature extraction, thereby enhancing the accuracy of the training [65].
Resize: To homogenize the aspect ratio, all images have been resized to a resolution of 640 × 480. This action ensures that all images have the same dimensions, which is important for the training process. Uniform image size simplifies the computations required during training and ensures that the model processes each image similarly. Additionally, resizing to a smaller, consistent resolution helps in reducing the file size of the images. This reduction in size leads to faster data loading times and quicker iterations during the training process, ultimately accelerating the overall training time.
The dataset annotations have been formatted in TXT format and configured in YAML, compatible with the YOLOv8 model. Annotations in TXT format provide a simple and efficient way to store and access information about the objects in the images, such as their classes and bounding-box coordinates. Configuring these annotations in YAML ensures compatibility with YOLOv8, simplifying the model implementation process. This structured approach to formatting annotations ensures that the data are easily readable and manageable by the YOLOv8 model, facilitating smoother training and deployment processes.

In summary, these preprocessing operations—automatic orientation, resizing, and standardized annotation formatting—play a critical role in enhancing the quality and consistency of the training data. They help maintain uniformity across the dataset, reduce computational overhead, and ensure compatibility with the YOLOv8 model, thereby improving the efficiency and effectiveness of the training process. All these pre-processing tasks have been managed through the Roboflow platform [63]. Consequently, whenever new images were incorporated into the dataset, a new version could be easily exported to the training environment.

Another processing technique used was data augmentation [66]. This method involves enriching an existing dataset by introducing slightly modified copies of the original data or synthetic data generated from these. Its application is common in deep learning and is justified for two fundamental reasons: firstly, to counteract the insufficiency of training data, as occurs in this project, and secondly, to prevent overfitting during training, thereby obtaining a more robust model. The data augmentation techniques applied in this project include:

Horizontal Flipping: The addition of horizontal flips (see Figure 9) to make the model invariant to the orientation of the subject. In this context, only horizontal flipping has been implemented, given that vertical flipping is not considered relevant for the application in question.

Figure 9. Application of horizontal flip to the original image.

Figure 9. Application of horizontal flip to the original image.
Rotation: The introduction of variability in rotations to strengthen the ability of the model to handle situations where the object of interest experiences rotational movements. A rotation range of −15 to 15 degrees has been applied (see Figure 10), which is particularly valuable in the context of images of a rotary kiln, where cases of clinker rotation on the kiln walls can be simulated. It is important to note that excessive rotation could generate confusion in the model, so a limited range of rotation has been defined to ensure an improvement in the robustness of the model without impairing its performance.

2.4.4. Metrics

After completing the model training process, an evaluation of the effectiveness in object segmentation is required. For this purpose, specific metrics designed for such models are employed to establish a single method of evaluation that facilitates comparison between different models.

To determine the similarity between the predicted masks and the ground truth masks (ideal expected result, obtained from dataset labeling), we have measured the overlap between these two using the Intersection over Union (IoU), which is defined as follows:

I o U = \frac{A r e a o f O v e r l a p}{A r e a o f U n i o n}

(1)

This metric allows us to define what constitutes a correct prediction. If the predicted category matches the actual one and the IoU is at least 50%, we consider the prediction to be correct (see Figure 11).

To measure how accurate the predictions of the model are, two widely used metrics, precision and recall, were employed. Essentially, precision measures the percentage of correct positive predictions among all predictions made, and recall measures the percentage of correct positive predictions among all actual positive cases.

P r e c i s i o n = \frac{T P}{T P + F P}

(2)

R e c a l l = \frac{T P}{T P + F N}

(3)

where

T P

is the number of true positives,

T N

is the number of true negatives,

F P

is the number of false positives, and

F N

is the number of false negatives.

There is a trade-off between these two measures, combining them we obtain the

F_{s c o r e}

:

F_{s c o r e} = 2 * (\frac{P r e c i s i o n * R e c a l l}{P r e c i s i o n + R e c a l l})

(4)

Finally, a commonly used metric is the average precision (AP). AP is defined as the area under the precision–recall curve. By averaging AP for different classes, we obtain the mean average precision (mAP) which has been used in validation. The

{m A P}^{50}

is calculated using only the IoU threshold of 0.5, while the

{m A P}^{50 - 95}

is calculated using 10 equally spaced IoU thresholds from 0.5 to 0.95. The

{m A P}^{50 - 95}

is a stricter metric than the

{m A P}^{50}

since it requires that the predicted detections overlap more with the ground truth to be considered correct.

3. Results and Discussion

To evaluate and select the final model for use in the monitoring tool in this work, two comparisons were conducted. The first comparison involves analyzing the overall performance of three different object segmentation models using the metrics previously described. The second comparison examines the effect of increasing the number of epochs used in training the same model.

3.1. Comparison between the Three Variants of YOLOv8

The object classes to be detected and segmented in this context include the categories of Flame, Clinker, and Plume. Performance evaluation is carried out using specific metrics such as B(P), B(R), or B(

{m A P}^{50}

), obtained through the use of bounding boxes. Additionally, metrics such as M(P), M(R), M(

{m A P}^{50}

), and M(

{m A P}^{50 - 95}

) are considered, which evaluate instance segmentation using masks. In this analysis, particular focus is placed on metrics starting with M, associated with the use of masks, used by YOLOv8, for instance, segmentation. Table 3, Table 4 and Table 5 show the results of the experiments E1, E2, and E3 carried out with YOLOv8-small, YOLOv8-medium, and YOLOv8-large models.

It is observed that all models exhibit high precision and recall for each class, indicating remarkable performance in instance segmentation for the three categories. However, both E3 (Table 5) and E2 (Table 4) present superior values in M(

{m A P}^{50 - 95}

) compared to E1 (Table 3). Furthermore, the models achieve high values in M(

{m A P}^{50}

) for all classes, suggesting an effective capability to distinguish instances when the overlap between detection and ground truth is at least 50%. Notably, the performance in the Flame class shows higher values compared to the other two classes. Despite this, a decrease in the values of M(

{m A P}^{50 - 95}

) for all classes is evident as the IoU threshold increases, indicating a loss of precision. This phenomenon can be attributed to the difficulties of the models in capturing details and edges of instances, especially in the more irregularly shaped Clinker and Plume classes.

Another essential requirement in this project is the ability to work in real time, which involves achieving an optimal inference time to obtain the maximum number of frames per second (FPS). Therefore, the inference time for the three models and the number of FPS they can achieve were measured. According to the results presented in Table 6, both E2 and E1 achieve a higher FPS value compared to E3. In conclusion, after evaluating the two fundamental requirements of the project and analyzing the results, we opted to select the model trained in E2. This model not only offers higher precision but also provides elevated FPS times, thus meeting the established requirements.

3.2. Impact of Training Epochs on Model Performance

Following the selection of the model to be used, with the goal of improving precision, the same model was trained varying the number of epochs, seeking a balance between enhancing performance and avoiding issues such as overfitting.

In Figure 12, the charts are divided into two sets: those starting with train exhibit metrics calculated with the training dataset, while those starting with val show metrics calculated with the validation dataset. On the left side of the image, there is a consistent decrease in losses throughout the epochs in relation to the training phase; these also show a decrease during the validation phase, suggesting that the model is achieving good generalization, avoiding overfitting to the training data. On the right side, it is highlighted that the training of the model progression is as expected, with a big growth in the early epochs and stabilization from epoch 40 onwards. Therefore, a new experiment (Experiment 4, or E4) has been decided upon with the same model and dataset but this time increasing the number of epochs to 50 since the gain in mAP is practically insignificant beyond that point.

In Table 7, it can be seen that E4 achieves a higher value of M(

{m A P}^{50 - 95}

) compared to E2, indicating better model performance. This translates into an increase in average precision across all classes, with the Flame class again showing outstanding performance, while the Plume class presents greater challenges in segmentation.

In Figure 13, the results of the model inference on various validation images are presented. In these representations, the assigned class is shown along with the corresponding confidence percentage, as well as the predicted segmentation mask.

3.3. Implementation of the System in Real Time

Once the model was selected, the final integration of all tools was carried out with the objective of obtaining the monitoring system. This system encompasses both the acquisition of images in real time and the trained model.

As illustrated in Figure 14, the acquisition system is responsible for capturing images from inside the rotary kiln, which will serve as input to the selected model. The latter performs the inference of the images, generating a resulting image with the various segmented elements. Additionally, to assist the operator in decision-making regarding the control of the rotary kiln, a characterization of the different segmented elements is carried out. This characterization provides a quantitative value to the operator through the percentage of the area occupied by each instance within the image.

Following the implementation and startup of the system at the cement production facilities, it was confirmed that the inference of the model time is 23 ms, as previously indicated, without causing a delay in image acquisition. This ensures that the system operates at 25 FPS, thus meeting the two essential requirements of this project.

3.4. Adaptation to Different Boundary Conditions

Given the changing nature of the industry, it is crucial that the models are robust enough to adapt to modifications. In this context, during the cement production process some adjustments to the burner of the rotary kiln are made, causing changes in the boundary conditions. For this reason, it will be necessary to validate whether the previously trained model maintains its performance or, if not, to evaluate the need to make the appropriate adjustments to adapt to the new conditions. In this regard, the collection of a new dataset (dataset 2) under the mentioned updated conditions was planned. To carry out this process, the previously described steps will be followed, including a campaign to collect videos, subsequent cleaning, and finally, the corresponding labeling. This new dataset encompasses a total of 106 images. Of these, 75 have been assigned for the training process, 20 for validation, and 11 reserved for testing.

As detailed in Section 3.2, the model initially trained exhibits very good performance metrics, with high scores in precision, recall, and mAP. Specifically, in E4, an average of 98.6% for

{m A P}^{50}

and 71.8% for

{m A P}^{50 - 95}

is achieved, as shown in Table 7. However, the results of the model validation against the new dataset 2, as evidenced in Table 8, are not optimal in this case, with an average of 86.5% for

{m A P}^{50}

and 47.2% for

{m A P}^{50 - 95}

across all classes. A clear deterioration in model performance is observed with this variation.

Therefore, it has been determined that it is crucial to make adjustments to the model to enhance its robustness in the face of new boundary conditions. Two different strategies have been evaluated for this purpose. The first involves retraining the model using both the original dataset 1 and the dataset 2. The second strategy involves applying fine-tuning to the model using exclusively dataset 2.

Once the training of the new models with these respective strategies is completed, an evaluation of them will be conducted. For this, four additional experiments will be carried out. On one hand, Experiment 6 (E6) and Experiment 7 (E7), which implement the retraining strategy, evaluate the performance of the model on dataset 1 and dataset 2, respectively. On the other hand, Experiment 8 (E8) and Experiment 9 (E9), based on fine-tuning, will assess performance on dataset 1 and dataset 2, respectively.

The results obtained after the adjustment process are highly noteworthy. The models experience significant improvements in precision, recall, and mAP. Notably, Experiment 9 (E9) achieves an average across all classes of 99.5% for

{m A P}^{50}

and 72.8% for

{m A P}^{50 - 95}

, as detailed in Table 9. However, when evaluated with the dataset 1 (E8), it is observed that its performance is the lowest compared to the other experiments, which could be due to a possible overfitting of the model to the new dataset.

On the other hand, Experiment 7 (E7) demonstrates improved performance on the new dataset, in addition to maintaining virtually the same performance with the original dataset 1 as observed in Experiment 6 (E6), indicating a notable capacity for generalization. Ultimately, both strategies can be used successfully, and the choice between them will depend on specific requirements. That is, if the objective is to achieve a model that can generalize across different datasets, the preferred choice would be to perform retraining with all the data. On the other hand, if the goal is to tailor the model to specific boundary conditions, fine-tuning emerges as the most suitable option. Both strategies ensure that the model maintains its effectiveness, robustness, and adaptability, becoming a fundamental asset for the rotary kiln monitoring system. As final result, Figure 15 presents an example of the final prediction of the model using dataset 2 compared with the ground truth obtained from the labeling of the dataset.

It is worth mentioning that optimizing hyperparameters such as learning rate, batch size, and network architecture can lead to significant improvements in model performance. For example, implementing learning rate schedules can reduce the learning rate as training progresses and can help achieve a better convergence. Moreover, experimenting with different batch sizes to find the optimal balance between memory usage and model performance and trying different optimizers like Adam or RMSprop can help to find the best fit for the specific application. Several post-processing techniques can refine the outputs of the YOLOv8 model to improve accuracy and reduce false positives: Non-Maximum Suppression (NMS) can eliminate multiple detections of the same object by keeping only the highest-confidence detection or setting a confidence threshold to filter out low-confidence detections, thereby reducing false positives.

Finally, future adaptations of the results for practical implementation may need to address some challenges associated with a small data sample, the need for data labeling, and the limitations of transfer learning. Techniques like data augmentation and synthetic data generation can expand the training dataset, improving model robustness, while utilizing semi-supervised and active learning techniques can reduce dependency on labeled data and maximize labeling efficiency.

4. Conclusions

Throughout this paper, the successful development and implementation of a monitoring system for rotary kilns in a real environment has been demonstrated using advanced computer vision and deep learning techniques. The operation of the developed system under real working conditions can be observed in Supplementary Video S1. The implementation of this monitoring system not only can enhance the efficiency of the kiln operators but also enables comprehensive supervision and control of the cooking process. These improvements are expected to optimize fuel consumption and contribute to an increase in the quality of the final product while decreasing the consumption of fossil fuels and thus reducing pollutant emissions.

The selection of the YOLOv8 model has been supported by its demonstrated capacity to detect and segment instances of flame, clinker, and plume with high levels of precision in real time. This endorsement validates the suitability of the adopted approach and ensures the fulfillment of the project requirements. Additionally, two strategies have been proposed to adapt the model to changes in boundary conditions, substantially improving both its precision and segmentation capability.

Future developments can be pursued in two main directions: First, to enhance the precision of the model, it is suggested to explore the possibility of increasing the dataset size or balancing the classes to achieve more homogeneous results. In this context, training the model with a greater number of epochs could be considered, maintaining a cautious balance between improving precision and mitigating overfitting. Furthermore, to enhance the robustness of the models, it is necessary to evaluate them under different working conditions, different plants, and various compositions of RDF. This requires a deep interaction between expert operators and the model developers. This collaborative approach ensures that the models can adapt to real-world scenarios and handle the variability inherent in RDF compositions, leading to more reliable and efficient performance in practical applications.

Second, the extraction of features from each instance and the correlation of them with process data allow the construction of predictive models from the images to anticipate events in the rotary kiln or to develop automatic control of the process, relating variables like the quality of the product, RDF composition, or pollutant emissions. In this way, techniques like reinforcement learning can optimize continuous processes by learning from interactions with the environment, improving quality control, fuel optimization, and emissions reduction.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/su16166862/s1, Video S1: Sample video (percentage pixels class vs total pixels).mp4.

Author Contributions

Conceptualization, J.A., C.P., J.B., and V.D.T.; methodology, J.A., C.P., J.B., P.C., and V.D.T.; software, C.P., J.B., and P.C.; validation, J.A., C.P., and J.B.; formal analysis, J.A., C.P., J.B., and P.C.; investigation, J.A., C.P., J.B., and P.C.; data curation, C.P.; writing—original draft preparation, J.A. and C.P.; writing—review and editing, J.A., C.P., J.B., P.C., and V.D.T.; supervision, J.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No. 869939 (RETROFEED project).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author because they are subjected to a confidentiality contract between the authors and industrial partners.

Acknowledgments

We would like to thank the SECIL Portugal team for all the time, knowledge provided, and access to resources and information during the development of this work.

Conflicts of Interest

Author Valter Domingos Tavares was employed by the company SECIL S.A. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Miller, S.A.; Habert, G.; Myers, R.J.; Harvey, J.T. Achieving Net Zero Greenhouse Gas Emissions in the Cement Industry via Value Chain Mitigation Strategies. One Earth 2021, 4, 1398–1411. [Google Scholar] [CrossRef]
Margaritis, N.; Evaggelou, C.; Grammelis, P.; Arévalo, R.; Yiannoulakis, H.; Papageorgiou, P. Application of Flexible Tools in Magnesia Sector: The Case of Grecian Magnesite. Sustainability 2023, 15, 12130. [Google Scholar] [CrossRef]
Tihin, G.L.; Mo, K.H.; Onn, C.C.; Ong, H.C.; Taufiq-Yap, Y.H.; Lee, H.V. Overview of Municipal Solid Wastes-Derived Refuse-Derived Fuels for Cement Co-Processing. Alex. Eng. J. 2023, 84, 153–174. [Google Scholar] [CrossRef]
Chen, D.M.-C.; Bodirsky, B.L.; Krueger, T.; Mishra, A.; Popp, A. The World’s Growing Municipal Solid Waste: Trends and Impacts. Environ. Res. Lett. 2020, 15, 074021. [Google Scholar] [CrossRef]
Chandrasekhar, K.; Pandey, S. Co-Processing of RDF in Cement Plants. In Energy Recovery Processes from Wastes; Springer: Singapore, 2020; pp. 225–236. [Google Scholar]
Karpan, B.; Abdul Raman, A.A.; Taieb Aroua, M.K. Waste-to-Energy: Coal-like Refuse Derived Fuel from Hazardous Waste and Biomass Mixture. Process Saf. Environ. Prot. 2021, 149, 655–664. [Google Scholar] [CrossRef]
Beguedou, E.; Narra, S.; Afrakoma Armoo, E.; Agboka, K.; Damgou, M.K. Alternative Fuels Substitution in Cement Industries for Improved Energy Efficiency and Sustainability. Energies 2023, 16, 3533. [Google Scholar] [CrossRef]
Sharma, P.; Sheth, P.N.; Mohapatra, B.N. Recent Progress in Refuse Derived Fuel (RDF) Co-Processing in Cement Production: Direct Firing in Kiln/Calciner vs. Process Integration of RDF Gasification. Waste Biomass Valorization 2022, 13, 4347–4374. [Google Scholar] [CrossRef]
Sharma, P.; Sheth, P.N.; Mohapatra, B.N. Waste-to-Energy: Issues, Challenges, and Opportunities for RDF Utilization in Indian Cement Industry. In Proceedings of the 7th International Conference on Advances in Energy Research; Bose, M., Modi, A., Eds.; Springer: Singapore, 2021; pp. 891–900. [Google Scholar]
Radu, S.M.; Bărbulescu, A.; Coandreș, C.; Mvodo, C.R.M.; Scutelnicu, I.P.; Khamis, J.; Burian, A.A.; Lihoacă, A. Industrialization of Mining Waste and Energy in the Circular Economy. Min. Rev. 2023, 29, 92–100. [Google Scholar] [CrossRef]
Lara-Topete, G.O.; Yebra-Montes, C.; Orozco-Nunnelly, D.A.; Robles-Rodríguez, C.E.; Gradilla-Hernández, M.S. An Integrated Environmental Assessment of MSW Management in a Large City of a Developing Country: Taking the First Steps Towards a Circular Economy Model. Front. Environ. Sci. 2022, 10, 838542. [Google Scholar] [CrossRef]
Hemidat, S.; Saidan, M.; Al-Zu’bi, S.; Irshidat, M.; Nassour, A.; Nelles, M. Potential Utilization of RDF as an Alternative Fuel to Be Used in Cement Industry in Jordan. Sustainability 2019, 11, 5819. [Google Scholar] [CrossRef]
Genon, G.; Brizio, E. Perspectives and Limits for Cement Kilns as a Destination for RDF. Waste Manag. 2008, 28, 2375–2385. [Google Scholar] [CrossRef]
Kosajan, V.; Wen, Z.; Zheng, K.; Fei, F.; Wang, Z.; Tian, H. Municipal Solid Waste (MSW) Co-Processing in Cement Kiln to Relieve China’s Msw Treatment Capacity Pressure. Resour. Conserv. Recycl. 2021, 167, 105384. [Google Scholar] [CrossRef]
Le Guen, L.; Huchet, F. Thermal Imaging as a Tool for Process Modelling: Application to a Flight Rotary Kiln. Quant. Infrared Thermogr. J. 2020, 17, 79–95. [Google Scholar] [CrossRef]
Lv, M.; Zhang, X.; Chen, H.; Ling, C.; Li, J. An Accurate Online Prediction Model for Kiln Head Temperature Chaotic Time Series. IEEE Access 2020, 8, 44288–44299. [Google Scholar] [CrossRef]
Liu, J.-X.; Zhu, Y.-L.; Shen, Z.; Sun, P. Hybrid Recognition Method for Burning Zone Condition of Rotary Kiln. Acta Autom. Sin. 2012, 38, 1153. [Google Scholar] [CrossRef]
Zanoli, S.M.; Pepe, C.; Rocchi, M. Cement Rotary Kiln: Constraints Handling and Optimization via Model Control Techniques. In Proceedings of the 2015 5th IEEE Australian Control Conference (AUCC), Gold Coast, Australia, 5–6 November 2015; pp. 288–293. [Google Scholar]
Zanoli, S.M.; Pepe, C.; Rocchi, M.; Astolfi, G. Application of Advanced Process Control Techniques for a Cement Rotary Kiln. In Proceedings of the 2015 19th IEEE International Conference on System Theory, Control and Computing (ICSTCC), Cheile Gradistei, Romania, 14–16 October 2015; pp. 723–729. [Google Scholar]
Chen, H.; Yan, T.; Zhang, X. Burning Condition Recognition of Rotary Kiln Based on Spatiotemporal Features of Flame Video. Energy 2020, 211, 118656. [Google Scholar] [CrossRef]
Dittrich, A.; Keller, S.; Vogelbacher, M.; Matthes, J.; Waibel, P.; Keller, H. Camera Based Optimization of Multi-Fuel Burners for the Use of Substitute Fuels in the Cement Industry. In Proceedings of the IRRC International Recycling and Recovery Congress—Waste to Energy, Vienna, Austria, 18–19 September 2017. [Google Scholar]
Pedersen, M.N. Co-Firing of Alternative Fuels in Cement Kiln Burners. Ph.D. Thesis, Technical University of Denmark, Lyngby, Denmark, 2018. [Google Scholar]
Waibel, P.; Vogelbacher, M.; Matthes, J.; Keller, H.B. Infrared Camera-Based Detection and Analysis of Barrels in Rotary Kilns for Waste Incineration. In Proceedings of the 2012 International Conference on Quantitative InfraRed Thermography, Naples, Italy, 11–14 June 2012; QIRT Council: New York, NY, USA, 2012. [Google Scholar]
Yudin, D.; Magergut, V.; Dobrinskiy, E. Machine Vision System for Assessment of Firing Process Parameters in Rotary Kiln. World Appl. Sci. J. 2013, 24, 1460–1466. [Google Scholar] [CrossRef]
Sun, P.; Chaia, T.; Zhou, X. Rotary Kiln Flame Image Segmentation Based on FCM and Gabor Wavelet Based Texture Coarseness. In Proceedings of the 2008 7th World Congress on Intelligent Control and Automation, Chongqing, China, 25–27 June 2008; pp. 7615–7620. [Google Scholar]
Wang, J.-S.; Ren, X.-D. GLCM Based Extraction of Flame Image Texture Features and KPCA-GLVQ Recognition Method for Rotary Kiln Combustion Working Conditions. Int. J. Autom. Comput. 2014, 11, 72–77. [Google Scholar] [CrossRef]
Li, W.; Wang, D.; Chai, T. Flame Image-Based Burning State Recognition for Sintering Process of Rotary Kiln Using Heterogeneous Features and Fuzzy Integral. IEEE Trans. Ind. Inform. 2012, 8, 780–790. [Google Scholar] [CrossRef]
Lin, B.; Jørgensen, S.B. Soft Sensor Design by Multivariate Fusion of Image Features and Process Measurements. J. Process Control 2011, 21, 547–553. [Google Scholar] [CrossRef]
Compais, P.; Arroyo, J.; Castán-Lascorz, M.Á.; Barrio, J.; Gil, A. Detection of Slight Variations in Combustion Conditions with Machine Learning and Computer Vision. Eng. Appl. Artif. Intell. 2023, 126, 106772. [Google Scholar] [CrossRef]
Guo, S.; Sheng, Y.; Chai, L. SVD-Based Burning State Recognition in Rotary Kiln Using Machine Learning. In Proceedings of the 2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA), Siem Reap, Cambodia, 18–20 June 2017; pp. 154–158. [Google Scholar]
Zhang, R.; Lu, S.; Yu, H.; Wang, X. Recognition Method of Cement Rotary Kiln Burning State Based on Otsu-Kmeans Flame Image Segmentation and SVM. Optik 2021, 243, 167418. [Google Scholar] [CrossRef]
Chen, K.; Wang, J.; Li, W.; Li, W.; Zhao, Y. Simulated Feedback Mechanism-Based Rotary Kiln Burning State Cognition Intelligence Method. IEEE Access 2017, 5, 4458–4469. [Google Scholar] [CrossRef]
Compais, P.; Arroyo, J.; Tovar, F.; Cuervo-Piñera, V.; Gil, A. Promoting the Valorization of Blast Furnace Gas in the Steel Industry with the Visual Monitoring of Combustion and Artificial Intelligence. Fuel 2024, 362, 130770. [Google Scholar] [CrossRef]
Li, T.; Peng, T.; Chen, H. Rotary Kiln Combustion State Recognition Based on Convolutional Neural Network. J. Phys. Conf. Ser. 2020, 1575, 012030. [Google Scholar] [CrossRef]
Pan, S.J.; Yang, Q. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
Barbhuiya, S.; Kanavaris, F.; Das, B.B.; Idrees, M. Decarbonising Cement and Concrete Production: Strategies, Challenges and Pathways for Sustainable Development. J. Build. Eng. 2024, 86, 108861. [Google Scholar] [CrossRef]
Binggang, X.; Jingbo, W.; Wanshun, J.; Yuwei, Z.; Yuan, C. A Global Study of Green Cement Intelligent Manufacturing Based on Artificial Intelligence. Sci. Eng. 2023, 2, 160–165. [Google Scholar] [CrossRef]
Sharma, R.; Saqib, M.; Lin, C.T.; Blumenstein, M. A Survey on Object Instance Segmentation. SN Comput. Sci. 2022, 3, 499. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 386–397. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
Cai, Z.; Vasconcelos, N. Cascade R-CNN: High Quality Object Detection and Instance Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 1483–1498. [Google Scholar] [CrossRef] [PubMed]
Kirillov, A.; Girshick, R.; He, K.; Dollar, P. Panoptic Feature Pyramid Networks. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 6392–6401. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. Available online: https://arxiv.org/abs/1804.02767v1 (accessed on 1 December 2023).
Bolya, D.; Zhou, C.; Xiao, F.; Jae Lee, Y. YOLACT: Real-Time Instance Segmentation. Available online: https://arxiv.org/abs/1904.02689v2 (accessed on 28 November 2023).
Bochkovskiy, A.; Wang, C.-Y.; Yuan, H.; Liao, M. YOLOv4: Optimal Speed and Accuracy of Object Detection. Available online: https://arxiv.org/abs/2004.10934v1 (accessed on 1 December 2023).
Comprehensive Guide to Ultralytics YOLOv5. Available online: https://docs.ultralytics.com/yolov5/ (accessed on 1 February 2023).
Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. Available online: https://arxiv.org/abs/2209.02976v1 (accessed on 1 December 2023).
Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Computer Vision–ECCV 2014, Proceedings of the 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Springer International Publishing: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar]
Ultralytics YOLOv8 Docs. Available online: https://docs.ultralytics.com/ (accessed on 1 December 2023).
Wang, C.-Y.; Yeh, I.-H.; Mark Liao, H.-Y. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. Available online: https://arxiv.org/abs/2402.13616 (accessed on 2 May 2024).
Talib, M.; Al-Noori, A.H.Y.; Suad, J. YOLOv8-CAB: Improved YOLOv8 for Real-Time Object Detection. Karbala Int. J. Mod. Sci. 2024, 10, 5. [Google Scholar] [CrossRef]
Khan, Z.; Liu, H.; Shen, Y.; Zeng, X. Deep Learning Improved YOLOv8 Algorithm: Real-Time Precise Instance Segmentation of Crown Region Orchard Canopies in Natural Environment. Comput. Electron. Agric. 2024, 109168. [Google Scholar] [CrossRef]
Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.-Y.; et al. Segment Anything. Available online: https://arxiv.org/abs/2304.02643v1 (accessed on 28 November 2023).
Zelinsky, A. Learning OpenCV—Computer Vision with the OpenCV Library (Bradski, G.R. et al.; 2008) [On the Shelf]. IEEE Robot. Autom. Mag. 2009, 16, 100. [Google Scholar] [CrossRef]
Schubert, E.; Sander, J.; Ester, M.; Kriegel, H.P.; Xu, X. DBSCAN Revisited, Revisited. ACM Trans. Database Syst. 2017, 42, 1–21. [Google Scholar] [CrossRef]
Gaussian Mixture Models. Available online: https://scikit-learn.org/stable/modules/mixture.html (accessed on 26 November 2023).
Arthur, D.; Vassilvitskii, S. How Slow Is the k-Means Method? In Proceedings of the Twenty-Second Annual Symposium on Computational Geometry, Sedona, AZ, USA, 5–7 June 2006; ACM: New York, NY, USA, 2006; pp. 144–153. [Google Scholar]
VGG Image Annotator. Available online: https://www.robots.ox.ac.uk/~vgg/software/via/via_demo.html (accessed on 12 February 2023).
Labelbox. Available online: https://labelbox.com/ (accessed on 2 December 2023).
Roboflow: Give Your Software the Power to See Objects in Images and Video. Available online: https://roboflow.com/ (accessed on 26 November 2023).
Bottou, L. Large-Scale Machine Learning with Stochastic Gradient Descent. In Proceedings of the COMPSTAT’2010, Paris, France, 22–27 August 2010; Lechevallier, Y., Saporta, G., Eds.; Physica-Verlag HD: Heidelberg, Germany, 2010; pp. 177–186. [Google Scholar]
Bai, R.; Guo, X. Automatic Orientation Detection of Abstract Painting. Knowl. Based Syst. 2021, 227, 107240. [Google Scholar] [CrossRef]
Mikolajczyk, A.; Grochowski, M. Data Augmentation for Improving Deep Learning in Image Classification Problem. In Proceedings of the 2018 IEEE International Interdisciplinary PhD Workshop (IIPhDW), Swinoujscie, Poland, 9–12 May 2018; pp. 117–122. [Google Scholar]

Figure 2. Released versions of the YOLO algorithm throughout the years.

Figure 3. Classes predicted in the model developed.

Figure 4. Scheme of a rotary kiln with a video system for flame monitoring.

Figure 5. (a) Location of the video system in the rotary kiln. (b) Sample image of the combustion inside the rotary kiln.

Figure 6. Image captures of the flame in the rotary kiln under different boundary conditions.

Figure 7. Clustering using K-means method.

Figure 8. Sample images from the labeled dataset, where the Flame class is outlined in blue, the Plume class in violet, and the Clinker class in orange tone.

Figure 10. Application of rotation to the original image.

Figure 11. Example of flame detection in an image. The predicted bounding box is drawn in red, while the actual bounding box is drawn in blue. Areas of overlap and union for the IoU calculation are shown in green. On the right is the equivalent calculation for instance segmentation masks.

Figure 12. Training summary. Lower box_loss suggests more accurate predictions in the location and size of boxes, lower seg_loss indicates greater similarity between predicted and actual masks in segmentation, and lower cls_loss reflects more accurate object classification.

Figure 13. Comparison between the ground truth and the prediction of the model on validation images.

Figure 14. Architecture of the real-time monitoring system.

Figure 15. Comparison between the ground truth and the prediction of the model on dataset 2.

Table 1. Specifications of the video system.

Main Specifications
Brand and model	DURAG D-VTA 200
Video system resolution	1280 × 960 pixels
Maximum insertion length	450 mm
Maximum temperature in combustion chamber	1600 °C
Cooling system	Air cooled

Table 2. Variants of the YOLOv8 model, trained with MS COCO [52].

Model	Size (Pixels)	mAP^50–95	Time (ms)	Params. (M)	FLOPs (B)
YOLOv8n-seg	640	30.5	1.21	3.4	12.6
YOLOv8s-seg	640	36.8	1.47	11.8	42.6
YOLOv8m-seg	640	40.8	2.18	27.3	110.2
YOLOv8l-seg	640	42.6	2.79	46.0	220.5
YOLOv8x-seg	640	43.4	4.02	71.8	344.1

Table 3. Experiment 1 (E1)—Model using YOLOv8-small, dataset 1, and epoch = 20.

Class	Images	Instances	B(P)	B(R)	B( ${m A P}^{50}$ )	M(P)	M(R)	M( ${m A P}^{50}$ )	M( ${m A P}^{50 - 95}$ )
All	245	690	0.948	0.986	0.985	0.953	0.956	0.981	0.656
Flame	245	245	0.935	0.976	0.986	0.966	0.992	0.994	0.777
Clinker	245	225	0.948	0.964	0.978	0.930	0.944	0.973	0.551
Plume	245	220	0.961	0.964	0.987	0.962	0.931	0.975	0.641

Table 4. Experiment 2 (E2)—Model using YOLOv8-medium, dataset 1, and epoch = 20.

Class	Images	Instances	B(P)	B(R)	B( ${m A P}^{50}$ )	M(P)	M(R)	M( ${m A P}^{50}$ )	M( ${m A P}^{50 - 95}$ )
All	245	690	0.960	0.977	0.989	0.968	0.977	0.988	0.695
Flame	245	245	0.953	0.983	0.988	0.972	0.998	0.995	0.801
Clinker	245	225	0.962	0.982	0.992	0.961	0.975	0.995	0.658
Plume	245	220	0.964	0.967	0.987	0.970	0.959	0.979	0.627

Table 5. Experiment 3 (E3)—Model using YOLOv8-large, dataset 1, and epoch = 20.

Class	Images	Instances	B(P)	B(R)	B( ${m A P}^{50}$ )	M(P)	M(R)	M( ${m A P}^{50}$ )	M( ${m A P}^{50 - 95}$ )
All	245	690	0.970	0.969	0.990	0.973	0.973	0.990	0.697
Flame	245	245	0.979	0.984	0.992	0.988	0.992	0.995	0.787
Clinker	245	225	0.948	0.975	0.989	0.947	0.973	0.988	0.661
Plume	245	220	0.981	0.949	0.988	0.986	0.953	0.988	0.645

Table 6. Inference times of each model.

Experiment	Inference Time (ms)	FPS
E1	13	25
E2	23	25
E3	40	21

Table 7. Comparison between Experiment 2 (E2, epoch = 20) and Experiment 4 (E4, epoch = 50) using YOLOv8-medium.

Class	Images	Instances	Experiment	B(P)	B(R)	B( ${m A P}^{50}$ )	M(P)	M(R)	M( ${m A P}^{50}$ )	M( ${m A P}^{50 - 95}$ )
All	245	690	E2	0.960	0.977	0.989	0.968	0.977	0.988	0.695
All	245	690	E4	0.973	0.966	0.987	0.973	0.971	0.986	0.718
Flame	245	245	E2	0.953	0.983	0.988	0.972	0.998	0.995	0.801
Flame	245	245	E4	0.968	0.983	0.990	0.968	0.990	0.994	0.814
Clinker	245	225	E2	0.962	0.982	0.992	0.961	0.975	0.995	0.658
Clinker	245	225	E4	0.980	0.947	0.986	0.982	0.950	0.983	0.670
Plume	245	220	E2	0.964	0.967	0.987	0.970	0.959	0.979	0.627
Plume	245	220	E4	0.970	0.968	0.984	0.970	0.973	0.982	0.669

Table 8. Experiment 5 (E5)—Model validation using YOLOv8-medium and dataset 2.

Class	Images	Instances	B(P)	B(R)	B( ${m A P}^{50}$ )	M(P)	M(R)	M( ${m A P}^{50}$ )	M( ${m A P}^{50 - 95}$ )
All	20	59	0.946	0.857	0.900	0.913	0.823	0.865	0.472
Flame	20	19	0.844	0.570	0.709	0.844	0.570	0.695	0.329
Clinker	20	20	0.996	0.980	0.995	0.896	0.900	0.906	0.359
Plume	20	20	0.998	0.990	0.995	0.998	0.990	0.995	0.727

Table 9. Comparison of the results of E6, E7, E8, and E9 with their respective training dataset (CDE) and evaluated on their respective validation dataset (CDV).

Experiment	CDE	CDV	Class	Images	Instances	M(P)	M(R)	M( ${m A P}^{50}$ )	M( ${m A P}^{50 - 95}$ )
E6	Datasets 1 and 2	Dataset 1	All	245	690	0.975	0.963	0.989	0.714
E7	Datasets 1 and 2	Dataset 2	All	20	59	0.924	0.938	0.963	0.658
E8	Dataset 2	Dataset 1	All	245	690	0.932	0.943	0.958	0.572
E9	Dataset 2	Dataset 2	All	20	59	0.983	0.998	0.995	0.728

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Arroyo, J.; Pillajo, C.; Barrio, J.; Compais, P.; Tavares, V.D. Deep Learning Techniques for Enhanced Flame Monitoring in Cement Rotary Kilns Using Petcoke and Refuse-Derived Fuel (RDF). Sustainability 2024, 16, 6862. https://doi.org/10.3390/su16166862

AMA Style

Arroyo J, Pillajo C, Barrio J, Compais P, Tavares VD. Deep Learning Techniques for Enhanced Flame Monitoring in Cement Rotary Kilns Using Petcoke and Refuse-Derived Fuel (RDF). Sustainability. 2024; 16(16):6862. https://doi.org/10.3390/su16166862

Chicago/Turabian Style

Arroyo, Jorge, Christian Pillajo, Jorge Barrio, Pedro Compais, and Valter Domingos Tavares. 2024. "Deep Learning Techniques for Enhanced Flame Monitoring in Cement Rotary Kilns Using Petcoke and Refuse-Derived Fuel (RDF)" Sustainability 16, no. 16: 6862. https://doi.org/10.3390/su16166862

APA Style

Arroyo, J., Pillajo, C., Barrio, J., Compais, P., & Tavares, V. D. (2024). Deep Learning Techniques for Enhanced Flame Monitoring in Cement Rotary Kilns Using Petcoke and Refuse-Derived Fuel (RDF). Sustainability, 16(16), 6862. https://doi.org/10.3390/su16166862

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning Techniques for Enhanced Flame Monitoring in Cement Rotary Kilns Using Petcoke and Refuse-Derived Fuel (RDF)

Abstract

1. Introduction

1.1. Instance Segmentation

1.2. Mask R-CNN

1.3. YOLO (You Only Look Once)

1.4. Segment Anything Model (SAM)

2. Materials and Methods

2.1. Cement Rotary Kiln

2.2. Flame Image Acquistion

2.3. Dataset Creation

2.3.1. Data Collection Campaign

2.3.2. Dataset Cleaning

2.3.3. Labeling the Dataset

2.4. Development of the Monitoring System

2.4.1. Segmentation Models

2.4.2. Model Training

2.4.3. Dataset Preparation

2.4.4. Metrics

3. Results and Discussion

3.1. Comparison between the Three Variants of YOLOv8

3.2. Impact of Training Epochs on Model Performance

3.3. Implementation of the System in Real Time

3.4. Adaptation to Different Boundary Conditions

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI