1. Introduction
Pipeline infrastructure plays a pivotal role in transporting vital resources such as crude oil, gas, and water. However, leakages from pipelines can lead to significant economic losses, environmental degradation, and safety hazards [
1].
The integration of the Internet of Things (IoT) and Artificial Intelligence (AI) has revolutionized the monitoring and management of industrial systems [
2]. The IoT refers to a network of interconnected physical devices such as sensors, machines, appliances, and vehicles that are embedded with software and other technologies. These devices collect and exchange data over the internet, enabling automation, monitoring, and control without human intervention. For instance, refs. [
3,
4] highlight the crucial role of IoT in smart agriculture through its ability to facilitate real-time environmental monitoring and automated decision-making. Similarly, Ullah [
5] demonstrated how a fog-enabled IoT robotic system can enhance crop management in Moroccan oases. Moreover, Malashin [
6] utilized IoT-based acoustic sensors to detect pipeline leaks, showing its effectiveness in industrial monitoring applications.
Artificial Intelligence (AI), on the other hand, focuses on building systems capable of performing tasks that typically require human intelligence such as learning, reasoning, and decision-making. According to [
2], AI is transforming agricultural practices by enabling predictive analytics and decision support systems. Farooq [
7] applied deep learning algorithms to detect corrosion in gas pipelines, showcasing AI’s capability in image classification and fault detection. In another application, machine learning models for pipeline leakage detection was implemented, illustrating the value of AI in pattern recognition and anomaly detection [
4].
Thermal imaging, in particular, is a valuable sensing technique for identifying temperature anomalies associated with leaks [
1]. Edge computing has also been explored to reduce data processing latency and enhance system efficiency [
8,
9].
Recent advancements in edge artificial intelligence (Edge AI) have significantly improved real-time data processing in resource-constrained environments, enabling efficient deployment of deep learning models for industrial monitoring applications [
10]. In addition, transformer-based models have recently gained attention in thermal image analysis due to their ability to capture long-range dependencies and improve feature representation compared to traditional convolutional neural networks [
9]. Furthermore, federated learning has emerged as a promising approach for IoT-based monitoring systems, allowing multiple distributed devices to collaboratively train models without sharing raw data, thereby enhancing privacy and scalability [
11]. These emerging trends highlight the need for lightweight, efficient, and intelligent leak detection systems capable of operating in decentralized and resource-limited environments.
In this context, this study proposes a system that integrates deep learning, thermal imaging, edge processing, and image compression techniques. The study will simulate data processing, compression techniques, and network transmission scenarios to demonstrate system efficiency in a virtual setting without real-time physical deployment on hardware devices such as Raspberry Pi. This study contributes to the development of a scalable and intelligent leak detection framework using thermal imaging and artificial intelligence in a simulated environment. It demonstrates how Convolutional Neural Networks (CNN) and autoencoders can improve leak detection performance while reducing computational complexity, without requiring costly real-time infrastructure. By leveraging open-source libraries and simulations, the work encourages reproducibility and future adaptation for real-world deployment [
7]. It also addresses the technical limitations of thermal image size and communication bandwidth in constrained networks, providing a foundation for more efficient pipeline monitoring systems.
Thermal imaging has emerged as a powerful and non-invasive technique for detecting faults, including leaks, in pipelines and industrial systems. The core principle relies on capturing infrared radiation, which reveals temperature anomalies that often correspond to faults or fluid leakage. This approach is especially effective when the leaked substance differs thermally from the surrounding environment. One of the major advantages of thermal imaging is its remote and non-contact capability, making it suitable for environments that are hazardous, difficult to access, or where physical intervention is not feasible.
Furthermore, thermal imaging systems are highly sensitive to even small temperature variations, which is essential for detecting early-stage leaks that might be missed by conventional methods. Thermal sensors can identify minute thermal changes, allowing for prompt intervention before significant damage occurs [
12]. Another significant factor in the adoption of thermal imaging is its cost-effectiveness and real-time performance. Compared to more complex techniques such as fiber optics or acoustic emission analysis, thermal imaging provides a more economical and scalable solution. Mohamed. [
4] reports that thermographic methods offer reliable wide-area surveillance at a lower cost, making them ideal for industrial applications, particularly where extensive pipeline networks are involved. Moreover, thermal cameras function reliably in low-visibility or harsh environmental conditions, such as fog, dust, or total darkness. This makes them invaluable for 24/7 monitoring in outdoor or industrial settings.
2. Literature Review
Oil Pipeline systems, particularly those transporting hydrocarbons, constitute critical national infrastructure but remain highly susceptible to leaks due to corrosion, material fatigue, equipment failure, and sabotage [
1]. The repercussions of such failures are severe, ranging from environmental degradation and health hazards to enormous economic losses [
13,
14].
Traditional leak detection mechanisms, including pressure monitoring, acoustic sensing, and infrared thermography, have made significant contributions to pipeline safety [
14]. However, these methods often fall short in delivering real-time, cost-effective, scalable, and autonomous solutions, especially in geographically dispersed and bandwidth-constrained environments [
15]. The convergence of emerging technologies such as the Internet of Things (IoT), Artificial Intelligence (AI), and edge devices offers transformative potential in enhancing the efficiency, accuracy, and autonomy of pipeline monitoring systems. This review synthesizes recent advances across key areas: IoT-enabled monitoring systems, thermal imaging techniques, deep learning approaches, edge-device deployments, data transmission and compression strategies, and energy-efficient system design.
Jadin and Obaid [
15,
16] also applied thermal cameras to detect CO
2 and gas leaks. Though validating thermal imaging’s utility, their approaches lacked IoT and AI integration, thus restricting real-time scalability and adaptive performance.
However, the model’s computational intensity precludes its deployment on lightweight edge devices, making it unsuitable for remote field use. Additionally, an experiment with aerial thermal imagery and holistically nested Edge Detection was conducted, achieving good results but failing to integrate any edge device framework, thus limiting the system’s portability [
17].
Nguyen [
11] demonstrated that thermal anomalies caused by underground natural gas leakage can be successfully classified using deep learning models applied to infrared thermal images, achieving over 95% accuracy.
While thermal imaging continues to be vital in leak detection, its integration with AI and deployment on resource-constrained edge devices remain limited.
Recent deep learning innovations have significantly improved leak detection performance. For example, a hybrid model combining CNN, LSTM (Long Short-Term Memory), and genetic algorithms, attaining 99.69% accuracy using acoustic data. Despite its robustness, it is not suitable for thermal imaging and requires substantial processing power, making edge deployment challenging [
14].
3. Methodology
This study proposes a lightweight deep learning framework for thermal-based pipeline leak detection, integrating Convolutional Neural Networks (CNN), Knowledge Distillation (KD), and Autoencoders (AE) for efficient edge deployment. The overall system architecture is designed to balance detection accuracy with computational efficiency, making it suitable for real-time applications on resource-constrained devices.
3.1. System Overview
The proposed system consists of four main stages: data description, preprocessing, model training, and edge deployment. Thermal images are captured from pipeline environments, preprocessed to enhance quality, and then fed into a teacher–student learning framework. The teacher model guides the student model through knowledge distillation, while an autoencoder is incorporated to improve feature representation and noise reduction. The final trained student model is optimized for potential deployment on an edge AI device such as a Raspberry Pi, enabling real-time leak detection.
3.2. Dataset Description
The dataset used in this study consists of 1506 thermal images captured using an FLIR C5 thermal imaging camera, (FLIR Systems, Inc., Arlington, VA, USA)operating in the long-wave infrared (LWIR) spectrum (8–14 µm) to document pipeline conditions under both leak and non-leak scenarios. The camera has a thermal resolution of 320 × 240 pixels, a temperature measurement range of −20 °C to 650 °C, and a thermal sensitivity of ≤0.05 °C, enabling accurate detection of temperature variations associated with pipeline leakage and structural anomalies. Data collection was conducted in outdoor operational pipelines to capture realistic thermal variations under real-world conditions. All images were stored in JPEG format, carefully annotated, and classified into two categories:
Leak: Images depicting pipelines with confirmed leakage, including small cracks, punctures, and corrosion-induced leaks (506 samples).
Non-leak: Images depicting pipelines without any leakage (1000 samples).
To increase the dataset size and improve model generalization, image augmentation techniques were applied to all collected images. Each image was augmented into six variations (rotation, zooming, flipping, width and height shifts, and rescaling), resulting in:
This brings the final dataset size to a total of 9036 images. The augmentation process not only balances the dataset but also introduces variability, enhancing the robustness of the deep learning model for pipeline leak detection.
Figure 1 illustrates sample thermal images from the dataset used in this study. The images capture sections of pipelines under different conditions, labeled as leak and non-leak. These thermal variations serve as the input data for training the deep learning model.
3.3. Data Preprocessing
Prior to model training, the dataset undergoes several preprocessing steps to enhance image quality, ensure label accuracy, and optimize compatibility with the model architecture, particularly for deployment on edge devices. An autoencoder-based image compression technique is used to reduce file size while preserving essential thermal features for efficient storage and faster inference on devices like the Raspberry Pi. Corrupted or low-quality images are filtered out, and manual label verification ensures accurate leak and non-leak classification. Pixel normalization standardizes intensity values for stable learning, while data augmentation techniques such as random rotations, flips, and zooming expand dataset diversity, reduce overfitting, and improve generalization without additional data collection. Collectively, these processes enhance dataset reliability, reduce computational overhead, and improve the model’s performance during training and real-time deployment.
3.4. Model Building
The proposed model is a thermal-image-based oil and gas pipeline leak detection system optimized for efficient operation on Raspberry Pi devices, with cloud integration for centralized monitoring and control. It employs a Convolutional Neural Network (CNN) enhanced by knowledge distillation to accurately classify thermal images captured by surveillance cameras into leak and non-leak categories. The system architecture, illustrated in
Figure 2, consists of three main components: a sensor module with thermal cameras mounted along the pipeline for real-time image capture; a detection module running a lightweight CNN model on a Raspberry Pi; and a cloud control server responsible for data management, analysis, and administrative functions. The structural workflow begins with dataset collection and preprocessing comprising autoencoder-based image compression, normalization, and data augmentation to improve image quality, training efficiency, and model robustness. The overall system framework, shown in
Figure 3, presents the high-level operational flow of the proposed model, from data acquisition to leak detection and cloud-based monitoring.
The system flowchart presented in
Figure 4 shows how the proposed model is trained. By incorporating knowledge distillation. The framework shows the model training process, where knowledge distillation is employed with a Teacher–student training technique. The goal is to train a small lightweight model suitable for edge devices, which mimic the behavior of large teacher models. where a high-capacity teacher model is trained first. Once the teacher model reaches a stable performance, a student model is trained using both ground truth and soft labels generated by the teacher to minimize a combined loss function. The complete training and deployment process is illustrated in the flow chart.
3.5. Model Architecture
The proposed system adopts a teacher–student architecture to balance high accuracy and computational efficiency. The teacher model is a deep convolutional neural network (CNN) designed to learn rich feature representations, while the student model is a lightweight CNN optimized for edge deployment. The architectural configurations of both models are presented in
Table 1.
The teacher model consists of deeper convolutional layers with a higher number of filters to capture complex thermal patterns associated with pipeline leaks. In contrast, the student model reduces the number of layers and filters to achieve a lightweight structure suitable for deployment on edge devices. Both models utilize the Rectified Linear Unit (ReLU) activation function for non-linearity and max-pooling layers for spatial down-sampling. The final classification layer uses a softmax activation function to output probabilities for leak and non-leak classes.
3.6. Proposed Model Formulation
The proposed system uses three different components. These components include Convolutional Neural Networks (CNNs), autoencoder for image compression, and knowledge distillation for transferring knowledge from a complex model to a lightweight one. To propose a hybrid autoencoder–CNN system, it combines the feature extraction capabilities of CNNs with the dimensionality reduction and data representation power of autoencoder and model compression capability of knowledge distillation. This integration is particularly useful for models that are designed to be lightweight for deployment on edge devices.
This hybridization can be better explained mathematically by formulating the problem as optimization problem.
Let
be the input image.
The autoencoder reconstruction loss is given as:
where the squared difference between encoded and reconstructed images is obtained as shown in Equation (3).
To combine the CNN with autoencoder-reconstructed images, we simply use Equation (4). We pass the encoded latent feature
z into the CNN classifier
, which outputs logits
:
Let
be the ground-truth label and
the cross-entropy loss:
The large CNN model will be distilled to a small student model using knowledge distillation, as described in Equation (5)
Let be the teacher model’s logits for input ,
be the temperature (typically ),
Soft targets (with temperature
):
Kullback–Leibler (KL) Divergence-based Knowledge Distillation (KD) loss:
The proposed hybrid model combines autoencoder and knowledge distillation with CNN. The model loss functions can be computed by taking the mathematical sum loss of three loss functions
as described in Equation (6)
where
are hyperparameters balancing the loss terms.
The knowledge distillation process was implemented using a temperature-scaled softmax function to facilitate effective knowledge transfer from the teacher model to the student model. The temperature parameter (T) controls the smoothness of the output probability distribution; in this study, T was set to 4 based on empirical evaluation, as lower values produced overly sharp distributions similar to standard softmax, while higher values degraded classification performance due to excessive smoothing. The total loss function combines cross-entropy loss (LCE), knowledge distillation loss (LKD), and autoencoder reconstruction loss (LAE), weighted by coefficients λCE, λKD, and λAE, respectively. These coefficients were determined through grid search on the validation dataset, with final values of λCE = 1.0, λKD = 0.5, and λAE = 0.2, which provided the best balance between classification accuracy, knowledge transfer, and noise reduction. Specifically, LCE was prioritized to ensure correct classification, LKD guided the student model to learn from the teacher’s soft targets, and LAE acted as a regularization term to improve robustness against noise in thermal images.
4. Result and Discussion
4.1. Baseline Models
This section presents the experimental results of the developed Convolutional Neural Network (CNN) and several pre-trained architectures evaluated for pipeline leak detection. The study involved developing a CNN model from scratch and utilizing pre-trained models, InceptionV3, MobileNetV2, and ResNet-50, as teacher networks, while a lightweight CNN served as the student model. Both teacher and student models were trained using the Adam optimizer for faster convergence and stable training.
The Traditional CNN model was included as an additional baseline to measure performance improvements achieved through advanced architectures and the Knowledge Distillation (KD) framework.
The proposed model, trained with the knowledge distilled from the Traditional CNN, outperformed all other model combinations in terms of accuracy and computational efficiency, achieving the highest performance with a significantly smaller number of parameters. This superior result demonstrates that a CNN designed and trained from scratch can be highly effective when optimized specifically for the thermal leak detection task. Unlike pre-trained models such as MobileNetV2, InceptionV3, EfficientNet-Lite and ResNet, which were originally developed for general-purpose image classification, the Traditional CNN was tailored to capture the unique spatial and thermal characteristics of the pipeline imagery. This task-specific optimization enabled it to extract more relevant and discriminative features, resulting in improved accuracy and faster convergence.
Moreover, the reduced number of parameters made the proposed model lightweight and suitable for deployment in embedded or IoT environments with limited computational resources. Overall, the Traditional CNN served as the most effective teacher model, allowing the proposed system to achieve a strong balance between accuracy, training speed, and hardware efficiency, as summarized in
Table 2.
4.2. Proposed Model
The proposed CNN + KD + AE model effectively combined a Convolutional Neural Network (CNN) for feature extraction, Knowledge Distillation (KD) for efficient knowledge transfer from a teacher to a student network, and an Autoencoder (AE) for feature compression and noise reduction, resulting in a hybrid design that achieved strong accuracy while maintaining computational efficiency. The teacher model, containing 52.7 million trainable parameters, occupying approximately 201 MB of memory, demonstrated excellent performance compared to larger pre-trained CNN architectures. As shown in
Figure 5a–b, the training and validation loss curves depict a rapid reduction in loss during the early epochs, followed by stabilization at very low values, indicating that the model quickly learned essential patterns within the dataset and maintained consistent generalization throughout training. The minimal gap between the training and validation losses signifies reduced overfitting and robust convergence.
Table 3 presents the proposed lightweight IoT-based CNN–Autoencoder model trained on a locally collected thermal image dataset, which achieved 98% accuracy, 98% precision, 98% recall, and an F1 score of 98%, using only 1.18 million parameters and occupying 4.1 MB of storage. These results highlight that the proposed approach delivers higher detection accuracy with far fewer parameters and smaller model size, confirming its suitability for real-time, resource-constrained IoT pipeline monitoring applications.
Figure 6 presents the confusion matrix of the proposed model. The confusion matrix demonstrates that the proposed model achieves strong classification performance, with a high number of correctly identified leak (115) and non-leak (181) samples. However, a small number of misclassifications were observed. Specifically, 4 non-leak samples were incorrectly classified as leaks (false positives), which may be attributed to thermal reflections or environmental heat sources mimicking leak patterns. Additionally, 1 leak sample was misclassified as non-leak (false negative), likely due to minimal temperature variation between the leak and the surrounding environment. These results indicate that while the model is highly accurate, its performance can be affected by environmental noise and low-contrast thermal signatures.
While this work highlights the potential for deployment on edge devices such as the Raspberry Pi, we note that the model was not empirically benchmarked on a physical device. The claim of suitability is instead based on the model’s lightweight architecture, characterized by a reduced parameter count (1.8 M), compact memory footprint (4.1 MB), and lower computational complexity compared to conventional deep models. These properties are generally aligned with the constraints of resource-limited hardware and suggest feasibility for edge inference. However, we do not report device-specific metrics such as inference latency, memory utilization, or power consumption. Comprehensive on-device evaluation remains an important direction for future work and will be necessary to fully validate real-world deployment performance in low-resource settings.
4.3. Comparison with Existing Work
To assess the performance and efficiency of the proposed model, a comparative evaluation was conducted against existing state-of-the-art approaches. Ref. [
18] proposed a convolutional neural network (CNN)-based leak detection framework enhanced with transfer learning for multi-sensor pipeline monitoring. Although their model achieved high detection accuracy, it required a substantial number of parameters and computational resources, making it less suitable for real-time or edge-based applications. Similarly, refs. [
19,
20] developed a hybrid deep learning architecture that integrates Long Short-Term Memory (LSTM) and Convolutional layers to improve temporal–spatial feature representation for pipeline leak detection. While their approach demonstrated strong accuracy and robustness, it suffered from increased inference time and energy consumption due to its complex network structure.
The comparison focused on key performance metrics, including accuracy, precision, recall, F1-score, model size, and the number of parameters.
Table 4 presents the results obtained from both existing models and the proposed model using the same locally collected dataset.
The proposed lightweight CNN architecture, optimized through Knowledge Distillation and Autoencoder-based feature compression, outperformed [
16,
17] despite both models having more parameters than the proposed model. while maintaining significantly lower model size and fewer parameters, These results demonstrate the superior detection capability, computational efficiency, and deployment suitability of the proposed model for real-time IoT-based pipeline monitoring applications.
5. Conclusions and Future Work
This study successfully developed a high-performing, computationally efficient CNN-based model for IoT pipeline leak detection by integrating Knowledge Distillation (KD) and Autoencoder (AE) techniques. The KD framework enabled the student model to mimic the teacher’s predictive behavior while drastically reducing the number of parameters and improving computational efficiency, and the AE compressed image features to minimize redundancy and improve feature representation. Experimental results demonstrated that the proposed CNN + KD + AE model achieved a validation accuracy of 95.02%, with strong precision and recall, matching or exceeding the performance of pre-trained models such as MobileNetV2, InceptionV3, and ResNet, while requiring far fewer computational resources. This makes it suitable for potential deployment on embedded and low-power IoT devices for real-time monitoring, enabling early leak detection, minimizing environmental hazards, and reducing economic losses.
For future work, efforts should focus on real-time implementation and evaluation of the model on edge devices such as Raspberry Pi, as well as expanding the dataset to ensure class balance and diversity, including varied pipeline materials, environmental conditions, and leak severities. This will further enhance model generalization, robustness, and practical applicability.