1. Introduction
With the rapid transition towards carbon neutrality and sustainable built environments, optimizing energy consumption in large-scale facilities has become a paramount global objective to address climate change and achieve net-zero emission targets [
1,
2]. Among various facility utilities, Compressed Air Systems (CASs) are widely recognized as the “fourth utility”, functioning as a critical and ubiquitous power source for large-scale building operations, HVAC control infrastructures, and automated facility systems [
3]. However, despite their indispensable role, CASs are notoriously energy-intensive and highly inefficient. Statistical analyses indicate that compressed air generation accounts for approximately 10% to 30% of the total electricity consumption in large-scale built environments and complex commercial facilities, yet only a small fraction of this electrical energy is converted into useful pneumatic work due to severe thermodynamic and mechanical losses [
4,
5,
6]. In many countries, compressed air systems in industrial buildings account for 20–30% of total facility electricity costs, making them a critical target for building-level decarbonization strategies [
7,
8]. Improving CAS efficiency therefore directly contributes to reducing Scope 2 CO
2 emissions from building operations, aligning with international frameworks such as the EU Energy Efficiency Directive and national net-zero building mandates [
9]. In modern facility-scale building systems, this systemic energy inefficiency predominantly stems from two distinct yet fundamentally interconnected domains: micro-level physical losses (e.g., hidden pipeline leaks) and macro-level operational redundancies (e.g., constant maximum-load compressor operation regardless of actual real-time demand) [
10,
11].
A significant portion of compressed air, often estimated between 20% and 30% of total production, is continuously wasted due to hidden leaks distributed across complex pipeline networks, aging valves, and deteriorated joints [
12]. These leaks encompass various forms of pneumatic faults, including pipe connection failures, fitting degradation, joint seal deterioration, and thread loosening at coupling points, all of which contribute to continuous volumetric energy losses in building pneumatic infrastructure.
Traditionally, detecting these leaks has heavily relied on manual inspections using single-point acoustic wand sensors or rudimentary soap-water tests. These conventional methods are not only labor-intensive but also highly susceptible to severe background noise, making them virtually ineffective in fully operational industrial environments [
13].
Beyond ultrasonic detection, several alternative fault monitoring approaches have been investigated for compressed air systems. Pressure-based monitoring methods analyze pressure decay curves to detect leaks and quantify their severity, offering simplicity but limited spatial resolution for pinpointing specific fault locations [
14,
15]. Cycle-time analysis monitors deviations in actuator or machine cycle times to identify system performance degradation caused by pressure losses [
16]. Vibration-based methods have also been explored for detecting mechanical faults in compressor components, though they are less effective for distributed pipeline leaks [
17]. Flow rate monitoring using differential flow metering between supply and demand points provides quantitative leak estimates but requires dense instrumentation across the network [
18]. More recently, integrated Fault Detection and Diagnosis (FDD) systems combining multiple sensor modalities have emerged as a promising direction in pneumatic system health management, aiming to not only detect faults but also classify their root causes [
19,
20]. While these methods offer respective advantages in terms of simplicity, cost, or diagnostic depth, they generally lack the capability for non-contact, spatially resolved leak detection that simultaneously provides both location and geometric quantification of the defect. In the present study, acoustic-vision technology was selected because it uniquely enables non-contact, real-time spatial localization and area quantification of leaks without requiring plant shutdown or dense sensor deployment, which is particularly advantageous in complex industrial building environments with extensive pipeline networks [
21,
22].
Recently, vision-based ultrasonic cameras equipped with microphone arrays have been introduced to localize high-frequency acoustic signals by overlaying visual sound maps onto optical images [
21,
22]. However, most existing acoustic-vision applications rely on basic static thresholding techniques. While they can successfully indicate the mere presence of a leak, they fail to accurately quantify the physical area of the defect or estimate the consequential volumetric energy loss, particularly under dynamic industrial lighting conditions and severe metallic reflections [
23,
24]. Without a robust method to extract the physical parameters of the leak, plant operators cannot reliably prioritize maintenance or calculate the direct economic impact of repair interventions.
Parallel to the physical losses in the distribution network, the operational strategy of the central air compressors represents a massive source of energy waste on the supply side. In many manufacturing facilities, even when the utilization rate of the production lines fluctuates or drops significantly (e.g., operating at 50% capacity during off-peak hours or partial shifts), the central compressors frequently run continuously at a fixed high-pressure setpoint (e.g., 8.0 bar) [
25,
26]. This conservative operational approach is typically adopted to prevent any potential pressure drops at the end-point equipment, which could trigger costly process failures or production halts. To mitigate this idle energy waste, various data-driven and machine learning (ML) architectures, such as Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), and ensemble models, have been explored to predict compressor power demand and optimize operation schedules [
27,
28]. By predicting the pneumatic load based on historical IoT data, these models aim to implement variable speed drive (VSD) controls or cascade scheduling to match the supply dynamically with the actual factory demand [
29].
Despite these individual technological advancements in both leak detection and compressor load forecasting, several critical gaps remain in the literature. First, many studies approach physical leak detection and compressor energy prediction as mutually exclusive, isolated problems, limiting their generalizability to holistic real-world operations. For instance, Guenther and Kroll [
14] focused exclusively on automated ultrasonic leak detection without considering the consequential impact on compressor energy optimization, while Wang et al. [
24] developed ML-based compressor power prediction models that implicitly assumed a leak-free distribution network. Similarly, Santolamazza et al. [
30] proposed anomaly detection for compressor energy consumption but did not incorporate pipeline-level physical fault information into their diagnostic framework. Second, implementing macro-level ML predictive control to aggressively reduce the compressor output pressure without first identifying and addressing micro-level physical leaks inevitably leads to catastrophic process failures, as the continuous volume loss from leaks causes the end-point pressure to plunge below the minimum safe operating threshold (e.g., 6.0 bar) [
31]. Third, modern fault monitoring systems increasingly aim to achieve comprehensive Fault Detection and Diagnosis (FDD) capabilities, even for pneumatic systems [
19,
20]. However, most existing FDD frameworks for compressed air systems focus on compressor-side mechanical faults (e.g., valve degradation, bearing wear) rather than distribution-level pipeline integrity. The present study partially addresses this gap by providing automated detection and quantification of pipeline leaks, though full diagnostic classification of diverse fault types (e.g., distinguishing between joint loosening and seal degradation) remains a direction for future work. Furthermore, while advanced ML methods are promising, their ability to bridge the gap between theoretical energy savings and building-specific, leak-free operational conditions has not been sufficiently validated without risking data leakage from internal flow sensors. These limitations underscore the urgent need for integrated frameworks that combine the structural integrity of the pipeline networks with the realistic, schedule-driven dynamics of the facility.
To address these gaps, the present study introduces a unified, dual-action AI framework. A multimodal ultrasonic camera was utilized to detect and physically quantify pipeline leaks under complex industrial conditions, thereby identifying and eliminating micro-level baseline energy losses. To evaluate macro-level operational robustness, independent testing data were generated from a fully operational automotive parts manufacturing facility (Company M in Gyeonggi-do, South Korea) comprising 13 distinct production lines. Unlike previous studies that approach leak detection and compressor scheduling as isolated problems, the present work establishes an integrated optimization sequence. The framework first quantifies the physical baseline losses using probabilistic acoustic-vision fusion, followed by the application of an eXtreme Gradient Boosting (XGBoost) model to predict dynamic power requirements. Crucially, this prediction is based strictly on external production schedules rather than internal flow monitoring, explicitly preventing data leakage (i.e., the unintentional inclusion of information from the test set or target-correlated variables during model training) and ensuring true predictive control. This explicit two-stage configuration enables a controlled and transparent assessment of system-wide energy efficiency without risking end-point pressure drops.
The key contributions of this study are threefold:
Development of a unified, dual-action optimization framework that seamlessly integrates micro-level physical leak quantification with macro-level operational power prediction;
Systematic detection, geometric verification, and physical area estimation of representative compressed air pipeline leaks using a robust probabilistic multimodal fusion algorithm;
Application and optimization of an XGBoost-based predictive control strategy utilizing strictly external production schedules, successfully preventing data leakage and demonstrating over 20% system-wide energy reduction while ensuring end-point pressure stability.
The remainder of this paper is structured as follows.
Section 2 details the materials and methods, describing the industrial testbed, the acoustic-vision leak quantification algorithms, and the configuration of the schedule-aware machine learning model.
Section 3 presents the experimental results, highlighting the performance outcomes of both the micro-level leak detection and the macro-level XGBoost power prediction.
Section 4 provides a comprehensive discussion of these findings, analyzing the integrated synergy of the dual-action framework, system-level ablation studies, and the overall economic and environmental impacts. Finally,
Section 5 concludes the paper by summarizing key findings and suggesting directions for future research.
2. Materials and Methods
This section describes the empirical foundation, data acquisition protocols, and analytical methodologies employed to develop and validate the proposed dual-action AI framework.
Section 2.1 introduces the industrial testbed and system overview.
Section 2.2 details the data acquisition procedures.
Section 2.3 presents the micro-level acoustic-vision algorithms (Phase I).
Section 2.4 describes the macro-level schedule-aware predictive models (Phase II).
Section 2.5 outlines the experimental design and framework integration strategy.
Section 2.6 establishes the evaluation metrics.
2.1. Study Site and System Overview
To validate the proposed framework within a realistic and highly complex manufacturing environment, the study was conducted at an electronics and automotive parts manufacturing facility located in Gyeonggi-do, South Korea. Unlike controlled laboratory settings where variables are artificially isolated, this facility operates under stringent industrial conditions characterized by continuous production cycles, dynamic pneumatic load fluctuations, and severe background noise. It should be noted that the validation was conducted at a single manufacturing site, which provides the advantage of real-world industrial testing but inherently limits the immediate generalizability of the findings. The applicability of the proposed framework to other industrial sectors (e.g., food processing, O-ring manufacturing, pharmaceutical production) with different pneumatic demand profiles, pipeline configurations, and environmental conditions requires additional validation and is discussed as a limitation in
Section 4.5.
The factory operates on a mixed shift schedule consisting of weekday full production (07:00–22:00), reduced night shifts (22:00–07:00), partial Saturday operations, and minimal Sunday/holiday activity. The compressed air system supplies air through a centralized, densely woven piping network to 13 distinct production lines. Specifically, these include 4 Surface-Mount Technology (SMT) lines, 3 assembly lines, 2 testing lines, 2 packaging lines, a warehouse automation system, and a clean room.
The pneumatic infrastructure centers on an Atlas Copco GA75PM_17W Variable Speed Drive (VSD) air compressor rated at 75 kW, supplying air at a nominal operating pressure of 7.5–8.0 bar.
Table 1 summarizes the key technical specifications of the compressed air system.
Figure 1 illustrates the overall schematic layout and operational flow of the CAS, detailing the integration of key components such as the air compressor, desiccant dryers, receiver tanks, and the extensive pipeline network extending to the 13 production lines. This diagram serves as the structural foundation for the proposed macro-level predictive optimization, demonstrating how external schedule variables are mapped to the central compressor.
Furthermore, moving from the macro-level schematic to the micro-level physical reality, the actual industrial environment presents significant observational challenges. As shown in
Figure 2, the facility is characterized by a dense and complex arrangement of pipelines, varying lighting conditions, and severe metallic reflections. These challenging visual conditions were explicitly considered to ensure the robustness of the acoustic-vision leak detection algorithm proposed in this study. While alternative fault monitoring methods such as pressure-based monitoring and cycle-time analysis are less affected by these visual challenges, they lack the spatial resolution and geometric quantification capability required for the integrated framework proposed herein. The rationale for selecting acoustic-vision technology over these alternatives is detailed in
Section 2.3.
2.2. Data Acquisition
To robustly evaluate the compressed air system and validate the proposed framework, comprehensive multi-domain data were continuously collected from the industrial testbed. [R1–9]
Table 2 summarizes the accuracy specifications of the key measurement equipment deployed during this acquisition phase.
2.2.1. IoT Sensor Data
Operational data characterizing the central compressor’s performance were continuously acquired via the Atlas Copco SMARTLINK IoT gateway. The data collection spanned a six-month period from April to September 2024, yielding a robust and comprehensive dataset comprising 25,920 discrete observations sampled at 10 min intervals. The 10 min sampling interval was selected based on the following considerations: (i) the SMARTLINK platform’s default data transmission cycle, which aggregates sensor readings at this frequency; (ii) the temporal granularity of the production shift schedules, which operate on 15 min to 1 h transition cycles, making sub-minute resolution unnecessary for capturing schedule-driven demand variations; and (iii) storage and computational efficiency for long-term (6-month) continuous monitoring. While shorter sampling intervals (e.g., 1 or 5 min) could potentially capture faster transient dynamics, the 10 min resolution was deemed sufficient for the schedule-aware prediction task, and the investigation of higher-frequency sampling is suggested as a direction for future work. The extracted internal sensor variables included post-calculated machine power (kW), post-calculated machine flow (l/s), Drive 1 motor current (A), ambient air temperature (°C), element outlet temperature (°C), delivery pressure (bar), running hours, and accumulated volume (
). In parallel with the compressor’s mechanical data, the exact operational schedules of the 13 individual production lines were recorded as binary states (active/inactive). These binary state signals were captured directly from the facility’s Programmable Logic Controllers (PLCs) via Modbus TCP communication, where each production line’s main power relay status was monitored. In more complex manufacturing setups, such schedule data could alternatively be obtained through Manufacturing Execution System (MES) integration, SCADA system interfaces, or OPC-UA communication protocols. The feasibility of schedule data acquisition may vary depending on the level of factory digitalization, which is discussed as a practical consideration in
Section 4.5. This highly granular schedule dataset encapsulates the facility’s real-time pneumatic demand profile. Crucially, these external schedule records constitute the primary independent feature matrix for the predictive machine learning models. By relying strictly on these external schedules rather than internal flow sensors, the framework rigorously prevents methodological data leakage (i.e., the unintentional incorporation of target-correlated variables into the training feature space, which would artificially inflate predictive accuracy without providing actionable preemptive capability), thereby ensuring true preemptive predictive capability as further detailed in
Section 2.4.
2.2.2. Acoustic Leak Detection Data
To quantify the micro-level physical volume losses within the pneumatic network, comprehensive compressed air leak surveys were conducted utilizing a FLUKE ii910 Precision Acoustic Imager. This industrial-grade diagnostic equipment operates within a broad ultrasonic frequency range of 2 to 100 kHz, utilizing a dense 64-element Micro-Electromechanical System (MEMS) microphone array. The FLUKE ii910 was selected for this study based on the following criteria: (i) it is one of the most widely adopted precision acoustic imagers in industrial maintenance, providing both acoustic frequency analysis and spatial visualization capabilities; (ii) its beamforming microphone array enables non-contact, real-time spatial localization of leak sources at working distances of 1–50 m; and (iii) its built-in sound intensity overlay on optical images provides the foundational multi-modal data required for the proposed MPSF algorithm. While the core algorithmic logic of the proposed framework (multi-stage fusion detection) operates on generic acoustic-optical overlay data and is, therefore, in principle, adaptable to other acoustic imagers with similar specifications (e.g., SDT LUBExpert (SDT International, Forest, Belgium), SONOTEC SONAPHONE (SONOTEC GmbH, Halle, Germany)), the current validation was performed exclusively with the FLUKE ii910, and cross-device compatibility remains to be verified in future studies.
The acoustic camera generates real-time sound intensity overlays onto standard visible-light images, enabling the precise spatial localization of high-frequency leak emissions. A significant operational advantage of this multi-domain data acquisition is that it allows for the continuous monitoring of structural integrity without requiring plant shutdowns or disrupting the ongoing manufacturing processes. During the field survey, multiple primary leak sites were identified and fully documented across the distribution network. The acoustic–visual data acquired from these specific sites served as the foundational input for the micro-level leak quantification and geometric validation algorithms proposed in
Section 2.3.
2.3. Phase I: Micro-Level Acoustic-Vision Leak Quantification
The first phase of the proposed framework focuses on the identification, isolation, and physical quantification of baseline volume losses from the pneumatic pipeline network. To accurately extract valid leak signatures from complex acoustic-optical imagery, a deterministic computer vision algorithm was developed. It should be noted that this phase specifically targets continuous leakage faults (pipe connection leaks, fitting degradation, and joint failures) that emit persistent acoustic signatures. Other types of pneumatic faults, such as pressure drops from clogged filters, actuator seal degradation, or valve malfunctions, are not addressed by the current detection pipeline, as they produce different physical signatures that require specialized monitoring approaches (e.g., differential pressure sensing, cycle-time analysis) [
19,
20].
2.3.1. Multi-Stage Fusion Detection Algorithm
The proposed leak detection pipeline employs a robust multi-stage fusion approach to identify and localize compressed air leaks. The algorithm is explicitly designed to systematically eliminate specific classes of false-positive confounders commonly encountered in industrial environments. The pipeline consists of five sequential stages:
Stage 1—Color Saliency Extraction: The raw acoustic overlay image is decomposed into the Hue, Saturation, and Value (HSV) color space. Regions exhibiting high color saliency relative to the background pipe surface are segmented using a predefined empirical threshold (). While this stage eliminates low-energy background acoustic noise, it inherently retains severe metallic specular reflections.
Stage 2—Texture Saliency Filtering: To differentiate true acoustic emissions from visual artifacts, a Gabor filter bank is applied to extract local spatial texture features. Genuine acoustic leak signatures exhibit characteristic high-frequency turbulent texture patterns, quantified by a texture saliency score. Enforcing a predefined threshold () explicitly isolates and eliminates specular metallic reflections that present high color saliency but lack turbulent acoustic texture.
Stage 3—Candidate Fusion: Regions passing both the color and texture thresholds are fused into unified leak candidates utilizing connected component analysis. A dynamic spatial area constraint () is applied to reject microscopic noise fragments and oversized, non-physical artifacts.
Stage 4—Geometric and Spatial Verification: To rigorously suppress remaining false positives (e.g., elongated pipe joints), each fused instance undergoes geometric evaluation. The primary constraint is Circularity (), defined as . This geometric metric exploits the physical property that true acoustic leak plumes uniformly diffuse outward from a pressurized point source. To ensure morphological consistency, an empirical threshold () is strictly enforced. Furthermore, Centrality () is applied as an optional post-filtering heuristic for operator-guided scanning scenarios. By measuring the distance from the leak centroid to the image’s optical center, this secondary metric leverages the hardware’s beamforming characteristic to provide operational feedback, ensuring the acoustic source is properly centered during field inspections.
Stage 5—Confidence Ranking and Output: The surviving, rigorously verified candidates are ranked utilizing a composite confidence score integrating color, texture, and circularity. Finally, the validated leak locations are annotated onto the original image with bounding boxes.
2.3.2. Physical Area Quantification and Energy Conversion
To evaluate the economic severity of a validated leak and prioritize maintenance interventions, it is imperative to translate the two-dimensional pixel area into a tangible physical dimension. The proposed system maps the pixel coordinates to physical space (, in ) using a homography mapping algorithm. This transformation utilizes the camera’s intrinsic parameters, specifically the Field of View (FOV), coupled with the operator-specified working distance configured during the field survey.
Once the actual physical area
is reliably quantified, the volumetric flow rate loss,
(
), is mathematically estimated utilizing a simplified empirical orifice flow model assuming quasi-steady discharge. The flow rate is calculated as:
where
represents the empirical discharge coefficient, which accounts for the frictional losses and non-ideal, irregular geometry of actual pneumatic pipeline cracks.
denotes the system gauge pressure drop across the orifice, and
is the ambient air density.
Finally, to bridge this micro-level physical defect with a macro-level economic metric, the consequential active electrical power waste,
(kW), is derived. This is calculated by multiplying the volumetric flow loss
by the specific energy requirement (SER) of the compressor:
By utilizing the facility-specific SER obtained from the baseline IoT monitoring data, the framework maps the visual bounding boxes directly to absolute energy waste metrics, providing a transparent, physics-informed baseline for the subsequent macro-level predictive optimization.
2.4. Phase II: Macro-Level Schedule-Aware Power Prediction
Following the physical mitigation of baseline volume losses via acoustic-vision quantification, the second phase of the proposed framework focuses on optimizing the macro-level operational strategy. The objective is to construct a predictive control model that dynamically forecasts the required central compressor power, , utilizing solely external operational boundaries without relying on any internal mechanical sensor readings.
2.4.1. Problem Formulation and Feature Design
The predictive modeling utilized a comprehensive multi-domain dataset synchronized with the discrete operational schedules of the manufacturing facility. To align the varying sampling rates, the dataset was temporally resampled to a consistent 10 min interval.
A critical methodological design choice in this formulation is the strict prevention of ‘data leakage’. Internal variables such as compressed air flow rate (l/s), motor current (A), element outlet temperature (°C), outlet pressure (bar), and VSD speed (%) are physical consequences of the power consumption itself. Incorporating these outputs of the internal control loop into the feature space would artificially inflate predictive accuracy to levels that, while technically correct, provide no actionable insight. Furthermore, the practical value of this framework lies in its ability to predict energy demand before the compressor responds, thereby enabling preemptive pressure setpoint adjustments. Consequently, these internal metrics were explicitly excluded.
Instead, the input feature vector
was strictly constrained to three external feature groups, as summarized in
Table 3: production line schedules, temporal identifiers, and environmental conditions.
2.4.2. Predictive Machine Learning Architectures
To map the complex, high-dimensional relationships between the intermittent multi-line factory schedules and the resultant pneumatic load, three distinct machine learning architectures were evaluated: Multiple Linear Regression (MLR) as a parametric baseline, Random Forest (RF) as a bagging-based ensemble, and XGBoost as the primary proposed predictive architecture. MLR was included to establish whether simple linear combinations of schedule and environmental features could sufficiently explain the target variance. RF was selected for its robustness against overfitting and its ability to capture non-linear interactions. XGBoost was chosen as the primary architecture due to its demonstrated superiority in tabular data prediction tasks, its inherent feature importance quantification capability, and its regularized boosting mechanism that systematically minimizes both bias and variance [
32]. All models were evaluated using an 80/20 chronological train-test split (20,660 training samples, 5166 test samples) with absolutely no temporal overlap.
Multiple Linear Regression (MLR)
As a parametric baseline, an MLR model was constructed to evaluate whether the target variance could be sufficiently explained through simple linear combinations of the schedule and environmental features. The MLR model is mathematically expressed as:
where
is the predicted compressor power for the
-th observation,
is the total number of external features (i.e.,
),
represents the value of the
-th feature,
is the intercept,
are the learned regression coefficients, and
represents the residual error. While computationally efficient and highly interpretable, linear architectures inherently struggle to map the complex interdependencies and synergistic overlaps of multiple concurrent production lines.
Random Forest (RF) Regressor
To address the anticipated non-linearities and complex interaction terms, a Random Forest architecture was deployed. As a robust bagging-based ensemble method, RF constructs a multitude of independent decision trees during training using bootstrapped subsets of the data. The final regression prediction
is obtained by averaging the predictions across all
independent trees:
where
represents the output of the
-th individual decision tree for the input feature vector
. By aggregating these diverse outputs, RF effectively reduces overall model variance and demonstrates strong resistance to overfitting, particularly against the stochastic noise inherent in industrial IoT data.
XGBoost
As the primary proposed predictive architecture, XGBoost—an advanced gradient boosting framework—was implemented to minimize both bias and variance systematically. Unlike RF, which builds trees in parallel, XGBoost builds a sequence of shallow regression trees iteratively, where each subsequent tree is trained to correct the residual errors of the preceding ensemble. The objective function
minimized during training comprises a differentiable loss function
and a structural regularization term
:
where
is the actual power,
is the number of trees, and
represents the structure of the
-th tree. The regularization term actively penalizes tree complexity, thereby smoothing the final learned weights to ensure high generalizability to unseen scheduling patterns. Furthermore, the algorithm incorporates a sparsity-aware split-finding mechanism, making it exceptionally robust in handling missing values or intermittent zero-state data points typical in long-term factory schedules.
A critical methodological constraint enforced in this study is the strict prevention of ‘data leakage’—a pervasive flaw in applied machine learning where models inadvertently train on features that are direct byproducts of the target variable. In compressed air systems, internal mechanical sensors such as the volumetric flow rate, delivery pressure, and motor current are direct physical consequences of the compressor’s real-time operation. Therefore, to ensure the XGBoost model learns to preemptively forecast energy demand based solely on independent external schedules, all such internal state variables were strictly excluded from the training feature space.
2.5. Experimental Design and Framework Integration
To rigorously validate the structural necessity of each algorithmic component and to demonstrate the synergistic physical relationship between the micro-level detection and macro-level prediction phases, a comprehensive experimental design and integration strategy was established.
2.5.1. Experimental Design for Predictive Modeling
To quantify the specific contributions of the machine learning architectures and the independent feature groups, two sets of controlled experimental evaluations were designed for the predictive model:
Model Architecture Comparison: The three defined architectures (MLR, RF, and XGBoost) are trained on the identical full feature set () and compared across all regression metrics. This experimental setup is designed to quantitatively justify the necessity of complex non-linear modeling for schedule-aware power prediction against a parametric baseline.
Feature Group Dependency Analysis: Using XGBoost as the fixed primary architecture, six distinct input matrix configurations are evaluated by systematically excluding specific feature groups: (a) the full proposed model, (b) exclusion of line schedules ( variables), (c) exclusion of temporal features (Hour, Day Of Week), (d) exclusion of environmental factors (Temperature, Humidity), (e) aggregated line count (a single integer variable replacing the 13 specific binary states), and (f) schedule-only (excluding both temporal and environmental data). The degradation in predictive performance relative to the full model mathematically quantifies the isolated contribution of each feature group.
2.5.2. Evaluation Strategy for the Detection Pipeline
The structural necessity of the multi-stage MPSF detection pipeline is evaluated by selectively bypassing specific processing stages on a curated benchmark dataset of 200 acoustic–optical image patches (comprising 50 confirmed valid leaks and 150 non-leak confounders, including metallic reflections, complex pipe joints/valves, and ambient background noise). As detailed in
Table 4, four algorithmic configurations are compared.
The core design hypothesis of this evaluation is that all sequential processing stages are strictly mandatory to achieve a high detection precision, as each distinct stage mathematically filters a specific class of industrial confounders that the preceding stages cannot successfully eliminate.
2.5.3. Integrated Energy Optimization Strategy
The proposed framework intrinsically links the micro-level leak mitigation (Phase I) with the macro-level predictive control (Phase II) into a unified, synergistic energy optimization strategy. The physical and operational rationale for this integration is fundamentally based on system pressure dynamics:
Control Without Leak Prevention: If the predictive machine learning model aggressively reduces the compressor’s output pressure to match a dynamically predicted low-demand schedule, the existing unaddressed physical leaks will continue to consume a fixed volumetric flow. Consequently, at this reduced supply pressure, the effective air volume reaching the end-point production processes will precipitously fall below the minimum acceptable pneumatic threshold, inevitably causing critical process failures.
Control With Leak Prevention: By sequentially identifying and physically eliminating the leak-induced volume losses first, the identical dynamic pressure reduction orchestrated by the predictive model becomes operationally safe. The compressor can robustly maintain adequate end-point process pressure while operating at a significantly diminished energy baseline.
Ultimately, the total annual energy saving,
, achieved by this integrated dual-action framework is estimated as the sum of three physical components:
where
is the direct energy recovered from physical leak mitigation (
Section 2.3),
represents the secondary thermodynamic savings from optimizing the baseline system pressure (empirically yielding approximately a 7% reduction in specific energy per 1 bar decrease), and
denotes the macro-level savings derived from the preemptive, schedule-aligned compressor control enabled by the XGBoost prediction model (
Section 2.4).
2.6. Evaluation Metrics
To comprehensively assess the performance of the proposed dual-action framework across the designed validation scenarios, the following statistical metrics were established.
2.6.1. Detection Evaluation Metrics (Phase I)
The reliability of the leak quantification algorithm was evaluated using a standard confusion matrix to compute Precision (
), Recall (
), and the F1-Score (
):
where
represents true valid leaks correctly identified,
denotes metallic reflections misclassified as leaks, and
indicates actual leaks missed by the algorithm.
2.6.2. Prediction Evaluation Metrics (Phase II)
The predictive performance of the schedule-aware machine learning models was quantified using four complementary regression metrics: Mean Absolute Error (
MAE), Root Mean Square Error (
RMSE), Coefficient of Determination (
), and Mean Absolute Percentage Error (
MAPE):
where
is the total number of test samples,
is the actual measured power consumption,
is the predicted power, and
is the mean of the actual power values. RMSE is designated as the primary comparison metric due to its inherent sensitivity to large prediction errors, which are particularly critical for industrial applications where the under-prediction of demand can cause catastrophic pressure drops and production downtime.
4. Discussion
4.1. Interpretation of Key Findings
This study demonstrates that the proposed dual-action optimization framework significantly improves both operational efficiency and physical plausibility in industrial compressed air systems [
4,
7]. Unlike conventional optimizers that rely purely on reactive mathematical heuristics [
24,
28], the integrated framework functions as a domain-aware reasoning agent, interpreting discrete production schedules and suggesting preemptive pressure updates that preserve pneumatic consistency. Across the real-world deployment, the framework maintained stable convergence and reproduced end-point pressures that closely matched design values, confirming its capacity to integrate physical pipeline integrity into the search process. A key outcome is the framework’s ability to adaptively balance physical defect mitigation and schedule-aware load forecasting throughout the control loop. While traditional machine learning algorithms often become trapped in physically implausible states or display unstable convergence due to data leakage, the strictly external feature selection strategy ensured steady improvement even under highly variable industrial shifts. This combination of the Multimodal Probabilistic Saliency Fusion (MPSF) algorithm for defect resolution and reasoning-driven parameter selection yielded results that were not only numerically precise but also practically interpretable, underscoring the advantage of embedding structural understanding into quantitative optimization. A critical finding is that leak prevention serves as a strict physical prerequisite for any predictive depressurization strategy. Without prior remediation of baseline volume losses, dynamic pressure optimization inevitably causes end-point pressure to fall below safe operating thresholds, resulting in process failure. This sequential dependency—detect and repair first, then optimize—represents a fundamental design principle for building-level CAS energy management that has not been explicitly validated in prior literature. It should be emphasized that the current evaluation considered only continuous leakage faults as the source of micro-level physical losses. Other pneumatic fault types, including pressure drops from clogged in-line filters, actuator seal degradation, condensate drain malfunctions, and regulator failures, were not included in the scope of this investigation. Accordingly, the generalizability of the present findings to facilities with different fault profiles or industrial sectors remains to be validated, as discussed in
Section 4.5.
4.2. Mechanistic Insights and Comparison with Prior Work
Conventional energy management frameworks—including acoustic array leak detection [
13,
23] and surrogate-based machine learning [
26,
27]—have achieved measurable success in addressing underdetermination and reducing computational cost. For example, Guenther and Kroll [
14] demonstrated automated ultrasonic leak detection with high sensitivity, but their system provided no mechanism for translating detected leaks into quantified energy losses or integrating with downstream operational optimization. Santolamazza et al. [
30] proposed anomaly detection in compressor energy profiles, achieving effective detection of abnormal consumption patterns, but did not incorporate physical pipeline integrity information into their diagnostic framework. Wang et al. [
24] and Benedetti et al. [
5] developed ML-based energy prediction models for building systems, but their approaches implicitly assumed intact distribution networks, rendering them vulnerable to physically implausible optimization outcomes in leak-affected systems. However, these isolated approaches often lack the capacity for holistic reasoning and therefore converge toward physically implausible solutions that satisfy numerical objectives but violate thermodynamic logic (e.g., lowering pressure without addressing structural leaks, causing process starvation). The proposed dual-action framework resolves this by introducing a cognitive feedback layer that interprets acoustic-vision outcomes in the context of pipeline fluid dynamics [
21,
22]. Through the MPSF algorithm, the framework integrates physical dependencies such as acoustic saliency and geometric leak propagation, reasoning about cause–effect volumetric losses rather than merely minimizing compressor power residuals. This capability transforms the optimization process from a blind search into an explainable decision-making sequence, analogous to a human expert iteratively refining a system based on diagnostic insight. Previous studies employing meta-model optimization or ensemble prediction achieved credible uncertainty quantification but remained opaque in their reasoning pathways. In contrast, the present approach provides explicit, interpretable justifications for each parameter update, thereby enhancing transparency and trust in the optimization process. Such interpretive capability marks a conceptual advance toward explainable AI in industrial energy management [
1,
2].
4.3. Practical Implications
From a practical standpoint, the proposed framework introduces a new paradigm for intelligent and explainable energy optimization in manufacturing environments. Its modular structure enables direct integration into existing industrial Internet of Things (IoT) platforms (e.g., SMARTLINK workflows), allowing users to leverage advanced predictive reasoning without altering established hardware infrastructures. For organizations seeking to adopt this approach at a new facility, the deployment process involves the following steps: (a) conducting an acoustic leak survey using a portable imager to identify and quantify baseline leaks, (b) integrating with the facility’s PLC/MES to obtain real-time production line schedules, (c) collecting a minimum of 2–3 months of IoT sensor data for model training, and (d) training and deploying the XGBoost predictive model. Importantly, the algorithmic framework (MPSF detection pipeline, XGBoost model architecture, and optimization logic) is directly reusable across facilities; however, the trained model parameters and detection thresholds are site-specific and require recalibration using local operational data. Regarding cost considerations, the primary capital investment is the acoustic imaging device (FLUKE ii910, approximately €20,000), which is a one-time purchase that can be shared across multiple facilities through periodic survey campaigns. The IoT sensor infrastructure (pressure, temperature, flow) is typically already available in modern compressor systems equipped with manufacturer-provided monitoring platforms (e.g., Atlas Copco SMARTLINK). The ML model training and deployment require standard computational resources and can be executed on commercial cloud platforms at minimal cost. For a single 75 kW compressor system, the framework demonstrated annual energy cost savings of over $27,600, suggesting a payback period of less than one year for the initial acoustic imager investment. For larger facilities with multiple compressor units, the economic benefit scales proportionally, while only a single acoustic imager is required for periodic diagnostic surveys. The diagnostic outputs generated by the MPSF vision-acoustic pipeline serve as real-time diagnostic feedback, offering insights into spatial leak severities and pneumatic behaviors that are otherwise obscured in conventional facility operations. In an operational context, this framework can be extended to digital-twin environments, enabling periodic recalibration using real-time production line scheduling data. The reasoning logs and quantified component-level savings—separating leak repair, pressure reduction, and schedule optimization—could also facilitate regulatory verification and transparent reporting for performance-based sustainability codes. Furthermore, by aligning numerical power forecasting with domain-specific physical realities, this approach opens new pathways toward “software-driven retrofitting” that combines non-intrusive defect mitigation with intelligent, schedule-aligned decision support.
4.4. Relevance to Building Energy Management
While the experimental validation was conducted in a manufacturing facility, the proposed framework has direct implications for building-level energy management and decarbonization strategies. Compressed air systems are a major contributor to energy consumption in industrial and large-scale commercial buildings, often accounting for 20–30% of total facility electricity costs [
7,
8]. The framework’s demonstrated ability to reduce CAS energy consumption by 42–53% translates directly to building-level Scope 2 CO
2 emission reductions, supporting compliance with increasingly stringent building energy performance standards [
33]. Furthermore, the schedule-aware prediction approach mirrors emerging trends in smart building energy management, where occupancy and usage schedules are leveraged to optimize HVAC, lighting, and other building systems [
34,
35,
36]. The modular architecture of the proposed framework enables its integration into Building Management Systems (BMSs) alongside existing HVAC and lighting optimization modules, positioning compressed air optimization as an integral component of holistic building decarbonization strategies rather than an isolated industrial maintenance task.
4.5. Limitations and Future Work
Despite its promising results, several limitations of the current study should be acknowledged to contextualize the findings and guide future research.
First, the proposed framework was validated at a single electronics and automotive parts manufacturing facility. While this real-world industrial testbed provides high ecological validity, the generalizability of the findings to other industrial sectors—such as food processing, pharmaceutical manufacturing, or O-ring production—where pneumatic demand profiles, pipeline configurations, and environmental conditions differ substantially, remains to be verified through multi-site deployment studies.
In terms of fault coverage, the current detection pipeline exclusively targets continuous leakage faults arising from pipe connections, fittings, and joint degradation. Other pneumatic fault types, including pressure drops from clogged in-line filters, actuator seal degradation, valve malfunctions, and condensate drain failures, fall outside the scope of the present investigation and would require supplementary detection modalities such as differential pressure monitoring or cycle-time analysis. Relatedly, all leaks identified in this study exhibited continuously active acoustic signatures. The framework is inherently limited in detecting intermittent faults—for instance, actuator seal leaks that manifest only during specific operational cycles—since such faults require the acoustic imager to be present and active at the precise moment of occurrence. Addressing this limitation would necessitate continuous automated monitoring using permanently installed acoustic sensor arrays or event-triggered detection strategies.
Regarding instrumentation, the validation was performed exclusively with the FLUKE ii910 Precision Acoustic Imager, a device costing approximately €20,000. Although the core algorithmic logic operates on generic acoustic-optical overlay data and is, in principle, transferable to other imagers with comparable specifications, cross-device compatibility has not been empirically verified. Moreover, the high acquisition cost may present a practical barrier for small- and medium-sized enterprises seeking to adopt this approach.
From a data infrastructure perspective, the XGBoost model’s predictive accuracy is heavily dependent on the availability of granular, real-time production line schedule data obtained via PLC or MES integration. In facilities lacking such digital infrastructure or operating with undocumented, irregular production patterns, the system’s predictive performance would be substantially degraded. Furthermore, a uniform minimum acceptable pressure of 6.0 bar was applied across all 13 production lines in this study; facilities with heterogeneous end-point pressure requirements would benefit from zone-based pressure management, which the current framework does not implement. Finally, the 10 min IoT sampling interval, while sufficient for capturing schedule-driven demand variations, may miss fast transient events associated with rapid load transitions, and the potential benefits of higher-frequency sampling remain to be explored.
Future research should address these limitations by conducting multi-site validation across diverse industrial sectors, integrating supplementary FDD modalities (pressure, cycle time, and vibration sensing) for comprehensive pneumatic fault coverage, exploring permanently installed acoustic sensor arrays for continuous automated leak monitoring, developing zone-based pressure optimization for facilities with heterogeneous equipment requirements, and investigating closed-loop PLC integration for fully autonomous compressor control.
5. Conclusions
This study introduces a novel dual-action energy optimization framework for large-scale building energy systems, successfully bridging the historical gap between physical built-environment maintenance and macro-level predictive control. By unifying micro-level defect mitigation with schedule-aware machine learning, the proposed methodology overcomes the critical limitations of isolated, reactive optimization strategies that often inadvertently compromise the stability of facility-scale systems and indoor environmental control. Thermodynamic simulations mathematically proved the core physical hypothesis of this research: leak prevention is a strict physical prerequisite for predictive pressure reduction. As demonstrated, dynamic pressure optimization without prior remediation of baseline volume losses critically starves end-point processes, highlighting the necessity of a holistic approach to pneumatic energy management. The principal contributions of this work are threefold: (i) the empirical demonstration that leak prevention constitutes a strict thermodynamic prerequisite for any predictive depressurization strategy; (ii) the development of a multimodal acoustic-vision pipeline (MPSF) capable of automated, geometry-verified leak quantification under complex industrial conditions; and (iii) the formulation of a schedule-aware XGBoost predictive model that achieves high-fidelity power forecasting using only external operational boundaries, explicitly preventing data leakage.
The micro-level implementation of the Multimodal Probabilistic Saliency Fusion (MPSF) algorithm successfully automated the detection and quantification of hidden pneumatic leaks, robustly filtering severe industrial confounders to explicitly recover 11.0 kW of continuous baseline power waste. Building upon this secured structural integrity, a macro-level XGBoost predictive architecture was deployed to dynamically forecast compressor power demand using strictly external operational boundaries. By exclusively utilizing the discrete binary schedules of 13 production lines and explicitly excluding internal mechanical sensors, the model successfully prevented data leakage and provided true preemptive load forecasting. This schedule-aware approach substantially outperformed conventional parametric and ensemble baselines, achieving a high-fidelity prediction with an RMSE of 2.67 kW and an of 0.9698.
Ultimately, this research provides a meaningful advance in building-level compressed air energy management and facility energy management. Across a comprehensive six-month real-world deployment, the integrated dual-action framework achieved massive monthly energy reductions ranging from 42.7% to 53.1%, averaging approximately 23.0 MWh per month. When extrapolated annually, this “software-driven retrofitting” approach translates to a direct operational cost reduction of over
$27,600 and the elimination of 116 tons of Scope 2
-equivalent emissions for a single 75 kW compressor network. It should be noted that the current study is subject to several limitations, including single-site validation, exclusive focus on continuous leakage faults, dependence on a single acoustic imager model, and requirement for granular production schedule data, as detailed in
Section 4.5. By aligning numerical machine learning optimization with domain-specific thermodynamic realities, the proposed framework offers an explainable, highly cost-effective, and physically grounded pathway toward carbon-neutral building operations and the decarbonization of the built environment, while potentially reducing the need for capital-intensive hardware upgrades.