Next Article in Journal
Automated Multi-Platform EDI Integration for B2B Retail: A Romanian Case Study on System Architecture, Implementation, and e-Factura Convergence
Previous Article in Journal
Intelligent Optimization in Satellite Communication Protocols: Methods, Applications, and Practical Limits
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Wavelet Entropy and Machine Learning Analysis of Nonlinear Dynamics in Tubular Light Pipes

Department of Electrical-Electronics Engineering, Faculty of Engineering-Architecture, Burdur Mehmet Akif Ersoy University, 15200 Burdur, Türkiye
Electronics 2026, 15(7), 1474; https://doi.org/10.3390/electronics15071474
Submission received: 16 March 2026 / Revised: 27 March 2026 / Accepted: 30 March 2026 / Published: 1 April 2026

Abstract

This study presents a hybrid framework primarily designed to predict electrical energy consumption in tubular light pipe systems while also providing interpretability through wavelet-based analysis. Indoor and outdoor illuminance were continuously monitored at one-minute intervals between January and May in Istanbul, Turkey. Using the continuous wavelet transform (CWT) with predefined scale ranges, multi-scale features such as scale-wise energy, relative wavelet energy, and wavelet entropy were extracted to quantify illumination variability and stability. These features were combined with contextual parameters (e.g., month and weather) to predict electrical energy consumption and the energy-saving ratio under a threshold-based lighting control strategy. Among the evaluated models, Random Forest was selected as the primary model due to its balance between prediction accuracy and interpretability, achieving lower prediction errors compared to baseline models (RMSE = 7.84 for RF, 9.39 for Linear Regression, and 8.28 for ARIMA), although the observed improvements are influenced by the inherent variability in the dataset. Feature-importance and SHapley Additive exPlanations (SHAP) analyses revealed that low-frequency wavelet components and low Wavelet Entropy values were found to strongly influence the predictive behavior, indicating that stable illumination leads to reduced artificial lighting demand and higher energy savings. A Lyapunov-inspired stability interpretation suggests that the system exhibits stable behavior consistent with asymptotic convergence. Unlike existing studies, the proposed framework integrates wavelet entropy with interpretable machine learning to jointly model illumination dynamics and energy demand. This enables more reliable prediction of lighting energy demand under highly variable daylight conditions.

1. Introduction

Tubular light pipes (TLPs) are passive daylighting systems designed to transmit natural light from exterior environments into interior spaces through highly reflective ducts, thereby reducing dependence on artificial lighting and overall building energy consumption [1]. In sustainable building design, these systems play an important role due to their ability to distribute diffuse daylight into deep indoor zones without significant thermal impact or visual discomfort [2]. The optical performance of a TLP is influenced by both structural parameters, such as diameter, length, and surface reflectivity, and environmental factors including solar altitude, sky conditions, and seasonal variability [3]. The interaction of these parameters leads to illumination dynamics that are inherently nonlinear and nonstationary under real atmospheric conditions.
Traditional approaches for modeling TLP performance are generally based on deterministic optical or radiative transfer formulations that assume steady-state conditions and uniform solar input. While these models provide useful approximations under controlled conditions, they are limited in their ability to capture transient fluctuations caused by cloud movement, atmospheric variability, and rapidly changing solar geometry [4]. As a result, discrepancies between modeled and measured illuminance frequently occur, particularly under partly cloudy conditions where irradiance varies across multiple temporal scales. This limitation has led to increasing interest in data-driven approaches that can better represent the multi-scale variability in daylighting systems [5].
Recent developments in signal processing and machine learning (ML) have enabled more flexible modeling of complex environmental time series. In this context, wavelet analysis provides a time–frequency framework that decomposes nonstationary signals into localized components, allowing for detailed characterization of illumination dynamics. The CWT enables multi-scale analysis of illuminance signals, from which features such as scale-wise energy and Relative Wavelet Energy (RWE) can be derived. These features capture how illumination variability is distributed across different temporal scales, providing insight into both gradual trends and short-term fluctuations [6].
Wavelet Entropy (WE), computed from the normalized distribution of wavelet energy, serves as a compact measure of signal complexity. Lower entropy values are associated with stable and predictable illumination conditions, typically observed under clear-sky scenarios, whereas higher entropy values indicate irregular and rapidly varying patterns under dynamic weather conditions [7]. Within TLP systems, WE can therefore be interpreted as an indicator of illumination stability that is directly related to variations in artificial lighting demand.
Although previous studies have applied wavelet-based techniques to solar irradiance and building lighting analysis [8,9,10,11,12], relatively few works have combined wavelet-domain features with interpretable machine learning models for predicting energy consumption in hybrid daylight–artificial lighting systems [13,14,15]. Most existing approaches focus primarily on improving short-term prediction accuracy, with limited attention to interpretability or to the stability characteristics of the illumination process [16,17,18,19]. Consequently, there remains a need for a unified framework that integrates signal-level characterization, predictive modeling, and physically meaningful interpretation.
To address this gap, this study proposes a hybrid wavelet–machine learning framework primarily aimed at predicting electrical energy consumption in TLP systems under real operating conditions. The approach integrates multi-scale wavelet features, including RWE and WE, with contextual variables such as weather conditions and seasonal indicators to model the relationship between daylight availability and artificial lighting demand. In addition to prediction, the framework emphasizes interpretability through feature importance analysis and SHAP, enabling a direct connection between signal characteristics and energy performance.
The analysis is based on an experimental setup deployed in Istanbul, Turkey, where indoor and outdoor illuminance were recorded at one-minute intervals over a five-month period, resulting in a dataset exceeding 200,000 samples. The system includes a 300 mm diameter reflective light pipe, an acrylic dome, and a prismatic diffuser, along with a feedback-controlled LED lighting system that is activated when indoor illuminance falls below a predefined threshold. Electrical energy consumption and the energy-saving ratio (ESR) are computed based on this control strategy.
Wavelet-based features are extracted from the measured illuminance signals using the Continuous Wavelet Transform with predefined scale ranges to ensure consistent multi-scale representation. These features are combined with environmental and temporal variables to form the input set for machine learning models. Random Forest (RF), Gradient Boosting (GB), and Long Short-Term Memory (LSTM) models are evaluated, and Random Forest is selected as the primary model due to its balance between predictive performance and interpretability. Model performance is assessed using RMSE, MAE, and R2 metrics, and the contribution of each feature is analyzed using SHAP-based interpretability techniques.
In addition, a Lyapunov-inspired formulation is introduced to provide a qualitative interpretation of system stability in relation to wavelet entropy. This formulation is not intended as a formal stability proof, but rather as an interpretative framework that relates entropy-based measures to observed system behavior, where decreasing entropy trends are associated with more stable illumination dynamics.
The main contributions of this study can be summarized as follows: (i) the development of a hybrid wavelet–machine learning framework for predicting electrical energy consumption in TLP systems, (ii) the extraction of physically meaningful multi-scale features using RWE and WE, (iii) the integration of interpretable machine learning techniques to link illumination dynamics with energy demand, and (iv) a stability-oriented interpretation that connects wavelet entropy with system behavior.
The proposed framework provides a consistent and interpretable approach for modeling the relationship between daylight variability and artificial lighting demand, enabling more reliable prediction of energy consumption under highly dynamic environmental conditions.

2. Experimental Setup

2.1. Measurement System and Configuration

This experimental study was conducted in a daylighting test chamber equipped with a TLP system transmitting sunlight from the roof to the interior working plane in Istanbul, Turkey (41.01° N, 28.97° E). The system includes a 300 mm diameter aluminum light pipe (length 2.4 m, inner reflectance ≈ 0.85). A transparent acrylic dome captures direct and diffuse solar radiation, while a prismatic diffuser distributes light uniformly throughout the interior zone.
Indoor and outdoor illuminance were recorded with calibrated digital lux sensors (±2%). Data were collected at 1 min intervals from January to May, yielding more than 200,000 samples. Outdoor measurements were taken on the rooftop, and indoor sensors were at 0.8 m height. Measurements were synchronized via a real-time clock (RTC).
A programmable LED unit is activated automatically when indoor illuminance falls below the target threshold ( E t = 500   l u x ), enabling estimation of electrical energy consumption using the equations referenced later. The illuminance sensors were factory-calibrated prior to installation, and their consistency was periodically verified during the measurement campaign. Indoor and outdoor measurements were synchronized using a real-time clock (RTC) system to ensure temporal alignment of the recorded data. The specifications of the experimental measurement system are presented in Table 1.

2.2. Data Acquisition and Preprocessing

Raw illuminance data were filtered to remove measurements affected by sensor saturation or temporary shading. Missing samples (less than 1.2% of the dataset) were reconstructed using spline interpolation to preserve signal continuity. All illuminance values (lux) were normalized to the range [0, 1] using min–max scaling to ensure numerical stability in subsequent wavelet and machine learning analyses. Meteorological data, including global solar irradiance, ambient temperature, and relative humidity, were obtained from the Turkish State Meteorological Service (TSMS) and temporally aligned with the illuminance measurements using timestamp matching. Minor inconsistencies were corrected through interpolation to maintain data consistency.
To assess whether global min–max normalization affects the representation of transient illumination dynamics, additional normalization strategies, including z-score scaling, were evaluated. The resulting wavelet-based features, particularly RWE and WE, exhibited consistent distributions across normalization schemes. This indicates that the multi-scale structure of the illuminance signal is preserved, and the proposed feature extraction process remains robust to the choice of normalization method.
The experimental configuration of the TLP system is illustrated in Figure 1. Sunlight enters through the rooftop dome (1) and is transmitted via a reflective aluminum-coated light pipe (2) to the test chamber, where a prismatic diffuser (3) distributes daylight evenly across the working plane. An indoor illuminance sensor (4) continuously measures lighting levels and sends feedback to a control unit. When the measured illuminance falls below the target threshold (Et = 500 lux), the controller activates the LED luminaire to maintain the desired indoor lighting level. This setup enables real-time evaluation of optical efficiency, control response, and energy consumption under dynamic daylight conditions. Electrical energy consumption was estimated based on the rated power of the LED system and its operating duration, which was determined from the control activation periods.

3. Mathematical Framework

This section presents the mathematical foundation for analyzing the nonlinear behavior of the TLP system using experimental illuminance data. The aim is to represent the nonlinear and time-varying behavior of the system under real environmental conditions while maintaining a physically interpretable structure.
Indoor illuminance is primarily governed by the amount of incoming daylight; however, in practical conditions, it is also influenced by environmental and atmospheric factors as well as system-level optical characteristics. To account for these combined effects, indoor illuminance is modeled as follows:
E m ( t )   =   α   ·   [ E o ( t ) ]   +   γ 1   I ( t )   +   γ 2   T ( t )   +   γ 3   H ( t )   +   ε ( t )          
where Em(t) and Eo(t) represent the indoor and outdoor illuminance (lux), respectively. The term I(t) represents global solar irradiance, T(t) is the ambient temperature, and H(t) is the relative humidity. The coefficient α corresponds to the overall optical efficiency of the TLP system, while β represents a nonlinear attenuation factor associated with geometric and reflective properties of the pipe. The coefficients γ1, γ2, and γ3 capture the influence of environmental variables on light transmission and diffusion. The term ε(t) accounts for stochastic disturbances, including measurement noise and unmodeled system effects.
The nonlinear exponent β allows the model to capture deviations from ideal linear light transmission, which may arise due to internal reflections, optical losses, and variations in incident light conditions. The inclusion of environmental variables reflects the fact that illumination inside the system is not solely determined by outdoor illuminance but is also affected by atmospheric variability and indirect lighting effects.
In the simplified case where environmental influences are neglected (γ1 = γ2 = γ3 = 0), β = 1, and ε(t) = 0, Equation (1) reduces to a linear relationship between indoor and outdoor illuminance. However, such a simplification is insufficient under real operating conditions, where illumination dynamics exhibit significant nonlinearity and nonstationarity.
This formulation provides a physically consistent yet flexible representation of the TLP system, enabling direct integration with wavelet-based feature extraction and subsequent machine learning models used for energy prediction and system analysis.

3.1. Illuminance Deficit and Energy Demand

To maintain the target illuminance E t , artificial lighting supplements are used when indoor illuminance E m ( t ) is insufficient. The illuminance deficit is expressed as follows:
Δ E ( t ) = m a x ( 0 ,   E t E m ( t ) )
where ΔE(t) represents the illuminance deficit and is defined only when the indoor illuminance falls below the target level. Instantaneous electrical power required to compensate for the deficit [20]:
P t = A   Δ E t η
where A is the illuminated area (m2) and η is the luminous efficacy (lm/W).
Total electrical energy over the period T is obtained by integration [21]:
E elec = 0 T P ( t )   d t

3.2. Wavelet Transform and Feature Extraction

The Continuous Wavelet Transform is used to analyze the nonstationary illuminance signal Em(t) [22].
In this study, a predefined scale range is used to capture both low-frequency (diurnal trends) and high-frequency (short-term fluctuations) components of the signal.
W s , τ = 1 s + E m t   ψ s , τ * t τ s d t
where s denotes the scale parameter, τ represents the time shift, and ψ is the selected mother wavelet. In this study, the Morlet wavelet is employed due to its strong time–frequency localization capability and its suitability for analyzing nonstationary environmental signals. Comparative tests using alternative wavelets, such as Daubechies families, showed that the overall energy distribution across scales and the resulting entropy values remain consistent. This suggests that the proposed feature extraction framework is not highly sensitive to the specific choice of mother wavelet.
From the wavelet coefficients, the scale-wise energy and the RWE are obtained as follows [7]:
E W ( s ) = W ( s , τ ) 2 d τ , R W E ( s ) = E W ( s ) i E W ( s i )

3.3. Wavelet Entropy and Nonlinear Behavior

Wavelet entropy summarizes the multiscale complexity of illumination dynamics [7]:
W E = i = 1 N R W E ( s i )   l o g ( R W E ( s i ) )
Low WE indicates stable and regular behavior; high WE implies irregular or chaotic fluctuations.
A Lyapunov-like energy function is used to relate entropy to system stability [23]:
V ( t ) = 1 2 [ Δ E ( t ) ] 2 , V ˙ ( t ) = k   W E ( t ) ,     k > 0
If V ˙ t < 0 , the total energy decreases, suggesting behavior consistent with asymptotic stability. The proportionality constant k is introduced as a positive scaling parameter to relate entropy to the rate of change in the energy function. In this formulation, k is not derived from physical system parameters but serves as a normalization coefficient to establish a qualitative link between entropy and system stability. Therefore, the Lyapunov-inspired formulation should be interpreted as a heuristic representation rather than a strict stability proof, and its generalization to different system configurations may require recalibration.

3.4. Machine Learning Integration

Wavelet and contextual features are combined into a comprehensive feature vector:
x = [ E W ( s 1 ) , , E W ( s p ) , W E , W e a t h e r , M o n t h ]
The model is designed to predict two primary targets:
Among the evaluated models, Random Forest is selected as the primary model due to its balance between predictive performance and interpretability.
y 1 = E elec ,     y 2 = 1 E elec smart E elec base
The regression mapping is defined as follows:
y ^ = f ( x ; θ )
where f denotes the chosen ML model, such as RF, GB, or LSTM networks.

3.5. Model Interpretation and Evaluation

RMSE quantifies prediction accuracy [24]:
R M S E = 1 N i = 1 N y i y ^ i 2
Model interpretability is ensured using SHAP values, which assess the marginal contribution of each input feature x j :
S H A P ( x j ) = E x j [ f ( x ) ] f ( x j )

3.6. Algorithm Description (Workflow Summary)

To clearly define the operational logic of the proposed hybrid framework, the algorithmic workflow is summarized as shown in Algorithm 1. This stepwise description outlines how experimental illuminance data are transformed into physically interpretable, wavelet-based features and subsequently used for predictive modeling. Each stage, from signal preprocessing to model interpretation, follows a consistent computational pipeline that ensures both physical relevance and statistical rigor. The procedure emphasizes integrating wavelet-domain descriptors with machine learning to achieve accurate, explainable energy prediction in TLP systems.
Algorithm 1 Algorithm Workflow of the Proposed Hybrid Framework
Input:
Outdoor illuminance Eo(t), indoor illuminance Em(t)
 
Procedure:
1. Compute illuminance deficit
ΔE(t) = max(0, Et − Em(t))   (Equation (2))
2. Estimate electrical power and energy consumption
Compute P(t) using (Equation (3))
Compute Eelec using (Equation (4))
3. Apply Continuous Wavelet Transform (CWT)
Compute W(s, τ) from Em(t) using (Equation (5))
4. Extract wavelet-based features
Compute EW(s) and RWE(s) using (Equation (6))
5. Compute wavelet entropy
Compute WE using (Equation (7))
6. Construct feature vector
x = [EW(si), WE, contextual variables]
7. Train machine learning model
Train model f(x) to predict Eelec or ESR
(Random Forest is used as the primary model)
8. Evaluate model performance
Compute RMSE using (Equation (12))
Interpret results using SHAP (Equation (13))
Output:
Predicted energy consumption Eelec and feature-level interpretability map
The workflow demonstrates how the raw illuminance measurements are systematically converted into energy-related insights. By combining physical equations with data-driven modeling, the framework bridges the gap between optical characterization and predictive analysis. This structure not only enables precise estimation of energy demand and ESR but also enhances model transparency through feature-level interpretability using SHAP analysis. Consequently, the algorithm forms the computational backbone of the proposed wavelet–machine learning methodology, ensuring reproducibility and scalability for future adaptive daylighting studies.

4. Methodology

This section describes the end-to-end workflow from raw illuminance data to interpretable, data-driven energy models. The methodology includes experimental data acquisition, wavelet-domain analysis, feature engineering, and machine learning–based prediction and interpretation. Each step ensures the physical and statistical consistency of the modeling process.

4.1. Analysis Pipeline Overview

The overall workflow of the proposed wavelet–machine learning framework is summarized in Figure 2. The process begins with synchronized one-minute indoor and outdoor illuminance measurements collected from the TLP system. Preprocessing involves filtering sensor noise, normalizing values, and interpolating missing samples using spline or linear methods to ensure signal continuity. The processed data are then divided into two analytical branches:
(1)
Physical–Energy Domain; where derived quantities such as instantaneous electrical energy and the ESR are computed based on measured illuminance and system parameters, and
(2)
Wavelet Domain, where the CWT extracts multi-scale features, including scale-wise energy, RWE, and WE.
Outputs from these branches are integrated into a combined feature vector x, which serves as the input to machine learning models predicting electrical energy demand, ESR, and entropy-based stability indices. The models are evaluated using RMSE, MAE, and Coefficient of Determination ( R 2 ) metrics, while interpretability is achieved through feature-importance and entropy dynamics analyses. This end-to-end pipeline provides both predictive accuracy and physical transparency, linking raw illuminance data to energy performance and stability insights in daylight-driven lighting systems.

4.2. Feature Engineering

Wavelet-based feature engineering was implemented to capture the multi-scale temporal dynamics of indoor illuminance signals E m ( t ) while preserving their underlying physical and statistical characteristics. The method transforms the time-domain signal into a joint time–frequency representation that reveals both transient fluctuations and quasi-stationary illumination patterns.
The transformation begins with the CWT, defined in Equation (5), which decomposes the signal into localized oscillations across multiple scales:
W ( s , τ ) = + E m ( t )   ψ * ( t τ s ) d t
where s denotes the scale parameter controlling frequency resolution, and τ represents the time-shift parameter capturing temporal localization. The mother wavelet ψ ( t ) is chosen to satisfy the admissibility condition, ensuring perfect reconstruction and finite energy:
C ψ = 0 + ψ ^ ( m ) 2 ω   d ω <
This ensures that E m ( t ) can be expressed as a weighted combination of wavelets without information loss. From the obtained coefficients W ( s , τ ) , the scale-wise energy is computed as follows:
E W ( s ) = 1 T 0 T W ( s , τ ) 2 d τ
representing the average energy density at each scale. This term quantifies how much illumination variability is concentrated in a specific temporal frequency band—lower scales capture short-term fluctuations (e.g., fast-changing cloud cover), whereas higher scales capture slow diurnal trends. The RWE provides a normalized representation of how total energy is distributed among the scales:
p s = E W ( s ) i E W ( s i ) , i p s = 1
where p s forms a probability distribution across the scale domain. This probabilistic view allows the wavelet energy to be interpreted as a measure of signal organization. Concentrated energy at a few scales implies regular illumination, while dispersed energy indicates irregular patterns. Building upon this, the WE quantifies the degree of disorder or complexity of illumination:
W E = i p s l o g 2 ( p s )
A low WE value corresponds to stable, clear-sky conditions with high predictability, whereas a high WE signifies unstable, cloudy conditions characterized by rapid transitions and broad spectral content. For comparative analysis across different datasets or seasons, a normalized entropy form can also be used:
W E n = i p s l o g 2 ( p s ) l o g 2 ( N ) , 0 W E n 1
where N is the number of scales. This normalization ensures that entropy values remain comparable across different wavelet resolutions or sampling rates. The extracted wavelet-based indicators { E W ( s i ) , R W E , W E } are integrated with contextual variables, including month, hour, outdoor irradiance, and meteorological parameters, to form the feature vector x.
x = [ E W ( s 1 ) , E W ( s 2 ) , , E W ( s N ) , R W E 1 , , R W E N , W E ,   contextual   features ]
This multi-domain feature representation fuses physical illumination parameters and wavelet-derived descriptors, providing a comprehensive input for the subsequent machine learning model. The resulting vector retains both the energetic structure and temporal complexity of daylight signals, enabling accurate and interpretable predictions of electrical demand, ESR, and stability metrics.

4.3. Machine Learning Modeling

After constructing the multi-domain feature vector x, supervised learning models were trained to predict key performance targets, including total electrical energy consumption E elec and the ESR. The modeling objective is to learn the functional mapping that minimizes the prediction error between the model output and ground truth values y i .
To avoid temporal leakage, the dataset was chronologically split into 75% for training and 25% for testing, ensuring that all validation samples occur after the training period. The objective function is defined as the Mean Squared Error (MSE) [25]. The hyperparameters of the machine learning models were selected based on empirical testing to balance prediction accuracy and model stability. For the Random Forest model, the number of trees and maximum depth were adjusted to avoid overfitting. Default parameter settings were used for Gradient Boosting and LSTM models with minor tuning.
L M S E θ = 1 N t r i I tr y i f x i ; θ 2
where y i denotes the observed target and f x i ; θ the model prediction.
To enhance generalization and reduce overfitting, five-fold time-aware cross-validation is applied [26]:
E ^ C V = 1 5 k = 1 5 1 V k i V k ( y i f ( x i ; θ ^ ( k ) ) ) 2
Each fold preserves chronological ordering, preventing information from future time steps from leaking into past ones. Three algorithms were employed for performance benchmarking:
  • Random Forest:
y ^ RF ( x ) = 1 B b = 1 B h b ( x )
where h b ( · ) represents each decision tree trained on a bootstrap sample. RF provides high noise stability and captures nonlinear interactions among features.
  • Gradient Boosting:
y ^ m ( x ) = t = 1 m ν g t ( x )
where ν is the learned g t ( · ) are weak learners sequentially optimized to minimize residuals. GB models capture fine-grained, additive relationships.
  • Long Short-Term Memory:
A recurrent neural network capable of modeling long-term temporal dependencies using gated memory cells. The hidden state h t is updated through input, output, and forget gates, allowing the network to retain illumination dynamics over time windows.
The lack of statistical significance at the 95% confidence level is attributed to the high variability and nonstationary nature of the illuminance data, which introduces substantial noise into the prediction task. Despite this, consistent reductions in RMSE and MAE across multiple folds indicate improved model stability rather than statistically dominant performance. Model accuracy was evaluated using three performance metrics: the RMSE, MAE, and R 2 [27]:
R 2 = 1 i ( y i y ^ i ) 2 i ( y i y ¯ ) 2
RMSE and MAE quantify prediction precision, while R 2 indicates the proportion of variance explained by the model. Together, these metrics assess both the statistical reliability and practical applicability of the hybrid wavelet–ML framework.
The detailed process of transforming the indoor illuminance signal into wavelet-derived features is illustrated in Figure 3, where CWT coefficients are used to compute scale-wise energy, relative wavelet energy, and wavelet entropy before forming the final feature vector.
Table 2 summarizes the ranked importance of input variables within the RF model, integrating both statistical relevance and physical interpretability. The results show that wavelet-domain descriptors, particularly low-scale RWE components and wavelet entropy, have a strong influence on the predictive structure, emphasizing the role of temporal stability and illumination variability in determining energy demand. Physical-domain and contextual variables, such as the indoor–outdoor energy ratio and seasonal indicators, also contribute, but with lower relative influence. This ranking indicates the predictive relevance of wavelet-derived features and enhances the interpretability of the hybrid Wavelet–RF framework by linking mathematical descriptors to their corresponding illumination behavior.

4.4. Model Evaluation and Interpretability

Model interpretability is essential for understanding the contribution of physical and wavelet-domain variables. Feature importance and SHAP values are computed to identify which features most strongly influence predictions. It should be noted that certain wavelet-based features, particularly adjacent RWE components, may exhibit high correlation due to their overlapping scale characteristics. While Random Forest models are relatively robust to multicollinearity, this correlation may influence the distribution of feature importance and SHAP values across similar variables. Therefore, the interpretation of feature contributions should consider potential redundancy among correlated inputs.
To further interpret the model’s behavior, Partial dependence plots (PDPs) were generated to visualize how variations in the most influential features affect the predicted energy consumption. Model interpretability is essential for understanding the contribution of physical and wavelet-domain variables. Feature importance and SHAP values are computed to identify which features most strongly influence predictions.
The results show that low-frequency RWE components and WE have the highest importance, indicating that stable illumination (low entropy) reduces the need for artificial lighting and increases overall model interpretability by revealing how physical variability translates into energy-saving potential. This connection highlights the physical consistency of the hybrid Wavelet–RF framework and demonstrates that the model captures both statistical and physical dependencies within the dataset.
Figure 4 presents the PDPs generated from the RF regression model, illustrating how each wavelet-derived feature, RWE1, RWE2, RWE3, and WE, affects the predicted indoor energy consumption. Each subplot shows the isolated effect of a single variable while averaging out the others, clarifying how variations in wavelet energy components and entropy contribute to changes in the model output.
The nonlinear patterns observed reveal that the relationship between wavelet-domain features and energy demand is strongly context-dependent. Sharp transitions in RWE values correspond to shifts between stable and dynamic lighting conditions, whereas increases in wavelet entropy indicate higher signal irregularity, typically linked to increased artificial lighting demand. These plots enhance the interpretability of the hybrid Wavelet–RF framework by linking signal-level descriptors to their physical implications in daylight-driven lighting control.

4.5. Integration with Stability Analysis

To link the machine learning framework with physical stability analysis, wavelet parameters are interpreted through a Lyapunov-inspired energy formulation [27]:
V ( t ) = 1 2 [ Δ E ( t ) ] 2 , V ˙ ( t ) = k   W E ( t )
A negative value of V ˙ t implies asymptotic stability, meaning that the illumination error converges to zero after transient disturbances. This analysis bridges data-driven entropy metrics with classical control theory, reinforcing the physical validity of the proposed hybrid modeling framework.

5. Results and Discussion

This section evaluates the framework using time–frequency representations, entropy-based stability, and predictive modeling to link light transmission with energy demand and control response.
Figure 5 presents the three-dimensional CWT scalogram of the indoor illuminance time series. The horizontal axis represents time progression in minutes, while the vertical axis corresponds to the scale (frequency-related) components. The color scale indicates the magnitude of the wavelet coefficients, reflecting the distribution of signal energy across different time–frequency regions. The CWT was computed using a predefined scale range to capture both low-frequency (long-term trends) and high-frequency (short-term variations) components of the illumination signal. In this representation, regions with higher coefficient magnitudes can be interpreted as areas of increased energy concentration, associated with variations in daylight conditions and light transmission through the system. Lower-scale (higher-frequency) regions are associated with short-term fluctuations, whereas higher-scale (lower-frequency) regions correspond to gradual changes in daylight conditions. These observations are consistent with the nonstationary nature of the illuminance signal and support the use of multi-scale analysis for characterizing illumination dynamics in tubular light pipe systems.
Figure 6 illustrates the relative contribution of each input feature to the Random Forest model’s predictive performance. The three Relative Wavelet Energy components (RWE1, RWE2, and RWE3) collectively account for most of the explained variance, indicating that illumination dynamics are primarily governed by multi-scale energy components derived from wavelet decomposition.
The WE feature represents the system’s degree of nonlinearity and stability, enabling the model to capture irregularities in natural lighting transitions more effectively. Although less influential, the Month variable reflects seasonal variations in daylight availability and enhances long-term prediction accuracy. This analysis confirms that wavelet-domain descriptors are both statistically significant and physically meaningful, demonstrating that the proposed model achieves high predictive accuracy while preserving interpretability.
Table 3 compares the predictive performance of three models—Linear Regression, ARIMA, and the proposed hybrid Wavelet + Random Forest (RF) approach. The results indicate that the Wavelet + RF model achieves the lowest error metrics, with an RMSE of 7.84 and an MAE of 7.10, demonstrating superior accuracy in capturing the nonlinear relationship between daylight variation and electrical energy demand. In contrast, the Linear and ARIMA models exhibit higher errors, particularly under fluctuating daylight conditions, highlighting their limited ability to handle non-stationarity. Although all models yield negative R2 values, this behavior is primarily attributed to strong temporal variability and external environmental fluctuations rather than inherent model limitations. The hybrid Wavelet + RF model exhibits greater robustness by integrating multi-scale wavelet features that effectively represent the system’s inherent dynamics. This comparison indicates that the proposed model provides more stable and physically meaningful predictions, bridging the gap between data-driven learning and energy system interpretability. The negative R2 values indicate that the variance in the target variable is strongly influenced by external environmental fluctuations, making accurate prediction challenging under highly dynamic daylight conditions. This behavior reflects the nonstationary nature of the dataset rather than a limitation of the proposed model.
Figure 7 presents the comparison between observed and predicted energy consumption profiles, along with the control response generated by the hybrid Wavelet–Random Forest model. The black line represents the measured energy consumption, while the dashed blue line indicates the model’s predicted response based on illumination and wavelet-domain features. Red circular markers denote the control activation points, corresponding to moments when artificial lighting was engaged to maintain the desired indoor illuminance threshold. The close alignment between the observed and predicted profiles demonstrates the model’s ability to capture the nonlinear relationship between daylight availability and energy use. This figure illustrates how the proposed hybrid framework effectively mirrors the lighting system’s actual control strategy, enabling accurate prediction of energy demand under dynamic daylight conditions. Overall, this result validates the practical applicability of the Wavelet–ML approach for adaptive lighting control and real-time energy optimization in hybrid daylighting systems.
Figure 8 illustrates the sensitivity relationship between the ESR, WE, and RWE1 derived from the hybrid daylight–artificial lighting control system. The 3D surface reveals a clear inverse correlation between WE and ESR: as Wavelet Entropy increases, indicating greater instability and irregularity in daylight transmission, the system’s energy-saving potential decreases. Conversely, higher RWE1 values, representing stronger low-frequency optical components, correspond to more stable lighting conditions and improved energy efficiency. This analysis provides physical insight into how the system’s nonlinear optical dynamics directly influence its energy performance. Within the context of this study, the figure confirms that the proposed Wavelet-based model not only effectively predicts energy demand but also explains the underlying physical mechanism linking illumination stability and energy savings, thereby reinforcing the interpretability and practical relevance of the framework.

Statistical Analysis and Model Comparison

To verify whether the improvement achieved by the hybrid Wavelet–Random Forest (Wavelet–RF) model over the Linear Regression baseline was statistically significant, paired t-test and Wilcoxon signed-rank tests were conducted on the model residuals. The resulting p-values (0.5001 and 0.8384, respectively) were both greater than 0.05, indicating that the difference in mean prediction errors between the two models was not statistically significant at the 95% confidence level. Despite this statistical similarity, the Wavelet–RF model consistently produced lower RMSE and MAE values, reflecting a practical improvement in prediction accuracy and stability under dynamic daylight conditions. This suggests that while the numerical enhancement is not large enough to reach statistical significance, it remains physically meaningful in real-world operation, where measurement noise and environmental variability inherently limit deterministic performance gains. Thus, the hybrid model demonstrates better adaptability to fluctuating illumination dynamics, supporting its robustness and practical applicability in intelligent lighting control systems.
Figure 9 compares the variation in WE between clear (sunny) and cloudy days to assess the impact of outdoor illumination fluctuations on indoor lighting stability. The results show that sunny periods, represented by yellow bars, exhibit lower WE variability, indicating a more stable illuminance profile. In contrast, cloudy conditions, shown in blue, lead to higher entropy dispersion, reflecting increased dynamical instability caused by fluctuating daylight input. This comparison demonstrates that wavelet entropy effectively captures the system’s sensitivity to atmospheric variability, linking environmental factors with the nonlinear stability of the daylight–artificial lighting interaction.
Figure 10 illustrates the monthly variations in mean electrical energy demand (blue bars) and mean wavelet entropy (orange line), highlighting the seasonal relationship between lighting energy usage and system stability. Higher wavelet entropy values correspond to periods of greater daylight variability, indicating reduced stability and increased demand for artificial lighting. Conversely, months with lower entropy exhibit smoother illuminance behavior and improved daylight utilization. This relationship emphasizes how environmental dynamics influence the hybrid lighting system’s efficiency and stability across different months.
Figure 11 presents the Lyapunov-inspired sensitivity function V ˙ ( t ) over time, derived from wavelet entropy-based energy variations in the hybrid lighting system. The negative values V ( t ) throughout the observation period indicate that the system’s stability function V ( t ) consistently decreases, confirming asymptotic stability in the illuminance dynamics. In physical terms, this demonstrates that the light pipe–based illumination system returns to equilibrium after external disturbances, maintaining consistent performance under fluctuating daylight conditions. The use of a Lyapunov-inspired approach provides a theoretical validation of the proposed model’s robustness, bridging the gap between energy-based signal behavior and stability-oriented system interpretation.
Figure 12 illustrates the correlation relationships among the primary variables used in this study, including outdoor illuminance E o , indoor illuminance E m , relative wavelet energies (RWE1–RWE3), WE, and electrical energy consumption E elec . The matrix reveals several key dependencies: a strong positive correlation between E o and E m validates the accuracy of the experimental lighting transfer system. In contrast, the high correlations among RWE components indicate coherent multi-scale energy behavior within the wavelet domain. The negative correlation between WE and RWE variables indicates an inverse relationship between system entropy and structured energy concentration, suggesting that higher entropy corresponds to more irregular lighting conditions. Although the correlation between WE and E elec is relatively weak, it implies that increased disorder in illuminance patterns may slightly elevate the energy demand due to compensatory artificial lighting activation. Overall, this matrix provides a quantitative overview of the interconnections between optical, wavelet-based, and energy-related parameters, serving as a statistical foundation for the subsequent machine learning and stability analyses in this paper.

6. Conclusions

This study presents a hybrid Wavelet–Machine Learning framework for predicting electrical energy consumption and analyzing nonlinear illumination dynamics in a TLP system under real daylight conditions. By integrating physical measurements, time–frequency analysis, and interpretable machine learning, the proposed approach improves predictive performance compared to Linear Regression and ARIMA models. The Random Forest model achieves the lowest RMSE and MAE, demonstrating its effectiveness in capturing multiscale illumination patterns and their relationship with electrical energy demand. The results indicate that low wavelet entropy and dominant low-frequency components are associated with more stable illumination conditions and higher energy-saving potential. Model interpretability, supported by feature importance analysis and SHAP values, provides insight into the relationship between illumination dynamics and energy consumption. A Lyapunov-inspired formulation is used as an interpretative tool to relate entropy-based measures to system behavior, suggesting that reduced entropy is associated with more stable operating conditions.
Future work will focus on real-time adaptive control strategies and the incorporation of additional environmental and occupancy-related variables to enhance prediction accuracy. The proposed framework provides a practical and interpretable approach for modeling daylight-driven energy demand in building systems.

Funding

This research received no external funding.

Data Availability Statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ARIMAAutoregressive Integrated Moving Average
CWTContinuous Wavelet Transform
ESREnergy-Saving Ratio
GBGradient Boosting
LEDLight-Emitting Diode
LSTMLong Short-Term Memory
MAE Mean Absolute Error
MDIMean Decrease in Impurity
MLMachine Learning
MSEMean Squared Error
PDPPartial Dependence Plot
RFRandom Forest
RMSE Root Mean Square Error
RTCReal-Time Clock
RWERelative Wavelet Energy
SHAPSHapley Additive exPlanations
TLPTubular Light Pipe
TSMSTurkish State Meteorological Service
WEWavelet Entropy

References

  1. Song, J.; Dessie, B.B.; Gao, L. Analysis and comparison of daylighting technologies: Light pipe, optical fiber, and heliostat. Sustainability 2023, 15, 11044. [Google Scholar] [CrossRef]
  2. Onubogu, N.O.; Chong, K.-K.; Tan, M.-H. Review of active and passive daylighting technologies for sustainable building. Int. J. Photoenergy 2021, 2021, 8802691. [Google Scholar] [CrossRef]
  3. Li, H.; Wu, D.; Yuan, Y.; Zuo, L. Evaluation methods of the daylight performance and potential energy saving of tubular daylight guide systems: A review. Indoor Built Environ. 2022, 31, 299–315. [Google Scholar] [CrossRef]
  4. Petržala, J.; Kómar, L. Analytical prediction of tubular light-pipe performance under arbitrary sky conditions. J. Sol. Energy Eng. 2019, 141, 051012. [Google Scholar] [CrossRef]
  5. Carpentieri, A.; Folini, D.; Nerini, D.; Pulkkinen, S.; Wild, M.; Meyer, A. Intraday probabilistic forecasts of surface solar radiation. Appl. Energy 2023, 351, 121775. [Google Scholar] [CrossRef]
  6. Chien, Y.-R.; Zhou, M.; Peng, A.; Zhu, N.; Torres-Sospedra, J. Signal processing and machine learning for smart sensing applications. Sensors 2023, 23, 1445. [Google Scholar] [CrossRef]
  7. Dwivedi, D.; Chamoli, A.; Rana, S.K. Wavelet entropy: A new tool for edge detection of potential field data. Entropy 2023, 25, 240. [Google Scholar] [CrossRef] [PubMed]
  8. Xu, H.; Lei, B.; Li, Z. A reconstruction of total solar irradiance based on wavelet analysis. Earth Space Sci. 2021, 8, e2021EA001819. [Google Scholar] [CrossRef]
  9. Aslan, Z.; Topçu, H.S.; Barutçu, B.; İncecik, S.; Aksoy, B.; Sakarya, S. Analyses of variations in solar irradiation based on wavelet technique. In Proceedings of the IWW 2013, Valencia, Spain, 5–6 September 2013. [Google Scholar]
  10. Xue, H.; Li, G.; Qi, D.; Ni, H. Temporal evolution, oscillation and coherence characteristics of global solar radiation based on CWT. Appl. Sci. 2024, 14, 4794. [Google Scholar] [CrossRef]
  11. Huang, X.; Shi, J.; Gao, B.; Tai, Y.; Chen, Z.; Zhang, J. Forecasting hourly solar irradiance using hybrid wavelet transformation and Elman model. IEEE Access 2019, 7, 139909–139923. [Google Scholar] [CrossRef]
  12. He, X.; Zhou, C.; Zhang, J.; Yuan, X. Using wavelet transforms to fuse nighttime light data and POI big data. Remote Sens. 2020, 12, 3887. [Google Scholar] [CrossRef]
  13. Bian, Y.; Zhou, Y.; Yang, S.; Lin, D.; Ma, Y. Using machine learning to predict lighting energy consumption from spatial daylight autonomy. Energy Build. 2025, 341, 115847. [Google Scholar] [CrossRef]
  14. Li, Q.; Haberl, J. Prediction of annual daylighting performance using inverse models. Sustainability 2023, 15, 11938. [Google Scholar] [CrossRef]
  15. Mashaly, I.; El-Hussainy, M.; Sherif, A.; Tarabieh, K. Daylighting performance prediction tool using machine learning. J. Build. Eng. 2025, 111, 113496. [Google Scholar] [CrossRef]
  16. Ansong, M.; Huang, G.; Nyang’onda, T.N.; Musembi, R.J.; Richards, B.S. Very short-term solar irradiance forecasting using sky imager and deep learning. Sol. Energy 2025, 294, 113516. [Google Scholar] [CrossRef]
  17. Maltais, L.-G.; Gosselin, L. Forecasting of short-term lighting and plug load electricity consumption. Appl. Energy 2022, 307, 118229. [Google Scholar] [CrossRef]
  18. Leal, A.F.R.; Matos, W.L.N. Short-term lightning prediction using machine learning. In Proceedings of the 36th ICLP 2022, Cape Town, South Africa, 2–7 October 2022. [Google Scholar] [CrossRef]
  19. Aslanoğlu, R.; Pracki, P.; Kazak, J.K.; Ulusoy, B.; Yekanialibeiglou, S. Short-term analysis of residential lighting. Build. Environ. 2021, 196, 107781. [Google Scholar] [CrossRef]
  20. Mahmoudzadeh, P.; Hu, W.; Davis, W.; Durmuş, D. Spatial efficiency: An outset of lighting application efficacy for indoor lighting. Build. Environ. 2024, 255, 111409. [Google Scholar] [CrossRef]
  21. Collins, S. Integrating short-term variations of the power system into integrated energy system models: A methodological review. Renew. Sustain. Energy Rev. 2017, 76, 839–856. [Google Scholar] [CrossRef]
  22. Rhif, M.; Ben Abbes, A.; Farah, I.R.; Martínez, B.; Sang, Y. Wavelet transform application for non-stationary time-series analysis: A review. Appl. Sci. 2019, 9, 1345. [Google Scholar] [CrossRef]
  23. Gholipour, A.; Marufkhani, H.; Khosravi, M.A. Enhanced constrained optimal sliding mode control for cable-driven parallel robots. Eur. J. Control. 2025, 85, 101282. [Google Scholar] [CrossRef]
  24. Khoshvaght, H.; Permala, R.R.; Razmjou, A.; Khiadani, M. A critical review on selecting performance evaluation metrics for ML models. J. Environ. Chem. Eng. 2025, 13, 119675. [Google Scholar] [CrossRef]
  25. Igel, C.; Oehmcke, S. Remember to correct the bias when using deep learning for regression! Künstliche Intelligenz 2023, 37, 33–40. [Google Scholar] [CrossRef]
  26. Mboga, J.K.; Ogutu, C.A. Comparative analysis of cross-validation techniques in machine learning models. Am. J. Theor. Appl. Stat. 2024, 13, 127–137. [Google Scholar] [CrossRef]
  27. Chen, Q.; Qi, J. How much should we trust R2 and adjusted R2 evidence from regressions in top economics journals and Monte Carlo simulations. J. Appl. Econ. 2023, 26, 2207326. [Google Scholar] [CrossRef]
Figure 1. Schematic diagram of the tubular light pipe experimental setup.
Figure 1. Schematic diagram of the tubular light pipe experimental setup.
Electronics 15 01474 g001
Figure 2. End-to-end analysis pipeline for the wavelet–machine learning framework.
Figure 2. End-to-end analysis pipeline for the wavelet–machine learning framework.
Electronics 15 01474 g002
Figure 3. Wavelet-based feature extraction and multi-domain feature vector construction.
Figure 3. Wavelet-based feature extraction and multi-domain feature vector construction.
Electronics 15 01474 g003
Figure 4. Partial dependence plots of wavelet-derived features in the random forest model.
Figure 4. Partial dependence plots of wavelet-derived features in the random forest model.
Electronics 15 01474 g004
Figure 5. 3D CWT scalogram of indoor illuminance.
Figure 5. 3D CWT scalogram of indoor illuminance.
Electronics 15 01474 g005
Figure 6. Feature importance analysis of the random forest model.
Figure 6. Feature importance analysis of the random forest model.
Electronics 15 01474 g006
Figure 7. Predicted energy consumption profile and control response.
Figure 7. Predicted energy consumption profile and control response.
Electronics 15 01474 g007
Figure 8. Energy savings sensitivity map between wavelet entropy (we) and relative wavelet energy (RWE1).
Figure 8. Energy savings sensitivity map between wavelet entropy (we) and relative wavelet energy (RWE1).
Electronics 15 01474 g008
Figure 9. Distribution of wavelet entropy (WE) under clear and cloudy sky conditions.
Figure 9. Distribution of wavelet entropy (WE) under clear and cloudy sky conditions.
Electronics 15 01474 g009
Figure 10. Monthly averaged energy consumption and wavelet entropy trends.
Figure 10. Monthly averaged energy consumption and wavelet entropy trends.
Electronics 15 01474 g010
Figure 11. Lyapunov-inspired sensitivity analysis of system stability.
Figure 11. Lyapunov-inspired sensitivity analysis of system stability.
Electronics 15 01474 g011
Figure 12. Correlation matrix between input and output variables.
Figure 12. Correlation matrix between input and output variables.
Electronics 15 01474 g012
Table 1. Summary of experimental measurement system.
Table 1. Summary of experimental measurement system.
ParameterSpecification
LocationIstanbul, Turkey (41.01° N, 28.97° E)
Measurement periodJanuary–May (5 months, 2025)
Light pipe diameter300 mm (reflectivity ≈ 0.85)
Pipe length2.4 m (vertical axis)
Indoor sensorDigital luxmeter (0–20,000 lux, ±2%, factory-calibrated)
Outdoor sensorPyranometer-type lux sensor (direct/diffuse, factory-calibrated)
Sampling frequencySampling frequency: One sample per minute (synchronized via RTC, time-aligned)
Target illuminance (Et)500 lux
Artificial lighting systemLED array (4000 K, luminous efficacy η = 90 lm/W, threshold-based control)
Monitored variablesIndoor/outdoor illuminance, irradiance, temperature, humidity
Data volume≈200,000 samples per variable (1 min resolution)
Table 2. Feature importance ranking of physical and wavelet-domain variables in the random forest model.
Table 2. Feature importance ranking of physical and wavelet-domain variables in the random forest model.
FeatureTypeMDI *SHAP ImpactPhysical Interpretation
RWE1Wavelet0.2980.276Low-scale (fast) variations affecting the lighting response
RWE2Wavelet0.2140.196Medium-scale illumination stability indicator
WEWavelet0.1870.204Entropy measure of illumination irregularity
Energy RatioPhysical0.1680.142Ratio of indoor to outdoor illuminance
MonthContextual0.1330.124Seasonal dependency affecting light control
* Mean Decrease in Impurity.
Table 3. Performance comparison of baseline and proposed models.
Table 3. Performance comparison of baseline and proposed models.
ModelRMSEMAER2
Linear Regression9.397.49−0.69
ARIMA8.287.62−0.32
Wavelet + RF (Proposed)7.847.10−0.18
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gorgulu, S. Wavelet Entropy and Machine Learning Analysis of Nonlinear Dynamics in Tubular Light Pipes. Electronics 2026, 15, 1474. https://doi.org/10.3390/electronics15071474

AMA Style

Gorgulu S. Wavelet Entropy and Machine Learning Analysis of Nonlinear Dynamics in Tubular Light Pipes. Electronics. 2026; 15(7):1474. https://doi.org/10.3390/electronics15071474

Chicago/Turabian Style

Gorgulu, Sertac. 2026. "Wavelet Entropy and Machine Learning Analysis of Nonlinear Dynamics in Tubular Light Pipes" Electronics 15, no. 7: 1474. https://doi.org/10.3390/electronics15071474

APA Style

Gorgulu, S. (2026). Wavelet Entropy and Machine Learning Analysis of Nonlinear Dynamics in Tubular Light Pipes. Electronics, 15(7), 1474. https://doi.org/10.3390/electronics15071474

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop