Next Article in Journal
Effect of Different Heat Treatment Processes on the Microstructure and Properties of Cu-15Ni-3Al Alloys
Previous Article in Journal
Microstructure Evolution and Fracture Mode of Laser Welding–Brazing DP780 Steel-5754 Aluminum Alloy Joints with Various Laser Spot Positions
Previous Article in Special Issue
Data-Mining-Aided-Material Design of Doped LaMnO3 Perovskites with Higher Curie Temperature
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deep Reinforcement Learning-Guided Inverse Design of Transparent Heat Mirror Film for Broadband Spectral Selectivity

School of Physics and Optoelectronics, Xiangtan University, Xiangtan 411105, China
*
Author to whom correspondence should be addressed.
Materials 2025, 18(12), 2677; https://doi.org/10.3390/ma18122677
Submission received: 22 April 2025 / Revised: 2 June 2025 / Accepted: 4 June 2025 / Published: 6 June 2025
(This article belongs to the Special Issue Machine Learning for Materials Design)

Abstract

With the increasing energy consumption of buildings, transparent heat mirror films have been widely used in building windows to enhance energy efficiency owing to their excellent spectrally selective properties. Previous studies have typically focused on spectral selectivity in the visible and near-infrared bands, as well as single-parameter optimization of film materials or thickness, without fully exploring the performance potential of the films. To address the limitations of traditional design methods, this paper proposes a deep reinforcement learning-based approach that employs an adaptive strategy network to optimize the thin-film material system and layer thickness parameters simultaneously. Through inverse design, a Ta2O5/Ag/Ta2O5/Ag/Ta2O5 (42 nm/22 nm/79 nm/22 nm/40 nm) thin-film structure with broadband spectral selectivity was obtained. The film exhibited an average reflectance of 75.5% in the ultraviolet band and 93.2% in the near-infrared band while maintaining an average visible transmittance of 87.0% and a mid- to far-infrared emissivity as low as 1.7%. Additionally, the film maintained excellent optical performance over a wide range of incident angles, making it suitable for use in complex lighting environments. Building energy simulations indicate that the film achieves a maximum energy-saving rate of 17.93% under the hot climatic conditions of Changsha and 16.81% in Guangzhou, demonstrating that the designed transparent heat mirror film provides a viable approach to reducing building energy consumption and holds significant potential for practical applications.

Graphical Abstract

1. Introduction

The accelerating urbanization process has given rise to pressing challenges, including critical energy shortages and environmental degradation that have reached unprecedented levels of severity [1,2,3]. Building energy consumption accounts for more than 40% of global energy consumption, and this proportion is expected to continue to increase with the ongoing improvement in living standards [4,5,6]. Windows, the least thermally insulated component of the building envelope, contribute to approximately 60% of the building energy consumption [7,8,9,10,11]. The solar spectrum can be divided into ultraviolet (UV, 0.2–0.38 μm) light, visible (VIS, 0.38–0.78 μm) light, and near-infrared (NIR, 0.78–2.5 μm) light, which account for 3%, 43%, and 54% of the total solar energy, respectively [12,13,14]. Among these, the thermal effect of NIR radiation is the primary contributor to indoor heat gain, while UV radiation not only damages human skin but is also absorbed by materials, reducing their durability [15,16,17]. Therefore, it is essential to design a smart window with high transmittance in the visible band and high reflectance in the UV and NIR bands.
Transparent heat mirror film (THM) has demonstrated significant potential in the development of smart windows due to its high transmittance in the visible spectrum and outstanding reflectance in the infrared band. Early on, Fan et al. [18] pioneered the design of a TiO2/Ag/TiO2-based THM structure, laying the groundwork for the development of dielectric/metal/dielectric (D/M/D) multilayers. Since then, D/M/D configurations have garnered widespread attention owing to their superior optical properties. Dalapati et al. [19] investigated the TiO2/Cu/TiO2 structure, achieving an average visible transmittance of up to 90%, along with 85% NIR reflectance. Similarly, Sibin et al. [20] optimized the ITO/Ag/ITO structure by adjusting the thickness of each layer and etching the glass substrate, resulting in an average visible transmittance of 91% while maintaining high NIR reflectance. However, the D/M/D structure still exhibits certain limitations in spectrally selective modulation and relies on a parametric scanning approach based on repeated trial-and-error processes for thin-film design, which is not only inefficient but also cost-prohibitive. To address these challenges, researchers have incorporated intelligent algorithms to optimize multilayer dielectric/metal interlayer structures, further enhancing film performance. For example, Hong et al. [14] investigated the Si3N4/Ag/Si3N4/Ag/Si3N4; structure and optimized the thickness of each layer using a particle swarm optimization algorithm, achieving an average visible transmittance of 88.6% and an average NIR reflectance of 92%. However, previous studies have primarily focused on single-parameter optimization, such as the material choice or thickness, without fully exploring the potential for enhancing the films’ performance. Moreover, traditional intelligent algorithms often struggle with high-dimensional data, making the simultaneous optimization of both the material composition and thickness a significant challenge.
Machine learning is widely applied in material preparation, performance prediction, and structural design, offering new research perspectives [21,22]. It can autonomously extract features from data and outperforms traditional intelligent algorithms in handling large-scale and high-dimensional datasets [23,24,25]. Deep reinforcement learning (DRL) combines the feature extraction capability of deep learning with the decision-making ability of reinforcement learning [26,27,28] and has emerged as one of the most prominent research directions in machine learning [29].
In this paper, a deep reinforcement learning-based approach is employed to simultaneously optimize both the material system and thickness of the thin film using an adaptive strategy network, enabling the inverse design of film structures with broadband spectral selectivity. The optimized results are compared to obtain the film with the best comprehensive performance. The underlying mechanism responsible for spectral selectivity is investigated through an electromagnetic field distribution analysis. Furthermore, by examining the influences of the incident angle and polarization state on the optical properties of the film, it is demonstrated that its optical performance remains stable under varying incident conditions. Finally, building energy consumption simulations under the hot climate of Changsha and Guangzhou, China, further validate the film’s outstanding energy-saving capacity.

2. Methodology

2.1. Data Preparation and Preprocessing

An ideal transparent heat mirror film transmits all visible light while completely reflecting UV and NIR radiation, as illustrated in Figure 1a. To achieve this functionality, THM films typically utilize high-reflectivity precious metals such as Au, Ag, and Cu as intermediate metallic layers [30]. In recent years, TiN has also been considered due to its lower cost [31]. Additionally, dielectric materials serve as antireflective layers to minimize reflection losses in the visible spectrum. Commonly used dielectric materials include Al2O3, AlN, HfO2, ITO, Si3N4, SiO2, Ta2O5, TiO2, ZnO, ZnS, WO3, and AZO, among others.
In this paper, to facilitate material optimization using machine learning, we assigned integer codes to the selected materials, as shown in Table S1. Since some of the collected data contained missing values, we applied linear interpolation over the wavelength range of 0.28 μm to 2.5 μm using 500 evenly spaced interpolation points. This process standardized the wavelength data across different materials, providing a consistent dataset for subsequent optimization.

2.2. Optimization Framework Design

In this study, a deep Q-network (DQN) framework (as shown in Figure 1b) was developed to optimize both the material system and thickness parameters of thin films. The material system, composed of a periodic stack of dielectric and metal layers, is represented by a state defined as m 1 , m 2 , t 1 , t 2 , , t n , where m 1 , m 2 denote the material codes for dielectric and metal layers, respectively, and { t 1 , t 2 , , t n } represent the thicknesses of each layer. The number of layers, n, is determined by the specific configuration of the film. The action set is defined as m 1 + 1 , m 1 1 , m 2 + 1 , m 2 1 , t 1 + Δ t , t 1 Δ t , , t n + Δ t , t n Δ t , where Δ t is fixed at 1 nm. To maximize the sum of ultraviolet reflectance and visible transmittance, and near-infrared reflectance, the reward function is defined as follows:
r = R ¯ UV + T ¯ VIS + R ¯ NIR
where R ¯ UV , T ¯ VIS , and R ¯ NIR represent the average ultraviolet reflectance, the average visible transmittance, and the average near-infrared reflectance, respectively.
Solar radiation in the ultraviolet (UV) range is typically divided into three regions: UVC (200–280 nm), UVB (280–315 nm), and UVA (315–380 nm). Among these, UVC is almost completely absorbed by the ozone layer, while UVA has the most significant impact on the human body, as it can penetrate the epidermis and reach the dermis [32]. Therefore, in this study, we focus on the UV spectrum within the range of 280–380 nm. The average reflectance and transmittance values, R ¯ UV , T ¯ VIS , and R ¯ NIR , are calculated using the following equations [19,33]:
R ¯ UV = 0.28 0.38 R ( λ ) S ( λ ) d λ 0.28 0.38 S ( λ ) d λ
T ¯ VIS = 0.38 0.78 T ( λ ) V ( λ ) d λ 0.38 0.78 V ( λ ) d λ
R ¯ NIR = 0.78 2.5 R ( λ ) S ( λ ) d λ 0.78 2.5 S ( λ ) d λ
in Equations (2) and (4), R ( λ ) represents the reflectance of the thin film, and S ( λ ) denotes the solar spectral irradiance under AM1.5 atmospheric conditions, as illustrated in Figure 2a. In Equation (3), T ( λ ) represents the transmittance of the thin film, while V ( λ ) corresponds to the photopic luminosity function of the human eye, as shown in Figure 2b. The function V ( λ ) is zero outside the visible spectrum and reaches a peak value of 1 at a wavelength of 0.55 μm, indicating the highest sensitivity of the human eye at this wavelength. The transmittance T ( λ ) and reflectance R ( λ ) of the film are calculated using the transfer matrix method (TMM), which is one of the most commonly used methods for analyzing the optical properties of multilayer films based on the refractive index ( n ) , extinction coefficient ( k ) , film thickness, and incident light wavelength [34].
According to the law of energy conservation and the law of thermal radiation, the spectral emissivity ε ( λ ) of the film can be calculated from its reflectance R ( λ ) and transmittance T ( λ ) as follows [35,36]:
ε ( λ ) = A ( λ ) = 1 T ( λ ) R ( λ )
Therefore, the average emissivity of the film in the mid- to far-infrared range (2.5–25 μm) can be calculated using the following expression [37]:
ε ¯ = 2.5 25 ε ( λ ) B ( λ , Τ ) d λ 2.5 25 B ( λ , Τ ) d λ
where B ( λ , Τ ) is the spectral radiance of a blackbody at temperature T = 300 K given by Planck’s law [38].

2.3. Model Training

The DQN model employs two deep neural networks: the main network and the target network. Both networks share the same architecture, which is a fully connected feedforward neural network comprising an input layer (with n input nodes representing the encoded thin-film structure), a single hidden layer with 64 neurons activated by the Tanh function, and an output layer with n output nodes corresponding to the discrete action space (i.e., modifications to the material type or layer thickness). The main network is responsible for computing the Q-values of different actions in the current state, and its weight parameters are continuously updated during training. In contrast, the target network is used to compute the Q-values of all possible actions in the next state. To maintain stability, the weights of the target network are synchronized with those of the main network every 2000 steps. This dual-network design effectively enhances the stability of the model and mitigates the risk of Q-value overestimation, which could otherwise lead to suboptimal policy choices [39].
During the decision-making process, the agent perceives the current state from the environment and uses the main network to compute the Q-values of all possible actions. An action is then selected using an ε-greedy strategy: with probability ε , a random action is chosen (exploration), and with probability 1 ε , the action with the highest Q-value is selected (exploitation). In this study, the ε-greedy strategy is used to dynamically adjust ε so that the model can explore more in the early stage of training and gradually tend to the optimal strategy in the late stage of training to accelerate convergence. This approach helps accelerate convergence toward the optimal policy.
The agent interacts with the environment by performing actions, which in turn update the environment's state and provide corresponding reward feedback. The agent then uses the updated state to compute new Q-values and decides whether to explore or exploit based on the updated ε. Through this continuous loop of interaction and learning, the agent incrementally improves its policy by acquiring new experience from the environment. Unlike supervised learning, DQN does not require pre-existing training data; instead, it learns by interacting with a TMM-based simulation environment that provides performance feedback for each candidate structure.
DQN employs an experience replay mechanism, where at each time step, the original state ( s ) , action ( a ) , reward ( r ) , next state ( s ) , and a termination flag ( d o n e ) are stored in the experience replay buffer. During training, a random batch of experiences is sampled from the buffer and used to update the main network. In this study, we set the experience buffer size to 20,000 and the batch size to 64.
Meanwhile, the target network computes the target Q-values for the next state in the sampled batch. The model uses the mean squared error (MSE) loss function to calculate the loss between the target Q-values and Q-values of the current state. The gradient descent algorithm is then applied to update the weight parameters of the main network.
A sensitivity analysis of the model was conducted by varying key parameters, including the random initialization seed, the number of training episodes, the maximum number of steps per episode, and the learning rate. The results indicated that the model converged reliably only when the number of training episodes reached 20 or more. Therefore, the number of training episodes in this study was set to 20. As training progresses, the Q-value function gradually converges, enabling the agent to select actions that maximize long-term rewards in any given state. Ultimately, the model outputs the state that yields the highest reward, which corresponds to the optimal thin-film structure parameters.

2.4. Optical Simulations

The optical simulations of the designed thin-film structure were performed using COMSOL Multiphysics® (v6.2, COMSOL AB, Stockholm, Sweden) [40]. COMSOL is based on the finite element method (FEM), which discretizes the computational domain into numerous fine meshes for numerical analysis. The model was constructed with an edge length of 1 μm, and Floquet periodic boundary conditions were applied to simulate the optical response of an infinite periodic array. By analyzing the electromagnetic field distribution, the internal mechanisms enabling spectral selectivity in the film structure were investigated.
In practical applications, sunlight does not always strike window surfaces perpendicularly. The angle of incidence varies throughout the day and across seasons. Therefore, achieving stable solar spectral modulation over a wide range of incidence angles is crucial for the real-world implementation of such thin films. In this study, the average visible transmittance and average near-infrared reflectance of the films were assessed for both transverse electric (TE) and transverse magnetic (TM) polarized waves at various angles of incidence. This analysis enabled a comprehensive assessment of the robustness of the film's optical properties under varying illumination conditions.

2.5. Building Energy Consumption Simulation

To further evaluate the feasibility and energy-saving performance of the proposed film in practical applications, a simple office room model with dimensions of 5 m (length) × 4 m (width) × 3 m (height) was constructed using SketchUp (v2023, Trimble., Sunnyvale, CA, USA) [41], and its energy consumption was simulated using EnergyPlus [42]. The exterior walls of the building were composed of concrete blocks coated with cement plaster on both sides (see Table S2 for detailed parameters). Meteorological data for the simulations were obtained from the China Standard Weather Data (CSWD), corresponding to the climate zone of Changsha. As a city with a subtropical monsoon climate characterized by hot summers, Changsha provides a representative environment to assess the thermal regulation performance of the proposed film. The meteorological data were sourced from the official EnergyPlus website (v23.2.0, U.S. Department of Energy, Washington, DC, USA).
The internal load settings of the room are summarized in Table S3. In this study, a 5 mm-thick quartz glass window was selected as the reference glazing [43]. The annual energy consumption of the office was simulated under four window configurations: the reference window and windows coated with three, five, and seven layers of the THM film. Based on the simulation results, the energy saving rate ( E S R ) of the coated windows was calculated using Equation (7):
E S R = Q ref Q film Q ref × 100 %
where Q ref represents the annual energy consumption of the office with the reference window, and Q film denotes the annual energy consumption of the office with the window coated with the THM film.

3. Results and Discussion

3.1. Evaluation of the Optimized Film Performance

The multilayer film structures and their optical properties, obtained through inverse design using a DQN, are presented in Table S4. Among these, the Ta2O5/Ag/Ta2O5/Ag/Ta2O5 structure (42 nm/22 nm/79 nm/22 nm/40 nm) exhibits the highest S U M value, where S U M = R ¯ UV + T ¯ VIS + R ¯ NIR . Additionally, the structures with the highest S U M values in both the three-layer and seven-layer configurations also employed Ta2O5/Ag, indicating that the Ta2O5/Ag material combination consistently demonstrates superior performance in the design of THM films.
Figure 3a presents the reflectance and transmittance spectra of the ideal THM film. As shown in Figure 3c, the transmittance spectrum of the five-layer film peaks at 0.51 μm with a maximum transmittance of 92.0%. In contrast, the seven-layer film (Figure 3d) peaks at 0.52 μm, with a transmittance of 86.6%, which is 5.4% less than that of the five-layer film. Additionally, the transmittance of both films decreases to approximately 80.0% at 0.61 μm. According to Figure 1b, the human eye is particularly sensitive to visible light in the wavelength range of 0.51–0.61 μm. Consequently, the T ¯ VIS of the five-layer (87.0%) film is higher compared to that of the seven-layer film (83.3%). Notably, the reflectance spectrum of the seven-layer film exhibits a steeper rising slope, allowing for a faster transition from high visible transmittance to high near-infrared reflectance. At 0.78 μm, the reflectance of the seven-layer film increases to 84.0% compared to 77.0% for the five-layer film. Although their reflectance spectra nearly converge beyond a wavelength of 1 μm, the R ¯ NIR of the seven-layer film remains slightly higher (94.6%) than that of the five-layer film (93.2%). Furthermore, in the UV band, the five-layer film exhibits a broader high-reflectance bandwidth compared to the seven-layer film, leading to a relatively higher R ¯ UV , which more effectively reduces UV transmittance and absorption. The three-layer film shown in Figure 3b demonstrates higher reflectance in the VIS range, resulting in a lower T ¯ VIS . This is primarily due to the limited ability of the single-layer Ag film to reflect both UV and NIR radiation, leading to lower values of both R ¯ UV and R ¯ NIR .
Therefore, the five-layer Ta2O5/Ag/Ta2O5/Ag/Ta2O5 structure demonstrates significantly better overall performance than the three-layer and seven-layer films, particularly in terms of spectral selectivity and modulation capability. As shown in Table 1, the THM film optimized via deep reinforcement learning achieves a more favorable balance between visible transmittance and NIR reflectance than those designed using conventional methods such as particle swarm optimization (PSO), resulting in superior overall spectral performance.
As shown in Figure 4, all three film structures exhibit relatively low emissivity in the mid- to far-infrared wavelength range (2.5–25 μm), with an overall decreasing trend as the wavelength increases. Among them, the five-layer structure shows the lowest emissivity, reaching as low as 0.01552 at a wavelength of 25 μm. Based on Equation (6), the average mid- to far-infrared emissivity values for the three-, five-, and seven-layer films are calculated to be 2.9%, 1.7%, and 2.1%, respectively, confirming that the five-layer film has the lowest average emissivity. This result can be attributed to the fact that the five-layer film exhibits the highest infrared reflectance, consequently achieving the lowest emissivity. This underscores its superior capability in suppressing long-wave thermal radiation compared to the other two designs.

3.2. Simulated Optical Properties

The human eye is most sensitive to light at a wavelength of 0.55 μm. At this wavelength, the transmittance of the five-layer Ta2O5/Ag/Ta2O5/Ag/Ta2O5 film exceeds 90%, while the reflectance remains above 94% for wavelengths beyond 1 μm. To further investigate the underlying mechanisms, we analyzed the electric and magnetic field distributions at wavelengths of 0.55 μm and 1 μm. At a wavelength of 0.55 μm, the position where the electric field reaches its maximum intensity in Figure 5a corresponds to the location where the magnetic field exhibits its minimum intensity in Figure 5b. In addition, the electric and magnetic fields show opposite overall distribution trends. Although the upper silver layer reflects part of the incident visible light, a portion is transmitted into the Fabry–Pérot (F-P) cavity. Multiple internal reflections within the F-P cavity lead to the formation of standing wave modes. Destructive interference of the reflected waves at the interfaces suppresses reflection and consequently enhances the optical transmittance. This effect is known as the Fabry–Pérot resonance [45]. At a wavelength of 1 μm, the distribution trends of the electric field (Figure 5c) and magnetic field (Figure 5d) are approximately the same. This can be attributed to the large extinction coefficient ( k ) of Ag in the NIR band, which forms a highly reflective interface with the high-refractive-index Ta2O5 layer. As a result, nearly all the NIR light is reflected at the upper Ta2O5/Ag interface, causing some of the electric and magnetic field modes beneath the upper silver layer to converge toward a minimum.
According to Figure 6a,b, the trends of the T ¯ VIS and R ¯ NIR of the Ta2O5/Ag/Ta2O5/Ag/Ta2O5 thin film are generally similar for both TM and TE polarized waves at varying incident angles. When the incident angle is less than 60°, T - VIS consistently remains above 70%. Meanwhile, R ¯ NIR increases with the incident angle and reaches a maximum of 94% at 70°. However, when the incident angle exceeds 75°, both T ¯ VIS and R ¯ NIR drop sharply and approach zero at an incident angle of 90°. This indicates that the optical performance of the five-layer film is relatively insensitive to variations in the incident angle within a broad angular range (0–60°).
As shown in Figure 6c,d, in the VIS range, the low-reflectance bandwidth is relatively broad at normal incidence (0°), indicating a higher transmittance of visible light. As the angle of incidence increases, the low-reflectance bandwidth gradually narrows, i.e., the reflectance increases slightly. In the NIR band, however, the reflectance remains almost unaffected by variations in the angle of incidence. This optical behavior is of considerable significance in practical applications. At midday during summer, when sunlight strikes at a high angle, the increased reflectance of visible light helps reduce glare and thermal radiation, thereby improving indoor comfort. In contrast, during winter, when the solar incidence angle is low, the higher visible light transmittance facilitates both natural illumination and passive heat gain, contributing to a reduced heating demand.

3.3. Building Energy Performance

Based on the variation in annual building energy consumption per square meter with the window-to-wall ratio shown in Figure 7a,c, it can be observed that windows coated with a five-layer Ta2O5/Ag/Ta2O5/Ag/Ta2O5 film exhibit the lowest energy consumption among the compared options. Meanwhile, as illustrated in Figure 7b,d, the energy efficiency of all three types of coated glass windows increases with the window-to-wall ratio. When the window-to-wall ratio reaches 90%, the five-layer coated windows achieve an energy savings rate of up to 17.93% under the climatic conditions of Changsha, which is 3.1% and 0.13% higher than the three-layer and seven-layer films, respectively. Under the climatic conditions of Guangzhou, the energy savings rate reaches 16.81%, exceeding those of the three-layer and seven-layer films by 3.61% and 0.22%, respectively. This result can be attributed to the superior optical performance of the five-layer structure. Its NIR is higher than that of the three-layer film, indicating a stronger capability to block infrared radiation and thereby reducing cooling energy consumption. Although the NIR of the seven-layer film is comparable to that of the five-layer structure, the latter exhibits a lower mid- to far-infrared emissivity. This characteristic helps suppress thermal radiation from the window into the indoor environment, effectively reducing heat transfer. Consequently, the five-layer configuration achieves the lowest energy consumption among the three designs, demonstrating the best overall energy-saving performance.

4. Conclusions

In this study, a film structure of Ta2O5/Ag/Ta2O5/Ag/Ta2O5 (42 nm/22 nm/79 nm/22 nm/40 nm) was inversely designed employing deep reinforcement learning to optimize both the material system and layer thickness parameters for broadband spectral selectivity. The film demonstrates high transmittance in the visible band ( T ¯ VIS = 87.0%) and high reflectance in the UV and NIR bands ( R ¯ UV = 75.5%, R ¯ NIR = 93.2%). Additionally, the film exhibits an average mid- to far-infrared emissivity as low as 1.7%, effectively reducing heat gain in summer and heat loss in winter. Simulation results under different polarization states and incidence angles indicate that the film maintains good optical performance across a wide angular range (0–60°). Under the hot climatic conditions of Changsha, China, windows coated with the proposed film outperform traditional quartz glass windows, achieving a maximum energy-saving rate of 17.93% at a window-to-wall ratio of 90%. Similarly, in Guangzhou, the energy-saving rate reaches 16.81%. The THM film developed in this study exhibits outstanding optical and thermal properties across the entire wavelength spectrum, indicating strong potential for application in energy-efficient building technologies.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ma18122677/s1, Table S1: Type of material and its code [46,47,48,49,50,51,52,53,54,55,56,57,58], Table S2: Materials of the building façade, Table S3: Room internal load settings, Table S4: DQN-optimized THM film structure.

Author Contributions

Conceptualization, H.J. and Z.Z.; Methodology, Z.Z. and T.X.; Software, Z.Z. and P.L.; Validation, B.L., S.J. and Z.Z.; Formal Analysis, Y.C.; Investigation, T.X.; Resources, H.J.; Data Curation, P.L.; Writing—Original Draft Preparation, Z.Z.; Writing—Review and Editing, H.J.; Visualization, Z.Z. and T.X.; Supervision, H.J. and B.L.; Project Administration, S.J.; Funding Acquisition, H.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the Xiangtan University College Student Innovation and Entrepreneurship Training Program, the National Natural Science Foundation of China (No. 51902276), the Natural Science Foundation of Hunan Province (No. 2019JJ50583, 2023JJ30585), and the Scientific Research Fund of Hunan Provincial Education Department (No. 21B0111).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the supplementary material. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

  1. Zhang, L.; Yang, L.; Zohner, C.M.; Crowther, T.W.; Li, M.; Shen, F.; Guo, M.; Qin, J.; Yao, L.; Zhou, C. Direct and indirect impacts of urbanization on vegetation growth across the world’s cities. Sci. Adv. 2022, 8, eabo0095. [Google Scholar] [CrossRef] [PubMed]
  2. Abbasi, K.R.; Shahbaz, M.; Zhang, J.; Irfan, M.; Alvarado, R. Analyze the environmental sustainability factors of China: The role of fossil fuel energy and renewable energy. Renew. Energy 2022, 187, 390–402. [Google Scholar] [CrossRef]
  3. Chu, S.; Majumdar, A. Opportunities and challenges for a sustainable energy future. Nature 2012, 488, 294–303. [Google Scholar] [CrossRef] [PubMed]
  4. Zhou, Y.; Jin, H.; Li, C.; Ding, L. Spatio-temporal patterns and impact mechanisms of CO2 emissions from China's construction industry under urbanization. Sustain. Cities Soc. 2024, 106, 105353. [Google Scholar] [CrossRef]
  5. Tan, J.; Peng, S.; Liu, E. Spatio-temporal distribution and peak prediction of energy consumption and carbon emissions of residential buildings in China. Appl. Energy 2024, 376, 124330. [Google Scholar] [CrossRef]
  6. Zhang, R.; Li, R.; Xu, P.; Zhong, W.; Zhang, Y.; Luo, Z.; Xiang, B. Thermochromic smart window utilizing passive radiative cooling for self-adaptive thermoregulation. Chem. Eng. J. 2023, 471, 144527. [Google Scholar] [CrossRef]
  7. Rezaei, S.D.; Shannigrahi, S.; Ramakrishna, S. A review of conventional, advanced, and smart glazing technologies and materials for improving indoor environment. Sol. Energy Mater. Sol. Cells 2017, 159, 26–51. [Google Scholar] [CrossRef]
  8. Gao, Y.; Jonsson, J.C.; Curcija, D.C.; Vidanovic, S.; Hong, T. Global and regional perspectives on optimizing thermo-responsive dynamic windows for energy-efficient buildings. Nat. Commun. 2025, 16, 199. [Google Scholar] [CrossRef]
  9. Raman, A.P.; Anoma, M.A.; Zhu, L.; Rephaeli, E.; Fan, S. Passive radiative cooling below ambient air temperature under direct sunlight. Nature 2014, 515, 540–544. [Google Scholar] [CrossRef]
  10. Cuce, E.; Riffat, S.B. A state-of-the-art review on innovative glazing technologies. Renew. Sustain. Energy Rev. 2015, 41, 695–714. [Google Scholar] [CrossRef]
  11. Wang, Y.; Ji, H.; Liu, B.; Tang, P.; Chen, Y.; Huang, J.; Ou, Y.; Tao, J. Radiative cooling: Structure design and application. J. Mater. Chem. A 2024, 12, 9962–9978. [Google Scholar] [CrossRef]
  12. Wang, S.; Zhou, Y.; Jiang, T.; Yang, R.; Tan, G.; Long, Y. Thermochromic smart windows with highly regulated radiative cooling and solar transmission. Nano Energy 2021, 89, 106440. [Google Scholar] [CrossRef]
  13. Wang, S.; Jiang, T.; Meng, Y.; Yang, R.; Tan, G.; Long, Y. Scalable thermochromic smart windows with passive radiative cooling regulation. Science 2021, 374, 1501–1504. [Google Scholar] [CrossRef]
  14. Hong, X.; Yang, Y.; Chen, H.; Tao, Q. Design, fabrication and energy-saving evaluation of five-layer structure based transparent heat mirror coatings for windows application. Build. Simul. 2023, 16, 2333–2342. [Google Scholar] [CrossRef]
  15. Lanfranchi, A.; Megahd, H.; Lova, P.; Comoretto, D. Multilayer polymer photonic aegises against near-infrared solar irradiation heating. ACS Appl. Mater. Interfaces 2022, 14, 14550–14560. [Google Scholar] [CrossRef] [PubMed]
  16. D’Orazio, J.; Jarrett, S.; Amaro-Ortiz, A.; Scott, T. UV radiation and the skin. Int. J. Mol. Sci. 2013, 14, 12222–12248. [Google Scholar] [CrossRef]
  17. Zhang, J.; Xi, S.; Mao, G.; Yin, R.; Zhu, L.; Li, D.; Yao, Z.; Mi, H.-Y.; Han, J.; Liu, C. Robust and efficient UV-reflecting one-dimensional photonic crystals enabled by organic/inorganic nanocomposite thin films for photoprotection of transparent polymers. J. Mater. Chem. C 2021, 9, 4223–4232. [Google Scholar] [CrossRef]
  18. Fan, J.C.; Bachner, F.J. Transparent heat mirrors for solar-energy applications. Appl. Opt. 1976, 15, 1012–1017. [Google Scholar] [CrossRef]
  19. Dalapati, G.K.; Masudy-Panah, S.; Chua, S.T.; Sharma, M.; Wong, T.I.; Tan, H.R.; Chi, D. Color tunable low cost transparent heat reflector using copper and titanium oxide for energy saving application. Sci. Rep. 2016, 6, 20182. [Google Scholar] [CrossRef]
  20. Sibin, K.; Selvakumar, N.; Kumar, A.; Dey, A.; Sridhara, N.; Shashikala, H.; Sharma, A.K.; Barshilia, H.C. Design and development of ITO/Ag/ITO spectral beam splitter coating for photovoltaic-thermoelectric hybrid systems. Sol. Energy 2017, 141, 118–126. [Google Scholar] [CrossRef]
  21. Chen, Y.; Long, P.; Liu, B.; Wang, Y.; Wang, J.; Ma, T.; Wei, H.; Kang, Y.; Ji, H. Development and application of Few-shot learning methods in materials science under data scarcity. J. Mater. Chem. A 2024, 12, 30249–30268. [Google Scholar] [CrossRef]
  22. Lu, M.; Ji, H.; Chen, Y.; Gao, F.; Liu, B.; Long, P.; Deng, C.; Wang, Y.; Tao, J. Machine learning assisted layer-controlled synthesis of MoS2. J. Mater. Chem. C 2024, 12, 8893–8900. [Google Scholar] [CrossRef]
  23. Narvaez, G.; Giraldo, L.F.; Bressan, M.; Pantoja, A. Machine learning for site-adaptation and solar radiation forecasting. Renew. Energy 2021, 167, 333–342. [Google Scholar] [CrossRef]
  24. Chen, Y.; Ji, H.; Lu, M.; Liu, B.; Zhao, Y.; Ou, Y.; Wang, Y.; Tao, J.; Zou, T.; Huang, Y. Machine learning guided hydrothermal synthesis of thermochromic VO2 nanoparticles. Ceram. Int. 2023, 49, 30794–30800. [Google Scholar] [CrossRef]
  25. Lu, M.; Ji, H.; Zhao, Y.; Chen, Y.; Tao, J.; Ou, Y.; Wang, Y.; Huang, Y.; Wang, J.; Hao, G. Machine learning-assisted synthesis of two-dimensional materials. ACS Appl. Mater. Interfaces 2022, 15, 1871–1878. [Google Scholar] [CrossRef]
  26. Xu, X.; Shen, B.; Ding, S.; Srivastava, G.; Bilal, M.; Khosravi, M.R.; Menon, V.G.; Jan, M.A.; Wang, M. Service offloading with deep Q-network for digital twinning-empowered Internet of Vehicles in edge computing. IEEE Trans. Ind. Inform. 2020, 18, 1414–1423. [Google Scholar] [CrossRef]
  27. Heil, C.M.; Patil, A.; Dhinojwala, A.; Jayaraman, A. Computational reverse-engineering analysis for scattering experiments (CREASE) with machine learning enhancement to determine structure of nanoparticle mixtures and solutions. ACS Cent. Sci. 2022, 8, 996–1007. [Google Scholar] [CrossRef]
  28. Chen, Y.; Ji, H.; Long, P.; Liu, B.; Wang, Y.; Ou, Y.; Deng, C.; Huang, Y.; Wang, J. Reinforcement learning-based inverse design of composite films for spacecraft smart thermal control. Phys. Chem. Chem. Phys. 2025, 27, 7753–7762. [Google Scholar] [CrossRef]
  29. Ladosz, P.; Weng, L.; Kim, M.; Oh, H. Exploration in deep reinforcement learning: A survey. Inf. Fusion 2022, 85, 1–22. [Google Scholar] [CrossRef]
  30. Al-Kuhaili, M. Enhancement of plasmonic transmittance of porous gold thin films via gold/metal oxide bi-layers for solar energy-saving applications. Sol. Energy 2019, 181, 456–463. [Google Scholar] [CrossRef]
  31. Okada, M.; Tazawa, M.; Jin, P.; Yamada, Y.; Yoshimura, K. Fabrication of photocatalytic heat-mirror with TiO2/TiN/TiO2 stacked layers. Vacuum 2006, 80, 732–735. [Google Scholar] [CrossRef]
  32. Chang, T.; Cao, X.; Long, Y.; Luo, H.; Jin, P. How to properly evaluate and compare the thermochromic performance of VO2-based smart coatings. J. Mater. Chem. A 2019, 7, 24164–24172. [Google Scholar] [CrossRef]
  33. Dang, S.; Yi, Y.; Ye, H. A visible transparent solar infrared reflecting film with a low long-wave emittance. Sol. Energy 2020, 195, 483–490. [Google Scholar] [CrossRef]
  34. Luce, A.; Mahdavi, A.; Marquardt, F.; Wankerl, H. TMM-Fast, a transfer matrix computation package for multilayer thin-film optimization: Tutorial. J. Opt. Soc. Am. A 2022, 39, 1007–1013. [Google Scholar] [CrossRef]
  35. Volterrani, M.; Minelli, A.; Gaetani, M.; Grossi, N.; Magni, S.; Caturegli, L. Reflectance, absorbance and transmittance spectra of bermudagrass and manilagrass turfgrass canopies. PLoS ONE 2017, 12, e0188080. [Google Scholar] [CrossRef]
  36. Hao, X.; Wei, G.; Zhang, H.; Tan, S.; Ji, G. Defect chemistry-regulated design of doping CeO2 with the enhanced high-temperature low infrared emissivity property. Mater. Today Nano 2025, 30, 100614. [Google Scholar] [CrossRef]
  37. Xu, G.; Kang, Q.; Zhang, X.; Wang, W.; Guo, K.; Guo, Z. Inverse-design laser-infrared compatible stealth with thermal management enabled by wavelength-selective thermal emitter. Appl. Therm. Eng. 2024, 255, 124063. [Google Scholar] [CrossRef]
  38. Quan, C.; Gu, S.; Liu, P.; Xu, W.; Guo, C.; Zhang, J.; Zhu, Z. Spectrally selective radiation infrared stealth based on a simple Mo/Ge bilayer metafilm. Opt. Lasers Eng. 2024, 180, 108328. [Google Scholar] [CrossRef]
  39. Fenjiro, Y.; Benbrahim, H. Deep reinforcement learning overview of the state of the art. J. Autom. Mob. Robot. Intell. Syst. 2018, 12, 20–39. [Google Scholar] [CrossRef]
  40. Timofeev, V.A.; Skvortsov, I.V.; Mashanov, V.I.; Gayduk, A.E.; Bloshkin, A.A.; Kirienko, V.V.; Utkin, D.E.; Nikiforov, A.I.; Kolyada, D.V.; Firsov, D.D. Excitation of hybrid modes in plasmonic nanoantennas coupled with GeSiSn/Si multiple quantum wells for the photoresponse enhancement in the short-wave infrared range. Appl. Surf. Sci. 2024, 659, 159852. [Google Scholar] [CrossRef]
  41. Liu, X. Three-dimensional visualized urban landscape planning and design based on virtual reality technology. IEEE Access 2020, 8, 149510–149521. [Google Scholar] [CrossRef]
  42. Jia, Y.; Liu, D.; Chen, D.; Jin, Y.; Chen, C.; Tao, J.; Cheng, H.; Zhou, S.; Cheng, B.; Wang, X. Transparent dynamic infrared emissivity regulators. Nat. Commun. 2023, 14, 5087. [Google Scholar] [CrossRef] [PubMed]
  43. Long, L.; Ye, H. Dual-intelligent windows regulating both solar and long-wave radiations dynamically. Sol. Energy Mater. Sol. Cells 2017, 169, 145–150. [Google Scholar] [CrossRef]
  44. Al-Kuhaili, M.; Al-Aswad, A.; Durrani, S.; Bakhtiari, I. Energy-saving transparent heat mirrors based on tungsten oxide–gold WO3/Au/WO3 multilayer structures. Sol. Energy 2012, 86, 3183–3189. [Google Scholar] [CrossRef]
  45. Zheng, L.; Zhang, S.; Yao, Q.; Lin, K.; Rao, A.; Niu, C.; Yang, M.; Wang, L.; Lv, Y. High reflectance tunable multi-color electrochromic films based on Fabry–Perot cavity. Ceram. Int. 2023, 49, 13355–13362. [Google Scholar] [CrossRef]
  46. Ciesielski, A.; Skowronski, L.; Trzcinski, M.; Górecka, E.; Trautman, P.; Szoplik, T. Evidence of germanium segregation in gold thin films. Surf. Sci. 2018, 674, 73–78. [Google Scholar] [CrossRef]
  47. Luke, K.; Okawachi, Y.; Lamont, M.R.; Gaeta, A.L.; Lipson, M. Broadband mid-infrared frequency comb generation in a Si3N4 microresonator. Opt. Lett. 2015, 40, 4823–4826. [Google Scholar] [CrossRef]
  48. Gao, L.; Lemarchand, F.; Lequime, M. Exploitation of multiple incidences spectrometric measurements for thin film reverse engineering. Opt. Express 2012, 20, 15734–15751. [Google Scholar] [CrossRef]
  49. Babar, S.; Weaver, J. Optical constants of Cu, Ag, and Au revisited. Appl. Opt. 2015, 54, 477–481. [Google Scholar] [CrossRef]
  50. Franta, D.; Nečas, D.; Ohlídal, I.; Giglia, A. Dispersion model for optical thin films applicable in wide spectral range. In Proceedings of the SPIE 9628, Optical Systems Design 2015: Optical Fabrication, Testing, and Metrology V, 96281U, Jena, Germany, 24 September 2015; pp. 342–353. [Google Scholar]
  51. Pflüger, J.; Fink, J. Determination of optical constants by high-energy, electron-energy-loss spectroscopy (EELS). In Handbook of Optical Constants of Solids; Elsevier: Amsterdam, The Netherlands, 1997; pp. 293–311. [Google Scholar]
  52. Aguilar, O.; de Castro, S.; Godoy, M.P.; Rebello Sousa Dias, M. Optoelectronic characterization of Zn1-xCdxO thin films as an alternative to photonic crystals in organic solar cells. Opt. Mater. Express 2019, 9, 3638–3648. [Google Scholar] [CrossRef]
  53. Beliaev, L.Y.; Shkondin, E.; Lavrinenko, A.V.; Takayama, O. Thickness-dependent optical properties of aluminum nitride films for mid-infrared wavelengths. J. Vac. Sci. Technol. A 2021, 39, 043408. [Google Scholar] [CrossRef]
  54. Ozaki, S.O.S.; Adachi, S.A.S. Optical constants of cubic ZnS. Jpn. J. Appl. Phys. 1993, 32, 5008. [Google Scholar] [CrossRef]
  55. Franta, D.; Nečas, D.; Ohlídal, I. Universal dispersion model for characterization of optical thin films over a wide spectral range: Application to hafnia. Appl. Opt. 2015, 54, 9108–9119. [Google Scholar] [CrossRef] [PubMed]
  56. Kulikova, D.P.; Dobronosova, A.A.; Kornienko, V.V.; Nechepurenko, I.A.; Baburin, A.S.; Sergeev, E.V.; Lotkov, E.S.; Rodionov, I.A.; Baryshev, A.V.; Dorofeenko, A.V. Optical properties of tungsten trioxide, palladium, and platinum thin films for functional nanostructures engineering. Opt. Express 2020, 28, 32049–32060. [Google Scholar] [CrossRef]
  57. Minenkov, A.; Hollweger, S.; Duchoslav, J.; Erdene-Ochir, O.; Weise, M.; Ermilova, E.; Hertwig, A.; Schiek, M. Monitoring the electrochemical failure of indium tin oxide electrodes via operando ellipsometry complemented by electron microscopy and spectroscopy. ACS Appl. Mater. Interfaces 2024, 16, 9517–9531. [Google Scholar] [CrossRef]
  58. Treharne, R.; Seymour-Pierce, A.; Durose, K.; Hutchings, K.; Roncallo, S.; Lane, D. Optical design and fabrication of fully sputtered CdTe/CdS solar cells. J. Phys. Conf. Ser. 2011, 286, 012038. [Google Scholar] [CrossRef]
Figure 1. (a) Diagram of the working mechanism of a transparent heat mirror film. (b) Schematic diagram of the deep Q-network model framework.
Figure 1. (a) Diagram of the working mechanism of a transparent heat mirror film. (b) Schematic diagram of the deep Q-network model framework.
Materials 18 02677 g001
Figure 2. (a) Solar spectral irradiance under AM1.5 atmospheric conditions. (b) Photopic luminosity function of the human eye in the visible light range [32].
Figure 2. (a) Solar spectral irradiance under AM1.5 atmospheric conditions. (b) Photopic luminosity function of the human eye in the visible light range [32].
Materials 18 02677 g002
Figure 3. Reflectance and transmittance spectra of (a) ideal, (b) three-layer, (c) five-layer, and (d) seven-layer THM films over the wavelength range of 0.28–2.5 μm.
Figure 3. Reflectance and transmittance spectra of (a) ideal, (b) three-layer, (c) five-layer, and (d) seven-layer THM films over the wavelength range of 0.28–2.5 μm.
Materials 18 02677 g003
Figure 4. Emissivity spectra of the three-layer, five-layer, and seven-layer films in the mid- to far-infrared bands.
Figure 4. Emissivity spectra of the three-layer, five-layer, and seven-layer films in the mid- to far-infrared bands.
Materials 18 02677 g004
Figure 5. Electric and magnetic field distributions of the Ta2O5/Ag/Ta2O5/Ag/Ta2O5 structure. (a) Electric field distribution at a wavelength of 0.55 μm. (b) Magnetic field distribution at a wavelength of 0.55 μm. (c) Electric field distribution at a wavelength of 1 μm. (d) Magnetic field distribution at a wavelength of 1 μm.
Figure 5. Electric and magnetic field distributions of the Ta2O5/Ag/Ta2O5/Ag/Ta2O5 structure. (a) Electric field distribution at a wavelength of 0.55 μm. (b) Magnetic field distribution at a wavelength of 0.55 μm. (c) Electric field distribution at a wavelength of 1 μm. (d) Magnetic field distribution at a wavelength of 1 μm.
Materials 18 02677 g005
Figure 6. (a) Average visible light transmittance under the oblique incidence of TM and TE waves. (b) Average near-infrared reflectance under the oblique incidence of TM and TE waves. (c) Reflectance spectrum of TM waves under oblique incidence in the wavelength range of 0.28–2.5 μm. (d) Reflectance spectrum of TE waves under oblique incidence in the wavelength range of 0.28–2.5 μm.
Figure 6. (a) Average visible light transmittance under the oblique incidence of TM and TE waves. (b) Average near-infrared reflectance under the oblique incidence of TM and TE waves. (c) Reflectance spectrum of TM waves under oblique incidence in the wavelength range of 0.28–2.5 μm. (d) Reflectance spectrum of TE waves under oblique incidence in the wavelength range of 0.28–2.5 μm.
Materials 18 02677 g006
Figure 7. Under the climatic conditions of Changsha, (a) the annual building energy consumption per square meter as a function of window-to-wall ratio and (b) energy saving rate as a function of the window-to-wall ratio are shown. Under the climatic conditions of Guangzhou, (c) the annual building energy consumption per square meter as a function of window-to-wall ratio and (d) energy saving rate as a function of the window-to-wall ratio are shown.
Figure 7. Under the climatic conditions of Changsha, (a) the annual building energy consumption per square meter as a function of window-to-wall ratio and (b) energy saving rate as a function of the window-to-wall ratio are shown. Under the climatic conditions of Guangzhou, (c) the annual building energy consumption per square meter as a function of window-to-wall ratio and (d) energy saving rate as a function of the window-to-wall ratio are shown.
Materials 18 02677 g007
Table 1. Performance comparison between the THM design in this study and other related research.
Table 1. Performance comparison between the THM design in this study and other related research.
Structure T ¯ VIS (%) R ¯ NIR (%)Optimization MethodReference
TiO2/Ag/TiO262.571.9PSODalapati et al. [19]
ZnO/Ag/ZnO87.158.9PSODang et al. [33]
Si3N4/Ag/Si3N492.574.6PSOHong et al. [14]
WO3/Au/WO379.060.3PSOAl-Kuhaili et al. [44]
Ta2O5/Ag/Ta2O583.081.7DRLThis work
TiO2/Ag/TiO2/Ag/TiO28793.2DRLThis work
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zeng, Z.; Ji, H.; Xiao, T.; Long, P.; Liu, B.; Jin, S.; Cao, Y. Deep Reinforcement Learning-Guided Inverse Design of Transparent Heat Mirror Film for Broadband Spectral Selectivity. Materials 2025, 18, 2677. https://doi.org/10.3390/ma18122677

AMA Style

Zeng Z, Ji H, Xiao T, Long P, Liu B, Jin S, Cao Y. Deep Reinforcement Learning-Guided Inverse Design of Transparent Heat Mirror Film for Broadband Spectral Selectivity. Materials. 2025; 18(12):2677. https://doi.org/10.3390/ma18122677

Chicago/Turabian Style

Zeng, Zhi, Haining Ji, Tianjian Xiao, Peng Long, Bin Liu, Shisong Jin, and Yuxin Cao. 2025. "Deep Reinforcement Learning-Guided Inverse Design of Transparent Heat Mirror Film for Broadband Spectral Selectivity" Materials 18, no. 12: 2677. https://doi.org/10.3390/ma18122677

APA Style

Zeng, Z., Ji, H., Xiao, T., Long, P., Liu, B., Jin, S., & Cao, Y. (2025). Deep Reinforcement Learning-Guided Inverse Design of Transparent Heat Mirror Film for Broadband Spectral Selectivity. Materials, 18(12), 2677. https://doi.org/10.3390/ma18122677

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop