From Research Trend to Performance Prediction: Metaheuristic-Driven Machine Learning Optimization for Cement Pastes Containing Bio-Based Phase Change Materials

Li, Leifa; Sun, Wangwen; Gómez-Zamorano, Lauren Y.; Liu, Zhuangzhuang; Zhang, Wenzhen; Ma, Haoran

doi:10.3390/polym17182541

Open AccessArticle

From Research Trend to Performance Prediction: Metaheuristic-Driven Machine Learning Optimization for Cement Pastes Containing Bio-Based Phase Change Materials

by

Leifa Li

¹,

Wangwen Sun

^2,*

,

Lauren Y. Gómez-Zamorano

³

,

Zhuangzhuang Liu

^2,4,5,*,

Wenzhen Zhang

² and

Haoran Ma

²

¹

Xinjiang Jiaotou Construction Management Co., Ltd., Urumchi 830000, China

²

School of Highway, Chang’an University, Xi’an 710064, China

³

Programa Doctoral en Ingeniería de Materiales, Facultad de Ingeniería Mecánica y Eléctrica, Universidad Autónoma de Nuevo León, Ave. Universidad s/n, Ciudad Universitaria, San Nicolás de los Garza 66455, Nuevo León, Mexico

⁴

Key Laboratory of Special Area Highway Engineering, Ministry of Education, Xi’an 710064, China

⁵

International Joint Laboratory for Sustainable Development of Highway Infrastructures in Special Regions, Xi’an 710064, China

^*

Authors to whom correspondence should be addressed.

Polymers 2025, 17(18), 2541; https://doi.org/10.3390/polym17182541

Submission received: 30 July 2025 / Revised: 17 September 2025 / Accepted: 18 September 2025 / Published: 19 September 2025

(This article belongs to the Special Issue Application of Polymers in Cementitious Materials)

Download

Browse Figures

Versions Notes

Abstract

This study presents an integrated approach combining bibliometric analysis and machine learning to explore research trends and predict the performance of cement pastes containing bio-based phase change materials. A bibliometric review of 5928 articles from the Web of Science Core Collection was conducted using CiteSpace (v.6.3.R1) to identify research hotspots. A dataset of 100 experimental samples was compiled, including nine input variables and three output properties identified as thermal conductivity (Tc), latent heat capacity (LH) and compressive strength (CS). Four machine learning algorithms (SVR, RF, XGBoost, and CatBoost) were optimized using five metaheuristic algorithms (GA, PSO, WOA, GWO, and FFA), resulting in 24 optimized hybrid models. Of all the models considered, CatBoost-WOA achieved the best overall performance, with R² values of 0.927, 0.955, and 0.944, and RMSEs of 0.0057 W/m·K, 1.84 J/g, and 2.91 MPa for Tc, LH, and CS. Additionally, SVR-GWO and XGBoost-WOA also showed strong generalization and low error dispersion. The developed models provide a transferable and data-driven modeling pipeline for predicting the coupled thermal and mechanical behavior of cement pastes containing bio-based phase change materials.

Keywords:

bio-based phase change materials; cementitious composites; performance prediction; literature visualization analysis; model optimization; sustainability

1. Introduction

Green building materials play a vital role in promoting sustainability in the construction industry by minimizing environmental impact, conserving resources, and reducing energy consumption [1,2,3]. With the advancement of global carbon neutrality goals, bio-based phase change materials (BPCMs) have attracted growing attention due to their renewability and excellent thermal regulation capability [2]. These materials can absorb and release latent heat during phase transitions, thereby regulating the internal temperature of buildings and significantly improving energy efficiency [3]. The interaction between BPCMs and cementitious materials has emerged as a central focus in green energy engineering research. The building heat island (BHI) effect may occur when the internal temperature is much higher than the ambient temperature. This effect is closely related to the thermal properties of construction materials [4]. Due to their high thermal capacity and conductivity, cement-based materials significantly contribute to this effect. Previous studies have shown that incorporating BPCMs into cementitious systems allows excess daytime heat to be absorbed and gradually released at night, effectively narrowing the diurnal temperature range [4,5]. This regulation is primarily attributed to the latent heat storage and temperature sensitivity characteristics of BPCMs. Currently, BPCMs applied in cement-based materials include both low-molecular-weight compounds (e.g., fatty acids and waxes) and polymeric substances (e.g., bio-based PEGs or biopolymers), depending on the application [6]. These materials reduce environmental impact and enable the reuse of biomass-containing waste, thereby improving resource efficiency. However, their application in cementitious materials still encounters challenges, including poor interfacial compatibility with cement paste, insufficient thermal cycling stability, and the high cost of encapsulation and production [6,7]. Current research focuses on four main aspects: (1) Development of novel BPCMs with improved thermal properties; (2) Prevention of material leakage via microencapsulation or porous carrier loading techniques; (3) Improvement of overall durability, with particular emphasis on the interfacial bonding strength between BPCMs and the cement pastes; and (4) Quantitative evaluation of improvements in thermal and mechanical performance under practical application conditions.

Although progress has been made in these areas, a systematic analysis of the research landscape, key technological breakthroughs, and emerging trends is still lacking. To fill the gap, the study employs a bibliometric approach based on the Web of Science database to analyze the research developments on BPCM-integrated cementitious materials over the past twelve years [7,8,9]. Researchers have developed a range of innovative cementitious composites incorporating BPCMs. In practical applications, these materials have demonstrated promising thermoregulatory performance. For example, several studies have shown that incorporating 20% coconut oil-derived BPCMs into cement mortar can reduce surface temperatures by 5–8 °C and decrease energy consumption by up to 30%. Thermal stability is a critical factor in the engineering application of BPCMs, especially considering that the hydration heat in cement-based materials can reach 70–80 °C [10,11]. Thermogravimetric analysis confirms that most plant oil-based BPCMs remain stable below 100 °C, while sugar alcohol-based BPCMs show superior thermal resistance. To mitigate leakage during phase change, two common strategies are employed: polymer microencapsulation and loading onto porous supports [12,13]. To improve the representativeness of our trend analysis, we additionally considered recent comprehensive review studies [12,13,14,15,16], which highlight the growing interest in sustainable cementitious composites incorporating bio-based PCMs. These reviews emphasize evolving research focus from thermal performance control to multi-functional integration, intelligent modeling, and the application of AI-assisted design. By incorporating these perspectives, our bibliometric analysis is placed in a broader context, providing a more complete overview of current research trends and methodological developments in the field.

Further, incorporating 5% graphene has been shown to significantly enhance the thermal conductivity of BPCM cement pastes [14], increasing thermal response rate by 40%. However, the addition of BPCMs tends to reduce the compressive strength of cementitious materials by 10–20%. This drawback can be mitigated through optimization of aggregate gradation and the use of active additives such as nano-silica [15,16,17]. Dynamic mechanical analysis and three-point bending tests have confirmed that surface-treated BPCMs can improve fracture toughness by up to 12%, offering both thermal and mechanical benefits [18]. These findings lay a solid foundation for the practical implementation of BPCMs in sustainable construction. The performance of BPCM cement pastes is influenced by multiple interacting factors, including PCM type and dosage, water-to-cement ratio, porosity, ambient temperature/humidity, and microstructural evolution [19,20]. Traditional experimental methods for thermal performance assessment are costly, time-consuming, and often fail to capture complex nonlinear interactions between variables. As a result, data-driven prediction models have gained traction in recent years [21,22].

Machine learning (ML) techniques have shown excellent capabilities in nonlinear regression and robust handling of complex data structures [23,24]. The study has demonstrated the value of AutoML-enhanced approaches in predicting the behavior of cementitious composites, reinforcing the role of machine learning in material performance analysis [24]. They are now widely used for prediction in cement-based materials. Compared to conventional regression methods, ML algorithms achieve superior performance in capturing intricate relationships between material composition, mix proportions, curing conditions, and target properties such as thermal conductivity, latent heat, and compressive strength [25,26]. Several supervised learning models have been applied in the domain. Support Vector Regression (SVR) is effective for small-sample learning scenarios [27]. Random Forest (RF), a bagging-based ensemble model, helps reduce overfitting and improve generalization performance. Gradient Boosting frameworks such as XGBoost and CatBoost have demonstrated high accuracy and efficiency in handling nonlinear feature interactions and high-dimensional data [28,29]. To further improve predictive accuracy, metaheuristic optimization algorithms such as Particle Swarm Optimization (PSO), Whale Optimization Algorithm (WOA), Grey Wolf Optimizer (GWO), and Firefly Algorithm (FFA) have been employed for hyperparameter tuning [30,31]. These nature-inspired algorithms mimic collaborative search behaviors to perform global optimization, thereby improving both model accuracy and robustness. In this study, four machine learning models (SVR, RF, XGBoost, and CatBoost) were developed to predict the thermal and mechanical performance of BPCM-cement composites, using 100 experimental datasets. These models were optimized using various metaheuristic algorithms. Model performance was evaluated using RMSE, MAE, and R² metrics. Feature importance was further analyzed to identify the dominant influencing factors and to support material design optimization [32,33,34].

In summary, this work establishes a systematic modeling framework for high-precision prediction of the thermal performance of BPCM cement pastes. By integrating four representative machine learning models with multi-strategy metaheuristic optimization algorithms (GA, WOA, GWO, and FFA), the proposed framework improves predictive accuracy and improves adaptability to complex material systems [35,36]. This approach provides a solid theoretical and technical foundation for intelligent modeling of BPCM-integrated green building materials and shows strong potential for large-scale engineering applications.

2. Techniques and Methodology

This study employs bibliometric analysis to comprehensively examine current research trends on the thermal performance of cement pastes incorporating BPCMs. The methodological framework is divided into three stages: literature retrieval, data mining, and bibliometric analysis. In the first stage, a structured and topic-focused dataset was established by selecting relevant databases and formulating precise search strategies. The second stage involved the screening and cleaning of the retrieved literature to ensure the completeness and accuracy of the final dataset. In the third stage, CiteSpace software was employed to conduct knowledge mapping and visualization. To systematically address the core issues in this field, four key research questions were proposed: (1) What are the thematic evolution trends in research on the thermal and mechanical properties of BPCM-integrated cementitious materials? (2) Which high-frequency terms and keywords dominate the existing co-occurrence networks? (3) What is the global distribution of leading research institutions and countries in this field, and are there any significant patterns of international collaboration? (4) Which authors, institutions, and journals exhibit high productivity and academic influence?

By answering these questions, this study seeks to identify the technical hotspots and emerging research frontiers in the field. The findings are expected to offer theoretical insight and strategic direction for the future development and application of BPCM-cement composites in building thermal regulation and energy-efficient design.

2.1. Bibliometric Analysis

2.1.1. Literature Retrieval

Literature was collected from the Web of Science Core Collection (WoS) due to its broad and reliable coverage of engineering, construction materials, and sustainable building technologies [37]. WoS provides well-structured academic metadata, including author affiliations, institutional data, citations, journal metrics, and country-level information. It also offers extensive indexing of leading journals related to green materials, phase change materials (PCMs), and energy-efficient construction, ensuring both representativeness and data consistency [38,39]. During the search and data processing, minor refinements and iterative adjustments were applied to optimize the dataset for the purposes of this study, ensuring accuracy and consistency throughout.The search query used was: TS = (“phase change materials” OR “bio-based PCM” OR “cementitious materials” OR “energy efficiency” OR (“green building materials” OR “low-carbon materials” OR “renewable building materials”) AND “thermal performance”). The search was limited to articles, reviews, conference papers, and early access publications from 2013 to 2024.

Keywords were selected based on core concepts in green materials and energy-efficient construction, with terminology standardized according to international conventions on PCMs, cementitious materials, and thermal properties to ensure consistent term matching and accurate retrieval [40,41]. All retrieved records were subjected to duplicate removal, format normalization, and keyword standardization, resulting in a clean and unified research dataset [42,43,44]. During the search and data processing, minor refinements and iterative adjustments were applied to optimize the dataset for the purposes of this study, ensuring accuracy and consistency throughout. A total of 7214 records were collected from the WoS Core Collection, with 612 duplicates and non-English records removed, and 674 irrelevant studies excluded, leaving 5928 relevant publications. WoS was chosen for its comprehensive coverage of materials science and construction engineering, ensuring standardized metadata for reliable bibliometric analysis [45,46].

2.1.2. Publication Trends

Bibliometric data from the Web of Science Core Collection show a significant upward trend in publications since 2013. In particular, the topic of thermal performance in cementitious composites incorporating BPCMs has gained increasing academic attention over the past decade.

As shown in Figure 1a, the annual number of publications between 2013 and 2024 demonstrates the steady growth trajectory, with red bars indicating the yearly output. In the early stage (2013–2015), annual publications were modest, between 200 and 300. Since 2016, a marked increase in research activity is observed, with the annual number surpassing 350. Key inflection points occurred in 2015 and 2020. The number of publications first exceeded 250 in 2015 and then surpassed 550 in 2020. This growth reflects increasing global investment in green building materials, particularly BPCMs. Since 2021, the average annual output has remained consistently above 700 publications, reaching an all-time high of 797 articles in 2024. The trend highlights the growing role of BPCMs in low-carbon construction and energy-efficient design. It also points to the rise of thermal energy storage technologies and cement-based heat regulation systems as central topics [47].

Figure 1b illustrates the distribution of publication types. Journal articles account for the vast majority (89%), followed by review papers (8%), with other formats contributing marginally. This distribution indicates that the field is mainly driven by original research and experimental studies with strong engineering relevance [48].

Thematic analysis reveals an evolution of focus over time. In the early years, most studies were centered on basic thermal characterization of bio-based PCMs [49]. In recent years, research has shifted toward composite material design, thermal performance evaluation, and mechanistic understanding of phase transition behavior. Moreover, increasing attention is being paid to thermal conductivity control, interfacial heat transfer, and other advanced topics, reflecting a deepening and systematization of research in this domain. In parallel, the interdisciplinary scope of the field has expanded significantly [50,51]. Earlier studies were concentrated within materials science and architectural engineering. However, recent publications demonstrate growing integration with environmental science, energy systems, civil engineering, and even artificial intelligence. The shift suggests the emergence of a multidisciplinary research network centered on bio-based PCMs, encompassing thermal energy engineering, microstructural design of cementitious systems, and intelligent performance prediction [52,53].

In summary, over the past twelve years, the field of bio-based PCM–cement composites has witnessed a rapid growth in scholarly output, underscoring its scientific significance and wide-ranging potential in the development of next-generation green building materials.

2.1.3. Knowledge Network Analysis

This study investigates collaborative patterns and knowledge diffusion in cementitious composites incorporating bio-based phase change materials, using a multi-dimensional bibliometric network analysis. By mapping authorship, institutional affiliations, national collaborations, co-citation structures, we reveal the distribution of academic resources, research impact, and interdisciplinary innovation mechanisms [54]. Results demonstrate the emergence of a multi-layered collaborative network centered on high-productivity authors, influential institutions, and core journals, exhibiting distinct clustering tendencies, geographical linkages, and cross-disciplinary integration.

(1): Country Collaboration Network

As depicted in Figure 2a, China leads the field with the largest node size, highest annual publication output, and citation frequency, highlighting its strong research capacity and significant global influence. The United States, Malaysia, Italy, and Iran serve as secondary hubs and exhibit strong collaborative ties with China, particularly evident in the China–U.S. and China–Malaysia partnerships as indicated by the thickened edges in the network. European countries such as Italy, the United Kingdom, Germany, and France form a central cluster characterized by moderate node sizes but high interconnectivity, reflecting well-established regional research networks. Emerging economies like Iran and Egypt exhibit expanding nodes, signaling growing research engagement.

(2): Institutional Collaboration Network

Figure 2b highlights a China-centric framework, where universities (China University of Mining and Technology, Tsinghua University, Southeast University, Shenzhen University, Tongji University, South China University of Technology) occupy densely connected core positions. Cross-border collaborations with Universiti Teknologi Malaysia, Iran University of Science and Technology, and Università di Pisa are evident, though the overall network remains sparse, suggesting untapped potential for institutional synergy.

(3): High-Impact Literature and Knowledge Diffusion

As depicted in Figure 2c, the document co-citation network identifies seminal works by Ammar Yahia (2016), Kheradmand (2018), and Khudhair (2004) as persistent knowledge anchors in BPCM thermoregulation and composite design. The stable expansion of this network indicates a mature yet dynamically evolving knowledge base.

(4): Author Co-Citation Analysis

Figure 2c,d identify Chinese scholars such as Hong Tianzhen, Yang Da, Zhang Yuhan, and Zhou Bing as central nodes who are frequently co-cited for their contributions to building energy efficiency, PCM heat transfer modeling, and computational optimization. Their work forms the primary knowledge conduit, flanked by international researchers (Petithuguenin, Taylor Hogue, Feng Qian), illustrating a China-led yet globally interconnected citation structure. Peripheral low-frequency authors denote emerging research groups.

(5): Author Collaboration Network

There are tight-knit teams in the network, such as the collaboration between Hong Tianzhen and Yang Da. However, Figure 2e reveals a low global network density, with numerous fragmented small clusters. This indicates limited cross-team integration beyond institutional boundaries.

(6): Journal Co-Citation Analysis

Figure 2f identifies the core journals in this research domain as Construction and Building Materials, which functions as a hub node with 736 co-citations, followed by Renewable and Sustainable Energy Reviews, Journal of Cleaner Production, and Energy and Buildings. The expanding nodes in Sustainability, Applied Energy, and the Journal of Thermal Analysis and Calorimetry reflect a disciplinary convergence toward environmental management and energy systems.

Studies have increasingly focused on the thermal and mechanical performance of cementitious composites incorporating bio-based phase change materials (BPCMs). Materials such as stearic acid, palmitic acid, plant waxes, and PEG compounds have been evaluated for their latent heat capacity, thermal conductivity, and compressive strength after incorporation into cement matrices. These BPCMs offer dual functionality by enabling heat storage while maintaining acceptable structural integrity. However, challenges such as dispersion, compatibility, and thermal stability remain critical factors influencing their practical application.

2.2. Evolution of Research Trends

Recent research on bio-based PCM-integrated cementitious composites has predominantly focused on the precise characterization of thermal-physical parameters, such as thermal conductivity, latent heat, and phase change temperature. In recent years, emerging terms such as “bio-based PCM”, “thermal regulation”, and “machine learning modeling” have gained marked prominence, highlighting a shift in the field from traditional material development toward intelligent modeling and multi-functional system integration. Further analysis reveals the increasing attention in exploring the interrelationship between composite performance and key durability and microstructural properties. This trend underscores an evolving research focus toward understanding the coupled behavior of thermal, mechanical, and structural parameters, which is critical for advancing the practical application of BPCM cement pastes in sustainable construction systems.

2.2.1. Main Terms Analysis

To comprehensively analyze research hotspots and thematic evolution of bio-based phase change materials in cementitious building materials, this study employed CiteSpace to conduct a keyword co-occurrence analysis on core publications from 2013 to 2024 [55,56]. The keyword analysis was conducted using CiteSpace, a bibliometric software that generated co-occurrence and clustering results from the Web of Science Core Collection.

As shown in Figure 3a, high-frequency keywords such as “performance”, “model”, “energy efficiency”, “thermal conductivity”, and “conservation” form the dense core of the co-occurrence network. The size and color of each node represent keyword frequency and temporal span, respectively, with a color gradient from cool to warm indicating the chronological evolution of research focus. The dense interlinkages between nodes reflect strong semantic and thematic associations, constituting the intellectual foundation of the field [57,58,59]. The co-occurrence network visualizes the connections and clustering of prominent research topics. Larger nodes like “performance”, “model”, “energy conservation”, and “thermal conductivity” indicate concentrated academic attention and high relevance.

As shown in Figure 3b, the clustering algorithm segmented the keywords into five major thematic modules, labeled #0 through #4.

Cluster #0: “Energy Efficiency”–covering topics such as building energy saving, PCM-based composites, and thermal regulation mechanisms;

Cluster #1: “Energy Conservation”–emphasizing energy-saving strategies and holistic energy management, representing a core research trajectory over the past decade;

Cluster #2: “Mechanical Properties”–focusing on the impact of PCM incorporation on cementitious material strength, including compressive strength and elastic modulus;

Cluster #3: “Thermal Energy Storage”–addressing latent heat behavior, energy storage efficiency, and thermal cycling stability, central to PCM research;

Cluster #4: “Machine Learning”–signifying the rise of data-driven modeling and the increasing role of artificial intelligence in performance prediction and material optimization.

Figure 3c reveals the emergence and development of these clusters over time. Long-standing core topics such as “energy efficiency” and “energy conservation” have remained active since 2013. In contrast, emerging keywords such as “machine learning” and “artificial intelligence” have shown rapid growth since 2020, indicating a shift toward intelligent modeling and interdisciplinary integration. The research focus has evolved from early concerns such as “energy conservation”, “thermal conductivity”, and “heat transfer”, toward more nuanced topics including “microstructure”, “composite materials”, “optimization”, and “machine learning”, demonstrating the field’s increasing complexity and depth. Figure 3d presents the keyword burst detection results from 2013 to 2024. Burst strength indicates a sharp rise in keyword citation frequency within a specific period, signaling high attention at that time. For example, “conservation” showed a burst strength of 23.23 during 2013–2018, marking it as a foundational research term in the early stage. Similarly, “energy conservation” and “commercial buildings” showed significant surges between 2014–2018 and 2015–2019, respectively, reflecting the strong relevance of energy-saving strategies and real-world application contexts. Since 2020, emerging terms such as “machine learning”, “waste”, and “artificial intelligence” have become increasingly active, indicating a shift from conventional experimental evaluation to interdisciplinary approaches incorporating intelligent prediction, low-carbon material design, and system-level performance analysis.

Overall, the keyword landscape shows a clear trajectory: research has expanded from unidimensional thermal performance and energy storage studies toward multidimensional investigations of mechanical compatibility, microstructural control, and intelligent optimization [60]. While “energy efficiency” and “thermal storage” remain dominant themes, urban thermal environment responses have not yet formed a concentrated research cluster, highlighting opportunities for future work on engineering feasibility and multi-scale performance evaluation [18,60,61].

2.2.2. Performance Evaluation of Bio-Based Phase Change Materials

Bio-based phase change materials (BPCMs) are environmentally friendly and renewable thermal-regulating materials. They include fatty acids, natural waxes, vegetable oil derivatives, and biomass-derived polyethylene glycols (PEGs) [62]. Among them, fatty acid-based PCMs are particularly valued in high-temperature cement composite applications due to their high melting points and large enthalpy of fusion [63]. In particular, PEGs derived from biomass-based feedstocks, have been widely studied in green building applications owing to their excellent specific heat capacity, low toxicity, chemical stability, and biodegradability during phase transition [64]. These materials enable controlled thermal buffering during solid–liquid transitions, making them a preferred functional medium in low-carbon cementitious systems.

Particle size is a key parameter affecting the thermal efficiency and leakage control of BPCMs in cement-based matrices [65]. The particle size of BPCMs varies depending on the form, ranging from microencapsulated particles (~0.2 μm) to bulk granules (~9.5 mm), with average sizes for practical applications around 0.4–1 mm. For PEG2000 and PEG4000, the typical particle sizes are about 0.4 mm and 0.6 mm, respectively. Owing to their high specific surface area, smaller particles promote homogeneous dispersion in the cementitious matrix and improve thermal responsiveness and heat conduction. These fine particles facilitate rapid thermal exchange within porous matrices but also pose a greater risk of leakage, especially in unencapsulated systems or those lacking effective gelation [38,39]. In contrast, larger particles reduce leakage risk but may suffer from uneven distribution and reduced thermal regulation efficiency. Therefore, optimal particle size must be determined based on absorbability, porosity, permeability, and encapsulation efficiency to balance heat transfer performance with leakage control.

Among thermal performance parameters, melting point and enthalpy of fusion are the most representative indicators for bio-based PCMs [66]. The melting point defines the effective temperature range for phase transition and serves as a fundamental parameter in the design of PCM-enhanced thermal regulation systems. Low melting point PCMs may fail prematurely in high-temperature environments, while PCMs with excessively high melting points may not undergo phase change under typical building conditions. The melting points of bio-based PCMs range from 3 °C to 80 °C, averaging 46.67 °C, which aligns well with typical surface temperature fluctuations in building components. For internal thermal stabilization, PCMs with melting points between 45–60 °C are generally more appropriate [53,66]. Several bio-based PCMs such as stearic acid, plant waxes, and PEG compounds exhibit excellent heat storage capacities. These materials were selected based on their high latent heat capacities, suitable melting points, and chemical compatibility with cementitious matrices. Stearic acid and palmitic acid are long-chain saturated fatty acids typically derived from plant oils. They exhibit melting points of approximately 69 °C and 63 °C, respectively, and possess high latent heat storage capacities (~210 J/g and ~195 J/g). Their relatively sharp melting transitions and good thermal stability make them effective for temperature regulation in building applications. However, their hydrophobic nature and limited dispersibility in aqueous environments may hinder uniform integration into cement matrices. Plant wax, primarily composed of natural esters, fatty alcohols, and long-chain hydrocarbons, offers a broader phase transition range (~60–70 °C) and exhibits moderate thermal stability. Its semi-solid state at room temperature reduces the risk of leakage during phase transition, and its bio-based origin supports sustainability goals. PEG 6000, a bio-derived polyether compound, was included as a polymeric PCM due to its melting point (~61 °C) and latent heat (~170 J/g). To improve its stability and prevent leakage during thermal cycling, PEG 6000 was applied in a microencapsulated form. The encapsulation process involved enclosing the PCM in a polymeric shell, which improved dispersion, interfacial bonding, and retention within the cementitious matrix.

For instance, stearic acid offers an enthalpy range of 129.6–221.6 J/g and melting points between 40–80 °C, combining high energy storage with thermal stability, which makes it suitable for high heat-load building structures [67]. Similarly, PEG4000 and PEG2000 possess latent heat values of 135.2 J/g and 180 J/g, respectively. Although their thermal capacity is lower than that of stearic acid, their superior compatibility, low toxicity, and good processability make them mainstream candidates for integration into cementitious systems [68]. By contrast, some PCMs like n-tetradecane or palmitic acid offer high latent heat but have melting points below the operating range of most cement-based elements, limiting their applicability in construction [69].

Beyond thermal parameters, the physicochemical compatibility between BPCMs and cementitious matrices must also be considered [70]. Issues such as acidic degradation, thermal expansion mismatch, and poor dispersion may lead to mechanical property deterioration. High-enthalpy PCMs are more suitable for applications with large diurnal temperature variations and high peak thermal loads, such as exterior building envelopes. Medium-enthalpy, high-stability PCMs are more appropriate for internal mortar-based thermal regulation layers [71]. When appropriately selected, these materials enable passive thermal regulation that reduces energy peak demands and delays heat flux transmission within structural elements [72,73]. Despite their promising energy storage potential and environmental benefits, the practical application of BPCMs in cement systems still faces multiple challenges, such as leakage control, thermal degradation, and inconsistent dispersion [74,75]. Selection based solely on thermal performance is insufficient; instead, a holistic evaluation involving particle size, interface compatibility, and service durability is essential. Currently, most investigations remain at the laboratory scale, and standardized testing protocols or life-cycle performance evaluations for thermal–mechanical synergy in real-world cement matrices are still lacking [76]. Therefore, both the intrinsic thermal behavior and interfacial interactions with the cementitious matrix jointly determine the feasibility of BPCMs in energy-efficient construction [77,78].

2.2.3. Analysis of Thermal and Mechanical Characteristics in Cement Pastes with Different BPCMs

Integrating bio-based phase change materials into cementitious materials requires a careful balance between optimizing thermal performance and maintaining mechanical stability [79,80]. However, the incorporation of BPCMs can have multifaceted impacts on the mechanical properties of cement paste. Studies have shown that increasing PCM dosage often leads to reductions in compressive strength and splitting tensile strength. The degradation is primarily attributed to the low elastic modulus of BPCMs compared to cement hydration products. In addition, PCM particles tend to induce microvoids or interfacial debonding, thereby compromising matrix densification and structural integrity [81,82]. The negative impact becomes more pronounced when the particle size of BPCMs exceeds 1 mm, due to poor dispersion and localized stress concentration. In contrast, microencapsulated BPCMs with smaller particle sizes exhibit improved interfacial compatibility and uniform distribution within the matrix, which mitigates mechanical deterioration [83,84]. Although overall strength is reduced, some mechanical parameters improve with PCM incorporation [85]. BPCM-modified mortars also show enhanced durability under repeated thermal–moisture cycling, with higher resistance to thermal fatigue than conventional mixes [86,87].

From the other perspective, the type of BPCM plays a decisive role in determining the heat regulation efficiency. Commonly used PEG-based PCMs (e.g., PEG2000, PEG4000) have melting points close to the peak surface temperatures of cementitious elements under solar exposure, allowing for precise heat release during high-temperature periods [88]. Experimental results indicate that incorporating PEG-based BPCMs can lower internal core temperatures by approximately 7–9 °C under intense solar radiation. In addition, they significantly reduce heat flux propagation, enabling a delayed peak thermal response [89]. The optimal dosage of BPCMs remains a key research focus. Most studies suggest a threshold of approximately 14 wt%, which achieves a favorable balance between thermal regulation and mechanical performance, with limited compromise in structural strength [90].

However, the type, encapsulation method, and dispersion technique of BPCMs significantly influence both thermal and mechanical behavior. For instance, high-purity fatty acid PCMs, while exhibiting high latent heat values, may undergo alkaline degradation in the cementitious environment, adversely affecting interfacial structure [91]. Microencapsulated PEGs demonstrate superior chemical stability and compatibility, making them more suitable for engineering applications. In contrast, microencapsulated bio-based phase change materials (BPCMs), such as PEG 6000, exhibit superior thermal stability and improved compatibility with cementitious materials. The PEG was microencapsulated using industrial spray-drying, forming spherical particles with an average diameter of approximately 10 to 20 μm, as provided by the manufacturer. This encapsulation process creates a polymeric shell that effectively isolates the PCM core from the alkaline cement matrix, thereby preventing leakage, reducing chemical degradation, and enhancing interfacial bonding with the hydration products. The smaller particle size also facilitates uniform dispersion in the cement paste, minimizing agglomeration and improving workability. These characteristics make microencapsulated BPCMs more suitable for practical engineering applications compared to non-encapsulated PCMs.

Polyethylene glycol (PEG 6000) with an average molecular weight of approximately 6000 g/mol and a purity of ≥99% was used as the polymeric phase change material. The material was incorporated at a dosage of 10–15 wt% by mass of cement. PEG 6000 was selected due to its melting point of ~61 °C and latent heat capacity of ~170 J/g, which are well-suited for thermal regulation in building applications. Additionally, this grade exhibits good chemical compatibility with cementitious matrices, minimizing adverse interactions with hydration products. It is noted that PEGs with different molecular weights may display significantly different melting temperatures, latent heats, and dispersion behaviors. Therefore, the selected PEG 6000 was considered the most appropriate balance between phase change performance, stability, and practical applicability in cementitious composites.

Additionally, polyurethane-encapsulated BPCMs with high isocyanate ratios offer improved leakage resistance and thermal durability [92]. Overall, BPCMs show significant potential in improving the thermal regulation capabilities of cement paste. BPCMs may compromise mechanical strength. However, optimizing particle size, dosage, and encapsulation strategy enables a practical balance between structural integrity and thermal control [93].

In the context of increasing energy demands, cementitious materials incorporating BPCMs represent a promising solution for peak-load reduction via latent heat storage. These materials offer practical strategies for developing low-carbon, durable, and energy-efficient building envelopes [94,95]. However, the underlying mechanisms governing mechanical degradation remain insufficiently understood and are influenced by multiple interacting factors, including PCM type, dosage, particle size, encapsulation, and spatial distribution. Future studies should focus on the thermo-mechanical coupling mechanisms of different BPCMs under various climatic conditions and structural applications.

2.3. Machine Learning Model

To improve the generalization ability of the models, 80% of the data were allocated to the training set, while 20% were reserved for the validation set. This approach allows for the evaluation and comparative analysis of the influence of various material parameters on the performance of BPCM cement pastes.

2.3.1. Support Vector Regression (SVR)

Support Vector Regression (SVR) is particularly effective for small-sample datasets and highly nonlinear regression tasks, owing to its reliance on the principle of structural risk minimization. The feature allows SVR to maintain strong generalization while mitigating overfitting, making it well-suited for modeling the thermophysical properties of complex multivariable systems such as cementitious composites containing BPCMs. However, the predictive accuracy of SVR is highly sensitive to the choice of kernel function and the setting of key hyperparameters. Suboptimal parameter configurations can result in underfitting or overfitting. To overcome this limitation, metaheuristic algorithms such as the Genetic Algorithm (GA), Particle Swarm Optimization (PSO), and Whale Optimization Algorithm (WOA) are employed to perform global hyperparameter tuning, avoiding local minima and improving model robustness [96]. n this study, Support Vector Regression (SVR) combined with metaheuristic optimization algorithms provides an efficient modeling strategy for systems characterized by limited data, nonlinear coupling, and unstable neural network convergence. This method offers a practical strategy toward accurate prediction of thermal behavior in BPCMs-modified cementitious materials [97].

2.3.2. Random Forest (RF)

Random Forest (RF) is a bagging-based ensemble learning method composed of multiple decision trees. It is particularly advantageous in regression tasks involving limited datasets with complex input-output relationships [98]. RF models are robust to noise and outliers, make no assumptions about feature relationships, and output ranked feature importance metrics. These benefits support effective sensitivity analysis in multivariable systems. In the study, the optimized RF model demonstrated strong predictive performance across all three target properties: thermal conductivity, latent heat, and compressive strength [23,99]. Due to its robustness and adaptability, Random Forest is well-suited for data-driven analysis and prediction of thermal behavior in sustainable building materials, especially in small-sample datasets.

2.3.3. Extreme Gradient Boosting (XGBoost)

XGBoost is an advanced implementation of Gradient Boosting Decision Trees (GBDT). It introduces several improvements over traditional GBDT models, including L1 and L2 regularization, optimized tree-splitting algorithms, built-in handling of missing values, and parallel computation [100]. These features yield high accuracy, efficiency, and robustness, even in high-dimensional, nonlinear scenarios [25]. XGBoost is well-suited for learning complex feature-response mappings without the need for explicit mathematical formulations. In this study, hyperparameter tuning was conducted via fold cross-validation and grid search [27]. Results show that XGBoost achieved low RMSE and high R² in thermal property prediction, outperforming conventional regression models. Its ability makes XGBoost an efficient and accurate choice for modeling heat conduction and storage behavior in building materials.

2.3.4. Categorical Boosting (CatBoost)

CatBoost is developed to handle categorical variables and complex, high-dimensional data structures, making it well-suited for material property prediction tasks where such challenges are common. Its core strengths include ordered target encoding, symmetric tree structures, and robust regularization strategies [101,102]. In this study, several input variables show strong categorical characteristics. CatBoost’s unique technique avoids target leakage while maintaining model stability in small datasets. Moreover, its symmetric tree structure ensures consistent decision paths across datasets, reducing variance and enhancing generalization. Experimental results indicate that CatBoost outperformed other models in predicting all three target variables, achieving high accuracy and stability [103].

2.4. Optimization Algorithms

2.4.1. Genetic Algorithm (GA)

The accuracy and generalization capability of machine learning models such as SVR, RF, XGBoost, and CatBoost are highly sensitive to hyperparameter configurations, particularly in nonlinear, multivariate regression tasks [104,105]. Conventional methods such as grid or random search often fail to efficiently explore high-dimensional, non-convex parameter spaces, easily falling into local optima. A population-based metaheuristic inspired by natural selection and genetic evolution is introduced for global hyperparameter optimization [106]. The Genetic Algorithm encodes parameter sets as chromosomes and generates an initial population. It applies selection, crossover, and mutation operations guided by a fitness function, enabling global search and convergence toward optimal solutions [107,108].

2.4.2. Particle Swarm Optimization (PSO)

Particle Swarm Optimization (PSO) is a widely adopted swarm intelligence algorithm that simulates the social behavior of bird flocks or fish schools. Each solution is represented as a particle, which updates its position based on both its own experience and the global best position identified by the swarm. This mechanism facilitates efficient global optimization in high-dimensional, nonlinear problem spaces. In this study, PSO is employed to optimize key hyperparameters of SVR, XGBoost, and CatBoost, including learning rate, subsample ratios, minimum split loss, and tree depth. PSO offers strong search capability, high computational efficiency, and easy parallelization. Compared with grid or random search, it achieves faster convergence in non-convex landscapes and is less prone to local optima.

2.4.3. Whale Optimization Algorithm (WOA)

The Whale Optimization Algorithm (WOA) is a novel metaheuristic inspired by humpback whales’ bubble-net hunting strategy. WOA balances global and local search via two core mechanisms: spiral updating and encircling prey. Its structure and efficient convergence make it suitable for complex regression model tuning [109,110]. For SVR, WOA optimizes the penalty parameter, kernel type, and kernel width. For RF, it adjusts tree number, split criteria, and sampling ratio. For XGBoost and CatBoost, WOA searches over learning rate, tree depth, subsampling ratio, and regularization terms [111]. In this study, WOA optimizes model hyperparameters by minimizing prediction error as the fitness function.

2.4.4. Grey Wolf Optimizer (GWO)

Grey Wolf Optimizer (GWO) is a method based on the social hierarchy and cooperative hunting behavior of grey wolves. It simulates the leadership roles of alpha, beta, delta, and omega wolves to navigate complex, high-dimensional solution spaces through exploration–exploitation balance [112]. In this study, GWO is used to optimize SVR, RF, and boosting model settings. GWO does not require gradient information and adapts well to multi-modal and discontinuous objective functions. Its hierarchical search procedure promotes robust convergence and effective escape from local minima [113,114]. models optimized by GWO are developed under the fitness criterion of minimizing model prediction error. They deliver strong regression performance and high generalization capacity, making them well-suited for nonlinear modeling of complex thermal behavior in cementitious composites.

2.4.5. Firefly Algorithm (FFA)

The Firefly Algorithm (FFA) is a bio-inspired optimization technique based on the luminescent attraction behavior of fireflies. It is particularly suitable for solving nonlinear, multi-modal, and high-dimensional problems, making it effective for hyperparameter tuning in machine learning models [115,116]. In SVR, the Firefly Algorithm tunes the kernel function parameters and regularization terms. For Random Forest, it adjusts the number of trees, maximum depth, and sampling methods. In boosting models, it optimizes the learning rate, tree structure, and regularization parameters. Each candidate solution is treated as a firefly whose brightness corresponds to prediction accuracy. In the Firefly Algorithm, fireflies are attracted to brighter individuals based on their objective function values. This procedure enables global search while helping the algorithm avoid convergence to local optima [117].

2.5. Development of Predictive Models

In this study, four mainstream machine learning algorithms were employed to model and predict the performance of cementitious composites incorporating BPCMs. These algorithms include Support Vector Regression (SVR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Categorical Boosting (CatBoost). A total of 100 experimental data samples were prepared, covering a wide range of parameter combinations including PCM dosage, water-to-cement ratio, curing age, and environmental conditions, to comprehensively capture multi-factor influences. The dataset was generated through laboratory experiments using ordinary Portland cement, fine and coarse aggregates, water, and BPCMs. Specimens (50 × 50 × 50 mm cubes) were cured at 20 ± 2 °C and ≥90% relative humidity. Compressive strength was tested according to the Chinese standard GB/T 50081-2019 [118] using a calibrated universal testing machine. Latent heat was measured with a differential scanning calorimeter (DSC, TA Instruments, New Castle, DE, USA), and thermal conductivity was determined using the transient plane source method with TPS (Hot Disk).

To promote generalization and predictive stability, the dataset was randomly divided into training and validation sets at an 8:2 ratio. A 10-fold cross-validation strategy was adopted for model evaluation to ensure robustness and minimize overfitting. To overcome the challenges of hyperparameter sensitivity and local optima, metaheuristic optimization algorithms were employed to fine-tune each model. The critical hyperparameters of the four ML models (SVR, RF, XGBoost, and CatBoost) were independently optimized using five metaheuristic algorithms, namely PSO, GA, WOA, GWO, and FFA. This procedure resulted in a total of 20 hybrid models. These optimization strategies significantly improved model accuracy and robustness by enabling global exploration of complex parameter spaces. For SVR, the penalty parameter C and kernel coefficient γ were optimized, controlling the balance between model flexibility and generalization. For RF, the number of trees (n_estimators) and maximum tree depth (max_depth) were tuned to regulate model complexity. For boosting-based models (XGBoost and CatBoost), learning rate, depth, and the number of estimators/iterations were adjusted to control convergence speed, model depth, and overall learning capacity. The population sizes and iteration numbers for the optimizers were chosen to balance computational efficiency and solution quality. Table 1 shows the search ranges and optimized hyperparameters.

As illustrated in Figure 4, the overall modeling framework consists of four sequential stages: (1) Experimental data collection and feature construction; (2) Key factor analysis; (3) Model development and hyperparameter optimization; and (4) Model validation and performance assessment.

Each stage is logically interconnected, beginning with experimental data collection and progressing through feature analysis, model development, and optimization. The incorporation of metaheuristic algorithms into the workflow highlights the importance of intelligent hyperparameter tuning in improving model accuracy and generalizability. This structured pipeline ensures that both domain knowledge and computational efficiency are integrated to achieve reliable predictions for composite material behavior.

Pearson correlation analysis was conducted to examine the linear associations among input variables, determine their statistical significance, and assess their potential impact on output targets [119]. Three types of graphical visualizations were used to present the results. Figure 5a presents a standard correlation heatmap. Figure 5b shows the statistical significance annotations for the variable correlations. Figure 5c combines the Pearson correlation coefficients with the corresponding p-values, using a significance threshold of α = 0.1.

As illustrated in Figure 5a, the Pearson correlation matrix reveals pairwise linear associations among the variables, with color gradients ranging from deep blue to deep red, and numerical values from −1 to +1 [120]. Most variable pairs show absolute correlation values |r| < 0.8, indicating a low risk of multicollinearity and satisfying the independence assumptions required for machine learning models. The strong negative correlation is observed between cement content (C) and coarse aggregate (CA), while a significant positive correlation is evident between water content (W) and water-to-cement ratio (W/C), reflecting inherent interdependencies in the concrete mix design parameters [121]. Figure 5b further annotates whether each correlation is statistically significant (p ≤ 0.1). Asterisks indicate non-significant relationships, while unmarked cells denote statistically significant correlations. Most strongly correlated variable pairs, such as C–CA and W–W/C, passed the significance test, confirming their statistical reliability. In addition, a significant negative correlation between compressive strength (CS) and PCM dosage was observed. This suggests that increasing PCM content may adversely affect the mechanical strength of the composite material [122]. Figure 5c shows correlation coefficients with p-values through ellipse-based visualization, where the orientation and shape of each ellipse reflect the direction and magnitude of the correlation [123]. According to the analysis, CS demonstrates moderate to strong linear relationships with several key variables, including cement content (r = 0.85), W/C (r = −0.84), thermal conductivity (Tc, r = −0.47), and latent heat (LH, r = −0.49), with most correlations statistically significant at p < 0.1.

The input variables show moderate correlation without severe multicollinearity. The majority of relationships meet the threshold for statistical significance [124,125]. This confirms a robust data foundation for the development of stable and interpretable machine learning models. Key variables such as cement content, W/C ratio, PCM dosage, and thermal properties (Tc and LH) reveal meaningful associations with compressive strength (CS), underscoring their importance for model construction [126].

To evaluate the data structure and statistical properties of both input and output variables, the present study uses a combination of boxplot visualization and descriptive statistical metrics [127,128]. The results are illustrated in Figure 6 and Figure 7. Figure 6 presents the distribution patterns of nine input variables, including phase change temperature (Tm), latent heat (Lh), PCM dosage, cement content (C), water content (W), water-to-cement ratio (W/C), fine aggregate (FA), coarse aggregate (CA), and the CA-to-FA ratio (CA/FA). All features were standardized using z-score normalization. Preprocessing was performed exclusively within the training folds to avoid data leakage. No categorical variables were present in the dataset.

Most input variables display approximately symmetric distributions, with means close to their respective medians and no significant deviations in the boxplot. This indicates a well-centered dataset with a limited number of outliers. In terms of probability density, certain variables such as CA, FA, and W demonstrate slight concentration in the mid-to-high value ranges, suggesting sample design preferences [129]. Descriptive statistics further revealed that the standard deviations (SD) of coarse aggregate (CA) and fine aggregate (FA) were the highest among all variables. The respective values were approximately 172.3 kg/m³ for CA and 155.1 kg/m³ for FA, indicating substantial variation in aggregate design within the cementitious mixtures. In contrast, ratio-based variables such as W/C and CA/FA displayed lower SD values, indicating more consistent distributions. The W/C ratio in particular showed minimal variation, near-zero skewness, and high data quality due to standardized formulation practices [130,131]. Both PCM dosage and latent heat exhibited moderately skewed distributions, primarily concentrated in the 100–200 range. Their density curves were well-balanced on both sides, with no severe long-tail effects, indicating that the variables are suitable for direct modeling [90,132]. Skewness and kurtosis were computed to evaluate the symmetry and peakedness of the distributions. All variables present skewness values within the range of −2 to +2, and kurtosis within −10 to +10, which falls within the acceptable range suggested in the literature, indicating no significant data skewness or dispersion issues [92]. Figure 7 shows the distribution of output variables: thermal conductivity (Tc), latent heat (LH), and compressive strength (CS). Among them, CS displayed the widest range, spanning from 11.4 to 53.8 MPa, with a standard deviation of approximately 7.5 MPa. The distribution was nearly symmetric but slightly right-skewed (skewness = 0.38), indicating a broad coverage of strength grades. LH was primarily distributed in the 90–110 J/g range, with normal skewness and kurtosis, forming a symmetric probability density curve. Thermal conductivity (Tc) shows the narrowest range (approximately 0.25–0.29 W/m·K), reflecting limited variability.

The input and output variables exhibited favorable distributional characteristics, with no severe skewness or significant outliers. It provides a solid foundation for subsequent machine learning tasks such as feature selection, normalization, and model stability enhancement [133]. Furthermore, the descriptive statistical indicators, including the mean, standard deviation, skewness, and kurtosis, were analyzed in detail. These indicators support the conclusion that the data are suitable for regression-based prediction modeling [134,135].

3. Results and Discussion

3.1. Performance Prediction Based on Support Vector Regression and Optimized Hybrid Models

Figure 8 presents the regression performance of six models in predicting thermal conductivity (Tc), including the SVR and five hybrid SVR models optimized by metaheuristic algorithms: SVR-GA, SVR-PSO, SVR-WOA, SVR-GWO, and SVR-FFA [136,137,138]. The x-axis represents measured values, while the y-axis corresponds to model-predicted values. The black dashed line denotes the ideal fit, while red and green lines represent regression trends for the training and testing datasets, respectively [139]. The slope of these lines reflects the consistency between predictions and actual values. The SVR exhibited poor regression performance, with training and testing slopes of 0.3317 and 0.4497, respectively. This indicates significant underfitting and limited capability in capturing the nonlinear features of thermal conductivity without parameter optimization [140]. SVR-GA and SVR-PSO achieved a testing slope of 0.8143, and SVR-PSO reached 0.8142, both achieving better model fit [141]. SVR-WOA further improved regression accuracy with training and testing slopes of 0.7045 and 0.8143, indicating improved fit and stronger generalization capability. SVR-GWO achieved the best overall performance, with regression slopes of 0.8440 for training and 1.0514 for testing, demonstrating exceptional ability to capture high-dimensional nonlinear patterns. SVR-FFA also resulted in stable performance with a testing slope of 1.0241, second only to SVR-GWO. All optimized SVR models outperformed the SVR, and SVR-GWO emerged as the best candidate for Tc prediction, improving the testing slope by 96.7% relative to the baseline [142]. Figure 9 illustrates model performance in predicting compressive strength. The SVR model performed inadequately, with slopes of 0.3389 and 0.4911, indicating an inability to capture complex mix–strength interactions. Optimization significantly improves performance [114]. SVR-GA achieved 0.8177 and 0.8757 for training and testing, respectively. SVR-PSO demonstrated near-perfect fit with a testing slope of 0.9323. SVR-WOA also showed high stability with slopes of 0.8684 and 0.8187. SVR-GWO achieved the highest performance, with slopes of 1.0256 and 0.9044, being the closest to the ideal fit among all models. SVR-FFA performed well with training and testing slopes of 0.8528 and 0.8879, respectively. SVR-GWO was the top-performing model in CS prediction [143], increasing testing slope by 76.2% over the base SVR and delivering high accuracy and generalization. Figure 10 presents the regression plots for latent heat prediction. The SVR again underperformed, with training and testing slopes of 0.6916 and 0.5778, reflecting typical underfitting. After optimization, SVR-GA improved testing slope to 0.9469. SVR-PSO further raised it to 0.8512, surpassing the 0.8 threshold for acceptable prediction [144,145]. SVR-WOA attained training and testing slopes of 0.8684 and 0.9976, respectively, demonstrating well-balanced predictive performance [146]. SVR-GWO achieved top performance, with slopes of 0.8597 and 1.0172. SVR-FFA also performed well with training and testing slopes of 0.7858 and 0.8706, respectively. All optimized models significantly outperformed base SVR in LH prediction, with SVR-GWO demonstrating the highest precision and robustness, improving the testing slope by 84%. The three sets of regression plots highlight differences in model capability across the three target variables [146,147]. The unoptimized SVR consistently exhibited underfitting, while models optimized with GWO, WOA, and FFA significantly improved both fitting accuracy and generalization. SVR-GWO achieved the highest or second-highest slope in all tasks [148], making it the most robust and accurate model overall.

Figure 11a–c present the absolute error distribution (violin plots) for the models across the three prediction tasks. The base SVR model showed the widest error range and highest maximum error (~0.045 W/m·K), with mean and median far from zero. SVR-GA reduced the average error significantly, with most samples below 0.01 W/m·K. SVR-PSO further narrowed the error band. SVR-WOA and SVR-GWO exhibited the narrowest, most symmetrical distributions, with peak density near zero and average errors below 0.005 W/m·K. SVR-GWO displayed minimal outliers and near-zero error dispersion, indicating superior stability and precision. The SVR model showed highly dispersed errors, with a mean over 6 J/g and a maximum near 17 J/g. SVR-GA and SVR-PSO reduced mean errors to 3.5 J/g and 3.0 J/g, respectively. SVR-WOA and SVR-GWO further improved error control, with SVR-GWO achieving a mean of ~2.2 J/g and tightly clustered errors. SVR-FFA also performed well, though slightly less precise than GWO [149]. SVR has the largest error, with a mean exceeding 10 MPa and max nearing 25 MPa. SVR-GA and SVR-PSO improved accuracy significantly [150]. SVR-WOA and SVR-GWO provided the best overall control, with SVR-GWO showing a mean error near 3.1 MPa and a median near 2.5 MPa, indicating stable performance and low variability. SVR-FFA remained competitive for predictions below 3.8 MPa.

Across three tasks, SVR-GWO achieved the lowest errors, most concentrated distributions, and minimal outliers, confirming it as the most reliable and accurate model [151]. These results are consistent with the earlier regression slope analysis, further validating the effectiveness of GWO in hyperparameter tuning and global optimization [152].

3.2. Performance Prediction Based on Random Forest and Optimized Hybrid Models

Figure 12, Figure 13 and Figure 14 illustrate the training and testing regression fits for six models under each prediction task. The regression slope was used as a key indicator of prediction accuracy, with a value of 1 indicating perfect agreement between predicted and measured results [153,154]. Figure 12 shows the regression results for compressive strength. The RF model exhibited relatively low regression slopes of 0.4856 and 0.8012, indicating limited fitting capacity and poor generalization under default hyperparameters. After applying genetic algorithm optimization, the RF-GA model significantly improved, with slopes increasing to 0.8772 and 0.8473, suggesting strong fitting and no signs of overfitting [154]. The RF-PSO model achieved the best performance, with training and testing slopes of 0.7779 and 0.9299, respectively, representing near-perfect fit and outstanding generalization [155,156]. Both RF-WOA and RF-GWO also achieved high accuracy, with training slopes of 0.7725 and 0.7251, and testing slopes of 0.7792 and 0.8698. The small slope differences (<0.03) indicate strong model stability. Its slightly lower performance compared to other optimized models may be attributed to mild underfitting. Overall, RF-PSO achieved the best accuracy in CS modeling, followed by RF-GA and RF-GWO [157]. Figure 13 presents regression results for latent heat. The RF model underperformed, with training and testing slopes of 0.3529 and 0.6286. Optimized models showed clear improvements. RF-GA achieved a training slope of 0.7304 and a testing slope of 1.0707. RF-PSO reached 0.7455 and 0.7887 for training and testing, respectively. Both are close to the 0.8 threshold, indicating enhanced nonlinear learning ability [158]. The RF-WOA model further improved predictive accuracy, achieving a testing slope of 0.8638. While the RF-GWO model showed the best results with training and testing slopes of 0.8678 and 0.8705, respectively. The small slope difference of 0.0212 highlights the model’s high robustness and low risk of overfitting. RF-FFA also performed well, with training and testing slopes of 0.9658 and 0.9747, closely trailing RF-GWO. Overall, RF-GWO showed the strongest generalization and precision in LH prediction, with RF-FFA and RF-WOA as strong alternatives. Figure 14 shows regression results for thermal conductivity (Tc), a task that involves small-range, low-variance data distributions, making it sensitive to model granularity and error [159]. The RF model achieved poor regression slopes of 0.7236 and 0.7155, indicating poor learning of input–output mapping and weak generalization. The RF-GA model showed moderate improvement, with slopes of 0.9430 and 0.9077 for training and testing, respectively [160]. Among all models, RF-GWO achieved the highest performance, with a training slope of 0.9548 and a testing slope of 0.8272, confirming the effectiveness of GWO for hyperparameter tuning [161]. RF-FFA showed weaker generalization, with a training slope of 0.8275.

The RF model showed the weakest performance across all tasks, with significant bias and poor stability. All optimization algorithms improved prediction accuracy and generalization to varying extents. RF-GWO consistently achieved the highest or second-highest slopes across all tasks, especially in LH and Tc predictions, making it the most balanced and robust model [162].

Figure 15a displays the absolute error distributions for the Tc prediction. The RF model showed the widest error range (>0.045 W/m·K), with a right-skewed density curve and long tails, indicating high variance and instability [163]. After optimization, both RF-GA and RF-PSO significantly reduced prediction errors, with most samples falling within ±0.015 W/m·K. RF-WOA and RF-GWO achieved better performance [164], with highly concentrated and nearly symmetric error distributions. In particular, RF-GWO showed minimal outliers and a dominant error range within ±0.01 W/m·K, reflecting strong precision and generalization. While RF-FFA remained below 0.03 W/m·K, its distribution was slightly skewed with heavier tails, suggesting lower stability than RF-GWO. Hence, RF-GWO was the most accurate and stable model for Tc prediction. Figure 15b illustrates absolute errors for LH prediction. The RF model had the widest spread, with errors up to ±17 J/g and a low-density, wide main region [165]. Optimized models significantly narrowed the error range. RF-GA and RF-PSO constrained most errors within ±6 J/g, confirming enhanced learning ability. RF-WOA and RF-GWO showed even tighter distributions. Particularly, RF-GWO had a high-density peak concentrated within ±2 J/g and minimal tail extension, indicating excellent robustness and convergence. RF-FFA also performed well but was slightly more dispersed. Overall, RF-GWO demonstrated the best error control and stability in LH prediction [166,167]. Figure 15c shows the error distribution for CS prediction. The RF model exhibited the widest error spread, with deviations reaching up to ±17 J/g and a low-density, broadly distributed main region. Optimized models crucially reduced errors. RF-GA and RF-PSO lowered most errors to within ±8 MPa, though some instability remained. In contrast, RF-WOA and RF-GWO achieved superior error convergence. RF-GWO had the most compact distribution, with the main density centered within ±3 MPa and almost no long tails, highlighting its high accuracy in modeling structural strength. RF-FFA was slightly less precise than GWO but still significantly outperformed the baseline RF [168].

In conclusion, RF-GWO achieved highly accurate predictions and tightly clustered outputs, confirming its robustness and high predictive accuracy across all three tasks [169]. Its superior performance validates GWO’s effectiveness in optimizing high-dimensional, nonlinear predictive models.

3.3. Performance Prediction Based on Extreme Gradient Boosting and Optimized Hybrid Models

Figure 16 presents the regression analysis of thermal conductivity prediction using the XGBoost model and its five optimized models [170]. XGBoost achieved regression slopes of 0.9722 and 0.9552 for training and testing, respectively, demonstrating its strong ability to learn both linear and nonlinear data structures. The XGBoost-GA slightly underperformed, with testing slope decreasing to 0.9363, suggesting limited improvement via a genetic algorithm in this task [171,172]. The XGBoost-PSO achieved a training slope of 0.5685 but drops to 0.5144 on the test set, indicating overfitting. In contrast, XGBoost-WOA and XGBoost-GWO both maintained high slopes above 0.90. XGBoost-WOA achieves a testing slope of 0.8655, showcasing strong accuracy and generalization [173]. XGBoost-GWO performs slightly below, with a testing slope of 0.7203. XGBoost-FFA exhibits regression slopes of 0.8343 and 0.9173, slightly inferior to the WOA model. Overall, XGBoost-WOA demonstrates the best balance between precision, robustness, and generalizability in Tc prediction. Figure 17 illustrates model performance in latent heat (LH) prediction. XGBoost achieved slopes of 0.9774 and 0.9133, reflecting high accuracy but also some sensitivity to nonlinear data deviations from the ideal fit line [174,175]. The XGBoost-PSO model significantly boosted predictive stability, with slopes of 0.8952 and 0.7599, reflecting improved fit. The XGBoost-WOA model performed best overall, achieving 0.8213 and 0.8575, confirming WOA’s effectiveness in modeling complex characteristics [176]. However, XGBoost-GWO showed signs of overfitting, with training and testing slopes of 0.7718 and 0.7749, respectively. XGBoost-FFA showed the weakest testing slope, indicating inadequate generalization and possible underfitting. Therefore, XGBoost-WOA stands out as the optimal model for LH prediction, offering both high accuracy and model robustness. Figure 18 presents the regression performance for predicting compressive strength. The baseline model shows high training and testing slopes, with the latter exceeding unity, implying slight overestimation in high-strength ranges. The XGBoost-GA improves generalization, with training and testing slopes of 0.9410 and 0.9243. XGBoost-PSO exhibits balanced performance, indicating strong robustness. XGBoost-WOA achieves 0.9414 but lower performance on testing, suggesting reduced generalization. XGBoost-GWO attains 0.9072 and 0.9175, within the accepted threshold and thus still viable for nonlinear CS prediction [177]. The best performance is delivered by XGBoost-FFA, which achieves 0.9403 and 1.0422, closely matching the ideal regression line, and demonstrating excellent predictive consistency on unseen data. All XGBoost models perform well for CS prediction, with XGBoost-PSO and XGBoost-FFA offering the most stable and accurate results, suitable for engineering applications [178].

Figure 19a presents the absolute error distributions for Tc prediction. The baseline model showed skewed error behavior, with maximum deviations nearing 0.025 W/m·K. Optimization significantly reduces error ranges. XGBoost-FFA and XGBoost-PSO demonstrated the most concentrated error distributions, with maximum errors constrained within 0.015 W/m·K and 0.018 W/m·K, respectively. Both models show strong reliability. XGBoost-GA and XGBoost-WOA also show excellent error control, with median errors close to zero and most predictions within ±0.01 W/m·K. While XGBoost-GWO showed some tail extension at the boundaries, its distribution remains symmetrical and centered, indicating high noise resistance. Overall, XGBoost-FFA provided the most accurate and stable performance in Tc prediction [179]. Figure 19b shows LH prediction errors. While the baseline model performed adequately, it presents outliers with maximum errors exceeding 12 J/g. Optimized models significantly compress error ranges. XGBoost-GA and XGBoost-PSO maintained most prediction errors within ±3 J/g, with near-Gaussian distributions. XGBoost-WOA and XGBoost-GWO exhibit slightly wider but still well-centered distributions within ±4 J/g, indicating strong resilience to parameter variation. XGBoost-FFA outperformed all others with the smallest average error and a highly concentrated distribution in the 0–2 J/g range [180]. Thus, XGBoost-GA and XGBoost-FFA achieved the best balance between accuracy and robustness in LH modeling [181]. Figure 19c shows error distributions for CS prediction. The baseline model exhibits considerable variability, with errors approaching 28 MPa, suggesting poor fit for high-strength samples. All optimized models reduce this significantly. XGBoost-GA and XGBoost-PSO reduced the maximum errors to below 10 MPa., with most errors in the 0–5 MPa range. XGBoost-WOA and XGBoost-GWO further refined model fit in medium-to-high strength ranges, showing tighter distributions and lower peak deviations. XGBoost-FFA deliverd the most symmetric distribution and lowest median error [182], maintaining strong generalization even under nonlinear and heterogeneous data conditions. Overall, optimized XGBoost models outperformed with FFA and PSO models offering the greatest potential for practical engineering deployment [183,184].

3.4. Performance Prediction Based on Categorical Boosting and Optimized Hybrid Models

Figure 20 illustrates the regression performance of the CatBoost model and its optimized models in predicting thermal conductivity. The CatBoost model showed regression slopes of 0.8602 and 0.8274, indicating limited predictive capability and relatively large prediction errors. After genetic algorithm optimization, the slopes improved to 0.9020 and 0.8069, respectively, indicating the improvement in both model fitting and generalization capability [185,186]. While the CatBoost-PSO model improved the training slope to 0.8236, the testing slope declined slightly to 0.8721. This implies better fitting on the training data but reduced ability to generalize. The CatBoost-WOA, CatBoost-GWO, and CatBoost-FFA models all showed significant improvements. Among them, CatBoost-WOA reached a testing slope of 0.8909, CatBoost-GWO achieved 0.9508, and CatBoost-FFA followed closely with 0.7436. These results confirm the efficacy of metaheuristic optimization [187,188]. In particular, CatBoost-GWO showed high slope consistency between training and testing, with only a 0.014 difference, indicating strong generalization. Overall, CatBoost-GWO achieved the best performance in predicting Tc [189,190]. Figure 21 displays the regression analysis for latent heat prediction. The significant discrepancy suggests underfitting during training and possible reliance on data noise [191]. Optimization with GA markedly improved performance with a minimal slope difference, indicating excellent fitting and model stability. Similarly, CatBoost-PSO achieved 0.8236 and 0.8721, though slightly overfitting the training data. The best performing models were CatBoost-GWO and CatBoost-FFA, both approaching the ideal regression line [192,193]. CatBoost-FFA achieved the minimal slope difference of 0.0008. With a testing slope of 0.9996, CatBoost-FFA was the top performer in LH prediction, followed closely by CatBoost-GWO. While the baseline and WOA-optimized models performed less effectively. Figure 22 shows the regression results for compressive strength prediction. The CatBoost model achieved slopes of 0.8391 and 0.9157, reflecting moderate accuracy. However, its departure from the ideal fit points to underfitting. After optimization, CatBoost-GA demonstrated significant improvement, achieving slopes of 0.9281 for training and 0.9670 for testing. The small slope gap reflects strong fitting performance and high model stability. CatBoost-PSO achieved 0.8568 and 0.9339, showing excellent fitting though with a mild overfitting tendency [194]. In contrast, CatBoost-WOA performed weakly, with 0.7122 and 0.8782, the lowest among all models, suggesting poor generalization and insufficient capture of input-output relationships [195]. CatBoost-GA performed best in CS prediction, followed closely by CatBoost-GWO and CatBoost-FFA, while CatBoost-WOA lagged behind, possibly due to overfitting or inadequate feature learning [196]. Figure 23 presents violin plots of absolute error distributions for Tc, LH, and CS predictions using CatBoost and its optimized models. The error offers an intuitive comparison of each model’s predictive precision and stability. In the thermal conductivity prediction, the CatBoost model exhibited right-skewed error distribution, with maximum errors exceeding 0.025 W/m·K, while the minimum was near zero. In contrast, CatBoost-FFA and CatBoost-GWO exhibited tighter and more centered distributions, with peak density around ±0.005 W/m·K. CatBoost-FFA showed the most concentrated density, confirming its high accuracy and robustness [197]. CatBoost-GA also improved prediction, with a median error of 0.008 W/m·K, compared to 0.014 W/m·K. In latent heat prediction, the baseline model reached a maximum error of nearly 16 J/g. However, CatBoost-FFA and CatBoost-WOA reduced this to within ±6 J/g. CatBoost-PSO also performed well, with 70% of errors falling within ±4 J/g, demonstrating good stability. CatBoost-GA had a wider error spread but still improved significantly over the baseline, with a median error around 2.5 J/g, compared to 6 J/g in the baseline. For compressive strength, differences were more pronounced [198]. The baseline model had maximum errors exceeding 20 MPa, with a wide error range and significant outliers; the median absolute error was approximately 7.2 MPa. Optimized models CatBoost-GWO and CatBoost-FFA again stood out, with concentrated error distributions, high peak densities, and over 85% of predictions within ±6 MPa. CatBoost-FFA achieved the lowest median error at ~2.5 MPa, nearly half that of the baseline, indicating strong generalization and superior performance under high nonlinearity and variability [143,199].

All optimized models improved accuracy and robustness across all three prediction tasks. Among all models, CatBoost-FFA exhibited the most balanced and superior performance. It attained the lowest average absolute errors for Tc (0.0035 W/m·K), LH (2.2 J/g), and CS (2.5 MPa), indicating its strong generalization across all targets [200]. These findings confirm its high potential for accurate multi-performance prediction in cementitious composites.

3.5. Comparative Evaluation of Model Accuracy

In this study, the predictive performance of each model was comprehensively evaluated using three statistical metrics [201]: Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and the Coefficient of Determination (R²). The mathematical formulations of these evaluation indices are presented in Equations (1)–(3). RMSE quantifies the overall magnitude of prediction errors and is more sensitive to large deviations, making it suitable for assessing the global performance of regression models. MAE measures the average absolute difference between predicted and observed values, offering a direct and interpretable measure of model accuracy. The R² index, on the other hand, reflects the proportion of variance in the observed data explained by the model; values closer to 1 indicate a higher level of goodness-of-fit and stronger predictive capability. Together, these three indicators provide a robust framework for evaluating the models’ precision, bias, and generalization ability under varying performance prediction tasks.

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} (x_{i} - {\hat{x}}_{i})^{2}}{n}}

(1)

M A E = \frac{\sum_{i = 1}^{n} |x_{i} - {\hat{x}}_{i}|}{n}

(2)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {({\hat{x}}_{i} - x_{i})}^{2}}{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}

(3)

where

x_{i}

is the i-th measured (experimental) value,

{\hat{x}}_{i}

is the i-th predicted value, and n represents the total number of data points.

To ensure the generalization capability and predictive stability of the developed machine learning models, the data partitioning and model validation process adopted in this study is illustrated in Figure 24. The entire dataset was first divided into a training set (80%) and a testing set (20%). The training set was used for model training and hyperparameter optimization, while the testing set was reserved exclusively for independent performance evaluation, ensuring objectivity and reliability of the final results [202,203].

During the training process, a 10-fold cross-validation strategy was applied to the training set [204]. Specifically, the training data was randomly partitioned into ten equal subsets. In each iteration, one subset was used as the validation set and the remaining nine were used for model training [205,206]. This process was repeated ten times, with each subset being used exactly once for validation. Hyperparameter optimization and model selection were conducted using 10-fold cross-validation within the training set, with all preprocessing restricted to avoid data leakage. Performance metrics were reported as mean CV fold values and separately for the test set.

After cross-validation, the optimal hyperparameter configuration was determined, and the final model was then evaluated on the previously unseen testing set. This procedure provides a comprehensive assessment of the model’s ability to generalize to new data [207,208]. Overall, this strategy effectively mitigates the risk of overfitting and significantly improves the accuracy and credibility of model evaluation. It is widely regarded as a standard and rigorous approach in contemporary machine learning research.

Figure 25 presents the performance comparison of four machine learning models and their combinations with five metaheuristic optimization algorithms, across three key prediction targets: thermal conductivity (Tc), latent heat (LH), and compressive strength (CS). Each model was evaluated on both the training and testing datasets using four commonly adopted metrics: Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Determination (R²) [209]. The SVR model, as a classical regression method, exhibited suboptimal performance in its baseline configuration, particularly in predicting CS, where the testing R² was only 0.748 with an RMSE of 6.23 MPa—highlighting its limited capability in capturing complex nonlinear interactions. However, its performance improved significantly after optimization via swarm intelligence algorithms [210]. For instance, the SVR-WOA model achieved an R² of 0.912 in CS prediction, with MAE reduced to 3.64 MPa, indicating enhanced generalization ability. Similarly, the SVR-FFA model demonstrated the best performance in LH prediction (R² = 0.893, RMSE = 2.15 J/g), with its prediction errors concentrated within a low-value range [211].

The RF model showed inherently stronger stability in nonlinear tasks. Without optimization, it achieved an R² of 0.843 for CS prediction, outperforming SVR. The RF-WOA model further improved the performance, achieving an R² value of 0.928. T The RF-FFA model attained the best performance in Tc prediction, with an RMSE of 0.0078 W/m·K and an R² value of 0.886, demonstrating strong accuracy in thermal performance modeling [212]. In latent heat prediction, RF-PSO and RF-GWO both maintained R² values above 0.87 with low error variance, reflecting high prediction robustness. XGBoost models exhibited superior predictive accuracy across all indicators. The XGBoost model already achieved a high testing R² of 0.911 for CS prediction. After optimization, the XGBoost-WOA model achieved an R² of 0.932 and an RMSE of 1.97 J/g in LH prediction, indicating strong adaptability to nonlinear features and complex data distributions [213,214]. With an RMSE of 0.0064 W/m·K and an R² of 0.905, the XGBoost-GWO model demonstrated strong predictive performance., ranking among the best-performing models. CatBoost, a gradient boosting framework based on symmetric tree structures, delivered the most outstanding performance across the board [215]. The CatBoost model achieved an R² of 0.936 for CS prediction. Its optimized version, CatBoost-WOA, yielded an R² of 0.955 and an MAE of 1.84 J/g for LH prediction, representing the most accurate results among all compared models. For Tc prediction, the CatBoost-FFA model demonstrated strong performance, with an RMSE of 0.0057 W/m·K, and an R² of 0.927. These results indicate that it outperformed other models in both generalization capability and fine-grained fitting accuracy [216]. CatBoost-WOA and XGBoost-WOA emerged as the top-performing ensemble models in thermal and mechanical property predictions, respectively [217].

As shown in Figure 26, feature importance was evaluated using CatBoost and XGBoost. PCM dosage, phase change temperature (Tm), and latent heat (Lh) are the dominant drivers. They jointly contribute over 70% of the importance. PCM dosage strongly affects compressive strength (CS) and thermal conductivity (Tc). Tm governs latent heat (LH) prediction and also influences Tc. Lh is central for LH and further contributes to CS. Cement content, water-to-cement ratio, and water content show moderate effects. Aggregate-related factors (CA, FA, CA/FA) are negligible. The results agree with Pearson correlation analysis. This reinforces the robustness and interpretability of the findings.

From an overall perspective, the models follow the trend: CatBoost > XGBoost > RF > SVR. The integration of swarm intelligence algorithms significantly enhanced both accuracy and stability, with WOA and FFA demonstrating the most prominent optimization effects [218]. Based on comprehensive evaluation across R², RMSE, MAE, and MSE, the CatBoost-WOA model achieved the best performance in terms of prediction accuracy, error distribution, and generalization, making it the most promising candidate for modeling the thermal and mechanical properties. Metaheuristic optimization generally contributed positively to prediction performance [219]. Among the algorithms, WOA and GWO were especially effective in improving both precision and robustness across various models. CatBoost-WOA and XGBoost-WOA were identified as the most balanced and reliable model combinations across all three performance indicators.

3.6. Future Improvements

The proposed multi-model optimization approach demonstrated excellent performance in predicting the thermophysical and mechanical properties of cementitious composites. Particularly for key indicators such as thermal conductivity, latent heat, and compressive strength, the optimized models exhibited high predictive accuracy and stability. The integration of various swarm intelligence algorithms significantly improved the generalization capability and consistency of traditional machine learning models. Moreover, the variable selection framework and cross-validation scheme employed in this study provide a transparent and practical modeling pipeline, offering valuable guidance for researchers working on similar composite systems. The current study adopted a fixed data preprocessing pipeline and static hyperparameter spaces. Although multiple optimization strategies were applied, the sensitivity of model performance to different feature engineering methods, normalization techniques, and hyperparameter tuning has not been fully explored. Future research should consider the following improvements: Expand the dataset scope, especially by incorporating real-world and field-condition samples to improve model robustness. Integrate microstructural descriptors that are strongly correlated with heat conduction mechanisms to improve model generalizability and mechanistic interpretability.

4. Conclusions

This study presents a comprehensive exploration of the intelligent prediction of thermophysical and mechanical properties of cement pastes incorporating bio-based phase change materials. By combining bibliometric analysis with the multi-algorithm optimization of traditional machine learning models, this study reveals key research trends and thematic developments. In addition, it builds and validates high-accuracy models suitable for prediction and material design. The main conclusions are summarized as follows:

1. Based on a bibliometric analysis of 5928 core publications from 2013 to 2024, this study systematically reveals five major research hotspots—“machine learning,” “thermal conductivity,” “energy efficiency,” “mechanical properties,” and “thermal energy storage”. The bibliometric analysis not only identifies key research hotspots but also demonstrates the growing influence of machine learning in the cementitious materials field. These hotspots represent a shift from traditional studies focused on basic thermal properties to more complex, multifunctional applications.

2. PCM dosage, water-to-cement ratio, and the latent heat of BPCMs are identified as dominant factors influencing thermal conductivity, latent heat capacity, and compressive strength. A strong negative correlation is observed between BPCM dosage and both compressive strength and thermal conductivity. It indicates that while the incorporation of BPCMs improves thermal energy storage, it may lead to a decline in mechanical performance. This reflects a trade-off relationship and highlights the challenge of balancing multifunctional properties in composites.

3. Four mainstream predictive models were developed using 100 experimental samples to estimate the key performance metrics Tc, LH, and CS. By combining traditional machine learning models (e.g., CatBoost, SVR) with metaheuristic algorithms, this study demonstrated a significant improvement in predictive accuracy. Specifically, the CatBoost-WAO and SVR-GWO hybrid models outperformed conventional models in terms of robustness, showing R² improvements of up to 44%. These results confirm that the hybrid approach not only enhances predictive performance but also stabilizes predictions across diverse datasets. This methodological advancement makes the models more reliable for material design.

4. The CatBoost-WOA model demonstrated the best performance, achieving R² values of 0.927 (Tc), 0.955 (LH), and 0.944 (CS), with corresponding RMSEs of 0.0057 W/m·K, 1.84 J/g, and 2.91 MPa. Compared to unoptimized models, its prediction accuracy improved by 38.2%, 41.7%, and 44.4%, respectively. These results highlight its excellent robustness and practical adaptability for engineering applications.

5. GWO emerged as one of the most effective metaheuristic algorithms for hyperparameter tuning, particularly in small sample datasets. Its application led to substantial reductions in prediction error, with the RMSE decreasing by over 3.3 times for compressive strength prediction. This shows that GWO excels at fine-tuning models in constrained environments, making it an indispensable tool for predictive modeling in fields where high accuracy is crucial despite limited data. In compressive strength (CS) prediction, the coefficient of determination (R²) increased from 0.77 to 0.79. These results demonstrate GWO’s excellent capability in hyperparameter tuning, particularly in small-sample datasets.

6. The predictive models, through coordinated optimization, effectively manage the highly nonlinear relationships among input parameters such as PCM dosage and phase change temperature. This adaptability ensures the stability of the models even when parameters fluctuate. Their ability to generalize well across diverse datasets underscores their potential for use in real-world applications.

Author Contributions

Conceptualization, L.L. and Z.L.; methodology, L.L.; validation, L.L., Z.L., and W.Z.; formal analysis, L.L.; investigation, L.L.; resources, Z.L.; data curation, H.M.; writing—original draft preparation, W.S. and L.L.; writing—review and editing, W.S. and L.Y.G.-Z.; supervision, Z.L.; project administration, Z.L.; funding acquisition, Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Research and Development Project of Xinjiang Transportation Investment (Group) Co., Ltd. (Research and Demonstration of Near-Zero-Carbon Smart Service Area Construction Technology with Water Resource Recycling and Energy Self-Sufficiency in Desert Regions, No. XJJTZKX-FWCG-202411-0736), the National Key R&D Program of China (No. 2024YFE0116202), and the Fundamental Research Funds for the Central Universities, Chang’an University (No. 300102214906).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

Leifa Li is employed by Xinjiang Jiaotou Construction Management Co., Ltd. This research was funded by the Science and Technology Research and Development Project of Xinjiang Transportation Investment (Group) Co., Ltd. (Research and Demonstration of Near-Zero-Carbon Smart Service Area Construction Technology with Water Resource Recycling and Energy Self-Sufficiency in Desert Regions, No. XJJTZKX-FWCG-202411-0736). The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Osibodu, S.J.; Adeyinka, A.M.; Mbelu, O.V. Phase Change Material Integration in Concrete for Thermal Energy Storage: Techniques and Applications in Sustainable Building. Sustain. Energy Res. 2024, 11, 45. [Google Scholar] [CrossRef]
Yadav, A.; Samykano, M.; Pandey, A.K.; Tyagi, V.V.; Devarajan, R.; Sudhakar, K.; Noor, M.M. A Systematic Review on Bio-Based Phase Change Materials. Int. J. Automot. Mech. Eng. 2023, 20, 10547–10558. [Google Scholar] [CrossRef]
Yu, Z.; Shao, R.; Li, J.; Wu, C. An In-Depth Review of Phase Change Materials in Concrete for Enhancing Building Energy-Efficient Temperature Control Systems. J. Energy Storage 2024, 104, 114533. [Google Scholar] [CrossRef]
Ling, T.-C.; Poon, C.-S. Use of Phase Change Materials for Thermal Energy Storage in Concrete: An Overview. Constr. Build. Mater. 2013, 46, 55–62. [Google Scholar] [CrossRef]
Rathore, P.K.S.; Shukla, S.K. Potential of Macroencapsulated PCM for Thermal Energy Storage in Buildings: A Comprehensive Review. Constr. Build. Mater. 2019, 225, 723–744. [Google Scholar] [CrossRef]
Eddhahak-Ouni, A.; Drissi, S.; Colin, J.; Neji, J.; Care, S. Experimental and Multi-Scale Analysis of the Thermal Properties of Portland Cement Concretes Embedded with Microencapsulated Phase Change Materials (PCMs). Appl. Therm. Eng. 2014, 64, 32–39. [Google Scholar] [CrossRef]
Ramakrishnan, S.; Sanjayan, J.; Wang, X.; Alam, M.; Wilson, J. A Novel Paraffin/Expanded Perlite Composite Phase Change Material for Prevention of PCM Leakage in Cementitious Composites. Appl. Energy 2015, 157, 85–94. [Google Scholar] [CrossRef]
Jeon, I.K.; Azzam, A.; Al Jebaei, H.; Kim, Y.R.; Aryal, A.; Baltazar, J.C. Effects of Shape-Stabilized Phase Change Materials in Cementitious Composites on Thermal-Mechanical Properties and Economic Benefits. Appl. Therm. Eng. 2023, 219, 119444. [Google Scholar] [CrossRef]
Marchi, S.; Pagliolico, S.; Sassi, G. Characterization of Panels Containing Micro-Encapsulated Phase Change Materials. Energy Convers. Manag. 2013, 74, 261–268. [Google Scholar] [CrossRef]
Rashid, F.L.; Al-Obaidi, M.A.; Dulaimi, A.; Mahmood, D.M.N.; Sopian, K. A Review of Recent Improvements, Developments, and Effects of Using Phase-Change Materials in Buildings to Store Thermal Energy. Design 2023, 7, 90. [Google Scholar] [CrossRef]
Nalla, B.T.; Subbiah, G.; Das, S.N.; Kumar, S.; Swarnkar, S.K.; Kaliappan, N. Bio-Based Phase-Change Materials for Thermal Energy Storage: Recent Advances, Challenges, and Outlook. Results Eng. 2025, 28, 107087. [Google Scholar]
Siddesh, J.S.; Shivaprasad, K.N.; Yang, H.M. Enhancing the Thermal Performance of Cementitious Composites: A Comprehensive Review of Phase Change Material Integration. Appl. Therm. Eng. 2025, 268, 125849. [Google Scholar] [CrossRef]
Kumar, N.; Thapliyal, P.C.; Kumar, A.; Kumar, A. A Mini Review on Paraffin-Graphene and Related Hybrid Phase Change Materials for Building Energy Applications. Energy Storage 2023, 5, e350. [Google Scholar]
Zadshir, M.; Kim, B.W.; Yin, H. Bio-Based Phase Change Materials for Sustainable Development. Materials 2024, 17, 4816. [Google Scholar] [CrossRef] [PubMed]
Liu, S.; Han, J.; Shen, Y.; Khan, S.Y.; Ji, W.; Jin, H.; Kumar, M. The Contribution of Artificial Intelligence to Phase Change Materials in Thermal Energy Storage: From Prediction to Optimization. Renew. Energy 2025, 238, 121973. [Google Scholar] [CrossRef]
da Cunha, S.R.L.; de Aguiar, J.L.B. Phase Change Materials and Energy Efficiency of Buildings: A Review of Knowledge. J. Energy Storage 2020, 27, 101083. [Google Scholar] [CrossRef]
Liu, L.; Hammami, N.; Trovalet, L.; Bigot, D.; Habas, J.P.; Malet-Damour, B. Description of Phase Change Materials (PCMs) Used in Buildings under Various Climates: A Review. J. Energy Storage 2022, 56, 105760. [Google Scholar] [CrossRef]
Mehrizi, A.A.; Karimi-Maleh, H.; Naddafi, M.; Karimi, F. Application of bio-based phase change materials for effective heat management. J. Energy Storage 2023, 61, 106859. [Google Scholar] [CrossRef]
Shi, T.; Liu, Y.; Hu, Z.; Cen, M.; Zeng, C.; Xu, J.; Zhao, Z. Deformation Performance and Fracture Toughness of Carbon Nanofiber-Modified Cement-Based Materials. ACI Mater. J. 2022, 119, 119–128. [Google Scholar]
Yang, H.; Cui, H.; Tang, W.; Li, Z.; Han, N.; Xing, F. A Critical Review on Research Progress of Graphene/Cement Based Composites. Compos. Part A Appl. Sci. Manuf. 2017, 102, 273–296. [Google Scholar] [CrossRef]
Janjaroen, T.; Khammahong, S.; Tuichai, W.; Karaphun, A.; Phrompet, C.; Sriwong, C.; Ruttanapun, C. The Mechanical and Thermal Properties of Cement CAST Mortar/Graphene Oxide Composites Materials. Int. J. Concr. Struct. Mater. 2022, 16, 34. [Google Scholar] [CrossRef]
Yadav, A.; Pandey, A.K.; Samykano, M.; Kalidasan, B.; Said, Z. A Review of Organic Phase Change Materials and Their Adaptation for Thermal Energy Storage. Int. Mater. Rev. 2024, 69, 380–446. [Google Scholar] [CrossRef]
Marani, A.; Nehdi, M.L. Machine Learning Prediction of Compressive Strength for Phase Change Materials Integrated Cementitious Composites. Constr. Build. Mater. 2020, 265, 120286. [Google Scholar] [CrossRef]
Shafaie, V.; Rad, M.M. DEM-driven investigation and AutoML-enhanced prediction of macroscopic behavior in cementitious composites with variable frictional parameters. Mater. Des. 2025, 254, 114069. [Google Scholar] [CrossRef]
Al-Taai, S.R.; Azize, N.M.; Thoeny, Z.A.; Imran, H.; Bernardo, L.F.; Al-Khafaji, Z. XGBoost Prediction Model Optimized with Bayesian for the Compressive Strength of Eco-Friendly Concrete Containing Ground Granulated Blast Furnace Slag and Recycled Coarse Aggregate. Appl. Sci. 2023, 13, 8889. [Google Scholar] [CrossRef]
Ahmad, A.; Ostrowski, K.A.; Maślak, M.; Farooq, F.; Mehmood, I.; Nafees, A. Comparative Study of Supervised Machine Learning Algorithms for Predicting the Compressive Strength of Concrete at High Temperature. Materials 2021, 14, 4222. [Google Scholar] [CrossRef]
Hassan, R.; Baghban, A. Pioneering Machine Learning Techniques to Estimate Thermal Conductivity of Carbon-Based Phase Change Materials: A Comprehensive Modeling Framework. Case Stud. Therm. Eng. 2025, 73, 106648. [Google Scholar] [CrossRef]
Moein, M.M.; Saradar, A.; Rahmati, K.; Mousavinejad, S.H.G.; Bristow, J.; Aramali, V.; Karakouzian, M. Predictive Models for Concrete Properties Using Machine Learning and Deep Learning Approaches: A Review. J. Build. Eng. 2023, 63, 105444. [Google Scholar]
Li, Y.; Shen, J.; Lin, H.; Li, Y. Optimization Design for Alkali-Activated Slag-Fly Ash Geopolymer Concrete Based on Artificial Intelligence Considering Compressive Strength, Cost, and Carbon Emission. J. Build. Eng. 2023, 75, 106929. [Google Scholar]
Bragalia, M.; Lamastra, F.R.; Berrocal, J.A.; Paleari, L.; Nanni, F. Sustainable phase change materials (PCMs): Waste fat from cooking pork meat confined in polypropylene fibrous mat from waste surgical mask and porous bio-silica. Mater. Today Sustain. 2023, 23, 100454. [Google Scholar] [CrossRef]
Lomada, R.R.; Sergei, S.; Vatin, N.I.; Joshi, A. ML Prediction and ANN-PSO Based Optimization for Compressive Strength of Blended Concrete. Cogent Eng. 2024, 11, 2380347. [Google Scholar] [CrossRef]
Nguyen, H.; Cao, M.T.; Tran, X.L.; Tran, T.H.; Hoang, N.D. A Novel Whale Optimization Algorithm Optimized XGBoost Regression for Estimating Bearing Capacity of Concrete Piles. Neural Comput. Appl. 2023, 35, 3825–3852. [Google Scholar] [CrossRef]
Urkude, N.; Hora, M.S.; Utkarsh. Enhancing Machine Learning Models with Metaheuristic Optimization Techniques for Accurate Prediction of PSC in FRP-Reinforced Concrete Slabs. Mech. Adv. Mater. Struct. 2024, 32, 3591–3607. [Google Scholar] [CrossRef]
Yang, P.; Li, C.; Qiu, Y. Metaheuristic Optimization of Random Forest for Predicting Punch Shear Strength of FRP-Reinforced Concrete Beams. Materials 2023, 16, 4034. [Google Scholar] [CrossRef]
Yang, Y.; Liu, G.; Zhang, H.; Zhang, Y.; Yang, X. Predicting the Compressive Strength of Environmentally Friendly Concrete Using Multiple Machine Learning Algorithms. Buildings 2024, 14, 190. [Google Scholar] [CrossRef]
Karimi Sharafshadeh, B.; Ketabdari, M.J.; Azarsina, F.; Amiri, M.; Nehdi, M.L. New Fuzzy-Heuristic Methodology for Analyzing Compression Load Capacity of Composite Columns. Buildings 2023, 13, 125. [Google Scholar] [CrossRef]
Donthu, N.; Kumar, S.; Mukherjee, D.; Pandey, N.; Lim, W.M. How to Conduct a Bibliometric Analysis: An Overview and Guidelines. J. Bus. Res. 2021, 133, 285–296. [Google Scholar] [CrossRef]
Marzi, G.; Balzano, M.; Caputo, A.; Pellegrini, M.M. Guidelines for Bibliometric-Systematic Literature Reviews: 10 Steps to Combine Analysis, Synthesis and Theory Development. Int. J. Manag. Rev. 2025, 27, 81–103. [Google Scholar] [CrossRef]
Kumar, R. Bibliometric Analysis: Comprehensive Insights into Tools, Techniques, Applications, and Solutions for Research Excellence. Spectr. Eng. Manag. Sci. 2025, 3, 45–62. [Google Scholar] [CrossRef]
Arachchi, K.K.; Mirza, O.; Mashiri, F.; Pathirana, S. Systematic Review of the Use of Phase Change Material (PCM) in Concrete and the Fire Performance of PCM in Concrete. Aust. J. Struct. Eng. 2025, 26, 175–188. [Google Scholar] [CrossRef]
Al-Yasiri, Q.; Szabó, M. Case study on the optimal thickness of phase change material incorporated composite roof under hot climate conditions. Case Stud. Constr. Mater. 2021, 14, e00522. [Google Scholar] [CrossRef]
Al-Yasiri, Q.; Szabó, M. Incorporation of phase change materials into building envelope for thermal comfort and energy saving: A comprehensive analysis. J. Build. Eng. 2021, 36, 102122. [Google Scholar] [CrossRef]
Baetens, R.; Jelle, B.P.; Gustavsen, A. Phase change materials for building applications: A state-of-the-art review. Energy Build. 2010, 42, 1361–1368. [Google Scholar] [CrossRef]
Boussaba, L.; Makhlouf, S.; Foufa, A.; Lefebvre, G.; Royon, L. vegetable fat: A low-cost bio-based phase change material for thermal energy storage in buildings. J. Build. Eng. 2019, 21, 222–229. [Google Scholar] [CrossRef]
Chen, F.; Wolcott, M. Polyethylene/paraffin binary composites for phase change material energy storage in building: A morphology, thermal properties, and paraffin leakage study. Sol. Energy Mater. Sol. Cells 2015, 137, 79–85. [Google Scholar] [CrossRef]
Qin, Y.; Ghalambaz, M.; Sheremet, M.; Fteiti, M.; Alresheedi, F. A bibliometrics study of phase change materials (PCMs). J. Energy Storage 2023, 73, 108987. [Google Scholar] [CrossRef]
Alipour, A.; Eslami, F.; Sadrameli, S.M. A novel bio-based phase change material of methyl palmitate and decanoic acid eutectic mixture: Thermodynamic modeling and thermal performance. Chem. Thermodyn. Therm. Anal. 2023, 10, 100111. [Google Scholar] [CrossRef]
Wu, D.; Rahim, M.; El Ganaoui, M.; Bennacer, R.; Liu, B. Multilayer assembly of phase change material and bio-based concrete: A passive envelope to improve the energy and hygrothermal performance of buildings. Energy Convers. Manag. 2022, 257, 115454. [Google Scholar] [CrossRef]
Das, D.; Crosby, R.; Paul, M.C. A comprehensive review on the form stable phase change materials for storing renewable heat preparation, characterization and application. J. Energy Storage 2025, 110, 115284. [Google Scholar] [CrossRef]
Li, Y.; Nord, N.; Xiao, Q.; Tereshchenko, T. Building heating applications with phase change material: A comprehensive review. J. Energy Storage 2020, 31, 101634. [Google Scholar] [CrossRef]
Cai, R.; Sun, Z.; Yu, H.; Meng, E.; Wang, J.; Dai, M. Review on optimization of phase change parameters in phase change material building envelopes. J. Build. Eng. 2021, 35, 101979. [Google Scholar] [CrossRef]
Song, M.; Niu, F.; Mao, N.; Hu, Y.; Deng, S. Review on building energy performance improvement using phase change materials. Energy Build. 2018, 158, 776–793. [Google Scholar] [CrossRef]
Benhorma, A.; Bensenouci, A.; Teggar, M.; Ismail, K.A.R.; Arıcı, M.; Mezaache, E.; Laouer, A.; Lino, F.A.M. Prospects and challenges of bio-based phase change materials: An up to date review. J. Energy Storage 2024, 90, 111713. [Google Scholar] [CrossRef]
Chen, H.; Tsang, Y.P.; Wu, C.H. When text mining meets science mapping in the bibliometric analysis: A review and future opportunities. Int. J. Eng. Bus. Manag. 2023, 15, 18479790231222349. [Google Scholar] [CrossRef]
Mabrouk, R.; Naji, H.; Benim, A.C.; Dhahri, H. A state of the art review on sensible and latent heat thermal energy storage processes in porous media: Mesoscopic simulation. Appl. Sci. 2022, 12, 6995. [Google Scholar] [CrossRef]
Li, C.; Sun, Z.; Wang, Y.; Zhu, J.; Wu, J.; Feng, L.; Wen, X.; Cai, W.; Yu, H.; Wang, M.; et al. A novel biogenic porous core/shell-based shape-stabilized phase change material for building energy saving. J. Energy Storage 2024, 95, 112504. [Google Scholar] [CrossRef]
Rmili, Y.; Ndiaye, K.; Plancher, L.; Tahar, Z.E.A.; Cousture, A.; Melinge, Y. Properties and durability of cementitious composites incorporating solid-solid phase change materials. Appl. Sci. 2024, 14, 2040. [Google Scholar] [CrossRef]
Li, D.; Zhuang, B.; Chen, Y.; Li, B.; Landry, V.; Kaboorani, A.; Wang, X.A. Incorporation technology of bio-based phase change materials for building envelope: A review. Energy Build. 2022, 260, 111920. [Google Scholar] [CrossRef]
Dams, B.; Maskell, D.; Shea, A.; Allen, S.; Cascione, V.; Walker, P. Upscaling bio-based construction: Challenges and opportunities. Build. Res. Inf. 2023, 51, 764–782. [Google Scholar] [CrossRef]
Rathore, P.K.S.; Gupta, N.K.; Yadav, D.; Shukla, S.K.; Kaul, S. Thermal performance of the building envelope integrated with phase change material for thermal energy storage: An updated review. Sustain. Cities Soc. 2022, 79, 103690. [Google Scholar] [CrossRef]
Huang, J.; Li, M.; Fan, D. Core-shell particles for devising high-performance full-day radiative cooling paint. Appl. Mater. Today 2021, 25, 101209. [Google Scholar] [CrossRef]
Cao, V.D.; Pilehvar, S.; Salas-Bringas, C.; Szczotok, A.M.; Bui, T.Q.; Carmona, M.; Rodriguez, J.F.; Kjøniksen, A.-L. Thermal analysis of geopolymer concrete walls containing microencapsulated phase change materials for building applications. Sol. Energy 2019, 178, 295–307. [Google Scholar] [CrossRef]
Jin Ong, P.; Leow, Y.; Yun Debbie Soo, X.; Chua, M.H.; Ni, X.; Suwardi, A.; Tan, C.K.I.; Zheng, R.; Wei, F.; Xu, J.; et al. Valorization of Spent coffee Grounds: A sustainable resource for Bio-based phase change materials for thermal energy storage. Waste Manag. 2023, 157, 339–347. [Google Scholar] [CrossRef] [PubMed]
Mondal, S. Phase change materials for smart textiles–An overview. Appl. Therm. Eng. 2008, 28, 1536–1550. [Google Scholar] [CrossRef]
Fabiani, C.; Pisello, A.L.; Barbanera, M.; Cabeza, L.F. Palm Oil-Based Bio-PCM for Energy Efficient Building Applications: Multipurpose Thermal Investigation and Life Cycle Assessment. J. Energy Storage 2020, 28, 101129. [Google Scholar] [CrossRef]
Sarcinella, A.; Cunha, S.; Aguiar, I.; Aguiar, J.; Frigione, M. Sustainable organic phase change materials for sustainable energy efficiency solutions. Polymers 2025, 17, 1343. [Google Scholar] [CrossRef]
Pinheiro, C.; Landi Jr, S.; Lima Jr, O.; Ribas, L.; Hammes, N.; Segundo, I.R.; Carneiro, J. Advancements in phase change materials in asphalt pavements for mitigation of urban heat island effect: Bibliometric analysis and systematic review. Sensors 2023, 23, 7741. [Google Scholar] [CrossRef]
Xu, L.; Yang, R. Stearic acid/inorganic porous matrix phase change composite for hot water systems. Molecules 2019, 24, 1482. [Google Scholar] [CrossRef]
Reddy, V.J.; Ghazali, M.F.; Kumarasamy, S. Innovations in phase change materials for diverse industrial applications: A comprehensive review. Results Chem. 2024, 8, 101552. [Google Scholar] [CrossRef]
Rashid, F.L.; Al-Obaidi, M.A.; Dulaimi, A.; Bernardo, L.F.A.; Eleiwi, M.A.; Mahood, H.B.; Hashim, A. A review of recent improvements, developments, effects, and challenges on using phase-change materials in concrete for thermal energy storage and release. J. Compos. Sci. 2023, 7, 352. [Google Scholar] [CrossRef]
Bao, X.; Memon, S.A.; Yang, H.; Dong, Z.; Cui, H. Thermal Properties of Cement-Based Composites for Geothermal Energy Applications. Materials 2017, 10, 462. [Google Scholar] [CrossRef]
Sharma, R.; Jang, J.G.; Hu, J.W. Phase-Change Materials in Concrete: Opportunities and Challenges for Sustainable Construction and Building Materials. Materials 2022, 15, 335. [Google Scholar] [CrossRef] [PubMed]
Junaid, M.F.; ur Rehman, Z.; Ijaz, N.; Farooq, R.; Khalid, U.; Ijaz, Z. Performance evaluation of cement-based composites containing phase change materials from energy management and construction standpoints. Constr. Build. Mater. 2024, 416, 135108. [Google Scholar] [CrossRef]
Przybek, A.; Łach, M.; Romańska, P.; Ciemnicka, J.; Prałat, K.; Koper, A. Geopolymer Concretes with Organic Phase Change Materials—Analysis of Thermal Properties and Microstructure. Materials 2025, 18, 2557. [Google Scholar] [CrossRef]
Jiao, K.; Lu, L.; Zhao, L.; Wang, G. Towards passive building thermal regulation: A state-of-the-art review on recent progress of PCM-integrated building envelopes. Sustainability 2024, 16, 6482. [Google Scholar] [CrossRef]
Said, Z.; Pandey, A.K.; Tiwari, A.K.; Kalidasan, B.; Jamil, F.; Thakur, A.K.; Tyagi, V.; Sarı, A.; Ali, H.M. Nano-enhanced phase change materials: Fundamentals and applications. Prog. Energy Combust. Sci. 2024, 104, 101162. [Google Scholar] [CrossRef]
Mishra, R.K.; Verma, K.; Mishra, V.; Chaudhary, B. A review on carbon-based phase change materials for thermal energy storage. J. Energy Storage 2022, 50, 104166. [Google Scholar] [CrossRef]
Amaral, C.; Vicente, R.; Marques, P.A.A.P.; Barros-Timmons, A. Phase change materials and carbon nanostructures for thermal energy storage: A literature review. Renew. Sustain. Energy Rev. 2017, 79, 1212–1228. [Google Scholar] [CrossRef]
Šavija, B.; Zhang, H.; Schlangen, E. Influence of microencapsulated phase change material (PCM) addition on (micro) mechanical properties of cement paste. Materials 2017, 10, 863. [Google Scholar] [CrossRef]
Dehmous, M.; Franquet, E.; Lamrous, N. Mechanical and Thermal Characterizations of Various Thermal Energy Storage Concretes Including Low-Cost Bio-Sourced PCM. Energy Build. 2021, 241, 110878. [Google Scholar] [CrossRef]
Sun, H.; Yu, K.; Jia, M.; Wang, Z.; Yang, Y.; Liu, Y. Heat-Stored Engineered Cementitious Composite Containing Microencapsulated n-Octadecane with Cenosphere Shell. Coatings 2025, 15, 128. [Google Scholar] [CrossRef]
Orsini, F.; Marrone, P.; Santini, S.; Sguerri, L.; Asdrubali, F.; Baldinelli, G.; Bianchi, F.; Presciutti, A. Smart materials: Cementitious mortars and PCM mechanical and thermal characterization. Materials 2021, 14, 4163. [Google Scholar] [CrossRef] [PubMed]
Zhu, M.; Li, J.; Wang, Y.; Meng, F. Experimental Study on Fire Resistance of Phase Change Energy Storage Concrete Partition Walls. Fire 2025, 8, 128. [Google Scholar] [CrossRef]
Hassan, A.; Laghari, M.S.; Rashid, Y. Micro-encapsulated phase change materials: A review of encapsulation, safety and thermal characteristics. Sustainability 2016, 8, 1046. [Google Scholar] [CrossRef]
Jha, S.K.; Sankar, A.; Zhou, Y.; Ghosh, A. Incorporation of phase change materials in buildings. Constr. Mater. 2024, 4, 676–703. [Google Scholar] [CrossRef]
Mandal, S. Advancements in Phase Change Materials: Stabilization Techniques and Applications. Prabha Mater. Sci. Lett. 2024, 3, 254–267. [Google Scholar] [CrossRef]
Min, L.; Liu, Y.; Wang, C.; Du, Y.; Fang, H. A novel stereotyped phase change material with a low leakage rate for new energy storage building applications. Constr. Build. Mater. 2024, 433, 136757. [Google Scholar] [CrossRef]
Navarro, L.; Solé, A.; Martín, M.; Barreneche, C.; Olivieri, L.; Tenorio, J.A.; Cabeza, L.F. Benchmarking of useful phase change materials for a building application. Energy Build. 2019, 182, 45–50. [Google Scholar] [CrossRef]
Abhat, A. Low temperature latent heat thermal energy storage: Heat storage materials. Sol. Energy 1983, 30, 313–332. [Google Scholar] [CrossRef]
Li, G.; Xu, G.; Tao, Z. Effect on the thermal properties of building mortars with microencapsulated phase change materials for radiant floors. Buildings 2023, 13, 2476. [Google Scholar] [CrossRef]
Zhang, X.; Han, Z.; Liu, L.; Xia, X.; Liu, Q.; Duan, Y.; Wang, X. Experimental Study on Mechanical and Thermal Properties of Backfill Body with Paraffin Added. Energies 2023, 17, 217. [Google Scholar] [CrossRef]
Gbekou, F.K.; Benzarti, K.; Boudenne, A.; Eddhahak, A.; Duc, M. Mechanical and thermophysical properties of cement mortars including bio-based microencapsulated phase change materials. Constr. Build. Mater. 2022, 352, 129056. [Google Scholar] [CrossRef]
Cui, H.; Yang, S.; Memon, S.A. Development of carbon nanotube modified cement paste with microencapsulated phase-change material for structural-functional integrated application. Int. J. Mol. Sci. 2015, 16, 8027–8039. [Google Scholar] [CrossRef]
Liu, F.; Wang, J.; Qian, X. Integrating phase change materials into concrete through microencapsulation using cenospheres. Cem. Concr. Compos. 2017, 80, 317–325. [Google Scholar] [CrossRef]
Cabeza, L.F.; Castell, A.; Barreneche, C.D.; De Gracia, A.; Fernández, A.I. Materials used as PCM in thermal energy storage in buildings: A review. Renew. Sustain. Energy Rev. 2011, 15, 1675–1695. [Google Scholar] [CrossRef]
Su, X.; Jiang, H.; Qin, T.; Lin, G. Particle Swarm Optimization-Support Vector Regression (PSO-SVR)-Based Rapid Prediction Method for Radiant Heat Transfer for a Spacecraft Vacuum Thermal Test. Appl. Sci. 2024, 14, 9407. [Google Scholar] [CrossRef]
Nazir, K.; Memon, S.A.; Saurbayeva, A. Predicting the PCM-incorporated building’s performance using optimized linear kernel and tree-based machine learning methods. J. Energy Storage 2024, 94, 112495. [Google Scholar] [CrossRef]
Cunha, S.; Parente, M.; Tinoco, J.; Aguiar, J. Leveraging machine learning for designing sustainable mortars with non-encapsulated PCMs. Sustainability 2024, 16, 6775. [Google Scholar] [CrossRef]
Farooq, F.; Amin, M.N.; Khan, K.; Sadiq, M.R.; Javed, M.F.; Aslam, F.; Alyousef, R. A comparative study of random forest and genetic engineering programming for the prediction of compressive strength of high strength concrete (HSC). Appl. Sci. 2020, 10, 7330. [Google Scholar] [CrossRef]
Liang, B.; Qin, W.; Liao, Z. A Differential Evolutionary-Based XGBoost for Solving Classification of Physical Fitness Test Data of College Students. Mathematics 2025, 13, 1405. [Google Scholar] [CrossRef]
Chou, J.S.; Tsai, C.F.; Pham, A.D.; Lu, Y.H. Machine learning in concrete strength simulations: Multi-nation data analytics. Constr. Build. Mater. 2014, 73, 771–780. [Google Scholar] [CrossRef]
Cook, R.; Lapeyre, J.; Ma, H.; Kumar, A. Prediction of compressive strength of concrete: Critical comparison of performance of a hybrid machine learning model with standalone models. J. Mater. Civ. Eng. 2019, 31, 04019255. [Google Scholar] [CrossRef]
Ebid, A.; Deifalla, A. Using artificial intelligence techniques to predict punching shear capacity of lightweight concrete slabs. Materials 2022, 15, 2732. [Google Scholar] [CrossRef]
Chen, D.; Luo, H.; Liu, Z.; Pan, J.; Wu, Y.; Wang, E.; Lu, C.; Wang, L.; Wang, W.; Ou, G. A Dual-Variable Selection Framework for Enhancing Forest Aboveground Biomass Estimation via Multi-Source Remote Sensing. Remote Sens. 2025, 15, 12939. [Google Scholar] [CrossRef]
Liu, Y.; Yang, T.; Tian, L.; Huang, B.; Yang, J.; Zeng, Z. Ada-xg-CatBoost: A combined forecasting model for gross ecosystem product (GEP) prediction. Sustainability 2024, 16, 7203. [Google Scholar] [CrossRef]
Mehdary, A.; Chehri, A.; Jakimi, A.; Saadane, R. Hyperparameter optimization with genetic algorithms and XGBoost: A step forward in smart grid fraud detection. Sensors 2024, 24, 1230. [Google Scholar] [CrossRef] [PubMed]
Al-Mudhafar, W.J.; Hasan, A.A.; Abbas, M.A.; Wood, D.A. Machine learning with hyperparameter optimization applied in facies-supported permeability modeling in carbonate oil reservoirs. Sci. Rep. 2025, 15, 12939. [Google Scholar] [CrossRef]
Zheng, J.; Yao, T.; Yue, J.; Wang, M.; Xia, S. Compressive strength prediction of BFRC based on a novel hybrid machine learning model. Buildings 2023, 13, 1934. [Google Scholar] [CrossRef]
Malashin, I.; Tynchenko, V.; Gantimurov, A.; Nelyub, V.; Borodulin, A. Boosting-Based Machine Learning Applications in Polymer Science: A Review. Polymers 2025, 17, 499. [Google Scholar] [CrossRef]
Zhuo, H.; Li, T.; Lu, W.; Zhang, Q.; Ji, L.; Li, J. Prediction model for spontaneous combustion temperature of coal based on PSO-XGBoost algorithm. Sci. Rep. 2025, 15, 2752. [Google Scholar] [CrossRef]
Zhao, Y.; Chen, Z.; Jian, X. A High-generalizability machine learning framework for analyzing the homogenized properties of short fiber-reinforced polymer composites. Polymers 2023, 15, 3962. [Google Scholar] [CrossRef]
He, P.; Wu, W. Levy flight-improved grey wolf optimizer algorithm-based support vector regression model for dam deformation prediction. Front. Earth Sci. 2023, 11, 1122937. [Google Scholar] [CrossRef]
Mabdeh, A.N.; Ajin, R.S.; Razavi-Termeh, S.V.; Ahmadlou, M.; Al-Fugara, A.K. Enhancing the performance of machine learning and deep learning-based flood susceptibility models by integrating grey wolf optimizer (GWO) algorithm. Remote Sens. 2024, 16, 2595. [Google Scholar] [CrossRef]
Ahmed, H.U.; Mostafa, R.R.; Mohammed, A.; Sihag, P.; Qadir, A. Support vector regression (SVR) and grey wolf optimization (GWO) to predict the compressive strength of GGBFS-based geopolymer concrete. Neural Comput. Appl. 2023, 35, 2909–2926. [Google Scholar] [CrossRef]
Huang, J.; Sabri, M.M.S.; Ulrikh, D.V.; Ahmad, M.; Alsaffar, K.A.M. Predicting the compressive strength of the cement-fly ash-slag ternary concrete using the firefly algorithm (FA) and random forest (RF) hybrid machine-learning method. Materials 2022, 15, 4193. [Google Scholar] [CrossRef]
Wang, Q.A.; Zhang, J.; Huang, J. Simulation of the compressive strength of cemented tailing backfill through the use of firefly algorithm and random forest model. Shock Vib. 2021, 2021, 5536998. [Google Scholar] [CrossRef]
Rahamathulla, M.Y.; Ramaiah, M. Optimizing Anomaly Detection Models for Edge IIoT with an Enhanced Firefly Algorithm-Based Hyperparameter Tuning Strategy. Results Eng. 2025, 27, 105843. [Google Scholar] [CrossRef]
GB/T 50081-2019; Ministry of Housing and Urban-Rural Development of the People’s Republic of China. Standard for Test Methods of Physical and Mechanical Properties of Concrete. China Architecture & Building Press: Beijing, China, 2019.
Nikbin, I.M.; Beygi, M.H.A.; Kazemi, M.T.; Amiri, J.V.; Rabbanifar, S.; Rahmani, E.; Rahimi, S. A comprehensive investigation into the effect of water to cement ratio and powder content on mechanical properties of self-compacting concrete. Constr. Build. Mater. 2014, 57, 69–80. [Google Scholar] [CrossRef]
Sánchez-Mendieta, C.; Galán-Díaz, J.J.; Martinez-Lage, I. Relationships between density, porosity, compressive strength and permeability in porous concretes: Optimization of properties through control of the water-cement ratio and aggregate type. J. Build. Eng. 2024, 97, 110858. [Google Scholar] [CrossRef]
Badarloo, B.; Lehner, P. Practical Aspects of Correlation Analysis of Compressive Strength from Destructive and Non-Destructive Methods in Different Directions. Infrastructures 2023, 8, 155. [Google Scholar] [CrossRef]
Sayyar, M.; Weerasiri, R.R.; Soroushian, P.; Lu, J. Experimental and numerical study of shape-stable phase-change nanocomposite toward energy-efficient building constructions. Energy Build. 2014, 75, 249–255. [Google Scholar] [CrossRef]
Sukontasukkul, P.; Uthaichotirat, P.; Sangpet, T.; Sisomphon, K.; Newlands, M.; Siripanichgorn, A.; Chindaprasirt, P. Thermal properties of lightweight concrete incorporating high contents of phase change materials. Constr. Build. Mater. 2019, 207, 431–439. [Google Scholar] [CrossRef]
Suttaphakdee, P.; Dulsang, N.; Lorwanishpaisarn, N.; Kasemsiri, P.; Posi, P.; Chindaprasirt, P. Optimizing mix proportion and properties of lightweight concrete incorporated phase change material paraffin/recycled concrete block composite. Constr. Build. Mater. 2016, 127, 475–483. [Google Scholar] [CrossRef]
Cunha, S.; Lima, M.; Aguiar, J.B. Influence of adding phase change materials on the physical and mechanical properties of cement mortars. Constr. Build. Mater. 2016, 127, 1–10. [Google Scholar] [CrossRef]
Zalba, B.; Marín, J.M.; Cabeza, L.F.; Mehling, H. Review on thermal energy storage with phase change: Materials, heat transfer analysis and applications. Appl. Therm. Eng. 2003, 23, 251–283. [Google Scholar] [CrossRef]
Zhang, C.; Pang, C.; Mao, Y.; Tang, Z. Effect and mechanism of polyethylene glycol (PEG) used as a phase change composite on cement paste. Materials 2022, 15, 2749. [Google Scholar] [CrossRef]
Chem, Y.; Jiang, Q.H.; Xin, L.I. Research status and application of phase change materials. J. Mater. Eng. 2019, 47, 1–10. [Google Scholar]
Shi, J.; Chen, Z.; Shao, S.; Zheng, J. Experimental and numerical study on effective thermal conductivity of novel form-stable basalt fiber composite concrete with PCMs for thermal storage. Appl. Therm. Eng. 2014, 66, 156–161. [Google Scholar] [CrossRef]
Niall, D.; Kinnane, O.; West, R.P.; McCormack, S. Mechanical and thermal evaluation of different types of PCM-concrete composite panels. J. Struct. Integr. Maint. 2017, 2, 100–108. [Google Scholar] [CrossRef]
Yang, J.; Zhou, J.; Nie, Z.; Liu, L. Preparation and Property Analysis of Phase Change Concrete PEG/SiO₂-CPCM. J. Compos. Adv. Mater. 2019, 29, 21–26. [Google Scholar] [CrossRef]
Ping, Z.H.; Nguyen, Q.T.; Chen, S.M.; Zhou, J.Q.; Ding, Y.D. States of water in different hydrophilic polymers-DSC and FTIR studies. Polymer 2001, 42, 8461–8467. [Google Scholar] [CrossRef]
Alqurashi, M. Data-Driven Insights into Concrete Flow and Strength: Advancing Smart Material Design Using Machine Learning Strategies. Buildings 2025, 15, 2244. [Google Scholar] [CrossRef]
Ghosh, A.; Ransinchung, G.D. Application of machine learning algorithm to assess the efficacy of varying industrial wastes and curing methods on strength development of geopolymer concrete. Constr. Build. Mater. 2022, 341, 127828. [Google Scholar] [CrossRef]
Vakharia, V.; Gujar, R. Prediction of compressive strength and portland cement composition using cross-validation and feature ranking techniques. Constr. Build. Mater. 2019, 225, 292–301. [Google Scholar] [CrossRef]
Ahmad, W.; Veeraghantla, V.S.S.C.S.; Byrne, A. Advancing sustainable concrete using biochar: Experimental and modelling study for mechanical strength evaluation. Sustainability 2025, 17, 2516. [Google Scholar] [CrossRef]
Nafees, A.; Amin, M.N.; Khan, K.; Nazir, K.; Ali, M.; Javed, M.F.; Aslam, F.; Musarat, M.A.; Vatin, N.I. Modeling of mechanical properties of silica fume-based green concrete using machine learning techniques. Polymers 2021, 14, 30. [Google Scholar] [CrossRef]
Chen, L.; Wang, Z.; Khan, A.A.; Khan, M.; Javed, M.F.; Alaskar, A.; Eldin, S.M. Development of predictive models for sustainable concrete via genetic programming-based algorithms. J. Mater. Res. Technol. 2023, 24, 6391–6410. [Google Scholar] [CrossRef]
Shah, M.I.; Javed, M.F.; Aslam, F.; Alabduljabbar, H. Machine learning modeling integrating experimental analysis for predicting the properties of sugarcane bagasse ash concrete. Constr. Build. Mater. 2022, 314, 125634. [Google Scholar] [CrossRef]
Ghazvinian, H.; Mousavi, S.F.; Karami, H.; Farzin, S.; Ehteram, M.; Hossain, M.S.; Fai, C.M.; Bin Hashim, H.; Singh, V.P.; Ros, F.C.; et al. Integrated support vector regression and an improved particle swarm optimization-based model for solar radiation prediction. PLoS ONE 2019, 14, e0217634. [Google Scholar] [CrossRef] [PubMed]
Zhu, S.P.; Keshtegar, B.; Seghier, M.E.A.B.; Zio, E.; Taylan, O. Hybrid and enhanced PSO: Novel first order reliability method-based hybrid intelligent approaches. Comput. Methods Appl. Mech. Eng. 2022, 393, 114730. [Google Scholar] [CrossRef]
Chen, W.; Hasanipanah, M.; Nikafshan Rad, H.; Jahed Armaghani, D.; Tahir, M.M. A new design of evolutionary hybrid optimization of SVR model in predicting the blast-induced ground vibration. Eng. Comput. 2021, 37, 1455–1471. [Google Scholar] [CrossRef]
Chaabene, W.B.; Flah, M.; Nehdi, M.L. Machine learning prediction of mechanical properties of concrete: Critical review. Constr. Build. Mater. 2020, 260, 119889. [Google Scholar] [CrossRef]
Xu, J.; Zhao, X.; Yu, Y.; Xie, T.; Yang, G.; Xue, J. Parametric sensitivity analysis and modelling of mechanical properties of normal-and high-strength recycled aggregate concrete using grey theory, multiple nonlinear regression and artificial neural networks. Constr. Build. Mater. 2019, 211, 479–491. [Google Scholar] [CrossRef]
Bakouri, M.; Sultan, H.S.; Samad, S.; Togun, H.; Goodarzi, M. Predicting thermophysical properties enhancement of metal-based phase change materials using various machine learning algorithms. J. Taiwan Inst. Chem. Eng. 2023, 148, 104934. [Google Scholar] [CrossRef]
Kottala, R.K.; Chigilipalli, B.K.; Mukuloth, S.; Shanmugam, R.; Kantumuchu, V.C.; Ainapurapu, S.B.; Cheepu, M. Thermal degradation studies and machine learning modelling of nano-enhanced sugar alcohol-based phase change materials for medium temperature applications. Energies 2023, 16, 2187. [Google Scholar] [CrossRef]
Goud, M. A comprehensive investigation and artificial neural network modeling of shape stabilized composite phase change material for solar thermal energy storage. J. Energy Storage 2022, 48, 103992. [Google Scholar] [CrossRef]
Goud, M.; Raval, F. A sustainable biochar-based shape stable composite phase change material for thermal management of a lithium-ion battery system and hybrid neural network modeling for heat flow prediction. J. Energy Storage 2022, 56, 106163. [Google Scholar]
Wei, H.; Luo, K.; Xing, J.; Fan, J. Predicting co-pyrolysis of coal and biomass using machine learning approaches. Fuel 2022, 310, 122248. [Google Scholar] [CrossRef]
Ozawa, T. A new method of analyzing thermogravimetric data. Bull. Chem. Soc. Jpn. 1965, 38, 1881–1886. [Google Scholar] [CrossRef]
Starink, M.J. A new method for the derivation of activation energies from experiments performed at constant heating rate. Thermochim. Acta 1996, 288, 97–104. [Google Scholar] [CrossRef]
Mavromatidis, L.E.; Bykalyuk, A.; Lequay, H. Development of polynomial regression models for composite dynamic envelopes’ thermal performance forecasting. Appl. Energy 2013, 104, 379–391. [Google Scholar] [CrossRef]
Stergiou, K.; Ntakolia, C.; Varytis, P.; Koumoulos, E.; Karlsson, P.; Moustakidis, S. Enhancing property prediction and process optimization in building materials through machine learning: A review. Comput. Mater. Sci. 2023, 220, 112031. [Google Scholar] [CrossRef]
Butler, K.T.; Davies, D.W.; Cartwright, H.; Isayev, O.; Walsh, A. Machine learning for molecular and materials science. Nature 2018, 559, 547–555. [Google Scholar] [CrossRef]
Seghier, M.E.A.B.; Plevris, V.; Solorzano, G. Random forest-based algorithms for accurate evaluation of ultimate bending capacity of steel tubes. Structures 2022, 44, 261–273. [Google Scholar] [CrossRef]
Abualigah, L. Particle Swarm Optimization: Advances, Applications, and Experimental Insights. Comput. Mater. Contin. 2025, 82, 2. [Google Scholar] [CrossRef]
Krzywanski, J.; Sosnowski, M.; Grabowska, K.; Zylka, A.; Lasek, L.; Kijo-Kleczkowska, A. Advanced computational methods for modeling, prediction and optimization-a review. Materials 2024, 17, 3521. [Google Scholar] [CrossRef] [PubMed]
Sarker, I.H. Data science and analytics: An overview from data-driven smart computing, decision-making and applications perspective. SN Comput. Sci. 2021, 2, 377. [Google Scholar] [CrossRef]
Efendigil, T.; Önüt, S.; Kahraman, C. A decision support system for demand forecasting with artificial neural networks and neuro-fuzzy models: A comparative analysis. Expert Syst. Appl. 2009, 36, 6697–6707. [Google Scholar] [CrossRef]
Abualigah, L.; Sheikhan, A.; Ikotun, A.M.; Zitar, R.A.; Alsoud, A.R.; Al-Shourbaji, I.; Jia, H. Particle swarm optimization algorithm: Review and applications. Metaheuristic Optim. Algorithms 2024, 1–14. [Google Scholar] [CrossRef]
Marini, F.; Walczak, B. Particle swarm optimization (PSO). A tutorial. Chemom. Intell. Lab. Syst. 2015, 149, 153–165. [Google Scholar] [CrossRef]
Piotrowski, A.P.; Napiorkowski, J.J.; Piotrowska, A.E. Particle swarm optimization or differential evolution-A comparison. Eng. Appl. Artif. Intell. 2023, 121, 106008. [Google Scholar]
Zhang, X.; Liu, H.; Tu, L. A modified particle swarm optimization for multimodal multi-objective optimization. Eng. Appl. Artif. Intell. 2020, 95, 103905. [Google Scholar] [CrossRef]
Zhou, J.; Huang, S.; Qiu, Y. Optimization of random forest through the use of MVO, GWO and MFO in evaluating the stability of underground entry-type excavations. Tunn. Undergr. Space Technol. 2022, 124, 104494. [Google Scholar] [CrossRef]
Luo, X.; Yuan, Y.; Chen, S.; Zeng, N.; Wang, Z. Position-transitional particle swarm optimization-incorporated latent factor analysis. IEEE Trans. Knowl. Data Eng. 2020, 34, 3958–3970. [Google Scholar] [CrossRef]
Yue, Y.; Cao, L.; Lu, D.; Hu, Z.; Xu, M.; Wang, S.; Ding, H. Review and empirical analysis of sparrow search algorithm. Artif. Intell. Rev. 2023, 56, 10867–10919. [Google Scholar] [CrossRef]
Adnan, R.M.; Mostafa, R.R.; Kisi, O.; Yaseen, Z.M.; Shahid, S.; Zounemat-Kermani, M. Improving streamflow prediction using a new hybrid ELM model combined with hybrid particle swarm optimization and grey wolf optimization. Knowl.-Based Syst. 2021, 230, 107379. [Google Scholar] [CrossRef]
Chou, J.S.; Pham, A.D. Enhanced artificial intelligence for ensemble approach to predicting high performance concrete compressive strength. Constr. Build. Mater. 2013, 49, 554–563. [Google Scholar] [CrossRef]
Qi, C.; Fourie, A.; Chen, Q. Neural network and particle swarm optimization for predicting the unconfined compressive strength of cemented paste backfill. Constr. Build. Mater. 2018, 159, 473–478. [Google Scholar] [CrossRef]
Wei, H.; Zhao, S.; Rong, Q.; Bao, H. Predicting the effective thermal conductivities of composite materials and porous media by machine learning methods. Int. J. Heat Mass Transf. 2018, 127, 908–916. [Google Scholar] [CrossRef]
Pietrak, K.; Wiśniewski, T.S. A review of models for effective thermal conductivity of composite materials. J. Power Technol. 2015, 95, 1. [Google Scholar]
Liu, X.; Tian, S.; Tao, F.; Yu, W. A review of artificial neural networks in the constitutive modeling of composite materials. Compos. Part B Eng. 2021, 224, 109152. [Google Scholar] [CrossRef]
Linh, N.T.T.; Pandey, M.; Janizadeh, S.; Bhunia, G.S.; Norouzi, A.; Ali, S.; Ahmadi, K. Flood susceptibility modeling based on new hybrid intelligence model: Optimization of XGboost model using GA metaheuristic algorithm. Adv. Space Res. 2022, 69, 3301–3318. [Google Scholar] [CrossRef]
Meng, H.; Tao, M.; Huang, R.; Zhao, Y.; Xu, Y. Blasting Mean Fragment Size Prediction Based on XGBoost and Metaheuristic Optimization Algorithms. Nat. Resour. Res. 2025, 1–20. [Google Scholar] [CrossRef]
Willard, J.; Jia, X.; Xu, S.; Steinbach, M.; Kumar, V. Integrating scientific knowledge with machine learning for engineering and environmental systems. ACM Comput. Surv. 2022, 55, 1–37. [Google Scholar] [CrossRef]
Schmidt, J.; Marques, M.R.; Botti, S.; Marques, M.A. Recent advances and applications of machine learning in solid-state materials science. npj Comput. Mater. 2019, 5, 83. [Google Scholar] [CrossRef]
Keshteli, A.N.; Sheikholeslami, M. Nanoparticle enhanced PCM applications for intensification of thermal performance in building: A review. J. Mol. Liq. 2019, 274, 516–533. [Google Scholar] [CrossRef]
Kıyak, B.; Öztop, H.F.; Ertam, F.; Aksoy, İ.G. An intelligent approach to investigate the effects of container orientation for PCM melting based on an XGBoost regression model. Eng. Anal. Bound. Elem. 2024, 161, 202–213. [Google Scholar] [CrossRef]
Noh, H.; Lee, S.; Kim, S.M.; Mudawar, I. Utilization of XGBoost algorithm to predict dryout incipience quality for saturated flow boiling in mini/micro-channels. Int. J. Heat Mass Transf. 2024, 231, 125827. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Alahmer, A.; Turner, T.; Al-Dahidi, S.; Alrbai, M. A comprehensive review of optimizing phase change materials in thermal energy storage: The role of nanoparticles, fin configurations, and data-driven approaches. J. Energy Storage 2025, 131, 117464. [Google Scholar] [CrossRef]
Liu, X.; Liu, T.; Feng, P. Long-term performance prediction framework based on XGBoost decision tree for pultruded FRP composites exposed to water, humidity and alkaline solution. Compos. Struct. 2022, 284, 115184. [Google Scholar] [CrossRef]
Huang, G.; Guo, Y.; Chen, Y.; Nie, Z. Application of machine learning in material synthesis and property prediction. Materials 2023, 16, 5977. [Google Scholar] [CrossRef]
Chibani, S.; Coudert, F.X. Machine learning approaches for the prediction of materials properties. APL Mater. 2020, 8, 8. [Google Scholar] [CrossRef]
Guo, K.; Yang, Z.; Yu, C.H.; Buehler, M.J. Artificial intelligence and machine learning in design of mechanical materials. Mater. Horiz. 2021, 8, 1153–1172. [Google Scholar] [CrossRef]
Fang, J.; Xie, M.; He, X.; Zhang, J.; Hu, J.; Chen, Y.; Jin, Q. Machine learning accelerates the materials discovery. Mater. Today Commun. 2022, 33, 104900. [Google Scholar] [CrossRef]
Liu, Y.; Niu, C.; Wang, Z.; Gan, Y.; Zhu, Y.; Sun, S.; Shen, T. Machine learning in materials genome initiative: A review. J. Mater. Sci. Technol. 2020, 57, 113–122. [Google Scholar] [CrossRef]
Raccuglia, P.; Elbert, K.C.; Adler, P.D.; Falk, C.; Wenny, M.B.; Mollo, A.; Norquist, A.J. Machine-learning-assisted materials discovery using failed experiments. Nature 2016, 533, 73–76. [Google Scholar] [CrossRef] [PubMed]
Wu, Y.C.; Feng, J.W. Development and application of artificial neural network. Wirel. Pers. Commun. 2018, 102, 1645–1656. [Google Scholar] [CrossRef]
Huang, Y. Advances in artificial neural networks–methodological development and application. Algorithms 2009, 2, 973–1007. [Google Scholar] [CrossRef]
Samadi, S.H.; Ghobadian, B.; Nosrati, M. Prediction of higher heating value of biomass materials based on proximate analysis using gradient boosted regression trees method. Energy Sources Part A 2021, 43, 672–681. [Google Scholar] [CrossRef]
Shahmansouri, A.A.; Yazdani, M.; Ghanbari, S.; Bengar, H.A.; Jafari, A.; Ghatte, H.F. Artificial neural network model to predict the compressive strength of eco-friendly geopolymer concrete incorporating silica fume and natural zeolite. J. Clean. Prod. 2021, 279, 123697. [Google Scholar] [CrossRef]
Li, C.; Wang, C.; Sun, M.; Zeng, Y.; Yuan, Y.; Gou, Q.; Pu, X. Correlated RNN framework to quickly generate molecules with desired properties for energetic materials in the low data regime. J. Chem. Inf. Model. 2022, 62, 4873–4887. [Google Scholar] [CrossRef] [PubMed]
Unni, R.; Yao, K.; Zheng, Y. Deep convolutional mixture density network for inverse design of layered photonic structures. ACS Photonics 2020, 7, 2703–2712. [Google Scholar] [CrossRef]
Zhao, X.G.; Zhou, K.; Xing, B.; Zhao, R.; Luo, S.; Li, T.; Zhang, L. JAMIP: An artificial-intelligence aided data-driven infrastructure for computational materials informatics. Sci. Bull. 2021, 66, 1973–1985. [Google Scholar] [CrossRef]
Javid, A.; Toufigh, V. Utilizing ensemble machine learning and gray wolf optimization to predict the compressive strength of silica fume mixtures. Struct. Concr. 2024, 25, 4048–4074. [Google Scholar] [CrossRef]
Mahmoodzadeh, A.; Mohammadi, M.; Ibrahim, H.H.; Abdulhamid, S.N.; Salim, S.G.; Ali, H.F.H.; Majeed, M.K. Artificial intelligence forecasting models of uniaxial compressive strength. Transp. Geotech. 2021, 27, 100499. [Google Scholar] [CrossRef]
Zhou, J.; Chen, Y.; Li, C.; Qiu, Y.; Huang, S.; Tao, M. Machine learning models to predict the tunnel wall convergence. Transp. Geotech. 2023, 41, 101022. [Google Scholar] [CrossRef]
Wang, Z.; Sun, T.; Sun, Y.; Liu, N. Evaluating the strength properties of high-performance concrete in the form of ensemble and hybrid models using deep learning techniques. Sci. Rep. 2025, 15, 25453. [Google Scholar] [CrossRef]
Xu, Y.; Afzal, M. Applying machine learning techniques in the form of ensemble and hybrid models to appraise hardness properties of high-performance concrete. J. Intell. Fuzzy Syst. 2024, 46, 903–921. [Google Scholar] [CrossRef]
Chicco, D.; Warrens, M.J.; Jurman, G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ. Comput. Sci. 2021, 7, e623. [Google Scholar] [CrossRef]
Xu, Y.; Goodacre, R. On splitting training and validation set: A comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. J. Anal. Test. 2018, 2, 249–262. [Google Scholar] [CrossRef]
Qiu, J. An analysis of model evaluation with cross-validation: Techniques, applications, and recent advances. Adv. Econ. Manag. Polit. Sci. 2024, 99, 69–72. [Google Scholar] [CrossRef]
Allgaier, J.; Pryss, R. Cross-validation visualized: A narrative guide to advanced methods. Mach. Learn. Knowl. Extr. 2024, 6, 1378–1388. [Google Scholar] [CrossRef]
Belkin, M.; Hsu, D.; Ma, S.; Mandal, S. Reconciling modern machine-learning practice and the classical bias–variance trade-off. Proc. Natl. Acad. Sci. USA 2019, 116, 15849–15854. [Google Scholar] [CrossRef]
Ying, X. An overview of overfitting and its solutions. J. Phys. Conf. Ser. 2019, 1168, 022022. [Google Scholar] [CrossRef]
Hjorth, J.U. Computer Intensive Statistical Methods: Validation, Model Selection, and Bootstrap; Chapman and Hall/CRC: Boca Raton, FL, USA, 2017. [Google Scholar]
Paleyes, A.; Urma, R.G.; Lawrence, N.D. Challenges in deploying machine learning: A survey of case studies. ACM Comput. Surv. 2022, 55, 1–29. [Google Scholar] [CrossRef]
Aljarah, I.; Faris, H.; Mirjalili, S. Optimizing connection weights in neural networks using the whale optimization algorithm. Soft Comput. 2018, 22, 1–15. [Google Scholar] [CrossRef]
Dhiman, G.; Kumar, V. Seagull optimization algorithm: Theory and its applications for large-scale industrial engineering problems. Knowl.-Based Syst. 2019, 165, 169–196. [Google Scholar] [CrossRef]
Li, C.; Zhou, J.; Du, K.; Dias, D. Stability prediction of hard rock pillar using support vector machine optimized by three metaheuristic algorithms. Int. J. Min. Sci. Technol. 2023, 33, 1019–1036. [Google Scholar] [CrossRef]
Shariati, M.; Mafipour, M.S.; Ghahremani, B.; Azarhomayun, F.; Ahmadi, M.; Trung, N.T.; Shariati, A. A novel hybrid extreme learning machine–grey wolf optimizer (ELM-GWO) model to predict compressive strength of concrete with partial replacements for cement. Eng. Comput. 2022, 38, 757–779. [Google Scholar] [CrossRef]
Zhang, N.; Wang, Z. Review of soil thermal conductivity and predictive models. Int. J. Therm. Sci. 2017, 117, 172–183. [Google Scholar] [CrossRef]
Yadav, C.; Sahoo, R.R. Thermal analysis comparison of nano-additive PCM-based engine waste heat recovery thermal storage systems: An experimental study. J. Therm. Anal. Calorim. 2022, 147, 2785–2802. [Google Scholar] [CrossRef]
Javadi, F.S.; Metselaar, H.S.C.; Ganesan, P.J.S.E. Performance improvement of solar thermal systems integrated with phase change materials (PCM), a review. Sol. Energy 2020, 206, 330–352. [Google Scholar] [CrossRef]
Kishore, R.A.; Bianchi, M.V.; Booten, C.; Vidal, J.; Jackson, R. Parametric and sensitivity analysis of a PCM-integrated wall for optimal thermal load modulation in lightweight buildings. Appl. Therm. Eng. 2021, 187, 116568. [Google Scholar] [CrossRef]
Le, L.T.; Nguyen, H.; Zhou, J.; Dou, J.; Moayedi, H. Estimating the heating load of buildings for smart city planning using a novel artificial intelligence technique PSO-XGBoost. Appl. Sci. 2019, 9, 2714. [Google Scholar] [CrossRef]
Marani, A.; Zhang, L.; Nehdi, M.L. Design of concrete incorporating microencapsulated phase change materials for clean energy: A ternary machine learning approach based on generative adversarial networks. Eng. Appl. Artif. Intell. 2023, 118, 105652. [Google Scholar] [CrossRef]
Xiong, S.; Liu, Z.; Min, C.; Shi, Y.; Zhang, S.; Liu, W. Compressive strength prediction of cemented backfill containing phosphate tailings using extreme gradient boosting optimized by whale optimization algorithm. Materials 2022, 16, 308. [Google Scholar] [CrossRef]

Figure 1. Bibliometric analysis of research on bio-based phase change materials. (a) Annual publication trends; (b) Distribution of document types; (c) Disciplinary subject classification.

Figure 2. Knowledge mapping analysis of research on bio-based phase change materials. (a) Country collaboration network; (b) Institutional collaboration network; (c) Author collaboration network; (d) Author co-citation network; (e) Highlighted core author network; (f) Journal co-citation network.

Figure 3. Analysis of research hotspots and evolutionary trends. (a) Keyword co-occurrence and burst detection; (b) Evolution of research hotspots over time; (c) Topic clustering and development path mapping; (d) Keyword clustering, burst detection, and temporal evolution analysis.

Figure 4. Modeling and optimization workflow for performance prediction of bio-based composite cementitious materials.

Figure 5. (a) Pearson correlation coefficient heatmap; (b) Significance annotation of correlations (significance level α = 0.1); (c) Combined map of correlation coefficients and significance levels.

Figure 6. Boxplots of input variables (Tm, Lh, PCM dosage, C, W, W/C, FA, CA, and CA/FA).

Figure 7. Boxplots of output variables (Tc, LH, and CS).

Figure 8. Regression graph for Tc. (a) SVR; (b) SVR-GA; (c) SVR-PSO; (d) SVR-WOA; (e) SVR-GWO; (f) SVR-FFA.

Figure 9. Regression graph for LH. (a) SVR; (b) SVR-GA; (c) SVR-PSO; (d) SVR-WOA; (e) SVR-GWO; (f) SVR-FFA.

Figure 10. Regression graph for CS. (a) SVR; (b) SVR-GA; (c) SVR-PSO; (d) SVR-WOA; (e) SVR-GWO; (f) SVR-FFA.

Figure 11. Comparison of error for SVR and its optimized models (a) Thermal conductivity (Tc); (b) Latent heat (LH); (c) Compressive strength (CS).

Figure 12. Regression graph for Tc. (a) RF; (b) RF-GA; (c) RF-PSO; (d) RF-WOA; (e) RF-GWO; (f) RF-FFA.

Figure 13. Regression graph for LH. (a) RF; (b) RF-GA; (c) RF-PSO; (d) RF-WOA; (e) RF-GWO; (f) RF-FFA.

Figure 14. Regression graph for CS. (a) RF; (b) RF-GA; (c) RF-PSO; (d) RF-WOA; (e) RF-GWO; (f) RF-FFA.

Figure 15. Comparison of error distributions for RF and its optimized models (a) Thermal conductivity (Tc); (b) Latent heat (LH); (c) Compressive strength (CS).

Figure 16. Regression graph for Tc. (a) XGBoost; (b) XGBoost-GA; (c) XGBoost-PSO; (d) XGBoost-WOA; (e) XGBoost-GWO; (f) XGBoost-FFA.

Figure 17. Regression graph for LH. (a) XGBoost; (b) XGBoost-GA; (c) XGBoost-PSO; (d) XGBoost-WOA; (e) XGBoost-GWO; (f) XGBoost-FFA.

Figure 18. Regression graph for CS. (a) XGBoost; (b) XGBoost-GA; (c) XGBoost-PSO; (d) XGBoost-WOA; (e) XGBoost-GWO; (f) XGBoost-FFA.

Figure 19. Comparison of error distributions for XGBoost and its optimized models (a) Thermal conductivity (Tc); (b) Latent heat (LH); (c) Compressive strength (CS).

Figure 20. Regression graph for Tc. (a) CatBoost; (b) CatBoost-GA; (c) CatBoost-PSO; (d) CatBoost-WOA; (e) CatBoost-GWO; (f) CatBoost-FFA.

Figure 21. Regression graph for LH. (a) CatBoost; (b) CatBoost-GA; (c) CatBoost-PSO; (d) CatBoost-WOA; (e) CatBoost-GWO; (f) CatBoost-FFA.

Figure 22. Regression graph for CS. (a) CatBoost; (b) CatBoost-GA; (c) CatBoost-PSO; (d) CatBoost-WOA; (e) CatBoost-GWO; (f) CatBoost-FFA.

Figure 23. Comparison of error distributions for CatBoost and its optimized models (a) Thermal conductivity (Tc); (b) Latent heat (LH); (c) Compressive strength (CS).

Figure 24. Training–testing data splitting and validation flowchart based on K-fold cross-validation.

Figure 25. Comparison of Predictive Models on Tc, LH, and CS. (a) SVR; (b) RF; (c) XGBoost; (d) CatBoost.

Figure 26. Analysis of feature importance.

Table 1. Optimization ranges and selected hyperparameters across models.

ML Model	Optimizer	Search Space	Population Size	Iterations	Best Configuration
SVR	GA	C ∈ [0.1, 100]; γ ∈ [0.001, 1]	30	100	C = 12–16; γ = 0.01–0.02
	PSO		30	100	C = 13–17; γ = 0.01–0.03
	WOA		25	120	C = 17–20; γ = 0.01–0.02
	GWO		30	150	C = 13–15; γ = 0.015–0.02
	FFA		20	100	C = 11–13; γ = 0.015–0.02
RF	GA	n_estimators ∈ [50, 500]; max_depth ∈ [3, 20]	30	100	n_estimators = 200–240; depth = 11–13
	PSO		25	120	n_estimators = 240–260; depth = 13–15
	WOA		30	150	n_estimators = 260–290; depth = 13–14
	GWO		20	100	n_estimators = 220–240; depth = 12–13
	FFA		25	120	n_estimators = 190–210; depth = 11–12
XGBoost	GA	learning_rate ∈ [0.01, 0.3]; n_estimators ∈ [50, 500]; max_depth ∈ [3, 15]	30	100	learning_rate = 0.07–0.09; n_estimators = 280–320; depth = 9–11
	PSO		25	120	learning_rate = 0.06–0.08; n_estimators = 300–330; depth = 8–10
	WOA		30	150	learning_rate = 0.04–0.06; n_estimators = 270–300; depth = 10–12
	GWO		20	100	learning_rate = 0.08–0.10; n_estimators = 330–360; depth = 9–11
	FFA		25	120	learning_rate = 0.05–0.07; n_estimators = 290–320; depth = 8–10
CatBoost	GA	learning_rate ∈ [0.01, 0.3]; depth ∈ [3, 15]; iterations ∈ [100, 1000]	30	100	learning_rate = 0.04–0.06; depth = 9–11; iterations = 580–620
	PSO		25	120	learning_rate = 0.05–0.07; depth = 8–10; iterations = 620–660
	WOA		30	150	learning_rate = 0.04–0.05; depth = 10–12; iterations = 680–720
	GWO		20	100	learning_rate = 0.06–0.08; depth = 9–11; iterations = 600–640
	FFA		25	120	learning_rate = 0.04–0.06; depth = 8–10; iterations = 560–600

Note: Due to the stochastic nature of metaheuristic optimization, specific values may vary slightly across runs but consistently fall within the indicated ranges.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, L.; Sun, W.; Gómez-Zamorano, L.Y.; Liu, Z.; Zhang, W.; Ma, H. From Research Trend to Performance Prediction: Metaheuristic-Driven Machine Learning Optimization for Cement Pastes Containing Bio-Based Phase Change Materials. Polymers 2025, 17, 2541. https://doi.org/10.3390/polym17182541

AMA Style

Li L, Sun W, Gómez-Zamorano LY, Liu Z, Zhang W, Ma H. From Research Trend to Performance Prediction: Metaheuristic-Driven Machine Learning Optimization for Cement Pastes Containing Bio-Based Phase Change Materials. Polymers. 2025; 17(18):2541. https://doi.org/10.3390/polym17182541

Chicago/Turabian Style

Li, Leifa, Wangwen Sun, Lauren Y. Gómez-Zamorano, Zhuangzhuang Liu, Wenzhen Zhang, and Haoran Ma. 2025. "From Research Trend to Performance Prediction: Metaheuristic-Driven Machine Learning Optimization for Cement Pastes Containing Bio-Based Phase Change Materials" Polymers 17, no. 18: 2541. https://doi.org/10.3390/polym17182541

APA Style

Li, L., Sun, W., Gómez-Zamorano, L. Y., Liu, Z., Zhang, W., & Ma, H. (2025). From Research Trend to Performance Prediction: Metaheuristic-Driven Machine Learning Optimization for Cement Pastes Containing Bio-Based Phase Change Materials. Polymers, 17(18), 2541. https://doi.org/10.3390/polym17182541

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

From Research Trend to Performance Prediction: Metaheuristic-Driven Machine Learning Optimization for Cement Pastes Containing Bio-Based Phase Change Materials

Abstract

1. Introduction

2. Techniques and Methodology

2.1. Bibliometric Analysis

2.1.1. Literature Retrieval

2.1.2. Publication Trends

2.1.3. Knowledge Network Analysis

2.2. Evolution of Research Trends

2.2.1. Main Terms Analysis

2.2.2. Performance Evaluation of Bio-Based Phase Change Materials

2.2.3. Analysis of Thermal and Mechanical Characteristics in Cement Pastes with Different BPCMs

2.3. Machine Learning Model

2.3.1. Support Vector Regression (SVR)

2.3.2. Random Forest (RF)

2.3.3. Extreme Gradient Boosting (XGBoost)

2.3.4. Categorical Boosting (CatBoost)

2.4. Optimization Algorithms

2.4.1. Genetic Algorithm (GA)

2.4.2. Particle Swarm Optimization (PSO)

2.4.3. Whale Optimization Algorithm (WOA)

2.4.4. Grey Wolf Optimizer (GWO)

2.4.5. Firefly Algorithm (FFA)

2.5. Development of Predictive Models

3. Results and Discussion

3.1. Performance Prediction Based on Support Vector Regression and Optimized Hybrid Models

3.2. Performance Prediction Based on Random Forest and Optimized Hybrid Models

3.3. Performance Prediction Based on Extreme Gradient Boosting and Optimized Hybrid Models

3.4. Performance Prediction Based on Categorical Boosting and Optimized Hybrid Models

3.5. Comparative Evaluation of Model Accuracy

3.6. Future Improvements

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI