Modeling the Structure–Property Linkages Between the Microstructure and Thermodynamic Properties of Ceramic Particle-Reinforced Metal Matrix Composites Using a Materials Informatics Approach

Xie, Rui; Li, Geng; Cao, Peng; Tan, Zhifei; Wang, Jianru

doi:10.3390/ma18102294

Open AccessArticle

Modeling the Structure–Property Linkages Between the Microstructure and Thermodynamic Properties of Ceramic Particle-Reinforced Metal Matrix Composites Using a Materials Informatics Approach

by

Rui Xie

¹,

Geng Li

²,

Peng Cao

^1,*

,

Zhifei Tan

³

and

Jianru Wang

⁴

¹

The College of Architecture and Civil Engineering, Beijing University of Technology, Beijing 100124, China

²

The Institute of Xi’an Aerospace Solid Propulsion Technology, Xi’an 710025, China

³

Department of Civil and Environmental Engineering, The Hong Kong Polytechnic University, Hong Kong, China

⁴

Academy of Aerospace Solid Propulsion Technology, Xi’an 710025, China

^*

Author to whom correspondence should be addressed.

Materials 2025, 18(10), 2294; https://doi.org/10.3390/ma18102294

Submission received: 11 April 2025 / Revised: 10 May 2025 / Accepted: 13 May 2025 / Published: 15 May 2025

(This article belongs to the Topic Digital Manufacturing Technology)

Download

Browse Figures

Versions Notes

Abstract

The application of ceramic particle-reinforced metal matrix composites (CPRMMCs) in the nuclear power sector is primarily dependent on their mechanical and thermal properties. A comprehensive understanding of the structure–property (SP) linkages between microstructures and macroscopic properties is critical for optimizing material properties. However, traditional studies on SP linkages generally rely on experimental methods, theoretical analysis, and numerical simulations, which are often associated with high time and economic costs. To address this challenge, this study proposes a novel method based on Materials Informatics (MI), combining the finite element method (FEM), graph Fourier transform, principal component analysis (PCA), and machine learning models to establish the SP linkages between the microstructure and thermodynamic properties of CPRMMCs. Specifically, FEM is used to model the microstructures of CPRMMCs with varying particle volume fractions and sizes, and their elastic modulus, thermal conductivity, and coefficient of thermal expansion are computed. Next, the statistical features of the microstructure are captured using graph Fourier transform based on two-point spatial correlations, and PCA is applied to reduce dimensionality and extract key features. Finally, a polynomial kernel support vector regression (Poly-SVR) model optimized by Bayesian methods is employed to establish the nonlinear relationship between the microstructure and thermodynamic properties. The results show that this method can effectively predict FEM results using only 5–6 microstructure features, with the R² values exceeding 0.91 for the prediction of thermodynamic properties. This study provides a promising approach for accelerating the innovation and design optimization of CPRMMCs.

Keywords:

CPRMMCs; thermodynamic properties; graph Fourier transform; principal component analysis; machine learning

1. Introduction

Ceramic particle-reinforced metal matrix composites (CPRMMCs) have been widely applied in various fields, such as the aerospace, automotive, and energy fields, due to their excellent physical properties, including high elastic modulus, high thermal conductivity, low coefficient of thermal expansion, and outstanding electrical conductivity [1,2,3]. Especially in the field of nuclear energy, the key properties of CPRMMCs, such as the elastic modulus, thermal conductivity, and coefficient of thermal expansion, directly influence the performance and application of materials in nuclear reactors. Although these properties are typically obtained through experimental methods, experiments are time-consuming and have certain limitations, which, to some extent, restrict the design and optimization of new nuclear reactors. Therefore, accurately predicting the thermodynamic properties of CPRMMCs in a short period has become a significant challenge in material design and optimization.

CPRMMCs are typically composed of a metal matrix and ceramic particles, with their macroscopic properties often being influenced by various factors, particularly microstructural features such as particle distribution [4,5], particle volume fraction [6,7], and particle size [8,9]. Extensive experimental studies have been conducted to evaluate the impact of microstructural features on the thermodynamic properties of composites [10,11,12,13,14]. However, experiments are often time-consuming and costly. Additionally, many theoretical models have been employed to study the thermodynamic properties of composites [15,16,17,18,19], but these models are often constrained by various assumptions and limitations, with limited predictive capability. Moreover, numerical simulations based on the finite element method (FEM) can effectively capture the linkages between microstructural features and material properties [20,21,22,23,24]. However, this method suffers from high computational costs and substantial data discretization, which constrain its applicability in large-scale optimization and rapid prediction tasks [25]. Therefore, there is an urgent need to develop a more efficient and accurate method to establish the structure–property (SP) linkages between the microstructure and thermodynamic properties of CPRMMCs.

With the development of the Materials Genome Initiative (MGI) [26] and data science technologies, data-driven approaches have provided new avenues for accelerating the study of SP linkages [27]. By mining and analyzing vast amounts of historical data and combining advanced informatics techniques, data-driven methods can accurately extract the SP linkages of materials without the need for new rounds of experiments and simulations. This approach is known as Materials Informatics (MI) [28,29,30]. Specifically, the application of this method typically relies on the establishment of “input” and “output” datasets. The “input” dataset typically includes design parameters [31,32] and microstructural features [33,34], while the “output” dataset typically contains target properties, which are obtained through experimental measurements or high-throughput computational simulations [35,36]. Next, data science techniques are used to perform statistical analysis on the datasets, simplifying the data structure by extracting and reducing the dimensionality of the microstructural features, thereby enhancing the effectiveness of the data. Common statistical methods currently used include correlation functions [37], linear path functions [38,39], and two-point spatial correlations [40,41]. Among these, two-point spatial correlations have been proven to be more rigorous and comprehensive as they can effectively capture the microstructural features of materials. As higher-order statistical datasets often contain significant redundancy and noise, dimensionality reduction techniques such as principal component analysis (PCA) [42,43,44] are widely applied to compress the data, improving its interpretability and predictive accuracy. Ultimately, through machine learning algorithms, such as support vector regression (SVR) [45,46] and convolutional neural network (CNN) [47,48], researchers can establish a mapping relationship between input data and output properties, thereby achieving accurate predictions of material properties. This data-driven integrated method not only significantly improves the efficiency and accuracy of property prediction but also effectively combines traditional research methods such as experiments, theoretical modeling, and numerical simulations, providing theoretical support and technical guarantees for the development of new materials.

In this study, an innovative method is proposed within the MI framework, integrating the FEM, graph Fourier transform, PCA, and machine learning to efficiently establish SP linkages between the microstructure and thermodynamic properties of CPRMMCs. First, the FEM is used to generate CPRMMC microstructures with different particle volume fractions and particle sizes, and their thermal conductivity, elastic modulus, and coefficient of thermal expansion are calculated. Then, the microstructural features are statistically represented through the graph Fourier transform based on two-point spatial correlations, and PCA is applied to reduce the dimensionality of the resulting statistics. Furthermore, a combination of various machine learning algorithms, including radial basis function kernel support vector regression (RBF-SVR), polynomial kernel support vector regression (Poly-SVR), random forest (RF), XGBoost, and CatBoost, is used to capture the complex nonlinear relationships between microstructural features and thermodynamic properties. The effects of different truncation levels and the number of microstructure samples on the prediction results are also systematically investigated.

2. Microstructure Model and Dataset

To establish the SP linkages between the microstructure and thermodynamic properties of CPRMMCs, this study is divided into the following five steps: the generation of stochastic microstructures, the evaluation of thermodynamic properties, the statistical representation of the microstructure, dimensionality reduction of the statistics, and the extraction and validation of SP linkages. The workflow is shown in Figure 1. This section primarily introduces the methods used for generating CPRMMCs microstructures and determining thermodynamic properties.

2.1. Generation of Stochastic Microstructures

To explore and evaluate the capability of machine learning models in predicting the thermodynamic properties of CPRMMCs, it is first necessary to construct a dataset that reflects the fundamental facts. Due to the lack of suitable experimental datasets, this study uses FEM for simulation analysis to acquire the required dataset in a short time. The design and manufacturing process of CPRMMCs involves multiple parameters, among which the volume fraction and particle size of the reinforcing particles have a significant impact on the material’s thermodynamic properties [9,49]. Therefore, this study selects parameter combinations of different ceramic particle volume fractions and particle sizes to generate the required dataset. Although only two process parameters are used in this study to demonstrate the feasibility of establishing SP linkages through data science methods, it should be noted that the method is highly general and scalable, allowing for the inclusion of additional influencing factors to improve the model’s prediction accuracy and broader applicability.

In microstructure modeling, the size of the representative volume element (RVE) is crucial. The RVE size must be large enough to ensure that key microstructural features are included, but it should also be as small as possible to avoid excessive computational costs due to increased size [50]. Based on this, the study selects 700 μm as the RVE size, which has been shown to effectively predict the thermodynamic properties of CPRMMCs while ensuring computational efficiency [51]. For a particle size of 140 μm, particle volume fractions of 10%, 20%, and 30% are set. Meanwhile, considering the limited enhancement of thermodynamic properties at low particle volume fractions and the restrictions on thermodynamic properties at high volume fractions, the particle size is set to 60, 100, 140, and 180 μm at a 20% particle volume fraction. In summary, this study investigates six parameter combinations, with each combination generating 100 distinct particle distribution states for the RVE, resulting in a total of 600 samples.

The FEM is used in this study to create the geometric model of the RVE. The microstructure of the RVE consists of ceramic particles, the matrix, and the interface between the ceramic particles and the matrix, with the interface thickness set to 5 μm. To simplify the computational process, the ceramic particles in the model are assumed to be spherical and are randomly distributed within the RVE. This simplification effectively approximates the morphology and distribution characteristics of the particles while reducing computational complexity. Figure 2 shows six typical RVEs of CPRMMCs under different particle volume fractions and particle sizes, where red represents ceramic particles, blue represents the interface, and white represents the matrix.

2.2. Model Solution Method

(1): Periodic boundary conditions (PBCs)

PBCs require that the physical quantities (such as stress or temperature) at one boundary of the simulation domain are consistent with those at the opposite boundary, thereby simulating an infinite system with periodic characteristics. Specifically, consider an RVE with boundaries AB, BC, AD, and DC, where the length and width are denoted as L and W, respectively, as shown in Figure 3. The mathematical form of the PBC can be expressed as follows [52]:

\begin{array}{l} u_{B C} - u_{B} = u_{A D} - u_{A} \\ u_{A B} - u_{A} = u_{D C} - u_{D} \end{array}

(1)

where u_AB, u_BC, u_AD, and u_DC represent the displacement vectors of arbitrary material points on the corresponding boundaries, and u_A, u_B, u_C, and u_D represent the displacement vectors at each vertex. Strain is generated by applying horizontal displacement at vertex C to calculate the elastic modulus of CPRMMCs.

(2): Thermal conductivity

In thermal conductivity analysis, the heat conduction equation is one of the fundamental equations used for understanding and describing how heat propagates through a material. The distribution of heat flux q typically follows Fourier’s law, which states that heat flux is a function of the temperature gradient and can be expressed as follows:

q = - λ \nabla T

(2)

where λ represents the thermal conductivity coefficient, a constant that measures the material’s ability to conduct heat, and ∇T is the temperature gradient, indicating the direction of heat flow from higher to lower temperature regions. In this study, the steady-state heat flux of the RVE under different temperature conditions is calculated using the FEM, and then the thermal conductivity is determined using Equation (2).

(3): Coefficient of thermal expansion

The coefficient of thermal expansion (CTE) of CPRMMCs is determined by studying the effect of temperature changes on the material’s strain. When using the FEM to calculate the CTE, the thermal expansion process needs to be modeled as the effect of temperature on the material’s properties. Specifically, by defining the relationship between the time step and temperature, the heating process is simulated to obtain the material’s thermal strain–temperature curve, as shown in Figure 4. In this figure, CTE_sec represents the secant coefficient of thermal expansion, CTE_tan represents the tangent coefficient of thermal expansion, and T denotes any given temperature. Typically, the heating process at 1000 °C is divided into 100 equidistant incremental steps for computation. This method can accurately reflect the impact of temperature changes on the thermal expansion characteristics of the composite material and provides effective data support for material design.

In the actual calculation, the following steps are used: (1) define the relationship between temperature and elastic modulus; (2) apply temperature loading from the reference temperature to the maximum temperature; (3) compute the thermal deformation in the RVE; (4) calculate the effective thermal strain; (5) calculate the CTE_sec at different temperatures.

2.3. Evaluation of Thermodynamic Properties

In this study, UO₂ material is used as the ceramic particle, and Zr alloy is used as the metal matrix. The material properties of the ceramic particles, metal matrix, and interface are shown in Table 1.

When using the FEM for simulation analysis, the model is meshed using triangles. A mesh sensitivity analysis was conducted to ensure that the chosen mesh density provides accurate results without unnecessary computational costs. The mesh size was gradually refined, and the solution was found to converge when further refinement led to negligible changes in the results. To balance accuracy and computational efficiency, the phases in the model used for calculating the elastic modulus and thermal conductivity are meshed into approximately 15,000 elements and 10,000 nodes, while the phases in the model used for calculating the CTE are meshed into approximately 150,000 elements and 80,000 nodes. Additionally, CPRMMCs can be considered as a periodic array of RVEs, so PBC must be applied to the RVE. This means that each RVE in CPRMMCs exhibits the same deformation pattern, and there is no separation or overlap between adjacent RVEs.

To verify the reliability of the finite element simulation, the simulation results for thermal conductivity at different particle volume fractions are compared with the experimental results, as shown in Figure 5. From the figure, it can be seen that the experimental and simulation results are in good agreement. This indicates that the established micro-mechanical model is reliable and can be used for the subsequent establishment of SP linkages.

Figure 6 shows the statistical analysis of thermal conductivity, the elastic modulus, and the CTE for 600 RVEs. From the figure, it can be observed that as the particle size increases, thermal conductivity and the CTE initially increase and then decrease, reaching a maximum value of 140 μm. The elastic modulus first decreases and then increases, reaching a minimum value of 140 μm. As the particle volume fraction increases, thermal conductivity and the CTE decrease, while the elastic modulus increases. This indicates that both the particle size and particle volume fraction have a significant impact on the thermodynamic properties of CPRMMCs.

3. Microstructure Dimensionality Reduction and Machine Learning Methods

3.1. Statistical Representation of Microstructure

The concept of two-point spatial correlations involves treating the microstructure image as a matrix containing positional and state information. By calculating the correlation between the state of each position in the matrix and other positions, the correlation characteristics of the entire microstructure image can be statistically analyzed. To facilitate this calculation, it is necessary to discretize the material’s microstructure, which involves dividing the continuous microstructure into distinct regions, each characterized by a uniform local state. The discretized microstructure is then represented using a functional expression. This discretized microstructure includes the spatial positions (denoted by vector s) and the local states (denoted by h) across the entire spatial domain. The two-point spatial correlations can be expressed as a function [55]:

f_{r}^{h h^{'}} = \frac{1}{S_{t}} \sum_{s = 1}^{S_{t}} m_{s}^{h} m_{s + r}^{h^{'}}

(3)

where s represents a spatial point within the entire microstructure domain S_t. In this study, S_t refers to all pixel points in the microstructure image, while s represents the spatial location coordinate of each pixel.

m_{s}^{h}

represents the probability density of finding a local state h at spatial position s [56]. That is, if the local state at position s is h, the probability density is 1, and if the local state at s is any other state, the probability density is 0. Based on this, two-point spatial correlations can be expressed as follows: a vector r connects two spatial points, s and s + r, and the probability density of finding structural states h and h′ at these two points is computed. As shown in Figure 7, by varying the vector r, this calculation is extended across the entire spatial domain and summed [57]. The expression

f_{r}^{h h^{'}}

in Equation (3) represents the cross-correlation of h and h′, which calculates the correlation of different local states, while

f_{r}^{h h}

represents the auto-correlation of h and h. In general, when S_t is large, higher-order calculations are required, and the computational process becomes particularly complex. Previous studies have shown that the Fast Fourier Transform (FFT) can reduce the computational complexity from O(N²) to O(NlogN), making the Fourier transform of large-scale data more efficient [58,59,60]. To speed up the computation, this study employs FFT to calculate Equation (3).

In this study, two-point correlation statistics are applied to the microstructure images of CPRMMCs. Each image is discretized into 692 × 692 pixels, resulting in a total of S_t = 478,864 spatial points. The matrix, interface, and ceramic particles are labeled as 0, 1, and 2, respectively, meaning each pixel belongs to one of three local states, i.e., h ∈ {0,1,2}. Accordingly, there are nine possible types of spatial correlations, denoted as

f_{r}^{h h^{'}}

, including three auto-correlations (where h = h′) and six cross-correlations (where h ≠ h′). Since the matrices calculated for

f_{r}^{01}

and

f_{r}^{10}

,

f_{r}^{02}

and

f_{r}^{20}

, and

f_{r}^{12}

and

f_{r}^{21}

are the same, and the volume fraction of the interface in the microstructure images is negligible, this study focuses only on three representative types of correlation: matrix auto-correlation

f_{r}^{00}

, particle auto-correlation

f_{r}^{22}

, and matrix–particle cross-correlation

f_{r}^{02}

.

The microstructure used in the finite element model for thermodynamic property calculations of CPRMMCs is shown in Figure 8a, with an image resolution of 692 × 692 pixels. Figure 8b–d display the three-dimensional mappings of the matrix auto-correlation, particle auto-correlation, and matrix–particle cross-correlation, respectively. It is important to note that the x and y axes represent the spatial locations of the two-point correlation statistics, while the z axis denotes the magnitude of the statistical values.

As observed in Figure 8b,c, the center of each auto-correlation plot (where the vector r = 0) corresponds to the volume fraction of the matrix and particles, respectively, and the amplitude of the surrounding fluctuations is approximately equal to the square of the respective volume fractions. In contrast, the center value of the cross-correlation plot (Figure 8d) is zero as it is physically impossible for both the matrix and particle phases to coexist at the same spatial location. The amplitude of the surrounding fluctuations in the cross-correlation plot is approximately equal to the product of the matrix and particle volume fractions. Previous studies have shown that the dominant microstructural features are primarily concentrated within the region corresponding to small vector values in the statistical data. Therefore, to simplify subsequent analyses, only a subset of the statistical data is retained. The truncated regions, highlighted by red dashed boxes in Figure 8b–d, correspond to the preserved two-dimensional slices shown in Figure 8e–g. Specifically, 96 columns of statistical values are truncated from both the x and y directions. As a result, the total data size is reduced from 3 × 692² = 1,436,592 to 3 × 500² = 750,000, which significantly enhances computational efficiency in the following steps. The factor of 3 accounts for the three types of correlation used to represent the microstructure: matrix auto-correlation

f_{r}^{00}

, particle auto-correlation

f_{r}^{22}

, and matrix–particle cross-correlation

f_{r}^{02}

. The influence of different truncation levels will be further discussed in a later section.

Furthermore, Figure 9 illustrates the two-point statistics of particle auto-correlation and matrix–particle cross-correlation at a constant particle volume fraction of 20%, with particle sizes of 60, 100, 140, and 180 μm. As shown, for the particle auto-correlation map at a particle size of 60 μm, a bright circular region appears at the center, while the surrounding area is filled with numerous small bright spots. As the particle size increases, both the diameter of the central bright circle and the size of the surrounding bright spots increase accordingly. A similar trend is observed in the matrix–particle cross-correlation maps. This phenomenon may be caused by the smaller size and larger number of small-sized particles than large-sized particles, which indicates that two-point spatial correlations statistics are capable of effectively capturing the characteristic features of the microstructure.

3.2. Dimensionality Reduction of Statistics

Although two-point spatial correlations can effectively capture key features within the microstructure, the challenge of high dimensionality remains. Even after dimensionality reduction through large vector truncation, the representation of microstructures remains excessively high-dimensional, making it difficult to establish effective SP linkages in practical applications [61]. This high-dimensional representation not only leads to significantly increased computational resource requirements but may also reduce the model’s generalization capability due to data sparsity, thereby compromising prediction performance. PCA has been proven to provide a reliable and accurate low-dimensional representation of high-dimensional spatial correlations [62,63]. As a classical unsupervised learning technique, PCA reconstructs the feature space through orthogonal linear projection and ranks the principal components (PCs) in descending order according to their explained variance, ensuring that the first principal component retains the most relevant information from the original data. In addition, PCA introduces orthogonal basis vectors, which successfully decouple the nonlinear correlations among the original features, providing a solid mathematical foundation for constructing efficient and stable SP prediction models.

In this study, the two-point spatial correlations statistics of microstructures are projected into the principal component space to achieve dimensionality reduction. The vectorized representation of the k-th microstructure in the PC space is given as follows [57]:

f_{r}^{(k)} = \sum_{i = 1}^{\min ((K - 1), R)} α_{i}^{(k)} φ_{i r} + {\bar{f}}_{r}

(4)

where K denotes the total number of microstructures, R represents the retained dimensionality of the two-point spatial correlations statistics,

α_{i}^{(k)}

is the weight coefficient of the k-th principal component,

φ_{i r}

refers to the basis vectors in the transformed space, and

{\bar{f}}_{r}

is the mean of the reduced dataset.

By performing singular value decomposition (SVD) on the original data matrix X, the following decomposition can be obtained:

X = U Σ V^{T}

(5)

where the basis vectors

φ_{i r}

correspond to the matrix V^T, while the PC weight coefficients

α_{i}^{(k)}

are derived from the relevant portion of the matrix UΣ. It is noteworthy that the first principal component typically represents the direction of greatest variance in the data and is associated with the largest eigenvalue. By retaining a reduced number of principal components (i.e., R′ PCs), the dimensionality-reduced representation of the k-th microstructure can be approximated as follows:

f_{r}^{(k)} \approx \sum_{i = 1}^{R^{'}} α_{i}^{(k)} φ_{i r} + {\bar{f}}_{r}

(6)

It should be noted that the number of principal components to retain is typically determined by calculating the proportion of variance explained (PVE), which reflects the ratio of variance explained by each principal component to the total variance of the dataset. PVE helps assess the relative importance of each principal component and guides the selection of the number of components to retain based on the desired level of accuracy. Fewer retained components result in lower dimensionality, improving computational efficiency while preserving the essential information of the data.

The high-dimensional feature vector space obtained from the two-point spatial correlations statistics of microstructure images is reduced using PCA. The variance explained by each PC is illustrated in Figure 10. In this figure, the variance contributed by each individual principal component—corresponding to a specific microstructural feature—is represented by bars, while the cumulative variance from the first PC to the selected number of PCs is shown as a curve. As observed in the figure, the cumulative PVE by the first three principal components exceeds 87%. In other words, the first three PCs capture the vast majority of the microstructural features represented by the two-point spatial correlations statistics. With the addition of a fourth PC, the cumulative PVE surpasses 94%. This indicates that the dimensionality of the dataset is effectively reduced from 750,000 to just 4 while retaining nearly all critical information. These results demonstrate the remarkable capability of PCA to achieve the high-precision dimensionality reduction in two-point spatial correlations features for microstructure representation.

Figure 11 presents the first three basis vectors of the two-point spatial correlations statistics for matrix auto-correlation, particle auto-correlation, and matrix–particle cross-correlation. Each basis vector is represented as an image of 500 × 500 = 250,000 pixels. By examining the PC1 basis vector images (leftmost column), a bright circular region is observed at the center of each image. The intensity values at the centers are approximately equal to the volume fractions of the matrix and particles, while the value at the center of the bottom-left image is close to zero. These observations are consistent with the previously discussed characteristics of two-point spatial correlations statistics, indicating that PC1 effectively retains the dominant features of the original data. In contrast, the basis vector images for PC2 and PC3 (middle and right columns) reveal more complex spatial patterns, suggesting that these components capture higher-order structural variations within the microstructure.

PC score maps, obtained by aggregating the first two PC weights of all 600 RVEs, are shown in Figure 12a,d. As observed, samples with different particle volume fractions exhibit significant variation along the PC1 axis, while samples with different particle sizes show greater variation along the PC2 axis. This indicates that PC1 primarily captures structural variations associated with particle volume fraction, whereas PC2 reflects changes related to particle size.

To further explore the intrinsic relationships between the PCs and RVE structural parameters, scatter plots of PC1 and PC2 versus the particle volume fraction and particle size are provided in Figure 12b,c,e,f. A closer examination reveals that, as the particle volume fraction varies, the sample data points show a much clearer separation along the PC1 dimension than along PC2. Conversely, with changes in the particle size, the distribution of data points along PC2 is more distinct than that along PC1. These observations further support the findings in Figure 12a,d.

3.3. Machine Learning Methods

3.3.1. Machine Learning Models

Given the nonlinear relationship between the microstructure characteristics and macroscopic properties after dimensionality reduction, this study adopted five different methods for model construction. Through comparative analysis, the optimal method was identified to improve.

(1): XGBoost

XGBoost [64] is based on the gradient boosting decision tree (GBDT) method. It fits the data by iteratively building decision trees, and in each iteration, the prediction is improved by minimizing the residuals. The goal of each new tree is to correct the errors made by the previous tree, thereby progressively enhancing the model’s prediction accuracy. The workflow of the XGBoost algorithm is illustrated in Figure 13a.

(2): CatBoost

Compared with XGBoost, CatBoost [65] is specifically optimized for handling categorical features. Unlike traditional methods that require converting categorical variables into one-hot encodings, CatBoost processes categorical data directly using an ordered boosting technique, which reduces information loss and enhances both the efficiency and accuracy of the model. The workflow of the CatBoost algorithm is illustrated in Figure 13b.

(3): Random forest (RF)

RF [66] is an ensemble learning method that improves predictive performance and model stability by training multiple independent decision trees and averaging their predictions. During training, each tree is constructed using bootstrap sampling and random feature selection, which increases diversity among the trees and reduces the risk of overfitting. The workflow of the RF algorithm is illustrated in Figure 14.

(4): Support vector regression (SVR)

The basic principle of SVR is to employ a kernel function to map nonlinear data from a low-dimensional space to a high-dimensional space, where the data become linearly separable. An optimal hyperplane is then determined to minimize the distance from the farthest sample points to the hyperplane, as illustrated in Figure 15. The SVR problem can be described as follows:

f (x) = w^{T} x + b

(7)

\min_{ω, b, ξ_{i}, {\hat{ξ}}_{i}} \frac{1}{2} {‖ω‖}^{2} + C \sum_{i = 1}^{m} (ξ_{i} + ξ_{i}^{*})

(8)

Subject to y_{i} - f (x_{i}) \leq ε + ξ_{i}, ξ_{i} \geq 0

(9)

f (x_{i}) - y_{i} \leq ε + ξ_{i}^{*}, ξ_{i}^{*} \geq 0

(10)

where w denotes the weight vector, b is the bias term,

ξ

and

ξ^{*}

are slack variables, y_i represents the predicted value, ε is the insensitive loss parameter, and C is the penalty coefficient. The choice of kernel function significantly affects the fitting capability and computational efficiency of the SVR model. In this study, both the polynomial (Poly) kernel and the radial basis function (RBF) kernel are employed.

3.3.2. Hyperparameter Optimization

The performance of machine learning models is highly dependent on the selection of hyperparameters. However, the hyperparameter space is complex, and the search process is extremely time-consuming, requiring substantial computational resources for model training. Moreover, due to the lack of a direct mathematical relationship between hyperparameters and model performance, traditional gradient-based optimization methods are often ineffective.

To address the aforementioned issues, this study introduces the Bayesian optimization (BO) algorithm [67]. By constructing a surrogate model of the objective function and using a small number of evaluated hyperparameter points for fitting and prediction, BO significantly reduces the number of training iterations. Compared to traditional methods such as random search [68], BO is more efficient in exploring the hyperparameter space and substantially shortens computation time. Furthermore, its inference mechanism, based on observed values, does not require explicit gradient information, making it effective in handling the non-differentiability of the objective function. The core idea of BO is to construct a probabilistic model of the objective function (typically a Gaussian Process model) to infer the shape of the objective function and then select the parameter points most likely to improve the objective function value, thus progressively optimizing the function. The workflow of BO is illustrated in Figure 16.

4. Results and Discussion

4.1. Extraction and Validation of SP Linkages

The previous sections introduced the methodology for constructing SP linkages. In this section, the effectiveness of the proposed method is validated using the dataset. Specifically, 80% of the dataset is randomly selected as the training set, while the remaining 20% is reserved as the independent test set. To ensure model robustness and generalization, 5-fold cross-validation is conducted within the training set. The cross-validation results are used to optimize the model’s hyperparameters and evaluate predictive performance prior to final testing. The hyperparameter space settings for the machine learning models are summarized in Table 2.

Figure 17a–c show the RMSE and PVE of different machine learning models in predicting the thermal conductivity, elastic modulus, and CTE of CPRMMCs under varying numbers of PCs. As shown, when the number of principal components is fewer than five, the RMSE decreases significantly, while the PVE increases sharply. This indicates that a small number of principal components can already capture the dominant features of the data, leading to a substantial improvement in model prediction performance. It also confirms that PCA dimensionality reduction preserves enough useful information to improve the predictive performance of the model. However, when the number of principal components exceeds five, the decrease in the RMSE becomes more gradual, and the growth of PVE plateaus, with only marginal improvements. This suggests that including more features does not significantly enhance the model’s predictive power and may instead introduce redundancy, increasing model complexity and raising the risk of overfitting. Among all evaluated models, the Poly-SVR model exhibits the best predictive performance for thermal conductivity, elastic modulus, and CTE when the number of principal components is 5, 6, and 5, respectively.

To further verify the accuracy of the constructed Poly-SVR model, Figure 17d–f compare the predicted results of the Poly-SVR model with those obtained from finite element simulations for thermal conductivity, elastic modulus, and CTE. The data points are observed to be evenly distributed around the X = Y diagonal, indicating a high level of agreement between the Poly-SVR predictions and FEM results. In addition, the R² values for both the training and test sets exceed 0.91, further confirming the accuracy and reliability of the Poly-SVR model in predicting thermodynamic properties.

In the context of material property prediction, the Poly-SVR model demonstrates a significant advantage in computational efficiency while maintaining high predictive accuracy. Traditional finite element methods require approximately one hour to process a single sample, resulting in a total computation time of around 600 h for 600 samples. In contrast, the Poly-SVR model has almost negligible prediction time for each sample after training and can achieve fast and accurate predictions in an extended parameter space. This substantial time-saving advantage highlights the practical value of the Poly-SVR model for material property prediction, particularly in scenarios involving large-scale datasets, where it can greatly enhance computational efficiency.

4.2. Discussion

Subsequently, this study systematically analyzes the influence of truncation level and RVE quantity on the predictive performance of the Poly-SVR model. Two sets of comparative conditions are examined: first, the truncation levels of 0, 96, and 146 are tested while keeping the number of RVEs fixed at 600; second, the number of RVEs is reduced from 600 to 300 to further evaluate its effect on model performance.

Table 3 provides a detailed summary of the Poly-SVR model’s prediction results under different truncation levels and RVE quantities. The results show that when the truncation level increases from 0 to 96, the RMSE significantly decreases and the R² value increases, indicating improved predictive capability. This suggests that truncation removes part of the redundant information in the two-point statistics, thereby enhancing the distinctiveness of the data features. In contrast, when the truncation level is further increased from 96 to 146, the changes in the RMSE and R² become negligible, implying that the central peak of the two-point statistics contains the microstructure information of maximum payload. Furthermore, when the number of RVEs is reduced from 600 to 300, the model shows improved prediction performance for thermal conductivity, negligible change for elastic modulus, and a decline in prediction accuracy for the CTE. This observation further confirms the strong dependency between model effectiveness and dataset size, indicating that under the premise of maintaining sufficient data, an appropriate sample size plays a crucial role in determining the predictive performance of machine learning models. Specifically, the improvement in thermal conductivity prediction with fewer RVEs may be attributed to the model’s ability to focus on more relevant features, reduce overfitting, and avoid noise or less informative data points.

This study proposes an MI-based method to explore the SP linkages of CPRMMCs and successfully carries out mapping between the microstructure and thermodynamic properties using efficient machine learning methods. Compared with traditional experimental and numerical simulation approaches, this method offers significant advantages in both data processing and model construction. The data-driven methodology is capable of capturing the complex nonlinear relationships between material microstructures and their properties while enabling high-efficiency performance prediction. This approach not only saves substantial time and computational costs but also avoids the complexity and uncertainty often encountered in conventional methods. By combining two-point spatial correlation-based graph Fourier transform with PCA, this study effectively extracts representative microstructural features from a high-dimensional feature space. This approach simplifies data processing while retaining the key information necessary for the accurate prediction of material properties. One of its major advantages lies in reducing the risk of model overfitting caused by excessive features while also improving the model’s generalization ability, which refers to its ability to maintain strong predictive performance when applied to new, unseen datasets.

The current approach, however, relies on traditional machine learning methods to analyze numerical simulation data for six parameter combinations. Future research could explore more advanced techniques [30], such as convolutional neural networks, graph neural networks, and symbolic regression, and extend the analysis to simulations or experiments involving a broader range of parameter combinations, including factors such as particle shapes, orientations, spatial distributions, and additional processing parameters. Additionally, future studies could incorporate supplementary data sources, such as electron microscopy [69] and X-ray tomography, to further enhance prediction accuracy and reliability.

5. Conclusions

This study proposes an efficient and accurate approach within the MI framework for establishing structure–property linkages between the microstructure and thermodynamic properties of CPRMMCs. It integrates the FEM, graph Fourier transform, PCA, and machine learning methods. By combining graph Fourier transform based on two-point spatial correlations with PCA, we successfully reduced the high-dimensional microstructure data (750,000 dimensions) to just five to six key PCs. This dimensionality reduction preserved the core features of the microstructure, such as particle volume fraction, particle size, and spatial distribution. Using these PCs, precise mapping between the microstructure and thermodynamic properties was carried out with a Bayesian-optimized Poly-SVR model. The results show that this model exhibits excellent prediction accuracy, achieving efficient predictions with only a few microstructure features, and all R² values exceed 0.91. This confirms its effectiveness and reliability in predicting material properties. Future research will focus on integrating multi-source data to provide more accurate support for materials design and optimization.

Author Contributions

Writing—original draft preparation, methodology, software, visualization, and investigation, R.X.; supervision, project administration, and funding acquisition, G.L.; writing—review and editing, conceptualization, methodology, and supervision, P.C.; writing—review and editing, formal analysis, investigation, and supervision, Z.T.; investigation and supervision, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Jianru Wang was employed by the Academy of Aerospace Solid Propulsion Technology. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Miracle, D.B. Metal matrix composites–From science to technological significance. Compos. Sci. Technol. 2005, 65, 2526–2540. [Google Scholar] [CrossRef]
Rajak, D.K.; Pagar, D.D.; Kumar, R.; Pruncu, C.I. Recent progress of reinforcement materials: A comprehensive overview of composite materials. J. Mater. Res. Technol. 2019, 8, 6354–6374. [Google Scholar] [CrossRef]
Lin, Z.; Su, Y.; Qiu, C.; Yang, J.; Chai, X.; Liu, X.; Ouyang, Q.; Zhang, D. Configuration effect and mechanical behavior of particle reinforced aluminum matrix composites. Scr. Mater. 2023, 224, 115135. [Google Scholar] [CrossRef]
Chen, X.-H.; Yan, H. Solid–liquid interface dynamics during solidification of Al 7075–Al₂O_3np based metal matrix composites. Mater. Des. 2016, 94, 148–158. [Google Scholar] [CrossRef]
Wang, D.; Shanthraj, P.; Springer, H.; Raabe, D. Particle-induced damage in Fe–TiB₂ high stiffness metal matrix composite steels. Mater. Des. 2018, 160, 557–571. [Google Scholar] [CrossRef]
Song, M.; Huang, B. Effects of particle size on the fracture toughness of SiC_p/Al alloy metal matrix composites. Mater. Sci. Eng. A. 2008, 488, 601–607. [Google Scholar] [CrossRef]
Rabiei, A.; Vendra, L.; Kishi, T. Fracture behavior of particle reinforced metal matrix composites. Compos. Part A Appl. Sci. Manuf. 2008, 39, 294–300. [Google Scholar] [CrossRef]
Liu, Q.; Qi, F.; Wang, Q.; Ding, H.; Chu, K.; Liu, Y.; Li, C. The influence of particles size and its distribution on the degree of stress concentration in particulate reinforced metal matrix composites. Mater. Sci. Eng. A 2018, 731, 351–359. [Google Scholar] [CrossRef]
Jarzabek, D.M.; Chmielewski, M.; Dulnik, J.; Strojny-Nedza, A. The influence of the particle size on the adhesion between ceramic particles and metal matrix in mmc composites. J. Mater. Eng. Perform. 2016, 25, 3139–3145. [Google Scholar] [CrossRef]
Singh, M.K.; Gautam, R.K. Structural, mechanical, and electrical behavior of ceramic-reinforced copper metal matrix hybrid composites. J. Mater. Eng. Perform. 2019, 28, 886–899. [Google Scholar] [CrossRef]
Jarząbek, D.M. The impact of weak interfacial bonding strength on mechanical properties of metal matrix–Ceramic reinforced composites. Compos. Struct. 2018, 201, 352–362. [Google Scholar] [CrossRef]
Yan, Y.-F.; Kou, S.-Q.; Yang, H.-Y.; Shu, S.-L.; Lu, J.-B. Effect mechanism of mono-particles or hybrid-particles on the thermophysical characteristics and mechanical properties of Cu matrix composites. Ceram. Int. 2022, 48, 23033–23043. [Google Scholar] [CrossRef]
Commisso, M.S.; Le Bourlot, C.; Bonnet, F.; Zanelatto, O.; Maire, E. Thermo-mechanical characterization of steel-based metal matrix composite reinforced with TiB₂ particles using synchrotron X-ray diffraction. Materialia 2019, 6, 100311. [Google Scholar] [CrossRef]
León-Patiño, C.A.; González-Esquivel, R.J.; Aguilar-Reyes, E.A. Thermophysical properties of Ni–20Cr metal matrix reinforced with TiC ceramic particles. MRS Adv. 2021, 6, 807–810. [Google Scholar] [CrossRef]
Chen, S.; Hassanzadeh-Aghdam, M.K.; Ansari, R. An analytical model for elastic modulus calculation of SiC whisker-reinforced hybrid metal matrix nanocomposite containing SiC nanoparticles. J. Alloys Compd. 2018, 767, 632–641. [Google Scholar] [CrossRef]
Hashin, Z.; Shtrikman, S. A variational approach to the theory of the elastic behaviour of multiphase materials. J. Mech. Phys. Solids 1963, 11, 127–140. [Google Scholar] [CrossRef]
Brailsford, A.D.; Major, K.G. The thermal conductivity of aggregates of several phases, including porous materials. Br. J. Appl. Phys. 1964, 15, 313. [Google Scholar] [CrossRef]
Khan, K.; Hajeri, F.; Khan, M. Analytical and numerical assessment of the effect of highly conductive inclusions distribution on the thermal conductivity of particulate composites. J. Compos. Mater. 2019, 53, 3499–3514. [Google Scholar] [CrossRef]
Sideridis, E. The influence of particle distribution and interphase on the thermal expansion coefficient of particulate composites by the use of a new model. Compos. Interfaces 2016, 23, 231–254. [Google Scholar] [CrossRef]
Madan, R.; Khobragade, P.; Mussada, E.K.; Singh, M.K.; Rangappa, S.M.; Njim, E.K.; Siengchin, S. A novel two-step finite element approach to estimate the thermo-mechanical properties of two-phase and three-phase hybrid composites. Compos. Commun. 2025, 53, 102213. [Google Scholar] [CrossRef]
Ma, S.; Zhuang, X.; Wang, X. Particle distribution-dependent micromechanical simulation on mechanical properties and damage behaviors of particle reinforced metal matrix composites. J. Mater. Sci. 2021, 56, 6780–6798. [Google Scholar] [CrossRef]
Zhang, J.; Ouyang, Q.; Guo, Q.; Li, Z.; Fan, G.; Su, Y.; Jiang, L.; Lavernia, E.J.; Schoenung, J.M.; Zhang, D. 3D Microstructure-based finite element modeling of deformation and fracture of SiCp/Al composites. Compos. Sci. Technol. 2016, 123, 1–9. [Google Scholar] [CrossRef]
Haušild, P.; Kovářík, O.; Havlíková, K.; Thomasová, M. Young’s modulus of alumina particles reinforced metal-matrix composite. Defect Diffus. Forum. 2016, 368, 174–177. [Google Scholar] [CrossRef]
Hua, Y.; Gu, L. Prediction of the thermomechanical behavior of particle-reinforced metal matrix composites. Compos. Part B Eng. 2013, 45, 1464–1470. [Google Scholar] [CrossRef]
Li, M.; Zhang, H.; Li, S.; Zhu, W.; Ke, Y. Machine learning and materials informatics approaches for predicting transverse mechanical properties of unidirectional CFRP composites with microvoids. Mater. Des. 2022, 224, 111340. [Google Scholar] [CrossRef]
de Pablo, J.J.; Jackson, N.E.; Webb, M.A.; Chen, L.-Q.; Moore, J.E.; Morgan, D.; Jacobs, R.; Pollock, T.; Schlom, D.G.; Toberer, E.S.; et al. New frontiers for the materials genome initiative. npj Comput. Mater. 2019, 5, 41. [Google Scholar] [CrossRef]
Lin, Z.; Su, Y.; Yang, J.; Qiu, C.; Chai, X.; Liu, X.; Ouyang, Q.; Zhang, D. Configuration feature extraction and mechanical properties prediction of particle reinforced metal matrix composites. Compos. Commun. 2023, 42, 101688. [Google Scholar] [CrossRef]
Agrawal, A.; Choudhary, A. Perspective: Materials informatics and big data: Realization of the “fourth paradigm” of science in materials science. APL Mater. 2016, 4, 053208. [Google Scholar] [CrossRef]
Chen, C.-T.; Gu, G.X. Generative deep neural networks for inverse materials design using backpropagation and active learning. Adv. Sci. 2020, 7, 1902607. [Google Scholar] [CrossRef]
Ramprasad, R.; Batra, R.; Pilania, G.; Mannodi-Kanakkithodi, A.; Kim, C. Machine learning in materials informatics: Recent applications and prospects. npj Comput. Mater. 2017, 3, 54. [Google Scholar] [CrossRef]
Mallela, U.K.; Upadhyay, A. Buckling load prediction of laminated composite stiffened panels subjected to in-plane shear using artificial neural networks. Thin-Walled Struct. 2016, 102, 158–164. [Google Scholar] [CrossRef]
Barbosa, A.; Upadhyaya, P.; Iype, E. Neural network for mechanical property estimation of multilayered laminate composite. Mater. Today Proc. 2020, 28, 982–985. [Google Scholar] [CrossRef]
Ye, S.; Li, B.; Li, Q.; Zhao, H.-P.; Feng, X.-Q. Deep neural network method for predicting the mechanical properties of composites. Appl. Phys. Lett. 2019, 115, 161901. [Google Scholar] [CrossRef]
Yang, Z.; Yabansu, Y.C.; Jha, D.; Liao, W.-K.; Choudhary, A.N.; Kalidindi, S.R.; Agrawal, A. Establishing structure-property localization linkages for elastic deformation of three-dimensional high contrast composites using deep learning approaches. Acta Mater. 2019, 166, 335–345. [Google Scholar] [CrossRef]
Liu, Z.; Wu, C.T. Exploring the 3D architectures of deep material network in data-driven multiscale mechanics. J. Mech. Phys. Solids 2019, 127, 20–46. [Google Scholar] [CrossRef]
Olfatbakhsh, T.; Milani, A.S. A highly interpretable materials informatics approach for predicting microstructure-property relationship in fabric composites. Compos. Sci. Technol. 2022, 217, 109080. [Google Scholar] [CrossRef]
Swaminathan, S.; Ghosh, S. Statistically equivalent representative volume elements for unidirectional composite microstructures: Part II-With interfacial debonding. J. Compos. Mater. 2006, 40, 605–621. [Google Scholar] [CrossRef]
Feng, J.; Teng, Q.; He, X.; Wu, X. Accelerating multi-point statistics reconstruction method for porous media via deep learning. Acta Mater. 2018, 159, 296–308. [Google Scholar] [CrossRef]
Qu, S.; Dai, Y.; Zhang, D.; Li, Q.; Chou, T.-W.; Lyu, W. Carbon nanotube film based multifunctional composite materials: An overview. Funct. Compos. Struct. 2020, 2, 022002. [Google Scholar] [CrossRef]
Cecen, A.; Yabansu, Y.C.; Kalidindi, S.R. A new framework for rotationally invariant two-point spatial correlations in microstructure datasets. Acta Mater. 2018, 158, 53–64. [Google Scholar] [CrossRef]
Cecen, A.; Fast, T.; Kalidindi, S.R. Versatile algorithms for the computation of 2-point spatial correlations in quantifying material structure. Integr. Mater. Manuf. Innov. 2016, 5, 1–15. [Google Scholar] [CrossRef]
Hu, X.; Li, J.; Wang, Z.; Wang, J. A microstructure-informatic strategy for Vickers hardness forecast of austenitic steels from experimental data. Mater. Des. 2021, 201, 109497. [Google Scholar] [CrossRef]
Jiang, M.; Hu, X.; Li, J.; Wang, Z.; Wang, J. An interface-oriented data-driven scheme applying into eutectic patterns evolution. Mater. Des. 2022, 223, 111222. [Google Scholar] [CrossRef]
Zhao, Y.; Altschuh, P.; Santoki, J.; Griem, L.; Tosato, G.; Selzer, M.; Koeppe, A.; Nestler, B. Characterization of porous membranes using artificial neural networks. Acta Mater. 2023, 253, 118922. [Google Scholar] [CrossRef]
Yi, Y.; Wang, L.; Chen, Z. Adaptive global kernel interval SVR-based machine learning for accelerated dielectric constant prediction of polymer-based dielectric energy storage. Renew. Energy 2021, 176, 81–88. [Google Scholar] [CrossRef]
Qu, D.; Zheng, W.; Wang, B.; Wu, B.; Cao, H.; Yi, H. Nondestructive acquisition of the micro-mechanical properties of high-speed-dry milled micro-thin walled structures based on surface traits. Chin. J. Aeronaut. 2021, 34, 438–451. [Google Scholar] [CrossRef]
Mann, A.; Kalidindi, S.R. Development of a robust CNN model for capturing microstructure-property linkages and building property closures supporting material design. Front. Mater. 2022, 9, 851085. [Google Scholar] [CrossRef]
Yang, Z.; Yabansu, Y.C.; Al-Bahrani, R.; Liao, W.-k.; Choudhary, A.N.; Kalidindi, S.R.; Agrawal, A. Deep learning approaches for mining structure-property linkages in high contrast composites from simulation datasets. Comput. Mater. Sci. 2018, 151, 278–287. [Google Scholar] [CrossRef]
Dong, X.; Shin, Y. Multi-scale modeling of thermal conductivity of SiC-reinforced aluminum metal matrix composite. J. Compos. Mater. 2017, 51, 3941–3953. [Google Scholar] [CrossRef]
Kanit, T.; Forest, S.; Galliet, I.; Mounoury, V.; Jeulin, D. Determination of the size of the representative volume element for random composites: Statistical and numerical approach. Int. J. Solids Struct. 2003, 40, 3647–3679. [Google Scholar] [CrossRef]
Luo, Y. An accuracy comparison of micromechanics models of particulate composites against microstructure-free finite element modeling. Materials 2022, 15, 4021. [Google Scholar] [CrossRef]
van der Sluis, O.; Schreurs, P.J.G.; Brekelmans, W.A.M.; Meijer, H.E.H. Overall behaviour of heterogeneous elastoviscoplastic materials: Effect of microstructural modelling. Mech. Mater. 2000, 32, 449–462. [Google Scholar] [CrossRef]
MacDonald, P.E.; Thompson, L.B. MATPRO: Version 09. A Handbook of Materials Properties for Use in the Analysis of Light Water Reactor Fuel Rod Behavior; U.S. Department of Energy Office of Scientific and Technical Information: Oak Ridge, TN, USA, 1976; p. 01.
Hagrman, D.L.; Reymann, G.A. MATPRO-Version 11: A Handbook of Materials Properties for Use in the Analysis of Light Water Reactor Fuel Rod Behavior; U.S. Department of Energy Office of Scientific and Technical Information: Oak Ridge, TN, USA, 1979; p. 01.
Kalidindi, S.B. Hierarchical Materials Informatics: Novel Analytics for Materials Data; Butterworth-Heinemann: Oxford, UK, 2015; pp. 1–219. [Google Scholar]
Steinmetz, P.; Yabansu, Y.C.; Hötzer, J.; Jainta, M.; Nestler, B.; Kalidindi, S.R. Analytics for microstructure datasets produced by phase-field simulations. Acta Mater. 2016, 103, 192–203. [Google Scholar] [CrossRef]
Fan, Y.S.; Yang, X.G.; Shi, D.Q.; Han, S.W.; Li, S.L. A quantitative role of rafting on low cycle fatigue behaviour of a directionally solidified Ni-based superalloy through a cross-correlated image processing method. Int. J. Fatigue 2020, 131, 105305. [Google Scholar] [CrossRef]
Niezgoda, S.R.; Yabansu, Y.C.; Kalidindi, S.R. Understanding and visualizing microstructure and microstructure variance as a stochastic process. Acta Mater. 2011, 59, 6387–6400. [Google Scholar] [CrossRef]
Choudhury, A.; Yabansu, Y.C.; Kalidindi, S.R.; Dennstedt, A. Quantification and classification of microstructures in ternary eutectic alloys using 2-point spatial correlations and principal component analyses. Acta Mater. 2016, 110, 131–141. [Google Scholar] [CrossRef]
Altschuh, P.; Yabansu, Y.C.; Hötzer, J.; Selzer, M.; Nestler, B.; Kalidindi, S.R. Data science approaches for microstructure quantification and feature identification in porous membranes. J. Membr. Sci. 2017, 540, 88–97. [Google Scholar] [CrossRef]
Yabansu, Y.C.; Steinmetz, P.; Hötzer, J.; Kalidindi, S.R.; Nestler, B. Extraction of reduced-order process-structure linkages from phase-field simulations. Acta Mater. 2017, 124, 182–194. [Google Scholar] [CrossRef]
Niezgoda, S.R.; Kanjarla, A.K.; Kalidindi, S.R. Novel microstructure quantification framework for databasing, visualization, and analysis of microstructure data. Integr. Mater. Manuf. Innov. 2013, 2, 54–80. [Google Scholar] [CrossRef]
Hao, W.; Shi, D.; Liu, C.; Fan, Y.; Yang, X.; Tan, L.; Zhang, B. A novel microstructure-informed machine learning framework for mechanical property evaluation of SiC_f/Ti composites. J. Mater. Res. Technol. 2024, 28, 420–433. [Google Scholar] [CrossRef]
Liu, X.; Liu, T.; Feng, P. Long-term performance prediction framework based on XGBoost decision tree for pultruded FRP composites exposed to water, humidity and alkaline solution. Compos. Struct. 2022, 284, 115184. [Google Scholar] [CrossRef]
Huang, X.; Liu, W.; Guo, Q.; Tan, J. Prediction method for the dynamic response of expressway lateritic soil subgrades on the basis of Bayesian optimization CatBoost. Soil Dyn. Earthq. Eng. 2024, 186, 108943. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Wu, J.; Chen, X.-Y.; Zhang, H.; Xiong, L.-D.; Lei, H.; Deng, S.-H. Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 2019, 17, 26–40. [Google Scholar]
Bergstra, J.; Bengio, Y. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
Escherová, J.; Krbata, M.; Kohutiar, M.; Barényi, I.; Chochlíková, H.; Eckert, M.; Jus, M.; Majerský, J.; Janík, R.; Dubcová, P. The Influence of Q & T Heat Treatment on the Change of Tribological Properties of Powder Tool Steels ASP2017, ASP2055 and Their Comparison with Steel X153CrMoV12. Materials 2024, 17, 974. [Google Scholar] [CrossRef]

Figure 1. Workflow for establishing SP linkages between microstructure and thermodynamic properties of CPRMMCS.

Figure 2. Six typical RVEs of CPRMMCs with different particle volume fractions and particle sizes.

Figure 3. Schematic diagram of RVE with PBC.

Figure 4. Calculation method of CTE under different temperature conditions.

Figure 5. Comparison of simulation and experimental results for thermal conductivity at different particle volume fractions.

Figure 6. Statistical results of thermal conductivity, elastic modulus, and CTE under different particle sizes and volume fractions.

Figure 7. Illustration of discretized microstructure

m_{s}^{h}

[41].

Figure 7. Illustration of discretized microstructure

m_{s}^{h}

[41].

Figure 8. Statistical representation of CPRMMC microstructure. (a) Microstructure image. (b–d) Three-dimensional representations of two-point statistics and truncation settings for (b) matrix auto-correlation, (c) particle auto-correlation, and (d) matrix–particle cross-correlation. (e–g) Two-dimensional representations of truncated two-point statistics for (e) matrix auto-correlation, (f) particle auto-correlation, and (g) matrix–particle cross-correlation.

Figure 9. Four typical RVEs with particle sizes of 60, 100, 140, and 180 μm and their corresponding two-point statistics for particle auto-correlation and matrix–particle cross-correlation.

Figure 10. Scree plot of first five principal components.

Figure 11. The first three PC basis vectors of the two-point statistics for matrix auto-correlation, particle auto-correlation, and matrix–particle cross-correlation.

Figure 12. (a) Correlation plot of first two PCs (color corresponds to particle volume fractions). (b) Correlation plot of particle volume fraction and PC1. (c) Correlation plot of particle volume fraction and PC2. (d) Correlation plot of first two PCs (color corresponds to particle size). (e) Correlation plot of particle size and PC1. (f) Correlation plot of particle size and PC2.

Figure 13. Workflow diagrams of machine learning algorithms. (a) XGBoost. (b) CatBoost.

Figure 14. Workflow diagram of RF algorithm.

Figure 15. Schematic illustration of SVR principle.

Figure 16. Bayesian optimization flowchart.

Figure 17. RMSE of different machine learning models predicting (a) thermal conductivity, (b) elastic modulus, and (c) CTE under varying numbers of PCs; comparison between Poly-SVR model predictions and simulation results for (d) thermal conductivity, (e) elastic modulus, and (f) CTE.

Table 1. Material properties of UO₂, Zr, and interface [53,54].

Material	Density (kg/m³)	Thermal Conductivity W/(m·K)	Coefficient of Thermal Expansion (°C⁻¹)	Elastic Modulus (MPa)	Poisson’s Ratio
UO₂	10,600	2.3026	1.54 × 10⁻⁵	168,137	0.316
Zr	6550	47.4328	8.95 × 10⁻³	15,150	0.34
Interface	9790	11.3289	1.80 × 10⁻³	137,540	0.3208

Table 2. Hyperparameter settings for machine learning models.

Machine Learning Model	Hyperparameter Settings
Poly-SVR	C ∈ [0.1, 100], epsilon ∈ [0.01, 0.5], degree ∈ [2, 5], coef0 ∈ [0, 1]
RBF-SVR	C ∈ [0.1, 100], epsilon ∈ [0.01, 0.5], gamma ∈ [0.01, 0.1, 1]
RF	n_estimators ∈ [10, 100], max_depth ∈ [3, 10], min_samples_split ∈ [2, 20], min_samples_leaf ∈ [1, 20]
XGBoost	n_estimators ∈ [10, 100], max_depth ∈ [3, 10], learning_rate ∈ [0.01, 0.5], subsample ∈ [0.6, 1], colsample_bytree ∈ [1, 20]
CatBoost	Iterations ∈ [10, 100], depth ∈ [3, 10], learning_rate ∈ [0.01, 0.5], l2_leaf_reg ∈ [1, 10], subsample ∈ [0.6, 1], colsample_bylevel ∈ [0.6, 1]

Table 3. Prediction results of Poly-SVR model with different truncation levels and RVE quantities.

Property	Truncation	Data Volume	No. RVEs	Training RMSE	Test RMSE	Training R²	Test R²
Thermal conductivity	0	1,436,592	600	1.292	1.681	0.918	0.867
	96	750,000	600	1.247	1.323	0.923	0.918
	96	750,000	300	1.182	1.227	0.930	0.935
	146	480,000	600	1.231	1.379	0.925	0.911
Elastic modulus	0	1,436,592	600	701.795	1156.328	0.946	0.865
	96	750,000	600	706.179	830.8523	0.945	0.930
	96	750,000	300	696.629	810.255	0.948	0.929
	146	480,000	600	731.614	892.195	0.941	0.920
Coefficient of thermal expansion	0	1,436,592	600	0.000229	0.000327	0.951	0.897
	96	750,000	600	0.000216	0.000252	0.956	0.939
	96	750,000	300	0.000291	0.000326	0.918	0.905
	146	480,000	600	0.000228	0.000251	0.951	0.939

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xie, R.; Li, G.; Cao, P.; Tan, Z.; Wang, J. Modeling the Structure–Property Linkages Between the Microstructure and Thermodynamic Properties of Ceramic Particle-Reinforced Metal Matrix Composites Using a Materials Informatics Approach. Materials 2025, 18, 2294. https://doi.org/10.3390/ma18102294

AMA Style

Xie R, Li G, Cao P, Tan Z, Wang J. Modeling the Structure–Property Linkages Between the Microstructure and Thermodynamic Properties of Ceramic Particle-Reinforced Metal Matrix Composites Using a Materials Informatics Approach. Materials. 2025; 18(10):2294. https://doi.org/10.3390/ma18102294

Chicago/Turabian Style

Xie, Rui, Geng Li, Peng Cao, Zhifei Tan, and Jianru Wang. 2025. "Modeling the Structure–Property Linkages Between the Microstructure and Thermodynamic Properties of Ceramic Particle-Reinforced Metal Matrix Composites Using a Materials Informatics Approach" Materials 18, no. 10: 2294. https://doi.org/10.3390/ma18102294

APA Style

Xie, R., Li, G., Cao, P., Tan, Z., & Wang, J. (2025). Modeling the Structure–Property Linkages Between the Microstructure and Thermodynamic Properties of Ceramic Particle-Reinforced Metal Matrix Composites Using a Materials Informatics Approach. Materials, 18(10), 2294. https://doi.org/10.3390/ma18102294

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modeling the Structure–Property Linkages Between the Microstructure and Thermodynamic Properties of Ceramic Particle-Reinforced Metal Matrix Composites Using a Materials Informatics Approach

Abstract

1. Introduction

2. Microstructure Model and Dataset

2.1. Generation of Stochastic Microstructures

2.2. Model Solution Method

2.3. Evaluation of Thermodynamic Properties

3. Microstructure Dimensionality Reduction and Machine Learning Methods

3.1. Statistical Representation of Microstructure

3.2. Dimensionality Reduction of Statistics

3.3. Machine Learning Methods

3.3.1. Machine Learning Models

3.3.2. Hyperparameter Optimization

4. Results and Discussion

4.1. Extraction and Validation of SP Linkages

4.2. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI