Next Article in Journal
Step-by-Step Analysis of a Copper-Mediated Surface-Initiated Atom-Transfer Radical Polymerization Process for Polyacrylamide Brush Synthesis Through Infrared Spectroscopy and Contact Angle Measurements
Previous Article in Journal
Computational Study of Catalytic Poisoning Mechanisms in Polypropylene Polymerization: The Impact of Dimethylamine and Diethylamine on the Deactivation of Ziegler–Natta Catalysts and Co-Catalysts
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Transfer Learning-Enhanced Prediction of Glass Transition Temperature in Bismaleimide-Based Polyimides

1
School of Materials Science and Engineering, Beihang University, Beijing 100191, China
2
State Key Laboratory of Artificial Intelligence for Material Science, Beihang University, Beijing 100191, China
3
Tianmushan Laboratory, Yuhang District, Hangzhou 311115, China
*
Authors to whom correspondence should be addressed.
Polymers 2025, 17(13), 1833; https://doi.org/10.3390/polym17131833
Submission received: 22 May 2025 / Revised: 23 June 2025 / Accepted: 26 June 2025 / Published: 30 June 2025
(This article belongs to the Section Artificial Intelligence in Polymer Science)

Abstract

The glass transition temperature (Tg) was a pivotal parameter governing the thermal and mechanical properties of bismaleimide-based polyimide (BMI) resins. However, limited experimental data for BMI systems posed significant challenges for predictive modeling. To address this gap, this study introduced a hybrid modeling framework leveraging transfer learning. Specifically, a multilayer perceptron (MLP) deep neural network was pre-trained on a large-scale polymer database and subsequently fine-tuned on a small-sample BMI dataset. Complementing this approach, six interpretable machine learning algorithms—random forest, ridge regression, k-nearest neighbors, Bayesian regression, support vector regression, and extreme gradient boosting—were employed to construct transparent predictive models. SHapley Additive exPlanations (SHAP) analysis was further utilized to quantify the relative contributions of molecular descriptors to Tg. Results demonstrated that the transfer learning strategy achieved superior predictive accuracy in data-scarce scenarios compared to direct training on the BMI dataset. SHAP analysis identified charge distribution inhomogeneity, molecular topology, and molecular surface area properties as the major influences on Tg. This integrated framework not only improved the prediction performance but also provided feasible insights into molecular structure design, laying a solid foundation for the rational engineering of high-performance BMI resins.

Graphical Abstract

1. Introduction

The thermal stability of polymer substrates in high-temperature environments represents a critical technical bottleneck limiting the advancement of advanced composite materials. The glass transition temperature (Tg), a key parameter governing the transition from the glassy to the rubbery state, directly influences the morphological stability and mechanical property retention of thermoplastic and thermosetting polymers under extreme conditions [1]. For high-performance thermosetting polymers such as bismaleimide-based polyimide (BMI) resins, Tg determines their practical application potential in critical fields such as aerospace thermal protection structures and microelectronic packaging materials [2]. However, optimizing Tg in BMI resins remains a complex challenge, as it is highly dependent on the composite and architecture of polymers [3,4,5,6]. Traditional trial-and-error approaches are constrained by lengthy experimental cycles, prohibitive testing costs, and difficulties in elucidating microscopic mechanisms. While computational methods like density functional theory (DFT) [7] and molecular dynamics (MD) simulations offer theoretical frameworks for Tg prediction [8,9,10], the intricate cross-linking networks, multidimensional aromatization reactions, and synergistic functional group interactions in BMI resins introduce dual challenges, namely exponentially increasing computational resource demands and empirical force field parameter selection in atomic-scale simulations.
Recent advances in polymer informatics have introduced a data-driven paradigm for materials development, enabling an inverse analysis of structure–property relationships through machine learning [11,12,13,14,15]. Notable progress includes Lei et al.’s [16] systematic benchmarking of 79 models for Tg prediction, which revealed synergistic interactions between molecular fingerprints and neural architectures. Their study evaluated how different feature engineering strategies impact model performance, particularly highlighting the efficacy of Simplified Molecular Input Line Entry System (SMILES) derived molecular fingerprints (e.g., Morgan fingerprints with radius = 3 and nBits = 2048) when combined with neural architectures. He et al. [17] demonstrated the scalability of this approach by developing a quantitative structure–property relationship (QSPR) model for 695 polyesters, achieving experimental validation errors within 17.4 °C through virtual screening. Bo et al. [18] conducted a comprehensive study on polyimide (PI) materials, focusing on 11 key properties across four categories. They developed a high-throughput predictive framework incorporating diverse feature representations (e.g., Morgan fingerprints, Rational Design Kit (RDKit)/Mordred descriptors) and machine learning models. To elucidate the physicochemical mechanisms underlying model predictions, they applied SHapley Additive exPlanations (SHAP) analysis. By leveraging SHAP values to quantify feature importance, they identified critical structural determinants influencing each property at the molecular level. Building on these insights, they designed three PI variants with distinct structural features, demonstrating the practical utility of using SHAP for interpretability in guiding rational materials design. While these studies highlight the transformative potential of machine learning in polymer design, their success critically depends on large-scale, high-quality datasets [19,20,21,22]. In contrast, the BMI resin field is constrained by data scarcity [23,24,25] and label noise arising from inconsistent experimental conditions, severely limiting model predictive capabilities.
To address the data scarcity challenge in BMI resin research, we employ transfer learning, a paradigm where knowledge gained from large datasets is repurposed for related tasks with limited data [26]. This approach has demonstrated efficacy in materials informatics. Yamada et al. [27] achieved high predictive performance in material property estimation using only tens of samples through their XenonPy.MDL pre-trained model library. Zhang et al. [28] developed a transfer learning framework to predict the stress–strain curves of polymer composites, achieving a 46.14% accuracy improvement in plastic deformation stages through optimal transport integration. Kazemi-Khasragh et al. [29] extended this concept to diverse polymer property prediction, accurately forecasting thermal and mechanical properties using datasets as small as 13 samples. Building on these foundations, we propose a hybrid framework that combines transfer learning and interpretable machine learning to overcome data limitations in BMI resin studies. The framework leverages knowledge from a large-scale polymer database to compensate for data limitations in BMI systems while incorporating explainability techniques to unravel the structural determinants of Tg. This approach aims to achieve the two following objectives: (1) enhancing predictive accuracy in data-scarce scenarios through transfer learning and (2) establishing a quantitative structure–property relationship for a rational molecular design of high-performance BMI resins.

2. Materials and Methods

2.1. Data Collection

A two-tier dataset architecture was employed. The base dataset (Data_1) comprised 3916 diverse polymers [30], among which 697 PI [31] were added due to their topological similarity with the imine pentacyclic structure of BMI to improve the pre-training effect of BMI-specific feature extraction. The target dataset (Data_2), constructed through experimental synthesis and literature curation [4,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53], contained 78 BMI molecules, which is a scarcity that posed the primary challenge for model development.
In order to systematically characterize the two-tier dataset architecture, we visualized the Tg distribution of the dataset (Figure 1a) using kernel density estimation (KDE) and distributional analysis. Here, the horizontal coordinates represent the Tg values corresponding to each data point, while the vertical coordinates indicate the frequency density. The green curve denotes Data_1, and the red curve denotes Data_2. Figure 1a reveals the multimodal distribution of Data_1 (the average value μ = 251.25 °C), reflecting its composition of diverse polymer families, as well as the right-skewed distribution of Data_2 (μ = 312.87 °C), suggesting rigid structural patterns specific to BMI resins. The distinct distributional differences between the two datasets underscore the uniqueness of BMI resins compared to other polymers, explaining why generalized polymer models cannot directly predict the Tg of BMI.
Molecular structures from both datasets were encoded as Morgan fingerprints (see Section 2.2 for details) and subjected to principal component analysis (PCA). This method projects high-dimensional feature relationships onto interpretable 2D scatter plots, where the horizontal and vertical axes represent the first two principal components after dimensionality reduction. As shown in Figure 1b, partial domain overlap exists between green Data_1 clusters and red Data_2 clusters, indicating transferable latent representations while preserving domain-specific characteristics. Critically, this structural overlap exists only in latent feature space, and all BMI molecules in Data_2 possess unique molecular scaffolds absent from Data_1. The black arrows symbolize the knowledge transfer pathway from the general polymer space to BMI-specific regions.
While curing conditions (e.g., temperature, post-cure duration) and environmental exposure factors (e.g., humidity, UV irradiation) are widely recognized as critical determinants of Tg in polymer systems, these variables were not systematically incorporated into our modeling framework due to inherent limitations in the literature data reporting consistency. Furthermore, while cross-linking frequency as a network structural parameter that directly affects Tg by restricting polymer chain movement, it was excluded from our analysis due to two practical limitations, namely (1) the experimental difficulty in measuring this property consistently across large datasets and (2) a lack of standardized measurement protocols between different studies. Although elevated cross-linking frequency is empirically linked to increased Tg in BMI systems, our study prioritized molecular-level descriptors to balance model complexity with data availability constraints. Additionally, we acknowledge that heterogeneity in Tg measurement methodologies across studies introduces unavoidable variability into the dataset. Such inherent noise contributes to model robustness by exposing the predictive framework to diverse measurement conventions, thereby enhancing generalization capabilities. This data collection approach reflects a deliberate trade-off between mechanistic completeness and pragmatic model applicability within the constraints of accessible experimental data.

2.2. Feature Engineering

A Morgan fingerprint [54] is a circular fingerprint encoding molecular substructures through hashed bit patterns. This encoding strategy preserves topological information at multiple scales while maintaining computational efficiency for neural processing. Unlike scalar molecular descriptors that aggregate global properties (e.g., molecular weight, the number of aromatic rings), Morgan fingerprints retain spatial relationships between functional groups, enabling neural networks to learn representations of structural features for Tg determination. The data preprocessing workflow (Figure 1c) utilized RDKit (RDKit: Open-source cheminformatics; http://www.rdkit.org (accessed on 11 March 2025)) to generate 2048-dimensional Morgan fingerprints from canonical SMILES [55] using the GetMorganFingerprintAsBitVect function with radius = 3 and nBits = 2048, capturing local chemical environments up to three bonds away while maintaining computational efficiency for neural processing.

2.3. Training Strategy of Transfer Learning Model

To address the sample scarcity in Data_2, a two-stage transfer learning framework was devised (Figure 1c). Stage 1 involved pre-training a multilayer perceptron (MLP) model on Data_1, featuring a 2048-dimensional input layer followed by three fully connected layers (1024/512/256 neurons with ReLU activation) and 30% dropout regularization. The model underwent 200 training epochs using the Adam optimizer with dynamic learning rate adjustment via ReduceLROnPlateau. Stage 2 implemented selective fine-tuning during transfer to Data_2: all parameters except the final five layers were frozen, enabling gradient updates only in the last two fully connected layers. This hierarchical adaptation mechanism preserved cross-domain generalizable features while enabling localized parameter tuning in Stage 2, where all parameters except the final five layers were frozen to allow gradient updates only in the last two fully connected layers. Hyperparameter optimization employed grid search with early stopping, exploring learning rates (1 × 10−3 to 1 × 10−5), batch sizes (16/32/64 samples/batch), and L2 regularization strengths (λ = 0.001). The final convolutional neural network (CNN) architecture comprised three convolutional blocks with 32/64/128 filters (3 × 3 kernels), augmented by a 0.3 dropout rate and batch normalization, trained using the Adam optimizer (1 × 10−4 initial learning rate). For the MLP, a three-layer fully connected network (1024/512/256 neurons) was configured with 0.3 dropout and L2 regularization, optimized via AdamW (5 × 10−5 learning rate). The deep neural network (DNN) adopted a five-layer sequential structure (1024/256/128/64/32 neurons) with 0.3 dropout, utilizing Adam optimization (1 × 10−5 learning rate). All experiments were executed on a Linux workstation with an NVIDIA RTX 4090 GPU using python 3.11. The presentation and calculation of the evaluation metrics are described in Supplementary Section S1. Evaluation metrics included the root mean squared error (RMSE), mean absolute error (MAE), mean squared error (MSE), and coefficient of determination (R2).

2.4. Virtual Structure Proposed

In order to elucidate the structure–property relationship between BMI molecules and their Tg, we categorized BMI molecules as aromatic (conjugated systems such as benzene rings) and aliphatic (linear or branched alkanes) based on the characterization of the R-groups, as shown in Figure 2a. We employed a multiscale functional group modification strategy by introducing thioether (-SH), nitrogen-containing groups (-NH2/NO2), oxygen-containing groups (hydroxyl, carbonyl, ester), and halogen substituents (F, Cl, Br, I). Additionally, five representative copolymer-modified architectures were incorporated, including ABPN, ABPA, DABPA, DABPAF, AN, and unmodified self-polymerization samples, to simulate real-world modification processes [56,57,58]. The functional group type distribution is visualized in Figure 2b, comprehensively reflecting the structural diversity of BMI resin macromolecules. The Tg prediction of these designed virtual structures using Model_1 yielded a virtual database of BMI (Data_3, n = 1092).

2.5. Descriptor Calculated

Unlike Morgan fingerprints, molecular descriptors are more suitable for analyzing and quantifying global physicochemical properties, so we use molecular descriptors as feature inputs in the interpretable modeling process instead of Morgan fingerprints. We used RDkit to calculate molecular descriptors (Figure 2c), including 67 descriptors such as the maximum partial charge value carried by the atoms in the molecule (MaxpartialCharge), Balaban’s topological index (BalabanJ), the sum of atomic molar refractivities (MolMR), and so on. The specific descriptors and their meanings are shown in Supplementary Section S2 (Table S1). A hybrid feature selection method combining the Pearson correlation coefficient identified 23 core descriptors for quantitative structure–property relationship modeling. The results of screening the characterization correlations using the Pearson correlation coefficient method are shown in Supplementary Section S3 (Figure S1).

2.6. Interpretable Model

We tried six machine learning algorithms. Random forest (RF) [59] operates as an ensemble method combining multiple decision trees via bagging and feature randomness, thereby reducing variance and improving generalization. Ridge regression [60] extends ordinary least squares by introducing L2 regularization to penalize large coefficients, effectively mitigating multicollinearity and overfitting. K-nearest neighbors (KNN) [61] follows a non-parametric, instance-based learning paradigm where predictions are derived from the weighted average of the target variable in the nearest training examples within the feature space. Bayesian regression (NB) [62] incorporates a probabilistic framework by assuming a prior distribution over model parameters, with predictions formulated as posterior distributions via Bayes’ theorem. Support vector regression (SVR) [63] extends the principles of support vector machines to regression tasks by mapping input features into a high-dimensional kernel space through nonlinear transformations. Extreme gradient boosting (XGBoost) [64] implements a gradient boosting framework that sequentially trains decision trees to correct residual errors, employing regularization terms and shrinkage to enhance robustness against overfitting while maintaining computational efficiency through parallel tree construction. We used 5-fold cross-validation in our model training. Specific parameter settings for the model training process are given in Supplementary Section S4.

2.7. SHAP Analysis

SHapley Additive exPlanations (SHAP) [65], a game theoretic framework rooted in cooperative game theory, was employed to decompose model predictions into feature contributions by quantifying Shapley values—the marginal impact of each molecular descriptor on Tg predictions. By aggregating local explanations across the dataset, SHAP generated globally interpretable insights through summary plots and force diagrams, enabling the visualization of both linear and nonlinear descriptor relationships. The final model leverages SHAP-derived descriptor importance rankings to construct a transparent structure–property map, where each molecular descriptor’s contribution to thermal transition behavior is represented.

3. Results and Discussion

3.1. Performance Comparison of Different Neural Network Frameworks as Pre-Trained Models in Transfer Learning

Given that the success of transfer learning hinges critically on the pre-trained model possessing robust generalization capabilities, we systematically compared the performance of different neural network frameworks within the transfer learning framework. We selected three distinct neural network frameworks for pre-trained model comparison, namely MLP, a CNN, and a DNN. MLP, as a fundamental feedforward neural network, excels in capturing nonlinear relationships within data through its fully connected layers, making it particularly suitable for processing high-dimensional sparse molecular fingerprint data, such as Morgan fingerprints. In this study, the MLP model, employing a three-layer hidden structure, demonstrated exceptional predictive capability on the test dataset of Data_2, achieving an R2 value of 0.59. In contrast, the CNN, renowned for its convolutional and pooling layers, excels in tasks like image recognition. However, within the transfer learning framework of this study, the CNN architecture did not surpass MLP in terms of predictive accuracy and generalization performance. This discrepancy might stem from the CNN’s proficiency in handling local features and spatial hierarchies, which may not be fully leveraged when dealing with high-dimensional sparse molecular data. The DNN, or deep neural network, enhances model representational power by increasing network depth. The DNN model adopted in this study comprised five hidden layers, enabling it to learn more complex feature representations. Despite its theoretical strong fitting capability, the DNN’s performance on the specific tasks and datasets of this study still slightly lagged behind MLP. This could be attributed to potential overfitting issues during DNN training, as well as challenges posed by data scarcity and quality heterogeneity in this study.
To visually illustrate the performance disparities among these models, Figure 3 presents the parity plots for pre-training on the Data_1 dataset across all three models. In these plots, the horizontal axis represents the true values, while the vertical axis denotes the model predictions. The black dashed line signifies the x = y diagonal, where points closer to this line indicate predictions closer to the true values. The blue line represents the training regression line, and the yellow line denotes the test regression line. Blue dots correspond to training dataset points, and yellow dots to test dataset points. Figure 3a specifically depicts the parity plot for the MLP model trained on Data_1, with training and test R2 values of 0.76 and 0.59, respectively. Figure 3b,c showcases the parity plots for the CNN and DNN models, with training R2 values of 0.67 and 0.68 and test R2 values of 0.51 and 0.57, respectively.
The effectiveness of transfer learning hinges on the pre-trained model’s performance, particularly its generalization ability, as this directly impacts the subsequent fine-tuning process on Data_2. The superior performance of the MLP model in pre-training, as evidenced by its higher test R2 value and closer alignment of test data points to the diagonal line in Figure 3a, underscores its advantage in handling high-dimensional sparse molecular data. Based on this finding, we selected the MLP model as the pre-trained model, and subsequent transfer learning tasks and methodological explorations were all conducted based on the MLP model.

3.2. Necessity and Technical Advantages of Transfer Learning

Confronting the dual challenges of data scarcity (Data_2, n = 78) and quality heterogeneity in predicting the Tg of BMI, our proposed transfer learning framework (Figure 1c) demonstrates significant technical advantages. As our objective focuses on predicting the Tg of BMI resins, we allocated 10% of Data_2 as the test set (designated as Test_2) for comparative analysis across different modeling strategies. As our objective focuses on predicting the Tg of BMI resins, we allocated 10% of Data_2 as the test set (designated as Test_2) for comparative analysis across different modeling strategies. Due to the limited size of Data_2 and considering that allocating 10% for external testing is highly sensitive to the partitioning method, a stratified random sampling approach was employed to ensure statistical representativeness while maintaining class balance in this low-resource scenario. Table 1 systematically compares the performance of three MLP-based modeling strategies evaluated on Test_2, building upon the conclusion in Section 3.1 that MLP constitutes the optimal neural architecture for this task. As established, all three strategies utilize the MLP framework but differ in training paradigms, with (1) standalone training on Data_1 (general molecular database), (2) standalone training on Data_2 (n = 78 BMI-specific dataset), and (3) the two-stage transfer learning paradigm combining Data_1 pre-training with Data_2 fine-tuning. Specifically, the transfer strategy freezes all layers except the final five during fine-tuning, which is an optimal knowledge transfer mechanism validated in Section 3.3. To ensure methodological consistency, both the transfer learning and standalone Data_1 pre-training strategies utilized identical MLP architectures and hyperparameter configurations during their respective pre-training phases.
However, despite the initial expectation that Data_1—a large-scale general polymer dataset—would provide robust predictive capability (as evidenced by its R2 = 0.59 on internal testing in Section 3.1), the model’s performance on BMI-specific Test_2 plummeted to R2 = −6.19. This dramatic degradation stems from fundamental domain differences: BMI resins exhibit unique thermal behavior mechanisms distinct from conventional polymers, rendering generic structural patterns in Data_1 poorly transferable. While standalone training on Data_2 (n = 78) might seem a logical alternative, the resulting R2 = −4.10 and RMSE = 82.15 °C reflect inherent limitations: (1) extreme data scarcity prevents learning meaningful representations and (2) manual aggregation from heterogeneous literature sources introduces uncontrolled experimental noise, forcing the model to memorize spurious correlations rather than genuine structure–property relationships.
Faced with these challenges—Data_1’s domain mismatch and Data_2’s poor quality—the transfer learning framework strategically leverages Data_1’s generalizable physical information as foundational knowledge while adapting to BMI-specific features through fine-tuning. As shown in Table 1, the transfer learning approach achieved significant improvements in all evaluation metrics. There, MSE measures squared differences between predictions and true values; the RMSE represents the absolute error magnitude aligned with the target variable scale; the MAE directly reflects the average prediction deviation magnitude. All three metrics follow the “lower the better” principle. The R2 evaluates the model’s explanatory power for data variance, with values closer to one indicating better performance, while negative values signify worse performance than the baseline mean prediction. When evaluating the three distinct modeling strategies on Test_2, the transfer learning approach demonstrates superiority across all performance metrics. Notably, transfer learning improves R2 from −6.19 to 0.44 when compared to training on Data_1 alone, directly demonstrating the framework’s ability to correct for domain shifts. The RMSE decreases by 72.40% (from 97.53 °C to 27.27 °C), which indicates a significant improvement in real-world applicability. The results for the Data_2 standalone training show a negative R2 (−4.10) and considerable RMSE (82.15 °C), indicating that the model is not predicting accurately. This behavior stems from the model learning spurious correlations rather than true structure–attribute relationships in the small Data_2 dataset. The transfer learning framework utilizes the structural knowledge in Data_1 to effectively mitigate this issue, as evidenced by the positive R2 (0.44) and RMSE (27.27 °C) on Test_2. The consistent performance gains across the RMSE, MSE, MAE, and R2 metrics collectively validate the framework’s capacity to mitigate data scarcity limitations in predicting the Tg of BMI resin.

3.3. Optimizing Transfer Learning Performance Through Layer-Wise Fine-Tuning in MLP Architectures

We explored the impact of varying fine-tuning layer counts on MLP-based transfer learning performance, with a particular focus on identifying the optimal balance between preserving pre-trained knowledge and adapting to target domain specifics. As shown in Figure 4, each subplot systematically evaluates a critical performance metric—the RMSE (4a), MSE (4b), MAE (4c), and R2 (4d)—along the vertical axis, while the horizontal axis spans the number of fine-tuned layers (ranging from 1 to 6). Given that the MLP architecture comprises six layers in total (excluding the input layer and output layer), our experiments systematically unfreeze one to six consecutive layers from the output end backward, enabling an investigation of adaptation effects.
The experimental curves reveal a consistent performance evolution pattern across all metrics. Initially, as layers are progressively unfrozen (moving from one to five layers), model performance improves markedly: the RMSE drops to 27.27 °C, the MSE decreases to 743.61, the MAE reduces to 21.92 °C, and R2 climbs to 0.44. This improvement phase peaks at five fine-tuned layers, indicating optimal adaptation where the model sufficiently adjusts higher-level representations for the target domain while retaining pre-trained feature extraction capabilities from the frozen initial layers. Beyond this optimal point, continued layer unfreezing (six layers) triggers performance deterioration across all metrics. This degradation suggests that excessive parameter adjustment may introduce domain-specific noise or disrupt previously learned robust features, negating the benefits of transfer learning.
These observations validate our strategy of adapting the last five layers. Such configuration creates a critical balance: maintaining frozen layers ensures stability in handling high-dimensional molecular data, while fine-tuning layers provides necessary flexibility for domain-specific calibration. The resulting pre-trained model, trained with this five-layer fine-tuning approach, represents the optimal intersection of transfer efficiency and adaptive capacity, achieving the highest predictive accuracy without compromising generalization capability.

3.4. Feature Interpretability Analysis

According to Section 3.3, the layer-wise fine-tuning strategy culminated in the development of Model_1, an optimized MLP architecture incorporating transfer learning principles that directly enabled the Tg predictions for BMI resins. To comprehensively elucidate the intrinsic physicochemical relationships governing the Tg of BMI resins, we conducted an interpretable machine learning analysis comparing experimental data (Data_2) and computationally augmented datasets (Data_3). While Morgan fingerprints excel in capturing structural patterns for predictive modeling, their inherent black-box nature limits physicochemical interpretability. Molecular descriptors, by contrast, encode quantifiable physicochemical properties, enabling direct correlation analysis between specific structural attributes and Tg. This rationale motivated our selection of descriptors for interpretable model development.
Initially, we attempted to derive interpretable insights directly from Data_2, but the insufficient data quantity and methodological variability across experimental sources precluded reliable descriptor analysis. To overcome this limitation, we designed 1092 novel BMI structures, then employed Model_1 to predict their Tg values, thereby establishing the Data_3 dataset. Table 2 presents the quantitative performance metrics of six regression algorithms evaluated on both datasets. Notably, all models exhibited significantly improved predictive capabilities when trained on Data_3: for instance, the XGBoost model achieved a test set R2 of 0.63 with RMSE = 17.06 °C on Data_3 compared to R2 = −1.97 (RMSE = 48.35 °C) on Data_2. This dramatic performance disparity stems from Data_2’s limited sample size (n = 78) and inherent experimental heterogeneity across literature sources, which introduced confounding noise that compromised model generalization.
According to the results in Table 2, given the superior performance of XGBoost on Data_3 (R2 = 0.63, RMSE = 17.06 °C), this model was selected for SHAP-based descriptor analysis. Figure 5 systematically shows the ordering of the effects and importance of different descriptors on the positive and negative correlations of Tg. SHAP summary plots (Figure 5a,c) visualize feature contributions via color gradient encoding: red/blue tones indicate high/low feature values, with saturation intensity reflecting predictive impact magnitude. The feature importance ranking (Figure 5b,d) further quantifies the relative contribution of each descriptor, the horizontal coordinate represents the relative value of feature importance, and the vertical coordinate is each descriptor. For Data_3 (Figure 5a,b), the most important descriptors include MaxPartialCharge, which represents the maximum partial charge value carried by the atoms in the molecule. The fact that these descriptors rank as the most important indicates that the inhomogeneity of the charge distribution affects Tg. MinPartialCharge represents the value of the smallest partial charge carried by an atom in a molecule. The importance ranking of MinPartialCharge is also high, and it works with MaxPartialCharge. A larger difference between the two represents a stronger localized concentration of charge within the molecule, which may lead to stronger electrostatic interactions between the molecules. It can be seen that MaxPartialCharge and MinPartialCharge have an important effect on the Tg of BMI; however, there is no obvious positive or negative correlation pattern. In addition, one of the most important descriptors is the number of rotatable bonds (NumRotatableBonds), which indicates the number of single bonds in the molecule that can be freely rotated. The molecules with fewer NumRotatableBonds have restricted chain segment mobility, more rigid molecular conformation, and higher Tg. Meanwhile, the specific molecular surface area contribution (SMR_VSA7) and the specific surface area contribution to the lipid–water partition coefficient (SlogP_VSA3) also ranked high, indicating that the molecular surface area properties also affect Tg, probably because they affect the stacking mode of the molecules and intermolecular interactions, which in turn have an effect on Tg.
In order to verify the authenticity of the analysis results of virtual Data_3, we used Data_2 to train the XGBoost model for feature significance analysis and obtained results (Figure 5c,d) similar to those of Data_3. Notably, MaxPartialCharge, MinPartialCharge, representing charge inhomogeneity, and SMR_VSA7, representing molecular surface properties, are still in the top rank, but the model fails to capture the influences such as NumRotatableBonds, representing molecular topological complexity. This is directly related to the small number of datasets.
Two experimentally characterized BMI derivatives (BMI-I and BMI-II) sourced from the literature [66,67,68] were subjected to computational analysis (detailed in Supplementary Section S5, Tables S2 and S3). The Model_1 predictions yielded Tg values of 263.75 °C and 323.03 °C for BMI-I and BMI-II, respectively. It is important to note that the literature-reported experimental values for these derivatives are given as ranges (Tg > 260 °C and Tg > 300 °C) rather than specific numerical values. In this study, we have chosen to use the lower bounds of these ranges (260 °C and 300 °C) as reference points for comparison with our model predictions. Notably, the Model_1 predictions show excellent quantitative agreement with these conservative benchmarks, exhibiting absolute deviations of 3.75 °C (1.44%) and 23.03 °C (7.67%), respectively. The concomitant calculation of molecular descriptors revealed that while most topological parameters remained consistent between the two systems, the NumRotatableBonds metric exhibited a marked difference (22 and 16 bonds), inversely correlating with measured Tg. This experimental and model comparison analysis not only validates Model_1’s predictive accuracy but also reinforces the structural complexity–Tg relationship posited by the interpretable framework, as reduced molecular flexibility (lower NumRotatableBonds) directly corresponds to evaluate Tg, thereby substantiating molecular topology as a critical determinant influencing Tg.
In summary, to rigorously validate our findings, we employed a two-step verification process. First, we established cross-dataset consistency through an independent analysis of experimental Data_2, confirming the dominant role of charge heterogeneity and molecular surface properties while highlighting limitations in capturing topological complexity. Second, we conducted targeted experimental validation using BMI-I and BMI-II, addressing the observed discrepancy and empirically substantiating the structural complexity–Tg relationship.

4. Conclusions

The hybrid framework proposed in this study integrates transfer learning and interpretable machine learning to successfully achieve the efficient prediction of Bismaleimide-based polyimide (BMI) glass transition temperature (Tg). Through SHapley Additive exPlanations (SHAP) analysis, this study elucidates the core mechanism of molecular descriptors’ influence on Tg. Specifically, topological complexity (represented by NumRotatableBonds), charge distribution properties (represented by MaxPartialCharge and MinAbsPartialCharge), and molecular surface properties (SMR_VSA7 and SlogP_VSA3) are identified as the dominant influencing Tg factors. This success can be attributed to the effective transfer of chemical spatial knowledge through transfer learning and the explicit resolution of higher-order interactions through SHAP analysis. Additionally, ten possible higher Tg structures are given based on the predictions of the model_1 (Supplementary Section S6, Table S4).
However, this study acknowledges certain limitations, particularly the exclusion of external variables such as processing parameters, which may affect the model’s generalizability under varying processing condition. Furthermore, while the current study primarily focuses on the influence of monomer structures on Tg, we recognize the potential importance of network structural information, such as cross-linking frequency, in influencing Tg. A higher cross-linking frequency generally leads to a denser network structure, thereby restricting the movement of polymer chains and resulting in an elevated Tg. Future research could extend this work by incorporating processing parameters, molecular characteristics, and network structural information to comprehensively reveal the multi-scale regulation mechanisms of Tg, further advancing the field of high-temperature-resistant polymer design with both predictive accuracy and mechanistic transparency. We believe that through these efforts, we can advance the development of Tg predictive models towards greater accuracy and practicality, thereby providing stronger support for the design and development of high-performance thermosetting polymers.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/polym17131833/s1, Table S1: Introduction to molecular descriptor names and related meanings. Figure S1: Feature correlation analysis after feature selection, the darker the blue color means the stronger the positive correlation, the lighter the blue color the negative number means the stronger the negative correlation, the number represents the correlation value. Table S2: Three experimental verifications of molecular structure in the literature. Table S3: The significant relevant descriptor values for the selected molecules. Table S4: Structures with higher predicted values of Tg in predicting the virtual structure of the design. The detailed datasets of these models can be found at https://doi.org/10.5281/zenodo.15481912 (accessed on 21 May 2025).

Author Contributions

Conceptualization, Z.W. and J.Z.; methodology, Y.L. and L.Z.; software, P.K.; validation, Z.W., Z.L., and P.K.; formal analysis, Z.W.; investigation, J.Z.; resources, Y.L.; data curation, X.X.; writing—original draft preparation, Z.W.; writing—review and editing, P.K.; visualization, Z.W.; supervision, P.K. and L.Z.; project administration, L.Z.; funding acquisition, Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data are available from the authors upon request.

Acknowledgments

We sincerely acknowledge Yanjie Wang for providing instrumental support in preparing partial experimental data and test materials, which significantly enhanced the robustness and reliability of our research findings. The expert assistance in data acquisition and equipment access proved invaluable to this study.

Conflicts of Interest

Author Ziqi Wang, Yu Liu, Xintong Xu, Jiale Zhang, Zhen Li, Lei Zheng and Peng Kang were employed by the company Tianmushan Laboratory. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Zhao, Z.; Liu, F.; Yang, X.; Xie, Z.; Liu, L.; Chen, W. High-velocity impact and post-impact fatigue response of Bismaleimide resin composite laminates. Eur. J. Mech. A/Solids 2025, 112, 105655. [Google Scholar] [CrossRef]
  2. Li, X.; Huang, J.; Chen, Y.; Zhu, F.; Wang, Y.; Wei, W.; Feng, Y. Polymer-Based Electronic Packaging Molding Compounds, Specifically Thermal Performance Improvement: An Overview. ACS Appl. Polym. Mater. 2024, 6, 14948–14969. [Google Scholar] [CrossRef]
  3. Xu, J.; Chen, P.; Ma, S.; Zhu, G.; Wu, L. Synthesis and thermal properties of novel bismaleimides containing cardo and oxazine structures and the thermal transition behaviors of their polymer structures. Thermochim. Acta 2023, 719, 179401. [Google Scholar] [CrossRef]
  4. Feng, Y.; Sun, Q.; Guo, J.; Wang, C. High-Performance Bismaleimide Resin with an Ultralow Coefficient of Thermal Expansion and High Thermostability. Macromolecules 2024, 57, 1808–1818. [Google Scholar] [CrossRef]
  5. Melissaris, A.P.; Mikroyannidis, J.A. Bismaleimides chain-extended by imidized benzophenone tetracarboxylic dianhydride and their polymerization to high temperature matrix resins. J. Polym. Sci. Part A Polym. Chem. 1988, 26, 1165–1178. [Google Scholar] [CrossRef]
  6. Grenier-Loustalot, M.-F.; Da Cunha, L. Sterically hindered bismaleimide monomer: Molten state reactivity and kinetics of polymerization. Eur. Polym. J. 1998, 34, 95–102. [Google Scholar] [CrossRef]
  7. Graser, J.; Kauwe, S.K.; Sparks, T.D. Machine Learning and Energy Minimization Approaches for Crystal Structure Predictions: A Review and New Horizons. Chem. Mater. 2018, 30, 3601–3612. [Google Scholar] [CrossRef]
  8. Radue, M.S.; Varshney, V.; Baur, J.W.; Roy, A.K.; Odegard, G.M. Molecular Modeling of Cross-Linked Polymers with Complex Cure Pathways: A Case Study of Bismaleimide Resins. Macromolecules 2018, 51, 1830–1840. [Google Scholar] [CrossRef]
  9. Han, J.; Gee, R.H.; Boyd, R.H. Glass Transition Temperatures of Polymers from Molecular Dynamics Simulations. Macromolecules 1994, 27, 7781–7784. [Google Scholar] [CrossRef]
  10. Buchholz, J.; Paul, W.; Varnik, F.; Binder, K. Cooling rate dependence of the glass transition temperature of polymer melts: Molecular dynamics study. J. Chem. Phys. 2002, 117, 7364–7372. [Google Scholar] [CrossRef]
  11. Zhang, T.; Wang, S.; Chai, Y.; Yu, J.; Zhu, W.; Li, L.; Li, B.A.-O. Prediction and Interpretability Study of the Glass Transition Temperature of Polyimide Based on Machine Learning with Quantitative Structure-Property Relationship (Tg-QSPR). J. Phys. Chem. B 2024, 128, 8807–8817. [Google Scholar] [CrossRef]
  12. Babbar, A.; Ragunathan, S.; Mitra, D.; Dutta, A.; Patra, T.K. Explainability and extrapolation of machine learning models for predicting the glass transition temperature of polymers. J. Polym. Sci. 2024, 62, 1175–1186. [Google Scholar] [CrossRef]
  13. Kang, S.; Cho, K. Conditional Molecular Design with Deep Generative Models. J. Chem. Inf. Model. 2019, 59, 43–52. [Google Scholar] [CrossRef]
  14. Preuer, K.; Renz, P.; Unterthiner, T.; Hochreiter, S.; Klambauer, G.A.-O. Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery. J. Chem. Inf. Model. 2018, 58, 1736–1741. [Google Scholar] [CrossRef]
  15. Arús-Pous, J.A.-O.; Blaschke, T.; Ulander, S.; Reymond, J.L.; Chen, H.; Engkvist, O. Exploring the GDB-13 chemical space using deep generative models. J. Chemin. 2019, 11, 20. [Google Scholar] [CrossRef]
  16. Tao, L.; Varshney, V.; Li, Y. Benchmarking Machine Learning Models for Polymer Informatics: An Example of Glass Transition Temperature. J. Chem. Inf. Model. 2021, 61, 5395–5413. [Google Scholar] [CrossRef]
  17. He, X.; Yu, M.; Han, J.-P.; Jiang, J.; Jia, Q.; Wang, Q.; Luo, Z.-H.; Yan, F.; Zhou, Y.-N. Leveraging data-driven strategy for accelerating the discovery of polyesters with targeted glass transition temperatures. AIChE J. 2024, 70, e18409. [Google Scholar] [CrossRef]
  18. Zhang, B.; Li, X.; Xu, X.; Cao, J.; Zeng, M.; Zhang, W. Multi-property prediction and high-throughput screening of polyimides: An application case for interpretable machine learning. Polymer 2024, 312, 127603. [Google Scholar] [CrossRef]
  19. Oviedo, F.; Ferres, J.L.; Buonassisi, T.; Butler, K.T. Interpretable and Explainable Machine Learning for Materials Science and Chemistry. Acc. Mater. Res. 2022, 3, 597–607. [Google Scholar] [CrossRef]
  20. Nguyen, T.; Bavarian, M. A Machine Learning Framework for Predicting the Glass Transition Temperature of Homopolymers. Ind. Eng. Chem. Res. 2022, 61, 12690–12698. [Google Scholar] [CrossRef]
  21. Pilania, G.; Iverson, C.N.; Lookman, T.; Marrone, B.L. Machine-Learning-Based Predictive Modeling of Glass Transition Temperatures: A Case of Polyhydroxyalkanoate Homopolymers and Copolymers. J. Chem. Inf. Model. 2019, 59, 5013–5025. [Google Scholar] [CrossRef]
  22. Alcobaça, E.; Mastelini, S.M.; Botari, T.; Pimentel, B.A.; Cassar, D.R.; de Carvalho, A.C.P.d.L.F.; Zanotto, E.D. Explainable Machine Learning Algorithms For Predicting Glass Transition Temperatures. Acta Mater. 2020, 188, 92–100. [Google Scholar] [CrossRef]
  23. Xu, P.; Ji, X.; Li, M.; Lu, W. Small data machine learning in materials science. npj Comput. Mater. 2023, 9, 42. [Google Scholar] [CrossRef]
  24. Zhu, L.; Zhou, J.; Sun, Z. Materials Data toward Machine Learning: Advances and Challenges. J. Phys. Chem. Lett. 2022, 13, 3965–3977. [Google Scholar] [CrossRef]
  25. Rodrigues, J.F., Jr.; Florea, L.; de Oliveira, M.C.F.; Diamond, D.; Oliveira, O.N., Jr. Big data and machine learning for materials science. Discov. Mater. 2021, 1, 12. [Google Scholar] [CrossRef]
  26. King-Smith, E. Transfer learning for a foundational chemistry model. Chem. Sci. 2024, 15, 5143–5151. [Google Scholar] [CrossRef]
  27. Yamada, H.; Liu, C.; Wu, S.; Koyama, Y.; Ju, S.; Shiomi, J.; Morikawa, J.; Yoshida, R. Predicting Materials Properties with Little Data Using Shotgun Transfer Learning. ACS Cent. Sci. 2019, 5, 1717–1730. [Google Scholar] [CrossRef] [PubMed]
  28. Zhang, Z.; Liu, Q.; Wu, D. Predicting stress–strain curves using transfer learning: Knowledge transfer across polymer composites. Mater. Des. 2022, 218, 110700. [Google Scholar] [CrossRef]
  29. Kazemi-Khasragh, E.; González, C.; Haranczyk, M. Toward diverse polymer property prediction using transfer learning. Comput. Mater. Sci. 2024, 244, 113206. [Google Scholar] [CrossRef]
  30. Otsuka, S.; Kuwajima, I.; Hosoya, J.; Xu, Y.; Yamazaki, M. PoLyInfo: Polymer Database for Polymeric Materials Design. In Proceedings of the 2011 International Conference on Emerging Intelligent Data and Web Technologies, Tirana, Albania, 7–9 September 2011; pp. 22–29. [Google Scholar]
  31. Zhang, H.; Li, H.; Xin, H.; Zhang, J. Property Prediction and Structural Feature Extraction of Polyimide Materials Based on Machine Learning. J. Chem. Inf. Model. 2023, 63, 5473–5483. [Google Scholar] [CrossRef]
  32. Zhu, J.; Xia, Y.; Liu, L.; Yan, S.; Zeng, Y.; Zhang, R.; Zhang, X.; Sheng, Y. Comparative study of the kinetic behaviors and properties of aromatic and aliphatic bismaleimides. Thermochim. Acta 2024, 737, 179768. [Google Scholar] [CrossRef]
  33. Lyu, J.; Tang, J.; Ji, B.; Wu, N.; Liao, W.; Yin, C.; Bai, S.; Xing, S. Fluorinated polyetherimide as the modifier for synergistically enhancing the mechanical, thermal and dielectric properties of bismaleimide resin and its composites. Compos. Commun. 2024, 51, 102035. [Google Scholar] [CrossRef]
  34. Chen, F.; Zhang, H.; Li, S.; Chen, Y.; Liang, M.; Heng, Z.; Zou, H. Design of high-performance resin by tuning cross-linked network topology to improve CF/bismaleimide composite compressive properties. Compos. Sci. Technol. 2023, 242, 110170. [Google Scholar] [CrossRef]
  35. Hsiao, C.-C.; Lee, J.-J.; Liu, Y.-L. Meldrum’s acid-functionalized bismaleimide, polyaspartimide and their thermally crosslinked resins: Synthesis and properties. React. Funct. Polym. 2024, 202, 105988. [Google Scholar] [CrossRef]
  36. Peng, H.; Wang, Y.; Zhan, Y.; Lei, F.; Wang, P.; Li, K.; Li, Y.; Yang, X. Hierarchical curing mechanism in epoxy/bismaleimide composites: Enhancing mechanical properties without compromising thermal stabilities. Eur. Polym. J. 2025, 222, 113604. [Google Scholar] [CrossRef]
  37. Liu, B.; Yuan, Z.; Liu, C.; Sun, M.; Zhang, X.; Derradji, M.; Zhang, B.; Li, J.; Zhao, M.; Song, C.; et al. Synthesis, curing kinetics and processability of a low melting point aliphatic silicon-containing bismaleimide. Mater. Today Commun. 2024, 41, 110845. [Google Scholar] [CrossRef]
  38. Zhang, Y.; Wang, L.; Yuan, Q.; Zheng, Q.; Wan, L.; Huang, F. Bismaleimide resin modified by a propargyl substituted aromatic amine with ultrahigh glass transition temperature, thermomechanical stability and intrinsic flame retardancy. React. Funct. Polym. 2023, 193, 105740. [Google Scholar] [CrossRef]
  39. Zhou, Y.; Liu, W.; Ye, W.; Chu, F.; Hu, W.; Song, L.; Hu, Y. Design of reactive linear polyphosphazene to improve the dielectric properties and fire safety of bismaleimide composites. Chem. Eng. J. 2024, 482, 148867. [Google Scholar] [CrossRef]
  40. Chen, S.; Yu, L.; Zhang, S.; Sun, X.; Qu, B.; Wang, R.; Zheng, Y.; Liu, X.; Li, W.; Gao, J.; et al. Synergistic strengthening and toughening of 3D printing photosensitive resin by bismaleimide and acrylic liquid-crystal resin. J. Sci. Adv. Mater. Devices 2023, 8, 100565. [Google Scholar] [CrossRef]
  41. Ning, Y.; Li, D.-s.; Jiang, L. Thermally stable and deformation-reversible eugenol-derived bismaleimide resin: Synthesis and structure-property relationships. React. Funct. Polym. 2022, 173, 105236. [Google Scholar] [CrossRef]
  42. Sheng, X.; Yun, S.; Wang, S.; Gao, Y.; Zuo, X.; Miao, X.; Shi, X.; Qin, J.; Ma, Z.; Zhang, G. Highly heat-resistant and mechanically strong co-crosslinked polyimide/bismaleimide rigid foams with superior thermal insulation and flame resistance. Mater. Today Phys. 2023, 36, 101154. [Google Scholar] [CrossRef]
  43. Ge, M.; Liang, G.; Gu, A. A facile strategy and mechanism to achieve biobased bismaleimide resins with high thermal-resistance and strength through copolymerizing with unique propargyl ether-functionalized allyl compound. React. Funct. Polym. 2023, 186, 105570. [Google Scholar] [CrossRef]
  44. Wu, T.; Jiang, P.; Zhang, X.; Guo, Y.; Ji, Z.; Jia, X.; Wang, X.; Zhou, F.; Liu, W. Additively manufacturing high-performance bismaleimide architectures with ultraviolet-assisted direct ink writing. Mater. Des. 2019, 180, 107947. [Google Scholar] [CrossRef]
  45. Xiong, X.; Ma, X.; Chen, P.; Zhou, L.; Ren, R.; Liu, S. New chain-extended bismaleimides with aryl-ether-imide and phthalide cardo skeleton (I): Synthesis, characterization and properties. React. Funct. Polym. 2018, 129, 29–37. [Google Scholar] [CrossRef]
  46. Li, X.; Zhou, Y.; Bao, Y.; Wei, W.; Fei, X.; Li, X.; Liu, X. Bismaleimide/Phenolic/Epoxy Ternary Resin System for Molding Compounds in High-Temperature Electronic Packaging Applications. Ind. Eng. Chem. Res. 2022, 61, 4191–4201. [Google Scholar] [CrossRef]
  47. Ning, L.; Yuan, L.; Liang, G.; Gu, A. Thermally resistant and strong remoldable triple-shape memory thermosets based on bismaleimide with transesterification. J. Mater. Sci. 2021, 56, 3623–3637. [Google Scholar] [CrossRef]
  48. Pu, Z.; Wu, F.; Wang, X.; Zhong, J.; Liu, X.; Pan, Y.; Wang, Y.; Jiang, D.; Ning, Z. Strategy to achieve low-dielectric-constant for benzoxazine-phthalonitriles: Introduction of 2,2′-bis [4-(4-Maleimidephen-oxy)phenyl)]propane by in-situ polymerization. J. Polym. Res. 2024, 31, 140. [Google Scholar] [CrossRef]
  49. Xing, Y.; Zhang, Y.; He, X. Design of acetylene-modified bio-based tri-functional benzoxazine and its copolymerization with bismaleimide for performance enhancement. Polym. Bull. 2023, 80, 12065–12077. [Google Scholar] [CrossRef]
  50. Yang, R.; Zhang, K. Strategies for improving the performance of diallyl bisphenol A-based benzoxazine resin: Chemical modification via acetylene and physical blending with bismaleimide. React. Funct. Polym. 2021, 165, 104958. [Google Scholar] [CrossRef]
  51. Yu, P.; Zhang, Y.-l.; Yang, X.; Pan, L.-j.; Dai, Z.-y.; Xue, M.-z.; Liu, Y.-g.; Wang, W. Synthesis and characterization of asymmetric bismaleimide oligomers with improved processability and thermal/mechanical properties. Polym. Eng. Sci. 2019, 59, 2265–2272. [Google Scholar] [CrossRef]
  52. Liu, C.; Qiao, Y.; Li, N.; Hu, F.; Chen, Y.; Du, G.; Wang, J.; Jian, X. Toughened of bismaleimide resin with improved thermal properties using amino-terminated Poly(phthalazinone ether nitrile sulfone)s. Polymer 2020, 206, 122887. [Google Scholar] [CrossRef]
  53. Xue, K.; Zhang, P.; Song, Z.; Guo, F.; Hua, Z.; You, T.; Li, S.; Cui, C.; Liu, L. Preparation of eugenol-based flame retardant epoxy resin with an ultrahigh glass transition temperature via a dual-curing mechanism. Polym. Degrad. Stab. 2025, 231, 111092. [Google Scholar] [CrossRef]
  54. Ma, R.; Liu, Z.; Zhang, Q.; Liu, Z.; Luo, T. Evaluating Polymer Representations via Quantifying Structure–Property Relationships. J. Chem. Inf. Model. 2019, 59, 3110–3119. [Google Scholar] [CrossRef]
  55. Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 1988, 28, 31–36. [Google Scholar] [CrossRef]
  56. Lyu, J.; Ji, B.; Wu, N.; Liao, W.; Yin, C.; Bai, S.; Xing, S. The effect of substituent group in allyl benzoxazine on the thermal, mechanical and dielectric properties of modified bismaleimide. React. Funct. Polym. 2023, 191, 105673. [Google Scholar] [CrossRef]
  57. Srinivasan, S.; Saravanamuthu, S.K.S.; Syed Mohammed, S.R.; Jeyaraj Pandian, D.; Chinnaswamy Thangavel, V. Low-temperature processable glass fiber reinforced aromatic diamine chain extended bismaleimide composites with improved mechanical properties. Polym. Compos. 2022, 43, 6987–6997. [Google Scholar] [CrossRef]
  58. Gao, H.; Ding, L.; Li, W.; Ma, G.; Bai, H.; Li, L. Hyper-Cross-Linked Organic Microporous Polymers Based on Alternating Copolymerization of Bismaleimide. ACS Macro Lett. 2016, 5, 377–381. [Google Scholar] [CrossRef]
  59. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  60. Hoerl, A.E.; Kennard, R.W. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
  61. Taunk, K.; De, S.; Verma, S.; Swetapadma, A. A Brief Review of Nearest Neighbor Algorithm for Learning and Classification. In Proceedings of the 2019 International Conference on Intelligent Computing and Control Systems (ICCS), Madurai, India, 15–17 May 2019; pp. 1255–1260. [Google Scholar] [CrossRef]
  62. MacKay, D.J.C. Bayesian interpolation. Neural Comput. 1992, 4, 415–447. [Google Scholar] [CrossRef]
  63. Cherkassky, V. The Nature Of Statistical Learning Theory. IEEE Trans. Neural Netw. 1997, 8, 1564. [Google Scholar] [CrossRef]
  64. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
  65. Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
  66. Iredale, R.J.; Ward, C.; Hamerton, I. Modern advances in bismaleimide resin technology: A 21st century perspective on the chemistry of addition polyimides. Prog. Polym. Sci. 2017, 69, 1–21. [Google Scholar] [CrossRef]
  67. Ohtsuka, K.; Nakao, S.; Hatanaka, Y. Toughening of bismaleimide and benzoxazine alloy with allyl group by incorporation of polyrotaxane. Polymer 2025, 320, 127979. [Google Scholar] [CrossRef]
  68. Wang, Y.; Yuan, L.; Liang, G.; Gu, A. Achieving ultrahigh glass transition temperature, halogen-free and phosphorus-free intrinsic flame retardancy for bismaleimide resin through building network with diallyloxydiphenyldisulfide. Polymer 2020, 203, 122769. [Google Scholar] [CrossRef]
Figure 1. Data characterization and methodological framework. (a) Comparative kernel density estimation (KDE) profiles: Data_1 (green) shows multimodal distribution patterns, whereas Data_2 (red) exhibits elevated mean values attributed to rigid structural motifs; (b) principal component analysis (PCA) dimensionality reduction: green/red regions denote Data_1/Data_2 distributions; arrows indicate transfer learning pathways; (c) technical roadmap: molecular structure → Rational Design Kit (RDKit) fingerprint generation → deep neural modeling → transfer learning adaptation, forming a framework for cross-domain knowledge migration.
Figure 1. Data characterization and methodological framework. (a) Comparative kernel density estimation (KDE) profiles: Data_1 (green) shows multimodal distribution patterns, whereas Data_2 (red) exhibits elevated mean values attributed to rigid structural motifs; (b) principal component analysis (PCA) dimensionality reduction: green/red regions denote Data_1/Data_2 distributions; arrows indicate transfer learning pathways; (c) technical roadmap: molecular structure → Rational Design Kit (RDKit) fingerprint generation → deep neural modeling → transfer learning adaptation, forming a framework for cross-domain knowledge migration.
Polymers 17 01833 g001
Figure 2. Bismaleimide-based polyimide (BMI) molecular design. (a) Classification framework based on R-group features (aromatic/aliphatic) and systematic functional group modification strategies; (b) modifier types showing functional group compositions in five copolymer structures and self-polymerized samples; (c) process of calculating descriptors.
Figure 2. Bismaleimide-based polyimide (BMI) molecular design. (a) Classification framework based on R-group features (aromatic/aliphatic) and systematic functional group modification strategies; (b) modifier types showing functional group compositions in five copolymer structures and self-polymerized samples; (c) process of calculating descriptors.
Polymers 17 01833 g002
Figure 3. Pre-training performance comparison of neural network frameworks on Data_1: (a) multilayer perceptron (MLP) parity plot (training R2 = 0.76, test R2 = 0.59); (b) convolutional neural network (CNN) parity plot (training R2 = 0.67, test R2 = 0.51); (c) deep neural network (DNN) parity plot (training R2 = 0.68, test R2 = 0.57). Diagonal line indicates ideal prediction (x = y), with training/test data points and regression lines shown in blue/yellow.
Figure 3. Pre-training performance comparison of neural network frameworks on Data_1: (a) multilayer perceptron (MLP) parity plot (training R2 = 0.76, test R2 = 0.59); (b) convolutional neural network (CNN) parity plot (training R2 = 0.67, test R2 = 0.51); (c) deep neural network (DNN) parity plot (training R2 = 0.68, test R2 = 0.57). Diagonal line indicates ideal prediction (x = y), with training/test data points and regression lines shown in blue/yellow.
Polymers 17 01833 g003
Figure 4. Layer-wise fine-tuning analysis for MLP-based transfer learning: (a) the root mean squared error (RMSE), (b) mean squared error (MSE), (c) mean absolute error (MAE), and (d) coefficient of determination (R2) performance trends across 1–6 fine-tuned layers. The MLP architecture comprises six layers (excluding input/output layers), with experiments systematically unfreezing 1–6 consecutive layers from the output end. The horizontal axis indicates the number of fine-tuned layers; vertical axes show error metrics (°C units for RMSE/MAE) and the coefficient of determination.
Figure 4. Layer-wise fine-tuning analysis for MLP-based transfer learning: (a) the root mean squared error (RMSE), (b) mean squared error (MSE), (c) mean absolute error (MAE), and (d) coefficient of determination (R2) performance trends across 1–6 fine-tuned layers. The MLP architecture comprises six layers (excluding input/output layers), with experiments systematically unfreezing 1–6 consecutive layers from the output end. The horizontal axis indicates the number of fine-tuned layers; vertical axes show error metrics (°C units for RMSE/MAE) and the coefficient of determination.
Polymers 17 01833 g004
Figure 5. Feature role analysis: (a) SHapley Additive exPlanations (SHAP) summary plot for training interpretable models on Data_3; color shades indicate the degree of feature contribution, and red/blue represents the size of feature values; (b) feature importance ranking for training interpretable models on Data_3; the horizontal axis is the relative value of feature importance, and the vertical axis is the feature; (c) SHAP summary plot for training interpretable models on Data_2; color shades indicate the degree of feature contribution, and red/blue represents the size of feature values; (d) feature importance ranking for training interpretable models on Data_2; the horizontal axis is the relative value of feature importance, and the vertical axis is the feature.
Figure 5. Feature role analysis: (a) SHapley Additive exPlanations (SHAP) summary plot for training interpretable models on Data_3; color shades indicate the degree of feature contribution, and red/blue represents the size of feature values; (b) feature importance ranking for training interpretable models on Data_3; the horizontal axis is the relative value of feature importance, and the vertical axis is the feature; (c) SHAP summary plot for training interpretable models on Data_2; color shades indicate the degree of feature contribution, and red/blue represents the size of feature values; (d) feature importance ranking for training interpretable models on Data_2; the horizontal axis is the relative value of feature importance, and the vertical axis is the feature.
Polymers 17 01833 g005
Table 1. Performance comparison of MLP-based strategies on Test_2.
Table 1. Performance comparison of MLP-based strategies on Test_2.
MetricsData_1 StandaloneData_2 StandaloneTransfer Learning
(from Data_1 to Data_2)
RMSE (°C)97.5382.1527.27
MSE9512.746747.92743.61
MAE (°C)89.4979.4721.92
R2−6.19−4.100.44
Table 2. Performance of six interpretable machine learning models on test dataset from Data_2 and Data_3.
Table 2. Performance of six interpretable machine learning models on test dataset from Data_2 and Data_3.
ModelDatasetRMSE (°C)MSEMAE (°C)R2
RFData_243.181864.3536.42−0.81
Data_317.32299.9712.270.62
RidgeData_237.391398.1530.72−0.36
Data_321.75472.9815.330.40
KNNData_264.094107.5648.96−2.98
Data_320.45418.2913.650.47
BayesianData_237.391397.8630.33−0.36
Data_321.60466.5115.180.41
SVRData_237.021370.6130.39−0.33
Data_319.07363.5813.380.54
XGBoostData_248.352337.3339.91−1.97
Data_317.06290.9812.090.63
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Z.; Liu, Y.; Xu, X.; Zhang, J.; Li, Z.; Zheng, L.; Kang, P. Transfer Learning-Enhanced Prediction of Glass Transition Temperature in Bismaleimide-Based Polyimides. Polymers 2025, 17, 1833. https://doi.org/10.3390/polym17131833

AMA Style

Wang Z, Liu Y, Xu X, Zhang J, Li Z, Zheng L, Kang P. Transfer Learning-Enhanced Prediction of Glass Transition Temperature in Bismaleimide-Based Polyimides. Polymers. 2025; 17(13):1833. https://doi.org/10.3390/polym17131833

Chicago/Turabian Style

Wang, Ziqi, Yu Liu, Xintong Xu, Jiale Zhang, Zhen Li, Lei Zheng, and Peng Kang. 2025. "Transfer Learning-Enhanced Prediction of Glass Transition Temperature in Bismaleimide-Based Polyimides" Polymers 17, no. 13: 1833. https://doi.org/10.3390/polym17131833

APA Style

Wang, Z., Liu, Y., Xu, X., Zhang, J., Li, Z., Zheng, L., & Kang, P. (2025). Transfer Learning-Enhanced Prediction of Glass Transition Temperature in Bismaleimide-Based Polyimides. Polymers, 17(13), 1833. https://doi.org/10.3390/polym17131833

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop