Machine Learning Prediction of Thermal Properties of PHB/PHBV-Based Materials: A Quantitative Structure–Property Relationship Approach Using an Integrated Polymer Database

Sotiropoulos, Nikolaos P.; Mindrinos, Leonidas; Peltier, Jean-David; Filippou, Konstantina V.; Kotzabasaki, Marianna I.; Tsigkas, Nikolaos; Maraveas, Chrysanthos

doi:10.3390/polym18131559

Open AccessArticle

Machine Learning Prediction of Thermal Properties of PHB/PHBV-Based Materials: A Quantitative Structure–Property Relationship Approach Using an Integrated Polymer Database

by

Nikolaos P. Sotiropoulos

¹,

Leonidas Mindrinos

¹

,

Jean-David Peltier

²

,

Konstantina V. Filippou

¹

,

Marianna I. Kotzabasaki

¹,

Nikolaos Tsigkas

¹

and

Chrysanthos Maraveas

^1,*

¹

Department of Natural Resources Development and Agricultural Engineering, Agricultural University of Athens, Iera Odos 75, 11855 Athens, Greece

²

CETEC Centro Tecnológico del Calzado y del Plástico de la Región de Murcia, Polígono Industrial Las Salinas, Avenida Europa, 2–3, 30840 Murcia, Spain

^*

Author to whom correspondence should be addressed.

Polymers 2026, 18(13), 1559; https://doi.org/10.3390/polym18131559 (registering DOI)

Submission received: 25 May 2026 / Revised: 19 June 2026 / Accepted: 21 June 2026 / Published: 23 June 2026

(This article belongs to the Special Issue Computational Modeling of Polymer Composites and Nanocomposites)

Download

Browse Figures

Versions Notes

Abstract

Bio-based and biodegradable polymers such as short-chain-length (scl) poly(3-hydroxybutyrate) (PHB) and poly(3-hydroxybutyrate-co-3-hydroxyvalerate) (PHBV) are widely adopted in diverse areas such as healthcare, manufacturing, and packaging. However, high production costs and the complexity of tailoring their thermal properties, such as glass transition temperature (Tg), melting temperature (Tm), and crystallization temperature (Tc), hinder further adoption. The current study reported on the development of a raw dataset of PHB and PHBV materials compiled from 572 instances collected from the literature (558 instances) and in-house experiments (14 instances). The dataset encompassed compositional physicochemical parameters, molecular features, and corresponding thermal characteristics. After assessing data quality and filtering for completeness and available features, curated datasets were created for machine learning (ML) analysis. Two ML models, Random Forest (RF) and eXtreme Gradient Boosting (XGBoost), were utilized to predict values of Tg, Tc, and Tm using feature engineering methods that integrated chemistry-based descriptors with polymer-specific and experimental variables. The predictive performance of the models was systematically investigated using different combinations of input features to identify the most informative descriptor sets for each target property. The best-performing models were obtained using 118 data points for Tg and Tm and 201 data points for Tc, achieving R² values of 0.77, 0.76, and 0.82 for Tg, Tc, and Tm, respectively. Despite the reliable prediction of the thermal properties of scl-PHAs, the main limitations of the study were the relatively small dataset size for certain targets and incomplete or missing reporting of experimental conditions in the literature sources, which may introduce variability in the compiled data. The findings implied that curated polymer datasets and interpretable ML models can support the rational design of sustainable polymers with tailored properties for specific applications.

Keywords:

Poly(3-hydroxybutyrate); Poly(3-hydroxybutyrate-co-3-hydroxyvalerate); thermal properties; machine learning; quantitative structure–activity relationship models; polymer informatics

1. Introduction

Conventional plastics, based on their durability, versatility, and low costs of production, are widely applied in different industries such as packaging, healthcare, and manufacturing. However, they also directly contribute to environmental pollution, as plastic waste is discarded into oceans, rivers, and landfills [1]. Modern methods to reduce pollution, such as recycling, are ineffective based on the small fraction of waste that they eliminate [2]. As a result, a significant proportion of plastic waste continues to pollute the environment as it takes decades to degrade into microplastics (MPs) and nanoplastics (NPs), risking ecosystems and human health [3]. Due to inadequate plastic waste management processes, there is an urgent need to transition to a circular economy (CE), ensuring a sustainable strategic framework governing the use of plastics.

The redesign of production, use, and end-of-life management of plastics is a promising pathway to eradicate plastic pollution and its negative environmental impacts [4]. Additionally, developing and adopting sustainable, bio-based, and biodegradable plastic products (BBpPs) is an effective alternative [5,6]. Polyhydroxyalkanoates (PHAs), a family of bio-based and bio-renewable aliphatic polyesters, are an example of a BBpP that biodegrades in different natural environments and are biosynthesized by microorganisms through fermentation under nutrient-limiting conditions [7,8]. Bacteria synthesize PHAs intracellularly as carbon and energy storage polymers composed of hydroxyalkanoate (HA) units. However, a core challenge of PHAs is their high production costs [9,10,11]. As a result, this limits their use in commercial settings.

PHAs are also categorized based on the length of the alkyl side chain in their repeating units into three groups. First, there are scl-PHAs with three to five carbon atoms, followed by medium-chain-length PHAs (mcl-PHAs) with six to 14 carbon atoms. Third, there are long-chain-length PHAs (lcl-PHAs; 15 or more carbon atoms) [12]. Additionally, the physicochemical and thermal behavior of PHAs is governed by factors such as average molecular weight, monomer composition, side chain length, and crystallinity, which can be modified to tailor the performance of the material. Further, the ability to tune the properties of PHAs, along with their biodegradation and biocompatibility in natural environments, is a core feature that positions PHAs as strong candidates that are suitable alternatives to traditional plastics derived from petroleum [11]. In this context, PHAs can be adopted to support the development of a circular bioeconomy.

Further insights from the literature reveal that PHB and its copolymer PHBV are the most widely studied scl-PHAs. These polymers (PHB, PHBV) have side chains attached at the third carbon atom of the backbone and exhibit relatively high crystallinity and melting temperatures. The integration of comonomers such as 3-hydroxyvalerate into PHB also disrupts the regularity of the crystals and further reduces the melting point while also introducing changes in chain mobility [4]. Additionally, PHB demonstrates improved thermal processability in terms of lower Tm, Tc, and Tg due to its copolymer (PHBV), leading to elastic and flexible texture at normal temperature while guaranteeing easier processing.

The blending of scl-PHAs with mcl-PHAs or the integration of additives such as plasticizers also modifies intermolecular interactions and crystallinity, enhancing thermal behavior. While these approaches enhance performance and processability, the resulting structure–property associations are often nonlinear and strongly coupled, leading to difficulty in making rational design using conventional trial-and-error methods. In contrast with scl-PHAs, mcl-PHAs also contain longer side chains that hinder chain packing, thereby leading to lower crystallinity, lower melting points, and elastomeric behavior [13,14]. The distinct structural differences lead to distinct mechanisms of crystallization and thermal transitions, revealing the chemically and physically unique polymer class of scl-PHAs. Additionally, the thermal qualities of scl-PHAs, especially Tg, Tm, and Tc, are highly sensitive to molecular structure and formulation. PHB, the most widely industrialized PHA, is also brittle due to its high crystallinity, poor mechanical properties, and low thermal stability because its melting temperature (Tm) is close to its degradation temperature (Tdeg), making thermal processing (e.g., extrusion, injection molding) challenging [15,16].

Despite the unique advantageous properties of PHAs (high crystallinity and melting temperatures and easier processing), they are inherently limited by factors such as brittleness, narrow processing windows, sensitivity of thermal properties to small compositional changes, and high production costs [15]. Such challenges are highly pronounced for scl-PHAs, where modest variations in comonomer content, blend composition, or additive concentration can lead to significant shifts in Tg, Tm, and Tc [14]. Consequently, predictive tools for the efficient linkage of formulation parameters to thermal properties are essential to accelerate the development of application-ready scl-PHA- and PHB-based materials with targeted properties, reducing experimental costs and time. To predict polymer properties and guide the design of the materials, ML methods and polymer informatics have also emerged as a promising alternative [17,18,19]. However, the success of ML models is constrained by the lack of high-quality and chemically consistent datasets. For PHAs, the availability of high-quality datasets is yet to be met despite large polymer databases such as Polymer Genome and PoLyInfo, in which PHAs represent only a small fraction of entries and thermal property data are often incomplete or inconsistently reported [20,21].

Moreover, clear differences do not emerge between scl- and mcl-PHAs in these databases, while they fail to account for formulation effects such as additive incorporation and blending. Currently, only a limited number of ML studies have addressed thermal property prediction within the PHA family. Previous studies have demonstrated the feasibility of predicting Tg or Tm for PHA homopolymers and copolymers based on small, literature-curated datasets, while more recent multitask deep learning models have included PHAs as a minor subset within broad collections of polymers. Such approaches either rely on heterogeneous PHA datasets, which obscure scl-specific physicochemical trends, or prioritize general polymer screening rather than focused modeling of scl-PHA formulations [22,23,24,25,26]. As such, a gap exists in the lack of high-quality datasets related to PHAs that address formulation effects related to blending and additive incorporation.

The aim of the current study was to develop a curated and standardized dataset of PHB and PHBV materials compiled from the literature and in-house experiments and apply ML models to predict their thermal features. The objectives in the study were:

To curate a structured data library of PHB/PHBV-based materials incorporating various additives and building blocks from the literature and in-house experiments.
To implement XGBoost and Random Forests to predict the thermal features of scl-PHAs from the curated dataset.
To critically investigate the restrictions of the ML models and the curated database, thereby outlining future research directions.

The novelty of this study was its presentation of a curated dataset dedicated exclusively to scl-PHAs with side chains at the 3-position of the polymer backbone, including PHB, PHBV formulations, additive-containing systems, and blends with mcl-PHAs, unlike broad polymer databases such as Polymer Genome and PoLyInfo, in which PHAs represent only a small subset of entries, and unlike previous PHA datasets that primarily focus on homopolymers and copolymers, allowing the relationship between formulation composition and thermal behavior to be systematically explored. Experimental Tg, Tc, and Tm values, weight-average molecular weight (Mw), number-average molecular weight (Mn), polydispersity index (PDI), and compositional information were systematically obtained from the literature and critically curated to ensure data consistency and reliability. Narrowing the scope to only these specific materials ensured that a thoroughly curated dataset could be constructed and aided the development of predictive models that captured structure–thermal property relationships and additive interactions in Tg, Tc, and Tm, inaccessible in broader PHA datasets. Subsequently, the research aims to establish a foundational data infrastructure for scl-PHA informatics and demonstrate the value of targeted curation for advancing predictive polymer design and optimization of scl-PHA formulations with tailored thermal performance.

2. Materials and Methods

2.1. Workflow of Model Development

The present study follows a multi-stage research methodology that integrates literature data analysis, experimental data incorporation, database development, feature engineering, and machine learning modeling. The overall workflow is depicted in Figure 1 and summarized as follows.

In the first stage, a comprehensive data curation process was conducted to assemble a structured dataset comprising polymer composition, molecular characteristics, additive information, and physicochemical and thermal properties. Raw data were manually collected from peer-reviewed literature sources as well as in-house experimental measurements to ensure broad coverage of scl-PHA systems. An initial quality control procedure was applied to remove duplicate entries and resolve inconsistencies in units. This step resulted in a curated and standardized database suitable for downstream analysis.

The second stage focused on data preprocessing and feature engineering. Missing values in key molecular descriptors were handled using physically consistent reconstruction strategies based on established polymer relationships, thereby improving descriptor coverage while preserving chemical meaning. Numerical features were subsequently normalized and transformed to ensure comparability across different scales, while categorical variables related to polymer type, composition, and additives were encoded using one-hot encoding. This stage also included preliminary exploratory analysis to assess feature distributions and potential sources of bias in the dataset.

In the third stage, ML model development was carried out. The dataset was partitioned into training and test subsets using multiple splitting strategies to evaluate robustness. Outlier detection was applied exclusively to the training data to prevent information leakage. Random Forest (RF) and XGBoost models were trained using cross-validated hyperparameter optimization to balance model complexity and generalization [27,28]. Feature importance analysis was performed to identify the most influential descriptors governing thermal behavior, and a Domain of Applicability (DoA) analysis was implemented to define reliable prediction ranges in terms of composition, molecular weight, and additive content. Model evaluation and validation were conducted using multiple complementary metrics and validation strategies. Predictive performance was assessed using the coefficient of determination (R²) together with error-based metrics (MAE and RMSE), and uncertainty was quantified through cross-validation variability.

2.2. Data Collection from the Literature

2.2.1. Literature Search Strategy

The literature search was undertaken in verified scientific databases such as Scopus, Elsevier, and Springer in alignment with PRISMA 2020 guidelines [29]. The rationale for their selection was that they provide access to diverse peer-reviewed research articles, patents, and conference proceedings, thereby providing valuable scientific insights to address the research objectives.

Keywords were first derived from the research objectives, such as “PHBV,” “PHB,” “thermal,” “lignin,” “ascorbic acid,” “orotic acid,” “maltose,” “thymol,” and “sorbitol.” Boolean operators AND/OR were then adopted to create a search string used in performing a literature search in the individual databases.

The final search string was structured as follows, where the TITLE-ABS-KEY search function from Scopus was used to query the databases:

TITLE-ABS-KEY ((“PHBV” OR “polyhydroxybutyrate-co-valerate” OR “Poly(3-hydroxybutyrate-co-3-hydroxyvalerate)” OR “polyhydroxybutyrate” OR “PHB”) AND (“thermal” OR “rheological”) AND (“lignin” OR “ascorbic acid” OR “orotic acid” OR “theobromine” OR “fructose” OR “glucose” OR “lactose” OR “sucrose” OR “maltose” OR “melezitose” OR “dextran” OR “ascorbyl palmitate” OR “lignins” OR “ammonium quaternary salts” OR “calcium carbonate” OR “epoxidized soybean oil” OR “acetyl tributyl citrate” OR “castor oil” OR “limonene” OR “thymol” OR “starch-based fillers” OR “sorbitol” OR “maltodextrin” OR “hexadecyl 3,5-bis-tert-butyl-4-hydroxybenzoate” OR “PHN” OR “polyhydroxynonanoate” OR “Poly(3-hydroxynonanoate)”)).

Applying the search string resulted in the retrieval of 327 documents.

2.2.2. Study Selection and Screening

Inclusion and exclusion criteria were further defined to facilitate the selection of relevant studies through a screening process based on the PRISMA 2020 guidelines [29] (Figure 2).

Based on the PRISMA 2020 guidelines [29], the research only selected studies whose scope focused on the thermal properties of PHB/PHBV formulations, their compositional information, with or without additives, and their blends with mcl-PHAs. Only original research articles, published in English, were included. There were no restrictions on the publication year, ensuring wide coverage of historical information. However, articles published in non-English languages and review articles were excluded from the study. Subsequently, only 109 studies were included in the final study.

2.2.3. Data Extraction and Synthesis

To extract and synthesize data from the identified 109 studies, a comprehensive examination of each study was undertaken. The data extraction from the literature sources identified the compositional, physicochemical, and thermal properties of the PHB/PHBV compounds for a wide range of molecular weights and the percentages of monomers that constituted them.

Further, data was also extracted from the in-house experimental dataset based on existing standards and protocols. First, Differential Scanning Calorimetry (DSC) was adopted to determine Tg, Tm, and Tc, as it directly assesses thermal transitions from differences in heat flow by detecting energy absorbed (endothermic) or released (exothermic) during phase transitions. Next, CETEC experimental protocols were implemented to facilitate the integration of the experimental dataset (14 instances). In cases where thermal properties were not reported but graphs, images, or curves provided useful data, the values were manually extracted using PlotDigitizer. While this approach ensured that all relevant thermal property data, including Tm, Tg, and Tc, were captured, even when only graphical representations were available, it can increase uncertainty and introduce manual errors. Additionally, due to incomplete reporting in the literature, key DSC experimental conditions such as heating/cooling rate, thermal history, and the definition of transition temperatures (onset, midpoint, or peak) were not consistently available and thus could not be incorporated into the dataset. These variations in DSC measurement conditions and methods used across studies can influence the measured thermal transitions and lead to shifts in the reported Tg, Tm, and Tc values, thereby representing an unavoidable source of variability and heterogeneity within the compiled dataset. In cases where sufficient information was available, Tc was taken from the cooling cycle, while Tg and Tm were extracted from the second heating scan to minimize the influence of prior thermal history. A representative DSC diagram for amorphous and semicrystalline polymers is shown below in Figure 3.

2.2.4. Data Curation

The literature and experimental data were curated in a central Excel Spreadsheet with 3 worksheets. All features were standardized to consistent units (SI or SI-consistent where applicable) to ensure comparability across datasets prior to model development. Abbreviations from the dataset were also documented in a fourth worksheet. Each worksheet represented a different data category, organized as showcased in Table 1 below.

Each study was also assigned a unique identification number (study ID), and relevant information across each category was entered into a row. If a study included different compositional or experimental details, a separate category (instance) was created for the study. Despite representing different formulations and conditions, these categories retained the study ID of their respective source studies.

2.2.5. Data Integration

The data from the literature (558 instances) and in-house experiments (14 instances) were further integrated into a database, leading to 572 different instances. The study IDs were non-sequential as they were extracted from a larger data library. Table A1 in Appendix A.1 showcases the distribution of instances for each study. The dataset was designed following the Findability, Accessibility, Interoperability, and Reuse (FAIR) data principles, facilitating reproducibility and reuse [31].

Table A2 in Appendix A.2 outlines the input (feature) and output (target) variables included for the construction of the scl-PHAs-based materials data library, which includes 25 numerical and 12 categorical variables. Each variable is described based on its definition, measurement unit, value range, and type (categorical or numerical). The full curated raw data library for the scl-PHAs-based formulations, including all variables across the three worksheets, was uploaded to the Agricultural University of Athens (AUA) Zenodo repository [32].

2.3. Data Collection from In-House Experiments

2.3.1. Experimental Data Acquisition Methodology

Following the collection of data from the literature, the second phase involved conducting in-house experiments to extract and characterize PHBV, as described in the subsequent subsections, resulting in the addition of 14 new instances to the dataset. The experimental values introduced into the database were single measurements.

PHBV Extraction

First, purified PHBV was recovered from the PHBV-rich biomass produced at the CetecBio PHBV pilot plant. Dried PHBV-rich biomass (about 700 g) was ground using a RETSCH cutting mill SM300 mounted with a 6 mm sieve. The ground PHBV-rich biomass was Soxhlet-extracted for 12 h with 1,3-dioxolane (4 L), and the extract was concentrated until the formation of a brown gel in a rotary evaporator. Methanol at −18 °C (1 L) was added to the residue, and the solution was agitated to precipitate the PHBV. The precipitate was then washed 3 times with methanol (500 mL) or until the filtrate was colorless. The recovered PHBV was then dried in the oven at 60 °C overnight, and the yield was recorded.

Gel Permeation Chromatography (GPC)

The GPC data were recorded on an Agilent Infinity II instrument equipped with differential refractive index (DRI), viscometry (VS), and light scatter (LS) detectors. The system was equipped with 2 × Agilent PLGel Mixed C columns (300 × 7.5 mm) and an Agilent 5 µm PLGel Guard column. Agilent poly(methylmethacrylate) (PMMA) EasiVials were used to create a third-order conventional calibration from DRI data between 1,795,000 and 535 g mol⁻¹. The mobile phase was CHCl₃, run at a flow rate of 1 mL min⁻¹ at 30 °C. All sample analyses (number-average molar mass (Mn), weight-average molar mass (Mw), and polydispersity index (Mw/Mn)) were carried out using Agilent GPC/SEC software.

Differential Scanning Calorimetry (DSC)

DSC was performed on the purified PHBV (10 ± 5 mg using TzeroTM aluminum pans and a TA Instrument, DSC 2500, with refrigerated cooling under nitrogen (50 mL/min) based on the international standard UNE-EN ISO 11357-1 [33]. The DSC equipment was calibrated using indium and sapphire standards in adherence with manufacturer protocols. Samples were equilibrated at −70 °C (2 min), then ramped at 10 °C/min to 185 °C (3 min) to remove any thermal history, then cooled to −70 °C at a rate of 10 °C/min (2 min), and the cycle was repeated.

The glass transition temperature (Tg) was determined based on the international standard UNE-EN ISO 11357-2 [34]. The melting temperature (Tm) and melting enthalpy (∆H_f) were further determined from the inflection point and endothermic peak of the second heating scan. The crystallization temperature (Tc) was based on the exothermic peak observed during the cooling cycle (in the presence of a nucleating agent) or during the second heating cycle (in the absence of a nucleating agent) following the international standard UNE-EN ISO 11357-3 [35].

The degree of crystallinity (Xc) was determined using the following equation:

X_{C} = \frac{Δ H_{f}}{{Δ H}^{0}_{f}} \times 100

(1)

where ΔH_f indicated the melting enthalpy of the sample and

{Δ H}^{0}_{f}

showed the melting enthalpy of the pure crystalline polymer (146 J/g for PHBV) [36]. Thereafter, the crystallization temperature was determined. Data were analyzed using TA TRIOS v5.1.1 software.

Nuclear Magnetic Resonance (NMR) Spectroscopy

PHBV samples were dissolved in deuterated chloroform (CDCl₃), and 1H-NMR spectra were recorded on an Ascend Bruker 400 MHz spectrometer at room temperature. Spectra were analyzed using MestReNova V14.2.0 software. The molar fraction of the 3 HV unit of the PHBV samples was estimated following the CEN workshop agreement 18155 procedure [37].

2.3.2. Mechanical Testing

Mechanical tests were carried out in a controlled atmosphere at 23 °C and an RH of 48% using a Zwick/Roell Z010 apparatus based on ISO 527 [38]. The preload was 1 N, the testing speed was 50 mm/min, the initial clamping length was 115 mm, and the initial standard stroke length was 50 mm.

2.3.3. Data Preprocessing

In the subsequent phase, the dataset “Worksheet_1_Physicochemical Information,” consisting of 24 columns and 572 rows, and the dataset “Worksheet_2_ Thermal properties”, with 18 columns and 572 rows, were merged based on the key columns “Study_id” and “Instance”. This integration aligned each experiment’s physicochemical properties with the extracted temperature values.

Before the model was developed, a series of preprocessing and feature engineering steps was applied. For samples where the Mw (Mw_PHBV), the number-average molecular weight (Mn_PHBV) and the PDI were not all available or only one of them was not reported, the missing values were recovered following the standard polymer relationship:

M_{w} = P D I \times M_{n}

(2)

In Table 2, Mw values before and after reconstruction are summarized, together with the percentage of filled entries compared to the full dataset of 572 samples (instances). From the perspective of data completeness, Mw achieved the highest coverage in the curated dataset, e.g., 64.2%, after reconstruction, compared to 41.1% for Mn and 40.4% for PDI. However, beyond data availability, Mw was the most relevant molecular weight descriptor for the thermal properties of semicrystalline PHB and PHBV. An explanation was that Mw reflected the length of the polymer chains produced during fermentation for the type of biopolyesters. Longer chains, e.g., higher Mw, resulted in a more entangled, cohesive material with higher Tg and Tm. On the contrary, shorter chains, e.g., lower Mw, behaved more like diluents within the polymer matrix, reducing both Tm and Tc [39]. Furthermore, Mn captured the shorter chain fraction of the distribution and was more sensitive to the presence of oligomers; hence, it was a less stable predictor of bulk thermal behavior. Finally, PDI described the width of the molecular weight distribution that affected the thermal properties.

The original additive type labels showed significant heterogeneity due to differences in terminology. For example, “plasticizer” and “plastisizer” were included as separate labels, as well as several categories with fewer than ten instances. To reduce categorical sparsity and improve model robustness, additive types were grouped into a reduced number of meaningful classes. In this study, this involved a trade-off between data availability and the specific roles of the additives in governing mechanisms such as crystallization. Therefore, the model learnt a generalizable representation of an additive function, in line with established QSPR/QSAR (Quantitative Structure–Activity Relationship) methodologies for categorical feature encoding [40]. This decision also prevented excessive dimensionality from one-hot encoding in relation to the limited dataset size, thereby preserving an appropriate sample-to-feature ratio and reducing the risk of overfitting.

“Plasticizer,” “filler,” “polymer_modifier,” and “stabilizer” were the four defined categories. Each category corresponded to a distinct mechanism of thermal property modification in PHB/PHBV formulations. More specifically, plasticizers increased the fractional free volume and reduced both Tg and Tm by disrupting the chain packing regularity [41]. Fillers, nucleating agents, and reinforcements were also grouped based on their similarity in providing heterogeneous nucleation sites or restricting bulk chain mobility, therefore elevating and modifying crystallinity [42]. Additionally, polymer modifiers changed the structure and interaction of the polymer chain by adding new particles to improve properties, and, finally, stabilizers prevented the polymer chains from breaking when heated, keeping their molecular weight stable during processing [43]. Supplementary material related to category grouping is showcased in Table 3 and Table 4, which illustrate the mapping of original to merged categories for “Additive_type_1” and “Additive_type_2”, respectively.

Missing additive weight fractions were considered as the absence of the corresponding additives. As such, missing values in the additive variables were replaced with zero. Similarly, for samples where the additive was absent, the corresponding additive type was assigned the category “not applicable”. In the final merged dataset, column features with more than 70% missing values and columns containing irrelevant information (e.g., “Additive1_name”, “Temperature_units”) were excluded. The cleaned dataset with completeness percentages of every feature is presented in Table 5.

Figure 4 presents the distributions of 14 numerical features, showing their range and skewness. In Figure 5, the distributions of the two categorical features are presented.

2.4. Thermal Properties and QSPR Model Development and Validation

The study focused on predicting the key thermal properties of the polymer, namely, Tm, Tc, and Tg. Descriptive statistics of the three thermal properties are summarized in Table 6. In each case, one property was treated as the target variable, and two different predictive models (RF and XGBoost) were developed for each thermal property. To examine the influence of thermal descriptors on model performance, a sensitivity analysis was performed before final model selection for the prediction of all three thermal properties.

More precisely, for each target, multiple feature configurations were examined by including different thermal properties as additional inputs. The basic feature set comprised seven molecular and compositional descriptors: “HB_ratio_formulation”, “HV_ratio_formulation”, “Mw”, “Additive1_percentage”, “Additive1_type”, “Additive2_percentage,” and “Additive2_type”. Since Tm, Tc, and Tg are thermodynamically interrelated properties of PHB/PHBV-based systems, the potential benefit of incorporating one or two thermal properties as additional input features was systematically investigated.

Three feature space configurations were evaluated for each target property: (1) the basic feature set alone, (2) the basic set augmented with one additional thermal descriptor, and (3) the basic set augmented with both remaining thermal descriptors. This procedure limits the practical applicability of the proposed framework but was used to obtain reliable results given the limited and heterogeneous dataset. Although the inclusion of thermal descriptors generally led to improved or comparable predictive performance, this improvement resulted in a reduction in available training data, highlighting a trade-off between model accuracy and data coverage. Note that for Tc prediction, the model with basic features yielded the best overall performance. In practice, the use of thermal descriptors may also restrict applicability in early-stage material design scenarios where such properties are not yet available. Table 7 reports the dataset size corresponding to each target value and the number of features considered in each modeling scenario.

It should be noted that the reduction from the full curated dataset of 572 instances to smaller effective dataset sizes in the final models was due to data completeness requirements since only samples with measured values for all descriptors included in a given feature space configuration could be used. Consequently, the reported model performance reflects learning from these reduced effective dataset sizes rather than from the full curated database.

After selection of the target variable and the feature space, the dataset was split into training and test sets to enable robust evaluation of model performance. Thereafter, the effect of statistically extreme values was examined. Feature-based selection was carried out using dimensionality reduction through Factor Analysis of Mixed Data (FAMD), and outlier detection was performed using the interquartile range (IQR) method [44,45]. Based on the limited size of the dataset, the effect of outlier exclusion was systematically evaluated as part of the sensitivity analysis. In each case, only a small number of samples were removed, corresponding to data points exceeding 7 times the interquartile range of the target variable. Note that data points that were likely to be statistically insignificant could still carry physically meaningful and polymer-relevant information [46,47,48].

3. Results and Discussion

3.1. Data Curation

A total of 109 peer-reviewed studies were systematically mined and integrated with an in-house experimental dataset. In this manner, a structured data library of PHB/PHBV-based materials incorporating various additives and building blocks was manually curated. The curated raw dataset used in the data preprocessing stage is publicly available in the associated data repository (AUA Zenodo repository) [32]. An overview of the compositional, molecular, physicochemical, and thermal descriptors and features that were extracted from the referenced studies is provided in Table A2 in Appendix A.2.

3.2. QSPR Model Performance

3.2.1. Sensitivity Analysis and Optimal Model Selection

Before the selection of the final model, a systematic sensitivity analysis was conducted to identify the optimal modeling configuration for each thermal property. Two cases were investigated: the composition of the input feature space and the effect of outlier exclusion on model generalization. All configurations were evaluated using both RF and XGBoost regressors, tuned via cross-validation (five folds), and compared using cross-validated (CV) R² mean and standard deviation (SD). Models exhibiting signs of overfitting (training R² > 0.99) or poor generalization (CV R² < 0.70) were excluded from further analysis. These thresholds were applied as dataset-specific quality control filters rather than universal statistical criteria, reflecting the heterogeneity and limited sample sizes of literature-derived polymer datasets. The cross-validated R² scores for all evaluated configurations are summarized in Table 8, Table 9 and Table 10, with the selected best-performing configuration highlighted in bold. The optimal models are then presented in detail in Section 3.2.2, Section 3.2.3 and Section 3.2.4.

For Tm (Table 8), multiple configurations across both algorithms and feature spaces satisfied the selection criteria, indicating that Tm prediction is robust to different modeling choices and that outlier exclusion consistently improved cross-validation stability. In contrast, Tc and Tg showed considerably fewer eligible configurations, reflecting greater sensitivity to feature space composition and preprocessing strategy; see Table 9 and Table 10, respectively. For Tc, only two configurations met the criteria, with the best performance achieved by XGBoost using the basic feature set alone, suggesting that Tc is best predicted from molecular descriptors without additional thermal inputs. For Tg, RF with all features and outlier exclusion yielded the best result. The diversity of the selected models—two RF and one XGBoost, two configurations with two thermal descriptors and one with basic features alone—reflects the target-dependent nature of thermal property prediction in PHB/PHBV systems and underlines the importance of systematic sensitivity analysis before final model selection. In the case of predicting Tg and Tm, models using only compositional features (basic set) represented a more realistic scenario for new or unseen samples, but performance improved when other thermal properties were included as inputs. This did not indicate classical data leakage but rather learning driven by strong relationships between thermal properties of polymers. A restriction is that these additional properties are often unknown in practice, reflecting real-world prediction settings. However, expanding and improving the dataset is expected to further enhance model performance and robustness, even without relying on additional thermal input features.

3.2.2. Tm Prediction Model Performance

The optimal Tm prediction model was identified as the tuned RF regressor trained on the basic feature set augmented with Tc and Tg as additional thermal descriptors, with outlier exclusion applied to the training set. The nine features used for model development are summarized in the supplementary data in Table 11. The dataset comprised 118 samples and was split into training and test sets using an 80/20 ratio, resulting in 94 samples for training and 24 samples for testing.

The optimal hyperparameter values obtained via cross-validated (cv) grid search are reported in Table 12. The performance of the tuned model is reported in Table 13.

This configuration achieved a cross-validated R² of 0.817 and a training R² of 0.935, representing the most favorable balance between predictive performance and generalization stability among all evaluated Tm models. The relatively modest train–CV gap, combined with the low CV standard deviation, confirmed that this configuration was the most robust and reliable for Tm prediction within the constraints of the available dataset.

A comparison of the predicted and the true Tm for the tuned RF model is presented in Figure 6.

SHAP analysis was further undertaken to assess feature importance by examining how much each input variable contributed to the model’s performance. The resulting SHAP summary plots (Figure 7) highlighted a global view of feature effects, revealing both the magnitude and direction of each feature’s influence on the model output. The synthesis of the results indicated that the most influential features identified by SHAP analysis were associated with the formation of crystals and stability of semicrystalline polymers. In particular, higher HV content was related to lower Tm values. This trend is consistent with the literature reporting that HV incorporation can disrupt chain regularity and reduce crystallinity in PHB/PHBV systems. Similarly, lower Mw was associated with reduced Tm values, which may reflect the effect of shorter chain length and a higher density of chain ends on crystal stability.

However, with the Tc values, a higher value may reflect the fact that thicker and more perfect crystals were created, leading to higher Tm. Tg was also positively associated with Tm. This behavior may be explained by the reduced mobility of the amorphous phase of the polymers, which has been reported to promote crystal formation and a higher Tm point, as more energy is required to melt the ordered crystalline structures. The analysis further indicated that the type of additive and its percentage were also important features in the prediction of Tm. This observation may reflect the influence of additives on the crystallization behavior and chain mobility of the final biopolymer formulation and its Tm point. However, these effects depend on the nature and function of each additive used.

3.2.3. Tc Prediction Model Performance

Based on the results, the best Tc prediction model was the tuned XGBoost regressor trained on the basic physicochemical and compositional feature set alone without outlier exclusion; see Table 14. The dataset consisted of 201 samples and was split into a training set with 160 samples and a test set with 41 samples using an 80/20 ratio.

The hyperparameters used in the tuned model are summarized in Table 15.

This configuration achieved a cross-validated R² of 0.762 ± 0.054 and a training R² of 0.972, as shown in Table 16. Closer inspection indicated that the absence of additional thermal descriptors in the optimal Tc feature space had a significant impact, unlike Tm and Tg, because crystallization temperature was a kinetic parameter influenced by the interplay between chain mobility and nucleation, which was partially reflected in the formulation descriptors such as Hv content and Mw. However, the integration of additional thermal qualities, such as input features, did not impact Tc prediction, while in several cases, it increased the variance, consistent with the physical interpretation.

A comparison between the predicted and the true Tc is presented in Figure 8.

The SHAP summary plot for the XGBoost model is presented in Figure 9. Based on the results, “Mw_PHBV” had the highest impact on model output, followed by “HB_ratio,” “HV_ratio,” and additive1 percentage. The results suggested that these features were relevant to the prediction of Tc, which is commonly associated in the literature with chain mobility, regularity, and intermolecular interactions. As such, higher Mw values may be correlated with reduced chain mobility due to higher entanglements and longer polymer chains that hindered the formation of crystals and decreased Tc values. The same effect on the prediction of Tc was observed when HV content was increased, which may be linked with the fact that crystallization became more difficult due to disrupted chain regularity. High HB monomer content may also reflect the dependence between the formation of more stable and organized crystals in the polymer matrix and higher structural rigidity, leading to higher Tc predictions. Although SHAP values represent feature importance and statistical associations within the model, the results indicated reliable QSPR performance, as the identified features are consistent with known factors influencing Tc.

3.2.4. Tg Prediction Model Performance

The results indicated that the optimal Tg prediction model was the tuned Random Forest regressor trained on the basic feature set augmented with both Tm and Tc as additional thermal descriptors, with outlier exclusion applied to the training set. The features employed in the Tg model are listed in Table 17. The dataset comprised 118 samples for the Tm model since the same feature space was used.

The hyperparameter values selected for the Tg model are presented in Table 18.

This configuration achieved a cross-validated R² of 0.765 and a training R² of 0.960 (Table 19). The analysis indicated that including the remaining thermal properties as input features reflected the well-established thermodynamic interdependencies governing the glass transition behavior in semicrystalline polymers. Tg was influenced by chain mobility and free volume, which were directly linked to the degree of crystallinity encoded in Tc and the melting behavior captured by Tm.

A comparison between the predicted and true Tg values of the tuned model is presented in Figure 10.

The SHAP analysis (Figure 11) indicated that the most influential features of the RF model included Tm, Tc, Mw, and the HV_ratio formulation. Upon closer inspection, the analysis showed that the glass transition temperature (Tg) reflected how easily polymer chains moved in the amorphous phase. Specifically, samples with higher Tm and Tc values tended to be associated with higher predicted Tg values. This association may reflect the relationship between stronger intermolecular interactions, greater chain regularity, and the development of crystalline structures, which are often linked to restricted molecular motion in amorphous regions. Similarly, the high Tg values arising from the reduction in chain mobility in the amorphous phase can be explained by high Mw values that led to an increase in chain entanglement. In contrast, an increasing HV_ratio was associated with lower predicted Tg values. This trend is consistent with reports that HV units can increase chain irregularity, reduce crystallinity, and enhance chain flexibility. In summary, the identified feature importance and the observed patterns suggest that the model captures relationships that are physically meaningful and relevant to chain mobility in the amorphous phase.

3.3. Comparison of Predictive Performance Across Thermal Properties

The synthesis of the predictive performance across the three thermal properties showed similarities and significant differences between the targets and optimal configurations. Applying systematic sensitivity analysis of feature space composition and outlier exclusion effects also helped identify the best-performing configuration for each thermal property. The results showed that the Random Forest algorithm was the most optimal for both Tm and Tg prediction, while XGBoost proved preferable for Tc, reflecting the target-dependent nature of thermal property modeling in PHB/PHBV systems.

All three optimal models achieved training R² values below 0.990, confirming that the selected configurations avoided severe overfitting while retaining sufficient model complexity to capture nonlinear structure–property relationships. Further, the generalization performance varied by target: the Tm model achieved the highest cross-validated R² of 0.817, followed by Tg at 0.765 and Tc at 0.762. The high scores underlined the generalizability of the ML models adopted in the research. As such, XGBoost and RFs could be applied across different datasets to predict the thermal features of scl-PHAs. Despite ranking third in terms of CV R², the Tc model achieved the lowest cross-validation standard deviation among the three properties, indicating particularly stable generalization despite relying exclusively on the basic feature set without outlier exclusion.

The Tg model exhibited the lowest absolute error metrics among the three properties, with a cross-validated RMSE of 4.05 °C and an MAE of 2.68 °C. This outcome reflects both the performance of the model and the narrower range and lower variability in Tg values in the dataset, which naturally leads to smaller absolute errors independent of predictive strength. In contrast, the Tc model showed higher absolute errors but demonstrated the most stable generalization behavior across folds. Overall, these results highlight that predictive performance should be interpreted jointly with dataset-dependent property distributions, as error magnitudes alone may not directly reflect model quality across different thermal properties.

Feature importance analysis via SHAP revealed consistent trends: compositional descriptors such as polymer Mw, Hb, and Hv formulation ratios were among the most influential features across all three thermal properties. These findings suggested that the models assigned higher importance to variables that are known to influence the thermal behavior of PHB/PHBV systems. The SHAP trends generally agree with relationships reported in the literature, but they should be viewed as model-based associations and measures of feature importance rather than a physically direct interpretation of the features. Further insights are expected to emerge as the dataset grows and more data become available.

Nevertheless, this study has some limitations that should be considered. The thermal property data were collected from different literature sources, where experimental conditions of DSC and definitions of transition temperatures (e.g., onset, midpoint, or peak) can lead to significant shifts in the reported thermal transition temperatures. Since processing and cooling conditions can significantly affect Tc, their absence from the dataset may further limit the physical meaning of the Tc model. Additionally, characterization protocols of compositional information were not consistently reported or standardized, leading to heterogeneity in the dataset, including differences in sample preparation histories and molecular weight determination methods. In some cases, values had to be manually extracted from published graphs using PlotDigitizer, which may introduce additional noise and uncertainty.

Also, a recurring limitation across all three models was the constrained dataset size, comprising between 118 and 201 samples based on the target property. This challenge was inherent to the field of polymer informatics, where experimental thermal characterization was resource-intensive and published datasets were sparser compared to other materials domains.

For Tg and Tm prediction, models based only on compositional features represent a more realistic scenario for predicting the properties of new materials. Although better performance was achieved when other thermal properties were included as inputs, this is mainly due to the strong relationships between thermal properties rather than data leakage. However, such information is often unavailable for unknown samples.

Despite these restrictions, this work represents a first step toward a data-driven framework for predicting PHA thermal properties. Future studies with larger datasets and more standardized experimental data and testing conditions are expected to further improve model accuracy and robustness. Further insights showed that the study was also limited as it compared only XGBoost and RF models. In future work, there is a need to consider broader ML models and compare their performance in predicting the thermal features of scl-PHAs.

3.4. Model Validation, Generalization and Applicability Domain Analysis

3.4.1. Study-Wise Splitting and External Validation

To evaluate generalization across independent studies, a source-wise (leave-one-study-out) splitting strategy was initially applied. However, due to the limited number of available studies in several test folds, this approach resulted in highly unstable performance estimates. Although cross-validation performance remained relatively high (CV R² ≈ 0.73–0.78), test set performance showed large variability and, in several cases, negative R² values. These results highlight the limitations of strict study-wise splitting under extreme data scarcity rather than intrinsic model failure. In particular, the imbalance in study sizes and the heterogeneity in experimental conditions led to test sets that were not statistically representative of the training distribution.

To address this constraint, an alternative external validation strategy was introduced based on chemically meaningful hold-out sets, where selected high-temperature observations from studies containing more than five instances were excluded from training. This approach resulted in more balanced and interpretable evaluation of generalization performance. Pearson correlation is particularly advantageous for evaluating unseen instances, as it captures the strength of the linear relationship between predicted and experimental values independent of scale and systematic bias.

Under this scheme, the models achieved improved and more stable predictive behavior, with Pearson correlation coefficients ranging from 0.71 to 0.91 across properties, as shown in Table 20. Figure 12 shows parity plots for training, test, and external validation splits. Nevertheless, validation against fully independent literature sources remains an important direction for further research.

3.4.2. Effect of Molecular Weight Reconstruction on Model Performance

To evaluate the influence of molecular weight reconstruction on predictive performance, additional models were trained using datasets in which all reconstructed molecular weight descriptors were excluded. In these cases, the effective dataset size was substantially reduced; note that for Tm and Tg, the instances were fewer than 100. Across all thermal properties and model types, the resulting models exhibited mean cross-validation R² values below 0.60, accompanied by large variance (>0.15) across folds.

This degradation in cross-validated performance is attributed mainly to the reduced datasets, amplifying the impact of individual studies and experimental conditions. Although reconstruction introduces approximate values, the resulting models consistently achieved higher and more stable cross-validated performance, indicating an improved bias–variance balance. These results highlight that, for sparse polymer datasets, reconstruction is essential for obtaining statistically robust and generalizable models. However, this procedure may introduce additional uncertainty, which should be taken into account when interpreting the results.

3.4.3. Baseline Models and Linearity Assessment

To provide a baseline for performance comparison, several commonly used linear and kernel-based regression models were evaluated, including ordinary linear regression (LR), Ridge regression, Lasso regression, and support vector regression (SVR). These models were trained and validated using the same descriptors, data splits, and cross-validation protocol employed for the ensemble models.

Across all three target properties, linear models exhibited limited predictive capability, with mean cross-validation R² values of 0.67, 0.34 and 0.38 for Tm, Tc and Tg, respectively. Regularization through Ridge and Lasso did not yield substantial improvements, indicating that the observed limitations were not due to overfitting or multicollinearity but rather to an inability of linear formulations to capture the underlying structure–property relationships.

SVR models showed improved performance relative to linear regressions, achieving cross-validation R² values of approximately 0.80 for Tm (acceptable) and around 0.51–0.57 for Tc and Tg. However, SVR performance remained consistently poorer than that of the tree-based ensemble models, particularly in terms of robustness and variance across folds. The superior performance of RF and XGBoost indicated that the relationships between molecular descriptors, molecular weight characteristics, and thermal properties of PHB and PHBV were inherently nonlinear and involved higher-order feature interactions that cannot be adequately represented by linear models.

3.4.4. Effect of Data Splitting Strategy

To assess whether the use of random train/test splitting leads to optimistic performance estimates, an additional evaluation was performed using a Kennard–Stone (KS) algorithm [49] to generate a more uniform and space-filling partition of the descriptor space. This approach is commonly used to mitigate sampling bias in small, heterogeneous datasets and provides a more stringent test of model robustness compared to random splitting.

Across all three thermal properties, the KS-based results confirmed that the predictive performance of the models was generally stable with respect to the splitting strategy, as shown in Table 21. For Tm and Tg, model performance remained comparable to that obtained using random splitting, indicating that the originally reported results were not artificially inflated. In contrast, Tc exhibited a moderate decrease in predictive performance under the KS split, suggesting a higher sensitivity to data heterogeneity and partitioning effects for this property. The observed differences were property-dependent rather than systematic, indicating that model performance was not primarily driven by sampling bias.

3.4.5. Domain of Applicability for Model Prediction

The applicability domain (DoA) of the developed models was defined using the observed ranges of the numerical input descriptors in the training dataset (Table 22). This analysis was performed to define the regions of chemical spaces where the developed models provide trustworthy predictions, whereas outside these ranges, predictions should be treated with caution since they correspond to extrapolation.

4. Conclusions

This study demonstrated the successful development of an ML framework for predicting thermal properties of PHB- and PHBV-based materials. After a literature search, 109 relevant studies were selected, and their findings were integrated with an in-house experimental package to facilitate the construction of a raw dataset, comprising a total of 572 instances. The curated version of the dataset was employed for ML modeling, enabling reliable model development while minimizing overfitting and capturing nonlinear structure–property relationships, using 118 data points for Tg, 201 data points for Tc, and 118 data points for Tm. This reduction reflects the requirement that only instances with complete descriptor information have been used for model training and evaluation, and the reported model performance therefore reflects learning from these reduced datasets.

ML QSPR models based on RF and XGBoost were developed, tuned, and validated for the prediction of the three thermal properties. Following hyperparameter optimization, the best-performing configurations achieved cross-validated R² values of 0.817, 0.762, and 0.765 for Tm, Tc, and Tg, respectively. RF outperformed XGBoost for Tm and Tg prediction, while XGBoost proved preferable for Tc. To examine generalization, a chemically informed external validation strategy was introduced. By holding out high-temperature data points from studies with sufficient sample sizes, an alternative split was performed. Under this setting, all thermal models achieved stable predictive behavior with strong Pearson correlation coefficients and consistent error metrics on unseen data.

Beyond predictive performance, the models provided interpretable structure–property insights and highlighted the importance of data quality and descriptor selection in polymer informatics. When data was limited and the database was relatively small, the combination of balanced performance metrics and explainability tools such as SHAP provided a practical, balanced, and effective approach for ML modeling.

Integrating datasets with interpretable ML approaches provided physical insight, supporting rational design of sustainable polymers with tailored properties, while reducing experimental time and cost and enabling application-specific optimization. In future work, the expansion of the dataset, incorporation of more detailed molecular and processing descriptors, and evaluation of alternative ML models will further validate QSPR predictive robustness. Larger datasets will enhance the predictive capability and capture more accurately the physical meaning of the features selected, revealing the importance of open-access and organized data libraries following FAIR data principles across the scientific community. The adoption of standardized testing and characterization protocols, including DSC procedures, experimental conditions, molecular weight determination and monomer composition characterization methods, would improve data consistency and comparability across studies, reducing dataset heterogeneity and further strengthening predictive performance.

Author Contributions

Conceptualization, C.M.; methodology, N.P.S. and L.M.; software, N.P.S. and L.M.; validation, L.M.; formal analysis, L.M. and N.P.S.; investigation, N.P.S. and L.M.; experiments, J.-D.P.; resources, N.P.S.; data curation, N.P.S. and N.T.; writing—original draft preparation, N.P.S.; writing—review and editing L.M., M.I.K., K.V.F. and C.M.; visualization, N.P.S. and K.V.F.; supervision, C.M.; project administration, C.M.; funding acquisition, C.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received funding from the Horizon Europe European Commission project ANIPH (Grant Agreement No. 101181943).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The datasets [32] used to train and evaluate the ML models were derived from curated thermal properties data reported in the literature and compiled within the framework of this study. The implementation of the ML models, including data preprocessing, feature selection, model training, and evaluation scripts, is publicly available via GitHub at: https://github.com/FSL-AUA/Temperature-model (accessed on 24 May 2026).

Acknowledgments

The work presented is based on research conducted within the framework of the Horizon Europe European Commission project ANIPH (Grant Agreement No. 101181943). The content of the paper is the sole responsibility of its authors and does not necessarily reflect the views of the European Commission.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of this manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

scl	short chain length
PHAs	Polyhydroxyalkanoates
Tg	glass transition temperature
Tm	melting temperature
Tc	crystallization temperature
PHB	poly(3-hydroxybutyrate)
PHBV	poly(3-hydroxybutyrate-co-3-hydroxyvalerate)
XGBoost	Extreme Gradient Boosting
Mw	weight-average molecular weight
MPs	microplastics
NPs	nanoplastics
BBpPs	biodegradable plastic products
CE	circular economy
PP	polypropylene
PET	polyethylene terephthalate
PE	polyethylene
PVC	polyvinyl chloride
mcl	medium chain length
lcl	long chain length
Tdeg	degradation temperature
ML	machine learning
PHA	Polyhydroxyalkanoate
QSPR	Quantitative Structure–Property Relationship
QSAR	Quantitative Structure–Activity Relationship
Mn	number-average molecular weight
PDI	polydispersity index
R²	coefficient of determination
DOA	Domain of Applicability
PRISMA	Preferred Reporting Items for Systematic Reviews and Meta-Analyses
DSC	Differential Scanning Calorimetry
DOI	Digital Object Identifier
FAIR	Findability, Accessibility, Interoperability, and Reuse
AUA	Agricultural University of Athens
T_d5%	decomposition temperature at 5% weight loss
ΔH_m	heat of melting fusion
ΔH_c	heat of crystallization fusion
X_{c %}	crystallinity %
GPC	Gel Permeation Chromatography
SEC	Size Exclusion Chromatography
DRI	refractive index
VS	viscometry
LS	Light scatter
PMMA	poly(methylmethacrylate)
NMR	Nuclear Magnetic Resonance
CDCl₃	deuterated chloroform (CDCl3)
CEN	European Committee for Standardization
ISO	International Organization for Standardization
UNE	Una Norma Española
EN	European Norm
RH	Relative Humidity
FAMD	Factorial Analysis of Mixed Data
IQR	interquartile range
HB	hydroxybutyrate
HV	hydroxyvalerate
CV	cross-validation
SHAP	Shapley Additive Explanations
RF	Random Forest
SD	standard deviation

Appendix A

Appendix A.1

In Table A1, we summarize the distribution of instances per study. Note that the study IDs are non-sequential as they were extracted from a larger data library.

Table A1. Distribution of instances across studies.

Study ID	No. of Instances	Reference	Study ID	No. of Instances	Reference
1	1	[50]	59	4	[51]
2	1	[52]	60	2	[53]
3	3	[54]	61	1	[55]
4	1	[56]	62	3	[57]
5	2	[58]	63	4	[59]
6	1	[60]	64	13	[61]
7	1	[62]	65	9	[63]
8	2	[64]	66	4	[65]
9	2	[66]	67	10	[67]
10	1	[68]	69	4	[69]
11	4	[70]	70	7	[71]
12	2	[72]	71	7	[73]
13	20	[74]	72	8	[75]
14	4	[76]	73	6	[77]
15	5	[78]	74	7	[79]
16	5	[80]	75	11	[81]
17	2	[82]	77	9	[83]
18	14	-	78	16	[84]
19	1	[85]	79	4	[86]
20	6	[87]	80	5	[88]
21	4	[89]	81	5	[90]
22	1	[91]	82	5	[92]
23	6	[93]	83	7	[94]
24	5	[95]	84	9	[96]
28	2	[97]	85	5	-
29	1	[98]	86	3	[99]
30	1	[100]	87	22	[101]
31	9	[102]	88	5	[103]
32	1	[104]	89	4	[105]
33	7	[106]	90	3	[107]
34	8	[108]	91	4	[109]
35	12	[110]	92	5	[111]
36	9	[112]	93	8	[113]
37	3	[114]	94	7	[115]
38	5	[116]	96	2	[117]
39	1	[118]	97	2	[119]
40	3	[120]	98	2	[121]
41	5	[122]	99	6	[123]
42	10	[124]	100	4	[125]
43	6	[126]	102	5	[127]
44	7	[128]	103	4	[129]
45	3	[130]	104	2	[131]
46	6	[132]	105	4	[133]
47	5	[134]	106	4	[135]
48	6	[136]	107	11	[137]
49	5	[138]	108	6	[139]
50	5	[140]	109	3	[141]
51	6	[142]	110	4	[143]
52	4	[144]	111	6	[145]
53	5	[146]	112	5	[147]
54	5	[148]	114	5	[149]
55	4	[150]	115	7	[151]
56	7	[152]	116	3	[153]
57	7	[154]	117	1	[155]
58	6	[156]	118	2	[157]
			Total	572

Appendix A.2

Table A2 reports a summary of features included in the thermal properties database for PHB/PHBV-based materials, which was uploaded to the AUA Zenodo repository [32]. For each feature, the table provides the feature name as used in the dataset, a concise scientific description, the measurement unit when applicable, and the observed value range across all samples. The dataset comprises a total of 25 numerical and 12 categorical features describing material characteristics, including information about composition, additives, physicochemical, and thermal properties of PHB/PHBV-based materials. The data were obtained from references [50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157].

Table A2. Overview of input (feature) and output (target) variables used to construct the thermal properties data library of scl-PHAs.

Feature	Description	Units	Range	Type
Study_id	Unique identifier for study instance	None	varies	categorical
Instance	Experimental run number	None	varies	categorical
Monomer A, Monomer B	Primary monomers	None	HB, HV	categorical
3 HB_ratio	Initial 3-hydroxybutyrate ratio	mol%	0–100	numerical
3 HV_ratio	Initial 3-hydroxyvalerate ratio	mol%	0–100	numerical
HB_ratio_formulation	3-hydroxyvalerate ratio in formulation	mol%	0–100	numerical
HV_ratio_formulation	3-hydroxyvalerate ratio in formulation	mol%	0–100	numerical
Molecular Weight (Mw)	Weight-average molar mass	g/mol	23,900–4,400,000	numerical
Molecular Weight (Mn)	Number-average molar mass	g/mol	35,575–2,101,000	numerical
Polydispersity Index (PDI)	Polydispersity Index	ratio	1.06–7	numerical
Density	Density of the polymer	g/cm³	1.09–1.25	numerical
Density_units	Units of density	g/cm³	-	categorical
Additive_1, Additive_2	Existence of Additives	None	Yes/No	categorical
Additive_1 name Additive_2 name	Name of the additive(s)	None	varies	categorical
Additive_1 Percentage, Additive_2 Percentage	Additive_1 content Additive_2 content	wt%	0–100	numerical
Additive_Type	Type of additive (e.g., filler, plasticizer)	None	varies	categorical
PHB(V)_percentage	PHBV content in final formulation	%	0–100	numerical
Tg1, Tg2	Glass transition temperature	Celsius (°C)	−48 to 43	numerical
Tc1,Tc2, Tmc	Crystallization temperature	Celsius (°C)	15.5–132	numerical
Tm1, Tm2	Melting temperature	Celsius (°C)	52–180	numerical
Td5	Degradation temperature where 5% of the material’s mass has decomposed	Celsius (°C)	158–329	numerical
Tdeg	Degradation temperature	Celsius (°C)	240.4–455	numerical
Temperature_units	Units of temperature	Celsius (°C)	-	categorical
DHm1, DHm2	Enthalpy change (DH) associated with the melting process	Joule/gram (J/g)	0.5–109.3	numerical
DHc1, DHc2	Enthalpy change (DH) associated with the crystallization process	Joule/gram (J/g)	0.013–97.1	numerical
Enthalpy_units	Units of enthalpy	Joule/gram (J/g)	-	categorical
Crystallinity	Degree to which a material has a well-ordered, repeating atomic or molecular structure	%	0–100	numerical

References

Kumar, R.; Verma, A.; Shome, A.; Sinha, R.; Sinha, S.; Jha, P.K.; Kumar, R.; Kumar, P.; Shubham; Das, S.; et al. Impacts of Plastic Pollution on Ecosystem Services, Sustainable Development Goals, and Need to Focus on Circular Economy and Policy Interventions. Sustainability 2021, 13, 9963. [Google Scholar] [CrossRef]
Gundlapalli, M.; Ganesan, S. Polyhydroxyalkanoates (PHAs): Key Challenges in Production and Sustainable Strategies for Cost Reduction within a Circular Economy Framework. Results Eng. 2025, 26, 105345. [Google Scholar] [CrossRef]
Acharjee, S.A.; Bharali, P.; Gogoi, B.; Sorhie, V.; Walling, B.; Alemtoshi. PHA-Based Bioplastic: A Potential Alternative to Address Microplastic Pollution. Water Air Soil Pollut. 2023, 234, 21. [Google Scholar] [CrossRef] [PubMed]
Schirmeister, C.G.; Mülhaupt, R. Closing the Carbon Loop in the Circular Plastics Economy. Macromol. Rapid Commun. 2022, 43, 2200247. [Google Scholar] [CrossRef] [PubMed]
Vidal, F.; Van Der Marel, E.R.; Kerr, R.W.F.; McElroy, C.; Schroeder, N.; Mitchell, C.; Rosetto, G.; Chen, T.T.D.; Bailey, R.M.; Hepburn, C.; et al. Designing a Circular Carbon and Plastics Economy for a Sustainable Future. Nature 2024, 626, 45–57. [Google Scholar] [CrossRef] [PubMed]
On the Plastics Crisis. Nat. Sustain. 2023, 6, 1137. [CrossRef]
Koller, M.; Heeney, D.; Mukherjee, A. Biodegradability of Polyhydroxyalkanoate (PHA) Biopolyesters in Nature: A Review. Biodegradation 2025, 36, 76. [Google Scholar] [CrossRef] [PubMed]
Ojumu, T.V.; Yu, J.; Solomon, B.O. Production of Polyhydroxyalkanoates, a Bacterial Biodegradable Polymer. Afr. J. Biotechnol. 2004, 3, 18–24. [Google Scholar] [CrossRef]
Możejko-Ciesielska, J.; Kiewisz, R. Bacterial Polyhydroxyalkanoates: Still Fabulous? Microbiol. Res. 2016, 192, 271–282. [Google Scholar] [CrossRef] [PubMed]
Sudesh, K.; Abe, H.; Doi, Y. Synthesis, Structure and Properties of Polyhydroxyalkanoates: Biological Polyesters. Prog. Polym. Sci. 2000, 25, 1503–1555. [Google Scholar] [CrossRef]
Muneer, F.; Rasul, I.; Azeem, F.; Siddique, M.H.; Zubair, M.; Nadeem, H. Microbial Polyhydroxyalkanoates (PHAs): Efficient Replacement of Synthetic Polymers. J. Polym. Environ. 2020, 28, 2301–2323. [Google Scholar] [CrossRef]
Zhou, W.; Bergsma, S.; Colpa, D.I.; Euverink, G.-J.W.; Krooneman, J. Polyhydroxyalkanoates (PHAs) Synthesis and Degradation by Microbes and Applications towards a Circular Economy. J. Environ. Manag. 2023, 341, 118033. [Google Scholar] [CrossRef] [PubMed]
Behera, S.; Priyadarshanee, M.; Vandana; Das, S. Polyhydroxyalkanoates, the Bioplastics of Microbial Origin: Properties, Biochemical Synthesis, and Their Applications. Chemosphere 2022, 294, 133723. [Google Scholar] [CrossRef] [PubMed]
Dalton, B.; Bhagabati, P.; De Micco, J.; Padamati, R.B.; O’Connor, K. A Review on Biological Synthesis of the Biodegradable Polymers Polyhydroxyalkanoates and the Development of Multiple Applications. Catalysts 2022, 12, 319. [Google Scholar] [CrossRef]
Li, Z.; Yang, J.; Loh, X.J. Polyhydroxyalkanoates: Opening Doors for a Sustainable Future. NPG Asia Mater. 2016, 8, e265. [Google Scholar] [CrossRef]
Singh, M.; Kumar, P.; Ray, S.; Kalia, V.C. Challenges and Opportunities for Customizing Polyhydroxyalkanoates. Indian J. Microbiol. 2015, 55, 235–249. [Google Scholar] [CrossRef] [PubMed]
Zhao, Y.; Mulder, R.J.; Houshyar, S.; Le, T.C. A Review on the Application of Molecular Descriptors and Machine Learning in Polymer Design. Polym. Chem. 2023, 14, 3325–3346. [Google Scholar] [CrossRef]
Kuenneth, C.; Rajan, A.C.; Tran, H.; Chen, L.; Kim, C.; Ramprasad, R. Polymer Informatics with Multi-Task Learning. Patterns 2021, 2, 100238. [Google Scholar] [CrossRef] [PubMed]
Chen, L.; Pilania, G.; Batra, R.; Huan, T.D.; Kim, C.; Kuenneth, C.; Ramprasad, R. Polymer Informatics: Current Status and Critical next Steps. Mater. Sci. Eng. R Rep. 2021, 144, 100595. [Google Scholar] [CrossRef]
Kim, C.; Chandrasekaran, A.; Huan, T.D.; Das, D.; Ramprasad, R. Polymer Genome: A Data-Powered Polymer Informatics Platform for Property Predictions. J. Phys. Chem. C 2018, 122, 17575–17585. [Google Scholar] [CrossRef]
Otsuka, S.; Kuwajima, I.; Hosoya, J.; Xu, Y.; Yamazaki, M. PoLyInfo: Polymer Database for Polymeric Materials Design. In Proceedings of the 2011 International Conference on Emerging Intelligent Data and Web Technologies; IEEE: Tirana, Albania, 2011; pp. 22–29. [Google Scholar]
Jiang, Z.; Hu, J.; Marrone, B.L.; Pilania, G.; Yu, X. (Bill) A Deep Neural Network for Accurate and Robust Prediction of the Glass Transition Temperature of Polyhydroxyalkanoate Homo- and Copolymers. Materials 2020, 13, 5701. [Google Scholar] [CrossRef] [PubMed]
Kuenneth, C.; Lalonde, J.; Marrone, B.L.; Iverson, C.N.; Ramprasad, R.; Pilania, G. Bioplastic Design Using Multitask Deep Neural Networks. Commun. Mater. 2022, 3, 96. [Google Scholar] [CrossRef]
Bejagam, K.K.; Gupta, N.S.; Lee, K.-S.; Iverson, C.N.; Marrone, B.L.; Pilania, G. Predicting the Mechanical Response of Polyhydroxyalkanoate Biopolymers Using Molecular Dynamics Simulations. Polymers 2022, 14, 345. [Google Scholar] [CrossRef] [PubMed]
Bejagam, K.K.; Lalonde, J.; Iverson, C.N.; Marrone, B.L.; Pilania, G. Machine Learning for Melting Temperature Predictions and Design in Polyhydroxyalkanoate-Based Biopolymers. J. Phys. Chem. B 2022, 126, 934–945. [Google Scholar] [CrossRef] [PubMed]
Pilania, G.; Iverson, C.N.; Lookman, T.; Marrone, B.L. Machine-Learning-Based Predictive Modeling of Glass Transition Temperatures: A Case of Polyhydroxyalkanoate Homopolymers and Copolymers. J. Chem. Inf. Model. 2019, 59, 5013–5025. [Google Scholar] [CrossRef] [PubMed]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: San Francisco, CA, USA, 2016; pp. 785–794. [Google Scholar]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef] [PubMed]
AliceChem. Thermal Transitions in Amorphous and Semicrystalline Polymers, 2018. Available online: https://commons.wikimedia.org/wiki/File:Thermal_transitions_in_amorphous_and_semicrystalline_polymers.tif (accessed on 23 April 2026).
Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; Da Silva Santos, L.B.; Bourne, P.E.; et al. The FAIR Guiding Principles for Scientific Data Management and Stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef] [PubMed]
Agricultural University of Athens Processability and Product Properties Data Library of Scl-PHAs Under ANIPH Project. 2026. Available online: https://zenodo.org/records/19725085 (accessed on 24 May 2026).
UNE-EN ISO 11357-1; Plastics—Differential Scanning Calorimetry (DSC)—Part 1: General Principles. International Organization for Standardization (ISO): Geneva, Switzerland, 2023.
UNE-EN ISO 11357-2; Plastics—Differential Scanning Calorimetry (DSC)—Part 2: Determination of Glass Transition Temperature and Step Height. International Organization for Standardization (ISO): Geneva, Switzerland, 2020.
UNE-EN ISO 11357-3; Plastics—Differential Scanning Calorimetry (DSC)—Part 3: Determination of Temperature and Enthalpy of Melting and Crystallization. International Organization for Standardization (ISO): Geneva, Switzerland, 2018.
Barham, P.J.; Keller, A.; Otun, E.L.; Holmes, P.A. Crystallization and Morphology of a Bacterial Thermoplastic: Poly-3-Hydroxybutyrate. J. Mater. Sci. 1984, 19, 2781–2794. [Google Scholar] [CrossRef]
European Committee for Standardization (CEN). Procedure Guidelines to Determinate 3-Hydroxyvalerate Content in PHBV by Nuclear Magnetic Resonance; European Committee for Standardization: Brussels, Belgium, 2024.
ISO 527; Plastics—Determination of Tensile Properties. International Organization for Standardization (ISO): Geneva, Switzerland, 2019.
Luo, S.; Grubb, D.T.; Netravali, A.N. The Effect of Molecular Weight on the Lamellar Structure, Thermal and Mechanical Properties of Poly(Hydroxybutyrate-Co-Hydroxyvalerates). Polymer 2002, 43, 4159–4166. [Google Scholar] [CrossRef]
Soares, T.A.; Nunes-Alves, A.; Mazzolari, A.; Ruggiu, F.; Wei, G.-W.; Merz, K. The (Re)-Evolution of Quantitative Structure–Activity Relationship (QSAR) Studies Propelled by the Surge of Machine Learning Methods. J. Chem. Inf. Model. 2022, 62, 5317–5320. [Google Scholar] [CrossRef] [PubMed]
Arrieta, M.P.; López, J.; Hernández, A.; Rayón, E. Ternary PLA–PHB–Limonene Blends Intended for Biodegradable Food Packaging Applications. Eur. Polym. J. 2014, 50, 255–270. [Google Scholar] [CrossRef]
Feijoo, P.; Mohanty, A.K.; Rodriguez-Uribe, A.; Gámez-Pérez, J.; Cabedo, L.; Misra, M. Biodegradable Blends from Bacterial Biopolyester PHBV and Bio-Based PBSA: Study of the Effect of Chain Extender on the Thermal, Mechanical and Morphological Properties. Int. J. Biol. Macromol. 2023, 225, 1291–1305. [Google Scholar] [CrossRef] [PubMed]
Kirschweng, B.; Tátraaljai, D.; Földes, E.; Pukánszky, B. Natural Antioxidants as Stabilizers for Polymers. Polym. Degrad. Stab. 2017, 145, 25–40. [Google Scholar] [CrossRef]
Pagès, J. Multiple Factor Analysis by Example Using R, 1st ed.; Chapman and Hall/CRC: Boca Raton, FL, USA, 2014; ISBN 978-0-429-17108-6. [Google Scholar]
Rousseeuw, P.J.; Hubert, M. Robust Statistics for Outlier Detection. WIREs Data Min. Knowl. 2011, 1, 73–79. [Google Scholar] [CrossRef]
Kotzabasaki, M.I.; Mindrinos, L.; Sotiropoulos, N.P.; Filippou, K.V.; Maraveas, C. A Data-Driven Framework for Predicting PHBV Biodegradation-Induced Weight Loss Based on Laboratory and Real-Environment Condition Tests. Polymers 2026, 18, 897. [Google Scholar] [CrossRef] [PubMed]
Tang, W.; Li, Y.; Yu, Y.; Wang, Z.; Xu, T.; Chen, J.; Lin, J.; Li, X. Development of Models Predicting Biodegradation Rate Rating with Multiple Linear Regression and Support Vector Machine Algorithms. Chemosphere 2020, 253, 126666. [Google Scholar] [CrossRef] [PubMed]
Kotzabasaki, M.I.; Mindrinos, L.; Sotiropoulos, N.P.; Filippou, K.V.; Maraveas, C. Machine Learning Methods for Mineralization-Based Biodegradation Prediction in Polyhydroxyalkanoate-Based Biopolymers: Insights from Lab-Scale Experiments. Polymers 2026, 18, 1076. [Google Scholar] [CrossRef] [PubMed]
Galvão, R.K.H.; Araujo, M.C.U.; José, G.E.; Pontes, M.J.C.; Silva, E.C.; Saldanha, T.C.B. A Method for Calibration and Validation Subset Partitioning. Talanta 2005, 67, 736–740. [Google Scholar] [CrossRef] [PubMed]
Luo, R.; Chen, J.; Zhang, L.; Chen, G. Polyhydroxyalkanoate Copolyesters Produced by Ralstonia Eutropha PHB−4 Harboring a Low-Substrate-Specificity PHA Synthase PhaC2Ps from Pseudomonas Stutzeri 1317. Biochem. Eng. J. 2006, 32, 218–225. [Google Scholar] [CrossRef]
Figueroa-Lopez, K.J.; Vicente, A.A.; Reis, M.A.M.; Torres-Giner, S.; Lagaron, J.M. Antimicrobial and Antioxidant Performance of Various Essential Oils and Natural Extracts and Their Incorporation into Biowaste Derived Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Layers Made from Electrospun Ultrathin Fibers. Nanomaterials 2019, 9, 144. [Google Scholar] [CrossRef] [PubMed]
Kang, C.-K.; Kusaka, S.; Doi, Y. Structure and Properties of Poly(3-Hydroxybutyrate-Co-4-Hydroxybutyrate) Produced by Alcaligenes Latus. Biotechnol. Lett. 1995, 17, 583–588. [Google Scholar] [CrossRef]
Brouchon, G.; Alvarez, P.; Six, A.; Lemechko, P.; Dimitriades-Lemaire, A.; Fleury, G.; Sassi, J.-F.; Bruzaud, S. Biosynthesis of Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) (PHBHV) Using Microalgae-Derived Starch and Levulinic Acid. Polym. Degrad. Stab. 2025, 233, 111176. [Google Scholar] [CrossRef]
Myung, J.; Flanagan, J.C.A.; Waymouth, R.M.; Criddle, C.S. Methane or Methanol-Oxidation Dependent Synthesis of Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) by Obligate Type II Methanotrophs. Process Biochem. 2016, 51, 561–567. [Google Scholar] [CrossRef]
Meléndez-Rodríguez, B.; Torres-Giner, S.; Reis, M.A.M.; Silva, F.; Matos, M.; Cabedo, L.; Lagarón, J.M. Blends of Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) with Fruit Pulp Biowaste Derived Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate-Co-3-Hydroxyhexanoate) for Organic Recycling Food Packaging. Polymers 2021, 13, 1155. [Google Scholar] [CrossRef] [PubMed]
Zhang, J.; Kasuya, K.; Hikima, T.; Takata, M.; Takemura, A.; Iwata, T. Mechanical Properties, Structure Analysis and Enzymatic Degradation of Uniaxially Cold-Drawn Films of Poly[(R)-3-Hydroxybutyrate-Co-4-Hydroxybutyrate]. Polym. Degrad. Stab. 2011, 96, 2130–2138. [Google Scholar] [CrossRef]
Melendez-Rodriguez, B.; Reis, M.A.M.; Carvalheira, M.; Sammon, C.; Cabedo, L.; Torres-Giner, S.; Lagaron, J.M. Development and Characterization of Electrospun Biopapers of Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Derived from Cheese Whey with Varying 3-Hydroxyvalerate Contents. Biomacromolecules 2021, 22, 2935–2953. [Google Scholar] [CrossRef] [PubMed]
Wang, H.; Zhou, X.; Liu, Q.; Chen, G.-Q. Biosynthesis of Polyhydroxyalkanoate Homopolymers by Pseudomonas Putida. Appl. Microbiol. Biotechnol. 2011, 89, 1497–1507. [Google Scholar] [CrossRef] [PubMed]
Sanhueza, C.; Diaz-Rodriguez, P.; Villegas, P.; González, Á.; Seeger, M.; Suárez-González, J.; Concheiro, A.; Alvarez-Lorenzo, C.; Acevedo, F. Influence of the Carbon Source on the Properties of Poly-(3)-Hydroxybutyrate Produced by Paraburkholderia Xenovorans LB400 and Its Electrospun Fibers. Int. J. Biol. Macromol. 2020, 152, 11–20. [Google Scholar] [CrossRef] [PubMed]
Mizuno, S.; Katsumata, S.; Hiroe, A.; Tsuge, T. Biosynthesis and Thermal Characterization of Polyhydroxyalkanoates Bearing Phenyl and Phenylalkyl Side Groups. Polym. Degrad. Stab. 2014, 109, 379–384. [Google Scholar] [CrossRef]
Melendez-Rodriguez, B.; Figueroa-Lopez, K.J.; Bernardos, A.; Martínez-Máñez, R.; Cabedo, L.; Torres-Giner, S.; Lagaron, J.M. Electrospun Antimicrobial Films of Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Containing Eugenol Essential Oil Encapsulated in Mesoporous Silica Nanoparticles. Nanomaterials 2019, 9, 227. [Google Scholar] [CrossRef] [PubMed]
Iqbal, N.M.; Amirul, A.A. Synthesis of P(3HB-Co-4HB) Copolymer with Target-specific 4HB Molar Fractions Using Combinations of Carbon Substrates. J. Chem. Tech. Biotech. 2014, 89, 407–418. [Google Scholar] [CrossRef]
Mousavioun, P.; George, G.A.; Doherty, W.O.S. Environmental Degradation of Lignin/Poly(Hydroxybutyrate) Blends. Polym. Degrad. Stab. 2012, 97, 1114–1122. [Google Scholar] [CrossRef]
Shen, X.-W.; Yang, Y.; Jian, J.; Wu, Q.; Chen, G.-Q. Production and Characterization of Homopolymer Poly(3-Hydroxyvalerate) (PHV) Accumulated by Wild Type and Recombinant Aeromonas Hydrophila Strain 4AK4. Bioresour. Technol. 2009, 100, 4296–4299. [Google Scholar] [CrossRef] [PubMed]
Fayyazbakhsh, A.; Koutný, M.; Kalendová, A.; Šašinková, D.; Julinová, M.; Kadlečková, M. Selected Simple Natural Antimicrobial Terpenoids as Additives to Control Biodegradation of Polyhydroxy Butyrate. Int. J. Mol. Sci. 2022, 23, 14079. [Google Scholar] [CrossRef] [PubMed]
Huu Phong, T.; Van Thuoc, D.; Sudesh, K. Biosynthesis of Poly(3-Hydroxybutyrate) and Its Copolymers by Yangia sp. ND199 from Different Carbon Sources. Int. J. Biol. Macromol. 2016, 84, 361–366. [Google Scholar] [CrossRef] [PubMed]
Viretto, A.; Gontard, N.; Angellier-Coussy, H. Urban Parks and Gardens Green Waste: A Valuable Resource for the Production of Fillers for Biocomposites Applications. Waste Manag. 2021, 120, 538–548. [Google Scholar] [CrossRef] [PubMed]
Shimamura, E.; Scandola, M.; Doi, Y. Microbial Synthesis and Characterization of Poly(3-Hydroxybutyrate-Co-3-Hydroxypropionate). Macromolecules 1994, 27, 4429–4435. [Google Scholar] [CrossRef]
Lugoloobi, I.; Li, X.; Zhang, Y.; Mao, Z.; Wang, B.; Sui, X.; Feng, X. Fabrication of Lignin/Poly(3-Hydroxybutyrate) Nanocomposites with Enhanced Properties via a Pickering Emulsion Approach. Int. J. Biol. Macromol. 2020, 165, 3078–3087. [Google Scholar] [CrossRef] [PubMed]
Chanprateep, S.; Kulpreecha, S. Production and Characterization of Biodegradable Terpolymer Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate-Co-4-Hydroxybutyrate) by Alcaligenes sp. A-04. J. Biosci. Bioeng. 2006, 101, 51–56. [Google Scholar] [CrossRef] [PubMed]
Marcoaldi, C.; Acampora, V.; Venezia, V.; Prieto, C.; Grappa, R.; Silvestri, B.; Luciani, G.; Lagaron, J.M. Challenges in Lignin Integration within Biopolymer Matrices: Toward Stable and Effective Lignin Nanoparticles as Additives for Sustainable Food Packaging. Ind. Crops Prod. 2025, 224, 120336. [Google Scholar] [CrossRef]
Laycock, B.; Arcos-Hernandez, M.V.; Langford, A.; Buchanan, J.; Halley, P.J.; Werker, A.; Lant, P.A.; Pratt, S. Thermal Properties and Crystallization Behavior of Fractionated Blocky and Random Polyhydroxyalkanoate Copolymers from Mixed Microbial Cultures. J. Appl. Polym. Sci. 2014, 131, app.40836. [Google Scholar] [CrossRef]
Luo, S.; Cao, J.; McDonald, A.G. Interfacial Improvements in a Green Biopolymer Alloy of Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) and Lignin via in Situ Reactive Extrusion. ACS Sustain. Chem. Eng. 2016, 4, 3465–3476. [Google Scholar] [CrossRef]
Ashby, R.D.; Solaiman, D.K.Y.; Nuñez, A.; Strahan, G.D.; Johnston, D.B. Burkholderia Sacchari DSM 17165: A Source of Compositionally-Tunable Block-Copolymeric Short-Chain Poly(Hydroxyalkanoates) from Xylose and Levulinic Acid. Bioresour. Technol. 2018, 253, 333–342. [Google Scholar] [CrossRef] [PubMed]
Panaitescu, D.M.; Frone, A.N.; Nicolae, C.-A.; Gabor, A.R.; Miu, D.M.; Soare, M.-G.; Vasile, B.S.; Lupescu, I. Poly(3-Hydroxybutyrate) Nanocomposites Modified with Even and Odd Chain Length Polyhydroxyalkanoates. Int. J. Biol. Macromol. 2023, 244, 125324. [Google Scholar] [CrossRef] [PubMed]
Dai, Y.; Yuan, Z.; Jack, K.; Keller, J. Production of Targeted Poly(3-Hydroxyalkanoates) Copolymers by Glycogen Accumulating Organisms Using Acetate as Sole Carbon Source. J. Biotechnol. 2007, 129, 489–497. [Google Scholar] [CrossRef] [PubMed]
Glasser, W.G.; Northey, R.A.; Schultz, T.P. (Eds.) Blends of Biodegradable Thermoplastics with Lignin Esters. In Lignin: Historical, Biological, and Materials Perspectives; ACS Symposium Series; American Chemical Society: Washington, DC, USA, 1999; Volume 742, Chapter 17; pp. 331–350. ISBN 978-0-8412-3611-0. [Google Scholar]
Ferre-Guell, A.; Winterburn, J. Biosynthesis and Characterization of Polyhydroxyalkanoates with Controlled Composition and Microstructure. Biomacromolecules 2018, 19, 996–1005. [Google Scholar] [CrossRef] [PubMed]
Pan, P.; Shan, G.; Bao, Y.; Weng, Z. Crystallization Kinetics of Bacterial Poly(3-hydroxylbutyrate) Copolyesters with Cyanuric Acid as a Nucleating Agent. J. Appl. Polym. Sci. 2013, 129, 1374–1382. [Google Scholar] [CrossRef]
Han, J.; Wu, L.-P.; Hou, J.; Zhao, D.; Xiang, H. Biosynthesis, Characterization, and Hemostasis Potential of Tailor-Made Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Produced by Haloferax mediterranei. Biomacromolecules 2015, 16, 578–588. [Google Scholar] [CrossRef] [PubMed]
Luo, S.; Cao, J.; McDonald, A.G. Esterification of Industrial Lignin and Its Effect on the Resulting Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) or Polypropylene Blends. Ind. Crops Prod. 2017, 97, 281–291. [Google Scholar] [CrossRef]
Povolo, S.; Romanelli, M.G.; Basaglia, M.; Ilieva, V.I.; Corti, A.; Morelli, A.; Chiellini, E.; Casella, S. Polyhydroxyalkanoate biosynthesis by Hydrogenophaga pseudoflava DSM1034 from Structurally Unrelated Carbon Sources. New Biotechnol. 2013, 30, 629–634. [Google Scholar] [CrossRef] [PubMed]
Panaitescu, D.M.; Nicolae, C.A.; Frone, A.N.; Chiulan, I.; Stanescu, P.O.; Draghici, C.; Iorga, M.; Mihailescu, M. Plasticized Poly(3-hydroxybutyrate) with Improved Melt Processing and Balanced Properties. J. Appl. Polym. Sci. 2017, 134, app.44810. [Google Scholar] [CrossRef]
Davaritouchaee, M.; Mosleh, I.; Dadmohammadi, Y.; Abbaspourrad, A. One-Step Oxidation of Orange Peel Waste to Carbon Feedstock for Bacterial Production of Polyhydroxybutyrate. Polymers 2023, 15, 697. [Google Scholar] [CrossRef] [PubMed]
Arcana, M.; Giani-Beaune, O.; Schue, R.; Schue, F.; Amass, W.; Amass, A. Ring-opening Copolymerization of Racemic Β-butyrolactone with Ε-caprolactone and Δ-valerolactone by Distannoxane Derivative Catalysts: Study of the Enzymatic Degradation in Aerobic Media of Obtained Copolymers. Polym. Int. 2002, 51, 859–866. [Google Scholar] [CrossRef]
Kovalcik, A.; Machovsky, M.; Kozakova, Z.; Koller, M. Designing Packaging Materials with Viscoelastic and Gas Barrier Properties by Optimized Processing of Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) with Lignin. React. Funct. Polym. 2015, 94, 25–34. [Google Scholar] [CrossRef]
Buzarovska, A.; Grozdanov, A.; Avella, M.; Gentile, G.; Errico, M. Poly(Hydroxybutyrate-Co-hydroxyvalerate)/Titanium Dioxide Nanocomposites: A Degradation Study. J. Appl. Polym. Sci. 2009, 114, 3118–3124. [Google Scholar] [CrossRef]
Sánchez-Safont, E.L.; Aldureid, A.; Lagarón, J.M.; Gamez-Perez, J.; Cabedo, L. Effect of the Purification Treatment on the Valorization of Natural Cellulosic Residues as Fillers in PHB-Based Composites for Short Shelf Life Applications. Waste Biomass Valor. 2021, 12, 2541–2556. [Google Scholar] [CrossRef]
Carli, L.N.; Daitx, T.S.; Guégan, R.; Giovanela, M.; Crespo, J.S.; Mauler, R.S. Biopolymer Nanocomposites Based on Poly(Hydroxybutyrate-Co-hydroxyvalerate) Reinforced by a Non-ionic Organoclay. Polym. Int. 2015, 64, 235–241. [Google Scholar] [CrossRef]
Angelini, S.; Cerruti, P.; Immirzi, B.; Santagata, G.; Scarinzi, G.; Malinconico, M. From Biowaste to Bioresource: Effect of a Lignocellulosic Filler on the Properties of Poly(3-Hydroxybutyrate). Int. J. Biol. Macromol. 2014, 71, 163–173. [Google Scholar] [CrossRef] [PubMed]
Doi, Y.; Kitamura, S.; Abe, H. Microbial Synthesis and Characterization of Poly(3-Hydroxybutyrate-Co-3-Hydroxyhexanoate). Macromolecules 1995, 28, 4822–4828. [Google Scholar] [CrossRef]
Wang, H.; Ouyang, Y.; Yang, W.; He, H.; Chen, J.; Yuan, Y.; Park, H.; Wu, F.; Yang, F.; Chen, G.-Q. Production and Characterization of Copolymers Consisting of 3-Hydroxybutyrate and Increased 3-Hydroxyvalerate by β-Oxidation Weakened Halomonas. Metab. Eng. 2025, 89, 97–107. [Google Scholar] [CrossRef] [PubMed]
Khanna, S.; Srivastava, A.K. Recent Advances in Microbial Polyhydroxyalkanoates. Process Biochem. 2005, 40, 607–619. [Google Scholar] [CrossRef]
Seydibeyoğlu, M.Ö.; Misra, M.; Mohanty, A. Synergistic Improvements in the Impact Strength and % Elongation of Polyhydroxybutyrate-Co-Valerate Copolymers with Functionalized Soybean Oils and POSS. Int. J. Plast. Technol. 2010, 14, 1–16. [Google Scholar] [CrossRef]
Avella, M.; Martuscelli, E.; Raimo, M. Review Properties of Blends and Composites Based on Poly(3-Hydroxy)Butyrate (PHB) and Poly(3-Hydroxybutyrate-Hydroxyvalerate) (PHBV) Copolymers. J. Mater. Sci. 2000, 35, 523–545. [Google Scholar] [CrossRef]
Kirboga, S.; Öner, M. Oxygen Barrier and Thermomechanical Properties of Poly (3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Biocomposites Reinforced with Calcium Carbonate Particles. Acta Chim. Slov. 2020, 67, 137–150. [Google Scholar] [CrossRef]
Muniyasamy, S.; Ofosu, O.; John, M.J.; Anandjiwala, R.D. Mineralization of Poly(Lactic Acid) (PLA), Poly(3-Hydroxybutyrate-Co-Valerate) (PHBV) and PLA/PHBV Blend in Compost and Soil Environments. J. Renew. Mater. 2016, 4, 133–145. [Google Scholar] [CrossRef]
Matsusaki, H.; Abe, H.; Doi, Y. Biosynthesis and Properties of Poly(3-Hydroxybutyrate-Co-3-Hydroxyalkanoates) by Recombinant Strains of Pseudomonas sp. 61-3. Biomacromolecules 2000, 1, 17–22. [Google Scholar] [CrossRef] [PubMed]
Moll, E.; Freitas, P.A.V.; Chiralt, A. Effect of Active Rice Straw Extracts on the Properties and Migration of PHBV Films. Food Packag. Shelf Life 2025, 48, 101454. [Google Scholar] [CrossRef]
David, G.; Michel, J.; Gastaldi, E.; Gontard, N.; Angellier-Coussy, H. How Vine Shoots as Fillers Impact the Biodegradation of PHBV-Based Composites. Int. J. Mol. Sci. 2019, 21, 228. [Google Scholar] [CrossRef] [PubMed]
Jost, V.; Langowski, H.-C. Effect of Different Plasticisers on the Mechanical and Barrier Properties of Extruded Cast PHBV Films. Eur. Polym. J. 2015, 68, 302–312. [Google Scholar] [CrossRef]
Mousavioun, P.; Halley, P.J.; Doherty, W.O.S. Thermophysical Properties and Rheology of PHB/Lignin Blends. Ind. Crops Prod. 2013, 50, 270–275. [Google Scholar] [CrossRef]
Shichao, W.; Hengxue, X.; Renlin, W.; Zhe, Z.; Meifang, Z. Influence of Amorphous Alkaline Lignin on the Crystallization Behavior and Thermal Properties of Bacterial Polyester. J. Appl. Polym. Sci. 2015, 132, app.41325. [Google Scholar] [CrossRef]
Haywood, G.W.; Anderson, A.J.; Roger Williams, D.; Dawes, E.A.; Ewing, D.F. Accumulation of a Poly(Hydroxyalkanoate) Copolymer Containing Primarily 3-Hydroxyvalerate from Simple Carbohydrate Substrates by Rhodococcus sp. NCIMB 40126. Int. J. Biol. Macromol. 1991, 13, 83–88. [Google Scholar] [CrossRef] [PubMed]
Sanchez-Garcia, M.D.; Gimenez, E.; Lagaron, J.M. Morphology and Barrier Properties of Solvent Cast Composites of Thermoplastic Biopolymers and Purified Cellulose Fibers. Carbohydr. Polym. 2008, 71, 235–244. [Google Scholar] [CrossRef]
Uzun, G.; Aydemir, D. Biocomposites from Polyhydroxybutyrate and Bio-Fillers by Solvent Casting Method. Bull. Mater. Sci. 2017, 40, 383–393. [Google Scholar] [CrossRef]
Singh, S.; Sithole, B.; Lekha, P.; Permaul, K.; Govinden, R. Optimization of Cultivation Medium and Cyclic Fed-Batch Fermentation Strategy for Enhanced Polyhydroxyalkanoate Production by Bacillus Thuringiensis Using a Glucose-Rich Hydrolyzate. Bioresour. Bioprocess. 2021, 8, 11. [Google Scholar] [CrossRef] [PubMed]
Slongo, M.D.; Brandolt, S.D.F.; Daitx, T.S.; Mauler, R.S.; Giovanela, M.; Crespo, J.S.; Carli, L.N. Comparison of the Effect of Plasticizers on PHBV—and Organoclay—Based Biodegradable Polymer Nanocomposites. J. Polym. Environ. 2018, 26, 2290–2299. [Google Scholar] [CrossRef]
Adorna, J.A.; Ventura, R.L.G.; Dang, V.D.; Doong, R.; Ventura, J.S. Biodegradable Polyhydroxybutyrate/Cellulose/Calcium Carbonate Bioplastic Composites Prepared by Heat-assisted Solution Casting Method. J. Appl. Polym. Sci. 2022, 139, 51645. [Google Scholar] [CrossRef]
Branciforti, M.C.; Corrêa, M.C.S.; Pollet, E.; Agnelli, J.A.M.; Nascente, P.A.D.P.; Avérous, L. Crystallinity Study of Nano-Biocomposites Based on Plasticized Poly(Hydroxybutyrate-Co-Hydroxyvalerate) with Organo-Modified Montmorillonite. Polym. Test. 2013, 32, 1253–1260. [Google Scholar] [CrossRef]
Chiulan, I.; Mihaela Panaitescu, D.; Nicoleta Frone, A.; Teodorescu, M.; Andi Nicolae, C.; Căşărică, A.; Tofan, V.; Sălăgeanu, A. Biocompatible Polyhydroxyalkanoates/Bacterial Cellulose Composites: Preparation, Characterization, and in Vitro Evaluation. J. Biomed. Mater. Res. 2016, 104, 2576–2584. [Google Scholar] [CrossRef] [PubMed]
Choi, J.S.; Park, W.H. Thermal and Mechanical Properties of Poly(3-hydroxybutyrate-Co-3-hydroxyvalerate) Plasticized by Biodegradable Soybean Oils. Macromol. Symp. 2003, 197, 65–76. [Google Scholar] [CrossRef]
Nerkar, M.; Ramsay, J.A.; Ramsay, B.A.; Kontopoulou, M. Melt Compounded Blends of Short and Medium Chain-Length Poly-3-Hydroxyalkanoates. J. Polym. Environ. 2014, 22, 236–243. [Google Scholar] [CrossRef]
Samantaray, S.; Mallick, N. Production of Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Co-Polymer by the Diazotrophic Cyanobacterium Aulosira Fertilissima CCC 444. J. Appl. Phycol. 2014, 26, 237–245. [Google Scholar] [CrossRef]
Martelli, S.M.; Sabirova, J.; Fakhouri, F.M.; Dyzma, A.; De Meyer, B.; Soetaert, W. Obtention and Characterization of Poly(3-Hydroxybutyricacid-Co-Hydroxyvaleric Acid)/Mcl-PHA Based Blends. LWT 2012, 47, 386–392. [Google Scholar] [CrossRef]
Abbasi, M.; Pokhrel, D.; Coats, E.R.; Guho, N.M.; McDonald, A.G. Effect of 3-Hydroxyvalerate Content on Thermal, Mechanical, and Rheological Properties of Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Biopolymers Produced from Fermented Dairy Manure. Polymers 2022, 14, 4140. [Google Scholar] [CrossRef] [PubMed]
Rebocho, A.T.; Pereira, J.R.; Neves, L.A.; Alves, V.D.; Sevrin, C.; Grandfils, C.; Freitas, F.; Reis, M.A.M. Preparation and Characterization of Films Based on a Natural P(3HB)/Mcl-PHA Blend Obtained through the Co-Culture of Cupriavidus necator and Pseudomonas citronellolis in Apple Pulp Waste. Bioengineering 2020, 7, 34. [Google Scholar] [CrossRef] [PubMed]
Gigante, V.; Seggiani, M.; Cinelli, P.; Signori, F.; Vania, A.; Navarini, L.; Amato, G.; Lazzeri, A. Utilization of Coffee Silverskin in the Production of Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Biopolymer-Based Thermoplastic Biocomposites for Food Contact Applications. Compos. Part A Appl. Sci. Manuf. 2021, 140, 106172. [Google Scholar] [CrossRef]
Tripathi, L.; Wu, L.-P.; Chen, J.; Chen, G.-Q. Synthesis of Diblock Copolymer Poly-3-Hydroxybutyrate -Block-Poly-3-Hydroxyhexanoate [PHB-b-PHHx] by a β-Oxidation Weakened Pseudomonas Putida KT2442. Microb. Cell Fact. 2012, 11, 44. [Google Scholar] [CrossRef] [PubMed]
Wang, K.Y.; Cao, F. Effect of CaCO₃ on Thermal and Crystalline Morphology Properties of Biodegradable PHBV. AMR 2013, 781–784, 542–545. [Google Scholar] [CrossRef]
Li, S.Y.; Dong, C.L.; Wang, S.Y.; Ye, H.M.; Chen, G.-Q. Microbial Production of Polyhydroxyalkanoate Block Copolymer by Recombinant Pseudomonas Putida. Appl. Microbiol. Biotechnol. 2011, 90, 659–669. [Google Scholar] [CrossRef] [PubMed]
Figueroa-Lopez, K.J.; Cabedo, L.; Lagaron, J.M.; Torres-Giner, S. Development of Electrospun Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Monolayers Containing Eugenol and Their Application in Multilayer Antimicrobial Food Packaging. Front. Nutr. 2020, 7, 140. [Google Scholar] [CrossRef] [PubMed]
Turco, R.; Corrado, I.; Zannini, D.; Gargiulo, L.; Di Serio, M.; Pezzella, C.; Santagata, G. Upgrading Cardoon Biomass into Polyhydroxybutyrate Based Blends: A Holistic Approach for the Synthesis of Biopolymers and Additives. Bioresour. Technol. 2022, 363, 127954. [Google Scholar] [CrossRef] [PubMed]
Wang, L.; Zhu, W.; Wang, X.; Chen, X.; Chen, G.; Xu, K. Processability Modifications of Poly(3-hydroxybutyrate) by Plasticizing, Blending, and Stabilizing. J. Appl. Polym. Sci. 2008, 107, 166–173. [Google Scholar] [CrossRef]
Basnett, P.; Ching, K.Y.; Stolz, M.; Knowles, J.C.; Boccaccini, A.R.; Smith, C.; Locke, I.C.; Keshavarz, T.; Roy, I. Novel Poly(3-Hydroxyoctanoate)/Poly(3-Hydroxybutyrate) Blends for Medical Applications. React. Funct. Polym. 2013, 73, 1340–1348. [Google Scholar] [CrossRef]
Angelini, S.; Cerruti, P.; Immirzi, B.; Scarinzi, G.; Malinconico, M. Acid-Insoluble Lignin and Holocellulose from a Lignocellulosic Biowaste: Bio-Fillers in Poly(3-Hydroxybutyrate). Eur. Polym. J. 2016, 76, 63–76. [Google Scholar] [CrossRef]
Baraki, S.Y.; Zhang, Y.; Li, X.; Ding, L.; Debeli, D.K.; Macharia, D.K.; Wang, B.; Feng, X.; Mao, Z.; Sui, X. Regenerated Chitin Reinforced Polyhydroxybutyrate Composites via Pickering Emulsion Template with Improved Rheological, Thermal, and Mechanical Properties. Compos. Commun. 2021, 25, 100655. [Google Scholar] [CrossRef]
Freitas, P.A.V.; Barrrasa, H.; Vargas, F.; Rivera, D.; Vargas, M.; Torres-Giner, S. Atomization of Microfibrillated Cellulose and Its Incorporation into Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) by Reactive Extrusion. Appl. Sci. 2022, 12, 2111. [Google Scholar] [CrossRef]
Panaitescu, D.M.; Vizireanu, S.; Stoian, S.A.; Nicolae, C.-A.; Gabor, A.R.; Damian, C.M.; Trusca, R.; Carpen, L.G.; Dinescu, G. Poly(3-Hydroxybutyrate) Modified by Plasma and TEMPO-Oxidized Celluloses. Polymers 2020, 12, 1510. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Chen, R.; Cai, J.; Liu, Z.; Zheng, Y.; Wang, H.; Li, Q.; He, N. Biosynthesis and Thermal Properties of PHBV Produced from Levulinic Acid by Ralstonia Eutropha. PLoS ONE 2013, 8, e60318. [Google Scholar] [CrossRef] [PubMed]
De Oliveira, P.F.; Aguiar, V.D.O.; Marques, M.D.F.V.; Monteiro, S.N. Effect of Acid Treatment of Eucalyptus Fibers for Improved Poly(3-Hydroxybutyrate) Nanocomposites. J. Mater. Res. Technol. 2024, 29, 3686–3698. [Google Scholar] [CrossRef]
Fang, C.; Shao, T.; Ji, X.; Wang, F.; Zhang, H.; Xu, J.; Miao, W.; Wang, Z. High Mechanical Property and Antibacterial Poly (3-Hydroxybutyrate-Co-3-Hydroxyvalerate)/Functional Enzymatically-Synthesized Cellulose Biodegradable Composite. Int. J. Biol. Macromol. 2023, 225, 776–785. [Google Scholar] [CrossRef] [PubMed]
Chen, J.; Wu, D.; Pan, K. Effects of Ethyl Cellulose on the Crystallization and Mechanical Properties of Poly(β-Hydroxybutyrate). Int. J. Biol. Macromol. 2016, 88, 120–129. [Google Scholar] [CrossRef] [PubMed]
Panaitescu, D.M.; Nicolae, C.A.; Gabor, A.R.; Trusca, R. Thermal and Mechanical Properties of Poly(3-Hydroxybutyrate) Reinforced with Cellulose Fibers from Wood Waste. Ind. Crops Prod. 2020, 145, 112071. [Google Scholar] [CrossRef]
Chen, J.; Yang, Y.; Fan, W.; Zhu, Y.; Yang, R.; Xu, Y. How Surface Modification of Cellulose Nanocrystals Affects the Crystallization Process of Poly (β-Hydroxybutyrate). Int. J. Biol. Macromol. 2024, 276, 134119. [Google Scholar] [CrossRef] [PubMed]
Bhati, R.; Mallick, N. Production and Characterization of Poly(3-hydroxybutyrate-Co-3-hydroxyvalerate) Co-polymer by a N₂-fixing Cyanobacterium, Nostoc muscorum Agardh. J. Chem. Tech. Biotech. 2012, 87, 505–512. [Google Scholar] [CrossRef]
Panaitescu, D.M.; Ionita, E.R.; Nicolae, C.-A.; Gabor, A.R.; Ionita, M.D.; Trusca, R.; Lixandru, B.-E.; Codita, I.; Dinescu, G. Poly(3-Hydroxybutyrate) Modified by Nanocellulose and Plasma Treatment for Packaging Applications. Polymers 2018, 10, 1249. [Google Scholar] [CrossRef] [PubMed]
Choi, J.S.; Park, W.H. Effect of Biodegradable Plasticizers on Thermal and Mechanical Properties of Poly(3-Hydroxybutyrate). Polym. Test. 2004, 23, 455–460. [Google Scholar] [CrossRef]
Li, Z.; Kong, J.; Han, L.; Zhang, H.; Dong, L. Effect of Crystallinity on the Thermal Conductivity of Poly(3-Hydroxybutyrate)/BN Composites. Polym. Bull. 2018, 75, 1651–1666. [Google Scholar] [CrossRef]
Ino, K.; Sato, S.; Ushimaru, K.; Saika, A.; Fukuoka, T.; Ohshiman, K.; Morita, T. Mechanical Properties of Cold-Drawn Films of Ultrahigh-Molecular-Weight Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Produced by Haloferax Mediterranei. Polym. J. 2020, 52, 1299–1306. [Google Scholar] [CrossRef]
Samaniego-Aguilar, K.; Sánchez-Safont, E.; Arrillaga, A.; Anakabe, J.; Gamez-Perez, J.; Cabedo, L. In Service Performance of Toughened PHBV/TPU Blends Obtained by Reactive Extrusion for Injected Parts. Polymers 2022, 14, 2337. [Google Scholar] [CrossRef] [PubMed]
Kai, D.; Zhang, K.; Liow, S.S.; Loh, X.J. New Dual Functional PHB-Grafted Lignin Copolymer: Synthesis, Mechanical Properties, and Biocompatibility Studies. ACS Appl. Bio Mater. 2019, 2, 127–134. [Google Scholar] [CrossRef] [PubMed]
Wei, L.; McDonald, A.G. Peroxide Induced Cross-linking by Reactive Melt Processing of Two Biopolyesters: Poly(3-hydroxybutyrate) and Poly(l-lactic Acid) to Improve Their Melting Processability. J. Appl. Polym. Sci. 2015, 132, app.41724. [Google Scholar] [CrossRef]
Madden, L.A.; Anderson, A.J.; Asrar, J. Synthesis and Characterization of Poly(3-Hydroxybutyrate) and Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Polymer Mixtures Produced in High-Density Fed-Batch Cultures of Ralstonia eutropha (Alcaligenes eutrophus). Macromolecules 1998, 31, 5660–5667. [Google Scholar] [CrossRef]
Wang, L.; Wei, X.; Bai, Y.; Zhou, H.; Wang, X.; Wen, B.; Wang, Y. Designing Improved Electromagnetic Shielding Efficacy of Poly(3-hydroxybutyrate-co-3hydroxyvalerate) Nanocomposites Foam Using Carbon Nanotubes and Graphene Nanoplatelets. Vinyl Addit. Technol. 2024, 30, 72–88. [Google Scholar] [CrossRef]
Akdoğan, M.; Çelik, E. Enhanced Production of Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Biopolymer by Recombinant Bacillus Megaterium in Fed-Batch Bioreactors. Bioprocess. Biosyst. Eng. 2021, 44, 403–416. [Google Scholar] [CrossRef] [PubMed]
Sadat-Shojai, M.; Khorasani, M.-T.; Jamshidi, A.; Irani, S. Nano-Hydroxyapatite Reinforced Polyhydroxybutyrate Composites: A Comprehensive Study on the Structural and in Vitro Biological Properties. Mater. Sci. Eng. C 2013, 33, 2776–2787. [Google Scholar] [CrossRef] [PubMed]
Garcia-Garcia, D.; Ferri, J.M.; Montanes, N.; Lopez-Martinez, J.; Balart, R. Plasticization Effects of Epoxidized Vegetable Oils on Mechanical Properties of Poly(3-hydroxybutyrate). Polym. Int. 2016, 65, 1157–1164. [Google Scholar] [CrossRef]
Ambrosi, M.; Raudino, M.; Diañez, I.; Martínez, I. Non-Isothermal Crystallization Kinetics and Morphology of Poly(3-Hydroxybutyrate)/Pluronic Blends. Eur. Polym. J. 2019, 120, 109189. [Google Scholar] [CrossRef]
Cyras, V.P.; Vazquez, A.; Rozsa, C.; Fernandez, N.G.; Torre, L.; Kenny, J.M. Thermal Stability of P(HB-Co-HV) and Its Blends with Polyalcohols: Crystallinity, Mechanical Properties, and Kinetics of Degradation. J. Appl. Polym. Sci. 2000, 77, 2889–2900. [Google Scholar] [CrossRef]
Isa, M.R.M.; Hassan, A.; Nordin, N.A.; Thirmizir, M.Z.A.; Ishak, Z.A.M. Mechanical, Rheological and Thermal Properties of Montmorillonite-Modified Polyhydroxybutyrate Composites. High Perform. Polym. 2020, 32, 192–200. [Google Scholar] [CrossRef]
De Oliveira, J.M.; Sousa, V.M.Z.; Teixeira, L.A.; Leão, R.M.; Sales-Contini, R.C.M.; Steier, V.F.; Da Luz, S.M. The Role of Chemical Treatments on Curaua Fibers on Mechanical and Thermal Behavior of Biodegradable Composites. Appl. Sci. 2024, 14, 10621. [Google Scholar] [CrossRef]
Vogel, R.; Tändler, B.; Häussler, L.; Jehnichen, D.; Brünig, H. Melt Spinning of Poly(3-hydroxybutyrate) Fibers for Tissue Engineering Using α-Cyclodextrin/Polymer Inclusion Complexes as the Nucleation Agent. Macromol. Biosci. 2006, 6, 730–736. [Google Scholar] [CrossRef] [PubMed]
Figueroa-Lopez, K.J.; Prieto, C.; Pardo-Figuerez, M.; Cabedo, L.; Lagaron, J.M. Development and Characterization of Electrospun Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) Biopapers Containing Cerium Oxide Nanoparticles for Active Food Packaging Applications. Nanomaterials 2023, 13, 823. [Google Scholar] [CrossRef] [PubMed]
Al, G.; Aydemir, D.; Altuntaş, E. The Effects of PHB-g-MA Types on the Mechanical, Thermal, Morphological, Structural, and Rheological Properties of Polyhydroxybutyrate Biopolymers. Int. J. Biol. Macromol. 2024, 264, 130745. [Google Scholar] [CrossRef] [PubMed]
Camargo, F.A.; Innocentini-Mei, L.H.; Lemes, A.P.; Moraes, S.G.; Durán, N. Processing and Characterization of Composites of Poly(3-Hydroxybutyrate-Co-Hydroxyvalerate) and Lignin from Sugar Cane Bagasse. J. Compos. Mater. 2012, 46, 417–425. [Google Scholar] [CrossRef]
Das, R.; Saha, N.R.; Pal, A.; Chattopadhyay, D.; Paul, A.K. Comparative Evaluation of Physico-Chemical Characteristics of Biopolyesters P(3HB) and P(3HB-Co-3HV) Produced by Endophytic Bacillus Cereus RCL 02. Front. Biol. 2018, 13, 297–308. [Google Scholar] [CrossRef]

Figure 1. Research workflow of model development.

Figure 2. PRISMA 2020 flow diagram illustrating the identification, screening, eligibility, and inclusion of studies for database construction of PHB/PHBV-based materials [29].

Figure 3. DSC thermogram and thermal transitions for A. amorphous and B. semicrystalline polymers. Reproduced from [30], Wikimedia Commons, 2018.

Figure 4. Distribution of model input numerical variables.

Figure 5. Distribution of model input categorical variables.

Figure 6. Comparison between predicted and true Tm values for the training set (left) and the test set (right) using the RF model.

Figure 7. Feature importance of the RF model using SHAP values.

Figure 8. Comparison between predicted and true Tc values for the training set (left) and the test set (right) using the XGBoost model.

Figure 9. Feature importance of the XGBoost model using SHAP values.

Figure 10. Comparison between predicted and true Tg values for the training set (left) and the test set (right) using the RF model.

Figure 11. Feature importance of the RF model using SHAP values.

Figure 12. Comparison between predicted and true thermal values for the training, test and external validation sets.

Table 1. Overview of worksheets and parameters in the thermal properties data libraries of scl-PHAs (PHB/PHBV formulations).

Worksheets	Parameters
Worksheet_1_Physicochemical Information (included information regarding the physicochemical properties and composition of the biopolymer)	Percentage of the PHB, percentage of the monomer PHV, weight-average molecular weight (M_w), number-average molecular weight (M_n), density of the polymer, PDI, the existence of an additive, the additive name, the percentage of the additive used, etc.
Worksheet_2_ Thermal properties (included information on the thermal behavior and characteristics of the polymer)	Glass transition temperature (T_g), crystallization temperature (T_c), melting temperature (T_m), decomposition temperature at 5% weight loss (T_d5%), degradation temperature (T_deg), heat of melting fusion (ΔH_m), heat of crystallization fusion (ΔH_c), crystallinity (X_{c %}), etc.
Worksheet_3_Metadata (contained information about the title and DOI of each study that was studied)	Study_id, author, date, title, Doi

Table 2. Molecular weight reconstruction.

Column	Initial Count (%)	Final Count (%)
Mw_PHBV	332 (58.0%)	367 (64.2%)
Mn_PHBV	157 (27.4%)	235 (41.1%)
PDI	180 (31.5%)	231 (40.4%)

Table 3. Category grouping of the feature “Additive_type_1”.

Original Category	Count	Merged Category	Count
not_applicable	247	not_applicable	247
plasticizer	79	plasticizer	115
plastisizer	36	plasticizer	115
filler	61	filler	138
reinforcement	59	filler
impact_modifier	2	filler
nucleating_agent	16	filler
crosslinking_agent	9	polymer_modifier	37
compatibilizer	28	polymer_modifier	37
antioxidant	32	stabilizer	35
stabilizer	2	stabilizer
antimicrobial	1	stabilizer

Table 4. Category grouping of the feature “Additive_type_2”.

Original Category	Count	Merged Category	Count
not_applicable	519	not_applicable	519
plasticizer	3	plasticizer	15
plastisizer	12	plasticizer	15
filler	4	filler	21
reinforcement	17	filler	21
crosslinking_agent	8	polymer_modifier	17
compatibilizer	2	polymer_modifier
blend	1	polymer_modifier
antioxidant	6	polymer_modifier

Table 5. Data type and completeness summary of the cleaned merged datasets.

Column	Count	Completeness	Dtype
Study_id	572	100.0%	int64
Instance	572	100.0%	int64
HB_ratio_formulation	567	99.1%	float64
HV_ratio_rofmulation	567	99.1%	float64
Mw_PHBV	367	64.2%	float64
Mn_PHBV	235	41.1%	float64
PDI	231	40.4%	float64
Additive1_type	572	100.0%	object
Additive1_percentage	572	100.0%	float64
Additive2_type	572	100.0%	object
Additive2_percentage	572	100.0%	float64
Tg1	273	47.8%	float64
Tc1	326	57.0%	float64
Tm1	541	94.6%	float64
Tm2	193	33.7%	float64
Td5	207	36.2%	float64
DHm1	357	62.4%	float64
Crystallinity	339	59.3%	float64

Table 6. Descriptive statistics of the key thermal properties.

	Tm (°C)	Tc (°C)	Tg (°C)
Count	541	326	273
Mean	158.1	85.1	−1.9
Min	52.0	15.5	−48.0
Max	180.0	126.7	43.0
Q₂	166.0	86.3	−1.6

Table 7. Dataset sizes for the different scenarios.

Feature Space	Target Value
Feature Space	Tm	Tc	Tg
Basic	351	201	215
Basic + 1 thermal	207 (+Tg) 196 (+Tc)	196 (+Tm) 122 (+Tg)	207 (+Tm) 122 (+Tc)
Basic + 2 thermal	118	118	118

Table 8. CV R² scores for the different feature spaces for the Tm model.

Model	Feature Space	Outlier Exclusion	Train R²	Test R²	CV R² ± SD
RF	Basic + Tc	No	0.949	0.742	0.731 ± 0.162
	Basic + Tc	Yes	0.968	0.715	0.754 ± 0.051
	Basic + Tg	No	0.965	0.719	0.718 ± 0.089
	Basic + Tg	Yes	0.971	0.621	0.787 ± 0.074
	Basic + Tc + Tg	No	0.957	0.649	0.713 ± 0.220
	Basic + Tc + Tg	Yes	0.935	0.763	0.817 ± 0.073
XGBoost	Basic + Tc	Yes	0.981	0.715	0.789 ± 0.083
XGBoost	Basic + Tc + Tg	Yes	0.969	0.649	0.771 ± 0.071

Table 9. CV R² scores for the different feature spaces for the Tc model.

Model	Feature Space	Outlier Exclusion	Train R²	Test R²	CV R² ± SD
RF	Basic + Tm	Yes	0.967	0.670	0.704 ± 0.095
XGBoost	Basic	No	0.972	0.753	0.762 ± 0.054

Table 10. CV R² scores for the different feature spaces for the Tg model.

Model	Feature Space	Outlier Exclusion	Train R²	Test R²	CV R² ± SD
RF	Basic + Tm + Tc	Yes	0.960	0.731	0.765 ± 0.093
XGBoost	Basic + Tc	No	0.989	0.868	0.702 ± 0.128

Table 11. Input features for the Tm models.

Feature	Type	Units
Tc	Numerical	Celsius
Tg	Numerical	Celsius
HB_ratio_formulation	Numerical	-
HV_ratio_formulation	Numerical	-
Mw_PHBV	Numerical	g/mol
Addtive1_percentage	Numerical	%
Additive1_type	Categorical	-
Addtive2_percentage	Numerical	%
Additive2_type	Categorical	-

Table 12. Optimized hyperparameter values for the RF model.

Parameter	Best Value
max_depth	10
min_samples_leaf	2
min_samples_split	5
n_estimators	50

Table 13. Performance of the tuned Tm model.

Metric	Random Forest (Tuned)
Train R²	0.935
Test R²	0.763
CV Mean R²	0.817 ± 0.073
CV Mean RMSE	8.55 ± 1.63
CV Mean MAE	5.56 ± 0.88

Table 14. Input features for the Tc model.

Feature	Type	Units
HB_ratio_formulation	Numerical	-
HV_ratio_formulation	Numerical	-
Mw_PHBV	Numerical	g/mol
Addtive1_percentage	Numerical	%
Additive1_type	Categorical
Addtive2_percentage	Numerical	%
Additive2_type	Categorical

Table 15. Optimized hyperparameter values for the XGBoost model.

Parameter	Best Value
max_depth	6
colsample_bytree	0.7
learning_rate	0.05
n_estimators	300
subsample	0.8

Table 16. Performance of the tuned Tc model.

Metric	XGBoost (Tuned)
Train R²	0.972
Test R²	0.753
CV Mean R²	0.762 ± 0.054
CV Mean RMSE	13.48 ± 1.54
CV Mean MAE	9.00 ± 1.32

Table 17. Input features for the Tg model.

Feature	Type	Units
Tm	Numerical	Celsius
Tc	Numerical	Celsius
HB_ratio_formulation	Numerical	-
HV_ratio_formulation	Numerical	-
Mw_PHBV	Numerical	g/mol
Addtive1_percentage	Numerical	%
Additive1_type	Categorical
Addtive2_percentage	Numerical	%
Additive2_type	Categorical

Table 18. Optimized hyperparameter values for the RF model.

Parameter	Best Value
max_depth	10
min_samples_leaf	1
min_samples_split	2
n_estimators	50

Table 19. Performance of the tuned Tg model.

Metric	Random Forest (Tuned)
Train R²	0.960
Test R²	0.731
CV Mean R²	0.765 ± 0.093
CV Mean RMSE	4.05 ± 1.47
CV Mean MAE	2.68 ± 0.63

Table 20. Performance comparison on the validation set.

Metric	Target Value
Metric	Tm	Tc	Tg
Pearson r	0.801	0.907	0.708
RMSE	11.70	11.59	2.89
MAE	6.38	9.81	1.97

Table 21. Model performance using KS splitting.

Metric	Target Value
Metric	Tm	Tc	Tg
Train R²	0.966	0.939	0.973
Test R²	0.850	0.845	0.720
CV Mean R²	0.732 ± 0.131	0.599 ± 0.113	0.732 ± 0.188

Table 22. Domain of Applicability (DoA) for the thermal models.

Numerical Input Feature	Target Value
Numerical Input Feature	Tm	Tc	Tg
Hb ratio formulation (mol %)	[0, 100]	[0, 100]	[0, 100]
Hv ratio formulation (mol %)	[0, 100]	[0, 100]	[0, 100]
Mw PHBV (g/mol)	[80,509, 5,240,400]	[23,900, 5,240,400]	[80,509, 5,240,400]
Additive 1 percentage (wt%)	[0, 40]	[0, 30]	[0, 40]
Additive 2 percentage (wt%)	[0, 10]	[0, 10]	[0, 10]
Tm (°C)	-	-	[99.7, 176.7]
Tc (°C)	[15.5, 114]	-	[15.5, 114]
Tg (°C)	[−30.7, 7.1]	-	-

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sotiropoulos, N.P.; Mindrinos, L.; Peltier, J.-D.; Filippou, K.V.; Kotzabasaki, M.I.; Tsigkas, N.; Maraveas, C. Machine Learning Prediction of Thermal Properties of PHB/PHBV-Based Materials: A Quantitative Structure–Property Relationship Approach Using an Integrated Polymer Database. Polymers 2026, 18, 1559. https://doi.org/10.3390/polym18131559

AMA Style

Sotiropoulos NP, Mindrinos L, Peltier J-D, Filippou KV, Kotzabasaki MI, Tsigkas N, Maraveas C. Machine Learning Prediction of Thermal Properties of PHB/PHBV-Based Materials: A Quantitative Structure–Property Relationship Approach Using an Integrated Polymer Database. Polymers. 2026; 18(13):1559. https://doi.org/10.3390/polym18131559

Chicago/Turabian Style

Sotiropoulos, Nikolaos P., Leonidas Mindrinos, Jean-David Peltier, Konstantina V. Filippou, Marianna I. Kotzabasaki, Nikolaos Tsigkas, and Chrysanthos Maraveas. 2026. "Machine Learning Prediction of Thermal Properties of PHB/PHBV-Based Materials: A Quantitative Structure–Property Relationship Approach Using an Integrated Polymer Database" Polymers 18, no. 13: 1559. https://doi.org/10.3390/polym18131559

APA Style

Sotiropoulos, N. P., Mindrinos, L., Peltier, J.-D., Filippou, K. V., Kotzabasaki, M. I., Tsigkas, N., & Maraveas, C. (2026). Machine Learning Prediction of Thermal Properties of PHB/PHBV-Based Materials: A Quantitative Structure–Property Relationship Approach Using an Integrated Polymer Database. Polymers, 18(13), 1559. https://doi.org/10.3390/polym18131559

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning Prediction of Thermal Properties of PHB/PHBV-Based Materials: A Quantitative Structure–Property Relationship Approach Using an Integrated Polymer Database

Abstract

1. Introduction

2. Materials and Methods

2.1. Workflow of Model Development

2.2. Data Collection from the Literature

2.2.1. Literature Search Strategy

2.2.2. Study Selection and Screening

2.2.3. Data Extraction and Synthesis

2.2.4. Data Curation

2.2.5. Data Integration

2.3. Data Collection from In-House Experiments

2.3.1. Experimental Data Acquisition Methodology

PHBV Extraction

Gel Permeation Chromatography (GPC)

Differential Scanning Calorimetry (DSC)

Nuclear Magnetic Resonance (NMR) Spectroscopy

2.3.2. Mechanical Testing

2.3.3. Data Preprocessing

2.4. Thermal Properties and QSPR Model Development and Validation

3. Results and Discussion

3.1. Data Curation

3.2. QSPR Model Performance

3.2.1. Sensitivity Analysis and Optimal Model Selection

3.2.2. Tm Prediction Model Performance

3.2.3. Tc Prediction Model Performance

3.2.4. Tg Prediction Model Performance

3.3. Comparison of Predictive Performance Across Thermal Properties

3.4. Model Validation, Generalization and Applicability Domain Analysis

3.4.1. Study-Wise Splitting and External Validation

3.4.2. Effect of Molecular Weight Reconstruction on Model Performance

3.4.3. Baseline Models and Linearity Assessment

3.4.4. Effect of Data Splitting Strategy

3.4.5. Domain of Applicability for Model Prediction

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1

Appendix A.2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI