Advances in the Development of Sol-Gel Materials Combining Small-Angle X-ray Scattering (SAXS) and Machine Learning (ML)

The requirements for new materials are increasing with each new application, which, in most cases, means an enhancement in the complexity of the development process. Nanoporous sol-gel-based materials, especially aerogels, are promising candidates for thermal superinsulation, electrodes for energy conversion and storage or high-end adsorbers. Their synthesis and processing route is complex, and the relationship between the material/processing parameters and the resulting structural and physical properties is not straightforward. Using small-angle X-ray scattering (SAXS) allows for fast structural characterization of both the gel and the resulting aerogel; combining these results with the respective physical properties of the aerogels and using these data as inputs for machine learning (ML) algorithms provide an approach to predict physical properties on the basis of a structural dataset. This data-driven strategy may be a feasible approach to speed up the development process. Thus, the study aimed to provide a proof of concept of ML-based model derivation from material, process and SAXS data to predict physical properties such as the solid-phase thermal conductivity (λs) of silica aerogels from a structural dataset. Here, we used different data subsets as predictors according to different states of synthesis (wet and dry) to evaluate the model performance.


Introduction
Sol-gel-derived porous solids represent a class of materials with a high number of synthesis and processing parameters. This class enables the provision of porous materials and composites with a designed chemical composition and independent control of the specific surface area, porosity and particle size, as well as excellent shaping capabilities via templating, sedimentation or molding [1]. In contrast to other types of porous materials, such as sinter metals, ceramics, foams or fiber felts, the sol-gel route allows for the synthesis of (monolithic) nanoporous materials with extremely high porosities of up to 99%. With these special properties, aerogels are unrivaled candidates for thermal superinsulation, energy conversion and storage, catalyst supports and dielectric materials [2,3], to name just a few target applications. Depending on the application, the requirement of different properties and combinations thereof must be fulfilled, e.g., materials with high porosity and small pore sizes for thermal insulation, but with sufficient mechanical stability for easy handling.
Although the mechanical properties and solid-phase thermal conductivities (representing the main thermal transport path in ambient conditions) of aerogels are connected to the porosity in a first-order approximation [4][5][6], these properties are also strongly controlled by other characteristics, such as the branching of the gel backbone and its connectivity, as Processes 2021, 9, 672 2 of 12 well as the necks between the subunits forming the gel skeleton, i.e., properties that are hard to quantify experimentally. Within the material development process, analysis by small-angle X-ray scattering (SAXS) provides fast characterization of structural quantities such as specific surface area and branching (fractality) of the solid phase, as well as particle and cluster size. Thus far, there is no unequivocal relationship between SAXS data and the resulting physical properties.
Due to the large number of process parameters and the complexity of the resulting structure, a new approach to speed up materials' development for aerogels would be very helpful in supporting the above-mentioned applications [7]. In this context, a variety of machine learning (ML) algorithms, development environments and datasets arise to support material development processes. Applications of machine learning (ML) in materials science include the following:

•
Finding new materials or promising material combinations; • Classifying materials or properties by recognizing patterns; • Predicting structural or performance properties from data subsets.
The overall objective is to support "experience-" and "intuition-based" decisionmaking or leave the "beaten path" with ML as a data-driven approach [8]. This has huge potential to shorten material development processes significantly. In particular, the prediction of a material structure from synthesis and processing parameters and the relationship between its mechanical/thermal structure and the resulting performance are promising fields of ML application. Oftentimes, analytical relationships cannot be applied due to the large number and complexity of influencing factors, e.g. on materials' synthesis and processing, on the resulting physical properties, etc. The challenge, therefore, is to fit or interpret the results of these characterization methods in terms of their structureprocess-performance relationship. Here, ML can make a significant contribution [9] and further the development of nanoporous materials by digitalization, thus confirming the statement by Schmidt et al. in 2019, "One of the most exciting tools that have entered the material science toolbox is machine learning" [10].
That the combination of SAXS with ML approaches can lead to relevant information was shown inter alia by Roth et al. [11], who analyzed a solution cast gradient consisting of colloidal gold nanoparticles on top of a silicon substrate. A widely evolved approach with respect to shape classification and molecular mass determination of biomolecules was investigated by Franke et al. [12], who demonstrated the great potential of SAXS with ML. Unfortunately, such large datasets are not available for SAXS data with respect to porous sol-gel materials. However, this publication may be a starting point improving the material development process in the sol-gel system using machine learning.

Synthesis of Sol-Gel Materials
For the investigation and proof of concept of combining sol-gel materials with machine learning for material development, a series of silica aerogels were chosen as the model system. The silica aerogels were synthesized using a 2-step process following the procedure described by Scherer et al. [13]. The raw materials used were tetraethoxysilane (TEOS) as a silica source, ethanol as a solvent, high-purity water for hydrolysis and hydrochloric acid and ammonia solution to adjust the pH. The three synthesis parameters used were the target density, ρ target (assuming that all silane in the liquid volume was converted to SiO 2 ), the molar ratio of water to TEOS, x HT , and the (calculated) pH, assuming that the whole solution was water. Gelling and aging were performed in airtight vessels at 50 • C for 24 h overall. Subsequent washing with ethanol replaced the liquid in the pores prior to supercritical drying (SCD) with CO 2 .

Structural, Mechanical and Thermal Analysis of the Materials
In addition to the density of the (dry) aerogels, the structural properties of the gels and the aerogels were determined using small-angle X-ray scattering (SAXS) with a SAXS-point 1.0 instrument from Anton Paar using Cu Kα radiation (wavelength 1.54 Å) at two sample detector distances of 109 and 562 mm. Analysis was performed on wet gels and the respective aerogels derived by supercritical drying (SCD). Thus, the structural changes between the wet and dry versions reflect the impact of the drying process on the structural characteristics. For the measurements, the wet gels were placed into a sealed cell with polyimide windows and an excess of ethanol to avoid drying. The SCD-dried aerogels were prepared as thin slices and placed in a solid sample holder. Prior to measurement, the dry silica aerogels were degassed for 8 h at 1 mbar and 110 • C according to the recommendations given by Scherdel et al. [14]. The scattering intensity was normalized to the mass-specific scattering cross-section m −1 ·dσ/dΩ in units of cm 2 g −1 sr −1 , using a glassy carbon reference with a well-known scattering cross-section as the secondary standard. Figure 1 shows a typical scattering curve for the investigated silica aerogels. Three regions with different slopes in the double-logarithmic plot of the differential scattering cross-section m −1 ·dσ/dΩ vs. the scattering vector, q, can be identified; the q-values of the intersections between the different slopes are related to the cluster, d cluster , and the particle size, d particle [15]. As the conversion factor to the scattering entity size, d, the relation d = π/q was applied [16]. The fractal dimension, d f , in the power law dependence (~q −df ) [17] is a measure of the mutual arrangement of the primary particles in a cluster. Using the Porod regime (~q −4 ), the specific surface area, S SAXS , can be calculated [18]. and the aerogels were determined using small-angle X-ray scattering (SAXS) w SAXSpoint 1.0 instrument from Anton Paar using Cu Kα radiation (wavelength 1.54 two sample detector distances of 109 and 562 mm. Analysis was performed on wet and the respective aerogels derived by supercritical drying (SCD). Thus, the struc changes between the wet and dry versions reflect the impact of the drying process o structural characteristics. For the measurements, the wet gels were placed into a se cell with polyimide windows and an excess of ethanol to avoid drying. The SCD-d aerogels were prepared as thin slices and placed in a solid sample holder. Prior to m urement, the dry silica aerogels were degassed for 8 h at 1 mbar and 110 °C accordin the recommendations given by Scherdel et al. [14]. The scattering intensity was nor ized to the mass-specific scattering cross-section m −1 •dσ/dΩ in units of cm 2 g −1 sr −1 , u a glassy carbon reference with a well-known scattering cross-section as the secon standard. Figure 1 shows a typical scattering curve for the investigated silica aero Three regions with different slopes in the double-logarithmic plot of the differential tering cross-section m −1 •dσ/dΩ vs. the scattering vector, q, can be identified; the q-va of the intersections between the different slopes are related to the cluster, dcluster, and particle size, dparticle [15]. As the conversion factor to the scattering entity size, d, the tion d = π/q was applied [16]. The fractal dimension, df, in the power law depend (~q −df ) [17] is a measure of the mutual arrangement of the primary particles in a clu Using the Porod regime (~q −4 ), the specific surface area, SSAXS, can be calculated [18].
The solid-phase thermal conductivity of the silica aerogels (λs) was determined u the transient hot-wire method [19], measured under vacuum at 0.3 mbar. Typical scattering curve of a supercritical drying (SCD)-dried silica aerogel. Three re gions with different slopes (i.e., power law dependence) can be identified. According to the ae gel scheme on top, the intersections are related to cluster (dcluster) and particle size (dparticle), resp tively. Typical scattering curve of a supercritical drying (SCD)-dried silica aerogel. Three regions with different slopes (i.e., power law dependence) can be identified. According to the aerogel scheme on top, the intersections are related to cluster (d cluster ) and particle size (d particle ), respectively.
The solid-phase thermal conductivity of the silica aerogels (λ s ) was determined using the transient hot-wire method [19], measured under vacuum at 0.3 mbar.

Machine Learning Meta Models
As introduced in Section 1, machine learning is already applied to various specific topics in computational materials science. General boundary conditions must be fulfilled to enable the set-up of machine learning algorithms. Ramprasad et al. [8] named two distinct steps for all data-driven approaches to perform quantitative predictions: 1.
Establishing the mapping/learning between inputs and target properties (e.g., mechanical/thermal properties).
These very general steps for machine learning contain some challenges, which need to be considered to deploy an executable algorithm. In particular, the initial numerical description with an ontology that is as general as possible for the research field and the description of workflows, e.g., for synthesis processes, require the greatest effort within the well-established Cross Industry Standard Process for Data Mining (CRISP-DM) model [20] [21]. This meta-model is somewhat of a standard to handle machine learning or data mining problems in a systematic way. Figure 2 shows the model adopted for the research field of computational materials.

Machine Learning Meta Models
As introduced in Section 1, machine learning is already applied to variou topics in computational materials science. General boundary conditions must be to enable the set-up of machine learning algorithms. Ramprasad et al. [8] named tinct steps for all data-driven approaches to perform quantitative predictions: 1. Numerical representation of inputs (e.g., synthesis parameters and charact results); 2. Establishing the mapping/learning between inputs and target properties ( chanical/thermal properties).
These very general steps for machine learning contain some challenges, wh to be considered to deploy an executable algorithm. In particular, the initial n description with an ontology that is as general as possible for the research field description of workflows, e.g., for synthesis processes, require the greatest effo the well-established Cross Industry Standard Process for Data Mining (CRISP-DM [20] [21]. This meta-model is somewhat of a standard to handle machine learnin mining problems in a systematic way. Figure 2 shows the model adopted for the field of computational materials. In terms of the "right" modeling, the selection and adaption of an appropri rithm must be conducted systematically. Doan and Kalita proposed a meta model for regression problems using supervised learning [22]. This model was according to our approach for sol-gel materials to predict their thermal and s properties. Figure 3 shows the adopted meta-model for algorithm selection. Bas initial dataset, the relevant data were filtered to gain a training example (data-su used the MATLAB regression learner tool to evaluate a huge set of potential ML Meta-knowledge about the common suitability of the model approaches helped t the number of model approaches that can be considered for our research. The su learning procedure resulted in a ranking of various models. Here, the root mea error (RMSE) represented the performance indicator of the ML model. In terms of the "right" modeling, the selection and adaption of an appropriate algorithm must be conducted systematically. Doan and Kalita proposed a meta-learning model for regression problems using supervised learning [22]. This model was modified according to our approach for sol-gel materials to predict their thermal and structural properties. Figure 3 shows the adopted meta-model for algorithm selection. Based on an initial dataset, the relevant data were filtered to gain a training example (data-subset). We used the MATLAB regression learner tool to evaluate a huge set of potential ML models. Meta-knowledge about the common suitability of the model approaches helped to reduce the number of model approaches that can be considered for our research. The supervised learning procedure resulted in a ranking of various models. Here, the root mean square error (RMSE) represented the performance indicator of the ML model. Processes 2021, 9, 672 5 of 12 Figure 3. Adopted meta-model following Doan, T. and Kalita, J. [19].
New material discovery and material design for optimal structure-property relationships are just two areas of application for ML techniques, as conventional methodologies are highly iterative and multidimensional [23,24]. The presented approach in this contribution utilizes synthesis, drying parameters and structural SAXS data from wet and/or dry gels to train ML models.

Machine Learning in Sol-Gel Processes
The previous sections presented the principle of aerogel synthesis as well as the determination of SAXS data for the wet gels and the dried aerogels. The approach yielded different datasets, such as synthesis parameters and structural data from SAXS, which may be used to predict physical properties such as the solid-phase thermal conductivity (λs) via machine learning. In Figure 4, the process layer shows the sol-gel process steps and the created data, beginning with the synthesis parameters. SAXS provided the structural characteristics of the wet gels and the aerogels. After supercritical drying, the resulting aerogels were additionally characterized with respect to their thermal properties (solid-phase thermal conductivity, λs).  New material discovery and material design for optimal structure-property relationships are just two areas of application for ML techniques, as conventional methodologies are highly iterative and multidimensional [23,24]. The presented approach in this contribution utilizes synthesis, drying parameters and structural SAXS data from wet and/or dry gels to train ML models.

Machine Learning in Sol-Gel Processes
The previous sections presented the principle of aerogel synthesis as well as the determination of SAXS data for the wet gels and the dried aerogels. The approach yielded different datasets, such as synthesis parameters and structural data from SAXS, which may be used to predict physical properties such as the solid-phase thermal conductivity (λ s ) via machine learning. In Figure 4, the process layer shows the sol-gel process steps and the created data, beginning with the synthesis parameters. SAXS provided the structural characteristics of the wet gels and the aerogels. After supercritical drying, the resulting aerogels were additionally characterized with respect to their thermal properties (solidphase thermal conductivity, λ s ).
The application of ML techniques uses four strategies, which imply different progress in material processing: Synthesis parameters; II.
Synthesis parameters and wet gel SAXS data; III.
Dry gel SAXS data; IV.
Synthesis parameters and dry gel SAXS data.
We investigated relevant ML algorithms that might fit this regression problem, with a dataset of n = 9 for the proof of concept. Regarding small datasets, Schmidt et al. stated [10] that it is also possible to use ML as a simple fitting procedure for low-dimensional data such as ours for the presented proof-of-concept. Table 1 illustrates the ML strategies with the used predictors as well as the responses. different datasets, such as synthesis parameters and structural data from SAXS, which may be used to predict physical properties such as the solid-phase thermal conductivity (λs) via machine learning. In Figure 4, the process layer shows the sol-gel process steps and the created data, beginning with the synthesis parameters. SAXS provided the structural characteristics of the wet gels and the aerogels. After supercritical drying, the resulting aerogels were additionally characterized with respect to their thermal properties (solid-phase thermal conductivity, λs).   The choice of the different ML strategies (S I-S IV) was oriented to the material development process. The earlier we can apply valid ML models to predict the resulting properties (here, solid thermal conductivity, λ s ), the more beneficial it is for streamlining the process. Hence, S I tends to predict the λ s from just the synthesis parameter of the material. By using SAXS as an intermediate characterization method of the wet gel, S II considers these predictors. After the (critical) drying process, SAXS is also applied to the dry gel. In S III, S SAXS, d f and d Cluster as the resulting SAXS data are the basis of ML model training. S IV takes the synthesis parameter as well as the dry SAXS data into account to predict the λ s . The following section presents and discusses the results gathered from the SAXS characterization of the wet and dry gels and the derivation of the ML models with respect to the described strategies.

Fast Structural Characterization-Results
After the sol-gel process, the wet gels were characterized using SAXS ( Figure 5). The strong incoherent background of the ethanol superimposing the Porod regime allowed only for determination of the cluster size and fractal dimension of the wet gels. The data derived from SAXS are summarized in Table 2 along with the respective synthesis parameters of the wet gels.  After supercritical drying, the aerogel samples were characterized with respect to their structural and thermal properties. The scattering curves of aerogels A to I on an absolute scale are shown in Figure 6. The data of the SCD-dried silica aerogels are summarized in Table 3.  After supercritical drying, the aerogel samples were characterized with respect to their structural and thermal properties. The scattering curves of aerogels A to I on an absolute scale are shown in Figure 6. The data of the SCD-dried silica aerogels are summarized in Table 3.
Assuming that all silane is converted to SiO 2 and that the sample volume stays constant during processing, the bulk density of the aerogels should have been similar to the target density of the synthesis. However, as the bulk density reached values of up to 708 kg/m 3 (at ρ target = 120 kg/m 3 ), a big change during processing was obvious and was mainly caused by a strong shrinkage during SCD. Another parameter that changed strongly with SCD was the fractal dimension d f , which was similar for all wet gels characterized (d f ≈ 2.16) but varied to a large extent for the respective aerogels (2.17 to 2.90). This finding shows, exemplarily, why a straightforward interpretation of porous sol-gel systems may be challenging. In our case, the correlation between the bulk density and the resulting thermal property was R 2 = 0.966 (see Figure 7), and it can be determined for the given type of aerogel with little effort when the solid specimen is geometrically defined.  Assuming that all silane is converted to SiO2 and that the sample volume stays constant during processing, the bulk density of the aerogels should have been similar to the target density of the synthesis. However, as the bulk density reached values of up to 708 kg/m³ (at ρtarget = 120 kg/m³), a big change during processing was obvious and was mainly caused by a strong shrinkage during SCD. Another parameter that changed strongly with SCD was the fractal dimension df, which was similar for all wet gels characterized (df ≈ 2.16) but varied to a large extent for the respective aerogels (2.17 to 2.90). This finding shows, exemplarily, why a straightforward interpretation of porous sol-gel systems may be challenging. In our case, the correlation between the bulk density and the resulting thermal property was R² = 0.966 (see Figure 7), and it can be determined for the given type of aerogel with little effort when the solid specimen is geometrically defined.  Oftentimes, for other sol-gel materials, the bulk density is difficult to evaluate correctly, e.g., due to the irregular shape of the synthesized specimen (aerogel granules, powder, small specimen in early stage of material development, etc.). Hence, intermediate characterization, e.g., by SAXS in combination with ML, can be a feasible approach to gain (predictive) models from these data combined with synthesis parameters. For aerogels in general, the total thermal conductivity in ambient conditions is often the application-related target. In contrast to the type of silica aerogels used in this study, the thermal conductivity in ambient conditions may contain significant contributions from gaseous and radiative heat transfer. In this case, density will only be a secondary parameter. Furthermore, for potential applications, physical properties such as thermal conductivity are more relevant, even if there are first-order correlations with structural properties. Thus, a heuristic approach such as machine learning seems to be inviting. Oftentimes, for other sol-gel materials, the bulk density is difficult to evaluate correctly, e.g., due to the irregular shape of the synthesized specimen (aerogel granules, powder, small specimen in early stage of material development, etc.). Hence, intermediate characterization, e.g., by SAXS in combination with ML, can be a feasible approach to gain (predictive) models from these data combined with synthesis parameters. For aerogels in general, the total thermal conductivity in ambient conditions is often the application-related target. In contrast to the type of silica aerogels used in this study, the thermal conductivity in ambient conditions may contain significant contributions from gaseous and radiative heat transfer. In this case, density will only be a secondary parameter. Furthermore, for potential applications, physical properties such as thermal conductivity are more relevant, even if there are first-order correlations with structural properties. Thus, a heuristic approach such as machine learning seems to be inviting.

Machine Learning Results
As mentioned in Section 3, we applied four different ML strategies to derive models to predict the physical properties of the aerogels without using the bulk density (ρ). The assigned ML problem belongs to the class of supervised regression learning, as the λs should be forecasted. Based on the CRISP-DM circle and the meta-learning model of Doan and Kalita [20], we created data subsets and applied the most appropriate algorithm(s) to retrieve the relevant responses from the data. The models were ranked according to their overall quality using the RMSE indicator for evaluating the model performance with the following: where n is the number of samples, yi is the i-th predicted value and oi is the i-th observed value. The sample number (n = 9) was split. Six datasets were used for the model training and three for the model validation. We applied the data for linear regression (LR) and Gaussian process regression (GPR) ML models, which are "classical" regression models regarding this problem formulation of (multi-variate) regression [22,23]. Linear regression

Machine Learning Results
As mentioned in Section 3, we applied four different ML strategies to derive models to predict the physical properties of the aerogels without using the bulk density (ρ). The assigned ML problem belongs to the class of supervised regression learning, as the λ s should be forecasted. Based on the CRISP-DM circle and the meta-learning model of Doan and Kalita [20], we created data subsets and applied the most appropriate algorithm(s) to retrieve the relevant responses from the data. The models were ranked according to their overall quality using the RMSE indicator for evaluating the model performance with the following: where n is the number of samples, y i is the i-th predicted value and oi is the i-th observed value. The sample number (n = 9) was split. Six datasets were used for the model training and three for the model validation. We applied the data for linear regression (LR) and Gaussian process regression (GPR) ML models, which are "classical" regression models regarding this problem formulation of (multi-variate) regression [22,23]. Linear regression is the simplest model to predict an independent variable (Y) from a single or multiple dependent variable(s) (x), defined by for linear and Y = α + β 1 x 1 + b 2 x 2 + . . . + b n x n for multiple linear regression. GPR, as another promising ML model, tends to specify distributions over functions without having to commit to a specific functional form [25]. It is also a powerful tool for small datasets [26]. GPR is a Bayesian approach, which derives a probability distribution over all possible values. The prior probabilities at parameter w, p(w) are relocated on the basis of the observed (training) data according to the Bayes rule: p(w|y,X) = (p(y|X,w)·p(w))/(p(y|X)), where p(w|y,X) represents the new distribution. The information of both, the prior distribution and the dataset is represented in this new distribution. Rasmussen provided a detailed explanation of the GPR algorithm in [27]. Figure 8 shows the results comprehensively. The RMSE values for the linear regression (LR) and Gaussian process regression (GPR) models predicting the response parameter λ s are illustrated for each ML strategy, S I to S IV, for the trained and validated models.  Table 1) and the response parameter λ s . Comprehensive results of the trained and validated models.
The trained models had different performance levels depending on the chosen data input, which depended on the ML strategy. For S I and S II, the model quality was not appropriate for LR or GPR considering the RMSE of both the trained and validated models. This is an indication that the supercritical drying process strongly changed the structural and, thus, physical properties, as the synthesis parameters and SAXS data of the wet gels showed a low correlation to the response λ s . In other words, the structures generated (and determined by SAXS) from the wet gel were inadequate to predict the structural changes caused by SCD and, thus, failed to predict the λ s . After supercritical drying (S III and S IV), the LR model had an RMSE of around 2 for the training and 10 for predicting the thermal conductivity λ s (model validation). In relation to the scale of the training and test data, a validated model using the indicated RMSE as a measure of the overall model performance is promising for the LR approach, despite the small set of validation data. The model coefficients β i (see Equation (2)) are shown in Table 4, where α = −487.9. Despite the small dataset used of n = 9, it can be seen that the ML-based models for predicting λ s were strongly improved by the SAXS data, as they represented the thermal properties of the aerogel. Therefore, in addition to the good correlation with bulk density, the SAXS data of aerogels also allowed for a good estimation of λ s . It can also be stated that only using synthesis parameters did not lead to a sufficient prediction model. In particular, this also means that the process of supercritical drying had a significant impact on the structural thermal properties of the resulting sol-gel materials. Generally, a larger dataset leads to greater statistical power for pattern recognition, but numerous studies performed using a small dataset have also shown a high accuracy (in this case, for brain disorders) [28]. Hence, we wanted to show a proof of concept using the intermediate SAXS data of dried gels and synthesis parameter data as an alternative to the bulk density to predict the response parameter λ s of our material system.
Based on the results, we can accelerate the material development and characterization process in cases where the bulk density cannot be evaluated easily and in such a way that extensive tests for the direct determination of λ s can be avoided if the ML strategies S III and S IV predict unsatisfactory values of λ s . At the same time, SAXS characterization is a high-throughput method, which enables targeted development of sol-gel materials by combining SAXS and ML. Non-suitable material combinations in terms of the required λ s can be left out to narrow the search space.

Conclusions and Further Research
The contribution shows an accelerated data-driven approach for the targeted development of sol-gel materials. Based on the fast, high-throughput SAXS characterization of sol-gel-based nanostructured materials, the trained machine learning models enhanced the development process by using the SAXS data to predict, e.g., thermal properties. The results indicate that the synthesis parameter alone are not sufficient for predictions, as gels change too much during processing (e.g., SCD). The same applies for wet gel characterization with SAXS. However, it can be concluded that the aerogel SAXS data in combination with trained GPR models have a good quality and, thus, enable the prediction of physical properties such as λ s . To the best of the authors' knowledge, we have presented, for the first time, a synergy of SAXS and machine learning in a porous sol-gel system. Further research should aim to feed the models with larger datasets to enhance the validation and to transfer the approach to other material systems and targeted developments.