One Step before Synthesis: Structure–Property–Condition Relationship Models to Sustainable Design of Efficient TiO2-Based Multicomponent Nanomaterials

To control the photocatalytic activity, it is essential to consider several parameters affecting the structure of ordered multicomponent TiO2-based photocatalytic nanotubes. The lack of systematic knowledge about the relationship between structure, property, and preparation parameters may be provided by applying a machine learning (ML) methodology and predictive models based on the quantitative structure-property-condition relationship (QSPCR). In the present study, for the first time, the quantitative mapping of preparation parameters, morphology, and photocatalytic activity of 136 TiO2 NTs doped with metal and non-metal nanoparticles synthesized with the one-step anodization method has been investigated via linear and nonlinear ML methods. Moreover, the developed QSPCR model, for the first time, provides systematic knowledge supporting the design of effective TiO2-based nanotubes by proper structure manipulation. The proposed computer-aided methodology reduces cost and speeds up the process (optimize) of efficient photocatalysts’ design at the earliest possible stage (before synthesis) in line with the sustainability-by-design strategy.


Introduction
Over the past few years, finding new ways to cut CO 2 emissions (at least 40% by 2030, compared with 1990) and increasing renewable energy consumption, while maintaining economic growth, -are both industrial and EU key targets. One of the accessible approaches, simultaneoufsly allowing the utilization of CO 2 , is the utilization of the energy from light (from UV to NIR) via heterogeneous photocatalysis [1], photoelectrochemical reduction [2], or photocatalytic process [3] with the use of TiO 2 nanostructures that convert CO 2 into stable, valuable chemicals (fuels or raw materials). TiO 2 nanoparticles (NPs) are the most widely investigated material, considered promising and efficient photocatalysts. TiO 2 nanoparticles are characterized by high photoactivity, relatively low cost, low toxicity, and good chemical and thermal stability. However, TiO 2 has low quantum efficiency under visible light, resulting from the fast recombination rate of photogenerated e -/h + pairs [4]. Thus, current systems based on TiO 2 NPs are still too expensive and inefficient for commercial deployment.
Various strategies for modifying TiO 2 in heterogeneous photocatalysis have been applied to overcome the abovementioned drawbacks. The design process of new efficient TiO 2 -based systems has been focused on the enhancement of its intrinsic properties by metal and non-metal doping (e.g., TiO 2 combining with noble/rare metal nanoparticles or narrow band gap semiconductor particles [5,6]), as well as metal chalcogenides, graphenebased composites [7], oxygen type perovskites [8], metal-organic frameworks (MOFs) [9], g-C 3 N 4 [10], and conducting polymers. 2 of 15 Among these strategies, the preparation of well-organized TiO 2 nanoparticles in the form of nanotubes with metal and non-metal doping metals in the anodic oxidation process appears to be a highly efficient method to tune the response of the semiconductor to the visible light region and enhance its photocatalytic properties. The anodic oxidation process occurs in electrolytes with double-or triple-electrode systems. Pure or doped titania foil is used as an electrode. During the anodization process, layers of organized titanium dioxide nanotubes are obtained [5]. Many heterogeneous NP-base photocatalysts have been described in the literature, for example, different types or concentrations of an electrolyte, such as ionic liquids [11][12][13], or urea as nitrogen precursor [14], titanium foil used as working electrode interlaced with modifiers [15,16] or decorated by Pt nanoparticles and Bi 2 S 3 quantum dots [17].
Unfortunately, the main challenge in the new TiO 2 -based nanostructure design is that there are thousands of possible combinations of nanoparticles' structural features (e.g., sizes, surface area, metal, non-metal doping, etc.). Considering that it would be a time-consuming, expensive, and complicated experimental study, it is irrational to experimentally synthesize and test all possible structure combinations to find the most effective photocatalysts. In the literature, there is a lack of systematic knowledge of how the experimental conditions influence changes in heterogeneous nanostructure and how the change in the nanostructure appears to tune the response of the semiconductor to the visible light region to enhance its photocatalytic properties. So far, the development of heterogeneous nanoparticles was based on subjective experts' expectations and very narrow investigations, rather than systematic studies of a wide space of possible solutions. In effect, research is conducted based on subjective knowledge, which leads to synthesizing each combination of modifiers and then measuring their activity. This way of synthesis is ineffective and time-consuming. This challenge can be faced by applying appropriate computational methods (virtual design) based on combined machine learning methods and virtual screening methodology. Quantitative mapping of the relationships between structure combination (metal and non-metal doping) of TiO 2 NTs and experimental conditions enables researchers to thoroughly understand and control the photocatalytic properties, which are essential for future developments in this research area, as well as efficient systems design. However, because of the complex structure of heterogeneous nanomaterials, developing these correlations will require extensive studies, high-throughput techniques, and new strategies. One of the most promising approaches that may fulfill the above described lack of data is Quantitative Structure-Property Relationship Modeling for advanced and multicomponent nanomaterials. QSPR methods for the multicomponent nanomaterials (so-called nano-QSAR mix ) were introduced for the first time in 2016 by Mikolajczyk et al. [18]. The nano-QSAR mix allows for predicting the impact of structural modifications on the photocatalytic activity of newly designed photocatalysts. It facilitates the identification of more meaningful relationships between the experimental and structural identity of TiO 2 NTs and their photocatalytic outcome under visible light. Recently, Mikolajczyk et al. [18,19] established how to adjust the QSPR methodology to predict the photocatalytic activity of newly designed TiO 2 -based nanostructures modified with noble metals. This method has been successfully applied in predicting the structure-photocatalytic activity relationship modeling of different TiO 2 -based photocatalysts. However, up to today, there has been no study that provides systematic knowledge about the quantitative relationship between experimental conditions, structure modification, and photocatalytic activity of newly designed TiO 2 -based photocatalysts.
The presented research aimed to invent a computational screening methodology for studying the role of experimental conditions and structural features of well-organized doped-and modified-TiO 2 nanotubes (NTs) on its photocatalytic efficiency at the early stage of advanced photocatalyst design (before synthesis). In the first step, a database of 136 TiO 2 well-organized doped-and modified-TiO 2 nanotubes (NTs) was collected. Then, linear and nonlinear ML methods were developed to provide systematic knowledge about the relationship between experimental conditions, nanostructure, and photocatalytic properties of newly designed, well-organized doped-and modified-TiO 2 nanotubes. Then, ML-based models and predictive nano-QSAR mix were developed for the first time to provide systematic knowledge answering the question of how to control the photocatalytic properties of newly designed TiO 2 -based nanotubes by proper structure manipulation and how to manipulate the structure by experimental conditions modification. As an effect of the virtualization of the design process, the experimental procedure would become much faster, much more efficient in terms of the number of considered solutions, and less expensive. Thus, improving computer-aided methods of designing new materials will contribute to optimizing the design process while reducing costs and time intended for the new materials development process.

Use of HCA Analysis to Compare the Similarity of Designed TiO 2 -Based NTs
The data from 113 TiO 2 -based NTs modified by metal and non-metal nanoparticles derived from the literature were collected into one matrix (Table S1) [11][12][13][14][15][16]. The designed TiO 2 -based NTs were characterized by experimental conditions, structural modification of TiO 2 -based NTs, and photocatalytic properties (Table S2). To compare the influence of proposed metal and non-metal modification on the structure of TiO 2 -based NTs, the Hierarchical Clustering Analysis (HCA) was performed ( Figure 1). The HCA was used to explore the distribution of the selected TiO 2 -based NTs in the space of structural similarity. The HCA indicates that TiO 2 -based NTs were grouped into nine clusters of self-organized TiO 2 nanotube arrays interlaced with metal and non-metals particles. Each cluster differs from the others in terms of the presence of metal and non-metal nanoparticles that belong to a given group, such as (A) well-ordered nanotubes in the presence of different types of ionic liquids (TiO 2 /ILs) [13], (B) hierarchical V 2 O 5 -TiO 2 NPs obtained from Ti-V alloys (TiO 2 /V 2 O 5 ) [20], (C) TiO 2 -MnO 2 NPs obtained from alloys (TiO 2 /MnO 2 ), (D) rare-earth metals-modified TiO 2 NTs (TiO 2 /REE) [16], (E) calcinated titanium dioxide nanotubes [21], (F) monometallic-modified TiO 2 (Er, Yb, Ho, Tb, Gd, Pr, Cu, Ag, Bi, Pt), bimetallic modified (AgCu) and BiS quantum dots modified TiO 2 NTs [17,22,23], (G) selforganized TiO 2 interlaced with silver nanoparticles (TiO 2 /Ag 2 O) [15], (H) nitrogen doped TiO 2 NTs (TiO 2 /N) [14], and (I) nanotubes modified with the presence of ionic liquids in different conditions (TiO 2 /ILs) [11,12]. HCA analysis indicates that the presence of metal and non-metal elements that incorporate into the TiO 2 structure during nanotube formation is crucial for controlling the morphological features of the designed TiO 2 -based NTs. These findings are in agreement with the literature data [14].

How to Control the Morphology of Designed TiO 2 -Based NTs? Virtual Screening of Structure-Experimental Condition Relationship
The experimental study [16] indicates that photocatalytic degradation of pollutants under visible and light is mainly related to photogenerated electrons and superoxide radicals. Moreover, the photoelectrochemical tests [16,24,25] proved that the presence of REE-modification of TiO 2 -based NTs had an influence on increased photocurrent. The study provided by Nevárez-Martínez et al. [26] indicates that MnO 2 species may absorb visible light irradiation, which is crucial to promote the enhancement of the charge transfer rate. Thus, the transfer of photoexcited electrons (generated in TiO 2 NTs) within MnO 2 under visible light may improve the photocatalytic activity of TiO 2 /MnO 2 systems. Similar results were described by Mazierski and al. [15] for TiO 2 /Ag 2 O nanotubes. The authors [15] proved that photocatalytic efficiency is related to the optimal content of Ag 2 O and Ag NPs, which are crucial for the number of recombination centers under UV-VIS light. While the activity under VIS light is correlated with increasing Ag 2 O and Ag 0 content in the TiO 2 /Ag 2 O-based NTs, due to the utilization of a higher amount of incident photons.

How to Control the Morphology of Designed TiO2-Based NTs? Virtual Screening of Structure-Experimental Condition Relationship
The experimental study [16] indicates that photocatalytic degradation of pollutants under visible and light is mainly related to photogenerated electrons and superoxide radicals. Moreover, the photoelectrochemical tests [16,24,25] proved that the presence of REEmodification of TiO2-based NTs had an influence on increased photocurrent. The study provided by Nevárez-Martínez et al. [26] indicates that MnO2 species may absorb visible light irradiation, which is crucial to promote the enhancement of the charge transfer rate. Thus, the transfer of photoexcited electrons (generated in TiO2 NTs) within MnO2 under visible light may improve the photocatalytic activity of TiO2/MnO2 systems. Similar results were described by Mazierski and al. [15] for TiO2/Ag2O nanotubes. The authors [15] proved that photocatalytic efficiency is related to the optimal content of Ag2O and Ag NPs, which are crucial for the number of recombination centers under UV-VIS light. While the activity under VIS light is correlated with increasing Ag2O and Ag 0 content in the TiO2/Ag2O-based NTs, due to the utilization of a higher amount of incident photons. The results described in the presented HCA analysis are in agreement with the experimental evidence [15,16,[24][25][26]. HCA analysis ( Figure 1) confirms that doping with metal or non-metal elements is crucial for controlling the morphological features of TiO 2 -based NTs and, consequently, their photocatalytic activity. While the morphology, i.e., length and wall thickness of TiO 2 nanotubes, top-opened or clogged, and wall smoothness is controlled by experimental conditions formed throughout the one-step anodization process. In this context, the Principal Component Analysis (PCA) was developed to explore the distribution of the TiO 2 -based NTs photocatalysts in the space of the experimental conditions and the structural features of TiO 2 -based NTs.
The results of the constructed PC map in the space of structural features and experimental conditions are displayed in Figure 2. The first two principal components (PC1 and PC2) explained 73.8% (51.1% + 22.6%) of the total variance in the data. The physical interpretation of a given PC can be assigned based on the contributions of the original descriptors to that PC (loadings values) schematically represented in Figure 2b. ysis was prepared for 32 TiO2-based NTs.
The results of the constructed PC map in the space of structural features and experimental conditions are displayed in Figure 2. The first two principal components (PC1 and PC2) explained 73.8% (51.1% + 22.6%) of the total variance in the data. The physical interpretation of a given PC can be assigned based on the contributions of the original descriptors to that PC (loadings values) schematically represented in Figure 2b. To correlate the dependency between experimental conditions, morphology, and photocatalytic activity, the results obtained from PCA were transferred into the color range, in which the ranges corresponded to the values of the photocatalytic activity expressed as phenol photodegradation rate (λ, μmol·dm −3 ·min −1 ) under VIS and UV-VIS light, respectively ( Figure 3). In this way, we tried identifying systematic patterns in the data that might suggest which structural features and experimental conditions are mainly responsible for the efficiency of the designed 32 well-ordered TiO2-based NTs. To correlate the dependency between experimental conditions, morphology, and photocatalytic activity, the results obtained from PCA were transferred into the color range, in which the ranges corresponded to the values of the photocatalytic activity expressed as phenol photodegradation rate (λ, µmol·dm −3 ·min −1 ) under VIS and UV-VIS light, respectively ( Figure 3). In this way, we tried identifying systematic patterns in the data that might suggest which structural features and experimental conditions are mainly responsible for the efficiency of the designed 32 well-ordered TiO 2 -based NTs.
To correlate the dependency between experimental conditions, morphology, and otocatalytic activity, the results obtained from PCA were transferred into the color nge, in which the ranges corresponded to the values of the photocatalytic activity exessed as phenol photodegradation rate (λ, μmol·dm −3 ·min −1 ) under VIS and UV-VIS ht, respectively ( Figure 3). In this way, we tried identifying systematic patterns in the ta that might suggest which structural features and experimental conditions are mainly ponsible for the efficiency of the designed 32 well-ordered TiO2-based NTs. Along with PC1, there are two main clusters. The first cluster contains Groups I an III, which includes groups D, G, and H from HCA ( Figure 1, Table S2), such as rare-ear metals-modified TiO2 NTs (TiO2/REE), self-organized TiO2 interlaced with silver nanopa ticles (TiO2/Ag2O), and nitrogen-doped TiO2 NTs (TiO2/N), and (I) nanotubes modifie with the presence of ionic liquids in different conditions (TiO2/ILs). The second clust contains Group II, which includes group A from HCA ( Figure 1, Table S2) represented b well-ordered nanotubes in the presence of different types of ionic liquids (TiO2/ILs), Tab S2. Along with PC2, there are three main groups. Group I contains rare-earth metals-mo ified TiO2 NTs (TiO2/REE), self-organized TiO2 interlaced with silver nanoparticl (TiO2/Ag2O), Group II contains well-ordered nanotubes in the presence of different typ of ionic liquids (TiO2/ILs). While Group III contains nitrogen-doped TiO2 NTs (TiO2/N) The PCA design map ( Figure 4) that shows the correlation between loading valu conducted within PC1 and PC2 and photocatalytic activity transferred into the color ran was prepared in order to understand the relationship between experimental conditio and structure features that may be fundamental in the future design of new and efficie TiO2-based NTs. Along with PC1, there are two main clusters. The first cluster contains Groups I and III, which includes groups D, G, and H from HCA ( Figure 1, Table S2), such as rare-earth metalsmodified TiO 2 NTs (TiO 2 /REE), self-organized TiO 2 interlaced with silver nanoparticles (TiO 2 /Ag 2 O), and nitrogen-doped TiO 2 NTs (TiO 2 /N), and (I) nanotubes modified with the presence of ionic liquids in different conditions (TiO 2 /ILs). The second cluster contains Group II, which includes group A from HCA ( Figure 1, Table S2) represented by wellordered nanotubes in the presence of different types of ionic liquids (TiO 2 /ILs), Table  S2. Along with PC2, there are three main groups. Group I contains rare-earth metalsmodified TiO 2 NTs (TiO 2 /REE), self-organized TiO 2 interlaced with silver nanoparticles (TiO 2 /Ag 2 O), Group II contains well-ordered nanotubes in the presence of different types of ionic liquids (TiO 2 /ILs). While Group III contains nitrogen-doped TiO 2 NTs (TiO 2 /N).
The PCA design map ( Figure 4) that shows the correlation between loading values conducted within PC1 and PC2 and photocatalytic activity transferred into the color range was prepared in order to understand the relationship between experimental conditions and structure features that may be fundamental in the future design of new and efficient TiO 2 -based NTs.
As it can be seen from Figure 4, the activity under VIS and UV-VIS light are correlated mainly with length, wall thickness, and the presence of non-metal elements. The highest activity under both VIS and UV-VIS light is obtained for non-metal-doped TiO 2 nanotubes. The activity increased with the length of TiO 2 NTs and decreased wall thickness of TiO 2 NTs. Both parameters depend on anodization time, anodization voltage, and the number of electrodes (Figure 3a,b). This observation complies with results obtained by experimental methods [11]. The PCA analysis indicates that a higher length of TiO 2 NTs and lower wall thickness of TiO 2 NTs are obtained with longer anodization time, lower voltage, and, consequently, a lower number of electrodes. As it can be seen from Figure 4, the activity under VIS and UV-VIS light are correlated mainly with length, wall thickness, and the presence of non-metal elements. The highest activity under both VIS and UV-VIS light is obtained for non-metal-doped TiO2 nanotubes. The activity increased with the length of TiO2 NTs and decreased wall thickness of TiO2 NTs. Both parameters depend on anodization time, anodization voltage, and the number of electrodes (Figure 3a,b). This observation complies with results obtained by experimental methods [11]. The PCA analysis indicates that a higher length of TiO2 NTs and lower wall thickness of TiO2 NTs are obtained with longer anodization time, lower voltage, and, consequently, a lower number of electrodes.
Moreover, the synthesis should be controlled by the amount of water and Ti content in the foil. The anodization process should be carried out in the electrolyte with a higher amount of ethylene glycol and a lower amount of water, using NH4F and different weight percentages of urea (as nitrogen precursor). These findings agree with the literature data [11][12][13][14][15][16]. For example, Nischk and al. [22] indicated that the photocatalytic activity, as well as the charge carrier recombination rate, depend on nitrogen concentration and process parameters, such as voltage and anodization time. The authors [22] indicated that the dimensions of nanotubes could be easily controlled within a wide range by changing the applied voltage and anodization time. Similar results were obtained by Pancielejko et al. [11,12]. Authors indicated that the photocatalytic activity and the charge carrier recombination rate may depend on nitrogen concentration. The TiO2 NTs in this group were grown in an ethylene glycol-based electrolyte that contained more NH4F than other samples [14][15][16]. The NTs in this group were grown at lower anodization voltage and longer anodization time using Ti sheets in an organic electrolyte containing specified amounts of urea as nitrogen precursor. The most active samples are characterized by higher length and lower wall thickness of TiO2 NTs compared with groups I and II.
Interestingly, Mazierski et al. [14] concluded that there are no significant differences in nitrogen-doped TiO2-based NTs' length under higher voltages and longer anodization time. However, there is a significant difference in length as compared to nanotubes obtained with lower voltages and shorter anodization time. This conclusion indicates that the virtual screening methodology of a wider group of TiO2-based NTs may bring new insight into the relationship between experimental conditions, structural features, and photocatalytic activity that is crucial for the design process of new TiO2-based NTs. For example, [16] indicates that nanotube dimensions for rare-earth metals-modified TiO2 nanotubes strongly depend on the anodization potential, including anodization time and voltage. Moreover, the synthesis should be controlled by the amount of water and Ti content in the foil. The anodization process should be carried out in the electrolyte with a higher amount of ethylene glycol and a lower amount of water, using NH 4 F and different weight percentages of urea (as nitrogen precursor). These findings agree with the literature data [11][12][13][14][15][16]. For example, Nischk and al. [22] indicated that the photocatalytic activity, as well as the charge carrier recombination rate, depend on nitrogen concentration and process parameters, such as voltage and anodization time. The authors [22] indicated that the dimensions of nanotubes could be easily controlled within a wide range by changing the applied voltage and anodization time. Similar results were obtained by Pancielejko et al. [11,12]. Authors indicated that the photocatalytic activity and the charge carrier recombination rate may depend on nitrogen concentration. The TiO 2 NTs in this group were grown in an ethylene glycol-based electrolyte that contained more NH 4 F than other samples [14][15][16]. The NTs in this group were grown at lower anodization voltage and longer anodization time using Ti sheets in an organic electrolyte containing specified amounts of urea as nitrogen precursor. The most active samples are characterized by higher length and lower wall thickness of TiO 2 NTs compared with groups I and II.
Interestingly, Mazierski et al. [14] concluded that there are no significant differences in nitrogen-doped TiO 2 -based NTs' length under higher voltages and longer anodization time. However, there is a significant difference in length as compared to nanotubes obtained with lower voltages and shorter anodization time. This conclusion indicates that the virtual screening methodology of a wider group of TiO 2 -based NTs may bring new insight into the relationship between experimental conditions, structural features, and photocatalytic activity that is crucial for the design process of new TiO 2 -based NTs. For example, [16] indicates that nanotube dimensions for rare-earth metals-modified TiO 2 nanotubes strongly depend on the anodization potential, including anodization time and voltage.

The Quantitative Relationship between Structure, Synthesis Condition, and Photocatalytic Activity of TiO2-Based NTs: Predictive Nano-QSPCRmix Model Development
To develop linear correlations between the determined structural, experimental descriptors (x), and the measured photocatalytic activity endpoints (y), multiple linear regressions (MLR) with a genetic algorithm (GA) were used to model y as a function of x, y = f(x). Table 1 and Figure 5 compare the performance of the constructed linear regression models for predicting photocatalytic activity under VIS and UV-VIS light. The MLR-GA models showed high accuracy, deduced by their acceptable R 2 , CCC and RMSE C , and MAE C . Interestingly, each MLR-GA model includes one descriptor that corresponds to the structure of TiO 2 -based NTs, such as length, Ti 3+ ion content (based on XPS analysis) content, and total titanium contain (∑ Ti [% at.]). The MLR-GA indicates that photocatalytic activity under UV-VIS light increases with length, diameter, and Ti 3+ ion content. While PCA analysis proved that these morphology features might be easily controlled by experimental conditions, i.e., Ti content in foil (higher), anodization time (higher), with higher amount of ethylene glycol and a lower amount of water in the electrolyte ( Figure 4). These findings agree with the results present in the literature [14]. In the paper [14], authors proved that crystallite size, NTs length, and Ti 3+ state content are critical to creating hydroxyl radicals. The presence of a Ti 3+ state can reduce oxygen dissolved in water and produce superoxide radicals, while positive holes may be involved in the generation of •OH radicals that affect photocatalytic activity.

2.3.The Quantitative Relationship between Structure, Synthesis Condition, and Photocatalytic Activity of TiO2-Based NTs: Predictive Nano-QSPCRmix Model Development
To develop linear correlations between the determined structural, experimental descriptors (x), and the measured photocatalytic activity endpoints (y), multiple linear regressions (MLR) with a genetic algorithm (GA) were used to model y as a function of x, y = f(x). Table 1 and Figure 5 compare the performance of the constructed linear regression models for predicting photocatalytic activity under VIS and UV-VIS light.   Table S3.
The MLR-GA models showed high accuracy, deduced by their acceptable R 2 , CCC and RMSEC, and MAEC. Interestingly, each MLR-GA model includes one descriptor that corresponds to the structure of TiO2-based NTs, such as length, Ti 3+ ion content (based on XPS analysis) content, and total titanium contain (∑ Ti [% at.]). The MLR-GA indicates that photocatalytic activity under UV-VIS light increases with length, diameter, and Ti 3+ ion content. While PCA analysis proved that these morphology features might be easily controlled by experimental conditions, i.e., Ti content in foil (higher), anodization time (higher), with higher amount of ethylene glycol and a lower amount of water in the electrolyte (Figure 4). These findings agree with the results present in the literature [14]. In the paper [14], authors proved that crystallite size, NTs length, and Ti 3+ state content are critical to creating hydroxyl radicals. The presence of a Ti 3+ state can reduce oxygen dissolved in water and produce superoxide radicals, while positive holes may be involved in the generation of •OH radicals that affect photocatalytic activity.
Finally, the most representative dataset from HCA (Group I, see Figure 1) that contains 17 well-ordered nanotubes was used for the predictive nano-QSPCRmix model development. The group is characterized by nanotubes modified with the presence of ionic liquids in different conditions (voltage, amount of ionic liquid, and water) (TiO2/ILs) [11,12]. The model (Equation (2)) was validated following recommendations of the Organization for Economic Co-operation and Development (OECD) [27]. Goodness-of-fit was assessed by the determination coefficient (R 2 ), and the root means square error of calibration (RMSEC) was based on the prediction for the training set. The model's robustness (stability) was verified by internal validation using the cross-validation leave-one-out algorithm (for MLR models). The robustness was expressed by QLOO 2 ; accordingly, root means square of cross-validation (RMSECV) also was calculated. Predictive ability of all models was assessed by the external validation coefficient (REXT 2 ), the root means square error of prediction (RMSEP), and the Concordance Correlation Coefficient (CCC). The applicability domain (AD) was also developed (Figure 6b).  Table S3.
Finally, the most representative dataset from HCA (Group I, see Figure 1) that contains 17 well-ordered nanotubes was used for the predictive nano-QSPCR mix model development. The group is characterized by nanotubes modified with the presence of ionic liquids in different conditions (voltage, amount of ionic liquid, and water) (TiO 2 /ILs) [11,12]. The model (Equation (2)) was validated following recommendations of the Organization for Economic Co-operation and Development (OECD) [27]. Goodness-of-fit was assessed by the determination coefficient (R 2 ), and the root means square error of calibration (RMSE C ) was based on the prediction for the training set. The model's robustness (stability) was verified by internal validation using the cross-validation leave-one-out algorithm (for MLR models). The robustness was expressed by Q LOO 2 ; accordingly, root means square of crossvalidation (RMSE CV ) also was calculated. Predictive ability of all models was assessed by the external validation coefficient (R EXT 2 ), the root means square error of prediction (RMSE P ), and the Concordance Correlation Coefficient (CCC). The applicability domain (AD) was also developed (Figure 6b).
The developed nano-QSPCR mix model is characterized by the following statistical characteristics: R 2  Observed-predicted plots and applicability domain areas of the nano-QSAR mix models for photocatalytic activity are presented in Figure 6. The developed for first-time nano-QSPCRmix model is described by the following Equation (2) and utilizes a descriptor that represents a length of TiO2 NTs.  The developed for first-time nano-QSPCR mix model is described by the following Equation (2) and utilizes a descriptor that represents a length of TiO 2 NTs.
The finally developed nano-QSPCR mix model (Equation (2)) explains 96% of the variability of observed photocatalytic activity under UV-VIS light (λ UV-VIS ) of the investigated ionic liquids-modified titanium oxide nanotubes samples. The model Equation (2) indicates that there is a linear correlation between photocatalytic activity and the length of TiO 2 NTs. Strong linear correlations between the observed and predicted values of (λ UV-VIS ), graphically presented in Figure 6a, proved the good fit and high predictive ability of both models. Additionally, the plot of standardized cross-validated residuals versus leverages (Williams plot) confirm that all training and validation compounds are located within the applicability domain area, i.e., the area defined by structural similarity of the nanoparticles, where the predictions are reliable (Figure 6b). Interpretation of the descriptor (i.e., a length of TiO 2 NTs) brings significant insight into the current knowledge on structural factors that are likely to affect the photocatalytic activity (λ UV-VIS ) of the studied nanoparticles. The model (Equation (2)) indicates that photocatalytic activity increases with the length of TiO 2 NTs (standardized coefficient: +0.0011). The developed nano-QSPCR mix model quantifies the influence of the structural features (the length) of TiO 2 NTs on the modeled photocatalytic activity (λ UV-VIS ) which is of high importance for designing an efficient photocatalyst. The developed predictive nano-QSPCR mix model is the first step in developing an ML-based framework that may significantly support the experimental design of novel photocatalysts based on TiO 2 NTs.
The results obtained in our study regarding the influence of TiO 2 NTs' structure on photocatalytic activity (λ UV-VIS ) are in agreement with the results presented in previously published papers by experimental groups [28][29][30]. Morphology and TiO 2 structure modified by ionic liquids is one crucial parameter for increasing the efficiency of titanium nanotubes obtained in the electrochemical synthesis process. For example, according to results provided by Liu et al. [28], longer nanotubes increase the amount of TiO 2 available for contact with pollutants where the photocatalytic reaction occurs. Furthermore, titanium nanotubes longer than 100 µm are characterized by higher mechanical stability, even for thick layers [29]. The longer length of nanotubes is associated with their higher survivability, which results in a longer period of operation of such a photocatalyst and consequently higher efficiency of the photocatalytic system during the entire service life. The tube shape, because of its higher aspect ratio, provides better optimized geometry for diffusion and low trapping and recombination kinetics of light-generated electron-hole pairs because the pairs don't have to migrate between nanoparticles [30,31]. As a result, the inadvisable effect of electron hole recombination is limited. Paramasivam et. al. [32] indicate that the modification of dopants leads to reductions in the band gap; in effect, TiO 2 nanotubes are excited in visible light. Also, the presence of ionic liquids increases the charge separation of photogenerated carriers at the photocatalyst surface, resulting in an increase in photocatalytic activity [33]. The ionic liquids' presence in the reaction environment causes increased external dimensions of titania nanotubes, in comparison to nanotubes obtained without ionic liquids [11]. Additionally, authors [13] report that the presence of Ti 3+ ions on the surface of TiO 2 enhance photoactivity under UV-VIS light. Under UV-VIS irradiation, Ti 3+ ions presence in the TiO 2 matrix can facilitate electron-hole separation and promote the interfacial electron transfer process. This relationship was also proof by linear models for activity under UV-VIS light for REE-doped titania nanotubes shown in the Table 1. Moreover, self-organized titania nanotubes allow for use in the form in which they were made, unlike to titania nanotubes in powder form, which must be immobilized on a carrier by compacting or sintering or by being suspended in liquid [32].

Dataset of Descriptors
Our research has been focused on characteristics that determine unique properties of newly designed NTs, i.e., descriptors related to structure composition (so-called system independent properties), including the amount of chemicals used for surface-modification, surface-doping, high surface area, good adsorption ability, highly ordered array structure, open mesoporous nature, as well as descriptors that describe synthesis conditions (so-called system-dependent properties), such as anodization voltage and time, ultrasonic treatment, or calcination time. The preparation of samples and surface properties' characterization is described in previously published papers [11][12][13][14][15][16][17][20][21][22][23]26].

Hierarchic Clustering Analysis (HCA)
The Hierarchical Clustering Analysis (HCA) is a grouping method that allows for arrangement of the object (i.e., TiO 2 -based NTs with different compositions) into tiered, ordered clusters that can be used to explore the data and visualize their underlying structure. All clustering methods are built on the concept of similarity: the greater the distance between objects, the lesser their similarity [34]. In our work, we performed HCA on TiO 2 NTs with different compositions on the linear maps to provide information concerning their distribution among structure properties. In the case of the presented study, the distance between objects was defined by the Euclidean distance and Ward's clustering methods using Python Scripts [35].

Principal Component Analysis (PCA)
Principal Component Analysis (PCA) is a method commonly used to reduce data complexity by creating a new set (from the original dataset) with uncorrelated vectors and analyzing similarities in studied structures [36]. In this method, new variables, called principal components (PCs), are developed as linear combinations of the original ones, where the first PC explains the largest possible amount of the variance in the original dataset. The second and next PC explains the largest possible variance unexplained by the previously used PC and so on. Finally, every object from the original dataset is described by a set of PCs instead of the original variables.
In the present study, the PCA method was adjusted to group the studied TiO 2 -based NTs with different compositions based on their structural similarity. Thus, to find the relationship between TiO 2 -based NTs structures and their potential photocatalytic activity, we have presented the structures of TiO 2 -based NTs in the space of two PCs (expressed as score plot) assigned with Malinowski's rules (i.e., the contribution of descriptors was selected based on the normalized loadings higher than 0.7).
In the final step, the grouped object was classified and transferred into the color range, in which the ranges corresponded to the values of the photocatalytic activity expressed as phenol photodegradation rate (λ, µmol·dm −3 ·min −1 ) under VIS and UV-VIS light ( Figure 3).

Nano-QSAR for Multicomponent Nanomaterials (Nano-QSARmix)
In the present study, QSARINS software has been used to develop Nano-QSAR mix models. The model was developed based on the Multiple Linear Regression (MLR) technique and dataset that were split into two sets: training set (to be used to develop a Nano-QSAR mix model) and validation set (to be used only for validating the model's predictive ability) [27]. To perform a splitting, the nanoparticles were sorted along with the increasing values of UV-VIS activity (Table 2). Then, every second NP was included in the validation set (V), whereas the remaining NPs formed the training set (T). In MLR, the endpoint (y i ) is described as the best combination of the most relevant auto-scaled descriptors used as independent variables (x 1 , x 2 , . . . , x n ) (1): The correlation coefficient (R 2 ) and the root mean square error of calibration (RMSEC) were used as the measures of goodness-of-fit for each developed model [18]. To verify the stability of the models (sensitivity on the composition of the selection of the training set), the cross-validated coefficient Q 2 LOO (leave-one-out method) and root mean square error of cross-validation RMSECV were calculated [18]. In addition, the leverage approach and Williams plot were developed to assess applicability domain (AD) of the models. This was undertaken to verify the space defined by the structural similarity of nanoparticles and the values of UV-VIS activity, in which the model can make predictions with the most optimal reliability [27].

Conclusions
The effect of the experimental conditions in a one-step anodization process with the use of ML methods was studied to determine their influence on the morphology and on the photocatalytic properties of newly designed TiO 2 -based NTs. The linear and non-linear MLbased methodology, proposed here, for the first time, provides systematic knowledge about the relationship between experimental conditions, structure features, and photocatalytic activity under VIS and UV-VIS light. As an effect of integrated data-driven strategy and predictive nano-QSPCR mix models, experimentalists will be able to identify and precisely control structural features of the TiO 2 NTs by the proper manipulation of experimental conditions. This will allow the design and control of photocatalytic properties at the early stage of TiO 2 NTs design (before synthesis). The proposed methodology may significantly support the experimental design of novel TiO 2 NTs with desired properties. This may significantly speed up the whole designing process of a wider group of environmentally friendly TiO 2 -based nanomaterials in terms of the number of considered solutions, while reducing the costs and time of required experiments. We believe that the proposed study using computer-aided study is crucial for accelerating the design process of sustainable materials in line with the safe-and-sustainability by design (SSbD) strategy.