Smart Design Nano-Hybrid Formulations by Machine Learning

Nano-hybrid formulations combine organic and inorganic materials in self-assembled platforms for drug delivery. Laponite is a synthetic clay, biocompatible, and a guest of compounds. Poloxamines are amphiphilic four-armed compounds and have pH-sensitive and thermosensitive properties. The association of Laponite and Poloxamine can be used to improve attachment to drugs and to increase the solubility of β-Lapachone (β-Lap). β-Lap has antiviral, antiparasitic, antitumor, and anti-inflammatory properties. However, the low water solubility of β-Lap limits its clinical and medical applications. All samples were prepared by mixing Tetronic 1304 and LAP in a range of 1– 20% (w/w) and 0–3% (w/w), respectively. The β-Lap solubility was analyzed by UV-vis spectrophotometry, and physical behavior was evaluated across a range of temperatures. The analysis of data consisted of response surface methodology (RMS), and two kinds of machine learning (ML): multilayer perceptron (MLP) and support vector machine (SVM). The ML techniques, generated from a training process based on experimental data, obtained the best correlation coefficient adjustment for drug solubility and adequate physical classifications of the systems. The SVM method presented the best fit results of β-Lap solubilization. In silico tools promoted fine-tuning, and near-experimental data show β-Lap solubility and classification of physical behavior to be an excellent strategy for use in developing new nano-hybrid platforms.


Introduction
Nano-hybrid systems have been presented as an attractive platform for drug delivery. These systems combine organic and inorganic materials in self-assembled structures [1]. Laponite (inorganic network, LAP) nanoparticles are disk-like synthetic smectite clays; they are biocompatible, and they have been explored for hybridization with polymers or small molecules to improve the attachment of drugs [2][3][4]. LAP has an empirical formula of Na + 0.7[(Si8Mg5.5Li0.3)O20(OH)4] −0.7 , and its surfaces exhibit negative charges, whereas the edge charges are pH-dependent. Poloxamines (organic compounds) are amphiphilic fourarmed (X-shaped) block copolymers of poly(ethylene oxide)-poly (propylene oxide)poly(ethylene oxide) with pH-sensitive and thermosensitive properties, being very attractive as drug delivery systems due to the capacity to form nanometric structures as micelles  or wormlike micelles, for example [5,6]. Polyamines are commercially available as Tetronic, with different units per arm of poly(propylene oxide) (PPO) and poly(ethylene oxide) (PPE), hydrophilic-lipophilic balance (HBL), and molecular weight [7]. We expect that the polymer (Tetronic) association with clay (LAP) with their respective properties of thermoresponsivity and swelling in aqueous solutions promotes a significant increase in lipophilic drugs' solubilization. In addition, adjusting the transition of sol-gel phases of the nanocomposites can provide a modified release, depending on the body temperature at the application site [8,9].
β-Lap, a model drug used in this work, is derivated from Lapachol, a natural product, chemically identified as a naphthoquinone, extracted from various species of plants of the bignoniciae family, Tabebuia, found in the northern and northeastern regions of Brazil. β-Lap has antiviral, antiparasitic, antitumor, and anti-inflammatory properties, showing promising potential in various biomedical applications [10]. However, β-Lap has very low water solubility, 0.038 mg mL −1 [11], that limits its systemic administration and clinical applications due to its low bioavailability.
With a wide possibility of combinations between organic and inorganic compounds to develop nanohybrid systems for pharmaceutical or cosmetic products reported in the literature, the optimization of the parameters used during development allows for assessment of the impact of each variable (input data) on the target (output data). One such approach is the response surface methodology (RSM), a technique created in 1951 by Box and Wilson [12] and widely used in the chemical industry to optimize experimental procedures with a reduced set of experiments. RSM is used to indicate an ideal operational region through mathematical models capable of predicting the impact of various factors related to a process, both individually and cumulatively, in response to a system [13].
Machine learning (ML) has been increasingly used as a predictive tool in different knowledge areas, such as telecommunications, electronics, bioengineering, and medicine since the last decade of the twentieth century. ML techniques allow analysis and extraction of new insights, accelerated discoveries of materials and structures, and planning of new experiences in an optimized way [14]. ML tools, such as support vector machine (SVM) and multilayer perceptron (MLP), use learning algorithms and discriminate the input and output relationships for complex non-linear systems, requiring a good set of inputs/outputs for shaping the knowledge of the algorithm to provide the regression analysis and classification parameters [14]. Although MLP is used to achieve regression models closer to the real thing (more predictive), it is necessary for many experimental data for training and validation (14). In contrast, the SVM model works very well with small datasets, generating good results concerning response surfaces and data classification [15][16][17].
Both RSM and ML (SVM and MLP) were applied to optimize different T1304 and LAP concentrations to find the most efficient formulations to solubilize β-Lap. The resulting prediction models must demonstrate a strong correlation with the experimental results. In addition, ML (SVM) also simulated the physical behavior of the samples [15][16][17].

Materials
Tetronic ® 1304 (10,500 Dalton, T1304 (21 PEO and 27 PPO units), HBL: 12-18) was kindly gifted by the BASF Corporation (Ludwigshafen, Germany). Laponite RD also was kindly donated by BYK Additives & Instruments (Wesel, Germany). The β-Lap, a drug model used in this work, was synthesized from Lapachol by acidic cyclization at acid (H2SO4) and low temperature conditions by Professor Celson Camara from Federal Rural University of Pernambuco (UFPE, Recife, Brazil). All other chemicals used in this work were of analytical grade and commercially available.

Preparation of the Nanocarriers
The nanocarrier (single and hybrid) systems' preparation consists of a mixture of Tetronic and a dispersion of LAP in the water previously stirred for 20 min. After the components (T1304 and LAP) were mixed, the samples were kept under stirring for 24 h. The nano-hybrid formulations studied in this work comprised different T1304 concentrations (1, 5, 10, 15, and 20%, w/w) with and without LAP (1.5 or 3%, w/w). All samples were prepared in triplicate at natural pH (~8.2 for nanocarriers in absence of LAP and ~10 in presence of LAP).

Characterization of the Nanocarriers Experimental Design Using the Central Composite Design
Experimental results of β-Lap solubility were organized in the factorial design of central composite design (CCD), with a 3 2 design (two factors and three levels) according to Table 1. The efficiency of the method (the coefficient of determination (R 2 ) of the surface) was obtained using the computational tool MATLAB 2020 (License 650662, Mathworks, Natick, MA, USA). The resultant solubility surface of β-Lap can be expressed as: where represents the -th linear and quadratic coefficient of the independent variables, the T1304 ( ) and LAP ( ) concentrations.  Machine Learning MLP and SVM set-up-to-surface Response: Figure 1a,b show the MLP and SVM used to create the surface response, respectively. Like the RSM method (see Equation (1)), and represent the independent variables, the T1304 and LAP concentrations, respectively. Both MLP and SVM used two variables on the input layer, which were the proportions of T1304 and LAP (see Table 1). For the output layer, both ML techniques used β-Lap solubility data. MLP used two hidden layers (with 16 neurons), the sigmoid activation function in all hidden neurons, and the linear function in the output neuron. The training step used the Levenberg-Marquardt backpropagation [14][15][16]. The SVM used the sequential minimal optimization (SMO) algorithm in the training step, and the Gaussian kernels are expressed as: where = , and is the centre of the -th kernel [14][15][16]. Both ML techniques used 90% of the samples for the training phase and 10% for the validation step. SVM set-up to phase behaviour classification: Figure 2a,b illustrates the SVM used to classify the phase behaviour. As the SVM makes a binary classification, a pool of the five SVMs, one of each class, was employed. Each class represents the solution's phase behavior: liquid ( = 1), viscous liquid ( = 2), gel ( = 3), strong gel ( = 4), and solid ( = 5) (see Table 2). Each -th SVM of the pool (SVM ) was configured with three inputs: the proportion of T1304 ( ), the proportion of LAP ( ), and the mixture's temperature ( ) in degrees celsius. One output ( ) where 0 1 (see Figure 2a) was also employed. The kernels in each -th SVM (SVM ) were the Gaussian (see Equation (2)), and the SVM was also used SMO in the training algorithm. The output of the classifier ( ) can be expressed as = index (max , , , , ) .
The dataset used for training and validation of the SVM classification model was chosen at random and corresponded to 80% and 20%, respectively.

Solid
Phase separation leading to a solid cluster suspended in the fluid.
The data used in RSM followed the same input and output schemes as the ML (SVM and MLP), and the implementations were made with MATLAB 2020 (License 650662, Mathworks, Natick, MA, USA). Table 3 shows the additional values (beyond Table 1) used to training (assays 1-12 in Table 1 and 12-18 in Table 3) and validating (assays 19-21 in Table 3) the ML techniques. Experimental Procedure

Solubility of β-Lap in the Nanocarriers
The maximum solubility of β-Lap was evaluated by adding an excess of the drug (5 mg) in vials with all nanocarrier systems mentioned in Section 2.2. The samples were kept under continuous agitation for 10 days, uninterrupted, using the Blood Homogenizer and Solutions Model AP 22 (Phoenix-Luferco, Araraquara, Brazil). After stirring, the samples were centrifuged (14,000 rpm for 15 min at 20 °C), and the supernatant was quantified by UV-Vis spectrophotometry at 257 nm. The concentration of β-Lap was found by the equation obtained from the linear regression given by the calibration curve (2-10 μg mL −1 ) of β-Lap in ethanolic solution (1:1) y = 0.1123x + 0.0041, where y is the drug concentration (μg mL −1 ) and x is the absorbance measured by the equipment, R 2 = 0.9998 [11].

Phase Behavior of Nanocarriers
The phase behavior of the samples was determined by visual observation after a gradual increase of temperature using a water bath in the range of 20 to 80 °C and in increments of 5 °C. The samples T1304 (1-20%, w/w) with LAP (0-3%, w/w) were submitted of heat at 10-min intervals at each temperature. The samples were made in triplicates at their natural pH, as described in Section 2.2.1. Table 2 presents the parameters used to classify the phase behavior of samples during the temperature's ramp.  Table 4 presents the coefficients (β1-5,) obtained from the RSM model, and Table 5 shows the mean square error (MSE) and R-squared coefficient (R 2 ) calculated from the resulted models of the RSM, MLP, and SVM. For the RSM, the MSE and R 2 were calculated to the fitting values (assays 1-12 in Table 1) and validation values (assays 19-21 in Table 3). For MLP and SVM, the MSE and R 2 were calculated to the training values (assays 1-9 in Table 1) and validation values (assays 19-21 in Table 3).  Table 5, the SVM had better results than the other two techniques, MLP and RSM. The surface found from the RSM methodology is limited by Equation (1). This characteristic allows for creation of the standard surface. However, this approach can mask some behavior in the found surface. On the other hand, the MLP method had a measure closer to the training points provided, although in some regions found on the surface, there were abrupt changes. In terms of MSE, the results of MLP and RSM are similar; however, the MLP has better results in terms of R 2 . The SVM machine had a smooth behavior throughout the change of the surface variables T1304 ( ) and LAP ( ), and it had the smallest MSE and the most R 2 ( Table 5). The reduced number of samples to training MLP is an important aspect to observe, mainly when planning an MLP machine. In this context, the SVM algorithm appears to be the best alternative because it does not require the same condition as an MLP. Figure 4 shows samples' physical behavior with different concentrations of T1304 with or without LAP over a wide temperature range after training and data validation using SVM. The implementation of the physical behavior classification by SVM used 58, 50, 7, 25, and 22 kernels on SVM , SVM , SVM , SVM , and SVM , respectively.

Phase Behavior Classification by SVM
According to Figure 4, samples with T1304 (1-20%, w/w) (without LAP) showed up as liquid samples in practically all studied concentrations in the broad range of temperatures studied; only above 75 °C was a phase transition behavior promoted (sol-gel). In addition, Figure 4 shows the positive LAP influence as an ingredient of formulation and capable of changing their sample's physical behavior against different temperatures. According to the results, at 1.5% of LAP, it was possible to obtain samples with five different physical states in different temperatures, demonstrating that they are good candidates for future studies to develop nanocarriers of drug delivery. The use of nano-hybrid systems with a transition phase from sol-gel to body temperatures, such as 32 °C, 35 °C, and 37 °C ( Figure 5), is very promising and can improve characteristics such as the release or the bioavailability of drugs.  Prediction models based on machine learning techniques with an accurate response and high effectiveness have been increasingly used to develop pharmaceutical formulations. The purpose of using predictive techniques is to minimize costs (materials, equipment, workers, among others) and accelerate the development of new medicine with desired target characteristics. Figure 6a,b shows slices of the samples' phase diagram at temperatures of 20 to 40 °C (a) and from 40 to 80 °C obtained by SVM. It is possible to observe that ML is demonstrated to be an excellent optimization tool for pharmaceutical formulations.

Conclusions
The association of Tetronic and Laponite in different concentrations allowed the formation of systems that present different phase behaviors as a function of temperature. LAP had a great influence in the liquid-gel transition of the systems. β-Lap solubility had an expressive increase in samples T1304 (over 10%) and 1.5% LAP, or systems with only LAP (1.5%). In silico ML tools, SVM promoted fine-tuning and near-experimental data, shown to be an excellent strategy for use in the development of new nano-hybrid platforms.