Learning to extrapolate using continued fractions: Predicting the critical temperature of superconductor materials

In the field of Artificial Intelligence (AI) and Machine Learning (ML), the approximation of unknown target functions $y=f(\mathbf{x})$ using limited instances $S={(\mathbf{x^{(i)}},y^{(i)})}$, where $\mathbf{x^{(i)}} \in D$ and $D$ represents the domain of interest, is a common objective. We refer to $S$ as the training set and aim to identify a low-complexity mathematical model that can effectively approximate this target function for new instances $\mathbf{x}$. Consequently, the model's generalization ability is evaluated on a separate set $T=\{\mathbf{x^{(j)}}\} \subset D$, where $T \neq S$, frequently with $T \cap S = \emptyset$, to assess its performance beyond the training set. However, certain applications require accurate approximation not only within the original domain $D$ but also in an extended domain $D'$ that encompasses $D$. This becomes particularly relevant in scenarios involving the design of new structures, where minimizing errors in approximations is crucial. For example, when developing new materials through data-driven approaches, the AI/ML system can provide valuable insights to guide the design process by serving as a surrogate function. Consequently, the learned model can be employed to facilitate the design of new laboratory experiments. In this paper, we propose a method for multivariate regression based on iterative fitting of a continued fraction, incorporating additive spline models. We compare the performance of our method with established techniques, including AdaBoost, Kernel Ridge, Linear Regression, Lasso Lars, Linear Support Vector Regression, Multi-Layer Perceptrons, Random Forests, Stochastic Gradient Descent, and XGBoost. To evaluate these methods, we focus on an important problem in the field: predicting the critical temperature of superconductors based on physical-chemical characteristics.


Introduction
Superconductors are remarkable materials that exhibit the extraordinary property of conducting electrical current with zero resistance.This unique characteristic has led to a wide range of applications, with Magnetic Resonance Imaging (MRI) systems being used globally as a crucial medical tool for producing detailed images of internal organs and tissues.Additionally, in the face of increasing energy demands driven by renewable energy sources and innovations like solar cars, superconductors hold the potential for efficient energy transfer.
The elimination of electrical resistance in superconductors significantly reduces energy wastage during current transmission from one location to another.However, a major limitation of existing superconductors is their reliance on extremely low temperatures, known as critical temperatures (T c ), to achieve zero resistance.Typically, these critical temperatures are incredibly cold, often around -196°C, and vary depending on the specific superconducting material Hamidieh [2018].Predicting the critical temperature (T c ) of superconductors has therefore become a topic of great interest in the field of materials science.
In this study, we leverage various machine learning techniques and propose a novel approach based on multivariate continued fractions to develop mathematical models capable of predicting the critical temperature of superconductors.Our models rely solely on the characterization of the chemical structure of the superconducting material, uncovering hidden information within.Accurate prediction of T c for superconductors will greatly enhance our ability to harness their potential, ushering in a new era of possibilities in multiple fields.

Continued Fraction Regression
In 2019, a new approach for multivariate regression using continued fractions was introduced in Sun and Moscato [2019] and compared with a state of the art genetic programming method for regression.A year later, this technique's results on 354 datasets from the physico-chemical sciences were presented in Moscato et al. [2020] and compared with some of the state-of-the-art top 10 regression techniques.The new method was the top-ranked performer in the training set in 352 out of the 354, and it was the also first in terms of generalisation in 192, more than half of the total of times of all other 10 methods combined.The figure of merit was the Mean Squared Error.
We named this known approach as 'Continued Fraction Regression', or CFR.The best existing algorithm currently utilizes a memetic algorithm for optimizing the coefficients of a model that approximates a target function as the convergent of a continued fraction Sun and Moscato [2019], Moscato et al. [2020Moscato et al. [ , 2021]].Memetic Algorithms are well-established research areas in the field of Evolutionary Computation and the IEEE had established a Task Force in Computational Intelligence for their study.Therefore, it is important to refer the readers to some of the latest references and reviews on the field Cotta and Moscato [2007], Moscato [2012], Cotta et al. [2018], Moscato and Cotta [2019], Moscato and Mathieson [2019].Very recently, continued fraction regression has been used to obtain analytical approximations of the minimum electrostatic energy configuration of n electrons, when the charges are constrained to be on the surface sphere, i.e. the celebrated Thomson Problem Moscato et al. [2023].Some basic introduction on analytic continued fraction approximation is perhaps necessary.A continued fraction for a real value α is of the following form (1) and may be finite or infinite Sun et al. [2019], according to α being a rational number or not, respectively.
Euler's proved a mathematical formula that allows us to write a sum of products as a continued fraction(2): This simple yet powerful equation reveals how infinite series can be written as infinite continued fractions, meaning that continued fractions can be a good general technique to approximate analytic functions thanks to the improved optimization methods such as those provided by memetic algorithms Moscato et al. [2021].Indeed, CFR has already demonstrated to be an effective regression technique on the real-world benchmark provided by the Penn Machine Learning Database Moscato et al. [2021].
In this paper, we will use Carl Friedrich Gauss' mathematical notation for generalized continued fractions Backeljauw and Cuyt [2009] (i.e. a compact notation where "K" stands for the German word "Kettenbruch" which means 'Continued Fraction').Using this notation, we may write the continued fraction in (1) as: thus the problem of finding an approximation of an unknown target function of n variables x given a training dataset of m samples S = {(x (i) , y (i) )} is that of finding the set of functions F = {a 0 (x)..., b 1 (x), ...} such that a certain objective function is minimized; i.e. we aim to find 2 Materials and Methods

A new approach: Continued Fractions with Splines
In previous contributions Sun and Moscato [2019], Moscato et al. [2020Moscato et al. [ , 2021]], a memetic algorithm was always employed to find the approximations.Here, we present another method to fit continued fraction representations by iteratively fitting splines.
Splines provide a regression technique that involves fitting piecewise polynomial functions to the given data Boor [1978].The domain is partitioned into intervals at locations known as "knots".Then, a polynomial model of degree n is separately fitted for each interval, generally enforcing boundary conditions including continuity of the function as well as the continuity of the first (n-1)-order derivatives at each of the knots.Splines can be represented as a linear combination of basis functions, of which the standard is the B-spline basis.Thus, fitting a spline model is equivalent to fitting a linear model of basis functions.We refer to Hastie et al.Hastie et al. [2009] for the particular definition of the B-spline basis.
First, when all the functions b i (x) = 1, for all i, we have a simple continued fraction representation, and we can write it as: Note that for a term g i (x), we say that it is at "depth" i.
Finding the best values for the coefficients in the set of functions {g i (x)}, can be addressed as a non-linear optimization problem as in Sun and Moscato [2019], Moscato et al. [2020Moscato et al. [ , 2021]].However, despite the great performance of that approach, we aim to introduce a faster variant that can scale well to larger datasets such as this one.
Towards that end, and thinking about the scalability, we fit the model iteratively by depth as follows: we first consider only the first term, g 0 (x) (at depth 0), ignoring all other terms.We fit a model for the first term using predictors x and the target f (x).Next, we consider only the first and second depths, with the terms g 0 (x) and g 1 (x), ignoring the rest.
We then fit g 1 (x) using the previously fit model for g 0 (x).For example, truncating the expansion at depth 1, we have that Thus, we fit g 1 (x) using the predictors x and the target (f (x) − g 0 (x)) −1 .We label this target as y (1) .We repeat this process, fitting a new model by truncating at the next depth by using the models fit from previous depths and iterations.
We have that at depth i > 0, the target y (i) for the model , where i−1 (x) is the residual of the previous depth's model, y (i−1) − g i−1 (x).
One notable characteristic of this approach is that if any model g i (x), i > 0 evaluates to 0, then we will have a pole in the continued fraction, which is often spurious.To remedy this, we modify the structure of the fraction such that each fitted g i (x), i > 0 is encouraged to be strictly positive on the domain of the training data.To do this, we add a constant C i to i when calculating the target y (i+1) , where C i = | min x i |.Thus, the targets y (i) for i > 0 are all non-negative, encouraging each g i (x), i > 0, to be strictly positive.For example, for g 1 (x), we would have that the target y (1) = (f (x) − g 0 (x) + C 1 ) −1 .Of course, we must then subtract C i from g i−1 (x) in the final continued fraction model.
We have found that data normalization often results in a better fit using this approach.It is sufficient to simply divide the targets uniformly by a constant when training and multiply by the same constant for prediction.We denote this constant parameter norm.
A good choice of the regression model for each g i (x) is a spline since they are well-established.For reasons stated in the next section, the exception is the first term g 0 (x), which is a linear model.We use an additive model to work with multivariate data where each term is a spline along a dimension.That is, given m predictor variables, we have that for each term g i (x), i > 0, where each function f j is a cubic spline along variable j.That is, f j is a piecewise polynomial of degree 3 and is a function of variable j.
We implement the splines with a penalized cubic B-spline basis.That is, , where each B i (x) is one of k cubic B-spline basis functions along dimension j and corresponds to one of k knots.We use the following loss function L (B (x, y, β β β)), i.e.
where B is the matrix of cubic B-spline basis functions for all variables, β β β is the vector of all of the weights, and P j is the associated second derivative smoothing penalty matrix for the basis for the spline f j .This is standard for spline models Hastie et al. [2009].The pseudocode for this approach is shown in Algorithm 1.
Algorithm 1: Iterative CFR using additive spline models with adaptive knot selection Input: Training data D = {(x 1 , f (x 1 ), ..., (x n , f (x n ))} and parameters λ, k, norm, and max_depth /* Let n be the number of samples; m be the number of variables */ /* X ∈ R n×m be data matrix and y ∈ R n be the vector of targets.
*/ knot_indices = {} y (0) ← y/norm for i ← 0, 1, ..., max_depth do β T P j β end /* Compute i , the vector of residuals of the ith model, and then compute the targets and knot locations for the next depth.*/ i ← y (i) − g i (X) The estimate for f (x) at max_depth is: The iterative method of fitting continued fractions also allows for an adaptive method of selecting knot placements for the additive spline models.For the spline model g i (x) at depth i > 0, we use all of the knots of the spline model g i−1 (x) at depth i − 1.Then, for each variable, we place k new knots at the unique locations of the k samples with the highest absolute error from the model g i−1 (x) at depth i − 1.As the points with the highest error can be likely to be very close to each other, we impose the condition that we take the samples with the highest error, but they must have alternating signs.
That is, for g i (x), i > 0, we select k knots, with the first knot at the location of the sample with the highest absolute error computed from the model g i−1 (x).For the rest of the knots, the j th knot is selected at the sample's location with the next highest absolute error than the sample used for the (j − 1) th knot.Nevertheless, only if the sign of the (non-absolute) error of that sample is different from the sign of the (non-absolute) error of the sample used for the (j − 1) th knot.Otherwise, we move on to the next highest absolute error sample, and so on, until we fulfill this condition.This knot selection procedure is shown in Algorithm 2. Note that we let g 0 be a linear model as there is no previous model to obtain the knot locations from.
The goal of using additive spline models with the continued fraction is to take advantage of the continued fraction representation's demonstrated ability to approximate general functions (see the discussion on the relationship with Padé approximants in Moscato et al. [2021]).The fraction's hierarchical structure allows for the automatic introduction of variable interactions, which is not included in the additive models individually that constitute the fraction.The iterative approach to fitting allows for a better algorithm for knot selection.
An example of this algorithm modeling the well-known gamma function (with standard normally distributed noise added) is demonstrated in Fig. 1.Here, we showed how the fitting to gamma is affected by different values of depths (3,5,10,15) in Spline Continued Fraction.As desired, it is evident from the figure that Spline Continued Fraction with more depth fits better with the data.
Algorithm 2: SelectKnots (Adaptive Knot Selection) Input: i /* Given the vector of residuals i of the spline model at depth i, select the knot placements for the next spline model at depth i + 1 */ /* Sort by indices of highest absolute error */ abs_error ← elementWiseAbsoluteValue( i ) highest_error_indices ← argsortDecreasing(abs_error) /* Take the top k highest order indices, such that each error term has opposite sign of the last

Data and Methods in the Study
We used the superconductivity dataset, also used by Hamidieh Hamidieh [2018], from the UCI Machine Learning repository1 .The website contains two files.In this work, we have only used the train.csvfile, which contains information of 21263 superconductors along with the critical temperature and a total of 81 attributes for each of them.
We conducted two main studies to see the generalization capabilities of many regression algorithms.We denote them as the Out-of-Sample and Out-of-Domain, respectively.For the Out-of-Sample study, the data is randomly partitioned into 2/3rds training data and 1/3rd test data.Each model was fit on the training data, and the RMSE is calculated on the separated test portion of the data.
For the Out-of-Domain study, the data was partitioned such that the training samples are always extracted from the set of samples with the lowest 90% of critical temperatures.For the test set, the samples come from the highest 10% of critical temperatures.It turned out the lowest 90% have critical temperatures < 89 K, whereas the highest 10% have temperatures greater or equal to 89 K that range from 89 K to 185 K (we highlight that the range of variation of the test set is more than the one of the training set making the generalization a challenging task).For each of the 100 repeated runs of Out-of-Domain test, we have randomly taken 1/2 of the training set (from lowest 90% of the observed value) to train the models and the same ratio from the test data (from 10% of the highest actual value) to estimate the model performance.This said the Out-of-Domain study allows us to see the capacity of several regression models in "predicting" on a set of materials that have higher critical temperatures, meaning that generalization, in this case, is strictly connected with the extrapolation capacity of the fitted models.We executed both the Out-of-Sample and Out-of-Domain tests for 100 times to help us validate our conclusions with statistical results.
The Spline Continued Fraction model had a depth of 5, five knots per depth, a normalization constant of 1000, and a regularization parameter λ of 0.5.These parameters resulted from a one-dimensional non-linear model fitting to problems like the gamma function with noise (already discussed in Fig. 1) and others such as fitting the function f (x) = sin(x)/x.The parameters were selected empirically using these datasets, and no problem-specific tuning on the superconductivity datasets was conducted.
The final model was then iteratively produced by beginning at a depth of 1 and increasing the depth by one until the error was greater than the one observed for a previous depth (which we considered as a proxy for overfitting the data).
To evaluate the performance of the Spline Continued Fraction (Spln-CFR) introduced in this paper with other stateof-the-art regression methods, we used a set of 11 regressors from two popular Python libraries (XGBoost Chen and Guestrin [2016] and Scikit-learn machine learning library Pedregosa et al. [2011]).The name of the regression methods are listed as follows: The XGBoost code is available as an open-source package2 .The parameters of the XGBoost model were the same as used in Hamidieh (2018) Hamidieh [2018].We kept the parameters of other machine learning algorithms the same as Scikit defaults.
All executions of the experiments were performed on an Intel Core TM i7-9750H hex-core based computer with hyperthreading and 16GB of memory.The machine was running on Windows 10 operating system.We used Python v3.7 to implement the Spline Continued Fraction using pyGAM Servén and Brummitt [2018] package.All experiments were executed under the same Python runtime and computing environment.

Results
Table 1: Results form the 100 runs of the proposed Spline Continued Fraction and ten regression methods all trained on the dataset, with median of Root Mean Squared Error (RMSE) and standard deviation as the uncertainty of error.

Regressor
Median RMSE Score We also report on some other descriptive statistics like, for instance, the number of times that the regressor correctly predicted a material to have a critical temperature greater or equal to 89 K.

Out-of-Sample Test
For the Out-of-Sample testing, XGBoost achieved the lowest error (median RMSE score of 9.47) among the 11 regression methods.The three closest regression methods to XGBoost are Random Forest (median RMSE of 9.67), Spline Continued Fraction (median RMSE of 10.99) and Gradient Boosting (median RMSE of 12.66).The Stochastic Gradient Descent, without parameter estimation, performed the worst among all regression methods used in the experiment and due to the unreasonable high error observed in the runs we have omitted it from further analysis.

Statistical Significance Testing on the Results obtained for Out-of-Sample test
To evaluate the significance in results obtained by different regression methods for Out-of-Sample, we applied a Friedman test for repeated measure Friedman [1937] for the 100 runs.Here, we computed the ranking of the methods for each of the runs based on the RMSE score obtained in the test distribution of the Out-of-Sample settings.It will help us determine if the experiment's techniques are consistent in terms of their performance.The statistical test found p-value = 1.9899 × 10 −183 which "rejected" the null hypothesis "all the algorithms perform the same" and we proceeded with the posthoc test.
We applied Friedman's posthoc test on the ranking of 10 regressors computed for the test RMSE scores obtained for 100 runs of Out-of-Sample test.In Fig. 2  there exist 'no significant differences' (NS) in performances of Spline Continued Fraction (Spnl-CFR) with: rf and grad-b.
Additionally, we generated the Critical Difference (CD) diagram proposed in Demšar [2006] to visualize the differences among the regressors for their median ranking.The CD plot used Nyemeni posthoc test and placed the regressors on the x-axis of their median ranking.It then computes the critical difference of rankings between them and connects those which are closer than the critical difference with a horizontal line denoting them as statistically 'non-significant'.
We plot the CD graph, in Fig. 2 (b), using the implementation from Orange data mining toolbox Demšar et al. [2013] in Python.The Critical Difference (CD) is found to be 1.25.We can see that the xg-b ranked 1 st among the regressors with 'no significant difference' with 2 nd ranked rf.The median ranking of the proposed Spline Continued Fraction is ranked 3 rd with 'no significant differences' in the performance rankings of rf and grad-b.

Out-of-Domain Test
For the task of Out-of-Domain prediction, the Spline Continued Fraction regressor exhibited the best performance (median RMSE score of 36.3)among all regression methods used in the experiment (in Table 1).Three closest regressors to the proposed Spline Continued Fraction method are XGBoost (median RMSE=37.3),Random Forest (median RMSE=38.1) and Gradient Boosting (median RMSE=39.6).

Statistical Significance Testing on the Results obtained for Out-of-Domain test
To test the significance of the results obtained by different regression methods for Out-of-Domain test, we employed the same statistical test used for Out-of-Sample (in Sec.3.1.1).The test returned a p-value = 1.2065 × 10 −156 which "rejected" the null hypothesis and we proceeded with the posthoc test.
The p-values obtained for the posthoc test are plotted as a heatmap in Fig. 3 (a) for Out-of-Domain test.It is noticeable that there exist 'no significant differences' (NS) in performances of Spline Continued Fraction (Spnl-CFR) with Random Forest (rf) and XGBoost (xg-b).There is also no significant difference in performance ranking of Linear Regression (l-regr) with mlp, l-svr, krnl-r and grad-b.
We plot the Critical Difference (CD) graph, in Fig. 3    To illustrate on the performance of models in the Out-of-Sample study, we employed Linear regression, XGBoost and Spline Continued Fraction on the training set and plotted the prediction vs actual temperatures for the entire dataset (in Fig. 5).We show that we were able to reproduce the result of the Out-of-Sample test from Hamidieh Hamidieh [2018] Fig. 5 (a), with RMSE of 17.7.The Out-of-Sample model for Spline Continued Fraction and XGBoost model are used to predict the critical temperature for the entire dataset.Together, the figures show that Spln-CFR performed better in modelling Out-of-Sample critical temperatures than that of Linear Regression, particularly for larger temperatures.Fig. 6 shows actual vs predicted critical temperature for the Out-of-Domain test for Linear Regression, XGBoost and Spline Continued Fraction models.We recall that in Out-of-Domain settings, we trained each of the models with the samples from the bottom 90% of the observed temperature (which is < 89 K).We measured the samples' testing performance with the top 10% of the observed critical temperatures (containing 2126 samples in the test set).
Table 2: Predicted vs. Actual critical temperatures for the materials with the top 20 predicted temperatures in the Out-of-Domain study, i.e. the one in which the lowest 90% of critical temperature samples were used for drawing the training data.The average values of the critical temperatures (x), the average relative error (η), and the Root mean squared error (RMSE as rm) of these materials for the top 20 predictions (which are not necessarily the same since they depend on the models) are shown in the last rows.

Spln-CFR
xg-b rf grad-b mlp l-regr l-svr krnl-r ada-b lasso-l y pred y pred y pred y pred y pred y pred y pred y pred y pred y pred 92.00114.14 89.2089.6491.1987.8989.5083.44We also reported the average temperature (x), average relative errors (η) and RMSE score computed for the top 20 predictions.XGBoost has showed the lowest value for both η (0.036) and RMSE (3.775) among 10 regressors.In terms of those scores, the proposed Spln-CFR is 4 th position.However, if we look at the average of predictions, Spln-CFR has the highest average prediction temperatures for the top 20 predictions in Out-of-Domain tests.
Since all the actual critical temperatures of the test set in Out-of-Domain settings are ≥ 89 K, it is relevant to evaluate for how many of these samples each regression method was able to predict above that value.Here, we considered the predicted value as P = critical temperature value ≥ 89 K (denoted as 'P', for positive) and N = critical temperature value < 89 K (denoted as 'N', for negative).In  We look at the consensus between regression methods in Out-of-Domain prediction.Only five regressors (Spln-CFR, xg-b, mlp, l-regr, l-svr and krnl-r) which were able to predict at least one positive value (critical temperature ≥ 89 K).We computed pairwise inter-rater agreement statistics, Cohen's kappa Cohen [1960], for those regression methods.We tabulated the value of Kappa(κ) ordered by highest to lowest and outlined the level of agreement in Table 4.We can see that in most of the cases, there is either "No" (9 cases) or "None to Slight" (4 cases) agreement exists between the pairs of regressors.We witness such behaviour in the agreement between the pairs formed with Spln-CFR and each of the other five methods.MLP Regressor and Linear SVR have "Fair" agreement in the predictions.We witnessed the highest value κ = 0.67 for Linear Regression and Kernel Ridge, which yields a "Substantial" agreement.

Extrapolation Capability of the Regressors in General
As all of the results presented in this work are for a special case of finding models for the extrapolation of the critical temperature of superconductors, we included more robust experimental outcomes with a set of six datasets used in Sun and Moscato [2019].This additional test will help us to evaluate the extrapolation capabilities of the regressors in other problem domains.
Jerome Friedman proposed a Multivariate Adaptive Regression Splines (MARS) algorithm in Friedman [1991] which aggregates multiple linear regression models throughout the range of target values.We used the implementation of the MARS algorithm from py-earth Python library3 .We included a comparison of MARS with Spln-CFR and other regressors for extrapolation capability.
Here, the samples from each of the datasets were sorted based on the target value.Then we split it into the out-of-domain setting by taking samples with lower 90% target values as train and higher 10% target values as a test.We uniformly at random took half of the samples from the out-of-domain train to build the model and the same ratio from the out-of-domain test sets for prediction for each of the 100 independent runs.We applied min-max normalization on the train set and used the same distribution to normalize the test set.We have analyzed their performance statistically (in Fig. 7) and found that MARS has a median ranking of 5, and is statistically significantly different from only krnl-r, sgd-r and lasso-l.However, the proposed Spln-CFR has achieved the first rank among all the methods with a median ranking between two to three.The predictions by each model are de-normalized to count the number of predictions above the threshold (the maximum target value in the training portion of data) in out-of-domain settings.We show the complete outcome in Table 5.These counts show that the Spln-CFR has the highest number of predictions (13560) followed by MARS (3716) and l-regr (2594) are in the range.These results demonstrate the strength of the regressors for their extrapolation capability.

Conclusions
We give a brief summary of some of the results observed on this new technique: • For the Out-of-Sample study, the median RMSE obtained for 100 independent runs, the proposed Spln-CFR is in the top three methods (in Table 1).• For the statistical test of Out-of-Sample rankings, Spln-CFR is statistically similar to the 2 nd ranked method (Random Forest) in Fig. 2 (b).• For Out-of-Domain median RMSE obtained for 100 runs, the proposed Spln-CFR is the top method (ranked 1 st in Table 1).• For the statistical test of Out-of-Domain rankings, in Fig. 3 (b), Spln-CFR is the best method (median ranking is close to 2) and statistically similar to the second best regressor, XGBoost (with a median ranking between 2 and 3).
• Spln-CFR correctly predicted that 108 unique materials have critical temperature values that are greater than or equal to 89 K in Out-of-Domain test (close to twice the number of all other regression methods tested combined which was 60) (Table 3).
Table 2 also reveals interesting characteristics of all methods that deserve further consideration as an area of research.First, note that the 20 top materials for each of the methods are not necessarily the same, although some intersections obviously may exist.In the Out-of-Domain study, the top 20 predicted critical temperature values by Spln-CFR were all above 98.9K (with 18 being above 100 K).The average RMSE critical temperature on this set (103.64 K) is nearly the same as the one predicted (103.08 K).The RMSE of xg-b, however, is nearly three times smaller, but the method's top predictions are materials with relatively smaller values (average of 91.31 K).We observed, for the collected information of materials in the dataset, the top suggestions of critical temperatures in superconductors are closer to the measured temperature, at least on the average, by the Spln-CFR.Therefore, the usage of Spln-CFR as a surrogate model to explore the possibility of testing the superconductivity in materials may bring better returns.
Interestingly, we have also observed a similar behavior of xg-b with other multivariate regression techniques, but also important differences worth noting.For instance, Linear Regression, perhaps the simplest scheme of them all, has an interesting behavior: the top 20 highest predictions are all in the range [81.83, 91.59]K while the actual values are in the interval [98.00, 132.60]K.For the multi-layer perceptron method (mlp), the top 20 highest predictions are all in the range [90.01, 100.81]K, yet true values are in the interval [109.00,131.40]K.This means that trained using the MSE, these techniques could still give valuable information about materials that could be prioritized for testing if we better consider the ranking given to several materials and have less concern about the predicted value.
Overall, the results show the limitations of the current dataset.One possible limitation is the lack of other useful molecular descriptors that can bring important problem-domain knowledge about the structure of the materials and their properties.In addition, it is also possible that a careful "segmentation" of the different materials is necessary.In some sense, the results of the experiments presented here may help the AI community reflect on how to do these analyses and motivate a closer collaboration with superconductivity specialists to provide other molecular descriptors.
We actually often compare the inherent difficulties in prediction in this dataset to other areas on which some of us have been working extensively (like the prediction of survivability in breast cancer using transcriptomic data).In both cases, without separating the training samples into meaningful subgroups, the models obtained generalised poorly.This said, one of the reasons that our continued fraction-based method may be doing just a bit better in the generalisation test in our Out-of-Domain study, is that there might be some structural similarities in the set of compounds used to define the continued fraction approximation at the highest temperatures in the training set, then, indirectly perhaps, from the molecular descriptors present in these samples some useful information exists which the continued fraction representation has exploited.We will investigate this hypothesis in a future publication where we aim to include more relevant problem-domain information, in collaboration with specialists, to benefit from the structure and known properties of the actual compounds.
In terms of future research on the algorithm we propose here, it is clear that Spln-CFR is already a promising approach that has some obvious extensions worth considering in the future, for instance, the inclusion of bagging and boosting techniques which can improve the Out-of-Sample performance.In addition, we consider that learning with modifications of the MSE in the training set may lead to better performance for the Out-of-Domain scenario, and we plan to conduct further research in that area as well.

Figure 1 :
Figure1: Examples of the fit obtained by the Spline Continued Fraction using a dataset generated thanks to the gamma function with added noise.We present several continued fractions with depths of 3 (a), 5 (b), 10 (c), and 15 (d).In this example, the number of knots k was chosen to be 3, norm = 1, and λ = 0.1.
Figure 2: Statistical Comparison of the regressors for the Out-of-Sample test.a) Heatmap showing the significance levels of p-values obtained by the Friedman Post-hoc Test and b) Critical difference (CD) plot showing the statistical significance of rankings achieved by the regression methods.

Figure 4 :
Figure4: Run-time (in seconds) required for model building and predicting by the regressors for 100 runs of the Out-of-Domain test, where samples with the lowest 90% of critical temperatures were drawn to be the training data, with an equal number of samples constitute the test data (but these were withdrawn from the top 10% highest critical temperatures).

Figure 5 :
Figure 5: Out-of-Sample Test results showing Predicted vs actual temperatures for entire data with regression models trained on the training data.a) Results replicate Linear Regression outcome from Hamidieh, b) XGBoost and b) Spline Continued Fraction model.

Figure 6 :
Figure 6: Out-of-Domain Test results showing Predicted vs actual temperatures of the samples for the highest 10% critical temperatures, where a model is fitted using the samples with the lowest 90% critical temperatures.We have shown the x-axis values up to 145 K which only left an extreme value (185 K) out of the visual area.Results of Out-of-Domain test for a) Linear Regression with RMSE of 41.3 b) XGBoost with RMSE of 36.3 and c) Spline Continued Fraction model's with RMSE of 34.8.

Figure 7 :
Figure 7: Critical difference (CD) plot showing the statistical significance of rankings achieved by the regression methods for 100 runs on the six datasets form Sun and Moscato [2019].
g 0 is a linear model parameterized by β, and is fit with least squares.
Table1presents the results of the regression methods along-with with those of the Spline Continued Fraction approach for both of Out-of-Sample and Out-of-Domain studies.The median RMSE value obtained from 100 runs is taken as the Out-of-Sample RMSE estimate.
For each of the 100 repeated runs of Out-of-Domain test, we estimate the model performance via the Out-of-Domain RMSE score.The median RMSE score obtained from this test performance is reported in Table1as Out-of-Domain RMSE.
.2.2 Runtime Required by the methods for out-of-domain test Fig.4shows the running time required by each of the regression methods (in s) for the 100 runs of Out-of-Domain test.We can see that the Linear Regression (50 th percentile runtime of 0.02 s and maximum runtime 0.158 s) and Lasso lars (50 th percentile 0.013 s and maximum of 0.027 s) required lowest running times.XGBoost (xg-b) required the most amount of CPU time (50 th percentile runtime of 55.33 s and maximum 79.05 s).On the other hand, Random Forest and the proposed Spline Continued Fraction Regression required nearly similar running time (50 th percentile runtime of 36.88 s and 41.65 s for rf and Spln-CFR, respectively) for the Out-of-Domain test.
Table3, we reported the number of samples for which each of the methods predicted a temperature value in the P and N category for the whole testing set of Out-of-Domain test.It is found that only six regression methods predicted the critical temperature being ≥ 89 K for at least one sample.Both Linear Regression and XGBoost predicted two sample's temperatures with critical temperature ≥ 89 K. Kernel Ridge predicted only one sample's value within that range.MLP Regressor and Linear SVR predicted it for 21 and 34 samples, respectively.The proposed Spline Continued Fraction predicted 108 sample's value ≥ 89 K, which is the best among all regression methods used in the experiments.

Table 3 :
Number of times the methods predicted a critical temperature value T c ≥ 89 K (denoted as 'P', for positive) and T c < 89 K (denoted as 'N' for Negative) for Out-of-Domain test.

Table 4 :
Inter-rater agreement between the pairs of regressor methods where the resulting models were able to predict at least one positive temperature value (T c ≥ 89 K).