# Learning to Extrapolate Using Continued Fractions: Predicting the Critical Temperature of Superconductor Materials

^{1}

^{2}

^{3}

^{4}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

#### Continued Fraction Regression

## 2. Materials and Methods

#### 2.1. A New Approach: Continued Fractions with Splines

Algorithm 1: Iterative CFR using additive spline models with adaptive knot selection |

#### 2.2. Adaptive Knot Selection

Algorithm 2: SelectKnots (Adaptive Knot Selection) |

#### 2.3. Data and Methods Used in the Study

`Spln-CFR`) introduced in this paper with other state-of-the-art regression methods, we used a set of eleven regressors from two popular Python libraries, namely, the XGBoost [17] and Scikit-learn [18] machine learning libraries. The names of the regression methods are listed as follows:

- AdaBoost (
`ada-b`) - Gradient Boosting (
`grad-b`) - Kernel Ridge (
`krnl-r`) - Lasso Lars (
`lasso-l`) - Linear Regression (
`l-regr`) - Linear SVR (
`l-svr`) - MLP Regressor (
`mlp`) - Random Forest (
`rf`) - Stochastic Gradient Descent (
`sgd-r`) - XGBoost (
`xg-b`)

## 3. Results

#### 3.1. Out-of-Sample Test

#### Statistical Significance of the Results of the Out-of-Sample Test

`Spnl-CFR`) and those of

`rf`and

`grad-b`.

`xg-b`ranked first among the regressors, with no significant difference’, while

`rf`ranked second. The median ranking of the proposed Spline Continued Fraction was third, withno significant differences in the performance rankings of

`rf`and

`grad-b`.

#### 3.2. Out-of-Domain Test

#### 3.2.1. Statistical Significance of the Results of the Out-of-Domain Test

`Spnl-CFR`) and those of Random Forest (

`rf`) and XGBoost (

`xg-b`). There is no significant difference between the performance ranking of Linear Regression (

`l-regr`) and those of

`mlp`,

`l-svr`,

`krnl-r`, and

`grad-b`.

`Spln-CFR`is very close to second, which is the best-ranking performance among the ten regressors. There is no significant difference between Spline Continued Fraction and the second-best method, XGBoost (

`xg-b`), which has an average ranking between second and third for Out-of-Domain prediction.

#### 3.2.2. Runtimes of the Different Methods on the Out-of-Domain Test

`xg-b`) required the longest time (50th percentile runtime of 55.33 s and maximum 79.05 s); on the other hand, Random Forest and the proposed Spline Continued Fraction regression required very similar times (50th percentile runtimes of 36.88 s and 41.65 s for

`rf`and

`Spln-CFR`, respectively) on the Out-of-Domain test.

## 4. Discussion

`Spln-CFR`performed better in modelling Out-of-Sample critical temperatures than Linear Regression, particularly for larger temperatures.

`Spln-CFR`is in the fourth position. However, looking at the average of the predictions,

`Spln-CFR`has the highest average prediction temperatures for the top twenty predictions on the Out-of-Domain test.

**P**= critical temperature value$\ge 89$ K (denoted as ‘P’ for positive) and

**N**= critical temperature value < 89 K (denoted as ‘N’ for negative). In Table 3, we report the number of samples for which each of the methods predicted a temperature value in the P and N categories for the whole testing set of the Out-of-Domain test. It can be seen that only six regression methods predicted the critical temperature of $\ge 89$ K for at least one sample. Both Linear Regression and XGBoost predicted two sample temperatures with the critical temperature $\ge 89$ K. Kernel Ridge predicted only one sample value within that range, while MLP Regressor and Linear SVR predicted it for 21 and 34 samples, respectively. The proposed Spline Continued Fraction predicted 108 sample values of $\ge 89$ K, the best among all regression methods used in our experiments.

`Spln-CFR`,

`xg-b`,

`mlp`,

`l-regr`,

`l-svr`, and

`krnl-r`), were able to predict at least one positive value (critical temperature $\ge 89$ K). We computed pairwise inter-rater agreement statistics, Cohen’s kappa [23], for these five regression methods. We tabulated the value of Kappa ($\kappa $) ordered by highest to lowest; the level of agreement is outlined in Table 4. It can be seen that in most cases there is either no agreement (nine cases) none to slight (four cases) between the pairs of regressors. Such behaviour is seen in the agreement between the pairs formed with

`Spln-CFR`and each of the other five methods. MLP Regressor and Linear SVR have fair agreement in their predictions. The highest value is $\kappa =0.67$ for Linear Regression and Kernel Ridge, yielding substantial agreement.

#### Extrapolation Capability of the Regressors in General

`Spln-CFR`and other regressors to assess their extrapolation capability.

`MARS`has a median ranking of fifth and is statistically significantly different from only

`krnl-r`,

`sgd-r`, and

`lasso-l`. On the other hand, the proposed

`Spln-CFR`achieves the first rank among all the methods, with a median ranking between two to three. To calculate how many predictions surpass the threshold (which matches the highest target score in the training data) in out-of-domain scenarios, we can convert the predictions of each model back to their original scale, i.e., denormalize them. Table 5 shows the full denormalization results. These counts show that

`Spln-CFR`has the highest number of predictions (13,560), followed by

`MARS`(3716) and

`l-regr`(2594). These results demonstrate the strength of the regressors in terms of their extrapolation capability.

## 5. Conclusions

- For the Out-of-Sample study, the proposed
`Spln-CFR`approach is among the top three methods based on the median RMSE obtained for 100 independent runs (Table 1). - For the statistical test of the Out-of-Sample rankings,
`Spln-CFR`is statistically similar to the second-ranked method, Random Forest, in Figure 2b. - In terms of the Out-of-Domain median RMSE obtained for 100 runs,
`Spln-CFR`is ranked first (Table 1). - For the statistical test of the Out-of-Domain rankings in Figure 3b,
`Spln-CFR`is the best method, with a median ranking close to second, and is statistically similar to the second-best regressor, XGBoost, which has a median ranking between second and third. `Spln-CFR`correctly predicted 108 unique materials with critical temperature values greater than or equal to 89 K in the Out-of-Domain test, nearly twice the number achieved all the other regression methods combined, which was 60 (Table 3).

`Spln-CFR`were all above 98.9 K, with eighteen being above 100 K. The average RMSE critical temperature on this set (103.64 K) is nearly the same as the predicted one (103.08 K). The RMSE of

`xg-b`, however, is nearly three times smaller, while the method’s top predictions are materials with relatively smaller values (average of 91.31 K). For the collected information of the materials in the dataset, we observed that the top suggestions of critical temperatures in superconductors are closer to the measured temperatures, at least on the average, when using

`Spln-CFR`. Therefore, the use of

`Spln-CFR`as a surrogate model to explore the possibility of testing the superconductivity of materials may provide better returns.

`xg-b`and other multivariate regression techniques, there were important differences worth noting. For instance, Linear Regression, perhaps the simplest scheme of them all, shows interesting behavior; its top twenty highest predictions are all in the range of 81.83–91.59 K while the actual values are in the interval 98.00–132.60 K. For the multi-layer perceptron method (

`mlp`), the top twenty highest predictions are all in the range of 90.01–100.81 K, yet the true values are in the interval 109.00–131.40 K. By considering the rankings given to several materials, as opposed to the predicted values, valuable information about materials can be obtained with these techniques when trained using the MSE, allowing prioritization of materials for testing.

`Spln-CFR`is already a promising approach with obvious extensions worth considering in the future. For instance, the inclusion of the bagging and boosting techniques could improve its Out-of-Sample performance, while modifications of the MSE used in the training set may lead to better learning performance in the Out-of-Domain scenario. We plan to conduct further research in these areas.

#### Note Added in Proof

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

AI | Artificial Intelligence |

CD | Critical Difference |

CFR | Continued Fraction Regression |

IEEE | Institute of Electrical and Electronics Engineers |

MARS | Multivariate Adaptive Regression Splines |

ML | Machine Learning |

MLP | Multilayer Perceptron |

MRI | Magentic Resonance Imaging |

MSE | Mean Squared Error |

NMSE | Normalised Mean Squared Error |

RMSE | Root Mean Squared Error |

SVR | Support vector Regressor |

UCI | University of California, Irvine |

## References

- Tinkham, M. Introduction to Superconductivity: International Series in Pure and Applied Physics; McGraw-Hill: New York, NY, USA, 1975. [Google Scholar]
- Tinkham, M. Introduction to Superconductivity, 2nd ed.; Courier Corporation: Chelmsford, MA, USA, 2004. [Google Scholar]
- Liu, W.; Li, S.; Wu, H.; Dhale, N.; Koirala, P.; Lv, B. Enhanced superconductivity in the Se-substituted 1T-PdTe
_{2}. Phys. Rev. Mater.**2021**, 5, 014802. [Google Scholar] [CrossRef] - Chen, X.; Wu, T.; Wu, G.; Liu, R.; Chen, H.; Fang, D. Superconductivity at 43 K in SmFeAsO
_{(1−x)}F_{x}. Nature**2008**, 453, 761–762. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Zhang, Y.; Liu, W.; Zhu, X.; Zhao, H.; Hu, Z.; He, C.; Wen, H.H. Unprecedented high irreversibility line in the nontoxic cuprate superconductor (Cu, C)Ba
_{2}Ca_{3}Cu_{4}O(11+). Sci. Adv.**2018**, 4, eaau0192. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Liu, W.; Lin, H.; Kang, R.; Zhu, X.; Zhang, Y.; Zheng, S.; Wen, H.H. Magnetization of potassium-doped p-terphenyl and p-quaterphenyl by high-pressure synthesis. Phys. Rev. B
**2017**, 96, 224501. [Google Scholar] [CrossRef] [Green Version] - Hamidieh, K. A data-driven statistical model for predicting the critical temperature of a superconductor. Comput. Mater. Sci.
**2018**, 154, 346–354. [Google Scholar] [CrossRef] [Green Version] - Sun, H.; Moscato, P. A Memetic Algorithm for Symbolic Regression. In Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2019, Wellington, New Zealand, 10–13 June 2019; IEEE: New York, NY, USA, 2019; pp. 2167–2174. [Google Scholar] [CrossRef]
- Moscato, P.; Sun, H.; Haque, M.N. Analytic Continued Fractions for Regression: A Memetic Algorithm Approach. Expert Syst. Appl.
**2021**, 179, 115018. [Google Scholar] [CrossRef] - Moscato, P.; Cotta, C. An Accelerated Introduction to Memetic Algorithms. In Handbook of Metaheuristics; Gendreau, M., Potvin, J.Y., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 275–309. [Google Scholar] [CrossRef]
- Moscato, P.; Mathieson, L. Memetic Algorithms for Business Analytics and Data Science: A Brief Survey. In Business and Consumer Analytics: New Ideas; Moscato, P., de Vries, N.J., Eds.; Springer: Berlin/Heidelberg, Germany, 2019; pp. 545–608. [Google Scholar] [CrossRef]
- Moscato, P.; Haque, M.N.; Moscato, A. Continued fractions and the Thomson problem. Sci. Rep.
**2023**, 13, 7272. [Google Scholar] [CrossRef] [PubMed] - Sun, S.; Ouyang, R.; Zhang, B.; Zhang, T.Y. Data-driven discovery of formulas by symbolic regression. Mater. Res. Soc. Bull.
**2019**, 44, 559–564. [Google Scholar] [CrossRef] - Backeljauw, F.; Cuyt, A.A.M. Algorithm 895: A continued fractions package for special functions. ACM Trans. Math. Softw.
**2009**, 36, 15:1–15:20. [Google Scholar] [CrossRef] - Boor, C.D. A Practical Guide to Splines; Springer: New York, NY, USA, 1978; Volume 27. [Google Scholar]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer: Berlin/Heidelberg, Germany, 2009; pp. 139–190. [Google Scholar]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the KDD ’16: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef] [Green Version]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res.
**2011**, 12, 2825–2830. [Google Scholar] - Servén, D.; Brummitt, C. pyGAM: Generalized Additive Models in Python. Available online: https://doi.org/10.5281/zenodo.1208723 (accessed on 18 April 2022).
- Friedman, M. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc.
**1937**, 32, 675–701. [Google Scholar] [CrossRef] - Demšar, J. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res.
**2006**, 7, 1–30. [Google Scholar] - Demšar, J.; Curk, T.; Erjavec, A.; Gorup, V.; Hočevar, T.; Milutinovič, M.; Možina, M.; Polajnar, M.; Toplak, M.; Starič, A.; et al. Orange: Data Mining Toolbox in Python. J. Mach. Learn. Res.
**2013**, 14, 2349–2353. [Google Scholar] - Cohen, J. A coefficient of agreement for nominal scales. Educ. Psychol. Meas.
**1960**, 20, 37–46. [Google Scholar] [CrossRef] - Friedman, J.H. Multivariate adaptive regression splines. Ann. Stat.
**1991**, 19, 1–67. [Google Scholar] [CrossRef] - Lee, S.; Kim, J.H.; Kwon, Y.W. The First Room-Temperature Ambient-Pressure Superconductor. arXiv
**2023**, arXiv:2307.12008. [Google Scholar] - Lee, S.; Kim, J.; Kim, H.T.; Im, S.; An, S.; Auh, K.H. Superconductor Pb
_{10−x}Cu_{x}(PO_{4})_{6O}showing levitation at room temperature and atmospheric pressure and mechanism. arXiv**2023**, arXiv:2307.12037. [Google Scholar] - Seegmiller, C.C.; Baird, S.G.; Sayeed, H.M.; Sparks, T.D. Discovering chemically novel, high-temperature superconductors. Comput. Mater. Sci.
**2023**, 228, 112358. [Google Scholar] [CrossRef]

**Figure 1.**Examples of the fit obtained by the Spline Continued Fraction using a dataset generated thanks to the gamma function with added noise. We present several continued fractions with depths of 3 (

**a**), 5 (

**b**), 10 (

**c**), and 15 (

**d**). In this example, the number of knots k was chosen to be 3, $norm$ = 1, and $\lambda $ = 0.1.

**Figure 2.**Statistical comparison of the regressors for the Out-of-Sample test: (

**a**) heat map showing the significance levels of the p-values obtained by the Friedman post hoc test and (

**b**) critical difference (CD) plot showing the statistical significance of the rankings achieved by the different regression methods.

**Figure 3.**Statistical comparison of the regressors for the Out-of-Domain test: (

**a**) heat map showing the significance levels of the p-values obtained by the Friedman post hoc test and (

**b**) critical difference (CD) plot showing the statistical significance of the rankings achieved by the different regression methods.

**Figure 4.**Runtime (in seconds) required for model building and prediction by the regressors for 100 runs of the Out-of-Domain test, in which samples with the lowest 90% of critical temperatures were drawn as the training data and an equal number of samples drawn from the top 10% highest critical temperatures constituted the test data.

**Figure 5.**Out-of-Sample test results, showing the predicted vs. actual temperatures for the entire dataset with regression models trained on the training data: (

**a**) Linear Regression (replicating the outcome from Hamidieh), (

**b**) XGBoost, and (

**c**) Spline Continued Fraction.

**Figure 6.**Out-of-Domain test results, showing the predicted vs. actual temperatures of the samples for the highest 10% of critical temperatures, where the models were fitted using the samples with the lowest 90% critical temperatures. The x-axis values up to 145 K are shown, leaving one extreme value (185 K) outside of the visualized area. Results of the Out-of-Domain test for: (

**a**) Linear Regression, with an RMSE of 41.3; (

**b**) XGBoost, with an RMSE of 36.3; and (

**c**) Spline Continued Fraction, with an RMSE of 34.8.

**Figure 7.**Critical difference (CD) plot showing the statistical significance of rankings achieved by the regression methods for 100 runs on the six datasets form [8].

**Table 1.**Results from 100 runs of the proposed Spline Continued Fraction and ten regression methods all trained on the same dataset, with the median of the Root Mean Squared Error (RMSE) and standard deviation as the uncertainty of error.

Regressor | Median RMSE Score ± Std | |
---|---|---|

Out-of-Sample | Out-of-Domain | |

Spln-CFR | 10.989 ± 0.382 | 36.327 ± 1.187 |

xg-b | 9.474 ± 0.190 | 37.264 ± 0.947 |

rf | 9.670 ± 0.197 | 38.074 ± 0.751 |

grad-b | 12.659 ± 0.178 | 39.609 ± 0.619 |

l-regr | 17.618 ± 0.187 | 41.265 ± 0.466 |

krnl-r | 17.635 ± 0.163 | 41.427 ± 0.464 |

mlp | 19.797 ± 5.140 | 41.480 ± 9.640 |

ada-b | 18.901 ± 0.686 | 47.502 ± 0.743 |

l-svr | 26.065 ± 7.838 | 47.985 ± 1.734 |

lasso-l | 34.234 ± 0.267 | 74.724 ± 0.376 |

sgd-r ^{1} | N.R. | N.R. |

^{1}The Stochastic Gradient Descent Regressor (

`sgd-r`) without parameter estimation predicted unreasonable high values and had an extreme predicted error measure. Hence, we do not report (N.R.) the performance of

`sgd-r`, and have omitted it from further analysis.

**Table 2.**Predicted vs. actual critical temperatures for the materials with the top twenty predicted temperatures in the Out-of-Domain study, i.e., the one in which the lowest 90% of critical temperature samples were used for drawing the training data. The average values of the critical temperatures ($\overline{x}$), the average relative error ($\overline{\eta}$), and the root mean squared error (RMSE, denoted as rm) of these materials for the top twenty predictions (which are not necessarily the same, as they depend on the models) are shown in the last rows.

Spln-CFR | xg-b | rf | grad-b | mlp | l-regr | l-svr | krnl-r | ada-b | lasso-l | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

y | pred | y | pred | y | pred | y | pred | y | pred | y | pred | y | pred | y | pred | y | pred | y | pred |

92.00 | 114.14 | 89.20 | 89.64 | 91.19 | 87.89 | 89.50 | 83.44 | 109.00 | 100.81 | 98.00 | 91.59 | 112.00 | 94.81 | 98.00 | 91.02 | 89.50 | 58.63 | 89.00 | 27.06 |

90.00 | 109.69 | 94.20 | 89.19 | 89.90 | 87.88 | 89.90 | 83.44 | 124.90 | 100.31 | 112.00 | 89.14 | 100.00 | 93.49 | 112.00 | 88.67 | 89.50 | 58.63 | 89.00 | 27.06 |

111.00 | 108.54 | 89.88 | 88.69 | 90.00 | 87.88 | 90.50 | 83.44 | 114.00 | 99.70 | 105.00 | 87.53 | 132.60 | 93.49 | 105.00 | 86.84 | 89.70 | 58.63 | 89.00 | 27.06 |

93.50 | 108.01 | 89.93 | 88.34 | 90.20 | 87.88 | 91.50 | 83.44 | 128.40 | 99.59 | 117.00 | 87.06 | 105.00 | 92.94 | 117.00 | 86.65 | 89.80 | 58.63 | 89.00 | 27.06 |

99.00 | 106.50 | 90.00 | 88.15 | 90.90 | 87.88 | 90.00 | 83.42 | 127.40 | 99.53 | 100.00 | 85.92 | 115.00 | 92.93 | 100.00 | 85.88 | 89.80 | 58.63 | 89.00 | 27.06 |

105.60 | 105.01 | 90.10 | 88.15 | 91.00 | 87.88 | 91.80 | 83.42 | 127.80 | 99.53 | 132.60 | 85.92 | 111.00 | 92.90 | 132.60 | 85.88 | 89.90 | 58.63 | 89.00 | 27.06 |

113.00 | 104.35 | 91.00 | 88.15 | 92.00 | 87.88 | 90.00 | 82.22 | 130.10 | 98.76 | 115.00 | 85.50 | 110.00 | 92.84 | 115.00 | 85.51 | 90.00 | 58.63 | 89.00 | 27.06 |

113.00 | 103.95 | 91.30 | 88.15 | 92.20 | 87.88 | 89.50 | 79.29 | 128.50 | 98.55 | 111.00 | 84.97 | 106.70 | 92.54 | 111.00 | 84.46 | 90.00 | 58.63 | 89.00 | 27.06 |

106.60 | 103.95 | 96.10 | 88.15 | 92.40 | 87.88 | 90.00 | 79.29 | 128.40 | 98.45 | 132.00 | 84.96 | 126.90 | 91.73 | 132.00 | 84.42 | 90.50 | 58.63 | 89.00 | 27.06 |

128.70 | 103.92 | 90.00 | 88.10 | 92.50 | 87.88 | 91.00 | 79.29 | 128.80 | 98.45 | 110.00 | 84.31 | 117.00 | 91.73 | 110.00 | 84.38 | 91.50 | 58.63 | 89.00 | 27.06 |

91.80 | 102.10 | 91.40 | 88.10 | 92.74 | 87.88 | 91.80 | 79.29 | 131.40 | 98.33 | 106.70 | 83.95 | 126.80 | 91.30 | 106.70 | 82.97 | 100.00 | 58.63 | 89.00 | 27.06 |

108.00 | 101.56 | 92.60 | 87.82 | 92.80 | 87.88 | 92.30 | 79.29 | 128.80 | 98.10 | 126.90 | 82.72 | 115.00 | 90.84 | 95.00 | 82.64 | 108.00 | 58.63 | 89.00 | 27.06 |

92.00 | 101.32 | 91.60 | 87.53 | 93.00 | 87.88 | 90.00 | 78.85 | 128.70 | 93.96 | 105.00 | 82.63 | 95.00 | 90.80 | 105.00 | 82.01 | 110.00 | 58.63 | 89.00 | 27.06 |

90.00 | 101.19 | 93.00 | 87.53 | 93.00 | 87.88 | 91.60 | 78.85 | 130.30 | 93.94 | 95.00 | 82.62 | 121.60 | 90.80 | 107.00 | 81.88 | 110.90 | 58.63 | 89.00 | 27.06 |

105.10 | 100.50 | 93.80 | 87.49 | 93.05 | 87.88 | 89.10 | 78.79 | 131.30 | 93.93 | 107.00 | 82.47 | 100.00 | 90.78 | 126.90 | 81.82 | 114.00 | 58.63 | 89.00 | 27.06 |

130.30 | 100.35 | 89.90 | 87.48 | 93.20 | 87.88 | 89.20 | 78.79 | 122.00 | 91.96 | 105.00 | 82.41 | 107.00 | 90.78 | 105.00 | 81.51 | 114.00 | 58.63 | 89.00 | 27.06 |

93.00 | 100.24 | 90.00 | 87.48 | 93.40 | 87.88 | 89.40 | 78.79 | 123.50 | 91.64 | 126.80 | 82.12 | 90.00 | 90.63 | 90.00 | 81.40 | 116.00 | 58.63 | 89.10 | 27.06 |

91.50 | 100.00 | 90.20 | 87.48 | 93.50 | 87.88 | 89.40 | 78.79 | 121.00 | 90.69 | 98.50 | 82.03 | 96.00 | 90.49 | 126.80 | 81.24 | 122.50 | 58.63 | 89.10 | 27.06 |

91.50 | 99.18 | 90.90 | 87.48 | 91.80 | 87.75 | 89.40 | 78.79 | 115.00 | 90.14 | 112.00 | 82.03 | 128.70 | 90.48 | 117.00 | 80.89 | 127.00 | 58.63 | 89.10 | 27.06 |

116.00 | 98.39 | 91.00 | 87.48 | 92.10 | 87.69 | 89.50 | 78.79 | 110.00 | 90.01 | 117.00 | 81.83 | 130.30 | 90.26 | 121.60 | 80.87 | 130.90 | 58.63 | 89.10 | 27.06 |

$\overline{x}$: 103.08 | 103.64 | 91.31 | 88.03 | 92.044 | 87.86 | 90.27 | 80.49 | 124.47 | 96.32 | 111.63 | 84.59 | 112.33 | 91.83 | 111.68 | 84.05 | 102.68 | 58.63 | 89.02 | 27.06 |

$\overline{\eta}$: | 0.1085 | 0.036 | 0.0453 | 0.1083 | 0.224 | 0.2351 | 0.1733 | 0.2389 | 0.4187 | 0.696 | |||||||||

rm: | 13.6023 | 3.7753 | 4.3261 | 10.0078 | 28.9783 | 29.3265 | 23.9282 | 30.2426 | 46.2473 | 61.96 |

**Table 3.**Number of times the different methods predicted a critical temperature value ${T}_{c}\ge 89$ K (denoted as ‘P’ for positive) and ${T}_{c}<89$ K (denoted as ‘N’ for Negative) on the Out-of-Domain test.

Regressor | Out-of-Domain Predicted Critical Temperature, ${\mathit{T}}_{\mathit{c}}$ | |
---|---|---|

P (${\mathit{T}}_{\mathit{c}}$ $\ge 89$ K) | N (${\mathit{T}}_{\mathit{c}}$ < 89 K) | |

Spln-CFR | 108 | 2018 |

xg-b | 2 | 2124 |

rf | 0 | 2126 |

grad-b | 0 | 2126 |

mlp | 21 | 2105 |

l-regr | 2 | 2124 |

l-svr | 34 | 2092 |

krnl-r | 1 | 2125 |

ada-b | 0 | 2126 |

lasso-l | 0 | 2126 |

**Table 4.**Inter-rater agreement between the pairs of regressor methods where the resulting models were able to predict at least one positive temperature value (${T}_{c}\ge 89$ K).

Rater 1 | Rater 2 | Value of Kappa ($\mathit{\kappa}$) | Level of Agreement |
---|---|---|---|

Spln-CFR | xg-b | −0.001851 | No Agreement |

Spln-CFR | mlp | 0.030476 | None to Slight |

Spln-CFR | l-regr | 0.016365 | None to Slight |

Spln-CFR | l-svr | 0.104988 | None to Slight |

Spln-CFR | krnl-r | −0.000933 | No Agreement |

xg-b | mlp | −0.001721 | No Agreement |

xg-b | l-regr | −0.000942 | No Agreement |

xg-b | l-svr | −0.001780 | No Agreement |

xg-b | krnl-r | −0.000628 | No Agreement |

mlp | l-regr | −0.001721 | No Agreement |

mlp | l-svr | 0.208516 | Fair |

mlp | krnl-r | −0.000899 | No Agreement |

l-regr | l-svr | 0.053874 | None to Slight |

l-regr | krnl-r | 0.666457 | Substantial |

l-svr | krnl-r | −0.000915 | No Agreement |

**Table 5.**Number of predictions by the different regressor methods which fall within the out-of-domain threshold range for the test set during 100 repeated runs on six datasets from [8].

Regressor | in Range | Regressor | in Range | Regressor | in Range |
---|---|---|---|---|---|

Spln-CFR | 13,560 | grad-b | 1227 | ada-b | 0 |

MARS | 3716 | mlp | 1158 | lasso-l | 0 |

l-regr | 2594 | xg-b | 826 | rf | 0 |

l-svr | 2045 | krnl-r | 735 | sgd-r | 0 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Moscato, P.; Haque, M.N.; Huang, K.; Sloan, J.; Corrales de Oliveira, J.
Learning to Extrapolate Using Continued Fractions: Predicting the Critical Temperature of Superconductor Materials. *Algorithms* **2023**, *16*, 382.
https://doi.org/10.3390/a16080382

**AMA Style**

Moscato P, Haque MN, Huang K, Sloan J, Corrales de Oliveira J.
Learning to Extrapolate Using Continued Fractions: Predicting the Critical Temperature of Superconductor Materials. *Algorithms*. 2023; 16(8):382.
https://doi.org/10.3390/a16080382

**Chicago/Turabian Style**

Moscato, Pablo, Mohammad Nazmul Haque, Kevin Huang, Julia Sloan, and Jonathon Corrales de Oliveira.
2023. "Learning to Extrapolate Using Continued Fractions: Predicting the Critical Temperature of Superconductor Materials" *Algorithms* 16, no. 8: 382.
https://doi.org/10.3390/a16080382