Next Article in Journal
Stochastic Fractal Search for Bayesian Network Structure Learning Under Soft/Hard Constraints
Previous Article in Journal
Analytical and Numerical Treatment of Evolutionary Time-Fractional Partial Integro-Differential Equations with Singular Memory Kernels
Previous Article in Special Issue
Deep Learning for Leukemia Classification: Performance Analysis and Challenges Across Multiple Architectures
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Proposal for the Application of Fractional Operators in Polynomial Regression Models to Enhance the Determination Coefficient R2 on Unseen Data

by
Anthony Torres-Hernandez
1,2,3,*,
Rafael Ramirez-Melendez
1 and
Fernando Brambila-Paz
4
1
Department of Engineering, Universitat Pompeu Fabra, 08018 Barcelona, Spain
2
Department of Physics, Faculty of Science, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
3
Modeling Team, AleaSoft Energy Forecasting, 08015 Barcelona, Spain
4
Department of Mathematics, Faculty of Science, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
*
Author to whom correspondence should be addressed.
Fractal Fract. 2025, 9(6), 393; https://doi.org/10.3390/fractalfract9060393
Submission received: 19 April 2025 / Revised: 9 June 2025 / Accepted: 12 June 2025 / Published: 19 June 2025

Abstract

:
Since polynomial regression models are generally quite reliable for data that can be handled using a linear system, it is important to note that in some cases, they may suffer from overfitting during the training phase. This can lead to negative values of the coefficient of determination R 2 when applied to unseen data. To address this issue, this work proposes the partial implementation of fractional operators in polynomial regression models to construct a fractional regression model. The aim of this approach is to mitigate overfitting, which could potentially improve the R 2 value for unseen data compared to the conventional polynomial model, under the assumption that this could lead to predictive models with better performance. The methodology for constructing these fractional regression models is presented along with examples applicable to both Riemann–Liouville and Caputo fractional operators, where some results show that regions with initially negative or near-zero R 2 values exhibit remarkable improvements after the application of the fractional operator, with absolute relative increases exceeding 800 % on unseen data. Finally, the importance of employing sets in the construction of the fractional regression model within this methodological framework is emphasized, since from a theoretical standpoint, one could construct an uncountable family of fractional operators derived from the Riemann–Liouville and Caputo definitions that, although differing in their formulation, would yield the same regression results as those shown in the examples presented in this work.

1. Introduction

Fractional calculus represents an extension of classical calculus, dealing with derivatives and integrals of arbitrary (non-integer) order. This mathematical framework emerged concurrently with traditional calculus, spurred by Leibniz’s notation for integer-order derivatives:
d n d x n .
This symbolic formulation allowed L’Hôpital to pose an intriguing question to Leibniz in their correspondence, inquiring about the meaning of taking n = 1 2 as the order of a derivative. Although Leibniz was unable to provide a geometric or physical interpretation at the time, he responded insightfully, stating in a letter that it “… is an apparent paradox from which, one day, useful consequences will be drawn” [1]. Furthermore, in one of his correspondence with Johann Bernoulli, Leibniz explicitly mentioned derivatives of “general order” [2], indicating that he had already considered the concept of non-integer derivatives prior to formally proposing it as a mathematical notion, even though he did not fully understand its implications. The term fractional calculus originates from a historical context, as this branch of mathematical analysis studies derivatives and integrals of order α , where α R .
At present, there exists no universally accepted definition of a fractional derivative. Consequently, when the specific form is not required, the general notation employed is
d α d x α .
Fractional operators admit various formulations, with a key property being their consistency with classical calculus in the limit when α n , where n is an integer. For instance, let f : Ω R R be a function such that f L l o c 1 ( a , b ) , where L l o c 1 ( a , b ) denotes the space of locally integrable functions on the open interval ( a , b ) Ω . One of the fundamental constructs in this theory is the Riemann–Liouville fractional integral, defined as [3,4]
I x α a f ( x ) : = 1 Γ α a x ( x t ) α 1 f ( t ) d t ,
where Γ denotes the Gamma function. This operator serves as a foundational element for defining the Riemann–Liouville fractional derivative, expressed as [3,5]
D x α a f ( x ) : = I x α a f ( x ) , if   α < 0 d n d x n I x n α a f ( x ) , if   α 0 ,
where n = α and I x 0 a f ( x ) : = f ( x ) .
Moreover, if f : Ω R R is n-times differentiable with f , f ( n ) L l o c 1 ( a , b ) , the Caputo fractional derivative can also be defined via the Riemann–Liouville integral [3,5]:
D x α a C f ( x ) : = I x α a f ( x ) , if   α < 0 I x n α a f ( n ) ( x ) , if   α 0 ,
where n = α and I x 0 a f ( n ) ( x ) : = f ( n ) ( x ) . Importantly, if the function f satisfies the condition f ( k ) ( a ) = 0 for all k 0 , 1 , , n 1 , then the Riemann–Liouville and Caputo derivatives are equivalent:
D x α a f ( x ) = D x α a C f ( x ) .
Applying operator (2) to the monomial x μ , with a = 0 and μ > 1 , the following result is obtained:
D x α 0 x μ = Γ μ + 1 Γ μ α + 1 x μ α , α R Z ,
and when 1 α μ , it holds that D x α 0 x μ = D x α 0 C x μ . To illustrate the diversity in the formalism of fractional operators, it is advisable to review key articles such as the article “A Review of Definitions for Fractional Derivatives and Integral” by Capelas de Oliveira and Tenreiro Machado (2014), which provides a comprehensive analysis of the main definitions of fractional derivatives and integrals in mathematics, physics, and engineering, establishing a systematic classification of the most commonly used operators [6]. Subsequently, the article “A Review of Definitions of Fractional Derivatives and Other Operators” by Teodoro et al. (2019) extends this review by incorporating operators that are not strictly fractional but are relevant within the broader context of fractional calculus, highlighting their applications and practical limitations in both computational and physical domains [7]. More recently, the article “How Many Fractional Derivatives Are There?” by Valério et al. (2022) proposes a unified framework for fractional derivatives based on two fundamental parameters—order and asymmetry—through which many existing formulations can be derived while also addressing misconceptions and inaccurate definitions found in the literature [8]. Collectively, these studies offer a thorough and structured overview of the conceptual and formal evolution of fractional operators, which is essential for their rigorous application across various scientific and technological disciplines.
Before proceeding, it is important to highlight the broad applicability of fractional operators across diverse scientific domains. These include finance [9]; economics [10,11]; number theory through the Riemann zeta function [12]; engineering, particularly in the development of hybrid solar receivers [13]; and physics and mathematics, where they are employed to address systems of nonlinear algebraic equations [14,15]. Such problems involve finding the set of zeros of a function f : Ω R n R n , i.e.,
ξ Ω : f ( ξ ) = 0 ,
where · : R n R denotes a given norm, or equivalently,
ξ Ω : [ f ] k ( ξ ) = 0 k 1 ,
with [ f ] k : R n R representing the k-th component of the vector-valued function f.

2. Sets of Fractional Operators

Prior to advancing, it is crucial to emphasize that the extensive variety of fractional operators reported in the literature can be appreciated by considering seminal contributions: Osler (1970) generalizes the classical Leibniz rule to fractional derivatives and demonstrates its applicability to infinite series expansions [16]; Almeida (2017) introduces a Caputo-type fractional derivative with respect to another function and explores its properties and applications in modeling phenomena such as population growth [17]; Fu et al. (2021) derive a generalized fractional Fokker–Planck equation on fractal media through continuous-time random walks, establishing a link between stochastic processes and fractional dynamics [18]; Fan, Wu and Fu (2022) analyze the function spaces and boundedness conditions of these generalized fractional integrals in the context of continuous-time random walks [19]; Abu-Shady and Kaabar (2021) propose yet another generalized fractional derivative, demonstrating its flexibility across various fractional frameworks [20]; Saad (2020) develops a new nonsingular-kernel fractional derivative and applies it to a Legendre spectral collocation method, enhancing numerical stability [21]; Rahmat (2019) defines a conformable fractional derivative on arbitrary time scales, thereby unifying discrete and continuous fractional calculus [22]; Sousa and De Oliveira (2018) introduce the ψ -Hilfer derivative, extending Hilfer’s operator by incorporating a nonconstant kernel function ψ  [23]; Jarad et al. (2017) present a novel class of fractional operators that generalizes and unifies several existing definitions [24]; Atangana and Gómez-Aguilar (2017) propose a fractional derivative with a normal-distribution kernel and discuss its theoretical foundations and applications [25]; Yavuz and Özdemir (2020) compare new fractional derivatives featuring exponential and Mittag-Leffler kernels, offering a systematic assessment of their merits and limitations [26]; Liu et al. (2020) introduce a fractional derivative with a sigmoid-function kernel and develop related modeling frameworks [27]; Yang and Machado (2017) develop a variable-order fractional operator and apply it to anomalous diffusion problems, highlighting its enhanced modeling capabilities [28]; Atangana (2016) applies a novel fractional derivative to the nonlinear Fisher reaction–diffusion equation, illustrating its impact on pattern formation in complex systems [29]; He, Li and Wang (2016) employ a new fractional derivative to explain thermophysical properties of polar bear hairs, demonstrating cross-disciplinary applicability [30]; and finally, Sene (2020) formulates a fractional diffusion equation driven by a newly proposed fractional operator, providing fresh insights into anomalous transport phenomena [31]. Collectively, these contributions illustrate the remarkable diversity of fractional derivatives and motivate a set-theoretical framework for their unified characterization. This perspective underpins the methodology known as the fractional calculus of sets [32,33,34,35].
Let h : R m R be a scalar function, and denote by { e ^ k } k 1 the canonical basis of R m . Applying Einstein’s summation convention, it is possible to define a fractional operator of order α as follows:
o x α h ( x ) : = e ^ k o k α h ( x ) .
Define k n as the partial derivative of order n with respect to the k-th coordinate of x. Utilizing the aforementioned operator, we introduce the following family of fractional operators:
O x , α n ( h ) : = o x α : o k α h ( x ) and lim α n o k α h ( x ) = k n h ( x ) , k 1 ,
which is nonempty as it contains, for instance, the subset
O 0 , x , α n ( h ) : = o x α : o k α h ( x ) = k n + μ ( α ) k α h ( x ) , with lim α n μ ( α ) k α h ( x ) = 0 , k 1 .
As a consequence, the following property is satisfied:
If o i , x α , o j , x α O x , α n ( h ) and i j , o k , x α = 1 2 o i , x α + o j , x α O x , α n ( h ) ,
moreover, the complement of the set (7) is defined by
O x , α n , c ( h ) : = o x α : o k α h ( x ) k 1 , and lim α n o k α h ( x ) k n h ( x ) for at least one k ,
and from this definition, the following assertion follows:
If o i , x α = e ^ k o i , k α O x , α n ( h ) , o j , x α = e ^ k o i , σ j ( k ) α O x , α n , c ( h ) ,
where σ j : { 1 , 2 , , m } { 1 , 2 , , m } is a permutation different from the identity permutation.
The set (7) allows the extension of certain classical calculus elements. For instance, for γ N 0 m and x R m , the following multi-index notation can be defined:
γ ! : = k = 1 m [ γ ] k ! , | γ | : = k = 1 m [ γ ] k , x γ : = k = 1 m [ x ] k [ γ ] k γ x γ : = [ γ ] 1 [ x ] 1 [ γ ] 1 [ γ ] 2 [ x ] 2 [ γ ] 2 [ γ ] m [ x ] m [ γ ] m .
So, given a function h : Ω R m R , the following fractional operator can be defined:
s x α γ ( o x α ) : = o 1 α [ γ ] 1 o 2 α [ γ ] 2 o m α [ γ ] m ,
which leads to the definition of the following set:
S x , α n , γ ( h ) : = s x α γ = s x α γ ( o x α ) : s x α γ h ( x ) with o x α O x , α s ( h ) s n 2 , and lim α k s x α γ h ( x ) = k γ x k γ h ( x ) , α , | γ | n ,
for which the following properties are valid:
If s x α γ S x , α n , γ ( h ) , lim α 0 s x α γ h ( x ) = o 1 0 o 2 0 o m 0 h ( x ) = h ( x ) , lim α 1 s x α γ h ( x ) = o 1 [ γ ] 1 o 2 [ γ ] 2 o m [ γ ] m h ( x ) = γ x γ h ( x ) , | γ | n , lim α q s x α γ h ( x ) = o 1 q [ γ ] 1 o 2 q [ γ ] 2 o m q [ γ ] m h ( x ) = q γ x q γ h ( x ) , q | γ | q n , lim α n s x α γ h ( x ) = o 1 n [ γ ] 1 o 2 n [ γ ] 2 o m n [ γ ] m h ( x ) = n γ x n γ h ( x ) , n | γ | n 2 .
Employing the little-o notation and denoting by B ( a ; δ ) a ball centered at a R m with radius δ , it is possible to derive the following result:
If x B ( a ; δ ) , lim x a o ( ( x a ) γ ) ( x a ) γ = 0 , | γ | 1 .
This motivates the introduction of the following set of functions:
R α γ n ( a ) : = r α γ n : lim x a r α γ n ( x ) = 0 , | γ | n , and r α γ n ( x ) o ( x a n ) , x B ( a ; δ ) .
Subsequently, considering the following sets of fractional operators,
T x , α , p n , q , γ ( a , h ) : = t x α , p = t x α , p ( s x α γ ) : s x α γ S x , α M , γ ( h ) , t x α , p h ( x ) : = | γ | = 0 p 1 γ ! s x α γ h ( a ) ( x a ) γ + r α γ p ( x ) , α n , p q ,
T x , α , γ ( a , h ) : = t x α , = t x α , ( s x α γ ) : s x α γ S x , α , γ ( h ) , t x α , h ( x ) : = | γ | = 0 1 γ ! s x α γ h ( a ) ( x a ) γ ,
it is possible to extend the classical Taylor expansion of scalar functions using multi-index notation, where M = max { n , q } . Consequently, the following results emerge:
If t x α , p T x , α , p 1 , q , γ ( a , h ) and α 1 , t x 1 , p h ( x ) = h ( a ) + | γ | = 1 p 1 γ ! γ x γ h ( a ) ( x a ) γ + r γ p ( x ) .
If t x α , p T x , α , p n , 1 , γ ( a , h ) and p 1 , t x α , 1 h ( x ) = h ( a ) + k = 1 m o k α h ( a ) ( x a ) k + r α γ 1 ( x ) .
Lastly, the set in (7) can be regarded as a generator of fractional tensor operators. For instance, given α , n R d with α = e ^ k [ α ] k and n = e ^ k [ n ] k , it is feasible to define the following set:
O x , α n ( h ) : = o x α : o x α h ( x ) and o x α O x , [ α ] 1 [ n ] 1 ( h ) × O x , [ α ] 2 [ n ] 2 ( h ) × × O x , [ α ] d [ n ] d ( h ) .

3. Groups of Fractional Operators

Consider a function h : Ω R m R m . We introduce the following sets of fractional operators associated with a vector function:
O x , α n m ( h ) : = o x α : o x α O x , α n [ h ] k for all k m ,
O x , α n , c m ( h ) : = o x α : o x α O x , α n , c [ h ] k for all k m ,
O x , α n , u m ( h ) : = O x , α n m ( h ) O x , α n , c m ( h ) ,
where [ h ] k : Ω R m R denotes the k-th component of h. Utilizing these sets, we define the following family of fractional operators:
MO x , α , u m ( h ) : = k Z O x , α k , u m ( h ) .
It is noteworthy that this family adheres to the identity property relative to the classical Hadamard product:
o x 0 h ( x ) : = h ( x ) o x α MO x , α , u m ( h ) .
So, for every operator o x α MO x , α , u m ( h ) , the corresponding fractional matrix operator is defined:
A α ( o x α ) : = [ A α ( o x α ) ] j k : = ( o k α ) .
Subsequently, for each o x α MO x , α , u m ( h ) , the following a modified Hadamard product is introduced [32]:
o i , x p α o j , x q α : = o i , x p α o j , x q α , if i j ( horizontal Hadamard product ) o i , x ( p + q ) α , if i = j ( vertical Hadamard product ) .
Then, it is feasible, using the aforementioned definitions, to construct an Abelian group of fractional operators which is isomorphic to the additive group of integers, as stated in the following theorem [33]:
Theorem 1. 
Let o x α MO x , α , u m ( h ) be a fractional operator, and consider the group of integers under addition ( Z , + ) . Then, employing the modified Hadamard product (29), the set of fractional matrix operators
G m ( A α ( o x α ) ) : = A α r = A α ( o x r α ) : r Z
constitutes an Abelian group generated by A α ( o x α ) that is isomorphic to ( Z , + ) :
G m ( A α ( o x α ) ) ( Z , + ) .
From this theorem, the following corollary naturally follows:
Corollary 1. 
Given o x α MO x , α , u m ( h ) , the group of integers ( Z , + ) , and any subgroup H Z , one can define the subset of fractional matrix operators:
G m ( A α ( o x α ) , H ) : = A α r = A α ( o x r α ) : r H ,
which forms a subgroup of G m ( A α ( o x α ) ) :
G m ( A α ( o x α ) , H ) G m ( A α ( o x α ) ) .
Example 1. 
Let Z n be the residue class ring modulo the positive integer n. So, for a fractional operator o x α MO x , α , u m ( h ) and Z 14 , it is feasible to define the following Abelian group of fractional matrix operators under the modified Hadamard product (29):
G m ( A α ( o x α ) , Z 14 ) = A α 0 , A α 1 , A α 2 , , A α 13 .
Moreover, Corollary 1 facilitates the generation of groups of fractional operators under alternative operations. For instance, considering the operation
A α r A α s = A α ( r s ) ,
the following corollaries can be obtained:
Corollary 2. 
Let M n be the multiplicative group of positive residue classes coprime to n. So, for each fractional operator o x α MO x , α , u m ( h ) , it is feasible to define the following set:
G m A α ( o x α ) , M n : = A α r = A α ( o x r α ) : r M n
which forms an Abelian group under the operation (35).
Corollary 3. 
Let Z p + be the multiplicative group of positive residue classes modulo a prime p. So, for each fractional operator o x α MO x , α , u m ( h ) , it is feasible to define the following set:
G m A α ( o x α ) , Z p + : = A α r = A α ( o x r α ) : r Z p +
which forms an Abelian group under the operation (35).
Finally, it is important to note that when n is prime, the following equality holds:
G m A α ( o x α ) , M n = G m A α ( o x α ) , Z n + .

4. Polynomial Regression Model

Polynomial regression constitutes an extension of linear regression that captures the relationship between the independent variable x and the dependent variable y through a polynomial function of degree n. Unlike simple linear regression, which assumes a purely linear association between x and y, polynomial regression introduces curvature by incorporating terms of higher powers of x. The general expression of a polynomial regression model is formulated as
y = β 0 + β 1 x + β 2 x 2 + + β n x n .
Here, the parameter β 0 denotes the intercept, corresponding to the expected value of y when x = 0 . The coefficients β 1 , β 2 , , β n quantify the influence of each power of x on the predicted outcome.
Incorporating polynomial terms enables the model to accommodate nonlinear tendencies while retaining linearity in terms of the parameters. The model’s adaptability increases with the polynomial degree n, thus permitting it to approximate complex data patterns more effectively.
The polynomial regression model can also be represented in matrix form, which simplifies the estimation of the coefficients. Using matrix notation, the model is expressed as
y 1 y 2 y m = 1 x 1 x 1 2 x 1 n 1 x 2 x 2 2 x 2 n 1 x m x m 2 x m n β 0 β 1 β n y = X β ,
where:
  • y represents the vector of observed responses;
  • X is the design matrix, with each row corresponding to an observation and each column representing a specific power of x;
  • β denotes the vector of regression coefficients.
The parameter vector β is commonly estimated via the least squares criterion, which minimizes the sum of squared residuals between observed and predicted values. The estimator for β is given by
β = ( X T X ) 1 X T y .
This formula is fundamental in polynomial regression as it facilitates the identification of coefficients that optimally fit the data.
Each coefficient in the polynomial regression equation carries a distinct interpretation with respect to the shape of the regression curve:
  • The intercept β 0 indicates the predicted value of y at x = 0 .
  • The coefficient β 1 pertains to the linear term x, governing the slope of the regression line near x = 0 .
  • The coefficient β 2 relates to the quadratic term x 2 , influencing the curvature of the fit.
  • Higher-order coefficients, such as β 3 for x 3 , β 4 for x 4 , etc., introduce additional degrees of flexibility, enabling the model to capture more intricate patterns.
Selecting the polynomial degree n is a critical modeling decision. A polynomial of insufficient degree may fail to capture essential trends in the data, whereas a polynomial of excessively high degree risks overfitting and producing unwarranted oscillations. Common strategies for selecting the degree include cross-validation techniques and incorporating prior knowledge of the data domain.
Polynomial regression has proven to be a valuable analytical tool across a wide range of disciplines, providing nuanced modeling of nonlinear relationships frequently encountered in empirical data. For example, in economics, polynomial regression has been employed to model the Laffer Curve, which posits a nonlinear relationship between tax rates and government revenue; Trabandt and Uhlig (2011) [36] applied this methodology to data from the United States and Europe, identifying optimal taxation thresholds that maximize revenue. In physics, polynomial models are utilized to capture deviations from ideal projectile motion caused by factors such as air resistance, yielding more accurate trajectory estimations beyond simple quadratic fits (Serway and Jewett, 2018) [37]. Similarly, in engineering—particularly within materials science—higher-degree polynomial regression is used to model the stress–strain behavior of materials during plastic deformation, enabling precise characterization of mechanical properties under varying loads (Callister and Rethwisch, 2020) [38].
By enabling the modeling of nonlinear associations while maintaining a simple and interpretable parametric form, polynomial regression significantly extends the applicability of classical regression frameworks.

5. Seen and Unseen Data in Regression Models

In regression modeling, distinguishing between “seen” and “unseen” data is essential for assessing a model’s performance and its capacity to generalize. This differentiation directly impacts how effectively the model can adapt to novel data and produce trustworthy predictions in practical applications.
Seen data, often called training data, refers to the dataset utilized during the model’s learning phase. Throughout this process, the model identifies the inherent patterns and dependencies between the independent variables (features) and the dependent variable (target). By tuning parameters such as coefficients, the model aims to minimize the discrepancy between predicted outputs and actual observations within the training dataset. The accuracy and volume of training data critically influence the model’s ability to accurately represent the relationships among variables.
The primary goal of training on seen data is to derive a function that characterizes the association between input features and output targets. Nonetheless, the model must not only achieve a good fit on the training data but also extend its learned structure to unseen, future data. This capacity for generalization is vital, as it determines the model’s effectiveness when applied to new datasets or real-world problems.
Unseen data, alternatively known as test data, comprises observations that the model has not encountered during training. This dataset serves as a benchmark to evaluate the model’s predictive capabilities on novel information. The chief purpose of employing unseen data is to measure the model’s accuracy and verify that it has identified genuine patterns instead of merely memorizing the training examples.
A model’s aptitude to generalize well to unseen data is a key indicator of its robustness. A well-generalized model can leverage its learned parameters to produce reliable predictions on new inputs. High performance on both training and test datasets implies that the model has effectively captured the fundamental relationships in the data rather than fitting noise or peculiarities present in the training set. Conversely, suboptimal results on unseen data may signal poor generalization, often caused by either overfitting or underfitting.
  • Overfitting: This occurs when a model excessively adapts to the specific details and noise within the training data, thereby impairing its ability to perform accurately on unseen data. Overfitting is usually linked to models that are overly complex relative to the size or variability of the training set. Consequently, the model becomes too specialized in the training data, losing generalization capacity on new samples.
  • Underfitting: This situation arises when the model is overly simplistic and fails to capture the genuine patterns in the data, resulting in poor predictive performance on both the training and test datasets. Underfitting generally happens when the model’s parameterization is insufficient or too rigid to reflect the underlying relationships, causing inaccurate predictions and low overall efficacy.

6. The Coefficient of Determination R 2

A fundamental metric for assessing the efficacy of regression models is the coefficient of determination, denoted by R 2 . This statistic quantifies the proportion of variance in the dependent variable that can be accounted for by the independent variables. Formally, R 2 is expressed as
R 2 = 1 i = 1 n ( y i y ^ i ) 2 i = 1 n ( y i y ¯ ) 2 1 ,
where y i indicates the observed values, y ^ i denotes the predicted values, and y ¯ represents the mean of the observed data. The numerator corresponds to the residual sum of squares (RSS), reflecting the prediction errors, while the denominator represents the total sum of squares (TSS), measuring the overall variance in the data. An R 2 value approaching 1 signifies that the model explains a large fraction of the variance in the target variable.
The coefficient R 2 typically ranges from 0 to 1, although it can assume negative values in some cases, especially when the model is improperly specified or overfits the data. The interpretation of R 2 values is as follows:
  • R 2 = 1 : This indicates a perfect fit where the model explains all variance in the dependent variable. The predicted values coincide exactly with the actual values, implying zero prediction error. However, such a result is uncommon in practical contexts and often suggests overfitting.
  • 0 < R 2 < 1 : Values within this interval imply that the model accounts for a portion of the variance. The closer R 2 is to 1, the better the fit and the more variance explained. For instance, R 2 = 0.8 indicates that 80% of the variance is captured by the model.
  • R 2 = 0 : This value suggests that the model does not explain any variance beyond the mean of the dependent variable. Predictions are no better than simply using the average outcome, indicating no meaningful relationship is captured.
  • Negative R 2 : Though rare, negative R 2 values may occur when the model performs worse than a naive mean predictor. This signals poor model fit and often results from overfitting or incorrect model formulation.
Generally, a higher R 2 reflects superior model performance and greater explanatory power. Nonetheless, relying solely on R 2 is inadequate for model evaluation, as it does not penalize overfitting or consider model complexity. Complementary metrics and validation methods, such as cross-validation, are recommended to ensure model robustness.
A high R 2 on the training dataset indicates good fit, yet it is even more critical to obtain a strong R 2 on unseen test data. A robust R 2 on new data demonstrates the model’s ability to generalize true underlying patterns rather than overfit training noise. This confers several benefits:
  • Enhanced generalization: Models with elevated R 2 on unseen data provide reliable predictions for novel observations, increasing practical utility.
  • Mitigated overfitting: Sustained high R 2 on test sets suggests the model has learned substantive patterns rather than memorizing training instances.
  • Improved decision support: In domains such as finance, healthcare, and engineering, dependable predictions based on high test R 2 facilitate better informed decisions.
  • Robust model selection: Comparing R 2 values across models aids in identifying the optimal balance between complexity and predictive accuracy without overfitting.
Therefore, ensuring that a model maintains a consistently high R 2 across both training and unseen datasets is crucial for building reliable, interpretable regression frameworks.

7. Fractional Regression Model

In the construction of predictive models based on time series data, it is standard practice to partition the dataset into two principal segments. The first, commonly referred to as the interpolation segment, comprises the training data used for model calibration. The second segment, associated with extrapolation, consists of data near or beyond the boundaries of the observed sample and serves to evaluate the predictive capability of the model once training and testing have been completed.
Within the framework of polynomial regression, the selection of the polynomial degree m plays a crucial role. Small variations in m—even by a single unit—can result in pronounced fluctuations in model performance, particularly during the validation phase. Such instability frequently necessitates a reevaluation of traditional modeling strategies. In response to this challenge, the present study explores a partial integration of fractional derivatives into regression models. This approach is inspired by recent developments in the literature, notably in [39,40], and aims to enhance model robustness in both interpolation and extrapolation contexts.
To examine the viability of this methodology, polynomial regression models are constructed using a dataset comprising average monthly prices of conventional and organic avocados from 2015 to 2018. This dataset is publicly accessible at the following GitHub repository: https://github.com/UOC-curso/curso/blob/main/proyectos/proyecto-1-regresion/avocado.csv (accessed on 18 April 2025).
A standard polynomial regression model of order m is given by
y ( x ) = i = 0 m β i x i ,
where the coefficients { β i } i = 0 m are to be estimated from the data, generally using an interpolation segment. Considering this structure, it is feasible to define the following set:
MO x , α m y β 0 : = k = m m O x , α k y β 0 ,
which enables the construction of a fractional regression model for any operator o x α MO x , α 1 y β 0 as follows:
σ ( α , x ) : = β 0 + i = 1 m β i o x α x i , α ( 1 , 1 ) .
It is worth mentioning that preserving the intercept term β 0 is essential to ensure that the regression model retains its original value at x = 0 . This formulation can be naturally extended to a multidimensional setting using a logistic function f, resulting in a fractional logistic regression model:
σ ( α , x ) : = f β 0 + i = 1 m β i o x i α x i , α ( 1 , 1 ) .
This generalized structure may be interpreted as a fractional activation function, thereby laying the foundation for developing fractional neural networks, provided that all features x i are non-negative. To satisfy this prerequisite, preprocessing methods such as normalization can be employed to scale all inputs to the interval [ 0 , 1 ] .
It is important to emphasize that the concept of fractional regression is not novel in the literature. Seminal contributions in this domain include the study “Regression Using Fractional Polynomials of Continuous Covariates” by Royston and Altman (1994) [41] and, more recently, “Regression Coefficient Derivation via Fractional Calculus Framework” by Awadalla et al. (2022) [42]. These works, although differing in methodology, share the objective of enhancing the flexibility and performance of regression models with a structure different from that of classical polynomial regression.
Royston and Altman propose the use of fractional polynomials, which enrich traditional polynomial regression by incorporating real-valued exponents in covariate transformations. This method provides an effective balance between model complexity and interpretability, capturing nonlinear trends while maintaining a parametric form consistent with conventional practices.
On the other hand, Awadalla et al. introduce a fundamentally different framework grounded in fractional calculus, particularly employing Caputo derivatives. Their approach modifies the classical least squares estimation technique to incorporate fractional differentiation, thereby expanding the model class. Traditional models are recovered when the fractional order α = 1 . The practical applicability of this methodology has been demonstrated in diverse contexts, including wage analysis and demographic modeling, showing improvements in predictive accuracy.
While Royston and Altman introduce flexibility within the structure of the regression function itself, Awadalla and collaborators redefine the process by which regression coefficients are derived. Despite their methodological divergence, both frameworks contribute significantly to the advancement of regression modeling.
Nevertheless, a notable limitation of the approach by Awadalla et al. lies in its vulnerability to overfitting, particularly when employing a least squares criterion. This issue may result in negative values of the coefficient of determination R 2 on unseen data from the extrapolation segment, a drawback also observed in classical polynomial models.
The methodology proposed in this study aims to overcome the aforementioned limitations through a two-stage modeling framework. In the initial phase, model coefficients are estimated using conventional approaches—such as the least squares method, fixed-point algorithms, or backpropagation in the case of neural networks—applied exclusively to the interpolation segment. Overfitting risks are temporarily ignored during this stage. Subsequently, in the second phase, a fractional operator is introduced, transforming the classical model into its fractional counterpart. The parametric nature of this operator induces a controlled mismatch within the interpolation region in the fractional regression model, thereby enhancing its predictive performance in the extrapolation domain.
This effect is illustrated in Figure 1, where adjustments to the fractional order α produce a decline in interpolation R 2 values while simultaneously increasing the extrapolation R 2 . In some cases, this transformation yields a positive extrapolation R 2 from an initially negative one, thereby significantly improving predictive reliability on unseen data.
It is pertinent to underscore a constraint inherent to the proposed fractional regression methods: they are applicable exclusively to non-negative features. This restriction arises from the potential emergence of outputs within the domain of complex numbers when fractional operators are applied to negative inputs. However, this issue can be effectively mitigated through preprocessing or encoding techniques such as Label Encoding, Ordinal Encoding, or One-Hot Encoding, which ensure that all input variables remain within the non-negative domain. Moreover, the proposed method can be considered a low-cost mismatch approach in computational terms, as it operates using the coefficients previously calculated in classical regression models, incurring only the computational cost associated with varying the parameter α in the fractional operator used. Consequently, this approach can improve the coefficient of determination R 2 performance on unseen data, in some cases transforming a negative R 2 value into a positive one.
Finally, the adoption of a fractional-operator-set-based framework introduces an additional layer of flexibility and provides a foundation for broader generalization. In contrast to conventional approaches—which typically rely on a single fractional operator and have led to a proliferation of studies in the field of fractional calculus that differ only in the choice of operator while often yielding equivalent results—the present methodology supports the integration of multiple operators. This facilitates the development of diverse families of models or equations. Consequently, it opens new avenues for both theoretical research and practical applications, enabling the incorporation of novel operators as they are introduced and offering the potential for more accurate modeling of complex real-world phenomena.

Examples Using Fractional Regression Model

Consider a polynomial regression function y of degree m and a fractional order parameter α ( 1 , 1 ) . The Riemann–Liouville and Caputo fractional operators, defined in Equations (2) and (3), belong to the set MO x , α 1 y β 0 and satisfy Equation (4). Consequently, both operators can be employed as described by Equation (5) within the fractional regression framework. To emphasize the relevance of using sets in the construction of the fractional regression model within this methodological framework, it is important to note that, from a theoretical perspective, one could construct an uncountably infinite family of fractional operators derived from the Riemann–Liouville and Caputo definitions. These operators, although differing in their formulation, would yield the same regression results as those shown in the subsequent examples and can be defined as follows:
o x α : o x α = r D x α 0 + ( 1 r ) D x α 0 C , r [ 0 , 1 ] MO x , α 1 y β 0 .
So, to illustrate the proposed method, the average monthly prices of conventional avocados are extracted from the dataset. Due to notable dispersion within some groups, the monthly median is utilized instead of the mean, as it reduces sensitivity to outliers and extreme values. The dataset is partitioned into the following:
  • Interpolation set: Comprises 80% of the data.
  • Extrapolation set: Comprises the remaining 20%.
This division is performed using the following Python 3.10.12 command:
train _ test _ split ( X , y _ price , test _ size = 0.2 , random _ state = 42 , shuffle = False )
Here, train_test_split from sklearn.model_selection separates the data into training (interpolation) and testing (extrapolation) subsets as follows:
  • X: Input features (time or date values).
  • y_price: Average price of conventional avocados.
  • test_size = 0.2: Proportion allocated for extrapolation.
  • random_state = 42: Seed for reproducibility.
  • shuffle = False: Maintains temporal order by avoiding data shuffling.
This methodology enables the generation of datasets suitable for evaluating polynomial models and subsequently constructing fractional regression models to analyze avocado price dynamics by region, as illustrated in Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11, Figure 12, Figure 13 and Figure 14.
In the forthcoming regression examples, the Student’s t-distribution is employed to construct both confidence and prediction intervals. This distribution is particularly advantageous for small sample sizes with unknown population variance. Its shape resembles that of the normal distribution but with heavier tails, reflecting the increased uncertainty inherent in the estimation process.
The t-statistic is defined as
t = x ¯ μ s / n ,
where x ¯ is the sample mean, μ is the population mean, s is the sample standard deviation, and n is the sample size. As n increases, the t-distribution converges to the normal distribution. Accordingly, a confidence interval for the population mean is given by
C I = y ^ ± t α / 2 , n 1 × s n ,
where t α / 2 , n 1 denotes the critical value from the t-distribution with n 1 degrees of freedom and confidence level 1 α . This interval estimates the range within which the true population parameter is expected to lie with the specified confidence (e.g., 95%).
Conversely, the prediction interval, which accounts for both data variability and prediction uncertainty, is expressed as
P I = y ^ ± t α / 2 , n 1 × s 2 1 + 1 n ,
where y ^ is the predicted value, s 2 is the residual variance, and n is the number of observations. Unlike the confidence interval, the prediction interval is wider, encompassing both the variability of future observations and the uncertainty in estimation.

8. Conclusions

This work has explored an innovative methodology aimed at improving polynomial regression models through the incorporation of fractional operators. This approach seeks to overcome classical limitations related to overfitting and predictive performance, particularly on unseen data, thereby enhancing the performance of time series forecasting.
The dataset was divided into two main segments: an interpolation segment used for model calibration and an extrapolation segment designed to evaluate the model’s predictive capacity on unseen or boundary-adjacent regions of the sample domain. This scheme is fundamental for assessing model robustness and generalization beyond the training sample.
Fractional operators parametrized by the order α ( 1 , 1 ) have proven to be an effective tool for inducing a controlled mismatch in the interpolation segment, which translates into significant improvements in the extrapolation segment. This phenomenon, which can be referred to as the fractional mismatch effect, helps mitigate the classical overfitting commonly observed in traditional polynomial models—particularly when high polynomial degrees are used or when the dataset is limited.
It is important to emphasize the role of employing sets in the construction of the fractional regression model within this methodological framework. From a theoretical perspective, one can construct an uncountably infinite family of fractional operators derived from the Riemann–Liouville and Caputo definitions. Although these operators differ in their formulation, they yield equivalent regression results to those demonstrated in the examples presented in this study. This highlights the generality and robustness of the proposed approach regardless of the specific fractional operator chosen.
For a quantitative analysis of the results shown in the previously presented examples, the metric of absolute relative increase in the coefficient of determination R 2 can be used to measure the improvement in the extrapolation segments after the partial application of a chosen fractional operator to the classical regression model. This metric is defined as follows:
Absolute Relative Increase = R final 2 R initial 2 R initial 2
This indicator allows for the evaluation of improvement regardless of the initial coefficient’s sign or magnitude, facilitating a robust interpretation even in cases with negative or near-zero values, as shown in Table 1.
Lastly, based on the analysis of the results presented in the above table, the following conclusions can be drawn:
  • Regions with initial R 2 values that are negative or close to zero exhibit extraordinary improvements after applying the fractional operator. Notable cases include South Carolina, Hartford Springfield, and California, where absolute relative increases exceed 800 % on unseen data.
  • Regions that start with relatively high and positive R 2 coefficients show more modest but consistent improvements, confirming that the technique adds value even where the base model already performs reasonably well.
  • Small changes in the parameter α can produce large relative improvements, especially in regions with poor initial performance, highlighting the sensitivity and fine-tuning potential of fractional models.
  • These results suggest that the proposed approach contributes to overcoming the classical limitations of polynomial models, mainly by increasing the coefficient of determination R 2 in unseen data and allowing the generalization of the methods through the implementation of fractional operators, thereby reinforcing the practical applicability of the methodology in predictive analyses based on time series.
In summary, the partial incorporation of fractional operators into polynomial regression models presents a promising approach to improve predictive accuracy and robustness, especially in extrapolation contexts. The flexibility introduced by the fractional parameter α allows effective control over the trade-off between mismatch and generalization, mitigating typical overfitting issues associated with classical regression methods.

Author Contributions

Methodology, A.T.-H.; Software, A.T.-H.; Validation, R.R.-M. and F.B.-P.; Formal analysis, A.T.-H.; Investigation, A.T.-H.; Resources, R.R.-M. and F.B.-P.; Writing—original draft, A.T.-H.; Writing—review & editing, A.T.-H.; Visualization, A.T.-H., R.R.-M. and F.B.-P.; Supervision, R.R.-M. and F.B.-P.; Funding acquisition, R.R.-M. and F.B.-P. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by IMPA project PID2023 152250OB-410 funded by MCIU/AEI/10.13039/501100011033/FEDER, UE.

Data Availability Statement

The data presented in this study are available in [UOC-curso] at https://github.com/UOC-curso/curso/blob/main/proyectos/proyecto-1-regresion/avocado.csv (accessed on 18 April 2025).

Conflicts of Interest

Author Anthony Torres-Hernandez was employed by the company AleaSoft Energy Forecasting during the period in which this study was carried out as part of a postdoctoral research stay in the Department of Engineering at Universitat Pompeu Fabra. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Miller, K.S.; Ross, B. An Introduction to the Fractional Calculus and Fractional Differential Equations; Wiley-Interscience: Hoboken, NJ, USA, 1993. [Google Scholar]
  2. Ross, B. The development of fractional calculus 1695–1900. Hist. Math. 1977, 4, 75–89. [Google Scholar] [CrossRef]
  3. Hilfer, R. Applications of Fractional Calculus in Physics; World Scientific: Singapore, 2000. [Google Scholar]
  4. Oldham, K.; Spanier, J. The Fractional Calculus Theory and Applications of Differentiation and Integration to Arbitrary Order; Elsevier: Amsterdam, The Netherlands, 1974; Volume 111. [Google Scholar]
  5. Kilbas, A.A.; Srivastava, H.M.; Trujillo, J.J. Theory and Applications of Fractional Differential Equations; Elsevier: Amsterdam, The Netherlands, 2006. [Google Scholar]
  6. De Oliveira, E.C.; Machado, J.A.T. A review of definitions for fractional derivatives and integral. Math. Probl. Eng. 2014, 2014, 1–7. [Google Scholar] [CrossRef]
  7. Teodoro, G.S.; Machado, J.A.T.; De Oliveira, E.C. A review of definitions of fractional derivatives and other operators. J. Comput. Phys. 2019, 388, 195–208. [Google Scholar] [CrossRef]
  8. Valério, D.; Ortigueira, M.D.; Lopes, A.M. How many fractional derivatives are there? Mathematics 2022, 10, 737. [Google Scholar] [CrossRef]
  9. Safdari-Vaighani, A.; Heryudono, A.; Larsson, E. A radial basis function partition of unity collocation method for convection–diffusion equations arising in financial applications. J. Sci. Comput. 2015, 64, 341–367. [Google Scholar] [CrossRef]
  10. Traore, A.; Sene, N. Model of economic growth in the context of fractional derivative. Alex. Eng. J. 2020, 59, 4843–4850. [Google Scholar] [CrossRef]
  11. Tejado, I.; Pérez, E.; Valério, D. Fractional calculus in economic growth modelling of the group of seven. Fract. Calc. Appl. Anal. 2019, 22, 139–157. [Google Scholar] [CrossRef]
  12. Guariglia, E. Fractional calculus, zeta functions and shannon entropy. Open Math. 2021, 19, 87–100. [Google Scholar] [CrossRef]
  13. Torres-Hernandez, A.; Brambila-Paz, F.; Rodrigo, P.M.; De-la-Vega, E. Reduction of a nonlinear system and its numerical solution using a fractional iterative method. J. Math. Stat. Sci. 2020, 6, 285–299. [Google Scholar]
  14. Torres-Hernandez, A.; Brambila-Paz, F. Fractional Newton-Raphson Method. Appl. Math. Sci. Int. J. (MathSJ) 2021, 8, 1–13. [Google Scholar] [CrossRef]
  15. Freitas, F.D.; de Oliveira, L.N. A fractional order derivative newton-raphson method for the computation of the power flow problem solution in energy systems. Fract. Calc. Appl. Anal. 2024, 27, 3414–3445. [Google Scholar] [CrossRef]
  16. Osler, T.J. Leibniz rule for fractional derivatives generalized and an application to infinite series. SIAM J. Appl. Math. 1970, 18, 658–674. [Google Scholar] [CrossRef]
  17. Almeida, R. A caputo fractional derivative of a function with respect to another function. Commun. Nonlinear Sci. Numer. Simul. 2017, 44, 460–481. [Google Scholar] [CrossRef]
  18. Fu, H.; Wu, G.; Yang, G.; Huang, L.-L. Continuous time random walk to a general fractional fokker–planck equation on fractal media. Eur. Phys. J. Spec. Top. 2021, 230, 3927–3933. [Google Scholar] [CrossRef]
  19. Fan, Q.; Wu, G.-C.; Fu, H. A note on function space and boundedness of the general fractional integral in continuous time random walk. J. Nonlinear Math. Phys. 2022, 29, 95–102. [Google Scholar] [CrossRef]
  20. Abu-Shady, M.; Kaabar, M.K. A generalized definition of the fractional derivative with applications. Math. Probl. Eng. 2021, 2021, 9444803. [Google Scholar] [CrossRef]
  21. Saad, K.M. New fractional derivative with non-singular kernel for deriving legendre spectral collocation method. Alex. Eng. J. 2020, 59, 1909–1917. [Google Scholar] [CrossRef]
  22. Rahmat, M.R.S. A new definition of conformable fractional derivative on arbitrary time scales. Adv. Differ. Equ. 2019, 2019, 1–16. [Google Scholar]
  23. da C Sousa, J.V.; De Oliveira, E.C. On the ψ-hilfer fractional derivative. Commun. Nonlinear Sci. Numer. Simul. 2018, 60, 72–91. [Google Scholar] [CrossRef]
  24. Jarad, F.; Uğurlu, E.; Abdeljawad, T.; Baleanu, D. On a new class of fractional operators. Adv. Differ. Equ. 2017, 2017, 1–16. [Google Scholar] [CrossRef]
  25. Atangana, A.; Gómez-Aguilar, J.F. A new derivative with normal distribution kernel: Theory, methods and applications. Phys. A Stat. Mech. Its Appl. 2017, 476, 1–14. [Google Scholar] [CrossRef]
  26. Yavuz, M.; Özdemir, N. Comparing the new fractional derivative operators involving exponential and mittag-leffler kernel. Discret. Contin. Dyn. Syst.-S 2020, 13, 995. [Google Scholar] [CrossRef]
  27. Liu, J.-G.; Yang, X.-J.; Feng, Y.-Y.; Cui, P. New fractional derivative with sigmoid function as the kernel and its models. Chin. J. Phys. 2020, 68, 533–541. [Google Scholar] [CrossRef]
  28. Yang, X.-J.; Machado, J.A.T. A new fractional operator of variable order: Application in the description of anomalous diffusion. Phys. A Stat. Mech. Its Appl. 2017, 481, 276–283. [Google Scholar] [CrossRef]
  29. Atangana, A. On the new fractional derivative and application to nonlinear fisher’s reaction–diffusion equation. Appl. Math. Comput. 2016, 273, 948–956. [Google Scholar] [CrossRef]
  30. He, J.-H.; Li, Z.-B.; Wang, Q.-L. A new fractional derivative and its application to explanation of polar bear hairs. J. King Saud Univ.-Sci. 2016, 28, 190–192. [Google Scholar] [CrossRef]
  31. Sene, N. Fractional diffusion equation with new fractional operator. Alex. Eng. J. 2020, 59, 2921–2926. [Google Scholar] [CrossRef]
  32. Torres-Hernandez, A.; Brambila-Paz, F. Sets of fractional operators and numerical estimation of the order of convergence of a family of fractional fixed-point methods. Fractal Fract. 2021, 5, 240. [Google Scholar] [CrossRef]
  33. Torres-Hernandez, A.; Brambila-Paz, F.; Montufar-Chaveznava, R. Acceleration of the order of convergence of a family of fractional fixed point methods and its implementation in the solution of a nonlinear algebraic system related to hybrid solar receivers. Appl. Math. Comput. 2022, 429, 127231. [Google Scholar] [CrossRef]
  34. Torres-Hernandez, A.; Brambila-Paz, F.; Ramirez-Melendez, R. Abelian groups of fractional operators. Comput. Sci. Math. Forum 2022, 4, 4. [Google Scholar] [CrossRef]
  35. Torres-Hernandez, A. Code of a multidimensional fractional quasi-Newton method with an order of convergence at least quadratic using recursive programming. Appl. Math. Sci. Int. J. (MathSJ) 2022, 9, 17–24. [Google Scholar] [CrossRef]
  36. Trabandt, M.; Uhlig, H. The Laffer curve revisited. J. Monet. Econ. 2011, 58, 305–327. [Google Scholar] [CrossRef]
  37. Serway, R.A.; Jewett, J.W. Physics for Scientists and Engineers with Modern Physics, 10th ed.; Cengage Learning: Boston, MA, USA, 2018. [Google Scholar]
  38. Callister, W.D.; Rethwisch, D.G. Materials Science and Engineering: An Introduction, 10th ed.; Wiley: Hoboken, NJ, USA, 2020. [Google Scholar]
  39. Torres-Hernandez, A.; Brambila-Paz, F.; Torres-Martínez, C. Numerical solution using radial basis functions for multidimensional fractional partial differential equations of type black–scholes. Comput. Appl. Math. 2021, 40, 245. [Google Scholar] [CrossRef]
  40. Torres-Hernandez, A.; Brambila-Paz, F.; Ramirez-Melendez, R. Proposal for use of the fractional derivative of radial functions in interpolation problems. Fractal Fract. 2023, 8, 16. [Google Scholar] [CrossRef]
  41. Royston, P.; Altman, D.G. Regression using fractional polynomials of continuous covariates: Parsimonious parametric modelling. J. R. Stat. Soc. Ser. C (Appl. Stat.) 1994, 43, 429–453. [Google Scholar] [CrossRef]
  42. Awadalla, M.; Noupoue, Y.Y.Y.; Tandogdu, Y.; Abuasbeh, K. Regression Coefficient Derivation via Fractional Calculus Framework. J. Math. 2022, 2022, 1144296. [Google Scholar] [CrossRef]
Figure 1. Illustrative examples demonstrating the mismatch effect induced by varying the α parameter in the proposed model. As the coefficient of determination in the interpolation region declines, the extrapolation coefficient shows an increasing trend. (a) Fragment of the exponential curve of the coefficient of determination for the interpolation and extrapolation segments in the California region. (b) Fragment of the exponential curve of the coefficient of determination for the interpolation and extrapolation segments in the Louisville region. (c) Fragment of the exponential curve of the coefficient of determination for the interpolation and extrapolation segments in the Jacksonville region.
Figure 1. Illustrative examples demonstrating the mismatch effect induced by varying the α parameter in the proposed model. As the coefficient of determination in the interpolation region declines, the extrapolation coefficient shows an increasing trend. (a) Fragment of the exponential curve of the coefficient of determination for the interpolation and extrapolation segments in the California region. (b) Fragment of the exponential curve of the coefficient of determination for the interpolation and extrapolation segments in the Louisville region. (c) Fragment of the exponential curve of the coefficient of determination for the interpolation and extrapolation segments in the Jacksonville region.
Fractalfract 09 00393 g001
Figure 2. Performance metrics and comparison of fractional regression models in the BaltimoreWashington region. Box plots of average prices grouped by month using the original data. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.011550 , R M S E E x t r a p o l a t i o n = 0.004006 , R I n t e r p o l a t i o n 2 = 0.717249 , R E x t r a p o l a t i o n 2 = 0.670462 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.05 , R M S E I n t e r p o l a t i o n = 0.012029 , R M S E E x t r a p o l a t i o n = 0.003597 , R I n t e r p o l a t i o n 2 = 0.705527 , R E x t r a p o l a t i o n 2 = 0.704112 .
Figure 2. Performance metrics and comparison of fractional regression models in the BaltimoreWashington region. Box plots of average prices grouped by month using the original data. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.011550 , R M S E E x t r a p o l a t i o n = 0.004006 , R I n t e r p o l a t i o n 2 = 0.717249 , R E x t r a p o l a t i o n 2 = 0.670462 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.05 , R M S E I n t e r p o l a t i o n = 0.012029 , R M S E E x t r a p o l a t i o n = 0.003597 , R I n t e r p o l a t i o n 2 = 0.705527 , R E x t r a p o l a t i o n 2 = 0.704112 .
Fractalfract 09 00393 g002
Figure 3. Performance metrics and comparison of fractional regression models in the Chicago region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.028562 , R M S E E x t r a p o l a t i o n = 0.089324 , R I n t e r p o l a t i o n 2 = 0.475296 , R E x t r a p o l a t i o n 2 = 0.263375 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.15 , R M S E I n t e r p o l a t i o n = 0.031300 , R M S E E x t r a p o l a t i o n = 0.056405 , R I n t e r p o l a t i o n 2 = 0.424983 , R E x t r a p o l a t i o n 2 = 0.202223 .
Figure 3. Performance metrics and comparison of fractional regression models in the Chicago region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.028562 , R M S E E x t r a p o l a t i o n = 0.089324 , R I n t e r p o l a t i o n 2 = 0.475296 , R E x t r a p o l a t i o n 2 = 0.263375 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.15 , R M S E I n t e r p o l a t i o n = 0.031300 , R M S E E x t r a p o l a t i o n = 0.056405 , R I n t e r p o l a t i o n 2 = 0.424983 , R E x t r a p o l a t i o n 2 = 0.202223 .
Fractalfract 09 00393 g003
Figure 4. Performance metrics and comparison of fractional regression models in the HartfordSpringfield region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.014998 , R M S E E x t r a p o l a t i o n = 0.008968 , R I n t e r p o l a t i o n 2 = 0.650812 , R E x t r a p o l a t i o n 2 = 0.060748 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.15 , R M S E I n t e r p o l a t i o n = 0.017396 , R M S E E x t r a p o l a t i o n = 0.004151 , R I n t e r p o l a t i o n 2 = 0.594972 , R E x t r a p o l a t i o n 2 = 0.565226 .
Figure 4. Performance metrics and comparison of fractional regression models in the HartfordSpringfield region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.014998 , R M S E E x t r a p o l a t i o n = 0.008968 , R I n t e r p o l a t i o n 2 = 0.650812 , R E x t r a p o l a t i o n 2 = 0.060748 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.15 , R M S E I n t e r p o l a t i o n = 0.017396 , R M S E E x t r a p o l a t i o n = 0.004151 , R I n t e r p o l a t i o n 2 = 0.594972 , R E x t r a p o l a t i o n 2 = 0.565226 .
Fractalfract 09 00393 g004
Figure 5. Performance metrics and comparison of fractional regression models in the Midsouth region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.006570 , R M S E E x t r a p o l a t i o n = 0.021519 , R I n t e r p o l a t i o n 2 = 0.555539 , R E x t r a p o l a t i o n 2 = 0.291076 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.1 , R M S E I n t e r p o l a t i o n = 0.007127 , R M S E E x t r a p o l a t i o n = 0.020223 , R I n t e r p o l a t i o n 2 = 0.517871 , R E x t r a p o l a t i o n 2 = 0.333769 .
Figure 5. Performance metrics and comparison of fractional regression models in the Midsouth region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.006570 , R M S E E x t r a p o l a t i o n = 0.021519 , R I n t e r p o l a t i o n 2 = 0.555539 , R E x t r a p o l a t i o n 2 = 0.291076 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.1 , R M S E I n t e r p o l a t i o n = 0.007127 , R M S E E x t r a p o l a t i o n = 0.020223 , R I n t e r p o l a t i o n 2 = 0.517871 , R E x t r a p o l a t i o n 2 = 0.333769 .
Fractalfract 09 00393 g005
Figure 6. Performance metrics and comparison of fractional regression models in the California region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.015709 , R M S E E x t r a p o l a t i o n = 0.078348 , R I n t e r p o l a t i o n 2 = 0.372374 , R E x t r a p o l a t i o n 2 = 0.070536 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.2 , R M S E I n t e r p o l a t i o n = 0.020504 , R M S E E x t r a p o l a t i o n = 0.034247 , R I n t e r p o l a t i o n 2 = 0.180774 , R E x t r a p o l a t i o n 2 = 0.532053 .
Figure 6. Performance metrics and comparison of fractional regression models in the California region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.015709 , R M S E E x t r a p o l a t i o n = 0.078348 , R I n t e r p o l a t i o n 2 = 0.372374 , R E x t r a p o l a t i o n 2 = 0.070536 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.2 , R M S E I n t e r p o l a t i o n = 0.020504 , R M S E E x t r a p o l a t i o n = 0.034247 , R I n t e r p o l a t i o n 2 = 0.180774 , R E x t r a p o l a t i o n 2 = 0.532053 .
Fractalfract 09 00393 g006
Figure 7. Performance metrics and comparison of fractional regression models in the GrandRapids region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.027854 , R M S E E x t r a p o l a t i o n = 0.044803 , R I n t e r p o l a t i o n 2 = 0.600983 , R E x t r a p o l a t i o n 2 = 0.309643 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.05 , R M S E I n t e r p o l a t i o n = 0.028448 , R M S E E x t r a p o l a t i o n = 0.043861 , R I n t e r p o l a t i o n 2 = 0.592488 , R E x t r a p o l a t i o n 2 = 0.324153 .
Figure 7. Performance metrics and comparison of fractional regression models in the GrandRapids region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.027854 , R M S E E x t r a p o l a t i o n = 0.044803 , R I n t e r p o l a t i o n 2 = 0.600983 , R E x t r a p o l a t i o n 2 = 0.309643 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.05 , R M S E I n t e r p o l a t i o n = 0.028448 , R M S E E x t r a p o l a t i o n = 0.043861 , R I n t e r p o l a t i o n 2 = 0.592488 , R E x t r a p o l a t i o n 2 = 0.324153 .
Fractalfract 09 00393 g007
Figure 8. Performance metrics and comparison of fractional regression models in the Louisville region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.012697 , R M S E E x t r a p o l a t i o n = 0.114513 , R I n t e r p o l a t i o n 2 = 0.141156 , R E x t r a p o l a t i o n 2 = 0.290752 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.55 , R M S E I n t e r p o l a t i o n = 0.018416 , R M S E E x t r a p o l a t i o n = 0.063104 , R I n t e r p o l a t i o n 2 = 0.245712 , R E x t r a p o l a t i o n 2 = 0.28872 .
Figure 8. Performance metrics and comparison of fractional regression models in the Louisville region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.012697 , R M S E E x t r a p o l a t i o n = 0.114513 , R I n t e r p o l a t i o n 2 = 0.141156 , R E x t r a p o l a t i o n 2 = 0.290752 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.55 , R M S E I n t e r p o l a t i o n = 0.018416 , R M S E E x t r a p o l a t i o n = 0.063104 , R I n t e r p o l a t i o n 2 = 0.245712 , R E x t r a p o l a t i o n 2 = 0.28872 .
Fractalfract 09 00393 g008
Figure 9. Performance metrics and comparison of fractional regression models in the NorthernNewEngland region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.010658 , R M S E E x t r a p o l a t i o n = 0.022661 , R I n t e r p o l a t i o n 2 = 0.636816 , R E x t r a p o l a t i o n 2 = 0.192293 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.05 , R M S E I n t e r p o l a t i o n = 0.011036 , R M S E E x t r a p o l a t i o n = 0.020806 , R I n t e r p o l a t i o n 2 = 0.623904 , R E x t r a p o l a t i o n 2 = 0.258411 .
Figure 9. Performance metrics and comparison of fractional regression models in the NorthernNewEngland region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.010658 , R M S E E x t r a p o l a t i o n = 0.022661 , R I n t e r p o l a t i o n 2 = 0.636816 , R E x t r a p o l a t i o n 2 = 0.192293 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.05 , R M S E I n t e r p o l a t i o n = 0.011036 , R M S E E x t r a p o l a t i o n = 0.020806 , R I n t e r p o l a t i o n 2 = 0.623904 , R E x t r a p o l a t i o n 2 = 0.258411 .
Fractalfract 09 00393 g009
Figure 10. Performance metrics and comparison of fractional regression models in the Plains region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.013831 , R M S E E x t r a p o l a t i o n = 0.039437 , R I n t e r p o l a t i o n 2 = 0.353870 , R E x t r a p o l a t i o n 2 = 0.268231 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.05 , R M S E I n t e r p o l a t i o n = 0.014087 , R M S E E x t r a p o l a t i o n = 0.035492 , R I n t e r p o l a t i o n 2 = 0.341887 , R E x t r a p o l a t i o n 2 = 0.341435 .
Figure 10. Performance metrics and comparison of fractional regression models in the Plains region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.013831 , R M S E E x t r a p o l a t i o n = 0.039437 , R I n t e r p o l a t i o n 2 = 0.353870 , R E x t r a p o l a t i o n 2 = 0.268231 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.05 , R M S E I n t e r p o l a t i o n = 0.014087 , R M S E E x t r a p o l a t i o n = 0.035492 , R I n t e r p o l a t i o n 2 = 0.341887 , R E x t r a p o l a t i o n 2 = 0.341435 .
Fractalfract 09 00393 g010
Figure 11. Performance metrics and comparison of fractional regression models in the TotalUS region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.008106 , R M S E E x t r a p o l a t i o n = 0.034653 , R I n t e r p o l a t i o n 2 = 0.475631 , R E x t r a p o l a t i o n 2 = 0.26149 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.15 , R M S E I n t e r p o l a t i o n = 0.00884 , R M S E E x t r a p o l a t i o n = 0.027896 , R I n t e r p o l a t i o n 2 = 0.42815 , R E x t r a p o l a t i o n 2 = 0.405492 .
Figure 11. Performance metrics and comparison of fractional regression models in the TotalUS region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.008106 , R M S E E x t r a p o l a t i o n = 0.034653 , R I n t e r p o l a t i o n 2 = 0.475631 , R E x t r a p o l a t i o n 2 = 0.26149 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.15 , R M S E I n t e r p o l a t i o n = 0.00884 , R M S E E x t r a p o l a t i o n = 0.027896 , R I n t e r p o l a t i o n 2 = 0.42815 , R E x t r a p o l a t i o n 2 = 0.405492 .
Fractalfract 09 00393 g011
Figure 12. Performance metrics and comparison of fractional regression models in the Boise region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.011818 , R M S E E x t r a p o l a t i o n = 0.053103 , R I n t e r p o l a t i o n 2 = 0.612036 , R E x t r a p o l a t i o n 2 = 0.11939 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.1 , R M S E I n t e r p o l a t i o n = 0.012309 , R M S E E x t r a p o l a t i o n = 0.043779 , R I n t e r p o l a t i o n 2 = 0.595922 , R E x t r a p o l a t i o n 2 = 0.274019 .
Figure 12. Performance metrics and comparison of fractional regression models in the Boise region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.011818 , R M S E E x t r a p o l a t i o n = 0.053103 , R I n t e r p o l a t i o n 2 = 0.612036 , R E x t r a p o l a t i o n 2 = 0.11939 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.1 , R M S E I n t e r p o l a t i o n = 0.012309 , R M S E E x t r a p o l a t i o n = 0.043779 , R I n t e r p o l a t i o n 2 = 0.595922 , R E x t r a p o l a t i o n 2 = 0.274019 .
Fractalfract 09 00393 g012
Figure 13. Performance metrics and comparison of fractional regression models in the Jacksonville region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.016762 , R M S E E x t r a p o l a t i o n = 0.053439 , R I n t e r p o l a t i o n 2 = 0.500422 , R E x t r a p o l a t i o n 2 = 0.285206 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.25 , R M S E I n t e r p o l a t i o n = 0.026883 , R M S E E x t r a p o l a t i o n = 0.037657 , R I n t e r p o l a t i o n 2 = 0.19877 , R E x t r a p o l a t i o n 2 = 0.496305 .
Figure 13. Performance metrics and comparison of fractional regression models in the Jacksonville region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.016762 , R M S E E x t r a p o l a t i o n = 0.053439 , R I n t e r p o l a t i o n 2 = 0.500422 , R E x t r a p o l a t i o n 2 = 0.285206 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.25 , R M S E I n t e r p o l a t i o n = 0.026883 , R M S E E x t r a p o l a t i o n = 0.037657 , R I n t e r p o l a t i o n 2 = 0.19877 , R E x t r a p o l a t i o n 2 = 0.496305 .
Fractalfract 09 00393 g013
Figure 14. Performance metrics and comparison of fractional regression models in the SouthCarolina region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.006627 , R M S E E x t r a p o l a t i o n = 0.027769 , R I n t e r p o l a t i o n 2 = 0.630994 , R E x t r a p o l a t i o n 2 = 0.0321 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.1 , R M S E I n t e r p o l a t i o n = 0.007434 , R M S E E x t r a p o l a t i o n = 0.017295 , R I n t e r p o l a t i o n 2 = 0.586072 , R E x t r a p o l a t i o n 2 = 0.397166 .
Figure 14. Performance metrics and comparison of fractional regression models in the SouthCarolina region. (a) Box plots of average prices grouped by month using the original data. (b) Box plots of average prices grouped by month after removing outliers. (c) Natural logarithm of the root mean square error for the interpolation and extrapolation segments. (d) Exponential of the coefficient of determination for the interpolation and extrapolation segments. (e) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0 , R M S E I n t e r p o l a t i o n = 0.006627 , R M S E E x t r a p o l a t i o n = 0.027769 , R I n t e r p o l a t i o n 2 = 0.630994 , R E x t r a p o l a t i o n 2 = 0.0321 . (f) Average price distribution using the monthly median after removing outliers—fractional regression model with α = 0.1 , R M S E I n t e r p o l a t i o n = 0.007434 , R M S E E x t r a p o l a t i o n = 0.017295 , R I n t e r p o l a t i o n 2 = 0.586072 , R E x t r a p o l a t i o n 2 = 0.397166 .
Fractalfract 09 00393 g014
Table 1. Differences in the α values of the fractional operators employed and the corresponding relative improvements in the coefficient of determination R 2 in the extrapolation segments across the considered regions.
Table 1. Differences in the α values of the fractional operators employed and the corresponding relative improvements in the coefficient of determination R 2 in the extrapolation segments across the considered regions.
Region Δ α R Extrapolation Initial 2 R Extrapolation Final 2 Absolute DifferenceAbsolute Relative Increase
BaltimoreWashington+0.050.6704620.7041120.0336500.0502 (5.02%)
Chicago−0.15−0.2633750.2022230.4655981.767 (176.7%)
HartfordSpringfield+0.150.0607480.5652260.5044788.31 (831%)
Midsouth−0.100.2910760.3337690.0426930.1467 (14.7%)
California−0.20−0.0705360.5320530.6025898.54 (854%)
GrandRapids−0.050.3096430.3241530.014510.0469 (4.69%)
Louisville−0.55−0.2907520.2887200.5794721.993 (199.3%)
NorthernNewEngland+0.050.1922930.2584110.0661180.3439 (34.4%)
Plains−0.050.2682310.3414350.0732040.2729 (27.3%)
TotalUS−0.150.2614900.4054920.1440020.5509 (55.1%)
Boise−0.100.1193900.2740190.1546291.2947 (129.5%)
Jacksonville−0.250.2852060.4963050.2110990.7404 (74.0%)
SouthCarolina−0.100.0321000.3971660.36506611.371 (1137%)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Torres-Hernandez, A.; Ramirez-Melendez, R.; Brambila-Paz, F. Proposal for the Application of Fractional Operators in Polynomial Regression Models to Enhance the Determination Coefficient R2 on Unseen Data. Fractal Fract. 2025, 9, 393. https://doi.org/10.3390/fractalfract9060393

AMA Style

Torres-Hernandez A, Ramirez-Melendez R, Brambila-Paz F. Proposal for the Application of Fractional Operators in Polynomial Regression Models to Enhance the Determination Coefficient R2 on Unseen Data. Fractal and Fractional. 2025; 9(6):393. https://doi.org/10.3390/fractalfract9060393

Chicago/Turabian Style

Torres-Hernandez, Anthony, Rafael Ramirez-Melendez, and Fernando Brambila-Paz. 2025. "Proposal for the Application of Fractional Operators in Polynomial Regression Models to Enhance the Determination Coefficient R2 on Unseen Data" Fractal and Fractional 9, no. 6: 393. https://doi.org/10.3390/fractalfract9060393

APA Style

Torres-Hernandez, A., Ramirez-Melendez, R., & Brambila-Paz, F. (2025). Proposal for the Application of Fractional Operators in Polynomial Regression Models to Enhance the Determination Coefficient R2 on Unseen Data. Fractal and Fractional, 9(6), 393. https://doi.org/10.3390/fractalfract9060393

Article Metrics

Back to TopTop