Mathematical Computation of Piecewise Linear Regression with Endogenous Segmentation for Accurate Data-Based Model Building: An Example of the Phillips Curve

Lin, Yi-Shin; Fan, Chih-Ping; Lee, Mei-Yu; Lee, Yao-Hsien

doi:10.3390/math14061041

Open AccessArticle

Mathematical Computation of Piecewise Linear Regression with Endogenous Segmentation for Accurate Data-Based Model Building: An Example of the Phillips Curve

¹

School of Business, Wuyi University, Wuyishan 354300, China

²

Department of Finance, Minghsin University of Science and Technology, Hsinchu 30401, Taiwan

³

Department of Finance, Chunghua University, Hsinchu 30012, Taiwan

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(6), 1041; https://doi.org/10.3390/math14061041

Submission received: 15 February 2026 / Revised: 10 March 2026 / Accepted: 13 March 2026 / Published: 19 March 2026

(This article belongs to the Special Issue New Advances in Mathematical Economics and Financial Modelling)

Download

Browse Figures

Versions Notes

Abstract

We provide a computational piecewise linear regression (CPLR) method with endogenous segmentation to construct accurate mathematical representations of data. The method models the unemployment–inflation relationship through a deterministic segmentation algorithm, rather than imposing a predetermined functional form to fit all data for one time. The CPLR method procedure sequentially fits local linear regressions on observations ordered by the explanatory variable and detects structural breakpoints according to a specific rule of breakpoint detection property in the coefficient of determination. A minimum segment size is imposed to ensure numerical stability constraint, and the optimal minimum sample size is determined by minimizing the global root mean squared error (RMSE) across candidate segmentations. We establish a theoretical property of the CPLR method algorithm, based on the fact that the breakpoint detection rule follows directly from the minimum least-squares error principle rather than from heuristic segmentation. The example results, using annual data from 1991 to 2022 as an illustration, reveal clear structural breaks and regime-dependent slope heterogeneity across the G7 economies. The number of segments varies substantially across countries, ranging from a stable two-segment structure in the United States to a six-segment structure in France. Positive slopes appear in specific unemployment ranges for several countries, except in the case of the United States. Comparisons with the results of conventional linear and nonlinear regressions show that in several countries, including Japan, Germany, Italy, and France, the CPLR representation provides a closer local fit to the observed data than conventional global linear or nonlinear specifications. The CPLR method, therefore, provides a deterministic and data-driven algorithm for detecting endogenous structural changes and constructing accurate piecewise linear representations without imposing prior theoretical assumptions or relying on statistical inference.

Keywords:

accurate data-based model building; computational piecewise linear regression; Phillips curve

MSC:

91-10

1. Introduction

Model misspecification remains a persistent challenge in empirical economic analysis [1], even when long historical datasets are available. Annual macroeconomic data often contain relatively few observations and limited short-term fluctuations, which makes conventional linear regression models sensitive to specification choices. As a result, empirical relationships that evolve over different levels of economic activity may not be well represented by a single global functional form. Addressing such structural heterogeneity has therefore become an important methodological issue in the construction of empirical economic models, particularly when the objective is to characterize the system’s response across different state-dependent regimes rather than merely tracking temporal changes.

The Phillips curve, which describes the relationship between unemployment rate and inflation rate, provides a representative example of this difficulty. Since the seminal work of Phillips [2], which utilized annual data to fit a specific nonlinear inverse power function, a large literature has attempted to characterize this relationship through various theoretical and empirical specifications. Many studies have introduced nonlinear functional forms, including quadratic or logarithmic models, to capture potential curvature in the data [2,3,4,5,6,7,8,9,10,11,12,13]. Others emphasize the traditional negative relationship [2,3,14]. Significant methodological debates, such as those raised by Hall and Hart [15] and Hoover [16], highlight the tension between imposing a rigid theoretical framework and capturing the authentic empirical structure. However, these approaches typically impose a global functional form, whether linear or nonlinear, before examining the data. In such cases, the estimated model may inadvertently force the observed data to conform to a prespecified global geometry, potentially obscuring localized structural shifts.

Recent empirical studies suggest that the Phillips curve may exhibit more complex structural patterns, including kinked points [17], regime-switching behavior [9], and even positive slopes within certain ranges of unemployment [7,15]. These findings indicate that the empirical relationship between unemployment and inflation may vary across different ranges of the explanatory variable. Consequently, representing the Phillips curve with a single global equation may obscure important local structural features embedded in the data.

To address this issue, this study develops a computational piecewise linear regression (CPLR) method. Unlike stochastic models that rely on probabilistic assumptions or time-series indexing, the CPLR method treats the Phillips curve as a problem of geometric fitting within a state space defined by unemployment and inflation. The fundamental premise is that while the local relationship is assumed to be linear, the overall complexity arises from a sequence of locally linear segments that transition at endogenous breakpoints. Instead of predetermining the number of observations per segment or the specific years of transition, the CPLR method identifies structural breakpoints directly from the geometric properties of the data through a deterministic sequential least-squares procedure. The CPLR method shifts the focus from temporal heterogeneity (how the curve changes over time) to state-space heterogeneity (how the curve’s slope changes across different ranges of unemployment), revealing the intrinsic geometric structure of the unemployment–inflation relationship.

The proposed CPLR method determines breakpoints through a sequential least-squares procedure that prioritizes the stability of the local fit, compared with existing change-point detection or regime-switching approaches [6,18,19]. The conceptual foundation for integrating clustering with regression to identify local regimes traces back to the work of McGee and Carleton [20], who proposed a hierarchical procedure based on goodness-of-linear-fit criteria. Observations are first ordered by the magnitude of the explanatory variable, and linear segments are estimated iteratively. While the early framework of McGee and Carleton utilized probabilistic F-tests as stopping rules for segment aggregation, the CPLR method shifts this paradigm toward a deterministic computational optimization.

Unlike endogenous breakpoint tests [21] or the existing methods that focus on breakpoint detection or segmentation complexity control [22,23,24,25], the CPLR method identifies an optimal minimum sample size that governs the emergence of local linear segments without requiring explicit breakpoint optimization. Structural breakpoints are detected by monitoring the stability of the functional mapping within the state space as additional observations are incorporated sequentially. A breakpoint is identified when the inclusion of an additional observation, ordered by the explanatory variable, produces a substantial expansion in the local residual variance, indicating a transition to a new local regime. To ensure numerical stability and mitigate the risk of overfitting in small samples, the optimal minimum sample size is determined by minimizing the root mean squared error (RMSE) across candidate segmentations. In this way, segmentation emerges endogenously from the stability properties of the least-squares solution rather than from an explicit search over breakpoint locations.

By following a deterministic computational procedure rather than a probabilistic inference framework, the CPLR method prioritizes the structural characterization of the observed manifold over population-based forecasting. This choice addresses the inherent difficulty noted by McGee and Carleton [20], who acknowledged that formal sampling theory for piecewise hierarchical procedures remains largely intractable due to the non-continuous nature of likelihood functions. This design allows the algorithm to identify structural changes through transparent sequential computations without requiring distributional assumptions or cross-validation techniques typically reserved for stochastic models. In this sense, the CPLR method functions as a structural filter. The optimal minimum segment constraint (k*) and RMSE-based optimization act together to ensure that the identified segments represent robust local patterns rather than in-sample noise. Consequently, the Phillips curve’s segments emerge from the internal geometric consistency of the least-squares solution rather than from externally imposed breakpoint searches. Thus, the method allows the empirical structure to be defined by its internal geometric consistency instead of arbitrary time-domain splits or ex ante theoretical restrictions.

The empirical application of the CPLR method focuses on the Phillips curve for the G7 economies using annual data from 1991 to 2022. This specific timeframe and data frequency are selected to align with the historical context of Phillips’ original work, while providing a sufficiently broad range of unemployment states across diverse advanced economies. The results reveal substantial structural heterogeneity in the unemployment–inflation relationship across different countries. Specifically, the estimated piecewise linear representations exhibit multiple segments with distinct slopes, demonstrating that the sensitivity of inflation to unemployment is not a global constant but a state-dependent variable. Crucially, these segments are designed to represent specific regimes of economic pressure (unemployment ranges) rather than chronological periods, thereby providing a direct mathematical response to the changes observed in different economic states. In some regimes, the Phillips curve appears relatively steep, while in others, it flattens significantly, reflecting localized structural features that a single global equation would fail to capture.

Before proceeding, it is important to clarify the methodological scope of this study. The objective of this study is not to develop a probabilistic econometric model for statistical inference or forecasting. Instead, the study focuses on computational mathematical modeling, where the primary goal is to construct a structural representation of empirical data through deterministic least-squares algorithms. Within this framework, the CPLR method functions as a data-driven segmentation algorithm that detects structural changes through sequential least-squares estimation rather than through hypothesis testing or distribution-based inference. Consequently, the evaluation criteria emphasize the internal consistency of the least-squares solution and the overall fitting performance within the observed dataset.

The remainder of this study is organized as follows. Section 2 describes the methodology and the data. The methodology covers the linear and nonlinear regression models. The section details the mathematical formulation of the CPLR method algorithm. The data subsection describes the source, frequency, and descriptive statistics of the unemployment rates and the inflation rates for the G7 economies. Section 3 presents the results of the estimation of the linear and nonlinear regressions and the computational piecewise regression method. Section 4 concludes this study.

2. Methodology and Data

2.1. Linear Regression Model for All Samples

The most popular model of Phillips curve is defined as follows:

P_t = Φ(u_t) + ε_t,

(1)

where P_t denotes the inflation rate that is calculated by the year-over-year rate of the consumer price index at time t, u_t denotes the unemployment rate at time t, and ε_t denotes the error term at time t with E(ε_t) = E(u_t ε_t) = 0 and Var(ε_t) = σ², t = 1, 2, …, T. We build the function Φ:{u}⟶{P}, where u = {u_t, t = 1, 2, …, T} ∈

R

and P = {P_t, t = 1, 2, …, T } ∈

R

. The function form of Φ(∙) represents the relationship between unemployment rates and inflation rates and is given by the following equation:

Φ (u_{t}) = P_{t}^{e} + α (u_{t} - u^{*}),

(2)

where

P_{t}^{e}

represents the expected inflation rate at time t,

u^{*}

represents the natural rate of unemployment, and α represents the parameter to display the change in unemployment rates. Following AI-Zeaud and AI-Hosban [26], we assume the expected inflation rate and the natural rate of unemployment are constant terms for simplification. Hence, Equation (2) can be rewritten as follows:

Φ (u_{t}) = P_{t}^{e} - α u^{*} + α u_{t},

(3)

Equation (3) is reasonable because (1) the expected inflation rate exhibits forecast errors, and (2) the natural rate of unemployment might be an average value in a specific time period [4]. Therefore, Equation (3) is used to describe the relationship between the unemployment rate and the inflation rate without considering the forecasting data as a variable. By substituting Equation (3) into Equation (1), we can obtain Equation (4) as follows:

P_t = Φ(u_t) + ε_t = β₀ + β₁ × u_t + ε_t,

(4)

where β₀ covers all other factors and any adjustments, and β₁ denotes the slope of the Phillips curve. Equation (4) is the benchmark model for the linear regression and the model form of the CPLR method.

2.2. Nonlinear Regression Model for All Samples

Fitting a nonlinear regression model can help verify whether the convex and nonlinear characteristics of the Phillips curve are present. Following Hsiao et al. [27], Φ(∙) in Equation (1) can be an optimal nonlinear model form from 37 mathematical functions. The optimal function form of Φ(∙) is determined by the maximum sample correlation coefficient, r(P_t, Φ(u_t)), for all the function forms of Φ(∙). After finding the optimal Φ(∙), we regress P_t on Φ(u_t) to obtain the estimated coefficients, R², and RMSE. Since there are 37 functions of Φ(u_t) to calculate the sample correlation coefficients and compare them to choose the maximized one, this design avoids the specific function form setting, such as the log-transformed function used by AI-Zeaud and AI-Hosban [26] and Cristini and Ferri [6]. The nonlinear function obtained in this study displays the shape of the Phillips curve; thus, we can verify the nonlinear characteristics using the data characteristics.

2.3. Computational Piecewise Linear Regression (CPLR) Method

To address structural heterogeneity in the unemployment–inflation relationship, this study proposes a computational piecewise linear regression (CPLR) method that constructs a piecewise linear approximation directly from the data.

Unlike conventional piecewise regression methods that require pre-specified breakpoints or fixed segment lengths, the CPLR method algorithm determines both breakpoints and segment lengths endogenously through a sequential least-squares procedure applied to expanding ordered samples. The CPLR algorithm consists of two stages: (1) sequential segment identification, and (2) optimal minimum segment size selection. The method is deterministic and relies solely on the least-squares fitting criterion.

2.3.1. Stage 1: Sequential Segment Identification

Let the observed data be:

{\{(u_{t}, P_{t})\}}_{i = 1}^{T},

where u_t denotes the unemployment rate, P_t denotes the inflation rate. The observations are first sorted in ascending order of u_t:

u_{1} \leq u_{2} \leq \dots \leq u_{T} .

This ordering allows the algorithm to detect local structural changes in the unemployment–inflation relationship along the horizontal axis of the Phillips curve. The first is the definition of minimum segment size. Each regression segment must contain at least k observations:

T \geq k \geq 5,

This constraint ensures that the regression parameters are identifiable and that excessively small segments are avoided, thereby improving numerical stability in small samples. The second is the sequential estimation. For a segment beginning at observation t, the algorithm first estimates a linear regression using the initial sample, (t, t + 1, …, t + k − 1) and computes the coefficient of determination R²(t, k), where k denotes the current segment length constraint. The segment is then expanded sequentially by including the next ordered observation, and the regression is re-estimated on the expanded local sample. For the expanded sample (t, t + 1, …, t + k), the regression is re-estimated and the updated coefficient of determination R²(t, k + 1) is computed.

The third is breakpoint detection rule. The segment expansion continues as long as R²(t, k + 1) ≥ R²(t, k). If the inclusion of the next observation causes the coefficient of determination to decrease, R²(t, k + 1) < R²(t, k), the stability of the local linear approximation deteriorates. The previous sample, (t, …, t + k − 1) is defined as the final interval of the current segment. The next observation (t + k) is then treated as the starting point of the subsequent segment. This sequential procedure is repeated until all observations have been processed.

The fifth is important clarification. In the CPLR method algorithm, the comparison R²(t, k + 1) vs. R²(t, k) is performed between two independently estimated regressions with different sample intervals. Therefore, the regressions are not nested models estimated on a fixed sample, but distinct least-squares estimations over expanding local samples.

2.3.2. Stage 2: Optimal Minimum Segment Size Selection

In Stage 1, the minimum segment size k is treated as a fixed parameter. Stage 2 determines the optimal minimum segment size k* through global error minimization. The segmentation procedure of Stage 1 is repeated for candidate values k = 5, 6, 7, …. For each k, the CPLR method model is constructed and the corresponding root mean squared error (RMSE) over the entire dataset is computed, that is RMSE(k). The optimal minimum segment size is defined as:

k^{*} = \arg \min_{k \in Z^{+}} R M S E (k),

(5)

The segmentation corresponding to k* is selected as the final CPLR method. The k* acts as a structural filter that prevents excessively small segments while allowing the algorithm to detect genuine structural changes in the data.

2.3.3. Theoretical Property of the Breakpoint Detection Rule

The breakpoint detection rule used in the CPLR method algorithm follows directly from the least-squares error minimization principle. The breakpoint rule is not a heuristic stopping rule but a direct consequence of the least-squares error-minimization principle. This decrease indicates that the additional observation deteriorates the stability of the current local linear approximation, suggesting that the expanded sample no longer belongs to the same structural regime. Let the ordered observations be {(u₁, P₁), (u₂, P₂), …, (u_T, P_T)}. For a segment beginning at observation t, define

S_{k} = \{(u_{t}, P_{t}), \dots, (u_{t + k - 1}, P_{t + k - 1})\},

as the sample used to estimate the linear regression for a segment of size k. Let R²(t, k) denote the coefficient of determination obtained from the least-squares regression estimated using the sample S_k. Because the CPLR method algorithm compares regressions estimated on expanding samples, the breakpoint rule can be interpreted as a deterministic consequence of the least-squares error structure. Now, let us consider two regressions estimated by the ordinary least-squares method, that is (1) regression estimated using sample S_k, and (2) regression estimated using sample S_k₊₁. Let SSE(k) and SSE(k + 1) denote the residual sum of squares of the two regressions.

The coefficients of determination can be written as:

R_{k}^{2} = 1 - \frac{S S E (k)}{S S T (k)},

where SSE(k) is the sum of squares due to error for sample S_k, SST(k) is the sum of squares total for sample S_k. If R²(t, k + 1) < R²(t, k), the relative least-squares error increases when the additional observation is incorporated. Because the regression is re-estimated on an expanding sample, both the residual sum of squares and the total sum of squares change simultaneously. Therefore, the coefficient of determination is not guaranteed to be monotonic in this sequential estimation framework. The decrease in R² indicates that the expanded interval provides a poorer local linear fitting. This implies that the regression estimated over the expanded interval S_k+₁ increases the least-squares error. Therefore, including observation t + k deteriorates the fit of the regression line representing the local relationship of the current segment. Under the constraint k ≥ k*, the segment S_k therefore provides the minimum-error representation of the data within the admissible interval. Consequently, the observation t + k marks the boundary at which the local linear structure changes, and it is selected as the starting observation of the next segment. The detailed pseudocode and mathematical formulation of the CPLR segmentation procedure are provided in Appendix A. Hence, we can establish the following proposition:

Proposition 1.

Breakpoint Detection Property.

Suppose that the regression segment must satisfy the minimum sample size constraint k ≥ k*, where k* is the optimal minimum segment size determined in Stage 2. If R²(t, k + 1) < R²(t, k) then the interval S_k provides a more accurate least-squares representation of the local relationship than the expanded sample S_k₊₁. Therefore, the observation t + k should be treated as the starting point of the next segment.

2.4. Sample Description

2.4.1. Data and Sample Description

To investigate the empirical relationship between inflation and unemployment rates within a computational framework, this study utilizes yearly macroeconomic data for the G7 economies (the United States (US), the United Kingdom (UK), Canada (CA), Germany (GE), Italy (IT), Japan (JP), and France (FR)) from 1991 to 2022. The dataset comprises 32 annual observations per country to detect potential structural changes in the relationship of the Phillips curve. This is a critical prerequisite for computational methods involving segmental analysis or nonlinear fitting. This is a small sample size because we maintain the same time length for the countries.

2.4.2. Data Sources and Definitions

Data are extracted from the OECD Statistics Database (OECD [28]) and the Federal Reserve Bank of St. Louis without seasonal adjustments, ensuring consistency with the raw macroeconomic monitoring standards. The key variables are defined as follows:

Inflation rate: it is measured by the annual percentage change in the consumer price index (CPI), reflecting the general price level fluctuation.
Unemployment rate: it is defined as the harmonized unemployment rate, representing the proportion of the labor force that is unemployed and actively seeking employment, standardized across the G7 countries to ensure cross-sectional comparability.

2.4.3. Sample Design and Data Characteristics

The data focus on the G7 advanced economies. These countries have heterogeneous macroeconomic environments, including diverse inflation–unemployment relationships and multiple structural breaks (e.g., the 2008 financial crisis and Japan’s prolonged low-inflation period), which offer a suitable testbed for evaluating the breakpoint detection capability of the CPLR method across different regimes.

Annual data are employed to maintain direct comparability with the original Phillips [2] and Samuelson and Solow [3] Phillips curve analyses, both of which were constructed using yearly observations. Using the same temporal frequency preserves the historical interpretation of the unemployment–inflation relationship and avoids distortions introduced by higher-frequency aggregation.

The dataset contains 32 annual observations per country. Given the relatively small sample size, each observation represents the complete economic performance of one year and contributes information to the structural identification of segments. Thus, no data points are discarded as noise. This characteristic makes accurate data-driven modeling essential and motivates the use of the piecewise or nonlinear regression rather than the conventional global linear fitting.

Preliminary visualization in Figure 1 reveals substantial heterogeneity and potential nonlinearity in the inflation–unemployment relationship across countries, suggesting that a single linear specification is insufficient. These features justify the adoption of the CPLR method, which constructs piecewise linear segments to capture structural changes and improve model fit.

2.4.4. Descriptive Statistics

Table 1 reports key descriptive statistics for the unemployment rates across the G7 economies from 1991 to 2022, including measures of central tendency, dispersion, distribution shape, and association with inflation. The average unemployment rate varies substantially among the G7 countries: Japan exhibits the lowest mean (3.712%), while both France (9.701%) and Italy (9.640%) have the highest means, reflecting divergent labor market dynamics. Median values align closely with means for most countries, indicating relatively symmetric distributions, except for the United Kingdom and France, where slight positive skewed coefficient (0.606 and 0.476, respectively) suggests occasional upward spikes in unemployment.

In terms of volatility, Germany exhibits the highest standard deviation (2.505%), whereas Japan shows the lowest (1.002%), indicating the most stable unemployment dynamics among the G7 economies. Kurtosis values range from 1.717 (Japan) to 2.883 (U.S.), suggesting predominantly platykurtic distributions and infrequent extreme unemployment events. Such distributional characteristics contribute to the stability of subsequent piecewise regression estimation.

Table 2 reports the key descriptive statistics for inflation rates across the G7 economies from 1991 to 2022, summarizing their central tendencies, volatility, and distributional properties, which are relevant for assessing the suitability of computational modeling. Japan exhibits the lowest mean inflation rate (0.418%) and median (0.098%), reflecting its prolonged low-inflation environment. In contrast, the rest of the countries are gathered into three groups. The group of high and similar average levels includes the United States (2.562%), the United Kingdom (2.488%), and Italy (2.483%). The lower group has France (1.604%) and Germany (1.942%), while Canada (2.084%) lies between these two groups.

Volatility, measured by the standard deviation, also differs across countries. Italy (1.902%) and the United Kingdom (1.607%) show relatively higher variability, while France (1.014%) and Japan (1.082%) exhibit more stable inflation dynamics. Skewed coefficients indicate non-normality in most economies: The United States (5.677), the United Kingdom (5.772), and Canada (5.988) display pronounced positive skewed, suggesting occasional extreme high-inflation episodes. Kurtosis values, ranging from 3.989 (Japan) to 5.248 (U.K.), further indicate leptokurtic distributions with heavier tails than the normal distribution, consistent with the presence of structural shocks such as oil price spikes and pandemic-related supply disruptions.

These heterogeneous distributional features, including differences in central tendencies, volatility, and higher-order moments, highlight the limitations of conventional linear specifications for modeling the inflation–unemployment relationship. Accordingly, they motivate the use of the CPLR method, which accommodates non-normality and structural breaks through piecewise regression.

3. Computation Results

3.1. Linear Model Benchmark

Figure 2 and Table 3 present the baseline linear regression results for the G7 economies, highlighting the inherent limitations of a single global functional form. Although most countries—including the United States, Canada, Japan, Germany, Italy, and France, exhibit the expected negative slope coefficients, these estimates lack statistical robustness, with the sole exception of Japan (b₁ = −0.718, p-value < 0.05). The United Kingdom represents a notable case of model misspecification, yielding a positive slope (b₁ = 0.208) that reflects the inability of a static linear form to capture complex policy interventions or structural shifts, rather than a theoretical contradiction.

The inadequacy of the linear framework is further evidenced by poor goodness-of-fit metrics; the coefficient of determination (R²) remains below 0.1 for most economies, explaining less than 10% of the variance in inflation. Such low explanatory power, coupled with inconsistent mean squared errors (MSEs) ranging from 0.675 to 3.453, underscores that a global linear specification fails to accommodate the heterogeneous regimes and nonlinear adjustments present in annual macroeconomic data. These benchmark results confirm that forcing a single function upon the entire data manifold obscures localized structural features, thereby validating the necessity for the more adaptive CPLR method to characterize state-dependent inflation sensitivity.

3.2. Nonlinear Regression Results

The nonlinear regression results are summarized in Table 4, Figure 3. The results demonstrate that the Phillips curve relationship lacks a common functional shape across the G7 economies. The optimal specifications, selected from 37 candidate forms, reveal substantial cross-country heterogeneity. First is that the United States, the United Kingdom, Canada, and France favor negative power functions (|u_t|^−0.1). Second is that Japan exhibits an exponential form. Third is that Germany and Italy require oscillatory sinusoidal and trigonometric specifications, respectively. These diverse outcomes confirm that a single, synchronized nonlinear model cannot adequately represent the varying inflation dynamics across different advanced economies.

Analytically, the nonlinear models consistently outperform their linear counterparts in terms of explanatory power. As shown in Table 3 and Table 4, the R² increases substantially for most countries, notably the United States (0.466 vs. 0.108), the United Kingdom (0.381 vs. 0.059), and Canada (0.435 vs. 0.057). While these flexible forms provide a superior empirical representation compared to static linear regressions, the resulting curve shapes often deviate significantly from the classical smooth, convex trade-off proposed by Phillips [2] and Samuelson and Solow [3]. For instance, Germany’s oscillatory structure and Italy’s concave form suggest that annual data may contain localized structural shifts that global functions struggle to accommodate.

The primary limitation of these nonlinear specifications remains their reliance on a single global functional form. Although these models improve fitting accuracy, they are inherently constrained by their pre-specified geometry, which forces all observations to conform to a uniform curvature. This “global constraint” may still obscure regime-dependent characteristics or localized sign reversals. These findings further justify the adoption of the data-driven CPLR method, which allows structural regimes to emerge endogenously from the data’s internal geometric consistency rather than from a pre-determined global function.

3.3. Computational Piecewise Regression Results

3.3.1. Cross-Country Structural Heterogeneity

Table 5 and Figure 4 present the outcomes of the CPLR method, revealing significant cross-country differences in the structural complexity of the Phillips curve. Unlike the uniform functional forms assumed in previous sections, the CPLR method identifies a variable number of segments, ranging from a stable two-segment structure in the United States to a highly complex six-segment mapping in France. These results suggest that the true shape of the Phillips curve is not a universal constant but a country-specific attribute. In economies such as France and Italy, the frequent shifts (5–6 segments) likely reflect a history of localized structural breaks and policy regime changes, whereas the simpler structure in the United States indicates a more consistent macroeconomic relationship across different unemployment ranges.

3.3.2. State-Dependent Slope Dynamics and Sign Reversals

A key advantage of the CPLR method is its ability to capture state-dependent piecewise structures that global models obscure. As summarized in Table 5, the U.S. Phillips curve maintains a monotonic negative relationship, though the slope magnitude decreases from −1.67 (low unemployment) to −0.425 (high unemployment), consistent with the convexity findings in the literature [6]. However, for the remaining G7 economies, the CPLR method identifies localized sign reversals where slopes alternate between negative and positive regimes.

For instance, the United Kingdom exhibits a traditional trade-off (slope = −2.979) at low unemployment, but develops positive relationships (1.108 and 0.715) in higher unemployment regimes. Similar patterns are observed in Japan and Germany. These positive segments are not treated as outliers but as internally consistent local regimes [7,13,17]. Within the CPLR method framework, these transitions demonstrate that inflation sensitivity is not only variable in magnitude but can also change in direction depending on the specific state-space region (unemployment range).

3.3.3. Model Performance and Adaptive Parsimony

The RMSE comparisons in Table 6 highlight a critical property of the CPLR method that is the adaptive parsimony. The algorithm does not uniformly dominate global linear or nonlinear models across all nations. In the United States case, the nonlinear model achieves a lower RMSE (1.081) than the CPLR (1.323), suggesting that the United States’ data manifold is sufficiently represented by a smooth, single function. Conversely, in Germany and France, the CPLR method provides a substantial improvement in fitting accuracy, where the presence of sharp structural discontinuities cannot be adequately captured by global functions.

This performance variation indicates that the CPLR method functions as a structural filter rather than a mere curve-fitting work. By utilizing the optimal minimum sample size (k*) and RMSE-based optimization, the algorithm avoids imposing arbitrary segments when a simpler structure is sufficient. This divides the G7 into two distinct groups: those characterized by global smoothness (the U.S., U.K., Canada) and those defined by regime-dependent discontinuities (Japan, Germany, Italy, France). These findings confirm that the CPLR method enhances model construction by aligning the mathematical structure with the data’s inherent geometric consistency, effectively distinguishing persistent structural patterns from localized noise.

4. Conclusions

This study has developed a computational piecewise linear regression (CPLR) method with endogenous segmentation to address model misspecification in macroeconomic modeling. By shifting to a state-space perspective, the CPLR method is a deterministic mathematical framework for identifying state-dependent regimes through intrinsic geometric consistency. The empirical application to the G7 economies reveals substantial cross-country variation in the unemployment–inflation relationship. As indicated in the results, the number of detected segments varies from two in the United States to six in France, illustrating that the sensitivity of inflation to unemployment is a state-dependent property of the economy rather than a global constant. This divergence in segmentation density reflects the underlying structural stability of each nation’s macroeconomic environment. It also reflects the underlying structural stability of each nation’s macroeconomic environment.

A critical observation is that the CPLR method does not systematically impose higher complexity across all economies. In the United States, the algorithm identifies a simpler structure, demonstrating its adaptive parsimony. In contrast, for economies like Japan, Germany, Italy, and France, where the data exhibit high curvature and localized sign reversals, the method provides significantly improved fitting accuracy. This confirms that the segmentation emerges endogenously from the stability properties of the least-squares solution, allowing the model to adapt to the inherent geometric consistency of each dataset without forcing arbitrary complexity.

From a methodological perspective, the CPLR method emphasizes the structural characterization of the observed manifold rather than probabilistic inference or forecasting. The segmentation process identifies regime transitions through the deterministic mapping properties of the data rather than through distributional assumptions.

While the piecewise decomposition of nonlinearity is an inherent property of any segmented regression, the CPLR method emphasizes internal geometric stability as the operational criterion for segment formation. Rather than an exhaustive search for arbitrary fits, the identification of segments is governed by the optimal minimum sample constraint (k*), which acts as a structural filter. This framework facilitates a transparent evaluation of the trade-off between global parsimony and local structural accuracy, ensuring that critical features, such as localized sign reversals, are not obscured. The CPLR method provides a consistent mathematical representation of the observed data manifold, though we acknowledge that in small-sample environments, such representations prioritize structural characterization over probabilistic validation. Unlike existing methods that rely on explicit breakpoint optimization, the CPLR method utilizes an optimal minimum sample size (k*) that governs the emergence of segments. In this sense, the minimum sample constraint and RMSE-based optimization collectively ensure geometric stability and model parsimony. This design ensures that the identified regimes represent consistent local structures derived from the deterministic properties of the least-squares solution, providing a transparent and reproducible framework even in small-sample environments.

The identification of these localized linear regimes reveals dynamic properties of the Phillips curve that are typically obscured by global functional assumptions. Our results suggest that the perceived instability in the unemployment–inflation relationship can be mathematically reinterpreted as a sequence of stable local regimes. The empirical results confirm that these segments represent specific regimes of economic pressure (unemployment ranges) rather than chronological periods. This finding provides a direct mathematical response to not only the flattening or steepening but also the localized sign reversals (transitions between negative and positive slopes) observed across different economic states. This piecewise linear state structure captures the dynamic shifts in the economy, providing a meticulous map of how inflation sensitivity transitions across different states of economic pressure.

A notable finding is the emergence of positive slopes in specific unemployment ranges for several G7 economies. Within the CPLR method, these segments emerge as internally consistent local regimes rather than outliers. These sign reversals indicate that the sensitivity of inflation to unemployment can change in direction depending on the state-space region. This empirical reality suggests that imposing a rigid, negative-sloped global function may be a theoretical restriction that fails to capture authentic structural dynamics.

Several limitations should be acknowledged. First, the analysis relies on annual observations from 1991 to 2022 in order to remain consistent with the historical data frequency used in early Phillips curve studies. While this choice facilitates comparability, it limits the effective sample size for detecting short-term structural changes. Future research could extend the analysis to higher-frequency data, such as quarterly or monthly observations, to investigate finer structural dynamics. Second, the empirical analysis focuses only on the G7 advanced economies. Applying the CPLR method to a broader set of countries, including emerging economies, would allow further evaluation of the method across different macroeconomic environments. Finally, because the present study emphasizes deterministic structural modeling, issues related to the probabilistic properties and uncertainty of the estimated segments remain an open area for future investigation, particularly when larger datasets become available for simulation-based analysis.

In conclusion, the CPLR method offers a transparent, data-driven framework for detecting endogenous regime changes without imposing prior theoretical assumptions. It reveals that the Phillips curve is best represented as a piecewise linear state structure whose complexity is contingent on the specific economic state. By shifting the focus from global fitting to localized structural identification, this research provides a robust computational tool for characterizing macroeconomic structures in empirical datasets where the number of observations is limited.

Author Contributions

Conceptualization, M.-Y.L. and Y.-S.L.; methodology, M.-Y.L. and C.-P.F.; validation, Y.-H.L. and Y.-S.L.; formal analysis, M.-Y.L. and C.-P.F.; investigation, M.-Y.L. and C.-P.F.; writing—original draft preparation, Y.-H.L. and C.-P.F.; writing—review and editing, Y.-H.L. and M.-Y.L.; visualization, M.-Y.L. and Y.-S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are openly available in OECD at https://www.oecd.org/en/data/indicators/unemployment-rate.html (accessed on 19 August 2023) and FRED at https://fred.stlouisfed.org/ (accessed on 19 August 2023).

Acknowledgments

The authors thank the reviewers for their useful feedback.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Pseudocode and Mathematical Formulation for the CPLR Method

Appendix A provides the algorithm procedure and the description of mathematical formulation of segments for the CPLR method as follows.

Appendix A.1. Algorithm of the Pseudocode Procedure

Input is the ordered observations:

{\{(u_{t}, P_{t})\}}_{i = 1}^{T},

Output is the piecewise linear regression method with endogenous breakpoints. Stage 1: Sequential Segment Identification

1  Sort observations by unemployment rate:
           u1 ≤ u2 ≤ … ≤ un
2  Set minimum segment size k (initially k ≥ 5)
3  Initialize starting index:
           i ← 1
4  while i ≤ n − k + 1 do
5         Set current segment size:
                  kin ← k
6         Estimate linear regression using observations
                  (i, …, i + kin − 1)
7         Compute R²(i, kin)
8         while (i + kin ≤ n) do
9                Estimate regression using expanded sample
                          (i, …, i + kin)
10              Compute R²(i, kin + 1)
11              if R²(i, kin + 1) ≥ R²(i, kin) then
12                       kin ← kin + 1
13                       Update R²(i, kin)
14              else
15                       Breakpoint detected
16                       Record segment (i, …, i + kin − 1)
17                       i ← i + kin
18                       Exit inner loop
19              end if
20        end while
21  end while

Stage 2: Optimal Minimum Segment Size Selection

22  For k = 5, 6, 7, … do
23         Apply Stage 1 segmentation
24         Compute global RMSE(k) across all observations
25  end for
26  Select optimal minimum segment size
            k* = arg min RMSE(k)
27  Return segmentation obtained with k*

Appendix A.2. Mathematical Formulation of Segments

After breakpoint detection and size validation, the overall relationship, Φ(u_t), is expressed as a piecewise function:

Φ (u_{t}) = \{\begin{matrix} β_{0, 1} + β_{1, 1} u_{t}, \\ β_{0, 2} + β_{1, 2} u_{t}, \\ ⋮ \\ β_{0, m} + β_{1, m} u_{t}, \end{matrix} \begin{matrix} i f u_{t} \in [L_{1}, R_{1}] \\ i f u_{t} \in (R_{1}, R_{2}] \\ ⋮ \\ i f u_{t} \in (R_{m - 1}, L_{m}] \end{matrix},

(A1)

where m denotes the total number of segments, [L_i, R_i] denotes the range of the ith segment for the unemployment rate, and β_0,I and β_1,i are the intercept and slope of the ith segment, respectively. Each segment’s range is determined by the breakpoints identified in Stage 1, with the length validated in Stage 2. The piecewise regression is one mathematical function with a dummy variable to allow the slope to differ between negative and positive unemployment gaps. A dummy variable denotes the segment j, D_j, j = 2, 3, … m.

D_{j} = \{\begin{matrix} 1, & s e g m e n t j \\ 0, & o t h e r w i s e \end{matrix}

(A2)

P_t = (b_0,1 + b_1,1 u_t) × (1 − D₂ − D₃ − …− D_m) + (b_0,2 + b_1,2 u_t) × D₂ + … + (b_0,m + b_1,m u_t) × D_m,

(A3)

The linear equations estimated for each data-based segment are combined to construct a single equation representing the overall dataset. Equation (A3) presents the final estimated specification of the CPLR method in this study. Unlike the piecewise model of Cristini and Ferri [6], our method identifies the unemployment rate segments and associates each segment with its corresponding regression equation, thereby forming a piecewise linear model that more accurately captures the structural features of the Phillips curve and facilitates the direct visualization in a two-axis plot.

Appendix B. Comparative Analysis of CPLR and Complexity-Adjusted Selection Criteria (AICc)

While information criteria such as the AICc provide a standard for probabilistic generalization by penalizing parameter count, their derivation relies on likelihood-based assumptions (e.g., normally distributed errors) that differ from the deterministic least-squares framework of the CPLR method.

As shown in Table A1, for economies with relatively smooth data manifolds (e.g., the United States), AICc favors global functional forms. However, in economies exhibiting localized structural heterogeneity (e.g., Germany and France), the CPLR method prioritizes the local structural identification of the observed manifold. The divergence between RMSE-based selection and AICc rankings highlights a fundamental trade-off: while the AICc prioritizes global parsimony, the CPLR method ensures that critical localized sign reversals—essential for characterizing state-dependent inflation sensitivity—are not obscured by global smoothness constraints.

This study focuses on constructing data-based mathematical models based on least-squares fitting rather than likelihood-based or predictive frameworks. Nevertheless, to address potential concerns regarding overfitting, we report several commonly used complexity-adjusted diagnostics as supplementary evidence. First, the corrected Akaike Information Criterion (AICc) is provided as a reference measure of model parsimony. Although the AICc is derived under likelihood-based assumptions and is not part of the present estimation procedure, it offers an external comparison that penalizes excessive parameterization, particularly in small samples. Table A1 summarizes the AICc values for three alternative specifications: linear regression, nonlinear regression, and the CPLR method. For each country, the smallest value is marked with an asterisk (*), indicating the relatively most parsimonious specification under this criterion.

Although the AICc offers a useful complexity-adjusted benchmark, it is derived from likelihood-based estimation and typically relies on distributional assumptions for the error term. Since this study adopts a purely least-squares, data-based modeling framework without imposing parametric distributional assumptions, the AICc results are treated as supplementary diagnostic evidence rather than the primary selection criterion.

Noteworthily, the AICc tends to favor overly simple specifications in small samples because of its strong penalty on parameterization. Moreover, the AICc is derived from likelihood-based estimation and implicitly relies on distributional assumptions (e.g., independent and normally distributed errors). Since this study does not impose parametric assumptions on the data-generating process and instead adopts a purely least-squares, data-based modeling strategy, the AICc values are interpreted only as external reference diagnostics rather than formal selection criteria.

To provide an additional complexity-adjusted measure of goodness-of-fit, we report the adjusted coefficient of determination, adjusted R². By incorporating a degrees-of-freedom correction, adjusted R-squared penalizes unnecessary parameterization and therefore serves as a complementary indicator for assessing potential overfitting while remaining consistent with the least-squares framework. This measure evaluates explanatory efficiency while discouraging unnecessarily complicated specifications. Table A2 summarizes the adjusted R² results for the three models.

The results exhibit clear cross-country heterogeneity in model adequacy. For the United States, the United Kingdom, and Canada, the nonlinear regression provides the highest adjusted R-squared value, indicating that moderate nonlinearity improves the structural representation of the data. For Japan, Germany, Italy, and France, the CPLR method achieves the largest adjusted R-squared value, suggesting that segmented structures more effectively capture regime-dependent variations.

Importantly, these improvements persist after accounting for parameter penalties. Even in cases with relatively larger numbers of segments (e.g., France), the CPLR method maintains substantially higher adjusted R-squared values than the simple linear model. This indicates that the gain in explanatory accuracy is not merely a consequence of overparameterization but reflects genuine structural information captured by the piecewise formulation.

We emphasize that the objective of this study is an accurate mathematical representation of the observed data rather than prediction or statistical inference. Accordingly, the reported diagnostics are intended to solely demonstrate that the proposed models achieve improved fit without relying on unwarranted complexity. The results show that the computational piecewise regression consistently provides a more detailed and faithful description of historical structures than simpler alternatives.

Table A1. AICc comparison of three regression specifications for the Phillips curves of the G7 economies.

Country	Linear Regression	Nonlinear Regression	CPR Method
US	24.180	12.393 *	28.982
UK	32.263	23.489 *	39.81
CA	19.006	7.200 *	30.897
JP	−9.804 *	−5.237	13.266
GE	26.725 *	27.876	62.952
IT	42.444	44.442 *	73.777
FR	1.714	−8.336 *	23.663

Note: An asterisk (*) indicate the optimal model selected among three models for each country. The CPLR method identifies complex structures in Group 2 economies (Japan, Germany, Italy, France) that capture regime-dependent discontinuities, which may exceed the parsimony thresholds favored by AICc.

Table A2. Adjusted R-squared comparison of three regression specifications for the Phillips curves of the G7 economies.

Country	Linear Regression	Nonlinear Regression	CPR Method
US	0.078	0.448 *	0.173
UK	0.027	0.360 *	0.309
CA	0.025	0.417 *	0.207
JP	0.423	0.424	0.711 *
GE	−0.012	0.092	0.236 *
IT	0.046	0.121	0.382 *
FR	0.059	0.405	0.500 *

Note: An asterisk (*) indicate the optimal model selected among three models for each country.

Appendix C. Displaying the Optimal Sample Size Where the R-Squared Values Change in the First Stage

Appendix C provides empirical verification of Proposition 1 using the country-specific data. To determine the appropriate sample size for each country-specific regression model, we adopt the root mean squared error as the model selection criterion in the second stage. For each candidate sample size k, the regression model is estimated, and the corresponding RMSE is computed. The optimal minimum sample size is determined by k* = arg min RMSE(k). To verify that the selected k* indeed corresponds to a local minimum rather than a numerical coincidence, we additionally report the RMSE values for its adjacent candidates, k* − 1 and k* + 1. A valid optimum should satisfy RMSE(k*) ≤ RMSE(k* − 1) and RMSE(k*) ≤ RMSE(k* + 1). Table A3 summarizes these comparisons. For boundary cases, where k* − 1 is not available, the value is not reported. For all countries, the RMSE at k* is strictly smaller than or equal to those of neighboring sample sizes, confirming that the chosen sample size corresponds to the minimum-error solution and is not sensitive to marginal changes in k.

Table A3 determines the optimal minimum sample size, k*, by minimizing RMSE over the 32 observations. We then impose k* as a minimum-sample constraint; thus, each regression segment must contain at least k* observations. We formulate breakpoint detection as a constrained least-squares problem rather than an unconstrained local search. Candidate breakpoints are ordered by increasing values of the explanatory variable (the unemployment rate), and only segment lengths satisfying k ≥ k* are considered. Breakpoints that violate this minimum-sample constraint are excluded, even if they produce higher R² values.

Within this feasible set, we evaluate model fit by adding one observation at a time to the current segment. For each admissible kin, we compute R²(k) and R²(k + 1), where k + 1 denotes the regression after including one additional data point. The comparison is one-sided because the search is restricted to k ≥ k*. We select the breakpoint when R²(k) ≥ R²(k + 1), indicating that the fit deteriorates after adding more observations. This rule also ensures that the chosen breakpoint minimizes the least-squares error within the admissible range. Candidates with R²(k − 1) > R²(k) are not considered because they violate the minimum-sample constraint. Thus, breakpoint selection follows a constrained optimization rule rather than the symmetric local maximization of R².

Table A4 presents these one-sided R² comparisons with the corresponding segment ranges and observation counts. Table A4 shows the R²(k − 1), R²(k), and R²(k + 1) for the G7 advanced economies under k* of Table A3. Across these countries, the identified breakpoints consistently satisfy the inequality above, indicating that the segmentation adheres to the least-squares error-minimization principle and achieves the most accurate representation under the constraint of the fixed minimum sample size, k*. Consistent with the objective of this study, these calculations serve solely to verify fitting accuracy and structural adequacy, and are not intended for statistical inference or prediction.

Table A3. Local RMSE minimization verification for optimal sample size selection.

Country	Optimal k (k*)	k(Optimal − 1)	k(Optimal)	k(Optimal + 1)
US	14	1.323	1.322	1.411
UK	8	1.474	1.336	1.445
CA	7	1.173	1.162	1.255
JP	5	-	0.581	0.696
GE	5	-	1.263	1.295
IT	6	1.660	1.496	1.888
FR	5	-	0.719	0.803

Table A4. The R-squared values around the breakpoint for the optimal sample size.

Country (k*)	Segment Range	Counts (k)	R²(k − 1)	R²(k)	R²(k + 1)
US (14)	3.698~5.366	14	0.141	0.246	0.096
US (14)	5.415~9.768	18	-	-	-
UK (8)	3.700~5.025	10	0.440	0.441	0.432
	5.075~5.975	8	0.314	0.113	0.022
	6.175~8.125	9	0.589	0.614	0.476
	8.600~10.400	5	-	-	-
CA (7)	5.300~6.467	7	0.598	0.577	0.480
	6.758~7.192	7	0.597	0.352	0.070
	7.217~8.450	10	0.281	0.506	0.335
	9.117~11.400	8	-	-	-
JP (5)	2.100~2.500	5	0.705	0.570	0.108
	2.600~2.892	5	0.908	0.593	0.467
	3.116~3.592	6	0.614	0.819	0.119
	3.841~4.717	10	0.450	0.556	0.498
	4.716~5.375	6	-	-	-
GE (5)	2.975~4.367	8	0.277	0.295	0.261
	4.708~6.567	6	0.657	0.841	0.222
	6.575~7.859	5	0.577	0.278	0.191
	8.008~9.450	8	0.026	0.153	0.027
	9.675~11.284	5	-	-	-
IT (6)	6.150~8.050	6	0.221	0.122	0.085
	8.075~8.542	6	0.522	0.572	0.185
	8.791~9.934	6	0.045	0.173	0.121
	10.050~11.184	7	0.094	0.259	0.117
	11.206~12.825	7	-	-	-
FR (5)	7.316~8.034	5	0.787	0.838	0.701
	8.433~8.850	5	0.129	0.056	0.055
	8.883~9.217	5	0.675	0.070	0.051
	9.275~9.759	5	0.455	0.271	0.103
	10.066~10.642	5	0.045	0.705	0.055
	11.316~12.400	7	-	-	-

References

Golden, R.M.; Henley, S.S.; White, H.; Kashner, T.M. Consequences of Model Misspecification for Maximum Likelihood Estimation with Missing Data. Econometrics 2019, 7, 37. [Google Scholar] [CrossRef]
Phillips, A.W. The Relation between Unemployment and the Rate of Change of Money Wage Rates in the United Kingdom, 1861–1957. Economica 1958, 25, 283–299. [Google Scholar] [CrossRef]
Samuelson, P.A.; Solow, R.M. Analytical Aspects of Anti-inflation Policy. Am. Econ. Rev. 1960, 50, 177–194. [Google Scholar]
Ball, L.; Mazumder, S. A Phillips Curve with Anchored Expectations and Short-term Unemployment. J. Money Credit Bank. 2019, 51, 111–137. [Google Scholar] [CrossRef]
Eser, F.; Karadi, P.; Lane, P.R.; Moretti, L.; Osbat, C. The Phillips Curve at the ECB; European Central Bank Working Paper Series NO2044; European Central Bank: Frankfurt am Main, Germany, 2020. [Google Scholar]
Cristini, A.; Ferri, P. Nonlinear Models of the Phillips Curve. J. Evol. Econ. 2021, 31, 1129–1155. [Google Scholar] [CrossRef]
Ho, S.Y.; Iyke, B.N. Unemployment and Inflation: Evidence of a Nonlinear Phillips Curve in the Eurozone; MPRA Paper No. 87122; University Library of Munich: Munich, Germany, 2018; Available online: https://ideas.repec.org/p/pra/mprapa/87122.html (accessed on 10 April 2025).
Gagnon, J.E.; Collins, C.G. Low Inflation Bends the Phillips Curve; Working Paper No. 19-6; Peterson Institute for International Economics: Washington, DC, USA, 2019. [Google Scholar]
Nalewaik, J. Non-Linear Phillips Curves with Inflation Regime-Switching; Working Paper No. 2016-78; FEDS: Washington, DC, USA, 2016. [Google Scholar]
Fendel, R.; Lis, E.M.; Rülke, J.C. Do Professional Forecasters Believe in the Phillips curve? Evidence from the G7 Countries. J. Forecast. 2011, 30, 268–287. [Google Scholar] [CrossRef]
Popescu, C.C.; Diaconu, L. Inflation–Unemployment Dilemma. A Cross-Country Analysis. Sci. Ann. Econ. Bus. 2022, 69, 377–392. [Google Scholar] [CrossRef]
Zhang, L. Modeling the Phillips Curve in China: A Nonlinear Perspective. Macroecon. Dyn. 2017, 21, 439–461. [Google Scholar] [CrossRef]
Eliasson, A.C. Is the Short-Run Phillips Curve Nonlinear? Empirical Evidence for Australia, Sweden and the United States; Working Paper Series, No. 124; Sveriges Riksbank: Stockholm, Sweden, 2001. [Google Scholar]
Forbes, K.J.; Gagnon, J.E.; Collins, C.G. Low Inflation Bends the Phillips Curve Around the World; Working Paper No. 29323; NBER: New York, NY, USA, 2021. [Google Scholar]
Hall, T.E.; Hart, W.R. The Samuelson-Solow Phillips Curve and the Great Inflation. Hist. Econ. Rev. 2012, 55, 62–72. [Google Scholar] [CrossRef]
Hoover, K.D. The Genesis of Samuelson and Solow’s Price-Inflation Phillips Curve; Working Paper No. 2014-10; CHOPE: Singapore, 2014. [Google Scholar]
Smith, S.C.; Timmermann, A.; Wright, J.H. Nonlinear Phillips Curves; Notes No. 2024-09-04-1; FEDS: Washington, DC, USA, 2024; Available online: https://ideas.repec.org/p/fip/fedgfn/2024-09-04-1.html (accessed on 17 September 2025).
Coulombe, P.G. A Neural Phillips Curve and a Deep Output Gap. J. Bus. Econ. Stat. 2025, 43, 669–683. [Google Scholar] [CrossRef]
Coulombe, P.G. The Macroeconomy as a Random Forest. J. Appl. Econ. 2024, 39, 401–421. [Google Scholar] [CrossRef]
McGee, V.E.; Carleton, W.T. Piecewise Regression. J. Am. Stat. Assoc. 1970, 65, 1109–1124. [Google Scholar] [CrossRef]
Caporale, G.M.; Gil-Alana, L.A.; Poza, C. Inflation in the G7 Countries: Persistence and Structural Breaks. J. Econ. Financ. 2022, 46, 493–506. [Google Scholar] [CrossRef]
Brodsky, B.E.; Darkhovsky, B.S. Applications of Nonparametric Change-point Detection Methods. In Nonparametric Methods in Change-Point Problems; Springer: Dordrecht, The Netherlands, 1993; pp. 169–182. [Google Scholar]
Gkiolekas, I.; Papageorgiou, L.G. Piecewise Regression Analysis through Information Criteria Using Mathematical Programming. Expert Syst. Appl. 2019, 121, 362–372. [Google Scholar] [CrossRef]
Kim, T.; Lee, H. Improved Identification of Breakpoints in Piecewise Regression and Its Applications. arXiv 2024, arXiv:2408.13751. [Google Scholar] [CrossRef]
Lu, K.P.; Chang, S.T. An Advanced Segmentation Approach to Piecewise Regression Models. Mathematics 2023, 11, 4959. [Google Scholar] [CrossRef]
Al-zeaud, H.; Al-hosban, S. Does the Phillips Curve Really Exist? An Empirical Evidence from Jordan. Eur. Sci. J. 2015, 11, 253–275. [Google Scholar]
Hsiao, C.W.; Chan, Y.C.; Lee, M.Y.; Lu, H.P. Heteroscedasticity and Precise Estimation Model Approach for Complex Financial Time-Series Data: An Example of Taiwan Stock Index Futures before and during COVID-19. Mathematics 2021, 9, 2719. [Google Scholar] [CrossRef]
OECD. Unemployment Rate (Indicator). Available online: https://www.oecd.org/en/data/indicators/unemployment-rate.html (accessed on 19 August 2023).

Figure 1. The scatter plots of the Phillips curve for the G7 advanced economies. The horizontal axis represents the unemployment rate (%), and the vertical axis represents the inflation rate (%). In the figure, the open circles represent the observed data points (original values).

Figure 2. The linear regression charts of the Phillips curve for G7. The horizontal axis represents the unemployment rate (%), and the vertical axis represents the inflation rate (%). In the figure, the open circles represent the observed data points (original values), and the straight line represents the regression line.

Figure 3. The nonlinear regression charts of the Phillips curve for the G7 economies. The order of the horizontal axis is the unemployment rate (%), and the vertical axis is the inflation rate (%). In the figure, the open circles represent the observed data points (original values), and the black curve represents the nonlinear regression outcome.

Figure 4. The computational piecewise regression charts of the Phillips curve for G7. The horizontal axis represents the unemployment rate, and the vertical axis represents the inflation rate value. The figures are sorted from small to large according to the number of unemployment rate segments generated by the CPLR method. The AICc and R²(adj) comparisons are shown in Appendix B.

Table 1. Descriptive statistics of unemployment rates.

Coefficient	Average	Median	Standard Deviation	Skew	Kurtosis	Spearman Correlation
US	5.907	5.528	1.669	0.823	2.883	−0.203
UK	6.309	5.575	1.873	0.606	2.259	0.406
CA	7.809	7.471	1.592	0.751	2.830	−0.340
JP	3.712	3.717	1.002	0.029	1.717	−0.733
GE	6.813	7.325	2.505	−0.071	1.728	−0.044
IT	9.640	9.642	1.731	−0.098	2.155	−0.312
FR	9.701	9.350	1.490	0.476	2.254	−0.183

Table 2. Descriptive statistics of inflation rates.

Coefficient	Average	Median	Standard Deviation	Skew	Kurtosis
US	2.562	2.525	1.455	5.677	4.452
UK	2.488	2.255	1.607	5.772	5.248
CA	2.084	1.886	1.305	5.988	5.058
JP	0.418	0.098	1.082	0.708	3.989
GE	1.942	1.562	1.445	3.711	4.768
IT	2.483	2.067	1.902	1.487	4.098
FR	1.604	1.665	1.014	4.178	4.277

Table 3. The coefficients of the linear regression of the Phillips curve for G7.

Coefficient	b₀	b₁	R²	MSE	RMSE
US	4.253	−0.286	0.108	1.951	1.397
UK	1.177	0.208	0.059	2.512	1.585
CA	3.607	−0.195	0.057	1.660	1.288
JP	3.082	−0.718 *	0.442	0.675	0.822
GE	2.502	−0.082	0.020	2.113	1.454
IT	5.411	−0.304	0.076	3.453	1.858
FR	3.576	−0.203	0.089	0.967	0.983

* Significant at the 5 percent level.

Table 4. Estimated results of the nonlinear regression for the Phillips curves of the G7 economies.

Country	b₀	b₁	R²	MSE	RMSE	F
US	44.300	−41.369 *	0.466	1.168	1.081	26.157
UK	44.158	−41.301 *	0.381	1.652	1.285	18.443
CA	30.993	−28.581 *	0.435	0.993	0.997	23.139
JP	7.980	−5.292 *	0.443	0.673	0.821	23.854
GE	1.811	0.741	0.121	1.895	1.377	4.133
IT	−5.190	1.481 *	0.149	3.181	1.783	5.260
FR	20.486	−18.625 *	0.424	0.611	0.782	22.116

* Significant at the 5 percent level.

Table 5. Estimated results of the computational piecewise regression for the Phillips curves of the G7 economies.

Country (RMSE)	Segment Range	b₀	b₁	R²	MSE
US (1.323)	3.698~5.366	10.216	−1.67	0.246	2.575
US (1.323)	5.415~9.768	5.44	−0.425 *	0.255	1.13
UK (1.336)	3.700~5.025	15.684	−2.979 *	0.441	2.59
	5.075~5.975	−4.183	1.108	0.113	0.933
	6.175~8.125	−2.913	0.715 *	0.614	0.212
	8.600~10.400	16.293	−1.317	0.218	5.012
CA (1.162)	5.300~6.467	21.959	−3.191 *	0.577	1.668
	6.758~7.192	17.842	−2.304	0.352	2.343
	7.217~8.450	14.795	−1.648 *	0.506	0.491
	9.117~11.400	0.681	0.12	0.004	3.107
JP (0.581)	2.100~2.500	12.059	−4.542	0.570	0.648
	2.600~2.892	22.008	−7.675	0.593	0.635
	3.116~3.592	−19.291	6.052 *	0.819	0.311
	3.841~4.717	6.214	−1.425 *	0.556	0.174
	4.716~5.375	3.88	−0.895	0.177	0.236
GE (1.263)	2.975~4.367	11.097	−2.58	0.295	3.926
	4.708~6.567	−9.413	2.228 *	0.841	0.511
	6.575~7.859	−11.826	1.888	0.378	2.098
	8.008~9.450	6.821	−0.61	0.153	0.461
	9.675~11.284	1.367	0.018	0.001	0.144
IT (1.496)	6.150~8.050	4.781	−0.378	0.122	0.744
	8.075~8.542	99.283	−11.312	0.572	3.683
	8.791~9.934	21.806	−2.056	0.173	4.798
	10.050~11.184	−17.231	1.887	0.259	1.855
	11.206~12.825	11.206	−0.868	0.362	0.609
FR (0.717)	7.316~8.034	40.532	−4.939 *	0.838	0.714
	8.433~8.850	−2.888	0.53	0.056	0.177
	8.883~9.217	15.983	−1.594	0.07	0.907
	9.275~9.759	−19.925	2.289	0.271	0.648
	10.066~10.642	−38.431	3.798	0.705	0.344
	11.316~12.400	−7.774	0.762	0.204	0.382

* Significant at the 5 percent level. Note: The RMSE value of each country is under the optimal sample size, k. Appendix C shows that the optimal minimum sample size has the minimum RMSE, and the R² values change in the regime covering the breakpoint. The outcomes can be reproduced by MathAI for numerical modeling software on GitHub (Version 1.1).

Table 6. MSE and RMSE of different models for the Phillips curves of the G7 economies.

Country	Linear Model		Nonlinear Model		CPLR Method
Country	MSE	RMSE	MSE	RMSE	MSE	RMSE
US	1.951	1.397	1.168	1.081 *	1.749	1.323
UK	2.512	1.585	1.652	1.285 *	1.785	1.336
CA	1.660	1.288	0.993	0.997 *	1.351	1.162
JP	0.675	0.822	0.673	0.821	0.338	0.581 *
GE	2.113	1.454	1.895	1.377	1.595	1.263 *
IT	3.453	1.858	3.181	1.783	2.237	1.496 *
FR	0.967	0.983	0.611	0.782	0.514	0.717 *

Note: The smallest value is marked with an asterisk (*) as comparing the three models.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lin, Y.-S.; Fan, C.-P.; Lee, M.-Y.; Lee, Y.-H. Mathematical Computation of Piecewise Linear Regression with Endogenous Segmentation for Accurate Data-Based Model Building: An Example of the Phillips Curve. Mathematics 2026, 14, 1041. https://doi.org/10.3390/math14061041

AMA Style

Lin Y-S, Fan C-P, Lee M-Y, Lee Y-H. Mathematical Computation of Piecewise Linear Regression with Endogenous Segmentation for Accurate Data-Based Model Building: An Example of the Phillips Curve. Mathematics. 2026; 14(6):1041. https://doi.org/10.3390/math14061041

Chicago/Turabian Style

Lin, Yi-Shin, Chih-Ping Fan, Mei-Yu Lee, and Yao-Hsien Lee. 2026. "Mathematical Computation of Piecewise Linear Regression with Endogenous Segmentation for Accurate Data-Based Model Building: An Example of the Phillips Curve" Mathematics 14, no. 6: 1041. https://doi.org/10.3390/math14061041

APA Style

Lin, Y.-S., Fan, C.-P., Lee, M.-Y., & Lee, Y.-H. (2026). Mathematical Computation of Piecewise Linear Regression with Endogenous Segmentation for Accurate Data-Based Model Building: An Example of the Phillips Curve. Mathematics, 14(6), 1041. https://doi.org/10.3390/math14061041

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mathematical Computation of Piecewise Linear Regression with Endogenous Segmentation for Accurate Data-Based Model Building: An Example of the Phillips Curve

Abstract

1. Introduction

2. Methodology and Data

2.1. Linear Regression Model for All Samples

2.2. Nonlinear Regression Model for All Samples

2.3. Computational Piecewise Linear Regression (CPLR) Method

2.3.1. Stage 1: Sequential Segment Identification

2.3.2. Stage 2: Optimal Minimum Segment Size Selection

2.3.3. Theoretical Property of the Breakpoint Detection Rule

2.4. Sample Description

2.4.1. Data and Sample Description

2.4.2. Data Sources and Definitions

2.4.3. Sample Design and Data Characteristics

2.4.4. Descriptive Statistics

3. Computation Results

3.1. Linear Model Benchmark

3.2. Nonlinear Regression Results

3.3. Computational Piecewise Regression Results

3.3.1. Cross-Country Structural Heterogeneity

3.3.2. State-Dependent Slope Dynamics and Sign Reversals

3.3.3. Model Performance and Adaptive Parsimony

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Pseudocode and Mathematical Formulation for the CPLR Method

Appendix A.1. Algorithm of the Pseudocode Procedure

Appendix A.2. Mathematical Formulation of Segments

Appendix B. Comparative Analysis of CPLR and Complexity-Adjusted Selection Criteria (AICc)

Appendix C. Displaying the Optimal Sample Size Where the R-Squared Values Change in the First Stage

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI