# A New Semiparametric Regression Framework for Analyzing Non-Linear Data

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. The Semiparametric Non-Linear Regression Model

#### 2.2. Bayesian Inference

`R`environment [42]. The executable scripts can be made available by the authors upon justified request.

#### 2.3. Model Comparison

## 3. Non-Linear Data Analysis

#### 3.1. Artificial Data

#### 3.2. COVID-19 Count Data

#### 3.3. Tuberculosis Count Data

## 4. Concluding Remarks

## Supplementary Materials

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

AR | Autoregressive |

CI | Credible Interval |

GAM | Generalized Additive Models |

GP | Gaussian Process |

MA | Moving Average |

MCMC | Markov Chain Monte Carlo |

MwG | Metropolis-within-Gibbs |

RVM | Relevance Vector Machine |

## References

- Bates, D.M.; Watts, D.G. Nonlinear Regression Analysis and Its Applications, 2nd ed.; John Wiley & Sons: New York, NY, USA, 2007. [Google Scholar]
- Pinheiro, J.C.; Bates, D.M. Mixed-Effects Models in S and S-Plus; Springer: New York, NY, USA, 2000. [Google Scholar]
- Eubank, R.L. Spline Smoothing and Nonparametric Regression; Marcel Dekker: New York, NY, USA, 1988. [Google Scholar]
- Green, P.J.; Silverman, B.W. Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach; Chapman & Hall: London, UK, 1994. [Google Scholar]
- Gu, C. Smoothing Spline ANOVA Models, 2nd ed.; Springer: New York, NY, USA, 2013. [Google Scholar]
- Hastie, T.; Tibshirani, R. Generalized Additive Models; Chapman & Hall: London, UK, 1990. [Google Scholar]
- Hastie, T.; Tibshirani, R. Varying-coefficient models. J. R. Stat. Soc. Ser. B
**1993**, 55, 757–796. [Google Scholar] [CrossRef] - Archontoulis, S.V.; Miguez, F.E. Nonlinear regression models and applications in agricultural research. Agron. J.
**2015**, 107, 786–798. [Google Scholar] [CrossRef][Green Version] - Martino, L.; Read, J. A joint introduction to Gaussian processes and relevance vector machines with connections to Kalman filtering and other kernel smoothers. Inf. Fusion
**2021**, 74, 17–38. [Google Scholar] [CrossRef] - Candela, J.Q. Learning with Uncertainty-Gaussian Processes and Relevance Vector Machines; Technical University of Denmark: Copenhagen, Denmark, 2004; pp. 1–152. [Google Scholar]
- Dixon, B.L.; Sonka, S.T. A note on the use of exponential functions for estimating farm size distributions. Am. J. Agric. Econ.
**1979**, 61, 554–557. [Google Scholar] [CrossRef] - Shimojo, M.; Nakano, Y. An investigation into relationships between exponential functions and some natural phenomena. J. Fac. Agric. Kyushu Univ.
**2013**, 58, 51–53. [Google Scholar] [CrossRef] - Gompertz, B. On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies. Philos. Trans. R. Soc. B
**1825**, 115, 513–585. [Google Scholar] - Verhulst, P.F. A note on population growth. Corresp. Math. Phys.
**1838**, 10, 113–121. [Google Scholar] - Weibull, W. A statistical distribution function of wide applicability. J. Appl. Math.
**1951**, 18, 293–297. [Google Scholar] [CrossRef] - Richards, F.J. A flexible growth function for empirical use. J. Exp. Bot.
**1959**, 10, 290–300. [Google Scholar] [CrossRef] - Yin, X.; Goudriaan, J.; Lantinga, E.A.; Vos, J.; Spiertz, J.H.J. A flexible sigmoid function of determinate growth. Ann. Bot.
**2003**, 91, 361–371. [Google Scholar] [CrossRef] [PubMed] - Blackman, F.F. Optima and limiting factors. Ann. Bot.
**1905**, 19, 281–295. [Google Scholar] [CrossRef] - Sinclair, T.R.; Horie, T. Leaf nitrogen, photosynthesis, and crop radiation use efficiency: A review. Crop Sci.
**1989**, 29, 90–98. [Google Scholar] [CrossRef] - van’t Hoff, J.H. Lectures on Theoretical and Physical Chemistry. Part 1: Chemical Dynamics; Edward Arnold: London, UK, 1898. [Google Scholar]
- Arrhenius, S. Über die Reaktionsgeschwindigkeit bei der Inversion von Rohrzucker durch Säuren. Z. Für Phys. Chem.
**1889**, 4, 226–248. [Google Scholar] [CrossRef][Green Version] - Ratkowsky, D.A.; Olley, J.; McMeekin, T.A.; Ball, A. Relationship between temperature and growth rate of bacterial cultures. J. Bacteriol.
**1982**, 149, 1–5. [Google Scholar] [CrossRef][Green Version] - Lloyd, J.; Taylor, J.A. On the temperature dependence of soil respiration. Funct. Ecol.
**1994**, 8, 315–323. [Google Scholar] [CrossRef] - Yin, X.; Kroff, M.J.; McLean, G.; Visperas, R.M. A nonlinear model for crop development as a function of temperature. Agric. For. Meteorol.
**1995**, 77, 1–16. [Google Scholar] [CrossRef][Green Version] - Hu, Y.; Tao, V.; Croitoru, A. Understanding the rational function model: Methods and applications. Int. Arch. Photogramm. Remote Sens.
**2004**, 20, 119–124. [Google Scholar] - Braverman, E.; Kinzebulatov, D. On linear perturbations of the Ricker model. Math. Biosci.
**2006**, 202, 323–339. [Google Scholar] [CrossRef] [PubMed] - Nijland, G.O.; Schouls, J.; Goudriaan, J. Integrating the production functions of Liebig, Michaelis-Menten, Mitscherlich and Liebscher into one system dynamics model. NJAS-Wagening. J. Life Sci.
**2008**, 55, 199–224. [Google Scholar] [CrossRef][Green Version] - Ye, Z.; Zhao, Z. A modified rectangular hyperbola to describe the light-response curve of photosynthesis of Bidens pilosa L. grown under low and high light conditions. Front. Agric. China
**2010**, 4, 50–55. [Google Scholar] [CrossRef] - Bernardo, J.M.; Smith, A.F.M. Bayesian Theory; John Wiley & Sons: New York, NY, USA, 1994. [Google Scholar]
- Gelfand, A.E.; Smith, A.F.M. Sampling based approaches to calculating marginal densities. J. Am. Stat. Assoc.
**1990**, 85, 398–409. [Google Scholar] [CrossRef] - Casella, G.; George, E.I. Explaining the Gibbs sampler. Am. Stat.
**1992**, 46, 167–174. [Google Scholar] - Chib, S.; Greenberg, E. Understanding the Metropolis-Hastings algorithm. Am. Stat.
**1995**, 49, 327–335. [Google Scholar] - Gilks, W.R.; Best, N.G.; Tan, K.K. Adaptive rejection Metropolis sampling within Gibbs sampling. J. R. Stat. Soc. Ser. C (Appl. Stat.)
**1995**, 44, 455–472. [Google Scholar] [CrossRef][Green Version] - Seber, G.A.F.; Lee, A.J. Linear Regression Analysis, 2nd ed.; John Wiley & Sons: New York, NY, USA, 2003. [Google Scholar]
- Ratkowsky, D.A. Nonlinear Regression Modelling: A Unified Practical Approach; Marcel Dekker: New York, NY, USA, 1983. [Google Scholar]
- Seber, G.A.F.; Wild, C.J. Nonlinear Regression; John Wiley & Sons: New York, NY, USA, 1989. [Google Scholar]
- Koop, G.; Poirier, D.J. Bayesian variants of some classical semiparametric regression techniques. J. Econom.
**2004**, 123, 259–282. [Google Scholar] [CrossRef][Green Version] - Munkin, M.; Trivedi, P. Bayesian analysis of the ordered probit model with endogenous selection. J. Econom.
**2008**, 143, 334–348. [Google Scholar] [CrossRef] - Feng, L.; Munkin, M. Bayesian semiparametric analysis on the relationship between BMI and income for rural and urban workers in China. J. Appl. Stat.
**2021**. [Google Scholar] [CrossRef] - Heidelberger, P.; Welch, P.D. Simulation run length control in the presence of an initial transient. Oper. Res.
**1983**, 31, 1109–1144. [Google Scholar] [CrossRef] - Geweke, J. Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. J. R. Stat. Soc.
**1994**, 56, 501–514. [Google Scholar] - R Development Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2020. [Google Scholar]
- Carlin, B.P.; Louis, T.A. Bayes and Empirical Bayes Methods for Data Analysis; Chapman & Hall: Boca Raton, FL, USA, 2001. [Google Scholar]
- Brooks, S.P. Discussion on the paper by Spiegelhalter, Best, Carlin, and van der Linde. J. R. Stat. Soc. Ser. B (Stat. Methodol.)
**2002**, 64, 616–639. [Google Scholar]

**Figure 2.**Fitted means for the daily number of COVID-19 cases (

**left panel**) and deaths (

**right panel**).

Group | Models | Published Works |
---|---|---|

I | Exponential Functions | [11,12] |

II | Sigmoids (e.g., Logistic Function) | [13,14,15,16,17] |

III | Asymptotic Exponential | [18,19] |

Modified Logistic | ||

Photosynthesis | ||

IV | Modified Arrhenius | [20,21,22,23] |

Temperature Dependencies | ||

va not (${Q}_{10}$ Function) | ||

V | Bell-shaped Curves | [24] |

Gaussian Function | ||

VI | Michaelis–Menten | [8,25,26,27,28] |

Modified Hyperbola | ||

Power Functions | ||

Rational Functions | ||

Ricker Curve |

y | x | z | y | x | z |
---|---|---|---|---|---|

12 | 1 | 10.0375 | 31 | 12 | 8.9171 |

14 | 2 | 11.4128 | 29 | 13 | 10.0933 |

15 | 3 | 9.8035 | 27 | 14 | 11.9097 |

18 | 4 | 9.9774 | 25 | 15 | 11.0709 |

20 | 5 | 9.0706 | 20 | 16 | 10.3041 |

21 | 6 | 10.8220 | 19 | 17 | 9.4895 |

22 | 7 | 9.6170 | 18 | 18 | 9.1792 |

25 | 8 | 9.1354 | 16 | 19 | 9.5295 |

28 | 9 | 10.0180 | 15 | 20 | 9.2414 |

30 | 10 | 10.1596 | 10 | 21 | 10.3354 |

34 | 11 | 10.3520 | - | - | - |

Model | Parameter | Mean | Std. Dev. | 95% CI | |
---|---|---|---|---|---|

Lower | Upper | ||||

(6) | ${\alpha}_{0}$ | 1.3780 | 2.7771 | −3.7411 | 7.0241 |

${\alpha}_{1}$ | 3.6490 | 0.4037 | 2.8950 | 4.4070 | |

${\alpha}_{2}$ | −0.1659 | 0.0181 | −0.1990 | −0.1297 | |

$\beta $ | 0.5975 | 0.3185 | −0.0185 | 1.2420 | |

$\zeta $ | 0.1697 | 0.0592 | 0.0756 | 0.3008 | |

(7) | ${\alpha}_{1}$ | 1.0000 | 0.0029 | 0.9944 | 1.0061 |

${\alpha}_{2}$ | 0.9990 | 0.0121 | 0.9749 | 1.0230 | |

${\alpha}_{3}$ | 0.0012 | 0.0257 | −0.0468 | 0.0518 | |

$\beta $ | −0.0002 | 0.0064 | −0.0131 | 0.0122 | |

$\zeta $ | 85.2600 | 29.6000 | 39.1100 | 152.5000 |

Model | k | ${\mathit{p}}_{\mathit{D}}$ | DIC | EAIC | EBIC |
---|---|---|---|---|---|

(6) | 5 | 4.826 | 103.986 | 109.160 | 114.382 |

(7) | 5 | 5.022 | 59.858 | 69.413 | 79.857 |

Count | Model | Parameter | Mean | Std. Dev. | 95% CI | |
---|---|---|---|---|---|---|

Lower | Upper | |||||

Cases | (8) | ${\beta}_{0}$ | 0.0371 | 1.0080 | −1.8620 | 1.9800 |

${\beta}_{1}$ | 0.8959 | 0.7839 | −0.6451 | 2.5610 | ||

${\beta}_{2}$ | 1.3070 | 0.0540 | 1.2100 | 1.4230 | ||

${\beta}_{3}$ | −0.3089 | 0.0545 | −0.4249 | −0.2088 | ||

$\zeta $ | <0.0001 | <0.0001 | <0.0001 | <0.0001 | ||

(9) | ${\alpha}_{1}$ | 1.0060 | 0.0021 | 1.0021 | 1.0100 | |

${\alpha}_{2}$ | −0.1662 | 0.0222 | −0.2113 | −0.1265 | ||

${\alpha}_{3}$ | −5.3710 | 0.5814 | −6.5710 | −4.3110 | ||

$\zeta $ | <0.0001 | <0.0001 | <0.0001 | <0.0001 | ||

Deaths | (8) | ${\beta}_{0}$ | −0.0296 | 0.9695 | −1.9500 | 1.8390 |

${\beta}_{1}$ | 0.0161 | 0.0185 | −0.0214 | 0.0556 | ||

${\beta}_{2}$ | 1.2930 | 0.0550 | 1.1940 | 1.4080 | ||

${\beta}_{3}$ | −0.2908 | 0.0558 | −0.4049 | −0.1893 | ||

$\zeta $ | 0.0009 | <0.0001 | <0.0001 | <0.0001 | ||

(9) | ${\alpha}_{1}$ | 1.0101 | 0.0018 | 1.0060 | 1.0130 | |

${\alpha}_{2}$ | −0.1681 | 0.0221 | −0.2136 | −0.1287 | ||

${\alpha}_{3}$ | −5.4050 | 0.5801 | −6.5920 | −4.3460 | ||

$\zeta $ | 0.0011 | <0.0001 | <0.0001 | <0.0001 |

Count | Model | k | ${\mathit{p}}_{\mathit{D}}$ | DIC | EAIC | EBIC |
---|---|---|---|---|---|---|

Cases | (8) | 5 | 3.084 | 6237.000 | 6243.637 | 6263.040 |

(9) | 4 | 3.186 | 6194.920 | 6193.722 | 6197.602 | |

Deaths | (8) | 5 | 4.174 | 4104.000 | 4109.994 | 4129.397 |

(9) | 4 | 3.945 | 4057.920 | 4060.384 | 4063.967 |

Model | Parameter | Mean | Std. Dev. | 95% CI | |
---|---|---|---|---|---|

Lower | Upper | ||||

(10) | ${\alpha}_{0}$ | 0.7285 | 0.7262 | −0.5683 | 1.9650 |

${\alpha}_{1}$ | 0.0007 | 0.0008 | <−0.0001 | 0.0023 | |

${\alpha}_{2}$ | <$-0.0001$ | <0.0001 | <−0.0001 | <−0.0001 | |

${\alpha}_{3}$ | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |

$\beta $ | 0.0041 | 0.0004 | 0.0035 | 0.0047 | |

$\zeta $ | 208.1000 | 19.8200 | 172.4000 | 249.7000 | |

(11) | ${\alpha}_{1}$ | 0.9998 | 0.0009 | 0.9982 | 1.0000 |

${\alpha}_{2}$ | −0.1817 | 0.0197 | −0.2273 | −0.1478 | |

${\alpha}_{3}$ | −6.2030 | 0.5525 | −7.3740 | −5.1280 | |

$\beta $ | 0.0002 | 0.0007 | −0.0012 | 0.0015 | |

$\zeta $ | 340.1000 | 34.2600 | 275.5000 | 411.0000 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Bertoli, W.; Oliveira, R.P.; Achcar, J.A.
A New Semiparametric Regression Framework for Analyzing Non-Linear Data. *Analytics* **2022**, *1*, 15-26.
https://doi.org/10.3390/analytics1010002

**AMA Style**

Bertoli W, Oliveira RP, Achcar JA.
A New Semiparametric Regression Framework for Analyzing Non-Linear Data. *Analytics*. 2022; 1(1):15-26.
https://doi.org/10.3390/analytics1010002

**Chicago/Turabian Style**

Bertoli, Wesley, Ricardo P. Oliveira, and Jorge A. Achcar.
2022. "A New Semiparametric Regression Framework for Analyzing Non-Linear Data" *Analytics* 1, no. 1: 15-26.
https://doi.org/10.3390/analytics1010002