# Generalized Partially Functional Linear Model with Unknown Link Function

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Preliminaries

## 3. Model and Estimation

#### 3.1. Abbreviation Introduction

#### 3.2. Model

#### 3.3. Estimation

**Step 1**To obtain the estimate ${\theta}^{\left(0\right)}$ of ${\theta}_{0}$ by solving Equation (5), it is assumed that the link function $g(\xb7)$ is known. The link function $g(\xb7)$ is required to be second-order continuously differentiable to ensure the existence of the Hessian matrix, moreover, for the variance function ${\sigma}^{2}(\xb7)$ is defined on the range of link function and is strictly positive.

**Step 2**By local linear regression, the estimates ${g}^{\left(0\right)}$, ${g}^{\prime \left(0\right)}$ of the link functions g, ${g}^{\prime}$ are obtained.

**Step 3**Using the method of

**Step 1**, the link function is replaced by the estimated link functions ${\tilde{g}}^{\left(\alpha \right)}$ and ${\tilde{g}}^{\prime \left(\alpha \right)}$, where $\alpha =0,1,2,\dots $. To update ${\tilde{\theta}}^{\left(\alpha \right)}$, solve the estimation equation (5) for $\theta $. From this we can obtain the estimated value of ${\tilde{\theta}}^{\left(\alpha \right)}$

**Step 4**Using the method in

**Step 2**, the parameter vector is replaced by the estimated ${\tilde{\theta}}^{\left(\alpha \right)}={({\tilde{\chi}}_{j1}^{\left(\alpha \right)},{\tilde{\chi}}_{j2}^{\left(\alpha \right)},\cdots ,{\tilde{\chi}}_{jm}^{\left(\alpha \right)},{\tilde{\gamma}}_{1}^{\left(\alpha \right)},{\tilde{\gamma}}_{2}^{\left(\alpha \right)},\cdots ,{\tilde{\gamma}}_{q}^{\left(\alpha \right)})}^{T}$, where $\alpha =1,2,3,\dots $ From this we obtain the estimates ${\tilde{g}}^{\left(\alpha \right)}$ and ${\tilde{g}}^{\prime \left(\alpha \right)}$ for g and ${g}^{\prime}$, where $\alpha =1,2,3,\dots $

**Step 5**Repeat the above steps until $\left|{\tilde{\theta}}^{(\alpha +1)}-{\tilde{\theta}}^{\left(\alpha \right)}\right|$ converge, and stop the iteration.

**Step 6**The final estimate of the regression coefficient $\theta $ is obtained as $\widehat{\theta}$, and the estimate of the link function g is obtained as $\widehat{g}$.

## 4. Asymptotic Properties

- (C1)
- There exists $b=max(4,c)$ for a constant $c>0$, such that $E\left[{\int}_{T}{\u2225{X}_{j}\left(t\right)\u2225}^{b}dt\right]<\infty ,\phantom{\rule{4pt}{0ex}}j=1,\cdots ,d,\phantom{\rule{4pt}{0ex}}E\left[{\u2225Z\u2225}^{b}\right]<\infty ,\phantom{\rule{4pt}{0ex}}E\left[\epsilon \right]<\infty .$
- (C2)
- Let the density function $f(\xb7)$ of ${\eta}_{i}$ be strictly positive, and $f(\xb7)$ satisfies the first-order Lipschitz condition when $\theta \to {\theta}_{0}$.
- (C3)
- The kernel function $k(\xb7)$ satisfies the first-order Lipschitz condition and is a bounded and continuous symmetric probability density function and satisfies ${\int}_{-\infty}^{\infty}{u}^{2}k\left(u\right)du\ne 0,\phantom{\rule{4pt}{0ex}}{\int}_{-\infty}^{\infty}{\left|u\right|}^{2}k\left(u\right)du<\infty .$
- (C4)
- $n{h}^{4}/{log}^{2}n\to \infty ,n{h}^{5}=O\left(1\right)$. Here, h is the bandwidth of the kernel function.
- (C5)
- For $j=1,\cdots ,d$, ${m}_{j}{n}^{-1/4}\to 0$ as $n\to \infty $.

**Remark 1.**

#### 4.1. Asymptotic Convergence of ${g}^{\left(\alpha \right)}$

**Lemma 1.**

**Proof.**

**Theorem 1.**

**Proof.**

**Corollary 1.**

#### 4.2. Asymptotic Convergence of $\widehat{\theta}$

**Lemma 2.**

**Proof.**

**Lemma 3.**

**Proof.**

**Theorem 2.**

**Proof.**

#### 4.3. Asymptotic Convergence of $\widehat{g}$

**Theorem 3.**

**Proof.**

**Corollary 2.**

**Remark 2.**

## 5. Simulation

## 6. Application

#### 6.1. Data Description

#### 6.2. Data Analysis

#### 6.3. Results Analysis

## 7. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Ramsay, J.O. When the data are functions. Psychometrika
**1982**, 47, 379–396. [Google Scholar] [CrossRef] - Ramsay, J.O.; Silverman, B.W. Functional Data Analysis, 2nd ed.; Springer: New York, NY, USA, 2005. [Google Scholar]
- Horváth, L.; Kokoszka, P. Inference for Functional Data with Application; Springer: New York, NY, USA, 2012. [Google Scholar]
- Shin, H. Partial functional linear regression. J. Stat. Plan. Inference
**2009**, 139, 3405–3418. [Google Scholar] [CrossRef] - Shin, H.; Lee, M.H. On prediction rate in partial functional linear regression. J. Multivar. Anal.
**2012**, 103, 93–106. [Google Scholar] [CrossRef] - James, G.M. Generalized linear models with functional predictors. J. R. Stat. Soc. Ser. B
**2002**, 64, 411–432. [Google Scholar] [CrossRef] - Müller, H.G.; Stadtmüller, U. Generalized functional linear models. Ann. Stat.
**2005**, 33, 774–805. [Google Scholar] [CrossRef] - Shang, Z.F.; Cheng, G. Nonparametric inference in generalized functional linear models. Ann. Stat.
**2015**, 43, 1742–1773. [Google Scholar] [CrossRef] - Wong, R.K.W.; Li, Y.; Zhu, Z.Y. Partially Linear Functional Additive Models for Multivariate Functional Data. J. Am. Stat. Assoc.
**2019**, 114, 406–418. [Google Scholar] [CrossRef] - Scallan, A.; Gilchrist, R.; Green, M. Fitting Parametric Link Functions in Generalized Linear Models. Comput. Stat. Data Anal.
**1984**, 2, 37–49. [Google Scholar] [CrossRef] - Weisberg, S.; Welsh, A.H. Adapting for the missing link. Ann. Stat.
**1994**, 22, 1674–1700. [Google Scholar] [CrossRef] - Chiou, J.M.; Müller, H.G. Quasi-likelihood regression with unknown link and variance functions. J. Am. Stat. Assoc.
**1998**, 93, 1376–1387. [Google Scholar] [CrossRef] - Chiou, J.M.; Müller, H.G. Estimated estimating equations: Semiparametric inference for clustered and longitudinal data. J. R. Stat. Soc. Ser. B
**2005**, 67, 531–553. [Google Scholar] [CrossRef] - Bai, Y.; Fung, W.K.; Zhu, Z.Y. Penalized quadratic inference functions for single-index models with longitudinal data. J. Multivar. Anal.
**2009**, 100, 152–161. [Google Scholar] [CrossRef] - Pang, Z.; Xue, L. Estimation for the single-index models with random effects. Comput. Stat. Data Anal.
**2012**, 56, 1837–1853. [Google Scholar] [CrossRef] - Yuan, M.; Diao, G. Sieve maximum likelihood estimation in generalized linear models with an unknown link function. Wiley Interdiscip. Rev. Comput. Stat.
**2017**, 10, e1425. [Google Scholar] [CrossRef] - Kokoszka, P.; Reimherr, M. Introduction to Functional Data Analysis; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar] [CrossRef]
- Rao, A.R.; Reimherr, M. Nonlinear Functional Modeling Using Neural Networks. J. Comput. Graph. Stat.
**2023**, 32, 1248–1257. [Google Scholar] [CrossRef] - Huang, C.; Barnett, A.G.; Wang, X.; Tong, S. The impact of temperature on years of life lost in Brisbane, Australia. Nat. Clim. Chang.
**2012**, 2, 265–270. [Google Scholar] [CrossRef] - Yang, Y.; Qi, J.L.; Ruan, Z.L.; Yin, P.; Zhang, S.Y.; Liu, J.M.; Liu, Y.N.; Li, R.; Wang, L.J.; Lin, H.L. Changes in Life Expectancy of Respiratory Diseases from Attaining Daily PM2.5 Standard in China: A Nationwide Observational Study. Innovation
**2020**, 1, 100064. [Google Scholar] [CrossRef] [PubMed] - Deryugina, T.; Molitor, D. The Causal Effects of Place on Health and Longevity. J. Econ. Perspect.
**2021**, 35, 147–170. [Google Scholar] [CrossRef] - Mack, Y.P.; Silverman, B.W. Weak and strong uniform consistency of kernel regression estimates. Probab. Theory Relat. Fields
**1982**, 63, 405–415. [Google Scholar] [CrossRef] - Masry, E.; Tjøstheim, D. Estimation and Identification of Nonlinear ARCH Time Series: Strong Convergence and Asymptotic Normality. Econom. Theory
**1995**, 11, 258–289. [Google Scholar] [CrossRef] - Chiou, J.M.; Müller, H.G. Nonparametric quasi-likelihood. Ann. Stat.
**1999**, 27, 36–64. [Google Scholar] [CrossRef] - Xiao, W.W.; Wang, Y.X.; Liu, H.Y. Generalized partially functional linear model. Sci. Rep.
**2021**, 11, 23428. [Google Scholar] [CrossRef] [PubMed]

**Figure 2.**Asymptotic properties of the link function g. The black line in the graph represents the true link function $g=exp\left(\eta \right)/(1+exp(\eta \left)\right)$. The purple, yellow, and red lines in the graph represent the estimated link functions $\widehat{g}$ under sample sizes of $n=50$, $n=100$, and $n=300$, respectively.

**Figure 3.**Estimated values of regression coefficient function ${\widehat{\beta}}_{1}\left(t\right)$, ${\widehat{\beta}}_{2}\left(t\right)$ (blue curves) and their 95% confidence intervals (grey area) for difference sample size, where the red curves are the theoretical regression coefficient functions ${\beta}_{1}\left(t\right)$, ${\beta}_{2}\left(t\right)$.

**Figure 4.**Daily AQI (left plot) and daily temperatures (right plot) for 58 cities in 2020; each curve represents one city.

**Figure 5.**Estimated values of regression coefficient function $\widehat{\beta}\left(t\right)$ and their 95% confidence intervals.

Abbreviation | Full Form |
---|---|

FPCA | Functional principal component analysis |

KL expansion | Karhunen–Loeve expansion |

RMISE | Root Mean Integrated Square Error |

SD | Standard Deviation |

GCV | Generalized Cross Validation |

MAE | Mean Absolute Error |

MSE | Mean Squared Error |

TP | True Positive |

TN | True Negative |

FP | False Positive |

FN | False Negative |

n | RMISE |
---|---|

50 | 0.3540 |

100 | 0.2734 |

300 | 0.1449 |

**Table 3.**SD and RMISE of the estimated values of ${\widehat{\beta}}_{1}\left(t\right)$ and ${\widehat{\beta}}_{2}\left(t\right)$ for different sample sizes n.

n | SD | RMISE | |
---|---|---|---|

50 | 0.2475 | 0.3405 | |

${\widehat{\beta}}_{1}\left(t\right)$ | 100 | 0.1344 | 0.2517 |

300 | 0.0552 | 0.1204 | |

50 | 0.2536 | 0.3232 | |

${\widehat{\beta}}_{2}\left(t\right)$ | 100 | 0.1261 | 0.2863 |

300 | 0.0239 | 0.1033 |

**Table 4.**Estimated values of scalar regression coefficients $\widehat{\gamma}$ and their SD in brackets for different sample sizes n.

n | ${\widehat{\mathit{\gamma}}}_{1}$ | ${\widehat{\mathit{\gamma}}}_{2}$ | ${\widehat{\mathit{\gamma}}}_{3}$ |
---|---|---|---|

50 | 0.7298 (0.191) | 0.5928 (0.177) | 0.5307 (0.232) |

100 | 0.6892 (0.092) | 0.5832 (0.071) | 0.4894 (0.096) |

300 | 0.7105 (0.019) | 0.5732 (0.018) | 0.4988 (0.016) |

n | M1 | M2 |
---|---|---|

50 | 0.3182 | 0.1579 |

100 | 0.3028 | 0.1498 |

300 | 0.2921 | 0.1406 |

Estimate | Std.Error | t Value | Pr (>$\left|\mathit{t}\right|$) | |
---|---|---|---|---|

${\widehat{\gamma}}_{GDP}$ | 0.6776 | 0.339 | 1.9988 | 0.04639 |

${\widehat{\gamma}}_{Beds}$ | 0.7354 | 0.367 | 2.0038 | 0.04585 |

**Table 7.**Comparison between Unknown Link Function Model, Logit Link Function Model, and Model without a Link Function.

Link Function | MAE | MSE | ${\mathit{R}}^{2}$ | Accuracy |
---|---|---|---|---|

Unknown | 0.2584 | 0.1399 | 0.8916 | 81.03% |

Logit | 0.2872 | 0.2511 | 0.6673 | 75.86% |

Without | 0.4777 | 0.3146 | 0.4118 | 74.14% |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Xiao, W.; Li, S.; Liu, H.
Generalized Partially Functional Linear Model with Unknown Link Function. *Axioms* **2023**, *12*, 1089.
https://doi.org/10.3390/axioms12121089

**AMA Style**

Xiao W, Li S, Liu H.
Generalized Partially Functional Linear Model with Unknown Link Function. *Axioms*. 2023; 12(12):1089.
https://doi.org/10.3390/axioms12121089

**Chicago/Turabian Style**

Xiao, Weiwei, Songxuan Li, and Haiyan Liu.
2023. "Generalized Partially Functional Linear Model with Unknown Link Function" *Axioms* 12, no. 12: 1089.
https://doi.org/10.3390/axioms12121089