Contribution of Incorporating the Phosphorus Cycle into TRIPLEX-CNP to Improve the Quantiﬁcation of Land Carbon Cycle

: Phosphorus (P) is a key and a limiting nutrient in ecosystems and plays an important role in many physiological and biochemical processes, affecting both terrestrial ecosystem productivity and soil carbon storage. However, only a few global land surface models have incorporated P cycle and used to investigate the interactions of C-N-P and its limitation on terrestrial ecosystems. The overall objective of this study was to integrate the P cycle and its interaction with carbon (C) and nitrogen (N) into new processes model of TRIPLEX-CNP. In this study, key processes of the P cycle, including P pool sizes and ﬂuxes in plant, litter, and soil were integrated into a new model framework, TRIPLEX-CNP. We also added dynamic P:C ratios for different ecosystems. Based on sensitivity analysis results, we identiﬁed the phosphorus resorption coefﬁcient of leaf (rpleaf) as the most inﬂuential parameter to gross primary productivity (GPP) and biomass, and determined optimal coefﬁcients for different plant functional types (PFTs). TRIPLEX-CNP was calibrated with 49 sites and validated against 116 sites across eight biomes globally. The results suggested that TRIPLEX-CNP performed well on simulating the global GPP and soil organic carbon (SOC) with respective R 2 values of 0.85 and 0.78 (both p < 0.01) between simulated and observed values. The R 2 of simulation and observation of total biomass are 0.67 ( p < 0.01) by TRIPLEX-CNP. The overall model performance had been improved in global GPP, total biomass and SOC after adding the P cycle comparing with the earlier version. Our work represents the promising step toward new coupled ecosystem process models for improving the quantiﬁcations of land carbon cycle and reducing uncertainty.


Introduction
Phosphorus (P) availability in soils is known to be controlled by its parent material, the stage of soil development, weathering, and erosion [1], as well as soil microbial activity [2]. Many forms of P exist in soils, but plants can acquire only simple inorganic orthophosphate (Pi) [3]. Since Pi is easily fixed by soil particles in soils with a range of chemical properties [4], its levels are usually limiting, which in turn limits the productivity of a variety of ecosystems [5].
It is widely accepted that nutrients limit terrestrial plant production and carbon (C) [6][7][8]. Nitrogen (N) has traditionally been considered the most important nutrient that limits plant production in terrestrial ecosystems [8,9], whereas P has been considered less important, perhaps with exception of lowland tropical forest regions [6,7,10]. This traditional view, however, has been challenged by mounting evidence of significant P limitation in areas other than the lowland tropical regions, including tundra regions [11,12] and temperate region areas with strongly weathered soils [13,14]. Despite the important role of P in terrestrial plant production, we still lack a quantitative understanding the mechanisms underlying terrestrial P limitation on a global scale, its spatial heterogeneity, or interactions of P with C and N [15,16]. For example, none of the models in the fifth phase of the Coupled Model Intercomparison Project consider P as a limiting nutrient, resulting in substantial uncertainty in estimates of the strength of the terrestrial carbon (C) sink through the 21st century [15][16][17][18][19]. Therefore, identifying the P cycle specific processes and quantifying the distribution and spatial heterogeneity of P limitation, as well as its potential effects on ecosystem responses to future climate change, remain a high priority.
Integrating the P cycle into terrestrial ecosystem models should be essential, given that laboratory and field experiments, especially empirical stoichiometry observations [20][21][22], have already provided theoretical and data foundations [2] that allowed scientists to develop terrestrial ecosystem models that take into account C, N and P processes, interactions and feedback in entire ecosystems. In fact, the early process-based models developed, focused on C cycles of terrestrial ecosystems under climate change and elevated levels of atmospheric CO 2 [15]. A growing number of models that couple C-N interactions have been applied to understanding the effects of global change on multiple scales. The models that most widely applied are the biogeochemical model CENTURY [23][24][25][26], ecosystem process-based model, BGC (BIOMASS-BGC, FOREST-BGC) [27][28][29][30][31], DNDC [32][33][34][35][36][37], and land biogeochemistry model, CLM [38][39][40]. These models have been applied to different ecosystems but have been limited primarily to forests, grasslands, and croplands.
At global scale, the land carbon models, including LPJ [41], LPJ_GUESS [42], OR-CHIDEE [43], CLM4CN [44], TRIFFID [45], VEGAS [46], SDGVM [47], and Hyland [48], encapsulate natural and human induced processes through which land ecosystems absorb and emit CO 2 , and carbon is stored by plants and soils, and have been widely used by the TRENDY modeling project for simulating the energy, water, nitrogen and carbon balances at daily, monthly and yearly time step [49]. Unfortunately, no model has included P cycle and limitation in their model simulations.
Efforts have been made to integrate co-limiting N and P on simulations of the productivity of terrestrial ecosystems [6,50,51], but the most of second-generation models do not take P limitation into account [15]. In part since this omission creates uncertainty in estimates of vegetation productivity and changes in terrestrial carbon sinks, interest is growing in understanding the effect of P on terrestrial ecosystems, how P limitation affects C and N, and how P and N interact globally [52]. The handful of land surface models that have considered the P cycle and C-N-P interactions, such as CASA-CNP [53], JSBACH-CNP [54,55], CLM-CNP [56], QUINCY [52], and data assimilation framework, GOLUM-CNP [57], show that increases in atmospheric CO 2 limits P in terrestrial ecosystems, especially lowland tropical forest [52,54,[56][57][58]. Recently, the ORCHIDEE-CNP model has been evaluated at a global scale using remote sensing data and provided updated information regarding the need to incorporate P cycle processes to estimate global C stocks under climate change [59]. However, the validation of model is still a big challenge. There are very few models, such as CASA-CNP, JSBACH-CNP, and ORCHIDEE-CNP, which have been validated by comparing the model simulations against the observations across global ecosystems. In addition, the Jena Soli Model (JSM) [60] focused on improving the microbial soil organic carbon (SOM). The importance of more detailed representation of plant P processes is needed to link nutrient availability and ecosystem productivity [61]. Moreover, the above models are rarely studied to confirm the sensitive parameters and adjustment parameters used in the phosphorus cycle. There are also few models that were validated by site-to-site across multiple variables on the global scale.
The objectives of the present study were to (1) incorporate the P cycle and its interactions with the C and N cycles into the existing ecosystem model TRIPLEX-GHG [62,63] in order to develop a new process-based model, called TRIPLEX-CNP; (2) identify the parameters to which TRIPLEX-CNP is the most sensitive to GPP and biomass by sensitivity analysis; and (3) evaluate the performance of TRIPLEX-CNP model for simulating GPP, SOC, and total biomass against global observations.

Model Structure and Description
In this study, we integrated the P cycle, its interaction with C and N, and feedback into the TRIPLEX-GHG model [63], which was developed based on the Integrated Biosphere Simulator (IBIS) and TRIPLEX [64]. TRIPLEX-GHG includes four submodules [63]: (i) the plant phenology module quantifies the response of different plant functional types (PFTs) to different parameters, capturing PFT differences in phenological processes such as budburst and tree leaf fall period; (ii) the vegetation dynamics module determines the distributions of PFTs based on climatic factors such as cold temperature and growing degree days, and it simulates the dynamic C allocation in leaves, stems and roots for each biome; (iii) the land surface module, developed from the land-surface transfer model [65], describes energy and water exchanges among soil, vegetation and atmosphere; and (iv) the soil biogeochemistry module focuses on exchange, flux and decomposition processes that affect C and N in litter and underground pools. TRIPLEX-GHG also contains a methane (CH 4 ) biogeochemical submodule [63] and nitrogen dioxide (N 2 O) submodule [62]. The model has been calibrated, validated, and used to simulate emissions of CH 4 [63,66,67] and N 2 O [62,68] from site-scale to global-scale.
However, the TRIPLEX-GHG model does not include explicit P cycle processes, their interactions or feedback to C and N processes. The present study covered this gap by integrating important P processes, pools and fluxes into the model (Figure 1), generating the new modeling framework that we call TRIPLEX-CNP.

P Fluxes and Pools in Plants
In our model, the P cycle contained 12 pools (leaf, wood and root for plant pools; decomposable, structural, and resistant for litter pools; microbial, slow, passive P pools for soil pools and new adding labile, absorbed and strongly absorbed pools), which are based mainly on the CASA-CNP model [53,69]. The P:C ratio for three key plant P pools (leaf, wood and root) were quantified and allowed to vary within ranges defined for each PFT by the model. All of the algorithms of pools and fluxes followed the mass balance principle. The timescale was daily for the biogeochemistry submodule. Plants were divided into leaf, wood, and root pools; the litter was divided into decomposable, structural and resistant pools; and the soil was divided into a microbial biomass pool, slow pool (protected and non-protected) and passive pool. The key parameters concerning P cycle used in fowling equations are listed in Table 1 (and the others in Table S1).

P Fluxes and Pools in Plants
In our model, the P cycle contained 12 pools (leaf, wood and root for plant pools; decomposable, structural, and resistant for litter pools; microbial, slow, passive P pools for soil pools and new adding labile, absorbed and strongly absorbed pools), which are based mainly on the CASA-CNP model [53,69]. The P:C ratio for three key plant P pools (leaf, wood and root) were quantified and allowed to vary within ranges defined for each PFT by the model. All of the algorithms of pools and fluxes followed the mass balance principle. The timescale was daily for the biogeochemistry submodule. Plants were divided into leaf, wood, and root pools; the litter was divided into decomposable, structural and resistant pools; and the soil was divided into a microbial biomass pool, slow pool (protected and non-protected) and passive pool. The key parameters concerning P cycle used in fowling equations are listed in Table 1 (and the others in Table S1). Equations describing changes in P pools were taken from CASA-CNP [53]. We started from where P i is the amount of P absorbed by plants (g P·m −2 ) (i = leaf, wood or root here), f pi is the proportion of P absorbed by plants allocated to plant pools and ∑ i f pi = 1, F pup is plant P uptake rate (g P·m −2 ·day −1 ), t i is the turnover rate of plant pools (day −1 ), and r pi is the P reabsorption coefficient (0.5 for leaf pool, 0.9 for wood pool and 0.9 for root pool) [53].

P Pools and Fluxes in Litter
The litter was modeled as three pools (including structural, decomposable and resistant pools) occupying an intermediate layer between the plant and the soil layer ( Figure 1). The litter layer receives elements from the vegetation, converts them into organic or inorganic forms, then transmits them to the soil layer. P fluxes in the structural pool were described by the equation dP str dt = t leaf C leaf + t root C root R str − l n t str P str (2) where P str is the amount of structural P (g P·m −2 ); C leaf and C root are the carbon concentrations of leaf and root pools (g C·m −2 ), respectively; t leaf , t root and t str are the turnover rates of leaf, root and structure pools (day −1 ); R str is the P:C ratio in the structural pool (g P·g C −1 ); and l n is the nitrogen limitation on litter C decomposition, defined to be 1.0. P fluxes in the decomposable pool were described by the equation where P dec is the decomposable P pool (g P·m −2 ), while r pleaf and r proot are the constants for leaf and fine root biomass turnover time constants (day −1 ) for each PFT. These constants had the values provided in Table S1. P fluxes in the resistant pool were described by the equation dP res dt = t wood P wood − l n t res P res (4) where P res is the resistant P pool (g P·m −2 ), and t wood and t res are the turnover time constants for wood and litter resistant pools (day −1 ). The values of these constants are shown in Table S1.

Soil P Pools, Fluxes and Mineralization
The soil pools were modeled as containing the three pools of the original model without the P cycle, divided into organic and inorganic pools, as well as three new independent P pools (labile, absorbed and strongly absorbed pools), which all together describe the process of P cycling completely. In contrast to the original TRIPLEX-GHG settings, we combined the non-protected slow pool and the protected slow pool into a single slow pool. In microbial biomass, we allowed P, C, and N to be in organic forms in the slow and passive pools, whereas P was allowed to flow among labile, absorbed and strongly absorbed pools only in their inorganic form. The labile pool is essential since it contains the main P form taken up by plants, while a fraction of P leaches out of the ecosystem due to its solubility. In absorbed and strongly absorbed pools, P becomes inactive and, ultimately, inorganic phosphate ions aggregate in soil, making it difficult for the P to be recycled. The orange arrows in Figure 1 represent the flows of P among different soil compartments, which were modeled using the following equations: dP mic dt = ∑ j C micj l n t j P j + ∑ k d mick t k P k − t mic P mic (5) dP slow dt = ∑ j C slowj l n t j P j + ∑ k d slowk t k P k − t slow P slow − prbiocheminer·M1 (6) dP pass dt = ∑ j C passj l n t j P j + ∑ k d passk t k P k − t pass P pass − prbiocheminer·M2 (7) where P j is the amount of P in different litter pools (j = microbial, slow or passive); k represents the soil P pools, P k is the amount of P in soil pools; P mic , P slow and P pass are the respective amounts of P in microbial, slow and passive pools in litter (g P·m −2 ); C slowj and C passj are the respective amounts of C in litter pools (structural, decomposable and resistant) to different soil pools (slow and passive) (g C·m −2 ); ln represents the limitation of nitrogen; t j is the turnover time constant of litter pool (day −1 ); t k is the turnover time constant of soil pools (day −1 ); d mick , d slowk and d passk are the C allocation from the other two soil pools; prbiocheminer is the biochemical mineralization rate of P (g P·m −2 ); and M1 is t slow P slow t slow P slow +t pass P pass and M2 is t pass P pass t slow P slow +u pass P pass , according to Michaelis-Menten kinetics. Three main soil P pools (labile, absorbed and strong absorbed) were modeled using the equations where P lab is the labile phosphorus pool (g P·m −2 ); F pnet is the mineralization rate (g P·m −2 ·day −1 ); F pdep , F pfert , F pwea , and F Ptase are P inputs into the ecosystem due to deposition, fertilization, weather and biological mineralization (g P·m −2 ·day −1 ); t sob is the turnover time constant rate for absorbed pool (day −1 ); p sorbmax is the maximum amount of absorbed phosphorus (644.0 g P·m −2 ); and k plabsor is an empirical parameter for describing the equilibrium between labile and absorbed phosphorus, which was defined to be 55.0 [53]. F pup and F ploss refer to two pathways out of the labile pool due to P uptake by plants and leaching from the ecosystem (g P·m −2 ·day −1 ), the loss rate is 0.04 yr −1 [70]. We focused on labile P pools, which P for mineralization. The net biological phosphorus mineralization rate (prnetminer) and biochemical phosphorus mineralization rate (prbiochenimer) can both be used to describe P mineralization, but due to the long turnover of this process, we simplified it using the equations dprnetminer dt = Plittermin + Psmin + Psimm (11) dprbiochemimer dt = u mvmax ·M4· t slo P slo + t pas P pas (12) where Plittermin is the minimum value in the litter pool; Psmin is gross immobilization from litter; Psimm is immobilization from other soil pools; M4 is ncstpup−ncstppro+kpbiochem , where u mvmax is the maximum specific biochemical phosphorus mineralization rate; ncstpup is N cost for P uptake; and kpbiochem is the Michaelis-Menten constant for biochemical P mineralization. Each of the 12 PFTs featured different values of u mvmax and ncstpup.
In our model, all P pool sizes were cumulative at each time step, and all the constants used in the algorithms of fluxes and pools are listed in Table S1. The P pool size at a specific time step, P tot , was calculated as (13) where t represents the time step, and P it is the amount of P in different pools.

Dynamic P:C Ratios of Different Phosphorus Pools
At the end of each simulation time step, the P:C ratio in different pools (g P·g C −1 ), pcr, was calculated for each pool from Land 2022, 11, 778 where P(i) and C(i) are the total P and C in the pool (g P·m −2 , g C·m −2 ). The value of pcr(i) was updated according to the P(i) and C(i) calculated in the last time step. In this way, the model relied on dynamic changes of P:C ratios instead of fixed ratios.

Constraints (Feedback) of P on Plant Productivity
Based on the result about global Meta-analysis that interaction between P and N as well as feedback to photosynthesis [71], we used the following equation to describe the feedback: where Vcmax is the maximum leaf carboxylation rate (mol-CO 2 ·m −2 ·s −1 ), while pleaf (g P·m −2 ) and nleaf (g N·m −2 ) are leaf P and N concentrations, respectively. Leaf maximum carboxylation rate is an index of photosynthesis.

Input Data
The climate data of calibration and validation sites for GPP and SOC come from the FLUXNET. Available online: https://fluxnet.org/data/fluxnet2015-dataset/ (accessed on 21 April 2022) that provide daily climate data including temperature, atmospheric pressure, precipitation, wind speed, etc. For the sites that climate data cannot be directly obtained, we downloaded the daily data of the nearest location from Global Historical Climatology Network daily (GHCNd). Available online: https://www.ncei.noaa.gov/products/landbased-station/global-historical-climatology-network-daily (accessed on 21 May 2022). In this way, we have all sites formed daily weather-driven data from 1901 to 2017.
We initialized soil carbon and soil C:N ratios in our model using the global soil dataset, IGBP-DIS (2000). The soil textures including clay, sand, slit fraction and soil pH data are provided by Global Digital Elevation Model (DSMW) obtained from FAO/UNESCO Soil Map of the World. Available online: http://www.fao.org/geonetmork/srv/en/metadata.show? id514116 (accessed on 21 April 2022). And DSMW is the basis of soil classification map.
Topographic input data were extracted from the global digital elevation model GTOPO30, with an approximate spatial resolution of 1 km.
Data on atmospheric CO 2 concentrations for the 1958-2017 period were taken from the Mauna Loa Observatory in Hawaii [72]. Data before 1958 were taken mainly from the IS92a dataset of annual global CO 2 concentrations, obtained by spline fitting ice core data to the Mauna Loa data [73].

Sensitivity Analysis
The most sensitive parameter in the procedure must be identified for site-specific studies. This is because this parameter is useful for model calibration, validation and uncertainty analysis [74]. We conducted initial sensitivity analysis experiments in this study to obtain the most sensitive parameters and simplify the fitting processes. The model parameters listed in Tables 1 and S1 have been basically used in the various processes of P cycle. Table 1 showed the most important parameters that are more closely related to GPP. Hence, we selected six key parameters associated with the P cycle model (Table 1) and four meteorological factors (cloud cover, near-surface temperature, precipitation, vapor pressure; Table S2) as the important parameters and variables for sensitivity analysis. We have also conducted sensitivity analysis on total biomass and six parameters, and four meteorological factors for rpleaf as well.
We calculated the sensitivity index (SI) to quantify the ratio between relative change in model output for a relative change in each parameter for each site using time series, while other parameters were kept constant [75]. Sensitivity is generally expressed as the ratio between a relative change in a model output and a relative change in a parameter. For each parameter, the SI at a given site was described as where n is the total number months of model simulation, j is the ordinal numeral of month, x 0 means the initial value of the input data, ∆x is the difference between the initial and adjusted value of the input data, and y 0 is the model output after adjustment of the input data. The six model parameters were varied ±50%; meteorological factors, cloud cover, precipitation and vapor pressure were varied ±20%; and near-surface temperature was varied ±2 • C. Then we calculated variations in GPP among different biomes, and according to the size of the GPP change, we determined the optimal values of the most sensitive parameter for each biome (Table S3).

Model Calibration and Validation
We used the subjective optimization [76] to calibrate the most sensitive model parameter, rpleaf (phosphorus resorption coefficients of leaf) by using GPP data from 49 sites of FLUXNET. Available online: https://fluxnet.org/data/fluxnet2015-dataset/ (accessed on 21 May 2022) for different biomes. For each site, we controlled other unchanged model parameters, and only increased the parameter value by 5% each time to obtain the GPP model value of model simulation. Since the real range of this parameter was 0.0 to 1, it needs to be tested at least 20 times. When the GPP value of simulation was closest to the observed value, the parameter is regarded as the optimal parameter value of this site. Then the optimal value of the most sensitive parameter for each biome was set as the mean value of all sites optimizations in its biome. The specific information of the rpleaf optimal value setting for each sits and each biome was shown in Table S3.
The model validations were based on the three indexes (e.g., formula [18][19][20], and these indexes were calculated by three variables (GPP, vegetation biomass, SOC) that were selected to verify model performance. In particular, for GPP, observation data was extracted from another 116 global FLUXNET sites. For total biomass, observation data of 933 sites were collected from literatures globally (from Web of Science using the search terms "biomass" or "biomass dataset") and detail information listed in S6. The observed SOC dataset for model validation was from the Harmonized World Soil Database (HWSD) (version 1.2) of the Food and Agriculture Organization of the United Nations. Available online: https://www.fao.org/soils-portal/soil-survey/soil-maps-and-databases/ harmonized-world-soil-database-v12/en/ (accessed on 20 April 2022). In the end, data were taken from 75 sites across three biomes: tree, shrubland and grassland. Details about the sites from which data were taken for model validation are provided in Table S3. All simulations for model calibration and validation were at site level. A spin-up run of approximately 100 years using multiyear mean climate data was conducted to achieve an equilibrium state in C pools and then collected climate data were used to make the following simulation for each site.
We evaluated model performance using three evaluation indexes. The coefficient of determination (R 2 ) was calculated as and the second index of agreement (D) was calculated as The last index was the root-mean-square error (RMSE), which was quantified as where n also represents the number of samples, and S i and O i are the simulated and observed output in the ordinal number month. This index represents the degree of agreement between observed and simulated values; the smaller the value, the better the model performance.

Initial Sensitivity Analysis
The sensitivity of GPP varied across the six model parameters listed in Table 1, with the phosphorus resorption coefficient of leaf (rpleaf) being the most influential, followed by the empirical parameter relating plant phosphorus uptake rate to labile pool size in the soil (kpup) (Figure 2a). Other parameters had only slight or negligible influence on GPP, including rpwood, rproot, kp, kpbiochem, kplasor and psorbmax. Since rpleaf is used to calculate pleaf, which regulates stomatal conductance and photosynthesis, we focused on rpleaf as the target parameter to be adjusted.  (Figure 2a). Other parameters had only slight or negligible influence on GPP, including rpwood, rproot, kp, kpbiochem, kplasor and psorbmax. Since rpleaf is used to calculate pleaf, which regulates stomatal conductance and photosynthesis, we focused on rpleaf as the target parameter to be adjusted. For total biomass, the sensitivity analysis results of the parameters indicate the rpleaf is the most sensitive parameter (Figure 2b), and its value is about 0.74. The other parameter kpup is less sensitive to biomass than GPP. The remaining parameters including rpwood, rproot, kp, kpbiochem, kplasor and psorbmax are less sensitive to total biomass. This also provides a basis for us to determine the adjustment parameter rpleaf.
With respect to GPP, the mean values of SI were all greater than zero for the four meteorological factors (Figure 2c): precipitation, 3.11; near-surface temperature, 1.87; vapor pressure, 0.80; and cloud cover, 0.17. It could be concluded that for the majority sites, the precipitation was the most sensitive meteorological factor to GPP and there was a significant correlation between them. At all sites, GPP generally increased as precipitation increased. The absolute values of maximum reductions of GPP under −20% were greater than the absolute values of maximum increases under +20% for the two precipitation gradients tested. This suggests that simulated GPP was more sensitive to drought, less sensitive to vapor pressure, and was not substantially affected by cloud cover.  panel (a,b), rpeaf, rpwood, and rproot are phosphorus resorption coefficients of leaf, wood and root, respectively; kp, an empirical parameter for phosphorus limitation on net ecosystem productivity (NPP); kpbiochem, an empirical constant for biochemical phosphorus mineralization; kplabsor, an empirical parameter for describing the equilibrium between labile and reabsorbed phosphorus; kpup, an empirical parameter relating plant phosphorus uptake rate to labile pool size in the soil; and psorbmax, the maximum amount of absorbed phosphorus.
In panel (c), the sensitivity of GPP (blue boxes) and amount of leaf phosphorus (red boxes) to variations in the indicated variables are shown. mineralization; kplabsor, an empirical parameter for describing the equilibrium between labile and reabsorbed phosphorus; kpup, an empirical parameter relating plant phosphorus uptake rate to labile pool size in the soil; and psorbmax, the maximum amount of absorbed phosphorus. In panel (c), the sensitivity of GPP (blue boxes) and amount of leaf phosphorus (red boxes) to variations in the indicated variables are shown.
For total biomass, the sensitivity analysis results of the parameters indicate the rpleaf is the most sensitive parameter (Figure 2b), and its value is about 0.74. The other parameter kpup is less sensitive to biomass than GPP. The remaining parameters including rpwood, rproot, kp, kpbiochem, kplasor and psorbmax are less sensitive to total biomass. This also provides a basis for us to determine the adjustment parameter rpleaf.
With respect to GPP, the mean values of SI were all greater than zero for the four meteorological factors (Figure 2c): precipitation, 3.11; near-surface temperature, 1.87; vapor pressure, 0.80; and cloud cover, 0.17. It could be concluded that for the majority sites, the precipitation was the most sensitive meteorological factor to GPP and there was a significant correlation between them. At all sites, GPP generally increased as precipitation increased. The absolute values of maximum reductions of GPP under −20% were greater than the absolute values of maximum increases under +20% for the two precipitation gradients tested. This suggests that simulated GPP was more sensitive to drought, less sensitive to vapor pressure, and was not substantially affected by cloud cover.
Also, in Figure 2c, with respect to the amount of leaf phosphorus, mean SI values were again positive (vapor pressure, 0.63; precipitation, 0.53; cloud cover, 0.06), except for near-surface temperature (−3.79), and this factor was the most influential on pleaf.

Model Calibration
We calibrated the model using 49 sites, which we classified by stratified random sampling into five main biomes (tropical forest, temperate forest, boreal forest, grassland, shrubland; Table S3). The calibration variable was GPP, and rpleaf was allowed to range from 0.0 to 1.0 in 5% increments; at each rpleaf level, we compared 20 simulated GPP values with GPP observations. Finally, we confirmed the optimal value of rpleaf for each site and biome.
Based on the three indexes of model fitting, we determined optimal rpleaf values to be 0.783 for tropical forest, 0.603 for temperate forest, 0.429 for boreal forest, 0.373 for grassland, and 0.610 for shrubland (Table S3). In addition, RMSE and D fell into acceptable ranges.  Table S4). Figure 3a represents the compare GPP which simulated by TRIPLEX-CNP, while Figure 3b is the results of GPP simulated by model without P cycle. TRIPLEX-CNP performs better than its predecessor that without P cycle (R 2 = 0.85 vs. 0.77, both p < 0.01; Figure 3). At all sites, there are 14 sites for deciduous broadleaf forest; 12 sites for evergreen broadleaf forest; 37 sites for evergreen needleaf forest; 4 sites for mixed forest; 24 sites for grassland; 12 sites for shrubland; and 12 for savanna. There is only one site for deciduous needleaf forest since all of the sites were selected by random stratified sampling from 165 sites. For individual biomes, the simulations of savanna, grassland and shrub land are shown to be closer to the observed GPP after adding phosphorus cycle. For the other biomes, modeled GPP was consistently lower than the observed GPP.
sites, there are 14 sites for deciduous broadleaf forest; 12 sites for evergreen broadleaf forest; 37 sites for evergreen needleaf forest; 4 sites for mixed forest; 24 sites for grassland; 12 sites for shrubland; and 12 for savanna. There is only one site for deciduous needleaf forest since all of the sites were selected by random stratified sampling from 165 sites. For individual biomes, the simulations of savanna, grassland and shrub land are shown to be closer to the observed GPP after adding phosphorus cycle. For the other biomes, modeled GPP was consistently lower than the observed GPP.

Total Biomass
The second validated variable was total biomass, which were divided into 5 biomes as showed in Figure 4. And there were 933 observations collected from literatures in which 104 sites are from tropical forests, 219 sites belong to temperature evergreen conifer forests, 227 sites are temperature evergreen broadleaf forests and 162 are temperature deciduous forests, and 221 sites for boreal forests (The detailed information of 933 sites in Table S5). The total biomass simulations were calculated from TRIPLEX-CNP (Figure 4a) and the original TRIPLEX-GHG model without P cycle (Figure 4b). Comparing the correlation between observations and simulations for TRIPLIEX-GHG (R 2 = 0.54), the correlation between observations and simulations for TRIPLIEX-CNP was much better (R 2 = 0.67). The total biomass of tropical forests and temperature evergreen broadleaf forests estimated by TRIPLIEX-CNP are closer to the observations than did the TRIPLIEX-GHG. However, both models trended to over-estimate the total biomass of boreal forests (Figure 4a,b).

SOC
Using the HWSD database from the same sites as those we used to obtain GPP data, we found that the correlation (R 2 ) between TRIPLEX-CNP simulation and observations was 0.78, which was 3% better than the original TRIPLEX-GHG model without P cycle ( Figure 5). Simulated SOC by both models were consistently larger than the observed data.
the original TRIPLEX-GHG model without P cycle (Figure 4b). Comparing the correlation between observations and simulations for TRIPLIEX-GHG (R 2 = 0.54), the correlation between observations and simulations for TRIPLIEX-CNP was much better (R 2 = 0.67). The total biomass of tropical forests and temperature evergreen broadleaf forests estimated by TRIPLIEX-CNP are closer to the observations than did the TRIPLIEX-GHG. However, both models trended to over-estimate the total biomass of boreal forests (Figure 4a,b).  Figure 3. TF is tropical forests, TCF is temperature evergreen conifer forests; TBF is temperature evergreen broadleaf forests, TDF is temperature deciduous forests and BF is boreal forests.

SOC
Using the HWSD database from the same sites as those we used to obtain GPP data, we found that the correlation (R 2 ) between TRIPLEX-CNP simulation and observations was 0.78, which was 3% better than the original TRIPLEX-GHG model without P cycle ( Figure 5). Simulated SOC by both models were consistently larger than the observed data.

Model Sensitivity Analysis
Parameterization of C-N-P process models remains a challenge [53,54,57], so we made the task more tractable by conducting sensitivity analysis to identify the most influential parameters and we used hypothetical values from previous studies (Figure 6a) [53,54]. In this way, we were able to identify rpleaf as more influential than other model

Model Sensitivity Analysis
Parameterization of C-N-P process models remains a challenge [53,54,57], so we made the task more tractable by conducting sensitivity analysis to identify the most influential parameters and we used hypothetical values from previous studies (Figure 6a) [53,54]. In this way, we were able to identify rpleaf as more influential than other model parameters listed in Table 1, which is consistent with a meta-analysis describing direct feedback of P on photosynthesis [71]. Then, for the model calibration, we divided all 49 sites into 5 different biomes. For each biome, the optimal rpleaf value was further determined (detailed values are in Table S3). The rpleaf parameter values of each biome were found to be spatially heterogeneous. For example, average rpleaf in forest was 0.61 which was larger than the average value (0.49) of grasslands and shrublands.

Model Sensitivity Analysis
Parameterization of C-N-P process models remains a challenge [53,54,57], so we made the task more tractable by conducting sensitivity analysis to identify the most influential parameters and we used hypothetical values from previous studies (Figure 6a) [53,54]. In this way, we were able to identify rpleaf as more influential than other model parameters listed in Table 1, which is consistent with a meta-analysis describing direct feedback of P on photosynthesis [71]. Then, for the model calibration, we divided all 49 sites into 5 different biomes. For each biome, the optimal rpleaf value was further determined (detailed values are in Table S3). The rpleaf parameter values of each biome were found to be spatially heterogeneous. For example, average rpleaf in forest was 0.61 which was larger than the average value (0.49) of grasslands and shrublands.  Figure  6a showed the quality of observed GPP and the simulated GPP by all model parameters that were listed in Table 1. 1.5* parameter = +50% and 0.5* parameter = −50% in (a), while in (b), 0.8* factor  Figure 6a showed the quality of observed GPP and the simulated GPP by all model parameters that were listed in Table 1. 1.5* parameter = +50% and 0.5* parameter = −50% in (a), while in (b), 0.8* factor represents meteorological factor decreased 20% to simulate GPP and 1.2* factor represents meteorological factor increased 20%; and the same meanings of legend in Figure 6c but for pleaf simulations, near-surface temperature was varied by ±2 • C, indicated respectively as tmp −2 and tmp +2 • C.
GPP simulated by TRILEX-CNP was most sensitive to precipitation and temperature ( Figure 6b, Table S2), and pleaf was sensitive to temperature (Figures 2 and 6b). In general, both responses of GPP and pleaf to precipitation show a great positive correlation. The increase in GPP with increasing precipitation at most sites is consistent with results of in six larch forest sites in East Asia, based on the process-based terrestrial biosphere model BIOME-BGC [77]. A positive relationship is consistent with previous modeling studies based on TRIPLEX [64] and TRIPLEX-GHG [78]. The relationship between GPP simulated with P cycle and mean annual temperature was negative for most tropical forest sites, but positive at most boreal forest and temperate forest sites, in which the positive relationship is consistent with TRIPLEX-GHG [78]. Warming eliminates the control of P availability on GPP in the typical alpine dry meadow [79].
As with GPP, pleaf as a variable of plant phosphorus increased with precipitation in most biomes examined, except the two variables correlated negatively in evergreen needleleaf forest and grassland, as reported for desert ecosystems in northwest China [80].
It is also consistent with other models: temperature explained as much as 37.7% of the variation in annual GPP across northern Eurasia, based on the light-use-efficiency-based model [81,82].
In our study, pleaf decreased with increasing temperature across various biomes, consistent with previous work [83]. Similarly, senesced-leaf P concentration was found to decrease with increasing cumulative temperature in a larch forest in northeast China across an entire growing season [84]. leaf P decreased with increasing temperature as one approached the equator [80,85,86].
Changes in cloudiness of ±20% did not significantly affect GPP in most biomes in our study. Studies have given divergent results about the potential relationship between cloud cover and GPP. Some studies have reported that increases in cloud cover and the corresponding change in radiation fraction can enhance ecosystem productivity in temperate broadleaf forests in North America [87] and northwestern China [88]. In some tropical savannas, GPP has been shown to peak when cloud cover was minimal or absent [89]. Tropical forests show strong responses to cloudiness [90]. The relationship between cloudiness and GPP may be spatially heterogeneous: The areas where the two factors correlated negatively were approximately nine times more extensive than areas where they correlated positively [81].

Model Evaluation
Accurate modeling of net primary productivity (NPP) and GPP is essential for predicting carbon-climate feedback in the future. There are many models without coupling the P cycle, which have been evaluated for GPP or NPP. The IBIS model used 19 sites globally and found reasonable agreement between simulated and observed NPP (R 2 = 0.52) [91]. BGC and DNDC models have been used to examine potential impacts of climatic and CO 2 changes as well as N deposition or fertilization, but these models do not incorporate the P cycle, and they have been validated only for the regions North America, Australia, or Europe. The recently developed model, such as CLM-CNP model has been validated using NPP data from two forest sites in Hawaii and the Amazon [56], and the model captured the shift of nutrients from N to P limitation in a tropical forest ecosystem. N14CP that integrated P weathering and N deposition was validated using data from 88 sites in northern Europe [92].
In this study, we achieved an R 2 = 0.85 for GPP using TRIPLEX-CNP, which incorporates the P cycle and was validated for different plant functional types (PFTs) using data from 116 sites. Our model simulated GPP 8% better than the predecessor model TRIPLEX-GHG, which does not contain the P cycle. We are aware of a few of global land surface models that have fully integrated C-N-P cycles and have been evaluated for different PFTs on global scale, CASA-CNP and JSBACH-CNP were two good examples. However, the ability of CASA-CNP to model P processes has been validated only with respect to leaf N:P ratios at absolute latitudes, while the ability of JSBACH-CNP has not been validated in detail but only through indirect comparisons to the literature [54]. JSBACH-CNP simulated mean P stock of 0.23 Gt P in a vegetation pool from 1970 to 1999, smaller than the range of 0.39-3.0 [53], but much larger than the 0.08 Gt [93]. JSBACH-CNP estimated soil organic P of 5.74 Gt, which fell at the low end of the 5-10 Gt P [94]. In addition, it provided updated information regarding to the need to incorporate P cycle processes to estimate global C stocks under climate change [95]. They tested and evaluated ORCHIDEE-CNP model at global scale using remote sensing data, instead of site-by-site observation data.
For biomass evaluation, we found that a few models, which have been coupled with the P cycle, have a global validation of biomass. For example, the ORCHIDEE-CNP [59] used the GlobBiomass dataset to evaluate the simulated global forest above-ground biomass. The CoupModel (v6.0) showed the simulated and measured plant biomass of four regions in Sweden [96]. In addition, simulated biomass response to fertilizer addition were consistent with field observations in tropical forests [97].
SOC is an important soil variable to evaluate the effect of the P cycle on the soil C cycle. Accurate modeling of the soil C cycle is essential for predicting C-N-P cycles and climate feedback in the future. Current models provide SOC estimates with large uncertainty: SOC simulations by 11 Earth system models from the fifth Coupled Model Intercomparison Project [98] varied 6-fold, despite the similarities in model structures [99], and only six of the models gave estimates within the range of 890-1660 Pg C in the HWSD. The spatial correlation coefficient between modeled and observed SOC values did not exceed 0.4 for any of the models [100,101]. Compared with other validation results, the simulated SOC result is less accurate. It might be related several reasons, including that most models lack of the microbe process for simulating soil organic matter (SOM), and very few models have explicitly represented soil microorganisms at global scale [102]. In addition, the SOC data used for model validation in this study is not site by site point data, resulting in deviations due to scaling problems, due to the lack of SOC observation-on a global scale. Furthermore, effects of P-induced limitation might be mainly occurred at root biomass [103], while our modeling study pays more attention to leaf P, instead of roots. Another possible reason for the unsatisfactory SOC simulation was that the feedback of phosphorus on carbon was only considered through leaf P, which had a more significant impact on plant carbon and might have a less impact on soil carbon. Mangrove forest was the special ecosystem that contain large amount carbon, especially in soil, while this study was not considered this ecosystem. At the same time, the other important ecosystem (i.e., the farm) was not regarded as the objective of this study. However, it should be noted that nitrogen loss and soil salinity are two key global issues for sustainable farming systems that effected SOC simulation [104,105].
Compared with other models, the TRIPLEX-CNP model has an improvement in the accuracy of calibration and validation. For example, the CASA-CNP just compared the NPP simulation among the latitude with other research results while in this study, we identified the most sensitive parameter in P cycle process and model calibrated by global GPP in 49 different sites including five biomes. For model validation, there were three variables (GPP, SOC and total biomass) used to evaluate the model globally from site to site.

Feedback of N, P on Plant Productivity
For the feedback effect of N and P on C, only a few models have been explored and, as such, it is still a gap area. The study focused on the woody species, trees, and shrubs in the region [106,107]. The meta-analysis to evaluate global patterns and controls of 10 variables associated with ecosystem P cycling under N addition [108]. A model representation of P limitation of two tropical forest sites productivity in Panam [109], but they could not provide a specific algorithm for quantifying the feedback effect of N and/or P on C over the physiological process. In this study, based on the global meta-analysis [71], we investigated and quantified the relationships of Vcmax and Jmax with leaf N, P for six different plant functional types.

Model Limitations, Uncertainty, Improvement, and Outlook
Current C-N-P models exhibit large uncertainties in their estimates of pool sizes, fluxes, and turnover rates of nutrients due to a lack of consistent global observations on P mineralization, sorption, limitation, and the stoichiometric relationships of P with C and N. Fewer data exist on the P cycle than on C and N processes, in part due to the large spatial heterogeneities of the P cycle across terrestrial ecosystems. For example, many studies have analyzed N limitation, but not P limitation, in temperate and highlatitude ecosystems [110,111]. Recently, studies have begun to examine P cycling and its interactions with C and N, such as the FACE experiment in an Amazon forest in Brazil [112] as well as a soil and understory forest warming experiment in Puerto Rico. Available online: http://www.forestwarming.org (accessed on 21 May 2022). In addition, the evaluation of the CNP model is also a great challenge. Since we are still lacking of key variable to validate the phosphorus cycle, and leaf C:P ratios. Although we have tried to eliminate the inconsistency caused by the scaling problem when processing the observed data, there is still a lot of uncertainty. Similarly, GPP, biomass and SOC observed data all have some errors due to poor quality or accuracy. There is also a global-scale project, Imbalance-P, focused on improving our understanding of P limitation. Available online: http://imbalancep-erc.creaf.cat/ (accessed on 21 May 2022). These projects will expand the database on the P cycle in regional and global networks, improving our ability to develop and validate more informative ecosystem models at multi-scales [113].
These efforts are all the more important given the recent realization that P limitation occurs not only in lowland tropical forest regions as traditionally thought [6,7,10], but also in other ecosystems [113,114], including tundra regions [11,12] and temperate region areas with strongly weathered soils [13,14]. Our work here with TRIPLEX-CNP demonstrates the feasibility and usefulness of incorporating the P cycle into global land surface models [52,54,56,57,69].
For the model uncertainty, the observations of GPP used in this article were estimated from the FLUXNET data. Howver, this dataset can also be obtained through a series of processing, the data may have uncertainty caused by processing methods. For the SOC, in order to verify the consistency, we used the same point as the GPP coordinate, but the FLUXNET did not provide SOC data, and it is difficult for us to collect the corresponding single point data globally, so we used the HWSD SOC data for verification, this may also cause the uncertainty due to the data source. In addition to the uncertainty caused by the data, the low level of understanding of the phosphorus cycles processes (mineralization, sedimentation, microbial processes, etc.) also brings uncertainty to the model simulations.
For the development of the model, there is still much work to be carried out, such as the assignments of P fraction to P pools are not consistent and often mix definitions by availability and speciation [115,116]. The questino concerning how nutrients reallocation will be affected by phosphorus availability is still unclear [117]. In addition, the microbial process that affects the absorption of nutrients by vegetation is another bottleneck, since soil microbes release immobile forms of P to the soil solution and are also responsible for the immobilization of P [118], especially they are close to mycorrhizal fungi [119]. Further progress along this path will require improvements in the parameterization of the P cycle based on P mineralization, sorption, limitation, and stoichiometry. Future work should also evaluate the TRIPLEX-CNP in different climate regimes under various experimental manipulations [56,120,121] and C-N-P interactions. Such work can assess not only a model's overall performance but also how well it represents individual processes.

Conclusions
This study had made a new attempt to integrate the P cycle and its interaction with carbon (C) and nitrogen (N) into new processes model of TRIPLEX-CNP. This model accurately simulated GPP, total biomass, and SOC across large geographical ranges and different ecosystem types, based on a range of global datasets after determining and adjusting the most sensitive parameters (phosphorus resorption coefficient of leaf (rpleaf)). The model calibration and validation demonstrated that the TRIPLEX-CNP was able to simulate global GPP, total biomass, and SOC reasonably and the overall model performance had been improved after adding the P cycle comparing with the earlier version. Our results highlight the urgent need for observations and experiments on how climate change and elevated CO 2 affect the P cycle, and for global land surface models that more accurately represent P mineralization, sorption, partitioning, and stoichiometry, as well as C-N-P interactions. This study represents a promising approach to help predicting the P distribution and its effects on productivity of terrestrial ecosystem under phosphorus limitation and global climate change.
Supplementary Materials: The following supporting information can be downloaded at: https://www. mdpi.com/article/10.3390/land11060778/s1, Table S1: Constants used to calculate the phosphorus cycle; Table S2: Percent change in GPP and in the amount of leaf phosphorus (pleaf) in response to changes in four meteorological factors, for different vegetation types; Table S3: Fitting of rpleaf at each site and for five biomes, and model performance for each biome (optimal parameter marked with ※); Table S4: Data used to validate model simulation of gross primary productivity (GPP) and soil organic carbon (SOC); Table S5: Data on total biomass taken from global 933 sites and used to validate model performance.