Parameter Estimation of the Farquhar — von Caemmerer — Berry Biochemical Model from Photosynthetic Carbon Dioxide Response Curves

The Farquhar—von Caemmerer—Berry (FvCB) biochemical model of photosynthesis, commonly used to estimate CO2 assimilation at various spatial scales from leaf to global, has been used to assess the impacts of climate change on crop and ecosystem productivities. However, it is widely known that the parameters in the FvCB model are difficult to accurately estimate. The objective of this study was to assess the methods of Sharkey et al. and Gu et al., which are often used to estimate the parameters of the FvCB model. We generated An/Ci datasets with different data accuracies, numbers of data points, and data point distributions. The results showed that neither method accurately estimated the parameters; however, Gu et al.’s approach provided slightly better estimates. Using Gu et al.’s approach and datasets with measurement errors and the same accuracy as a typical open gas exchange system (i.e., Li-6400), the majority of the estimated parameters—Vcmax (maximal Rubisco carboxylation rate), Kco (effective Michaelis-Menten coefficient for CO2), gm (internal (mesophyll) conductance to CO2 transport) and Γ* (chloroplastic CO2 photocompensation point)—were underestimated, while the majority of Rd (day respiration) and α (the non-returned fraction of the glycolate carbon recycled in the photorespiratory cycle) were overestimated. The distributions of Tp (the rate of triose phosphate export from the chloroplast) were evenly dispersed around the 1:1 line using both approaches. This study revealed that a high accuracy of leaf gas exchange measurements and sufficient data points are required to correctly estimate the parameters for the biochemical model. The accurate estimation of these parameters can contribute to the enhancement of food security under climate change through accurate predictions of crop and ecosystem productivities. A further study is recommended to address the question of how the measurement accuracies can be improved.


Introduction
The FvCB leaf photosynthesis model for C 3 plants [1,2] is fundamental for the prediction of leaf responses to environmental variation [3].This model has been widely used to simulate CO 2 assimilation and the response of plants to climate change for different spatiotemporal scales [4][5][6][7][8][9][10][11], due to its solid theoretical basis and simplicity [12].It is also frequently used in reverse to quantify the underlying biochemical properties (i.e., the model parameters) of leaves under different environmental conditions [13][14][15][16][17].These parameters are often considered easier to estimate from gas exchange measurements rather than making the required in vitro measurements to quantify enzyme activity.This is because it is difficult to extract functional enzymes from many species [18], and in vitro conditions seldom represent those experienced in vivo [19].According to the different versions of FvCB model [1,[20][21][22][23], up to 8 parameters (V cmax , K co , J, T p , α, g i , R d , and Γ*) can be estimated from an analysis of the response of the net assimilation rate (A n ) to intercellular CO 2 concentration (C i ) if enough accurate data points are available [2].
There are numerous publications discussing the different parameterization methods associated with A n /C i curves [2,12,[24][25][26][27][28][29].Each method relies on different assumptions and has technical limitations.Most approaches assume that α = 0 and that the K co and Γ* can be chosen a priori from estimates in previous studies to determinate V cmax , J max , T p , R d and g m [2,12,22,26,27].Gu et al. [2] extended the method of Ethier and Livingston [12] to propose an exhaustive dual optimization (EDO) approach to estimate the parameters from fitting A n /C i curves.All of the curve-fitting methods minimize an objective function; e.g., the sum of the square of errors (SSE), based on the nonlinear FvCB model, with a limited number of measurements (typically 8-12, [2,24,26,30]) and the expected accuracy of a measured A n /C i curve (see [2,25,26] for comprehensive reviews).Depending on the equations used for fitting the parameters, two major groups of methods can be distinguished: Group I directly fits parameters with the FvCB model [14,[25][26][27]31] and Group II fits parameters with a quadratic equation [2,12].The implementation is sensitive to the methods used.The estimated parameters can be substantially different when using different A n /C i curve-fitting methods on the same dataset [2,25,26,29].It is frequently difficult to determine which method is superior based on the measures of SSE because of the characteristics of the FvCB model [2], the assumption that the assigned values of the kinetic properties of Rubisco are the correct ones, and the number of parameters to be estimated [24].Some studies compared different fitting methods to determine the parameters of the FvCB model.Miao et al. [26] made a comparison of six different methods using 160 randomly selected A n /C i datasets from 4 shrubby indicator species, including highbush blueberry (Vaccinium corymbosum L.), dangleberry (Gaylusaccia frondosa L.), coastal fetterbush (Eubotrys racemosa L.) and sweet pepperbush (Clethra alnifolia L.).They concluded that the method developed by Sharkey et al. [27] was among the 'best', based on the lowest minimum of the root mean square error.Gu et al. [2] stated that their approach could estimate reliable FvCB parameters using error-free synthetic A n /C i curves and predicted limited states that matched chlorophyll fluorescence patterns from actual datasets.
An erroneous determination of the FvCB model parameters can lead to inaccurate predictions of ecosystem productivity, because the potential errors can worsen when moving to larger temporal and spatial scales (e.g., from field measurements over short periods to ecosystem predictions over long periods) [32,33].However, to the best of our knowledge, there is little information on a comprehensive test of these different fitting methods using common generated datasets superimposed by possible measurement errors.The objective for this study was to assess the two approaches for fitting the FvCB model: group I (Sharkey et al.'s method) and group II (Gu et al.'s method).

The FvCB Model and Characteristics
A n /C i curves are fitted with the FvCB model for C 3 leaves [1] that accounts for g m and whereby A n is given as where min{} denotes 'the minimum of'.
α in Equation ( 6) is often set to zero, and Equation ( 6) can be then reduced to In the above equations, W c , W j and W p are the carboxylation rates limited by Rubisco (A c state), the Ribulose 1,5-bisphosphate (RuBP) regeneration (A j state) and triose phosphate utilization (TPU) (A p state), respectively; Equation ( 3) is one of the points where the A j state is equal to the A n state (see next the section for an explanation); C c is the chloroplastic CO 2 partial pressure and can be estimated by A typical condition for assessing A n /C i curves at saturating light levels is to assume that J approaches J max , the maximum rate of electron transport.If light is not saturating at the time of measurement, J max must be calculated from J [24].In this study, we assume that J max = J.
Combining Equations ( 1)-( 8), the relationship between A n and C i in the A c , A j and A p states can be expressed as three segment hyperbolic curves [2,12], respectively; where the subscripts c, j and p refer to the A c , A j and A p states, respectively; g m , p, q and u are 4 'coefficients' in each segment of an A n /C i curve.The general form of Equation ( 9) is In the When α = 0, Equation (9) in the A p state is reduced to Mathematically, there are two positive roots for the quadratic Equation (10), but only one root meets the constraints of the FvCB model.
The above equation describes A n as a function, C i [12].Equation ( 16) consists of up to three segments (states).The three states share one common coefficient g m , while the other coefficients (p, q and u) are combinations of the common parameters g m , Γ* and R d and the state specific parameters V cmax and K co in the A c state, J max in the A j state, and T p and α in the A p state.

Constraints for the Parameters
To generate datasets in order to test the methods, the following constraints for the parameters are required.All parameters are greater than zero to keep their biological meanings.The FvCB model is a monotonic increasing function in both A c and A j states and a monotonically decreasing function when α > 0, or it is a constant in the A p state when α = 0.The three states follow a fixed order along the C i axis [2].
If an A n /C c curve consists of both A c and A j states, there could, mathematically, be two conditions: (1) The two states could be exactly the same if J max /4 = V cmax and K co = 2Γ*.Since K co > 2Γ* [2,22], however, the two states cannot be the same.(2) There are two intersection points where A nc = A nj .The first point is at C c = Γ*; A nj < A nc when C c < Γ*.We define A n = A nc when C c < Γ* to create the fixed order of A c and A j states (Equation ( 3)).The second point is defined as the transition point C cc_cj by combining Equations (4) and ( 5), Since 2Γ*/K co < 1, the constraints for V cmax and J max can be expressed as: When A j and A p states coexist, the constraint for J max as a function of T p is [2]: When A c and A p states coexist, the constraint for V cmax as a function of T p is When three states coexist, the constrains are given by

Criteria for Deriving Parameters
An involved parameter is a parameter that is presented in the given dataset.A non-resolvable parameter is defined as an involved parameter whose value cannot be correctly derived.It should be noted that resolvable, non-resolvable, and noninvolved parameters were defined by Gu et al. [2].More detailed information on these definitions can be found in Gu et al. [2].The equality of an estimated parameter against its "true value" is defined by an equivalency at the accuracy of three decimal places if the lowest precision of the A n /C i dataset is three or higher, or at the lowest precision of the A n /C i dataset if the precision of A n /C i is less than three decimal places.Since there are limitations due to measurement accuracy, the number of data points of the A n /C i dataset, and the fitting methods used, an estimated parameter might be equal to its "true value" by chance.To stop this result from biasing our analysis, a correctly estimated parameter is defined here as one that is obtained only if all resolvable parameters in this dataset are equal to their "true values".If the numbers of data points in A c , A j and A p states are x, y and z, respectively, the data point distribution is written as (x, y, z).More detailed information on the theoretical resolvability of parameters can be found in Gu et al. [2].
In practice, all of the measurements are subject to measurement errors (Appendix A).Therefore, the measured points lie near to, but not on, the theoretical curve.A number of techniques are needed to obtain the optimal approximation of the "true value".The common methods for estimating the parameters of the FvCB model are based on the minimization of an objective function which characterizes a goodness-of-fit of a particular curve with respect to the given set of data points.For example, the parameters are estimated by minimizing the SSE [2,29]: where A ci , A jj and A pk are the calculated net assimilation rates at point i in the A c , A j , A p states respectively; A cmi , A jmj and A pmk are the measured counterparts, respectively; and subscripts nc, nj and np are the numbers of counterpart points, respectively.The objective equation used in the methods of Sharkey et al. and Gu et al. are similar to Equation ( 22); however, the calculation procedures are different.

Generation of Datasets
The parameters were randomly chosen from selected parameter ranges with three decimal places.V max varied from 20.000 to 160.000 µmol m −2 s −1 , J max from 20.000 to 250.000 µmol m −2 s −1 , T p from 5.000 to 15.500 µmol m −2 s −1 , R d from 0.010 to 5.000 µmol m −2 s −1 , g m from 0.100 to 30.000 µmol m −2 s −1 Pa −1 , Γ* from 0.100 to 5.000 Pa, K co from 20.000 to 100.000 Pa, and α from 0.001 to 1.000.As a special case, α = 0, with the requirement of the constraints mentioned in the previous section (Equation (21)).C i ranges from 5 to 150 Pa in both cases.We assumed that the datasets were collected at a leaf temperature of 25 • C and an air pressure of 100 Pa.Each set of parameters (except for α) was used to generate datasets with either α = 0 or α > 0. There were 200 datasets within each accuracy level, of which 100 used α = 0 and 100 with α > 0.
There were three accuracy subgroups (high, normal, and varied).Firstly, a high accuracy implies the accuracy of the generated A n and C i to eight decimal places.Secondly, a normal accuracy dataset was defined as a dataset which is rounded off from a high accuracy dataset.Normal precision is the same as a typical open gas exchange system, e.g., Li-6400 (Li-Cor, Inc., Lincoln, NE, USA).In this case, A n is rounded to three decimal places if A n < 1.000 µmol m −2 s −1 , to two decimal places if 1.00 µmol s −1 m −1 < A n < 10.00 µmol s −1 m −1 , and to one decimal place if A n > 10.0 µmol m −2 s −1 ; C i is rounded to one decimal place if C i < 100 µmol mol −1 and to an integer if C i > 100 µmol mol −1 .Datasets can commonly have measurement errors.The errors in A n and C i were calculated according to Equations (A5) and (A6), respectively, in the appendix.The precision of a dataset from the measurements is the same as a normal accuracy dataset.Varied accuracy datasets were generated without measurement errors, either with varied accuracy or varied data points.The parameters used to generate varied accuracy datasets were: (i) Datasets with varied data points.The accuracy of this dataset was eight decimal places; and the numbers of data points were 4, 5, 6, 7, 8, 9, and 12.The varied data point dataset was used to evaluate the impact of the number of data points on parameterization.It should be noted that these datasets are included in the high accuracy dataset.(ii) Datasets with varied accuracy.These datasets were with either eight or 12 data points, and accuracies were from one to eight decimal places.The varied accuracy dataset was used to identify the impact of accuracy on parameterization.

Gu et al.'s Method
Gu et al. [2] developed a four-step method to estimate the parameters of the FvCB model. ( The enumeration of all possible data point distributions of three states of a given dataset.The three limited states must follow a certain pattern along the C i axis in an order dictated by the FvCB model.The minimum numbers of data points (3 or 0, 2 or 0, 3 or 0) and the number of data points higher than the number of parameters to be derived are required for resolvable parameters.Under these conditions, the resolvable parameters are defined as Gu et al.'s resolvable parameters to differentiate them from resolvable parameters as a general case.Thus, the minimum numbers of data points (3, 2, 3) and a minimum number of nine observed data points are required for all eight parameters to be resolvable.We refer to these as Fitting the FvCB model to each limited state distribution separately.In this step, the transition points are never calculated and the carboxylation rates in different states are never compared.The A n is calculated with the submodel of the limited state to which the data point is assigned.(3) Detection and correction of inadmissible fits.Gu et al., [2] defined "inadmissible fits" as cases where the limitation states of the points in the A/C i curve have not consistently or correctly identified the derived parameters.This step is only used for a dataset that contains multiple limited states.
If the calculated limited state distribution is the same as the assigned limited state distribution, then the fit is admissible; otherwise, the fit is inadmissible.If the fit is inadmissible, the fit will be corrected via a penalization strategy.(4) Section of best fit.The best fit for an observed set of data is the method that gives the smallest value for the minimized objective function.If the values of the minimized objective function are equal when comparing across different limited state distributions, the one with fewer parameters is selected.

Sharkey et al.'s Method
Sharkey et al.'s method requires an initial set of estimated parameter values to be assigned and iteratively improves this set.The algorithm starts with two initial transition values for A c and A j states and A j and A p states; it then changes the values until the objective function is minimized.Since the parameters K co , α and Γ* are assigned a priori, a maximum of five parameters are estimated.The three states share common parameters-g m and R d -with state specific parameters of V cmax in the A c state, J max in the A j state, and T p in the A p state.Thus, the minimum data point distribution is (1, 1, 1) and the minimum number of points is five for all five parameters to be resolvable.If the data points are only distributed in one state or in two states, the minimum number of data points is three or four, respectively, for all involved parameters to be resolvable.It should be noted that, for the same dataset, the number of Gu et al.'s resolvable parameters is different from that of Sharkey et al.'s, and both are different from the resolvable parameters in the general case.

Parameter Calculations
The parameter estimations were conducted in May 2011 to March 2012.The generated datasets were uploaded to the website http://Leafweb.ornl.gov to estimate the parameters by Gu et al.'s method.The detailed procedures can be found on the website.The Excel spreadsheet-software created by Sharkey et al. [27] (A n /C i curve fitting utility version 1.1) was used to test Sharkey et al.'s method.

High Accuracy Dataset
Using Gu et al.'s method, all eight parameters were correctly retrieved in 23 datasets for high accuracy datasets with α > 0. In fact, there were a total of 33 datasets with all eight resolvable parameters, in which ten of the data sets did not meet the requirements set by Gu et al. for the minimum data point distribution (3,2,3) to retrieve all eight parameters.For datasets with α = 0, all eight parameters were correctly derived in 24 out of a total of 25 datasets that met the requirements of Gu et al. for all eight resolvable parameters.There was one exception, where a (3, 2, 10) dataset resulted in poor parameter estimation.These results imply that, when the dataset met Gu et al.'s requirements for all resolvable parameters in high accuracy datasets, Gu et al.'s method obtained a full set of parameters for a dataset with α > 0, while the method might not correctly estimate all eight parameters for a dataset with α = 0.
For datasets with Gu et al.'s partially solvable parameters, the resolvable parameters may or may not be correctly estimated.For example, two datasets had the same data distribution (9, 0, 6).One dataset was able to correctly estimate all Gu et al.'s partially resolvable parameters, but another could not correctly estimate any parameter.T p and α were non-involved parameters in this dataset, where the missing A p state was incorrectly estimated because T p was calculated by fitting the A n /C i curve with a sigmoid function and fixed α = 0.For example, one dataset (0, 15, 0) was identified as (5, 10, 0), and the non-involved parameters V cmax , K co , T p and α were incorrectly derived.
For a dataset with one or two states which did not meet the minimum data point requirements set by Gu et al. for resolvable parameters, the specific parameters in this state could not be correctly estimated.For example, the method forced a dataset distribution of (7, 6, 2) to (7,5,3), leading to an incorrect estimation of all eight parameters.The numbers of estimated parameters were larger than the resolvable parameters, which in turn were larger than the correctly estimated parameters.For example, for V cmax , J max and T p , when α > 0, the numbers of estimated parameters were 91, 74 and 100; the numbers of resolvable parameters were 71, 72 and 61; and the numbers of correctly estimated parameters were 52, 49, and 42, respectively (Table 1); when α = 0, the total numbers parameters were 86, 92 and 100 for estimated, 44, 62 and 45 for resolvable, and 37, 46 and 30 for correctly estimated parameters, respectively (Table 1).Obviously, g m was a resolvable parameter in any dataset, but could not always be correctly estimated.
Using Gu et al.'s method, more than half of the values of V cmax , K co , J max , g m , T p and Γ* ranged within ±10% of error (Table 1).Some estimated values of g m could be very large; up to 100,000 µmol m −2 s −1 Pa −1 (Figure 1E).The estimated R d values were more variable in comparison with other parameters, and many of the values were larger than the upper limit of 5.000 µmol m −2 s −1 used to generate datasets (Figure 1D).Most estimates of α were zero or very close to their "true values" when α > 0 (Figure 1H).The incorrectly estimated values of V cmax , K co , J max , Γ* and α showed a tendency to be underestimated, while R d tended to be overestimated.About half of the incorrectly estimated g m values (except for extreme values) were overestimated, while T p was evenly distributed around the 1:1 line (y = 1.002x,R 2 = 0.941).The uneven distributions of the estimated parameters V cmax , K co , Γ*, R d and α around the 1:1 line implied that the averages of these parameters may not be close to their "true values".Correctly estimated: the estimated parameter by a specific method with the same value as "true value"; Total estimated: total estimated parameters including correctly and incorrectly estimated parameters; Error within ±10%: the ranges of the error of estimated parameters within ±10% of the "true value".The number of resolvable parameters is the same for all datasets (HDS: high accuracy dataset, NDS: normal accuracy dataset and DSE: dataset with measurement errors) when α > 0 (or α = 0) using the method of Gu et al. [2], and using the method of Sharkey et al. [27], respectively; b The number of correctly estimated parameters is the same for datasets using the method of Sharkey et al. [27]; c For values of α, the differences within 0.1 were listed; d The number of estimated parameters were the same for all datasets using the method of Sharkey et al. [27]; e The number of correctly estimated parameters is the same for all datasets (using the method of Gu et al. [2]).
Sharkey et al.'s method [27] was unable to correctly estimate any parameter for any dataset with both α = 0 and α > 0 (Table 1).All five unknown parameters-V cmax , J max , g m , T p and R d -could be estimated for any dataset, and even for some noninvolved parameters in some datasets (Table 1), since Sharkey et al.'s method incorrectly forced data points into a missing state to minimize the objective function.Within ±10% of error ranges, the numbers for V cmax , J max , g m , T p and R d were 22, 65, 41, 7 and 6, respectively, when α > 0, and were 23, 64, 63, 8, and 5 when α = 0, respectively.V cmax and J max had extreme values in some datasets.Many values of g m reached their upper limit value of 30 µmol m −2 s −1 Pa −1 .Overall, V cmax , and g m were overestimated, about 40% of the R d values were zero, J max was about evenly distributed around the 1:1 line (y = 1.016x,R 2 = 0.462) with a few extreme large values, and T p was about evenly distributed around the 1:1 line (y = 0.987x, R 2 = 0.866) (Figure 1).

Normal Accuracy Datasets
For normal accuracy datasets with both α > 0 and α = 0, no parameters were correctly estimated using Gu et al.'s method.In comparison with the high accuracy dataset results, this suggests that the accuracy of An/Ci data was important for correctly determining the parameters.The total number of estimated parameters was more than the total number of resolvable parameters (except for gm) (Table 1).The number of estimated parameters was more than those utilizing high accuracy datasets, since more non-involved parameters were incorrectly estimated.Compared to the high accuracy datasets, more datasets with a missing state were incorrectly identified as datasets with all three states available by changing a missing state to an available one.For example, a dataset (3, 0, 12) was assigned to (4, 2, 9).The total number of values within ±10% of their "true values" for Vcmax, Kco, Jmax, Tp, gm, Rd, Γ* and α were 55, 40, 65, 70, 26, 16, 48 and 26 for α > 0; and were 43, 27, 64, 67, 20, 12, 42 and 99 for α = 0, respectively (Table 1).These numbers were fewer than the corresponding parameters in

Normal Accuracy Datasets
For normal accuracy datasets with both α > 0 and α = 0, no parameters were correctly estimated using Gu et al.'s method.In comparison with the high accuracy dataset results, this suggests that the accuracy of A n /C i data was important for correctly determining the parameters.The total number of estimated parameters was more than the total number of resolvable parameters (except for g m ) (Table 1).The number of estimated parameters was more than those utilizing high accuracy datasets, since more non-involved parameters were incorrectly estimated.Compared to the high accuracy datasets, more datasets with a missing state were incorrectly identified as datasets with all three states available by changing a missing state to an available one.For example, a dataset (3, 0, 12) was assigned to (4, 2, 9).The total number of values within ±10% of their "true values" for V cmax , K co , J max , T p , g m , R d , Γ* and α were 55, 40, 65, 70, 26, 16, 48 and 26 for α > 0; and were 43, 27, 64, 67, 20, 12, 42 and 99 for α = 0, respectively (Table 1).These numbers were fewer than the corresponding parameters in the high accuracy dataset.Overall, V cmax , K co , was underestimated, R d was overestimated and g m was generally underestimated.In some cases, g m was overestimated when the estimated value was higher than 20 µmol m −2 s −1 Pa −1 .Most of the estimated parameters Γ*, J max and T p were evenly distributed around the 1:1 line (y = 0.952x and R 2 = 0.531 for J max , y = 1.001x and R 2 = 0.958 for T p ) (Figure 2).generally underestimated.In some cases, gm was overestimated when the estimated value was higher than 20 μmol m −2 s −1 Pa −1 .Most of the estimated parameters Γ*, Jmax and Tp were evenly distributed around the 1:1 line (y = 0.952x and R 2 = 0.531 for Jmax, y = 1.001x and R 2 = 0.958 for Tp) (Figure 2).

Datasets with Measurement Errors
Datasets with measurement errors are representative of observed data.Using Gu et al.'s method, the number of correctly estimated parameters was zero.The common parameters gm, Rd and Γ*, and the specific parameters Tp and α were estimated for all 100 datasets.The specific parameter Vcmax, Kco, or Jmax was estimated whenever its state was identified (Table 1).The number of estimated parameters was higher than that in normal accuracy datasets, indicating that more parameters would be Using Sharkey et al.'s method [27], there were no correct parameters for both α = 0 and α > 0 (Table 1).All five unknown parameters-V cmax , J max , g m , T p and R d -were estimated for any dataset (Table 1) within ±10% of error.The values of V cmax , J max , g m , T p and R d were 24, 70, 6, 57 and 6, respectively, when α > 0; and 22, 59, 8, 51, and 8, respectively, when α = 0. V cmax and J max had extreme values in some datasets.Many values of g m reached their upper constrained value of 30 µmol m −2 s −1 Pa −1 for both α = 0 and α > 0. Overall, the parameters V cmax and g m were underestimated, J max and T p were evenly distributed around the 1:1 line (y = 0.991x and R 2 = 0.572 for J max , y = 0.976x and R 2 = 0.800 for T p ), and many values of R d were over their upper limit of 5.000 µmol m −2 s −1 when generated (Figure 2).

Datasets with Measurement Errors
Datasets with measurement errors are representative of observed data.Using Gu et al.'s method, the number of correctly estimated parameters was zero.The common parameters g m , R d and Γ*, and the specific parameters T p and α were estimated for all 100 datasets.The specific parameter V cmax , K co , or J max was estimated whenever its state was identified (Table 1).The number of estimated parameters was higher than that in normal accuracy datasets, indicating that more parameters would be estimated when using a less accurate dataset.The total numbers of values within ±10% of their "true values" for V cmax , K co, J max , T p , g m , R d , Γ* and α were 30, 10, 61, 64, 7, 7, 36 and 23 when α > 0; and 22, 7, 50, 51, 5, 8, 23 and 78 when α = 0, respectively (Table 1).The distributions of estimated parameters (except for T p ) were more scattered than that of normal accuracy datasets.Most of the values for V cmax , K co , g m and Γ* were underestimated (Figure 3A,G,E,F), while most of R d and α were overestimated (Figure 3D,H).About half of the values of T p and J max were overestimated.estimated when using a less accurate dataset.The total numbers of values within ±10% of their "true values" for Vcmax, Kco, Jmax, Tp, gm, Rd, Γ* and α were 30, 10, 61, 64, 7, 7, 36 and 23 when α > 0; and 22, 7, 50, 51, 5, 8, 23 and 78 when α = 0, respectively (Table 1).The distributions of estimated parameters (except for Tp) were more scattered than that of normal accuracy datasets.Most of the values for Vcmax, Kco, gm and Γ* were underestimated (Figure 3A,G,E,F), while most of Rd and α were overestimated (Figures 3D and 4H).About half of the values of Tp and Jmax were overestimated.
Using Sharkey et al.'s method, there were no correctly estimated parameters for both α > 0 and α = 0 (Table 1).The numbers of parameters gm, Rd, Vcmax, Jmax, Tp, within ±10% errors were 8, 5, 24, 70 and 60 when α > 0, and 4, 7, 26, 61 and 63 when α = 0, respectively.There were similar numbers of estimated parameters within ±10% errors between datasets with measurement errors having α > 0 and α = 0 and for normal accuracy and measurement error datasets.This observation suggests that the impacts of the value of α and measurement errors on the estimated parameters were insignificant in datasets with large errors when using Sharkey et al.'s method.Overall, the distributions of the estimated parameters were similar to those of a normal accuracy dataset (Figure 3). Figure 3.The same as in Figure 2; here applied to datasets with measurement errors.
Figure 3.The same as in Figure 2; here applied to datasets with measurement errors.
Using Sharkey et al.'s method, there were no correctly estimated parameters for both α > 0 and α = 0 (Table 1).The numbers of parameters g m , R d , V cmax , J max , T p , within ±10% errors were 8, 5, 24, 70 and 60 when α > 0, and 4, 7, 26, 61 and 63 when α = 0, respectively.There were similar numbers of estimated parameters within ±10% errors between datasets with measurement errors having α > 0 and α = 0 and for normal accuracy and measurement error datasets.This observation suggests that the impacts of the value of α and measurement errors on the estimated parameters were insignificant in datasets with large errors when using Sharkey et al.'s method.Overall, the distributions of the estimated parameters were similar to those of a normal accuracy dataset (Figure 3).

Datasets with Varied Data Points
Table 2 summarizes the results of the two methods using high accuracy datasets with a varied number of data points.Gu et al.'s method was unable to guarantee the correct estimation of any parameter values when the number of data points was eight or less.For datasets with nine and 12 points, all eight parameters could be correctly estimated if a dataset met the requirements set by Gu et al. for all eight resolvable parameters.However, the actual number of datasets with all eight resolvable parameters was much higher than the number of datasets that met the requirements of Gu et al. for all eight resolvable parameters.Gu et al.'s method could estimate the parameters g m , R d , Γ*, T p and α in any dataset (Table 2), though many of them were non-resolvable.Generally, the ratio of correctly estimated parameters to estimated parameters increased (except for K co and Γ*) with an increasing number of observed data points for the A n /C i curve.The same values for all parameters using the method of Sharkey et al.
All parameters from the datasets were not correctly estimated by Sharkey et al.'s method (Table 2).The percentage of the number of estimated parameters with a ±10% error relative to the number of estimated parameters for J max and T p ranged from about 33% when the number of data points was four.When the number of data points was 12 for g m , R d and Γ*, this ranged from 0 to 20%.The estimated parameters were unevenly distributed around their "true values".

Datasets with Varied Accuracy
Gu et al.'s method was unable to correctly estimate any parameter when the accuracy of datasets was four or fewer decimal places.For the datasets with decimal places 5, 6, 7 and 8, the numbers of datasets with correctly estimated parameters were 1, 4, 4, and 9 for datasets with eight data points, and 1, 9, 24 and 30 for datasets with 12 data points, respectively.If the number of data points was eight or fewer, there was no correct estimate for all eight parameters and no guarantee to correctly estimate any parameter.The numbers of datasets with a correctly estimated parameter increased with an increasing number of decimal places.For a dataset with 12 data points, it was possible to obtain all eight correct parameters when the accuracy was five or higher decimal places; however, to guarantee all eight parameters to be correctly estimated, a dataset must meet the requirements of Gu et al. for all eight resolvable parameters, and the accuracy must be seven or higher decimal places.In addition, in Gu et al.'s method, V cmax was underestimated with an error of about 13-17% for datasets with eight data points and 11-15% for datasets with 12 data points.J max was slightly overestimated, with an error of about 0-7% for datasets with eight data points and 1-4% for datasets with 12 data points.T p was overestimated, with an error of about 8-16% for for datasets with eight data points and 7-11% for for datasets with 12 data points.The estimated parameters g m , K co R d , α and Γ* had relatively large errors; for example, the errors of the mean values of R d were from 80 to 243%.
Sharkey et al.'s method was unable to correctly estimate any parameter for datasets with varied accuracy.The estimated parameters did not change from their initial values when the accuracies of the datasets were five or more decimal places.Compared to the results using Gu et al.'s method, all the estimated parameters had less variation, as indicated by their smaller standard deviations (data not shown).The mean values of V cmax were overestimated, with errors of 18% for datasets with eight data points, and 12-16% for datasets with 12 points.The mean values of J max were close to the "true value", with errors of 5% for the datasets with eight data points, and 5-7% for datasets with 12 points.The means of T p were overestimated with errors of 12-13% for datasets with eight data points, and 0-2% for datasets with with 12 points.Parameters g m and R d had relatively large errors.The ranges of the relative changes of the mean g m and R d were from 164 to 694%, and from −43 to 159%, respectively.

Discussion
Using high accuracy datasets, Gu et al.'s method was unable to correctly estimate the resolvable parameters that did not meet the requirements set by Gu et al. for resolvable parameters.This is because Gu et al.'s method forced these datasets to have the minimum number of data points as (0 or 3, 0 or 2, 0 or 3), which was an altered distribution.Gu et al.'s method could only correctly derive the resolvable parameters if the dataset satisfied the requirements of Gu et al. for resolvable parameters.However, since any observed dataset distribution is unknown and there are many fitted parameters, one cannot identify which set of parameters are correct.
Using normal accuracy datasets with measurement errors, Gu et al.'s and Sharkey et al.'s methods were unable to correctly estimate any parameter.As shown in Figures 1-3, each set of estimated parameters from a high accuracy dataset, a normal accuracy dataset or a dataset with measurement errors were different.One of the main reasons for these differences was the accuracy of the datasets.Both methods were based on the standard nonlinear regression, which assumes that the error of A n is a random variable whose population mean is zero and variance is constant, and that C i is an independent variable without any error.Since only a few data points are available in each limited state, the sample error will vary considerably, simply by chance.A point with a larger error would tend to have a larger deviation from the curve and so would have a larger impact on the SSE.In contrast, a point with a smaller error would have a smaller influence.Minimizing the SSE would be inappropriate for datasets with a few data points with relative large errors.
The conditions for correctly estimating all eight parameters using Gu et al.'s method is a dataset with (1) a minimum of three data points for the limited state of Gu et al. for all resolvable parameters; (2) a minimum of nine data points for the A n /C i curve; (3) an accuracy of at least seven decimal places; and (4) α > 0. These requirements were necessary and sufficient conditions for Gu et al.'s method.If a dataset does not meet these conditions, Gu et al.'s method will be unable to guarantee a fit for any parameter.For example, in the dataset (2, 4, 2) with high accuracy, all the parameters were resolvable; however, Gu et al.'s method incorrectly identified it as (3,5,0), leading to all the parameters being incorrect (data not shown).
Sharkey et al.'s method (and other extant methods, see [26]) fits V cmax , J max , T p , R d and g m simultaneously, using all the data points of an A n /C i curve, by fitting Equations ( 1)-( 5) and ( 14), with fixed K co and Γ*, and assuming α = 0.This approach simplifies the fitting method, but may introduce more errors to the estimated parameters if the wrong fixed values are used.Sharkey et al.'s method was unable to correctly estimate any parameters using all examined datasets.One of the major reasons for this could be the use of an incorrect value for the fixed parameters K co , Γ*, and α.There are different K co and Γ* values to choose from in the literature [12,27].In addition, K co changes across diverse species and environmental conditions.One can see from Equations ( 9)-( 13) that a change in one or more parameters may lead to changes in all the other parameters; for example, if a dataset is only in the A c state, by combining Equations ( 1), ( 4) and ( 8) and assuming independent of all involved parameters, one can have Thus, the direct impacts of Γ* and K co on V cmax are and From the above equations, one can see that the direct effects of errors in V cmax depend on the errors in K co and Γ* and the values of A c , C i , g m and R d .Thus, incorrect values for K co and/or Γ* will inevitably lead to incorrect parameter estimates.This result is in agreement with that of Ethier and Livingston [12].
In Sharkey et al.'s methods, the CO 2 partial pressure inside the chloroplast C cs was estimated by (a similar estimation is also in the methods of Dubois et al. [24] and Miao et al. [26]): where A nm is the measured net assimilation rate.Equation ( 26) is not identical to Equation (8); A n in Equation ( 8) is the calculated value in the algorithm.Thus, a minimization of SSE based on Equation (26) or Equation ( 8) is different.The parameterization of a four-point curve in the A j state (Table 3) illustrates this problem.The dataset with eight decimal places was generated using the same value of Γ* as in Sharkey et al.'s method to eliminate the effects of different values of Γ*.The two sets of estimated parameters were different from their "true values" and modeled A jj s were slightly different from the 'measured' values.One can see that Sharkey et al.'s method was unable to correctly derive the parameters, even using a dataset with a high accuracy and the same fixed value as in the method.
Because the distribution of data points is unknown, it is possible for Sharkey et al.'s method to assign data points to the wrong states to minimize SSE, especially if a state is missing.Table 4 shows an example in which there was one dataset with 12 data points only in the A c and A j states.If the data points were assigned to the same states as when it was generated (I), the SSE was larger than if the last three data points were assigned to the A p state, which was larger than if the transitional point between A c and A j was also adjusted.The estimated parameters were different among the three conditions (except for R d and g m ) and were different from their "true values."It is worth noting that this is an intrinsic problem in Sharkey et al.'s method, because of the limited accuracy and number of data points of an A n /C i curve.It is very easy to miss a state, especially the A p state [24].This result can be explained by considering that the incorrect identification of the distribution of data points led to incorrect parameter estimation.The transition point was the same as the corresponding error-free data set generated by true parameters; b is the same as a , except for the fact that the last 3 points were assigned to A p state; c minimum of SSE where the transition point of C i between A c and A j is 150 and 201 µmol mol −1 , and between A j and A p it is between 552 to 673 µmol mol There were also different constraints found in other methods in the literature, such as the constraint of −3 < R d < 50 µmol m −2 s −1 in Dubois et al. [24].Firstly, if a parameter is estimated within the range of its constraint, the local minimum must be achieved in the range of the constraint.Secondly, the constraints are subjective choices which are probably not realistic; for example, we obtained R d as zero by Sharkey et al.'s method (data not shown), and Γ* as zero (data not shown) by Gu et al.'s method.Thirdly, if a parameter is equal to its constraint, which is likely to be incorrect, this incorrect parameter may substantially affect the estimates of other parameters.The problems of finding a local minimum and non-uniqueness of the parameters is intrinsic to nonlinear regression; for example, in Sharkey et al.'s method, the estimates of the parameters were sensitive to the initial values (Table 3).There are similar problems in the method of Dubois et al. [24] and Miao et al. [26], as stated by Gu et al. [2].
However, the best fit but unreliable parameter set may be used to predict A n from C i if the SSE is small, as argued by Ethier and Livingston [12].The four A n /C i curves modeled by the three sets of parameters with 15 data points (7, 6, 2) were compared.The three sets of parameters were high accuracy, including measurement errors, and derived from curves generated using the parameters fitted to the dataset with measurement errors by methods of Gu et al. and Sharkey et al., respectively.All curves are superposed.
The distributions of the estimated parameters (except for T p ) by both methods using a normal accuracy dataset and a dataset with measurement errors were very scattered.The majority of the estimated parameters were either overestimated or underestimated (Table 1, Figures 2 and 3), implying that the mean values of these estimated parameters could not represent the "true values".This can be explained by low data accuracies and by the intrinsic problems in both methods.The estimated T p was evenly distributed around the 1:1 line, indicating that the mean T p was close to its "true value"; this was because T p is almost equal to A n 3 (Equations ( 1), (6), and ( 7)) since A n >> R d in the A p state.

Conclusions
In this study, we tested Sharkey et al.'s [21] commonly-used A n /C i curve fitting method, and the method developed recently by Gu et al. [2], using datasets with a number of A n /C i curve points from four to 15 and accuracies from one to eight decimal places.The generated datasets were conservative.In the literature, the typical number of data points of an A n /C i curve is eight to 12, which is considered enough for estimation [2,24,26,30].The accuracy of the measured data in a typical open gas exchange system is one decimal place; e.g., Li-6400.The error level of the generated datasets was lower than normally seen in practice, because only one source of measurement error, e.g., the random noise of CO 2 ±0.2 µmol mol −1 , was imposed as an error to the datasets; other error sources were not included [34].
To correctly estimate the parameters of an A n /C i curve, we have to consider data accuracy and the number and distribution of data points as well as error distribution.Based on the results using different generated datasets, we concluded that Sharkey et al.'s method failed to correctly estimate the parameters, while Gu et al.'s method was unable to correctly estimate the parameters using a dataset with a number of data points fewer than five or with an accuracy of four or fewer decimal places.At least eight data points were required for Gu et al.'s method to correctly estimate all eight parameters.For the datasets with measurement errors and the same accuracy of a typical open gas exchange system-i.e., Li-6400-using Gu et al.'s approach, the parameters V cmax , K co , g m and Γ* were underestimated, while R d and α were often overestimated.The distributions of T p were evenly dispersed around the 1:1 line using both approaches.Using Sharkey et al.'s approach, the parameters J max was overestimated, V cmax and g m were underestimated, and many values of R d were over their upper limit of 5.000 µmol m −2 s −1 .The mean values of all estimated parameters, except for T p , were not close to their "true values".This failure of parameterization was due to two types of problems.One was the limited number of data points and the limited accuracy of datasets in both methods that did not meet the assumptions for nonlinear regression (measurement errors in both A n and C i ).The other problem was the failure to identify correct parameter estimates using Gu et al.'s method, due to the unknown data point distribution.
This study revealed that high accuracy A n /C i and enzyme kinetic measurements are required to correctly estimate these parameters, even when sufficient data points are provided.An accurate estimation of the parameters can contribute to the enhancement of food security under climate change by reducing potential errors when the biological and biophysical processes of CO 2 assimilation are correctly spatially and temporally scaled-up for ecosystem studies.This study does not address the question of how these measurement accuracies can be improved.It is recommended that this question be addressed in a further study.

Figure 1 .
Figure 1.Comparison of the estimated parameters by Gu et al.'s method (black circle for α > 0 and black triangle for α = 0) and Sharkey et al.'s method (open circle for α > 0 and open triangle for α = 0) vs true parameter values for synthetic An/Ci curves, 100 datasets with α > 0 and 100 with α = 0.The datasets consisted of 15 data points with high accuracy (eight decimal points).The points in each figure may have less than 100 values because some datasets do not contain all three states.

Figure 1 .
Figure 1.Comparison of the estimated parameters by Gu et al.'s method (black circle for α > 0 and black triangle for α = 0) and Sharkey et al.'s method (open circle for α > 0 and open triangle for α = 0) vs. true parameter values for synthetic A n /C i curves, 100 datasets with α > 0 and 100 with α = 0.The datasets consisted of 15 data points with high accuracy (eight decimal points).The points in each figure may have less than 100 values because some datasets do not contain all three states.

Figure 2 .
Figure2.The same as in Figure1; here applied to normal accuracy datasets.

− 1 .
Both the methods of Sharkey et al. and Gu et al. have assigned constraints for some parameters based on prior knowledge; for example, g m ≤ 30 µmol m −2 s −1 Pa −1 in Sharkey et al.'s method, and R d ≤ 10 µmol m −2 s −1 , g m ≤ 1,000,000 µmol m −2 s −1 Pa −1 , and Γ* ≥ 0 Pa in Gu et al.'s method.
Γ* of 4.396 Pa, K co of 43.616 Pa, and α of 0.352.The maximum number of data points in each state was nine.There were two subgroups in this group:

Table 1 .
Summary of the parameterization of Gu et al.'s and Sharkey et al.'s methods using 15 data points A n /C i curves.Resolvable: the parameter can be correctly estimated by an appropriate method;

Table 2 .
Effects of the number of data points of A n /C i curves on the quality of the parameter estimates obtained from Gu et al.'s and Sharkey et al.'s methods.The "true values" of the parameters are 396 Pa, K co = 43.616Paandα = 0.352.The descriptions of the terms are the same as in Table1.

Table 3 .
(26)meters estimated from three datasets fitted by Sharkey et al.'s method.The dataset was generated with the fixed value of Sharkey et al.'s method (Γ* = 3.743 Pa −1 ) at a leaf temperature of 25 • C and an air pressure of 100 Pa.A jj I and A jj II and estimated parameters I and II are values estimated using two different sets of the initial values of the parameters.A jmj and A jj were measured (generated) values using Equation (8) and modeled values using Equation(26), respectively.

Table 4 .
Comparison of fitting results from a 12-point A n /C i using Sharkey et al.'s method by assigning a different transition point between the A c , A j and A p states, as indicated by I, II, III or IV.The dataset was generated as containing only A c and A j states, and the transitional point of C i is between 201 and 284 µmol mol −1 .