Mathematical Modeling to Estimate Photosynthesis: A State of the Art

: Photosynthesis is a process that indicates the productivity of crops. The estimation of this variable can be achieved through methods based on mathematical models. Mathematical models are usually classiﬁed as empirical, mechanistic, and hybrid. To mathematically model photosynthesis, it is essential to know: the input/output variables and their units; the modeling to be used based on its classiﬁcation (empirical, mechanistic, or hybrid); existing measurement methods and their invasiveness; the validation shapes and the plant species required for experimentation. Until now, a collection of such information in a single reference has not been found in the literature, so the objective of this manuscript is to analyze the most relevant mathematical models for the photosynthesis estimation and discuss their formulation, complexity, validation, number of samples, units of the input/output variables, and invasiveness in the estimation method. According to the state of the art reviewed here, 67% of the photosynthesis measurement models are mechanistic, 13% are empirical and 20% hybrid. These models estimate gross photosynthesis, net photosynthesis, photosynthesis rate, biomass, or carbon assimilation. Therefore, this review provides an update on the state of research and mathematical modeling of photosynthesis.


Introduction 1.Photosynthesis Process
Plants perform physiological functions that allow them to grow, develop, and reproduce.For their development and growth, plants require energy from sunlight, assimilated carbon dioxide (CO 2 ), hydrogen, mineral nutrients, oxygen (O 2 ), and a suitable temperature both in air and roots [1,2].A fundamental physiological process for the vegetable kingdom is photosynthesis.The photosynthesis process occurs in terrestrial and aquatic plants, algae, and some types of bacteria that are essential for living species [3].Photosynthesis is a process by which plants convert light energy into chemical energy to obtain sugar as a final product.On the other hand, another photosynthetic reaction is oxygenic, which is released into the atmosphere as a waste, being useful for the biological process of respiration [4].
Chloroplasts (plant organelles located in the leaf) are active metabolic centers that capture solar energy through chlorophyll (green pigment) and manufacture carbohydrates (glucose molecules) through the process of photosynthesis [3].The energy from light absorbed by chlorophyll molecules in a leaf can take one of three ways: it can be used Appl.Sci.2022, 12, 5537 2 of 30 to drive photosynthesis; excess energy can be dissipated as heat, or it can be re-emitted as light-chlorophyll fluorescence [5].Glucose is one of the main molecules that serve as an energy source for plants and animals.It is found in plant sap and in the human bloodstream, where it is known as "blood sugar" [6,7].Then, through photosynthesis, plants produce glucose for their food and dispose of valuable oxygen for the respiration of other living beings.Plants can capture carbon dioxide and release oxygen during the day, but they undergo another change at night: they absorb oxygen and release carbon dioxide [8].
Photosynthesis is divided into two phases: In the first phase, absorption and conversion of energy happen; while in the second, the intake and assimilation of carbon occur.Light energy is absorbed by photosensitive biomolecules and transformed into a stable biochemical energy form.The constituent elements are taken from inorganic mineral sources (water, H 2 O; carbon dioxide, CO 2 ; nitrates, NO − 3 ; sulfates, SO 2− 4 , among others.)and are incorporated into metabolizable organic biomolecules.Both phases are perfectly coordinated and interrelated.Traditionally, these phases have been called the light phase and the light-independent phase.The first phase is produced only by utilizing sunlight, which will provide us with oxygen, ATP (adenosine triphosphate), and NADPH (nicotinamide adenine dinucleotide phosphate).In the second phase (it can occur during the day and also at night), the Calvin cycle is carried out where glucose is produced by using the ATP, NADPH, and CO 2 generated in the light phase [2,8].
The essential and predominant element in organic material is carbon.In photosynthesis, carbon is taken from the air's carbon dioxide (CO 2 ).Generally, in terrestrial plants, CO 2 is incorporated from the atmosphere through stomata.Algae and aquatic plants take it from the CO 2 dissolved in the surrounding water.At night, when photosynthesis is not active and, therefore, there is no demand for CO 2 inside the leaf, the stomatal openings are reduced preventing unnecessary water loss.In the morning, when the water supply is abundant and solar radiation favors photosynthetic activity, the demand for CO 2 inside the leaf is big and the stomatal pores are wide open, reducing the stomatal resistance to the diffusion of the CO 2 [8].
Plants have developed three photosynthetic systems, C 3 , C 4 , and CAM (Crassulacean acid metabolism), with different anatomical and chemical characteristics [9], the C 3 type being the most common [10,11].The main difference between the types of photosynthesis lies in the way CO 2 is synthesized.The first step in the Calvin cycle is carbon dioxide fixation by rubisco.Plants that use only this "standard" carbon fixation mechanism are called C 3 plants because of the three-carbon compound (3-PGA) that they produce during this part of the photosynthetic process.In C 4 plants, the light-dependent reactions and the Calvin cycle are physically separated: the light-dependent reactions occur in the cells of the mesophyll (spongy tissue in the center of the leaf) and the Calvin cycle occurs in special cells around the veins of the leaf.These cells are called vascular bundle cells.Some plants adapted to dry environments use the crassulaceae acid metabolism (CAM) pathway to minimize photorespiration.Instead of separating the light-dependent reactions and the use of CO 2 in the Calvin cycle, the temporary separation of photosynthetic processes in CAM plants causes carbon assimilation to take place at night.At night, they open their stomata so that the CO 2 diffuses into the leaves.This CO 2 is fixed to oxaloacetate by PEP carboxylase (the same step that C 4 plants use), which is then converted into malate or another organic acid [8].The organic acid is stored inside vacuoles until the next day.During the day, CAM plants do not open their stomata, but they can still carry out photosynthesis.Because organic acids are transported out of vacuoles and broken down to release CO 2 , which enters the Calvin cycle.This controlled cycle maintains a high concentration of CO 2 around rubisco [4].The ability to represent the three photosynthetic types (C 3 , C 4 , and CAM) has important implications for studying natural ecosystems and agroecosystems [12].
The described complex process of photosynthesis must work, integrally and efficiently, in an environment where there is an enormous natural variability of factors that affect the rate of photosynthesis.For instance, light, environment temperature, air humidity, availability of water, and mineral nutrients in the soil, among others.Carbon dioxide (CO 2 ) can also be considered as part of this list of relevant factors due to global climate change [13].The photosynthesis rate of a leaf is conditioned by one of more than 50 individual reactions, each presenting its response to each environmental variable.This photosynthetic rate can widely vary between days and also throughout seasons, due to environmental factors such as light and temperature.It can also vary in the longer term during the coming decades as a response to increasing atmospheric CO 2 levels.The increase in CO 2 and other greenhouse gases in the atmosphere can cause global climate change.As can be understood, each of the aforementioned environmental factors affects the photosynthesis rate in a different way, depending on the time scale [8].
Photosynthesis is important for many reasons.From humanity's point of view, it produces food and oxygen; therefore, it is often studied in its end products.However, these are secondary aspects of an integral process.The most important aspects are capturing and transforming light [14].A critical component of crop production is the ability to produce more biomass [15].
Photosynthesis at the ecosystem scale, also known as the gross primary productivity (GPP), is the first step of CO 2 entering the biosphere from the atmosphere.Over the past century, with the increasing carbon release from landcover change and fossil fuel burning, the CO 2 accumulation rate on land, in the ocean and in the atmosphere has continuously increased.The increment of CO 2 in Earth's atmosphere is the major cause of global climate change.A major contribution to this high variability comes from GPP, as the photosynthesis process is vulnerable to droughts, heatwaves, floods, frost, and other types of disturbances.An accurate estimation of GPP will not only provide information about the ecosystem response to these extreme events but also help to predict the future carbon cycle dynamics [16].
For the agronomic sector the research about photosynthesis is useful in many aspects, such as: determining the state of crops or of certain plants; indicating crop production; helping in crop-health control; helping with the optimization of natural resources; detecting genetic alterations in plants to know how plants react to stressful situations; and serving as a predictive indicator of biomass accumulation in green plants [17][18][19][20][21][22].

Methods for Inferring Photosynthesis
Many of the methods for inferring photosynthesis are invasive; they physically or chemically interfere with the plant, altering its natural process during measurement.Noninvasive methods do not alter the plant's natural process since there is no contact with the specimen [23,24].
Millan et al. [25] presented a review of the advantages and disadvantages of the methods used for photosynthesis estimation.Table 1 shows the classification of these methods.CO 2 exchange is the most used method for constructing commercial and experimental equipment [26,27].It is possible to measure photosynthesis at the level of individual leaves, whole plants, plant canopies, and even forests [28,29].
Table 1.Classification of the methods used for photosynthesis estimation and their general description.

Destructive
Involves cutting a whole plant or only a portion of it to estimate the photosynthetic activity based on the accumulation of dry matter in the plant, from the point of germination until it is cut [25].

Manometric
Consists of direct measurement of oxygen (O 2 ) pressure or carbon dioxide (CO 2 ) in an isolated chamber with photosynthetic organisms [30].

Electrochemical
Uses electrochemical electrodes to measure O 2 , CO 2 , or pH in aqueous solutions of the sample to detect variations that depend on photosynthetic activity [30].

Gas exchange
Consists of isolating the sample for analysis in a closed chamber to quantify the CO 2 concentration [29,31].Concentrated CO 2 gas is detected by an infrared gas sensor (called IRGA for Infra-Red Gas Analysis sensors) [32] Carbon isotopes Uses carbon isotopes such as 11 C, 12 C, and 14 C to produce incorporated CO 2 with radioactivity.This methodology is applied to analyze samples in isolated and illuminated chambers to produce a maximum fixation of radioactive CO 2 during photosynthesis [33,34].The main disadvantage is that it is destructive as it fixes a radioactive compound onto the sample; and its precision depends on lighting conditions.

Acoustic waves
Based on the principle of sound wave distortion in the medium in which waves propagate.The technique involves placing an acoustic transmitter on the seabed area where you want to monitor photosynthetic activity.The disadvantage is that it dependent on water conditions and sensitive to environmental disturbances [35].

Fluorescence
Way in which a certain amount of light energy absorbed by chlorophylls is dissipated.The fluorescence emission can be analyzed and quantified, which provides information on the electron transport rate, the quantum yield and the existence of photoinhibition of photosynthesis.Indeed, fluorescence is used in various ways, and it has different applications.The interested reader is referred to reference [5,8,24].

Mathematical modeling
Equation or a set of equations that represent a system's behavior.There is a correspondence between the model variables and the observable quantities [36].
As can be noticed from previous Table 1 Photosynthesis is a complex process that involves several and distinct variables; hence, it cannot be directly measured, if not through a mathematical model.Then, photosynthesis can be inferred by integrating and measuring these variables of interest.Hence, it is possible to generate a mathematical model if the contributions of the variables of interest of a physical, chemical, or biological process, among others, are known [36,37].It should be noted that all invasive or non-invasive methods that are used to infer photosynthesis apply some type of mathematical model.Therefore, the invasiveness or non-invasiveness when developing any method to estimate photosynthesis depends on the technique used to measure the variables of interest; for this reason, the present document focuses on studying and comparing the most relevant mathematical models reported in the literature.Authors like Zufferey [38], Polerecky [39], Millan [25], Espinosa [40], Magney [41], Aziz & Dianursanti [42], and Kitić [43], demonstrate the use of some photosynthesis measurement methods.

Mathematical Modeling
Any branch of science, as it progresses from qualitative to quantitative, is likely to reach the point where the use of mathematics to connect experiment and theory is essential.
Mathematical models can be classified into mechanistic (white box), empirical (black box), and hybrid (gray box).These, in turn, have sub-classifications, as shown in Figure 1 [36].Empirical models, also called black box models, mainly described a system's responses by using mathematical or statistical equations without any scientific content, restrictions, or scientific principle.Depending on particular goals, this may be the best type of model to build [44].Its construction is based only on experimental data and does not explain dynamic mechanisms; this refers to the fact that the system's process is unknown [40].Estimating an unknown function from the observations of its values is a problem.The basic advice in this aspect is to estimate models of different complexity and evaluate them using validation data.A good way to restrict certain classes of models' flexibility is to use a regularized fit criterion.A key issue is finding a sufficiently flexible parameterization model.Another key is to find a suitable "close approach "to the model structure [45].Researchers usually employ methods for predicting physiological parameters by using intelligent algorithms, such as Support Vector Machines (SVM), Back-Propagation Neural Network (BPNN), Artificial Neural Network (ANN), Deep Neural Network (DNN), and the combination of Wide and Deep Neural Network (WDNN) [46].
Mechanistic models, also called white box models, provide a degree of understanding or explanation of the modeled phenomena.The term "understanding" implies a causal relationship between quantities and mechanisms (processes).A well-built mechanistic model is transparent and open to modifications and extensions, more or less without limits.A mechanistic model is based on our ideas about how the system works, the important elements, and how they are related [44].These models allow knowing the input or output variables and the variables involved during the modeling process [40,47].Mechanistic models are more research-oriented than application-oriented, although this is changing as our mechanistic models become more reliable.Evaluation of such models is essential, although it is often, and inevitably, rather subjective.Conventional mechanistic models are complex, and unfriendly [44].Empirical models, also called black box models, mainly described a system's responses by using mathematical or statistical equations without any scientific content, restrictions, or scientific principle.Depending on particular goals, this may be the best type of model to build [44].Its construction is based only on experimental data and does not explain dynamic mechanisms; this refers to the fact that the system's process is unknown [40].Estimating an unknown function from the observations of its values is a problem.The basic advice in this aspect is to estimate models of different complexity and evaluate them using validation data.A good way to restrict certain classes of models' flexibility is to use a regularized fit criterion.A key issue is finding a sufficiently flexible parameterization model.Another key is to find a suitable "close approach "to the model structure [45].Researchers usually employ methods for predicting physiological parameters by using intelligent algorithms, such as Support Vector Machines (SVM), Back-Propagation Neural Network (BPNN), Artificial Neural Network (ANN), Deep Neural Network (DNN), and the combination of Wide and Deep Neural Network (WDNN) [46].
Mechanistic models, also called white box models, provide a degree of understanding or explanation of the modeled phenomena.The term "understanding" implies a causal relationship between quantities and mechanisms (processes).A well-built mechanistic model is transparent and open to modifications and extensions, more or less without limits.A mechanistic model is based on our ideas about how the system works, the important elements, and how they are related [44].These models allow knowing the input or output variables and the variables involved during the modeling process [40,47].Mechanistic models are more research-oriented than application-oriented, although this is changing as our mechanistic models become more reliable.Evaluation of such models is essential, although it is often, and inevitably, rather subjective.Conventional mechanistic models are complex, and unfriendly [44].
Figure 1 shows that both mechanistic and empirical models can be deterministic or stochastic.Determinists make definite quantitative predictions (plant dry-matter or animal intake) without any associated probability distribution.This can be acceptable in many cases; however, it may not be satisfactory for quite changeful quantities or processes (e.g., rain or the migration of diseases, pests, or predators).On the other hand, stochastic models include a random element as a part of the model so that the predictions have a distribution.One problem with stochastic models is that they can be technically difficult to build and complex to test or falsify [44].In turn, the deterministic and stochastic models can be continuous or discrete.A mathematical model that describes the relationship between continuous signals in time is called time-continuous.Differential equations are frequently used to describe such relationships.A model that directly relates the values of the signals at the sampling times is called a discrete or sampled time model.Such a model is typically described by differential equations [48].
The continuous models are classified as dynamic since they predict how quantities vary with time, so a dynamic model is generally presented as a set of ordinary differential equations with time (t), the independent variable.On the other hand, the continuous models can also be static; they do not contain time as a variable and do not make timedependent predictions [44].
Finally, dynamic models can be grouped or distributed.Partial differential equations mathematically describe many physical phenomena.The events in the system are, so to speak, scattered over the spatial variables.This description is called the distributed parameter model.If a finite number of changing variables describes the events, we speak of grouped models.These models are usually expressed by ordinary differential equations [48].
An intermediate model is classified as the semi-empirical or semi-mechanistic model between the black box and white box models.These models are also called gray box or hybrid models; they consist of a combination of empirical and mechanistic models [40].
The practical use of a mathematical model classification lies in understanding "where you are" in the mathematical model space and what types of models might apply to the problem.To understand the nature of mathematical models, they can be defined by the chronological order in which the model's constituents usually appear.Usually, a system is given first, then there is a question regarding that system, and only then is a mathematical model developed.This process is denoted as SQM, where S is a system, Q is a question relative to S, and M is a set of mathematical states M = (σ 1 , σ 2 , . . ., Σn) which can be used to answer Q.Based on this definition, it is natural to classify mathematical models in an SQM space [36].Figure 2 shows an approach to visualize this SQM space of mathematical models based on the white box and black box models classification.At the black box, at the beginning of the spectrum, models can perform reliable predictions based on data.At the white box end of the spectrum, mathematical models can be applied to the design, testing, and optimization of computer processes before they are physically carried out.On each of the S, Q, and M axes in Figure 2b, the mathematical models are classified based on a series of criteria compiled from various classification attempts in the literature [36].
stochastic.Determinists make definite quantitative predictions (plant dry-matter or animal intake) without any associated probability distribution.This can be acceptable in many cases; however, it may not be satisfactory for quite changeful quantities or processes (e.g., rain or the migration of diseases, pests, or predators).On the other hand, stochastic models include a random element as a part of the model so that the predictions have a distribution.One problem with stochastic models is that they can be technically difficult to build and complex to test or falsify [44].
In turn, the deterministic and stochastic models can be continuous or discrete.A mathematical model that describes the relationship between continuous signals in time is called time-continuous.Differential equations are frequently used to describe such relationships.A model that directly relates the values of the signals at the sampling times is called a discrete or sampled time model.Such a model is typically described by differential equations [48].
The continuous models are classified as dynamic since they predict how quantities vary with time, so a dynamic model is generally presented as a set of ordinary differential equations with time (t), the independent variable.On the other hand, the continuous models can also be static; they do not contain time as a variable and do not make time-dependent predictions [44].
Finally, dynamic models can be grouped or distributed.Partial differential equations mathematically describe many physical phenomena.The events in the system are, so to speak, scattered over the spatial variables.This description is called the distributed parameter model.If a finite number of changing variables describes the events, we speak of grouped models.These models are usually expressed by ordinary differential equations [48].
An intermediate model is classified as the semi-empirical or semi-mechanistic model between the black box and white box models.These models are also called gray box or hybrid models; they consist of a combination of empirical and mechanistic models [40].
The practical use of a mathematical model classification lies in understanding "where you are" in the mathematical model space and what types of models might apply to the problem.To understand the nature of mathematical models, they can be defined by the chronological order in which the model's constituents usually appear.Usually, a system is given first, then there is a question regarding that system, and only then is a mathematical model developed.This process is denoted as SQM, where S is a system, Q is a question relative to S, and M is a set of mathematical states M = (σ1, σ2, …, Σn) which can be used to answer Q.Based on this definition, it is natural to classify mathematical models in an SQM space [36].Figure 2 shows an approach to visualize this SQM space of mathematical models based on the white box and black box models classification.At the black box, at the beginning of the spectrum, models can perform reliable predictions based on data.At the white box end of the spectrum, mathematical models can be applied to the design, testing, and optimization of computer processes before they are physically carried out.On each of the S, Q, and M axes in Figure 2b, the mathematical models are classified based on a series of criteria compiled from various classification attempts in the literature [36].There are different mathematical models related to biochemical, physical, and agroecological variables that estimate photosynthesis at the leaf, plant, or group of plant levels.
Therefore, the study of mathematical modeling focused on the photosynthetic process becomes important in the agricultural sector.Since it is a direct indicator of a plant's health.It also makes it possible to assess the consequences of global climate change on crop growth, since the high concentration of CO 2 , the increase in temperature and altered rainfall patterns can have serious effects on crop production in the near future [44].
However, to the best of the authors' knowledge, a study on the diversity of mathematical modeling in the field of scientific research has not been approached nor focused on: the mathematical formulation, the complexity of the model, the validation, the type of crop (at the leaf, plant or canopy level), the analysis of the diversity of variables used with their respective units, as well as the invasiveness in their measurements.Hence, this manuscript presents a selective review of mathematical modeling to estimate photosynthesis.
In the literature, there is a review of the mathematical modeling of photosynthesis developed by Susanne Von Caemmerer.However, here only several models derived from the C 3 model by Farquhar, von Caemmerer, and Berry are discussed and compared.The models described and reviewed here describe the assimilation rates of CO 2 in a steady state and provide a set of hypotheses collected in a quantitative way that can be used as research tools to interpret experiments both in the field and in the laboratory.Additionally, it also provides tools for reflective experiments [49].Conversely, the present paper provides a new vision of the state of the art in mathematical models with certain specifications.This information can be used to develop new mathematical models to estimate photosynthesis with new variables related to the plant's habitat and with greater relevance to be implemented in electronic systems during the development of photosynthesis estimation equipment.
The objective of this manuscript is to present mathematical models with different characteristics to estimate photosynthesis; discuss its formulation, complexity, validation, applications, and invasiveness in the estimation method; show that the units corresponding to photosynthesis depend on the measurable biochemical, physical, and agroecological variables used for the estimation; analyze the relevance of the behavior trend in the photosynthetic process and prioritize over the specific magnitude itself.Therefore, in the authors' opinion, this study provides a great opportunity to contribute to the mathematicalmodeling knowledge to estimate photosynthesis.
The content of this work is organized in the following sections.Section 2 describes the criteria for the selection of mathematical models according to their classification and characteristics.In Section 3, a critical review of the white box, black box, and gray box models is given.Finally, conclusions are reported in the last section of this manuscript.

Techniques
Over the years, various mathematical models have been developed and introduced for photosynthesis estimation.The review of these models is based on an exhaustive search and analysis of the reported literature, it focuses on their mathematical formulation, validation, input variables, as well as the type of cultivation at the leaf, plant, or canopy level.As a result, an updated list of the most relevant mathematical evaluation models was obtained, and possible areas of opportunity was detected.The research was conducted in the main databases (Science Direct, Web of Science, IEEE, Google Scholar, Scielo, ResearchGate), scientific organizations such as the American Society of Agricultural and Biological Engineers ASABE, and specialized journals such as Bioresource Technology Reports, Ecological oped over time.This approach allows us to contribute to the knowledge of the different existing methodologies for estimating photosynthesis.
The selection criteria were: (a) The models had to estimate gross photosynthesis, net photosynthesis, photosynthetic rate, biomass, or carbon assimilation.Wohlfahrt & Gu [50] reviewed the photosynthesis background and the associated terminology; they showed that different definitions and names are used for photosynthesis in literature.Then, as the understanding of photosynthesis has deepened, the terminologies and definitions of photosynthesis have also evolved [51].In this work, Gross photosynthesis is defined as the total energy that plants fix during the photosynthesis process, while net photosynthesis is the total fixation of CO 2 [52].The photosynthetic rate refers to the amount of CO 2 absorbed by a plant per unit of leaf area and unit of time.Biomass results from the net accumulation of assimilated CO 2 throughout the growth cycle [53].Lastly, carbon assimilation is the amount of CO 2 stored in the plant.These terms are related to photosynthesis since plants can capture CO 2 from the atmosphere and metabolize it through the photosynthetic process to obtain sugars and other compounds that are required for the normal development of their life cycle [26,54].Then, the series of definitions described above refers to the estimation of photosynthesis.(b) The articles had to have a validation method to check if the different models used in the existing literature yielded similar results among other accredited models or with a photosynthesis measurement device on the market.This is important because one of the steps for modeling development is validation [55,56].Validation is defined as the comparison of the model predictions with the observed values of the actual process to determine if the model fits its purpose; the validation results are a necessary step in the acceptance of the model [57][58][59].Validation can also be defined as a demonstration that a model has acceptable predictive accuracy in that domain within a specific application.The quantitative precision of the predictions is often referred to as the "validity" of the model.Generally, it refers to the ability of a model to accurately predict the results of a particular experimental or observational scenario [44].
In this review, a series of papers related to the topic of mathematical modeling were analyzed for inclusion.During data extraction, some studies were excluded because they did not present the information required in the criteria established for selection.These excluded models have the following issues: -They did not estimate gross photosynthesis, net photosynthesis, photosynthetic rate, biomass, or carbon assimilation; instead, they modeled chlorophyll, fluorescence, or stomatal conductance, among others.- The information was related to the measurement methodology [40,41] and not to the mathematical modeling.-Some articles did not present the mathematical formulation for photosynthesis [60,61].-Some models estimated crop growth [62][63][64][65], which is not the focus of this review.-Some articles use previously developed models to carry out experiments different from those originally proposed [66][67][68][69], but they do not propose any new model.-Some authors present a mathematical formulation of the model; however, they do not present validation or results obtained [70].-Some authors expose models to estimate photosynthesis only for the Calvin cycle [71,72], which is only one section of all the photosynthetic processes.-Some authors used repeated or similar models to those presented in this manuscript [73][74][75][76].-Some models are developed based on soft computing [77,78].These models were excluded because they are not analytical models, that is, they are not models that present explicit algebraic expressions.
Finally, a total of 39 reported papers were selected and reviewed, being essential for conducting this review.In addition to these criteria, the type of cultivation, the mathematical formulation, as well as the validation of the modeling were considered.
The information retrieved was analyzed to determine their corresponding classification, according to the previously described types of mathematical modeling: white box, black box, or gray box.

Discussion
Models in the literature reported the use of input variables such as light (photosynthetically active radiation (PAR) µmol m −2 s −1 , average amount of energy incident per unit, area per unit time on a surface (Irradiance) W/m 2 ), CO 2 in the atmosphere or the leaf, oxygen in the leaf or the atmosphere, chlorophyll concentration, chlorophyll fluorescence, energy composition, source concentration energy, stomatal conductance, ambient temperature, leaf's temperature, time in hours, the vapor pressure in the leaf, nitrogen, and phosphorus; however, others focus on the physical characteristics of the plant, such as leaf's area index, stem diameter, plant height, leaf's mass per area, leaf's age, or population density.
Regarding the models' validation, several researchers use the Farquhar model as a basis [79,80], since they were the pioneers in estimating photosynthesis.Unfortunately, the parameters in the Farquhar model are difficult to estimate [81], since they use several biochemical reactions and thus use invasive techniques for plants.Another disadvantage is that it involves a long and complex mathematical calculation, which is not favorable for implementation in measurement systems.Therefore, some other models are validated with commercial devices or even with previously approved models developed by other authors.
Another difference between the models is the terminology and the definition in the output variable, since some authors present the result not as photosynthesis, but rather as carbon assimilation, net photosynthetic rate, net photosynthesis, photosynthetic rate, gross photosynthetic rate, or as biomass; because all these variables represent or are related to the estimation of photosynthesis.
The equations of the models to estimate photosynthesis can be seen in Table 2.The equation number for each model is referenced with its corresponding author.Based on this, it can be determined that mechanistic models are complex, require a lot of construction time, are difficult to parameterize, are very unfriendly to use, and tend to be more researchoriented than application-oriented. On the other hand, deterministic models are simpler to implement, but nevertheless must be adjusted to obtain accurate results.The latter is difficult to achieve because some black box-based models are designed to apply various transformations to the input data, and debugging these models, at any stage, is not an easy task.As for the gray box models, they have an interesting approach, as new models are needed that merge the white box and black box approaches to make it easier to interpret models.The abbreviations lists used to express the models' equations.

Equation Author Mechanistic Mathematical Models
(1) Hahn [87] P n = 36, 000 where Aj: a where Ac: Yin [98] C 3 plants for the limited transport part Torres [99] A n = a 0 •ln •RDINC (e 0 +CLA+e 1 ) Empirical mathematical models (27) Hozumi & Kirita [106] Wetzel Romdhonah [109] (Linear squared model (S-model)) Functions f 1 and f 2 differ by crop, and by crop history (34) Sau [113] Complex functions are not detailed and are written as f (main variables).Table 3 describes the main characteristics of the selected models.Models are classified according to their type: mechanistic, empirical, and gray box.Table 3 is ordered similarly to Table 2, where it can be found: the author with the equation number for each model; measurable input variables in plants and their environment, with their respective units; as well as the model corresponding to the estimation of photosynthesis and its units; the number of plants (samples); the validation method used for the modeling; and the coefficient of determination (R 2 ).All the investigations based on the study of mathematical modeling require extensive knowledge of the photosynthesis process in order to determine the types of variables that must be measured, as well as a knowledge of how these variables interrelate to obtain a reasonable photosynthesis estimation.
Most of the mathematical models included in this review are validated with accredited models of other authors, or by comparing them with some device for measuring photosynthesis (mainly gas exchange), some others were only simulated.
The percentage analysis of the revised mathematical models shows that 20% are gray box models, 13% are black box, and 67% are white box.
It can be observed in Figure 3 that after the year 2000, the white box models had a considerable increase, separating themselves from the other types of models.The construction of the white box models is based on previous knowledge and physical intuition [45].
Besides this, white box models are more complex to model and implement in electronic systems compared with black box models [2].On the other hand, the empirical models (black box) have been moderately developed, probably because no previous physical information is available and they are based on experimental data, trying to follow the white box.However, this does not mean that the obtained results with the black box models are not reliable, since the application of this type of model in the estimation of photosynthesis has been successful, and it is also more flexible to implement in electronic systems compared to with white box models.[47].Unlike white box models, black box models can solve problems by estimating an unknown function, which serves as a good guide for a linear and nonlinear dynamic system.Regarding hybrid models (gray box), the first investigations began in 1991, two decades after the other types of models.Recently, interest in gray box models has increased, as it offers the possibility of combining a white box model with a black box model.In other words, it is possible to have previous information available on some processes, but at the same time, several parameters are determined from the observed data.It can be observed in Figure 3 that after the year 2000, the white box models had a considerable increase, separating themselves from the other types of models.The construction of the white box models is based on previous knowledge and physical intuition [45].Besides this, white box models are more complex to model and implement in electronic systems compared with black box models [2].On the other hand, the empirical models (black box) have been moderately developed, probably because no previous physical information is available and they are based on experimental data, trying to follow the white box.However, this does not mean that the obtained results with the black box models are not reliable, since the application of this type of model in the estimation of photosynthesis has been successful, and it is also more flexible to implement in electronic systems compared to with white box models.[47].Unlike white box models, black box models can solve problems by estimating an unknown function, which serves as a good guide for a linear and nonlinear dynamic system.Regarding hybrid models (gray box), the first investigations began in 1991, two decades after the other types of models.Recently, interest in gray box models has increased, as it offers the possibility of combining a white box model with a black box model.In other words, it is possible to have previous information available on some processes, but at the same time, several parameters are determined from the observed data.The models based on white box are as good as the models based on black box, and their performance depends on the domain of the application and the input data, so it is necessary to analyze them to select the best one to apply to the given problem and the best one to show the obtained results.To this end, it is essential to understand the input data, as well as the best way to display the output data, the careful formulation of the objectives is paramount and determines the scope of the built model.It is important to mention that in the last two years (2021-2022) the development of gray box and black box models has been the subject of interest, while white box models remain outdated.
In this review, it is demonstrated, with considerable evidence, that photosynthesis can be estimated through different mathematical models.These models have various characteristics, without standardizing any specific equation or variables.Among the articles on validated mathematical models with trends in mathematical behaviors are Marshall [82], validated with a rectangular hyperbolic model; Han [93] validated with an empirical function based on the Poisson distribution; [107] validated with Euler's simple digital integration method, rectangle hyperbolic method Perin [116] validated with the hyperbolic model and Jansen [112] with the integration model according to Euler.
It is important to note that all the measurable variables used in the classification of the revised mathematical models (empirical, mechanistic, or hybrid) have diversified The models based on white box are as good as the models based on black box, and their performance depends on the domain of the application and the input data, so it is necessary to analyze them to select the best one to apply to the given problem and the best one to show the obtained results.To this end, it is essential to understand the input data, as well as the best way to display the output data, the careful formulation of the objectives is paramount and determines the scope of the built model.It is important to mention that in the last two years (2021-2022) the development of gray box and black box models has been the subject of interest, while white box models remain outdated.
In this review, it is demonstrated, with considerable evidence, that photosynthesis can be estimated through different mathematical models.These models have various characteristics, without standardizing any specific equation or variables.Among the articles on validated mathematical models with trends in mathematical behaviors are Marshall [82], validated with a rectangular hyperbolic model; Han [93] validated with an empirical function based on the Poisson distribution; [107] validated with Euler's simple digital integration method, rectangle hyperbolic method Perin [116] validated with the hyperbolic model and Jansen [112] with the integration model according to Euler.
It is important to note that all the measurable variables used in the classification of the revised mathematical models (empirical, mechanistic, or hybrid) have diversified units, so it is impossible to standardize the units for estimating photosynthesis.Therefore, the approach is oriented towards the analysis and comparison of the behavior trend in the photosynthetic process, regardless of the specific magnitude of the reported measurement.Then, the photosynthesis estimation units depend on the type of variables in the input, since it was found that there are different mathematical models related to biochemical, physical, and agroecological variables [14,119].Figure 4 shows the input variables used in the reviewed mathematical models, classified according to type [120][121][122].
Appl.Sci.2022, 12, x FOR PEER REVIEW 21 of 31 units, so it is impossible to standardize the units for estimating photosynthesis.Therefore, the approach is oriented towards the analysis and comparison of the behavior trend in the photosynthetic process, regardless of the specific magnitude of the reported measurement.Then, the photosynthesis estimation units depend on the type of variables in the input, since it was found that there are different mathematical models related to biochemical, physical, and agroecological variables [14,119].Figure 4 shows the input variables used in the reviewed mathematical models, classified according to type [120][121][122].To implement the mathematical model in an electronic system, it is necessary to consider the physical method of measuring each variable [121,123,124].Regarding the physical method of measurement, the biochemical, physical, and agroecological variables can be sub-classified as invasive or non-invasive.Invasive measurements interfere physically or chemically with the plant; on the contrary, non-invasive measurements do not alter the natural process of the plant because, in general, there is no contact with the studied body.Invasive measurement methods have the following disadvantages: they affect the behavior of plants, they can be destructive, depending on the laboratory, and require a space with adequate facilities.Therefore, it is not in situ and the measurement is not immediate, causing a delay.In contrast, non-invasive methods have the characteristic that they can be in situ, in vivo, and non-destructive.
According to the mathematical models reviewed, Figure 5 shows the diversity of variables that have been used to estimate photosynthesis, as well as the classification of whether they are invasive or non-invasive, when measured in plants.It also presents the percentage of the number of times a certain variable is used to build the mathematical model, reinforcing the idea that it is not possible to standardize the units of photosynthesis estimation.According to Figure 5, light (photosynthetically active radiation (PAR) µmol m −2 s −1 ; average amount of incident energy per unit area per unit time on a surface (Irradiance) W/m 2 ) is the most used variable for estimating photosynthesis in mathematical models since it is required in 25% of the models, followed by ambient temperature with 14%.In general, the advantage of involving biochemical variables in the models is that photosynthesis is more reliable, since they are directly related to the photosynthetic process.However, since the information on the biochemical variables is obtained by a completely invasive process, and the type of equation to be implemented requires digital ele- To implement the mathematical model in an electronic system, it is necessary to consider the physical method of measuring each variable [121,123,124].Regarding the physical method of measurement, the biochemical, physical, and agroecological variables can be sub-classified as invasive or non-invasive.Invasive measurements interfere physically or chemically with the plant; on the contrary, non-invasive measurements do not alter the natural process of the plant because, in general, there is no contact with the studied body.Invasive measurement methods have the following disadvantages: they affect the behavior of plants, they can be destructive, depending on the laboratory, and require a space with adequate facilities.Therefore, it is not in situ and the measurement is not immediate, causing a delay.In contrast, non-invasive methods have the characteristic that they can be in situ, in vivo, and non-destructive.
According to the mathematical models reviewed, Figure 5 shows the diversity of variables that have been used to estimate photosynthesis, as well as the classification of whether they are invasive or non-invasive, when measured in plants.It also presents the percentage of the number of times a certain variable is used to build the mathematical model, reinforcing the idea that it is not possible to standardize the units of photosynthesis estimation.According to Figure 5, light (photosynthetically active radiation (PAR) µmol m −2 s −1 ; average amount of incident energy per unit area per unit time on a surface (Irradiance) W/m 2 ) is the most used variable for estimating photosynthesis in mathematical models since it is required in 25% of the models, followed by ambient temperature with 14%.In general, the advantage of involving biochemical variables in the models is that photosynthesis is more reliable, since they are directly related to the photosynthetic process.However, since the information on the biochemical variables is obtained by a completely invasive process, and the type of equation to be implemented requires digital elements of a better quality, the implementation price is quite high.On the other hand, involving physical variables is less invasive since it depends on the method used for the measurement, as is the case of the leaf's area index and leaf's temperature variables.Finally, agroecological variables are those found in the atmosphere; these variables are completely non-invasive.
ments of a better quality, the implementation price is quite high.On the other hand, involving physical variables is less invasive since it depends on the method used for the measurement, as is the case of the leaf's area index and leaf's temperature variables.Finally, agroecological variables are those found in the atmosphere; these variables are completely non-invasive.Another relevant finding is that there are other important variables for estimating photosynthesis that have not been used in previous models since they interfere in this metabolic process thus opening a niche of opportunity for future models, variables such as, minerals, processed sap (glucose), relative humidity of the air and relative humidity of the leaf.
Table 5 shows the statistics of the number of input variables used in the mathematical models.It is emphasized that 31% of the models require only 2 input variables, and 5% used 7 input variables for the estimation of photosynthesis.Thus, it is shown that it is not necessary to measure all the possible input variables to obtain a photosynthesis output value.In addition, Table 5 also shows that statistics on the number of input variables used by each mathematical model are observed according to their classification.Based on this Table, it was concluded that black box models tend to use only 2 input variables; although the estimation process is unknown, the expected result is not affected since there are even white box models that also use only two input variables to estimate the photosynthesis.These statistics confirm once again that regardless of the number of input variables and the type of modeling, it does not influence the expected photosynthesis estimation.Another relevant finding is that there are other important variables for estimating photosynthesis that have not been used in previous models since they interfere in this metabolic process thus opening a niche of opportunity for future models, variables such as, minerals, processed sap (glucose), relative humidity of the air and relative humidity of the leaf.
Table 4 shows the statistics of the number of input variables used in the mathematical models.It is emphasized that 31% of the models require only 2 input variables, and 5% used 7 input variables for the estimation of photosynthesis.Thus, it is shown that it is not necessary to measure all the possible input variables to obtain a photosynthesis output value.In addition, Table 4 also shows that statistics on the number of input variables used by each mathematical model are observed according to their classification.Based on this Table, it was concluded that black box models tend to use only 2 input variables; although the estimation process is unknown, the expected result is not affected since there are even white box models that also use only two input variables to estimate the photosynthesis.These statistics confirm once again that regardless of the number of input variables and the type of modeling, it does not influence the expected photosynthesis estimation.Murchie [125] established that photosynthesis is perhaps the most studied physiological process in plant science since it supports plant productivity.However, it is notoriously sensitive to small dynamic changes in environmental conditions; this means that quantifying nature on different time scales is not easy.For this reason, most of the mathematical models reviewed in this article measure variables in steady state, that is, in controlled environments.It is important and convenient to measure the properties of a plant under steady-state conditions, but this does not always allow accurately prediction of how the plant will respond in a complex field environment [125][126][127][128][129][130].For the analysis of physiological processes, it is important to take into account that plants are living beings, so each plant has a different photosynthetic response [11,131].This is due to factors such as the age of the plant, its nutrition, the type of plant, its habitat, whether it is aquatic or terrestrial [132,133], and the climate in which it is found, as well as the type of species [134][135][136][137].
In the revised models, plants of different species were used to carry out the photosynthesis estimation validations, classified for both aqueous and terrestrial environments (Figure 6).As mentioned by Gómez [138], most of the studies on photosynthetic activity in different species have been carried out on individual leaves, without considering that due to various factors, what happens in one leaf may not be what is happening in the other leaves of the same plant and the crop in general [139][140][141].For this reason, another important aspect to consider in the analysis and validation of a mathematical model for estimating photosynthesis is the number of plants required for the experimental part of the model, since it is more significant to experiment with a complete crop than to do so with just one plant per species [11].In this review, the number of plants used in each article was sought; however, not all the articles specified this information, leaving in doubt whether it is a significant piece of data to evaluate the reliability of the model.
Various principles and methods have been proposed to select the best model for a system among the various existing models [88,[142][143][144][145].These quality criteria, which can be used separately or in some combination, are as follows: i.
To have the minimum number of model parameters with a reasonable error.ii.
To have the simplest form with minimal error.iii.To be based on physical, chemical, and biological laws as much as possible.iv.To have the minimum deviation between the predicted and empirical values.v.
To have the minimum output variance.
According to Gutiérrez [146], another useful statistic to measure the global quality of a model is the coefficient of determination (R 2 ) since it measures the proportion or percentage of variability in the experimental data.To interpret these coefficients, it is true that 0 ≤ R 2 ≤ 1, values close to 1 are desirable.In general, for prediction purposes, an adjusted coefficient of determination of at least 70% is recommended.Although the coefficient of determination is an important factor to consider for the validation of the models, only some authors showed this evidence in the literature review.
As we have seen so far, the estimation of photosynthesis is not an easy process to measure, as many factors can modify the final results, in other words, it is not possible to use the same model for all types of plants.
Years after the publication of Farquhar, Von Caemmerer and Berry's landmark C3 photosynthesis model, photosynthesis modeling remains an active field of research.These models have allowed the formulation and testing of new hypotheses, which has led to their refinement [147].

Conclusions
This article summarizes current mathematical models in the context of the photosynthetic process in plants.For this, an extensive review of the current state of the art was carried out.In addition to highlighting their advantages and limitations, this review demonstrates that mathematical models consider both, qualitative and quantitative variables.Among the qualitative variables, there is the validation method, the type of crop to carry out the tests, the mathematical model, and the input variables and estimated variables with their respective units.Conversely, the quantitative variables reported are the number of input variables, the number of samples to carry out the tests, the coefficient of determination, and the percentage equivalent to the number of published articles focused on mathematical models to estimate photosynthesis according to its classification.
The reader can compare the existing types of models according to their mathematical complexity, the input variables required for the models, and the units used to estimate photosynthesis.Based on the state-of-the-art, there is no evidence of a comparison between mathematical models.
This review shows various models, from simple to very complex, that relate mathematics to the photosynthetic process.
This review presents the diversity of mathematical models for photosynthesis estimation from previously published research, focused on the mathematical formulation, complexity, validation, applications, and the analysis of the diversity of the variables with their respective units, as well as the invasiveness required for its measurement.Once these input variables have been analyzed, we conclude that the most commonly used non-invasive variables are light, ambient temperature, carbon dioxide found in the atmosphere, and the leaf's temperature.On the other hand, the vast majority of variables used to estimate photosynthesis are invasive for the plant, causing stress.Among the most common invasive input variables to measure photosynthesis are carbon dioxide concentration in the leaf, oxygen in the leaf, the leaf's temperature, and pressure steam, remembering at this point that the terms "invasive" or "non-invasive" depend on the techniques necessary to acquire the variables of the model [96].
Regarding the mathematical modeling described, we have inferred that some models require a wide variety of input variables, which makes them more complex, thus hindering their future application in electronic systems.Therefore, it would also be useful to develop simpler mathematical models for easy implementation.
This review supports the idea that an estimate of photosynthesis can be obtained with different mathematical equations and different measurable input variables without standardizing.This idea may be significant for future research given that, based on the research carried out in this review, it can be argued that photosynthesis units are determined based on the input variables measured.
Mathematical models provide an essential contribution to the understanding of both, mathematical and biological fields.Deepening the knowledge of mathematical modeling to estimate photosynthesis allows us to understand that there are different methodologies to obtain the same result, or at least a similar result.In biological systems, it is not always necessary to have specific definitive values, but the tendencies of the system are also important.This information can be used to develop new mathematical models to estimate photosynthesis with new variables related to the plant's habitat and with greater relevance to be implemented in electronic systems during the development of photosynthesis estimation equipment.

Figure 1 .
Figure 1.Classification of mathematical models.The diagram shows the three main classifications of the white box, black box and gray box models, and their sub-classifications of the white box or mechanistic, and black box or empirical models.Mechanistic and empirical models can be deterministic or stochastic; and in turn, they can be continuous or discrete.

Figure 1 .
Figure 1.Classification of mathematical models.The diagram shows the three main classifications of the white box, black box and gray box models, and their sub-classifications of the white box or mechanistic, and black box or empirical models.Mechanistic and empirical models can be deterministic or stochastic; and in turn, they can be continuous or discrete.

Figure 2 .
Figure 2. The three dimensions of an SQM mathematical model, where the (S) systems are ranked at the top of the bar; immediately below the bar, there is a list of objectives that the mathematical models in each of the segments can have (which is Q); at the lower end are the corresponding mathematical structures (M) ranging from algebraic equations (Aes) to differential equations (Des).(a) Classification of mathematical models between black and white box models.(b) Classification of mathematical models in the SQM space.Modified from [36].

Figure 3 .
Figure 3. Mathematical models reported in the literature.The graph shows the representation of the number of published articles focused on mathematical models to estimate photosynthesis over time, classified by decades from 1970 to 2022.

Figure 3 .
Figure 3. Mathematical models reported in the literature.The graph shows the representation of the number of published articles focused on mathematical models to estimate photosynthesis over time, classified by decades from 1970 to 2022.

Figure 4 .
Figure 4. Input variables used in the reviewed mathematical models, classified by biochemical, physical, and agroecological type.

Figure 4 .
Figure 4. Input variables used in the reviewed mathematical models, classified by biochemical, physical, and agroecological type.

Figure 5 .
Figure 5. Input variables used by the mathematical models to estimate photosynthesis; invasiveness based on their measurement; as well as the equivalent percentage to the number of times that the models use these variables.

Figure 5 .
Figure 5. Input variables used by the mathematical models to estimate photosynthesis; invasiveness based on their measurement; as well as the equivalent percentage to the number of times that the models use these variables.

Table 2 .
Model equations for estimating photosynthesis.

Table 3 .
Major features of the models for estimating photosynthesis.

Table 4 .
Reported input variables in mathematical models for estimating photosynthesis.The table shows the representation of the percentage equivalent to the number of input variables used by the reviewed mathematical models for white, black, or gray box models.
(34)ccumulated absorption O 3 Equation (23) DConversion coefficient Pg Equations (6) and(20); Stem diameter Equation (29) DO2Dissolved oxygen concentration Equations (35), (36) and(38)Cumulative density leaf area between the surface of the canopy and height z Equation (27)f 1 (T)Absolute Maximum mathematical function due to leaf's temperature Equation (5)f 2 (a)Function that describes the pattern for the age of the leaf Equation (5)F AO3Response relationship treatment to control photosynthesis Equation (23)FC0 2Relative dimensionless factor that estimates the effect on the growth of the concentration of CO 2 in the air Equation(34) and (31) R d (C) Respiratory CO 2 released Equation (18) R d (I) Leaf respiration in light or day Equations (12) and (22) RDINC Relative depth in the crown Equations (26) and (38) R light Nonphotorespiratory mitochondrial CO 2 release in light Equation (16) RO 2 Photosynthesis rate Equation (35) RO 2 (DO 2 ) Photosynthesis rate in function of dissolved oxygen concentration Equation (35) RO 2 (I) Leaf respiration in light or day Equation (39) RO 2 (Iav) Photosynthesis rate in function of irradiance Equation (35) RO 2 (pH) Photosynthesis rate in function of pH Equation (35) RO 2 (T) Photosynthesis rate in function of temperature Equation (35) (2))ochemical efficacy of the leaf in the absence of oxygen Equation(25)Relationship between physical and total resistance to CO 2 diffusion Equation(2) (11)2 compensation point in the absence of dark breathing Equations (10), (12) and (22)Γ *Compensation point in the absence of dark Equations (1) and (19); CO 2 partial pressure for the compensation of oxygenation and carboxylation reactions Equation(11)