Accelerating Bayesian Estimation of Solar Cell Equivalent Circuit Parameters Using JAX-Based Sampling

: Equivalent circuit models that reproduce the current–voltage characteristics of solar cells are useful not only to gain physical insight into the power loss mechanisms that take place in solar cells but also for designing systems that use renewable solar energy as a power source. As mentioned in a previous paper, Bayesian estimation of equivalent circuit parameters avoids the drawbacks of nonlinear least-squares methods, such as the possibility of evaluating estimation errors. However, it requires a long computation time because the estimated values are obtained by sampling using a Markov chain Monte Carlo method. In this paper, a trial to accelerate the calculation by upgrading the Bayesian statistical package PyMC is presented. PyMC ver. 4, the next version of PyMC3 used in the previous paper, started to support the latest sampling libraries using a machine learning framework JAX, in addition to PyMC-speciﬁc methods. The acceleration effect of JAX is remarkable, achieving a calculation time of less than 1/20 times that of the case without JAX. Recommended calculation conditions were disclosed based on the results of a number of trials, and a demonstration with testable Python code on Google Colaboratory using the recommended conditions is published on GitHub.


Introduction
Efforts are underway to dramatically improve the efficiency as well as accessibility of crucial societal components including energy production and distribution, infrastructure establishment and upkeep, production technology, agriculture, transport, public health, and security by analyzing and utilizing information collected by the Internet of Things (IoT) networks, which consist of many wireless sensor nodes deployed around the society [1][2][3][4].Utilizing permanent power sources that do not need to be replaced is necessary to minimize the maintenance costs of wireless sensor nodes.So far, devices that convert ambient energy sources such as heat, vibration, and light into electric power have been developed as environmental energy-harvesting devices.Among these, the use of light energy is an attractive option in terms of energy density and other factors, assuming a comfortable living environment for human inhabitants [5][6][7][8].For this purpose, we need to develop technologies to make full use of power generated by solar cells under any lighting situation.
Solar cells have long been used to convert indoor light power into electric power [9].In recent years, new types of solar cells, such as organic solar cells and perovskite solar cells, have attracted much attention [10][11][12][13].One of the reasons why these new solar cells are attractive is that their power conversion efficiency even increases under low irradiation light.A previous work has elucidated that this increased efficiency can be attributed to the increase in parallel resistance within organic thin-film solar cells under lower light intensity, thereby resulting in reduced power losses [14].On the other hand, dye-sensitized solar cells (DSSCs), a kind of the new solar cells and the ancestor of perovskite solar cells, are already on the market for this purpose [15,16].A feature unique to DSSCs is that the electrochemical 2 of 12 nature of DSSCs prevents their power conversion efficiency from decreasing under pulsewidth-modulated illumination, which is commonly used in modern solid-state lighting with light-emitting diodes (LEDs) [17].
Equivalent circuit models of solar cells describe the direct current (DC) operation behavior of solar cells and are useful not only to gain physical insight into the power loss mechanisms that take place in solar cells, but also to design systems that use solar cells as power sources.Obtaining an accurate equivalent circuit model is especially important when employing solar cells as a power source for IoT devices used indoors, since it is necessary to track the optimal operating point of the solar cell in response to fluctuating irradiation light intensity.
The one-diode model shown in Figure 1 is the most basic solar cell equivalent circuit model and has been widely adopted because it reproduces the current-voltage characteristics of many solar cells, despite having only four basic elements displayed in Figure 1 and five parameters mentioned below.In this equivalent circuit, the following equation holds for the current I and the voltage V.
where R s and R p are the series and the parallel (shunt) resistances, I s and n denote the reverse saturation current and the ideality factor of the diode, respectively, and I ph is the ideal photocurrent.The thermal voltage V t is approximately 26 mV at room temperature.intensity, thereby resulting in reduced power losses [14].On the other hand, dye-sensitized solar cells (DSSCs), a kind of the new solar cells and the ancestor of perovskite solar cells, are already on the market for this purpose [15,16].A feature unique to DSSCs is that the electrochemical nature of DSSCs prevents their power conversion efficiency from decreasing under pulse-width-modulated illumination, which is commonly used in modern solid-state lighting with light-emitting diodes (LEDs) [17].Equivalent circuit models of solar cells describe the direct current (DC) operation behavior of solar cells and are useful not only to gain physical insight into the power loss mechanisms that take place in solar cells, but also to design systems that use solar cells as power sources.Obtaining an accurate equivalent circuit model is especially important when employing solar cells as a power source for IoT devices used indoors, since it is necessary to track the optimal operating point of the solar cell in response to fluctuating irradiation light intensity.
The one-diode model shown in Figure 1 is the most basic solar cell equivalent circuit model and has been widely adopted because it reproduces the current-voltage characteristics of many solar cells, despite having only four basic elements displayed in Figure 1 and five parameters mentioned below.In this equivalent circuit, the following equation holds for the current I and the voltage V.
where  s and  p are the series and the parallel (shunt) resistances,  s and  denote the reverse saturation current and the ideality factor of the diode, respectively, and  ph is the ideal photocurrent.The thermal voltage  t is approximately 26 mV at room temperature.

Figure 1.
One-diode equivalent circuit model of photovoltaic cell.
It is necessary to extract five circuit parameters so as to reproduce the current-voltage curve of the solar cell obtained experimentally.This is the task of finding parameters that minimize the discrepancy between the experimental data and the corresponding equivalent circuit curve [18].To accomplish this, a variety of nonlinear least-squares methods have traditionally been developed and used, including sophisticated techniques such as metaheuristic algorithms [19][20][21].If one takes such a proper approach, of course, all five parameters are determined simultaneously.
It is worth noting here that in numerous experimental studies, only the series resistance  s as well as the parallel resistance  p is presented, and the other three parameters are often not presented.In such cases, it is expected that a simple method of estimating the series resistance  s (and parallel resistance  p ) as the reciprocal of the slope of the curve when the terminals are open (and shorted) is often employed [22].However, the values obtained via this method are not suitable for the purpose of accurately estimating  s and  p because the slopes at these particular points are strongly affected not only by  s and  p shown in Figure 1 but also by other parameters like the ideality factor of the diode  [23].Therefore, such an approach potentially yields imprecise out- It is necessary to extract five circuit parameters so as to reproduce the current-voltage curve of the solar cell obtained experimentally.This is the task of finding parameters that minimize the discrepancy between the experimental data and the corresponding equivalent circuit curve [18].To accomplish this, a variety of nonlinear least-squares methods have traditionally been developed and used, including sophisticated techniques such as metaheuristic algorithms [19][20][21].If one takes such a proper approach, of course, all five parameters are determined simultaneously.
It is worth noting here that in numerous experimental studies, only the series resistance R s as well as the parallel resistance R p is presented, and the other three parameters are often not presented.In such cases, it is expected that a simple method of estimating the series resistance R s (and parallel resistance R p ) as the reciprocal of the slope of the curve when the terminals are open (and shorted) is often employed [22].However, the values obtained via this method are not suitable for the purpose of accurately estimating R s and R p because the slopes at these particular points are strongly affected not only by R s and R p shown in Figure 1 but also by other parameters like the ideality factor of the diode n [23].Therefore, such an approach potentially yields imprecise outcomes in determining R s and R p as well as misunderstanding that the effect on n caused by specific conditions is the effect on R s or R p .
The nonlinear least-squares method yields point estimates for a set of parameters.It should be noted that the curves computed from the equivalent circuit are error free in a practical sense, but the experimental data are not free from measurement error.This means that the parameters that minimize the experimental and computed curves will have many locally optimal solutions.This implies that parameter estimation should also involve information about the error in the estimation in addition to point estimation.Indeed, when using nonlinear least-squares methods, it is frequently experienced that the point estimates differ depending on the initial guess of each parameter given to the program.
In recent years, parameter extraction from experimental data by Bayesian estimation has been used in various fields including physics, energy, and materials science to solve the problems mentioned above and has begun to produce fruitful results [24][25][26][27][28]. Bayesian estimation searches the parameter space according to a pre-specified probability distribution called the prior probability distribution to obtain a probability distribution of the parameter set that can reproduce the experimental data, called the posterior probability distribution.One may utilize the mean value and the standard deviation of the posterior probability distribution as the point estimate and the error, respectively, for example.
In a previous paper, the author reported on a method for extracting equivalent circuit parameters of solar cells from current-voltage characteristics using Bayesian estimation [29].The result is not a point estimate of the parameters but an estimated probability distribution of the parameters, and thus, it is also possible to evaluate the error contained in the estimated value.For example, if one adds larger errors in the pseudo-generated experimental data, larger standard deviations of the estimated values are obtained as the outcome.This is a clear piece of evidence of the ability of the Bayesian estimation to treat uncertainty arising from the error in the experimental data.
On the other hand, a problem with the Markov chain Monte Carlo sampling method used in the estimation calculation is that it requires a relatively long computation time.In this paper, it is reported that by applying the latest sampling computation module, appropriate estimates can be obtained in a calculation time that is more than one order of magnitude shorter.

Methods
One can express the current I in terms of an analytical solution for the voltage V, by using the principal branch of Lambert's W-function W 0 (x) [30]. with Lambert's W-function W(x) is defined as the solution of the transcendental equation W(x)•exp((W(x)) = x , and has two real branches.W 0 (x) is the branch defined for x∈[−exp(−1), +∞].This equation includes the exponential function of a relatively large number to be computed once, which creates the risk of overflow when standard doubleprecision floating-point number is used.To avoid this, it is very useful to use the g-function g(x) = ln(W(exp(x))), defined by Roberts [31][32][33]. with This equation of I(V) using g-function appeared in Ref. [33] for the first time.This is because Roberts focused his attention on the equation of V(I), which potentially causes overflow much more easily than I(V).
In the previous work, the Python package PyMC3 [34], which enables us to conduct Bayesian estimation by simply writing a model in Python, was employed.This package is still under active development, and the next version, PyMC ver.4, and the current version, PyMC ver. 5 [35], have already been released.These new versions support the latest sampling libraries Numpyro [36,37] and Blackjax [38], which are based on the machine learning framework JAX [39] and are expected to provide accelerated sampling, in addition to conventional PyMC-specific sampling methods.
As shown below, each parameter is assigned a prior probability density distribution in the form of a uniform distribution spanning several orders of magnitude, thereby indicating a deliberate avoidance of substantial prior information utilization.As an example, the upper limit permitted for I ph surpasses the short-circuit photocurrent magnitude by over tenfold.These are the same as ones used in the previous work [29].
The computational algorithm for g(x) is given in Ref. [31].An initial rough estimation of g(x) was obtained by the piecewise function reported in the original literature, not the softplus-like function developed in the previous work [29].Although the latter consists of a single analytical equation, it was found that it gives a somewhat worse estimation compared to the former one.
The current-voltage characteristics of a Si solar cell [18] and an organic photovoltaic cell (OPV) [40] used in the literature [29] were used as the experimental data.Calculations were performed on the free version of Google Colaboratory, employing 10,000 steps for both sampling and burn-in if not otherwise noted, and the number of chains (parallel calculations) was two.All calculations were run on the CPU, and it was confirmed that the calculations were executed on two Xeon cores (2.2 GHz) in all cases.GPUs and TPUs provided in the free version of Google Colaboratory were not used, as they did not improve the computation speed.

Recommended Calculation Condition and Its Results
As a result of a large number of trials, the author recommends the following calculation conditions: use calculation package Numpyro under PyMC ver. 5 with 10,000 steps for both sampling and burn-in.Figure 2 shows the density distributions, which are similar to those of the posterior probability density function, and the trace plots for each equivalent circuit parameter obtained under these conditions for the tabular experimental data of the commercial Si solar cell (R.T.C.France) presented in the literature [18].Table 1 shows the results for the estimated values.The shapes of density distributions generated by two independent chains mimic each other and are similar to normal distribution functions.According to the central limit theorem, the density distribution generally approaches a normal distribution function as the number of sampling steps increases.The trace plots show that the values are repeatedly updated within an appropriate range, indicating that the Markov chain Monte Carlo calculation converges well.In fact, the R (R-hat) value, which indicates the convergence of this calculation, is 1.0 for all parameters.The deviation of R from 1.0 is usually accompanied by trace plots drifting unsteadily over a wide range and skewed density distributions.The average values for each parameter listed in Table 1 are practically the same as those in the literature.In addition, as shown in Figure 3, the current-voltage curve calculated using the mean values of the density distributions reasonably reproduces the experimental data.
priate range, indicating that the Markov chain Monte Carlo calculation converges we fact, the  ̂ (R-hat) value, which indicates the convergence of this calculation, is 1.0 fo parameters.The deviation of  ̂ from 1.0 is usually accompanied by trace plots dri unsteadily over a wide range and skewed density distributions.The average value each parameter listed in Table 1 are practically the same as those in the literature.In dition, as shown in Figure 3, the current-voltage curve calculated using the mean va of the density distributions reasonably reproduces the experimental data.Figure 4 shows the density distributions and the trace plots for each equivalent circuit parameter obtained under these conditions for the OPV (PTB7-Th:C 70 , annealed at 175 • C) experimental data shown in the literature [40].The results for each parameter estimate are summarized in Table 2. Similarly, the calculation converges well, and no serious discrepancy is observed between the simulated curve using the average values of the density distributions and the experimental data, as shown in Figure 5. On the other hand, the mean values of each parameter shown in Table 2 are slightly different from those shown in the previous work [29].This is due to the fact that the experimental data up to +0.8 V were used in this study, while those up to +0.9 V were used in the previous work.
Table 1.Estimation reported in Ref. [18] and Bayesian estimation of the equivalent circuit parameters of the commercial Si solar cell (R.T.C.France) [18] with the recommended calculation condition.

Parameter
Ref. [  Table 1.Estimation reported in Ref. [18] and Bayesian estimation of the equivalent circuit parameters of the commercial Si solar cell (R.T.C.France) [18] with the recommended calculation condition.

Parameter
Ref. [ Figure 4 shows the density distributions and the trace plots for each equivalent circuit parameter obtained under these conditions for the OPV (PTB7-Th:C70, annealed at 175 °C) experimental data shown in the literature [40].The results for each parameter estimate are summarized in Table 2. Similarly, the calculation converges well, and no serious discrepancy is observed between the simulated curve using the average values of the density distributions and the experimental data, as shown in Figure 5. On the other hand, the mean values of each parameter shown in Table 2 are slightly different from those shown in the previous work [29].This is due to the fact that the experimental data up to +0.8 V were used in this study, while those up to +0.9 V were used in the previous work.The calculation times were approximately 160 s and 310 s for the former and latter data, respectively.The usage of JAX and Numpyro accelerates the calculation, and these calculation times are less than 1/20 times those calculations when using PyMC-specific methods without JAX.The above results, including the Python code, are available on GitHub [41].The calculation time varies somewhat from run to run and data to data, and successful convergence of the calculation potentially depends on the experimental data used.If the calculations do not converge well, increasing the number of burn-in and sampling steps may help.The calculation times were approximately 160 s and 310 s for the former and latter data, respectively.The usage of JAX and Numpyro accelerates the calculation, and these calculation times are less than 1/20 times those calculations when using PyMC-specific methods without JAX.The above results, including the Python code, are available on GitHub [41].The calculation time varies somewhat from run to run and data to data, and successful convergence of the calculation potentially depends on the experimental data used.If the calculations do not converge well, increasing the number of burn-in and sampling steps may help.
Table 2. Bayesian estimation of the equivalent circuit parameters of the OPV (PTB7-Th:C70, annealed at 175 °C) [40] with the recommended calculation condition.

Results of Other Conditions
Table 3 summarizes the calculation results for the Si solar cell obtained under other typical calculation conditions.The parameter estimates themselves were almost identical for all calculation methods: without JAX, PyMC3 and PyMC ver. 4 required approximately 9000 s of calculation time, and PyMC ver. 5 was too time consuming to verify.On the other hand, when JAX was used, the computation time was less than 150 s, which is a remarkable acceleration.
The Markov chain Monte Carlo computation used for the present study is something like a numerical experiment of a random process, and the steps needed to reach the convergence change from trial to trial, probably due to the initial condition and data set.In the case of this experimental data, both Numpyro and Blackjax were verified to con-

Results of Other Conditions
Table 3 summarizes the calculation results for the Si solar cell obtained under other typical calculation conditions.The parameter estimates themselves were almost identical for all calculation methods: without JAX, PyMC3 and PyMC ver. 4 required approximately 9000 s of calculation time, and PyMC ver. 5 was too time consuming to verify.On the other hand, when JAX was used, the computation time was less than 150 s, which is a remarkable acceleration.
Table 3. Mean value, standard deviation (SD), computation time, and computation success rate of estimated equivalent circuit parameters under various computation conditions (experimental data: the commercial Si solar cell (R.T.C.France) [18]).All displayed values are typical ones.The Markov chain Monte Carlo computation used for the present study is something like a numerical experiment of a random process, and the steps needed to reach the convergence change from trial to trial, probably due to the initial condition and data set.In the case of this experimental data, both Numpyro and Blackjax were verified to converge successfully 10 times in 10 trials.On the other hand, for the OPV experimental data, Numpyro converged 10 times in 10 trials, as shown in Table 4, while the calculation converged only in less than half of the cases when Blackjax was used.Table 4. Mean value, standard deviation (SD), computation time, and computation success rate of estimated equivalent circuit parameters under various computation conditions (experimental data: the OPV (PTB7-Th:C 70 , annealed at 175 • C) [40]).All displayed values are typical ones.

Dependence on the Number of Steps
In Bayesian estimation using Markov chain Monte Carlo methods, an increase in the number of steps directly leads to an increase in computation time.Therefore, information on the minimum number of steps required to obtain adequate results is important.This section describes the results of an investigation of this issue using PyMC ver. 5, the latest version at the time of writing the present paper.The number of steps used for sampling and burn-in is the same, and only the number of steps is changed.Other conditions remained unchanged.
The results without JAX are shown in Tables 5 and 6.The computation time increases monotonically as the number of steps increases.In Table 5, the standard deviations of some parameters are noticeably large for the 1000-step case, indicating that adequate results are not obtained.Even in the case of 2000 steps, the computation time exceeds 7000 s, which is much longer than the computation time of 10,000 steps using Numpyro.Tables 7 and 8 show the results of varying the number of steps when using Numpyro.Computing a large number of consecutive calculations decreases the overhead associated with compilation.For this reason, shorter computation times are shown than in the results shown in 3.1.As the number of steps increases, the probability of convergence of the computation increases monotonically, indicating that for both data sets, convergence is almost certain after 10,000 steps of computation.Based on these results, it is recommended that Numpyro be used and that 10,000 steps be used for both sampling and burn-in.In addition, the computation time and the probability of convergence of the computation depended on the data set.The reason for this is not clear at this time and is a subject for future research.

Conclusions
Equivalent circuit models that reproduce the current-voltage characteristics of solar cells are useful not only to gain physical insight into the power loss mechanisms that take place in solar cells but also for designing systems that use renewable solar energy as a power source.The extraction of equivalent circuit parameters from experimental current-voltage characteristics of a solar cell is still an active research topic today.Bayesian estimation of equivalent circuit parameters avoids the drawbacks of nonlinear least-squares methods; however, it requires a long computation time.In this study, we investigated the acceleration of Bayesian estimation of equivalent circuit parameters for solar cells from current-voltage characteristics using the latest JAX-based sampling libraries that are now available in recent versions of PyMC.It was confirmed that reasonable estimates could be obtained in less than 1/20 times that of the computation time compared to the case where JAX is not used.Recommendations were made based on the results of a number of trials, and a demonstration with testable Python code on Google Colaboratory using the recommended conditions is published on GitHub [41].

Figure 1 .
Figure 1.One-diode equivalent circuit model of photovoltaic cell.

Figure 2 .
Figure 2. Density distributions ((left) panels) and trace plots ((right) panels) for each equiv circuit parameter of the commercial Si solar cell (R.T.C.France) [18] with the recommended c lation condition.The solid and dotted lines correspond to different chains.

Figure 2 .
Figure 2. Density distributions ((left) panels) and trace plots ((right) panels) for each equivalent circuit parameter of the commercial Si solar cell (R.T.C.France) [18] with the recommended calculation condition.The solid and dotted lines correspond to different chains.

Figure 4 .
Figure 4. Density distributions ((left) panels) and trace plots ((right) panels) for each equivalent circuit parameter of the OPV (PTB7-Th:C70, annealed at 175 °C) [40] with the recommended calculation condition.The solid and dotted lines correspond to different chains.

Figure 4 .
Figure 4. Density distributions ((left) panels) and trace plots ((right) panels) for each equivalent circuit parameter of the OPV (PTB7-Th:C 70 , annealed at 175 • C) [40] with the recommended calculation condition.The solid and dotted lines correspond to different chains.

18] Bayesian Estimation Mean Value Standard Deviation
RI ph[mA]

Table 5 .
[18] value, standard deviation (SD), computation time, and computation success rate of estimated equivalent circuit parameters obtained by PyMC ver. 5 without JAX under various numbers of steps (experimental data: the commercial Si solar cell (R.T.C.France)[18]).All displayed values are typical ones.

Table 7 .
[18] value, standard deviation (SD), computation time, and computation success rate of estimated equivalent circuit parameters obtained by PyMC ver. 5 with Numpyro under various numbers of steps (experimental data: the commercial Si solar cell (R.T.C.France)[18]).All displayed values are typical ones.