Approaches to Parameter Estimation from Model Neurons and Biological Neurons

: Model optimization in neuroscience has focused on inferring intracellular parameters from time series observations of the membrane voltage and calcium concentrations. These parameters constitute the ﬁngerprints of ion channel subtypes and may identify ion channel mutations from observed changes in electrical activity. A central question in neuroscience is whether computational methods may obtain ion channel parameters with sufﬁcient consistency and accuracy to provide new information on the underlying biology. Finding single-valued solutions in particular, remains an outstanding theoretical challenge. This note reviews recent progress in the ﬁeld. It ﬁrst covers well-posed problems and describes the conditions that the model and data need to meet to warrant the recovery of all the original parameters—even in the presence of noise. The main challenge is model error, which reﬂects our lack of knowledge of exact equations. We report on strategies that have been partially successful at inferring the parameters of rodent and songbird neurons, when model error is sufﬁciently small for accurate predictions to be made irrespective of stimulation.


Introduction
Big data are driving the development of algorithms that predict the future evolution of complex systems and learn optimal strategies. Data-based learning has demonstrated remarkable success in predicting chaotic dynamics through reservoir computing [1] and in mastering board games without human input [2]. Model-based learning has instead relied on data assimilation to configure model equations with optimal parameters and initial conditions [3] that synchronize internal state variables to observations. The model is then completed by incorporating the estimated parameters and initial conditions in the model equations. Predictions are then made by forward integrating the completed model. This approach is routinely used by meteorologists, who forward integrate completed Navier-Stokes models to forecast weather conditions [3,4] or map ocean flow [5,6]. It is also increasingly being used by computational neuroscientists to predict the state of a neuron [7][8][9][10][11][12][13] and biocircuits [14,15]. The estimation of parameters from neuron-based conductance models is also important as it has the potential to extract microscopic information on the complement of ion channels and intracellular dynamics, which are currently inaccessible to measurement. Parameters such as ionic conductances, gate thresholds, and gate kinetics are signatures of ion channel subtypes [7,8,12,13,16]. Inferring these parameters can potentially reveal the ion channels expressed in the electrical activity of a cell. Furthermore, changes in parameter values can potentially detect ion channel mutations which are difficult to classify using transcriptomics [17]. For example, channelopathies are involved in the early stages of brain cancer [18,19] and Alzheimer disease [20]. Model-based approaches are also important because they allow reconstructing the dynamics of ionic gates which are inaccessible to experiment [7,8].
Inferring biological information is met with three challenges. Firstly neuron-based conductance models respond nonlinearly to stimulation. Model optimization calls for convex algorithms that minimize a cost function subject to equality and inequality constraints. Equality constraints are given by the equations of motion of the state vector linearized at each point of the assimilation window. Inequality constraints are given by the boundaries of the parameter search intervals. Early attempts at optimizing neuron models used linear solvers to extract ionic conductances, while setting the gate thresholds and kinetics (nonlinear parameters) at empirically known values [21,22]. These solvers included evolutionary [23], genetic [24][25][26][27][28][29], and particle swarm optimization algorithms [30]. More recently, interior point methods [7][8][9][10]30,31], 4DVar [4], and self-organizing state space methods [32] have successfully estimated both linear and nonlinear parameters (see links in supplementary information). The second challenge is to transfer from the data all the information needed to determine a unique set of model parameters and initial conditions. The added difficulty here is that many state variables cannot be observed, leaving only a few, such as the membrane voltage, as the entry points of experimental data into the model. It is important that the criteria of observability, identifiability, and data sampling be fulfilled to fully constrain the model parameters. Given the availability of powerful convex optimization solvers [30,31], this paper focuses on setting the conditions for transferring information from the data to the model. The third challenge relates to the awareness that biological circuits may intrinsically incorporate redundant degrees of freedom. Experiments on crustacean central pattern generators have shown that the same electrical activity may arise from different circuit configurations, pointing to possible functional overlap between parameters [33][34][35] and compensation [36]. This naturally calls into question whether any biologically meaningful information may be inferred from electrical activity alone [37][38][39]. Functional overlap between parameters is, however, highly conditional to the nature of the stimulation applied to circuits. Parameters which may be redundant under tonic stimulation often reveal their functional significance under complex stimulation protocols. Kushinsky et al. [40] and Haley et al. [41] have demonstrated this by perturbing central pattern generators with changes in temperature and pH level. They observed switching into a new mode of electrical oscillations, revealing the role of certain parameters in setting the onset of the new oscillation regime. Similarly, the rebound and adaptation dynamics of neurons are the signature of calcium currents, or the hyperpolarization-activated h current which is only revealed under time dependent current stimulation. The answer to the third challenge may therefore lie in the design of appropriate stimulation protocols since the process of natural evolution would have filtered out most redundant functions in neurons and biocircuits.
From a modelling viewpoint, there are two reasons to suggest that the complement of ion channels might eventually be inferred. Firstly, conductance models are known to be sufficiently close approximations of actual neurons and to predict neuronal oscillations with great accuracy [7,8,32]. Therefore, algorithms that are capable of correcting residual model error will retain the parameter estimation accuracy demonstrated in well-posed models. Secondly, some of the estimated parameters consistently fall in a narrow range that matches known biological values. We first discuss well-posed problems, reviewing the criteria that warrant the systematic recovery of correct parameters in twin experiments. We then estimate the parameters of rodent and avian neurons with guessed conductance models. We describe three strategies that have obtained sensible biological parameters while accounting for residual model error.

Choosing Neuron Models
Neuron-based conductance models [42] describe the rate of change of the membrane voltage as a function of the ionic currents (Na, K, Ca, Cl, leak) crossing the neuron membrane. Each current flows through an ion-specific channel that is gated by the membrane voltage. Activation gates (m) and inactivation gates (h) open (resp. close) above a threshold voltage, albeit with a delay described by a first order kinetics. The voltage thresholds, Algorithms 2022, 15, 168 3 of 14 recovery time constants, and maximum ionic conductances are the parameters that we seek to estimate here. The most common ionic currents are of the form: where g i is the maximal conductance of channel subtype i, m i and h i are the probabilities of the activation and inactivation gates being open, a ≥ 0 and b ≥ 0 are the gate exponents, E i is the reversal potential of the ionic species, and V is the membrane voltage. Choosing a neuron model entails selecting no two ionic currents that have the same mathematical dependence on V and t (Equation (1)). The maximal conductances might otherwise not be uniquely determined. To avoid parameter degeneracy, it is necessary that ion channel subtypes differ from each other through differences in gate exponents, a and b, or through the voltage dependence of their inactivation time. This condition is sufficient if all other criteria in this section are met. The latter dependence can be bell-shaped (transient sodium current), rectifying (inactivating potassium current), or biexponential (low threshold calcium current, A-current) [8,43]. Conversely, over-specified models, which include more ionic currents than are expressed in the data, are easily handled by data assimilation. The conductances of redundant ion channels are invariably set to zero when assimilating model data [10]. Reduced models have also been introduced to avoid over-specified parameters [44][45][46].
Biological considerations regarding the preferences of some ion channels to lie in greater density in the soma or the dendrites may also help select the ionic currents to include in the model. It is known, for example, that the density of calcium channels is greater in dendritic trees than in the soma [47][48][49][50], and hence, calcium channels may be omitted when modelling the soma compartment. In addition, single compartment models are often sufficient to give excellent predictions of neuronal activity [7,8,13,32,51]. The minute contribution of dendritic compartments may therefore be treated as a perturbation alongside model error. Figure 1 shows the process of constructing a complete model from electrophysiological recordings of a songbird neuron [7,8]. The conductance model used in this example is given in the supplementary information. The optimal fit of the model (green line) to the data (black line) is obtained by assimilating these data over a 590 ms time window. Once estimated, the parameters are inserted in the model equations. The completed model is then forward integrated from t = 1090 ms onwards (red trace), driven by the same current protocol applied to the interneuron (blue trace) [8]. Biological considerations also guide the choice of the parameter search intervals, which must be specified prior to assimilating the data. The model trace is directly overlayed over the data. The neuron membrane voltage was recorded under current stimulation by a protocol (blue trace) mixing chaotic oscillations (Lorenz system) and long current steps. Parameters are obtained from the model that best synchronizes its membrane voltage state variable to the data (green trace). This optimal model is then used to forward integrate the current protocol and predict the future state of the neuron (red trace). Here, note that the predicted membrane voltage occasionally departs from the data (black line). The blue arrows show the position of three action potentials missing from the data. These are missing because the current protocol at around 1650 ms is exactly the series (black trace) of an interneuron of the songbird brain. The model trace is directly overlayed over the data. The neuron membrane voltage was recorded under current stimulation by a protocol (blue trace) mixing chaotic oscillations (Lorenz system) and long current steps. Parameters are obtained from the model that best synchronizes its membrane voltage state variable to the data (green trace). This optimal model is then used to forward integrate the current protocol and predict the future state of the neuron (red trace). Here, note that the predicted membrane voltage occasionally departs from the data (black line). The blue arrows show the position of three action potentials missing from the data. These are missing because the current protocol at around 1650 ms is exactly the same as the one giving the continuous bursts at 1350 ms and 1550 ms. The missing action potentials are likely due to input from other neurons in the slide, which cannot be completely eliminated through inhibitors.

Observability
Observability is a measure of how well the state and parameters of a conductance model can be inferred from measurements of the membrane voltage and other experimental observations. Takens' theorem prescribes that all information needed to constrain the model is contained in an observation window of finite duration [52]. Aeyels [53] refined this argument by predicting that the number of observations be no less than 2L + 1, where L is the number of state variables in the system.
Electrophysiology and calcium imaging only allow a few state variables to be observed, namely the membrane voltage and calcium concentration. Gate variables m i and h i need to be inferred indirectly from time delayed observations of the membrane voltage. Parlitz et al. [54][55][56][57][58] and Letellier et al. [59][60][61] have studied the properties of the Jacobian matrix, embedding the state vector at time t in the time delayed measurements of the membrane voltage. They show that, when the embedding space is small, this matrix is singular in localized regions of dynamic maps. In those regions, the data do not contain enough information to uniquely constrain the neuron state, and hence, the system is not observable. Increasing the size of the embedding vector to match the number of state variables plus parameters was found to eliminate domains where the Jacobian is singular. The length of the assimilation window used to construct neuron models typically has 10,000-100,000 data points for models with less than 100 state variables and parameters [7,8,[62][63][64]. This suggests that the volume of data is sufficient to fulfill the observability criterion. This has been empirically verified by reliably retrieving the parameters of the original model in twin experiments, where the assimilated data were generated by the same model, configured with known parameters. Twin experiments are the ultimate validation tool of well-posed inference problems where the model generating the data is fully known.
Naturally, increasing the number of state variables that can be matched to observation increases observability across the entire phase space [54,55], and also stabilizes convergence towards the global minimum of the objective functions, by reducing the occurrence of positive conditional Lyapunov exponents [65].

Identifiability
A model is identifiable if any two different sets of parameters produce different state trajectories starting from the same initial state [66,67]. Over-specified models may still be identifiable if unexpressed ion channels are filtered out (i.e., g i = 0), with all other ion channel parameters being fully constrained. This aside, over-specified models are not identifiable.
Identifiability is critically dependent on the stimulation protocol applied to the neuron. Unfortunately, parameter estimation has mainly focused on autonomous oscillators. Examples of parameter estimation from systems driven by an external force are comparatively few. Consequently, there is a need for further research on stimulation protocols that maximize identifiability. Central pattern generators are one example of autonomous oscillators where disparate parameters may yield the same oscillation patterns under no or tonic stimulation [34]. Similarly, no or tonic stimulation would fail to reveal adaptation in the electrical activity of a cell [68] from which calcium properties could not be identified. On the other hand, perturbing a central pattern generator, even with a modest variation in temperature or pH, is sufficient to switch between oscillatory modes [69][70][71] and in the process lift parameter degeneracy [40,41]. Maximum perturbation has been applied to neurons by designing current protocols that reveal adaptation mechanisms [7,8,14,32]. The design of such protocols has so far been guided by intuitive considerations based on the need to excite the hyperpolarized, sub-threshold, and depolarized regimes of the neuron, and to elicit the different response times of the neuron/circuit [7,8,14].
Identifiability may be achieved through careful design of current protocols. The aim is to retrieve the maximum of information from the data. The information contained in the current protocol may be quantified by the conditional Shannon entropy [72] relative to the conductance model. An alternative approach to testing identifiability has been suggested by Herman et al. [73] and Villaverde et al. [74]. Their method computes the Lie matrix linking the time derivatives of the data at different orders to the parameter vector at time t. Identifiability is validated by the matrix having full rank. If the matrix is singular, identifiability may be recovered by removing the matrix column(s) responsible for the singularity. This approach implies reparametrizing the model to remove functional overlap between parameters. Because models are meant to describe physical reality rather than the other way round, such simplifications rarely help characterize the underlying biology. Instead, Lie matrices are useful to validate the stimulation protocols that warrant the identifiability of neuron-based conductance models.
Numerical experiments have shown that identifiability criteria are fulfilled when current protocols elicit the full range of membrane voltage amplitudes and dynamics. Protocols that meet these conditions were found to constrain all model parameters to within 0.1% of their true value, independently of the waveform chosen for the current protocol.

Noise and Effect of Assimilation Window over Parameter Distribution Adaptive Sampling
Noise in the membrane voltage introduces uncertainty in the parameter field. This is observed in the broadening of the posterior distribution function of extracted parameters (Figure 2), which is generally non-Gaussian [11]. Gaussian noise added to the current protocol has a comparatively small effect on parameter estimates, as it is integrated by the conductance model.

OR PEER REVIEW
6 of 15 ( Figure 2), which is generally non-Gaussian [11]. Gaussian noise added to the current protocol has a comparatively small effect on parameter estimates, as it is integrated by the conductance model. Measurement noise or intrinsic noise [75] may be smoothed {7,8] to improve parameter accuracy. Parameter accuracy may also be enhanced by increasing the duration of the assimilation window. In Figure 2, the effect of increasing the assimilation window is to narrow the posterior distribution function [11,76]. The assimilation window can be widened without increasing the size of the problem, using an adaptive mesh size. This works by selectively sampling action potentials at a faster rate than subthreshold oscillations across the assimilation window [11].

Stochastic Regularization of Convergence to Local Minima Vicinal to the Global Minimum
Erroneous parameter solutions may be obtained when the parameter search ends at a local minimum of the cost function. This is a rare occurrence which is usually observed Measurement noise or intrinsic noise [75] may be smoothed {7,8] to improve parameter accuracy. Parameter accuracy may also be enhanced by increasing the duration of the assimilation window. In Figure 2, the effect of increasing the assimilation window is to narrow the posterior distribution function [11,76]. The assimilation window can be widened without increasing the size of the problem, using an adaptive mesh size. This works by selectively sampling action potentials at a faster rate than subthreshold oscillations across the assimilation window [11].

Stochastic Regularization of Convergence to Local Minima Vicinal to the Global Minimum
Erroneous parameter solutions may be obtained when the parameter search ends at a local minimum of the cost function. This is a rare occurrence which is usually observed when a vicinal local minimum has nearly the same data misfit error as the global minimum, and the difference between the two is smaller than the termination error of the algorithm. In this case, the true solution at the global minimum can still be obtained through the noise regularization method [11].
Noise has the effect of shifting the positions of local and global minima relative to one another in parameter space, as well as shifting these minima up in energy. An example is stochastic gradient descent [77,78]. The direction and magnitude of the shift depends both on the realization and the standard deviation of noise. By varying these noise parameters in a suitable manner, one can perturb the environment of the minima of the cost function and merge the local minimum, where the parameter search is arrived at, with the global minimum. The judicious management of additive noise until a merger has occurred, and subsequent removal of noise while tracking the return of the global minimum to the zeronoise level, can increase the accuracy on parameters beyond the limitations of gradient descent methods (Figure 3) [11].

Biological Neurons: Estimating Parameters from Guessed Models
The lack of knowledge on the exact biological model presents the principal challenge The parameter estimation algorithm started from an initial guess at the local minimum (pink star). Then, the amplitude of noise was increased. Until a critical value, the parameter solution remains in the local minimum but shifts (blue dot). Above this critical value, at σ 3 , the local minima and the global minimum merge, causing an abrupt jump of the solution into the (noise shifted) global minimum. The global solution above σ 3 then need to be inserted as input to the parameter estimation algorithm, and then, noise is progressively decreased to zero to obtain the global minimum at zero noise. (b) Trajectory of the neuron parameter vector projected in two dimensions as noise is varied. A jump is observed from the local to the global solution. (c) Schematics of the noise annealing protocol [11].

Biological Neurons: Estimating Parameters from Guessed Models
The lack of knowledge on the exact biological model presents the principal challenge to the aim of characterizing ion channels. When the cost function only measures the fitting error between the time series data and one or more state variables, the optimum fit cannot represent the optimum parameters which we are seeking. Instead, some parameters will hit the boundaries of their search intervals in an attempt to compensate for model error. A model error term must therefore be added to the cost function to eliminate this compensation. The weight of the model error term must be carefully balanced with the data error term. This is done via their respective covariance matrices as we shall see below.
If the model error is omitted from the cost function, parameter solutions will hit search interval boundaries to compensate for model error. Conversely, if the model error term carries a disproportionate weight, parameter solutions will be loosely constrained. This section discusses strategies which have been implemented, with various degrees of success, to obtain biological parameters from models whose error is known to be small. Models with residual error are defined here as models that (i) predict neuronal dynamics to a very high degree of accuracy under any current stimulation, and (ii) assign biologically realistic values to the thresholds and recovery times associated with the largest ionic conductances.

Including Model Error in the Cost Function
Model error is accounted for by adding a quadratic term in the cost function. This term avoids the strict enforcement of constraints when the model → F g is guessed and necessarily departs from the true but unknown equations of motion → F : The constraint is linearized at each mesh point t i of the assimilation window for each state variable x l giving constraints x l (t i+1 ) ≈ G g l → x (t i ) . The function G g will depend on the interpolation method used, for example: The cost function has a model error term that measures the discrepancies between the x l (t i+1 ) and G g l → x (t i ) across the assimilation window (i = 0, . . . , N − 1) and state variables (l = 1, . . . , L). To this, one adds the data error term that measures the distance between the membrane voltage V data (t i ) in the experimental recording and the corresponding state variable in the model x 1 (t i ) for i = 0, . . . , N: This cost function is similar to the one used to infer the initial state of the atmosphere in meteorology [3]. It may also be formally derived from the action in Lagrangian mechanics [6]. A quadratic regularization term u 2 may added and treated as an additional state variable to remove the occurrence of positive Lyapunov exponents during convergence [79,80]. The R l are the diagonal elements of the covariance matrix of the model error. They establish a tradeoff between ignoring model error ( R l → ∞ ) and giving a greater weight to model error than data error, with the consequence that the model may fail to synchronize to the data ( R l → 0 ). The computation of these covariance matrices is an active area of research in data assimilation, which places emphasis on calculating majoration bounds of diagonal matrix elements [4]. This approach has been met with partial success in balancing data and model error in such a way that constraints (Equation (2)) are not enforced too strictly and do not send the model error term to zero in the cost function (Equation (3)). Note that in Equation (3), the covariance matrix of observation error is taken to be a unitary matrix. This is because consecutive measurements of the membrane voltage (or calcium concentration) are taken by the same apparatus which carries the same experimental error, hence, giving the same matrix elements on the diagonal. Measurements are also uncorrelated with one another, giving zero off-diagonal terms in the data error covariance matrix.
A simplification to computing the model error covariance matrix R was proposed by Ye et al. [6,65]. They applied a majoration to the model error covariance matrix by retaining the error on a single state variable x λ , preferably the one that is the most sensitive to model error. The model error on x λ was weighted by a single adjustable hyperparameter R λ , which was varied to optimally balance data and model error. This parameter expresses the degree of confidence one has in the model: the smaller the value of R, the lower the confidence in the model constraint: Although Ye et al. [65] did not directly assimilate erroneous models, they studied the convergence of data assimilation as a function of R λ as the number of state variables accessible to observation was incremented. They obtained an optimum value of R λ where the sum of the data and model error passes through a minimum. Current research efforts are directed to engineering the covariance matrices of model error to obtain stable parameter estimates when model error is gradually switched on.

Supplementing Gradient Descent with Statistical Inference
Current clamp recordings of the membrane voltage, V data (t), may be written as the sum of the useful signal, V use (t), and a small error component, v n (t), accounting for noise and model error. The cost function: defines the energy surface whose minimum we seek. This total energy may be decomposed as the sum of a free energy, F, and a random energy, TS as [8]: where and Minimizing the cost function by a deterministic process such as gradient descent only minimizes the free energy part. The misfit error δc = TS can only be minimized by statistical inference. Since we have assumed small experimental error, the misfit error determines the surface of an ellipsoid centered on the optimal parameter solution Starting the search of parameters from different initial guesses will generate parameter solutions, → p k , distributed across this ellipsoid surface (Figure 4a). The true solution may therefore be approached by taking the mean value of the statistical sample of solutions derived from K initial guesses.

Aggregate State Variables
One effect of model error is to transform the minima of the cost function into deep valleys that correlate two or more parameters. The data misfit ellipsoid = ( Figure  4a) subsequently becomes spiky. This can be seen in the spectrum of the principal axes calculated from the inverse covariance matrix of the parameters from a songbird neuron [8]. This spectrum, shown in Figure 5, has an exponential distribution with a few outliers at = 1 − 4. Examination of the covariance matrix of the estimated parameters (inset) identifies the recovery time constants of the ion channels as the most loosely constrained of all the parameters. This is not surprising as they enter the argument of exponentials describing the kinetics of the gate variables. The covariance matrix also suggests that parameter correlations are greater within the subset of parameters of each ion channel. The approach has been successful in estimating the parameters of interneurons from the songbird high vocal center [8]. Figure 4b shows the activation and inactivation thresholds obtained by averaging the parameter solutions emanating from 84 initial guesses. These are the gate thresholds of all ion channels expressed in two interneurons, N1 and N2. The variance (vertical bars) indicates the dispersion of the underlying p k values. The average thresholds (dots/square symbols) otherwise become remarkably consistent across the two interneurons [8]. This consistency strongly suggests that averaging the thresholds obtained by gradient descent gives sensible estimates of the true biological thresholds. Similar consistency was obtained across ionic conductances and recovery time constants. The variance of the latter was, however, greater.

Aggregate State Variables
One effect of model error is to transform the minima of the cost function into deep valleys that correlate two or more parameters. The data misfit ellipsoid δc = ST (Figure 4a) subsequently becomes spiky. This can be seen in the spectrum of the principal axes calculated from the inverse covariance matrix of the parameters from a songbird neuron [8]. This spectrum, shown in Figure 5, has an exponential distribution with a few outliers at i = 1 − 4. Examination of the covariance matrix of the estimated parameters (inset) identifies the recovery time constants of the ion channels as the most loosely constrained of all the parameters. This is not surprising as they enter the argument of exponentials describing the kinetics of the gate variables. The covariance matrix also suggests that parameter correlations are greater within the subset of parameters of each ion channel.

Aggregate State Variables
One effect of model error is to transform the minima of the cost function into deep valleys that correlate two or more parameters. The data misfit ellipsoid = ( Figure  4a) subsequently becomes spiky. This can be seen in the spectrum of the principal axes calculated from the inverse covariance matrix of the parameters from a songbird neuron [8]. This spectrum, shown in Figure 5, has an exponential distribution with a few outliers at = 1 − 4. Examination of the covariance matrix of the estimated parameters (inset) identifies the recovery time constants of the ion channels as the most loosely constrained of all the parameters. This is not surprising as they enter the argument of exponentials describing the kinetics of the gate variables. The covariance matrix also suggests that parameter correlations are greater within the subset of parameters of each ion channel.  A strategy to extract biological information is to evaluate compound quantities that integrate correlations between more fundamental parameters. Ionic currents, or the amount of ionic charge transferred per action potential, may be reconstructed from parameter estimates of maximal ionic conductances, gate recovery times, and voltage thresholds. The reconstructed quantities often carry a much lower uncertainty than the estimates of model parameters. Morris et al. [81] have recently used this approach to predict the amount of charge transfer through individual ion channels modulated by inhibitor drugs. Data assimilation was able to predict the selectivity and potency of inhibitor drugs in good agreement with IC50 analysis, by estimating the amount of charge transferred across each ion channel before and after inhibition.

Discussion
Gradient descent algorithms have reliably recovered the 10-100 parameters of neuronbased conductance models. When the problem is well-posed, these parameters are systematically found to lie within 0.1% of the true parameter values. These are the parameters set in the model used to synthesize the time series data. An accuracy better than 0.1% supposes that the criteria of observability and identifiability are satisfied. Observability can be further improved by increasing the number of state variables which are observed. This means complementing electrophysiological recordings with calcium [82] or optogenetic recordings [83]. More observed state variables also means that the parameter search becomes less likely to develop positive Lyapunov exponents [65,82,84,85] and to converge [54,55]. Identifiability is conditional to the nature of stimulation applied to the system. Although much theoretical work remains to be done to design stimulation protocols that transfer a maximum of information from the data to the model, numerical studies show that protocols eliciting both hyperpolarized and depolarized oscillations, while covering the full spectrum of ionic gate kinetics, are sufficient to recover all original parameters in twin experiments. Noise in experimental data is easily dealt with by smoothing membrane voltage time series to reduce the uncertainty on parameters. Additive Gaussian noise may be used in a constructive manner to regularize convergence towards the global minimum of the cost function.
Currently the major roadblock in estimating the parameters of neurons is model error. Conventional approaches insert error in the cost function alongside data error. This requires balancing data and model error, either through ab-initio calculations of the model error covariance matrix, or by using a variational parameter such as R λ as a majoration of model error. In either case, the data assimilation algorithm must relax the enforcement of constraints without sending the model error term of the cost function to zero. Reservoir computing [1] can, in principle, also help correct model error. The non-physical models, built by training a neural network on time series data, can be used to identify error in the conductance equations. The excellent predictions of neuron oscillations made by conductance models have suggested two empirical methods which revealed ion channel properties in good agreement with voltage clamp measurements and IC50 analysis. The first approach consists of taking the mean expectation value of a statistical sample of parameters computed from many different initial guesses. This method was shown to give consistent activation thresholds and ion channel conductances across the complement of ion channels of nominally equivalent songbird neurons. The second approach used the observation that parameter correlations tend to be greatest within each ion channel. We showed that quantities reconstructed from these parameter subsets-the charge transferred per action potential and the ionic current-could be predicted with enough accuracy to identify which ion channels were inhibited when an inhibitor drug was applied and the potency of the drug. Data assimilation has also successfully optimized conductance models derived from the IV characteristics of neuromorphic circuits [51,86]. The conditioning of neuromorphic circuits [69,71] with the parameters that produce the correct adaptation to stimuli is generating considerable interest for bioelectronic medicine [87] and brainmachine interfaces [88,89].
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/a15050168/s1, Figure S1: The generic rate equation of a gate variable, here n(t). V n is the threshold of the sigmoidal activation. dV n is the width of the transition from the open to the closed state of the gate. dV n > 0 gives an activation curve; dV h < 0, inactivation. The recovery time τ n (V) of most gate variables has a Bell-shaped curve defined by 3 parameters: V n , t 0n , n . This dependence describes all activation gates and almost all inactivation gates the exception being the K2 and CaT inactivation recovery times which exhibit a rectifying behaviour and a biexponential dependence respectively; Figure S2: Mathematical form of the 9 ionic currents used to model the songbird neuron in Figure 1 of the manuscript.

Conflicts of Interest:
The author declares no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.