A Sparse Multiwavelet-Based Generalized Laguerre–Volterra Model for Identifying Time-Varying Neural Dynamics from Spiking Activities

: Modeling of a time-varying dynamical system provides insights into the functions of biological neural networks and contributes to the development of next-generation neural prostheses. In this paper, we have formulated a novel sparse multiwavelet-based generalized Laguerre–Volterra (sMGLV) modeling framework to identify the time-varying neural dynamics from multiple spike train data. First, the signiﬁcant inputs are selected by using a group least absolute shrinkage and selection operator (LASSO) method, which can capture the sparsity within the neural system. Second, the multiwavelet-based basis function expansion scheme with an efﬁcient forward orthogonal regression (FOR) algorithm aided by mutual information is utilized to rapidly capture the time-varying characteristics from the sparse model. Quantitative simulation results demonstrate that the proposed sMGLV model in this paper outperforms the initial full model and the state-of-the-art modeling methods in tracking performance for various time-varying kernels. Analyses of experimental data show that the proposed sMGLV model can capture the timing of transient changes accurately. The proposed framework will be useful to the study of how, when, and where information transmission processes across brain regions evolve in behavior.


Introduction
Modeling of information transmission processes across brain regions, including spike generation and synaptic transmission, is challenging because neurobiological processes can be nonstationary [1,2].For example, studies establish that the synaptic plasticity, with specific forms of long-term potentiation (LTP) and long-term depression (LTD), occur in response to specific input patterns, which can be indicated as the system nonstationarity.However, tracking of the changes for such time-varying functional connectivity on the level of neural ensembles is still challenging [3] due to the potential nonstationary properties.
Generally, three major classes of system identification methods have been applied to the study of the time-varying neural system.The first class of the nonstationary systems modeling method is the data-driven artificial neural networks (ANNs), which have been widely applied to various fields, including speech recognition [4], forecasting [5], as well as nonstationary or nonlinear system modeling [6].However, the multiple hidden layers and the nodes in the networks structure are usually difficult to interpret.The second class of time-varying parametric estimation approaches is based on the adaptive filtering strategy, such as the steepest descent algorithm [7], stochastic gradient algorithms [8], Kalman filter and its variants [9].The adaptive filter algorithms for the spike train data update the model parameters iteratively according to the difference between the occurrence of a real spike and the estimated spiking firing probability [10], which has an advantage in online estimation of a model that represents the best fit to an unknown neural system.However, if the model coefficients changed abruptly, adaptive filters might be incapable of capturing the transient properties of a time-varying system due to the convergence time taken in the adaptive filters method [11].
The third class is the basis function expansion method where the unknown time-varying parameters are projected onto a set of predefined basis functions.By this means, the time-varying identification problem can be transformed into an expansion-based time-invariant parametric modeling scheme, with remarkable reduction in the number of the coefficients to be estimated.A variety of basis functions can be utilized under this strategy, including the Walsh function [12], Chebshev basis [13], Laguerre polynomials [14], Legendre series [15], and wavelets [16,17].Each family of basis functions should be considered based on its unique fit to the kernel shape if the number of basis functions to be used is limited.For example, the Chebshev basis and the Legendre polynomials both perform well in tracking smoothly or slowly changing parameters [13], while the Walsh and Haar basis functions are appropriate for tracking abrupt time-varying parameters [18].Towards the applicability to a dynamical system with spike train inputs and outputs, the Chebshev basis expansion technique has been tested [19].However, an unknown neural system can involve multitudes of time-varying dynamics, where both smooth and abrupt changes can be found simultaneously.The Chebshev basis expansion may be insufficient to capture the potential diverse time-varying properties effectively.Instead, we propose the use of a wavelet basis function, which is localized both in the time and frequency domain [20], can be effectively utilized to represent a sharp or slow variation vanishing outside a short time interval [21,22].Li et al. previously combined the cardinal B-splines basis function with a block least mean square algorithm to track both slow and rapid changes of coefficients effectively in time-varying linear systems [23,24].The team later extended this method to the identification of nonlinear time-varying systems and obtained satisfactory results of tracking time-varying variations from EEG nonstationary data [25].
A biological neural network commonly exhibits small-world properties where the neurons are not fully connected [26,27].Commonly, the direct utilization of the basis function expansion method based on all input spiking signals can easily suffer from an overfitting problem due to the spurious input connectivity.Regularization is crucial for achieving functional sparsity at the global levels, which has been increasingly applied to solve the ill-posed or overfitting problems in the statistics and machine learning field [28].In the context of the sparsity in neural connectivity, the regularization method is also very popular to process the issue of the sparse connectivity [29].Various regularization approaches, such as the least absolute shrinkage and selection operator (LASSO) [30], the smoothly clipped absolute deviation (SCAD) [31], group LASSO [32], and group ridge [33] have been employed to obtain sparse representations.For example, Song et al. implemented a group LASSO to capture the sparsity of a neural system [34].However, existing studies only focused on a time-invariant system, while there is a lack of reliable method for the regularized estimation of the time-varying system in nature.In addition, our previous work indicated that the multiwavelet-based time-varying generalized Laguerre-Volterra model can be used to track both slow and rapid changes of time-varying parameters simultaneously for the spiking train data [24], while it may fall into the dilemma of overfitting in the case that the neural system is rather sparsely connected.Therefore, we may integrate both the group LASSO and the multiwavelet basis expansion method in a unified framework to include their advantages (e.g., the group LASSO algorithm can eliminate the spurious input and capture the sparsity of a neural system, while the multiwavelet basis expansion method can effectively track both slow and rapid changes of time varying parameters simultaneously) and achieve an efficient and accurate estimation for the nonstationary spiking system.
In this paper, we propose a sparse multiwavelet-based generalized Laguerre-Volterra (sMGLV) modeling framework to identify the underlying nonstationary sparse neuronal connectivity by observing the input and output spike trains only.The contributions of this paper are threefold.First, we address the problem of effective input selection by using a group LASSO regularization method in the analysis of the time-varying dynamics within a neural system.Second, in order to estimate the time-varying kernels quickly and accurately, we project the time-varying parameters in the sparse model onto a set of multiwavelet-based basis functions so that the time-varying parameters estimation problem can be tackled within the time-invariant regression scheme.Third, we utilize an effective forward orthogonal regression (FOR) aided by the mutual information (MI) criterion (FOR-MI) algorithm to determine and refine the model structure.The redundant model terms or regressors will be removed, and thus a compact model with good generalization performance is achieved.Particularly, the proposed sMGLV modeling approach can produce an optimal model structure to characterize the time-varying dynamical processes underlying the spiking system.The novel sMGLV framework integrates the group LASSO regularization algorithm and the multiwavelet basis expansion method in a unified framework to obtain an efficient and accurate estimation for the nonstationary spiking system.An obvious advantage of the proposed framework lies in its ability to track rapidly and even sharply time-varying processes on the condition that the neurons are not fully connected.Extensive simulation results demonstrate that the sMGLV model outperforms the existing state-of-the-art modeling methods in the identification of sparse time-varying neural connectivity.Redundant inputs can be effectively eliminated while more reliable kernel estimations can be obtained from a group of neural spiking signals.Experimental results show that the proposed method can track the changes of the functional connectivity in a real retinal neural spike train signals accurately.

Methodology
The schematic diagram for the proposed sMGLV framework is shown in Figure 1.The sMGLV model is implemented by using the Laguerre expansion of the kernel technique, the group LASSO regularization method, and the multiwavelet expansion scheme based on the time-varying Volterra series and generalized linear model, respectively.Furthermore, an effective FOR-MI algorithm provides an optimal model structure, where the expansion parameters are obtained with a maximum likelihood estimation method.Finally, the time-varying Volterra kernels are reconstructed for a biologically plausible description of the neural dynamics.

Time-Varying Generalized Volterra Model
Neural spiking activities can be viewed as point processes, represented as binary signals [35].That is, 1 in the time series corresponds to when the spike is observed at the predefined time interval, and 0 otherwise [36].The spiking activity of an output neuron depends on not only the current activities of other neurons, but also the spiking histories, which requires a dynamical model to capture a causal relationship [34].Given the history of the input and output signals, the conditional probability of the action potential of the output signal can be expressed as follows [34]: where y denotes the output signal, X represents the set of all the equivalent input signals and M is the memory length in the system.Note that T , which contains the output signal y itself for the purpose of including the autoregressive effect of the preceding activities, namely, the total number for the input signals is N 0 − 1, and the autoregressive effect from the preceding output can be written as x N 0 for simplicity (Table 1).The input-output dynamics is represented by a multi-input, multi-output (MIMO) framework, which is decomposed into a series of structurally plausible multi-input, single-output (MISO) models.The structure for the time-varying MIMO and MISO neural systems for spiking data can be given in Figure 2. Note that a population of neurons in the network tend to be sparsely connected.An output is typically innervated by only a small subset of neurons in its input region and thus exhibits a high level of sparsity in nature.

Time-invariant multiwavelet expansion coefficients B m
The m-th order Bspline basis function j The scale parameter of Bspline basis function m The order parameter of Bspline basis function p The number of selected significant model terms  The Voterra series expression has been widely used to represent the feedforward and feedback kernels in the context of the spiking neural system [34].To describe the causal relationship between the input and output spiking probability, the generalized linear model with time-varying Volterra series is utilized as follows: where k 0 is a scalar zeroth-order kernel function, and k (n) , n = 1, 2, • • • , N are first-order kernel functions describing the relationship between the output spiking probability and the n-th input.Here, g (•) is a known probit link function, which can be interpreted as a noisy threshold function that transforms the pre-threshold membrane potential to output spikes [36].With the probit link function, Equation ( 2) can be rewritten in the form of the normal cumulative distribution function as follows [34]: where the function erf is the Gaussian error function that can be defined by

Time-Varying Generalized Laguerre-Volterra Model
The memory length in the neurons may last for hundreds of milliseconds.If M is large, thousands of unknown coefficients have to be estimated for each input kernel in Equation (2).To alleviate computational complexity, Laguerre expansion decomposes the kernel functions k (n) (t, τ) into a set of weighted summation of the predefined orthogonal Laguerre basis functions: where b l (l = 1, 2, • • • , L) represents the l-th order Laguerre basis function, and c l denote time-varying Laguerre expansion coefficients of the n-th kernels.The total number of basis functions L has been chosen in the range of 3 to 15 based on previous experimental analyses [37].L also determines the complexity for the Laguerre expansion model.Specially, a larger L can give rise to more model terms and increase the computationally complex, while a more accurate and good generalization model will be produced.The Laguerre basis function b employed here can be obtained iteratively by [38]: where β is the Laguerre parameter (0 < β < 1), which determines the exponential decaying rate of the Laguerre basis functions.The bigger the parameter β is, the slower the Laguerre basis function decays, and a longer spread of significant non-zero values can be seen in the basis function [37].The system kernels with longer memory may require a larger β for efficient representation.Therefore, the proper selection of the parameter β and L can effectively represent the time-varying system coefficients.Figure 3 shows the first 15 Laguerre basis functions, where the Laguerre parameters L equals 15 and β equals 0.7.The basis function expands the entire system memory [0, M].
With input spike trains in X convolved with b, the convolved products v are represented by The time-varying generalized Volterra model in Equation ( 2) can thus be rewritten by where c 0 (t) , c (1) represent the coefficients of the Laguerre expansion of the respective kernels, as functions of time.The model described in Equation ( 8) forms the time-varying generalized Laguerre-Volterra (TVGLV) model.Given that the order of the Laguerre basis functions L is usually much smaller than the memory length M, the number of time-varying coefficients to be estimated can be largely reduced after the Laguerre expansion procedure.

Functional Sparsity Based on a Group Regularization Method
Nodes in biological neural network are not fully connected [26].Namely, only a small subset of the given multiple input signals may connect to the signal recorded from the output neuron.However, a full Volterra kernel model constructed with all the observed signals described in Equation ( 2) or (8) does not capture the sparsity yet.Therefore, functional sparsity is introduced to the time-varying generalized Laguerre-Volterra model.Using a regularization approach, the functional sparsity can select the significant inputs by enforcing the estimated kernels of the redundant inputs to be zeroes.
Regularization approaches, including LASSO [30], SCAD [31], group LASSO [32] and group bridge [33], have been widely applied to simultaneous parameter estimation and variable selection.The former two methods can achieve individual coefficient selection, while the latter two can produce a sparse set of groups.In the TVGLV model, the Laguerre approximation coefficients can be grouped with regard to the associated input signals, for example, {c L (t)} can be assigned to the same group because they all belong to the n-th input.Given this group setting, functional sparsity then refers to the occurrence that the estimation for several groups of Laguerre expansion coefficients all remain zero-valued.Therefore, a group LASSO regularization method based on the minimization of the sum of squared errors is utilized for a sparse TVGLV model.The group LASSO penalty term can be consisted as a hybrid of L 2 -penalty and L 1 -penalty: within the groups, it shrinks (but does not select) the coefficients with the L 2 -norm regularization; across groups, it selects the groups in a L 1 -norm regularization fashion.Specifically, the expression for the group LASSO penalty utilized here can be given as: where the subscript "2" is the L 2 -norm regularization within the groups, and the super-script "1" represent the L 1 -norm regularization across the groups.c(n) l represents the mean value of time-varying Laguerre expansion coefficients during the whole data length, and can be estimated by using the maximum likelihood estimation method in Equation (8).The group LASSO regularization criterion can thus be represented by: where λ > 0 is a regularization parameter that controls the sparsity degree of the model, i.e., a larger value of λ yields sparser results of the coefficients.A wide range of λ from 10 −5 to 10 −1 is explored.
For each λ, deviance is obtained from a cross-validation strategy.The λ with the smallest deviance and the corresponding set of C are used to find out the spurious inputs.For example, a redundant input can be determined if all the Laguerre expansion coefficients belonging to its group equal to zeroes.
Suppose that the selected Laguerre expansion coefficients are {c 0 (t) ,c , the associated inputs can thus be expressed as x r 1 , x r 2 , • • • , x r N .According to Equation (8), the sparse TVGLV model can be written as follows: By means of the group LASSO regularization method, the functional global sparsity can be achieved and the neural connectivity can thus be accurately represented.Furthermore, time-varying parameters can be efficiently estimated by using the following multiwavelet-based expansion scheme.

Sparse Multiwavelet-Based Generalized Laguerre-Volterra Model
Previous work has illustrated that expanding time-varying coefficients onto a set of basis functions can largely improve the tracking performance of the time-varying model and reduce the model complexity because the TVGLV model can be transformed into a time-invariant regression model [19].The multiwavelet basis functions are usually employed to approximate the time-varying parameters because of their good approximation performance in tracking both fast and slowly varying trends regardless of the a priori knowledge of nonstationary processes.Thus, we can project time-varying Laguerre expansion coefficients in the sparse TVGLV model onto a set of multiwavelet basis functions, which can be yielded the following formula: where wavelet scale j and the wavelet order m, respectively.α 0,m j,k , α r n ,m l,j,k , with T is the length of observed data.Multiwavelet basis functions are combined with a series of wavelet basis functions from different orders.It should be noted that wavelet basis functions of lower orders tend to perform well on the change of signals with sharp transients, while the wavelet basis functions of higher orders tend to perform well on the change of signals smoothly.Multiwavelet basis functions can thus be adopted to approximate time varying coefficients with both rapid and slow variations.Considering the cardinal B-spline basis functions to be completely supported with excellent approximation properties [17], ψ m j,k (•) can take the form of the B-spline basis function with the m-th order (B m ): where the variable x = t/T is normalized to [0, 1] and shift indices k satisfies −m ≤ k ≤ 2 j − 1.The scale j determines the resolution, namely, a larger j leads to a higher resolution, and also a bigger of the computation complexity.For most nonlinear dynamical modelling problems for the neural spiking data, scale j of 3, and the wavelets of order selected from ψ m j,k : m = 2, 3, 4, 5 are usually adequate, which can be graphically illustrated in Figure 4.The detailed discussions on the selection of proper scale j and the associated wavelet order can be found in [39].By substituting Equation (13) into Equation (11), the sparse multiwavelet-based generalized Laguerre-Volterra model is given below: where α 0,m j,k , α r n ,m l,j,k , are the time-invariant coefficients to be estimated.For simplification, new variables can be defined by: Substituting Equation (15) into Equation ( 14) obtains where α 0,m j,k , α r n ,m l,j,k the full model terms in the sparse mutliwavelet-based TVGLV expansion model.Hence, the original time-varying model described in Equation ( 11) is now transformed into a time-invariant regression model, which enables the coefficient estimation problem to be solved within the maximum-likelihood framework.

Model Structure Selection
The multiwavelet-based basis function expansion model in Equation ( 16) may involve myriad candidate terms or regressors if the model parameters L, J, m and N are large, which can easily result in an overfitting problem.Thus, determination of the significant terms or regressors to be included in the final identified model is crucial prior to estimating time-invariant expansion coefficients.
LASSO-related penalized regression methods are popular for variables or terms selection studies [30,40,41].However, the LASSO algorithm cannot guarantee consistency in the model term selection, and noisy terms can be occasionally selected [42].Nonetheless, the computation of such methods is demanding considering that the number for the candidate terms in the expansions is large [42].By contrast, the orthogonal-least-squares-type methods can obtain unbiased estimates [43].The forward orthogonal regression (FOR) algorithm aided by mutual information criterion (MI) (FOR-MI) is an efficient and widely-used method in respect of model structure selection [44].Thus, the FOR-MI method has been adopted to select significant model terms in the neural dynamical system from spike train data, which can effectively achieve local sparsity for the model structure.Consider two random discrete variables X 0 and Y 0 , the MI criterion is used to measure the reduction in the uncertainty of Y 0 due to the knowledge of X 0 , which can be defined as [45]: where P (X 0 ) and P (Y 0 ) represent the marginal probability density functions, and P (X 0 , Y 0 ) denotes the joint probability density function.
Given the binary characteristics of input and output data, the model structure detection algorithm based on the full model in Equation ( 16) can be described as follows: Step 1. Combine the output observation y and the full model candidates V 0,m j,k (t) , V r n ,m l,j,k (t) , the maximum likelihood estimator αmle can be estimated by using a generalized linear model fitting method.The output spike probability g (θ (t)) can further be obtained using Equation (16).Assume G (t) =g (θ (t)) , t = 1, 2, • • • , T, to be a vector of the measured outputs with T time instants.
Step 2. The model terms v i (t) ∈ V 1 = V 0,m j,k (t),V r n ,m l,j,k (t) ,i = 1, 2, • • • ,M are candidates for the first selected significant terms ϕ 1 (t).Let ϕ (i) The first significant term can be selected as v 1 (t), and ϕ 1 (t) =ϕ . The error-to-signal ratio (ESR) is given as the search termination criterion below: Step 3. The rest of the model terms and we can obtain Then, the second term can be selected as v 2 (t), and ϕ 2 (t) =ϕ Step 4. Subsequent significant terms can be selected in the same way step by step until terminated at the p-th step when some certain criteria is satisfied.A generalized cross-validation (GCV) is used to determine model size, which can be expressed as [44]: where MSE (p) = r p 2 p is the mean square error (MSE) corresponding to the optimal model with the p significant terms selected.The effective number of model terms can thus be obtained with a minimum GCV value.Through the model structure detection procedure, the significant model terms can be selected from a large number of redundant model terms and therefore a parsimonious model can be obtained.

Parameter Estimation and Tuning Parameter Optimization
Assuming that the significant model terms selected using the FOR-MI algorithm can be given as φ= ϕ 1 , • • • , ϕ p , and the associated time-invariant expansion coefficients α= α 1 , • • • , α p can be easily estimated by using a maximum-likelihood method.The log likelihood function can be expressed as follows: The log-likelihood can thus be defined by: Plugging the selected model terms φ to approximate θ (t) in the above equation yields an approximation of l.The criterion is a function of basis function coefficients α= α 1 , • • • , α p and denoted by l (α).The maximizer of l (α) is the maximum likelihood estimator (MLE) that is represented as α1 , • • • , αp , which can be calculated using the standard iterative reweighted least-squares method.
The optimization of the tuning parameters, including the β, L, J, m, are critical for achieving an accurate and efficient model.The well-known Bayesian information criterion (BIC) is employed to determine the proper tuning parameters, which can be calculated as follows: where α (β, L, J, m) is the estimated coefficients corresponding to a certain set of tuning parameters.The optimal tuning parameters of the proposed sparse multiwavelet-based generalized Laguerre-Volterra model can be determined with the minimum of the BIC value.

Kernel Reconstruction and Interpretation
With the proper model terms selected and the corresponding time-invariant expansion coefficients estimated, a parsimonious model can therefore be achieved through a kernel reconstruction and normalization procedure.
The Laguerre expansion coefficients { c0 (t) , c(r 1 ) L (t)} can be calculated using Equation (12).The Voterra kernels for the significant inputs can thus be reconstructed as follows: Towards the interpretation, the kernel functions can further be normalized by the zeroth-order kernel as k where the threshold for the spike generating is set to be 1, and the k0 (t) was set to be −1.
As to the interpretation for the reconstructed model, θ (t) is the conditional probability of the action potential of the output signal, and g (θ (t)) with a probit link function can be regarded as the membrane potential of the output neuron.k(n) (t, τ) is the postsynaptic potential in the output neuron elicited by the n-th input neuron when n is within [0, N 0 − 1], and the output neuron itself when n equals to N 0 .Inputs with all zero-valued kernels is identified as the redundant inputs.

Simulation Studies
Synthetic spike train data was generated from known systems to study the tracking performance of both the rapid and gradual changing kernels using the proposed sMGLV approach.Simulated system shown in Figures 5-7 is an 8-input and 1-output spiking neural system where the first four kernel functions (k (1) , k (2) , k (3) , k (4) ) undergo step changes and the other two kernels (k (5) , k (6) ) undergo slower gradual changes concurrently.To make the tracking task more difficult, the last two kernels (k (7) , k (8) ) are designed to be zeroes as redundant inputs to the spiking neural system.The transient changes for all step changing kernels occur at 400 s.In particular, the amplitudes of k (1) and k (2) are doubled, while the amplitudes of k (3) and k (4) are decreased by half.To test the concurrent tracking performance in identifying gradual changes, k (5) and k (6) are linear time-varying kernels with different slopes, as shown in Figures 6 and 7. Additionally, the feedback kernel from the preceding output signal can be denoted by k (9) , and is set to be time-invariant in this simulation.The zeroth-order kernel, i.e., the baseline, is set as a constant 0. Specific shapes for different kernels are shown in Figure 5.The input signals are eight independent Poisson random spike trains with a mean firing rate of 10 Hz.The simulation length is 800 s with a bin size of 2 ms.Given the input spike trains and the known kernels defined in this simulation, the conditional probability of the output signal can be generated by using Equation ( 3), which can be used to produce the output spike trains at different times through a Bernoulli process.
Using the input and output spike train data, kernel functions can be estimated by the proposed sMGLV method.As a comparison, the tracking result of a direct multiwavelet-based modelling method (dMGLV) without removing the redundant inputs is also provided to estimate the time-varying dynamical system in order to verify the effectiveness of the proposed approach.The tuning parameters utilized are the same for two compared methods according to Equation (27). Figure 5 illustrate the recovered kernel functions during different time courses using the dMGLV model and the proposed sMGLV model.The x-axis in the figure represents the time lag τ between the previous input events and the present time, and the y-axis indicates the amplitude of kernels.The true kernels are shown in black lines, and the estimated kernels over different time evolution can be displayed in different colors.Specifically, Figure 5a represent the estimated kernels in the first half of the simulation time course (the first 400 s) using two different methods, while Figure 5b represent the estimated kernels in the second half of the simulation time evolution (the last 400 s) using two different methods.The experiment results clearly indicate that the dMGLV model cannot capture global sparsity, that is, unable to recognize the redundant inputs.The dMGLV modeling method estimates spurious inputs even for zero-valued kernel functions, which is shown in Figure 5.In contrast, the proposed sMGLV model can accurately recognize all redundant inputs.As for the non-zero kernel functions, both dMGLV and sMGLV modeling methods can estimate the accurate kernel shapes accurately.Particularly, the proposed sMGLV model can effectively eliminate the redundant inputs, and therefore achieve a more reliable and accurate kernel estimation for the neural spiking system.We have also compared the tracking performance for both the rapid and gradual changing kernels in the whole simulation data length.As shown in Figures 6 and 7, apart from the proposed method, the stochastic state point process filtering algorithm (SSPPF), SSPPF with the group LASSO algorithm (sSSPPF), and the dMGLV model are also utilized for comparison.Specifically, Figure 6 shows the actual and estimated kernels over the time evolution, where the x-axis denotes the simulation time, the y-axis represents the time lag τ lasting from the previous input events to the present time, and the amplitudes of the kernel functions are color-coded.Additionally, Figure 7 illustrates the tracking results of the peak values of the estimated nonstationary feedforward and time-invariant feedback kernels using four different methods, where the x-axis represents simulation time course, and the y-axis indicates the kernel amplitudes for different spike train signals.The goodness-of-fit is quantitatively evaluated by comparing the mean absolute error (MAE) and normalized root mean square error (RMSE).The MAE and RMSE of the kernel estimates for the simulation spike train data, with respect to the corresponding true values, are estimated and shown in Table 2. Compared with the SSPPF, sSSPPF and the dMGLV model, the MAE and RMSE for the proposed sMGLV are much smaller.The MAE and RMSE of the peak amplitudes for the kernel are defined by where K (t) and K (t) represent the peak amplitudes of the actual and estimated kernels at different times, respectively.Table 2 statistically confirms the better performance of the proposed sparse multiwavelet-based TVGLV model.Compared with the conventional SSPPF, the SSPPF with the group LASSO and the direct multiwavelet-based modeling method, the above simulation results indeed show that the proposed approach possesses much higher accuracy in kernel estimates.Experiment results show that four estimation methods can all track the nonstationary kernels with step or linear changes.Although the SSPPF can roughly track the changes of the significant kernels, its estimates for the insignificant inputs are poor due to its inability in identifying the redundant model inputs.By contrast, the sSSPPF method can firstly remove the redundant inputs and provide relatively satisfactory tracking results of time-varying kernels by eliminating the disruptions from the redundant inputs simultaneously.However, considering the deficiency of the convergence rate for the adaptive filter methods, the sSSPPF cannot accurately capture the transient properties of the jump or abrupt time-varying processes.The dMGLV method, which established under the multiwavelet-based expansion framework, outperforms the SSPPF-type methods in the respect of tracking results of time-varying kernels with both rapid and gradual changes over time evolution.Nevertheless, the parametric modeling method without considering the sparsity condition can also produce spurious kernels generated by the redundant model inputs.The proposed sMGLV can instead provide optimal estimations and be consistent with the identification of the sparse time-varying neural connectivity, where the redundant inputs can be effectively eliminated and more reliable kernel estimations can be obtained from a group of neural spiking signals.Therefore, the proposed sMGLV framework can remarkably improve the tracking performance of time-varying kernels from time-varying dynamical systems based on spiking activities, and can be generally applied to the analysis for the real nonstationary neural functional connectivity.
The estimated model under the proposed sGLMV framework can be further validated using the correlation plot for the estimated and actual output spike train.Figure 8a shows the estimated and actual output spike trains based on identical input spiking signals.The pre-threshold membrane potential was firstly calculated from the input spike trains (X) and the estimated kernels ( k) using the proposed sMGLV model.Firing probability intensity (θ) for the output was then obtained using the Gaussian error function in Equation (3) (Figure 8a, first row).The estimated output spike train ( ŷ) is finally obtained through a threshold procedure (Figure 8a, second row).The corresponding actual spike train (y) is also given in the third row in Figure 8a.Considering the stochastic nature of the model, the similarities between the estimated output spike train and the actual output spike train can be quantified with correlation plot (Figure 8b).Noted that the point-process signals ŷ and y are firstly convolved with a Gaussian kernel with standard deviation σ g , and thus smoothed to continuous signals y σ g (t) and y σ g (t) , respectively.The correlation coefficient r is defined as follows [36]: It shows that the proposed sMGLV model can generate output highly correlated with the actual output for a large range of standard deviation σ g , and further validates the effectiveness of the proposed framework.Kernel shapes for a first-order, 8-input, single-output system with step and linear changes using four different algorithms across simulation time evolution in 2D.The feedback kernel considered in the system can be written as k (9) .The amplitudes of the kernels are color-coded.Peak Amplitude of K (5)      Results of additional intensive simulations have validated the advantages of the proposed sMGLV model over the state-of-the-art methods with respect to the identification of sparse time-varying neural connectivity.The proposed method can first select the significant inputs to avoid the estimation error produced by the redundant ones; then, both the gradual and rapid changes of the time-varying kernels can be tracked rapidly and accurately by using the sparse multiwavelet-based modeling framework.

Application to Functional Connectivity Analysis of Retinal Neural Spikes
The proposed sMGLV model can be generally applied to identify the functional connection from the experimental spiking train data.A set of spontaneous retinal spike train recordings has been used to illustrate and reveal the potential properties of the neural dynamics [46].The dataset contains 512 electrodes recordings from retina of rats.The spontaneous spike trains were recorded simultaneously for 1000 s, with a sampling rate of 20 kHz and a bin size of 2 ms.In this experiment example, the system is composed of four inputs and one output.Utilizing the proposed modeling method, dynamics of four feedforward and one feedback kernels can be estimated during the time course.
Similar to the previous simulated example, the order for the B-splines basis function were adopted from 2 to 5 to construct the sMGLV model based on the spike trains.The tuning parameters are optimized using Equation (26), and thus both the optimal Laguerre parameter and the Laguerre basis number are 0.83 and 11, respectively.The regularization parameter in the group LASSO can be obtained using a cross validation method introduced in the Section C. A group LASSO with the B-splines expansion method is applied to the experimental spiking data to reveal the potential properties underlying the retinal neural system.The parsimonious model structure is produced with a FOR-MI algorithm and the expansion parameters are obtained with a maximum likelihood estimation method.
Figure 9 shows the reconstructed kernel results using the proposed sMGLV method over the time course.Specifically, the left figure in Figure 9 illustrates the estimated kernels at different time (i.e., 300 s, 400 s, 500 s, 600 s, 700 s), where the x-axis represents time lag τ between the previous input events and the present time, and the y-axis stands for the amplitudes of kernels.Experimental results show that the kernels for the first and second input spike train signals (i.e., k (1) and k (2) ) are nonzero, and thus are significant inputs, while the kernels of the third and fourth input spike train signals (i.e., k (3) and k (4) ) are yielded to be zero-valued for the whole system memory, and therefore deemed to be insignificant inputs.The sMGLV model can effectively identify the sparsity in the experimental system.Owing to eliminating the spurious inputs, the kernels for significant inputs can be estimated more accurately and are shown to keep the stable estimation results.The right figure in Figure 9 demonstrates the peak amplitudes and their corresponding stand deviation of different kernels across the experimental time in the spiking system, where the red lines represent the changes of the peak amplitudes over time and the green shade indicates the standard deviation for the amplitudes of the estimated kernels.Both of the peak amplitudes and their corresponding stand deviation can be easily derived based on the time-invariant expansion coefficients α through Equations ( 12), ( 27) and (28).The peak amplitudes describe the changes for the estimated kernels during the experimental time, while the standard deviation accounts for the fluctuation ranges for the peak values.The experimental results in the right figure illustrate that the peak amplitudes for the significant inputs (i.e., k (1) and k (2) ) keep varying over time, which indicates the significant time-varying connection strength.These results demonstrate the applicability of the proposed sMGLV model framework to the real experimental data.

Conclusions
A novel sMGLV modeling framework has been formulated for the identification of time-varying neural dynamics from spiking activities.The proposed framework has three distinct advantages compared to the state-of-the-art modeling methods.First, in terms of the sparse neural connectivity, the proposed sMGLV modeling framework combines the Laguerre expansion technique with the group LASSO algorithm to realize a global sparsity, where spurious inputs can be effectively eliminated by enforcing the Laguerre expansion coefficients to be zero.Second, the proposed sMGLV modeling scheme projects the time-varying coefficients onto a set of multiwavelet-based basis functions.By this means, the time-varying modeling problem can be converted into a time-invariant regression problem, and further the tracking performance of time-varying models is improved.Third, for the issue of model overfitting, the proposed sMGLV model employs a FOR-MI method to select the significant terms and removes the redundant ones efficiently, and a satisfactory sparse model with good generalization performance is finally produced.Numerical simulation verified the effectiveness of the proposed modeling scheme in identifying the changes of the time-varying kernels.Its application to a real retinal spike train recorded indicates that the proposed method can be applied to the real experimental data, and obtain a relatively faithful expression for the sparse functional connectivity among the neural The structurally plausible time-varying MISO neural spiking model

Figure 2 .
Figure 2. The structures of the time-varying MIMO and MISO neural spiking system.(a) the MIMO neural spiking system with the sparse connectivity among the neural population.The solid lines represent the causal relationship between the input and output spiking signals.The dashed lines indicate that there is no causal relationship between them; (b) the structurally plausible time-varying MISO neural spiking model.
the comparison of different kernel estimation results using the dMGLV and sMGLV methods in the first half of time course between 0 and 400s.
the comparison of different kernel estimation results using the dMGLV and sMGLV methods in the second half of time course between 400s and 800 s.

Figure 5 .
Figure 5.The estimations of time-varying kernels from an 8-input 1-output system using the dMGLV and the proposed sMGLV modeling methods, where the true kernels are given at 400 s and 800 s, respectively, in two stages.(a) comparing estimation results for different kernels using the dMGLV model and sMGLV model in the first half of the time course; (b) comparing estimation results for different kernels using the dMGLV model and sMGLV model in the second half of the time course.

Figure 6 .
Figure 6.Kernel shapes for a first-order, 8-input, single-output system with step and linear changes using four different algorithms across simulation time evolution in 2D.The feedback kernel considered in the system can be written as k(9) .The amplitudes of the kernels are color-coded.

Figure 7 .
Figure 7. Peak amplitudes of actual kernels (black) and estimated kernels (blue for SSPPF, green for sSSPPF, purple for dMGLV, and red for the proposed sMGLV model) across simulation time evolution in a time-varying system with a 1 s resolution.

Figure 8 .
Figure 8.Estimated output spike train and correlation plot.(a) Estimated output spike train calculated using the proposed sMGLV model.The first row is the estimated firing probability intensity; the second row is the estimated output spike train; the third row is the corresponding actual spike train fragment.(b) Correlation plot.The correlation coefficient r for the whole simulation data length is calculated as a function of the standard deviation σ g of the Gaussian smoothing kernel.

Figure 9 .
Figure 9.The estimations of a 4-input and 1-output retina model using the proposed sMGLV modelling method.Estimated kernels using the sMGLV model at different times are drawn with different colors in the left figure; amplitudes and the corresponding stand errors for the estimated kernels can be seen in the right figure.

Table 1 .
Table of notations.

Table 2 .
A performance comparison of kernel estimates for the simulation examples using different approaches.The NRMSE does not exist considering that the actual kernel values are designed to be zeros. *