This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Brain-computer interfaces (BCIs) mostly rely on electrophysiological brain signals. Methodological and technical progress has largely solved the challenge of processing these signals online. The main issue that remains, however, is the identification of a reliable mapping between electrophysiological measures and relevant states of mind. This is why BCIs are highly dependent upon advances in cognitive neuroscience and neuroimaging research. Recently, psychological theories became more biologically plausible, leading to more realistic generative models of psychophysiological observations. Such complex interpretations of empirical data call for efficient and robust computational approaches that can deal with statistical model comparison, such as approximate Bayesian inference schemes. Importantly, the latter enable the optimization of a model selection error rate with respect to experimental control variables, yielding maximally powerful designs. In this paper, we use a Bayesian decision theoretic approach to cast model comparison in an online adaptive design optimization procedure. We show how to maximize design efficiency for individual healthy subjects or patients. Using simulated data, we demonstrate the face- and construct-validity of this approach and illustrate its extension to electrophysiology and multiple hypothesis testing based on recent psychophysiological models of perception. Finally, we discuss its implications for basic neuroscience and BCI itself.

Brain-computer interfaces (BCIs) enable direct interactions between the brain and its bodily environment, as well as the outside world, while bypassing the usual sensory and motor pathways. In BCI, electroencephalography (EEG) is by far the most widely used technique, either with patients or healthy volunteers, simply because it offers a non-invasive, direct and temporally precise measure of neuronal activity at a reasonable cost [

In this paper, we would like to further promote the idea that BCI and cognitive neuroscience researchers can help each other in pursuing this common goal. In short, the BCI paradigm puts the subject in a dynamic interaction with a controlled environment. From the perspective of cognitive neuroscience, this is a new opportunity to study normal and pathological brain functioning and to test mechanistic neurocognitive hypotheses [

Thankfully, a recent trend (and one that is increasingly catching on) has been to increase the permeability of the border between the BCI and cognitive neuroscience communities. New applications have emerged that rely on both disciplines and, thus, bring short-term benefit to both. One example is the so-called brain-state-dependent stimulation approach (BSDS) [

In this paper, we extend and formalize the BSDS approach by showing that our ability to process neuroimaging data online can be used to optimize the experimental design at the subject level, with the aim of discriminating between neurocognitive hypotheses. In experimental psychology and neuroimaging, this is a central issue, and examples range from stair-case methods to estimating some individual sensory detection or discrimination threshold [

We introduce a generic approach in which real-time data acquisition and processing is aimed at discriminating between candidate mappings between physiological markers and mental states. This approach is essentially an adaptive design optimization (ADO) procedure [

This paper is organized as follows. In the Theory and Methods section, we first describe the class of dynamical models that we compare. To make this paper self-contained, but still easy to read, we provide an appendix with a comprehensive summary of the variational Bayesian inference approach (see

A schematic illustration of the adaptive

In this section, we briefly introduce the very general type of complex generative models for which the proposed ADO procedure is most appropriate. In their general form, such models are defined by a pair of assumptions {

The second component,

More recently, a related dynamical-system based approach has been derived to model psychological states, their evolution over time and their mapping onto observable behavioral measures (e.g., choices, reaction times) [_{s}_{s}_{s}_{e}_{e}_{e}_{s}_{s}_{e}_{e}_{s}_{e}

Most of the generative models that are used in cognitive neuroscience fall into the class of nonlinear Gaussian models. Our approach combines two recent methodological advances and brings them online for ADO. First, we use a Bayesian framework to invert and compare such generative models [

For the online use of the same criterion in order to optimize the experimental design for model comparison, at the individual level, we simply proceed as illustrated in

Running the variational Bayes (VB) inference for each model,

Updating the prior over models with the obtained posteriors;

Computing the design efficiency or Laplace-Chernoff bound for each possible value of the experimental design variable,

Selecting the optimal design for the next trial or stage.

Finally, the online experiment will be interrupted as soon as some stopping criterion will have been met. Typically, the experiment will be conclusive as soon as one model is identified as the best model, for instance, when its posterior probability will be greater than 0.95. If this is not the case, when an

We now turn to the validation of the proposed approach. We describe two studies based on synthetic data. The first one demonstrates the face and construct validity of the approach by reproducing the simulation example in [

In order to illustrate our approach for ADO and to provide a first demonstration of its face and construct validity, we reproduce results from Cavagnaro and colleagues [

Model power (POW):
^{−b}

Model exponential (EXP):
^{−}^{bt}

In each equation, the symbol,

As in [^{−b}

The observable data,

For each simulated participant, ADO was initialized with the same priors over model parameters:

To demonstrate how our new instantiation of ADO extends to nonlinear dynamic causal models, which are of increasing interest in cognitive neuroscience, we now turn to a second series of original simulations. We therefore consider recent models of human perceptual learning in a changing environment [

Below, we expose the perceptual (evolution) and response (observation) models we considered for simulating MMN-like responses.

We considered a simplified version of the perceptual learning model proposed in [_{1} = 1 for deviant and _{1} = 0 for standard stimuli) is governed by a state, _{2}, at the next level of the hierarchy. The brain perceptual model assumes that the probability distribution of _{1} is conditional on _{2}, as follows:
_{1}│_{2}) = _{2})^{x1} (1 − _{2})) ^{1−x1} = _{1};_{2}))

Equations (6) and (7) imply that the states _{1} = 0 and _{1} = 1 are equally probable when _{2} = 0.

The probability of _{2} itself changes over time (trials) as a Gaussian random walk, so that the value,

Setting the parameter _{2} is fixed over time. In all other cases, the magnitude of changes in _{2} over time (trials) is controlled by _{3} (the third level of the hierarchy) and

Graphical illustration of the hierarchical perceptual (generative) model with States _{1}, _{2} and _{3}. The probability at each level is determined by the variables and parameters at the level above. Each level relates to the level below by controlling the variance of its transition probability. The highest level in this hierarchy is a constant parameter, _{1} determines the probability of the input stimulus: standard (0) or deviant (1). The model parameters,

One can quantify the novelty of sensory input using Bayesian surprise. In what follows, we assume that EEG response magnitudes encode the Bayesian surprise induced by the observation of sensory stimuli at each trial. This is in line with recent empirical studies of the MMN in oddball paradigms [

Recall that, at any given trial, the Bayesian surprise is simply the Kullback-Leibler divergence between the prior and posterior distribution [

Note that under the Laplace approximation, BS has a straightforward analytic form (see [

We considered the problem of comparing five different perceptual models given simulated EEG data (see

Five alternative models used and compared in simulations.

Models | Ability to Track Events Probabilities | Ability to Track Environmental Volatility | |||
---|---|---|---|---|---|

M1 | −Inf | 0 | - | No | No |

M2 | −5 | 0 | - | Low learning | No |

M3 | −4 | 0 | - | High learning | No |

M4 | −5 | 1 | 0.2 | Low learning | Yes |

M5 | −4 | 1 | 0.2 | High learning | Yes |

We simulated 75 experiments in total, corresponding to 15 different synthetic subjects simulated under each model type as the true model. Each experiment consists of 350 trials. ADO was compared with the two following classical designs. The “stable” classical design has a fixed probability of the occurrence of a deviant (

Simulations in this work were performed using the VBA toolbox [

In brief, ADO chooses, at each stage, the lag time that maximizes the difference between model predictions and then updates model probabilities based on the model evidences. For example, in Stage 3 of the simulated experiment depicted in

In line with Cavagnaro

Finally,

Predictions of the power (POW) and exponential (EXP) models in the first four stages of one simulated experiment and the landscape of selection error rate across lag time. The predictions are based on the prior parameter estimates at each stage. The text above and inside the graphs provides information about the prior probabilities of each model, the optimal designs for discriminating the models and the observed outcomes (correct responses) at each stage of the simulated experiment. Arrows denote the percentage of correct responses at the optimal lag time. For the heat maps of models predictions (

Lag time distribution for each experimental design (over the 30 simulations).

Posterior probabilities of the true (POW) model at each stage (average over 30 simulations).

In brief, our analysis reproduces the results of Cavagnaro

Here, we assess ADO’s ability to discriminate between complex (computational) models.

Simulated data for five Bayesian learning models (defined in

As in the previous

Adaptive design optimization (ADO) with learning models: simulation results. Note that a simulated experiment is deemed “conclusive” whenever the true model posterior probability is equal to or greater than 95%. (

In conclusion, ADO performs better than classical designs, yielding fast and efficient experiments for all the models considered. The last point is important, since this implies that ADO does not induce biases in model selection.

Posterior model probabilities at each trial in our simulated experiment: the average over 15 simulations for each model.

In this paper, we demonstrate the added-value of real-time acquisition and BCI loops, when applied to the aim of performing online design optimization. This work follows recent advances in Bayesian decision theoretic approaches to design optimization in experimental neuroscience [

First, ADO’s performances depend upon the accuracy of prior information regarding model parameters. In fact, non-informative priors are unacceptable, because they induce flat predictive densities for all models, which prevents any design optimization procedure [

Second, real-time processing of electrophysiological data remains challenging, because of data contamination by high-magnitude artifacts (

Third, ADO cannot be used to optimize the experimental design and to select relevant data features (e.g., EEG markers) at the same time. This implies that admissible data features have to be identified prior to the experiment.

A promising application of ADO is differential diagnosis, whereby one seeks to discriminate between alternative pathological mechanisms. One such example is the inference of patients’ mental states from electrophysiological makers in coma and related disorders [

Our paper aims to provide a proof of concept of an original way to conduct basic research experiments. Using simulations, we demonstrated robust advantages of optimal design when the ADO procedure was compared with classical designs in behavioral or electrophysiological experiments. We envisage that the present paper could pave the way for future BCI applications in both basic and clinical research.

The authors are grateful to Karen Reilly for insightful comments on an early version of this manuscript. This work was supported by the French ANR project ANR-DEFIS 09-EMER-002 CoAdapt and a grant from the Fondation pour la Recherche Médicale (FRM) to G.S., E.M., O.B. and J.M. A.B. is funded by the MEEGAPERF project, DGA-RAPID. J.D. acknowledges support from the European Research Council. This work was also performed within the framework of the LABEX CORTEX (ANR-11-LABX-0042) of Université de Lyon, within the program “Investissements d’Avenir” (ANR-11-IDEX-0007) operated by the French National Research Agency (ANR). We gratefully acknowledge CC-IN2P3 through TIDRA [

The authors declare no conflict of interest.

In this Appendix, we briefly describe how approximate Bayesian inference applies to dynamic causal models, with a particular emphasis on Bayesian model comparison.

In the Bayesian framework, defining model

For a more detailed description of the VB approach and an exemplar application in neuroimaging, we refer the interested reader to [

In the Bayesian framework, comparing Model _{1} with Model _{2} rests upon computing the Bayes factor:
_{1} (equivalently, to a

Under equiprobable priors over models, this boils down, for Model, _{k}

Then, a natural decision criterion is to select as the best model the one that obtains a posterior probability greater than 0.95.

In this Appendix, we summarize the decision theoretic approach introduced in [

In _{e}

The task of design optimization is to reduce the effect of the data sampling process upon the overall probability of selecting the wrong model. In other words, the design risk we want to minimize corresponds to the marginalization of the above probability over the whole sample space, ^{*}, thus writes:

Unfortunately, the above integral has no analytical close form and will be difficult to evaluate in most cases. As proposed in [_{JS}_{M}_{JS}

In this Appendix, we disclose the relationship between the Chernoff bound we use for online design optimization and the criterion proposed in the seminal work by Myung, Pitt and Cavagnaro [

The Chernoff bound writes (see _{JS}

Minimizing this bound to the model selection error rate, with respect to design variable

Simply unfolding Shannon’s entropy and applying Bayes rule yields:

The Jensen-Shannon divergence, or equivalently, the above conditional mutual information, is the relevant terms of the Chernoff bound to the model selection error rate. For simple models, such as the memory retention models compared here and in [