Multivariate Modelling and Prediction of High-Frequency Sensor-Based Cerebral Physiologic Signals: Narrative Review of Machine Learning Methodologies

Vakitbilir, Nuray; Islam, Abrar; Gomez, Alwyn; Stein, Kevin Y.; Froese, Logan; Bergmann, Tobias; Sainbhi, Amanjyot Singh; McClarty, Davis; Raj, Rahul; Zeiler, Frederick A.

doi:10.3390/s24248148

Open AccessReview

Multivariate Modelling and Prediction of High-Frequency Sensor-Based Cerebral Physiologic Signals: Narrative Review of Machine Learning Methodologies

by

Nuray Vakitbilir

^1,*

,

Abrar Islam

¹

,

Alwyn Gomez

^2,3

,

Kevin Y. Stein

¹

,

Logan Froese

⁴

,

Tobias Bergmann

⁵

,

Amanjyot Singh Sainbhi

¹

,

Davis McClarty

⁶,

Rahul Raj

⁷

and

Frederick A. Zeiler

^1,2,4,8

¹

Department of Biomedical Engineering, Price Faculty of Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada

²

Section of Neurosurgery, Department of Surgery, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB R3A 1R9, Canada

³

Department of Human Anatomy and Cell Science, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB R3E 0J9, Canada

⁴

Department of Clinical Neuroscience, Karolinska Institutet, 171 77 Stockholm, Sweden

⁵

Undergraduate Engineering, Price Faculty of Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada

⁶

Undergraduate Medicine, College of Medicine, Rady Faculty of Health Sciences, Winnipeg, MB R3E 3P5, Canada

⁷

Department of Neurosurgery, University of Helsinki, 00100 Helsinki, Finland

⁸

Pan Am Clinic Foundation, Winnipeg, MB R3M 3E4, Canada

^*

Author to whom correspondence should be addressed.

Sensors 2024, 24(24), 8148; https://doi.org/10.3390/s24248148

Submission received: 14 October 2024 / Revised: 9 December 2024 / Accepted: 18 December 2024 / Published: 20 December 2024

(This article belongs to the Special Issue Advances in Biomedical Sensing, Instrumentation and Systems: 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

Monitoring cerebral oxygenation and metabolism, using a combination of invasive and non-invasive sensors, is vital due to frequent disruptions in hemodynamic regulation across various diseases. These sensors generate continuous high-frequency data streams, including intracranial pressure (ICP) and cerebral perfusion pressure (CPP), providing real-time insights into cerebral function. Analyzing these signals is crucial for understanding complex brain processes, identifying subtle patterns, and detecting anomalies. Computational models play an essential role in linking sensor-derived signals to the underlying physiological state of the brain. Multivariate machine learning models have proven particularly effective in this domain, capturing intricate relationships among multiple variables simultaneously and enabling the accurate modeling of cerebral physiologic signals. These models facilitate the development of advanced diagnostic and prognostic tools, promote patient-specific interventions, and improve therapeutic outcomes. Additionally, machine learning models offer great flexibility, allowing different models to be combined synergistically to address complex challenges in sensor-based data analysis. Ensemble learning techniques, which aggregate predictions from diverse models, further enhance predictive accuracy and robustness. This review explores the use of multivariate machine learning models in cerebral physiology as a whole, with an emphasis on sensor-derived signals related to hemodynamics, cerebral oxygenation, metabolism, and other modalities such as electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) where applicable. It will detail the operational principles, mathematical foundations, and clinical implications of these models, providing a deeper understanding of their significance in monitoring cerebral function.

Keywords:

cerebral physiologic signals; computational models of cerebral physiology; high-dimensional cerebral data analysis; multivariate machine learning models

1. Introduction

The high-frequency measurements of cerebral physiologic parameters provide real-time insights crucial for the precise assessments of cerebral function, enabling the capture of rapid fluctuations, the detection of subtle changes, and the guidance of real-time interventions, making them indispensable for understanding, monitoring, and treating cerebral disorders [1,2,3]. These high-frequency signals are generated by a combination of invasive and non-invasive sensor technologies, which monitor cerebral oxygenation, metabolism, and hemodynamic parameters, including intracranial pressure (ICP) and cerebral perfusion pressure (CPP) [1,4,5]. Additionally, neural activity signals recorded by sensor-based modalities such as electroencephalography (EEG), functional near-infrared spectroscopy (fNIRS), and magnetoencephalography (MEG) provide complementary data that enrich the understanding of cerebral function, particularly when integrated with multivariate analysis approaches. Understanding and analyzing these signals are pivotal for grasping the complexities of cerebral processes, as it enables the identification of intricate patterns and accurately pinpoint anomalies [6]. The development of computational models of cerebral physiology plays a vital role in exploring the connections between measurable signals and the underlying physiological state. However, the complexity and multidimensionality of cerebral data poses significant challenges for traditional analytical methods [2,6]. Herein lies the importance of leveraging multivariate machine learning approaches to unlock the full potential of these signals.

Multivariate models, particularly machine learning models, have emerged as indispensable tools for cerebral physiological signal analysis due to their capacity to capture intricate relationships among multiple variables recorded by different sensors simultaneously. Unlike conventional univariate methods, which analyze individual signals in isolation and may overlook subtle yet clinically relevant patterns, multivariate models excel in integrating diverse physiological parameters together, extracting meaningful information from complex datasets [7]. This capability is particularly valuable in neuroimaging studies, where numerous factors, measured by different sensors, contribute to brain activity and dynamics [8]. Moreover, multivariate machine learning facilitates accurate modeling of cerebral physiologic signals, effectively capturing complex patterns in cerebral data, delineate functional brain networks, and can help identify biomarkers of neurological disorders [9,10]. Such modeling approaches not only enhance our understanding of brain physiology but also hold promise for developing novel diagnostic and prognostic tools for a wide range of neurological conditions.

Furthermore, the predictive capabilities of multivariate machine learning are instrumental in advancing personalized medicine and treatment optimization in neurology [1,5]. By harnessing large-scale datasets encompassing diverse patient populations and clinical variables, these models can forecast disease progression, treatment response, and patient outcomes with unprecedented accuracy [11,12,13]. In the context of cerebral physiologic signals, predictive modeling enables patient-specific intervention, thereby optimizing therapeutic efficacy and improving patient care [14].

Additionally, machine learning models exhibit remarkable versatility, particularly in their capacity to be combined synergistically to tackle complex challenges in cerebral physiological signal analysis and modeling. By integrating multiple machine learning algorithms, the strengths of each model can be harnessed to address specific aspects of cerebral data analysis comprehensively [15,16]. For instance, ensemble learning techniques, such as random forests or gradient boosting, enable the aggregation of predictions from diverse base learners, enhancing predictive accuracy and robustness [17]. In the context of cerebral physiologic signals, combining various machine learning models allows for a more nuanced understanding of brain function and pathology by leveraging the complementary strengths of different approaches, as well as allowing for the mitigation of the limitations of individual models, such as overfitting or bias, resulting in more reliable and interpretable results [18].

In this study, hidden Markov models (HMM), convolutional neural networks (CNNs), long short-term memory (LSTM) networks, recurrent neural networks (RNNs), echo state networks (ESNs), random forests, XGBoost, support vector machines (SVMs), and Gaussian processes (GPs) are studied. Among these, CNNs, LSTMs, and RNNs fall under the category of deep learning, a subfield of machine learning. These models not only serve as powerful tools for automated feature extraction but are also widely employed as standalone predictive models, learning hierarchical representations directly from raw data. Their inclusion in this study highlights their versatility and importance within the broader machine learning modeling. Figure 1 offers a brief summary of the process from raw data to modeling with various machine learning models, highlighting the key steps involved. The models included in this study were chosen based on their ability to handle the complexity, sequential nature, and non-linear dynamics of the dataset. Classical models such as k-nearest neighbors (k-NNs), logistic regression, linear regression, and naïve Bayes were excluded due to their limitations in modeling temporal dependencies, non-linear relationships, and high-dimensional data effectively. Additionally, the chosen models align with state-of-the-art approaches for similar applications, ensuring robust and interpretable results.

This review aims to provide a thorough investigation of commonly used multivariate machine learning models, categorized as state-space models, neural networks, ensemble learning methods, and kernel methods, which are also summarized in Table 1. By delving into their operational principles and mathematical formulations, this narrative seeks to clarify the complexities associated with these modeling approaches. Moreover, the discussion will not only cover theoretical foundations but also explore practical applications and clinical implications of these models. We conducted a literature search to summarize research from the past 10 years, focusing specifically on the multivariate applications of machine learning models to cerebral physiological signals. To gather the relevant literature, we used a targeted search strategy in PubMed, employing keywords related to machine learning models and cerebral physiological signals. The selected studies were analyzed by categorizing them based on the models used, the signals analyzed, and the purpose of using the model. These studies were then included to provide insight into the current use of machine learning models in this area of research, highlighting how these models are being applied in practice. By bridging theoretical concepts and practical implementations, this review examines how modern machine learning methodologies can be utilized to model cerebral physiologic signals, offering insights into their potential applications in analyzing and interpreting sensor-derived signals.

2. Methods

2.1. Multivariate State-Space Models

Multivariate state space models are a class of statistical models used to represent complex systems characterized by multiple interacting variables evolving over time which are particularly useful for analyzing sequential data where observations are influenced by latent or unobservable states [19]. In a multivariate state space model, the observed data are assumed to be generated by a process that evolves through a series of hidden states over time, and each hidden state is associated with a multivariate observation, capturing the relationships between different variables [20]. Multivariate state space models offer a flexible framework for modeling temporal dependencies, capturing non-linear relationships, and handling missing or noisy data.

Hidden Markov Model (HMM)

The HMM is a versatile statistical modeling approach applicable to ‘linear’ challenges, such as sequences or time series, which assumes the existence of an underlying, unobservable (hidden) state sequence influencing the observed data [21]. The HMM is characterized by two main components: the observable states, which are the directly measurable or observable aspects of the system being modeled, and the hidden states, which represent unobserved variables that capture underlying structures or dynamics in the data sequence. HMM assigns transition probabilities to states, with each state emitting symbols based on its corresponding emission probability [22]. The probability of transitioning from one hidden state to another is described by the state transition probabilities, which is denoted by Equation (1), where A represents a transition matrix, a_ij represents the probability of transitioning from hidden state i at time t to hidden state j at time t+1, and S represent a set of hidden states [22].

A = a_{i j} = \{P (X_{t + 1} = S_{j}| X_{t} = S_{i})| 1 \leq i, j \leq n\},

(1)

The emission probabilities describe the probability of observing a particular observable state given the current hidden state. These probabilities are modeled using an emission matrix B as shown in Equation (2), where O(t) represents m-observable symbols and b_j(t) represents the probability of emitting symbol O(t) from state j.

B = b_{j} (t) = {P (O (t)| X (t) = S_{j})| 1 \leq j \leq n},

(2)

The initial state distribution describes the probability of starting in each hidden state. This distribution is modeled using an initial state vector π, denoted by Equation (3), where π_i represents the probability of starting in hidden state i.

π = {π_{i} = P (X_{l} = S_{i}) | 1 \leq i \leq n},

(3)

With these components, the probability of observing a sequence of observable states can be computed by O(t) = (o₁,o₂,...,o_m).

HMMs offer flexible modeling of diverse sequential data types and can capture temporal dependencies and transitions effectively [22]. Moreover, HMMs provide a probabilistic framework for robust inference and uncertainty estimation, making them valuable in various domains such as speech recognition, natural language processing, and signal processing. HMMs also offer interpretability by revealing insights into hidden state dynamics and feature extraction capabilities for capturing relevant patterns [22]. However, they assume a Markovian property that may not hold for complex systems and require specifying the number of hidden states beforehand, which can be challenging when the true number of states is unknown [23]. Additionally, HMMs may struggle to capture long-term dependencies in data and face computational challenges with high-dimensional data [22]. Sensitivity to initialization and parameter tuning, as well as vulnerability to overfitting, are also potential drawbacks [24]. Despite these limitations, HMMs remain a valuable tool for modeling sequential data and understanding dynamic processes in various fields. HMMs are increasingly being utilized for the analysis of cerebral physiologic signals, as they provide a robust framework for modeling the temporal dynamics and underlying states that generate these complex signals [25,26,27].

2.2. Neural Networks

Neural networks are a class of machine learning models inspired by the structure and function of the human brain. Comprising interconnected nodes, or neurons, organized into layers, neural networks are adept at learning complex patterns and relationships within data. Input data are fed into the network, processed through a series of hidden layers, and transformed into output predictions. Each neuron applies a mathematical operation to its inputs, with parameters (weights and biases) that are learned during training to minimize the difference between predicted and actual outcomes [28]. They come in various architectures, including feedforward, convolutional, recurrent, and attention-based networks, each tailored to specific types of data and tasks. Despite their flexibility, scalability, and ability to learn complex representations, neural networks can be computationally intensive and require large amounts of labeled data for effective training [29]. Figure 2 represents a simplified general structure for neural network architecture. Note that the output layer may include more nodes depending on the task, such as regression or classification. Additionally, the hidden layer might consist of more than one layer.

2.2.1. Convolutional Neural Network (CNN)

CNNs are a type of neural network architecture that are particularly effective at capturing spatial hierarchies of features within data, commonly images, through the use of convolutional layers, pooling layers, and fully connected layers [30]. In a CNN, the convolutional layer applies a series of filters, also known as kernels, to the input data where each filter detects specific features, such as edges or textures, by performing element-wise multiplications and summations across local regions to compute different feature maps [31]. The input is convolved using a learned kernel and subsequently applying a non-linear activation function to the convolved outcomes, resulting in the creation of a new feature map [32]. Multiple feature maps are then obtained through the utilization of various kernels. The feature value can be mathematically computed, as illustrated by Equation (4), where

w_{k}^{l}

is the weight vector applied to the input and

b_{k}^{l}

is the bias term of the k-th filter of the l-th layer,

x_{i, j}^{l}

is the input and

z_{i, j}^{l}

represents the feature map at location (i,j) of the l-th layer [32].

z_{i, j, k}^{l} = {w_{k}^{l}}^{T} x_{i, j}^{l} + b_{k}^{l},

(4)

After convolution, the output is typically passed through a non-linear activation function such as sigmoid, hyperbolic tangent function (tanh) or rectified linear unit (ReLU), as presented as given in Equation (5), where

a_{i, j, k}^{l}

represents the activation value.

a_{i, j, k}^{l} = a (z_{i, j, k}^{l}),

(5)

The pooling layer, typically situated between two convolutional layers, aims to attain shift invariance through the reduction in feature map resolution. Each feature map within the pooling layer is linked to its corresponding feature map from the preceding convolutional layer. Common pooling operations include max pooling and average pooling. A general pooling layer is presented in Equation (6).

y_{i, j, k}^{l} = p o o l (a_{i, j, k}^{l}),

(6)

Finally, the output from the convolutional and pooling layers is flattened and passed through one or more fully connected layers, which perform classification or regression tasks. These layers, exampled in Equation (7), where A_f represents the flattened output from the previous layers, use weights, w, and biases, b, to compute the final output o, typically followed by a softmax activation function for classification tasks:

o = s o f t m a x (w * A_{f} + b),

(7)

The softmax activation function converts raw scores into a probability distribution over multiple classes for multi-class classification tasks. It provides interpretable outputs, and allows for efficient training through gradient-based optimization through Equation (8), where x is the vector of the output of previous layer, K represents the vector length, and j and i are indices of a vector element and corresponding softmax output element, respectively [33].

{f (x)}_{i} = \frac{e^{x_{i}}}{\sum_{j = 1}^{K} e^{x_{j}}},

(8)

CNN models leverage convolutional layers to extract hierarchical features generally from input images, followed by pooling layers for dimensionality reduction and fully connected layers for classification or regression. During the training process, various optimization methods, including the optimization methods such as stochastic gradient descent (SGD), Adam, AdaDelta, Adabelief, and AdaMax, can be employed to adjust the parameters of the network, including the weights and biases, in order to minimize the difference between the predicted outputs and the ground truth labels [34]. Each optimization method has its own advantages and disadvantages, and the choice of method depends on factors such as the specific problem, the size and complexity of the dataset, and computational resources available. Experimentation and tuning are often required to determine the most effective optimization method for a given task.

CNN models have the ability to automatically learn hierarchical representations of data, achieve translation-invariant feature detection, and efficiently process high-dimensional data like images and videos; however, they may struggle with understanding spatial context in certain cases, and they can be challenging to apply to irregular data structures [35]. Additionally, CNNs benefit from parameter sharing and sparse connectivity, enabling more efficient learning and inferences [36]. They are also parallelizable, facilitating fast computation and training on large-scale datasets, and can leverage transfer learning to accelerate model training [30]. On the other hand, they may lack interpretability of learned features, require large amounts of labeled data for training, and be sensitive to variations in hyperparameters [31]. Additionally, designing complex CNN architectures and addressing computational requirements for training and inference can also pose challenges as a balance between model complexity, computational resources, training time, generalization performance, and deployment constraints is required [35].

2.2.2. Recurrent Neural Network (RNN)

RNNs are a type of neural network architecture specifically designed for sequential data processing. RNNs have connections that form directed cycles, allowing them to maintain an internal memory or hidden state that captures information about previous inputs [37]. The hidden state h_t at time step t in an RNN can be computed as presented in Equation (9), where f is the activation function, such as sigmoid, tanh, or ReLU; W_hh is the weight matrix for the recurrent connections; h_t−1 is the hidden state from the previous time step; x_t is the input at time step t; W_xh is the weight matrix for connections from x_t; and b_h is the bias vector [38].

h_{t} = f (W_{h h} * h_{t - 1} + W_{x h} * x_{t} + b_{h}),

(9)

The output of the RNN at each time step t can then be computed as given in Equation (10), where y_t is the output at time step t, W_hy is the weight matrix for the connections from the hidden state to the output, and b_y is the bias vector.

y_{t} = W_{h y} * h_{t} + b_{h},

(10)

During training, the parameters (weights and biases) of RNNs are learned using backpropagation through time [38]. RNNs can model temporal dynamics, process variable-length inputs, and share parameters for efficient learning of long-term dependencies [38]. They naturally represent sequential data and maintain stateful memory, enabling them to capture context across time steps and support gradient propagation through backpropagation; however, they suffer from vanishing or exploding gradients during training, limiting their ability to capture long-term dependencies effectively [37]. Short-term memory limitations can impact their performance on tasks requiring an understanding of long-range contexts, and sequential computation slows training and inference, hindering scalability [29]. RNNs are also sensitive to hyperparameters and can exhibit training instability with large datasets or complex architectures, limiting their effectiveness in certain scenarios [37].

2.2.3. Long Short-Term Memory (LSTM)

LSTM networks are a specialized type of RNN architecture designed to address the vanishing gradient problem and capture long-term dependencies in sequential data [16]. LSTMs introduce memory cells with gated units, which allows them to selectively retain or forget information over time enabling LSTMs to learn and remember important information over long sequences, making them well-suited for tasks such as time series prediction [39]. The key components of an LSTM cell include the input gate i_t, forget gate f_t, output gate o_t, cell state c_t, intermediate state g_t, and hidden state h_t, which are updated at each time step t. The equations governing the computation of these components are given from Equations (11)–(16), where x_t is the input at time step t; h_t₋₁ is the hidden state from the previous time step; W and b are the weights and biases for the input gate (i), output gate (o), cell state (c), and intermediate state (g); σ is the sigmoid activation function; and ⊙ represents element-wise multiplication [40].

i_{t} = σ (W_{x i} x_{t} + W_{h i} h_{t - 1} + W_{c i} c_{t - 1} + b_{i}),

(11)

f_{t} = σ (W_{x f} x_{t} + W_{h f} h_{t - 1} + W_{c f} c_{t - 1} + b_{f}),

(12)

o_{t} = σ (W_{x o} x_{t} + W_{h o} h_{t - 1} + W_{c o} c_{t - 1} + b_{o}),

(13)

g_{t} = t a n h (W_{x g} x_{t} + W_{h g} h_{t - 1} + b_{g}),

(14)

c_{t} = f_{t} ⨀ c_{t - 1} + i_{t} ⨀ {\tilde{c}}_{t}),

(15)

h_{t} = o_{t} ⨀ t a n h (c_{t}),

(16)

LSTMs excel in learning and retaining information over long sequences, making them ideal for tasks involving temporal context, and they address the vanishing gradient problem of traditional RNNs, ensuring stable training [40]. LSTMs utilize gating mechanisms to regulate information flow and maintain an internal memory cell for context retention [41]. While versatile across various sequential tasks, LSTMs are complex, requiring careful hyperparameter tuning, and may suffer from overfitting. Interpreting LSTM mechanisms can be challenging, hindering performance diagnosis [42]. Despite their proficiency, LSTMs may struggle with very long sequences due to memory limitations and can still face gradient explosion issues during training, especially with deep architectures [41].

2.2.4. Echo State Network (ESN)

ESNs are a type of RNN renowned for their simplicity and effectiveness in handling temporal data [43]. The defining feature of ESNs is their “reservoir”—a fixed, random, and sparse recurrent network that transforms the input signals into high-dimensional dynamic states [44]. The reservoir is made of large recurrently connected neurons, often arranged in a random or fixed structure. The defining characteristic of ESNs lies in their training method, where only the output weights are learned while the internal connections remain fixed [43]. This simplifies the training process and prevents issues such as vanishing or exploding gradients. Typical update equations for ESN are given in Equations (17) and (18), where x(t) represents the state of the reservoir at time t; y(t) represents the output value; W_in, W_res and W_out are the input, recurrent, and readout weights, respectively; u(t) represents the input value at time t; and f is a non-linear activation function, such as sigmoid or tan [45,46].

x (t) = f (W_{i n} u (t) + W_{r e s} x (t - 1)),

(17)

y (t) = W_{o u t} x (t),

(18)

The effectiveness of ESN lies in its ability to generate complex temporal dynamics from simple input–output mappings, making it suitable for tasks such as time-series prediction and signal processing [46]. ESNs offer a computationally efficient alternative to traditional RNNs and LSTMs for tasks involving temporal data, leveraging a fixed reservoir to generate dynamic representations while simplifying the training process [47]. However, the performance of ESNs can be highly sensitive to the design of the reservoir, including its size, connectivity, and spectral radius, making the optimization of these parameters challenging for specific tasks [48]. Since the reservoir is fixed and untrained, it may not always align well with the particular characteristics of the input data, limiting adaptability [49]. Large reservoirs can consume significant memory resources, posing a challenge for memory-intensive applications. Additionally, the fixed reservoir cannot adapt to new data patterns during training, which might limit the network’s effectiveness in dynamic environments [46]. ESNs may struggle with very large-scale problems or tasks requiring highly precise temporal modeling, where the advantages of traditional RNNs or LSTMs might become more pronounced.

2.3. Ensemble Learning Models

Ensemble learning models are a powerful class of machine learning techniques that combine multiple individual models to produce a more accurate and robust prediction than any single model could achieve on its own [50]. By leveraging the diversity of the constituent models, ensemble methods can mitigate the weaknesses of individual models and exploit their complementary strengths. Bagging, boosting, and stacking are common techniques used in ensemble learning models to improve model performance [17]. In bagging, multiple models are trained independently on different subsets of the training data, and their predictions are combined through averaging or voting. Boosting sequentially trains a series of weak learners, with each subsequent model focusing on the examples that the previous models struggled with. Stacking involves training a meta-model that learns how to combine the predictions of multiple base models. Ensemble learning models are particularly effective when dealing with noisy or heterogeneous data [50]. In Figure 3, simplified general structure for ensemble learning models based on decision trees is illustrated.

2.3.1. Random Forest

Random forest is an ensemble learning method based on bagging that constructs multiple decision trees during training and combines their predictions to make more accurate and robust predictions [51]. The key idea behind random forest is to introduce randomness both in the selection of the data samples used to train each tree and in the selection of features considered for splitting at each node. Each decision tree in the random forest is trained independently on a bootstrap sample of the training data, where samples are drawn with replacements [52]. Additionally, at each node of the tree, only a random subset of features is considered for splitting, ensuring that each tree in the forest learns different aspects of the data. The final prediction of the random forest is obtained by aggregating the predictions of all individual trees through averaging in regression or voting in classification tasks. Prediction ŷ of a new input x in a random forest model, for a regression task, is presented in Equation (19) [53].

\hat{y} = \frac{1}{K} \sum_{i = 1}^{K} T_{i} (x),

(19)

The K represents the number of trees in the forest and T_i(x) represents the prediction of the i-th decision tree for input x. The random forest model combines multiple decision trees to produce accurate predictions while reducing overfitting. It effectively handles missing data and identifies influential features for interpretation [54]. Robust to outliers, it can handle large datasets efficiently but may struggle with imbalanced data and highly correlated features. However, random forest models are less interpretable and require tuning of multiple hyperparameters, leading to longer training times [51].

2.3.2. Extreme Gradient Boosting (XGBoost)

Extreme gradient boosting (XGBoost) is an advanced implementation of gradient boosting, a powerful ensemble learning technique. Employing a supervised learning approach, it combines the predictions of multiple weaker models typically decision trees, in a sequential manner to accurately forecast an objective variable [15]. The key innovation of XGBoost lies in its optimization algorithm, which efficiently minimizes a regularized objective function by iteratively adding new trees to the ensemble [55]. XGBoost applies a gradient descent-based approach to minimize the loss function, incorporating both first-order gradients, i.e., gradients of the loss function with respect to predictions, and second-order gradients, i.e., gradients of the loss function with respect to the model parameters [15,56]. Thus, XGBoost can learn complex relationships in the data while preventing overfitting through regularization techniques such as shrinkage, which controls the contribution of each tree to the final ensemble, and tree pruning, which removes unnecessary branches from the model. The objective function L is minimized by XGBoost, as presented in Equation (20), where l is the loss function measuring the difference between the true label y_i, the predicted label ŷ_i, Ω(f_k) is the regularization term penalizing the complexity of each tree f_k, and K is the number of trees in the ensemble [56].

L = \sum_{i = 1}^{n} l (\hat{y_{i}}, y_{i}) + \sum_{k = 1}^{K} Ω (f_{k}),

(20)

XGBoost iteratively adds new trees to the ensemble to minimize this objective function, with each tree trained to correct the errors made by the previous ones [15]. XGBoost, known for its efficiency and scalability, often surpasses other machine learning algorithms in both speed and accuracy [55]. It incorporates built-in regularization techniques that add penalty terms to the loss function, such as L1 regularization (Lasso) and L2 regularization (Ridge) that add the absolute values of the coefficients and the squared values, respectively, to prevent overfitting and enhance generalization. Supporting various objective functions and evaluation metrics, XGBoost adapts well to regression, classification, and ranking tasks. It allows the derivation of feature importance scores to aid in feature selection and interpretation [57]. XGBoost handles missing values automatically and employs parallelization for faster training on large datasets. Advanced tree pruning techniques control model complexity, improving generalization performance. However, tuning numerous hyperparameters may require significant computational resources, and memory consumption can be high, limiting its applicability in memory-constrained environments [58]. Like other ensemble methods, XGBoost may sacrifice interpretability. Despite its regularization, overfitting may occur with complex datasets or inadequate tuning [15]. Sensitivity to outliers and challenges with scalability in very large datasets or distributed environments are potential limitations, and addressing imbalanced datasets may require additional techniques like class weighting or resampling [56].

2.4. Kernel Methods

Kernel methods are a class of machine learning techniques that encompass various approaches. Some kernel methods compute similarities between data points in the original input space without necessarily explicitly computing the transformation into a higher-dimensional feature space while some kernel models operate by implicitly mapping input data into a high-dimensional feature space using a kernel function [59]. This flexibility allows kernel methods to effectively capture complex patterns in data [60]. Kernel methods are especially useful when dealing with data that cannot be effectively linearly separated or modeled using traditional techniques [59].

2.4.1. Support Vector Machine (SVM)

The SVM is a powerful supervised learning kernel model used for classification and regression tasks. In classification, SVMs aim to find the optimal hyperplane that separates data points belonging to different classes with the largest margin [61].

A SVM aims to find a function f(x) for a given set of training data points {x_i, y_i}, where x_i is the input feature vector and y_i is the corresponding continuous target value, such that the difference between the predicted and actual values is minimized, while still satisfying a specified margin ϵ [62]. Equation (21) presents objective function of SVM, where w is the weight vector orthogonal to the hyperplane, b is the bias term, ξ_i and

ξ_{i}^{*}

are slack variables representing the distance of the data point (x_i, y_i) from the margin, and C is the regularization parameter controlling the trade-off between maximizing the margin and minimizing the error [63].

m i n i m i z e \frac{1}{2} {‖w‖}^{2} + C \sum_{i = 1}^{n} (ξ_{i} + ξ_{i}^{*}) s u b j e c t t o \{\begin{matrix} |y_{i} - w \cdot x_{i} - b| \leq ϵ + ξ_{i} \\ y_{i} - w \cdot x_{i} - b \leq ϵ + ξ_{i}^{*} \\ - y_{i} + w \cdot x_{i} + b \leq ϵ + ξ_{i}^{*} \end{matrix},

(21)

The kernel trick can also be applied to SVM to handle non-linear relationships between the input features and the target variable. SVMs excel in high-dimensional spaces, making them ideal for tasks like text classification and image recognition [63]. Their margin maximization objective mitigates overfitting, ensuring robust generalization to unseen data [61]. With different kernel functions, SVMs can handle both linearly and non-linearly separable data, offering flexibility in modeling complex relationships. By utilizing support vectors in the decision function, SVMs are memory-efficient and suitable for large datasets [63]. Even with small datasets, SVMs focus on support vectors near the decision boundary, enhancing generalization, and they also aim for global optimums, providing stable solutions compared to local optimization methods. Fine-tuning parameters like the regularization parameter C and kernel choice allows control over model complexity and performance [64]. However, parameter selection can be challenging, and training can be computationally intensive, especially with large datasets. Storing the entire training dataset in memory during training poses memory constraints, particularly with non-linear kernels or high-dimensional spaces, limits SVMs’ practicality for big data applications [62]. Additionally, SVMs lack interpretability due to their black-box nature and are sensitive to noisy data, potentially affecting performance, and extensions are needed for multi-class classification or regression tasks, adding complexity to the modeling process [61].

2.4.2. Gaussian Processes (GPs)

GPs are probabilistic machine learning models that are categorized under kernel models in this paper due to their reliance on kernel functions to define the covariance structure between data points [65]. Unlike traditional kernel methods such as SVMs, which use kernel functions to implicitly map data points into a high-dimensional feature space, GPs directly model the distribution over functions using a covariance function [66]. This covariance function, also known as the kernel function, encodes assumptions about the relationships between input and output variables. Mean function m(x) and covariance function k(x, x′), where x and x′ are input points, fully specify a GP [67]. Given a set of observed data points {x_i, y_i}, where y_i is the observed output corresponding to input x_i, the joint distribution of the observed outputs y can be written as presented in Equation (22).

y ~ N (m, K),

(22)

The elements of the covariance matrix K are computed using the covariance function k(x_i, x_j), which is often referred to as the kernel function [66]. Equation (23) illustrates the equation for computation of the output

y_{*}

for a new input point

x_{*}

, where

μ_{*}

is the mean of the predictive distribution and

σ_{*}^{2}

is its variance [66].

P (y_{*}| x_{*}, x, y) = N (μ_{*}, σ_{*}^{2}),

(23)

The means of the predictive distribution μ_* and its variance σ_*² are computed as given in Equation (24) and Equation (25), respectively, where k_* is the vector of covariances between the new input x_* and the training inputs x, K is the covariance matric of the training inputs x, σ_n² is the noise variance, I is the identity matrix, and the k(x_*,x_*) is the covariance between the new input x_* and itself [66].

μ_{*} = k_{*}^{T} {(K + σ_{n}^{2} I)}^{- 1} y,

(24)

σ_{*}^{2} = k (x_{*}, x_{*}) - k_{*}^{T} {(K + σ_{n}^{2} I)}^{- 1} k_{*}

(25)

GPs excel in modeling complex, non-linear relationships between variables across various data types, offering uncertainty estimates through confidence intervals, which is vital for decision-making under uncertainty [68]. GPs accommodate both interpolation and extrapolation, making them suitable for tasks with sparse or irregular data, and their adaptability to data complexity without manual hyperparameter tuning is a significant advantage of GP models [69]. As nonparametric models, GPs grow in accuracy with increasing data, facilitating robust regression and classification with Bayesian inference [67]. GPs can easily incorporate prior knowledge about the underlying process through the choice of covariance functions, enabling the integration of domain expertise into the modeling process; however, computational intensity poses challenges, especially with large datasets, as the need to store the entire dataset may be impractical [69]. GPs may struggle with high-dimensional or large-scale datasets due to computational constraints, impacting their applicability in signal analysis [67]. Additionally, kernel selection is crucial and may require domain expertise, while hyperparameter tuning influences model performance while also extending GPs to non-Gaussian likelihoods can be challenging [68]. Interpreting underlying processes may also be difficult due to their complexity and lack of explicit model parameters.

3. Preprocessing Requirements

Preprocessing is a critical step in preparing cerebral physiological signals for analysis using machine learning models, as it ensures the data are clean, structured, and well-suited to the chosen algorithm. Although the specific requirements vary, many preprocessing steps are shared across models, with slight adjustments based on the underlying assumptions and operations of each. For HMMs, preprocessing focuses on maintaining sequential data quality and interpretability. Noise reduction, such as bandpass filtering, is critical to isolating the frequencies relevant to the modeled states, while normalization ensures consistency across signal amplitudes [22,24]. Temporal segmentation into meaningful epochs (e.g., 1 s windows) is essential for capturing transitions between hidden states. Preprocessing may also include trend removal, such as detrending ICP signals, in order to focus on short-term dynamics and reduce the influence of long-term drift.

In CNNs, preprocessing often involves transforming time-series data into image-like representations, such as spectrograms or wavelet scalograms. Noise reduction and normalization are critical to ensure that visual features reflect meaningful physiological patterns [32,34]. Additionally, segmenting the data into fixed-size windows helps create consistent input sizes for the network. For temporal models like RNNs, LSTM networks, and ESNs, preprocessing emphasizes the preservation of temporal dependencies. Noise reduction and normalization are fundamental, while segmentation into sequences ensures that input data are structured consistently [37,42,47]. Sequence padding or truncation may also be required to handle varying sequence lengths effectively. Detrending is particularly relevant for signals like ICP or fNIRS, where non-stationary trends could interfere with learning temporal relationships.

Random forests and XGBoost, as tree-based models, are relatively robust to noise and outliers but still benefit from preprocessing to improve model interpretability and performance. Feature extraction and selection are often key steps for reducing dimensionality and focusing on relevant signal attributes, such as power in specific EEG frequency bands or oxygenation levels in fNIRS [70]. Normalization is less critical for these models but can improve feature importance metrics and consistency across datasets. For distance-based models like SVMs, proper scaling or normalization of features is essential, as these models rely on distance calculations for classification or regression [64]. Noise reduction and feature extraction are similarly crucial to reduce irrelevant variability and focus on patterns critical for decision boundaries. Outlier detection is especially important for SVMs, as extreme values can disproportionately affect the model’s performance.

Finally, GPs are highly sensitive to noise and outliers, making preprocessing crucial for reliable probabilistic modeling. Noise reduction, detrending, and outlier removal are fundamental to ensure that the GP models reflect true underlying processes rather than spurious variations [69]. Normalization helps standardize the data range, enabling smoother kernel function operations, while feature extraction can reduce dimensionality for computational efficiency.

Overall, preprocessing steps like noise reduction, normalization, segmentation, detrending, outlier handling, and feature extraction play a pivotal role in preparing data for analysis. Tailoring these steps to the specific model ensures that the algorithm can effectively capture and interpret the underlying dynamics of cerebral physiological signals.

4. Clinical Relevance

Studying and analyzing cerebral physiological signals is crucial for gaining a deeper understanding of brain function and dysfunction. The brain is the control center of the human body, responsible for regulating essential functions such as cognition, emotion, movement, and sensory perception. Cerebral physiological signals, such as EEG, and fNIRS provide valuable insights into the complex neural processes underlying these functions. By examining these signals, brain activity patterns, connectivity networks, and aberrant responses associated with various neurological conditions can be investigated. Understanding cerebral physiological signals is essential for diagnosing and monitoring disorders like epilepsy, Alzheimer’s disease, traumatic brain injury (TBI), and psychiatric illnesses. Additionally, it informs the development of novel therapies and interventions aimed at restoring or optimizing brain function, ultimately improving patient outcomes and quality of life.

Multivariate machine learning models play a crucial role in analyzing, modeling, and predicting cerebral physiologic signals in both human and veterinary cohorts. These models excel in handling the complexity and high dimensionality of cerebral data, integrating multiple variables simultaneously to uncover intricate patterns and relationships. In human studies, such models are utilized extensively in neuroimaging research, including EEG, and fNIRS data analysis, thus enabling the identification of biomarkers of neurological disorders, delineate brain networks, and predict disease progression or treatment outcomes [10,12,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101]. For instance, in epilepsy research, multivariate machine learning models can distinguish between interictal and ictal states, aiding in seizure prediction and localization for potential surgical interventions [102,103,104,105]. The brain–computer interface (BCI) is the other important research area where multivariate machine learning models play a crucial role, since BCI analyzes signals from the brain, facilitating direct communication between humans and machines, and ultimately aiding individuals with severe disabilities in controlling external machines or robots to accomplish specific tasks [106,107,108,109]. Another research area involves the use of machine learning models to analyze high-frequency signals, e.g., EEG or electrocorticography (ECoG), in visual or auditory studies for tasks such as attention detection [110,111].

Similarly, in veterinary medicine, multivariate machine learning holds promise for understanding brain function and pathology in animals [112,113,114,115,116]. Multivariate machine learning models could allow analysis of complex cerebral data from veterinary cohort providing insights into disease mechanisms, optimizing treatment strategies, and improving prognostic accuracy [117].

Moreover, while these models support translational research, it is important to recognize that the outcomes observed in animal models cannot always be directly translated to human studies due to significant physiological and anatomical differences [118]. This highlights the need for careful consideration when applying findings from veterinary medicine to human brain disorders. Nevertheless, these models can still provide valuable insights and foster a collaborative approach to understanding brain disorders and developing innovative therapies for both humans and animals.

Multivariate machine learning models are instrumental in analyzing cerebral physiological signals, aiding in various research in both human and animal cohorts. Table 2 lists a compilation of studies employing non-linear multivariate machine learning models to analyze diverse cerebral physiological signals, spanning from EEG signals [39,104,108,119,120,121,122,123,124,125,126,127,128,129,130] to other cerebral physiologic signals like ICP and CPP [131,132,133,134], for tasks related to modeling and prediction. Overall, multivariate models have been used for tasks such as determining cerebral dynamic states; analyzing neuronal dynamics; detecting and classifying depression, fatigue, and emotions; investigating local field potential (LFP) states in Parkinson’s disease (PD); decoding movement intentions; classifying PD patients; classifying brain state changes; predicting brain age and hand kinematics; classifying preictal or interictal states; and carrying out feature extraction. Similarly, using other cerebral signals, tasks such as predicting ICP episodes, automatic sleep state scoring in neonates, distinguishing TBI patients, predicting neurological outcomes, and analyzing ICP and CPP signals have been carried out.

5. Limitations of the Models

Analyzing cerebral physiological data with multivariate machine learning models presents several challenges. Firstly, the high dimensionality of the data, often stemming from various imaging modalities or physiological sensors, complicates model construction and interpretation. Secondly, cerebral signals exhibit complex temporal dynamics and spatial interactions, necessitating sophisticated modeling techniques capable of capturing these intricate patterns effectively. Additionally, the presence of noise and artifacts in the data poses challenges for accurate inference, requiring robust preprocessing methods to enhance signal quality. Moreover, cerebral physiological data may suffer from missing values or irregular sampling rates, necessitating careful handling during data preprocessing and model training. Furthermore, the heterogeneity of cerebral signals across individuals, influenced by factors such as age, gender, and pathology, introduces variability that must be accounted for in model development and validation. Finally, the limited availability of labeled data and ground truth measurements complicates model validation and generalization to diverse populations or clinical settings.

Multivariate machine learning models, despite offering robust tools for analyzing complex data and making predictions, are encumbered by inherent limitations that affect their applicability and interpretability across diverse contexts. Estimating parameters in multivariate machine learning models can be difficult, particularly with high-dimensional data or complex relationships, potentially introducing biases in predictions. Furthermore, these models may be sensitive to initial conditions and have limited flexibility in capturing intricate variable interactions, thus limiting their usefulness. The computational complexity associated with training and analyzing multivariate machine learning models, coupled with the risk of overfitting and interpretability challenges, poses significant obstacles to their application.

However, it is important to acknowledge that specific models may have unique limitations and drawbacks. For instance, HMMs may struggle to capture the complex temporal dynamics inherent in cerebral signals due to their assumption of discrete latent states. CNNs, designed for spatial hierarchies, may overlook the temporal dependencies crucial in cerebral signal analysis. RNNs and LSTMs are adept at modeling sequential data, yet they might encounter difficulties in handling long-term dependencies or noisy signals. GPs, although effective in uncertainty quantification, may face scalability issues with large datasets. SVMs, random forests, and XGBoost, while robust and versatile, may lack interpretability in complex cerebral signal patterns. Furthermore, selecting the appropriate model architecture, tuning hyperparameters, and ensuring generalization to unseen data still pose significant challenges in cerebral physiologic signal analysis. Therefore, careful consideration of the strengths and limitations of each model is essential for effective utilization in this domain. Addressing the challenges stemming from cerebral physiological data with respect to limitations of multivariate machine learning models requires a multidisciplinary approach, integrating expertise from neuroscience, signal processing, and machine learning to develop robust and interpretable models for cerebral physiological data analysis.

6. Conclusions

Studying cerebral physiological signals is essential for understanding brain function and dysfunction. Biomedical sensors like EEG and fNIRS provide insights into neural processes, aiding in the diagnosis and treatment of neurological disorders such as epilepsy, Alzheimer’s disease, and TBI. Multivariate machine learning models are crucial in this field, capable of handling complex and high-dimensional data recorded by these sensors to uncover patterns and predict outcomes. These models are used extensively in neuroimaging research to identify biomarkers, delineate brain networks, and forecast disease progression or treatment efficacy. Additionally, they play a pivotal role in BCIs, which, through the use of real-time sensor data, enable direct communication between the brain and external devices, helping individuals with severe disabilities.

However, analyzing cerebral physiological data with these models presents challenges, such as high dimensionality, noise, artifacts, and variability across individuals. Different models have unique limitations and strengths that must be carefully considered for the successful application of these models in cerebral physiological signal modeling and prediction. Importantly, the use of machine learning models in this domain is still in its infancy, and achieving their full potential requires a concerted effort to address foundational issues. Standard sensor technologies and imaging modalities must be utilized to establish reliable datasets and ensure consistency in data collection and preprocessing. These steps are critical for accurately categorizing disease states and establishing a robust foundation for subsequent predictive modeling. Equally important is the selection of the most appropriate machine learning algorithms for specific tasks, taking into account their ability to manage the unique challenges of cerebral data, such as its multivariate nature and person-specific differences. Accreditation and regulatory frameworks should be developed and adhered to, ensuring that these models meet clinical and ethical standards.

Ultimately, the integration of standardized practices, ideal algorithms, and accredited frameworks will enable machine learning models to deliver on their promise of improving prognostic accuracy and enhancing treatment strategies. By addressing these challenges, machine learning has the potential to revolutionize our understanding of brain function and dysfunction, enabling the development of innovative diagnostic tools and personalized therapies for neurological disorders.

Author Contributions

N.V. and F.A.Z. conceptualized and designed the study. N.V. conducted database screening. N.V. created the tables and wrote the first draft of the manuscript. A.S.S., A.I., A.G., K.Y.S., L.F., T.B., D.M., R.R. and F.A.Z. proofread and edited the final manuscript. F.A.Z. was responsible for supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This work was directly supported through the Endowed Manitoba Public Insurance (MPI) Chair in Neuroscience and the Natural Sciences and Engineering Research Council of Canada (NSERC; ALLRP-576386-22 and ALLRP-586244-23).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

F.A.Z. is supported through the Endowed Manitoba Public Insurance (MPI) Chair in Neuroscience/TBI Research Endowment, NSERC (DGECR-2022-00260, RGPIN-2022-03621, ALLRP-578524-22, ALLRP-576386-22, I2IPJ 586104–23, and ALLRP 586244-23), Canadian Institutes of Health Research (CIHR), the MPI Neuroscience Research Operating Fund, the Health Sciences Centre Foundation Winnipeg, the Canada Foundation for Innovation (CFI) (project no. 38583), Research Manitoba (grant nos. 3906 and 5429) and the University of Manitoba VPRI Research Investment Fund (RIF). N.V. is supported by NSERC (RGPIN-2022-03621, ALLRP-576386-22, ALLRP 586244-23). ASS is supported through the University of Manitoba Graduate Fellowship (UMGF)—Biomedical Engineering, NSERC (RGPIN-2022-03621), and the Graduate Enhancement of Tri-Council Stipends (GETS)—University of Manitoba. A.I. is supported by a University of Manitoba, Department of Surgery GFT Grant, the University of Manitoba International Graduate Student Entrance Scholarship (IGSES), and the University of Manitoba Graduate Fellowship (UMGF) in Biomedical Engineering. A.G. is supported through a CIHR Fellowship (grant no. 472286). K.Y.S. is supported through the NSERC CGS-D Program (CGS D-579021-2023), University of Manitoba R.G. and E.M. Graduate Fellowship (Doctoral) in Biomedical Engineering and the University of Manitoba MD/PhD program. L.F. is supported through a Research Manitoba PhD Fellowship, the Brain Canada Thomkins Travel Scholarship, NSERC (ALLRP-578524-22, ALLRP-576386-22), and the Graduate Enhancement of Tri-Council Stipends (GETS)—University of Manitoba. T.B. is supported through the NSERC CGS-M program. R.R. is supported through state funding (Helsinki University Hospital), the Swedish Cultural Foundation in Finland, Finska Läkaresällskapet, and Medicinska Understödsföreningen Liv och Hälsa.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zeiler, F.A.; Ercole, A.; Cabeleira, M.; Zoerle, T.; Stocchetti, N.; Menon, D.; Smieleweski, P.; Czosnyka, M. Univariate Comparison of Performance of Different Cerebrovascular Reactivity Indices for Outcome Association in Adult TBI: A CENTER-TBI Study. Available online: https://pubmed.ncbi.nlm.nih.gov/30877472/?otool=icaumlib (accessed on 14 August 2020).
Wang, R.; Wang, Y.; Xu, X.; Li, Y.; Pan, X. Brain Works Principle Followed by Neural Information Processing: A Review of Novel Brain Theory. Artif. Intell. Rev. 2023, 56, 285–350. [Google Scholar] [CrossRef]
Froese, L.; Gomez, A.; Sainbhi, A.S.; Batson, C.; Stein, K.; Alizadeh, A.; Zeiler, F.A. Dynamic Temporal Relationship Between Autonomic Function and Cerebrovascular Reactivity in Moderate/Severe Traumatic Brain Injury. Front. Netw. Physiol. 2022, 2, 837860. [Google Scholar] [CrossRef] [PubMed]
Tas, J.; Czosnyka, M.; van der Horst, I.C.C.; Park, S.; van Heugten, C.; Sekhon, M.; Robba, C.; Menon, D.K.; Zeiler, F.A.; Aries, M.J.H. Cerebral Multimodality Monitoring in Adult Neurocritical Care Patients with Acute Brain Injury: A Narrative Review. Front. Physiol. 2022, 13, 1071161. [Google Scholar] [CrossRef] [PubMed]
Donnelly, J.; Czosnyka, M.; Adams, H.; Cardim, D.; Kolias, A.G.; Zeiler, F.A.; Lavinio, A.; Aries, M.; Robba, C.; Smielewski, P.; et al. Twenty-Five Years of Intracranial Pressure Monitoring After Severe Traumatic Brain Injury: A Retrospective, Single-Center Analysis. Neurosurgery 2019, 85, E75–E82. [Google Scholar] [CrossRef]
Caldwell, M.; Hapuarachchi, T.; Highton, D.; Elwell, C.; Smith, M.; Tachtsidis, I. BrainSignals Revisited: Simplifying a Computational Model of Cerebral Physiology. PLoS ONE 2015, 10, e0126695. [Google Scholar] [CrossRef]
Brier, L.M.; Zhang, X.; Bice, A.R.; Gaines, S.H.; Landsness, E.C.; Lee, J.-M.; Anastasio, M.A.; Culver, J.P. A Multivariate Functional Connectivity Approach to Mapping Brain Networks and Imputing Neural Activity in Mice. Cereb. Cortex 2022, 32, 1593–1607. [Google Scholar] [CrossRef]
Chen, G.; Adleman, N.E.; Saad, Z.S.; Leibenluft, E.; Cox, R.W. Applications of Multivariate Modeling to Neuroimaging Group Analysis: A Comprehensive Alternative to Univariate General Linear Model. NeuroImage 2014, 99, 571–588. [Google Scholar] [CrossRef]
Jha, A.; Agarwal, S. Do Deep Neural Networks Model Nonlinear Compositionality in the Neural Representation of Human-Object Interactions? In Proceedings of the 2019 Conference on Cognitive Computational Neuroscience, Berlin, Germany, 13–16 September 2019. [Google Scholar]
Shi, W.; Fan, L.; Jiang, T. Developing Neuroimaging Biomarker for Brain Diseases with a Machine Learning Framework and the Brainnetome Atlas. Neurosci. Bull. 2021, 37, 1523–1525. [Google Scholar] [CrossRef]
Zhang, J. Multivariate Analysis and Machine Learning in Cerebral Palsy Research. Front. Neurol. 2017, 8, 715. [Google Scholar] [CrossRef]
Ahmadzadeh, M.; Christie, G.J.; Cosco, T.D.; Arab, A.; Mansouri, M.; Wagner, K.R.; DiPaola, S.; Moreno, S. Neuroimaging and Machine Learning for Studying the Pathways from Mild Cognitive Impairment to Alzheimer’s Disease: A Systematic Review. BMC Neurol. 2023, 23, 309. [Google Scholar] [CrossRef]
Raj, R.; Wennervirta, J.M.; Tjerkaski, J.; Luoto, T.M.; Posti, J.P.; Nelson, D.W.; Takala, R.; Bendel, S.; Thelin, E.P.; Luostarinen, T.; et al. Dynamic Prediction of Mortality after Traumatic Brain Injury Using a Machine Learning Algorithm. NPJ Digit. Med. 2022, 5, 96. [Google Scholar] [CrossRef] [PubMed]
Tanaka, H.; Ishikawa, T.; Kakei, S. Neural Predictive Computation in the Cerebellum. In Cerebellum as a CNS Hub; Mizusawa, H., Kakei, S., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 371–390. [Google Scholar]
Zhang, W.; Braden, B.B.; Miranda, G.; Shu, K.; Wang, S.; Liu, H.; Wang, Y. Integrating Multimodal and Longitudinal Neuroimaging Data with Multi-Source Network Representation Learning. Neuroinformatics 2022, 20, 301–316. [Google Scholar] [CrossRef] [PubMed]
Al-azazi, F.A.; Ghurab, M. ANN-LSTM: A Deep Learning Model for Early Student Performance Prediction in MOOC. Heliyon 2023, 9, e15382. [Google Scholar] [CrossRef] [PubMed]
Polikar, R. Ensemble Learning. In Ensemble Machine Learning: Methods and Applications; Zhang, C., Ma, Y., Eds.; Springer: New York, NY, USA, 2012; pp. 1–34. ISBN 978-1-4419-9326-7. [Google Scholar]
Gao, Z.; Dang, W.; Wang, X.; Hong, X.; Hou, L.; Ma, K.; Perc, M. Complex Networks and Deep Learning for EEG Signal Analysis. Cogn. Neurodyn 2021, 15, 369–388. [Google Scholar] [CrossRef] [PubMed]
Triantafyllopoulos, K. Multivariate State Space Models. In Bayesian Inference of State Space Models: Kalman Filtering and Beyond; Triantafyllopoulos, K., Ed.; Springer Texts in Statistics; Springer International Publishing: Cham, Switzerland, 2021; pp. 209–261. ISBN 978-3-030-76124-0. [Google Scholar]
Liu, W.; Yairi, T. A Unifying View of Multivariate State Space Models for Soft Sensors in Industrial Processes. IEEE Access 2024, 12, 5920–5932. [Google Scholar] [CrossRef]
Eddy, S.R. Hidden Markov Models. Curr. Opin. Struct. Biol. 1996, 6, 361–365. [Google Scholar] [CrossRef]
Mor, B.; Garhwal, S.; Kumar, A. A Systematic Review of Hidden Markov Models and Their Applications. Arch. Comput. Methods Eng. 2021, 28, 1429–1448. [Google Scholar] [CrossRef]
Miller, D.R.H.; Leek, T.; Schwartz, R.M. A Hidden Markov Model Information Retrieval System. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, USA, 15–19 August 1999; ACM: Berkeley, CA, USA, 1999; pp. 214–221. [Google Scholar]
Rabiner, L.; Juang, B. An Introduction to Hidden Markov Models. IEEE ASSP Mag. 1986, 3, 4–16. [Google Scholar] [CrossRef]
Chen, K.; Li, C.; Sun, W.; Tao, Y.; Wang, R.; Hou, W.; Liu, D.-Q. Hidden Markov Modeling Reveals Prolonged “Baseline” State and Shortened Antagonistic State across the Adult Lifespan. Cereb. Cortex 2022, 32, 439–453. [Google Scholar] [CrossRef]
Torrésani, B.; Villaron, E. Harmonic Hidden Markov Models for the Study of EEG Signals. In Proceedings of the 2010 18th European Signal Processing Conference, Aalborg, Denmark, 23–27 August 2010; pp. 711–715. [Google Scholar]
Ou, J.; Xie, L.; Jin, C.; Li, X.; Zhu, D.; Jiang, R.; Chen, Y.; Zhang, J.; Li, L.; Liu, T. Characterizing and Differentiating Brain State Dynamics via Hidden Markov Models. Brain Topogr. 2015, 28, 666–679. [Google Scholar] [CrossRef]
Kietzmann, T.C.; McClure, P.; Kriegeskorte, N. Deep Neural Networks in Computational Neuroscience. In Oxford Research Encyclopedia of Neuroscience; Oxford University Press: Oxford, UK, 2019; ISBN 978-0-19-026408-6. [Google Scholar]
Kriegeskorte, N.; Golan, T. Neural Network Models and Deep Learning. Curr. Biol. 2019, 29, R231–R236. [Google Scholar] [CrossRef] [PubMed]
Derry, A.; Krzywinski, M.; Altman, N. Convolutional Neural Networks. Nat. Methods 2023, 20, 1269–1270. [Google Scholar] [CrossRef] [PubMed]
Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 6999–7019. [Google Scholar] [CrossRef] [PubMed]
Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent Advances in Convolutional Neural Networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef]
Emanuel, R.H.K.; Docherty, P.D.; Lunt, H.; Möller, K. The Effect of Activation Functions on Accuracy, Convergence Speed, and Misclassification Confidence in CNN Text Classification: A Comprehensive Exploration. J. Supercomput. 2024, 80, 292–312. [Google Scholar] [CrossRef]
Mehmood, F.; Ahmad, S.; Whangbo, T.K. An Efficient Optimization Technique for Training Deep Neural Networks. Mathematics 2023, 11, 1360. [Google Scholar] [CrossRef]
Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of Deep Learning: Concepts, CNN Architectures, Challenges, Applications, Future Directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef]
Xu, Y.; Zhang, H. Convergence of Deep Convolutional Neural Networks. Neural Netw. 2022, 153, 553–563. [Google Scholar] [CrossRef]
Cossu, A.; Carta, A.; Lomonaco, V.; Bacciu, D. Continual Learning for Recurrent Neural Networks: An Empirical Evaluation. Neural Netw. 2021, 143, 607–627. [Google Scholar] [CrossRef]
Barak, O. Recurrent Neural Networks as Versatile Tools of Neuroscience Research. Curr. Opin. Neurobiol. 2017, 46, 1–6. [Google Scholar] [CrossRef]
Mughal, N.E.; Khan, M.J.; Khalil, K.; Javed, K.; Sajid, H.; Naseer, N.; Ghafoor, U.; Hong, K.-S. EEG-fNIRS-Based Hybrid Image Construction and Classification Using CNN-LSTM. Front. Neurorobotics 2022, 16, 873239. [Google Scholar] [CrossRef] [PubMed]
Vakitbilir, N.; Hilal, A.; Direkoğlu, C. Hybrid Deep Learning Models for Multivariate Forecasting of Global Horizontal Irradiation. Neural Comput. Appl. 2022, 34, 8005–8026. [Google Scholar] [CrossRef]
Van Houdt, G.; Mosquera, C.; Nápoles, G. A Review on the Long Short-Term Memory Model. Artif. Intell. Rev. 2020, 53, 5929–5955. [Google Scholar] [CrossRef]
Li, M.; Wang, J.; Yang, S.; Xie, J.; Xu, G.; Luo, S. A CNN-LSTM Model for Six Human Ankle Movements Classification on Different Loads. Front. Hum. Neurosci. 2023, 17, 1101938. [Google Scholar] [CrossRef]
Jaeger, H. Adaptive Nonlinear System Identification with Echo State Networks. In Proceedings of the Advances in Neural Information Processing Systems 15 (NIPS 2002), Vancouver, BC, Canada, 9–14 December 2002; Volume 15. [Google Scholar]
Lukoševičius, M. A Practical Guide to Applying Echo State Networks. In Neural Networks: Tricks of the Trade: Second Edition; Montavon, G., Orr, G.B., Müller, K.-R., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 659–686. ISBN 978-3-642-35289-8. [Google Scholar]
Ozturk, M.C.; Xu, D.; Príncipe, J.C. Analysis and Design of Echo State Networks. Neural Comput. 2007, 19, 111–138. [Google Scholar] [CrossRef]
Sun, C.; Song, M.; Hong, S.; Li, H. A Review of Designs and Applications of Echo State Networks 2020. arXiv 2020, arXiv:2012.02974. [Google Scholar]
De Vos, N.J. Echo State Networks as an Alternative to Traditional Artificial Neural Networks in Rainfall–Runoff Modelling. Hydrol. Earth Syst. Sci. 2013, 17, 253–267. [Google Scholar] [CrossRef]
Sun, C.; Song, M.; Cai, D.; Zhang, B.; Hong, S.; Li, H. A Systematic Review of Echo State Networks From Design to Application. IEEE Trans. Artif. Intell. 2024, 5, 23–37. [Google Scholar] [CrossRef]
Soltani, R.; Benmohamed, E.; Ltifi, H. Echo State Network Optimization: A Systematic Literature Review. Neural Process Lett. 2023, 55, 10251–10285. [Google Scholar] [CrossRef]
Dong, X.; Yu, Z.; Cao, W.; Shi, Y.; Ma, Q. A Survey on Ensemble Learning. Front. Comput. Sci. 2020, 14, 241–258. [Google Scholar] [CrossRef]
Qi, Y. Random Forest for Bioinformatics. In Ensemble Machine Learning: Methods and Applications; Zhang, C., Ma, Y., Eds.; Springer: New York, NY, USA, 2012; pp. 307–323. ISBN 978-1-4419-9326-7. [Google Scholar]
Bian, J.; Wang, X.; Hao, W.; Zhang, G.; Wang, Y. The Differential Diagnosis Value of Radiomics-Based Machine Learning in Parkinson’s Disease: A Systematic Review and Meta-Analysis. Front. Aging Neurosci. 2023, 15, 1199826. [Google Scholar] [CrossRef] [PubMed]
Hastie, T.; Tibshirani, R.; Friedman, J. Random Forests. In The Elements of Statistical Learning; Springer Series in Statistics; Springer: New York, NY, USA, 2009; pp. 587–604. ISBN 978-0-387-84857-0. [Google Scholar]
Biau, G.; Scornet, E. A Random Forest Guided Tour. TEST 2016, 25, 197–227. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
Sheridan, R.P.; Wang, W.M.; Liaw, A.; Ma, J.; Gifford, E.M. Extreme Gradient Boosting as a Method for Quantitative Structure–Activity Relationships. J. Chem. Inf. Model. 2016, 56, 2353–2360. [Google Scholar] [CrossRef] [PubMed]
Chang, Y.-C.; Chang, K.-H.; Wu, G.-J. Application of eXtreme Gradient Boosting Trees in the Construction of Credit Risk Assessment Models for Financial Institutions. Appl. Soft Comput. 2018, 73, 914–920. [Google Scholar] [CrossRef]
Bentéjac, C.; Csörgő, A.; Martínez-Muñoz, G. A Comparative Analysis of XGBoost. Artif. Intell. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
Wilson, A.G.; Hu, Z.; Salakhutdinov, R.; Xing, E.P. Deep Kernel Learning. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, Cadiz, Spain, 9–11 May 2016; pp. 370–378. [Google Scholar]
Lodhi, H. Computational Biology Perspective: Kernel Methods and Deep Learning. WIREs Comput. Stat. 2012, 4, 455–465. [Google Scholar] [CrossRef]
Valkenborg, D.; Rousseau, A.-J.; Geubbelmans, M.; Burzykowski, T. Support Vector Machines. Am. J. Orthod. Dentofac. Orthop. 2023, 164, 754–757. [Google Scholar] [CrossRef]
Pisner, D.A.; Schnyer, D.M. Chapter 6—Support Vector Machine. In Machine Learning; Mechelli, A., Vieira, S., Eds.; Academic Press: Cambridge, MA, USA, 2020; pp. 101–121. ISBN 978-0-12-815739-8. [Google Scholar]
Kecman, V. Support Vector Machines—An Introduction. In Support Vector Machines: Theory and Applications; Wang, L., Ed.; Studies in Fuzziness and Soft Computing; Springer: Berlin/Heidelberg, Germany, 2005; pp. 1–47. ISBN 978-3-540-32384-6. [Google Scholar]
Ji, W.; Liu, D.; Meng, Y.; Xue, Y. A Review of Genetic-Based Evolutionary Algorithms in SVM Parameters Optimization. Evol. Intel. 2021, 14, 1389–1414. [Google Scholar] [CrossRef]
Seitz, S. Gradient-Based Explanations for Gaussian Process Regression and Classification Models. arXiv 2022, arXiv:2205.12797. [Google Scholar]
Seeger, M. Gaussian Processes for Machine Learning. Int. J. Neur. Syst. 2004, 14, 69–106. [Google Scholar] [CrossRef]
Rasmussen, C.E. Gaussian Processes in Machine Learning. In Advanced Lectures on Machine Learning: ML Summer Schools 2003, Canberra, Australia, February 2–14, 2003, Tübingen, Germany, August 4–16, 2003, Revised Lectures; Lecture Notes in Computer Science; Bousquet, O., von Luxburg, U., Rätsch, G., Eds.; Springer: Berlin/Heidelberg, Germany, 2004; pp. 63–71. ISBN 978-3-540-28650-9. [Google Scholar]
Williams, C.; Rasmussen, C. Gaussian Processes for Regression. In Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA, 27–30 November 1995; MIT Press: Cambridge, MA, USA, 1995; Volume 8. [Google Scholar]
Mackay, D.J.C. Introduction to gaussian processes. In Neural Networks and Machine Learning; NATO ASI Series F Computer and Systems Sciences; Springer: Berlin, Germany, 1998; Volume 168, pp. 133–166. [Google Scholar]
Khan, M.S.; Salsabil, N.; Alam, M.G.R.; Dewan, M.A.A.; Uddin, M.Z. CNN-XGBoost Fusion-Based Affective State Recognition Using EEG Spectrogram Image Analysis. Sci. Rep. 2022, 12, 14122. [Google Scholar] [CrossRef] [PubMed]
Kwak, S.; Akbari, H.; Garcia, J.A.; Mohan, S.; Dicker, Y.; Sako, C.; Matsumoto, Y.; Nasrallah, M.P.; Shalaby, M.; O’Rourke, D.M.; et al. Predicting peritumoral glioblastoma infiltration and subsequent recurrence using deep-learning–based analysis of multi-parametric magnetic resonance imaging. J. Med. Imaging 2024, 11, 054001. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Wang, S.; Sui, Y.; Yang, M.; Liu, B.; Cheng, H.; Sun, J.; Jia, W.; Phillips, P.; Gorriz, J.M. Multivariate Approach for Alzheimer’s Disease Detection Using Stationary Wavelet Entropy and Predator-Prey Particle Swarm Optimization. J. Alzheimer’s Dis. 2018, 65, 855–869. [Google Scholar] [CrossRef] [PubMed]
Petras, K.; ten Oever, S.; Jacobs, C.; Goffaux, V. Coarse-to-Fine Information Integration in Human Vision. NeuroImage 2019, 186, 103–112. [Google Scholar] [CrossRef] [PubMed]
Shen, Y.; Giannakis, G.B.; Baingana, B. Nonlinear Structural Vector Autoregressive Models with Application to Directed Brain Networks. IEEE Trans. Signal Process 2019, 67, 5325–5339. [Google Scholar] [CrossRef]
Baroni, F.; Morillon, B.; Trébuchon, A.; Liégeois-Chauvel, C.; Olasagasti, I.; Giraud, A.-L. Converging Intracortical Signatures of Two Separated Processing Timescales in Human Early Auditory Cortex. NeuroImage 2020, 218, 116882. [Google Scholar] [CrossRef]
Leech, R.; Leech, D. Testing for Spatial Heterogeneity in Functional MRI Using the Multivariate General Linear Model. IEEE Trans. Med. Imaging 2011, 30, 1293–1302. [Google Scholar] [CrossRef]
McKinney, B.A.; White, B.C.; Grill, D.E.; Li, P.W.; Kennedy, R.B.; Poland, G.A.; Oberg, A.L. ReliefSeq: A Gene-Wise Adaptive-K Nearest-Neighbor Feature Selection Tool for Finding Gene-Gene Interactions and Main Effects in mRNA-Seq Gene Expression Data. PLoS ONE 2013, 8, e81527. [Google Scholar] [CrossRef]
Rasekhi, J.; Mollaei, M.R.K.; Bandarabadi, M.; Teixeira, C.A.; Dourado, A. Preprocessing Effects of 22 Linear Univariate Features on the Performance of Seizure Prediction Methods. J. Neurosci. Methods 2013, 217, 9–16. [Google Scholar] [CrossRef]
Uruñuela, E.; Gonzalez-Castillo, J.; Zheng, C.; Bandettini, P.; Caballero-Gaudes, C. Whole-Brain Multivariate Hemodynamic Deconvolution for Functional MRI with Stability Selection. Med. Image Anal. 2024, 91, 103010. [Google Scholar] [CrossRef]
Srinivasan, S.; Johnson, S.D. Optimizing Feature Subset for Schizophrenia Detection Using Multichannel EEG Signals and Rough Set Theory. Cogn. Neurodyn 2024, 18, 431–446. [Google Scholar] [CrossRef] [PubMed]
Li, W.; Chen, G.; Chen, M.; Shen, K.; Wu, C.; Shen, W.; Zhang, F. PCA-WRKNN-Assisted Label-Free SERS Serum Analysis Platform Enabling Non-Invasive Diagnosis of Alzheimer’s Disease. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2023, 302, 123088. [Google Scholar] [CrossRef] [PubMed]
Hajianfar, G.; Haddadi Avval, A.; Hosseini, S.A.; Nazari, M.; Oveisi, M.; Shiri, I.; Zaidi, H. Time-to-Event Overall Survival Prediction in Glioblastoma Multiforme Patients Using Magnetic Resonance Imaging Radiomics. Radiol. Med. 2023, 128, 1521–1534. [Google Scholar] [CrossRef] [PubMed]
Kasim, Ö. Identification of Attention Deficit Hyperactivity Disorder with Deep Learning Model. Phys. Eng. Sci. Med. 2023, 46, 1081–1090. [Google Scholar] [CrossRef]
Keihani, A.; Sajadi, S.S.; Hasani, M.; Ferrarelli, F. Bayesian Optimization of Machine Learning Classification of Resting-State EEG Microstates in Schizophrenia: A Proof-of-Concept Preliminary Study Based on Secondary Analysis. Brain Sci. 2022, 12, 1497. [Google Scholar] [CrossRef]
Chung, H.; Seo, H.; Choi, S.H.; Park, C.-K.; Kim, T.M.; Park, S.-H.; Won, J.K.; Lee, J.H.; Lee, S.-T.; Lee, J.Y.; et al. Cluster Analysis of DSC MRI, Dynamic Contrast-Enhanced MRI, and DWI Parameters Associated with Prognosis in Patients with Glioblastoma after Removal of the Contrast-Enhancing Component: A Preliminary Study. Am. J. Neuroradiol. 2022, 43, 1559–1566. [Google Scholar] [CrossRef]
Treder, M.S.; Codrai, R.; Tsvetanov, K.A. Quality Assessment of Anatomical MRI Images from Generative Adversarial Networks: Human Assessment and Image Quality Metrics. J. Neurosci. Methods 2022, 374, 109579. [Google Scholar] [CrossRef]
Liu, M.; Amey, R.C.; Backer, R.A.; Simon, J.P.; Forbes, C.E. Behavioral Studies Using Large-Scale Brain Networks—Methods and Validations. Front. Hum. Neurosci. 2022, 16, 875201. [Google Scholar] [CrossRef]
Barreto, C.; Bruneri, G.d.A.; Brockington, G.; Ayaz, H.; Sato, J.R. A New Statistical Approach for fNIRS Hyperscanning to Predict Brain Activity of Preschoolers’ Using Teacher’s. Front. Hum. Neurosci. 2021, 15, 622146. [Google Scholar] [CrossRef]
Sarton, B.; Jaquet, P.; Belkacemi, D.; de Montmollin, E.; Bonneville, F.; Sazio, C.; Frérou, A.; Conrad, M.; Daubin, D.; Chabanne, R.; et al. Assessment of Magnetic Resonance Imaging Changes and Functional Outcomes Among Adults With Severe Herpes Simplex Encephalitis. JAMA Netw. Open 2021, 4, e2114328. [Google Scholar] [CrossRef]
Schmuker, M.; Pfeil, T.; Nawrot, M.P. A Neuromorphic Network for Generic Multivariate Data Classification. Proc. Natl. Acad. Sci. USA 2014, 111, 2081–2086. [Google Scholar] [CrossRef] [PubMed]
von Lühmann, A.; Li, X.; Müller, K.-R.; Boas, D.A.; Yücel, M.A. Improved Physiological Noise Regression in fNIRS: A Multimodal Extension of the General Linear Model Using Temporally Embedded Canonical Correlation Analysis. NeuroImage 2020, 208, 116472. [Google Scholar] [CrossRef] [PubMed]
Vizioli, L.; De Martino, F.; Petro, L.S.; Kersten, D.; Ugurbil, K.; Yacoub, E.; Muckli, L. Multivoxel Pattern of Blood Oxygen Level Dependent Activity Can Be Sensitive to Stimulus Specific Fine Scale Responses. Sci. Rep. 2020, 10, 7565. [Google Scholar] [CrossRef] [PubMed]
Tzovara, A.; Chavarriaga, R.; De Lucia, M. Quantifying the Time for Accurate EEG Decoding of Single Value-Based Decisions. J. Neurosci. Methods 2015, 250, 114–125. [Google Scholar] [CrossRef]
Zhang, Y.; Kimberg, D.Y.; Coslett, H.B.; Schwartz, M.F.; Wang, Z. Multivariate Lesion-symptom Mapping Using Support Vector Regression. Hum. Brain Mapp. 2014, 35, 5861–5876. [Google Scholar] [CrossRef]
Dartora, C.; Marseglia, A.; Mårtensson, G.; Rukh, G.; Dang, J.; Muehlboeck, J.-S.; Wahlund, L.-O.; Moreno, R.; Barroso, J.; Ferreira, D.; et al. A Deep Learning Model for Brain Age Prediction Using Minimally Preprocessed T1w Images as Input. Front. Aging Neurosci. 2024, 15, 1303036. [Google Scholar] [CrossRef]
Brown, C.A.; Almarzouki, A.F.; Brown, R.J.; Jones, A.K.P. Neural Representations of Aversive Value Encoding in Pain Catastrophizers. NeuroImage 2019, 184, 508–519. [Google Scholar] [CrossRef]
Khawaldeh, S.; Tinkhauser, G.; Torrecillos, F.; He, S.; Foltynie, T.; Limousin, P.; Zrinzo, L.; Oswal, A.; Quinn, A.J.; Vidaurre, D.; et al. Balance between Competing Spectral States in Subthalamic Nucleus Is Linked to Motor Impairment in Parkinson’s Disease. Brain 2022, 145, 237–250. [Google Scholar] [CrossRef]
Hussain, S.J.; Quentin, R. Decoding Personalized Motor Cortical Excitability States from Human Electroencephalography. Sci. Rep. 2022, 12, 6323. [Google Scholar] [CrossRef]
Kwak, S.; Akbari, H.; Garcia, J.A.; Mohan, S.; Davatzikos, C. Fully Automatic mpMRI Analysis Using Deep Learning Predicts Peritumoral Glioblastoma Infiltration and Subsequent Recurrence. Proc. SPIE Int. Soc. Opt. Eng. 2024, 12926, 423–429. [Google Scholar] [CrossRef]
Vidaurre, C.; Gurunandan, K.; Idaji, M.J.; Nolte, G.; Gómez, M.; Villringer, A.; Müller, K.-R.; Nikulin, V.V. Novel Multivariate Methods to Track Frequency Shifts of Neural Oscillations in EEG/MEG Recordings. Neuroimage 2023, 276, 120178. [Google Scholar] [CrossRef] [PubMed]
Xue, T.; Bai, L.; Chen, S.; Zhong, C.; Feng, Y.; Wang, H.; Liu, Z.; You, Y.; Cui, F.; Ren, Y.; et al. Neural Specificity of Acupuncture Stimulation from Support Vector Machine Classification Analysis. Magn. Reson. Imaging 2011, 29, 943–950. [Google Scholar] [CrossRef] [PubMed]
Aayesha; Qureshi, M.B.; Afzaal, M.; Qureshi, M.S.; Fayaz, M. Machine Learning-Based EEG Signals Classification Model for Epileptic Seizure Detection. Multimed. Tools Appl. 2021, 80, 17849–17877. [Google Scholar] [CrossRef]
Wang, G.; Sun, Z.; Tao, R.; Li, K.; Bao, G.; Yan, X. Epileptic Seizure Detection Based on Partial Directed Coherence Analysis. IEEE J. Biomed. Health Inform. 2016, 20, 873–879. [Google Scholar] [CrossRef]
Williamson, J.R.; Bliss, D.W.; Browne, D.W.; Narayanan, J.T. Seizure Prediction Using EEG Spatiotemporal Correlation Structure. Epilepsy Behav. 2012, 25, 230–238. [Google Scholar] [CrossRef]
Bomela, W.; Wang, S.; Chou, C.-A.; Li, J.-S. Real-Time Inference and Detection of Disruptive EEG Networks for Epileptic Seizures. Sci. Rep. 2020, 10, 8653. [Google Scholar] [CrossRef]
Zhang, Z.; Chen, G.; Yang, S. Ensemble Support Vector Recurrent Neural Network for Brain Signal Detection. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 6856–6866. [Google Scholar] [CrossRef]
Csaky, R.; van Es, M.W.J.; Jones, O.P.; Woolrich, M. Interpretable Many-Class Decoding for MEG. NeuroImage 2023, 282, 120396. [Google Scholar] [CrossRef]
Pancholi, S.; Giri, A.; Jain, A.; Kumar, L.; Roy, S. Source Aware Deep Learning Framework for Hand Kinematic Reconstruction Using EEG Signal. IEEE Trans. Cybern. 2023, 53, 4094–4106. [Google Scholar] [CrossRef]
Ieracitano, C.; Mammone, N.; Hussain, A.; Morabito, F.C. A Novel Multi-Modal Machine Learning Based Approach for Automatic Classification of EEG Recordings in Dementia. Neural Netw. 2020, 123, 176–190. [Google Scholar] [CrossRef]
Schrouff, J.; Mourão-Miranda, J.; Phillips, C.; Parvizi, J. Decoding Intracranial EEG Data with Multiple Kernel Learning Method. J. Neurosci. Methods 2016, 261, 19–28. [Google Scholar] [CrossRef] [PubMed]
EskandariNasab, M.; Raeisi, Z.; Lashaki, R.A.; Najafi, H. A GRU-CNN Model for Auditory Attention Detection Using Microstate and Recurrence Quantification Analysis. Sci. Rep. 2024, 14, 8861. [Google Scholar] [CrossRef] [PubMed]
Gier, E.C.; Pulliam, A.N.; Gaul, D.A.; Moore, S.G.; LaPlaca, M.C.; Fernández, F.M. Lipidome Alterations Following Mild Traumatic Brain Injury in the Rat. Metabolites 2022, 12, 150. [Google Scholar] [CrossRef] [PubMed]
Koren, V. Uncovering Structured Responses of Neural Populations Recorded from Macaque Monkeys with Linear Support Vector Machines. STAR Protoc. 2021, 2, 100746. [Google Scholar] [CrossRef] [PubMed]
Fröhlich, H.; Claes, K.; De Wolf, C.; Van Damme, X.; Michel, A. A Machine Learning Approach to Automated Gait Analysis for the Noldus Catwalk System. IEEE Trans. Biomed. Eng. 2018, 65, 1133–1139. [Google Scholar] [CrossRef]
Ehrens, D.; Assaf, F.; Cowan, N.J.; Sarma, S.V.; Schiller, Y. Ultra Broad Band Neural Activity Portends Seizure Onset in a Rat Model of Epilepsy. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2018, 2018, 2276–2279. [Google Scholar] [CrossRef]
Baldassi, C.; Alemi-Neissi, A.; Pagan, M.; Dicarlo, J.J.; Zecchina, R.; Zoccolan, D. Shape Similarity, Better than Semantic Membership, Accounts for the Structure of Visual Object Representations in a Population of Monkey Inferotemporal Neurons. PLoS Comput. Biol. 2013, 9, e1003167. [Google Scholar] [CrossRef]
Appleby, R.B.; Basran, P.S. Artificial Intelligence in Veterinary Medicine. J. Am. Vet. Med. Assoc. 2022, 260, 819–824. [Google Scholar] [CrossRef]
Arzi, B.; Webb, T.L.; Koch, T.G.; Volk, S.W.; Betts, D.H.; Watts, A.; Goodrich, L.; Kallos, M.S.; Kol, A. Cell Therapy in Veterinary Medicine as a Proof-of-Concept for Human Therapies: Perspectives From the North American Veterinary Regenerative Medicine Association. Front. Vet. Sci. 2021, 8, 779109. [Google Scholar] [CrossRef]
Fraiwan, L.; Alkhodari, M. Neonatal Sleep Stage Identification Using Long Short-Term Memory Learning System. Med. Biol. Eng. Comput. 2020, 58, 1383–1391. [Google Scholar] [CrossRef]
Khadidos, A.O.; Alyoubi, K.H.; Mahato, S.; Khadidos, A.O.; Nandan Mohanty, S. Machine Learning and Electroencephalogram Signal Based Diagnosis of Depression. Neurosci. Lett. 2023, 809, 137313. [Google Scholar] [CrossRef] [PubMed]
Xing, M.; Hu, S.; Wei, B.; Lv, Z. Spatial-Frequency-Temporal Convolutional Recurrent Network for Olfactory-Enhanced EEG Emotion Recognition. J. Neurosci. Methods 2022, 376, 109624. [Google Scholar] [CrossRef] [PubMed]
Zong, J.; Xiong, X.; Zhou, J.; Ji, Y.; Zhou, D.; Zhang, Q. FCAN–XGBoost: A Novel Hybrid Model for EEG Emotion Recognition. Sensors 2023, 23, 5680. [Google Scholar] [CrossRef] [PubMed]
Yang, L.; Wang, Z.; Wang, G.; Liang, L.; Liu, M.; Wang, J. Brain-Inspired Modular Echo State Network for EEG-Based Emotion Recognition. Front. Neurosci. 2024, 18, 1305284. [Google Scholar] [CrossRef]
Kim, H.-H.; Jeong, J. Decoding Electroencephalographic Signals for Direction in Brain-Computer Interface Using Echo State Network and Gaussian Readouts. Comput. Biol. Med. 2019, 110, 254–264. [Google Scholar] [CrossRef]
Itälinna, V.; Kaltiainen, H.; Forss, N.; Liljeström, M.; Parkkonen, L. Using Normative Modeling and Machine Learning for Detecting Mild Traumatic Brain Injury from Magnetoencephalography Data. PLoS Comput. Biol. 2023, 19, e1011613. [Google Scholar] [CrossRef]
Jiang, W.; Ding, S.; Xu, C.; Ke, H.; Bo, H.; Zhao, T.; Ma, L.; Li, H. Discovering the Neuronal Dynamics in Major Depressive Disorder Using Hidden Markov Model. Front. Hum. Neurosci. 2023, 17, 1197613. [Google Scholar] [CrossRef]
Nadalizadeh, F.; Rajabioun, M.; Feyzi, A. Driving Fatigue Detection Based on Brain Source Activity and ARMA Model. Med. Biol. Eng. Comput. 2024, 62, 1017–1030. [Google Scholar] [CrossRef]
Paliwal, V.; Das, K.; Doesburg, S.M.; Medvedev, G.; Xi, P.; Ribary, U.; Pachori, R.B.; Vakorin, V.A. Classifying Routine Clinical Electroencephalograms With Multivariate Iterative Filtering and Convolutional Neural Networks. IEEE Trans. Neural Syst. Rehabil. Eng. 2024, 32, 2038–2048. [Google Scholar] [CrossRef]
Uyulan, C.; de la Salle, S.; Erguzel, T.T.; Lynn, E.; Blier, P.; Knott, V.; Adamson, M.M.; Zelka, M.; Tarhan, N. Depression Diagnosis Modeling With Advanced Computational Methods: Frequency-Domain eMVAR and Deep Learning. Clin. EEG Neurosci. 2022, 53, 24–36. [Google Scholar] [CrossRef]
Zafar, R.; Dass, S.C.; Malik, A.S. Electroencephalogram-Based Decoding Cognitive States Using Convolutional Neural Network and Likelihood Ratio Based Score Fusion. PLoS ONE 2017, 12, e0178410. [Google Scholar] [CrossRef] [PubMed]
Asgari, S.; Adams, H.; Kasprowicz, M.; Czosnyka, M.; Smielewski, P.; Ercole, A. Feasibility of Hidden Markov Models for the Description of Time-Varying Physiologic State After Severe Traumatic Brain Injury. Crit. Care Med. 2019, 47, e880. [Google Scholar] [CrossRef] [PubMed]
Farhadi, A.; Chern, J.J.; Hirsh, D.; Davis, T.; Jo, M.; Maier, F.; Rasheed, K. Intracranial Pressure Forecasting in Children Using Dynamic Averaging of Time Series Data. Forecasting 2019, 1, 47–58. [Google Scholar] [CrossRef]
Güiza, F.; Depreitere, B.; Piper, I.; Van den Berghe, G.; Meyfroidt, G. Novel Methods to Predict Increased Intracranial Pressure During Intensive Care and Long-Term Neurologic Outcome After Traumatic Brain Injury: Development and Validation in a Multicenter Dataset. Crit. Care Med. 2013, 41, 554. [Google Scholar] [CrossRef]
Myers, R.B.; Lazaridis, C.; Jermaine, C.M.; Robertson, C.S.; Rusin, C.G. Predicting Intracranial Pressure and Brain Tissue Oxygen Crises in Patients With Severe Traumatic Brain Injury. Crit. Care Med. 2016, 44, 1754. [Google Scholar] [CrossRef]
Lee, S.; Hussein, R.; Ward, R.; Jane Wang, Z.; McKeown, M.J. A Convolutional-Recurrent Neural Network Approach to Resting-State EEG Classification in Parkinson’s Disease. J. Neurosci. Methods 2021, 361, 109282. [Google Scholar] [CrossRef]

Figure 1. An overview illustrating the pathway from raw data to the final model through various machine learning models.

Figure 2. A general neural network architecture.

Figure 3. A general ensemble learning architecture based on decision trees.

Table 1. Summary of the non-linear state-space models.

Model	Description	Advantages	Disadvantages
HMM	Probabilistic graphical model used to model sequential data, such as recorded data, by considering both directly measured factors (observable) and underlying aspects that cannot be directly seen (hidden), such as disease states.	Flexibility for modeling diverse sequential data types. Ability to capture temporal dependencies and transitions. Interpretability, providing insights into hidden state dynamics. Feature extraction capabilities for capturing relevant patterns. Well-suited for analyzing sequential data in various domains.	Assumption of Markovian property may not hold for complex systems. Fixed state space can be challenging when the number of states is unknown. Limited modeling of long-term dependencies in data. Difficulty with high-dimensional data and computational complexity. Sensitivity to initialization and parameter tuning. Inference complexity increases with large state spaces or long sequences. Limited representation power compared to deep learning models. Difficulty in handling continuous data without preprocessing. Vulnerability to overfitting, particularly with large state spaces relative to data size.
CNN	Neural network architecture effective at capturing spatial hierarchies of features within data.	Hierarchical feature learning captures progressively complex features. Translation invariance enables robustness to spatial variations. Sparse connectivity reduces parameters and computational load. Parameter sharing facilitates generalization and handling of variable inputs. Effective for high-dimensional data like images and videos. Parallelizable operations enable fast training and inference. Transfer learning accelerates training with pre-trained models. Interpretability through visualization aids in model understanding. Improved fundamental feature extraction.	Limited interpretability of learned features. Requirement for large amounts of labeled data. Sensitivity to variations in hyperparameters. Lack of spatial context understanding in some cases. Difficulty in handling irregular data structures. Complexity of model architecture design. Vulnerability to adversarial attacks, meaning subtle alterations to input data can lead the network to confidently misclassify. Heavy computational requirements for training and inference.
RNN	Neural network architecture designed for processing sequential data by allowing connections between units to form directed cycles, enabling information persistence over time.	Temporal dynamics for time-series prediction and sequence tasks. Ability to process variable-length inputs. Shared parameters facilitate learning of long-term dependencies. Natural representation for sequential data tasks. Stateful memory captures context across time steps. Gradient propagation commonly with backpropagation through time.	Vanishing and exploding gradients hinder training. Short-term memory limits capture of long-range dependencies. Difficulty in capturing long-term dependencies. Sequential computation slows training and inference. Sensitivity to hyperparameters affects performance. Training instability with large datasets or complex architectures. Inadequacy in capturing complex contextual information.
LSTM	Type of RNN architecture designed to address the vanishing gradient problem and capture long-term dependencies by introducing specialized memory cells with gating mechanisms.	Capable of capturing long-term dependencies in sequences. Addresses the vanishing gradient problem for stable training. Utilizes gating mechanisms for better control over information flow. Maintains stateful memory for retaining relevant context. Versatile and effective for various sequential data tasks including time-series prediction.	Increased complexity and computational requirements. Proneness to overfitting, especially with limited data. Sensitivity to hyperparameters, requiring careful tuning. Difficulty in interpreting internal mechanisms. Limited memory for processing very long sequences. Potential gradient explosion during training.
ESN	Type of RNN model that utilizes a fixed, randomly connected reservoir and only trains the output weights, making them efficient for processing temporal data.	Training only the output layer is fast and computationally inexpensive. The fixed reservoir simplifies the network design and reduces the parameters to optimize. The reservoir transforms inputs into a high-dimensional space with diverse dynamic behavior. Fewer trainable parameters lower the risk of overfitting. Fixed reservoir weights provide stable and predictable dynamics. Applicable to various tasks involving temporal data, such as time series prediction and signal processing. Not affected by vanishing or exploding gradient issues.	Sensitivity to reservoir size, connectivity, and spectral radius. Untrained reservoir may not suit specific input data characteristics. Large reservoirs consume significant memory resources. Fixed reservoir cannot adapt to new data patterns during training. Challenging to find the optimal reservoir configuration for specific tasks.
Random forest	An ensemble learning method that constructs multiple decision trees during training and outputs the mode of the classes (classification) or the mean prediction (regression) of the individual trees.	High accuracy through the aggregation of multiple decision trees. Robustness to overfitting compared to individual trees. Effective handling of missing data. Provision of feature importance measures for interpretation. Ability to capture non-linear relationships. Robustness to outliers in the data. Efficiency in handling large datasets and high dimensionality.	Less interpretability compared to simpler models. Computational complexity can lead to longer training times. Memory-intensive due to storing multiple trees. Bias towards majority classes in imbalanced datasets. Time-consuming hyperparameter tuning. Slower prediction speed compared to simpler models for real-time applications. Reduced effectiveness with highly correlated features.
XGBoost	An implementation of random forest algorithms designed for efficiency, speed, and accuracy in supervised learning tasks.	High performance with efficiency and scalability. Built-in regularization techniques for preventing overfitting. Flexibility in supporting various objective functions and metrics. Feature importance scores for informed feature selection. Automatic handling of missing values in the data. Parallelization for faster training on large datasets. Advanced tree pruning for improved model complexity control. Useful in feature extraction.	Complexity in hyperparameter tuning. High memory usage with large datasets or deep trees. Reduced interpretability as a black box model. Potential overfitting, especially with complex datasets. Sensitivity to outliers in the data. Scalability limitations with extremely large datasets. Challenges in handling imbalanced datasets.
SVM	A supervised machine learning algorithm used for classification and regression tasks, aiming to find the optimal hyperplane that best separates different classes or predicts continuous values.	Effective in high-dimensional spaces. Robust to overfitting due to margin maximization. Versatility with linear and non-linear kernel functions. Memory efficiency using a subset of support vectors. Effective with small datasets, focusing on support vectors. Aim to find global optimum for more stable solutions. Controlled complexity through parameter tuning.	Sensitivity to parameter tuning. Computationally intensive for large datasets. Memory-intensive storage of entire training dataset. Difficulty with scalability to very large datasets. Limited interpretability as black-box models. Performance degradation with noisy data. Inherent limitation to binary classification tasks.
GP	A probabilistic model that defines a distribution over functions, where any finite set of points has a joint Gaussian distribution.	Flexibility to model complex relationships without assuming specific functional forms. Uncertainty quantification for reliable predictions under uncertainty. Capable of interpolation and extrapolation for sparse or irregularly sampled data. Automatic adjustment of complexity based on available data. Nonparametric nature allows for growing complexity with increasing data. Probabilistic framework enables principled uncertainty estimation and Bayesian inference. Easy incorporation of prior knowledge through choice of covariance functions.	Computational complexity for large datasets. Memory requirements for storing entire datasets. Limited scalability to high-dimensional data. Challenging choice of covariance function. Sensitivity to hyperparameters. Difficulty with non-Gaussian likelihoods. Limited interpretability due to complex nature.

CNN, convolutional neural network; ESN, echo state networks; GP, Gaussian process; HMM, hidden Markov model; LSTM, long short-term memory; RNN, recurrent neural network; SVM, support vector machines; XGBoost, extreme gradient boosting.

Table 2. Studies employing non-linear multivariate state-space models for various cerebral physiological signals.

Study	Study Group	Relevant ML Model	Cerebral Physiology	Significance of the Model in the Study
Asgari et al., 2019 [131]	Adult patients with TBI	HMM	ICP, CPP, PRx, RAP Other: ABP	A HMM was utilized to determine the cerebral dynamic states with respect to various cerebral physiological signals.
Farhadi et al., 2019 [132]	Pediatric ICU patients	SVM, random forest	ICP, CPP Other: MAP, HR, BP	SVM and random forest models were compared with other models for prediction of ICP episodes, with the random forest model achieving the highest prediction accuracy.
Fraiwan and Alkhodari, 2020 [119]	Neonates	LSTM	Sleep-state EEG recordings	A LSTM algorithm was utilized and compared with studies from the literature for automatic sleep state scoring for neonates, achieving the highest accuracy.
Güiza et al., 2013 [133]	Patients with TBI	GP	ICP, CPP Other: MAP	A GP was compared to logistic regression for increased ICP episode prediction and early prediction of unfavorable neurological outcome, with the GP model exhibiting the best overall performance.
Itälinna et al., 2023 [125]	Patients with mild TBI and healthy controls	SVM	MEG	A SVM classifier was trained on quantitative deviation maps to distinguish TBI patients from healthy control subjects.
Jiang et al., 2023 [126]	Healthy volunteers and individuals with major depressive disorder	HMM with multivariate autoregressive observation (MAR)	Resting-state and task-state EEG recordings	The HMM-MAR model illustrated the ability to capture neuronal dynamics from EEG signals and to interpret brain disease pathogenesis by analyzing state transitions.
Khadidos et al., 2023 [120]	Healthy volunteers and patients with depression	Decision tree, random forest, CNN, RNN, LSTM, XGBoost	Stimuli-induced EEG recordings	Models were compared for detection and classification of depression. CNN showed the best performance among all employed models.
Khawaldeh et al., 2022 [97]	PD patients	HMM	LFP	A HMM was used to detect different LFP states to investigate the impact of various spectral states in the subthalamic nucleus LFP on motor impairment in PD patients.
Kim and Jeong, 2019 [124]	Healthy volunteers	ESN and Gaussian readouts	Stimuli-induced EEG recordings	ESN and Gaussian readouts were shown to effectively decode user movement intentions using a low-cost, portable EEG system.
Lee et al., 2021 [135]	Healthy volunteers and PD patients	CNN-RNN	Resting-state EEG recordings	CNNs were employed for feature extraction, while RNN model was used for detection and classification of PD patients, showing better performance compared to baseline machine learning models as well as the deep learning models from the literature.
Mughal et al., 2022 [39]	Healthy volunteers	CNN-LSTM	fNIRS and task-state EEG recordings	The CNN-LSTM hybrid model was applied to images generated by recurrence plots for stand-alone EEG and fNIRS data, as well as hybrid EEG-fNIRS data, for the classification of changes in brain state. The performance of the model using hybrid EEG-fNIRS data were observed to be superior compared to the other two image sets, as well as to the results reported in the literature.
Myers et al., 2016 [134]	Patients with severe TBI	GP	ICP, PbtO₂ Other: MAP, EtCO₂, SaO₂	A GP was compared to logistic regression and autoregressive model for univariate and multivariate prediction of ICP and PbtO₂.
Nadalizadeh et al., 2024 [127]	Drivers in fatigued and normal states	k-NN, SVM, random forest	Resting-state and task-state EEG recordings	k-NN, SVM, and random forest classifiers were applied to features extracted from EEG signals for fatigue detection and recognition.
Paliwal et al., 2024 [128]	Patients of various ages who had undergone routine clinical EEG scans	CNN	EEG	A CNN was used to predict brain age of a patient from EEG scans. CNN model performance was shown to improve with multivariate iterative filtering.
Pancholi et al., 2023 [108]	Healthy volunteers	MLP, CNN-LSTM, WPD CNN-LSTM	Task-state EEG recordings	MLP, CNN-LSTM, and WPD CNN-LSTM are employed and compared for the prediction of hand kinematic trajectory.
Uyulan et al., 2022 [129]	Patients with major depressive disorder and healthy volunteers	Pretrained CNN-LSTM	Resting-state EEG recordings	A hybrid model was employed alongside a stand-alone LSTM to detect depression-specific information from EEG signals for depression classification. The hybrid model demonstrates better performance, lower training time, and no overfitting issues.
Williamson et al., 2012 [104]	Patients with medically intractable focal epilepsy	SVM	Intracranial EEG	A SVM model was trained on 15 s of EEG signals for the classification of preictal or interictal state of patients.
Yang et al., 2024 [123]	Healthy volunteers	M-ESN, ESN	Stimuli-induced EEG recordings	A M-ESN, where ESN hidden state is directly initialized, outperformed standard ESN while having a smaller reservoir size and a simpler training process.
Xing et al., 2022 [121]	Healthy volunteers	CNN-LSTM	Stimuli-induced EEG recordings	A CNN-LSTM model was utilized for emotion detection by combining spatial–frequency–temporal features extracted from EEG signals. The model showed better performance compared to baseline methods.
Zafar et al., 2017 [130]	Healthy volunteers	CNN	Stimuli-induced EEG recordings	A CNN was utilized for feature extraction from EEG signals, which were then used in a separate classification task.
Zong et al., 2023 [122]	Healthy volunteers	XGBoost	Stimuli-induced EEG recordings	XGBoost was employed for emotion recognition task using features extracted with a feature attention network module. The proposed model was shown to have better performance than baseline models.

ABP, arterial blood pressure; BP, blood pressure; CNN, convolutional neural network; CPP, cerebral perfusion pressure; EEG, electroencephalography; ESN, echo state network; EtCO₂, end-tidal carbon dioxide; fNIRS, functional near-infrared spectroscopy; GP, Gaussian process; HMM, hidden Markov model; HR, heart rate; ICP, intracranial pressure; ICU, intensive care unit; LFP, local field potential; LSTM, long short-term memory; M-ESN, modular echo state network; MAP, mean arterial blood pressure; MAR, multivariate autoregressive; MEG, magnetoencephalography; ML, machine learning; MLP, multi-layer perceptron; PbtO₂, cerebral tissue oxygenation; PD, Parkinson’s disease; PRx, pressure reactivity index; RAP, pressure–volume reserve; RNN, recurrent neural network; SaO₂, arterial oxygen saturation; SVM, support vector machines; TBI, traumatic brain injury; WPD, wavelet packet decomposition; XGBoost, extreme gradient boosting.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Vakitbilir, N.; Islam, A.; Gomez, A.; Stein, K.Y.; Froese, L.; Bergmann, T.; Sainbhi, A.S.; McClarty, D.; Raj, R.; Zeiler, F.A. Multivariate Modelling and Prediction of High-Frequency Sensor-Based Cerebral Physiologic Signals: Narrative Review of Machine Learning Methodologies. Sensors 2024, 24, 8148. https://doi.org/10.3390/s24248148

AMA Style

Vakitbilir N, Islam A, Gomez A, Stein KY, Froese L, Bergmann T, Sainbhi AS, McClarty D, Raj R, Zeiler FA. Multivariate Modelling and Prediction of High-Frequency Sensor-Based Cerebral Physiologic Signals: Narrative Review of Machine Learning Methodologies. Sensors. 2024; 24(24):8148. https://doi.org/10.3390/s24248148

Chicago/Turabian Style

Vakitbilir, Nuray, Abrar Islam, Alwyn Gomez, Kevin Y. Stein, Logan Froese, Tobias Bergmann, Amanjyot Singh Sainbhi, Davis McClarty, Rahul Raj, and Frederick A. Zeiler. 2024. "Multivariate Modelling and Prediction of High-Frequency Sensor-Based Cerebral Physiologic Signals: Narrative Review of Machine Learning Methodologies" Sensors 24, no. 24: 8148. https://doi.org/10.3390/s24248148

APA Style

Vakitbilir, N., Islam, A., Gomez, A., Stein, K. Y., Froese, L., Bergmann, T., Sainbhi, A. S., McClarty, D., Raj, R., & Zeiler, F. A. (2024). Multivariate Modelling and Prediction of High-Frequency Sensor-Based Cerebral Physiologic Signals: Narrative Review of Machine Learning Methodologies. Sensors, 24(24), 8148. https://doi.org/10.3390/s24248148

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multivariate Modelling and Prediction of High-Frequency Sensor-Based Cerebral Physiologic Signals: Narrative Review of Machine Learning Methodologies

Abstract

1. Introduction

2. Methods

2.1. Multivariate State-Space Models

Hidden Markov Model (HMM)

2.2. Neural Networks

2.2.1. Convolutional Neural Network (CNN)

2.2.2. Recurrent Neural Network (RNN)

2.2.3. Long Short-Term Memory (LSTM)

2.2.4. Echo State Network (ESN)

2.3. Ensemble Learning Models

2.3.1. Random Forest

2.3.2. Extreme Gradient Boosting (XGBoost)

2.4. Kernel Methods

2.4.1. Support Vector Machine (SVM)

2.4.2. Gaussian Processes (GPs)

3. Preprocessing Requirements

4. Clinical Relevance

5. Limitations of the Models

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI