Trial Analysis of Brain Activity Information for the Presymptomatic Disease Detection of Rheumatoid Arthritis

Maeda, Keisuke; Ogawa, Takahiro; Kayama, Tasuku; Sasaki, Takuya; Tainaka, Kazuki; Murakami, Masaaki; Haseyama, Miki

doi:10.3390/bioengineering11060523

Open AccessArticle

Trial Analysis of Brain Activity Information for the Presymptomatic Disease Detection of Rheumatoid Arthritis

by

Keisuke Maeda

¹

,

Takahiro Ogawa

²

,

Tasuku Kayama

³,

Takuya Sasaki

^3,4,

Kazuki Tainaka

⁵,

Masaaki Murakami

^6,7,8,9 and

Miki Haseyama

^2,*

¹

Data-Driven Interdisciplinary Research Emergence Department, Hokkaido University, N-13, W-10, Kita-ku, Sapporo 060-0813, Japan

²

Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Japan

³

Department of Pharmacology, Graduate School of Pharmaceutical Sciences, Tohoku University, 6-3 Aramaki-Aoba, Aoba-ku, Sendai 980-8578, Japan

⁴

Department of Neuropharmacology, Tohoku University School of Medicine, 4-1 Seiryo-machi, Aoba-ku, Sendai 980-8575, Japan

⁵

Department of System Pathology for Neurological Disorders, Brain Research Institute, Niigata University, 1-757 Asahimachi-dori, Chuo-ku, Niigata 951-8585, Japan

⁶

Division of Molecular Psychoimmunology, Institute for Genetic Medicine and Graduate School of Medicine, Hokkaido University, Kita-15, Nishi-7, Kita-ku, Sapporo 060-0815, Japan

⁷

Division of Molecular Neuroimmunology, National Institute for Physiological Sciences, Myodaiji, Okazaki 444-8585, Japan

⁸

Group of Quantum Immunology, National Institute for Quantum and Radiological Science and Technology (QST), 4-9-1 Anagawa, Inage 263-8555, Japan

⁹

Institute for Vaccine Research and Development (HU-IVReD), Hokkaido University, Kita-21, Nishi-11, Kita-ku, Sapporo 001-0021, Japan

^*

Author to whom correspondence should be addressed.

Bioengineering 2024, 11(6), 523; https://doi.org/10.3390/bioengineering11060523

Submission received: 21 March 2024 / Revised: 26 April 2024 / Accepted: 15 May 2024 / Published: 21 May 2024

(This article belongs to the Section Biosignal Processing)

Download

Browse Figures

Versions Notes

Abstract

This study presents a trial analysis that uses brain activity information obtained from mice to detect rheumatoid arthritis (RA) in its presymptomatic stages. Specifically, we confirmed that F759 mice, serving as a mouse model of RA that is dependent on the inflammatory cytokine IL-6, and healthy wild-type mice can be classified on the basis of brain activity information. We clarified which brain regions are useful for the presymptomatic detection of RA. We introduced a matrix completion-based approach to handle missing brain activity information to perform the aforementioned analysis. In addition, we implemented a canonical correlation-based method capable of analyzing the relationship between various types of brain activity information. This method allowed us to accurately classify F759 and wild-type mice, thereby identifying essential features, including crucial brain regions, for the presymptomatic detection of RA. Our experiment obtained brain activity information from 15 F759 and 10 wild-type mice and analyzed the acquired data. By employing four types of classifiers, our experimental results show that the thalamus and periaqueductal gray are effective for the classification task. Furthermore, we confirmed that classification performance was maximized when seven brain regions were used, excluding the electromyogram and nucleus accumbens.

Keywords:

presymptomatic disease; rheumatoid arthritis; brain activity; missing data; canonical correlation analysis

1. Introduction

According to the World Health Organization, the average life expectancy has increased by 5–10 years over the last two decades (from 2000 to 2019). Despite the concurrent increase in healthy life expectancy, the gap between overall and healthy life expectancy remains at approximately 10 years [1]. Longer healthy life expectancy is expected to improve quality of life, contribute to the overall revitalization of society, and reduce medical costs. In order to increase healthy life expectancy, it is essential to implement preventive measures at the “presymptomatic” stage, addressing health concerns before the onset of disease [2,3,4,5].

In recent years, machine learning has been significantly used in presymptomatic disease research, particularly for detecting early signs of Alzheimer’s disease, which increases in risk with age [2,4]. Despite these advancements, no established technology exists for pre-emptively detecting arthropathies such as rheumatoid arthritis (RA). Our study pioneers the development of a technology capable of identifying individuals at risk of developing RA. Previous studies on Alzheimer’s disease have demonstrated the efficacy of brain activity information, including electromyogram (EEG), functional near-infrared spectroscopy, and local field potential (LFP) in presymptomatic disease detection [6]. Therefore, it is expected that leveraging brain activity information will enable the accurate identification of presymptomatic individuals with RA.

Cytokines cause RA [7]. In order to verify the possibility of detecting individuals with RA using brain activity information, it is essential to develop the ability to classify F759 mice, which depend on the inflammatory cytokine IL-6, and healthy wild-type mice. Furthermore, determining the specific brain regions involved in RA is crucial. Although the direct analysis of brain activity information is necessary, specific brain regions may have missing information because of physical constraints, such as electrode insertion-related issues [8,9,10]. Consequently, extracting information about RA induction from incomplete brain activity data is a complex task. The primary objective of this study is to fill the gaps in brain activity information, enabling the analysis of the relationship between the data obtained from various brain regions, classification performance, and the identification of brain regions crucial for RA detection.

In this study, we conducted a trial analysis for the early detection of presymptomatic RA using missing value completion techniques for brain activity information and correlation analysis capable of assessing multiple aspects of brain activity. Specifically, we used a matrix factorization-based approach to address missing brain activity information, confirming its efficacy in complementing missing biological data, including behavior [11] and brain [12] data. In order to identify crucial brain regions for classifying F759 and wild-type mice, we employed canonical correlation analysis, a method widely used for analyzing various brain activity information [13,14,15,16,17]. Specifically, we used supervised multiview canonical correlation analysis (sMVCCA) [18], which is capable of handling multiple types of information. Subsequently, we calculated cross-loadings using features obtained from sMVCCA because this calculation is a powerful tool for revealing the potential relationships between biological information and other factors [19,20,21]. Thus, we identified important brain regions contributing to the classification of F759 and wild-type mice. Furthermore, by constructing a machine learning-based classification model using brain activity information, we clarified the relationship between information derived from different brain regions and classification performance.

2. Data Acquisition of Brain Activities and Feature Extraction

This section outlines the process of acquiring data from mice, as detailed in Section 2.1, and subsequently discusses the procedure for calculating features from the obtained data in Section 2.2.

2.1. Brain Activity Data Acquisition

2.1.1. Animals

All mice used in this study were 8–16 weeks old and had pre-operative weights of 25–35 g. Male C57BL/6 J wild-type mice were purchased from SLC, Inc. (Shizuoka, Japan). The F759 mouse line carrying human gp130 (S710L) has been previously established [22]. In brief, the F759 mice underwent genetic modification involving the replacement of the intracellular region of the mouse gp130 gene, a gene responsible for encoding a signal transducer within the IL-6 receptor complex, with an intracellular mutant human gp130 cDNA (Y759F). This mutation inhibits SOCS3-mediated negative feedback, resulting in enhanced activation of STAT3 after IL-6 stimulation. Consequently, the injection of IL-6 and IL-17 into the ankle joints of F759 mice activates the IL-6 amplifier, improving the activation of the NFkB pathway in nonimmune cells, including synovial cells, ultimately resulting in the development of rheumatoid-like arthritis [23]. The mice were housed in a vivarium at a controlled temperature (22 ± 1 °C) and humidity (55 ± 5%), with a 12:12 h light/dark cycle (lights on from 8 am to 8 pm). The mice were provided ad libitum access to food and water and were individually housed.

2.1.2. Surgery

The standard surgical procedures were similar to those described in previous studies [24,25,26]. The mice were anesthetized with 1–2% isoflurane gas in air and then fixed in a stereotaxic instrument equipped with two ear bars and a nose clamp. Craniotomy was performed on the left hemisphere at specific co-ordinates: anterior cingulate cortex (ACC; 1.1 mm anterior and 0.2 mm lateral to the bregma), prelimbic cortex (PL; 1.7 mm anterior and 0.2 mm lateral to the bregma), nucleus accumbens (NAc; 0.8 mm anterior and 0.8 mm lateral to the bregma), amygdala (AMY; 0.8 mm posterior and 3.0 mm lateral to the bregma), primary somatosensory cortex (S1; 1.4 mm posterior and 2.1 mm lateral to the bregma), thalamus (THL; 2.1 mm posterior and 1.3 mm lateral to the bregma), and periaqueductal gray (PAG; 3.5 mm posterior and 0.2 mm lateral to the bregma). The electrode array was directly implanted into the brain tissue, with electrodes inserted at varying depths (0.8 mm into S1, 2.0 mm into ACC, 2.5 mm into PL, 2.8 mm into PAG, 3.0 mm into THL, and 4.4 mm into NAc and AMY). Two electromyogram (EMG) electrodes were implanted in the dorsal neck area. In addition, stainless steel screws were positioned on the skull above the olfactory bulb and cerebellum, serving as recording electrodes for respiratory signals and ground/reference electrodes, respectively. All wires and electrode arrays were securely attached to the skull using dental cement. After completing all surgical procedures, anesthesia was discontinued, allowing the animals to awaken naturally. After surgery, each mouse was housed with free access to water and food while undergoing daily observations.

2.1.3. Electrophysiological Recording and Histological Analysis to Confirm Electrode Placement

The mice were connected to the recording equipment via a Cereplex M (Blackrock Microsystems, Salt Lake City, UT, USA), a digitally programmable amplifier for electrophysiological signal recording. The head stage output was then transmitted to the Cereplex Direct recording system, a data acquisition system, using a lightweight multiwire tether and a commutator.

The mice were euthanized with an overdose of urethane/

α

-chloralose, followed by intracardial perfusion with 4% paraformaldehyde in phosphate-buffered saline (pH 7.4) and subsequent decapitation. After dissection, the brains were fixed overnight in 4% PFA and equilibrated with 20% and 30% sucrose in phosphate-buffered saline each overnight. Frozen coronal sections (50 µm) were obtained using a microtome, and the resulting serial sections were mounted and subjected to cresyl violet staining. For cresyl violet staining, the slices were water-rinsed, stained with cresyl violet, and overslipped using a hydrophobic mounting medium (Marinol). The positions of all electrodes were verified by identifying the corresponding electrode tracks in the histological tissue using an optical microscope (All-in-One Fluorescence Microscope BZ-X810, Keyence, Itasca, IL, USA).

2.1.4. Simultaneous LFP Recording of Seven Brain Regions during Quiescent Periods

LFP signals were simultaneously recorded for over 5 h from various brain regions, including ACC, PL, NAc, AMY, S1, THL, and PAG, in freely moving mice (Figure 1a). EMG signals were recorded using electrodes implanted in the dorsal neck area to assess the movements of the animals. Respiratory signals indicating breathing activity (BR) were recorded from the skull above the olfactory bulb. For analysis, LFP signals spanning 3600 s were manually extracted from quiescent periods identified when EMG signals exhibited minimal fluctuations (Figure 1b).

2.2. Feature Extraction

These electrical signals were sampled at 2 kHz. In previous studies [27,28], the effectiveness of signals within the delta (1–4 Hz), theta (6–10 Hz), and gamma (40–100 Hz) bands was demonstrated. Corresponding band-pass filters were applied to the original signals to extract these bands. This experiment recorded data from a resting-state mouse over a 1 h period. A Hamming window was applied to segment the data into 1 min intervals. Fourier transformation was then applied to the segmented data to calculate each band’s amplitude spectrum and ratio to the entire signal. Consequently, 60 samples of six-dimensional signals (=3 bands × 2 types of data) were obtained for each mouse. Furthermore, signals were collected from mice three times at 3-day intervals, resulting in 180 samples per mouse. An overview of the feature calculation is shown in Figure 2.

3. Approach for Presymptomatic Disease Detection of Rheumatoid Arthritis

In Section 3.1, we detail the process of complementing missing brain activity information. In Section 3.2, we apply sMVCCA to the complemented brain activity information and label the information so as to indicate whether the mice are F759 or wild-type. Subsequently, we identify brain regions highly associated with presymptomatic mice. Finally, we confirm the high accuracy of presymptomatic detection of RA by classifying two types of mice using brain activity information obtained from regions strongly linked to presymptomatic mice in Section 3.3.

3.1. Completion of Missing Data

In order to complement the missing data, we employed regularized matrix factorization. Figure 3 provides an overview of the complementation process. By considering the brain features

x_{n, m} \in R^{D}

of the m-th brain region

(m = 1, 2, \dots M; M

being the number of brain regions) of the n-th mouse

(n = 1, 2, \dots, N; N

being the number of mice), we define the feature matrix

X \in R^{N \times (D \times M)}

as follows:

\begin{matrix} X = [\begin{matrix} x_{1, 1}^{⊤} & x_{1, 2}^{⊤} & \dots & x_{1, M}^{⊤} \\ x_{2, 1}^{⊤} & x_{2, 2}^{⊤} & \dots & x_{2, M}^{⊤} \\ ⋮ & ⋱ & ⋮ \\ x_{N, 1}^{⊤} & x_{N, 2}^{⊤} & \dots & x_{N, M}^{⊤} \end{matrix}], \end{matrix}

(1)

where D represents the dimension of features derived from brain activity information and is set to 6, as explained in Section 2.2.

In order to complement the missing values in the feature matrix

X

, our method uses matrix factorization, which is a baseline for missing value complementation. In general matrix factorization, it is assumed that the feature matrix

X

containing missing values can be expressed as the product of the two matrices,

P

and

Q

, with K-dimensional latent features, as outlined below:

\begin{matrix} X & \approx P Q^{T} \\ = \hat{X}, \end{matrix}

(2)

where each row of

P

represents the strength of association between the mouse and the latent feature. In contrast, each row of

Q

represents the strength of association between the brain region feature and the latent feature. The

(i, j)

-th element of

\hat{X}

is calculated using the feature vectors

p_{i}

and

q_{j}

from the matrices

P

and

Q

. By using this approach, we specifically compensate for missing values through matrix factorization as follows:

\begin{matrix} {\hat{x}}_{i j} & = p_{i}^{T} q_{j} \\ = \sum_{k = 1}^{K} p_{i k} q_{k j} . \end{matrix}

(3)

Moreover, to determine the optimal

P

and

Q

, we minimize the squared error between the observed matrix

X

and the complemented matrix

\hat{X}

as follows:

\begin{matrix} arg min_{P, Q} \sum_{i, j} {(x_{i j} - {\hat{x}}_{i j})}^{2} = arg min_{P, Q} \sum_{i, j} {(x_{i j} - \sum_{k = 1}^{K} p_{i k} q_{k j})}^{2} . \end{matrix}

(4)

The loss of information about brain activity stems from malfunctions in the acquisition equipment, leading to the complete absence of specific brain regions. Because this loss is very different from the typically assumed random loss, it is difficult for general matrix factorization to compensate for it accurately. Our method introduces a regularized matrix factorization that incorporates bias terms for each mouse and brain region feature, accounting for the bias associated with missing values. In regularized matrix factorization, the optimal

P

and

Q

are estimated by solving the following equations:

\begin{matrix} arg min_{P, Q} \sum_{i, j} {(x_{i j} - {\hat{x}}_{i j})}^{2} + \frac{λ}{2} ({∥b_{mouse}∥}^{2} + {∥b_{brain}∥}^{2} + {∥p_{i}∥}^{2} + {∥q_{j}∥}^{2}), \end{matrix}

(5)

where

b_{mouse}

and

b_{brain}

represent the bias terms for each mouse and brain feature, respectively. Equation (5) can be solved using the gradient descent method, with each parameter updated as follows:

\begin{matrix} b_{mouse}^{'} & = b_{mouse} + α (2 ave (e_{i, *}) - λ b_{mouse}), \end{matrix}

(6)

\begin{matrix} b_{brain}^{'} & = b_{brain} + α (2 ave (e_{*, j}) - λ b_{brain}), \end{matrix}

(7)

\begin{matrix} p_{i k}^{'} & = p_{i k} + α (2 e_{i j} q_{k j} - λ p_{i k}), \end{matrix}

(8)

\begin{matrix} q_{k j}^{'} & = q_{k j} + α (2 e_{i j} p_{i k} - λ q_{k j}), \end{matrix}

(9)

\begin{matrix} e_{i j} & = x_{i j} - (μ + b_{mouse, i} + b_{brain, j} + p_{i}^{⊤} q_{j}) . \end{matrix}

(10)

Finally, the

(i, j)

-th element of the complemented matrix

\hat{X}

is predicted using the following equation:

\begin{matrix} {\hat{x}}_{i j} = μ + b_{mouse, i} + b_{brain, j} + p_{i}^{⊤} q_{j} . \end{matrix}

(11)

Based on the aforementioned information, the missing elements in the feature matrix

X

are replaced with the corresponding elements in the predicted complemented matrix

\hat{X}

to generate the complemented feature matrix

Z

, defined as follows:

\begin{matrix} Z & = [\begin{matrix} z_{1, 1}^{⊤} & z_{1, 2}^{⊤} & \dots & z_{1, M}^{⊤} \\ z_{2, 1}^{⊤} & z_{2, 2}^{⊤} & \dots & z_{2, M}^{⊤} \\ ⋮ & ⋱ & ⋮ \\ z_{N, 1}^{⊤} & z_{N, 2}^{⊤} & \dots & z_{N, M}^{⊤} \end{matrix}] \end{matrix}

(12)

\begin{matrix} = [Z_{1}, Z_{2}, \dots, Z_{M}], \end{matrix}

(13)

where

\begin{matrix} z_{n, m} = \{\begin{matrix} {\hat{x}}_{n, m} & (if m - th brain region of n - th mouse is missing) \\ x_{n, m} & (otherwise) \end{matrix} . \end{matrix}

(14)

Notably,

Z_{m} \in R^{N \times D}

represents the complemented feature matrix obtained from the m-th brain region.

Furthermore,

b_{mouse}

and

b_{brain}

in Equation (5) are effective in complementing the missing values in specific brain regions because they introduce bias for each mouse and brain region. Consequently, applying regularized matrix factorization to the feature matrix enables accurate complementation of missing values in brain activity information.

3.2. Estimation of Important Brain Region

In this subsection, we examine the potential correlation between class labels representing mouse types (i.e., F759 or wild-type) and feature vectors obtained from each brain region in the complemented feature matrix

Z

. Because multiple brain regions require comparison with class labels, we construct a latent variable model capable of handling multiview and supervised data. By using sMVCCA [18], which is designed to handle various types of information, we calculate cross-loadings to estimate important brain regions. The overall process is outlined at the top of Figure 4.

We estimate the projection vectors

v_{m^{'}} (m^{'} = {1, 2, \dots, M, l}; l

begin the class label) by maximizing the following equation:

\begin{matrix} arg max_{v_{m^{'}}} \sum_{m_{1}^{'}} \sum_{m_{2}^{'}, m_{2}^{'} \neq m_{1}^{'}} & \frac{v_{m_{1}^{'}}^{⊤} C_{m_{1}^{'}, m_{2}^{'}} v_{m_{2}^{'}}}{\sqrt{v_{m_{1}^{'}}^{⊤} C_{m_{1}^{'}, m_{1}^{'}} v_{m_{1}^{'}}} \sqrt{v_{m_{2}^{'}}^{⊤} C_{m_{2}^{'}, m_{2}^{'}} v_{m_{2}^{'}}}}, \\ s . t . v_{m_{1}^{'}}^{⊤} C_{m_{1}^{'}, m_{1}^{'}} v_{m_{1}^{'}} = 1, & v_{m_{2}^{'}}^{⊤} C_{m_{2}^{'}, m_{2}^{'}} v_{m_{2}^{'}} = 1, \end{matrix}

(15)

where

C_{m_{1}^{'}, m_{2}^{'}} = Z_{m_{1}^{'}}^{⊤} Z_{m_{2}^{'}}

and

Z_{l} \in R^{N \times D_{l}}

, where

D_{l}

is the number of class labels. In this study, because we used F759 and wild-type mice, we set

D_{l}

to 2. Because the solution to the aforementioned problem is independent of the scale of

v_{m_{1}^{'}}

and

v_{m_{2}^{'}}

, Equation (15) can be rewritten as follows:

\begin{matrix} arg max_{v_{m^{'}}} & \sum_{m_{1}^{'}} \sum_{m_{2}^{'}, m_{2}^{'} \neq m_{1}^{'}} v_{m_{1}^{'}}^{⊤} C_{m_{1}^{'}, m_{2}^{'}} v_{m_{2}^{'}} . \end{matrix}

(16)

In our method, we define

V = {[V_{1}^{⊤}, V_{2}^{⊤}, \dots, V_{M}^{⊤}, V_{l}^{⊤}]}^{⊤} \in R^{(M \times D + D_{l}) \times D_{p}}

. Notably,

D_{p} (\leq

\min (D, D_{l}))

represents the dimension of the latent features obtained via sMVCCA. Equation (16) can be rewritten as follows:

\begin{matrix} arg max_{V} & trace (V^{⊤} \bar{C} V) s . t . V^{⊤} \underset{̲}{C} V = I, \end{matrix}

(17)

where

\begin{matrix} \bar{C} = [\begin{matrix} 0 & C_{1, 2} & \dots & C_{1, M} & C_{1, l} \\ C_{2, 1} & 0 & \dots & C_{2, M} & C_{2, l} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ C_{M, 1} & C_{M, 2} & \dots & 0 & C_{M, l} \\ C_{l, 1} & C_{l, 2} & \dots & C_{l, M} & 0 \end{matrix}], \end{matrix}

(18)

\begin{matrix} \underset{̲}{C} = [\begin{matrix} C_{1, 1} & 0 & \dots & 0 & 0 \\ 0 & C_{2, 2} & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ 0 & 0 & \dots & C_{M, M} & 0 \\ 0 & 0 & \dots & 0 & C_{l, l} \end{matrix}] . \end{matrix}

(19)

Finally, we solve the following generalized eigenvalue problem:

\begin{matrix} \bar{C} V = ϵ (\underset{̲}{C} + η I) V, \end{matrix}

(20)

where

ϵ

denotes an eigenvalue, and

η

denotes a regularization parameter. The optimal projection

{\hat{V}}_{m^{'}}

for the feature transformation is obtained by solving the aforementioned problem. The matrix

{\hat{V}}_{m^{'}}

is constructed using the eigenvectors of the

D_{p}

-largest eigenvalues. We can then calculate the projected features as follows:

\begin{matrix} {\hat{Z}}_{m^{'}} = Z_{m^{'}} {\hat{V}}_{m^{'}} \in R^{N \times D_{p}}, \end{matrix}

(21)

where

{\hat{Z}}_{m^{'}} = {[{\hat{z}}_{m^{'}}^{1}, {\hat{z}}_{m^{'}}^{2}, . . ., {\hat{z}}_{m^{'}}^{N}]}^{⊤}

. Consequently, by concatenating these projected features for each brain region, we can obtain the projected features

{\hat{z}}^{n} = {[{({\hat{z}}_{1}^{n})}^{⊤}, {({\hat{z}}_{2}^{n})}^{⊤}, \dots, {({\hat{z}}_{M}^{n})}^{⊤}]}^{⊤}

of the n-th sample and construct the classifiers in Section 3.3.

In addition to estimating projected features, we calculated cross-loadings to identify brain regions important for classifying F759 and wild-type mice using sMVCCA. The cross-loading between the two features,

g

and

h

, can be calculated using Pearson’s correlation coefficient as follows:

\begin{matrix} C o r r (g, h) = \frac{\sum_{i = 1}^{N} (g_{i} - \bar{g}) (h_{i} - \bar{h})}{\sqrt{\sum_{i = 1}^{N} {(g_{i} - \bar{g})}^{2}} \sqrt{\sum_{i = 1}^{N} {(h_{i} - \bar{h})}^{2}}}, \end{matrix}

(22)

where

g

and

h

represent the features, and

\bar{g}

represents the average of

g

. In order to calculate the cross-loading between the m-th brain region and the class label, we set

g

to

z_{m, d} \in R^{N} (d = 1, 2, \dots, D)

. Notably, the m-th brain features are

z_{m} = {[z_{m, 1}^{⊤}, z_{m, 2}^{⊤}, \dots, z_{m, D}^{⊤}]}^{⊤}

. In addition,

h

is set to the projected label feature

{\hat{z}}_{l}^{1} \in R^{N}

, where

{\hat{z}}_{l}^{1}

is the feature with the highest eigenvalue obtained via sMVCCA. Consequently, the cross-loading

C L_{m}

of the m-th brain region is computed by averaging the cross-loadings calculated from each dimension, d, and the projected label feature as follows:

\begin{matrix} C L_{m} = \sum_{d = 1}^{D} C o r r (z_{m, d}, {\hat{z}}_{l}^{1}) . \end{matrix}

(23)

If

C L_{m}

is high, the m-th brain region will likely effectively classify mice.

The aforementioned process allows us to calculate the cross-loading between the class labels and features from each brain region in the common latent space. Therefore, effective brain regions can be estimated by computing cross-loading,

C L_{m}

.

3.3. Classification of Mice

In this section, by using the cross-loading calculated in Section 3.2, we arrange brain regions in descending order of their values and construct classifiers using the selected features. By using the constructed classifiers, we compared the classification performance of mice to identify an effective combination of brain regions. Essentially, this process allows us to identify those brain regions crucial for the presymptomatic detection of RA. The outlined flow is shown at the bottom of Figure 4.

In order to construct the classifier, we arrange the M brain regions in descending order on the basis of their cross-loading values and select up to the R-th highest cross-loading brain region. We then redefine the features

{\hat{z}}^{n} \in R^{D_{p} \cdot R}

of the n-th sample as follows:

\begin{matrix} {\hat{z}}^{n} = [{\hat{z}}_{1}^{n^{⊤}}, {\hat{z}}_{2}^{n^{⊤}}, \dots, {\hat{z}}_{R}^{n^{⊤}}] . \end{matrix}

(24)

The analysis also assesses the robustness of the results by applying the obtained

{\hat{z}}^{n}

to multiple classifiers. Specifically, four classifiers, namely, linear discriminant analysis (LDA), K-nearest neighbor (KNN), support vector machine (SVM) [29], and extreme learning machine (ELM) [30], are employed. The LDA model reduces within-class variance and increases between-class variance. The KNN model classifies target data on the basis of the Euclidean distance between features. The SVM model maximizes the distance, termed the margin, between the features of each class from the discriminative boundary distinguishing the classes. These three models are baseline and traditional classification models, which are commonly used in studies involving brain activity information, where acquiring a large amount of data is challenging. As the fourth classifier in this analysis, we opted for a neural network-based model. On the basis of a report that a simple multilayer perceptron can be effective, even with a small training dataset, we used an ELM model with one hidden layer. The classification performance of features, arranged according to their cross-loading values, is comparable, elucidating the combination of brain regions that are effective for detecting presymptomatic mice.

4. Experimental Conditions

In this experiment, we used 10 wild-type and 15 F759 mice. We attempted to obtain data from nine brain regions (i.e.,

M = 9

), some of which are missing. The relationship between each mouse and the missing data is shown in Figure 5. Figure 5 shows the relationship between the acquired data (“✓”) and the missing data (“-”). As shown in Figure 5, approximately 11.6% of the total data were missing. One way to address missing data is to exclude mice with missing data from the analysis. However, as shown in Figure 5, the method of excluding mice from the analysis is inappropriate because 18 out of 25 of the mice have missing data. Therefore, it was necessary to establish a method based on the assumption of missing data, and this study focused on data completion. These missing data were then complemented using the regularized matrix factorization introduced in Section 3.1.

The details of the parameters used in our method are as follows. The dimension, D, of the features obtained from each brain region was 6.

α

and

λ

, which were used in the regularized matrix factorization, were

2.0 \times 10^{- 5}

and

1.0 \times 10^{- 5}

, respectively.

η

in sMVCCA was set to

0.01

. The number of neighbors for the KNN model was set to 9, and a linear kernel was adopted as the kernel function for the SVM model. The ELM model is a three-layer neural network with 1000 nodes in the hidden layer.

For evaluation, we adopted the leave-mouse-out approach, where 24 out of 25 mice were used for training and 1 mouse was used for testing. The classifiers were constructed in the order of cross-loading, and the parameter R, which determines the number of features to be used, was varied from 1 to 9 in the experiments.

5. Results

Figure 6 illustrates the cross-loading between the features obtained from the nine brain regions and class labels. A higher cross-loading indicates a greater likelihood that the brain region contributes to mouse classification. Because each brain region has six types of features, six cross-loadings are calculated from one brain region. Consequently, the average value of these cross-loadings is presented in blue letters in Figure 6 as the cross-loading of the target brain region. These results suggest that THL and PAG effectively detect presymptomatic disease in RA. The “Delta ratio” tended to be high even when the average cross-loading was low, suggesting that features in the 1–4-Hz frequency range were more important than those in other frequency ranges.

Subsequently, by comparing the magnitude of the cross-loading in Figure 6, it can be confirmed that the values of THL, PAG, PL, AMY, S1, ACC, BR, NAc, and EMG are larger in that order. Therefore, we performed mouse classification using these brain regions sequentially.

The LDA, KNN, SVM, and ELM classification results are presented in Figure 7. The term “Ranking” in Figure 7 represents the brain regions sorted in the order of cross-loading. A common finding from these results is that the highest accuracy is achieved when the top seven brain regions are used for all classifiers. In other words, brain regions other than EMG and NAc are effective.

Moreover, these findings indicate that classification performance improves when the top seven regions are included. In essence, there are correlations between cross-loadings and classification performance. Therefore, cross-loading is a highly effective method for estimating crucial brain regions without constructing classifiers.

6. Discussion

The classification results for each mouse, where the number of brain regions, R, was varied, are presented in Figure 8, Figure 9, Figure 10 and Figure 11. These figures show the mouse type on the vertical axis and the number of brain regions used to construct the classifier on the horizontal axis. The average performance at the bottom corresponds to the mean classification performance across all mice, which corresponds to the values presented in Figure 7. Figure 8, Figure 9, Figure 10 and Figure 11 provide a detailed breakdown of Figure 7. A generally consistent trend is observed in all figures. Specifically, the classification performance of wild-type mice tends to be lower than that of F759 mice. This discrepancy can be attributed to class imbalance, reflecting instability in learning due to the larger number of F759 mice than wild-type mice. Moreover, accuracy is notably low when

R = 1

and

R = 2

, indicating the difficulty of achieving highly accurate classification, even when features from brain regions with high cross-loading are used. Conversely, as R increases, the overall accuracy improves, with some mice surpassing 90%.

However, even with an increase in K, no improvement in classification performance is seen for certain mice, such as wild_10 and F759_4. We attribute this phenomenon to inherent individual differences in biological information processing. It is conceivable that these differences pose challenges in accurately calculating potential correlations using sMVCCA. The acquired data will likely have a different distribution than the other data. In order to address this issue, potential strategies include excluding mice with different data distributions or removing noise on the basis of data characteristics. However, excluding specific specimens from experiments involving biological entities, such as mice, is not advisable because of the lack of data. Furthermore, elucidating data characteristics is a heuristic and impractical approach. Therefore, when constructing a common latent space, an effective solution is to compare the distribution of the 180 samples within a mouse with the distribution of samples across different mice and introduce a learning mechanism capable of bridging these distributional differences. By understanding differences in data characteristics, features useful for classification can be obtained, which is expected to further improve performance.

7. Conclusions

In this study, we analyzed brain activity using information obtained from mice to detect presymptomatic RA. The novelty of this method is that it attempts to identify brain regions crucial for presymptomatic RA detection by achieving high-accuracy classification between wild-type and F759 mice using a combination of multivariate analysis and machine learning. We introduced a matrix factorization-based approach for data completion to solve the problem of missing brain activity data. Furthermore, we applied sMVCCA to the complemented brain activity information and class labels, calculating cross-loadings between each brain region and class label to identify relevant brain regions. By constructing multiple classifiers using brain regions selected on the basis of cross-loadings, we successfully identified brain regions that were effective for detecting presymptomatic RA. Experiments involving 25 mice revealed the efficacy of seven brain regions, excluding NAc and EMG.

In order to verify the versatility of our method, it is desirable to conduct experiments using data from other diseases. Because the accumulation of amyloid-

β

secreted in the brain is related to the detection of Alzheimer’s disease [31,32], it is not necessary to focus on only the electrical signals of the brain, which are targeted in this study. Therefore, future research will involve investigating datasets for detecting presymptomatic diseases other than Alzheimer’s and conducting additional experiments using these datasets.

8. Ethical Approvals

All experiments were approved by the Committee on Animal Experiments at Tohoku University (approval number: 2022 PhA-004). All experiments were conducted following the NIH guidelines for the care and use of animals.

Author Contributions

Conceptualization, K.M., T.O., T.K., T.S., K.T. and M.M.; methodology, K.M., T.O. and M.H.; software, K.M.; validation, K.M.; data curation, T.K.; writing-original draft preparation, K.M., T.K. and T.S.; writing-review and editing, K.M., T.O., T.K., T.S., K.T., M.M. and M.H.; visualization, K.M.; funding acquisition, T.K. and M.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by AMED Moonshot (JP23zf01270004), given to M. Murakami, K. Tainaka, M. Haseyama, and T. Sasaki, and JSPS KAKENHI, Grant Number JP23K11211.

Institutional Review Board Statement

All experiments were approved by the Committee on Animal Experiments at Tohoku University (approval number: 2022 PhA-004). All experiments were conducted following the NIH guidelines for the care and use of animals.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data cannot be released.

Conflicts of Interest

The authors declare no conflict of interest.

References

World Health Statistics. 2022. Available online: https://www.who.int/news/item/20-05-2022-world-health-statistics-2022 (accessed on 30 September 2023).
Small, G.W. Early diagnosis of Alzheimer’s disease: Update on combining genetic and brain-imaging measures. Dialogues Clin. Neurosci. 2022, 2, 241–246. [Google Scholar] [CrossRef] [PubMed]
Wachinger, C.; Salat, D.H.; Weiner, M.; Reuter, M.; Initiative, A.D.N. Whole-brain analysis reveals increased neuroanatomical asymmetries in dementia for hippocampus and amygdala. Brain 2016, 139, 3253–3266. [Google Scholar] [CrossRef] [PubMed]
Billeci, L.; Badolato, A.; Bachi, L.; Tonacci, A. Machine learning for the classification of Alzheimer’s disease and its prodromal stage using brain diffusion tensor imaging data: A systematic review. Processes 2020, 8, 1071. [Google Scholar] [CrossRef]
Conrad, A.O.; Li, W.; Lee, D.Y.; Wang, G.L.; Rodriguez-Saona, L.; Bonello, P. Machine learning-based Presymptomatic detection of Rice sheath blight using spectral profiles. Plant Phenomics 2020, 2020, 8954085. [Google Scholar] [CrossRef] [PubMed]
Abrol, A.; Fu, Z.; Du, Y.; Calhoun, V.D. Multimodal data fusion of deep learning and dynamic functional connectivity features to predict Alzheimer’s disease progression. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; pp. 4409–4413. [Google Scholar]
Kondo, N.; Kuroda, T.; Kobayashi, D. Cytokine networks in the pathogenesis of rheumatoid arthritis. Int. J. Mol. Sci. 2021, 22, 10922. [Google Scholar] [CrossRef] [PubMed]
Vaden, K.I., Jr.; Gebregziabher, M.; Kuchinsky, S.E.; Eckert, M.A. Multiple imputation of missing fMRI data in whole brain analysis. Neuroimage 2012, 60, 1843–1855. [Google Scholar] [CrossRef]
Liang, Q.; Jiang, R.; Adkinson, B.D.; Rosenblatt, M.; Mehta, S.; Foster, M.L.; Dong, S.; You, C.; Negahban, S.; Zhou, H.H.; et al. Rescuing missing data in connectome-based predictive modeling. Imaging Neurosci. 2023, 2, 1–16. [Google Scholar] [CrossRef]
Sole-Casals, J.; Caiafa, C.F.; Zhao, Q.; Cichocki, A. Brain-computer interface with corrupted EEG data: A tensor completion approach. Cogn. Comput. 2018, 10, 1062–1074. [Google Scholar] [CrossRef]
Maeda, K.; Kushima, T.; Takahashi, S.; Ogawa, T.; Haseyama, M. Estimation of interest levels from behavior features via tensor completion including adaptive similar user selection. IEEE Access 2020, 8, 126109–126118. [Google Scholar] [CrossRef]
Akmal, M.; Zubair, S.; Alquhayz, H. Classification analysis of tensor-based recovered missing EEG data. IEEE Access 2021, 9, 41745–41756. [Google Scholar] [CrossRef]
Chiarion, G.; Sparacino, L.; Antonacci, Y.; Faes, L.; Mesin, L. Connectivity Analysis in EEG Data: A Tutorial Review of the State of the Art and Emerging Trends. Bioengineering 2023, 10, 372. [Google Scholar] [CrossRef] [PubMed]
Wang, Z.; Huang, R.; Yan, Y.; Luo, Z.; Zhao, S.; Wang, B.; Jin, J.; Xie, L.; Yin, E. An Improved Canonical Correlation Analysis for EEG Inter-Band Correlation Extraction. Bioengineering 2023, 10, 1200. [Google Scholar] [CrossRef] [PubMed]
Moroto, Y.; Maeda, K.; Ogawa, T.; Haseyama, M. Human-centric emotion estimation based on correlation maximization considering changes with time in visual attention and brain activity. IEEE Access 2020, 8, 203358–203368. [Google Scholar] [CrossRef]
Horii, K.; Maeda, K.; Ogawa, T.; Haseyama, M. Human-centered image classification via a neural network considering visual and biological features. Multimed. Tools Appl. 2020, 79, 4395–4415. [Google Scholar] [CrossRef]
Horii, K.; Maeda, K.; Ogawa, T.; Haseyama, M. A human-centered neural network model with discriminative locality preserving canonical correlation analysis for image classification. In Proceedings of the IEEE International Conference on Image Processing, Athens, Greece, 7–10 October 2018; pp. 2366–2370. [Google Scholar]
Lee, G.; Singanamalli, A.; Wang, H.; Feldman, M.D.; Master, S.R.; Shih, N.N.; Spangler, E.; Rebbeck, T.; Tomaszewski, J.E.; Madabhushi, A. Supervised multi-view canonical correlation analysis (sMVCCA): Integrating histologic and proteomic features for predicting recurrent prostate cancer. IEEE Trans. Med. Imaging 2014, 34, 284–297. [Google Scholar] [CrossRef]
Maeda, K.; Togo, R.; Ogawa, T.; Adachi, S.i.; Yoshizawa, F.; Haseyama, M. Trial analysis of the relationship between taste and biological information obtained while eating strawberries for sensory evaluation. Sensors 2022, 22, 9496. [Google Scholar] [CrossRef] [PubMed]
Ishihara, T.; Miyazaki, A.; Tanaka, H.; Matsuda, T. Association of cardiovascular risk markers and fitness with task-related neural activity during animacy perception. Med. Sci. Sport. Exerc. 2022, 54, 1738. [Google Scholar] [CrossRef] [PubMed]
Cristi-Montero, C.; Johansen-Berg, H.; Salvan, P. Multimodal neuroimaging correlates of physical-cognitive covariation in Chilean adolescents. The Cogni-Action Project. Dev. Cogn. Neurosci. 2022, 66, 101345. [Google Scholar] [CrossRef] [PubMed]
Atsumi, T.; Ishihara, K.; Kamimura, D.; Ikushima, H.; Ohtani, T.; Hirota, S.; Kobayashi, H.; Park, S.J.; Saeki, Y.; Kitamura, Y.; et al. A point mutation of Tyr-759 in interleukin 6 family cytokine receptor subunit gp130 causes autoimmune arthritis. J. Exp. Med. 2002, 196, 979–990. [Google Scholar] [CrossRef] [PubMed]
Murakami, M.; Okuyama, Y.; Ogura, H.; Asano, S.; Arima, Y.; Tsuruoka, M.; Harada, M.; Kanamoto, M.; Sawa, Y.; Iwakura, Y.; et al. Local microbleeding facilitates IL-6–and IL-17–dependent arthritis in the absence of tissue antigen recognition by activated T cells. J. Exp. Med. 2011, 208, 103–114. [Google Scholar] [CrossRef] [PubMed]
Konno, D.; Nakayama, R.; Tsunoda, M.; Funatsu, T.; Ikegaya, Y.; Sasaki, T. Collection of biochemical samples with brain-wide electrophysiological recordings from a freely moving rodent. J. Pharmacol. Sci. 2019, 139, 346–351. [Google Scholar] [CrossRef] [PubMed]
Sasaki, T.; Nishimura, Y.; Ikegaya, Y. Simultaneous recordings of central and peripheral bioelectrical signals in a freely moving rodent. Biol. Pharm. Bull. 2017, 40, 711–715. [Google Scholar] [CrossRef]
Shikano, Y.; Sasaki, T.; Ikegaya, Y. Simultaneous recordings of cortical local field potentials, electrocardiogram, electromyogram, and breathing rhythm from a freely moving rat. J. Vis. Exp. 2018, 134, e56980. [Google Scholar]
Nakayama, R.; Ikegaya, Y.; Sasaki, T. Cortical-wide functional correlations are associated with stress-induced cardiac dysfunctions in individual rats. Sci. Rep. 2019, 9, 10581. [Google Scholar] [CrossRef] [PubMed]
Kuga, N.; Sasaki, T. Memory-related neurophysiological mechanisms in the hippocampus underlying stress susceptibility. Neurosci. Res. 2022. [Google Scholar] [CrossRef] [PubMed]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: A new learning scheme of feedforward neural networks. In Proceedings of the IEEE International Joint Conference on Neural Networks, Budapest, Hungary, 25–29 July 2004; Volume 2, pp. 985–990. [Google Scholar]
Rother, C.; Uhlmann, R.E.; Müller, S.A.; Schelle, J.; Skodras, A.; Obermüller, U.; Häsler, L.M.; Lambert, M.; Baumann, F.; Xu, Y.; et al. Experimental evidence for temporal uncoupling of brain Aβ deposition and neurodegenerative sequelae. Nat. Commun. 2022, 13, 7333. [Google Scholar] [CrossRef]
Jansen, W.J.; Janssen, O.; Tijms, B.M.; Vos, S.J.; Ossenkoppele, R.; Visser, P.J.; Aarsland, D.; Alcolea, D.; Altomare, D.; Von Arnim, C.; et al. Prevalence estimates of amyloid abnormality across the Alzheimer disease clinical spectrum. JAMA Neurol. 2022, 79, 228–243. [Google Scholar] [CrossRef]

Figure 1. Simultaneous LFP recordings from seven brain regions of a freely moving mouse. At the bottom of (a), the electrophysiological signals, including brain LFP signals, an EMG signal, and a respiratory signal (BR), were obtained from an F759 mouse. At the top of (a), the histological confirmation of recording sites in ACC, PL, NAc, AMY, S1, THL, and PAG can be seen, as observed in the cresyl-stained sections. The dotted lines outline the contours of the brain regions, and the arrowheads indicate electrode tracks. (b) Representative LFP signals from ACC, PL, NAc, AMY, S1, THL, and PAG and EMG and respiratory signals. Quiescent periods were manually identified on the basis of nearly silent EMG signals.

Figure 2. Overview of feature extraction from mice. Data acquisition spanned 1 h per day for three consecutive days. The 1 h data were divided into 1 min intervals, yielding 180 samples from each mouse.

Figure 3. We initialize the missing data in the feature matrix

X

with initial values. We aim to minimize errors using known data by applying regularized matrix factorization to the obtained matrix. Subsequently, the missing data are replaced with complemented data, and this process is repeated.

Figure 3. We initialize the missing data in the feature matrix

X

with initial values. We aim to minimize errors using known data by applying regularized matrix factorization to the obtained matrix. Subsequently, the missing data are replaced with complemented data, and this process is repeated.

Figure 4. Overview of the process for identifying brain regions from complemented features that contribute to the classification of mice. In Section 3.2, a common latent space is constructed using sMVCCA, enabling the analysis of correlations between features from multiple brain regions and class labels. Because the features in this space are designed to be highly correlated with the class labels, the brain regions associated with the class labels are estimated by calculating the cross-loadings using these features. In addition, in Section 3.3, we further identify important brain regions by constructing various classification models using features ordered by cross-loading, starting with brain regions with the highest values.

Figure 5. Relationship between each mouse and the missing brain region. “✓” indicates acquired data, and “-” indicates missing data.

Figure 6. Cross-loading between data from nine brain regions and class labels. “Delta”, “theta”, and “gamma” indicate frequency bands. The terms “amp” and “ratio” indicate the amplitude spectrum and its ratio, respectively.

Figure 7. Classification performance of LDA, KNN, SVM, and ELM models. The results were obtained from classifiers constructed using the top R-ranked brain regions.

Figure 8. Classification performance of each mouse, with changes in the number of brain regions, R, used to construct linear discriminant analysis (LDA).

Figure 9. Classification performance of each mouse, with changes in the number of brain regions, R, used to construct K-nearest neighbor (KNN).

Figure 10. Classification performance of each mouse, with changes in the number of brain regions, R, used to construct support vector machine (SVM).

Figure 11. Classification performance of each mouse, with changes in the number of brain regions, R, used to construct extreme learning machine (ELM).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Maeda, K.; Ogawa, T.; Kayama, T.; Sasaki, T.; Tainaka, K.; Murakami, M.; Haseyama, M. Trial Analysis of Brain Activity Information for the Presymptomatic Disease Detection of Rheumatoid Arthritis. Bioengineering 2024, 11, 523. https://doi.org/10.3390/bioengineering11060523

AMA Style

Maeda K, Ogawa T, Kayama T, Sasaki T, Tainaka K, Murakami M, Haseyama M. Trial Analysis of Brain Activity Information for the Presymptomatic Disease Detection of Rheumatoid Arthritis. Bioengineering. 2024; 11(6):523. https://doi.org/10.3390/bioengineering11060523

Chicago/Turabian Style

Maeda, Keisuke, Takahiro Ogawa, Tasuku Kayama, Takuya Sasaki, Kazuki Tainaka, Masaaki Murakami, and Miki Haseyama. 2024. "Trial Analysis of Brain Activity Information for the Presymptomatic Disease Detection of Rheumatoid Arthritis" Bioengineering 11, no. 6: 523. https://doi.org/10.3390/bioengineering11060523

APA Style

Maeda, K., Ogawa, T., Kayama, T., Sasaki, T., Tainaka, K., Murakami, M., & Haseyama, M. (2024). Trial Analysis of Brain Activity Information for the Presymptomatic Disease Detection of Rheumatoid Arthritis. Bioengineering, 11(6), 523. https://doi.org/10.3390/bioengineering11060523

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Trial Analysis of Brain Activity Information for the Presymptomatic Disease Detection of Rheumatoid Arthritis

Abstract

1. Introduction

2. Data Acquisition of Brain Activities and Feature Extraction

2.1. Brain Activity Data Acquisition

2.1.1. Animals

2.1.2. Surgery

2.1.3. Electrophysiological Recording and Histological Analysis to Confirm Electrode Placement

2.1.4. Simultaneous LFP Recording of Seven Brain Regions during Quiescent Periods

2.2. Feature Extraction

3. Approach for Presymptomatic Disease Detection of Rheumatoid Arthritis

3.1. Completion of Missing Data

3.2. Estimation of Important Brain Region

3.3. Classification of Mice

4. Experimental Conditions

5. Results

6. Discussion

7. Conclusions

8. Ethical Approvals

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI