Spatial Smoothing Effect on Group-Level Functional Connectivity during Resting and Task-Based fMRI

Spatial smoothing is a preprocessing step applied to neuroimaging data to enhance data quality by reducing noise and artifacts. However, selecting an appropriate smoothing kernel size can be challenging as it can lead to undesired alterations in final images and functional connectivity networks. However, there is no sufficient information about the effects of the Gaussian kernel size on group-level results for different cases yet. This study investigates the influence of kernel size on functional connectivity networks and network parameters in whole-brain rs-fMRI and tb-fMRI analyses of healthy adults. The analysis includes {0, 2, 4, 6, 8, 10} mm kernels, commonly used in practical analyses, covering all major brain networks. Graph theoretical measures such as betweenness centrality, global/local efficiency, clustering coefficient, and average path length are examined for each kernel. Additionally, principal component analysis (PCA) and independent component analysis (ICA) parameters, namely kurtosis and skewness, are evaluated for the functional images. The findings demonstrate that kernel size directly affects node connections, resulting in modifications to functional network structures and PCA/ICA parameters. However, network metrics exhibit greater resilience to these changes.


Introduction
Although the human brain weighs only about 1.35 kg (~3 pounds) on average and consists of fat, water, and protein, the working mechanism is the most complex one among the known structures [1]. The developments in neuroimaging techniques in recent decades have gained a fast acceleration on the working mechanism of the human brain. One of the leading techniques is functional magnetic resonance imaging (fMRI), which captures the change in the blood oxygen levels in the brain and creates a related time series of images [2]. In general, the studies on functional imaging can be categorized into two main perspectives: (i) resting-state fMRI (rs-fMRI) and (ii) task-based (also known as task-related or taskevoked) fMRI (tb-fMRI) [3]. The rs-fMRI focuses on the intrinsic neuronal activity, which reflects the spontaneous fluctuations while the subject is at "rest". On the other hand, tb-fMRI focuses on extrinsic neuronal activity, unlike rs-fMRI. As opposed to intrinsic activity, extrinsic activity reflects the signals that arise from performing some given tasks during the scan. The task may show great variety according to the hypothesis, from motor movement to speaking or from making calculations to internally rehearsing a skill [4,5]. Until recent decades, the main aim was to map the specific brain areas with the given stimuli type and associate the task-responsible regions. This characterization, also known as localization, requires carefully designed special fMRI tasks, which are performed by the subjects in the MR scanner [6]. Nevertheless, the most current paradigms in neuroscience indicate that cognitive tasks are performed not only by individual brain regions working alone but also by networks of several separate brain regions that are functionally connected [7,8]. Thanks to the graph theoretical approaches, the brain can be modeled as a highly complex network, defined with brain regions as nodes and connections as links [9][10][11]. Functional connectivity 1.
We constructed the functional connectivity maps for both rs-fMRI and tb-fMRI in detail for each subject. For the whole brain coverage, all major brain networks (Default Mode Network (DMN), Somatomotor Network (SMN), Visual Network (VN), Salience Network (SN), Dorsal Attention Network (DAN), Frontoparietal Network (FPN), Limbic Network (LN), Cerebellar Network (CN) were included to the analysis. To the best of our knowledge, there is no such study that analyzes the functional interactions of all these networks together.

2.
Although there are several studies that investigate the smoothing effect on the functional connectivity [30] or tb-fMRI [34] in single-subject [33] or on the healthy and diseased groups [35], however, to the best of our knowledge, there are no studies that have performed rs-fMRI and tb-fMRI together in such detail. Moreover, another important issue that differs from the other studies is that the functional images in the dataset involve sequential resting and task images that belong to the same subjects acquired from a single scanner. Thus, the dataset does not include intra-scanner and subject-related artifacts, which allows for observing the changes clearly. 3.
The main graph metrics, betweenness centrality, global and local efficiency, clustering coefficient, and average path length, are also analyzed for each smoothing level both for rs-fMRI and tb-fMRI. 4.
Besides the functional connectivity analysis, the main component of independent component analysis (ICA) and principal component analysis (PCA), in terms of kurtosis and skewness, are also investigated in detail for both rs-fMRI and tb-fMRI at the group level. We could not find similar studies during the literature search.
Sensors 2023, 23, 5866 3 of 20 the group-level effects of smoothing on functional connectivity networks during rs-fMRI and tb-fMRI. Therefore, we used two functional datasets that were acquired from a single MRI scanner (see Section 2 for details) of 20 healthy volunteers. For each of the subjects in the dataset, we constructed the functional connectivity maps using the {0, 2, 4, 6, 8, 10} mm Gaussian kernels and measured the smoothing effects as illustrated in Figure 1. The contributions can be summarized as follows: 1. We constructed the functional connectivity maps for both rs-fMRI and tb-fMRI in detail for each subject. For the whole brain coverage, all major brain networks (Default Mode Network (DMN), Somatomotor Network (SMN), Visual Network (VN), Salience Network (SN), Dorsal Attention Network (DAN), Frontoparietal Network (FPN), Limbic Network (LN), Cerebellar Network (CN) were included to the analysis. To the best of our knowledge, there is no such study that analyzes the functional interactions of all these networks together. 2. Although there are several studies that investigate the smoothing effect on the functional connectivity [30] or tb-fMRI [34] in single-subject [33] or on the healthy and diseased groups [35], however, to the best of our knowledge, there are no studies that have performed rs-fMRI and tb-fMRI together in such detail. Moreover, another important issue that differs from the other studies is that the functional images in the dataset involve sequential resting and task images that belong to the same subjects acquired from a single scanner. Thus, the dataset does not include intra-scanner and subject-related artifacts, which allows for observing the changes clearly. 3. The main graph metrics, betweenness centrality, global and local efficiency, clustering coefficient, and average path length, are also analyzed for each smoothing level both for rs-fMRI and tb-fMRI. 4. Besides the functional connectivity analysis, the main component of independent component analysis (ICA) and principal component analysis (PCA), in terms of kurtosis and skewness, are also investigated in detail for both rs-fMRI and tb-fMRI at the group level. We could not find similar studies during the literature search. To observe the effects, two fMRI datasets were used: resting state (rs-fMRI) and task-based (tb-fMRI) with n = 20 subjects each. For each subject, the same preprocessing steps were followed, and smoothing was performed with fwhm = {0, 2, 4, 6, 8, 10} mm. For whole brain coverage, the major networks were chosen as ROIs for each image. Afterward, group-level performances of rs-fMRI and tb-fMRI are analyzed in two-fold, separately. First, PCA and ICA decompositions were performed for each kernel size, and the changes in two main components, kurtosis and skewness, were measured. Second, functional connectivity analyses were performed to be able to see the network changes. Finally, the graph theoretical measurements were calculated for each size of kernels.
The rest of the paper is organized as follows: Dataset, imaging procedure, and data preprocessing are presented in Section 2. The methods, voxel-based analysis, and ROIbased analysis are presented in Section 3. The comparative results and conclusion are given in Section 4 and Section 5, respectively. To observe the effects, two fMRI datasets were used: resting state (rs-fMRI) and task-based (tb-fMRI) with n = 20 subjects each. For each subject, the same preprocessing steps were followed, and smoothing was performed with fwhm = {0, 2, 4, 6, 8, 10} mm. For whole brain coverage, the major networks were chosen as ROIs for each image. Afterward, group-level performances of rs-fMRI and tb-fMRI are analyzed in two-fold, separately. First, PCA and ICA decompositions were performed for each kernel size, and the changes in two main components, kurtosis and skewness, were measured. Second, functional connectivity analyses were performed to be able to see the network changes. Finally, the graph theoretical measurements were calculated for each size of kernels.
The rest of the paper is organized as follows: Dataset, imaging procedure, and data preprocessing are presented in Section 2. The methods, voxel-based analysis, and ROIbased analysis are presented in Section 3. The comparative results and conclusion are given in Sections 4 and 5, respectively.

Data Preliminaries
rs-fMRI offers great benefits, such as requiring only the MRI scanner, not requiring the task-related programs, and other information beyond the resting, i.e., behavioral task-related responses. Through the advantages, it could be applicable to large subject populations and even to subjects who could not perform any tasks for any reason. On the other hand, tb-fMRI requires carefully designed tasks, so it is more laborious, and the number of such studies is relatively fewer than the rs-fMRI. In this study, to be able to seek the effect of FWHM kernel size differences on functional connectivity networks, a collection of datasets that consists of a sequential scan with rs-fMRI and memory fMRI task (mem-fMRI) was used. The structural and functional images were acquired by a 3T Siemens Magnetom whole-body MRI scanner at Ege University, Izmir, Turkey. The task was conducted by the Socat Lab research team. The whole research and task were approved by the subjects and the ethical committee of the university. In total, 30 volunteers without reported any diseases were involved in the scans. It was observed that they all had a normal or corrected vision at the initial examination. The exclusion criteria for selecting the subjects were having a mental illness, mental trauma in the present or in the past, having a medical disease such as tension, diabetes, etc., and having an Alzheimer's story in the family). After the quality checks of the acquired data, 23 subjects (20-25 years old, mean = 23.35 ± 1.04, f:12 m:11) were involved in further analyses, while the others were discarded due to the high motion rates and incompleteness of the task. The power of the population was measured over 80% with the power analysis, which is given in Figure 2. According to the results of the power analysis, it was observed that nearly 21 subjects are enough to achieve power over 80%, according to the Random Field Theory (the commonly used scale). Since different tasks will create activations in different areas and levels in the brain, power analysis should be performed task-specifically in every study before further analysis. One recent study analyzed the required sample size estimation on three different functional datasets with power analysis in detail [36].

Data Preliminaries
rs-fMRI offers great benefits, such as requiring only the MRI scanner, not requiring the task-related programs, and other information beyond the resting, i.e., behavioral taskrelated responses. Through the advantages, it could be applicable to large subject populations and even to subjects who could not perform any tasks for any reason. On the other hand, tb-fMRI requires carefully designed tasks, so it is more laborious, and the number of such studies is relatively fewer than the rs-fMRI. In this study, to be able to seek the effect of FWHM kernel size differences on functional connectivity networks, a collection of datasets that consists of a sequential scan with rs-fMRI and memory fMRI task (mem-fMRI) was used. The structural and functional images were acquired by a 3T Siemens Magnetom whole-body MRI scanner at Ege University, Izmir, Turkey. The task was conducted by the Socat Lab research team. The whole research and task were approved by the subjects and the ethical committee of the university. In total, 30 volunteers without reported any diseases were involved in the scans. It was observed that they all had a normal or corrected vision at the initial examination. The exclusion criteria for selecting the subjects were having a mental illness, mental trauma in the present or in the past, having a medical disease such as tension, diabetes, etc., and having an Alzheimer's story in the family). After the quality checks of the acquired data, 23 subjects (20-25 years old, mean = 23.35 ± 1.04, f:12 m:11) were involved in further analyses, while the others were discarded due to the high motion rates and incompleteness of the task. The power of the population was measured over 80% with the power analysis, which is given in Figure 2. According to the results of the power analysis, it was observed that nearly 21 subjects are enough to achieve power over 80%, according to the Random Field Theory (the commonly used scale). Since different tasks will create activations in different areas and levels in the brain, power analysis should be performed task-specifically in every study before further analysis. One recent study analyzed the required sample size estimation on three different functional datasets with power analysis in detail [36].  The parameters of the whole brain T2*-weighted EPI sequence are as follows: Repetition time (TR) = 3 s, echo time (TE) = 30 ms, voxel size 3 × 3 × 3 mm, matrix size = 64 × 64 with 37 slices, FOV = 192 × 192 mm, flip angle (FA) = 90 • . The parameters of the structural data were also acquired with 3D GR\IR T1-weighted images with matrix size 512 × 512 × 160, TR = 1600 ms, TE = 2.21 ms, slice thickness = 1 mm, voxel size 1 × 1 × 1 mm.

fMRI Task
The mem-fMRI is a block design fMRI task that consists of sequential rest and task phases. The fMRI procedure begins with the resting-state scan, in which the subjects are asked to rest with closed eyes for 300 s. Then, the task block starts, and during the task phase, a set of neutral faces and their names are shown on the screen for 6 s and a onesecond black screen comes after. During the screens of 6 s, subjects are asked to memorize the name and face pairs. In total, 24 different face and name pairs are shown to the subjects in the encoding block in 288 s. Each period starts with a baseline screen (30 s) in task blocks. The face dataset used in the memory task is a subset of the FACES dataset, which only includes neutral face expressions [37].
It can be said that the main difference between this dataset and similar ones is conducted with a single MRI scanner in a single session for whole participants. Thus, both the scanner-related artifacts, such as calibration, noise, slice gap, etc., and user-related, such as MR technician, differences are eliminated. Since the scan is acquired in one session, the conditions of each participant are guaranteed, such as being in the same physiological conditions or attention. These aspects allow for the investigation of the functional data without any external effects.

Data Preprocessing
The preprocessing of rs-fMRI and tb-fMRI data was performed by using CONN (version 21.a) functional connectivity toolbox [38]. During the preprocessing, common steps were followed as a default preprocessing pipeline for each subject. All functional images were realigned and corrected for head motion (as stated in Section II-A, 3 subjects were discarded at this step). Afterward, all slices were aligned at the slice timing step so that they were synchronized temporally. The pipeline then continued with the co-registration step, in which the anatomical scans of the subjects were registered to the mean of the images. The segmentation step came after, in which the brain was separated into surrounding tissues, and the white matter and CSF were extracted. During the segmentation, the anatomical images were also registered to the MNI-152 (Montreal Neurological Institute) standard space T1-weighted average structural template for further group analysis. Finally, the spatial smoothing step was applied for all rs-fMRI and tb-fMRI images with isotropic Gaussian kernel at the different sizes: FWHM = {0 (as reference for no smoothing), 2, 4, 6, 8, 10} mm, which are the most frequently used in practice. The smoothing was consistently applied before calculating the functional connectivity networks, ICA, and PCA components extraction with each and every parameter set. Additionally, for each FWHM value, quality checks of the processed data were performed.

Smoothing
The primary aim of smoothing is to reduce the noise, which increases the SNR value in the BOLD signal and minimizes the errors in group-level analysis. Commonly, it is applied as the last step of the preprocessing pipeline. As we stated above, it has several good benefits as well as detrimental effects. A typical spatial smoothing is evaluated by a Gaussian kernel, and the smoothing step results in each voxel being transformed into the weighted average of its nearby voxels. The averaging process is performed by applying a smoothing filter, which represents the smoothing matrix. The width of the kernel is determined by full-width-half-maximum (FWHM) [25]. The relationship between the FWHM and the standard deviation of the Gaussian kernel can be shown as follows: The width of the kernel is w, and it is symmetric between [−0.5 w and 0.5 w]. Due to the relation between the FWHM and standard deviation, (1) can be rewritten as follows: Regarding (2), for a voxel i, the smoothing operation with its nearby voxels can be defined as follows: Here, x i represents the time series of the i th voxel, and w i (k) denotes the value of the k th voxel on which the kernel is centered at the i th voxel. It is obvious that as the kernel size becomes wider, the smoothing filter contains more neighboring voxels inside and vice versa. There are several common assumptions and recommendations that the FWHM kernel size should be 1.5-2 times, or three times the voxel size (i.e., applying FWHM = 3-6 mm smoothing for a voxel size of 2 × 2 × 2 mm) [7,39]. The other concept is not to choose a kernel size wider than the ROI. For example, if the ROI is as small as the amygdala, the FWHM size should be set smaller than the size of it, while if the ROI is a subcortical motor region, the FWHM size may be set to 8-10 mm. Despite the second recommendation being more common, there is no unique consensus on it.
We applied five kernels in total at sizes {2, 4, 6, 8, 10} mm and 0 mm kernels (as the reference for non-smoothing). These kernel sizes are in the commonly recommended range in the literature; however, it is not yet obviously clear how these Gaussian sizes affect the functional connectivity networks and their related parameters.

Functional Connectivity Network Extraction
Thanks to the graph theory, the brain can be modeled as a network regarding its connectivity properties. The term connectivity can refer the regional connectivity, also known as anatomical or structural connectivity, which defines the short-distance connections between nearby voxels, or effective connectivity, which defines the causal effects [13]. On the other hand, it is also possible to define functional connectivity, which refers to the indirect connections between discrete or spatially distant brain regions. Functional connectivity measures the correlation or, in other terms, dependencies of the BOLD signals among the different areas. The structural and effective connectivity is out of scope in this manuscript.
A functional network is interpreted with nodes and edges in graph modeling, where nodes are the disjoint brain regions, and edges are the connections or connection strengths between the nodes. Formally, the network can be described as a graph G(N,E), with N representing the set of nodes and E being the edges in the graph. Once graph G is defined, the connectivity pattern can be designated by a symmetric adjacency matrix sized N × N, and the connectivity matrix can be acquired. As illustrated in Figure 3, the common connectivity analysis consists of the following steps: After the preprocessing, i.e., the initial step for all functional data, the node definition (or parcellation) or seed selection should be made. For this step, either the atlases, such as automated anatomical labeling (AAL), Harvard-Oxford atlas, or manually defined subject-specific or task-specific ROIs can be used. In this way, voxels are grouped into areas that are considered as functionally homogeneous. Afterward, the BOLD signals, which are the time series of the voxels, are extracted for each node within each region. The third step is calculating the correlations for extracted time series. The correlation coefficients (r) are computed for each subject separately and converted into Fisher's z-transformation. r-to-z transformation can be defined as follows [40]:  Since the correlation is not normally distributed, Fisher transformation allows for calculating the in a confidence interval. Therefore, it is strongly recommended to make the values in the connectivity matrix approximately normally distributed [40]. Finally, the last step is building the symmetric connectivity matrix with the values , where , (namely denotes the correlation value between the voxel and . Once the matrix has been acquired, a threshold value can be applied for the graph representation.
In this study, the nodes are defined as the set of major functional brain networks with their high-level cognitive areas [41] for whole brain coverage. These networks dominate cortical activity in the brain to the greatest extent, and they all perform related but distinct tasks. These major networks regulate brain activity both while the brain is at rest and during processing tasks. The neural networks of the brain operate in a hierarchical manner, integrating and synchronizing to carry out complicated tasks.
The A sample connectivity map of all nodes is given in Figure 4, where the red lines indicate the strong functional connections, and the blue lines denote the weaker connections. Here, it can be clearly seen that even though the nodes are located in distributed regions, they are still connected functionally. Since the correlation is not normally distributed, Fisher transformation allows for calculating the (r) in a confidence interval. Therefore, it is strongly recommended to make the values in the connectivity matrix approximately normally distributed [40]. Finally, the last step is building the N × N symmetric connectivity matrix with the values r XY , where r, (namely (z r )) denotes the correlation value between the voxel X and Y. Once the matrix has been acquired, a threshold value can be applied for the graph representation.
In this study, the nodes are defined as the set of major functional brain networks with their high-level cognitive areas [41] for whole brain coverage. These networks dominate cortical activity in the brain to the greatest extent, and they all perform related but distinct tasks. These major networks regulate brain activity both while the brain is at rest and during processing tasks. The neural networks of the brain operate in a hierarchical manner, integrating and synchronizing to carry out complicated tasks.
The A sample connectivity map of all nodes is given in Figure 4, where the red lines indicate the strong functional connections, and the blue lines denote the weaker connections. Here, it can be clearly seen that even though the nodes are located in distributed regions, they are still connected functionally.

Voxel-Based Analysis
The voxel-wised analyses are helpful when we explore the connectivity effec whole brain without the restriction of any a priori regions or selected ROIs. Ho since correlation values for each and every combination of the voxels are calcula cost of this analysis is high. However, it can be reduced with the application of co well-known decomposition methodologies so that how the underlying compone related to each other and how these components are estimated from the four-dime data. In this context, two best-known methodologies, principal component analysi and independent component analysis (ICA), are investigated under the different s ing kernels separately. They are both data-driven techniques used in fMRI data a for identifying patterns of brain activity. Here, it should be noted that PCA or ICA better than the other; they differ in their assumptions, objectives, and results. Each m has its own benefits, which are detailed below, and the choice of the decompositio odology can change the research hypothesis.

Principal Component Analysis (PCA)
One of the major challenges of fMRI data lies in being big data. Among t 100.000 voxels, some of them may contain correlated or uncorrelated information. other hand, this becomes more problematic in terms of computational cost, mem strictions, etc., when the datasets consist of large subject groups [30]. A common an erful approach is applying principal component analysis (PCA) to compress th mation, i.e., reduce the dimensionality. PCA is a linear method that seeks to iden directions of maximal variance in a dataset. When we apply the PCA to fMRI datas extracted components may contain Eigen images or spatial patterns whose variati defined by their eigenvectors over time. The eigenvalue defines the variance expla each component, and as the eigenvalue increases, it means that a larger variance c explained. Since PCA is applied iteratively, it generates orthonormal componen principal components are linear combinations of the original fMRI time series, an can be interpreted as spatial maps of brain activity that represent different sources ral activity in the brain. However, PCA is a completely data-driven method, so its can also be directly affected by the smoothing level of the data. To explore the smo effect on PCA, first, we performed the group-level dimensionality reduction bef

Voxel-Based Analysis
The voxel-wised analyses are helpful when we explore the connectivity effect on the whole brain without the restriction of any a priori regions or selected ROIs. However, since correlation values for each and every combination of the voxels are calculated, the cost of this analysis is high. However, it can be reduced with the application of common, well-known decomposition methodologies so that how the underlying components are related to each other and how these components are estimated from the four-dimensional data. In this context, two best-known methodologies, principal component analysis (PCA) and independent component analysis (ICA), are investigated under the different smoothing kernels separately. They are both data-driven techniques used in fMRI data analysis for identifying patterns of brain activity. Here, it should be noted that PCA or ICA is not better than the other; they differ in their assumptions, objectives, and results. Each method has its own benefits, which are detailed below, and the choice of the decomposition methodology can change the research hypothesis.

Principal Component Analysis (PCA)
One of the major challenges of fMRI data lies in being big data. Among the over 100.000 voxels, some of them may contain correlated or uncorrelated information. On the other hand, this becomes more problematic in terms of computational cost, memory restrictions, etc., when the datasets consist of large subject groups [30]. A common and powerful approach is applying principal component analysis (PCA) to compress the information, i.e., reduce the dimensionality. PCA is a linear method that seeks to identify the directions of maximal variance in a dataset. When we apply the PCA to fMRI datasets, the extracted components may contain Eigen images or spatial patterns whose variations are defined by their eigenvectors over time. The eigenvalue defines the variance explained by each component, and as the eigenvalue increases, it means that a larger variance could be explained. Since PCA is applied iteratively, it generates orthonormal components. The principal components are linear combinations of the original fMRI time series, and they can be interpreted as spatial maps of brain activity that represent different sources of neural activity in the brain. However, PCA is a completely data-driven method, so its success can also be directly affected by the smoothing level of the data. To explore the smoothing effect on PCA, first, we performed the group-level dimensionality reduction before the decomposition and then performed the other steps, such as back projection, for characterizing the voxel-to-voxel connectivity matrix. Nevertheless, this can be grueling and difficult to be able to decide which components should constitute the main component set, which can represent the original data when the others are discarded. Moreover, the extracted first components could be reflecting the noise due to the orthogonality, and these components also affect the following extracted components. At this point, ICA may provide a better approach in such circumstances due to its components should be independent as possible [42]. Here, "independent" refers to minimum overlaps between the components; thus, it can be applied effectively.

Independent Component Analysis (ICA)
ICA, on the other hand, is a statistical method that seeks to identify statistically independent sources of brain activity. ICA assumes that the acquired BOLD signal consists of a mixture of various components that cannot be visually recognized but can be distinguished. These components are statistically independent and represent a different spatial pattern of brain activity that is not necessarily related to the spatial maps identified by PCA. Unlike the orthogonal vectors founded by PCA, ICA uses high-order statistics to maximize independence. A common ICA model is formulized by X = A.s, where X is the mixture (i.e., the data) and denoted by X = (x 1 , x 2 , . . . , x m ), s is the N-dimensional independent source vector which is shown by s = [s 1 , s 2 , . . . , s N ], and A MxN is the matrix of the mixing coefficients. Here, ICA aims to achieve the un-mixing matrixŴ MxN , so that the sources s could be estimated with y = W.X.
ICA is a data-driven and blind decomposition method that identifies the components based on the principle of non-Gaussianity. Thus, it is possible to identify the set of time courses and independent spatial maps that explain the data by finding the components that are maximally non-Gaussians [7]. In this way, it can also identify highly functionally connected networks. Similarly, it can be used to extract the functional connectivity matrix, such as cognitive, motor, or sensory networks, for a given stimulus. Another problem that ICA is used is to distinguish task-related and non-task-related components from the BOLD signals. The non-task-related components caused by the physiological behaviors, such as head motion or cardiac movements, can contribute to the activation signals as noise and creates artifacts. They may result in characteristic changes in the BOLD signal, so they should be removed or suppressed during the smoothing step in preprocessing. Here, it should be noted that the selection of Gaussian kernel size for smoothing constitutes a great variety for further ICA analysis because, in the ICA model, the artifacts are not modeled implicitly but are shown as separate components.
In summary, PCA is used to reduce the dimensionality of the fMRI data and identify the sources of neural activity that explain the majority of the variance in the data, while ICA is used to identify independent sources of neural activity that may not be captured by PCA. Both methods can provide useful insights into the underlying patterns of brain activity in fMRI data and can be used in combination with other data analysis techniques.

ROI-Based Analysis
Once ROIs, i.e., nodes, are defined, functional connectivity analysis of the whole set of nodes can be performed for each possible connection between all pairs of ROIs. The analyses give a statistical map that is estimated using a GLM for pairwise associations, characterizing the effect of interest in connectivity matrices. The F-statistics, which are acquired after the analysis, can tell whether there is any effect between all possible ROI pairs sensitively. Here, both with-in-network and between-network metrics, such as global efficiency, local efficiency, betweenness centrality, average path length, etc., can be calculated. Here, we focused on five main network metrics, which are global efficiency, local efficiency, betweenness centrality, average path length, and clustering coefficient. As illustrated in Figure 5, these metrics can be summarized as follows: Efficiency is a measure of information propagation and can be calculated both as global and local [43]. The global efficiency is the distant information transmission of all networks and is defined as the average of the inverse shortest path length between all node pairs: where G, K, and E denote the network, nodes, and edges, respectively. d ij indicates the shortest path length between the nodes i and j. Here, please note that the path length between the nodes, i.e., the length of an edge, is interpreted as the correlation coefficient where a high correlation indicates a short length and vice versa. The local efficiency, on the other hand, is defined as the capability of information change in each subgraph and calculated as: Here, the subgraph contains all neighboring nodes, which are directly connected to node i.
is thought of as the measure of the functional decomposition in such complex networks [44]. Betweenness centrality metrics would indicate a high score if it is located in many shortest paths; therefore, it is a measure of which nodes act as a bridge in the graph. Betweenness centrality is expressed as follows, where denotes the total shortest path number from node i to node k and is the number of these paths passed through node : The clustering coefficient is a measure of which nodes are in connection to form a cluster together, i.e., it denotes the number of triangles in a graph. High values indicate a dense network structure. Average path length is defined as the average number of steps along the shortest paths for all possible pairs of network nodes, which are formulated as: 1 . 1 The low and high values are the two parameters of a small-world structured graph [45].  Here, the subgraph contains all neighboring nodes, which are directly connected to node i. E loc is thought of as the measure of the functional decomposition in such complex networks [44]. Betweenness centrality metrics would indicate a high score if it is located in many shortest paths; therefore, it is a measure of which nodes act as a bridge in the graph. Betweenness centrality is expressed as follows, where σ ik denotes the total shortest path number from node i to node k and σ ik (j) is the number of these paths passed through node j : The clustering coefficient CC(G) is a measure of which nodes are in connection to form a cluster together, i.e., it denotes the number of triangles in a graph. High CC(G) values indicate a dense network structure. Average path length l(G) is defined as the average number of steps along the shortest paths for all possible pairs of network nodes, which are formulated as: The low l(g) and high CC(G) values are the two parameters of a small-world structured graph [45].

Results for Voxel-Based Analysis: Effects on PCA and ICA Components
We analyzed two main parameters of PCA and ICA: kurtosis and skewness. Kurtosis indicates the degree of the outliers in the given signal or distribution. High kurtosis refers to having extreme tails, which means outliers, and low kurtosis refers to the light tails or absence of outliers. Skewness, on the other hand, is an indicator of symmetric distribution. A high skewness value indicates an asymmetrical distribution. If the data are skewed from either left (negative skew) or right (positive skew), a transform should be applied further until a normal distribution is reached because the tail could behave as an outlier in any case.
In the context of fMRI data analysis, kurtosis and skewness can be used to evaluate the distribution of the principal components obtained from PCA and independent components obtained from ICA.
In this study, PCA and ICA decompositions were performed with the CONN functional connectivity toolbox, which works on Matlab R2019b [38]. Since the selection/ determination of the factor number for PCA and ICA decomposition is beyond the scope of the article, the recommended number of suitable components (n = 40) was automatically extracted.
In Figure 6, the change in the kurtosis and skewness values of PCA are presented for both rs-fMRI and tb-fMRI at various FWHMs at sizes {0, 2, 4, 6, 8, 10} mm. In order to improve the clarity of the graphics, only minimum, average, and maximum values are given with their standard deviations. The pink, green, and blue regions show the fwhm = 0 mm (no smoothing), 6 mm, and 10 mm, respectively. Figure 6a shows the kurtosis and skewness parameters of PCA for the resting state, and Figure 6b shows the encoding state. Although the mean of kurtosis at rs-fMRI and tb-fMRI are nearly the same for fwhm values, the wider kernel size causes more outliers in rs-fMRI. A similar result can also be observed for the skewness parameter. The asymmetry of PCA components increases as the kernel size increases. Since the PCA does not constrain the distribution of the principal components, therefore they can be either symmetric or skewed according to the data characteristic.

Results for Voxel-Based Analysis: Effects on PCA and ICA Components
We analyzed two main parameters of PCA and ICA: kurtosis and skewness. Kurtosis indicates the degree of the outliers in the given signal or distribution. High kurtosis refers to having extreme tails, which means outliers, and low kurtosis refers to the light tails or absence of outliers. Skewness, on the other hand, is an indicator of symmetric distribution. A high skewness value indicates an asymmetrical distribution. If the data are skewed from either left (negative skew) or right (positive skew), a transform should be applied further until a normal distribution is reached because the tail could behave as an outlier in any case.
In the context of fMRI data analysis, kurtosis and skewness can be used to evaluate the distribution of the principal components obtained from PCA and independent components obtained from ICA.
In this study, PCA and ICA decompositions were performed with the CONN functional connectivity toolbox, which works on Matlab R2019b [38]. Since the selection/determination of the factor number for PCA and ICA decomposition is beyond the scope of the article, the recommended number of suitable components (n = 40) was automatically extracted.
In Figure 6, the change in the kurtosis and skewness values of PCA are presented for both rs-fMRI and tb-fMRI at various FWHMs at sizes {0, 2, 4, 6, 8, 10} mm. In order to improve the clarity of the graphics, only minimum, average, and maximum values are given with their standard deviations. The pink, green, and blue regions show the fwhm = 0 mm (no smoothing), 6 mm, and 10 mm, respectively. Figure 6a shows the kurtosis and skewness parameters of PCA for the resting state, and Figure 6b shows the encoding state. Although the mean of kurtosis at rs-fMRI and tb-fMRI are nearly the same for fwhm values, the wider kernel size causes more outliers in rs-fMRI. A similar result can also be observed for the skewness parameter. The asymmetry of PCA components increases as the kernel size increases. Since the PCA does not constrain the distribution of the principal components, therefore they can be either symmetric or skewed according to the data characteristic.  On the other hand, ICA assumes that the sources of neural activity are statistically independent and non-Gaussian. Therefore, the independent components obtained from ICA are often more likely to have non-Gaussian distributions than the principal components obtained from PCA. In general, independent components tend to have higher kurtosis and skewness, reflecting their non-Gaussian nature.
In parallel, the ICA results on the change in the kurtosis and skewness values for rs-fMRI and tb-fMRI are given in Figure 7. As the kernel size increases, the observed outlier behavior increases in ICA components. The mean of the kurtosis also tends to increase regularly for rs-FMRI and tb-fMRI. A similar observation can be performed for the skewness. If we skewed data with a wide kernel, it would diverge to the normal distribution, which also means an increased asymmetry. Here it should be noted that even though the non-smoothing gives the minimum results for kurtosis and skewness, it is not recommended to use PCA or ICA decomposition on non-smoothed data since the functional images are too noisy. As another analysis, the results of the p-value changes across all kernel sizes are given in Figure 8. It can be said that this recommendation is affirmed according to the given subfigures. The most significant p-value changes stand out in the 0 mm kernel size (i.e., non-smoothed data) when compared with the other kernel sizes on the top row (or first column). It confirms that the consecutive smaller kernel size changes may not reflect significant changes in these parameters, for example, changing the kernel size from 4 mm to 6 mm. However, the main difference can be observed better when it is changed, for example, from 0 mm to 6 mm. On the other hand, ICA assumes that the sources of neural activity are statistically independent and non-Gaussian. Therefore, the independent components obtained from ICA are often more likely to have non-Gaussian distributions than the principal components obtained from PCA. In general, independent components tend to have higher kurtosis and skewness, reflecting their non-Gaussian nature.
In parallel, the ICA results on the change in the kurtosis and skewness values for rs-fMRI and tb-fMRI are given in Figure 7. As the kernel size increases, the observed outlier behavior increases in ICA components. The mean of the kurtosis also tends to increase regularly for rs-FMRI and tb-fMRI. A similar observation can be performed for the skewness. If we skewed data with a wide kernel, it would diverge to the normal distribution, which also means an increased asymmetry. Here it should be noted that even though the non-smoothing gives the minimum results for kurtosis and skewness, it is not recommended to use PCA or ICA decomposition on non-smoothed data since the functional images are too noisy. As another analysis, the results of the p-value changes across all kernel sizes are given in Figure 8. It can be said that this recommendation is affirmed according to the given subfigures. The most significant p-value changes stand out in the 0 mm kernel size (i.e., non-smoothed data) when compared with the other kernel sizes on the top row (or first column). It confirms that the consecutive smaller kernel size changes may not reflect significant changes in these parameters, for example, changing the kernel size from 4 mm to 6 mm. However, the main difference can be observed better when it is changed, for example, from 0 mm to 6 mm. On the other hand, ICA assumes that the sources of neural activity are statistically independent and non-Gaussian. Therefore, the independent components obtained from ICA are often more likely to have non-Gaussian distributions than the principal components obtained from PCA. In general, independent components tend to have higher kurtosis and skewness, reflecting their non-Gaussian nature.
In parallel, the ICA results on the change in the kurtosis and skewness values for rs-fMRI and tb-fMRI are given in Figure 7. As the kernel size increases, the observed outlier behavior increases in ICA components. The mean of the kurtosis also tends to increase regularly for rs-FMRI and tb-fMRI. A similar observation can be performed for the skewness. If we skewed data with a wide kernel, it would diverge to the normal distribution, which also means an increased asymmetry. Here it should be noted that even though the non-smoothing gives the minimum results for kurtosis and skewness, it is not recommended to use PCA or ICA decomposition on non-smoothed data since the functional images are too noisy. As another analysis, the results of the p-value changes across all kernel sizes are given in Figure 8. It can be said that this recommendation is affirmed according to the given subfigures. The most significant p-value changes stand out in the 0 mm kernel size (i.e., non-smoothed data) when compared with the other kernel sizes on the top row (or first column). It confirms that the consecutive smaller kernel size changes may not reflect significant changes in these parameters, for example, changing the kernel size from 4 mm to 6 mm. However, the main difference can be observed better when it is changed, for example, from 0 mm to 6 mm. (e) (f) (g) (h) It is worth noting that the interpretation of kurtosis and skewness in fMRI data analysis should be performed with caution, as these measures can be influenced by various factors, including the number of components, the specific data preprocessing steps, and the nature of the experimental design. Additionally, the choice of kernel size may also depend on the specific research question being addressed. For example, if the goal is to identify small, localized areas of activation, a smaller kernel size may be preferable. Conversely, if the goal is to identify larger, more diffuse areas of activation, a larger kernel size may be more appropriate.
The choice of kernel size can also depend on the anatomical features of the brain being studied. For example, a smaller kernel size may be used to smooth data from regions with a high degree of anatomical detail, such as the cortex, while a larger kernel size may be used to smooth data from regions with less detail. Therefore, it is important to carefully evaluate the distributional properties of the components obtained from PCA and ICA and to use additional methods to validate the results obtained from these techniques.

Results for ROI-Based Analysis: Effects on Functional Connectivity Networks
Graph theory analyses on functional connectivity networks are also performed with the CONN toolbox. For whole brain network analysis, ROIs (i.e., nodes) are defined as major brain networks and their high-level cognitive areas [41]. In total, 32 ROIs are determined for each participant both for rs-fMRI and tb-fMRI analysis (See Appendix A for the complete list of the ROIs). All correlation values were calculated, which constitute the weights (i.e., edges) of the functional connectivity matrices. Then, the Fisher transform was applied to make the values in the connectivity matrix approximately normally distributed. Four hundred and ninety-six significant connections were determined using an FDR-corrected threshold (p < 0.05). This procedure results in weighted matrices of size 32 32. Then, according to the presence of a connection, the matrices are converted into the binary adjacency matrix.
First, the smoothing effect with different FWHM on rs-fMRI and tb-fMRI network structure is exhibited in Figure 9 (please see the Supplementary Materials for the connectome rings of all kernel sizes). Here, the circle denotes the selected nodes, and the links denote the network-based statistical results illustrated with their t-values. While red links show the strong positive functional relationships between the nodes, the blue links show the negative ones. The number and strength of the links differ as the kernel size changes, whereas the significant links tend to decrease as the kernel size increases. On the other hand, it can be clearly seen that rs-fMRI and tb-fMRI show great varieties among the connections in the graphics of minimum, average, and maximum kernel sizes, respectively. Later on, to explore the various smoothing parameter effects on topological properties, the above-mentioned metrics (global efficiency ( ), local efficiency ( ), betweenness It is worth noting that the interpretation of kurtosis and skewness in fMRI data analysis should be performed with caution, as these measures can be influenced by various factors, including the number of components, the specific data preprocessing steps, and the nature of the experimental design. Additionally, the choice of kernel size may also depend on the specific research question being addressed. For example, if the goal is to identify small, localized areas of activation, a smaller kernel size may be preferable. Conversely, if the goal is to identify larger, more diffuse areas of activation, a larger kernel size may be more appropriate.
The choice of kernel size can also depend on the anatomical features of the brain being studied. For example, a smaller kernel size may be used to smooth data from regions with a high degree of anatomical detail, such as the cortex, while a larger kernel size may be used to smooth data from regions with less detail. Therefore, it is important to carefully evaluate the distributional properties of the components obtained from PCA and ICA and to use additional methods to validate the results obtained from these techniques.

Results for ROI-Based Analysis: Effects on Functional Connectivity Networks
Graph theory analyses on functional connectivity networks are also performed with the CONN toolbox. For whole brain network analysis, ROIs (i.e., nodes) are defined as major brain networks and their high-level cognitive areas [41]. In total, 32 ROIs are determined for each participant both for rs-fMRI and tb-fMRI analysis (See Appendix A for the complete list of the ROIs). All correlation values were calculated, which constitute the weights (i.e., edges) of the functional connectivity matrices. Then, the Fisher transform was applied to make the values in the connectivity matrix approximately normally distributed. Four hundred and ninety-six significant connections were determined using an FDRcorrected threshold (p < 0.05). This procedure results in weighted matrices of size 32 × 32. Then, according to the presence of a connection, the matrices are converted into the binary adjacency matrix.
First, the smoothing effect with different FWHM on rs-fMRI and tb-fMRI network structure is exhibited in Figure 9 (please see the Supplementary Materials for the connectome rings of all kernel sizes). Here, the circle denotes the selected nodes, and the links denote the network-based statistical results illustrated with their t-values. While red links show the strong positive functional relationships between the nodes, the blue links show the negative ones. The number and strength of the links differ as the kernel size changes, whereas the significant links tend to decrease as the kernel size increases. On the other hand, it can be clearly seen that rs-fMRI and tb-fMRI show great varieties among the connections in the graphics of minimum, average, and maximum kernel sizes, respectively. Later on, to explore the various smoothing parameter effects on topological properties, the above-mentioned metrics (global efficiency (E glob ), local efficiency (E loc ), betweenness centrality (BC), average path length (l avg ), and clustering coefficient (CC)) are computed on the binary adjacency matrix for both rs-fMRI and tb-fMRI. Group-level results are presented in Table 1.
Sensors 2023, 23,5866 14 of 20 centrality , average path length ( ), and clustering coefficient ) are computed on the binary adjacency matrix for both rs-fMRI and tb-fMRI. Group-level results are presented in Table 1.  Contrary to the functional connectivity results, it can be said that there are no significant statistical differences in the graph metrics at the different sizes of the smoothing kernels. On the other hand, the value of these metrics differs remarkably in rs-fMRI and tb-fMRI. While the values of , , and increased during the task state, the values of and metrics are decreased when compared to the resting state. However, in any  Contrary to the functional connectivity results, it can be said that there are no significant statistical differences in the graph metrics at the different sizes of the smoothing kernels. On the other hand, the value of these metrics differs remarkably in rs-fMRI and tb-fMRI. While the values of E glob , E loc , and CC increased during the task state, the values of BC and l avg metrics are decreased when compared to the resting state. However, in any case, the mean of the results shows that the best FWHMs are either 6 mm or 8 mm, which are commonly used Gaussian kernel sizes.

Conclusions and Future Work
The network properties of functionally connected brain networks may affect the various processing steps. Although gold-standard preprocessing has not been identified yet, definite consecutive steps are common. However, one of the fundamental steps, smoothing, is a data-driven process and should be decided specifically for the functional data.
In this study, we explored the topological functional network and parametric affections on different smoothing values at the preprocessing step of functional neuronal imaging data. In order to investigate the changes, functional images were smoothed at kernel sizes FWHM = {0, 2, 4, 6, 8, 10} mm. Both resting state and task state conditions are analyzed in the aspect of functional connectivity networks. For whole brain coverage, the major brain networks and their high cognitive areas were included in the analysis. First, we analyzed PCA and ICA decompositions with their main components: kurtosis and skewness changes. Then, functional connectivity networks of rs-fMRI and tb-fMRI were analyzed as functional connectome changes and the graph parameters: global efficiency, local efficiency, betweenness centrality, clustering coefficient, and average path length. Comparative results for PCA and ICA show that as the kernel size becomes wider, the observed outlier behavior increases in terms of kurtosis. The asymmetry, i.e., skewness, increases with the wider kernel size for all functional data. Nevertheless, rs-fMRI and tb-fMRI values show differences in functional connectivity relationships. The number and strength of the links differ as the kernel size changes, where the significant links tend to decrease as the kernel size increases both for rs-fMRI and tb-fMRI. However, it can be said that the graph theoretical metrics are much more stable and robust and do not show significant differences with kernel size. Overall, it can be noted that when the number of connections, strength, and graph theoretical metrics are considered together, the suggested kernel size is the fwhm with 6 or 8 mm.
Here, one of the points worth mentioning could be the sample size. As is known, performing fMRI studies is challenging in many ways, including technical difficulties, data complexity, participant recruitment, ethics and safety, cost, data interpretation, etc. Among these, one of the main limitations is the high cost of an fMRI scan, which can limit the number of participants. In this context, the sample size of the study should be determined optimally. It should be statistically significant and large enough to generalize to a population but not so large as to be prohibitively expensive or time-consuming. Therefore, the power analysis method is used for the determination of the sample size. In this study, the power of the population was measured over 80%. Since the results of the analyses reflect the group averages and the power of the population is significant enough, it can be foreseen that the results of the wider sample size will approximate the proposed results. In general, it has been reported that the average sample size of most neuroimaging studies is below 25. Moreover, although the average increases to~0.74 participants per year, it is still under 15 subjects in the most cited studies [46]. Table 2 provides a comparative summary of related studies. The challenges in recruiting fMRI data and data loss due to several reasons, such as high motion, generally results in relatively few valid sample size, which is under 25. Here, it is important to note that the appropriate sample size may vary depending on the research question, the type of study, and other factors.

Publication #Subjects Specification Description
Somatotopy of cervical dystonia in motor-cerebellar networks: Evidence from resting state fMRI [47] 18 Resting 11 min resting state, in which the gaze monitored with eye tracking Gender differences in brain regional homogeneity of healthy subjects after normal sleep and after sleep deprivation: A resting-state fMRI study [48] 16 In summary, the selection of smoothing kernel size may still depend on the hypothesis of the research, spatial resolution requirements, and anatomical features of the data being analyzed. However, the results show that smoothing can yield more consistent analysis results, especially in functional network connectivity. Thus, it should also be kept in mind that smoothing is data-dependent and may cause significant alterations in group-level analysis when it is not applied appropriately.
In future work, the region-specific kernel size estimation problem can be considered. With machine learning algorithms trained according to the selected or marked regions in the brain, the suggestion of the most appropriate kernel size would be ensured. This future study, which requires rigorous and tailored analyses, would also be promising in practical clinical use.

Supplementary Materials:
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/s23135866/s1, Figure S1: The Detailed Graphics of Figure 9. Functional connectivity networks as connectome ring for all fwhm kernel sizes.
Funding: This research received no external funding.

Institutional Review Board Statement:
The study was conducted in accordance with the Ege University Faculty of Medicine and all protocols approved by the Ethics Committee of Ege University.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: The dataset will be available upon request on reasonable request. Acknowledgments: All functional data are provided by the SoCAT research lab at Ege University Medical Faculty, Izmir, Turkey. The author also would like to Ali Saffet Gonul for his permanent encouragement.

Conflicts of Interest:
The author declares no conflict of interest.

Appendix A
The major brain networks and their high cognitive areas that are used as ROIs in the paper are given in detail as follows: