Sensor Location Optimization of Wireless Wearable fNIRS System for Cognitive Workload Monitoring Using a Data-Driven Approach for Improved Wearability

Functional Near-Infrared Spectroscopy (fNIRS) is a hemodynamic modality in human cognitive workload assessment receiving popularity due to its easier implementation, non-invasiveness, low cost and other benefits from the signal-processing point of view. Wearable wireless fNIRS systems used in research have promisingly shown that fNIRS could be used in cognitive workload assessment in out-of-the-lab scenarios, such as in operators’ cognitive workload monitoring. In such a scenario, the wearability of the system is a significant factor affecting user comfort. In this respect, the wearability of the system can be improved if it is possible to minimize an fNIRS system without much compromise of the cognitive workload detection accuracy. In this study, cognitive workload-related hemodynamic changes were acquired using an fNIRS system covering the whole forehead, which is the region of interest in most cognitive workload-monitoring studies. A machine learning approach was applied to explore how the mean accuracy of the cognitive workload classification accuracy varied across various sensing locations on the forehead such as the Left, Mid, Right, Left-Mid, Right-Mid and Whole forehead. The statistical significance analysis result showed that the Mid location could result in significant cognitive workload classification accuracy compared to Whole forehead sensing, with a statistically insignificant difference in the mean accuracy. Thus, the wearable fNIRS system can be improved in terms of wearability by optimizing the sensor location, considering the sensing of the Mid location on the forehead for cognitive workload monitoring.


Introduction
The invention of the wheel introduced a new role for humankind, the operator. This job requires humans to take into account the current state of the system being operated and the current environmental situation the system is in, and to then perform cognitively assessed actuating commands to operate the system effectively and safely. From riding a bicycle to operating an aircraft, these jobs exert various levels of cognitive load on the operator depending on the systems. The safety of the human users accompanying the operator is highly dependent on the continuous cognitive effort of the operator, which is also denoted as the cognitive workload on the operator. For instance, a study on aviation crashes [1] based on 329 major airline crashes claimed that 38% of the crashes were probably caused due to pilot error. Thus, ensuring a balanced and continuous human operator's cognitive effort During the cognitive load, the portion of the cognitive system that retains the information in the short term, necessary for processing to accomplish the cognitive tasks, is known as the working memory [27], which is widely assumed to be served by the Prefrontal Cortex (PFC) along with central executive function. Several studies related to cognitive function conform with this assumption from the observation of high hemodynamic activities in the PFC during the cognitive workloads [3,9,14,28,29]. This observation of regional cerebral activation during cognitive tasks leads to the implementation of the fNIRS modality in assessing cognitive workload [6,[30][31][32] by sensing the PFC optically. These studies focused mainly on anatomically exploring the activation of the brain region by various cognitive processes at deep or shallow depth from the skin. Thus, in such studies, the optimization of sensor locations regarding the wearability of the fNIRS system was not considered. Thereafter, cognitive workload assessment studies focused on fNIRS typically sense the entire front area to identify the workload. To the best of the knowledge of the authors, there is no study addressing this problem of optimization and determining whether it is necessary to sense the whole forehead for the successful detection of cognitive workloads. Furthermore, the compromise of the precision of cognitive workload detection if only a certain smaller portion of the forehead is sensed remains unaddressed.

Participants
Eight healthy volunteers (six males and two females), with no history of neurovascular and cognitive disorders, participated in this study. The study was approved by the Institutional Review Board (IRB-19-0091) of Florida International University, and signed informed consent was obtained from all the subjects prior to the study.

Experimental Design
To induce a cognitive workload in human subjects, the n-back task is a widely used [6,7,28] paradigm related to cognitive study in research, which was first demonstrated by Kirchner, W.K. [33]. In this paradigm, the subject observes a series of events during the testing period. If any of the events match with n-events before then, the subject provides feedback, where n could be 1, 2, 3 and so forth, depending on the requirement. As the target of this study is to explore how the detection accuracy for cognitive workload varies due to the sensing location on the human forehead, a moderate level of cognitive workload induction was applied, which is assumed to be represented by the 2-back test [34]. A free, open-source piece of software [35] named "Brain Workshop" was used to simulate the positional 2-back task, where a solid colored square changed its position randomly within a 3-by-3 grid every two seconds (Figure 2 depicts this task paradigm). If the current position of the square matched its position two events before, the subject pressed a button on the keyboard, and they did nothing if the position did not match. In each session of recording, there were 24 events of the positional 2-back task, which spanned 48 s, and afterward, there was 25 s of relaxing, when the subjects did nothing. This relaxing period was considered as the Rest state [7]. The recording of each During the cognitive load, the portion of the cognitive system that retains the information in the short term, necessary for processing to accomplish the cognitive tasks, is known as the working memory [27], which is widely assumed to be served by the Prefrontal Cortex (PFC) along with central executive function. Several studies related to cognitive function conform with this assumption from the observation of high hemodynamic activities in the PFC during the cognitive workloads [3,9,14,28,29]. This observation of regional cerebral activation during cognitive tasks leads to the implementation of the fNIRS modality in assessing cognitive workload [6,[30][31][32] by sensing the PFC optically. These studies focused mainly on anatomically exploring the activation of the brain region by various cognitive processes at deep or shallow depth from the skin. Thus, in such studies, the optimization of sensor locations regarding the wearability of the fNIRS system was not considered. Thereafter, cognitive workload assessment studies focused on fNIRS typically sense the entire front area to identify the workload. To the best of the knowledge of the authors, there is no study addressing this problem of optimization and determining whether it is necessary to sense the whole forehead for the successful detection of cognitive workloads. Furthermore, the compromise of the precision of cognitive workload detection if only a certain smaller portion of the forehead is sensed remains unaddressed.

Participants
Eight healthy volunteers (six males and two females), with no history of neurovascular and cognitive disorders, participated in this study. The study was approved by the Institutional Review Board (IRB-19-0091) of Florida International University, and signed informed consent was obtained from all the subjects prior to the study.

Experimental Design
To induce a cognitive workload in human subjects, the n-back task is a widely used [6,7,28] paradigm related to cognitive study in research, which was first demonstrated by Kirchner, W.K. [33]. In this paradigm, the subject observes a series of events during the testing period. If any of the events match with n-events before then, the subject provides feedback, where n could be 1, 2, 3 and so forth, depending on the requirement. As the target of this study is to explore how the detection accuracy for cognitive workload varies due to the sensing location on the human forehead, a moderate level of cognitive workload induction was applied, which is assumed to be represented by the 2-back test [34].
A free, open-source piece of software [35] named "Brain Workshop" was used to simulate the positional 2-back task, where a solid colored square changed its position randomly within a 3-by-3 grid every two seconds (Figure 2 depicts this task paradigm). If the current position of the square matched its position two events before, the subject pressed a button on the keyboard, and they did nothing if the position did not match. In each session of recording, there were 24 events of the positional 2-back task, which spanned 48 s, and afterward, there was 25 s of relaxing, when the subjects did nothing. This relaxing period was considered as the Rest state [7]. The recording of each session started 10 s prior to the start of the 2-back task, and this 10 s was used as the baseline of the diffused optical signal in the conversion of the absorption of optical signals to hemodynamic change. Each subject Sensors 2020, 20, 5082 4 of 15 did 10 such sessions, where the subjects relaxed for about 30 s between the sessions, and during these periods, data were not recorded and the subjects were free to move. The sessions with less than 90% accuracy for 2-back task performance were rejected during the recording [7], assuming less cognitive involvement of the subject in the task.
Sensors 2020, 20, x FOR PEER REVIEW 4 of 17 session started 10 s prior to the start of the 2-back task, and this 10 s was used as the baseline of the diffused optical signal in the conversion of the absorption of optical signals to hemodynamic change. Each subject did 10 such sessions, where the subjects relaxed for about 30 s between the sessions, and during these periods, data were not recorded and the subjects were free to move. The sessions with less than 90% accuracy for 2-back task performance were rejected during the recording [7], assuming less cognitive involvement of the subject in the task.

Data Acquisition and Signal Processing
A wearable wireless fNIRS system developed at the Human Cyber-Physical Systems Lab at Florida International University was used for data acquisition in the experiment. The system architecture was based on the sensor system developed in [36][37][38] and modified to accommodate more channels required to sense the whole forehead. The improved system consists of three lightemitting diodes (LED) as a source of near-infrared (NIR) light, capable of multiwavelength (770 and 850 nm) emission, and eight photodetectors (PD) as light detectors, where the source-detector distance is 3 cm, with 0.3 cm variability. The differential path length (DPF) was 6.2 cm for 770 nm and 5.1 cm for 850 nm [39]. The LED and PD together form twelve channels for sensing, which are marked by channel numbers in Figure 3a. The sensitivity map for the depicted channel arrangement is presented in Figure 3b, derived from Homer2 Atlas Viewer [40,41]. The system covers the whole forehead for sensing, which is the region of interest (ROI) in this study. This ROI is subdivided into five sub-locations named Left, Mid, Right, Left-Mid and Right-Mid. Channels 1 to 4 sense the Left location on the forehead, Channels 5 to 8 sense the Mid location on the forehead, and the remaining Channels 9 to 12 sense the Right location on the forehead. The locations stated as Left-Mid and Right-Mid consist of Channels 1 to 10 and Channels 3 to 12, respectively. The location name for the whole forehead sensing area is Whole in subsequent descriptions. Each channel was sampled at a 25 Hz sampling rate. Additionally, the headband that houses the system is equipped with a nine-channel inertial measurement unit (IMU) and records the movement data concurrently with NIR data. IMU data were checked immediately after each session from each subject for movements during the recording, and the sessions that showed movements were discarded. The raw NIR signals were lowpass filtered with a third-order Butterworth bandpass filter with a 0.01-0.5 Hz cut-off frequency [42], and afterward, a two-second windowed moving-average filter was applied to further remove any physiological interference in the detected NIR signal, such as Mayer waves, respiration and heart rate [43,44]. Afterward, the modified Beer-Lambert [45] law was applied to convert the multiband raw NIR signal to oxygenation change signals, known as the change in oxygenated hemoglobin (ΔHbO2) and deoxygenated hemoglobin (ΔHbR).

Data Acquisition and Signal Processing
A wearable wireless fNIRS system developed at the Human Cyber-Physical Systems Lab at Florida International University was used for data acquisition in the experiment. The system architecture was based on the sensor system developed in [36][37][38] and modified to accommodate more channels required to sense the whole forehead. The improved system consists of three light-emitting diodes (LED) as a source of near-infrared (NIR) light, capable of multiwavelength (770 and 850 nm) emission, and eight photodetectors (PD) as light detectors, where the source-detector distance is 3 cm, with 0.3 cm variability. The differential path length (DPF) was 6.2 cm for 770 nm and 5.1 cm for 850 nm [39]. The LED and PD together form twelve channels for sensing, which are marked by channel numbers in Figure 3a. The sensitivity map for the depicted channel arrangement is presented in Figure 3b, derived from Homer2 Atlas Viewer [40,41]. The system covers the whole forehead for sensing, which is the region of interest (ROI) in this study. This ROI is subdivided into five sub-locations named Left, Mid, Right, Left-Mid and Right-Mid. Channels 1 to 4 sense the Left location on the forehead, Channels 5 to 8 sense the Mid location on the forehead, and the remaining Channels 9 to 12 sense the Right location on the forehead. The locations stated as Left-Mid and Right-Mid consist of Channels 1 to 10 and Channels 3 to 12, respectively. The location name for the whole forehead sensing area is Whole in subsequent descriptions. Each channel was sampled at a 25 Hz sampling rate. Additionally, the headband that houses the system is equipped with a nine-channel inertial measurement unit (IMU) and records the movement data concurrently with NIR data. IMU data were checked immediately after each session from each subject for movements during the recording, and the sessions that showed movements were discarded. The raw NIR signals were low-pass filtered with a third-order Butterworth bandpass filter with a 0.01-0.5 Hz cut-off frequency [42], and afterward, a two-second windowed moving-average filter was applied to further remove any physiological interference in the detected NIR signal, such as Mayer waves, respiration and heart rate [43,44]. Afterward, the modified Beer-Lambert [45] law was applied to convert the multiband raw NIR signal to oxygenation change signals, known as the change in oxygenated hemoglobin (∆HbO2) and deoxygenated hemoglobin (∆HbR). Sensors 2020, 20, x FOR PEER REVIEW 5 of 17 To study how the workload detection accuracy varied with the length of sensing period along with the sensing location on the forehead, the oxygenation change signals were segmented with various window lengths, such as 5, 10, 20, 25 and 48 s [7]. There was about 50% overlap [7] in the windowing when the segmentation window length was less than the whole period of the 2-back or Rest state. The overlapping is assumed to be necessary to reduce inter-subject variability in the statistical temporal features of the signal [7,46]. After segmentation, each signal segment was labelled as 2-back or Resting-state accordingly.
This segmentation process resulted in n segments of 2-back state signal and m segments of resting-state signals. Here the values of n and m were dependent on the segmentation window lengths used in this study. For each session of any subject, it resulted in (n, m) = (1, 11) when segmenting with a 5 s window, (n, m) = (9, 5) for a 10 s window, (n, m) = (4, 2) for a 20 s window, (n, m) = (2, 1) for a 25 s window and (n, m) = (1, 1) when using the whole period of each state. These values of n and m also signify the number of samples in each class at different segmentation window lengths. All sessions of each subject, after segmentation and appropriate class-label assignment, resulted in 290, 140, 60, 30 and 20 samples for the segmentation window lengths of 5, 10, 20 and 25 s and the whole state period, respectively.

Feature Extraction
From each segmented hemodynamic change signal sample, commonly used statistical features were extracted, such as the mean [47,48], variance [49], slope [47,48] of polynomial fit, skewness [49,50], kurtosis [49,50] and correlation [7] of ΔHbO2 and ΔHbR. The extraction of these features from each channel resulted in a total of 11 features per channel for any samples under any segmentation window length. As there were various numbers of channels involved in sensing different locations on the forehead, various numbers of features resulted from each sample depending on the sensing location on the forehead. For example, there were 44 features for the Left, Mid and Right locations, whereas there were 110 features for the Left-Mid and Right-Mid locations for any length of segmentation. As there were 12 channels involved in the whole forehead location, there were 132 features recorded in the location denoted by Whole. The supplementary file to this article contains all the feature values segregated according to different sensing locations and segmentation window lengths.

Feature Selection and Classification
In the case of the fNIRS-based classification problem, several classification methods have been used, such as Discriminant analysis, Support Vector Machine (SVM), Artificial Neural Network (ANN) To study how the workload detection accuracy varied with the length of sensing period along with the sensing location on the forehead, the oxygenation change signals were segmented with various window lengths, such as 5, 10, 20, 25 and 48 s [7]. There was about 50% overlap [7] in the windowing when the segmentation window length was less than the whole period of the 2-back or Rest state. The overlapping is assumed to be necessary to reduce inter-subject variability in the statistical temporal features of the signal [7,46]. After segmentation, each signal segment was labelled as 2-back or Resting-state accordingly. This

Feature Extraction
From each segmented hemodynamic change signal sample, commonly used statistical features were extracted, such as the mean [47,48], variance [49], slope [47,48] of polynomial fit, skewness [49,50], kurtosis [49,50] and correlation [7] of ∆HbO2 and ∆HbR. The extraction of these features from each channel resulted in a total of 11 features per channel for any samples under any segmentation window length. As there were various numbers of channels involved in sensing different locations on the forehead, various numbers of features resulted from each sample depending on the sensing location on the forehead. For example, there were 44 features for the Left, Mid and Right locations, whereas there were 110 features for the Left-Mid and Right-Mid locations for any length of segmentation. As there were 12 channels involved in the whole forehead location, there were 132 features recorded in the location denoted by Whole. The supplementary file to this article contains all the feature values segregated according to different sensing locations and segmentation window lengths.

Feature Selection and Classification
In the case of the fNIRS-based classification problem, several classification methods have been used, such as Discriminant analysis, Support Vector Machine (SVM), Artificial Neural Network (ANN) and so on [50][51][52][53][54][55][56]. As the pivotal point of the study was to assess the cognitive workload monitoring Sensors 2020, 20, 5082 6 of 15 accuracy variability using wearable fNIRS at various sensing locations on the forehead, the speed of the classifier was a crucial property for consideration. Thus, the Linear SVM, which had already been used in several other fNIRS-based classification studies [7,52,57], was selected as a classifier in this study. In this respect, the generic implementation of Linear SVM in the MATLAB platform with a box constraint of 1 and auto kernel scale was used for classification. For feature selection, two algorithms were used, namely, the Sequential Forward Selection (SFS) Wrapper algorithm [58] for feature subset selection and Relief algorithm [59,60]. Each of these algorithms has its own merits with respect to the statistical relevance (Relief) of the features to the classes [59] and the interaction of the training feature set (SFS) with the classifier algorithm [58]. Thus, both of the selected feature sets returned by these two algorithms were individually used in the classification using the linear SVM. The classification with the best accuracy among these two classification results was used in the statistical analysis. In this respect, as the SFS claims to return an optimal feature subset by heuristic search, the whole subset of the returned features was used in classification. On the other hand, as the Relief algorithm instead returns the ranks and weights for all the features, only the features with positive weights were used in the classification.
In the classification process, ten-fold testing cross-validation was applied [61]. In other words, there were 240 datasets resulting from segmentation, and in each of the classification processes on these datasets, the dataset was partitioned into ten subsets using a random selection of the observations for each partition. Then, the SVM was trained using the nine subsets of this dataset, leaving out the remaining one subset for testing. This leave-out subset is never seen by the classifier during the training phase. The training of the SVM was performed using ten-fold training cross-validation using those nine subsets of the partitioned data. Afterward, the trained SVM classifier was used to test the classification accuracy using the subset not used in training. Subsequently, this same training and testing procedure was applied on the other remaining nine subsets of this dataset in a nested cross-validation. The final accuracy of the classification on this dataset was calculated as the mean of these ten testing accuracies.

Statistical Analysis
For all the eight subjects, cognitive workload classification accuracies were calculated for the sensing locations of Left, Mid, Right, Left-Mid, Right-Mid and Whole, under several segmentation windows. The mean classification accuracies for each location across all the subjects were different from each other. Thus, to find the statistical significance of the differences in the mean classification accuracies for all subjects for each location (Left, Mid, Right, Left-Mid and Right-Mid) from the accuracy for the whole forehead location (Whole), a two-sample t-test was applied with a 5% significance level, which decided whether the means were statistically equal or not at that significance level.

Results and Discussion
All the classification accuracies from all the datasets are presented in Table 1. For the 5 s segmentation window, the lowest mean accuracy for all the subjects was 83.4%, with a standard deviation of 6.7%, which occurred when only the Right location was used. On the other hand, the highest mean accuracy occurred when the Whole forehead dataset was used to classify, which was 94.0%, with a 3.9% standard error. Like in the 5 s segmentation window, the Right location in all the other segmentation windows also resulted in the lowest classification accuracies, which were 84.0%, 86.5%, 90.8% and 95.0% for the 10, 15, 20, 25 and 48 s window lengths, respectively. For the 48 s segmentation window, the Left location also resulted in the lowest classification accuracy, like the Right location, but with a higher standard deviation, which was 6.5%. In the case of the highest mean accuracy of classifications beyond the 5 s segmentation window length, the Left-Mid location yielded the highest mean accuracy for the 20, 25 and 48 s segmentation window lengths, which were 93.6%, 95.9 and 98.3% respectively, and for the remaining 10 s segmentation window length, Right-Mid resulted in the highest mean accuracy of classification, which was 93.0%. The other details of the Sensors 2020, 20, 5082 7 of 15 classification process, such as the F1 score, sensitivity, specificity and precision are listed in the table in Appendix A for each classification performed in this study. Similar to the mean classification accuracy results from the calculation process mentioned before, the classifier details described in Appendix A were also calculated.  Regarding the statistical analysis aforementioned, the two-sample t-test was used to test the statistical significance of the differences in the mean classification accuracies for each location (Left, Mid, Right, Left-Mid and Right-Mid) from that for the whole forehead location, denoted by Whole. The t-test hypothesis-testing decisions indicated that the mean classification accuracies were statistically different from the mean classification accuracy of the location Whole in the case of only the 10 and 20 s segmentation window lengths for location Right and the 5 s segmentation window length for locations Left and Right. In other words, except for these four cases, the mean classification accuracy for any location or segmentation window lengths, statistical significance was not found for the difference in the mean classification accuracies. Moreover, these results indicate that the Mid location, which is one of the three smallest sensing locations (Left, Mid, Right), resulted in a mean classification accuracy with a statistically insignificant difference, compared to the largest location Whole for any segmentation window lengths.
The mean classification accuracies are depicted by the bar plot in Figure 4 for a qualitative assessment of how the classification accuracies varied along with the change in the sensing location on the forehead, which are grouped into segmentation windows. The error bars in the plot show the standard errors of the mean accuracies, which were estimated by dividing the standard deviation by the square root of the number of subjects in this study. From this plot, it is apparent that the differences in the mean accuracies between the largest location Whole and the other smaller locations were reduced with the increase in the segmentation window length.
Sensors 2020, 20, x FOR PEER REVIEW 8 of 17 Mean ± SD 95 ± 6.5 95.6 ± 5.0 95 ± 3.8 98.3 ± 2.4 97.5 ± 3.8 98.1 ± 3.7 Regarding the statistical analysis aforementioned, the two-sample t-test was used to test the statistical significance of the differences in the mean classification accuracies for each location (Left, Mid, Right, Left-Mid and Right-Mid) from that for the whole forehead location, denoted by Whole. The t-test hypothesis-testing decisions indicated that the mean classification accuracies were statistically different from the mean classification accuracy of the location Whole in the case of only the 10 and 20 s segmentation window lengths for location Right and the 5 s segmentation window length for locations Left and Right. In other words, except for these four cases, the mean classification accuracy for any location or segmentation window lengths, statistical significance was not found for the difference in the mean classification accuracies. Moreover, these results indicate that the Mid location, which is one of the three smallest sensing locations (Left, Mid, Right), resulted in a mean classification accuracy with a statistically insignificant difference, compared to the largest location Whole for any segmentation window lengths.
The mean classification accuracies are depicted by the bar plot in Figure 4 for a qualitative assessment of how the classification accuracies varied along with the change in the sensing location on the forehead, which are grouped into segmentation windows. The error bars in the plot show the standard errors of the mean accuracies, which were estimated by dividing the standard deviation by the square root of the number of subjects in this study. From this plot, it is apparent that the differences in the mean accuracies between the largest location Whole and the other smaller locations were reduced with the increase in the segmentation window length. Optical methods, such as fNIRS, which can reach a depth of a few centimeters only from the skin to the human brain, could only sense hemodynamic changes of the brain tissue that are closer to the cerebral surface, such as the PFC under the forehead. The PFC is one of the most functionally correlated subsystems [62] among other cerebral subsystems such as the Hippocampal Formation (HF), Inferior Parietal Lobule (IPL) and so on, which form the default network with respect to cognitive states. Moreover, research based on other methods such as fMRI and Positron Emission Tomography (PET) showed that the Medial PFC in this network is the most involved region of the PFC related to the cognitive states [63]. In this study, the n-back task was used to induce cognitive workload, which entails the involvement of the brain function named working memory. However, the outcome of any task execution by the human brain is accompanied by the other parts of human cognition such as memory retrieval, relational reasoning and multitasking behaviors. Other studies suggest that the frontopolar cortex (FPC), which we assumed to be sensed by the Mid location depicted in this study, is responsible for carrying out these parts of human cognition [64][65][66][67]. In congruence with those physiological study-based findings, the statistical results presented in this study also showed that sensing the Mid location only can result in significant accuracy in cognitive workload classification The four classification accuracy means whose differences are statistically significant are highlighted for significance with stars.
Optical methods, such as fNIRS, which can reach a depth of a few centimeters only from the skin to the human brain, could only sense hemodynamic changes of the brain tissue that are closer to the cerebral surface, such as the PFC under the forehead. The PFC is one of the most functionally correlated subsystems [62] among other cerebral subsystems such as the Hippocampal Formation (HF), Inferior Parietal Lobule (IPL) and so on, which form the default network with respect to cognitive states. Moreover, research based on other methods such as fMRI and Positron Emission Tomography (PET) showed that the Medial PFC in this network is the most involved region of the PFC related to the cognitive states [63]. In this study, the n-back task was used to induce cognitive workload, which entails the involvement of the brain function named working memory. However, the outcome of any task execution by the human brain is accompanied by the other parts of human cognition such as memory retrieval, relational reasoning and multitasking behaviors. Other studies suggest that the frontopolar cortex (FPC), which we assumed to be sensed by the Mid location depicted in this study, is responsible for carrying out these parts of human cognition [64][65][66][67]. In congruence with those physiological study-based findings, the statistical results presented in this study also showed that sensing the Mid location only can result in significant accuracy in cognitive workload classification compared to using the location Whole on the forehead. Thus, from the system design point of view, the fNIRS system could be minimized to sense only the Mid location on the forehead with a statistically insignificant compromise of the cognitive workload detection accuracy. Similarly to the Mid location, the other two smallest sensing locations, Left and Right, could also be targeted for cognitive workload detection using a minimized fNIRS system but with higher latencies for obtaining significantly comparable accuracy.

Conclusions
The fNIRS is a promising modality for the ubiquitous monitoring of human cognitive workload. For the potential deployment of fNIRS, device design optimization regarding sensor position to boost wearability toward this aim of cognitive workload tracking is essential. In this respect, statistically significant classification accuracy for cognitive workloads can be achieved by sensing the Mid location on the human forehead rather than the entire forehead using fNIRS. This finding can be utilized by researchers to optimize their wearable wireless fNIRS systems, which are resource and power constrained by nature, as well as user comfort being a concern. While the purpose of this study was to investigate how the cognitive workload monitoring accuracy varied across the sensing locations on the forehead, the variability arising from the human subjects was assumed to be minimal, and the number of participants in this study was decided by considering common practice in other relevant studies. Future studies may investigate whether there are any subject-dependent variabilities in sensor location optimization for wearable fNIRS system design. The extracerebral tissue layer signal interferences were assumed to be minimal and similar in both the 2-back and resting-state signals. A future study may utilize a short separation channel to investigate these interferences. In this study, only the immobile state of the subject was considered. Thus, another promising future direction might be to investigate the effect of motion on sensor location optimization.

Conflicts of Interest:
The authors declare no conflict of interest.