Opal Actigraphy (Activity and Sleep) Measures Compared to ActiGraph: A Validation Study

Physical activity and sleep monitoring in daily life provide vital information to track health status and physical fitness. The aim of this study was to establish concurrent validity for the new Opal Actigraphy solution in relation to the widely used ActiGraph GT9X for measuring physical activity from accelerometry epic counts (sedentary to vigorous levels) and sleep periods in daily life. Twenty participants (age 56 + 22 years) wore two wearable devices on each wrist for 7 days and nights, recording 3-D accelerations at 30 Hz. Bland–Altman plots and intraclass correlation coefficients (ICCs) assessed validity (agreement) and test–retest reliability between ActiGraph and Opal Actigraphy sleep durations and activity levels, as well as between the two different versions of the ActiGraph. ICCs showed excellent reliability for physical activity measures and moderate-to-excellent reliability for sleep measures between Opal versus Actigraph GT9X and between GT3X versus GT9X. Bland–Altman plots and mean absolute percentage error (MAPE) also show a comparable performance (within 10%) between Opal and ActiGraph and between the two ActiGraph monitors across activity and sleep measures. In conclusion, physical activity and sleep measures using Opal Actigraphy demonstrate performance comparable to that of ActiGraph, supporting concurrent validation. Opal Actigraphy can be used to quantify activity and monitor sleep patterns in research and clinical studies.


Introduction
Limited physical activity and poor sleep are related to several neurological diseases and result in reduced quality of life [1][2][3][4]. Research has shown a bidirectional relationship between a poor/insufficient sleep and low physical activity levels [5][6][7][8][9]. Hence, accurate assessment of physical activity and sleep is critical for management of many chronic conditions and can help quantify treatment-related changes and track individuals' overall health status [10]. Traditionally, comprehensive assessments of physical activity and sleep have been conducted in a clinic/laboratory environment that requires expensive equipment and trained staff [11][12][13]. Self-reporting of physical activity and sleep are subjective and depend on individuals' ability to recall and estimate activity levels and sleep patterns. To overcome these limitations in laboratory and self-reported measures, accelerometer-based wearable technologies have emerged as valid tools to directly and objectively quantify physical activity behaviors and sleep patterns in daily life [14]. Objective assessment of physical activities and sleep patterns with wearable devices shows great promise for application in decentralized clinical trials in patient homes and community settings to capture continuous tracking of daily life health status [15][16][17].
Wearable technology with tri-axial accelerometers has been successfully used to measure physical activity and sleep for decades, and provides a cost-effective and practical solution for continuous tracking of daily life health status [18][19][20]. Clinicians and researchers are often interested in quantifying the time, in minutes, spent in selected levels of physical activity intensity commonly defined as sedentary, light, moderate, vigorous and very vigorous intensity. ActiGraph devices have been among the most commonly used solutions for quantification of physical activities and sleep, with more than 20,000 papers concerning them published to date [21].
Algorithms for computing activity counts in the earlier models of these devices have been published [22][23][24][25][26]. Several activity count algorithms are based on zero-crossing and time above threshold. Freedson et al. established accelerometer count ranges for a physical activity monitor [25]. They provided accelerometer count cut-points for adults that correspond to different activity intensity levels [25]. A decade later, they updated their activity count cut-points to classify physical activity intensity based on tri-axial vector magnitude [24,25]. In addition to physical activity intensity, clinicians and researchers are also interested in quantifying periods of sleep and sleep efficiency in order to distinguish sleep from wakefulness [23].
Several studies demonstrated the validity of accelerometer-based physical activity and sleep measures by comparing them to in-clinic gold-standard assessment [27][28][29][30]. The validity of ActiGraph wrist monitors for the assessment of physical activity and sleep has been extensively studied across many different age groups, genders, and patient populations. Clevenger et al. compared cross-generational ActiGraph devices and assessed their performance in quantifying physical activity [31]. They used GT9X and wGT3X models worn at the hip and on each wrist for 4 days. They reported that epoch-level data from different models were not identical, but most outcomes were strongly related between models and equivalent when reduced to the percentage of time spent in each intensity of activity. They suggested that caution should be exercised when comparing outcome measures among ActiGraph models [31]. Validity varies widely between devices including the Apple Watch, Yamax Digiwalker, iHealth Edge, and Misfit Shine [32]. Valkenet et al. investigated the validity of their accelerometer-based system for measuring physical activity by comparing it with the ActiGraph wGT3X-BT accelerometer [33]. They reported an intraclass correlation coefficient of 0.95, but participants wore the accelerometers for a comparatively short time, that is, one day between 9 am and 4 pm.
Wrist-worn accelerometers have been validated for sleep detection as well. Cole et al. developed and validated automatic scoring methods to distinguish sleep from wakefulness based on wrist activity during overnight polysomnography. They reported that their algorithms correctly distinguished sleep from wakefulness 88% of the time [23]. Neishabouri et al. described the detailed counts algorithms of five generations of ActiGraph devices and published the counts algorithm in ActiGraph's ActiLife and CentrePoint software [21]. Recently, a sleep detection algorithm based on angular wrist rotation estimated with raw acceleration data was validated using data from multiple accelerometer brands, including Axivity, GENEActiv and ActiGraph [34,35].
Opal ® V2 Solutions (APDM Wearable Technologies, a Clario Company) have been used extensively in clinical and academic research to characterize gait, balance, and other aspects of mobility. Opal ® V2 wearable sensors contain lightweight inertial measurement units, and can be deployed both in the clinic and remote settings to capture a broad range of digital movement outcomes. Recently, Clario has developed an actigraphy solution to quantify physical activity and sleep measures during daily life. This solution contains one Opal sensor, configured to use the triaxial accelerometer, in a low power mode sampled at 30 hz to collect actigraphy data continuously. The Mobility Lab ® software contains algorithms that generate the objective measures of daily activity levels and sleep. In this pilot study, we aimed to validate the Opal Actigraphy measures of physical activity and sleep and assess their agreement with those obtained from two different models of ActiGraph. We chose ActiGraph as a reference because ActiGraph devices (ActiGraph, Pensacola, FL, USA) have been among the most widely used and validated actigraphy devices for over two decades [21]. The main contribution of this study is to provide an alternative solution to characterizing daily free-living physical activity and sleep using Opal, and to demonstrate its concurrent validity compared to the ActiGraph.

Participants
Twenty healthy subjects without neurological, musculoskeletal, or sleep disorders participated in the study. The experimental protocol was approved by the Institutional Review Board of the Oregon Health & Science University (eIRB # 15578). All the participants provided informed written consent.

Data Collection
Participants wore 4 wearable devices; 2 on each wrist. One Opal and one ActiGraph GT9X were strapped together and placed on the non-dominant arm, and one ActiGraph GT3X and a second ActiGraph GT9X were strapped together and placed on the dominant arm. Participants were instructed to wear the devices continuously for 7 days (day and night) during their daily activities and sleep. Data collection was performed by team at OHSU, independent of the data analysis team at Clario.

Data Processing
The sensor data were processed by Mobility Lab (Opal) and ActiLife (GT3X and GT9X) to obtain accelerometer epoch counts data and sleep periods (Cole-Kripke algorithm) [18]. The counts were categorized into activity levels (sedentary, light, moderate, vigorous and very vigorous) according to the thresholds established by Freedson et al. with two versions of algorithms, referring to Freedson98 [20] and FreedsonVM3 [19]. We also combined detected sleep periods from each noon-noon period to obtain daily sleep statistics. We chose both algorithms (Freedson98 and FreedsonVM3) as they have been the most widely utilized in the research community and validated for activity and sleep monitoring. They have also been used in by ActiGraph devices and the software used for comparison in our current study.

Statistical Analysis
Concurrent validity is established when a new test or instrument is compared against a previously validated test, often a "gold standard" or that most widely used in the field. In this study, we aim to support validation of Opal Actigraphy by comparing it to the current most widely used wrist-worn sensor in the industry, the ActiGraph. To make the comparison between the devices, we chose several analytic approaches to assess levels of reliability between the Opal and ActiGraph GT9X algorithms. Evidence of high reliability will suggest the concurrent validity of the Opal instrument. Furthermore, we predict that the level of reliability and agreement between the Opal and ActiGraph GT9X will be comparable to the level of agreement between the two ActiGraph devices (GT3X and GT9X), thus providing additional support to validation of the Opal-based actigraphy solution. To establish reliability and agreement, we used intraclass correlation coefficients (ICCs) (two-way mixed effects, absolute agreement) and Bland-Altman plots. Additionally, we calculated mean absolute percentage error (MAPE) to test the performance between Opal versus ActiGraphy GT9X, and ActiGraph GT3X versus Actigraph GT9X. The aim was to test if the difference between the MAPE of Opal vs. GT9X is comparable (within 10% [36,37]) to the difference between MAPE of Opal vs. GT9X for all activity and sleep measures. Bland-Altman and ICC analyses were performed by a non-conflicted biostatistician at OHSU (BB) using STATA 16 software. The figures and all other analyses were produced using R Version 1.1.456 software.

Data Availability Statement
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Results
Twenty healthy subjects participated in the study. The mean age of the group was 56 years (standard deviation (SD) = 22 years), with an average height of 65.5 inches (SD = 5.8 inches) and an average weight of 72.6 Kg (SD = 16 Kg). Compliance was high for weekly recordings, with an average of 6.5 days (SD = 0.6 days; min = 6 days and max = 8 days) of recording.

Comparison of Opal and ActiGraph Algorithms for Physical Activity Measures
Freedson98 Algorithm: The ICC values show high reliability and similar range in both comparisons for all activity measures estimated by Freedson98. The ICC ranges across activity levels for Opal versus ActiGraph GT9X were 0.9953-0.9996, and for GT3X versus GT9Xwere 0.9899-0.9999 (see Table 1). Figure 1 shows the Bland-Altman plots comparing Opal and ActiGraph GT9X, and between two versions of ActiGraph (GT3X versus GT9X), side-by-side, for the total activity counts. The degree of agreement between Opal and ActiGraph and between the two ActiGraph models was similar across activity levels (not all plots shown). The biases and limits of agreement for all Freedson98 activity measures are represented by mean and lower limits (LL) and upper limits (UL) in Table 2, respectively. The mean absolute percentage error (MAPE) shows a comparable performance between the Opal and ActiGraph GT9X, and between the two versions of ActiGraph (GT3X versus GT9X) (see Table 2). Specifically, the MAPE for Opal versus ActiGraph GT9X activity measures using the Freedson98 algorithm was within 10% of the MAPE of GT3X versus GT9X.  Figure 1. Bland-Altman plots of total counts using Freedson98 algorithm. Figure 1. Bland-Altman plots of total counts using Freedson98 algorithm.  Table 1). Figure 2 shows the Bland-Altman plots comparing Opal and ActiGraph GT9X, and between two versions of ActiGraph (GT3X versus GT9X), side-by-side, for the total activity counts. The degree of agreement between Opal and ActiGraph and between the two ActiGraph models was similar across activity levels (not all plots shown). The biases and limits of agreement for all FreedsonVM3 activity measures are represented by mean and lower limits (LL) and upper limits (UL) in Table 3, respectively. The mean absolute percentage error (MAPE) shows a comparable performance between the Opal and ActiGraph GT9X, and between the two versions of ActiGraph (GT3X versus GT9X) (see Table 3). Specifically, the MAPE for Opal versus ActiGraph GT9X activity measures using FreedsonVM3 algorithm was within 10% of the MAPE of GT3X versus GT9X. The mean and SD of all activity measures' averages across 7 days of all participants are shown in Supplementary Table S1.     Table 1). Figure 4 shows the Bland-Altman plots comparing Opal and ActiGraph GT9X, and between two versions of ActiGraph (GT3X versus GT9X) side-by-side for all the sleep    Table 1). Figure 4 shows the Bland-Altman plots comparing Opal and ActiGraph GT9X, and between two versions of ActiGraph (GT3X versus GT9X) side-by-side for all the sleep measures. The degree of agreement between Opal and Actigraph and between the two ActiGraph models was similar across sleep measures. The bias and limits of agreement for all sleep measures are represented by mean and lower limits (LL) and upper limits (UL) in Table 4, respectively. The mean absolute percentage error (MAPE) shows a comparable performance between the Opal and ActiGraph GT9X and between the two versions of ActiGraph (GT3X versus GT9X) (see Table 4). Specifically, the MAPE for Opal versus ActiGraph GT9X sleep measures was within 10% of the MAPE of GT3X versus GT9X, except for the total sleep time. However, in the case of the total sleep time measures, the MAPE for Opal versus GT3X (6%) was lower than that for GT3X versus GT9X (25%). The mean and SD of all sleep measures' averages across 7 days of all participants are shown in Supplementary Table S1. sions of ActiGraph (GT3X versus GT9X) (see Table 4). Specifically, the MAPE for Opal versus ActiGraph GT9X sleep measures was within 10% of the MAPE of GT3X versus GT9X, except for the total sleep time. However, in the case of the total sleep time measures, the MAPE for Opal versus GT3X (6%) was lower than that for GT3X versus GT9X (25%). The mean and SD of all sleep measures' averages across 7 days of all participants are shown in Supplementary Table S1.

Discussion
Opal Actigraphy and ActiLife solution (ActiGraph GT3X and GT9X) generated activity counts, activity intensity levels, and sleep measures during daily life. Comparisons between Opal and ActiGraph GT9X and between ActiGraph GT3X and GT39 were used to assess levels of reliability and agreement. Collectively, in this pilot study, our results indicate high reliability and similar agreement between the Opal and ActiGraph models, and between the two ActiGraph models, lending support for establishing concurrent validity.
Activity measures: Total activity counts were slightly underestimated by Opal, and overestimated by GT3X compared to GT9X. This is consistent with the findings by John et al. who observed an overestimation of counts by GT3X compared to GT9X on a treadmill protocol [38]. In contrast, Hwang et al. found that total counts were overestimated by GT9X compared to GT3X [39].
Sleep measures: GT3X has been shown to be a reliable and valid to measure sleep compared to a gold standard in-lab, i.e., polysomnography (PSG) [27,30,40]. Sleep efficiency has been shown to be somewhat overestimated by GT3X compared to PSG [27,30,40]. In our results, we also found that sleep efficiency is overestimated by GT3X, and slightly underestimated by Opal compared to GT9X. Similarly, the wake-after-sleep onset time (WASO) has been shown to be underestimated by GT3X compared to PSG [27,30,40]. In our results, we also found that WASO is underestimated by GT3X, and overestimated by Opal compared to GT9X. In contrast, total sleep time is overestimated by GT3X (compared to PSG [27]) and Opal compared to GT9X.
We observed that activity measures yielded higher reliability metrics from ICCs than sleep measures. Further, we observed that for some of the activity and sleep measures, Opal is underestimating while GT3X is overestimating compared to GT9X, and vice versa, with a notable outlier from the Bland-Altman plots. Both of these observed differences could be attributed to the different hardware, sensor characteristics (see Supplementary Table S2), and algorithms used by each system. Furthermore, some participants clearly observed the difference in weight (GT3X was bulkier) on one hand, compared to another hand, which might have influenced the measures extracted from the dominant (GT3X versus GT9X) and non-dominant (Opal versus GT9X) hands. Lastly, although the agreement between Opal and ActiGraph GT9X is similar to the agreement between the two versions of ActiGraph, we recommend caution when interpretating these results; for some activity and sleep measures, there appears to be a systematic bias.
There are several limitations of the current study. First, we only assessed the reliability and agreement of activity and sleep measures of the Opal compared to ActiGraph in healthy middle-aged adults, and findings may not be applicable to other cohorts. Second, we only validated Opal's activity and sleep measures with a small sample of subjects, therefore limiting the strength of the conclusions that can be drawn regarding agreement, especially from the Bland-Altman plots. Finally, we need to investigate the activity and sleep measures for the minimal clinically important difference in order to establish whether the difference between the Opal and GT9X is clinically meaningful. Hence, future studies are needed to strengthen the validity of Opal's activity and sleep measures on larger, more diverse populations, including those with neurological disorders.

Conclusions
Our pilot study provided evidence for establishing concurrent validity of t Opal Actigraphy measures of sleep and physical activity with 3D accelerometer data during daily life, as compared to the most widely used ActiGraph. Historically, Opal multi-sensor solutions have enabled high resolution data capture of precise aspects of human mobility via prescribed movement tasks in controlled environments to assess the efficacy and safety of therapeutic interventions. The new and validated Opal Actigraphy single-sensor configuration can provide further insight on how movement impairments captured in clinic translate into real-world quality of life by quantifying overall physical activity levels and sleep durations. Future studies should evaluate Clario's Opal Actigraphy solution in larger, more diverse cohorts and determine the clinically meaningful important difference in activity and sleep metrics in response to interventions.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/s23042296/s1, Table S1. Activity and sleep measures average across 7 days of all participants for the four sensors. Table S2. Sensor characteristics for Opal, GT9X and GT3X.  Data Availability Statement: Data from this study will be made available upon reasonable request to the corresponding author.