Next Article in Journal
Performance Assessment of Portable Optical Particle Spectrometer (POPS)
Next Article in Special Issue
A Cross-Regional Analysis of the COVID-19 Spread during the 2020 Italian Vacation Period: Results from Three Computational Models Are Compared
Previous Article in Journal
Loosely Coupled GNSS and UWB with INS Integration for Indoor/Outdoor Pedestrian Navigation
 
 
Order Article Reprints
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Accuracy of Mobile Applications versus Wearable Devices in Long-Term Step Measurements

1
Istituto Scientifico Romagnolo per lo Studio e la Cura dei Tumori (IRST) IRCCS, 47014 Meldola (FC), Italy
2
Department of Computer Science and Engineering (DISI), University of Bologna, 47521 Cesena, Italy
*
Author to whom correspondence should be addressed.
Sensors 2020, 20(21), 6293; https://doi.org/10.3390/s20216293
Received: 14 October 2020 / Revised: 28 October 2020 / Accepted: 3 November 2020 / Published: 5 November 2020

Abstract

:
Fitness sensors and health systems are paving the way toward improving the quality of medical care by exploiting the benefits of new technology. For example, the great amount of patient-generated health data available today gives new opportunities to measure life parameters in real time and create a revolution in communication for professionals and patients. In this work, we concentrated on the basic parameter typically measured by fitness applications and devices—the number of steps taken daily. In particular, the main goal of this study was to compare the accuracy and precision of smartphone applications versus those of wearable devices to give users an idea about what can be expected regarding the relative difference in measurements achieved using different system typologies. In particular, the data obtained showed a difference of approximately 30%, proving that smartphone applications provide inaccurate measurements in long-term analysis, while wearable devices are precise and accurate. Accordingly, we challenge the reliability of previous studies reporting data collected with phone-based applications, and besides discussing the current limitations, we support the use of wearable devices for mHealth.

Graphical Abstract

1. Introduction

Innovative applications (apps) and smart devices such as smartphones and tablets have emerged as integral parts of people’s lives [1]. They provide information to profile society [2], but also to make users aware of body signals important for predicting life expectancy [3], promoting patient engagement, self-management of diseases, and assist doctors to remotely follow up patients [4]. In particular, in the healthcare sector, we have seen the development of systems to assist users in different ways during the day [5] and night [6]. These devices offer incredible potential to generate big data, often identified as patient-generated health data (PGHD), that could influence medical doctors’ decision-making [7], as well as the potential for earlier diagnosis [8]. The World Health Organization (WHO) defines these tools under the labels electronic health (eHealth) and mobile health (mHealth): eHealth refers to any use of information and communication technology for health care; mHealth is a subset of eHealth specifically referring to the use of mobile and wireless devices. For instance, fitness trackers, blood glucose meters [9], blood pressure monitors [10], smoking sensors [11], and temperature-detection devices [12] are today popular for monitoring different fitness parameters and vital signs used for health analysis [13] and many other motion-related applications including tracking, positioning, activity recognition, and augmented reality [14]. In addition, electronic textiles (e-textiles), smart clothes, and flexible/printable electronics bring us closer to scenarios where electronic systems are totally integrated in our everyday life and help us in achieving a higher comfort level when interacting with smartphones and domotics in general [15].
Actually, there are three main ways to monitor and record physical activity (PA) parameters: (a) subjective methods, including self-reporting instruments such as questionnaires and diaries; (b) phone-based applications that help to record activities performed daily; and (c) wearable devices that record fitness and vital parameters in a continuous manner. However, most of the recent studies are based on mobile applications and/or fitness trackers to avoid subjective measurement, leading to wrong estimations (e.g., typically lower than self-reported estimations [16]). Among the fitness trackers, bracelets usually worn on the wrist can be today considered as the gold standard device used for activity tracking [17]. Furthermore, most of these fitness bracelets have complementary applications for interacting with social media to allow users to post their sleep and activity data, making these devices always more and more common. This has spawned a new trend, especially in the younger generations, and the total sales of fitness trackers are projected to be USD 6 billion by the end of 2020 [18].
Recently, Arigo et al. [19] described the history and future of the wearable devices used in medicine. Briefly, the earliest uses of technology to support medicine interventions occurred in the late 1940s and 1950s with the use of mechanical counters to provide feedback on users’ behavior. Pedometers (meant as early-stage mechanical devices that collect ambulatory data) were first used in the treatment of obesity in 1949 [20]. Later versions of activity sensors, such as aligned wrist- and ankle-worn counters [21], paved the way for modern smartphones and wearable fitness trackers. Most of these systems are equipped with accelerometers that record accelerations in one or more planes. These data elements are then processed into more meaningful variables [7], such as step counts; time spent in sedentary, light, moderate, and vigorous PA; flights of stairs climbed; and hours of sleep [22]. Steps are the basic parameter for several other indirect measurements. They are objective, intuitive, and comprehensible in the context of understanding personal activity, which makes this measure ideal for people to reflect on [23]. Steps are recorded in the device when a vertical acceleration deflects a spring-suspended lever arm above a designated force sensitivity threshold [24]. Most of the other common fitness tracker measures (e.g., flights of stairs, active minutes, calories burned, etc.) are then derived from the step count [25]. However, questions remain about the accuracy of the data they collect [26] and whether appropriate validation occurs before commercial sale [27].
In recent years, many studies have shown the importance of PA for long-term cardiovascular health [28]. Similarly, the number of studies considering patients’ PA in cancer clinical trials is rapidly increasing. Recently, Cox et al. [29] reviewed the different types of data recorded in cancer clinical trials, Purswani et al. [24] reviewed oncological studies specifically considering the patient’s steps, and Gresham et al. [30] looked at oncology trials based on wearable activity monitors. In general, the trend shows a rising number of studies involving some form of PA analysis, and the population is always more willing to share personal data, especially for research purposes [13]. In oncology, PGHD may be useful in providing continuous, dynamic, and objective assessments of patients’ health status between clinician visits. In general, medical doctors typically suggest that oncology patients engage in several PAs to keep a healthy status [31]. However, for several cancer types, the effects of PA on cancer prevention, treatment, and survival are still not known. An example is thyroid cancer for which the literature reports controversial results. Kitahara et al. [32] observed an increased risk of thyroid cancer in subjects reporting a higher frequency of PA. Conversely, the results reported by Leitzmann et al. [33] suggest that PA is unrelated to total thyroid cancer. Finally, Rossing et al. [34] support the hypothesis that PA may reduce the risk of thyroid cancer, but such a hypothesis is too sparse to make a more definitive judgment and give robust guidelines to patients. Like for thyroid, for several other cancer types we can also find in the literature controversial studies confirming that today, the health and fitness data at our disposal are not able to clarify the relations between cancer and PA. New technologies for collecting fitness and health data and better data accuracy may significantly improve the reliability of scientific studies in the field [35], and having large datasets reporting timeline PA and patients’ clinical status at the scientific community’s disposal would help in better profiling patients and understanding the correlations in order to improve people’s wellbeing [36].
In this work, we considered a basic parameter for PA evaluation—the number of steps taken daily. Our aim was to give the users an idea about how much steps they miss when measuring them through a smartphone-based application. In particular, we compared six fitness tracking applications for Android mobiles and three commercial fitness tracking wristbands, and we analyzed the precision and accuracy of the phone-based applications versus the dedicated wearable devices. Several brands and models of consumer trackers were examined for their accuracy in step measurement in a laboratory setting [37], e.g., treadmill walking [38], level walking [39], and stair walking [40]. However, count accuracy in real environments remains a major challenge [41]. To assess the accuracy and precision of the considered trackers, we performed two types of experiments (Figure 1). The first experiment was similar to those performed by Case et al. [42] and Modave et al. [43]. Basically, we evaluated the accuracy of the trackers in controlled outdoor tests with known ground truth, i.e., the real number of steps counted manually by analyzing videos recorded with a camera to serve as the benchmark (Figure 1a) [44]. In the second experiment, we asked a healthy 35-year-old man to install the six applications in his mobile and at the same time, wear all the fitness wristbands, and we measured the precision of the different systems in a 2-month 24 h/7 days experiment (Figure 1b). All the collected data and acquired videos are publicly available on FigShare (https://doi.org/10.6084/m9.figshare.c.4923645.v2).
The results obtained show that despite the good accuracy of all the tracking systems, the mobile applications did not provide reliable data in the long-term analysis due to an intrinsic problem, a missing parameter—the length of time the mobile was carried. Basically, in daily routines, mobiles are not always carried by people, especially when acting in a small indoor environment and performing standard daily life activities (e.g., cleaning, cooking, visiting the toilet), and this may generate unreliable day-based measurements. Generally, most of the time, young people carry their mobiles, so the number of steps recorded is reliable, but this is a very different situation for elders. In our case, even though the man performing the experiment was aware of the purpose of the collected data, the difference between the number of steps recorded by the phone-based applications was, on average, 30% lower with respect to what was recorded by the fitness wristbands, and this probably was misleading for all types of subsequent statistics, especially if the data were collected for clinical trials. Furthermore, these noncredible data may be understood by patients as being trustworthy and may also have a negative impact on users’ behavior.
We decided to share our data to make the scientific community aware that many studies based on steps counted with smartphone-based applications may be unreliable. Several recent as well as ongoing studies are based on data collected using smartphone-based applications [45], but our data show that the findings of these studies should be reconsidered. In our opinion, the only way to collect reliable data is to use an accurate fitness tracker worn on someone’s body for 24 h a day, 7 days a week. A major limitation of these devices is that they may not be sensitive enough for non-ambulatory physical activities such as cycling, swimming, and dancing [24]. Furthermore, most of them become unreliable when used for people with disabilities [46]. However, they are still a better compromise available today to collect reliable data passively without influencing people’s lives.

2. Materials and Methods

2.1. Phone-Based Applications and Wearable Fitness Trackers

Today, the market offers a wide range of phone-based applications and wearable devices for fitness tracking [47]. They differ in terms of available hardware and software installed, and it is known that the accuracy depends on both software and hardware [48]. However, even when the operating system, application programming interface (API), and sensors are the same, different implementations can affect the computation of the number of steps.
In this work, we mimicked a user interested in installing an application for fitness tracking by downloading one from among those freely available. The applications and devices included in this study were selected considering different software companies, prices, sensors, and algorithms used to analyze the data. Our decision was to install the applications in a mid-range smartphone, considering this to be the representative technology of the average consumer today. In particular, all of the applications were downloaded from the Google Play store and were installed on a Huawei P Smart FIG-LX1 phone with the Android 9 operating system.
In this work, 6 phone-based fitness tracking applications (APPs) and 3 wearable fitness trackers (WFTs) were tested. The 6 applications were:
The 3 wearable fitness trackers were:

2.2. Tracker Accuracy: Experiment Description

To assess how closely the different trackers agreed, we asked 3 operators to perform a test carrying a mobile with the 6 apps installed in a trouser pocket and wearing the 3 fitness wristbands on their left arm. The operators were 3 healthy subjects: a 35-year-old man (hereafter, Operator1), a 65-year-old man (Operator2), and a 65-year-old woman (Operator3). We asked each operator to walk in a fairly flat park while counting approximately 1200 steps. For the test, we chose this outdoor environment and not an indoor treadmill because it is known that in real-world conditions, especially on difficult terrain, there can be far more variation in step counts, given the changes in gait and wrist movement [43]. To have the ground truth, a cameraman recorded a video by following the operator performing the test with a camera. The acquired videos were post-processed to count the actual number of steps. Accuracy was then assessed by comparing the number of steps recorded by each tracker with the ground truth. All videos considered in the experiments are publicly available on FigShare (https://doi.org/10.6084/m9.figshare.c.4923645.v2).

2.3. Tracker Precision: Experiment Description

To evaluate the precision of the different trackers, we performed a long-term experiment based on the one-day test lasting for a few months in order to collect 60 days of reliable data. Basically, we asked an operator to carry the mobile with apps installed and wear the fitness wristbands 24 h/7 days. Then, we kept the data of the first 60 days, in which all the tracking systems were working. Precisely, every time one of the tracking systems was under recharge, we discarded the data collected by all the other systems. Furthermore, all the days when the operator could wear the wristbands but not carry the mobile in a trouser pocket (due to performing activities in his free time that could potentially damage the smartphone, for instance swimming, working in a vegetable garden, riding a horse, dancing) were removed from the analysis.
The operator was Operator1, a healthy 35-year-old man who leads a standard office job life (he is a computer scientist researcher) with several typical activities like working at the computer, answering the office phone, going to the printer, sharing documents with colleagues, and meeting with collaborators. The raw data recorded by different trackers are reported in Table 1. Once again, it is worth remarking that the 60 days considered in the experiment were not continuous, mainly due to battery recharge issues of the wearable devices and operator needs.

2.4. Statistics

In the accuracy assessment, the absolute normalized difference percentage (PAND) between step value (Vi) recorded by the trackers (i) and ground truth (G), determined in the post-processing of the videos, was computed to measure the closeness of agreement of the different counts according to Equation (1):
PAND = 100 × |ViG|/G
In the precision assessment, unbalanced one-way analysis of variance (ANOVA) was performed to measure if the step values recorded daily by the apps significantly differed from those recorded by the fitness wristbands. Then, the normalized difference percentage (PND) between the values (Vi) recorded by the trackers (i) and the average value (A), computed considering the values of apps (Aa) or wearables (Aw) together, was calculated to evaluate the precision of each tracker i, according to Equation (2):
PND = 100 × (ViAa/w)/Aa/w
The tests were carried out using MATLAB (MathWorks, Inc., Natick, MA, USA), and a p-value ≤ 0.05 was considered statistically significant. The asterisks in column 1 of Table 2 are only intended to flag levels of significance for three of the most commonly used levels: p-value less than 0.05 is flagged with one asterisk (*), p-value less than 0.01 is flagged with two asterisks (**), and p-value less than 0.001 is flagged with three asterisks (***). The bold values reported in column 1 of Table 2 highlight the days when the step values recorded by the apps did not significantly differ (considering p-value = 0.05 as the threshold) from those recorded by the fitness wristbands. The same analysis was repeated by accumulating the steps and subdividing the data in 6 blocks of 10 days (Table 3).

3. Results

3.1. Tracker: Accuracy: Results

To assess the accuracy of the trackers, we asked three operators to wear the devices and walk in a park while counting approximately 1200 steps. To have the ground truth number of steps, a cameraman followed the operator and recorded a video that was subsequently used to determine the real number of steps. Table 1 reports the name of the operator performing the test in row 1, the ground truth number of steps in row 2, the number of steps recorded by the devices in rows 3–11, and the PAND values in rows 12–21. It is worth noting that all apps counted the same number of steps in all tests, except APP4 in the test performed by Operator2 and Operator3 (Figure 2a). However, all PAND values were lower than 10% (the worst PAND was 8.23% for the apps and 8.08% for the wristbands), proving the accuracy of all the trackers in a short-term analysis performed under the controlled conditions of a standard walk in a park.

3.2. Tracker Precision: Results

To analyze the precision of the trackers, PND values were computed considering Aa as a normalization factor for apps and Aw for the fitness trackers. Table 2 reports the steps counted by the trackers in the 60-day experiment. Day-based average values and standard deviations of APPs and WFTs are reported in Figure 2b. PND values are reported in Table 4. The average PND computed by considering all the data collected (last line of Table 4) ranges from −3% to 8% for apps and −6% to 4% for wearables, showing good precision of the trackers. The same analysis was repeated not just day-based, but also accumulating the steps and subdividing the data in 6 blocks of 10 days. Table 3 reports the cumulative step numbers and Table 5 the block-based PND values. The average PND computed by considering the 6 blocks together (last line of Table 5) ranges from −4% to 8% for apps and −8% to 4% for wearables. This shows that despite some local values strongly differing from the average, across the whole experiment, there were no significant differences between the APPs or WFTs considered separately. Accordingly, this proves that today, despite recorded values being logically dependent on the quality of the algorithms, on average, thanks to the available technology, all apps and wearables gave practically the same number of counted steps in long-term analysis.
However, the number of steps recorded by the phone-based apps was significantly lower than that recorded by the fitness wristbands. The average number of steps recorded daily by the phone-based apps was on average 34% lower than that recorded by the fitness wristbands (Figure 2b) according to the day-based analysis, and 32% in the cumulative-based one. Furthermore, on approximately 80% of the days (50 out of 60 days), the values recorded by the phone-based apps statistically significantly differed from those recorded by the wearable devices, according to an unbalanced one-way ANOVA test performed on the day-based data. This suggests lower accuracy of the apps in long-term analysis with respect to wearables worn on the body. This is also emphasized when considering the cumulative-based data reported in Table 3: in 100% of the 6 blocks, the cumulative values referring to the phone-based apps strongly (Block I is characterized by a p-value lower than 0.01; all the other blocks by p-values lower than 0.001) statistically significantly differed from those referring to the wearable devices.
It is worth remarking that the operator performing the experiment was well informed about the finality of the acquired data. However, to collect reliable data, he did not modify his daily routine and the way he used his mobile. Accordingly, he was not carrying the mobile during several daily movements typical of a standard office job, like standing from the desk for answering the office phone, going to the printer, bringing documents to colleagues, and visiting the toilet. Furthermore, during his free time (mainly in the evenings at home), he was not carrying the mobile during standard activities like cooking and cleaning. The recorded data proved that these standard movements strongly affect the final count of steps taken, and the difference between steps recorded by phone-based apps and fitness wristbands would be even larger for elderly people, who typically leave their mobile on a desk in a central room of the house and perform standard activities in the environment.

4. Discussion

One of the goals of this work was to compare the accuracy and precision of smartphone-based apps versus those of wearable fitness wristbands to suggest the best way to collect data for long-term clinical trials. Even though several research studies have confirmed that both the accuracy and precision of activity tracking devices have steadily improved over the last decade, device accuracy still remains a concern. However, technological improvements, increased knowledge on how to use the devices and for what purposes, and a better statistical approach to understanding the data all contribute to more comfort in the use of wearables and sensors for clinical trials.
Logically, to be effective in measuring PA, patients must wear a wristband or use an app for the entire day. Therefore, design, battery life, water-resistance, and comfort must all be maximized [29]. However, since in daily life people are not always carrying their smartphone, data collected by phone-based apps may be unreliable for long-term analysis, and asking people to carry their smartphone 24 h a day, 7 days a week would be impossible and would definitively change their lifestyle. Besides, other studies already proved that even fitness wristbands generally slightly underestimate the number of steps [38]. Consequently, there are several reasons to suggest that data obtained by smartphone-based apps are unreliable. On the other hand, fitness wristbands: (a) are somewhat invasive; (b) do not influence people’s lifestyle; and (c) can collect accurate and precise data for long-term analysis. However, since foot pods measure walking or running cadence directly, we do not exclude the idea that more reliable data could be acquired using ankle bracelets, which are probably more accurate than devices measuring at the wrist or the hip [43]. Previous comparisons of data from various consumer- and commercial-grade wearables already demonstrated variability among devices related to body placement. For instance, Hildebrand et al. [49] examined the effects of tracker placement on acceleration values. In particular, they showed a significant difference between hip and wrist placement, showing that the output from the wrist was generally higher than that from the hip. A hip tracker usually has several limitations, including underestimation of energy expenditure during activities with little or no movement at the hip and potential loss of data due to removing the tracker when dressing, and influencing the lifestyle. On the other hand, several studies have documented noncompliance resulting in loss of data when using a hip-mounted accelerometer [16,50].
Actually, how to improve the general accuracy of step counting still remains a very important issue. For instance: (a) by exploiting the decreasing trend of the prices of electronics and the increasing size-reduction of all the sensors, three sensors could be embedded in all the devices to always provide a median value; (b) GPS measurements of the distances that have been walked or ran could be considered in all step counting algorithms to improve the accuracy of the system; (c) other less noise-prone body placement positions could be tested, for instance the neck or waist, or smart wearable technologies, such as smart shirts or pants, could be considered. However, with the information available today, we think that the best trade-off in collecting reliable data without affecting people’s lifestyle is offered by wearable fitness wristbands. Therefore, we suggest that these types of devices be used for long-term clinical trials, and we caution the community to reconsider the findings of previous studies based on smartphone applications.

5. Conclusions

At this time, the number of clinical trials involving physical activity measurements is growing. Smartphone-based applications, fitness wristbands, ankle bracelets, hip belts, and many other tracking systems are very common in society. However, the accuracy and precision of these devices still remain a concern.
In this work, we analyzed the precision and accuracy of six phone-based applications (i.e., APP1—Huawei Health; APP2—Bits&Coffee ActivityTracker; APP3—Best Simple Apps Contapassi; APP4—GALA MIX WinWalk; APP5—LG Electronics LG Health; APP6—Pacer Health’s Pacer) and three wearable fitness wristbands (i.e., WFT1—Decathlon OnCoach 100; WFT2—Crane activity tracker; WFT3—Suunto 9). The experiments performed were conceived as a proof of concept to give an idea about what can be expected in terms of relative differences in the measurements achieved using different system typologies. In particular, the number of steps recorded by the trackers in controlled tests with ground truth was considered to assess the accuracy of the trackers; a long-term analysis based on data acquired in a 2-month experiment was considered to estimate the precision. The first experiment performed proves that the accuracy of the smartphone applications is comparable to that of the wearable fitness wristbands; the second experiment proves that due to the fact that people cannot always carry a smartphone, the step measurement in long-term analysis may differ even by 30%. Providing significant statistics about the absolute performance of mobile applications and fitness wristbands is beyond the scope of this work. However, we can reassume that between the applications, just APP4 gave significantly different results, and between the wristbands, WFT1 gave, in general, lower step measurements.
The main outcome of the experiments is that despite good accuracy in short-term analysis of controlled conditions, data acquired with phone-based applications may be unreliable in long-term analysis. This not due to system accuracy, but because people do not always carry their smartphone in their trouser pocket. For instance, they typically leave their mobile on a desk in a central room of the house and do not carry their smartphone when performing standard daily life activities (e.g., cleaning, cooking, visiting the toilet) in small indoor environments, and we proved that the sum of these moments, added to the moments when they do not carry their mobile because performing activities that could potentially damage it (e.g., swimming, working in the vegetable garden, riding a horse, dancing), at the end of the day, may have a significant impact on step measurement.
To conclude, besides suggesting the use of wearable fitness wristbands to collect data, we also caution the scientific community to reconsider outcomes of studies based on data collected with mobile trackers.

Author Contributions

Conceptualization: F.P., G.M., A.C.; data curation: F.P.; formal analysis: F.P., A.C.; funding acquisition: G.M., A.C.; investigation. F.P., A.C.; methodology: F.P., A.C.; project administration: A.C.; resources: G.M., A.C.; software: F.P.; supervision: A.C.; validation: F.P.; visualization: F.P.; writing—original draft: F.P., A.C.; writing—review and editing: G.M. All authors have read and agreed to the published version of the manuscript.

Funding

G.M. is supported by Horizon 2020—ONCORELIEF EU Project (ref: H2020-875392). This research received no other funding.

Acknowledgments

We would like to thank Mattia Cantagalli (Faenza, Italy) and Roberto Montevecchi (Faenza, Italy) for sharing important opinions about fitness trackers; Roberto Vespignani (IRST IRCCS, Meldola, Italy) and Nicola Caroli (IRST IRCCS, Meldola, Italy) for technical assistance in testing applications/trackers; Amedeo Bertuccioli (University of Bologna, Italy) for analyzing the videos; Roberta Vicchi (Ravenna, Italy) for editing the manuscript; the MDPI English Editing Services for revising the manuscript language.

Conflicts of Interest

The authors declare no conflict of interest.

Ethics Statements

It is worth noting that this is not a clinical study/trial. In this research, there are no patients and the operators involved in the experiments were volunteers just carrying a smartphone and wearing three fitness wristbands. We are not sharing personal health data or sensible data and the experiments have been designed only to test tracking systems. However, all the operators involved in the experiments signed the informed consent to authorize the treatment of the collected data. Their consent is publicly available: https://doi.org/10.6084/m9.figshare.c.4923645.v2.

References

  1. Jee, H. Review of researches on smartphone applications for physical activity promotion in healthy adults. J. Exerc. Rehabil. 2017, 13, 3–11. [Google Scholar] [CrossRef] [PubMed][Green Version]
  2. Zhao, S.; Li, S.; Ramos, J.; Luo, Z.; Jiang, Z.; Dey, A.K.; Pan, G. User profiling from their use of smartphone applications: A survey. Pervasive Mob. Comput. 2019, 59, 101052. [Google Scholar] [CrossRef]
  3. Kang, J.J.; Adibi, S. Systematic predictive analysis of personalized life expectancy using smart devices. Technologies 2018, 6, 74. [Google Scholar] [CrossRef][Green Version]
  4. Bayo-Monton, J.L.; Martinez-Millana, A.; Han, W.; Fernandez-Llatas, C.; Sun, Y.; Traver, V. Wearable sensors integrated with Internet of Things for advancing eHealth care. Sensors 2018, 18, 1851. [Google Scholar] [CrossRef][Green Version]
  5. Kranz, M.; MöLler, A.; Hammerla, N.; Diewald, S.; PlöTz, T.; Olivier, P.; Roalter, L. The mobile fitness coach: Towards individualized skill assessment using personalized mobile devices. Pervasive Mob. Comput. 2013, 9, 203–215. [Google Scholar] [CrossRef][Green Version]
  6. Sadeghi, R.; Banerjee, T.; Hughes, J.C.; Lawhorne, L.W. Sleep quality prediction in caregivers using physiological signals. Comput. Biol. Med. 2019, 110, 276–288. [Google Scholar] [CrossRef]
  7. Purswani, J.M.; Dicker, A.P.; Champ, C.E.; Cantor, M.; Ohri, N. Big data from small devices: The future of smartphones in oncology. Semin. Radiat. Oncol. 2019, 29, 338–347. [Google Scholar] [CrossRef] [PubMed]
  8. Dias, D.; Silva Cunha, J.P. Wearable health devices—Vital sign monitoring, systems and technologies. Sensors 2018, 18, 2414. [Google Scholar] [CrossRef][Green Version]
  9. Heintzman, N.D. A digital ecosystem of diabetes data and technology: Services, systems, and tools enabled by wearables, sensors, and apps. J. Diabetes Sci. Technol. 2016, 10, 35–41. [Google Scholar] [CrossRef][Green Version]
  10. Villarreal, V.; Nielsen, M.; Samudio, M. Sensing and storing the blood pressure measure by patients through a platform and mobile devices. Sensors 2018, 18, 1805. [Google Scholar] [CrossRef][Green Version]
  11. Sazonov, E.; Lopez-Meyer, P.; Tiffany, S. A wearable sensor system for monitoring cigarette smoking. J. Stud. Alcohol Drugs 2013, 74, 956–964. [Google Scholar] [CrossRef] [PubMed][Green Version]
  12. Lee, J.W.; Han, D.C.; Shin, H.J.; Yeom, S.H.; Ju, B.K.; Lee, W. PEDOT: PSS-based temperature-detection thread for wearable devices. Sensors 2018, 18, 2996. [Google Scholar] [CrossRef][Green Version]
  13. Kessel, K.A.; Vogel, M.M.; Kessel, C.; Bier, H.; Biedermann, T.; Friess, H.; Herschbach, P.; von Eisenhart-Rothe, R.; Meyer, B.; Kiechle, M.; et al. Mobile health in oncology: A patient survey about app-assisted cancer care. JMIR mHealth uHealth 2017, 5, e81. [Google Scholar] [CrossRef]
  14. Kang, X.; Huang, B.; Qi, G. A novel walking detection and step counting algorithm using unconstrained smartphones. Sensors 2018, 18, 297. [Google Scholar] [CrossRef] [PubMed][Green Version]
  15. Rajan, K.; Garofalo, E.; Chiolerio, A. Wearable intrinsically soft, stretchable, flexible devices for memories and computing. Sensors 2018, 18, 367. [Google Scholar] [CrossRef] [PubMed][Green Version]
  16. Troiano, R.P.; Berrigan, D.; Dodd, K.W.; Masse, L.C.; Tilert, T.; McDowell, M. Physical activity in the United States measured by accelerometer. Med. Sci. Sports Exerc. 2008, 40, 181–188. [Google Scholar] [CrossRef] [PubMed]
  17. Mansukhani, M.P.; Kolla, B.P. Apps and fitness trackers that measure sleep: Are they useful. Clevel. Clin. J. Med. 2017, 84, 451–456. [Google Scholar] [CrossRef] [PubMed]
  18. Kinney, D.A.; Nabors, L.A.; Merianos, A.L.; Vidourek, R.A. College students’ use and perceptions of wearable fitness trackers. Am. J. Health Educ. 2019, 50, 298–307. [Google Scholar] [CrossRef]
  19. Arigo, D.; Jake-Schoffman, D.E.; Wolin, K.; Beckjord, E.; Hekler, E.B.; Pagoto, S.L. The history and future of digital health in the field of behavioral medicine. J. Behav. Med. 2019, 42, 67–83. [Google Scholar] [CrossRef]
  20. Larsen, G. Treatment of obesity. Tidsskr. Nor. laegeforen. 1949, 69, 442–446. [Google Scholar]
  21. Schulmann, J.L.; Reisman, J.M. An objective measurement of hyperactivity. Am. J. Ment. Defic. 1959, 64, 455–456. [Google Scholar]
  22. Burgdorf, A.; Güthe, I.; Jovanović, M.; Kutafina, E.; Kohlschein, C.; Bitsch, J.Á.; Jonas, S.M. The mobile sleep lab app: An open-source framework for mobile sleep assessment based on consumer-grade wearable devices. Comput. Biol. Med. 2018, 103, 8–16. [Google Scholar] [CrossRef] [PubMed]
  23. Bassett, D.R.; Toth, L.P.; LaMunion, S.R.; Crouter, S.E. Step counting: A review of measurement considerations and health-related applications. Sports Med. 2017, 47, 1303–1315. [Google Scholar] [CrossRef][Green Version]
  24. Purswani, J.M.; Ohri, N.; Champ, C. Tracking steps in oncology: The time is now. Cancer Manag. Res. 2018, 10, 2439. [Google Scholar] [CrossRef][Green Version]
  25. El-Amrawy, F.; Nounou, M.I. Are currently available wearable devices for activity tracking and heart rate monitoring accurate, precise, and medically beneficial? Health Inform. Res. 2015, 21, 315–320. [Google Scholar] [CrossRef]
  26. Evenson, K.R.; Goto, M.M.; Furberg, R.D. Systematic review of the validity and reliability of consumer-wearable activity trackers. Int. J. Behav. Nutr. Phys. Act. 2015, 12, 159. [Google Scholar] [CrossRef][Green Version]
  27. Peake, J.; Kerr, G.K.; Sullivan, J.P. A critical review of consumer wearables, mobile applications and equipment for providing biofeedback, monitoring stress and sleep in physically active populations. Front. Physiol. 2018, 9, 743. [Google Scholar] [CrossRef]
  28. Hershman, S.G.; Bot, B.M.; Shcherbina, A.; Doerr, M.; Moayedi, Y.; Pavlovic, A.; Waggott, D.; Cho, M.K.; Rosenberger, M.E.; Haskell, W.L.; et al. Physical activity, sleep and cardiovascular health data for 50,000 individuals from the MyHeart Counts Study. Sci. Data 2019, 6, 1–10. [Google Scholar] [CrossRef][Green Version]
  29. Cox, S.M.; Lane, A.; Volchenboum, S.L. Use of wearable, mobile, and sensor technology in cancer clinical trials. JCO Clin. Cancer Inform. 2018, 2, 1–11. [Google Scholar] [CrossRef]
  30. Gresham, G.; Schrack, J.; Gresham, L.M.; Shinde, A.M.; Hendifar, A.E.; Tuli, R.; Rimel, B.J.; Figlin, R.; Meinert, C.L.; Piantadosi, S. Wearable activity monitors in oncology trials: Current use of an emerging technology. Contemp. Clin. Trials 2018, 64, 13–21. [Google Scholar] [CrossRef][Green Version]
  31. Chandwani, K.D.; Perkins, G.; Nagendra, H.R.; Raghuram, N.V.; Spelman, A.; Nagarathna, R.; Johnson, K.; Fortier, A.; Arun, B.; Wei, Q.; et al. Randomized, controlled trial of yoga in women with breast cancer undergoing radiotherapy. J. Clin. Oncol. 2014, 32, 1058. [Google Scholar] [CrossRef][Green Version]
  32. Kitahara, C.M.; Platz, E.A.; Freeman, L.E.B.; Black, A.; Hsing, A.W.; Linet, M.S.; Park, Y.; Schairer, C.; De González, A.B. Physical activity, diabetes, and thyroid cancer risk: A pooled analysis of five prospective studies. Cancer Causes Control. 2012, 23, 463–471. [Google Scholar] [CrossRef][Green Version]
  33. Leitzmann, M.F.; Brenner, A.; Moore, S.C.; Koebnick, C.; Park, Y.; Hollenbeck, A.; Schatzkin, A.; Ron, E. Prospective study of body mass index, physical activity and thyroid cancer. Int. J. Cancer 2010, 126, 2947–2956. [Google Scholar] [CrossRef][Green Version]
  34. Rossing, M.A.; Remler, R.; Voigt, L.F.; Wicklund, K.G.; Daling, J.R. Recreational physical activity and risk of papillary thyroid cancer (United States). Cancer Causes Control. 2001, 12, 881–885. [Google Scholar] [CrossRef] [PubMed]
  35. Carbonaro, A.; Piccinini, F.; Reda, R. Integrating heterogeneous data of healthcare devices to enable domain data management. J. e-Learn. Knowl. Soc. 2018, 14, 1–12. [Google Scholar]
  36. Dijkhuis, T.B.; Blaauw, F.J.; Van Ittersum, M.W.; Velthuijsen, H.; Aiello, M. Personalized physical activity coaching: A machine learning approach. Sensors 2018, 18, 623. [Google Scholar] [CrossRef] [PubMed][Green Version]
  37. Tam, M.K.; Cheung, S.Y. Validation of consumer wearable activity tracker as step measurement in free-living conditions. Finn. J. eHealth eWelfare 2019, 11, 68–75. [Google Scholar] [CrossRef][Green Version]
  38. Sears, T.; Alvalos, E.; Lawson, S.; McAlister, I.; Eschbach, L.C.; Bunn, J. Wrist-worn physical activity trackers tend to underestimate steps during walking. Int. J. Exerc. Sci. 2017, 10, 764–773. [Google Scholar]
  39. Takacs, J.; Pollock, C.L.; Guenther, J.R.; Bahar, M.; Napier, C.; Hunt, M.A. Validation of the Fitbit One activity monitor device during treadmill walking. J. Sci. Med. Sport 2014, 17, 496–500. [Google Scholar] [CrossRef]
  40. Huang, Y.; Xu, J.; Yu, B.; Shull, P.B. Validity of FitBit, Jawbone UP, Nike+ and other wearable devices for level and stair walking. Gait Posture 2016, 48, 36–41. [Google Scholar] [CrossRef]
  41. Ngueleu, A.M.; Blanchette, A.K.; Bouyer, L.; Maltais, D.; McFadyen, B.J.; Moffet, H.; Batcho, C.S. Design and accuracy of an instrumented insole using pressure sensors for step count. Sensors 2019, 19, 984. [Google Scholar] [CrossRef][Green Version]
  42. Case, M.A.; Burwick, H.A.; Volpp, K.G.; Patel, M.S. Accuracy of smartphone applications and wearable devices for tracking physical activity data. JAMA 2015, 313, 625–626. [Google Scholar] [CrossRef][Green Version]
  43. Modave, F.; Guo, Y.; Bian, J.; Gurka, M.J.; Parish, A.; Smith, M.D.; Lee, A.M.; Buford, T.W. Mobile device accuracy for step counting across age groups. JMIR mHealth uHealth 2017, 5, e88. [Google Scholar] [CrossRef]
  44. Bunn, J.A.; Jones, C.; Oliviera, A.; Webster, M.J. Assessment of step accuracy using the Consumer Technology Association standard. J. Sports Sci. 2019, 37, 244–248. [Google Scholar] [CrossRef]
  45. Soto-Perez-De-Celis, E.; Kim, H.; Rojo-Castillo, M.P.; Sun, C.L.; Chavarri-Guerra, Y.; Navarrete-Reyes, A.P.; Waisman, J.R.; Avila-Funes, J.A.; Aguayo, A.; Hurria, A. A pilot study of an accelerometer-equipped smartphone to monitor older adults with cancer receiving chemotherapy in Mexico. J. Geriatr. Oncol. 2018, 9, 145–151. [Google Scholar] [CrossRef]
  46. Jones, M.; Morris, J.; Deruyter, F. Mobile healthcare and people with disabilities: Current state and future needs. Int. J. Environ. Res. Public Health 2018, 15, 515. [Google Scholar] [CrossRef][Green Version]
  47. Muzny, M.; Henriksen, A.; Giordanengo, A.; Muzik, J.; Grottland, A.; Blixgard, H.; Hartvigsen, G.; Arsand, E. Wearable sensors with possibilities for data exchange: Analyzing status and needs of different actors in mobile health monitoring systems. Int. J. Med. Inform. 2020, 133, 104017. [Google Scholar] [CrossRef] [PubMed]
  48. Díaz, S.; Stephenson, J.B.; Labrador, M.A. Use of wearable sensor technology in gait, balance, and range of motion analysis. Appl. Sci. 2020, 10, 234. [Google Scholar] [CrossRef][Green Version]
  49. Hildebrand, M.; Van Hees, V.T.; Hansen, B.H.; Ekelund, U.L.F. Age group comparability of raw accelerometer output from wrist-and hip-worn monitors. Med. Sci. Sports Exerc. 2014, 46, 1816–1824. [Google Scholar] [CrossRef]
  50. Colley, R.; Gorber, S.C.; Tremblay, M.S. Quality control and data reduction procedures for accelerometry-derived measures of physical activity. Health Rep. 2010, 21, 63. [Google Scholar]
Figure 1. Representation of performed experiments. (a) Outdoor controlled test with number of steps counted by processing a video recorded with a camera. (b) Two-month 24 h/7 days monitoring of a healthy 35-year-old man wearing three fitness wristbands and carrying a mobile with six running step-counter applications.
Figure 1. Representation of performed experiments. (a) Outdoor controlled test with number of steps counted by processing a video recorded with a camera. (b) Two-month 24 h/7 days monitoring of a healthy 35-year-old man wearing three fitness wristbands and carrying a mobile with six running step-counter applications.
Sensors 20 06293 g001
Figure 2. Bar-charts showing (a) absolute normalized difference percentage (PAND) values reported in the last nine rows of Table 1, and (b) day-based averages of the APPs and WFTs step values reported in Table 2. The gray bars in (b) are the standard deviations. * p-value less than 0.05; ** p-value less than 0.01; *** p-value less than 0.001. Bold values in the x-axis of (b) highlight days when step values recorded by APPs do not significantly differ (considering p-value = 0.05 as the threshold) from those recorded by fitness wristbands (WFTs).
Figure 2. Bar-charts showing (a) absolute normalized difference percentage (PAND) values reported in the last nine rows of Table 1, and (b) day-based averages of the APPs and WFTs step values reported in Table 2. The gray bars in (b) are the standard deviations. * p-value less than 0.05; ** p-value less than 0.01; *** p-value less than 0.001. Bold values in the x-axis of (b) highlight days when step values recorded by APPs do not significantly differ (considering p-value = 0.05 as the threshold) from those recorded by fitness wristbands (WFTs).
Sensors 20 06293 g002
Table 1. Steps recorded by trackers in step-controlled experiments.
Table 1. Steps recorded by trackers in step-controlled experiments.
OperatorOperator1Operator2Operator3
GROUND TRUTH [steps]102714091175
APP1 [steps]103614101176
APP2 [steps]103614101176
APP3 [steps]103614101176
APP4 [steps]103612931095
APP5 [steps]103614101176
APP6 [steps]103614101176
WFT1 [steps]101613801230
WFT1 [steps]94414241197
WFT1 [steps]99414021159
APP1 (PAND)0.880.070.09
APP2 (PAND)0.880.070.09
APP3 (PAND)0.880.070.09
APP4 (PAND)0.888.236.81
APP5 (PAND)0.880.070.09
APP6 (PAND)0.880.070.09
WFT1 (PAND)1.072.064.68
WFT1 (PAND)8.081.061.87
WFT1 (PAND)3.210.501.36
Table 2. Steps recorded by trackers in the 2-month experiment. * p-value less than 0.05; ** p-value less than 0.01; *** p-value less than 0.001. Bold values in column 1 highlight days when step values recorded by APPs do not significantly differ (considering p-value = 0.05 as the threshold) from those recorded by fitness wristbands (WFTs).
Table 2. Steps recorded by trackers in the 2-month experiment. * p-value less than 0.05; ** p-value less than 0.01; *** p-value less than 0.001. Bold values in column 1 highlight days when step values recorded by APPs do not significantly differ (considering p-value = 0.05 as the threshold) from those recorded by fitness wristbands (WFTs).
DayAPP1APP2APP3APP4APP5APP6WFT1WFT2WFT3
01 *676750006203610063286140713972167899
02 **550753895070540053104594590760956631
03 ***501550154760500050825015611463877326
0416,23416,15015,46916,10015,95116,15015,08220,68718,774
05 ***536352864823490052335286659693368853
06 ***11881188100412007921188401164055018
07 ***529652554319530052555255711088178102
08 **179317931589170017931733275939915452
0913,99762913,69962962962910,11213,96614,618
10 *11,305366211,04720003663367412,09014,02313,480
11 **556843374651260041695443866175646381
12884623968426380087088738883510,96210,372
13 ***629961995959480061016059963383149241
14 **91588931797680008488962110,35611,97211,192
15 **496627644508340033833591584865046120
16 **10,17910,493964010,00010,49310,10311,77017,06515,826
17 ***692669266046620069246926991710,36610,291
18 ***277727882631260026042788515950135893
19 ***394839483846410039073948666657856068
20605061549211690061546048800088318543
21 *779877986052500076527798930688319062
22 ***403639884858440039883988635458417265
23 *640364036429420064036403761380517811
24 ***130613061155210013061306310528803466
25 ***289128622840270027722862485845925199
26 **474144134259580043994393726265735391
27 *380038003615400038003800561847524107
2812,48712,41012,41414,00012,41012,41011,81316,60816,767
29644263956107650062526309527595518598
30 **522049874818380049874987613161307361
31 *550553735154960053705373910889368252
32 ***306730592740260030572991664656556898
33 ***69727189674889007187681710,56710,24410,292
34464544284015650044284428483858606421
35 ***318830232967380029643023617478248364
36 ***404539213565460039213921974410,1989900
37 **11,29411,03510,648940011,03511,03513,26419,93619,899
38 ***632361875705600060416187822681358289
39 ***391936973278350035633697693957005900
40 *611358095296490058095809706261187123
41 **676565746232890065746574976390209004
42 ***87078617830590008587861710,97212,49913,511
43 ***513251295142470051295129741110,03810,260
44 *743974877252820073117436753611,67312,831
4511,22611,22610,800760011,22611,22611,93512,36812,228
46 ***472947314458420045804729683674428118
4717,69917,69917,27110,90017,56017,69917,44020,01618,621
48 *605560555863670058516055639665337151
49 **13,56313,18813,30911,20013,18813,18813,88917,78618,717
5077447744760012,1007606760610,101893810,198
51 ***247324732400240024072407510942654340
52 ***469446944631390044984694734660786859
53 ***557854085456540053855577797473739056
54 ***443543633986320043634363657262366530
55 *325732282992370032283228489096745193
56 **427942494120490042464249716211,4077133
57 ***671965536468460065536553966310,12011,205
58430439813807370039173981589160187882
59 **781176517370970076517532942310,99011,043
60 ***10,96710,96710,75211,00010,96710,96714,00314,17114,559
Table 3. Cumulative number of steps recorded subdividing the data reported in Table 2 in 6 blocks of 10 days. Asterisks in column 1 highlight days when step values recorded by APPs significantly differ from those recorded by fitness wristbands (WFTs). * p-value less than 0.05; ** p-value less than 0.01; *** p-value less than 0.001.
Table 3. Cumulative number of steps recorded subdividing the data reported in Table 2 in 6 blocks of 10 days. Asterisks in column 1 highlight days when step values recorded by APPs significantly differ from those recorded by fitness wristbands (WFTs). * p-value less than 0.05; ** p-value less than 0.01; *** p-value less than 0.001.
BlockAPP1APP2APP3APP4APP5APP6WFT1WFT2WFT3
I **72,46549,36767,98348,32950,03649,66476,92096,92396,153
II ***64,71754,93662,89452,40060,93163,26584,84592,37689,927
III ***55,12454,36252,54752,50053,96954,25667,33573,80975,027
IV ***55,07153,72150,11659,80053,37553,28182,56888,60691,338
V ***89,05988,45086,23283,50087,61288,25910,227911,631312,0639
VI ***54,51753,56751,98252,50053,21553,55178,03386,33283,800
Table 4. Percentage normalized difference (PND) values computed for trackers in the 2-month experiment.
Table 4. Percentage normalized difference (PND) values computed for trackers in the 2-month experiment.
DAYAPP1APP2APP3APP4APP5APP6WFT1WFT2WFT3
0111−182041−4−36
0263−342−12−5−27
0311−4021−7−311
0411−3101−17143
0543−6−523−20137
0699−810−289−2224−2
0743−16433−11101
0833−8−230−32−234
09178−88172−88−88−88−22813
1092−3887−66−38−38−862
1125−34−42−722150−15
1230−6524−442828−1293
13751−19336−82
1453−8−8−211−770
1532−2720−10−10−5−56−1
1603−5−130−21156
1744−9−744−321
1833−2−4−33−4−610
1900−34−108−6−2
20−10−9362−9−10−541
211111−14−299113−30
22−4−5155−5−5−2−1012
23666−3066−330
24−8−8−1849−8−8−1−910
25211−4−21−1−66
262−5−924−6−6133−16
2700−550016−2−15
28−2−2−210−2−2−221011
2921−43−10−322210
30940−2144−6−613
31−9−11−1558−11−1142−6
3255−6−11524−128
33−5−2−822−2−72−1−1
34−2−7−1537−7−7−15313
351−4−620−6−4−17512
361−2−1115−2−2−230
3753−1−1233−251312
3842−6−1−120−11
3992−9−3−1212−8−5
4093−6−13334−105
41−2−5−1028−5−55−3−3
4210−44−10−11110
43112−711−20911
44−10−49−3−1−29920
45662−2866−220
4633−2−803−809
47775−3477−770
48−1−1−410−4−1−4−27
49523−1322−17611
50−8−8−1044−9−94−85
5122−1−1−1−112−7−5
52442−14049−101
532−10−1−22−2−911
5486−3−22662−31
550−1−913−1−1−2647−21
56−1−2−513−2−2−1633−17
57854−2655−6−28
5891−4−6−11−11−919
59−2−4−722−4−5−1055
6000−2100−2−12
AVERAGE8−32−3−3−1−624
Table 5. Percentage normalized difference (PND) values computed analyzing the 6 blocks of cumulative steps reported in Table 3.
Table 5. Percentage normalized difference (PND) values computed analyzing the 6 blocks of cumulative steps reported in Table 3.
BlockAPP1APP2APP3APP4APP5APP6WFT1WFT2WFT3
I29−1221−14−11−12−1587
II8−85−1226−541
III21−2−201−724
IV2−1−810−2−2−614
V21−1−401−1037
VI21−2−101−641
AVERAGE8−32−4−2−1−844
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Piccinini, F.; Martinelli, G.; Carbonaro, A. Accuracy of Mobile Applications versus Wearable Devices in Long-Term Step Measurements. Sensors 2020, 20, 6293. https://doi.org/10.3390/s20216293

AMA Style

Piccinini F, Martinelli G, Carbonaro A. Accuracy of Mobile Applications versus Wearable Devices in Long-Term Step Measurements. Sensors. 2020; 20(21):6293. https://doi.org/10.3390/s20216293

Chicago/Turabian Style

Piccinini, Filippo, Giovanni Martinelli, and Antonella Carbonaro. 2020. "Accuracy of Mobile Applications versus Wearable Devices in Long-Term Step Measurements" Sensors 20, no. 21: 6293. https://doi.org/10.3390/s20216293

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop