Article Text
Abstract
Background/aim Consumer-based physical activity (PA) monitors have become popular tools to track PA behaviours. Currently, little is known about the validity of the measurements provided by consumer monitors. We aimed to compare measures of steps, energy expenditure (EE) and active minutes of four consumer monitors with one research-grade accelerometer within a semistructured protocol.
Methods Thirty men and women (18–80 years old) wore Fitbit One (worn at the waist), Fitbit Zip (waist), Fitbit Flex (wrist), Jawbone UP24 (wrist) and one waist-worn research-grade accelerometer (ActiGraph) while participating in an 80 min protocol. A validated EE prediction equation and active minute cut-points were applied to ActiGraph data. Criterion measures were assessed using direct observation (step count) and portable metabolic analyser (EE, active minutes). A repeated measures analysis of variance (ANOVA) was used to compare differences between consumer monitors, ActiGraph, and criterion measures. Similarly, a repeated measures ANOVA was applied to a subgroup of subjects who didn’t cycle.
Results Participants took 3321±571 steps, had 28±6 active min and expended 294±56 kcal based on criterion measures. Comparatively, all monitors underestimated steps and EE by 13%–32% (p<0.01); additionally the Fitbit Flex, UP24, and ActiGraph underestimated active minutes by 35%–65% (p<0.05). Underestimations of PA and EE variables were found to be similar in the subgroup analysis.
Conclusion Consumer monitors had similar accuracy for PA assessment as the ActiGraph, which suggests that consumer monitors may serve to track personal PA behaviours and EE. However, due to discrepancies among monitors, individuals should be cautious when comparing relative and absolute differences in PA values obtained using different monitors.
- Physical activity measurement
- energy expenditure
- active minutes
- steps/day
- simulated free-living protocol
- accelerometry
Statistics from Altmetric.com
- Physical activity measurement
- energy expenditure
- active minutes
- steps/day
- simulated free-living protocol
- accelerometry
Introduction
Physical activity (PA) reduces risk of obesity, diabetes, hypertension, cardiovascular disease and all-cause mortality.1 National PA guidelines recommend that adults participate in ≥150 min/week of moderate-intensity or vigorous-intensity PA as a minimal amount of PA needed to increase or maintain health.1 Accurate assessment of individuals’ PA, including volume, intensity and time, is important.2
Accelerometers provide an objective assessment of PA by measuring accelerations of the body and translating these accelerations into PA variables such as steps, energy expenditure (EE) and time spent in moderate-intensity or vigorous-intensity PA (active minutes). However, accelerometers have mainly been used in research settings and are rarely used by consumers for personal PA tracking.
More recently, consumer-based PA monitors, which use accelerometer technology, have become popular as personal PA tracking tools. In 2013, consumer monitor sales were estimated at $330 million worldwide, with approximately 87% coming from Fitbit (Fitbit, San Francisco, California, USA) and Jawbone (Jawbone, San Francisco, California, USA) brands.3 Like research-grade accelerometers, consumer monitors estimate PA variables including steps taken, EE and active minutes. However, consumer monitors offer numerous advantages over most research-grade accelerometers, including real-time feedback, easy synchronisation to smartphone or computer applications, and goal-tracking.
Despite the recent widespread adoption of consumer monitors, little research has compared their accuracy with research-grade accelerometers. Lee et al compared the accuracy of consumer monitors (Fitbit One (FO), Fitbit Zip (FZ), Jawbone UP24 (JU)) and one popular research-grade accelerometer, the ActiGraph (AG; ActiGraph, Pensacola, Florida, USA), with indirect calorimetry for estimating EE in a semistructured setting. These researchers found similar accuracy of the consumer monitors to the research-grade accelerometer, with mean absolute per cent error (MAPE) of 10%–12% for the consumer monitors and 12.6% for the research-grade accelerometer compared with indirect calorimetry.4 In a similar investigation, Bai et al assessed the accuracy of the Fitbit Flex (FF), JU, and AG against indirect calorimetry for measuring EE in a semistructured setting, finding that the FF, JU, and AG all had MAPE of <20% for EE measurements.5 Finally, Murakami et al assessed the accuracy of 12 activity monitors, including FF, JU, and AG for measuring EE in a free-living setting compared with doubly labelled water. Estimates from the 12 devices ranged from 590 to 69 kcal/day lower than the doubly labelled water measure.6 While these studies provide insight into EE prediction, they did not assess accuracy of steps or active minute estimates. Storm et al compared seven activity monitors, including two research-grade accelerometers, the MoveMonitor and activPAL (PAL Technologies, Glasgow, UK). All monitors underestimated steps compared with the criterion measure; however, the MoveMonitor accelerometer had the best performance with less than 2% error at all walking speeds. Two recent studies showed moderate or strong correlations between consumer monitors and the AG for step counting (r=0.80–0.91), EE (r=0.74–0.81) and active minutes (r=0.52–0.91) under free-living conditions. However, these studies lacked a criterion measure, so accuracy of these devices could not be determined.7 8
Due to the widespread use of consumer monitors and other PA and health monitoring tools, it is important to gain understanding on how consumer monitors compare with a popularly used research-grade accelerometer in terms of accuracy of PA measurement and to make comparisons of studies using different types and brands of monitors. Thus, we compared four popular consumer monitors with a commonly used research-grade accelerometer in a semistructured environment using a protocol that incorporates similar activities as those performed by adults on a daily basis.
Methods
Healthy adult men (n=15) and women (n=15) who were 18–80 years of age and able to participate in moderate-to-vigorous PA participated in this study. The study was approved by Ball State University’s Institutional Review Board, and all subjects provided informed consent prior to participation. Age, height and weight of subjects were 49.2±19.2 years, 174.0±8.9 cm, and 79.2±15.5 kg, respectively. Subjects were predominantly right-hand dominant (93.4%).
Four consumer monitors (FO, FZ, FF and JU), one research-grade accelerometer (AG) and a COSMED K4b2 (COSMED Srl, Rome, Italy) portable metabolic analyser were used in this study. All equipment was initialised with subject’s sex, height, weight, and age, and synchronised to an external clock at the beginning of each visit.
The FO and FZ were mounted on a waist-worn elastic belt over the left hip, near the anterior axillary line, and were counterbalanced for anterior and posterior placement on the hip among subjects. The FF and JU were worn on the non-dominant wrist and counterbalanced for proximal and distal wrist placement among subjects. Two iPod Touch (Apple, Cupertino, California, USA) media players equipped with the Fitbit and Jawbone applications were synced to the FF and JU, and steps, EE, and active minutes were recorded. For the FO and FZ, steps and EE were recorded from the screen displays; active minutes were not assessed as these data are not available from the screen displays.
The AG (GT3X+), a commonly used accelerometer, was placed by research staff over participants’ right hip on an elastic waistband at the anterior axillary line. AG data were recorded at a frequency of 60 Hz and analysed in 30 s epochs. All time with ≥2691 vector magnitude counts/min was used to estimate active minutes (measurement of active minutes (MVPA); ≥3 metabolic equivalents (METS)9 The work-energy theorem and Freedson 2011 combination was used to calculate EE across the entire protocol from the AG. When selected, this combination automatically uses the work-energy theorem to calculate EE for sedentary and light PA, and the Freedson 2011 equation to calculate EE for MVPA.10 All calculations were performed via ActiLife 6 software (ActiGraph).
The COSMED K4b2 was used to measure oxygen consumption (VO₂) and carbon dioxide production during the study protocol. Breath-by-breath measurements were collected via a breathing mask worn by participants and were used to determine VO₂ in litres per minute (L/min), which was converted by a technician to EE by multiplying by 5 kcal per L of O2.11 All time with an EE ≥3.0 METS was summed for a measure of active minutes. The COSMED has been shown to provide accurate and reliable measures of VO2 over a wide range of activity intensities in comparison with metabolic carts and was used as the criterion measure for EE and active minutes in this study.12
Subjects participated in an 80 min, semistructured activity protocol, performing ≥12 activities from a list of 21 choices. Activities were grouped into the following categories: (1) sedentary activities (lying down, watching television, writing, reading, playing cards, and computer use), (2) household activities (standing, dusting, sweeping, vacuuming, folding laundry, making bed, picking up items from floor, and gardening) and (3) ambulatory and cycling activities (slow overground walk, brisk overground walk, treadmill walk, overground jog, treadmill jog, stair climbing, and stationary cycling). Subjects chose the pace, duration (2–15 min) and order of activities. At least four activities from each category (sedentary, household, and ambulatory) were performed, and subjects were instructed to spend ≥40 min in sedentary activities (to replicate adults’ free-living sedentary behaviour patterns).12 13 PA variables were recorded from the consumer monitors at the beginning and end of the protocol; therefore, activity-specific analyses could not be conducted for these data.
A trained research assistant counted and recorded steps during the entire protocol using a handheld tally counter; this served as the criterion measure of steps taken. A step was defined as lifting the entire foot and then placing it on the ground. During cycling, steps were counted for each pedal stroke, or two steps per revolution.
Repeated measures analysis of variance statistical tests were performed to assess differences from all four consumer monitors, AG, and criterion measures for steps, EE, and active minutes. This analysis was conducted for the total sample, as well as for a subgroup of the sample who did not perform cycling (n=9). When the test statistic was significant, post hoc pairwise comparisons were performed using paired t-tests and a least significant difference correction. MAPE and per cent bias (%bias) were calculated to analyse predictive error of each PA monitor compared with the criterion measures; criterion measures included indirect calorimetry for EE and active minutes and counted steps by trained research technicians counting all steps taken during the protocol. Bland-Altman statistics were performed to determine the limits of agreement for each device compared with the criterion measure. Finally, a correlation analysis was performed among all devices, including the criteria.
Results
During the protocol, subjects averaged 3231±571 steps, a total EE of 294±56 kcal and 28±6 active minutes. Compared with criterion measures, all monitors underestimated steps and EE (table 1; p<0.01); additionally all monitors that assessed active minutes (FF, JU, and AG) underestimated this variable (p<0.05). The FF had the lowest %bias (table 1) when counting steps and predicting EE, although this difference was only statistically significant when compared with the JU and AG (p<0.05). The JU had the lowest %bias in predicting active minutes; however, this difference was only significantly different from the FF (p<0.05). Similar PA underestimations were seen in the subgroup who did not cycle, although in some cases the differences were no longer statistically significant due to the small sample size (table 2).
While all consumer monitors tested had higher accuracy than the AG for at least one PA variable, none of the consumer monitors had higher accuracy than the AG for all PA variables tested. The FO, FZ, and FF had significantly more steps compared with the AG (mean difference: 199–302 steps, p<0.05), whereas there was no significant difference in steps recorded by the JU compared with the AG (mean difference: 145 steps, p=0.21). The FZ produced similar estimates of EE compared with the AG; however, EE estimates were significantly higher for the FO and FF and lower for the JU (mean difference: −7 to 38 kcal). The FF recorded significantly fewer active minutes than the AG (p=0.001), while the JU similarly estimated active minutes compared with the AG.
Figure 1 reports MAPE for all PA monitors tested. The MAPE for the FF was significantly smaller than all other monitors (p<0.05), except the FZ when measuring steps and significantly smaller than the FO and JU when estimating EE (p<0.05). For active minutes, the JU and AG showed similar MAPE, both of which were significantly lower than the FF (p<0.05).
* significantly different than actigraph accelerometer (p<0.05)
The Bland-Altman plots for steps, EE, and active minutes are shown in figures 2A–E, 3A–E and 4A–C, respectively. There were wide limits of agreement for all devices and variables, indicating high individual predictive error. Additionally, the Bland-Altman plots show a trend of underestimation for all devices across all variables, with similar levels of error and variability for the consumer monitors as with the AG.
The correlation analysis for all devices is shown in table 3. For steps and EE, all monitors were significantly correlated with each other and with the criterion measures (p<0.05). The FF had poor, non-significant correlations with the criterion measure and the AG for active minutes. While not statistically evaluated, the wrist-worn monitors (FF and JU) appeared more highly correlated with each other than the hip-worn monitors (FU, FZ, AG), and vice versa for the hip-worn monitors. The criterion measure correlations also appeared higher with the hip-worn monitors compared with the wrist-worn monitors.
Discussion
Our primary finding was that the consumer monitors underestimate all steps, EE, and active minutes to a similar degree as the AG, with no monitor consistently outperforming the others. Our findings extend previous research that assessed these PA variables individually, illustrating similar degree of error and good agreement in PA estimates produced by consumer monitors and the AG when measuring steps, EE, and active minutes.4–6 14 A previous analysis by our research group indicated that, while sedentary and ambulatory (walking and jogging) activities could be measured accurately by consumer monitors, the consumer monitors tended to underestimate PA for activities in the household category and for cycling.15
All participants in the current study performed activities in the household category, but a subgroup analysis was performed for subjects who chose not to cycle in the protocol to better understand what activities contributed to underestimates of PA. Although the small sample (n=9) resulted in lack of statistical significance for some comparisons, there were still underestimations of PA variables by all monitors. Therefore, underestimations of overall PA found by all monitors in the current study were most likely driven by the periods of time during the protocol in which the subjects performed household activities, where shuffling of feet and slower walking speeds are common. For individuals with high sedentary time and/or ambulatory time throughout the day, these monitors will likely yield higher measurement accuracy than for individuals who spend more time in non-sedentary, non-ambulatory activities, such as cycling or household activities.
The role of the research-grade accelerometer in our study
The inclusion of a research-grade accelerometer (AG) as a comparison in the current study provides a unique advantage of this study, as the AG is a popular research-grade accelerometer that has been used extensively in various healthy and clinical populations, including the National Health and Nutrition Examination Survey,16 to gain insight into their activity levels. Moreover, these devices have been used more commonly than any other accelerometer brand in intervention protocols to track adherence to prescribed PA.17 Furthermore, the 2008 Physical Activity Guidelines for Americans recommends prescribing exercise using PA measures of intensity, active time and volume (as reported as MET-min/week), and these recommendations are based on measurements collected using accelerometry.1 Thus, understanding how well consumer monitors measure PA compared with a commonly used research-grade accelerometer is important when comparing data collected from different consumer monitors.
Note that the AG, coupled with traditional (cut-point based) data analysis methods, demonstrated similar trends as the consumer monitors in underestimating PA variables in a simulated free-living environment, with the greatest error found in the MVPA. Fitbit’s website states that active minutes must be accumulated in 10 min bouts to be recorded, similar to recommendations from PA guidelines; this feature of their proprietary software likely contributed to the underestimation of active minutes by all Fitbit monitors in this study.18 Individuals considering using PA monitoring devices should be aware of this underestimation when tracking their progress and adherence to PA recommendations and/or exercise prescription. The use of PA logs to supplement objective data collection or the use of alternate exercise prescriptions based on daily step count should be considered to better capture PA, especially non-ambulatory types.
Role of the monitor location (hip, wrist)
We were not surprised that the hip-worn monitors correlated more highly with each other than with the wrist-worn monitors (and vice versa) and also appeared to have higher correlations with the criterion measures. These findings are in agreement with past literature, which generally supports PA measurement accuracy with hip-worn compared with wrist-worn accelerometers.19–21 However, wrist-worn devices are more popular considering that most consumer monitors are designed for wear on the wrist, and previous work also supports higher compliance with wrist-worn devices compared with hip-worn devices.22 Therefore, choice of wrist-worn or hip-worn device will depend on the importance of compliance versus accuracy desired in a study.
Research and clinical implications
Although the consumer monitors tested in this study had similar accuracy to the AG monitor and associated linear regression EE and activity intensity prediction, our findings should not be taken to mean that consumer monitors are on par with the highest accuracy monitoring devices available. For example, the activPAL, another popular research-grade accelerometer, has shown higher accuracy for measurement of sedentary behaviour and steps than the AG but poor for measurement of EE.23–25 Additionally, as techniques for analysing research-grade accelerometer data improve (eg, through techniques such as pattern recognition), the AG research-grade accelerometer and other research-grade devices may become a more accurate method for assessing PA levels.
Our results indicate that a select sample of consumer monitors provided similar strengths and weaknesses and similar PA estimates to a single, popularly used research-grade device and associated prediction equation. Given the ease of use, relatively low cost, and comparable accuracy of consumer monitors to the AG in monitoring PA, consumer monitors may have utility in monitoring PA behaviours. Most consumer monitors also provide real-time feedback, which may influence behaviour via the Hawthorne effect or a better awareness of PA patterns. Therefore, consumer monitors may be less appropriate for use in surveillance studies or to assess the effectiveness of interventions because of their potential to influence PA behaviour more than the AG or other research-grade accelerometers that do not give immediate feedback. On the contrary, their real-time feedback may serve as a valuable intervention strategy/motivational tool that could help promote adoption/maintenance of healthy PA habits in the general population; if used for such purposes, the accuracy of the monitors would be of less importance than the fact that they may encourage people to be more active. However, individuals should be cautious when comparing PA values from different consumer monitors due to discrepancies seen in this and previous studies.4–6 Additionally, individuals should consider these monitors’ potential underestimation of PA variables when provided feedback from the monitors and consider if other types of monitors (such as the activPAL) may give more accurate data to meet their specific needs. Thus, the decision to use consumer monitors or research-grade monitors will depend on the goals of the assessment, the options available to the researcher or individual and the relative importance of usability versus accuracy of measurement.2
Limitations
This study comprised a range of activities performed during a semistructured protocol, but had a small sample size. Given the slower movement speeds with older, diseased or disabled adults and poorer accuracy of PA monitors for measuring slower movement speeds,26 measurement error may be greater in these populations than indicated by our results. Additionally, although similar activities were performed in the protocol as those commonly performed on a daily basis, the study did not use a true free-living setting. Measurement error may differ in a true free-living environment than indicated by the current protocol, especially if the proportion of the day spent in different types of activities differs from that performed in the current study, where participants spent roughly 50-60% of the time in sedentary activities27. This study only assessed the total measurement values for steps, EE, and active minutes over the observed period for all monitors assessed, and therefore time-matched analyses for each activity performed are not provided in current results. However, this analysis can be found in a similar study from our laboratory.15 The current study defined a step as lifting the entire foot and then placing it on the ground, which may not be the same method as the PA monitors use to recognise a step. This may have been a cause of the underestimation by monitors in measuring steps specifically. Finally, although a subgroup analysis was performed for individuals who did not cycle in the protocol, the sample size was only nine participants. Further validation and comparison of these PA monitors is warranted to determine which monitor has the best capability of capturing non-ambulatory measurement, household chores, cycling-type exercise (ie, stationary bicycle, recumbent trainer) and sport activities.
Summary and conclusion
In conclusion, consumer monitors and the AG research-grade accelerometer underestimated PA variables and EE assessed in comparison with criterion measures to a similar degree. Due to the underestimates of PA and discrepancies between consumer monitors, researchers and consumers should be cautious when comparing PA values that were obtained using different monitors.
What are the findings?
All PA monitors tested (consumer-based and research-grade) tended to underestimate PA measurements compared with criterion methods.
The hip-worn monitors correlated more highly with each other than with the wrist-worn monitors (and vice versa) and also appeared to have higher correlations with the criterion measures.
How might it impact on clinical practice in the future?
Due to the differences found in the measures of PA between the consumer-based PA monitors and the ActiGraph during a simulated free-living protocol, one should be cautious when comparing PA values obtained using these monitors in free-living settings.
Given the ease of use, relatively low cost, immediate feedback capabilities and comparable accuracy of consumer monitors with the AG in monitoring PA, consumer monitors may have utility in monitoring PA behaviours and promoting adoption/maintenance of healthy PA habits in the general population.
This study shows hip-worn devices to have higher accuracy compared with wrist-worn devices; however, past literature has shown higher compliance using wrist-worn devices. Therefore, the choice of wrist-worn or hip-worn devices will depend on the importance of compliance versus accuracy.
References
Footnotes
Funding This study received funding support, in part, from the Ball State University ASPiREgrant.
Competing interests None declared.
Patient consent Obtained.
Ethics approval Ball State University Institutional Review Board.
Provenance and peer review Not commissioned; externally peer reviewed.