Article Text
Abstract
Objective Measuring physical activity is a key part of studying its health effects. Questionnaires and pedometers each have weaknesses but are the cheapest and easiest to use measurement methods for large-scale studies. We examined their capacity to detect expected associations between physical activity and a range of surrogate health measures.
Design Cross-sectional analysis of 669 community-dwelling participants (mean age 63.3 (7.7) years) who completed the Physical Activity Scale for the Elderly (PASE) questionnaire and who, within 2 weeks, wore a pedometer for 7 days.
Results PASE score and step count were only poorly correlated (r = 0.37 in women, r = 0.30 in men). Of 12 expected associations examined between activity and surrogate markers of health, 10 were detected as statistically significant by step counts but only 3 by PASE scores. Significant associations in the expected direction were found between step counts and high-density lipoprotein, body mass index, waist circumference, waist-to-hip ratio, blood glucose level, white cell count and fibrinogen. There was no association with either systolic or diastolic blood pressure. The association between PASE score and these markers was detected as significant only for body mass index and waist circumference in women and waist-to-hip ratio in both sexes. Associations were stronger for steps multiplied by stride length than for raw step count.
Conclusions Pedometer-derived step counts are a more valid measurement of overall physical activity in this sample than PASE score. Researchers should use objective measures of physical activity whenever possible.
Statistics from Altmetric.com
Maintaining an active lifestyle in later years has been associated with a decreased risk of falls, fractures and malignancies,1 2 preventing age-associated declines in bone density,3 maintenance of muscular strength,4 and cardiovascular fitness.5 The measurement of physical activity,6 however, is difficult as it has multiple component activities, being undertaken for recreation, employment, and transport as well as household activities of daily living.
The quality of measurement of physical activity in research is an important cause for concern as poor-quality measurement tools can obscure important associations. A recent systematic review7 of agreement in measures of activity energy expenditure between physical activity questionnaires and doubly labelled water as a criterion measure showed correlation coefficients of only r = 0.3 to 0.4. Shorter questionnaires generally show better reliability and validity than longer ones, possibly due to fatigue or boredom interfering with the completion of longer instruments.8
There are various ways to measure physical activity that capture its different aspects. Pedometers measure steps, but not intensity of effort, and do not capture activity such as cycling, swimming and upper limb exercise. Pedometers can also be subject to sampling error if the period during which it is worn is not representative of the person's usual activity, possibly including reactivity, also known as a “Hawthorne effect” in which individuals modify their behaviour because it is being recorded. This has been described as increasing daily step counts by 1800 in young adults,9 although another study showed no effect.10 Accelerometers measure amount and intensity of activity but are more expensive than pedometers, $A400 versus $A20, and neither instrument can be worn during some activities such as swimming.
Questionnaires capture the full range of activity and can allow grading of activity by intensity, such as converting activities into metabolic equivalents and estimation of total energy consumed. Questionnaires are, however, subject to recall and reporting bias, particularly the problem of asking respondents to describe the intensity of activity, and the social desirability of “correct” answers. Given the importance of accurate measures of physical activity for epidemiological research, we examined two approaches: a questionnaire specifically developed for use in older populations (Physical Activity Scale for the Elderly (PASE)) and pedometers worn for a week.
Assessment of energy expenditure by doubly labelled water is generally regarded as the gold-standard measure of physical activity, but it costs up to $A1000 per person and is not widely available. Without a reference method that can be easily applied to large groups, direct assessment of the validity of the various measurement approaches is not possible. Much of the work establishing the validity of physical activity measurement scales is analysed by a simple correlation coefficient between measures; however, this can be affected by confounders, particularly age. We have taken the approach of assessing the association of physical activity measures with markers of health or fitness that are well-established sequelae of physical activity as an indirect way of establishing validity. Fibrinogen and white cell count (WCC) are included as markers of a low-grade inflammatory state. These surrogate health markers are all significantly associated with reduced mortality in cohort studies. With an adequate sample size, this allows use of regression modelling to adjust for confounders. In this study, we investigate the predictive ability of step counts and PASE as alternate measures of physical activity against a number of surrogate markers of health status.
Methods
Setting and population
Community-dwelling men and women aged 55 to 85 years in the Newcastle urban area (NSW, Australia) were randomly selected from the state electoral roll. Listing on the electoral role is compulsory and estimated to be 93.6% correct.11 Up to two letters of introduction and invitation to participate were posted to the selected persons. Non-responders were telephoned by a research assistant, with up to five attempts made at contact. Persons who could not speak English were deemed ineligible. Once informed consent was obtained, participants were asked to complete a series of self-report postal questionnaires including the PASE and then to attend a study centre for clinical measurements, anthropometry, blood collection and instruction in wearing the pedometer. This study has been approved by the University of Newcastle Human Research Ethics Committee.
Measures of physical activity
Two measures of physical activity were used. The PASE questionnaire12 is a 12-item questionnaire covering activity frequency and duration in the previous 7 days and can be administered either by phone or self-completed. The PASE was established by Washburn and colleagues in a sample aged 65 years and older and validated in both the USA and Holland but not in Australia. Activity frequency is multiplied by item weights and summed to a total score that can range from 0 to 400 or more. In the original validation study, PASE scores were about 20% lower for women than for men.13 The second measure was pedometers worn for 1 week. Participants were shown how to wear a pedometer (Yamax DW200, Yamax, Tokyo, Japan) clipped to their clothing (either side), close to the anterior iliac spine, from the time of rising until the time of getting undressed in the evening. Participants recorded the daily count on a diary but did not reset the pedometer so that daily counts could be validated against the weekly total. Pedometers were returned by post. Start and end times were recorded, and daily durations of less than 9 h were ignored, as were participants with less than 3 days of step counts. To explore three possible ways of interpreting step counts, we have used the mean daily step count over the days, mean step count multiplied by the participant's height, and mean step count multiplied by the participant's observed stride length to give a presumed distance in kilometres.
Confounders recorded include
▶. alcohol consumption, graded to four levels as non-drinkers, drinking at safe (<40 g/day for men and <20 g/day for women), moderate (40–60 g/day for men and 20–40 g/day for women, on no more than 3 days a week) or excessive levels;
▶. smoking status, graded as never (less than 100 cigarettes in their life), ever and current;
▶. age and sex.
Surrogate measures of health risk
A number of measures of health risk were obtained, including
▶. Laboratory measures: fibrinogen was measured using the CA 7000 coagulation analyser and high-density lipoprotein (HDL) cholesterol by the automated HDL Dimension clinical chemistry system using the Flex reagent cartridge (Dade Behring, Deerfield, Illinois, USA); WCC was measured on an automated Coulter counter (Beckman, Fullerton, California, USA).
▶. Anthropometry: body circumference was measured using a non-stretch Teflon tape, the waist being the smallest circumference below the rib cage and above the umbilicus. Hip circumference was taken as the largest circumference at the posterior section of the buttocks.
▶. The “timed up and go” test is a valid and reliable test for quantifying functional mobility.14 The test measures, in seconds, the time taken for an individual to stand up from a standard arm chair (seat height 44–47 cm; arm rest 63–65 cm), walk a distance of 3 m, turn, walk back to the chair and sit down again. All study participants wore their regular footwear and used their customary walking aid if needed. No assistance was given.
▶. Clinical measures: blood pressure was measured following the British Hypertension Society guidelines using a BPM-100 oscillometric blood pressure monitor (VSM MedTech, Coquitlam, British Columbia, Canada).15 Stride length was derived from the number of steps taken to cover 5 m in the clinic.
Statistical methods
For analysis of correlation and in regression models, both PASE scores and step counts were truncated at the 99th centile (PASE 403, step count 19 037) to remove the effect of high outliers. After examining individual variable correlations, linear least squares regression models were used to adjust for age, sex, smoking and alcohol consumption. Partial F test was used to compare r2 values. All calculations were done in Stata V.8.2 (College Station, Texas, USA).16
Results
Of 2253 subjects approached after random selection from the electoral role, 1071 (48%) agreed to participate, for whom data are collected and completed for 901 (40%) subjects. Of the 901 participants in the phase I of recruitment, 669 had pedometry data for at least 9 h on at least 3 days. Comparisons between responders and non-responders demonstrated no statistically significant difference for sex (male: 46.1% vs 46.9%). However, a difference was demonstrated for age with responders slightly younger than non-responders (66.3 (7.8) years vs 68.6 years (8.4), p<0.001) (table 1).
The distributions of PASE scores and step counts by age and sex are shown in fig 1. In our population, there is no sex difference in the number of steps taken, but the women show lower PASE scores; mean of 148 for women (95% CI 142 to 154), 189 for men (95% CI 180 to 198). After adjustment for age, step counts were not associated with income, employment or education.
The scatter plots for PASE versus step count for men and women are shown in fig 2; there is a statistically significant but small correlation of 0.32 between the two measures in the overall group; the correlation is slightly higher in women (0.37) than in men (0.30).
To examine the association of physical activity with surrogate markers of general health, we compared models for the prediction of HDL cholesterol, BMI, waist circumference, waist-to-hip ratio, fasting blood glucose, WCC, fibrinogen, systolic blood pressure, and diastolic blood pressure in regression models adjusting for age, sex, smoking and alcohol intake; the physical activity term was taken as either step count or PASE score. The interaction term sex × steps was significant (p<0.05) for HDL, BMI and waist circumference; so for these variables, men and women were analysed separately. Table 2 shows the change in each variable for an increase of 1 SD in step count or PASE score, with associated p and r2 values for each model. It can be seen that with the exception of blood pressure, the r2 value is consistently higher for models with step count than for those using PASE. Of the 12 comparisons in table 2, 10 show a statistically significant association with step counts, but only 3 show a significant association with PASE scores.
Forcing both PASE and mean step counts into the model shows that PASE does not add explanatory power and contributes a minimal rise in r2 for only four of the models and reduces it in the others. This indicates that PASE does not capture a complementary aspect of physical activity to step count.
We then examined the explanatory power of alternate formulations of step count by comparing the r2 values of models as above but substituting either height × steps or kilometres walked for daily step count, as shown in table 3. Kilometres walked consistently explains more of the variance than either of the other step count formulations, and this is statistically significant (p<0.05) in 6 of the 12 comparisons.
Discussion
The correlation r = 0.33 of PASE score with step count is surprisingly low for two variables purportedly measuring the same thing. A low correlation could indicate poor measurement performance, limited range in the data or that the two measures are capturing different and possibly complementary aspects of physical activity. Validation research for questionnaires against accelerometer counts has been done in large samples for the Active Australia Survey, International Physical Activity Questionnaire, and the Behavioural Risk Factor Surveillance System questions.17 Correlations for these instruments were in the range 0.22 to 0.36, similar to our findings. We believe that the PASE is as good as other questionnaire methods available and was specifically developed for the age group of our study.
Washburn et al18 has shown better correlations of PASE with 3-day accelerometer recordings (r = 0.49); however, this was in a non-representative sample of only 20 and did not reflect activity over the full weekday/weekend cycle.19
Previous validation work has consistently relied on simple correlation coefficients to evaluate the agreement between methods; however, correlation coefficients will be reduced by variation due to age, sex and other personal characteristics. We have taken the approach of examining the explanatory power of physical activity measures to predict levels of surrogate health markers regarded as sequelae of physical activity. For this we have chosen HDL, BMI, waist, waist-to-hip ratio, fibrinogen, fasting glucose, WCC and blood pressure. All these factors are affected by physical activity, and all are associated with mortality in cohort studies. Using this approach, the step count consistently outperforms PASE score in predicting these surrogate outcomes of physical activity, and adding PASE score to a model with step counts never significantly improves the explanatory power, demonstrating that the two measures are not capturing complementary aspects of physical activity.
The interpretation of WCC as a marker of a low-grade inflammatory state is a recent proposition but is supported by good evidence from the Baltimore Longitudinal Study of Ageing,20 which showed an adjusted hazard ratio for all-cause mortality of 1.24 (95% CI 1.06 to 1.43) in those with WCC of more than 6000 compared with those with WCC of 3500 to 5999. In that study, an increase of 1 SD in energy expenditure on a physical activity questionnaire was associated with 80 fewer white cells per cubic millimetre, similar to our value of 119 based on PASE score, and less than our value of 327 based on step count.
Limitations of this research are that, in our study, the PASE responses were based on a week before the clinic visit, while the pedometer recording was for the week after the visit; however, there was no advice given to exercise more, and in a previous analysis, we have shown that the step count for the first 3 days was not consistently different to that of the last 4 days, suggesting there was no Hawthorne effect of increased steps due to being observed. Except for waist-to-hip ratios, the regression models explain only a small proportion of total variability in the surrogate outcomes, but despite this, the greater value of r2 in models based on step count versus PASE score is apparent for all outcomes. We do not claim that pedometers are an exact measure of physical activity and, when compared with metabolic methods, may overestimate or underestimate total energy expenditure.
What is already known on this subject
▶. Physical activity is a crucial element of health and healthy ageing.
▶. Much of what we know about physical activity is based on the results of self-report questionnaires with imperfect validity.
What this paper adds
▶. Physical activity measured by step counts is much more strongly associated with health indicators than physical activity measured by questionnaire.
▶. Future research should use objective measures of physical activity wherever possible.
In this population, both questionnaires and pedometers are feasible measurement approaches, but we believe that pedometry captures total physical activity with greater validity than the PASE questionnaire. This is best demonstrated by the fact that of the 12 expected associations we examined, 10 were evident for step counts and only 3 were evident for physical activity measured by PASE.
The literature on using pedometers to measure physical activity21 suggests interpreting raw step counts rather than converting them to distance, although the basis for this recommendation is unclear. In our data, the conversion of step counts to nominal kilometres walked based on observed stride length gave greater explanatory power than the raw step count, and we think this is objective evidence in support of this interpretation of pedometer counts for public health research purposes.
Questionnaire methods are widely used as outcome measures in physical activity intervention trials; however, in studies that show no effect, it is difficult to be certain whether the problem lies with the programme or with the outcome measurement tool.8 The recent meta-analysis published in this journal by Hamer and Chida22 of 18 prospective studies that all used self-reported walking time or distance showed a hazard ratio of 0.68 for all-cause mortality in the highest versus the lowest walking groups. Poor measurement generally biases results towards the null, so the protective effect may well be greater if objective measures of walking had been used.
We believe it is time to use objective physical activity measurements in epidemiological research whenever possible.
Acknowledgments
We are grateful to Steve Bowe for statistical advice.
References
Footnotes
-
Funding The Hunter Community Study is funded by the University of Newcastle and the pedometers were funded by the Hunter Medical Research Institute.
-
Competing interests None.