Objective—To assess the test-retest reliability and validity of the physical activity questions in the World Health Organisation health behaviour in schoolchildren (WHO HBSC) survey.
Methods—In the validity study, the Multistage Fitness Test was administered to a random sample of year 8 (mean age 13.1 years; n = 1072) and year 10 (mean age 15.1 years; n = 954) high school students from New South Wales (Australia) during February/March 1997. The students completed the self report instruments on the same day. An independent sample of year 8 (n = 121) and year 10 (n = 105) students was used in the reliability study. The questionnaire was administered to the same students on two occasions, two weeks apart, and test-retest reliability was assessed. Students were classified as either active or inadequately active on their combined responses to the questionnaire items. Kappa and percentage agreement were assessed for the questionnaire items and for a two category summary measure.
Results—All groups of students (boys and girls in year 8 and year 10) classified as active (regardless of the measure) had significantly higher aerobic fitness than students classified as inadequately active. As a result of highly skewed binomial distributions, values of kappa were much lower than percentage agreement for test-retest reliability of the summary measure. For year 8 boys and girls, percentage agreement was 67% and 70% respectively, and for year 10 boys and girls percentage agreement was 85% and 70% respectively.
Conclusions—These brief self report questions on participation in vigorous intensity physical activity appear to have acceptable reliability and validity. These instruments need to be tested in other cultures to ensure that the findings are not specific to Australian students. Further refinement of the measures should be considered.
- physical activity
- health behaviour in schoolchildren
Take home message
The items of the HBSC physical activity questionnaire have acceptable reliability and validity among Australian students, but should also be evaluated among students from other cultures. They provide limited information, so more comprehensive questions should be considered for development.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Regular participation in physical activity is associated with a range of physical and mental health benefits in adults, including reduced risk for cardiovascular disease, non-insulin-dependent diabetes mellitus, overweight, hypertension, osteoporosis, some cancers, and anxiety and depression.1 Because there have been few long term studies, the relation between physical activity during childhood and adolescence and health during adulthood is less clear. However, evidence is accumulating to support the theory that improved blood lipid profiles, blood pressure, body composition, glucose metabolism, bone strength, and psychological health are causally associated with regular participation in physical activity.2
The first step toward understanding and encouraging participation in physical activity and enhanced fitness involves the provision of representative data to assess and monitor the population prevalence and distribution of physical activity participation. It is important to know not only what proportion of children and adolescents are active, but also if there are differences across the population by sex, among diverse cultural groups, or according to socioeconomic backgrounds or place of residence. Identification of these subgroups will allow public health and health promotion resources to be more equitably distributed.
Few systems exist to monitor child and adolescent health. The best known is the World Health Organisation health behaviour in schoolchildren (WHO HBSC) surveys, carried out in Europe for the past 15 years.3 Initial participation was by only a few countries in the 1980s increasing to 123 000 young people from 26 countries in the most recent 1997/98 survey.4 These school based surveys used standardised protocols and core questions across countries to measure a range of adolescent health and lifestyle issues, including physical activity and sedentary behaviours such as time spent watching television or using computers.
Kohl and his colleagues5 have recently conducted a comprehensive review of measures of physical activity participation among children and adolescents, including the strengths and limitations of the various methods of assessing their validity and reliability. They concluded that, at present, administration of self report questionnaires is the only practical method of collecting a broad range of data from a large number of children and adolescents, but it is still important that the questionnaire items provide acceptably accurate data. Consequently, a central question for self reported population measures is whether they are known to have good measurement properties, especially test-retest reliability and validity.
The two studies reported here include the HBSC physical activity questions. In the validity study, a field measure of aerobic fitness (the Multistage Fitness Test (MFT))6 was also administered to the students who completed the questionnaire, allowing partial validation of the self report instrument. In the second study (the reliability study), the questionnaire was administered to the same students on two occasions, two weeks apart, and test-retest reliability was assessed. The data were stratified by sex and school year (years 8 and 10) before analysis. This pair of studies represents one of the first assessments of the reliability and validity of the WHO HBSC physical activity questionnaire items.
The validity and reliability studies were conducted separately and involved different schools and students. The two studies used the same survey instrument, which was administered under the same conditions—that is, in class groups under the supervision of at least one research officer. The survey instrument sought a broad range of self reported information, including: demographics; participation in physical activity using different measures, including the physical activity assessment items from the WHO HBSC; the amount of time spent in sedentary activities; questions about school physical education; and, individual factors associated with sport and other physical activity participation based on Social Cognitive Theory.7 These studies were approved by the University of Sydney human ethics committee, and informed consent was provided by all students before their participation.
The WHO HBSC physical activity questionnaire asked students to report the frequency and total amount of time spent exercising vigorously outside school hours. The frequency question asked students “Outside school hours: How often do you usually exercise in your free time, so much that you get out of breath or sweat?” The response alternatives were: “Once a month or less”, “Once a week”, “2–3 times a week”, “4–6 times a week”, and “Every day”. The duration question asked students “Outside school hours: How many hours do you usually exercise in your free time, so much that you get out of breath or sweat?” The response alternatives were: “None”, “About half an hour per week”, “About one hour per week”, “About 2–3 hours per week”, “About 4–6 hours per week”, and “About 7 hours per week”.
For the validity analyses, the responses to the two questions were summarised in two ways, based on guideline 2 of the Physical activity guidelines for adolescents.8 The criterion provided by guideline 2 was selected as the basis for categorisation because it is likely to be used as a cut off point for estimation of the prevalence of sufficient activity among young people, and it is important to know whether those identified as more active on the basis of self report also have, on average, greater aerobic fitness. Firstly, responses were dichotomised for each question separately. For the frequency question, those who reported being active for once a week or less were considered inadequately active, and those who reported being active “2–3 times per week” or more were considered to be active. For the duration question, those who reported that they were active for “half an hour a week” or less were considered inadequately active and those who reported that they were active for “an hour a week” or more were considered active. The Physical activity guidelines for adolescents of Sallis and Patrick8 identify three times a week as the minimum frequency of participation to meet the criterion of being “active”, so the criterion used here of “2–3 times per week” will incorrectly categorise some students who participate only twice per week as “active”. This is an unavoidable limitation to this study and a significant shortcoming of the questionnaire.
Secondly, a dichotomous summary measure was created. Students who reported that they were vigorously physically active “2–3 times per week”, “4–6 times per week”, or “every day” and that they were vigorously active for “about 1 hour per week” or more were classified as active. Students who reported their frequency of activity as “once a month or less” or “once a week” or who reported their duration of activity as “none” or “about half an hour per week” were classified as inadequately active. The uncategorised responses to the two questionnaire items and the two category summary measure were used for the reliability analyses. The responses to the two questionnaire items were analysed separately in an attempt to identify whether one was more or less reliable than the other.
Data relevant to the validation analyses were collected as part of the NSW Schools Fitness and Physical Activity Survey, 1997, the details of which have been published elsewhere.9, 10 Briefly, 44 high schools were selected at random from the three education sectors (Department of School Education, Catholic Education Commission, and Independent Schools) in proportion to the number of students enrolled in each sector. The likelihood of a school being selected in each stratum was proportional to the size of the student enrolment. Within each school, one class was chosen at random from years 8 and 10. Assuming that classes would be about equal in size, this method provides a self weighted sample in which each student has an approximately equal chance of selection. The data were collected during February/March 1997 (summer).
Aerobic fitness was assessed using the MFT (also known as the 20 m Shuttle Run Test, the Beep Test, or PACER); it was first described by Leger and Lambert6 and identified in a recent review as a reliable and valid field test for use among children and adolescents.11 Students are required to run between two lines 20 m apart (one “lap”), starting at 8.5 km/h and increasing by 0.5 km/h every two minutes, in synchrony with a cadence tape. Students were tested in groups of about 15, and the test was supervised by at least two of the field team. The number of laps completed was determined by the student failing to keep pace with the cadence tape on two consecutive laps or voluntarily withdrawing. The stage and level achieved by each student was written on the back of their hands with a water soluble marker and recorded when all students had completed the test. Stage and level were converted into number of laps completed (laps) for the analysis. We recognise that aerobic fitness is, at best, only an indirect method of self report validation and that collection of accelerometer data may be a preferable approach to validation. Unfortunately, this study lacked the necessary resources to administer accelerometers to the students.
Five high schools were selected at random from all high schools in the southern half of the Sydney metropolitan region and two high schools from a regional city were approached to participate in the study. The questionnaire was administered on two occasions, two weeks apart, in each of the schools during November and December 1998. One year 8 and one year 10 class in each school was selected by school staff to participate in the study.
For the validity study, the mean and standard error of the number of laps on the MFT were calculated for the two questions separately, and for the summary measure. Standard errors were adjusted for the design effects. Comparisons of the mean number of laps between the active and inadequately active categories were carried out using multiple regression with dummy variables. Significance levels were adjusted for the design effects (resulting from schools and classes being the units of random selection) using SUDAAN.12 Results of analysis of square root transformed data were not different from the results of analyses using crude data.
Test-retest reliability was assessed for agreement beyond chance using unweighted (for 2 × 2 contingency tables) and weighted (for multiway contingency tables) kappa statistics with 95% confidence intervals. Skewed binomial data can give rise to erroneously low values of kappa,13, 14 so percentage agreement was also calculated as the number of students classified as active at both test 1 and test 2 plus the number of students classified as inadequately active at both test 1 and test 2 as a proportion of the total number of students.
Of the 1072 year 8 students, 48% were girls and the mean age was 13.1 years; 45% of the 954 year 10 students were girls and the mean age was 15.1 years. The response rates were 86.6 and 83.5% for year 8 boys and girls respectively and 80.5% and 70.8% for year 10 boys and girls respectively.
Table 1⇓ shows the proportion of year 8 and year 10 boys and girls in each category of each of the measures of physical activity, and table 2⇓ shows the mean number of laps completed on the MFT for each category of each measure for boys and girls in years 8 and 10. With regard to the separate frequency and duration questions, the active group had significantly higher aerobic fitness than the inadequately active group for both sexes and both years. Similarly, for the dichotomous summary measure, aerobic fitness was significantly greater for the active than for the inadequately active students, for boys and girls in both years. The differences were statistically significant for all groups of students.
Of the 121 year 8 students, 48% were girls and the mean age was 13.7 years; 29% of the 105 year 10 students were girls and the mean age was 15.7 years. Table 3⇓ shows the proportion of year 8 and year 10 boys and girls who were active and inadequately active at test 1 and test 2, and table 4⇓ shows the values of kappa (95% confidence interval) and percentage agreement for each of the measures for year 8 and year 10 boys and girls separately. For the two questionnaire items, the values of kappa and percentage agreement were generally similar and all were 0.6 or lower, suggesting poor to moderate reliability. However, these items used five and six point response scales, making high agreement less likely, even when agreement is assessed using weighted kappa.
In contrast, the two category summary measure generally showed low values of kappa and higher percentage agreement values. Among year 8 boys and girls, a much higher proportion of students were active than inadequately active, resulting in a skewed binomial distribution and, consequently, paradoxically low values of kappa, despite percentage agreement being close to 70%. A similar, but less pronounced, phenomenon occurred among year 10 students, for whom the distributions were less skewed. Given the characteristics of these data, we consider that percentage agreement provides the better indication of test-retest reliability.
The results of this study indicate that, at least among Australian high school students, the HBSC physical activity questions had acceptable test-retest reliability among year 8 students and acceptable to good reliability among year 10 students. However, these findings should be viewed with some caution until they are confirmed in other populations. That the reliability of self report physical activity measures generally improves with the age of the students is a further issue which has also been reported by Sallis and his colleagues.15
The data also indicate that the HBSC instrument has at least partial validity, as higher scores on the MFT differentiated the groups by self reported activity level. Aerobic fitness is an indirect measure of validity because other factors, in addition to physical activity, also influence aerobic fitness. That is, a measure of aerobic fitness is, at best, only partially related to the differences between self report physical activity categories. It should also be kept in mind that physical activity recommendations for children and adolescents is a vexed issue and one for which consensus has not been achieved. For example, Pate and his colleagues16 presented a set of physical activity recommendations that did not include specific reference to a quantum of vigorous activity. Despite these limitations, data on aerobic fitness make a useful contribution to a comprehensive understanding of the characteristics of self report measures, particularly those that focus on vigorous intensity physical activity, as the HBSC items do. A better understanding of the validity of these questionnaire items would require the use of other approaches to validation, such as the use of accelerometers.
Some speculation on the reasons why better reliability was not seen may assist in the development of more reliable and valid self report measures of physical activity participation for use with young people. Firstly, the poorer reliability for year 8 than year 10 students is possibly because year 8 students are at that stage of adolescence during which authority is challenged and few things are taken seriously—that is, we suspect that the year 8 students may have taken less care in responding to the questions. Of course, this may vary from culture to culture, and the results achieved with this Australian sample may not be replicated in other countries.
Secondly, it may be that the questions are too simple and brief to yield accurate responses from many students. The questions do not include a reference period—for example, last or usual week—so students may consider different time periods in formulating their answer on different occasions. The questions also do not make reference to different types of activity—for example, organised sports and non-organised activities—so students may have been unclear about which activities to include in their report. Finally, students' participation may vary substantially with the season and, as the questions do not make reference to the season, students may be unclear about which activities to report.
The findings of this study suggest that the HBSC questionnaire items have acceptable reliability and validity among year 8 students and acceptable to good reliability and validity among year 10 students, but there is clearly room for improvement. However, the poorer performance of the measure among the younger students, a finding that is consistent with other studies, raises doubts about the use of these measures with even younger adolescents. Similar studies should be conducted on the same questions in several other countries to confirm that the findings are not specific to Australian students. If lesser levels of agreement are found in other cultures, thought should be given to further development of the HBSC self report physical activity questions.
The study was supported by grants from the NSW Department of Education and Training, the NSW Department of Health, and the National Professional Development Program.