Association of physical activity with risk of hepatobiliary diseases in China: a prospective cohort study of 0.5 million people

Objective There is limited prospective evidence on the association of physical activity with hepatobiliary cancer subtypes and other major hepatobiliary diseases, especially in China. We aimed to quantify the associations with risk of these diseases. Methods The study population involved 460 937 participants of the prospective China Kadoorie Biobank aged 30–79 years from 10 diverse areas in China without history of cancer or hepatobiliary disease at baseline. Cox regression was used to estimate adjusted hazard ratios (HRs) for each disease associated with self-reported total and domain-specific physical activity (occupational and non-occupational, ie, leisure time, household and commuting). Results During ~10 years of follow-up, 22 012 incident cases of hepatobiliary diseases were recorded. The overall mean (SD) total physical activity was 21.2 (13.9) metabolic equivalent of task (MET)-hours/day, with 62% from occupational activity. Total physical activity was inversely associated with hospitalised non-alcoholic fatty liver disease (HR comparing top vs bottom quintile: 0.62, 95% confidence interval (CI) 0.53 to 0.72), viral hepatitis (0.73, 95% CI 0.62 to 0.87), cirrhosis (0.76, 95% CI 0.66 to 0.88) and liver cancer (0.81, 95% CI 0.71 to 0.93), as well as gallstone disease (0.86, 95% CI 0.81 to 0.90), gallbladder cancer (0.51, 95% CI 0.32 to 0.80) and biliary tract cancer (0.55, 95% CI 0.38 to 0.78). The associations for occupational physical activity were similar to those for total physical activity, but for non-occupational physical activity they differed by disease subtype. For leisure-time physical activity, there was an inverse association with liver cancer and an inverse trend for gallstone disease (HR comparing ≥7.5 MET-hours/day with none: 0.83, 95% CI 0.75 to 0.91 and 0.82, 95% CI 0.66 to 1.01). Conclusion Among Chinese adults, high total physical activity, particularly occupational physical activity, was inversely associated with risk of major hepatobiliary cancers and diseases, including non-alcoholic fatty liver disease, cirrhosis and certain types of cancer.


Follow-up for morbidity and mortality
In CKB, C22 was used only for primary liver cancer and C22.9 was used for "liver cancer, unspecified subsite". Secondary cancer was coded as "C78.7". Information on cancer histological subtypes was also collected for a subset of the cases through cancer registries or reviews of hospital medical notes as part of the ongoing outcome adjudication for major diseases. Participants who had developed hepatobiliary cancers and diseases were censored at the first time of event. Participants who died or were lost to follow-up were censored at the last day known to be alive.

Statistical analysis
A Cox proportional hazards models with age as the underlying time scale and delayed entry at age at baseline were used to estimate adjusted hazard ratios (HRs) of specific disease incidence associated with physical activity levels, stratified by sex and study area (10 areas), and adjusted for age at baseline, education (6 groups: no formal school, primary school, middle school, high school, technical school/college, or university), household income (6 groups: <2500, 2500-4999, 5000-9999, 10,000-19,999, 20,000-34,999, or ≥35,000 RMB), smoking (3 groups: never, occasional, or ever regular), alcohol (6 groups: abstainers, ex-weekly drinkers, reduced-intake drinkers, occasional drinkers, and among weekly drinkers, <20 or ≥20 g/day [women], <30 or ≥30 g/day [men]), self-rated health (3 groups: poor, fair, and good), diabetes, cardiovascular disease, respiratory disease, rheumatoid arthritis, and sedentary leisure time. These confounders were selected based on a literature review and previous reports in CKB and included potential risk factors for BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)  [1][2][3][4]5 and factors associated with physical activity (sex, age, smoking, alcohol, self-rated health, prior history of diabetes, cardiovascular disease, respiratory disease, and rheumatoid arthritis). [6][7][8][9][10] In CKB, risks of hepatobiliary diseases and patterns of physical activity varied across 10 regions, so all analyses were stratified by study region. 12,13 Education and household income were included to reflect socioeconomic status associated with both physical activity and hepatobiliary diseases. 13 Physical activity was categorised by splitting at quintiles in order to assess the shape of the association. If the association was linear then physical activity was also modelled as a continuous variable to estimate risk associated with a 4 MET-h/day higher level of physical activity.
To assess potentially nonlinear associations between physical activity and disease risk, restricted cubic splines were calculated using three fixed knots at the 10%, 50%, and 90% quintiles. Nonlinearity was evaluated using the likelihood ratio rest to compare the fit of linear and nonlinear models.
For exposure variables with more than two categories, all HRs are presented with "floating" standard errors to facilitate comparisons between groups. 14 The CKB estimates for physical activity and hepatobiliary cancers were meta-analysed with estimates from published prospective cohort studies using a random effects meta-analysis. Details of the study selection are reported in Supplementary Figure 1.
Single measurements of physical activity may not accurately reflect an individual"s usual level because of within-person variation or change over time. Repeat measurements of physical activity were available for ~20,000 participants who attended a resurvey ~3 years BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) after baseline and were used to estimate regression dilution ratios (RDRs) using the Table 3). 15 Log HR estimates for 4 MET-h/day higher physical activity were divided by the RDR to estimate associations of usual physical activity with incident disease risk.

McMahon-Peto method (Supplementary
We selected a priori central adiposity (waist circumference) and diabetes and investigated the extent to which additional adjustment for these factors could alter the association of physical activity with hepatobiliary diseases. We included each of the above-identified factors in the basic model and examined the percent change in the logHRs comparing the top and bottom quintile of total physical activity. The proportion of disease risk reduction due to additional adjustment for the factor was calculated as follows: ((logHR adjusted model -logHR basic model ) / (logHR basic model )) X 100%. The 95% CIs for the proportion were obtained through bootstrap with 1000 replications.

Sensitivity analyses
We conducted several sensitivity analyses. First, previous reports showed that the patterns of physical activity differed by sex and region in CKB, 11 and therefore we examined whether the associations of domain-specific physical activity with hepatobiliary diseases differ by sex and urbanicity. Importantly, urbanicity is a major source of confounding by socioeconomic status and other unmeasured confounders (e.g. infections), which are associated with chronic liver disease and liver cancer. 13 nonmanual workers). Work activity intensity was used to assess whether the associations differed by the intensity of physical activity (the mean MET values of total physical activity in CKB: farmers 10.4, manual workers 11.7, and nonmanual workers 6.4). Third, the main analyses were repeated excluding the first five years of follow-up. This is because participants with subclinical or undetected diseases at baseline might be diagnosed with disease over early years of follow-up and subclinical diseases may affect physical activity at baseline, resulting in reverse causation bias. 6     Abbreviation: CLD, chronic liver disease; GBTC, gallbladder and biliary tract cancer; WC, waist circumference. Models were stratified by sex and region, and adjusted for age at baseline, education, household income, smoking, alcohol, self-rated health, diabetes, cardiovascular disease, respiratory disease, rheumatoid arthritis, and sedentary leisure time. For CLD and liver cancer, the first 5 years of follow-up were excluded.
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Models were stratified by sex and region, and adjusted for age at baseline, education, household income, smoking, alcohol, self-rated health, and sedentary leisure time.
Time since birth was used as the underlying time scale with delayed entry at age at baseline. * P-value for heterogeneity between participants born before and after 1955.
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)

. Associations of total PA with risk of hepatobiliary cancers and diseases
Restricted cubic splines were calculated using three fixed knots at the 10%, 50%, and 90% quantiles. Models were stratified by sex and region, and adjusted for age at baseline, education, household income, smoking, alcohol, self-rated health, and sedentary leisure time. Time since birth was used as the underlying time scale with delayed entry at age at baseline.

. Associations of occupational and nonoccupational PA with risk of liver diseases and cancer
Models were stratified by sex and region, and adjusted for age at baseline, education, household income, smoking, alcohol, self-rated health, and sedentary leisure time. Time since birth was used as the underlying time scale with delayed entry at age at baseline. HRs are plotted against the mean level in each category of physical activity. Log-scale is used for the y-axis. The squares represent HRs and the vertical lines represent 95% CIs. The area of the squares is inversely proportional to the variance of the log HRs. The numbers above the vertical lines are point estimates for HRs, and the numbers below the lines are numbers of events.

Supplementary Figure 5. Associations of domain-specific PA with risk of hepatobiliary disease by region
Models were stratified by sex and region, and adjusted for age at baseline, education, household income, smoking, alcohol, self-rated health, diabetes, cardiovascular disease, respiratory disease, rheumatoid arthritis, and sedentary leisure time. The amount of domain-specific physical activity was categorised by splitting at quintiles. To ensure enough participants in each category of physical activity, we estimated HRs comparing the highest quintile with the lower four quintiles of total physical activity for these subgroup analyses. For leisure-time physical activity, the HR was comparing ≥7.5 MET-h/day to none. Models were stratified by region, and adjusted for age at baseline, education, household income, smoking, alcohol, self-rated health, diabetes, cardiovascular disease, respiratory disease, rheumatoid arthritis, and sedentary leisure time.
The amount of domain-specific physical activity was categorised by splitting at quintiles. To ensure enough participants in each category of physical activity, we estimated HRs comparing the highest quintile with the lower four quintiles of total physical activity for these subgroup analyses. For leisure-time physical activity, the HR was comparing ≥7.5 MET-h/day to none.
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)

Supplementary Figure 7. Associations of total PA with risk of chronic liver disease and liver cancer by participant characteristic
Models were stratified by sex and region, and adjusted for age at baseline, education, household income, smoking, alcohol, self-rated health, diabetes, cardiovascular disease, respiratory disease, rheumatoid arthritis, and sedentary leisure time, where appropriate. To ensure enough participants in each category of physical activity, we estimated HRs comparing the highest quintile with the lower four quintiles of total physical activity for these subgroup analyses.   Models were stratified by sex and region, and adjusted for age at baseline, education, household income, smoking, alcohol, diabetes, cardiovascular disease, respiratory disease, rheumatoid arthritis, and sedentary leisure time, where appropriate. To ensure enough participants in each category of physical activity, we estimated HRs comparing the highest quintile with the lower four quintiles of total physical activity for these subgroup analyses.