Objectives There have been recent calls for the application of the complex systems approach in sports injury research. However, beyond theoretical description and static models of complexity, little progress has been made towards formalising this approach in way that is practical to sports injury scientists and clinicians. Therefore, our objective was to use a computational modelling method and develop a dynamic simulation in sports injury research.
Methods Agent-based modelling (ABM) was used to model the occurrence of sports injury in a synthetic athlete population. The ABM was developed based on sports injury causal frameworks and was applied in the context of distance running-related injury (RRI). Using the acute:chronic workload ratio (ACWR), we simulated the dynamic relationship between changes in weekly running distance and RRI through the manipulation of various ‘athlete management tools’.
Results The findings confirmed that building weekly running distances over time, even within the reported ACWR ‘sweet spot’, will eventually result in RRI as athletes reach and surpass their individual physical workload limits. Introducing training-related error into the simulation and the modelling of a ‘hard ceiling’ dynamic resulted in a higher RRI incidence proportion across the population at higher absolute workloads.
Conclusions The presented simulation offers a practical starting point to further apply more sophisticated computational models that can account for the complex nature of sports injury aetiology. Alongside traditional forms of scientific inquiry, the use of ABM and other simulation-based techniques could be considered as a complementary and alternative methodological approach in sports injury research.
- agent-based modelling
- complex systems
- sports injury
- distance running
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
In the sports injury aetiology and prevention research field, the use of the ‘complex systems approach’ has been recently promoted.1 Inspired by previous work that questioned the routine application of reductionist scientific methodologies and statistical techniques,2–4 Bittencourt and colleagues1 argued for an alternative causal approach. This approach recognises that sports injury is a ‘complex emergent phenomenon’, resulting from the interactions among different factors (ie, a web of determinants), which may produce regularities (ie, a risk profile) that are antecedent to the emerging pattern (ie, sports injury). In brief, the theoretical assumptions underpinning the complex systems approach can be traced back to general systems theory,5 which identifies several characteristics of complexity as a general philosophical precept. These include, but are not limited to, adaptation and learning, tight coupling, causal feedback, non-linear relations, sensitivity on initial conditions, threshold effects, stochasticity and historical dependency. Those characteristics have featured across multiple discussions in the sports injury scientific literature1 4 6–8; however, no study has yet applied a recognised method that has the capability to dynamically simulate and better understand complex systems causal patterns and processes. One computational modelling method that has been suggested as a suitable approach for sports injury research1 2 4 9 is agent-based modelling (ABM).
In this paper, we develop a first-of-its-kind ABM in the field of sports science and apply it in the context of distance running-related injury (RRI). Distance running is considered a pertinent example to use from a technical point of view given that the main participatory-related exposure (ie, workload) can be readily defined. There are three aspects to the model that underpin its concept, design and operation: (1) its development is based on the complex systems approach,1 and so investigating the application of a novel complex systems method in sports injury research represents an important line of inquiry; (2) the ABM is informed by contemporary sports injury and RRI causal theory10–12; and (3) it incorporates the acute:chronic workload ratio (ACWR),13 14 an approach that calculates relative changes in training load. The primary aim of the ABM is to simulate the dynamic relationship between the absolute weekly running distance and RRI, as well as the relative change to weekly running distance and RRI, through the manipulation of various model parameters (see section titled, ‘athlete management tools’).
Using ABM to simulate the relationship between workload and sports injury development is a considerable step forward in terms of bringing complexity science and systems thinking to the sports injury literature.1 4 6–8 15 16 Indeed, during the past decade, there have been a series of articles that share a number of progressive methodological features. Initially, Quatman et al 2 proposed a conceptual-methodological framework encompassing the integration of in vivo, in vitro and in silico techniques to better understand the development of anterior cruciate ligament injury. In particular, the authors2 stated that the greatest advances in sports injury research were likely to come from a new methodological paradigm that enabled scientists and clinicians to think, theorise and locate appropriate applications that consider the nature of the complex relationships among different exposures. Shortly afterwards, Mendiguchia et al 3 argued for the same paradigm shift, advocating that sports injury research was required to move beyond the process of wanting to isolate risk factors to a conceptual model encompassing ‘dynamic simulations’ and the possibility to ‘modify different parameters’.3 More recently, Hulme and Finch4 and Bittencourt et al 1 have suggested the use of ABM for the explication and testing of theoretical causal assumptions in relation to injury development, as well as for the simulation of complex sports injury aetiologic mechanism(s). Further systems-based work has since applied a method from the human factors and ergonomics domain17 and developed a more holistic, ‘complex systems model’ of RRI development and prevention.15 16 Notwithstanding the evolution of systems thinking applications in sports injury research, most scholarly contributions have been descriptive in nature1–4 6–8 or have involved the development of static frameworks and models.15 16 As such, to advance this body of work, it is necessary to apply a computational modelling approach that can simulate dynamic behaviours within complex sports systems and/or understand how systems change over time.
With regard to scientific theory and clinical practice, the use of simulation and computational modelling techniques has the potential to generate new insights about sports injury aetiology, which could be used to support clinical decision making. Healthcare practitioners rely on a wide range of study designs and different forms of evidence in which to prescribe the most efficacious therapeutic or preventive interventions to athletes.18–20 For that reason, there is a need to investigate how, as a proven complementary method to routine epidemiological inquiry,21–24 ABM can dynamically simulate known mechanisms of sports injury (ie, the relationship between workload and RRI), so that it is possible to develop more sophisticated and clinically relevant complex systems models. In taking the next formal step, this study represents a transition away from theoretical description and static modelling approaches,1–4 6–8 15 16 and examines the feasibility of simulations for studying the complex and dynamic nature of sports injury moving forwards. Therefore, the purpose of this study is to introduce computational modelling to sports injury research, using ABM as one example of a viable method for studying complex injury dynamics in future theoretical and practical applications.
As a computational method, ABM simulates the actions and interactions of heterogeneous, autonomous ‘agents’, to assess the effects of their behaviour on the system as a whole.25 26 Agents in an ABM can constitute any self-contained and goal-directed entity, including but not limited to, molecules, cells, pathogens, people (eg, athletes, runners and sports teams), animals, automated vehicles, organisations, and/or entire synthetic populations.27 28 In the case that the agents are representative of individual people, the model operator can assign demographic and lifestyle-related characteristics such as age, sex, diet, medical history and injury susceptibility, as well as cognitive rules pertaining to memory, personality, behaviour and/or intelligence.29 This means that agents can learn over time based on past experiences, update their internal states, adapt to changing environmental circumstances and demonstrate any other characteristic or behaviour that has been explicitly defined. Based on its ‘ground-up’ modelling approach, ABM can be used to explain how populations self-organise and/or create patterns of global behaviour that are not predictable or programmed into each agent type a priori. For this reason, ABM is a powerful tool when wanting to explore the mechanism(s) by which collective behaviour among individual agents gives rise to emergent-level phenomena (eg, rates of sports injury).
Many different health-related contexts have applied ABM. A notable example is the Global-Scale Agent-Based Model, which simulated 6.5 billion persons and explored how various behaviours and contact points shaped the transmission rate and distribution of the H1N1 swine influenza virus.27 28 Other studies have integrated ABM with geographic information systems science to improve comprehension of how the measles disease propagates through an urban environment.30 In the non-communicable health context, ABM has been used for multiple purposes,31 including the evaluation of policy-level and environmental intervention strategies for improving diet and promoting exercise.32–34 Specifically, Yang and coworkers32 used ABM to examine the impact of certain policies aimed to change population-level attitudes towards walking among individuals from different socioeconomic backgrounds. In the medical and healthcare context, ABM has emulated a real-world lifestyle modification programme for individuals with diabetes and estimated the morbidity and economic outcomes associated with the modification of certain parameters (eg, pharmacological delivery options) over a 30-year period.35 Since initial applications in the mid-1990s, the use of ABM has continued to gain popularity in parallel with the evolution of information technology and computing power.36 For further information pertaining to the origins, purpose and general use of ABM, the reader is referred to other more comprehensive sources.25 26 29 36
The distance running agent-based model
The distance running ABM was constructed using the NetLogo toolkit (V.6.0.1), a cross-platform, open-source, programmable modelling environment for simulating natural and social phenomena (https://ccl.northwestern.edu/netlogo/).37 38 The simulation environment was representative of a track and field overlay with dimensions of 70×30 patches (arbitrarily scaled distance units) (online supplementary material figure 1). To guide the reader through the different stages of ABM development, the following four phases are described: (1) defining the personal characteristics of the synthetic agent population (hereby referred to as ‘runners’, or the ‘running population’); (2) initialising the model and establishing baseline procedures; (3) implementing four distinct ‘athlete management tools’ (ie, these tools drive the dynamics of the model); and (4) establishing the conditions surrounding the execution of the simulation procedure itself.
Supplementary file 1
Phase one: running population characteristics
The distance running ABM contained 1000 runners. This number of runners was chosen to capture aggregate, population-level dynamics and patterns. Each runner possessed personal characteristics that could affect their physical capacity to tolerate an applied external running workload, defined as the number of kilometres (km) undertaken in any given training week (p/w). These characteristics included body mass index (BMI), biomechanics, footwear, sleep, diet, recovery and genetics and were selected based on a recent framework of RRI aetiology.11 To support a comparison of those factors across runners, the relative ‘quality’ of each characteristic was standardised and drawn from a random-normal distribution with a mean of zero (eg, an ‘average’ diet or sleep quality), with an SD of 0.3 (ie, this maintained most runners between a range of +1 and −1).
An additional characteristic that each runner possessed was a maximum workload potential (MWP) state. In accordance with contemporary sports injury and RRI causal theory,10–12 surpassing the MWP state was equivalent to the absolute external running workload exceeding a specific musculoskeletal structure’s physical capacity. Safely reaching the MWP state without surpassing it assumes a perfect environment, training and management regimen. For each runner in the ABM, their initial MWP state was set to a random-normal mean of 65.0 km p/w, with an SD of 10.0 km p/w. This produced a population-based MWP distribution that acknowledged not all runners had an equal upper workload limit. It is worth noting that the selection of runners’ characteristics and the MWP state values are not necessarily integral to the operation of the simulation or the validity of its outputs. Rather, this model was focused on demonstrating how ABM can be both programmed and used to simulate the relationship between workload and RRI risk—and by extension—overall population-level athletic performance.
Central to the distance running ABM is the ACWR.13 14 As a means of facilitating sports performance optimisation, the ACWR can be used to guide the prescription of future workloads. There are two components to this metric: (1) the ‘acute’ phase, which represents the training load undertaken in the most recent week (ie, a 1-week block); and (2) the ‘chronic’ phase, which signifies the average training load undertaken in the month prior (ie, a 4-week block).13 Calculating the ACWR involves dividing the acute phase (eg, 60.0 km of running), by the chronic average (eg, 50.0 km), giving in this case a ratio of 1.2 (ie, 20.0% workload increase).
The ACWR is theoretically driven and practically appealing. Well-developed physical qualities and musculoskeletal adaptations produced during chronic training phases may build athletic resilience and protect against injury.13 14 Gradually increasing workloads, and closely tracking week-to-week changes to training regimens, is more important than the absolute applied workload exposure at any given time.39 40 Prospective epidemiological investigations have found that when the acute training load is equal to, or less than, the chronic phase (ie, ACWR ratio ≤1.0), the risk of non-contact, soft-tissue injury is significantly lower than ratios of ≥1.5.41 Although further research is yet to be conducted to strengthen existing evidence for example,42–47 an ACWR between 0.8 and 1.3 has been coined the training ‘sweet spot’, whereas a ratio between 1.3 and 1.4, and ≥1.5, represents a moderate and high-risk injury zone, respectively.13 48
In response to the growing interest around the ACWR, concerns have been raised about the use of rolling averages to assess workload and sports-related injury risk.49–51 The two main limitations with the traditional ACWR calculation are: (1) averages fail to account for variation over time such that day-to-day patterns and ‘spikes’ in the applied workload are smoothed; and (2) rolling averages neglect the decaying nature of stimuli over time.49 As such, a non-linear training model that places increasing weighting on the daily workloads undertaken towards the end of a chronic training phase has been proposed.51 The exponentially weighted ACWR (EW-ACWR) was found to be significantly more sensitive than the traditional ACWR at identifying injury likelihoods at upper training load ratio ranges (ie, ≥1.5).50 Along with the traditional ACWR, the EW-ACWR was incorporated into the distance running ABM as an option for calculating the relative variation in runners’ workload.
Phase two: model initialisation
At the start of the simulation, each runner was assigned a standard running history spanning the previous 28-day period. This history allocated a total of 20.0 km p/w in each 7-day block preceding each day in the prior 28-day period. Therefore, at ABM initialisation, each runner had 20.0 km p/w history of running in the previous training week and had a rolling average of 20.0 km p/w for the past month. This produced both an initial ACWR and an EW-ACWR of 1.0 (ie, each runner had a consistent workload in relation to a previously recorded workload over the past 28 days as calculated under each regimen).
Phase three: athlete management tools
The distance running ABM incorporated four athlete management tools that were manipulable by the model operator. The first of these tools is the ‘ramp-up rate slider’, which dictated the rate at which runners applied and increased workload over time. Specifically, the goal of each runner in the system was to maximise the absolute distance they were able to run per week without sustaining RRI, that is, safely reach the MWP state and remain there. To achieve this, the simulation started with runners’ gradually increasing their weekly kilometres at a rate determined by the user-defined ramp-up rate. The lower the ramp-up rate, the longer the time frame before the running population reached a MWP state. Conversely, higher ramp-up rates resulted in runners rapidly ascending to their upper workload limits. For the purposes of experimentation, runners’ workloads were increased within the reported ACWR sweet spot of between 5.0% and 30.0%, in increments of 5.0%.13 This resulted in a total of six possible ramp-up rate conditions.
The second athlete management tool was an ability to approximate a runner’s individual adherence, misrepresentation or miscalculation of the advice provided by, for example, a coach or healthcare professional as to how much training should be undertaken per week. The ‘random variation slider’ introduced noise into each runner’s planned workload, adjusting the ramp-up rate by a mean of 0.0%, but with increasing SD of 0.0% (ie, perfect training adherence), 1.0% (ie, moderate training adherence) or 2.5% (ie, poor training adherence). The formula for the calculation of workload in the current week is shown in equation 1, where cw=workload in the current week, pw=workload in the previous week, r=ramp-up rate and error=random variation.
The third athlete management tool that was manipulated by the model operator was the ability to adjust the way in which the ACWR was calculated. Although differences between the traditional ACWR and the EW-ACWR metrics correspond to approaches that calculate changes in workload, the distance running ABM was constructed so that altering the estimation of the ratio (ie, either non-weighted or weighted) could affect RRI risk. Therefore, the risk of RRI in each week was based on either the ACWR or EW-ACWR calculation, of which both dynamically responded to the user-defined ramp-up rate and random variation condition. Accordingly, if a given runner’s calculated workload ratio was ≥1.1, then the likelihood of RRI was proportional to the cubed value of their allocated ACWR or EW-ACWR condition. This produced an exponentially increasing risk of RRI that approximated the observed likelihood of sports injury development as found in empirical studies (figure 1).13 To provide a visual indication to the model operator of the health of the running population at any given time, runners who incurred an RRI doubled in size, turned red in colour and were transferred to the centre of the simulation environment. On sustaining RRI, a given runner’s workload dropped to 5.0 km per week.
The fourth and final athlete management tool under manipulation was a binary condition relating to the runner’s individual MWP state. In the latter condition, whereby an individuals’ MWP was unknown (ie, ‘off’), the calculation of runners’ future training was based on the workload in the most current week, multiplied by the ramp-up rate and adjusted for random variation (equation 1). A second condition was constructed whereby runners’ workloads were further adjusted based on how close the current workload was to their MWP state. This calculation recognised that the running population had a randomly distributed MWP that was guaranteed to be reached under conditions of continuous, compounding growth (ie, MWP state ‘on’). The formula for this calculation is shown in equation 2.
Phase four: establishing the conditions of the simulation
The set combination of the six ramp-up rates, three random variation conditions, two ACWR calculations and two MWP states produced a 72-condition matrix. Given stochastic elements within features of the ABM, modelling under the 72 different possible conditions was repeated 10 times for 1000 model time steps, or days (~143 weeks). This produced a total computational model encompassing 720 000 individual simulated runners monitored over a total of 720 000 days (~102, 800 weeks). On completion of the simulation, data were exported from the NetLogo37 38 software into spreadsheet processing software (Microsoft Excel for Windows).
There were no differences between the ACWR and the EW-ACWR calculations in relation to changes to workloads or RRI incidence proportions across the six ramp-up rates and the three random variation conditions under both MWP states (online supplementary material table 1). The EW-ACWR was, however, more sensitive than the traditional ACWR at responding to individual-level workload fluctuations (figure 2). The differences between the 0.0% and 1.0%, and 1.0% and 2.5% random variation conditions did not considerably affect workloads or RRI incidence proportions (online supplementary material table 2). As such, we examined the interaction between the six ramp-up rates and the two most extreme random variation conditions, that is, 0.0% (perfect training adherence) and 2.5% (poor training adherence) under both MWP states.
Supplementary file 2
Perfect training adherence (random variation 0.0%)
When the random variation was set to 0.0%, and the MWP state was set to on, the running population maintained the highest workloads relative to when the MWP was set to off (figure 3). Similarly, higher ramp-up rates over the simulated time frame also resulted in higher maximum workloads. Specifically, at a 5.0% and 30.0% ramp-up rate, the distance performed by the runners was 53.8 km p/w and 62.5 km p/w, respectively. Under the same set of conditions, the RRI incidence proportion was 0.0%. Conversely, with the MWP state set to off (ie, runners could overshoot their MWP state), the RRI incidence proportion climbed from 4.2% to 30.1% across the six ramp-up rates.
The variability around workloads and RRI incidence proportions under the two different MWP states can be viewed in the dynamic ABM output plots (figures 4 and 5). With the MWP set to on, the running population consistently increased their workload, and aware of the threshold over which they would sustain RRI, safely reached a performance ceiling (figure 4). With the MWP state set to off, a given runner invariably surpassed their physical capacity and sustained an RRI (figure 5).
The workload across the six ramp-up rates remained relatively stable with the MWP state set to off (figure 3). Accordingly, a 5.0% ramp-up rate resulted in runners spending a proportionately greater amount of time performing lower weekly running distances to the benefit of fewer RRIs (figure 6). However, a 30.0% ramp-up rate reduced the length of time that runners spent at lower workloads but equally resulted in a higher RRI incidence proportion. Across the population, MWP spikes stabilised with relatively longer periods of workload growth.
Poor training adherence (random variation 2.5%)
Introducing training error into the runners’ ramp-up rates resulted in changes to both workloads and RRI incidence proportions (figure 7). This condition simulated a scenario whereby runners were aware that a MWP state existed, but they could only estimate the value within a 2.5% random variation. At a 5.0% and 30.0% ramp-up rate, the mean distance performed by the runners was 29.9 km p/w and 35.3 km p/w, respectively. The RRI incidence proportion was higher across the six ramp-up rates relative to the 0.0% random variation condition.
With the random variation set to 2.5%, there was an initial upwards workload trajectory as runners climbed towards their MWP state (figure 8). As a given runner approached and misjudged their MWP state due to training error, a higher RRI incidence proportion across the population brought the workload down over the 143 weeks.
The purpose of this study was to introduce computational modelling to sports injury research, using ABM as one example of a viable method for studying complex injury dynamics in future theoretical and practical applications. To achieve that purpose, an ABM was developed with the aim of simulating the dynamic relationship between the absolute weekly running distance and RRI, as well as the relative change to weekly running distance and RRI, through the manipulation of four athlete management tools (ie, six ramp-up rates, three random variation conditions, two ACWR calculations and two MWP states). This was an important step for the integration of complexity science and systems thinking in the sports injury literature,1–4 6–8 particularly given that no study has yet formally demonstrated the use of computational modelling in this context. Previous attempts to describe and/or apply the complex systems approach have resulted in the development of static frameworks or models that are not capable of simulating dynamic behaviours within complex sports systems and/or understanding how systems change over time.15 16 In addition to the findings related to the optimal management of RRI within running populations, this paper also reiterates the long-standing need for an alternative paradigm involving ‘dynamic simulations’ and ‘complex modelling’ as has previously been called for.2 3 Although the distance running ABM has effectively simulated the occurrence of sports injury in a population of runners, there remains a need to highlight what computational modelling can offer to the field of sports injury research more broadly. Therefore, the following discussion is structured around the main take-home messages and subsequently outlines important considerations when aiming to use simulations in future research-based applications. Our intention is to clarify the potential contribution of ABM and to inspire researchers and clinicians to continue to explore computational modelling and further develop applications in the sports injury context.
With regard to the presented simulation dynamics, a useful way of conceptualising the distance running ABM is to consider the rate at which the running population climbed towards a MWP state. Depending on the predefined ramp-up rate plus error condition, the goal of the agents was to run safely toward their maximum performance level. With the MWP state set to on, and the random variation condition set to 0.0% (ie, perfect adherence to instruction), the running population appropriately identified their workload limits, and sustained the lowest number of RRIs. Conversely, increasing the random variation to 2.5% (ie, poor adherence to instruction) while leaving all other conditions equal adversely affected workloads and RRI incidence proportions over the course of the simulation. The maximum workload across the six ramp-up rates was comparatively lower when training error was higher because runners were misjudging the applied workload, and therefore sustaining RRI despite being aware of their MWP state.
Setting the MWP state to on and increasing the level of random variation in the model is representative of a real-world sports training situation. Distance runners, coaches and qualified healthcare professionals may be aware that an MWP state for each athlete exists, but knowing precisely where that upper individual limit is, and how to get there safely, is arguably one of the greatest challenges in prescribing future training loads. Despite the utility of workload calculation approaches such as the ACWR,41–47 the distance running ABM has demonstrated that building weekly running distances over time, even within reported sweet spot guidelines of up to 1.3,13 14 48 will eventually result in RRI as athletes reach their upper physical limits. This was indicated in the simulation, as the programming code was set so that varying workloads would occasionally surpass the MWP state for some individuals in the absence of extreme relative changes to the applied running distance. Exceeding a physical capacity to tolerate workload is, however, not a new concept, and a ‘ceiling effect of safety’ has been described in the literature.52 53 The findings of present study support the view that the calculation and prescription of athletic workloads should not be performed in isolation (ie, with a single metric) and requires a comprehensive, individualised and flexible approach.12 40 54
Another insight offered by the simulation relates to the trade-off between a lower versus higher weekly ramp-up rate under the 2.5% random variation condition at different stages of recovery following RRI. Results showed that, in general, a higher ramp-up rate after returning from RRI resulted in greater maximum running distances across the population over the course of the simulation as runners quickly returned to their preinjury workload levels. For those athletes in the process of returning from RRI, this may be seen as positive. However, this relationship was offset by a higher RRI incidence proportion. Specifically, in the early stages of recovery and prior to reaching their MWP state, runners experienced a greater margin of error when either miscalculating workload or not observing the recommended ramp-up rate. However, when runners’ physical capacity to tolerate workload had been reached, any error to the applied running distance, irrespective of its magnitude, resulted in RRI. This ‘hard ceiling’ dynamic shows that it may be advantageous for runners who wish to maintain high distances over extended periods of time to think long term about their training, or perhaps even refrain slightly from regularly operating at their perceived level of peak performance. While this simulation was modelled on understanding RRI occurrence in a ‘general population’, the implications of this dynamic indicated fragility at the extremities of performance for the more serious runner who might aim to participate in competitive events. Athletes, running coaches and healthcare practitioners are reminded that although it is necessary to progressively and systematically increase external workloads over time, it is as equally important to continuously monitor and measure internal physiological and psychological responses to that load.40
The conceptual basis and development of the distance running ABM was based on the complex systems approach.1 For this reason, it is worthwhile to briefly outline the main distinction between computational modelling methods and traditional statistical models for studying health-related phenomena. First and foremost, simulation techniques (like the ABM here) are best used for exploring and understanding mechanisms and theories in complex systems that are potentially unknown or contested.55 Conversely, statistical models such as regression analyses, are useful for testing a priori hypotheses and analysing already collected data. Simulation methods are, therefore, better suited at generating hypotheses that can be empirically tested with statistical modelling. One theoretical advantage of ABM over other computational modelling methods (eg, systems dynamics modelling, machine learning) is that it allows the programmer to more easily study systems in which the causal mechanisms for one person can change depending on their connection to other individuals.56 A typical example would be in the infectious disease context whereby the probability of contracting a given disease is dependent on whether there are only isolated cases or an epidemic. Although our simulation did not model agent interactions, there is a need for future computational studies to explore this social dimension in relation to sports injury development. As such, the application of simulation methods to a practical sports injury problem could be highly beneficial if used to supplement routine epidemiological inquiry.
When studying complex systems phenomena with ABM, it is expected that a greater reliance is placed on theory relative to data.57 That is, ABM cannot offer the same level of external quantitative credibility that traditional statistical modelling can provide.58 However, ABM enables the analyst to establish a balance between realism (ie, face validity), generality (ie, qualitative abstraction) and numerical precision (ie, fineness of model specification). This balance can be achieved by triangulating different forms of evidence and using empirical data to parameterise models when assigning agent characteristics and environmental rules at baseline.59 Notwithstanding the reported guidelines around the development, calibration and validation of computational modelling,55 60 ABM is effectively an in silico laboratory that can provide scientists and clinicians with a means of understanding what results they might expect if current theories are true, although without undue financial, ethical or logistical implications that are associated with real-world pilot studies.56 For instance, it is possible to situate agents within a social network and broader spatial context, duplicate baseline conditions and subsequently change only one aspect of the model. Thus, a range of experimental scenarios can be trialled repeatedly, providing scientists with an agent-based counterfactual simulation that predicts the impact of different health-related strategies or policies based on the simulation settings provided.56 57 61 Consequently, ABM has been used to emulate the randomised controlled trial for patients with diabetic retinopathy, allowing for the examination of hypothetical interventions targeting vision loss.62 63 The next step might be to follow-up such an ABM with a ‘real’ experiment by drawing on the theoretical insights generated from the simulation exercise.
Given the flexibility of computational modelling, it is not surprising to find that ABM has been used to simulate cyclical, self-reinforcing feedback loops among individual microunits, including people, cells and molecules, to identify emergent patterns of behaviour, such as disease transmission dynamics,64 wound healing processes65 66 and adaptive immunity.67 Indeed, one advantage of ABM is that the complex interactions between agents can (and ideally should) be explicitly modelled. If the theoretical causal mechanisms that are encoded into the simulation programme are accurate, it will provide useful predictions. These causal mechanisms are developed through a synthesis of all evidence, preferably through a causal inference approach that makes assumptions explicit. One challenge when hypothesising these causal mechanisms from observational data is that, traditionally, they require the stable unit value treatment assumption. In brief, this assumption states that the outcome of one individual should not affect the outcome of another individual, irrespective of whether they were exposed or not.36 56 As explained previously in relation to infectious diseases, this assumption is often violated when there is interference among units leading to biased causal effects68 (eg, exposed athletes might influence the behaviours and potential outcomes of their unexposed peers). More recent work in the causal inference scientific thematic does not require this assumption; however, the suggested methods are not yet widely implemented and so the hypothesised causal mechanisms encoded into simulation models may not be accurate despite best intentions.69–72
A final consideration when using regression models (eg, logistic and survival analyses), and something to which computational modelling can potentially circumnavigate, is the events-per-variable (EPV) requirement.73 74 In short, the EPV requirement explains that in order to conduct a robust statistical sports injury analysis and avoid ‘sparse data bias’,74 the number of explanatory variables modelled should be large enough in relation to the number of events (ie, injuries) observed. Not only does the EPV requirement necessitate considerably large sample sizes, but it is also further compounded when the researcher wishes to stratify samples to understand how the applied workload and other time-varying exposures change status over time during follow-up.75 Accordingly, in the absence of large-scale data sets, computational modelling methods such as ABM might prove useful for simulating hundreds or thousands of athletes, each of whom can be assigned characteristics and decision-making rules corresponding to real-world values and observed behaviours. Doing so might lead to a sufficient number of injurious events per explanatory variable modelled, affording insight into the likely mechanisms that generate certain outcomes. In light of the above considerations, computational modelling methods are by no means superior to traditional scientific approaches and statistical models; rather, simulation techniques could be considered (and might prove to be) a useful methodological adjunct. Although beyond the scope of this paper, a more complete list of the advantages and disadvantages of computational modelling, including ABM, should be provided and contextualised within the sports science field.
On the whole, and in terms of sports injury research, computational modelling has the potential to predict the extent to which different factors and their interactions influence the onset of injury given the settings of the model. Evaluating new hypothetical injury prevention strategies is also possible, provided that these strategies do not change or require additional underlying causal assumptions that are not coded within the model. Irrespective of why simulations and computational modelling techniques are to be used, their development should be carefully planned, debated and scrutinised over a series of iterative stages that starts with a verified working model, similar to the distance running ABM presented. There is now a need for future computational modelling applications to explore how ABM can be used to simulate more advanced complex systems characteristics in relation to sports injury aetiology and prevention.
Limitations and research-based considerations
This proof-of-concept study is not without limitation. First and foremost, the intention of the ABM was not to offer original data, nor provide new or practical knowledge about how to safely increase workloads for running performance optimisation. For example, runners’ personal characteristics such as BMI were not explanatory in the sense of impacting on the results, and this should motivate future computational applications to either build on the distance running ABM directly (annotated NetLogo Code provided) or draw on its premise to guide the development of dynamic simulations in other sports domains. For that reason, the model was not instantiated with data, and the selection of runners’ personal characteristics, as well as numerical values pertaining to workload and RRI risk, were based on subject matter expertise (AH and RON), contemporary RRI causal theory10 11 and evidence around the ACWR.13 14 We contend that this provides a practical position in which to further explore computational modelling.
Another limitation relates to the different classes of agent-based simulations that can be developed. For example, the distance running ABM is more representative of a ‘multi-agent’ system, as runners responded to their environment (ie, direction from the coach) but operated independently from one another. These conditions are easily modelled with traditional simulation techniques as well. Conversely, a typical ABM in the social sciences aims to understand how the mechanism of interaction between boundedly rational agents leads to the emergence of global patterns and collective behaviour. Extending the current model to include local level interactions between agents and factors could be a means of providing further insight into the role of specific social mechanisms that drive behaviour and injury incidence, as well as recovery among the running (or other sports) populations. Although there are many different ways to advance the distance running ABM, it is essential that extensions are biologically plausible, theoretically reasonable and numerically precise where relevant.58
Under the assumption that computational modelling is to gain further traction in the field of sports science, there will be an ongoing requirement for scholarly research to ask important questions such as: when is the use of computational modelling appropriate and for what types of problems should it be applied? What should be included in a given simulation, including its scope, properties, agents, behaviours, environment, inputs and modelled time steps? How can the modelled outputs best be verified, and what type of validation is to be sought (eg, empirical output or macro-face validation)? Finally, what are the technical options for addressing model misspecification? In terms of ABM, detailed resources such as the one provided by Rand and Rust55 represent a useful starting point for sports injury scientists and clinicians who wish to continue to explore what computational modelling can offer as an alternative method.
In terms of research-based considerations, ABM requires the expert use of an object-oriented programming language (eg, Java, Python and C++), and the wide range of available computer-based software packages could be disconcerting. Overcoming these hurdles and transitioning to computational modelling requires a multidisciplinary team comprising clinicians and healthcare professionals, sports injury experts, social scientists, epidemiologists, biostatisticians and, crucially individuals with proven level of expertise in coding and programming.
It has long been argued that the adoption of a complex systems approach in sports injury research and practice will help to better understand and prevent injury. However, beyond theoretical description and static models of complexity, little progress has been made towards formalising this approach in a way that is practical to sports injury scientists and clinicians. In short, a true complex systems approach has not yet been applied. However, in demonstrating for the first time the use of ABM to simulate RRI, this study illustrates that computational modelling can be helpful in sports injury research. We used current theory to guide the choices for causal relationships in the simulations of RRI, including variation in the population MWP state, the ACWR calculation and individual differences in adherence to training (or lack thereof). Our results are therefore necessarily consistent with current theory. Specifically, when runners were aware of their upper workload limits and accurately adhered to training-related instruction, the agent population maintained the highest level of performance while sustaining the lowest number of RRIs. Poor adherence to training-related instruction adversely affected workloads and RRI incidence proportions over the course of the simulation. The ABM confirmed that the calculation and prescription of athletic workloads should not rely on the use of single metric and require a comprehensive, personalised and adaptable approach. This is especially true for runners who are operating close to their physiological potential.
Additional implications extend beyond the presented model. In particular, scientists and clinicians interested in the philosophy of complex systems should start to explore what computational modelling can offer to a sports injury problem of interest. Likewise, descriptions of the specific advantages and disadvantages of the different computational modelling strategies (including ABM) in different contexts (eg, sports injury prediction, prevention and treatment) would be helpful. Alongside the continuing use of traditional epidemiological and clinical research-based applications, the use of computational modelling methods should be considered as a complementary methodological approach in sports injury research.
What are the findings?
Agent-based modelling (ABM) is best used for exploring and understanding mechanisms and theories in complex systems that are potentially unknown or contested. Conversely, formal statistical analyses are useful for testing a priori hypotheses and analysing already collected data.
This proof-of-concept study shows that the distance running workload and running-related injury (RRI) dynamics can be simulated using an ABM approach.
Based on our simulation settings, attempting to maintain workloads within the ACWR sweet spot will still result in athletes surpassing their own ‘ceiling of safety’ resulting in RRI. This dynamic is exacerbated with greater workload error.
How might it impact on clinical practice in the future?
Computational modelling methods such as ABM are primarily used to understand how local-level behaviours and interactions among individual ‘agents’ (eg, molecules, cells and athletes) may theoretically lead to the emergence of complex systems patterns (eg, sports injury development).
Sports injury scientists and clinicians might want to familiarise themselves with ABM to determine new ways of using this theoretically driven method so that it can be effectively applied to a specific problem of interest.
In the absence of large-scale data, scientists and clinicians interested in the aetiology and prevention of sports injurry are invited to consider computational modelling as an alternative and complementary method to traditional epidemiological and clinical research-based applications.
We would like to thank Professor Rod McClure for his insightful and constructive feedback during the early stages of this research. We also thank Mr Tony Carden whose knowledge on the origins and history of complexity theory was most valuable. Dr Simon Feros, Dr Scott Mclean and Ms Natalie Selever provided useful comments pertaining to the flow and readability of the manuscript. We are grateful to Mr Nicholas Patorniti, who helped with technical formatting.
Contributors AH was responsible for the concept, ABM development, methods, results interpretation and write-up. JT was primarily responsible for developing the ABM, contributed to the methods write-up and had editorial input into the manuscript. RON, GJMR and PMS had editorial input into the manuscript and contributed to the write-up. PMS’s contribution to this work was funded by the Australian Research Council (FT140100681).
Funding Australian Research Council (grant number: FT140100681).
Competing interests None declared.
Patient consent Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.