Purpose The goal of this meta-analysis was to determine the clinical utility of acute mountain sickness (AMS) history to predict future incidents of AMS.
Method 17 studies (n=7921 participants) were included following a systematic review of the literature. A bivariate random-effects model was used to calculate the summary sensitivity and specificity of the diagnostic test, and moderator variables were tested to explain the heterogeneity across studies. The Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) method was used to assess concerns for bias and applicability for the included studies.
Results The history of AMS had a low diagnostic accuracy for the prediction of future AMS incidents: the summary sensitivity was 0.50 (95% CI (0.40 to 0.59)) and the summary specificity was 0.72 (95% CI (0.66 to 0.78)). There was significant heterogeneity in the sensitivity and specificity across studies, which we modelled using moderator analysis. Studies that restricted the use of acetazolamide and dexamethasone had not only a higher sensitivity (0.66) relative to those that did not (0.44; p=0.03) but also an increased false-positive rate (0.39 vs 0.23, p=0.03). The QUADAS-2 analysis showed that AMS histories were insufficiently detailed, and few studies controlled for prophylactic medication use or recent altitude exposure, leading to high risks of bias and concerns for applicability.
Conclusions The use of AMS history to guide prophylactic strategies for high-altitude ascent is not supported by the literature; however, the low sensitivity and specificity of this diagnostic test could reflect the quality of the available studies. Ensuring that the characteristics of the history and future ascents are similar may improve the clinical utility of AMS history.
- Outdoor medicine
- Evidence based reviews
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Acute mountain sickness (AMS) affects many of the millions of people who ascend above 2500 m each year.1 The severity of AMS symptoms is often mild-moderate, but the environment in which AMS usually occurs (ie, austere mountainous regions where medical services are limited) augments its impact on the health, productivity and travel of high-altitude sojourners. Furthermore, AMS often precedes high-altitude cerebral oedema, a rare but potentially lethal form of altitude illness.2 While established evidence-based guidelines to treat AMS are available,3 preventing AMS is preferable. The often-cited risk factors for AMS are the ascent rate, the altitude attained (or the sleeping altitude) and the AMS history of the individual.2 ,4
There is strong evidence demonstrating that the incidence of AMS is positively correlated with the ascent rate and the altitude attained. With respect to the ascent rate, quicker ascents were associated with significantly greater AMS incidences in large groups of participants on the way to 4559 5 and 6962 m.6 With respect to the altitude attained, the incidences of AMS in trekkers were 9%, 13%, 34% and 53% at 2850, 3050, 3650 and 4559 m, respectively.7 That the ascent rate and the altitude attained strongly influence the incidence of AMS is congruent with the primary cause of AMS: insufficient acclimatisation to hypoxia.
Many reviews suggest that a history of AMS is a strong predictor of AMS on a future ascent;2 ,8 ,9 however, the utility of AMS history in predicting future AMS incidence is not clear. Although multiple studies have demonstrated that an AMS history is statistically associated with an increased likelihood of developing AMS,10 ,11 many of the ORs that were statistically significant were also relatively small—possibly too small to be clinically useful.12 To determine the utility of AMS history in predicting future AMS incidence, AMS history can be treated as a diagnostic test for future AMS outcomes. In this context, the diagnostic accuracy of AMS history can be described by its sensitivity and specificity for predicting future AMS outcomes.
Meta-analysis of diagnostic accuracy (MADA) is a statistical technique to combine data from multiple studies of diagnostic accuracy.13 Using MADA, a consensus estimate of the sensitivity and specificity of a diagnostic test can be obtained, and possible sources of heterogeneity across studies can be examined. Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) is a qualitative tool used in conjunction with MADA to determine the strengths and weaknesses of included studies.14 QUADAS-2 guides the researcher to create study-specific signalling questions, which are used to determine the risks of bias and the concerns for applicability (ie, relevance to the specific review question) of four domains: participant selection, index test, reference standard and timing of assessments. Using these two methods, researchers can synthesise comprehensive results for diagnostic tests and suggest modifications to improve the accuracies of diagnostic tests. The purpose of this examination was to use MADA and QUADAS-2 to systematically determine the utility of a history of AMS for the prediction of a future AMS outcome. We hypothesised that AMS history would be a reliable tool for predicting future AMS outcomes.
Potential studies were identified by searching PubMed and Google Scholar with combinations of the following keywords as queries: ‘acute mountain sickness,’ ‘previous history,’ ‘repeatability’ and ‘reproducibility.’ All identified studies published in English before May 2013 were reviewed for inclusion. One French study15 referred to by another publication was included after being translated into English. To be included in the analysis, studies had to report (1) AMS histories for participants; (2) AMS outcomes for participants on the investigated ascent(s) and (3) the number of true positives (TP), false positives (FP), false negatives (FN) and true negatives (TN; see table 1). Data were independently extracted from the included studies by two researchers.
Additional information was extracted from each study to perform a quality assessment using the QUADAS-2 protocol14; however, reporting this information was not a requirement for inclusion. Each study's risk of bias and concern for applicability were rated as low, high or unclear for each domain (except for the timing domain, for which concern for applicability does not apply). Participant selection quality was judged from the method of recruitment, whether any participant took medications to prevent AMS, and whether any participant spent >1 day above 3000 m in the preceding 2 months. Under the QUADAS-2 protocol, the AMS history assessment was termed the ‘index test’ and the AMS outcome assessment was termed the ‘reference standard.’ The quality of each assessment was judged from the following information: timing of assessment in relationship to the occurrence of symptoms (retrospective/prospective), method of diagnosis (and threshold), altitude of assessment and whether the assessments were performed independently. Finally, the risk of bias in the timing of the index test and the reference standard was based on the time elapsed between the AMS history and AMS outcome.
For the purpose of this paper, the participants’ AMS history was treated as a binary index test (ie, positive/negative). The diagnosis of AMS reported in each study was also treated as a binary reference standard (ie, positive/negative) and is referred to as the ‘AMS outcome.’
For each study, the TP, FP, FN and TN data were used to calculate the sensitivity, specificity and false-positive rate (FPR=1−specificity). For each proportion, 95% CIs were calculated based on ref. 16.
Studies of diagnostic accuracy generate pairs of sensitivity and specificity scores as outcomes. Sensitivity (P) is the ratio of TP to the total number of people with an AMS-positive outcome (TP+FN). Conceptually, sensitivity represents how well AMS history predicts who will get AMS on a future exposure among people who subsequently developed AMS (a score of 1 would be perfect prediction). Specificity (N) is the ratio of TN to the total number of people with an AMS-negative outcome (TN+FP). Conceptually, specificity represents how well AMS history predicts who will not get AMS among people who did not subsequently develop AMS (again, a score of 1 would be perfect prediction). Related to specificity is FPR (FPR=1−specificity), for which a score of 0 would indicate perfect prediction (ie, no false-positive cases). All calculations in the meta-analysis are based on the sensitivity and specificity of individual studies (Pi and Ni) and the variability of these parameters between studies.
The main analysis tested a bivariate random-effects model for sensitivity and specificity of using AMS history to predict AMS outcome. We have used a bivariate meta-analysis approach17 instead of a univariate analysis18 because bivariate results allow for the estimation of sensitivity and FPR and the correlation between the two parameters. Simultaneous estimation of these parameters is more clinically relevant than the single diagnostic OR provided by univariate analysis. In order to be analysed statistically, sensitivity and specificity must be transformed into logits (a logit is the natural log of the OR, which allows for statistical analysis through a general linear model). Following analysis, an inverse logit can be used to return point estimates and CIs to their original dimensions. We used R (r-project.org) to test a bivariate random-effects model of diagnostic accuracy using the ‘mada’ package.19
Building on this main analysis, we tested a number of moderator variables to explain the heterogeneity of sensitivity and FPR found between studies. First, heterogeneity tests were performed to determine whether sufficient variation was present to warrant analysing moderator variables. Limited data for moderator variables were available in the original studies, and thus only three moderator variables were tested: (1) altitude of the AMS diagnosis (in km), (2) whether all participants had previous altitude exposure prior to the AMS outcome ascent (coded as 0=no, 1=yes) and (3) whether any participant used prophylactic medications (coded as 0=no, 1=yes). Details of the models are explained in the online supplementary appendix.
Risks of bias and concerns for applicability
A total of 736 records were reviewed; 27 studies were assessed for eligibility; 10 studies were excluded for not including sufficient data; 17 studies were included in this meta-analysis (table 2). Results from the QUADAS-2 analysis are displayed in figure 1. There was a low risk of bias in participant selection, but a high concern for the applicability of participants due to prophylactic medication use and recent altitude exposure. As most studies determined AMS history retrospectively, did not describe the characteristics of previous ascents, and did not describe the method of determining AMS history, the index test had a high risk of bias and an unclear concern for applicability. In contrast, most studies determined AMS outcome prospectively with an acceptable method, leading to a low risk of bias and low concern for applicability with the reference standard. Finally, the time between the index test and reference standard was not stated in 15 studies (and many studies included participants with recent altitude exposure), indicating a high risk of bias for the timing of assessments.
The raw data from each study are provided in table 3, and the descriptive statistics for each study are shown in figure 2, which plots the sensitivity and specificity (with 95% CI) separately for each study. A total of 7921 participants were included, and the mean and median sample sizes were 466 and 138, respectively. The test for heterogeneity of sensitivity was significant, χ2(17)=486.20, p<0.0001, as was the test for heterogeneity of specificity, χ2(17)=436.38, p<0.0001. Significant heterogeneity in both factors validates the use of a random-effects modelling procedure and suggests sufficient variability between studies to model this variability using metaregression.
Bivariate random-effects model
Combining data from 7921 participants in the bivariate meta-analysis led to a summary sensitivity of 0.50 (95% CI (0.40 to 0.59)) and summary specificity of 0.72 (95% CI (0.66 to 0.78)). The details of the model are presented in table 4, and a summary receiver-operator characteristic curve is shown in figure 3.
The random-effect model calculates a mean sensitivity and FPR, the amount of between-study variation, and the strength/direction of the correlation between sensitivity and FPR. From these statistics, we calculated a 95% confidence ellipse around a summary estimate of sensitivity and specificity, shown in figure 3. As aforementioned, the model analyses the logit of sensitivity and FPR; in order to display these estimates in their original units, the results were reverse-transformed using an inverse logit.
Tests of moderator variables
For reasons of statistical power, each moderator was tested separately (see online supplementary appendix). The results of the moderator variable analysis are shown in table 5. Slopes and intercepts in table 5 are reported in logits; point estimates in the text have been transformed back to the original units using inverse logits.
There was no evidence that the altitude at diagnosis had any effect on test sensitivity or FPR (see online supplementary figure A1). The estimated sensitivity and FPR at sea level (0.48 and 0.27, respectively) were very similar to the values estimated for 8 km above sea level (0.46 and 0.29, respectively).
Controlling for altitude exposure in the AMS history (ie, ensuring that all participants had previously been exposed to altitude) did not significantly affect the sensitivity of AMS history as an index test (see online supplementary figure A2). Studies that did control for previous exposure had higher sensitivity on average (0.64) relative to those that did not (0.46), but this difference was not significant (p=0.24). Controlling for altitude exposure had a significant effect on the FPR intercept but no effect on the slope, suggesting that FPR was significantly non-zero, but that FPR was comparable between studies that did control (0.29) and did not control (0.29) for altitude exposure.
Finally, controlling for prophylactic medications had a significant effect on sensitivity and FPR (see online supplementary figure A3). Those studies that did not allow prophylactic medications had a significantly higher sensitivity (0.66) relative to those studies that did not control for prophylactic medications use (0.40). FPR was significantly higher in studies that did control for medications (0.39) relative to studies that did not (0.23).
Our meta-analysis indicates that a positive AMS history is not very useful in predicting a positive AMS outcome (ie, low sensitivity), but a negative AMS history is moderately useful in predicting a negative AMS outcome (ie, moderate specificity/FPR); however, neither the sensitivity nor the specificity was sufficient to rely on AMS history to plan ascents to high altitudes (eg, recommending medications, ascent rates, etc). The low utility for AMS history as a predictor of AMS outcome may reflect the quality of the included studies (many studies had high risks of bias, especially for the index test, and high concerns for applicability for all QUADAS-2 domains), or it may suggest that AMS history is not a useful predictor of AMS outcomes.
We used QUADAS-2 to qualitatively describe the included studies and identify their weaknesses and strengths. Our analysis demonstrated that participant recruitment was not likely to be biased, but the research question may not have been applicable to many participants given that some were using one or more prophylactic medications or had spent recent time at altitude to prevent AMS on their AMS outcome ascents. These substantial differences between samples in each study contribute to high variability in diagnostic accuracy (ie, significant effects of heterogeneity for sensitivity and specificity).
Our analysis also demonstrated that many studies had weak index tests (or did not adequately describe their index tests). The majority of studies reported little or no information related to participants’ previous ascents or how they determined AMS history, providing little confidence in the quality of AMS history assessments in most studies. In contrast, most studies used an acceptable method and threshold to diagnose AMS (ie, Lake Louise Score (LLS) Questionnaire with an LLS >3 or >434; Environmental Symptom Questionnaire with an AMS cerebral score >0.7035). Also, many studies diagnosed AMS prospectively, eliminating problems related to remembering past symptoms. Finally, the timing between the index test and reference standard was not reported in most studies, making it unclear whether or not the gap between ascents was appropriate. Given that recent exposure to altitude is associated with a decrease in AMS symptoms on subsequent exposures,5 ,6 ,36 the timing between ascents is critical.
Our QUADAS-2 analysis examined the included studies in the context of our specific research question, and it is necessary to stress that our results do not demonstrate that the included studies are of poor quality overall. Rather, most studies tested the association between AMS and multiple variables, and most were not designed specifically to test the association between AMS history and AMS outcome. These studies are still relevant to the meta-analysis, but their focus on multiple variables may affect their diagnostic accuracies in the context of our analysis.
Given the issues with the quality of the studies included in the meta-analysis, it is not clear if poor sensitivity and specificity reflect problems with predicting AMS outcomes from AMS histories per se or a problem with the quality of the available data. Sensitivity and specificity will be high when FN and FP are reduced. A higher rate of FN and FP could be expected if the altitude, ascent rate, acclimatisation status and drug use were not consistent between the history ascent(s) and outcome ascent. For example, FN may be high (creating low sensitivity) when the altitude and rate of ascent are lower on the AMS history ascent(s) than the AMS outcome ascents or when there was no history ascent and participants’ histories were recorded as negative. This idea is partially supported by the trend for higher sensitivities in studies that only included participants with altitude experience (relative to studies that included participants without any previous altitude experience).
Similarly, the ratio of TP to FP may be affected when participants develop AMS on their history ascents and then take a prophylactic medication (eg, acetazolamide) or preacclimatise on their outcome ascents. This idea is supported by the significant effect showing that studies excluding the use of medications had higher sensitivities than studies that did not control this variable. Conversely, FN may be high if participants took acetazolamide or preacclimatised for the AMS history ascent(s) but not for the AMS outcome ascent. This information was not provided in most studies, but it is very likely that the majority of studies did not control for these potential confounding factors given their opportunistic recruitment strategies and observational designs. We attempted to control for acetazolamide use in our analysis because of its capacity to prevent AMS3; however, the potential side effects of acetazolamide overlap considerably with the symptoms of AMS (eg, nausea and lethargy),37 further complicating the analysis of studies that permitted acetazolamide use.
Only two studies attempted to match the ascent rate, altitude attained, use of medications and acclimatisation across the AMS history ascent and AMS outcome ascent. Rexhaj et al27 reported a much higher sensitivity and a similar specificity to the estimates provided in our analysis; MacInnis et al22 reported not only a much higher sensitivity but also a much lower specificity compared with the estimates provided by our analysis. Increased familiarity with the environment was suggested to explain the low specificity in this study. Based on these studies, it is possible that AMS history could be a useful predictor of a positive AMS outcome if the ascents were better matched in terms of altitude, ascent and the use of prophylactic medications.
While FP may have occurred because of a lack of methodological controls, an alternative explanation is also possible for the moderate FPR: individuals may be less susceptible to AMS on subsequent ascents to altitude simply because they are familiarised with the high altitude environment and the physiological responses to hypoxia and report less severe symptoms. MacInnis et al22 reported a significant decrease in AMS severity across two 12 h exposures to normobaric hypoxia. The likelihood of acclimation was low in this study, as participants only experienced 12 h of hypoxia (>2500 m) in the 10 weeks preceding the second hypoxic exposure. In a study of acclimatisation and reacclimatisation at altitude, MacNutt et al38 suggested that previous exposures to high altitude might increase one's psychological tolerance of altitude, which could lower self-reported AMS symptom severity at altitude on subsequent exposures.
The Wilderness Medical Society consensus guidelines for preventing AMS3 are based partly on the AMS history of individuals. Our meta-analytic results (based on a large and systematically collected dataset) do not suggest that there is utility in this strategy. It is important to note that our analysis does not demonstrate that AMS history cannot predict AMS outcome; rather, it demonstrates that the predictive utility of AMS history is insufficient for AMS history to be a diagnostic tool. As discussed above, a possible reason for the lack of evidence is that most studies classified the AMS history of participants without any appreciation for the conditions of previous ascents (eg, ascent characteristics, use of medications and preacclimatisation strategies): a participant developing AMS only above 6000 m and a participant developing AMS at 2000 m are both considered to have a positive AMS history. In this case, the positive AMS history at 2000 m may indicate that the participant is at high risk for developing AMS at 3000 m, but the latter participant's response to 6000 m is not likely to be a good predictor of that participant's response to 3000 m. Similarly, an individual who has not developed AMS at 3000 m should not be considered to have a negative AMS history when planning an ascent to 5000 m (an ‘uninformative history’ would probably be a better classification). Thus, AMS history should be evaluated with an appreciation of the previous ascent(s) and the future ascent(s); there is no reason to necessarily expect AMS history to accurately predict an AMS outcome when the conditions of the ascent are considerably different.
Currently, AMS history is not clinically useful in predicting future AMS outcomes based on the available published data. The low sensitivity and specificity may reflect the quality of the included studies, and large, well-designed studies are needed to clarify the potential utility of previous AMS history in predicting AMS outcomes. Most importantly, AMS history should be considered in the context of the previous and future ascents, and extrapolations to novel altitudes and conditions may not be possible. More consideration of the rate of ascent (at outcome and history), altitude of assessment (at outcome and history), medication use and detailed reporting of demographic data are important considerations for future research. Given the low cost and speed of using AMS history as a diagnostic tool, it is very important to extricate these variables and determine if the sensitivity and specificity of AMS history improve as a result. Based on the available data, there is significant variability in the predictive utility of different studies, but controlling for the existing moderators did little to improve the sensitivity or specificity of AMS history as a diagnostic test. Further research is needed to see if this variability can be modelled systematically, creating a more nuanced, but more accurate, relationship between AMS history and the likelihood of developing AMS in the future.
MULTIPLE CHOICE QUESTIONS AND ANSWERS
Acute mountain sickness does not usually develop below:
Correct answer: a
Acute mountain sickness occurs very infrequently below 2500 m.
The current guidelines for the prevention of acute mountain sickness (Luks et al. 2010) use which of the following variables to categorise an individual's risk of developing acute mountain sickness:
the rate of ascent
the rate of ascent and the altitude attained
the rate of ascent, the altitude attained, and the individual's history of acute mountain sickness
Correct answer: c
The current guidelines for the prevention of acute mountain sickness (Luks et al. 2010) suggest using the rate of ascent, the altitude attained (or sleeping altitude), and the individual's history of AMS to categorise an individual's risk of developing AMS.
Based on this meta-analysis of diagnostic accuracy, which of the following statements is correct regarding the use of history to guide prophylactic strategies:
A positive history of acute mountain sickness is more informative than a negative history of acute mountain sickness.
A negative history of acute mountain sickness is more informative than a positive history of acute mountain sickness.
Positive and negative histories of acute mountain sickness are equally informative.
Neither a positive nor a negative history of acute mountain sickness is informative.
Correct answer: b
According to this meta analysis of diagnostic accuracy, the sensitivity of a previous history of AMS was 50% and the specificity was 72%. This indicates that 50% of people with a positive AMS history WILL develop AMS and that 72% of people with a negative AMS history WILL NOT develop AMS, which suggests that a negative history of AMS is more informative than a positive history of AMS.
In which of the following scenarios is an individual's history of acute mountain sickness most likely to be useful for planning a future ascent:
The individual has never been to altitude and wants to ascend to 3000 m.
The individual developed AMS at 5000 m and wants to ascend to 3000 m.
The individual did not develop AMS at 3000 m and wants to ascend to 5000 m.
The individual did not develop AMS at 5000 m and wants to ascend to 3000 m.
Correct answer: d
A negative history is more informative than a positive history, and this individual had a negative history at a greater altitude than the altitude to which he/she plans to travel; therefore, it is likely that the individual will not develop acute mountain sickness at 3000 m. In the other scenarios, the individual has no history (a), a positive history (b), and a negative history (c). Only the negative history of scenario C is likely to be informative; however, because the individual wishes to ascend to an altitude greater than that from which his/her history was established, the negative history is quite likely to be uninformative.
The greatest concern with the studies included in this meta analysis is:
The lack of a consensus for measuring an individual's history of AMS.
The methods used to assess AMS symptoms on the “AMS outcome” ascent were not appropriate.
Too many studies used case-control designs.
Too many studies evaluated clinical populations.
Correct answer: a
There is no consensus for assessing an individual's history of AMS. Most studies did not report how this was determined, and nor did they report the conditions of the previous ascent (eg, altitude, ascent rate). The other choices are not correct, as appropriate methods were used to assess acute mountain sickness symptoms on the outcome ascent, only one study had a case-control design, and all studies investigated non-clinical populations.
What are the new findings?
This is the first meta-analysis to assess the diagnostic accuracy of acute mountain sickness (AMS) history for predicting future AMS incidents.
AMS history had a specificity of 0.72 and a sensitivity of 0.50 with respect to predicting future AMS incidents.
Potential risks of bias and concerns for applicability were identified in the included studies, suggesting that the available evidence for AMS history as a predictor of future AMS incidents is weak.
How might it impact on clinical practice in the near future?
The use of AMS history to guide prophylactic strategies for high-altitude ascent is not supported by the literature.
A negative AMS history may be partially informative, but a positive AMS history is uninformative, with respect to future ascents to high altitude.
Ensuring that the characteristics (eg, ascent rate and altitude attained) of the history ascent are similar to those of the future ascent may improve the clinical utility of AMS history.
The authors would like to thank Dr Khan and the manuscript reviewers for reviewing their manuscript and providing constructive feedback.
Contributors MJM conceived of the study, and MJM and JKS performed the systematic review and data extraction. KRL chose the statistical methods and carried out the analysis with assistance from MJM and MSK. MJM and KRL wrote the first draft of the manuscript; MSK critically reviewed the manuscript; and all authors assisted with revising the manuscript and approved the final draft.
Funding MJM was supported by a Natural Science and Engineering Research Council of Canada (NSERC) Canada Graduate Scholarship.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.