More information about text formats
I appreciate the thoughtful considerations raised. Scientific discussion is always the best way for the opportunity to review points, exchange thoughts and evolve in knowledge. Here are some additional considerations below:
- About strength and VO2peak controlled by FFM and/or weight:
We showed these data in the article (strength/FFM; VO2peak/FFM; VO2peak/weight) in the results and table 2. There are no statistical differences comparing all populations (TW,CM and CW).
- TW with 637 ng/dL testosterone on the day of the tests:
In the long-term follow-up of a cohort of individuals with daily medication use, temporary failures in the regular use of medications are not uncommon. One of the participants had a high level of testosterone at the time of the study. However, we emphasize that we were careful to assess testosterone levels in the year before the study so that we could confirm the correlation of the values obtained at the time of the study with those in the last year. In addition, the values of haemoglobin denoted testosterone supression in the past 4 months. Although one of the TW was not blocked on test day (total testosterone =637 ng/dL), her value was 79 ng/dL six months before the study. This point did not interfere with her VO2 results (supplementary figure 2).
- Weight and height:
Studies in sports medicine generally eliminate the height as an interfering factor in the analyses.
Height is a consequent characteristic o...
- Weight and height:
Studies in sports medicine generally eliminate the height as an interfering factor in the analyses.
Height is a consequent characteristic of the puberty pattern that the individual had (male or female) and cis women are 13 cm shorter on average when compared to men. We have to be careful when making the correlation of height and sports performance. Men are taller and also have more muscle mass, larger heart chambers, and larger diameter of the bronchial tree. Therefore, height cannot be identified as a cause of sporting abilities, but rather as a confounding factor.
It is not weight alone that may affect sporting abilities, but muscle mass and several other factors.
Gender studies in sport often use BMI for pairing.
- Grip strength test:
The grip strength test presents results that denote an overall assessment of the individual and is largely carried out in performance studies in the most diverse populations.
Alvares et al.  conducted a study to compare performance-related measures such as cardiopulmonary exercise capacity and muscle strength in non-athlete transgender women (TW) undergoing long-term gender-affirming hormone therapy to non-athlete cisgender men (CM) and non-athlete cisgender women (CW). The authors report higher absolute VO2peak (L/min) and muscle strength (kg) in TW compared to CW and lower than CM. The authors conclude that their “…findings could inform policy and help in decisions about the participation of transgender women in sporting activities”.
However, the authors interpreted their findings on the basis of the absolute data they present and not the relative data that was controlled for body mass and fat-free mass (FFM), as would be appropriate for comparisons of such performance metrics (e.g., aerobic capacity and muscle strength). By focusing on the absolute data, the authors over-emphasise differences between comparison groups (e.g., TW and CW) that are clearly driven by differences in anthropometry. For example, when the data reported in Table 2  are corrected for body mass and fat-free mass (FFM), differences in aerobic capacity and strength between TW and CW disappear. Yet, in the section “WHAT THIS STUDY ADDS” , which is the primary focus of many readers, the authors omit the results that control for body mass and FFM, instead leaving the reader with the misleading message that “[t]he mean strength and VO2peak...
However, the authors interpreted their findings on the basis of the absolute data they present and not the relative data that was controlled for body mass and fat-free mass (FFM), as would be appropriate for comparisons of such performance metrics (e.g., aerobic capacity and muscle strength). By focusing on the absolute data, the authors over-emphasise differences between comparison groups (e.g., TW and CW) that are clearly driven by differences in anthropometry. For example, when the data reported in Table 2  are corrected for body mass and fat-free mass (FFM), differences in aerobic capacity and strength between TW and CW disappear. Yet, in the section “WHAT THIS STUDY ADDS” , which is the primary focus of many readers, the authors omit the results that control for body mass and FFM, instead leaving the reader with the misleading message that “[t]he mean strength and VO2peak in non-athlete TW … were higher than those in non-athlete CW.”
Relatedly, we question the authors’ reliance on expressing lung volume relative to body mass and FFM, when expressing relative to height would be more informative. The important relationship between height and indices of cardiovascular function/capacity is widely recognised . However, Alvares et al. only report the heights of participants in their supplementary data  – which will mostly be overlooked by many reading the paper. Our own analysis of the supplementary data shows that height differences between TW and CW could fully account for the absolute differences in lung capacity. Again, the key finding here would be an absence of difference in relative aerobic capacity of cisgender versus transgender women.
In addition to the biased message, there are also many other basic errors in this manuscript that questions its value as published to inform policy. In the interest of brevity our main concerns are listed below as follow:
1. We question why one of the TW participants had a very high level of testosterone on the day of testing (TT=637 ng/dL). The authors justified the inclusion of this subject’s testing results on the basis of her having had a lower testosterone level six months prior. However, this is a cross-sectional study, meaning this participant should have been excluded. Without access to the raw data it is impossible to know the impact of the inclusion of this data in the comparison between groups.
2. VO2peak rather than VO2max is listed as the measure of aerobic capacity, even though VO2peak is known to underestimate the value of VO2max by 0.1-0.4 l/min . It is unknown whether this represent a semantic error, or the authors have indeed measured VO2peak.
3. The authors rely on the International Physical Activity Questionnaire (IPAQ) to assess the physical activity levels of participants and match subjects in the comparison groups for meaningful data interpretation. However, the IPAQ is considered a valid instrument for measuring physical activity in large study populations, and not for such small studies needing validity at the individual level. This makes the interpretation of the comparisons between “matched” groups very difficult, if not impossible.
4. The number of participants reported in the abstract (i.e. 15 TW, 13 CM and 14 CW) does not match the number reported later in the article. The units of measurement for maximum aerobic capacity are clearly erroneous as physiologically implausible (e.g. Table 2 and VO2peak (L/min) was 2606±416.9 in TW, 2167±408.8 in CW and 3358±436.3 in CM).
5. The authors imply that gender dysphoria is a mental disorder, stating that “the criteria for gender dysphoria diagnosis were in accordance with the Diagnostic and Statistical Manual of Mental Disorder.” However, Gender Dysphoria is no longer classed as a mental health disorder .
In conclusion, we suggest that, as presented, the paper by Alvares et al.  cannot be used to “inform policy and help in decisions about the participation of transgender women in sporting activities”. Instead, the scientific community including these authors should be encouraged to conduct and publish high quality studies involving trained transgender individuals (and athletes where possible) and involving sport-specific measures to inform policy and guidelines .
1. Alvares LAM, Santos MR, Souza FR, Santos LM, Mendonça BB, Costa EMF, Alves MJNN, Domenice S. Cardiopulmonary capacity and muscle strength in transgender women on long-term gender-affirming hormone therapy: a cross-sectional study. Br J Sports Med. 2022 Nov;56(22):1292-1298.
2. HEPPER NG, FOWLER WS, HELMHOLZ HF Jr. Relationship of height to lung volume in healthy men. Dis Chest. 1960 Mar;37:314-20. PMID: 14401182.
3. Smirmaul BP, Bertucci DR, Teixeira IP. Is the VO2max that we measure really maximal? Front Physiol. 2013 Aug 5;4:203.
4. Rodríguez MF, Granda MM, González V. Gender incongruence is no longer a mental disorder. Journal of Mental Health & Clinical Psychology. 2018;2(5).
5. Martowicz M, Pape M, Budgett R, Mascagni K, Engebretsen L, Dienstbach-Wech L, Pitsiladis Y, Pigozzi F, Erdener U. Position Statement: IOC Framework on Fairness, Inclusion and Non-Discrimination on the Basis of Gender Identity and Sex Variations. British Journal of Sports Medicine. In review.
The topic of transgender inclusion in women’s sports is politically fraught. Sport’s governing bodies are grappling with the competing priorities of inclusivity and fairness due to any perceived competitive advantage above and beyond the large and broad continuum of biological variables found within cisgender women (e.g. height, bone mass, bone length, fiber cross-sectional diameter, etc.) associated with testosterone exposure during puberty. This active area of research is rapidly evolving due to the multitude of new studies published over the previous 5 years. In fact, there have been over a dozen primary prospective and case-control research studies published on this topic since 2018 resulting in the lowering of the maximum allowable testosterone level in transgender elite athletes (i.e., from 5.0 to 2.5nmol/L) by several sports’ governing bodies.
The preponderance of evidence suggests that hematological differences in hematocrit, red cell number, and hemoglobin are largely normalized within 120 days of testosterone suppression, which is biologically plausible as this corresponds with the average lifespan of a red cell (~ 120 days). Since oxygen delivery to peripheral tissues is performance limiting in aerobic sports, any competitive advantage is likely largely diminished within a year of testosterone suppression. Studies evaluating changes in strength, muscle mass, and body composition are more equivocal and most likely occur over a longer time span (12-36 mon...
The preponderance of evidence suggests that hematological differences in hematocrit, red cell number, and hemoglobin are largely normalized within 120 days of testosterone suppression, which is biologically plausible as this corresponds with the average lifespan of a red cell (~ 120 days). Since oxygen delivery to peripheral tissues is performance limiting in aerobic sports, any competitive advantage is likely largely diminished within a year of testosterone suppression. Studies evaluating changes in strength, muscle mass, and body composition are more equivocal and most likely occur over a longer time span (12-36 months).
Few studies have evaluated cardiopulmonary differences in transgender women relative to cisgender women or men. The recent publication by Alvares et. al. evaluated cardiopulmonary capacity and grip strength in a small cohort of non-athlete cisgender and transgender women (CW and TW) and cisgender men (CM) in San Paulo, Brazil. 15 transgender women were recruited from a clinic that specializes in the treatment transgender patients. The average age of the TW was 34.2 +/- 5.2 years with an average duration of hormonal treatment of 14.4+/-3.5 years (median age of treatment initiation was 17 years old). Although the TW subjects were on hormonal treatment for over a decade, 11 of the 15 subjects were dependent on chemical testosterone suppression (i.e., non-gonadectomized). The median testosterone level over the previous 12 months for the TW subjects was 3.5nmol/L with 4 of the subjects above 7nmol/L, which is within the range observed CM group. As noted above, several sports’ governing bodies require testosterone suppression below 2.5nmol/L throughout the entire year. Prospective testosterone data for each subject was not provided so it is unclear how many TW subjects meet these criteria, however median levels presented in supplemental figure 1 suggest that at least 8 out of 15 of the subjects do not meet this criterion. Despite the suboptimal hormonal control, hemoglobin levels of the TW were not different than CW and both groups were significantly lower than the CM group. Although the groups were matched by age and activity, they were not weight matched. The average body weights were 60.8kg, 78.1kg, and 81.3kg for the CW, TW, CM groups respectively (CW vs TW and CW vs CM were significantly different; P < 0.001).
The authors performed cardiopulmonary exercise testing on a treadmill using a ramp protocol to exhaustion. They measured oxygen consumption at rest (prior to running), at anaerobic threshold (AT), at respiratory compensation point (RCP), and peak consumption. Values were provided on an absolute basis (mL/min) at rest, AT, RCP. VO2 peak was presented on absolute and relative basis (relative to total body weight and fat free mass [FFM]; L/min/kg). The absolute oxygen consumption at rest, at RCP, and peak consumption were higher in the TW group relative to the CW group. This is not surprising since the average body weight of the TW was 22% heavier than the CW group.
Conceptually speaking, someone that is heavier (i.e. has a higher fat free mass) is more metabolically active and will consume more oxygen per time period. The authors do present peak oxygen consumption normalized to total body weight and fat free mass. When doing so, differences in the peak oxygen consumption disappear. In fact, when normalized to FFM, VO2 peak was 11% less in the TW group relative to the CW group although the differences were not statistically different. When corrected to body weight, no differences in oxygen consumption were observed between the CW and TW groups (Rest - 4.2 vs 4.0L/kg/min; AT - 21.6 vs. 21.5L/kg/min; RCP - 29.6 vs. 31.5 L/kg/min; Peak - 33.4 vs 35.7 L/kg/min). This is an important point because it suggests that there are no differences in cardiopulmonary capacity in TW compared to CW when normalized to body weight. Although these subjects reported high activity levels, the peak oxygen consumption (VO2 peak) for the CW, TW, and CM groups was roughly half the VO2 peak observed in most elite athletes (>60 L/kg/min). These results should not be extrapolated to elite athletes.
The authors also assessed grip strength to evaluate whether there were any strength differences between TW and CW and CM. There was a statistically significant increase in grip strength between TW and CW, however this finding was no longer significant after normalizing to body weight. Nevertheless, it is unclear the relevance of grip strength to predicting any performance advantage to most elite or professional sports.
Owing to the scientific rigor and careful interpretation of results from previous case-control and prospective research studies, the results from these studies have advanced our understanding of the physiological changes associated with testosterone exposure during puberty and subsequent withdrawal on human exercise performance. The amalgamation of the available data has allowed sports’ governing bodies the ability to make highly informed policy decisions on managing a balance between inclusivity and fairness in female transgender athletes. It is incumbent that all new studies in this area of research are of high scientific rigor and the associated conclusions are appropriate for the data that are presented because the results and the language used have imminent ramifications for the inclusion of transgender athletes to compete in sport. The conclusions presented by Alvares et al are incomplete and not fully supported by the data. Further and perhaps more importantly, the conclusions by the authors suggesting that TW have higher cardiopulmonary function (unadjusted for body weight) is harmful to the sporting community at large because it submits false evidence of a competitive advantage. Although the study was conducted in non-athletes, the authors suggest the results from their study may inform inclusion policies for transgender athletes. In fact, the data from Alvares et al suggest that TW do not have improved cardiopulmonary function relative to CW or TW when normalized to body weight. Thus, the study does not provide evidence of a competitive advantage in sports in this small cohort of non-athletes. As such, it does not support further restrictions of transgender athletes from sport.