Article Text

Development and validation of a new visa questionnaire (VISA-H) for patients with proximal hamstring tendinopathy
  1. Angelo Cacchio1,
  2. Fosco De Paulis2,
  3. Nicola Maffulli3
  1. 1Department of Life, Health and Environmental Sciences, School of Medicine, University of L'Aquila, L'Aquila, Italy
  2. 2Division of Diagnostic Imaging, Valle Giulia Clinic, Roma, Italy
  3. 3Centre for Sports and Exercise Medicine Barts and The London School of Medicine and Dentistry, Mile End Hospital, London, UK
  1. Correspondence to Professor Angelo Cacchio, Dipartimento di Medicina Clinica, Sanità Pubblica, Scienze della Vita e dell'Ambiente, Università degli Studi dell'Aquila, P.le Salvatore Tommasi 1, L'Aquila 67100, Italy; angelo.cacchio{at}


Background There is a need for a patient-reported outcome (PRO) questionnaire to evaluate patients with proximal hamstring tendinopathy (PHT).

Objective To develop a PRO questionnaire based on VISA questionnaire forms for patients with PHT.

Methods Item generation, item reduction, item scaling and evaluation of the psychometric properties were used to develop a questionnaire to assess the severity of symptoms, function and ability to play sports in patients with PHT and healthy subjects. The final version, named Victorian Institute of Sport Assessment-Proximal Hamstring Tendons (VISA-H), consisted of eight questions that measured the domains of pain, function and sporting activity. The psychometric properties of a questionnaire were estimated in a population of non-surgical (n=20) and surgical (n=10) patients, as well as in healthy subjects (n=30).

Results The VISA-H questionnaire displayed a high degree of internal consistency, with a Cronbach α of 0.84. (The test–retest reliability was high for all groups of participants with an intraclass correlation coefficient ranging from 0.90 to 0.95.) The VISA-H exhibited a high correlation with the Nirschl phase rating scale (r ranging from −0.75 to −0.89) and a generic tendon grading system proposed by Curwin and Stanish (r ranging from −0.70 to −0.88). Also, the responsiveness was higher for the VISA-H questionnaire with an area under the curve of 0.90 and a minimum clinically important difference of 22 points.

Conclusions The VISA-H is a PRO questionnaire with high psychometric properties for measuring pain, function and sporting activity in patients with PHT.

  • Hamstring injuries

Statistics from


Proximal hamstring tendinopathy (PHT) is an overuse tendinopathy affecting the proximal tendon of the hamstring muscles. PHT usually affects  athletes in many sports and at all levels of participation, but is of particular concern for elite sprinters, hurdlers and long-distance-running athletes.1–3

The characteristic complaint of PHT is pain, especially while performing sports activities or when sitting, in the area of the ischial tuberosity that rarely radiates distally to the popliteal fossa.1 ,2 ,4 The pain typically appears and gradually increases without being triggered by any acute event.1 ,2 ,4

Previous studies have clearly described the symptoms,1–6 the MRI,7 ,8 histopathological findings,4 and the non-surgical1 ,3 and surgical treatment2 ,4–6 of this condition. A more recent study has shown that three pain provocation tests can be used for the clinical diagnosis of PHT in athletes.9

Unfortunately, the success or failure of treatment for PHT is open to interpretation, given the lack of instruments on how to measure treatment outcome in a standardised fashion.

The importance of monitoring the effectiveness of treatment on the basis of the patient's viewpoint is widely recognised.10 ,11 Standardised patient-reported outcome (PRO) questionnaires provide a convenient method to compare different patient populations, evaluate the outcome of treatment, facilitate comparisons between studies, determine the patient's clinical severity, provide a guideline for treatment and monitor treatment effects.12 ,13

The PRO VISA-P and VISA-A questionnaires have been introduced to quantify athletes’ disability due to patellar and Achilles tendinopathy, respectively, thereby facilitating research into these conditions.14 ,15 These questionnaires assess pain and the ability to undertake physical activities and sports. Moreover, these questionnaires have been documented as valid and reliable instruments for monitoring recovery from patellar16 ,17 and Achilles tendinopathy.18 ,19

There is also, however, a need for an easy, valid, and reliable PRO questionnaire to appraise the symptoms and the ability to undertake sport, in patients with PHT.

To our knowledge, there are no validated PRO questionnaires for patients with PHT. Therefore, the purpose of this study is to develop and validate a new PRO questionnaire for patients with PHT that requires little time to administer and is easily readable.

Materials and methods

The development of this new instrument consisted of four steps: (1) item generation and test construction, (2) item reduction, (3) item scaling and (4) evaluation of the psychometric properties of the final version of the questionnaire.

The VISA-A14 and VISA-P15 questionnaires were used as background material to develop a questionnaire specifically for use in patients with PHT.

Step 1: item generation and test construction

The development of this questionnaire started with a literature review to find items that would be appropriate for inclusion. In addition, other potential items used in clinical practice were gathered while interviewing physicians, athletic trainers and physical therapists directly involved with the management of PHT. Further, patients were informally interviewed about symptoms that they felt were important. Finally, an expert group of colleagues with several years of experience with PHT participated in two brainstorming sessions to ensure good face validity to the 25 new items generated.

Step 2: item reduction

In this phase, a focus group consisting of the principal investigator, a sport orthopaedic surgeon and a radiologist reviewed all the items generated, deciding which of the 25 items should be discarded and which should be retained. Using the frequency-importance product (frequency×mean importance) and the Pearson product–moment correlation, six items were retained. These six items will constitute the first six questions (Q1–Q6) of our Victorian Institute of Sport Assessment-Proximal Hamstring Tendons (VISA-H) questionnaire, and with the last two questions (Q7 and Q8), similar to those of other VISA questionnaires, will be structured in an eight-item questionnaire, covering the three domains of pain, function and sporting activity.

Step 3: item scaling

Based on the assumptions expressed by the authors of the VISA-A questionnaire,14 the first six questions (pain and function) used a 0–10 numerical rating scale and the final two questions (sporting activity) used a categorical rating scale.

The worst score obtainable with the VISA-H questionnaire was 0 points, while the best was 100 points.

Step 4: evaluation of psychometric properties of the final version of the VISA-H questionnaire


The final version of the VISA-H questionnaire (see online supplementary appendix) was administered to three groups: a non-surgical group with 20 patients (14 men and 6 women; mean age of 23.7 years, range 18–25) with a diagnosis of PHT and selected to receive conservative management; a surgical group with 10 patients (8 men and 2 women; mean age of 21.4 years, range 18–23) with a diagnosis of PHT and, following failure of conservative management, on the waiting list for surgery; a healthy group with 30 individuals (20 men and 10 women; mean age of 23.1 years, range 18–26) represented a convenience sample of athletes whose age matched the patients’ groups to serve as a control group. Written informed consent was obtained from all the participants before enrolment in the study, and the procedures followed in the study were in accordance with the ethical standards of the local ethics committee and conformed to the Declaration of Helsinki.

For inclusion in the study, subjects in all groups had to be older than 18 and able to give written informed consent. For non-surgical and surgical groups, patients had to have a diagnosis of PHT made clinically and by means of MRI.1 ,9 A clinical diagnosis of PHT was made when the athlete had pain in the lower gluteal region, tenderness in the ischial tuberosity area and positive in at least two of the following three pain provocation tests: the Puranen-Orava test, the bent-knee stretch test and the modified bent-knee stretch test.1 ,9 Briefly, the Puranen-Orava test entails actively stretching the hamstring muscles in the standing position with the hip flexed at approximately 90°, the knee fully extended and the foot on a support. The bent-knee stretch test is performed with the patient supine. The hip and knee of the symptomatic leg are maximally flexed, and the examiner slowly straightens the knee. For the modified bent-knee stretch test, the patient lies in the supine position with the legs fully extended; the examiner grasps the symptomatic leg behind the heel with one hand and at the knee with the other hand, flexes the hip and knee maximally and then rapidly straightens the knee.1 ,9

Exclusion criteria were: lumbar sciatic pain, piriformis syndrome, ischial tuberosity avulsion, ischiogluteal bursitis or hamstring muscle tears; pregnancy; age of <18 years; inflammatory or neoplastic disorders; any treatments administered in the last 2 months.

During the physical examination, a differential diagnosis was made between PHT and lumbar sciatic pain, piriformis syndrome, hamstring muscle tears and knee pain. If one of the aforesaid conditions was suspected on the basis of the clinical findings, additional radiographs, electroneuromyographic studies and MRI of the lumbar spine or of the hamstring muscles or of the knee were performed before the enrolment.


Non-surgical management consisted of relative rest, avoiding activities and/or exercises that would increase the severity of symptoms, and four sessions of radial shockwave therapy, at the rate of one session per week. At each session, 2500 shocks with a pressure of four bars (equal to an energy flux density of approximately 0.18 mJ/mm2) and a frequency of 10 shocks/s were applied. The technique and methodology have been described in detail elsewhere.1

Surgical treatment was performed in another centre according to the procedure described by Lempainen et al.4

Procedures and questionnaires

At baseline, subjects of three groups were asked to complete the VISA-H questionnaire, the Nirschl phase rating scale (NPRS),20 and a generic tendon grading system proposed by Curwin and Stanish (TGSCS)21 in a comfortable room.

To assess the VISA-H test–retest reliability, all three groups of participants were asked to complete the questionnaire again 3 days later after the first administration at baseline. To minimise the risk of clinical changes, non-surgical and surgical groups of patients did not receive any treatment during this 3-day interval.

The final version of the VISA-H questionnaire consists of eight items, covering the two domains of pain/function (questions 1–6) and sporting activity (questions 7–8). Questions 1–7 were scored of 10 each and question 8 was scored of 30. Scores are summed to give a total of 100. An asymptomatic person would score 100, while someone who is symptomatic would score less than that.

The NPRS21 is a seven-phase (1–7) assessment of pain and activity limitations caused by overuse injuries. The TGSCS22 assessed 6° of reported exercise-induced tendon pain and the level of sports performance.

Although the psychometric properties of these tools are not formally validated so far, TGSCS and NPRS have been used as a standard for construct validity testing in previous VISA-A and VISA-P validation research, respectively.14 ,15 ,22–24 Moreover, the NPRS was used as an outcome measure in a previous clinical study on shockwave treatment for patients with chronic PHT.1

At discharge, the VISA-H, NPRS and GTRCS were administered again to the patients on completion of their non-surgical or surgical treatment. Since the patients of the surgical group underwent the surgical procedure in another referral centre, the second administration of the VISA-H, NPRS and TGSCS questionnaires and a global rating of change were undertaken between 7 and 15 days (mean: 11±3 days) after their discharge.

At discharge, the physician and the patient also independently completed a 7-point global rating of change form, ranging from 1=‘very much worse’ to 7=‘very much improved’. The physician's and the patient's global rating of change scores were averaged to give an overall change score, which was used in this study as the criterion standard of change. This measure of change was used as our external criterion, in the absence of a ‘gold standard’,  for the evaluation of responsiveness.25 For this purpose, we chose global rating of change scores of 1 or 2 to classify a worsened patient, a score of 3–5 to classify a stable patient, and scores of 6 or 7 to classify an improved patient.

Data analysis

Although the Kolmogorov-Smirnov test showed that the variables were normally distributed, given the small sample size we applied non-parametric tests. We determined that, to detect an intraclass correlation coefficient (ICC2,1) of 0.75 and an area under the curve (AUC) of 0.90 with a type I error of 0.05 and a type II error of 0.20, the necessary sample size was 10 and 15 participants, respectively.

The level of statistical significance was set at p<0.05. All analyses were conducted using MedCalc, V. for Windows (MedCalc Software, Mariakerke, Belgium), GraphPad InStat, V.3.05 for Windows (GraphPad Software Inc, San Diego, California, USA) and STATA software, V.8.2 (Stata Corp, College Station, Texas, USA).

Psychometric properties

Internal consistency

Internal consistency is the degree of inter-relatedness among the items.26 Internal consistency of the VISA-H was assessed by means of the Cronbach α and 95% CIs, using the data from the baseline questionnaire.27 Moreover, a principal component analysis with varimax rotation (eigenvalue >1) was applied to analyse the factor structure of the VISA-H questionnaire.

Test–retest reliability

Test–retest reliability indicates the extent to which the same results are obtained on repeated administrations of a given instrument when no change is expected. Test–retest reliability was assessed by means of the ICC2,1.26

Changes in the VISA-H scores following both the non-surgical and surgical treatments in comparison with the respective baselines were assessed using the Wilcoxon test.

Additionally, SE of measurement (SEM=SD×√(1−test–retest reliability coefficient)) was calculated.27

Construct validity

Construct validity indicates the extent to which the questionnaire scores correlate with those of other questionnaires as expected, that is, whether the questionnaire really measures the intended construct. Construct validity was tested by determining the relationship between the VISA-H scores and the NPRS and GTRSC scores, both at the initial and at the discharge assessments, using the Spearman correlation coefficients (r) and 95% CIs.

According to the original article on the VISA-A,14 construct validity of the VISA-H was also tested, comparing by Kruskal-Wallis test the results from the VISA-H questionnaires for non-surgical and surgical group patients with the results of healthy subjects. A Dunn post hoc comparison was used to determine significant differences between mean values when a significant main effect and interaction were found.

Responsiveness and interpretability

There is no consensus on the most suitable statistical analysis to assess responsiveness. Although the COSMIN guideline has defined some responsiveness parameters such as effect size (ES) and standardised response mean (SRM) as inappropriate measures of responsiveness,26 they are accepted worldwide and used in a large body of scientific literature and many clinicians are familiar with them.28 Therefore, in this study, we opted to use two distribution-based methods to assess the responsiveness of the VISA-H questionnaire: the ES29 and the SRM,30 as well as an anchor-based method, the receiver-operating-characteristic (ROC) curve.25

The ROC curve was also used to provide an estimate of the minimal minimum clinically important difference (MCID), taken as the point on the upper left-hand corner of the ROC curve, which most effectively discriminates between patients who have improved and those whose condition is unchanged.31

We also computed the AUC, which can be interpreted as the probability of correctly identifying an improved patient from randomly selected pairs of patients who have and have not improved.32 An AUC of 1.0 indicates perfect discrimination between these two health states. A questionnaire that does not discriminate more effectively than chance will have an AUC of 0.50.

Floor and ceiling effects were determined by calculating the number of patients who obtained the best or worst scores possible at both the baseline and discharge assessments in all the questionnaires. Floor or ceiling effects are considered to be present if more than 15% of respondents achieved the lowest or highest possible score, respectively.33

Floor and ceiling effects, distribution of total scores and change scores in the overall study sample and in non-surgical and surgical subgroups, and MCID will allow us to define the interpretability of our results.26 Interpretability is the degree to which one can assign qualitative meaning to an instrument's quantitative scores or change in scores.26


Internal consistency

Internal consistency reached a Cronbach's α of 0.84 (95% CI 0.77 to 0.89) for the eight items. When the α coefficient was calculated for the overall scale by eliminating each of the eight items one at a time, the range was 0.75–0.81; no single item was found to change the internal consistency substantially. No items were missing from the three questionnaires at either the baseline or discharge assessments. The principal components analysis revealed a two-factor structure, which accounted for 73.4% of the total variance. The items loading on the first component (pain/function) with six questions (Q1–Q6) had loadings ranging from 0.67 to 0.86, and explained 34.1% of variance with an eigenvalue of 5.8; the second component (sporting activity) with two questions (Q7 and Q8) had loadings of 0.80 to 0.74, respectively, and explained 39.3% of variance with an eigenvalue of 6.4.

Test–retest reliability

The test-retest reliability yielded an ICC2,1 (table 1) of 0.92 (95% CI 0.80 to 0.97), with an SEM of 1.35 for the non-surgical group, 0.90 (95% CI 0.63 to 0.97), with an SEM of 1.56 for the surgical group, and 0.95 (95% CI 0.90 to 0.97) with an SEM of 0.25 for the healthy group.

Table 1

Baseline, 3-day retest, and discharge scores of the VISA-H in non-surgical (n=20), surgical (n=10) and healthy (n=8) participants, and the magnitude of the changes after non-surgical (ES, SRM, ROC curve) and surgical (ES, SRM) patients

Construct validity

The Spearman rank correlation coefficients describing the extent of the correlation between the VISA-H scores and those of the comparison questionnaires (NPRS and TGSCS) are shown in table 2.

Table 2

Correlations between the VISA-H, NPRS and TGSCS scores at the baseline and discharge assessments

The Kruskal-Wallis test and Dunn post hoc comparison revealed that the healthy individuals had a significantly higher score (99.3±1.2 points) compared with the patients of the non-surgical group (56.7±11.6 points, p<0.001) and surgical group (45.8±12.2 points p<0.001). No difference was found between patients of the non-surgical and surgical groups (p>0.05).

Responsiveness and interpretability

The Wilcoxon test revealed statistically significant changes of the VISA-H scores from baseline to discharge for both the non-surgical (mean±SD difference, 25.3±15.8; p<0.0001) and surgical (difference, 41.1±18.9; p<0.0001) groups (table 1).

There were no floor or ceiling effects for the VISA-H questionnaire from non-surgical and surgical groups at either baseline or discharge (All <15%).

The mean baseline and discharge scores, as well as the magnitude of changes expressed by the ES and SRM for the improved patients of the non-surgical group (n=16) and surgical group (n=9), are presented in table 1. The VISA-H had a large ES and SRM both for the non-surgical group (ES=2.2, SRM=1.6) and surgical group (ES=3.3, SRM=2.2).

The ROC curve analysis of the non-surgical group revealed an AUC of 0.90 (95% CI 0.70 to 0.98; figure 1). The SE value was 0.07. The AUC, by far, exceeded 0.5 (p<0.0001). This indicates that the change scores yielded by the VISA-H are significantly better than chance at identifying an improved patient from randomly selected pairs of improved and unimproved patients. The MCID for the VISA-H questionnaire was of 22 points. The sensitivity and specificity associated with the MCID of 22 were 0.91 (95% CI 0.61 to 0.98) and 0.87 (95% CI 0.48 to 0.96), respectively.

Figure 1

Receiver-operating-characteristic (ROC) curves illustrating the relationship between sensitivity and complement of specificity (1-specificity) for the Victorian Institute of Sport Assessment-Proximal Hamstring Tendons (VISA-H) questionnaire.


This study presents the VISA-H questionnaire constructed by adapting questions from the VISA-P and VISA-A questionnaires, with the overall purpose of evaluating pain/function and sporting activity of patients with PHT.

To our knowledge, the VISA-H is the first PRO questionnaire developed for patients with PHT.

There are many indications to be followed for the development of a questionnaire, and we have tried to follow the COSMIN (Consensus-based Standards for the selection of health Measurement Instruments) recommendations.26

Our psychometric evaluation proved that the VISA-H questionnaire had a high reliability, validity and responsiveness when evaluating patients with PHT.

The Cronbach α coefficient for the VISA-H was 0.84, which indicates an excellent internal consistency. The Cronbach α coefficient for the VISA-H was similar to that reported for the Swedish version of VISA-P,34 but higher than that reported for the Dutch version of VISA-P,35 and for the Swedish22 and German24 versions of VISA-A. Since the internal consistency was not assessed in the original VISA-A and VISA-P questionnaires, no comparison of these data can be made with these studies. The high internal consistency found in the present study clearly demonstrates that the VISA-H measures the same construct or dimension, that is, how patients were limited by their symptoms during various physical activities. This was confirmed by the results of factor analysis that produced two strong factors (pain/function and sporting activity), indicating that the VISA-H is valid for evaluating the patient's pain/function and its effect on sporting activity. These results were similar to those of Silbernagel et al22 for the Swedish version of the VISA-A questionnaire.

The VISA-H displayed an excellent reliability for all groups. Our values are higher than those reported for the Dutch version of VISA-P,35 but similar to those of the German24 and Swedish22 versions of VISA-A, and of the Swedish34 version of VISA-P. As test–retest reliability in the original VISA-A and VISA-P questionnaires was assessed using the Pearson correlation coefficient, a direct comparison with our results is not possible.

We also analysed the SEM to define the error associated with a single application of the VISA-H. Using the SEM, a clinician can be 68% confident that an initial VISA-H score of 55 points actually falls within ±1.4 points of the true score.

The VISA-H questionnaire demonstrates good construct validity with high correlations with TGSCS and NPRS questionnaires, both at baseline and discharge. Our Spearman correlation coefficients at baseline between VISA-H and NPRS were lower than that reported by Visentini et al15 in the original study on VISA-P. Our Spearman correlation coefficients at baseline between VISA-H and TGSCS questionnaires were higher than that of the original version of the VISA-A14 and of the Swedish22 version of VISA-A, but lower than that reported by Loher and Nauk for the German24 version of VISA-A.

Construct validity, as tested by the Kruskal-Wallis test between healthy participants and patients, revealed that the healthy participants marked a score to the VISA-H that was significantly higher than that for groups of patients.

The quality of measurement questionnaires has usually been evaluated by considering the reliability and validity of such questionnaires; it has, however, been suggested that responsiveness should be another criterion in the choice of a measurement questionnaire.

To our knowledge, this is one of the first studies that evaluated the responsiveness of the VISA form questionnaire, using both distribution-based methods (ES and SRM) and an anchor-based method (ROC curve). Only recently, Hernandez-Sanchez et al36 have assessed the responsiveness of the Spanish version of VISA-P combining an anchor-based (MCS and ROC curve) like ours and distribution-based approaches (SEM and MDC) different from ours.

The MCID, defined as the magnitude of change that best distinguishes between patients who have improved and those whose condition remains unchanged, was calculated using the ROC curve analysis. Our MCID was of 22 points for the VISA-H questionnaire in the non-surgical patients’ group. This value was slightly higher than that reported by Hernandez-Sanchez et al36 (13 points) for Spanish VISA-P. The potential reasons for this difference could be due to the different population of patients studied and the different methodology used for calculating the MCID by ROC curve.

The VISA-H questionnaire demonstrated a high degree of responsiveness for both the distribution-based method in both groups of patients (ES and SRM) and the anchor-based method (ROC curve) for the non-surgical patients’ group.

If taken together, the data obtained by both the anchor-based and distribution-based methods demonstrate that the VISA-H questionnaire has very high sensitivity. This allows moderate differences in clinical change to be identified when patients undergo therapy and for there to be fewer patients necessary to detect a significant difference between treatment groups and control groups in a clinical study.

Although the mean VISA-H scoring was significantly different between healthy individuals and patients with PHT, the score, as previously suggested, is not considered to be a diagnostic test.14 ,15

Further, there was no statistically significant difference between mean VISA-H scores in the non-surgical and surgical patients’ groups. This means that the result of the VISA-H, as for the VISA-A and the VISA-P, does not have any role to play in the decision as to whether or not surgery is indicated. In our opinion, the indication for the surgical treatment in patients with PHT remains a clinical decision that must be made between the physician and the patient.

A limitation of our study is the small sample size of subjects. As a consequence, we did not perform an ROC curve analysis in the group of patients that was surgically treated. Another methodological limitation of our study is that the absence of a ‘gold standard’ for comparison makes analysis of this new questionnaire difficult.

The VISA-H questionnaire may provide clinically relevant information to physicians and physiotherapists, and could therefore be very helpful during follow-ups when they conservatively or surgically treat patients with PHT. However, further studies with a large sample size across a broader age range would add to the generalisability of our results.


In conclusion, this study provides initial evidence for VISA-H validity, reliability and responsiveness for making judgements about pain/function and sporting activity in patients with PHT.

What this study adds

  • This study provides initial evidence for validity, reliability and responsiveness of the PRO VISA-H questionnaire which can be used in a clinical setting for measuring the outcome, related to pain function and sporting activity, after non-surgical or surgical treatment in patients with PHT.


View Abstract

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:


  • Contributors AC designed the data collection tools, monitored the data collection for the whole trial, wrote the statistical analysis plan, cleaned and analysed the data and drafted and revised the manuscript. She is the guarantor. NM and FDP cleaned and analysed the data, as well as drafted and revised the paper.

  • Competing interests None.

  • Ethics approval University of L'Aquila.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles