Article Text

Genes encoding proteoglycans are associated with the risk of anterior cruciate ligament ruptures
  1. Sasha Mannion1,2,
  2. Asanda Mtintsilana1,
  3. Michael Posthumus1,3,
  4. Willem van der Merwe3,
  5. Hayden Hobbs3,
  6. Malcolm Collins1,4,
  7. Alison V September1
  1. 1UCT/MRC Research Unit for Exercise Science and Sports Medicine, Department of Human Biology, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
  2. 2Division of Human Genetics, Department of Clinical Laboratory Sciences, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
  3. 3Sports Science Orthopaedic Clinic, Cape Town, South Africa
  4. 4The South African Medical Research Council, Cape Town, South Africa
  1. Correspondence to Dr Alison V September, MRC/UCT Research Unit for Exercise Science and Sports Medicine, University of Cape Town, PO Box 115, Newlands, Cape Town 7725, South Africa; alison.september{at}uct.ac.za

Abstract

Background Genetic variants within genes involved in fibrillogenesis have previously been implicated in anterior cruciate ligament (ACL) injury susceptibility. Proteoglycans also have important functions in fibrillogenesis and maintaining the structural integrity of ligaments. Genes encoding proteoglycans are plausible candidates to be investigated for associations with ACL injury susceptibility; polymorphisms within genes encoding the proteoglycans aggrecan (ACAN), biglycan (BGN), decorin (DCN), fibromodulin (FMOD) and lumican (LUM) were examined.

Methods A case–control genetic association study was conducted. 227 participants with surgically diagnosed ACL ruptures (ACL group) and 234 controls without any history of ACL injury were genotyped for 10 polymorphisms in 5 proteoglycan genes. Inferred haplotypes were constructed for specific regions.

Results The G allele of ACAN rs1516797 was significantly under-represented in the controls (p=0.024; OR=0.72; 95% CI 0.55 to 0.96) compared with the ACL group. For DCN rs516115, the GG genotype was significantly over-represented in female controls (p=0.015; OR=9.231; 95%CI 1.16 to 73.01) compared with the ACL group and the AA genotype was significantly under-represented in controls (p=0.013; OR=0.33; 95% CI 0.14 to 0.78) compared with the female non-contact ACL injury subgroup. Haplotype analyses implicated regions overlapping ACAN (rs2351491 C>T-rs1042631 T>C-rs1516797 T>G), BGN (rs1126499 C>T-rs1042103 G>A) and LUM-DCN (rs2268578 T>C-rs13312816 A>T-rs516115 A>G) in ACL injury susceptibility.

Conclusions These independent associations and haplotype analyses suggest that regions within ACAN, BGN, DCN and a region spanning LUMDCN are associated with ACL injury susceptibility. Taking into account the functions of these genes, it is reasonable to propose that genetic sequence variability within the genes encoding proteoglycans may potentially modulate the ligament fibril properties.

  • ACL
  • Soft tissue injuries
  • Sporting injuries

Statistics from Altmetric.com

Introduction

Although the aetiology of the molecular mechanisms is poorly understood, multiple intrinsic and extrinsic risk factors, including genetics, have been associated with anterior cruciate ligament (ACL) ruptures.1–7 DNA sequence variants within several genes encoding collagens, implicated in regulating the formation of the collagen fibril (fibrillogenesis), which is the basic building block of ligaments, have been associated with ACL ruptures.3–6 Similar to the collagens, the proteoglycans aggrecan, biglycan, decorin, fibromodulin and lumican have important structural roles in ligaments as well as play an essential role in regulating fibrillogenesis.8

Mutations within the large aggrecan (ACAN) gene cause either dominant familial osteochondritis dissecans or a recessive skeletal dysplasia in humans.9 ,10 Murine tissues and mice deficient in the small leucine-rich proteoglycans (SLRPs) biglycan, decorin, fibromodulin or lumican have similar physical phenotypes to humans with classic Ehlers-Danlos syndrome: fibrillogenesis is compromised resulting in collagen fibrils of highly irregular diameters and abnormal fibrillar organisation.11–13 Young et al14 recently investigated the extracellular matrix (ECM) content of ruptured ACL tissue. Lower proteoglycan and glycosaminoglycan (GAG) levels were observed in ruptured human ACL tissue in comparison with the non-ruptured controls.14 Genes encoding proteoglycans are, therefore, plausible candidate genes to be investigated for an association with ACL injury risk.

Variants within the ACAN and lumican (LUM) genes, on chromosomes 15q26.1 and 12q21.3, respectively, have previously been associated with several multifactorial conditions.15–20 The genes encoding biglycan (BGN) on chromosome Xq28, decorin (DCN) on chromosome 12q21.33 and fibromodulin (FMOD) on chromosome 1q32, have, however, not been associated with any multifactorial conditions until today.

This study aimed to investigate the association of sequence variants in the ACAN, BGN, DCN, FMOD and LUM candidate genes with ACL ruptures based on the important biological functions of these five proteoglycan encoding genes in maintaining the structural integrity of tissues and regulating fibrillogenesis. More importantly, the study aimed to identify genomic regions encompassing these five genes which may be harbouring DNA sequence signatures relevant to our understanding of ACL injury susceptibility. In addition, we aimed to investigate whether there was any contribution of these variants to sex-linked susceptibility as has been previously noted with ACL ruptures.4 ,6

Materials and methods

The reporting of this case–control genetic association is in alignment with the recommendations outlined by the STREGA initiative, which is an extension of the STROBE statement.21

Participants

A total of 461 physically active, unrelated, self-reported Caucasian participants were recruited for this case–control genetic association study using previously described inclusion and exclusion criteria.4 These participants consisted of 227 (166 male) individuals with surgically diagnosed ACL ruptures (ACL group) and 234 (144 male) apparently healthy participants without any history of ACL injuries (CON group). The ACL group was recruited from the Sports Science Orthopaedic Clinic in Cape Town, South Africa and the CON group was recruited from sports clubs and wellness centres within Cape Town, South Africa. Recruitment of participants took place between the years of 2006 and 2013. Of the 227 ACL participants, 126 (61.46%; 94 male) sustained their injury through non-contact mechanisms (NON subgroup).22 The NON subgroup was analysed as part of the ACL group, and underwent additional analysis separately.

All participants were required to complete a written informed consent form according to the Declaration of Helsinki. Participants were also requested to complete a questionnaire regarding personal details, medical history, personal and family ligament and tendon injury history, as well as sports participation. All participants were of self-reported Caucasian ancestry.

Sports participation of the CON and ACL groups was characterised into contact sports, non-contact jumping sports, non-contact non-jumping sports and skiing sports as previously defined,2 with slight modification.4 The most common sports played by the ACL group men were rugby (45.2%) and soccer (8.4%), while the ACL group women played predominantly hockey (26.2%) and netball (14.8%).

This study was approved by the Research Ethics Committee of the Faculty of Health Sciences within the University of Cape Town, South Africa (reference number 164/2006).

DNA extraction

Approximately 5 mL of venous blood was obtained from each participant by venepuncture of a forearm vein and collected into an EDTA vacuum container tube. Blood samples were stored at −20°C until total DNA extraction was performed, as previously described by Lahiri and Nurnberger23 and modified by Mokone et al.24

Single nucleotide polymorphism selection and genotyping

Single nucleotide polymorphisms (SNPs; sequence variants) within each of the five candidate genes were identified using the genome database hosted by the National Centre for Biotechnology Information (NCBI; http://ncbi.nlm.nih.gov/) as well as SeattleSNPs (http://pga.gs.washington.edu/). SNPs were selected based on previous associations, whether they were Tag SNPs and should thus provide moderate coverage of the genetic interval25 ,26 or if the SNP had a heterozygosity score greater than 30%.

Three SNPs (see online supplementary figure S1A) were investigated in ACAN which include (1) rs2351491 23 C>T, previously associated with height in individuals of African ancestry,17 (2) rs1042631 4157 T>C and (3) rs1516797 1133 T>G, with the latter two SNPs being previously associated with lumbar disc degeneration.16 The two SNPs investigated within BGN (see online supplementary figure S1B) included rs1126499 189 C>T and rs1042103 359 G>A, both of which were previously investigated for an association with congenital muscular dystrophy but excluded as a cause of the disorder.27 DCN rs13312816 IVS1 A>T and rs516115 IVS3 A>G are Tag SNPs25 ,26 and should therefore provide a large coverage of the genomic region spanning approximately 18 kb of the DCN gene (see online supplementary figure S1C). Rs7543148 244 G>A, with a heterozygosity score greater than 30%, and rs10800912 1338 C>T, a Tag SNP, were chosen for investigation within the FMOD gene. LUM rs2268578 697 T>C, a Tag SNP,26 ,28 has previously been associated with multifactorial phenotypes.26 ,29 Schematic diagrams of the FMOD and LUM genes are given in online supplementary figures S1D and E, respectively. The nomenclature used to describe the SNPs investigated is in accordance with the NCBI (http://ncbi.nlm.nih.gov/).

TaqMan allele-discrimination assays (Applied Biosystems, Foster City, California, USA) were used to genotype participants for the 10 SNPs. Previously inventoried TaqMan primer sets and allele-specific MGB-labelled probes were used together with the PCR master mix, containing ampliTaq DNA polymerase Gold (Applied Biosystems, Foster City, California, USA), as per manufacturers’ recommendations in a final reaction volume of 8 µL. The PCR reactions were performed on an Applied Biosystems StepOnePlus Real-Time PCR system (Applied Biosystems, Foster City, California, USA) using the Applied Biosystems Step-OnePlus Real-Time PCR software V.2.2.2 (Applied Biosystems, Foster City, California, USA). The PCR parameters comprised a 30 s hold step at 60°C followed by a 10 min heat activation step at 95°C, 40 cycles of 95°C for 15 s and 60°C for 1 min, ending with a 30 s hold step at 60°C. Genotypes were determined by endpoint fluorescence. For PCR and genotype quality control purposes, a number of positive (known genotypes) and DNA-free controls were randomly included on every 96-welled PCR plate. All control samples were successfully repeated on every plate.

All laboratory works, DNA extraction and sample genotyping, took place at the UCT/MRC Research Unit for Exercise Science & Sports Medicine Laboratory, Faculty of Health Sciences, University of Cape Town.

Statistics

Quanto V.1.2 (http://hydra.usc.edu/gxe) was used to determine the statistical power of the sample size. Assuming allele frequencies between 0.1 and 0.9 for the ‘risk’ allele of each SNP investigated, our sample size of 227 cases would be adequate to detect an allelic OR of 1.8 and greater at a power of 80% and a significance level of 5%.

Genotype and allele frequencies were analysed using Statistica V.11 (StatSoft Inc, Tulsa, Oklahoma, USA) and GraphPad InStat V.5 (GraphPad software, San Diego, California, USA). The BGN gene is on the X chromosome and therefore genotype and allele frequencies of SNPs investigated in this gene were compared separately between male and female participants. One-way analysis of variance was used to compare continuous biological characteristics between the CON and ACL groups and between the CON group and NON subgroup. Fisher's exact and χ2 tests were used to compare categorical variables (sex and country of birth) between the CON group, ACL group and NON subgroup, as well as to analyse any difference in genotype and allele frequencies between the groups. Inferred haplotypes were constructed for the ACAN, BGN, DCN and FMOD genes using the specific SNPs investigated within each gene. A haplotype was also constructed to overlap the LUM-DCN genetic interval (12q21.3-12q21.33) using the SNPs investigated within these genes. The Chaplin case–control haplotype inference software programme V.1.2.2 (http://www.genetics.emory.edu/labs/epstein/software/chaplin/index.html) was used to compare allele frequencies of the variants within each haplotype between cases and controls. CubeX: cubic exact solution (http://www.oege.org/software/cubex/)30 was used to determine which of the SNPs investigated are in linkage disequilibrium (LD) and thus likely to be inherited together. Significance was accepted at p<0.05. In order to determine whether the genotypes obtained for each of the SNPs investigated were in Hardy-Weinberg equilibrium (HWE), the data were analysed using Genepop V.4.2 (http://genepop.curtin.edu.au/).

Results

Participant characteristics

There were significantly more men in the ACL group (p=0.008) and NON (p=0.013) subgroup in comparison with the CON group (table 1). Participants within the CON and ACL groups, and within the NON subgroup, were similarly matched for critical body mass index (BMI) (BMI at recruitment for CON group; p=0.107 and 0.141, respectively) and country of birth (COB; p=0.769 and 0.814, respectively). The ACL group (p=0.019) and NON subgroup (p=0.028) were significantly younger than the CON group. When covaried for the differences in sex and age, the ACL group (p=0.002) and NON subgroup (p=0.011) still weighed significantly more than the CON group. When covaried for sex, height did not differ significantly between groups (p=0.231 and 0.153).

Table 1

Characteristics of the asymptomatic control group (CON), the ACL rupture group (ACL), and the ACL subgroup with a non-contact (NON) mechanism of injury

Age and weight of the ACL group at the time of recruitment was 4.6±8.9 years (n=225) older and 2.0±12.1 kg (n=221) heavier than at the time of the first ACL rupture. In the NON subgroup, the age and weight at the time of recruitment was 4.1±7.3 years (n=126) older and 2.5±11.0 kg (n=124) heavier than at the time of first ACL rupture.

The female ACL and CON groups were matched for participation in non-contact non-jumping sports (p=0.764) and skiing sports (p=0.526; data not shown). The male ACL and CON groups were matched for non-contact non-jumping sports (p=0.138). Significantly more participants (men and women) within the ACL group participated in contact sports (p=0.009 and 0.001, respectively) and non-contact jumping sports (p=0.003 and < 0.001, respectively) in comparison with controls. Significantly more ACL group men also participated in skiing sports in comparison with controls (p<0.001).

With the exception of a significant ACAN rs1042631 genotype effect on sex (p=0.033), there were no other genotype effects on participant characteristics (see online supplementary table S1).

ACAN gene

There were no significant differences in the genotype and allele frequency distributions between the CON and ACL groups at the ACAN rs2351491 (p=0.547 and 0.415) and rs1042631 (p=0.168 and 0.064) SNPs. Similar results were noted when stratified by mechanism of injury (NON subgroup) and sex (see online supplementary table S2). There was a trend (p=0.059) for the TT genotype to be over-represented in the CON group (51.1%, n=119) when compared with the ACL group (42.3%, n=96) for rs1516797. Interestingly, the G allele of rs1516797 was significantly under-represented in the CON group (27.5%, n=128; p=0.024; OR=0.72; 95% CI 0.55 to 0.96) in comparison with the ACL group (34.4%, n=156). No significant differences (p=0.325) in the allele frequencies for rs1516797 were however noted between the CON group and NON subgroup. The genotype and allele frequency distributions for all three SNPs were similar between the male and female participants for all groups (CON, ACL and NON). All the groups were in HWE for all three ACAN SNPs.

Only six of the possible eight haplotypes constructed from the three ACAN variants (rs2351491 C>T—rs1042631 T>C—rs1516797 T>G) had a frequency greater than 2%. The haplotype containing alleles T–C–T was significantly over-represented (p=0.001; LR=10.30) in the CON group (N=233, 43.55%) in comparison with the ACL group (N=225, 32.74%), while T–C–G was significantly under-represented (p=0.005; LR=7.79) in the CON group (N=233, 20.83%) in comparison with the ACL group (N=225, 29.08%; figure 1A). ACAN rs2351491 and rs1042631 were found to be in complete LD (D′=1.0), while rs1042631 and rs1516797 were not in LD (D′=−0.784).

Figure 1

Frequency distribution of the (A) ACAN (rs2351491 C>T—rs1042631 T>C—rs1516797 T>G), (B and C) BGN (rs1126499 C>T—rs1042103 G>A), (D and E) DCN (rs13312816 A>T—rs516115 A>G) and (F) LUM–DCN (rs2268578 T>C—rs13312816 A>T—rs516115 A>G) inferred haplotypes among the control (CON, white bars) and anterior cruciate ligament rupture (ACL, black bars) groups for all participants, and for men and women separately in the BGN and DCN genes. The p values of significantly different distributions are noted. The total number of participants with available genotype data within each group is indicated in parenthesis on the graph.

BGN gene

There were no significant differences in the genotype frequency distributions between the CON and ACL groups, or between the CON group and NON subgroup at the BGN rs1126499 or rs1042103 loci for either men (rs1126499: CON vs ACL, p=0.533; CON vs NON, p=0.672; rs1042103: CON vs ACL, p=0.383; CON vs NON, p=0.580) or women (rs1126499: CON vs ACL, p=0.105; CON vs NON, p=0.278; rs1042103: CON vs ACL, p=0.226; CON vs NON, p=0.584; see online supplementary table S2). Similarly, no significant differences in allele frequencies were noted. However, there was a trend for the BGN rs1126499T allele to be under-represented (p=0.068) in the female CON group (48.3%, n=85) when compared with the female ACL group (59.0%, n=72). All the female groups were in HWE for both BGN SNPs.

There were no significant differences in the distribution of the inferred haplotypes constructed from the BGN variants (rs1126499 C>T—rs1042103 G>A) when only the male participants were compared between the CON and ACL groups (figure 1B). However, when the female participants were compared, the BGN C–G inferred haplotype was significantly over-represented (p=0.027) in the CON group (N=88, 39.64%) in comparison with the ACL group (N=61, 27.15%; figure 1C). BGN rs1126499 and rs1042103 were found to be in low LD (D′=0.321).

DCN gene

No significant differences in genotype frequencies were noted for the DCN rs13312816 (p=0.221) or rs516115 (p=0.926) SNPs when all participants (men and women) were analysed, or when only the male participants (rs13312816: p=0.256; rs516115: p=0.334) were analysed between the CON and ACL groups (see online supplementary table S2); similarly, no significant differences in allele frequencies were noted. Furthermore, no significant difference in the genotype distribution of DCN rs13312816 was noted in women between the ACL and CON groups (p=0.214). However, the GG genotype of rs516115 was significantly over-represented in the CON group (13.3%, p=0.015; OR=9.23; 95% CI 1.17 to 73.01) when compared with the ACL group (1.6%), as well as being significantly over-represented in the CON group (13.3%, p=0.035; OR=10.35; 95% CI 0.60 to 180.20) compared with the NON subgroup (0.0%), where the GG genotype was absent when only female participants were compared. In contrast, the AA genotype was under-represented in the CON group (38.9%, p=0.065) in comparison with the ACL group (54.2%) and significantly under-represented in the CON group (38.9%, p=0.013; OR=0.33; 95% CI 0.14 to 0.78) compared with the NON subgroup (65.6%) when only female participants were analysed. In addition, the G allele of rs516115 was significantly over-represented in the CON group (37.2%) when compared with the ACL group (23.8%, p=0.014; OR=1.90; 95% CI 1.14 to 3.18) and the NON subgroup (17.2%, p=0.003; OR=2.86; 95% CI 1.40 to 5.85) when only female participants were analysed. There were no significant differences in the allele frequency distributions of rs13312816 between the three groups for the female participants. All the DCN variants were in HWE for all groups.

Only three of the possible four inferred haplotypes constructed for the two DCN variants (rs13312816 A>T—rs516115 A>G) had a frequency greater than 0%. There were no significant differences in the distribution of the DCN inferred haplotypes when only male participants were compared between the CON and ACL groups (figure 1D). When the female participants were analysed, there was a trend for the T–A haplotype to be under-represented in the CON group (N=90, 62.8%) in comparison with the ACL group (N=61, 76.2%; figure 1E). The rs13312816 and rs516115 SNPs were found to be in compete LD (D′=1.0).

FMOD and LUM genes

No significant differences in genotype frequencies were noted between the CON and ACL groups when the variants in the FMOD (rs7543148: p=0.458; rs10800912: p=0.616) and LUM (rs2268578: p=0.598) genes were analysed. Similarly, no significant differences in allele frequencies were noted between these groups (see online supplementary table S2). Likewise, no significant differences in the genotype and allele frequency distributions were noted when the data were stratified by mechanism of injury or sex.

There were no significant differences in the distribution of the inferred haplotypes for FMOD (rs7543148 G>A—rs10800912 C>T) between the CON and ACL groups (data not shown). Only six of the possible eight haplotypes constructed for the 56 kb genetic interval overlapping the LUM and DCN genes (rs2268578 T>C—rs13312816 A>T—rs516115 A>G) had a frequency greater than 2%. The T–A–G inferred haplotype was significantly over-represented (p=0.038) in the CON group (N=234, 9.16%) in comparison with the ACL group (N=227, 7.26%; figure 1F). LUM rs2268578 and DCN rs13312816 were in high LD (D′=0.927) and DCN rs13312816 and rs516115 were in complete LD (D′=1.000).

Discussion

Proteoglycans, such as aggrecan, have major structural roles in ligaments, and the SLRPs biglycan, decorin, fibromodulin and lumican are critical in regulating ECM remodelling and collagen fibrillogenesis through their interactions with the collagen network, the major structural component of ligaments and tendons.31–33 In light of the essential role of proteoglycans in fibrillogenesis, and the previous associations of sequence variants within genes implicated in fibrillogenesis (COL1A1, COL5A1 and COL12A1) with ACL injury risk,3–6 this study aimed to investigate 10 variants within 5 genes encoding proteoglycans (ACAN rs2351491, rs1042631, rs1516797; BGN rs1126499, rs1042103; DCN rs13312816, rs516115; FMOD rs7543148, rs10800912 and LUM rs2268578) for an association with the risk of ACL injuries. The main findings of this study include: (1) ACAN rs1516797 was independently associated with the risk of ACL injury in all participants; (2) DCN rs516115 was independently associated with the risk of injury in female participants (sex-linked association) and (3) haplotype analyses further implicated regions overlapping four of the proteoglycan encoding genes (ACAN, BGN and LUM-DCN) with ACL injury susceptibility. This study is the first report of genetic associations between the genes encoding proteoglycans and ACL injury susceptibility.

Aggrecan is a large structural proteoglycan that stabilises the collagen network by having a highly fixed negative charge which results in water retention (figure 2). This proteoglycan forms large aggregates through its interactions with hyaluronan in the ECM. Although aggrecan is believed to indirectly interact with the collagen fibril, the exact interaction between aggrecan and the collagen network remains unclear.34 Aggrecan is composed of a protein core comprising three globular domains, G1, G2 and G3, each of which have a specific function.35–37 The present study found that participants with the rs1516797 G allele had an increased risk of rupturing their ACL (p=0.024; OR=0.72; 95% CI 0.55 to 0.96). The biological function of this T>G substitution in intron 12 of ACAN is unknown.

Figure 2

A schematic diagram representing the association of aggrecan and the small leucine-rich proteoglycans (SLRPs) biglycan, decorin, fibromodulin and lumican, as well as the glycosaminoglycans (GAGs), with the collagen network. Multiple aggrecan core proteins bind hyaluronan via a link protein to form large aggregates; hyaluronan is able to directly interact with the collagen fibril.

Analysis of the ACAN haplotype (rs2351491 C>T—rs1042631 T>C—rs1516797 T>G) further highlights the potential role of aggrecan in the pathobiology of ACL injury susceptibility. The T–C–G haplotype was associated with an increased risk of ACL ruptures (p=0.005) while the T–C–T haplotype was associated with a decreased risk of ACL ruptures (p=0.001). These haplotypes, which overlap the G3 domain, are therefore suggesting that this genomic interval, which includes ACAN rs1516797 T>G, may be harbouring DNA sequence signatures which may possibly alter aggrecans’ role in the collagen network, and should thus be further explored. Interestingly, disease-causing mutations in proximity to rs1516797 T>G have been associated with inherited forms of skeletal dysplasia, a connective tissue disorder.9 ,10

This study also provided preliminary evidence suggesting that SLRPs such as biglycan and decorin may play a role in the pathobiology of ACL injuries. Disease-causing mutations within the DCN gene have previously been associated with connective tissue disorders.38 ,39 The SLRPs function predominantly in collagen fibrillogenesis, but have also been implicated in regulating cell growth and matrix remodelling.31 ,33 ,40–43

BGN was chosen as a candidate for investigation because it is on the X chromosome; sex is an intrinsic risk factor for ACL ruptures44 and previous ACL injury genetic association studies have observed sex–genetic interactions.4 ,6 Although neither of the two SNPs investigated within the BGN gene were independently associated with ACL injury risk, the C–G BGN haplotype (rs1126499 C>T—rs1042103 G>A) was associated with a decreased risk of ACL injury in female participants (p=0.027). This haplotype analysis suggests that the region overlapping BGN is modulating ACL injury susceptibility and the BGN gene should be further explored to identify the causal regulatory DNA sequence signatures.

Biglycan regulates collagen fibrillogenesis and structure by interacting with collagen fibrils (figure 2)31 ,45; this allows biglycan to maintain the structure of the ECM in connective tissues.46 In addition, biglycan also interacts with growth factors such as transforming growth factor β (TGFβ), indicating this proteoglycans’ involvement in modulating growth factor availability to cells and its role in regulating matrix turnover.42 Previous genetic association studies have also observed a similar sex-specific selective advantage, as identified in this study, with variants localised to the COL5A1 3′-UTR which is also implicated in fibrillogenesis.4

The core protein of decorin binds to collagen to regulate collagen fibrillogenesis (figure 2).40 This study noted that DCN rs516115 A>G was implicated in ACL injury risk in female participants, with the GG genotype specifically associated with a 10.4-fold decreased risk of injury (p=0.035; OR=10.35; 95% CI 0.60 to 180.20) and the AA genotype associated with an increased risk of injury (p=0.013; OR=0.33; 95% CI 0.14 to 0.78) when the CON group and NON subgroup were compared. The associations were further illustrated when the A and G alleles were significantly over-represented in the ACL and CON groups, respectively (p=0.014), and mirrored when the data were stratified by mechanism of injury (p=0.003). The functional significance of this variant is unknown but one can hypothesise that variations in the core protein may affect the interaction of decorin with TGFβ or the collagen fibril,33 ,47 thereby possibly altering the process of fibrillogenesis and affecting the mechanical properties of the ligament.

Currently, the authors are not aware of any published evidence suggesting a selective biological advantage for the specific regions overlapping BGN and DCN to have a sex-specific association. Although biglycan is on the X chromosome, there is conflicting evidence as to whether this gene undergoes X-inactivation and whether BGN has a Y chromosome homologue.48 The difference, or lack thereof, in dosage and expression of BGN between male and female participants has not been fully explored, and may play a role in the altered risk of ACL injury in female participants. Ovarian hormone levels, particularly oestrogen, modulate the synthesis and degradation of SLRPs such as biglycan and decorin.49 ,50 Receptor sites for oestrogen and progesterone have been found in the ACL,51 ,52 and sex hormones are suggested to affect ACL structure and composition.51 The regulation of SLRPs by oestrogen53–55 may account for the sex-specific association of these proteoglycan genes with the risk of ACL injury in women.

Haplotype analysis, the investigation of a set of variants inherited together, is often more informative in detecting an association compared with analysing individual variants alone.56 No independent associations were noted for LUM; however, the haplotype encompassing the LUMDCN (rs2268578 T>C—DCN rs13312816 A>T—rs516115 A>G) genes implicated the T–A–G allele combination with reduced ACL injury risk (p=0.038). One can speculate that this genomic interval overlapping LUM and DCN influences ACL injury susceptibility by effecting fibrillogenesis. Therefore, it is critical that this genomic region is further interrogated to identify functional DNA sequences.

Proteoglycans such as aggrecan, biglycan, decorin, fibromodulin and lumican play an important role in fibrillogenesis, possibly through their myriad of direct/indirect interactions with various proteins, including the collagen network (figure 2) and cell-signalling molecules within the ECM. Altering the properties of the collagen fibril will most likely alter the biomechanical and functional properties of the ligament and one can therefore hypothesise that this modulation will impact on ACL injury risk.57 It was thus not surprising that our novel results are implicating sequence variants within four proteoglycan genes with ACL injury susceptibility. Although there is no immediate clinical translation from this study, the results are suggesting that interindividual variations in the collagen network and fibril assembly might be an important molecular mechanism contributing to the aetiology of ACL ruptures, similar to Achilles tendinopathy.58 To improve our understanding of ACL pathogenesis and susceptibility, it is imperative that we start elucidating the net effect of the intricate interactions of proteoglycans, such as aggrecan, in regulating the structural properties of the ECM, including the collagen fibril. The elucidation of these interactions is vital for the development of possible therapeutic interventions targeting proteoglycans specifically.59

The cases and controls investigated in this study were not matched for the confounding variables weight and height, and this is, therefore, a limitation of the study. It has been suggested that heavier individuals are more likely to sustain an ACL rupture as adiposity is a contributing risk factor to the development of musculoskeletal soft tissue injuries.60 ,61 Although care was taken to recruit controls matched for sports participation, a limitation is that the cases and controls were not matched exactly for exposure to contact and non-contact jumping sports which are of high risk for ACL injuries.

The pathway-based approach followed in this study provides evidence that highlights the potentially important biological role that proteoglycans ACAN, BGN, DCN and/or LUM may have in modulating ACL injury susceptibility. These findings should, however, be repeated in independent populations to confirm the associations described. These results suggest a need to further interrogate the genomic intervals encompassing the proteoglycan genes with regard to ACL injury risk using functional analyses, as well as to identify the reasons for the multiple sex-specific associations observed.

What are the new findings?

  • The genes encoding the proteoglycans aggrecan and decorin are independently associated with the risk of anterior cruciate ligament (ACL) injuries (decorin in females only).

  • This study implicates specific regions within four proteoglycan genes in ACL pathogenesis.

  • We propose that specific regions within ACAN, BGN, DCN and LUM potentially contain functional sequences modulating ligament fibril properties, affecting ACL injury susceptibility.

How might it impact on clinical practice in the near future?

  • There are no immediate clinical translations; however, these results implicate proteoglycans in anterior cruciate ligament (ACL) pathogenesis, making them potential therapeutic targets.

  • Results of this study add to the growing body of evidence, suggesting that interindividual variations in collagen fibril assembly might be an important molecular mechanism in the aetiology of ACL ruptures.

  • Genetic risk factors could, in future, be included into multifactorial models to assess an individuals’ ACL injury susceptibility.

Acknowledgments

The authors would like to thank Ms M Rahim and Ms M Hay for assisting with recruiting participants. Dr D O'Cuinneagain is also thanked for his assistance with participant recruitment and diagnosis.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:

Footnotes

  • Contributors SM contributed to participant recruitment, laboratory work, genotyping, analyses and manuscript preparation. AM contributed to the laboratory work and genotyping. MP was responsible for manuscript preparation and participant recruiting. WvdM and HH contributed to participant recruitment, diagnosis and manuscript editing. MC was responsible for project development, analysis and manuscript preparation. AVS was responsible for genotyping QC, project development, analyses and manuscript preparation.

  • Funding This study was supported in part by funds from the University of Cape Town, and the South African Medical Research Council. SM was supported by the National Research Foundation (NRF) of South Africa. MP was supported by the Thembakazi Trust.

  • Competing interests MC and AVS have filed patents on the application of specific sequence variations (not included in this manuscript) related to risk assessment of Achilles tendinopathy and anterior cruciate ligament injuries.

  • Patient consent Obtained.

  • Ethics approval Ethics approval was obtained from Human Research Ethics Committee, University of Cape Town, South Africa.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles