Unique diagnostic signatures of concussion in the saliva of male athletes: the Study of Concussion in Rugby Union through MicroRNAs (SCRUM)

Objective To investigate the role of salivary small non-coding RNAs (sncRNAs) in the diagnosis of sport-related concussion. Methods Saliva was obtained from male professional players in the top two tiers of England’s elite rugby union competition across two seasons (2017–2019). Samples were collected preseason from 1028 players, and during standardised head injury assessments (HIAs) at three time points (in-game, post-game, and 36–48 hours post-game) from 156 of these. Samples were also collected from controls (102 uninjured players and 66 players sustaining a musculoskeletal injury). Diagnostic sncRNAs were identified with next generation sequencing and validated using quantitative PCR in 702 samples. A predictive logistic regression model was built on 2017–2018 data (training dataset) and prospectively validated the following season (test dataset). Results The HIA process confirmed concussion in 106 players (HIA+) and excluded this in 50 (HIA−). 32 sncRNAs were significantly differentially expressed across these two groups, with let-7f-5p showing the highest area under the curve (AUC) at 36–48 hours. Additionally, a combined panel of 14 sncRNAs (let-7a-5p, miR-143-3p, miR-103a-3p, miR-34b-3p, RNU6-7, RNU6-45, Snora57, snoU13.120, tRNA18Arg-CCT, U6-168, U6-428, U6-1249, Uco22cjg1, YRNA_255) could differentiate concussed subjects from all other groups, including players who were HIA− and controls, immediately after the game (AUC 0.91, 95% CI 0.81 to 1) and 36–48 hours later (AUC 0.94, 95% CI 0.86 to 1). When prospectively tested, the panel confirmed high predictive accuracy (AUC 0.96, 95% CI 0.92 to 1 post-game and AUC 0.93, 95% CI 0.86 to 1 at 36–48 hours). Conclusions SCRUM, a large prospective observational study of non-invasive concussion biomarkers, has identified unique signatures of concussion in saliva of male athletes diagnosed with concussion.

play. World rugby rules required this assessment to be completed within 10 minutes, but a special dispensation was given for an extension of 3 minutes to permit study procedures to be carried out as part of the protocol for the current study in season 1.
Annual mandatory training programmes must be undertaken by all medical staff involved in the delivery of the HIA protocol. Formal audit, governance and disciplinary processes are in place to monitor compliance. All assessments are entered by team medical staff in real-time on an app (CSx) (https://csx.co.nz/our-story/) and are available for subsequent audit and review.
The HIA protocol incorporates a very clearly defined and replicable definition of what constitutes a sport-related concussion and provided the diagnostic reference for our analysis.
For the purpose of this study, we defined the in-match assessment as time point T1, the postmatch assessment as time point T2 and the 36-48 hour assessment as time point T3.
Participants who were evaluated using the HIA protocol formed the HIA+ group if concussion was confirmed at any of the three time points, and the HIA-group if concussion was initially considered but subsequently ruled out.
Whenever a participant was assessed post-match (T2) for possible concussion, team medical staff were asked to identify another participant who had played a similar number of minutes in the same match but who had not had been assessed for concussion and, if possible, a third participant who had had to be withdrawn from that match due to a musculoskeletal injury.
Samples were requested from all categories of player at near enough the same time after the final whistle and were completed before players finished getting showered and changed.
These provided samples at time points T2 and T3, to form the uninjured and the musculoskeletal Injury (MSK) control groups respectively.
The HIA-T2 assessment was usually carried out between 30-90 minutes after the match through to 190 minutes (90 minutes+80 minutes playing time+20 minutes interval) post injury if a player was removed in the first or completed last minute of the game respectively.
It is also important to note that not all players had an assessment at time point T1, as not all significant head injury events are identified in game and symptoms for some players only develop post-game. Moreover, it would have not been possible to obtain samples from the uninjured group during the match, therefore, it was predetermined in the study design that T2 would be the primary time point of interest for comparisons, as this would provide the most consistent timeframe to collect saliva samples across all groups.
Although team physicians were responsible for the clinical management of each player in realtime, in order to ensure a consistent diagnostic standard for the study, the full HIA protocol documentation for each player assessed for concussion and (where available) the video footage of the inciting head injury event were subsequently reviewed independently against the HIA protocol by two experienced sports medicine doctors and England Senior National Team doctors (SPTK and RT). They were blinded to any laboratory results and adjudicated each incident as HIA+ or HIA-or recommended its exclusion due to insufficient or conflicting evidence. For completeness, the analysis of the uncensored data is presented in this section.

Saliva collection
Medical staff at the respective clubs were trained in the collection procedure. Saliva was collected in Oragene®-RNA RE-100 saliva self-collection kits (DNA Genotek) containing an RNA stabilizing solution preserving the samples for up to 8 weeks. Saliva was collected from each participant at enrolment and at the time points described above. Samples were transported to the lab in Birmingham, where they were processed in line with the manufacturer's protocol for storage. During the second season, DNA Genotek discontinued the RE-100 kits and replaced them with an equivalent product . This was utilised from January 2018 onwards.

Library preparation and Next Generation Sequencing
Library preparation was carried out using the QIAseq miRNA Library Kit (QIAGEN). A total of 5ul total RNA was converted into microRNA NGS libraries. Adapters containing UMIs were ligated to the RNA. Then RNA was converted to cDNA. The cDNA as amplified using PCR (22 cycles) and during the PCR, indices were added. After PCR the samples were purified.
Library preparation QC was performed using either the Bioanalyzer 2100 (Agilent) or TapeStation4200 (Agilent). Based on quality of the inserts and the concentration measurements the libraries were pooled in equimolar ratios. The library pool(s) were quantified using the qPCR ExiSEQ LNA™ Quant kit (Exiqon). The library pools were then sequenced on a NextSeq500 sequencing instrument according to the manufacturers instructions (NEBNext Multiplex Small RNA Library Prep Set for Illumina) to make approximately 163-175 base-pair sized libraries. Raw data as demultiplexed and FASTQ files for each sample were generated using the bcl2fastq software (Illumina inc.). FASTQ data were checked using the FastQC tool (http://. bioinformatics.babraham.ac.uk/proeects/fastqc/).

Mapping
A reference profile of sequencing data for each sample was obtained using the whole human genome sequence GRCh37, downloaded from the Genome Reference Consortium and mirbase_20 as an annotation reference. Reads were aligned to the miRbase using Bowtie2. [2] The mapping criteria for aligning reads to spike-ins, abundant sequence and miRBase were the reads to have perfect match to the reference sequences. For mapping to the genome, the restricting was one mismatch which was allowed in the first 32 bases of the read. No in-dels were allowed in mapping. Unaligned reads were mapped against the host reference genome Statistical analysis P-values for significantly differentially expressed sncRNAs are estimated by an exact test on the negative binomial distribution. Aligned reads were counted and differential expression analysis, p-values for significantly differentially expressed microRNAs and false discovery rate according to Benjamini-Hochberg were performed with EdgeR. [6] For normalisation, the trimmed mean of M-values (TMM) method based on log-fold and absolute gene-wise changes in expression levels between samples was used.

qPCR season1
14 μl RNA was reverse transcribed in 70 μl reactions using the miRCURY LNA RT Kit (QIAGEN). cDNA was diluted 50 x and assayed in 10 μl PCR reactions according to the protocol for miRCURY LNA miRNA PCR; each miRNA was assayed once by qPCR on the miRNA Ready-to-Use PCR, custom panel using miRCURY LNA SYBR Green master mix.
qPCR Probes are the complementary sequences of the sncRNAsof interest (eTable 2 below).
Negative controls excluding template from the reverse transcription reaction was performed and profiled like the samples. The amplification was performed in a LightCyclerp 480 Real-Time PCR System (Roche) in 384 well plates. The amplification curves were analysed using the Roche LC software, both for determination of Cq (by the 2nd derivative method) and for melt curve analysis. The amplification efficiency was calculated using algorithms similar to the LinReg software. All assays were inspected for distinct melting curves (Tm) and the Tm was checked to be within known specifications for the assay. Furthermore, assays must be detected with 0 Cq less than the negative control, and with Cq<37 to be included in the data analysis. Data that did not pass these criteria were omitted from any further analysis. Cq was calculated as the 2nd derivative. Normalization was performed based on the average of hsa-BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) miR-29c-3p and hsa-let-7b-5p (custom normalizer assays), the two most stable miRs identified across all samples by Normofinder software. [7] The formula used to calculate the normalized Cq values is the difference between the custom normalizer assays mean Cq and the assay Cq (miRNA of interest). After normalization 20 has been added to the normalized dCq values to shift the numbers in a positive range to allow using the qPCR analysis pipelins according Qiagen procedures. While processing the data in the qPCR pipeline a minus is inserted before the normalized dCq value. A higher value indicates that the miRNA is more abundant in that sample.

qPCR season 2
RNA from saliva samples was extracted and analysed with exactly the same protocol used for SCRUM1 and qPCR was performed using the Applied Biosystems Quantstudio 5 (ThermoFisher Scientific) for amplification and melt curve analysis.

Uncensored data
After independent review of the incidents, 47 cases were excluded due to incomplete HIA documentation or failure to identify a clear mechanism of injury on the video footage. The analysis of the complete dataset including the excluded incidents does not show substantial differences from the censored data. The full uncensored dataset analysis is reported in the eTable 9. The overlap with the previous analysis is evidenced in grey cells.